Data Scientist / Engineer
Company | Broadcom Limited |
---|---|
Location | Portsmouth, NH, USA, Framingham, MA, USA |
Salary | $107000 – $171000 |
Type | Full-Time |
Degrees | Bachelor’s |
Experience Level | Senior, Expert or higher |
Requirements
- Proficient in Python for data manipulation (Pandas, NumPy), data visualization (Matplotlib, Seaborn, Plotly), and machine learning libraries (Scikit-learn, TensorFlow, PyTorch)
- Experience with data manipulation, statistical analysis, and visualization packages in R (dplyr, tidyr, ggplot2)
- Ability to query and manipulate large datasets from relational databases using SQL
- Experience integrating and deploying large language models (LLMs), diffusion models, and other generative AI models into data pipelines and applications
- Expertise in cleaning, transforming, and preparing data for use in generative AI models
- Deep understanding of prompt engineering techniques for maximizing the quality and relevance of outputs from generative AI models
- Familiarity with different retrieval techniques, such as BM25, dense retrieval (using embeddings), and hybrid approaches
- Understanding how to make the system’s reasoning transparent and understandable, especially important for building trust in GenAI applications
- Bachelor’s degree + 8+ years of related experience
Responsibilities
- Design and develop a network operations AI assistant expert system using generative AI, traditional Machine Learning, and statistical analysis tools and techniques
- Integrate and deploy large language models (LLMs), diffusion models, and other generative AI models into data pipelines and applications
- Clean, transform, and prepare data for use in generative AI models
- Maximize the quality and relevance of outputs from generative AI models through prompt engineering techniques
- Communicate complex technical information to both technical and non-technical audiences
- Create clear and effective visualizations using various tools
Preferred Qualifications
- Experience with Cypher/GQL for querying complex graph-based data sets
- Proficiency in other programming languages like Java, Scala, or Go