Data Scientist II
Company | Robert Half |
---|---|
Location | San Ramon, CA, USA |
Salary | $117000 – $176000 |
Type | Full-Time |
Degrees | Bachelor’s, Master’s, PhD |
Experience Level | Mid Level, Senior |
Requirements
- Bachelor’s degree in Statistics, Computer Science, Mathematics or equivalent required; Master’s or PhD highly preferred
- 5 years of professional experience in data science, with a record in designing and implementing large-scale data science projects
- 5 years of industry experience in predictive modeling and large data analysis
- Knowledge of open-source large language models and experience with evaluating and recommending appropriate models for specific use cases
- 3+ years of experience in using big data platforms and technologies such as Hadoop, Azure data lake, Azure Cosmos DB, Pig, Hive, HBase, etc.
- 3+ years of hands-on experience in statistical modeling, data mining, large data analysis and predictive modeling; text mining a major plus
- 3+ years of experience in regression, classification and clustering methods such as GLM, LR, SVM, LVQ, SOM, Neural Networks
- Experience with two or more of the following: Python, PERL, Matlab or Scala
- Expertise in various machine learning frameworks and libraries (e.g., TensorFlow, PyTorch, Scikit-learn)
- Excellent analytical, problem-solving, and communication skills
- Excellent communication skills, with a proven ability to translate technical findings into business recommendations and strategies
Responsibilities
- Develop and implement advanced predictive models and statistical analysis using a variety of machine learning algorithms
- Suggest algorithms or models appropriate for specific use cases and applications
- Analyze and extract relevant information from large amounts of historical business data to help automate and optimize key processes with business teams
- Apply technical solutions to business problems and questions using large scale data analytics and machine learning; create highly calibrated solutions for business problems
- Work closely with software engineering teams to drive real-time model experiments, implementations and new feature creations
- Continuously evaluate and refine models based on performance metrics
- Utilize cloud technologies such as Azure Machine Learning, Azure Databricks, and other Microsoft data services for data processing, model building, and deployment
- Enhance and evolve the performance of large language models by refining their capabilities through targeted fine-tuning
- Steer both the research trajectory and the practical engineering efforts of the team
- Formulate and enact algorithms for model enhancement, tweak critical hyperparameters, and heighten overall model efficiency
- Guarantee the integrity and relevance of datasets by conducting thorough preprocessing and data analysis within the fine-tuning workflow
- Conduct assessments on fine-tuned models, making necessary modifications to boost their effectiveness
- Enhance performance of large language models by using prompt engineering, useful personas and retrieval-based techniques
- Perform benchmarking on large language models with human in the loop and iteratively increase model performance
- Design and build multi-agent workflows to solve complex business problems
- Foster a cooperative environment within the team, providing guidance to peers to ensure a smooth fine-tuning operation that yields superior results
- Stay at the forefront of advancements in large language model technologies and applications, perpetually refining technical expertise in model fine-tuning
- Collaborate with IT and data engineering teams in an enterprise setting to integrate data science solutions into the broader tech stack and data strategy
- Work closely with business stakeholders to identify opportunities for leveraging company data to drive business solutions
- Translate complex data-driven findings into actionable business insights and communicate these effectively to non-technical stakeholders
- Stay abreast of industry trends and advancements in data science, large language models and Azure technologies
- Conduct research to explore new methodologies and technologies that can enhance the organization’s data analytics capabilities
Preferred Qualifications
- Master’s or PhD highly preferred
- Certifications in Azure data services or advanced analytics preferred