Senior Machine Learning Engineer
Company | Leonardo.Ai |
---|---|
Location | Seattle, WA, USA, California, USA, San Francisco, CA, USA |
Salary | $Not Provided – $Not Provided |
Type | Full-Time |
Degrees | |
Experience Level | Senior |
Requirements
- Strong experience building and managing MLOps pipelines using frameworks like Kubeflow, MLflow, or similar.
- Proficiency in Python, focusing on writing high-performance, maintainable code.
- Hands-on experience with AWS services (e.g., S3, EC2, SageMaker), and infrastructure-as-code tools like Terraform.
- Deep understanding of Docker and container orchestration tools like Kubernetes.
- Experience designing scalable ETL pipelines and working with SQL and NoSQL databases.
- Familiarity with API integrations, network configurations (e.g., proxies, SSH, NAT, VPN), and security best practices.
- Knowledge of monitoring tools such as Prometheus, Grafana, or CloudWatch.
- Highly adaptable and eager to learn emerging tools and technologies in the MLOps landscape.
Responsibilities
- Design, build, and maintain robust MLOps pipelines to support the end-to-end lifecycle of machine learning models, including data preparation, training, deployment, monitoring, and retraining.
- Develop reusable tools and modules to enable efficient experimentation, model deployment, and versioning.
- Integrate ComfyUI nodes and other workflow tools into the MLOps ecosystem, optimising for performance and scalability.
- Collaborate with DevOps teams to implement and manage cloud infrastructure, focusing on AWS (e.g., S3, EC2, SageMaker) using tools like Terraform and CloudFormation.
- Implement CI/CD pipelines tailored for machine learning workflows, ensuring smooth transitions from research to production.
- Optimise resource allocation and manage costs associated with cloud-based machine learning workloads.
- Design and maintain scalable data pipelines for collecting, processing, and storing large volumes of data.
- Automate data acquisition and preprocessing workflows, optimising I/O bandwidth and implementing efficient storage solutions.
- Manage data integrity and ensure compliance with privacy and security standards.
- Deploy machine learning models to production, ensuring robustness, scalability, and low latency.
- Implement monitoring solutions for deployed models to track performance metrics, detect drift, and trigger retraining pipelines.
- Continuously optimise inference performance using techniques like model quantisation, distillation, or caching strategies.
- Work closely with cross-functional teams, including AI researchers, data engineers, and software developers, to support ongoing projects and align MLOps efforts with organisational goals.
- Proactively identify opportunities to streamline and automate workflows, driving innovation and efficiency.
- Operate independently to manage deadlines, deliverables, and high-quality solutions in a dynamic environment.
Preferred Qualifications
- Strong grasp of DevOps principles, including CI/CD and infrastructure automation.
- Understanding of machine learning model lifecycle, including data versioning, experiment tracking, and model explainability.
- Experience with distributed computing frameworks like Apache Spark, Dask, or Ray.
- Familiarity with performance optimisation techniques such as multi-threading, vectorisation, or distributed computing.