Senior Machine Learning Engineer

Company	Leonardo.Ai
Location	Seattle, WA, USA, California, USA, San Francisco, CA, USA
Salary	$Not Provided – $Not Provided
Type	Full-Time
Degrees
Experience Level	Senior

Requirements

Strong experience building and managing MLOps pipelines using frameworks like Kubeflow, MLflow, or similar.
Proficiency in Python, focusing on writing high-performance, maintainable code.
Hands-on experience with AWS services (e.g., S3, EC2, SageMaker), and infrastructure-as-code tools like Terraform.
Deep understanding of Docker and container orchestration tools like Kubernetes.
Experience designing scalable ETL pipelines and working with SQL and NoSQL databases.
Familiarity with API integrations, network configurations (e.g., proxies, SSH, NAT, VPN), and security best practices.
Knowledge of monitoring tools such as Prometheus, Grafana, or CloudWatch.
Highly adaptable and eager to learn emerging tools and technologies in the MLOps landscape.

Responsibilities

Design, build, and maintain robust MLOps pipelines to support the end-to-end lifecycle of machine learning models, including data preparation, training, deployment, monitoring, and retraining.
Develop reusable tools and modules to enable efficient experimentation, model deployment, and versioning.
Integrate ComfyUI nodes and other workflow tools into the MLOps ecosystem, optimising for performance and scalability.
Collaborate with DevOps teams to implement and manage cloud infrastructure, focusing on AWS (e.g., S3, EC2, SageMaker) using tools like Terraform and CloudFormation.
Implement CI/CD pipelines tailored for machine learning workflows, ensuring smooth transitions from research to production.
Optimise resource allocation and manage costs associated with cloud-based machine learning workloads.
Design and maintain scalable data pipelines for collecting, processing, and storing large volumes of data.
Automate data acquisition and preprocessing workflows, optimising I/O bandwidth and implementing efficient storage solutions.
Manage data integrity and ensure compliance with privacy and security standards.
Deploy machine learning models to production, ensuring robustness, scalability, and low latency.
Implement monitoring solutions for deployed models to track performance metrics, detect drift, and trigger retraining pipelines.
Continuously optimise inference performance using techniques like model quantisation, distillation, or caching strategies.
Work closely with cross-functional teams, including AI researchers, data engineers, and software developers, to support ongoing projects and align MLOps efforts with organisational goals.
Proactively identify opportunities to streamline and automate workflows, driving innovation and efficiency.
Operate independently to manage deadlines, deliverables, and high-quality solutions in a dynamic environment.

Preferred Qualifications

Strong grasp of DevOps principles, including CI/CD and infrastructure automation.
Understanding of machine learning model lifecycle, including data versioning, experiment tracking, and model explainability.
Experience with distributed computing frameworks like Apache Spark, Dask, or Ray.
Familiarity with performance optimisation techniques such as multi-threading, vectorisation, or distributed computing.