Posted in

Senior Machine Learning Engineer

Senior Machine Learning Engineer

CompanyLeonardo.Ai
LocationSeattle, WA, USA, California, USA, San Francisco, CA, USA
Salary$Not Provided – $Not Provided
TypeFull-Time
Degrees
Experience LevelSenior

Requirements

  • Strong experience building and managing MLOps pipelines using frameworks like Kubeflow, MLflow, or similar.
  • Proficiency in Python, focusing on writing high-performance, maintainable code.
  • Hands-on experience with AWS services (e.g., S3, EC2, SageMaker), and infrastructure-as-code tools like Terraform.
  • Deep understanding of Docker and container orchestration tools like Kubernetes.
  • Experience designing scalable ETL pipelines and working with SQL and NoSQL databases.
  • Familiarity with API integrations, network configurations (e.g., proxies, SSH, NAT, VPN), and security best practices.
  • Knowledge of monitoring tools such as Prometheus, Grafana, or CloudWatch.
  • Highly adaptable and eager to learn emerging tools and technologies in the MLOps landscape.

Responsibilities

  • Design, build, and maintain robust MLOps pipelines to support the end-to-end lifecycle of machine learning models, including data preparation, training, deployment, monitoring, and retraining.
  • Develop reusable tools and modules to enable efficient experimentation, model deployment, and versioning.
  • Integrate ComfyUI nodes and other workflow tools into the MLOps ecosystem, optimising for performance and scalability.
  • Collaborate with DevOps teams to implement and manage cloud infrastructure, focusing on AWS (e.g., S3, EC2, SageMaker) using tools like Terraform and CloudFormation.
  • Implement CI/CD pipelines tailored for machine learning workflows, ensuring smooth transitions from research to production.
  • Optimise resource allocation and manage costs associated with cloud-based machine learning workloads.
  • Design and maintain scalable data pipelines for collecting, processing, and storing large volumes of data.
  • Automate data acquisition and preprocessing workflows, optimising I/O bandwidth and implementing efficient storage solutions.
  • Manage data integrity and ensure compliance with privacy and security standards.
  • Deploy machine learning models to production, ensuring robustness, scalability, and low latency.
  • Implement monitoring solutions for deployed models to track performance metrics, detect drift, and trigger retraining pipelines.
  • Continuously optimise inference performance using techniques like model quantisation, distillation, or caching strategies.
  • Work closely with cross-functional teams, including AI researchers, data engineers, and software developers, to support ongoing projects and align MLOps efforts with organisational goals.
  • Proactively identify opportunities to streamline and automate workflows, driving innovation and efficiency.
  • Operate independently to manage deadlines, deliverables, and high-quality solutions in a dynamic environment.

Preferred Qualifications

  • Strong grasp of DevOps principles, including CI/CD and infrastructure automation.
  • Understanding of machine learning model lifecycle, including data versioning, experiment tracking, and model explainability.
  • Experience with distributed computing frameworks like Apache Spark, Dask, or Ray.
  • Familiarity with performance optimisation techniques such as multi-threading, vectorisation, or distributed computing.