Skip to content

Mlops Engineer
Company | Otter.ai |
---|
Location | Mountain View, CA, USA |
---|
Salary | $155000 – $185000 |
---|
Type | Full-Time |
---|
Degrees | Master’s, PhD |
---|
Experience Level | Mid Level, Senior |
---|
Requirements
- Has a Master’s degree + 3 years of industry experience or a Ph.D. in computer science, machine learning, or a related field.
- Has hands-on experience with Linux system administration, Kubernetes, Terraform, and AWS.
- Is proficient in GitOps, CI/CD tools (e.g., ArgoCD, Jenkins, GitHub Actions).
- Has experience in writing internal web applications (e.g., using Django, Flask, FastAPI, or React).
- Is familiar with ML experiment tracking tools (e.g., Weights & Biases, MLflow).
- Has expertise in model deployment and inference optimization using ONNX, TorchScript, TensorRT, or similar frameworks.
- Has strong programming skills in Python, with additional experience in Rust or C++.
- Understands large-scale distributed computing and has worked with Spark, Ray, or other big data processing frameworks.
- Has experience with performance tuning of ML models and infrastructure.
- Is comfortable collaborating with research and engineering teams to translate cutting-edge AI into scalable, production-ready solutions.
Responsibilities
- Design, deploy, and maintain scalable infrastructure on Linux, Kubernetes, and AWS to support machine learning workloads.
- Develop and manage automated CI/CD pipelines for machine learning models and applications, ensuring seamless deployments and version control.
- Build internal web applications to improve ML workflow efficiency, model monitoring, and deployment processes.
- Utilize tools such as Weights & Biases, MLflow, and other experiment tracking systems to manage model lifecycle and metadata.
- Deploy and optimize ML models for inference using ONNX, TorchScript, TensorRT, or other relevant technologies to maximize performance.
- Conduct performance profiling and tuning of ML workloads, optimizing memory usage, compute efficiency, and model latency.
- Design and maintain large-scale data processing pipelines using Spark, Ray, or other distributed computing frameworks to support AI research and production systems.
Preferred Qualifications
No preferred qualifications provided.