Skip to content

Engineering Manager – Machine Learning Platform
Company | Chime |
---|
Location | San Francisco, CA, USA |
---|
Salary | $176490 – $245100 |
---|
Type | Full-Time |
---|
Degrees | |
---|
Experience Level | Senior, Expert or higher |
---|
Requirements
- Expertise in designing and scaling ML platforms for large-scale AI applications.
- Deep experience with ML infrastructure components, such as distributed training, model registries, feature stores, and inference serving.
- Hands-on knowledge of ML and data technologies, including TensorFlow, PyTorch, Kubeflow, MLflow, Airflow, Spark, and Kubernetes.
- Proficiency in cloud-based ML ecosystems, including AWS (SageMaker, S3, Lambda), GCP (Vertex AI, BigQuery), or Azure ML.
- Strong software engineering skills, with experience in Python, Java, or Scala and deep knowledge of ML Ops best practices.
- Experience implementing monitoring and observability tools for model drift detection, automated retraining, and performance tracking.
- Leadership experience, with a track record of managing engineering teams and collaborating with data scientists.
Responsibilities
- Design and implement a scalable ML platform, enabling seamless model development, deployment, and monitoring across the organization.
- Optimize ML workflows, ensuring efficient experimentation, feature engineering, model training, and inference at scale.
- Build and maintain ML infrastructure, including distributed training systems, feature stores, model registries, and real-time serving frameworks.
- Work closely with ML engineers and data scientists, providing a self-service platform that accelerates research and deployment cycles.
- Ensure compliance and governance, defining best practices for ML model security, monitoring, versioning, and responsible AI practices.
- Improve ML model performance, enabling efficient inference pipelines, real-time model serving, and latency optimization.
- Lead and mentor a team of ML platform engineers, fostering a culture of innovation and technical excellence.
- Stay ahead of ML infrastructure trends, evaluating and adopting emerging technologies to improve scalability and performance.
Preferred Qualifications
No preferred qualifications provided.