Mlops Engineer

Has a Master’s degree + 3 years of industry experience or a Ph.D. in computer science, machine learning, or a related field.
Has hands-on experience with Linux system administration, Kubernetes, Terraform, and AWS.
Is proficient in GitOps, CI/CD tools (e.g., ArgoCD, Jenkins, GitHub Actions).
Has experience in writing internal web applications (e.g., using Django, Flask, FastAPI, or React).
Is familiar with ML experiment tracking tools (e.g., Weights & Biases, MLflow).
Has expertise in model deployment and inference optimization using ONNX, TorchScript, TensorRT, or similar frameworks.
Has strong programming skills in Python, with additional experience in Rust or C++.
Understands large-scale distributed computing and has worked with Spark, Ray, or other big data processing frameworks.
Has experience with performance tuning of ML models and infrastructure.
Is comfortable collaborating with research and engineering teams to translate cutting-edge AI into scalable, production-ready solutions.

Design, deploy, and maintain scalable infrastructure on Linux, Kubernetes, and AWS to support machine learning workloads.
Develop and manage automated CI/CD pipelines for machine learning models and applications, ensuring seamless deployments and version control.
Build internal web applications to improve ML workflow efficiency, model monitoring, and deployment processes.
Utilize tools such as Weights & Biases, MLflow, and other experiment tracking systems to manage model lifecycle and metadata.
Deploy and optimize ML models for inference using ONNX, TorchScript, TensorRT, or other relevant technologies to maximize performance.
Conduct performance profiling and tuning of ML workloads, optimizing memory usage, compute efficiency, and model latency.
Design and maintain large-scale data processing pipelines using Spark, Ray, or other distributed computing frameworks to support AI research and production systems.

No preferred qualifications provided.