Staff/Tech Lead-ML Infrastructure Engineer
Company | Gatik AI |
---|---|
Location | Mountain View, CA, USA |
Salary | $Not Provided – $Not Provided |
Type | Full-Time |
Degrees | Bachelor’s, Master’s |
Experience Level | Senior, Expert or higher |
Requirements
- Bachelor’s Degree in Computer Science, Machine Learning or relevant field
- 7+ years of experience working with large ML projects and/or building production ML systems
- Excellent C++, Python, and/or CUDA programming skills
- Familiarity with modern machine learning environments such as Pytorch
- Expert experience with optimization techniques from high-level ML algorithms to low-level HW utilization
- Experience in software architecture, system performance, latency, and data flow
- Expert experience in machine learning workflows: data sampling and curation, pre-processing, model training, ablation studies, evaluation, deployment, inference optimization
- Strong analytical skills, especially for performance troubleshooting (e.g. profiling, roofline model)
- Industry experience in building large-scale ML pipelines
Responsibilities
- Own development of ML models end-to-end from data strategy, initial development, optimization, production platform validation, and fine-tuning based on metrics and on-road performance
- Lead efficient neural network development including quantization, pruning, sparsification, compression, and novel differentiable compute primitives
- Build the foundation models for the on-vehicle and offline applications; Develop metrics and tools to analyze errors and understand improvements in our systems
- Train and evaluate DNNs for the purpose of benchmarking neural network optimization algorithms – optimizing for latency and power consumption
- Design and implement a horizontally scalable, high-throughput cloud inference pipeline for evaluation and KPI calculation
- Streamline workflows to allow creation of verified, deployable artifacts from annotated data
- Support data preparation for training: building a horizontally scalable data preparation pipeline that is simple to use and doesn’t delay training
- Support development of tools for introspection and visualization to understand what is going well and what can be improved in our work
Preferred Qualifications
- Master’s Degree with a focus on Machine Learning, Statistics, Optimization or a related field (preferred) or relevant work experience
- Experience with cloud ML training pipelines in Azure (preferred)
- High Performance Computing experience (preferred)