Senior Software Engineer-Distributed Inference
Company | NVIDIA |
---|---|
Location | Washington, USA, Texas, USA, Arizona, USA, Colorado, USA, Massachusetts, USA |
Salary | $184000 – $356500 |
Type | Full-Time |
Degrees | Bachelor’s, Master’s, PhD |
Experience Level | Senior, Expert or higher |
Requirements
- Bachelor’s, Masters or PhD or equivalent experience
- 8+ years in Computer Science, computer architecture, or related field
- Knowledge of distributed systems programming
- Ability to work in a fast-paced, agile team environment
- Excellent Python programming and software design skills, including debugging, performance analysis, and test design.
Responsibilities
- Develop and enhance functionalities within the GenAI-Perf, Triton Performance Analyzer and Triton Model Analyzer tools.
- Collaborate with researchers and engineers to understand their performance analysis needs and translate them into actionable features.
- Collaborate closely with cross-functional teams including software engineers, system architects, and product managers to drive performance improvements throughout the development lifecycle.
- Responsible for setting up, executing, and analyzing the performance of LLM, Generative AI and deep learning models.
- Develop and implement efficient algorithms for measuring deep learning throughput and latency, benchmarking large language models, and deploying models.
- Integrate various tools to create a unified and user-friendly experience for deep learning performance analysis.
- Automate testing processes to ensure the quality and stability of the tools.
- Contribute to technical documentation and user guides. Stay up-to-date on the latest advancements in deep learning performance analysis and LLM optimization techniques.
Preferred Qualifications
- Experience with deep learning algorithms and frameworks. Especially experience with Large Language Models and frameworks such as PyTorch, TensorFlow, TensorRT, and ONNX Runtime.
- Excellent troubleshooting abilities spanning multiple software (storage systems, kernels and containers).
- Experience contributing to a large open source project – use of GitHub, bug tracking, branching and merging code, OSS licensing issues handling patches, etc.
- Familiarity with cloud computing platforms (e.g., AWS, Azure, GCP) and Experience building and deploying cloud services using HTTP REST, gRPC, protobuf, JSON and related technologies.
- Experience working with NVIDIA GPUs and deep learning inference frameworks is a plus.