Posted in

Senior Software Engineer-Distributed Inference

Senior Software Engineer-Distributed Inference

CompanyNVIDIA
LocationWashington, USA, Texas, USA, Arizona, USA, Colorado, USA, Massachusetts, USA
Salary$184000 – $356500
TypeFull-Time
DegreesBachelor’s, Master’s, PhD
Experience LevelSenior, Expert or higher

Requirements

  • Bachelor’s, Masters or PhD or equivalent experience
  • 8+ years in Computer Science, computer architecture, or related field
  • Knowledge of distributed systems programming
  • Ability to work in a fast-paced, agile team environment
  • Excellent Python programming and software design skills, including debugging, performance analysis, and test design.

Responsibilities

  • Develop and enhance functionalities within the GenAI-Perf, Triton Performance Analyzer and Triton Model Analyzer tools.
  • Collaborate with researchers and engineers to understand their performance analysis needs and translate them into actionable features.
  • Collaborate closely with cross-functional teams including software engineers, system architects, and product managers to drive performance improvements throughout the development lifecycle.
  • Responsible for setting up, executing, and analyzing the performance of LLM, Generative AI and deep learning models.
  • Develop and implement efficient algorithms for measuring deep learning throughput and latency, benchmarking large language models, and deploying models.
  • Integrate various tools to create a unified and user-friendly experience for deep learning performance analysis.
  • Automate testing processes to ensure the quality and stability of the tools.
  • Contribute to technical documentation and user guides. Stay up-to-date on the latest advancements in deep learning performance analysis and LLM optimization techniques.

Preferred Qualifications

  • Experience with deep learning algorithms and frameworks. Especially experience with Large Language Models and frameworks such as PyTorch, TensorFlow, TensorRT, and ONNX Runtime.
  • Excellent troubleshooting abilities spanning multiple software (storage systems, kernels and containers).
  • Experience contributing to a large open source project – use of GitHub, bug tracking, branching and merging code, OSS licensing issues handling patches, etc.
  • Familiarity with cloud computing platforms (e.g., AWS, Azure, GCP) and Experience building and deploying cloud services using HTTP REST, gRPC, protobuf, JSON and related technologies.
  • Experience working with NVIDIA GPUs and deep learning inference frameworks is a plus.