Senior Devops Engineer - Enterprise Systems

Senior Devops Engineer – Enterprise Systems

Bachelor’s or Master’s Degree in Computer Science or Software Engineering, or equivalent experience
10+ years of proven experience with 5+ years of Linux and Scripting experience
Solid background on Enterprise Systems and Clustering architectures
A track record of quickly understanding new technologies outside of your domain expertise and deploying systems in complex configurations from hardware through multiple layers of software in a fast-paced environment
Strong technical skills and understanding of embedded systems, orchestration & automation systems, data centers and cloud architecture, as well as excellent communication and planning skills
Strong problem-solving ability and experience in product engineering/failure analysis and debug/ HW or test design
Understanding of dense datacenter design including compute, Storage and networking

Work with NVIDIA Product Teams to understand new product requirements including HPC and AI/ML Products
Finding Optimum Solutions to deploy these products in a Datacenter or a Lab environment using sophisticated design techniques, services and tools
Assist in roll-out and deployment of new development features aimed at supporting the latest NVIDIA hardware and technologies
Work closely with world-class engineers, architects, technical product managers and application developers setting the best strategies in place for a product launch
Defining and implementing full scale solutions for product onboarding into our hosted and private cloud environments
Solve critically layered problems involving multi-site deployments of NVIDIA products
Collaborate with multi-functional teams, including system engineering, software engineering, mechanical/thermal engineering, operations, data center teams, external vendors, and other partners to optimally deliver a reliable and robust platform from concept to prototype to deployments
Directly contribute to the overall quality of deployments and improve time to market next gen products

Experience in large scale QA environments, for product bring ups
Background with supporting GPUs, embedded device development, driver development and CUDA applications
Special skills in large-scale computing and cluster computing(MPI), data center design include high speed interconnect InfiniBand, Cluster Storage and Scheduling related design and/or management experience
Experience with converged and hyper-converged hardware and servers
Background with Python and familiarity with Jenkins, Ansible and REST APIs with expert level background with Windows & Linux administration