Skip to content

Senior Performance Engineer II
Company | DigitalOcean |
---|
Location | San Francisco, CA, USA |
---|
Salary | $165000 – $210000 |
---|
Type | Full-Time |
---|
Degrees | Bachelor’s, Master’s |
---|
Experience Level | Senior |
---|
Requirements
- Bachelor’s or Master’s degree in Computer Science, Mathematics, Statistics or Computer/Electrical Engineering or equivalent work experience
- Extensive knowledge of Linux kernel, hypervisors, and open-source operating systems
- 5+ years of experience with performance measurement tools such as profilers, eBPF, XDP, fio, TPCC, MLPerf, and NCCL
- 5+ years developing strategies for managing, monitoring, and analyzing infrastructure, applications and services
- Strong proficiency in Go, Python, and/or Ruby
- Deep understanding of kernel performance aspects, including scheduling, context switching, and hardware acceleration
- Expertise in distributed systems performance, including tracing and debugging methodologies
- Demonstrated ability to solve complex problems at scale
- Excellent cross-team collaboration and communication skills
- Leadership experience in skills development and mentorship
- Professional-level written and spoken English with strong presentation abilities
Responsibilities
- Develop and implement comprehensive performance metrics, analysis tools, and reporting systems
- Lead initiatives to enhance shared infrastructure, balancing performance optimization with rigorous security standards
- Conduct in-depth performance analysis of the Linux kernel, virtualization layer, storage, and network stack to devise optimization strategies
- Identify system bottlenecks proactively and drive optimizations across the hypervisor software stack
- Work cross-functionally to harness new performance capabilities from evolving hardware architectures
- Enhance test frameworks and pipelines to ensure robust performance validation
- Investigate and resolve virtual machine downtime and performance issues in our production environment
- Participate in on-call rotations as needed to support system reliability
Preferred Qualifications
- Experience with observability platforms such as Splunk, Prometheus, Grafana, Elastic, or Dynatrace
- Experience with Chef, AWX, and/or Kubernetes
- Familiarity with x86_64 and/or ARM architectures
- Successful history of upstreaming Linux kernel patches
- In-depth knowledge of at least one Linux subsystem (CPU scheduling, memory management, file system, I/O, etc.)
- Experience in developing and deploying ML-based solutions for anomaly detection and dynamic load balancing