Skip to content

Software Engineer – Power Management – Hardware Health
Company | OpenAI |
---|
Location | San Francisco, CA, USA |
---|
Salary | $310000 – $460000 |
---|
Type | Full-Time |
---|
Degrees | |
---|
Experience Level | Senior, Expert or higher |
---|
Requirements
- 7+ years of software engineering experience with a focus on solving large-scale, system-level challenges.
- Strong proficiency in Python and familiarity with automation and scripting tools (e.g., shell scripting).
- Experience with distributed systems to efficiently aggregate and analyze streaming data.
- Knowledge of electrical engineering concepts including digital signal processing, power systems, Fast Fourier Transforms, or related areas.
- Experience in system-level investigations and development of automated solutions to address power management, fault detection, and remediation.
- Strong analytical skills and the ability to dig into noisy data (experience with SQL, PromQL, Pandas, etc.).
- Comfort working with both hardware and software teams to solve multidisciplinary problems.
Responsibilities
- Develop and implement system-level and software-level solutions to optimize power usage in large-scale supercomputers, ensuring efficient and reliable operations.
- Build automation to monitor power consumption patterns during training workloads and design algorithms to stabilize these fluctuations, preventing issues with grid reliability.
- Work with researchers and engineers to design tools for real-time monitoring, detection, and remediation of power-related hardware and system faults.
- Collaborate cross-functionally to translate complex electrical system requirements into code, while driving continuous improvements in power management solutions.
- Drive the development of power throttling mechanisms at the IT system level to dynamically adjust power usage based on workload demands and infrastructure limitations.
- Collaborate with hardware design teams to integrate system-level power control requirements into IT hardware design, ensuring seamless coordination between software-driven power management and hardware capabilities.
Preferred Qualifications
- Deep expertise with the power characteristics of synchronous workloads (as seen in supercomputing or model training environments).
- Knowledge of power control requirements in IT hardware design, with the ability to drive cross-functional collaboration to integrate power management features into hardware systems effectively.
- Working knowledge of control system fundamentals and how physical systems respond to control strategies.