Skip to contentSoftware Engineer in Systems
Company | OpenAI |
---|
Location | San Francisco, CA, USA |
---|
Salary | $405000 – $405000 |
---|
Type | Full-Time |
---|
Degrees | |
---|
Experience Level | Mid Level |
---|
Requirements
- Background in high performance computing and/or low level systems
Responsibilities
- Build systems to distribute work across massive GPU clusters efficiently.
- Design and implement methods to make our training stack more efficient & scale up to our next generation super computers.
- Design and implement methods to robustly train models in the presence of hardware failures.
- Build tooling to help us better understand problems in our largest training jobs.
Preferred Qualifications
- Love getting into low level details about performance.
- Passionate about building stable and highly efficient distributed systems.