Skip to content

Site Reliability Engineer II – Observability
Company | Crusoe |
---|
Location | San Francisco, CA, USA |
---|
Salary | $135000 – $158000 |
---|
Type | Full-Time |
---|
Degrees | Bachelor’s |
---|
Experience Level | Senior |
---|
Requirements
- 5+ years of professional SRE experience
- 5+ years of experience contributing to architecture and design (architecture, design patterns, reliability and scaling) of new and current systems
- Bachelor’s Degree in Computer Science or related field, or 6+ years relevant work experience
- Solid understanding of infrastructure design, including the operational trade-offs of various designs
- Experience writing high-quality code with at least one programming language (Python, Go, or similar)
- Experience building with modern infrastructure tools such as Docker, Kubernetes, Ansible, Cloud Formation, Terraform
- Experience building with modern CI/CD practices and build systems, such as GitLab CI/CD, CircleCI, GitHub Actions
- Experience with logging, monitoring, and alerting systems and tools
- Experience with Unix/Linux environments
- Experience with TCP/IP and network programming
- Experience with information security best practices
- Excellent communication skills
- Must be able to pass a background check
- Embody the Company values
Responsibilities
- Help manage, maintain, and improve the observability stack, ensuring all systems are operating optimally
- Work closely with software engineering teams, helping them integrate observability best practices into their software lifecycle and guiding them on using analytics to pinpoint performance issues and optimize reliability
- Develop and deploy new monitoring capabilities, enhance telemetry infrastructure, and improve integration frameworks and libraries
- Write tooling and automation to support the observability platform and its users
- Help teams analyze telemetry data to gain valuable insights into system performance and reliability
- Identify opportunities to refine our observability stack, continuously improving its capabilities and usability
- Document work, share learnings with the team, and plan ahead
- Collaborate with teammates during stand-ups to discuss ongoing projects, recent insights, and priorities for enhancing observability tools and workflows
Preferred Qualifications
No preferred qualifications provided.