Posted in

Senior/Staff+ Site Reliability Engineer I – Observability

Senior/Staff+ Site Reliability Engineer I – Observability

CompanyCrusoe
LocationSan Francisco, CA, USA
Salary$250000 – $250000
TypeFull-Time
DegreesBachelor’s
Experience LevelExpert or higher

Requirements

  • 12+ years of professional SRE experience
  • 12+ years of experience contributing to architecture and design (architecture, design patterns, reliability and scaling) of new and current systems
  • Bachelor’s Degree in Computer Science or related field, or 15+ years relevant work experience
  • Solid understanding of infrastructure design, including the operational trade-offs of various designs
  • Experience writing high quality code with at least one programming language (Python, Go, or similar)
  • Experience building with modern infrastructure tools such as Docker, Kubernetes, Ansible, Cloud Formation, Terraform
  • Experience building with modern CI/CD practices and build systems, such as GitLab CI/CD, CircleCI, GitHub Actions
  • Experience with logging, monitoring and alerting systems and tools
  • Experience with Unix/Linux environments
  • Experience with TCP/IP and network programming
  • Experience with information security best practices
  • Excellent communication skills
  • Must be able to pass a background check

Responsibilities

  • Help manage, maintain, and improve the observability stack, ensuring all systems are operating optimally
  • Work closely with software engineering teams, helping them integrate observability best practices into their software lifecycle and guiding them on using analytics to pinpoint performance issues and optimize reliability
  • Develop and deploy new monitoring capabilities, enhance telemetry infrastructure, and improve integration frameworks and libraries
  • Write tooling and automation to support the observability platform and its users
  • Help teams analyze telemetry data to gain valuable insights into system performance and reliability
  • Identify opportunities to refine our observability stack, continuously improving its capabilities and usability
  • Document work, share learnings with the team, and plan ahead
  • Collaborate with teammates during stand-ups to discuss ongoing projects, recent insights, and priorities for enhancing observability tools and workflows

Preferred Qualifications

    No preferred qualifications provided.