Skip to content

Staff Devops Engineer
Company | Prenuvo |
---|
Location | San Francisco, CA, USA |
---|
Salary | $169000 – $207000 |
---|
Type | Full-Time |
---|
Degrees | |
---|
Experience Level | Expert or higher |
---|
Requirements
- 10+ years of experience in DevOps, SRE, or infrastructure engineering, including recent senior/staff-level roles with high-impact ownership
- Proven experience with Terraform at scale (modules, orchestration, testing), ideally across multiple cloud environments (AWS preferred)
- Familiarity with container orchestration technologies such as Amazon ECS, Kubernetes (K8s), and Serverless frameworks, including experience deploying, scaling, and managing containerized applications in production environments
- Strong Python programming skills with the ability to build tools, automate systems, and contribute to application codebases, coupled with a solid understanding of application internals and deployment considerations
- Deep experience with GitHub Actions for CI/CD— including custom workflows, reusable actions, and multi-environment pipelines
- Solid understanding of network architecture, security principles, and cloud native infrastructure patterns
- Hands-on experience with monitoring and observability, especially with Datadog, and the ability to interpret metrics/logs to guide system improvements
- Excellent communication and collaboration skills—you’re comfortable navigating technical conversations across engineering, security, and leadership
Responsibilities
- Take technical ownership of our Terraform-based IaC platform—assess the current state, define next steps, and drive delivery of improvements across environments
- Work closely with engineering teams to design and implement cloud-native architectures, optimize deployment pipelines, and contribute directly to Python-based codebases where needed
- Strengthen and scale our CI/CD processes using GitHub Actions, including artifact packaging, environment promotion, and automated testing/release workflows
- Shape and execute our global infrastructure strategy, including multi-region deployment, scalability, and resiliency planning
- Implement and improve disaster recovery planning, incident response, and high availability for critical systems
- Design and enforce cloud networking and security best practices, including IAM, VPC architecture, and secrets management
- Drive improvements in observability and performance monitoring using Datadog APM, metrics, logs, and alerting to proactively identify and resolve issues
- Collaborate across teams to implement and evangelize SRE principles, including SLIs, SLOs, and error budgets
- Serve as a technical mentor and thought partner, helping level up others while delivering meaningful improvements yourself
Preferred Qualifications
- Background working in platform engineering, developer experience, or infrastructure leadership roles
- Experience navigating the challenges of a startup transitioning into a scale-up, including evolving systems, processes, and team structures
- Experience mentoring or guiding teams to turn projects around
- Knowledge of compliance and risk management in regulated environments (HIPAA, SOC2, ISO27001, etc.)