Posted in

Site Reliability Engineer II – Real-Time

Site Reliability Engineer II – Real-Time

CompanyEsri
LocationWest Redlands, Redlands, CA, USA
Salary$82160 – $138320
TypeFull-Time
DegreesBachelor’s
Experience LevelSenior

Requirements

  • 5+ years of experience managing Kubernetes (EKS), logging and monitoring (ELK, Prometheus), and container technologies (Docker, ECS)
  • Proficient in using Terraform for automating infrastructure provisioning and management
  • Ability to design and automate Git workflows for streamlined code integration, testing, and infrastructure deployment
  • Ability to write scripts to deploy infrastructure and/or applications (Bash, Python, Terraform)
  • High level of understanding and experience with cloud computing platforms (AWS)
  • Strong knowledge of Linux Operating system administration, including troubleshooting, performance tuning, and shell scripting
  • Proficient in cloud networking, including VPCs, subnets, security groups, and VPNs in platforms like AWS
  • Skilled in identifying and resolving system and application issues through effective troubleshooting and root cause analysis
  • Working knowledge of a source control and issue management system, preferably GitHub
  • Working knowledge of authoring, deploying, and troubleshooting Java applications on AWS Lambda
  • Bachelor’s in computer science, computer engineering, GIS, or information systems

Responsibilities

  • Collaborate with a team of SRE engineers to operate SaaS capabilities across multiple regions on the cloud platform
  • Design, implement, configure, and utilize monitoring systems to monitor the health of SaaS products
  • Manage infrastructure used for ArcGIS Velocity and ArcGIS Workflow Manager, respond to alerts, and troubleshoot problems to resolution
  • Develop, implement, and maintain automation solutions for repetitive operational tasks, such as deployment pipelines, incident resolution, and scaling processes
  • Design and implement the deployment and upgrade containerized micro-service components that, when combined, power Esri’s SaaS offerings
  • Create and automate Git workflows to simplify code integration, testing, and infrastructure deployments.
  • Participate in technical spike efforts, bringing new innovative ideas to future versions of our software
  • Troubleshoot the system incidents and provide root cause analysis reports
  • Provide rotational on-call technical support

Preferred Qualifications

  • 5+ years of experience designing, administering, and/or maintaining cloud environments, such as AWS, supporting 24×7 high-availability production environments
  • Interest in working with GitOps principles to automate the deployment of applications on Kubernetes clusters
  • Certifications: AWS Certified Solution Architect Associate, CKA/CKAD or similar
  • Experience managing OpenSearch (datastore or logstore), and Kafka for managing distributed data streams and ensuring high availability in large-scale systems
  • Ability to work with continuous integration and delivery best practices
  • Knowledge of operating resilient, highly available, scalable, and performance SaaS capabilities
  • Knowledge of Esri ArcGIS or other web mapping technologies
  • Master’s in computer science, computer engineering, GIS, or information systems