Senior Manager Observability and Reliability Platform Engineering

Company	Geico
Location	San Francisco, CA, USA, Bethesda, MD, USA
Salary	$150000 – $300000
Type	Full-Time
Degrees
Experience Level	Senior

Requirements

Strong expertise with Python, Golang or Java and RESTful Services, with Focus on building high throughput/High volume distributed systems
Strong Expert in Unix, Container orchestration (e.g., Kubernetes), container runtimes and optimization
Experience with Open-source Observability tools such as Prometheus, and LGTM stack will be a big plus
Strong understanding on Columnar data stores
Strong understanding of Site Reliability Engineering and DevOps principles
Strong technical acumen in Cloud Architecture, Performance Benchmarking, and Capacity planning
Solid foundation in algorithms, data structures, and core computer science concepts
Experience managing and growing engineers and teams
In-depth knowledge of CS data structures and algorithms
Basic UI/UX and prototype design knowledge and experience
Proven ability to concentrate and demonstrate a capacity for learning technical concepts and adapting to new technologies quickly
Strong Cloud (AWS, GCP, Azure etc.) platform knowledge
Proficiency in Project Management and work item management tools such as Azure DevOps and Portfolio

Responsibilities

Have strong technical expertise and leadership, you are able to lead from the trenches and have proven knowledge in the area of Observability
Be able to drive the build out of multi cloud infrastructure, lead by example and be a role model to the team of developers and infrastructure engineers
Work with your Director to address project dependencies, negotiate and estimate incremental delivery dates for milestones with the stakeholder community, and deliver projects on time
Understand how requirements and design choices may impact systems across multiple areas
Report on your team’s progress for project and other key metrics, in addition to presenting detailed and implementable ideas for areas to further improve or influence product or project delivery
Initiate and support performance evaluation of team members
Cultivate a culture that motivates all levels of performers to higher levels of achievement
Build and maintain relationships with your team members to support an environment of trust
Identify where technical or analytical skill gaps put future team deliverables at risk and craft a plan to remediate, consistently challenge team members to share knowledge and learn new technologies
Significantly contribute to the team planning process to include surfacing associate level proposals
Collaborate with the product teams to understand their pain points around performance, resiliency and formulate strategies to address recurring issues in a sustainable way
Develop and motivate teams to solve complex problems and be a strong advocate for open-source technologies and solutions
Be responsible for building and mentoring a new team of Site reliability engineers
Drive the team towards building solutions towards the long-term goals while ensuring that high priority tech debts are solved in an efficient way
Be a strong thought leader in Observability, Site Reliability engineering Principles
Consistently share best practices and improve processes within and across teams

Preferred Qualifications

No preferred qualifications provided.