Observability Engineer
Company | Las Vegas Sands Corp |
---|---|
Location | Dallas, TX, USA |
Salary | $Not Provided – $Not Provided |
Type | Full-Time |
Degrees | Bachelor’s |
Experience Level | Senior |
Requirements
- At least 21 years of age.
- Proof of authorization to work in the United States.
- Bachelor’s degree in computer science, Engineering or related discipline required.
- 5+ years proven experience of developing Monitoring, Observability solutions in on-premises IT infrastructure, applications and private & public cloud monitoring.
- Strong expertise with scripting in Python, Java and RESTful Services, with focus on building high throughput/High volume distributed systems.
- Strong expertise in Linux/Unix, Container orchestration (e.g., Kubernetes), container runtimes and optimization.
- Strong understanding of Site Reliability Engineering and DevOps principles.
- Strong technical acumen in Cloud Architecture, Performance Benchmarking, and Capacity planning.
- Strong Cloud (AWS, GCP, Azure etc.) platform knowledge.
- Proficiency in Project Management and work item management tools such as Azure DevOps and Portfolio.
- Strong knowledge of logging systems, experience with ELK Stack (Elasticsearch, Logstash, Kibana), Splunk, or similar platforms.
- Experience with tools like Harness, GitLab, Terraform, Ansible, or CloudFormation for managing and monitoring infrastructure.
- Demonstrated experience diagnosing performance bottlenecks and other system issues using observability data.
- Demonstrated understanding and respect of IT service management practices (e.g., change, release, incident, problem management).
- Able to multi-task and handle various types of requests from different people/areas.
- Strong analytical and problem-solving skills.
- Effective written and verbal communication skills in English.
Responsibilities
- Work with Lead Observability Engineer to decide & execute upon priorities for monitoring, alerting and observability KPIs that are required.
- Develop solutions to observability demands.
- Deliver broad services that cover the following domains: Log Collection and Analysis, Operational Metrics, Distributed Tracing, Build, Test, and Deployment Automation, Platform reliability engineering monitoring.
- Design, develop, and maintain automation solutions to support observability and operations, focusing on improving system monitoring, alerting, and reporting capabilities.
- Provide technology and/or process solutions to high-impact problems/projects through in-depth evaluation of complex business processes, system processes, and industry standards.
- Be accountable for execution in support of observability policies, processes, and architectural decisions.
- Responsible for ensuring operational methods, procedures, facilities, and tools are developed in accordance with policies, and are well documented, and maintained.
- Monitor and research emerging observability trends and technologies with the potential to improve efficiency, security, and business capabilities.
- Develop and execute proof-of-concept projects to evaluate new solutions for potential adoption.
- Develop documentation (e.g., including data flow diagrams, logical diagrams, and physical diagrams) and training in compliance with standards.
- Apply enterprise design principles and best practices for implementing and supporting observability services.
- Operate with a limited level of direct supervision and exercise independence of judgment and autonomy.
- Consistently share standard methodologies and improve processes within and across teams.
- Perform job duties in a safe manner.
- Attend work as scheduled on a consistent and regular basis.
- Perform other related duties as assigned.
Preferred Qualifications
- Advanced degree in technology or engineering is a plus.
- Experience in ITRS, Geneos and OpsView is a plus.