Posted in

Lead – Devops Support Engineering

Lead – Devops Support Engineering

CompanyMagna
LocationLowell, MA, USA
Salary$Not Provided – $Not Provided
TypeFull-Time
Degrees
Experience LevelSenior, Expert or higher

Requirements

  • 5+ years of experience in DevOps, SRE, or L2 technical support roles.
  • Experience creating and tracking tasks for L2 DevOps engineers to drive operational efficiency.
  • Strong expertise in automating support processes and troubleshooting complex systems.
  • Proficiency in scripting (Bash, Python, or similar) for automation and monitoring.
  • Hands-on experience with monitoring & logging tools (Prometheus, Grafana, ELK, Datadog, etc.).
  • Solid understanding of CI/CD pipelines, infrastructure components, and cloud services (AWS, GCP, or Azure).
  • Experience with containerized environments (Docker, Kubernetes) and troubleshooting containerized applications.
  • Strong analytical skills for root cause analysis, incident resolution, and risk assessment.

Responsibilities

  • Automate L2 support processes, incident resolution, and infrastructure management.
  • Develop and maintain scripts and automation tools to enhance efficiency and reduce manual work.
  • Ensure seamless integration between infrastructure, CI/CD pipelines, and monitoring solutions.
  • Optimize deployment processes and automate recurring operational tasks.
  • Lead DevOps L2 incident response, diagnosing and resolving infrastructure and application issues.
  • Perform root cause analysis and implement proactive fixes to prevent recurring incidents.
  • Work closely with L1 and L3 teams to streamline support escalations and improve response times.
  • Troubleshoot Kubernetes, cloud infrastructure, networking, and deployment failures.
  • Design, configure, and optimize monitoring and logging dashboards (Prometheus, Grafana, ELK, etc.).
  • Improve alerting mechanisms to enhance observability and reduce noise.
  • Ensure system performance metrics are effectively tracked and visualized for proactive incident management.
  • Define and optimize support workflows for efficient issue resolution.
  • Establish escalation routes to ensure timely handling of critical incidents.
  • Evaluate risks associated with deployments and infrastructure changes, implementing mitigation strategies.
  • Assist in QA validation of infrastructure changes and automation scripts.

Preferred Qualifications

  • Experience with Infrastructure as Code (Terraform, Ansible, CloudFormation).
  • Familiarity with AWS ALB Controller, external-dns, and DNS management.
  • Exposure to service mesh (Istio, Linkerd) and Kubernetes operators.
  • Certifications such as CKA, AWS DevOps Engineer, or similar.