Posted in

Compute Operations Lead

Compute Operations Lead

CompanyLeidos
LocationWashington, DC, USA
Salary$104650 – $189175
TypeFull-Time
DegreesBachelor’s
Experience LevelSenior, Expert or higher

Requirements

  • Bachelor Degree with 8+ years of prior relevant experience. additional years of experience will be considered in lieu of a degree
  • 2+ years of formal or informal leadership experience including project management. Experience supervising and mentoring System Administrators preferred
  • Foundation of knowledge of Windows, Red Hat and Storage Platforms
  • Experience building new servers (Physical and Virtual)
  • Experience troubleshooting issues in a growing, fast-paced environment
  • Experience with log reviews, incident analysis, and identification of issue trends
  • Knowledge of ITSM systems (ServiceNow)
  • Experience with server patch management methodologies
  • Time management skills
  • Strong oral and written communication skills
  • Track record of working effectively within a team, and support to peers toward improved processes and results
  • Candidate must, at a minimum, be able to meet IAT Level II certification requirements (currently Security+ CE, CCNA-Security, GSEC, or SSCP)
  • Experience supporting Windows 2019 and later

Responsibilities

  • Lead and manage daily operations for a 24/7/365 server and storage enterprise infrastructure on-premises and in the cloud; including incident response and maintenance
  • Lead and manage compute team members and contractors.
  • Maintain positive, constructive, and professional communication with customer, which includes considering and executing their requests in the name of customer service
  • Effectively manage ServiceNow queues for the compute team ensuring that all tickets are assigned in a timely manner and all SLA requirements for ticket handling are followed.
  • Ensure comprehensive documentation is created and maintained for infrastructure, processes, and system configurations.
  • Ensure that all compute related vulnerabilities are mitigated within timeframes established by SLAs and that the environment is configured to comply with FTC standards and regulations (e.g. CIS 3.0)
  • Maintain all operating systems and databases to version N or N-1 and ensure that future resource needs related to capacity or EOS/EOL status are communicated to the customer.
  • Address any technical issues or escalations, ensuring that any critical incidents related to compute resources are resolved quickly and efficiently.
  • Proactively collaborate with other teams (Network, Security) and maintain an open line of communication.
  • Work with customers to define project requirements, deliverables, and timelines, and ensure the team stays aligned with these objectives. Ensure project schedules are maintained and up to date.
  • Track individual and team performance, provide constructive feedback, and general system administration supervision and mentoring.
  • Effectively manage and project compute capacity and provide recommendations for management of both compute and storage resources.
  • Manage team workload and resources (personnel and hardware/software) to ensure operational support requirements are met and project timelines are adhered to.
  • Coordinate with senior leadership, customers, and stakeholders to collect data, conduct analysis, develop, and implement solutions associated with incident tickets and requirements.
  • Maintain consistent backups for servers and storage and participate in COOP/DR exercises as needed or required by the agency.
  • Develop solutions to complex technical issues.
  • Provide follow-up reports (technical findings, feedback, resolution steps taken) for root cause analysis, engineering technical assessment and process improvement initiatives.

Preferred Qualifications

  • Experience with Ivanti Patch Management
  • Experience with NetApp
  • Experience with RHEL 8/9
  • Experience with Windows 2019 and later
  • Experience with Azure GovCloud