Lead ML&AI Operations Engineer - Evinova

Lead ML&AI Operations Engineer – Evinova

Company	AstraZeneca
Location	Gaithersburg, MD, USA
Salary	$119628 – $179442
Type	Full-Time
Degrees	Bachelor’s
Experience Level	Senior, Expert or higher

Requirements

Minimum of 5 years in ML/AI engineering.
HS Diploma and 8 years of experience in Engineering/IT solutions OR BA/BS Degree and 5 years of experience or equivalent capabilities.
Proven track record of designing and implementing ML/AI end-to-end workflows.
Demonstrated ability to identify real business and product opportunities and implement cutting-edge technologies and methodologies in production environments to drive tangible and quantifiable product and business value.
Experience working with diverse teams to achieve product and organizational objectives through automation.
Deep understanding of the Data Science Lifecycle (DSLC) and the ability to shepherd data science projects from inception to production within the platform architecture and development process (AWS TypeScript CDK, Argo CD, Helm, Kubernetes).
Expertise in RAG methodologies and AWS Bedrock, Sagemaker, Lex and OpenSearch for developing AI and automation solutions.
Expertise and vigilance of the latest GenAI frameworks and tools like DSPy, Letta, LlamaIndex, and LangChain.
Advanced coding skills in relevant programming languages (Python/JavaScript/TypeScript) and frameworks.
Similar CI/CD cloud operations skills as a Cloud Solutions Architect or DevOps engineer, with additional expertise in data science and ML/AI engineering.

Responsibilities

Lead by example in creating high-performance, mission-focused and interdisciplinary teams/culture founded on trust, mutual respect, growth mindsets, and an obsession for building extraordinary products with extraordinary people.
Lead by example in using reactive firefighting to drive the creation of proactive capability and process enhancements that ensures enduring value creation and analytic compounding interest.
Design and implement resilient cloud ML/AI operational capabilities to maximize our system A-bilities (Learnability, Flexibility, Extendibility, Interoperability, Scalability).
Drive cost efficiency, optimized system performance, and risk mitigation with a data-driven strategy, analytics, and predictive capabilities for the systems we build, manage and govern.
Design and scale GenAI applications using AWS Services and bleeding edge GenAI stacks and frameworks.
Ensure AI projects have principled and methodical validation pathways that adhere to our platform development lifecycle — so we either fail fast, or get to market as efficiently and quickly as possible.
Maintain and teach advanced understanding of rapidly evolving frameworks such as DSPy, Letta (formerly MemGPT), LlamaIndex, LangChain, Vector DBs, RAG (and constantly evolving innovations such as Cache-Augmented Generation (CAG)) to enhance AI capabilities across engineering and science organizations.
Apply and teach advanced prompt engineering techniques to improve AI model interactions and outputs, and evangelize ‘prompts as programs’ philosophy and management/governance implications.
Embed deeply within and across product, design, science and business teams to build AI capabilities that have tangible and quantifiable product and business impact.

Preferred Qualifications

7+ years in ML/AI operations engineering roles.