AI Engineer
Company | Xoul |
---|---|
Location | San Francisco, CA, USA |
Salary | $Not Provided – $Not Provided |
Type | Full-Time |
Degrees | |
Experience Level | Junior, Mid Level |
Requirements
- Deep familiarity with supervised and unsupervised learning techniques, including traditional and deep-learning methods.
- Experience with pipeline tools like Kafka, Dagster/Airflow, Flink, or equivalent streaming and batch processing systems.
- Hands-on experience with fine-tuning and customizing state-of-the-art models (LLMs, transformers, diffusion models, etc.) for production scenarios.
- Proficiency with performance-optimized inference engines such as TensorRT, FlashInfer, and direct experience programming CUDA kernels.
- Comfortable deploying models using frameworks like vLLM, Triton, or similar deployment environments.
- Experience deploying, scaling, and maintaining AI/ML systems in production environments serving real-world users.
- Prior research experience is encouraged.
- A background in traditional software development and system design is a plus.
Responsibilities
- Relentlessly tackle novel AI/ML challenges—fine-tuning state-of-the-art models, creating innovative algorithms, and pushing boundaries.
- Handle AI/ML pipelines from data ingestion and preprocessing to inference optimization and deployment strategies.
- Debug, optimize, and proactively ensure your AI models and data pipelines perform robustly at scale in production.
- Actively flesh out project specs alongside executives, teammates, and product engineers.
- Anticipate and accommodate user needs proactively in your AI systems.
- Clearly communicate progress, challenges, and blockers.
Preferred Qualifications
- Being an LLM whisperer—someone exceptionally skilled at coaxing maximum performance out of LLMs is a significant bonus.