Skip to content

HPC Storage Engineer
Company | IMC Trading |
---|
Location | Chicago, IL, USA |
---|
Salary | $150000 – $225000 |
---|
Type | Full-Time |
---|
Degrees | |
---|
Experience Level | Senior |
---|
Requirements
- 5+ years of experience in system storage architecture and system design within a large-scale compute environment
- Storage system design and optimization with systems like Lustre, GPFS, S3, BeeGFS, Weka, Ceph, Vast, PowerScale, DDN, MinIO
- Strong Linux Engineering skills, including bare-metal
- Understanding of the implementation of storage stack from the kernel to user space (including file systems, block storage, I/O schedulers, VFS)
- Storage benchmarking and performance tuning, with experience analyzing throughput, latency, IOPS, and workload-specific optimizations
- Ability to manage large-scale, performance-critical environments, including capacity planning, scaling, and optimization
- Knowledge of hardware components critical to storage systems, including NVMe, CPU/GPU/xPU architectures, PCIe, power utilization, NIC (eth/ib), PMEM, SCM, RedFish and how they impact performance and scalability
- Proficiency in programming with languages such as Rust, Python, Go, or Bash
Responsibilities
- Design, deploy, and manage our global storage platform, ensuring high performance, massive scalability, reliability, and future-proof solutions
- Collaborate with cross-functional research and engineering teams, enabling them to leverage high-performance storage solutions, optimize data workflows, and accelerate computational workloads across global infrastructures
- Diagnose and resolve complex storage, Linux, and networking challenges in a fast-paced environment
- Integrate storage systems with distributed computing clusters, ensuring seamless compatibility with hardware and software across multiple locations
- Participate in ‘follow-the-sun’ support, responding proactively to critical storage issues and ensuring continuous uptime for our operations worldwide
Preferred Qualifications
- Kubernetes
- CSI
- Slurm
- AWS
- GCP
- HDFS
- AI/ML frameworks
- Prometheus
- Terraform/Ansible