Posted in

HPC Storage Engineer

HPC Storage Engineer

CompanyIMC Trading
LocationChicago, IL, USA
Salary$150000 – $225000
TypeFull-Time
Degrees
Experience LevelSenior

Requirements

  • 5+ years of experience in system storage architecture and system design within a large-scale compute environment
  • Storage system design and optimization with systems like Lustre, GPFS, S3, BeeGFS, Weka, Ceph, Vast, PowerScale, DDN, MinIO
  • Strong Linux Engineering skills, including bare-metal
  • Understanding of the implementation of storage stack from the kernel to user space (including file systems, block storage, I/O schedulers, VFS)
  • Storage benchmarking and performance tuning, with experience analyzing throughput, latency, IOPS, and workload-specific optimizations
  • Ability to manage large-scale, performance-critical environments, including capacity planning, scaling, and optimization
  • Knowledge of hardware components critical to storage systems, including NVMe, CPU/GPU/xPU architectures, PCIe, power utilization, NIC (eth/ib), PMEM, SCM, RedFish and how they impact performance and scalability
  • Proficiency in programming with languages such as Rust, Python, Go, or Bash

Responsibilities

  • Design, deploy, and manage our global storage platform, ensuring high performance, massive scalability, reliability, and future-proof solutions
  • Collaborate with cross-functional research and engineering teams, enabling them to leverage high-performance storage solutions, optimize data workflows, and accelerate computational workloads across global infrastructures
  • Diagnose and resolve complex storage, Linux, and networking challenges in a fast-paced environment
  • Integrate storage systems with distributed computing clusters, ensuring seamless compatibility with hardware and software across multiple locations
  • Participate in ‘follow-the-sun’ support, responding proactively to critical storage issues and ensuring continuous uptime for our operations worldwide

Preferred Qualifications

  • Kubernetes
  • CSI
  • Slurm
  • AWS
  • GCP
  • HDFS
  • AI/ML frameworks
  • Prometheus
  • Terraform/Ansible