Job description
We’re partnering with a cutting-edge AI startup building next-generation infrastructure to power large-scale, intelligent systems. Their mission is to bridge the gap between world-class AI research and production-grade deployment - enabling faster experimentation, high-performance inference, and reliable large-scale training.
As a Member of Technical Staff (ML Infrastructure), you’ll design and scale the systems that keep state-of-the-art AI running - from distributed training clusters and inference engines to agentic frameworks and post-training pipelines. You’ll work alongside a small, elite team of researchers and engineers who move fast, think big, and take full ownership of their work.
What You’ll Do
-
Design, build, and optimize high-performance ML infrastructure for large-scale training, inference, and evaluation.
-
Develop and maintain distributed systems that power large compute clusters and AI networking.
-
Streamline research workflows and accelerate experimentation by improving data pipelines (data collection, loading, SFT, RL).
-
Enhance inference performance across both open-source and proprietary inference engines.
-
Establish strong engineering practices for observability, reliability, and scalability.
-
Collaborate with researchers and product teams to translate cutting-edge ideas into robust, production-ready systems.
What We’re Looking For
-
Deep expertise in one or more of the following: inference optimization, GPU performance, cluster scheduling, or large-scale infrastructure.
-
Strong experience with modern ML frameworks (e.g., PyTorch, vLLM, Verl).
-
Startup-ready mindset - high ownership, adaptability, and comfort working in fast-moving environments.
-
Passion for bridging research and real-world impact.
Why This Role
-
High impact: You’ll ship meaningful work in weeks, not months.
-
Elite team: Work alongside ex-founders, top AI researchers, and engineers from leading tech companies.
-
Momentum: Well-funded, fast-growing, and laser-focused on building and shipping real products powered by cutting-edge AI.