Staff AI/ML Engineer – Large-Scale & Low-Precision AI

Pasadena or San Francisco

Full-time

We are seeking a Staff-level (or higher) model training expert to lead large-scale model training with a focus on low-precision and power efficiency. This role combines hands-on ownership of 100B+ parameter training runs on TPUs using JAX with responsibility for setting technical direction, mentoring engineers, and raising training quality across the organization.

Apply now

About Us: We build power-efficient, low-precision foundation models designed to run from edge devices to large-scale deployments. We train models ranging from roughly 1B to 100B+ parameters across LLMs, diffusion models, and other modalities, with a strong emphasis on efficient training, inference, and real-world deployment under power and memory constraints.

Role Overview: We are seeking a Staff-level (or higher) model training expert to lead large-scale model training with a focus on low-precision and power efficiency. This role combines hands-on ownership of 100B+ parameter training runs on TPUs using JAX with responsibility for setting technical direction, mentoring engineers, and raising training quality across the organization.

Responsibilities: You will design, implement, and debug distributed training pipelines on TPUs using JAX across all major training phases, while driving efficiency and stability at scale. Core responsibilities include:

Leading pretraining, SFT, RL/RLHF, and post-training optimization
Designing data curation strategies including filtering, deduplication, dataset mixing, and curriculum design
Applying and evaluating PTQ and QAT techniques for low-precision, power-efficient deployment
Optimizing convergence, throughput, memory usage, and numerical stability using advanced optimizers and parallelism strategies
Translating state-of-the-art research into reliable production training systems
Providing technical leadership through mentoring, design reviews, and cross-team collaboration

Basic Qualifications: You bring deep experience in large-scale ML systems and a strong foundation in modern model training, including:

8–10+ years of experience in machine learning or AI
Strong Python programming skills with production-quality code
Hands-on experience training multi-billion-parameter models
Solid understanding of optimization, distributed training, and training dynamics
Experience with LLM training phases including pretraining, SFT, and RL-based methods
A demonstrated ability to mentor and technically lead other ML engineers

Preferred Qualifications: You have additional experience that directly aligns with our efficiency-focused mission, including:

Training very large models in the 100B+ range
Deep experience with TPUs and JAX, including XLA and SPMD optimization
Hands-on application of PTQ and QAT for low-precision models
Familiarity with edge or power-constrained deployment targets
Experience with advanced optimizers, scaling laws, and compute-efficient training
Prior contributions to research efforts or open-source ML systems

Ideal Candidate Profile: You have personally trained large models end-to-end, understand why large-scale training fails and how to fix it, care deeply about efficiency and real-world deployment, enjoy mentoring others, and are comfortable operating at the intersection of research, systems engineering, and product constraints.

Apply now

Back to home

Staff AI/ML Engineer – Large-Scale & Low-Precision AI

Related roles

Staff AI/ML Engineer – Edge & Consumer AI

Join the team

Contact us