About Us
We build high-performance foundation models designed to run efficiently across a wide range of environments—from edge devices to large-scale deployments. Our work spans models from ~1B to 100B+ parameters across LLMs, diffusion models, and other modalities, with a strong focus on scalable training, efficient inference, and real-world deployment.
Role Overview
We are seeking a Staff-level (or higher) AI/ML engineer with expertise in multimodal systems to lead the development of capabilities that expand consumer use cases and product opportunities. This role focuses on building and integrating vision, speech, and other modalities into our core models, while providing technical leadership across AI/ML systems.
Responsibilities
You will design, build, and integrate multimodal components optimized for performance, quality, and real-world deployment. Key responsibilities include:
- Developing and integrating vision and multimodal components for large language models (LLMs)
- Building or integrating speech systems, including speech recognition (speech-to-text) and speech synthesis (text-to-speech)
- Optimizing multimodal pipelines for latency, efficiency, and deployment across a range of hardware environments
- Collaborating with model training and product teams to identify high-impact product opportunities
- Translating cutting-edge multimodal research into robust, production-ready AI/ML systems
- Mentoring engineers and establishing best practices for multimodal development
Basic Qualifications
You have a strong background in multimodal AI/ML systems and technical leadership:
- 8–10+ years of experience in machine learning or AI or strong publication record
- Strong Python programming skills with production-quality code
- Hands-on experience building and training multimodal models
- Solid understanding of computer vision, speech processing, and how modern AI/ML models learn and use representations (e.g., embeddings, feature extraction)
- Experience integrating AI/ML systems into production environments
- Proven ability to mentor and lead other AI/ML engineers
Preferred Qualifications
You bring experience that aligns with building consumer-facing and efficient AI systems:
- Experience developing multimodal components for large-scale models
- Experience building or integrating automatic speech recognition (ASR) and text-to-speech (TTS) systems
- Familiarity with generative models, including diffusion-based approaches
- Experience improving model performance and efficiency across different modalities
- Familiarity with deployment constraints in real-world environments, including latency, cost, and hardware limitations
- Experience contributing to product-driven AI development or launching consumer-facing features
Ideal Candidate Profile
You enjoy turning advanced AI research into usable products, understand how multimodal systems unlock new user experiences, and think carefully about performance, efficiency, and deployment trade-offs. You take ownership of technical direction, work effectively across research and product teams, and actively support the growth of other AI/ML engineers.