Staff AI/ML Engineer – Edge & Consumer AI

We are seeking a Staff-level (or higher) multimodal expert to lead the development of multimodal capabilities that expand monetization and consumer reach for our edge-optimized models. This role focuses on building vision, speech, and other modality components that integrate tightly with our core models, while providing technical leadership across multimodal systems.

About Us: We build power-efficient, low-precision foundation models designed to run from edge devices to large-scale deployments. We train models ranging from roughly 1B to 100B+ parameters across LLMs, diffusion models, and other modalities, with a strong emphasis on efficient training, inference, and real-world deployment under power and memory constraints.

Role Overview: We are seeking a Staff-level (or higher) multimodal expert to lead the development of multimodal capabilities that expand monetization and consumer reach for our edge-optimized models. This role focuses on building vision, speech, and other modality components that integrate tightly with our core models, while providing technical leadership across multimodal systems.

Responsibilities: You will design, build, and integrate multimodal components optimized for efficiency, quality, and deployability. Key responsibilities include:

  • Building vision towers and multimodal encoders for LLMs and generative models
  • Developing or integrating speech-to-text and text-to-speech systems
  • Optimizing multimodal pipelines for latency, power efficiency, and edge deployment
  • Collaborating with model training and product teams to identify high-impact monetization paths
  • Translating multimodal research into production-ready systems
  • Mentoring engineers and setting technical standards for multimodal development

Basic Qualifications: You have a strong background in multimodal ML systems and technical leadership, including:

  • 8–10+ years of experience in machine learning or AI
  • Strong Python programming skills
  • Hands-on experience building and training multimodal models
  • Solid understanding of vision, speech, and representation learning fundamentals
  • Experience integrating multimodal systems into production environments
  • A proven ability to mentor and technically lead other ML engineers

Preferred Qualifications: You bring experience that directly supports consumer and edge-focused AI products, including:

  • Building vision towers for LLMs or multimodal foundation models
  • Developing or integrating ASR and TTS systems
  • Experience with diffusion or generative multimodal models
  • Applying quantization or low-precision techniques to non-text modalities
  • Familiarity with edge deployment constraints and consumer-facing AI products
  • Experience driving technical work that directly enabled product monetization

Ideal Candidate Profile: You enjoy turning foundation models into usable products, understand how multimodal systems unlock consumer value, think deeply about efficiency and deployment constraints, and naturally take ownership of technical direction while helping other engineers grow.