The first commercially viable models with 1-bit weights. Available in 8B, 4B, and 1.7B sizes, these models were engineered for robotics, real-time agents, and edge computing. They have a 14× smaller footprint than their full-precision counterparts, run 8× faster, and are 5× more energy efficient, while matching leading models at similar parameter counts on benchmarks. This results in over 10× the intelligence density of full-precision equivalents¹.
Ternary Bonsai models use {-1, 0, 1} weights to deliver a powerful balance between model quality and deployment efficiency. Available in 8B, 4B, and 1.7B sizes, these models have a 9× smaller footprint than full-precision counterparts and run roughly 5× faster, while delivering substantially stronger benchmark performance than most models at similar parameter counts. This creates a compelling tradeoff between capability and efficiency2.
Available in 1-bit and Ternary variants, Bonsai Image 4B brings high-quality image generation to everyday devices. Built for local inference on iPhone, Mac, and GPUs, it reduces the diffusion transformer size by up to 8x and speeds image generation by up to 5.6× versus its full-precision counterparts, while preserving strong visual quality. The result is a lighter, and more deployable image generation model.