Kairos
ModelactiveKairos 3.0-4B is an open-source generative world model designed from the ground up for embodied intelligence, developed by ACE Robotics. With only 4 billion parameters, it achieves real-time physics-consistent video prediction (1:1.5 speed ratio on THOR platform) while being 72x faster than NVIDIA's Cosmos 2.5-14B at comparable or better benchmark scores. The model uses a unified "understanding-generation-prediction" architecture that directly maps visual representations to action outputs, eliminating the traditional pipeline of generating future video then inferring actions. It introduces the first hybrid linear attention operator designed specifically for world models, reducing temporal complexity from O(n²) to O(n). Kairos 3.0-4B integrates three data sources: physical law Chain-of-Thought text, human behavior data, and real robot interaction data. It achieves SOTA on PAI-Bench (80.03 for robot subset) and WorldModelBench, matching or exceeding models 3-7x larger. The model is released in multiple variants: 480P pretrained, 480P robot fine-tuned, 480P distilled (for edge deployment), and 720P HD. It runs on NVIDIA GPUs (A800, RTX 5090) and supports domestic Chinese GPUs (MetaX, Hygon, Biren). It is compatible with Agibot G1, Unitree G1, and Songling PIPER robot platforms for zero-shot cross-embodiment generalization.