小米的 MiMo-V2.5 系列开源:1T 参数,相较 GPT-5.4 令牌效率更优

Gate News message, April 27 — Xiaomi’s MiMo team has open-sourced the MiMo-V2.5 series of large language models under MIT license, supporting commercial deployment, continued training, and fine-tuning. Both models feature a 1 million token context window. MiMo-V2.5-Pro is a pure-text mixture-of-experts (MoE) model with 1.02 trillion total parameters and 42 billion active parameters, while MiMo-V2.5 is a native multimodal model with 310 billion total parameters and 15 billion active parameters, supporting text, image, video, and audio understanding.

MiMo-V2.5-Pro targets complex agent and programming tasks. In ClawEval benchmarks, it achieved 64% Pass@3 while consuming approximately 70,000 tokens per task trajectory—40% to 60% fewer tokens than Claude Opus, Gemini 3.1 Pro, and GPT-5.4. The model scored 78.9 on SWE-bench Verified. In a demonstration, V2.5-Pro independently implemented a complete SysY-to-RISC-V compiler for a Peking University compiler course project in 4.3 hours with 672 tool calls, achieving a perfect score of 233/233 on hidden test sets.

MiMo-V2.5 is designed for multimodal agent scenarios, equipped with a dedicated vision encoder (729 million parameters) and audio encoder (261 million parameters), scoring 62.3 on the Claw-Eval general subset. Both models employ a hybrid architecture combining sliding window attention (SWA) and global attention (GA), paired with a 3-layer multi-token prediction (MTP) module for accelerated inference. Model weights are available on Hugging Face.

Alongside the open-source release, the MiMo team launched the “Orbit Quadrillion Token Creator Incentive Program,” offering 100 quadrillion tokens free over 30 days to global users. Individual developers, teams, and enterprises can apply via the program page with an evaluation cycle of approximately 3 business days; approved benefits are distributed as Token Plans or direct credits, compatible with tools like Claude Code and Cursor.

免责声明:本页面信息可能来自第三方,不代表 Gate 的观点或意见。页面显示的内容仅供参考,不构成任何财务、投资或法律建议。Gate 对信息的准确性、完整性不作保证,对因使用本信息而产生的任何损失不承担责任。虚拟资产投资属高风险行为,价格波动剧烈,您可能损失全部投资本金。请充分了解相关风险,并根据自身财务状况和风险承受能力谨慎决策。具体内容详见声明

相关文章

CZ 表示 YZi Labs 在 2026 年 Consensus Miami 将 70% 分配给区块链,20% 分配给 AI

根据 ChainCatcher,在 2026 年 Consensus Miami 上,赵长鹏(CZ)表示,YZi Labs 将 70% 的资金投向区块链,20% 投向 AI,10% 投向生物技术。CZ 补充称,BNB 应被定位为 AI 代理的本币,并且所有区块链都需要“具备 AI 就绪能力”以支持

GateNews3小时前

Public 收购 AI 投资平台 Treasury App 以扩展加密交易

据 ChainCatcher,Public 公布收购 AI 投资服务平台 Treasury App,以加强其由 AI 驱动的经纪业务。收购金额未披露。Public 目前支持交易股票、债券和加密货币,包括 Bitcoin、Ethereum、a

GateNews5小时前

Blitzy 完成由 Northzone 领投的 $200M 融资轮次

据 ChainCatcher,Blitzy,这家由前 Nvidia 架构师 Sid Pardeshi 联合创立的 AI 编程公司,已完成一轮 2 亿美元的融资,领投方为 Northzone。Battery Ventures、Jump Capital 和 Morgan Creek Digital 参与了本轮融资。该平台可以解析复杂系统并

GateNews6小时前

欧盟于 5 月 7 日禁止 AI 生成的换脸色情内容

据新华社报道,5 月 7 日,欧盟议会成员及成员国达成共识,禁止人工智能系统生成深度伪造色情内容。该禁令将纳入对 2024 年《人工智能法案》的修订中。欧洲议会

GateNews6小时前

Tether 发布 QVAC MedPsy 医疗 AI 模型,在 17B 参数版本上取得 62.62 分

据 Odaily,Tether AI Research Group 发布了 QVAC MedPsy,一种面向医疗的 AI 模型,旨在无需云端依赖、可在智能手机和可穿戴设备上本地运行。该 17 亿参数版本在七项医学基准测试中得分 62.62,较 Google 的 MedGemma-1.5-4B 高出 11.42 poi

GateNews6小时前

B.AI API 推出四款新模型,包括 GPT-5.5 Instant,并在 OpenAI 发布后 48 小时内推出

B.AI API 已推出四款新模型:GPT-5.5 Instant、DeepSeek-v3.2、MiniMax-M2.7 和 GLM-5.1。GPT-5.5 Instant 已在 OpenAI 发布后的 48 小时内完成底层适配和接口集成,实现对

GateNews6小时前
评论
0/400
暂无评论