OpenAI Engineer Clive Chan Challenges V4 Hardware Recommendations, Citing Errors and Vagueness vs. V3

Gate News message, April 24 — OpenAI engineer Clive Chan has raised detailed objections to the hardware recommendations chapter in the V4 technical report, calling it “surprisingly mediocre and error-prone” compared to the acclaimed V3 version. V3’s hardware guidance, which included Q&A sessions that became the most popular discussion topic at the ISCA academic conference, offered specific recommendations aligned with industry interconnect standards. V4, by contrast, is far more vague.

Chan systematically challenged three key recommendations. On power consumption, the report suggests that software optimization allows chips to run compute, storage, and communication at full capacity simultaneously, and recommends that chip manufacturers reserve additional power headroom. Chan argues this is counterproductive: total chip power is constrained by physical process limitations, so reserving more power margin only reduces operating frequency, ultimately decreasing computational performance. Regarding GPU-to-GPU data transfer, the report advocates a pull model—where GPUs actively fetch data—over a push model, citing high notification overhead in push operations. Chan disputes this, contending that pull is actually slower and that improved network adapter capabilities would be preferable. However, the two may be discussing different layers of the issue: the report addresses notification mechanism overhead, while Chan refers to transmission latency itself.

On activation functions, the report recommends replacing SwiGLU with simpler functions to reduce computational burden. Chan sees no merit in this, noting that Sonic MoE has already demonstrated optimal performance using SwiGLU. Chan suspects DeepSeek may have “deliberately weakened this section.”

免責聲明:本頁面資訊可能來自第三方,不代表 Gate 的觀點或意見。頁面顯示的內容僅供參考,不構成任何財務、投資或法律建議。Gate 對資訊的準確性、完整性不作保證,對因使用本資訊而產生的任何損失不承擔責任。虛擬資產投資屬高風險行為,價格波動劇烈,您可能損失全部投資本金。請充分了解相關風險,並根據自身財務狀況和風險承受能力謹慎決策。具體內容詳見聲明

相關文章

CZ 表示 YZi Labs 在 Consensus Miami 2026 將 70% 分配給區塊鏈,20% 分配給 AI

根據 ChainCatcher 報導,在 Consensus Miami 2026 上,趙長鵬(CZ)表示 YZi Labs 將 70% 的資金投向區塊鏈、20% 投向 AI、10% 投向生物科技。CZ 進一步補充,BNB 應被定位為 AI 代理的原生貨幣,且所有區塊鏈都需要「AI ready」以支援

GateNews1小時前

Public 收購 AI 投資平台 Treasury App 以擴大加密貨幣交易

根據 ChainCatcher,Public 宣布收購 AI 投資服務平台 Treasury App,以強化其由 AI 驅動的經紀業務。收購金額未予披露。Public 目前支援交易股票、債券與加密貨幣,包括 Bitcoin、Ethereum、a

GateNews2小時前

Blitzy 完成 $200M 融資輪,領投方為 Northzone

根據 ChainCatcher,Blitzy(一家由前 Nvidia 架構師 Sid Pardeshi 共同創立的 AI 編碼公司)已完成一輪由 Northzone 領投的 2 億美元 融資。Battery Ventures、Jump Capital 和 Morgan Creek Digital 參與了該輪融資。該平台可以解析複雜系統,並

GateNews3小時前

歐盟在 5 月 7 日禁止 AI 生成的深偽色情內容

根據新華社,5 月 7 日,歐洲議會議員與成員國達成共識,禁止人工智慧系統生成深度偽造色情內容。該禁令將被納入對 2024 年《人工智慧法案》的修正之中。歐洲議會

GateNews4小時前

Tether 發布 QVAC MedPsy 醫療 AI 模型,並在 17B 參數版本上取得 62.62 分

根據 Odaily,Tether AI Research Group 釋出了 QVAC MedPsy,這是一款醫療 AI 模型,旨在不依賴雲端的情況下可在智慧型手機與穿戴式裝置上本地運行。這款 17 億參數版本在七項醫療基準上取得 62.62 分,表現優於 Google 的 MedGemma-1.5-4B,領先 11.42 poi

GateNews4小時前

B.AI API 推出四款新模型,包括在 OpenAI 發布後 48 小時內推出的 GPT-5.5 Instant

B.AI API 已推出四款新模型:GPT-5.5 Instant、DeepSeek-v3.2、MiniMax-M2.7 和 GLM-5.1。GPT-5.5 Instant 在 OpenAI 發布後 48 小時內完成底層適配與介面整合,實現零延遲存取以

GateNews4小時前
留言
0/400
暫無留言