DeepSeek Open-Sources TileKernels, GPU Kernel Library for Large Model Training and Inference

Gate News message, April 23 — DeepSeek has open-sourced TileKernels under the MIT license, a GPU kernel library written in TileLang for large language model training and inference. TileLang is a domain-specific language developed by the tile-ai team for expressing high-performance GPU kernels in Python. DeepSeek stated that most kernels in the library have approached hardware performance limits in compute density and memory bandwidth, with portions already deployed in internal training and inference operations.

The library comprises six categories of kernels: MoE (mixture of experts) gating and routing, including Top-k expert selection, token-to-expert mapping, and fused expand/shrink with weight normalization; quantization supporting FP8, FP4, and E5M6 formats with per-token, per-block, and per-channel quantization, including fused SwiGLU+quantization operations; batch transpose; Engram gating with fused RMSNorm forward/backward propagation and weight gradient reduction; Manifold HyperConnection with Sinkhorn normalization and mixed split/apply; and high-level autograd interfaces that wrap low-level kernels into trainable layers.

Engram and Manifold HyperConnection are proprietary components of DeepSeek’s model architecture, with implementation details disclosed publicly for the first time. The library requires NVIDIA SM90 or SM100 architecture GPUs (H100/H200 or Blackwell series), CUDA Toolkit 13.1 or higher, and PyTorch 2.10 or higher.

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.

Related Articles

Justin Sun calls TRON the world’s first post-quantum attack-resistant network, with the mainnet going live in Q3 of 2026.

TRON founder Justin Sun announced on X on April 26 that TRON plans to enable anti-quantum attack functionality on the testnet in the second quarter, with a mainnet launch planned for the third quarter. In the post, Justin Sun referred to this upgrade plan as “the world’s first anti-quantum attack network.” Although quantum threats are still largely theoretical for now, Ethereum, Solana, and others have already published post-quantum cryptography (PQC) upgrade plans or timelines.

MarketWhisper6m ago

DeFi United’s crowdfunding campaign raises 102,000+ ETH, with AAVE rebounding to $100

According to the official DeFi United page, the multi-protocol relief fund DeFi United, initiated and led by Aave service providers, has raised more than 102k ETH as of April 27. The goal is to cover the bad-debt shortfall created in the Aave V3 market after the April 18 Kelp DAO cross-chain bridge attack incident. AAVE briefly broke above $100 before falling back.

MarketWhisper50m ago

Vcitychain DPoS Mainnet Goes Live with Self-Developed Consensus System

Gate News message, April 27 — Vcitychain, a commercial-grade blockchain, officially launched its DPoS mainnet today, transitioning to a self-developed Delegated Proof of Stake (DPoS) consensus system. The upgrade aims to enhance network performance, increase decentralization, and improve on-chain g

GateNews58m ago

ApeCoin Transfers Game Control to Community as Blackbeard's Bounty Season 3 Concludes

Gate News message, April 27 — ApeCoin announced that Blackbeard's Bounty quest season has officially ended, though the ability for users to create and complete bounty tasks will remain active. As the season concludes, game control is being transferred to the community, with future development

GateNews59m ago

FLOA Ecosystem Launches FloaClaw AI Suite With Multi-Scenario Skill Matrix

Gate News message, April 27 — The FLOA ecosystem has officially launched FloaClaw, its core AI toolkit, featuring a multi-scenario AI skill matrix. Access to FloaClaw's functions is limited to Agent users at level 3 and above. FloaClaw operates on a token-based system where users purchase

GateNews59m ago

Aave Labs proposes for Arbitrum: unlock 30,765 ETH in compensation for victims

According to a proposal published by Aave Labs on April 25 on the Arbitrum governance forum, Aave Labs asks the Arbitrum decentralized autonomous organization (DAO) to unfreeze 30,765 ETH associated with the Kelp DAO attack and to transfer the above funds to the “DeFi United” Recovery Fund, to restore support for rsETH and compensate holders.

MarketWhisper2h ago
Comment
0/400
No comments