首页 > 快讯 > 正文

DeepSeek Launches NSA for Ultra-Fast Long Context Training and Inference

clock
2025-02-18 08:34:43
On February 18, DeepSeek launched NSA. DeepSeek claims that NSA is a hardware-consistent and natively trainable sparse attention mechanism for ultra-fast long-context training and inference. With an optimized design for modern hardware, NSA speeds up inference while reducing pre-training costs without affecting performance. It performs on general benchmarks, long-context tasks, and instruction-based inference equal to or better than full attention models.
Web3 桌面交易工具
了解币圈信息快人一步

7x24 快讯