Feng
  • Home
  • Blogs
  • About

Tag

#attention

2 posts found

Attention Residuals: How Kimi Rethinks Depth-Wise Information Flow in LLMs
Deep Learning
Attention Residuals: How Kimi Rethinks Depth-Wise Information Flow in LLMs

Kimi's Attention Residuals paper proposes replacing fixed residual connections with learned softmax …

March 20, 2026 7min
Attention Mechanisms Compared: Standard, Linear, and Flash
Deep Learning
Attention Mechanisms Compared: Standard, Linear, and Flash

A deep dive comparing standard softmax attention, linear attention, and Flash Attention: their math, …

March 11, 2026 5min
RedNote

© 2025  •  Feng

Powered by Hugo, Lightbi and Google Firebase