Awesome-Efficient-Arch

🔥 News

2025.08: 📢📢📢 The survey paper has been reported by many technical media, including: Zhihu (知乎), Synced (机器之心), PaperWeekly, 52CV (我爱计算机视觉), Deep Learning and NLP (深度学习自然语言处理), Emergent Clustering Points (涌现聚点), Global Economic Forum (全球经济论坛), and more. People ❤️ "Speed Always Wins"!
2025.08: 🎉🎉🎉 We have released a survey paper Speed Always Wins: A Survey on Efficient Architectures for Large Language Models, with 449 papers included. Please feel free to open PRs to include your Awesome-Efficient-Arch work.

✨ Overview

📄 Paper List

1 Introduction
- 1.1 Background
- 1.2 Position and Contributions
2 Linear Sequence Modeling
- 2.1 Linear Attention
- 2.2 Linear RNN
- 2.3 State Space Model
- 2.4 Test-Time-Training RNN
- 2.5 Unified Linear Sequence Modeling
- 2.6 Linearization
- 2.7 Hardware-efficient Implementation
3 Sparse Sequence Modeling
- 3.1 Static Sparse Attention
- 3.2 Dynamic Sparse Attention
- 3.3 Training-free Sparse Attention
- 3.4 Hardware-efficient Implementation
4 Efficient Full Attention
- 4.1 IO-Aware Attention
- 4.2 Grouped Attention
- 4.3 Mixture of Attention
- 4.4 Quantized Attention
5 Sparse Mixture-of-Experts
- 5.1 Routing Mechanisms
- 5.2 Expert Architectures
- 5.3 MoE Conversion
6 Hybrid Architectures
- 6.1 Inter-layer Hybrid
- 6.2 Intra-layer Hybrid
7 Diffusion Large Language Models
8 Applications to Other Modalities
- 8.1 Vision
- 8.2 Audio
- 8.3 Multimodality
9 Conclusion and Future Directions

Linear Sequence Modeling

Linear Attention

Linear RNN

State Space Model

Test-Time-Training RNN

Unified Linear Sequence Modeling

Linearization

Hardware-efficient Implementation

Sparse Sequence Modeling

Static Sparse Attention

Dynamic Sparse Attention

Training-free Sparse Attention

Hardware-efficient Implementation

Efficient Full Attention

IO-Aware Attention

Grouped Attention

Mixture of Attention

Quantized Attention

Sparse Mixture-of-Experts

Routing Mechanisms

Expert Architectures

MoE Conversion

Hybrid Architectures

Inter-layer Hybrid

Intra-layer Hybrid

Diffusion Large Language Models

Non-Autoregressive Diffusion LLM

Bridging Diffusion LLM and Autoregressive

Extending Diffusion LLM to Multimodality

Applications to Other Modalities

Vision

Audio

Multimodality

🤝🏻 Contribution

Contributing to the paper list

⭐ Join us to improve this repo! ⭐ If you know of any Awesome-Efficient-Arch work we've missed, please contribute via PR or raise an issue. Your contributions are very welcomed!

Contributors

🖤 Citation

If you find this survey useful, please consider citing our paper:

@article{sun2025survey,
  title={Speed Always Wins: A Survey on Efficient Architectures for Large Language Models},
  author={Sun, Weigao and Hu, Jiaxi and Zhou, Yucheng and Du, Jusen and Lan, Disen and Wang, Kexin and Zhu, Tong and Qu, Xiaoye and Zhang, Yu and Mo, Xiaoyu and Liu, Daizong and Liang, Yuxuan and Chen, Wenliang and Li, Guoqi and Cheng, Yu},
  journal={arXiv preprint arXiv:2508.09834},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
assets		assets
README.md		README.md

weigao266/Awesome-Efficient-Arch

Folders and files

Latest commit

History

Repository files navigation

Awesome-Efficient-Arch

🔥 News

✨ Overview

📄 Paper List

Table of Contents

Linear Sequence Modeling

Linear Attention

Linear RNN

State Space Model

Test-Time-Training RNN

Unified Linear Sequence Modeling

Linearization

Hardware-efficient Implementation

Sparse Sequence Modeling

Static Sparse Attention

Dynamic Sparse Attention

Training-free Sparse Attention

Hardware-efficient Implementation

Efficient Full Attention

IO-Aware Attention

Grouped Attention

Mixture of Attention

Quantized Attention

Sparse Mixture-of-Experts

Routing Mechanisms

Expert Architectures

MoE Conversion

Hybrid Architectures

Inter-layer Hybrid

Intra-layer Hybrid

Diffusion Large Language Models

Non-Autoregressive Diffusion LLM

Bridging Diffusion LLM and Autoregressive

Extending Diffusion LLM to Multimodality

Applications to Other Modalities

Vision

Audio

Multimodality

🤝🏻 Contribution

Contributing to the paper list

Contributors

🖤 Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 7

Uh oh!

Packages