Chapter 4 of The Big Book of Large Language Models is Here!

Damien Benveniste, PhD

Published Mar 31, 2025

Chapter 4 of the Big Book of Large Language Models is finally here! That was a difficult chapter to write! Originally, I wanted to cram in that chapter all the improvements related to the Transformer architecture, since the Attention is all you need paper, but I realized that it would be too long for one chapter. I ended up focusing only on improvements related to the attention layer and delaying things like relative positional encoding and Mixture of Experts to the next chapter. In this chapter, I addressed the following improvements:

Sparse Attention Mechanisms

The First Sparse Attention: Sparse Transformers
Choosing Sparsity Efficiently: Reformer
Local vs Global Attention: Longformer and BigBird

Linear Attention Mechanisms

Low-Rank Projection of Attention Matrices: Linformer
Recurrent Attention Equivalence: The Linear Transformer
Kernel Approximation: Performers

Memory-Efficient Attention

Self-attention Does Not Need O(N^2) Memory
The FlashAttention

Faster Decoding Attention Mechanisms

Multi‑Query Attention
Grouped‑Query Attention
Multi-Head Latent Attention

Long Sequence Attentions

Transformer-XL
Memorizing Transformers
Infini-Attention

Obviously, I could not include everything that was ever invented in the context of the attention layer, but I believe those use cases capture well the different research routes that have been explored since then. I believe it is a very important chapter, as most materials available online tend to focus on the vanilla self-attention, which starts to be an outdated concept for today’s standards. I also found that trying to understand how to improve the self-attention is a very good way to understand what it is we are trying to improve in the first place! The self-attention may appear odd at first, but diving into the inner workings of the layer in order to improve it gives us a level of understanding that is beyond anything we can learn just by looking at the original self-attention. I hope you will enjoy it!

Looking for corporate training or consulting services for your AI/ML endeavors? Just send me an email: damienb@theaiedge.io

The AiEdge

54,031 followers

+ Subscribe

Alexander Vega Carvajal

5mo

Great content Damien Benveniste, PhD 👌🏼 Looking forward to the next chapter

Huynh Tan Khoa

8mo

I appreciate this, Damien

Agent UI

8mo

Teach us how to prompt images like you!!!!

2 Reactions

Paolo Perrone

8mo

you've got some image prompting skill 😄

2 Reactions

Ahsan Saeed

8mo

Good luck with the book Damien Benveniste, PhD

1 Reaction

See more comments

To view or add a comment, sign in

More articles by Damien Benveniste, PhD

Build Production-Ready Agentic-RAG Applications From Scratch Course: What we are going to build

Sep 2, 2025

Build Production-Ready Agentic-RAG Applications From Scratch Course: What we are going to build

On Saturday, September 27th, I am launching a new course: Build Production-Ready Agentic-RAG Applications From Scratch!…

9 Comments
New Course: Build Production-Ready Agentic-RAG Applications From Scratch

Aug 25, 2025

New Course: Build Production-Ready Agentic-RAG Applications From Scratch

On Saturday, September 27th, I am launching a new course: Build Production-Ready Agentic-RAG Applications From Scratch!…

2 Comments
Last Week to Register for the Build Production-Ready LLMs From Scratch Course!

Jul 9, 2025

Last Week to Register for the Build Production-Ready LLMs From Scratch Course!

This Saturday, we kick off the Build Production-Ready LLMs From Scratch course! This is the last week to register, so…

6 Comments
Build Production-Ready LLMs From Scratch Starting on July 12th!

Jun 16, 2025

Build Production-Ready LLMs From Scratch Starting on July 12th!

Get ready! The latest iteration of the Build Production-Ready LLMs From Scratch live course is starting on July 12th!…

6 Comments
Last Week to Register for the Build Production-Ready LLMs From Scratch Course!

May 19, 2025

Last Week to Register for the Build Production-Ready LLMs From Scratch Course!

This Saturday, we kick off the Build Production-Ready LLMs From Scratch course! This is the last week to register, so…

12 Comments
Today, 9:30 am PST: Build the Self-Attention in PyTorch From Scratch

May 2, 2025

Today, 9:30 am PST: Build the Self-Attention in PyTorch From Scratch

Join me this morning for a live coding event. It is a completely free event where I will explain the basics of the…

10 Comments
Join us for a Free LIVE Coding Event: Build the Self-Attention in PyTorch From Scratch

Apr 25, 2025

Join us for a Free LIVE Coding Event: Build the Self-Attention in PyTorch From Scratch

Next Friday, I am inviting you to join me for an exciting live coding event. It is a completely free event where I will…

7 Comments
Build Production-Ready LLMs From Scratch

Apr 21, 2025

Build Production-Ready LLMs From Scratch

From Prototype to Production: Ship Scalable LLM Systems in 6 Weeks Big news! I am now partnering with Maven as an…

15 Comments
New Chapter: Attention Is All You Need - The Original Transformer Architecture

Feb 11, 2025

New Chapter: Attention Is All You Need - The Original Transformer Architecture

The second chapter of the Big Book of Large Language Models is now available in preview: Attention Is All You Need: The…

9 Comments
Introducing The Big Book of Large Language Models!

Jan 30, 2025

Introducing The Big Book of Large Language Models!

For the past years, I have been creating educational content around machine learning and, specifically, large language…

13 Comments

See all articles

Sign in

Stay updated on your professional world

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

The AiEdge

54,031 followers

More articles by Damien Benveniste, PhD

Build Production-Ready Agentic-RAG Applications From Scratch Course: What we are going to build

New Course: Build Production-Ready Agentic-RAG Applications From Scratch

Last Week to Register for the Build Production-Ready LLMs From Scratch Course!

Build Production-Ready LLMs From Scratch Starting on July 12th!

Last Week to Register for the Build Production-Ready LLMs From Scratch Course!

Today, 9:30 am PST: Build the Self-Attention in PyTorch From Scratch

Join us for a Free LIVE Coding Event: Build the Self-Attention in PyTorch From Scratch

Build Production-Ready LLMs From Scratch

New Chapter: Attention Is All You Need - The Original Transformer Architecture

Introducing The Big Book of Large Language Models!

Sign in

Explore content categories