Matrix Multiplication in LLMs

Understanding how neural networks transform information through matrix operations

Why Matrix Multiplication Matters

Every time you use ChatGPT, Claude, or any Large Language Model, billions of matrix multiplications happen behind the scenes. Each multiplication transforms your input text into new representations, building up layers of understanding until the model can generate a response.

This demo will help you understand what matrix multiplication actually does (not just how to compute it) and why it's the perfect operation for neural networks.

1 One Token, Three Features

A helpful fiction: Real LLMs use vectors with 768 to 12,288+ dimensions, not 3. And those dimensions don't map neatly to human concepts—they're abstract patterns learned from training data. But to build intuition, let's pretend we can label them with made-up meanings.

Imagine a single token is represented by just 3 numbers, where each number captures some aspect of the word:

Token "running"
plural-ish
past-ish
action-ish
2
1
3
x = [2, 1, 3]

Remember: "plural-ish" and "action-ish" are fictional labels we invented! In real models, dimensions don't have neat meanings. But the math works identically—so this intuition will serve you well.