This page explains the lambda‑calculus term λx.λy.zxy, its de Bruijn index form
λλ521, how the counting works, and why this representation solves real problems
for compilers and interpreters.
The original term is:
λx. λy. z x y
This term contains:
λx then λy)z x y, which is an application of three expressionsx and y are bound variablesz is a free variable
λx
|
λy
|
@
/ | \
z x y
The structure shows the two abstractions and the application of z to x and y.
The corresponding de Bruijn form is:
λ λ 5 2 1
In this representation, variable names are removed and replaced with numbers that indicate how many binders (λ’s) must be crossed to reach the variable’s definition.
λ
|
λ
|
@
/ | \
5 2 1
The structure is identical to the named‑variable tree; only the labels change.
The variable ordering assumed is:
x, y, z, x, y, z, …
Using this ordering:
yxz in the free‑variable list
Thus the body z x y becomes:
5 2 1
Bound variables use the “bucket” numbers created by λ’s; free variables use the later numbers.
There are two binders (λx and λy), so the free‑variable list begins at index 3:
3 → x
4 → y
5 → z
6 → x
7 → y
8 → z
...
In the term λx. λy. z x y:
y is bound → climbs 1 λ → index 1x is bound → climbs 2 λ’s → index 2z is free → uses the next free‑variable index → 5
Indices 3 and 4 correspond to free x and y, which do not appear in this term,
so they are not used here.
Every λ is a “bucket” on a stack.
Inside the body, a variable does not say “I am x”, it says:
“I live in bucket #1, #2, … above me.”
λx) → bucket #1λy) → bucket #2Inside the body z x y:
y is in the nearest bucket → index 1x is in the next bucket → index 2z is in no bucket → it is free → index 5 (from the free‑variable list)
z → 5 (free)
x → 2 (bound)
y → 1 (bound)
Therefore:
λx. λy. z x y
→
λ λ 5 2 1
At first glance, de Bruijn indices look like “just replacing names with numbers”. In reality, they solve several concrete problems that compilers and interpreters face.
def outer(x):
def inner(x):
return x + 1
return inner(x)
Two different variables are both called x.
Humans can track which is which by looking at indentation and scope.
A compiler must track this mechanically and precisely.
With de Bruijn indices, the inner variable is always “the nearest binder”, e.g. index 1. The outer one is index 2, and so on. Names no longer matter.
λx. λy. x + y
λa. λb. a + b
These two functions are mathematically the same, just with different variable names. A compiler must recognise that they are equivalent.
With de Bruijn indices, both become:
λ λ 2 + 1
They are now literally the same sequence of symbols. No special α‑equivalence logic is needed: renaming bound variables does not change the index form.
(λx. (λy. x + y)) (y)
If substitution is done naively with names, the free y in the argument can be
accidentally captured by the inner λy, changing the meaning of the program.
With de Bruijn indices, substitution is defined on numbers and binder depth. The rules are purely structural, which makes it much easier to avoid capture and reason about correctness.
Python closures already use a similar idea internally: they refer to variables by position, not by the original source name.
def outer(a):
b = 10
def inner(c):
return a + b + c
return inner
Conceptually, the inner function can be thought of as using something like:
load freevar #0 # a
load freevar #1 # b
load local #0 # c
The runtime does not need the original names a, b, c;
it only needs to know “which slot” or “which position” to read from.
This is the same spirit as de Bruijn indices: variables are identified by
where they live, not by what they are called.
Although the idea comes from early work in lambda calculus, it has become a practical tool for designing compilers and interpreters that can reason about code reliably and transform it safely.