Abstract

Show the limits of computability.

We explore the limits of computability, i.e. the question: *Are
there functions which cannot be computed?*

At the turn of the 19th/20th century there was a great optimism regarding this question. Most of the mathematicians thought that every mathematical function must be computable, every mathematical question has to be decidable.

The famous german mathematician David Hilbert formulated a program
(today called *Hilbert’s program*) to ground all mathematics on
axioms and formal proofs and prove that such a system is free of
inconsistencies. In such a formal system it should be possible to prove
or disprove all mathematical statements. He expressed his belief with
his famous sentence

We must know, we will know.

This optimism had not been restricted to mathematics. It had been present in all science and technology. The Eiffel tower has been build for the world exposition in 1889. A 324 m high steel building representing the power of technology. Albert Einstein published 1905 his theory of relativity and in 1915 his theory of general relativity. Both theories had been verified by various experiments. The Wright brothers invented and flew the first airplane in 1903.

But let’s go back to mathematics. In 1901 Bertrand Russel discovered a paradox in naive set theory. It has been customary to talk about the set of all sets. Bertrand Russel found out that this leads to the paradoxical

set of all sets which do not contain themself.

Does this unversal set contain itself? If yes, it contains itself and therefore cannot be in the set. If no, it does not contain itself and therefore must contain itself.

In order to avoid such paradoxes which lead to contradictions,
Whitehead and Russel published in 1910, 1912 and 1913 the famous
*Principia Mathematica* which provided a foundation for
mathematics free of paradoxes.

The young viennese mathematician Kurt Goedel invented a technique
called *Goedel Numbering*. He used this technique to formulate a
sentence similar to Russel’s paradox as a mathematical statement about
natural numbers. Such a statement can be expressed in the formalism of
Principia Mathematica and therefore can be injected into Principia
Mathematica like a Trojan horse. In his famous incompleteness theorem
(1931) Goedel demonstrated that paradoxical statements can be injected
into all formalisms which are powerful enough to express basic
arithmetics.

Imagine the blow to Hilbert’s program! All formal theories sufficiently powerful are either inconsistent or incomplete.

The positive side of Goedels incompleteness theorem: Paradoxes are not necessarily bad. They are part of the mathematicians toolkit to prove something.

In 1936 Alan Turing and Alonzo Church proved independently that there are undecidable (or uncomputable) problems in mathematics. They basically used the technique of Goedel numbering in their proofs.

In this paper we are going to show that there are functions which cannot be computed in lambda calculus. Note that functions can return a boolean value and a not computable boolean valued function represents an undecidable predicate.

We don’t use a lot of math here. We continue to express the ideas in programmer’s terms like we did in Programming with Lambda Calculus. Everybody who is able to program functions should be able to follow.

Nearly all proofs of undecidability use the encoding of some logical
paradox. Let’s review the outline of such proofs using the example of
the *barber paradox*.

Definition:

A village has the

barberproperty if there is a barber in the village who shaves all the men in the village who do not shave themselves.

Theorem:

A village with the barber property does not exist.

Proof:

Let’s assume by way of contradiction that such a village exists. Then there is a barber who shaves every man in the village who does not shave himself.

Now there are two possibilities:

The barber shaves himself: This is not possible because the barber can only shave men who do not shave themselves. Therefore the assumption that the barber shaves himself leads to a contradiction.

The barber does not shave himself: This is not possible because the barber shaves all the men who do not shave themselves. Therefore the assumption that the barber does not shave himself leads to a contradiction as well.

The assumption that such a village exists leads to a contradicion. Therefore we have to conclude that a village with the barber property cannot exist.

Note that some kind of self reference is essential in all paradoxes. In order to use this technique to show that some predicates are undecidable in lambda calculus we have to introduce some self referential lambda terms. I.e. we need some lambda terms and encoded versions of them.

In order to do this, we need some techniques. After having the machinery the actual proof is not complicated.

This section summarizes some basic knowledge of lambda calculus which is required to understand the rest of the text.

If the summary is too compressed, then read the text Programming with Lambda Calculus which gives more detailed introduction.

A lambda term is either a variable, an application of a function term to a variable term or a lambda abstraction.

```
::= x -- variable
t | (a b) -- application of 'a' to the argument 'b'
| (\ x := a) -- abstraction
^ ^ definition term
| bound variable
```

where `x`

ranges over an infinite supply
`x0, x1, x2, ...`

of variables and `a, b, t`

range
over arbitrary lambda terms.

In mathematical texts the abstraction `(\x := a)`

is
usually written as . In this
text we prefer an ascii notation in order to be able to write lambda
terms more like terms in a functional programming language. However the
mathematical notation and the ascii notation denote the same thing.

The name of a bound variable is irrelevant. The names of bound variables can be changed consistently as long as the change does not interfere with variables which are not bound by the same binder. This restriction is necessary, because a renaming must not make a free variable bound.

Bound variable names in lambda calculus have the same character as formal argument names in programming languages.

All functions in lambda calculus have only one argument. This is not a restriction, because the function applied to an argument can return another function which can be applied another argument.

The lambda term `(\ x := exp) a`

is a reducible
expression. The basic reduction step is

```
(\ x := exp) a ~> exp[x := a]
^
read "reduces to"
```

where `exp[x:=a]`

is the term `exp`

where all
free variables in `exp`

have been replaced by the term
`a`

. Note that it might be necessary to rename bound
variables in `exp`

before doing the substution in such a way
that they do not interfere with any free variables in `a`

.
This is always possible since we have an infinite supply of
variables.

Example:

```
((\ x := (\ y := x)) a) b
~> (\ y := x)[x:=a] b
= (\ y := a) b
~> a[y:=b]
= a
```

Note that the last equality is valid. `a`

must not contain
the free variable `y`

. Otherwise we would have been obliged
to rename `y`

to a new variable which is not contained in
`a`

.

In order to avoid excessive brackets we use some conventions.

Outer brackets can be ommitted.

Function application associates to the left. I.e.

`((a b) c) d)`

can be written as`a b c d`

.Functions with more than one argument can be written in a compressed form. I.e. instead of writing

`\ x := (\y := exp)`

we can write`\ x y := exp`

.

Data types in lambda calculus are represented by functions which
*do something* which is meaningful for the data type.

A boolean term has two values, true and false. We can encode such values in lambda calculus as functions taking two arguments and returning either the first in case of a true value and the second in case of a false value.

```
true := (\ x y := x)
false := (\ x y := y)
```

Therefore `true a b ~> a`

and
`false a b ~> b`

.

Note that we use the reduction symbol `~>`

in a sloppy
manner. `t ~> u`

can mean that `t`

reduces to
`u`

in one step or in more steps. In this paper there is no
need to distinguish between one step and multistep reduction. In other
contexts it might be necessary to use different symbols.

In order to make our notation look more like definitions in a
programming language we can write the definitions of `true`

and `false`

as

```
true x y := x
false x y := y
```

But remember that definitions are just abbreviations to make the terms more readable. Lambda calculus does not know of any definitions. The corresponding lambda term can always be obtained by expanding the definitions. This is always possible, because no recursive definitions are allowed. Definitions are just abbreviations.

We can use our notation to define pairs in lambda calculus.

` pair x y f := f x y`

If we apply `pair`

only to two arguments
`pair a b`

we get a function which expects another argument
and then applies the missing argument to the first two arguments. This
fact can be used to define the following projections

```
first p := p (\ x y := x)
second p := p (\ x y := y)
```

The following reduction shows that `first`

behaves as
expected extracting the first part of a pair.

```
first (pair a b)
= first ((\ x y f := f x y) a b)
~> first (\ f := f a b)
= (\ f := f a b) (\x y := x)
~> (f a b)[f:= (\x y := x)]
= (\x y := x) a b
~> (\ y := x)[x:=a] b
= (\ y := a) b
~> a[y:=b]
= a
```

Note again that `a`

must not contain `y`

.
Otherwise the variable `y`

must be renamed to another
variable which does not occur in `a`

.

We can represent numbers in lambda calculus as functions with two
arguments. The first argument is a function and the second is a start
value. The so represented number is a lambda term which applies the
function argument `n`

times on the start value.

Such lambda terms are called *Church Numerals*.

The number zero is represented by the term

` zero f s := s`

which is in our notation equivalent to

` zero := (\ f s := s)`

The function `zero`

applies the function argument
`f`

zero times on the start value `s`

.

The successor function takes a Church numeral `n`

and a
function argument `f`

and a start value `s`

and
applies the function `f`

one more time upon the start value
than `n`

already does.

` successor n f s := f (n f s)`

Make sure that you understand that the partial application
`successor n`

returns a Church numeral if supplied with a
Church numeral argument.

Having this we can define arbitrary Church numerals

```
one := successor zero
two := successor one
...
```

We can define addition as the iterated application of the successor function, multiplication as the iterated addition of a number and exponentiation as the iterated multiplication of a number.

```
plus n m := n successor m
times n m := n (plus m) zero
power n m := m (times n) one
```

Addition, multiplication and exponentiation are just simple iterations. In order to implement the predecessor function we need a little trick.

We want a function `predecessor`

which returns for any
number the predecessor of that number and which returns
`zero`

for the predecessor of the number
`zero`

.

We implement the function by iterating over a pair of numbers of the form

` pair counter predecessor-of-counter`

with the start value `pair zero zero`

. At each iteration
we copy the counter to the predecessor field and increment the counter.
At the end of the iteration we extract the second component of the
pair.

```
:=
predecessor n
second
(n step (pair zero zero))where
:=
step p pred-counter :=
p (\ counter pair (successor counter) counter)
```

These arithmetic functions are sufficient for the rest of the text.

The paper Programming with Lambda Calculus shows many more functions which can be implemented in lambda calculus. All computable functions can be implemented in lambda calculus.

It is not possible to write a lambda term which can inspect its own
arguments i.e. find something about the structure of its arguments. E.g.
a lambda term `isVariable`

where `isVariable a`

returns `true`

, if `a`

is a variable and otherwise
`false`

, is not possible.

A lambda term can only use its arguments as functions and apply it to other arguments or as arguments to feed them to other arguments. This is due to the fact that lambda terms are either variables, applications or abstractions.

Therefore we must encode a boolean value as a function taking two arguments and returning either the first or the second depending on its truth value. We have to encode numbers as functions taking two arguments and applying the first argument n times on the second argument.

All arithmetic functions expect their arguments to be Church
numerals. E.g. the function `successor`

expects one argument
which has to be a Church numeral to compute the Church numeral of the
successor of the number represented by its argument.

All our functions expressed in lambda calculus do the *right
thing* only if their arguments have the correct form. The lambda
calculus cannot check if the argument is a Church numeral and throw an
exception in case of failure.

If we want to have lambda terms which are self referential we have to give the lambda term an argument, which is some encoding of itself. The most popular encoding is Goedel numbering which is in lambda calculus the encoding of lambda terms as Church numerals. However Goedel numbering is not the only way to encode lambda terms. We separate in the following the parts which are independent of the encoding technique and the parts which do the actual encoding as Church numerals.

We use the following notation.

```
lambda term encoded term
a <a>
```

The correspondence between a lambda term `a`

and its
encoding `<a>`

must be one-to-one. I.e. two different
lambda terms must not have the same encoding.

In order to encode lambda terms we need the functions
`var`

, `appl`

, `lam`

and
`wrap`

which satisfy the following requirements.

```
var c[i] ~> <xi>
app <a> <b> ~> <a b>
lam c[i] <a> ~> <\ xi := a>
wrap <a> ~> < <a> >
```

The function

`var`

expects a Church numeral representing the number`i`

and returns the encoding of the variable`xi`

.The function

`app`

expects an encoding of the lambda terms`a`

and`b`

and returns the encoding of the lambda term`a`

applied to`b`

i.e. the encoding of`a b`

.The function

`lam`

expects a Church numeral representing the number`i`

and an encoding of the lambda term`a`

and returns an encoding of the lambda term`\ xi := a`

The function wrap expects an encoding of the lambda term

`a`

and returns the encoding of the encoding of the lambda term`a`

.

In order to satisfy the *one-to-one* requirement, the
functions `var`

, `app`

, `lam`

and
`wrap`

have to be one-to-one. Remember that the composition
of one-to-one functions is one-to-one as well. I.e. all combinations of
the four basic functions are one-to-one.

All four functions are needed to encode self referential terms. E.g.
to generate the encoding of the term `a <a>`

i.e. the
encoding of the term `a`

applied to its encoding we can use
the equivalence

` <a <a>> = app <a> (wrap <a>)`

I.e. the function

` self x := app x (wrap x)`

satisfies the specification

` self <a> ~> <a <a>>`

In this section we show how to encode lambda terms as Church
numerals, i.e. a term of the form `<a>`

is the Church
numeral encoding the term `a`

.

Before doing that we define a mathematical pairing function which maps pairs of natural numbers one-to-one and onto to the natural numbers. There are many possibilities. We choose the function

Note that this is a mathematical function and has nothing to do with a lambda term.

Some example values:

n | m | [n,m] |
---|---|---|

0 | 0 | 0 |

1 | 0 | 1 |

2 | 0 | 3 |

3 | 0 | 7 |

0 | 1 | 2 |

0 | 2 | 4 |

0 | 3 | 6 |

1 | 1 | 5 |

2 | 1 | 11 |

In order to see that the pairing function is one-to-one and onto we show that given the value of we can uniquely recover and . Assume . We divide by until we get an odd number. is the number of times, we can divide by . Since all odd numbers can be uniqueley represented by , we get a unique value for .

For example let . We get . Dividing 12 subsequently by 2 we get the sequence i.e. we can divide 2 times by 2 and therefore . Resolving the equation we get .

We can transform the mathematical pairing function into a lambda term.

```
:=
compress n m -- Compute the pairing function [n,m] where
-- n and m are Church numerals representing
-- the numbers.
predecessor
(times
(power two n) (successor (times two m)))
```

We use the function `compress`

to do the basic encoding of
lambda terms.

```
:=
var i -- Encoding of xj where i is the Church numeral
-- representing the index j.
compress zero i
:=
app enca encb -- Encoding of the lambda term a b, where
-- enca and encb are the encodings of a and b.
compress one (compress enca encb)
:=
lam i enca -- Encoding of (\xj := a) where i is the Church
-- numeral representing the index j and enca is
-- the Church numeral representing the lambda term a.
compress two (compress i enca)
```

The first argument is the Church numeral for the numbers 0, 1, or 2 depending whether the lambda term is a variable, an application or an abstraction. This shows that the encoding is not onto. The Church numerals for the numbers 3, 4, … are never used in the top level compress functions. I.e. there are Church numerals which do not encode a lambda term. But this is not a problem, because we only require that the encoding is one-to-one, i.e. that there are never two lambda terms which have the same encoding.

We still have to define the function `wrap`

which given
the encoding of a lambda term returns the encoding of the encoding of
the the lambda term. In order to achieve that we first encode
`zero`

and `successor`

and then use these
encodings to define `wrap`

.

Remember the definition of `zero`

and
`successor`

.

```
zero := (\ f s := s)
successor := (\n f s := f (n f s))
```

These functions have the bound variables `s`

,
`f`

and `n`

. But formally we have only the
variables `x0`

, `x1`

, `x2`

, … We use
here the convention to number the bound variables *inside-out*,
i.e. the innermost bound variable gets the name `x0`

, the
next outer variable gets the name `x1`

and so on. Since the
names of bound variables are arbitrary as long as they don’t interfere
with the outside world, this choice is as good as any other choice.

The encoding of the lambda term `zero`

is a
straightforward application of the basic encoding functions.

```
zero := (\ x1 x0 := x0)
enc_zero :=
lam one (lam zero (var zero))
```

The encoding of the successor function looks a little bit more
complicated. But no magic is involved. Just a systematic use of the
basic encoding functions `var`

, `app`

and
`lam`

.

```
:= (\x2 x1 x0 := x1 (x2 x1 x0))
successor
:=
enc_successor where
lam two (lam one (lam zero enc)) :=
enc
app
(var one) (app (app (var two) (var one)) (var zero))
```

Having an encoding of `zero`

and `successor`

,
the definition of `wrap`

is easy.

```
:=
wrap enca -- Given the encoding enca of a lambda term a,
-- compute the encoding of the encoding of a.
enca (app enc_successor) enc_zero
```

Note that the argument `enca`

is the encoding of some
lambda term. Therefore it is a Church numeral. Church numerals can be
used to iterate the application of the encoding of the successor
function `enca`

times to get an encoding of `enca`

which is an encoding of the encoding of the corresponding lambda
term.

You see: Encoding of lambda terms as Church numerals is not difficult. It is only necessary to be careful in distinguishing lambda terms and their encodings properly.

This chapter proves some undecidability theorems. In order to do this we have to encode lambda terms. From the previous chapter we need only two facts.

There is a unique encoding of lambda terms which maps a lambda term

`a`

into its encoding`<a>`

. The encoding is unique in the sense that different lambda terms never have the same enconding, i.e. the encoding is one-to-one.There is a lambda term

`self`

which maps the encoding`<a>`

of the lambda term`a`

into the lambda term`< a <a> >`

which represents the encoding of the lambda term`a`

applied to its own encoding`<a>`

.

Nothing more is needed in the proofs.

In the proofs we work with set of lambda terms.

Definition: Non-trivial set

We say a set

`A`

of lambda terms isnon-trivialif it is neither empty nor contains all possible lambda terms.

I.e. for each non-trivial set of lambda terms there is always a
lambda term `m0`

which is in the set and a lambda term
`m1`

which is not in the set.

Definition: Closed set

We say a set

`A`

of lambda terms isclosedif it contains with each lambda term`a`

all lambda terms which are alpha or beta equivalent to the term.

Definition: Decider

We say a lambda term

`p`

is a decider for a set of lambda terms`A`

if for any term`a`

the application`p <a>`

either returns`true`

or`false`

depending on the fact whether`a`

is in`A`

or`a`

is not in`A`

.

Theorem: *Any set A of lambda terms which is
non-trivial and closed does not have a decider.*

We prove this theorem by assuming that it is decidable i.e. that
there is a decider `p`

for the set `A`

and derive
a contradiction from the assumption.

Since the set `A`

is nontrivial there is term
`m0`

which is in the set and a term `m1`

which is
not in the set.

Using `p`

, `self`

, `m0`

and
`m1`

we define the term `g`

` g := (\ x := p (self x) m1 m0)`

Let’s see what happens if we apply `g`

to the description
of a lambda term `a`

.

```
g <a> ~> p (self <a>) m1 m0
~> p (<a <a>>) m1 m0
~> m1, if a <a> is in A
~> m0, if a <a> is not in A
```

The whole thing gets more interesting if we apply `g`

to
its own description. We have to distinguish the cases that
`g <g>`

is in `A`

and
`g <g>`

is not in `A`

:

`g <g>`

is in `A`

:

```
g <g> ~> p (self <g>) m1 m0
~> p (<g <g>>) m1 m0
~> true m1 m0
~> m1
```

`g <g>`

and `m1`

are betaequivalent.
However this would require that `m1`

is in `A`

by
the closedness of `A`

and contradicts the assumption that
`m1`

is not in `A`

.

`g <g>`

is not in `A`

:

```
g <g> ~> p (self <g>) m1 m0
~> p (<g <g>>) m1 m0
~> false m1 m0
~> m0
```

`m0`

and `g <g>`

are betaequivalent. This
would require that `g <g>`

is in `A`

by the
closedness of `A`

and contradicts the assumuption that
`g <g>`

is not in `A`

.

Conclusion: The assumption that there exists a decider `p`

for the terms in `A`

leads to a contradiction. Therefore such
a decider cannot exist.

Note how the application of the term `g`

to its own
description `<g>`

and the existence of a decider
`p`

are essential to prove the contradiction.

Theorem: *It is undecidable if two arbitrary lambda terms
a and b are beta equivalent.*

We proof this theorem by contradiction assuming that there is a
lambda term `q`

such that `q <a> <b>`

returns `true`

, if `a`

and `b`

are beta
equivalent and otherwise `false`

.

We form the set

`A`

which contains`a`

and all beta (or alpha) equivalent terms. Therefore`A`

is closed by definition.`A`

is nonempty, because it contains at least the term`a`

.`A`

cannot contain all lambda terms for the following reasons:If

`a`

is normalizing, then any term which does not reduce to a normal form cannot be contained in`A`

. There are terms which do not reduce to a normal form (e.g.`(\x := x x)(\x := x x)`

).If

`a`

does not reduce to a normal form, then any term in normal form cannot be in`A`

. There are terms in normal form (e.g.`\ x := x`

).

`A`

is nontrivial.Because

`A`

is closed and non-trivial there cannot be a decider for`A`

.Since

`q <a>`

is a decider for`A`

we get a contradiction.

Therefore the assumption that a decider `q`

exists must be
false.

Theorem: *It is undecidable if an arbitrary lambda term
a reduces to a normal form.*

We prove this theorem by contradiction assuming that there is a
lambda term `q`

such that `q <a>`

returns
`true`

, if `a`

reduces to a normal form and
otherwise `false`

.

We form the set

`A`

which contains all lambda terms which reduce to a normal form.`A`

is certainly closed because it contains all normal forms and all terms which reduce to a normal form i.e. all terms which are alpha or beta equivalent.`A`

is certainly nontrivial, because there are terms which are in normal form (e.g. all variables) and are therefore in the set and because there are terms which do not have a normal form and are therefore not in`A`

.`q`

would be a decider for`A`

.

Since a non-trival closed set cannot have a decider, the existence
that a term like `q`

exists leads to a contradiction and
therefore must be false.