It’s all arbitrary

The word arbitrary is ubiquitous in mathematical texts. It is a very useful word, but it is a common cause of confusion for a few reasons:

  • The meaning of arbitrary in mathematical texts is different from its standard meaning in English;
  • The concept of arbitrariness is confusing in its own right;
  • Even in the context of mathematics, the word arbitrary can mean more than one thing.

My aim in writing this post is to clarify the differences between the standard English and mathematical English uses of the word arbitrary, and to provide some examples of the word arbitrary at work.

“Let $x$ be arbitrary”

The word arbitrary is most commonly used when introducing variables for the purposes of proving universally quantified statements.

What I mean by this is the following. Suppose you are trying to prove that every even integer is the sum of two odd integers. Your proof might look a little bit like this:

Let $x$ be an arbitrary even integer. Then $x=2n$ for some integer $n$. It follows that $x = 1 + (2n-1)$, which is the sum of two odd integers.

Expressions of the form “let $x$ be an arbitrary […]” are typical when proving that a property $p(x)$ holds for all elements $x$ of some set $X$. These kinds of statements are called universally quantified statements. [Symbolically, we would write the assertion that $p(x)$ is true for all elements $x$ of a set $X$ as “$\forall x \in X,\ p(x)$”.]

In the above example, $X$ was the set of even integers and $p(x)$ was the property “$x$ is the sum of two odd integers”.

A direct proof of a universally quantified statement $\forall x \in X,\ p(x)$ usually looks like this:

Let $x \in X$ be arbitrary. […proof of $p(x)$ goes here…]

And this is where the confusion begins. To illustrate, consider the following (non-)proof that every even integer is the sum of two odd integers.

Let $x$ be an arbitrary even integer, say $x=42$. Then $x = 17 + 25$, which is the sum of two odd integers, as required.

A seasoned mathematician might scoff at such a proof, but to a novice it is less clear why the proof doesn’t work.

The issue here is the dissonance between the standard English usage of arbitrary—meaning ‘based on individual discretion’ or ‘determined by impulse rather than reason’—and the mathematical usage. This meant that as soon as the proof-writer wrote the word arbitrary, their thought process was then distorted.

From the proof-writer’s perspective, having just written ‘let $x$ be an arbitrary even integer’, they picked an even integer arbitrarily (it just happened to be $42$), and the proof went through just fine!

If they took a step back, they’d realise what they wrote was not a proof that every even integer can be written as the sum of two odd integers. In fact, what they proved is that some even integer, namely $42$, can be written as the sum of two odd integers.

So what did they do wrong?

The reader and the writer

The main observation to be made is that, when you write ‘let $x \in X$ be arbitrary’, the power to choose a value of $x$ arbitrarily belongs not to the person writing the proof, but to the person reading it.

What this means is that, as soon as you have written ‘let $x \in X$ be arbitrary’, the reader should be able to replace all subsequent instances of the variable $x$ by a value of their choice, and the proof should remain true.

In this sense, the word arbitrary really means generic: the variable $x$ is treated as an element of $X$, but when reasoning about $x$, the only things we can assume about $x$ are those things that are true of all elements of $X$.

Let’s return to the previous example, where $X$ is the set of even integers. Lots of things are known about all even integers. For example, by definition of ‘even’, every even integer can be expressed in the form $2n$ for some integer $n$. This means that when reasoning about an ‘arbitrary’ even integer $x$, we are free to write $x=2n$ for some integer $n$, which may depend on the value of $x$.

There are things that are not true of all even integers, even though they might be true of some even integers. It is safe to say that ‘$x=17+25$’ is such a statement; this is only true for the integer $42$, and hence it is not a valid thing to use in a proof that all even integers $x$ are the sum of two odd integers. This was the shortcoming of the non-proof given above.

Recall the correct proof above that every even integer is the sum of two odd integers.

Let $x$ be an arbitrary even integer. Then $x=2n$ for some integer $n$. It follows that $x = 1 + (2n-1)$, which is the sum of two odd integers.

The reader should be able to replace $x$ by an arbitrary even integer, and the remaining proof should go through.

Let’s do this. Replacing $x$ by $42$, we obtain the following:

[Then] $42=2n$ for some integer $n$. It follows that $42 = 1 + (2n-1)$, which is the sum of two odd integers.

Is this true? Well, yes. The assertion that $42=2n$ for some integer $n$ is seen to be true by taking $n=21$. In this case, the rest of the proof reads:

It follows that $42 = 1 + 41$, which is the sum of two odd integers.

We can certainly agree that $1$ and $41$ are odd and that $1+41=42$.

But the point is that the reader didn’t need to have picked $x=42$. The reader could just as well have taken $x=64101272$ and the proof would still work.

“The values are arbitrarily small”

What makes the word arbitrary more confusing is that it can be used to mean something subtly different, especially in its adverbial form arbitrarily.

Here are some examples of statements that a learner of mathematics might encounter:

  • Since the terms of the sequence $(x_n)$ are eventually arbitrarily small, it follows that $\lim_{n \to \infty} x_n = 0$.
  • There are intervals in $S$ of arbitrarily long length.
  • If a theory $\mathbb{T}$ has arbitrarily large finite models, then $\mathbb{T}$ has an infinite model.

This kind of usage of arbitrar(il)y is even more confusing on first sight than the one discussed above (see here and here and here for some examples of questions asked by people confused by this very issue).

The best I can do to define this usage in the abstract is as follows: the expression ‘[object] is arbitrarily [adjective]’ means that no matter how [adjective] you want [object] to be, there is some instance of [object] which is at least as [adjective] as you wanted.

To illustrate, let’s look at what the relevant phrases in the three examples above really mean:

  • ‘The terms of the sequence $(x_n)$ are eventually arbitrarily small’ means that, for all $\varepsilon > 0$, there is a stage in the sequence after which all terms $x_n$ satisfy $|x_n| \le \varepsilon$.
  • ‘There are intervals in $S$ of arbitrarily long length’ means that, for all $\ell \ge 0$, there is an interval in $S$ whose length is $\ge \ell$.
  • ‘The theory $\mathbb{T}$ has arbitrarily large finite models’ means that, for all $n \in \mathbb{N}$, there is a model of $\mathbb{T}$ of size $\ge n$.

Notice in each case that the word arbitrar(il)y has been replaced by a univerally quantified statement: ‘for all $\varepsilon > 0$’ or ‘for all $\ell \ge 0$’ or ‘for all $n \in \mathbb{N}$’.

Just like before, this means that the reader—not the writer—has the power to choose the value to be made arbitrarily [adjective].

But there is another source of confusion in such statements, which is that what people want to believe is that the existence of arbitrarily [adjective] [objects] means that there exists an infinitely [adjective] [object].

For example, you might want to say that if the terms of $(x_n)$ are eventually arbitrarily small, then they are eventually zero. But this is not true: for example, taking $x_n = \frac{1}{n+1}$, we see that the terms are eventually arbitrarily small, but no value of $x_n$ is equal to zero.

As another example, take $S = \{ [0,n] \mid n \in \mathbb{N} \}$. This is a set of intervals, and they have arbitrarily long lengths since for any $\ell \ge 0$ the interval $[0, \lceil \ell \rceil]$ has length $\ge \ell$, where $\lceil \ell \rceil$ is the smallest integer greater than or equal to $\ell$. Since $S$ contains intervals of arbitrarily long length, you might be tempted to say that it contains an interval of infinite length… but it doesn’t, since the length of each interval in $S$ is a natural number, so each interval in $S$ has finite length.

The moral of the story

To summarise:

  • The word arbitrary means generic when used in the context ‘let [variable] be arbitrary’, meaning that the reader should be able to substitute whatever value for the variable that they please, and the proof should go through.
  • The expression arbitrarily [adjective] means that there is no bound on how [adjective] the object in question can be, but it does not necessarily imply that some object is infinitely [adjective], whatever that means.
  • When using the word arbitrary in mathematical writing, always remember that it is for the reader, not the writer, to make the arbitrary decisions.

Talk about formalisation at the GSS

Yesterday, I gave a talk at the CMU Graduate Student Seminar entitled Formalisation or: How I Learned to Stop Worrying and Love the Computer.

Abstract: What distinguishes mathematics from the empirical sciences is that we prove stuff with complete certainty. Ha, just kidding! If you look at the mathematical literature in a bit more detail, you’ll see a body of work littered with errors, omissions, disagreements, leaps of faith, appeals to intuition and duplicated efforts. In this talk, I will describe some ongoing efforts to use computers to help us overcome these problems, including work currently being done right here in our beloved Steel City. Prerequisites: nothing but an open mind and a hungry stomach.

Slides: available here.

Video recording: