Constructing the hyperreal numbers

The hyperreal numbers are the members of an ordered field denoted {\star {\mathbf R}} which has a proper subfield isomorphic to {{\mathbf R}} (the set of all real numbers). One of the interesting things about {\star {\mathbf R}} is that it has infinite members. The formal definition of an infinite member of an ordered field is as follows.

Definition 1 For every ordered field {F} and every member {x} of {F}, {x} is infinite if and only if for every integral member {n} of {F}, {n < |x|}.

For every infinite member {x} of an ordered field {F}, we have {|x^{-1}| < n^{-1}} for every positive integer {n}, since {n < |x|}; therefore, {x^{-1}} is said to be infinitesimal. Now, the mathematicians who originally developed the calculus— principally Newton and Leibniz— often justified their proofs with reference to infinitesimals. For example, Leibniz would have justified that the derivative of {x \in {\mathbf R} \mapsto x^2} is {x \in {\mathbf R} \mapsto 2 x} roughly as follows: for every real number {x} and every infinitesimal {h},

\begin{array}{rcl} \frac {(x + h)^2 - x^2} h &=& \frac {x^2 + 2 h x + h^2 - x^2} h \\   &=& 2 x + h,  \end{array}

which differs from {2 x} only by an infinitesimal. He was on dubious grounds here, because he was working before a rigorous definition of the real numbers had been developed, and therefore was not clear that infinitesimals actually existed. In fact, it was always pretty clear that no real numbers could be infinitesimal, since reciprocals of infinitesimals are infinite, and infinite real numbers clearly don’t exist. In the 19th century, the real numbers were finally defined in a rigorous way. It was then possible to prove in an absolutely rigorous way that there were no infinite real numbers, and hence no infinitesimal real numbers. At the same time, Augustin-Louis Cauchy gave an explanation of how the calculus could be developed without any reference to the concept of infinitesimals. Cauchy’s way of developing calculus (which, now that it has been placed on a rigorous footing, tends to be called analysis) is now standard. But in the 1960s, Abraham Robinson developed the concept of hyperreal numbers and explained how analysis could also be developed using the infinitesimals in {\star {\mathbf R}} in much the same way as Leibniz. This way of developing analysis is called non-standard analysis. Obviously, the disadvantage of non-standard analysis is that you have to construct a whole nother number system, but proofs in it do tend to be a lot more intuitive. I’m not going to go into it in this post, but I think the best way to illustrate this is the proof of the extreme value theorem. You can see both the standard and non-standard proofs on Wikipedia. Once you are familiar with {\star {\mathbf R}} I think you’ll agree that the non-standard proof is much more natural.

In this post, I’m just going to explain how {\star {\mathbf R}} can be constructed. The basic idea is to let {\star {\mathbf R}} be the quotient in {{\mathbf R}^{\mathbf N}} (the set of all sequences of real numbers) of an equivalence relation {R}, so that every hyperreal number can be represented by a sequence of real numbers). Informally, two sequences of real numbers {\langle r_i \rangle} and {\langle s_i \rangle} are equivalent with respect to {R} if and only if for almost every natural number {i}, {r_i = s_i}. The difficulty in the definition is making the term “almost” precise. One natural way to do this would be to say that {r_i = s_i} for almost every natural number {i} if and only if the set of all natural numbers {i} such that {r_i = s_i} is infinite. Unfortunately, {R} will not be an equivalence relation if “almost” is defined in this way. An equivalence relation must be reflexive, symmetric and transitive. But consider the sequences {\langle 1 \rangle}, {\langle -1 \rangle} and {\langle (-1)^i \rangle}. The set of all natural numbers {i} such that {1 = -1} is empty, so {\langle 1 \rangle} and {\langle -1 \rangle} are not equivalent with respect to {R}. Yet the set of all natural numbers {i} such that {1 = (-1)^i} is the set of all even natural numbers, which is infinite, and the set of all natural numbers {i} such that {-1 = (-1)^i} is the set of all odd natural numbers, which is infinite. Therefore, {\langle 1 \rangle} and {\langle (-1)^i \rangle} are equivalent with respect to {R} and {\langle -1 \rangle} and {\langle (-1)^i \rangle} are equivalent with respect to {R}. So {R} is not transitive.

Let us think about the problem in more general terms. Let {\mathscr F} be an arbitrary set of subsets of {{\mathbf N}} and let us say that {r_i = s_i} for almost every natural number {i} if and only if the set of all natural numbers {i} such that {r_i = s_i} is a member of {\mathscr F}. In other words, {\mathscr F} is the set of all subsets of {{\mathbf N}} which contain almost every natural number. Given that {R} must be an equivalence relation, what can we conclude about {\mathscr F}?

  1. {R} must be reflexive. Therefore, for every sequence of real numbers {\langle r_i \rangle}, the set of all positive integers {i} such that {r_i = r_i} must be in {\mathscr F}. This set is simply {{\mathbf N}} itself (because equality is reflexive). So it must be the case that {{\mathbf N}} is in {\mathscr F}.
  2. {R} must be symmetric. Therefore, for every pair of sequences of real numbers {\langle r_i \rangle} and {\langle s_i \rangle}, if {\{i \in {\mathbf N} : r_i = s_i\}} is in {\mathscr U}, {\{i \in {\mathbf N} : s_i = r_i\}} must be in {\mathscr U} as well. But {\{i \in {\mathbf N} : s_i = r_i\}} and {\{i \in {\mathbf N} : r_i = s_i\}} are always the same (because equality is symmetric), so this does not impose any condition on {\mathscr F}.
  3. {R} must be transitive. Therefore, for every triple of sequences of real numbers {\langle r_i \rangle}, {\langle s_i \rangle} and {\langle t_i \rangle}, if {\{i \in {\mathbf N} : r_i = s_i\}} and {\{i \in {\mathbf N} : s_i = t_i\}} are both in {\mathscr F}, {\{i \in {\mathbf N} : r_i = t_i\}} must be in {\mathscr F} as well. Now, {\{i \in {\mathbf N} : r_i = t_i\}} includes the intersection of {\{i \in {\mathbf N} : r_i = s_i\}} and {\{i \in {\mathbf N} : s_i = t_i\}} (because equality is transitive). Since {\langle r_i \rangle}, {\langle s_i \rangle} and {\langle t_i \rangle} can be chosen so that {\{i \in {\mathbf N} : r_i = s_i\}} and {\{i \in N : s_i = t_i\}} are any two subsets of {{\mathbf N}} and {\{i \in {\mathbf N} : r_i = t_i\}} is any superset of their intersection, it follows that {\mathscr F} must contain every superset of every finite intersection of its members.

A set of subsets of {{\mathbf N}} that has these properties is called a filter. That is, a set of subsets of {{\mathbf N}} is a filter if and only if it contains {{\mathbf N}} as well as every superset of every finite intersection of its members. We have just proven that {R} is an equivalence relation if and only if {\mathscr F} is a filter.

One trivial example of a filter is {\mathcal P({\mathbf N})}, which is actually the greatest filter with respect to inclusion. If we take {\mathscr F} to be this filter, all sequences of real numbers are equivalent with respect to {R}, so {\star {\mathbf R}} has just one member. It cannot, therefore, be an ordered field extension of {{\mathbf R}}. So this filter isn’t useful. By the way, filters which are not {\mathcal P({\mathbf N})} is said to be proper. {\mathcal P({\mathbf N})}, then, is the only improper filter. Every filter which contains the empty set is improper, because every subset of {{\mathbf N}} is a superset of the empty set, so if a filter contains the empty set it also contains every subset of {{\mathbf N}}.

Another trivial example of a filter is {\{{\mathbf N}\}}. But if we take {\mathscr F} to be this filter, {R} is simply equality, so {\star {\mathbf R}} is isomorphic to {{\mathbf R}}, which means this filter isn’t helpful either. In order to find a non-trivial filter, let’s think about our original example where {\mathscr F} was the set of all infinite subsets of {{\mathbf N}}. The problem there was that the intersection of two infinite subsets of {{\mathbf N}} can be finite. Exactly when does this occur? Well, if two infinite subsets {A} and {B} of {{\mathbf N}} have a finite intersection, it follows that the complement of {A \cap B} in {{\mathbf N}} is infinite. Now, {{\mathbf N} \setminus (A \cap B)} can also be written as {({\mathbf N} \setminus A) \cup ({\mathbf N} \setminus B)} (by De Morgan’s laws). If this set is infinite, at least one of {{\mathbf N} \setminus A} and {{\mathbf N} \setminus B} is infinite. So we can rule out the possibility that two infinite subsets {A} and {B} of {{\mathbf N}} have a finite intersection by requiring that both subsets have a finite complement in {{\mathbf N}}. Therefore, the set {\mathscr F} of all subsets of {{\mathbf N}} with a finite complement in {{\mathbf N}} is a filter. As perhaps the simplest non-trivial filter, it has a name of its own, the Fréchet filter.

Of course the Fréchet filter may still not be good enough; we haven’t checked that {\star {\mathbf R}} is an ordered field extension of {{\mathbf R}} when {\mathscr F} is taken to be the Fréchet filter. Let’s see what conditions {\mathscr F} must satisfy, if {\star {\mathbf R}} is to be an ordered field extension of {{\mathbf R}}. First, we need to come up with a way of adding and multiplying hyperreal numbers. There’s a very natural way to do this: for every pair of sequences {\langle r_i \rangle} and {\langle s_i \rangle} of real numbers, let {[\langle r_i \rangle] + [\langle s_i \rangle] = [\langle r_i + s_i \rangle]} and {[\langle r_i \rangle] [\langle s_i \rangle] = [\langle r_i s_i \rangle]}, where for every sequence {\langle t_i \rangle} of real numbers, {[\langle t_i \rangle]} denotes its equivalence class with respect to {R}. The following two theorems prove that this definition does not lead to a contradiction as long as {\mathscr F} is a filter.

Theorem 2 For every pair of sequences {\langle r'_i \rangle} and {\langle s'_i \rangle} of real numbers such that {[\langle r'_i \rangle] = [\langle r_i \rangle]} and {[\langle s'_i \rangle] = [\langle s_i \rangle]}, {[\langle r'_i + s'_i \rangle] = [\langle r_i + s_i \rangle].}

Proof: Suppose {\langle r'_i \rangle} and {\langle s'i \rangle} are sequences of real numbers and {[\langle r'_i \rangle] = [\langle r_i \rangle]} and {[\langle s'_i \rangle] = [\langle s_i \rangle]}. Then {\{i \in {\mathbf N} : r'_i = r_i\} \in \mathscr F} and {\{i \in {\mathbf N}: s'_i = s_i\} \in \mathscr F}. For every natural number {i} such that {r'_i = r_i} and {s'_i = s_i}, {r'_i + s'_i = r_i + s_i}, so the intersection of {\{i \in {\mathbf N} : r'_i = r_i\}} and {\{i \in {\mathbf N}: s'_i = s_i\}} is a subset of {\{i \in {\mathbf N} : r'_i + s'_i = r_i + s_i\}}, which means the latter is in {\mathscr F}, because {\mathscr F} is a filter. \Box

Theorem 3 For every pair of sequences {\langle r'_i \rangle} and {\langle s'_i \rangle} of real numbers such that {[\langle r'_i \rangle] = [\langle r_i \rangle]} and {[\langle s'_i \rangle] = [\langle s_i \rangle]}, {[\langle r'_i s'_i \rangle] = [\langle r_i s_i \rangle].}

Proof: Suppose {\langle r'_i \rangle} and {\langle s'i \rangle} are sequences of real numbers and {[\langle r'_i \rangle] = [\langle r_i \rangle]} and {[\langle s'_i \rangle] = [\langle s_i \rangle]}. Then {\{i \in {\mathbf N} : r'_i = r_i\} \in \mathscr F} and {\{i \in {\mathbf N}: s'_i = s_i\} \in \mathscr F}. For every natural number {i} such that {r'_i = r_i} and {s'_i = s_i}, {r'_i s'_i = r_i s_i}, so the intersection of {\{i \in {\mathbf N} : r'_i = r_i\}} and {\{i \in {\mathbf N}: s'_i = s_i\}} is a subset of {\{i \in {\mathbf N} : r'_i s'_i = r_i s_i\}}, which means the latter is in {\mathscr F}, because {\mathscr F} is a filter. \Box

Given these two theorems it is easy to prove that {\star {\mathbf R}} is a commutative ring, regardless of the value of {\mathscr F}. Its additive identity is {[\langle 0 \rangle]} and its multiplicative identity is {[\langle 1 \rangle]}. However, when we attempt to prove that {\star {\mathbf R}} is a field— i.e., every hyperreal number other than {[\langle 0 \rangle]} has a multiplicative inverse— we run into difficulties. For every pair of sequences {\langle r_i \rangle} and {\langle s_i \rangle} such that {[\langle s_i \rangle]} is the multiplicative inverse of {[\langle r_i \rangle]}, {[\langle r_i s_i \rangle] = [\langle 1 \rangle]}, which means that {\{i \in {\mathbf N} : r_i s_i = 1\}} is a member of {\mathscr F}. But if {\mathscr F} is the Fréchet filter and {\langle r_i \rangle} is {\langle 1 + (-1)^i \rangle}, then for every odd natural number {i}, {r_i = 0}, so {r_i s_i = 0} (regardless of the value of {\langle s_i \rangle}), which means {\{i \in {\mathbf N} : r_i s_i = 1\}} does not contain any odd natural numbers, so it cannot be a member of {\mathscr F} since its complement in {{\mathbf N}} is infinite. So it is impossible for {\langle r_i \rangle} to have a multiplicative inverse. Yet {\{i : r_i = 0\}} does not contain any even numbers either (for every even natural number {i}, {r_i = 2}), so its complement in {{\mathbf N}} is infinite, which means {[\langle r_i \rangle] \ne [\langle 0 \rangle]}. Therefore, {\star {\mathbf R}} cannot be a field if {\mathscr F} is the Fréchet filter.

In order to avoid this problem, it is sufficient to ensure that as long as {[\langle r_i \rangle] \ne [\langle 0 \rangle]}, i.e. {\{i \in {\mathbf N} : r_i = 0\} \not \in \mathscr F}, the complement in {{\mathbf N}} of this set, {\{i \in {\mathbf N} : r_i \ne 0\}}, is a member of {\mathscr F}. This is because for every natural number {i} such that {r_i \ne 0}, {1 / r_i} exists, and {r_i (1 / r_i) = 1}, so {\{i \in {\mathbf N} : r_i \ne 0 \wedge r_i (1 / r_i) = 1\}} is the same set as {\{i \in {\mathbf N} : r_i \ne 0\}} and hence a member of {\mathscr F}. Therefore, as long as this condition holds, if we let {\langle s_i \rangle} be any sequence of real numbers such that for every natural number {i}, {s_i = 1 / r_i} if {r_i \ne 0}, then {\langle s_i \rangle} will be a multiplicative inverse of {\langle r_i \rangle}. Since {\langle r_i \rangle} can be chosen so that {\{i \in {\mathbf N} : r_i = 0\}} is any subset of {{\mathbf N}}, this condition is equivalent to requiring that for every subset {S} of {{\mathbf N}}, either {S \in \mathscr F} or {{\mathbf N} \setminus S \in \mathscr F}. A filter with this property is called an ultrafilter. We have just proved that {\star {\mathbf R}} is a field if and only if {\mathscr F} is an ultrafilter.

Note that if {\mathscr F} is to be an ultrafilter, either the set of all even natural numbers or the set of all odd natural numbers must be a member of {\mathscr F}, since these two sets are each other’s complements in {{\mathbf N}}. But neither of these sets is intuitively “larger” than the other; our intuition would be that they are the same size. Therefore, if we were to explicitly construct an ultrafilter (like how we constructed the Fréchet filter as an example of a filter), we would have to make an essentially arbitrary choice about which of these sets should be in the ultrafilter (and we would have to make other arbitrary choices for other pairs of sets as well). Because this choice is arbitrary, we will not make any such explicit construction. We will just let {\mathscr F} be an arbitrary ultrafilter. However, we should at least ensure that an ultrafilter exists. The proof of this is given below. It relies on Zorn’s lemma, a well-known result of set theory equivalent to the axiom choice.

Lemma 4 A filter {\mathscr U} is an ultrafilter if and only if it is the maximal proper filter, with respect to inclusion, i.e., the only filters {\mathscr V} such that {\mathscr U \subseteq \mathscr V} are the improper filter {\mathcal P({\mathbf N})} and {\mathscr U} itself.

Proof: Suppose {\mathscr U} is an ultrafilter, {\mathscr V} is a filter and {\mathscr U \subseteq \mathscr V}. Let {S} be a member of {\mathscr V}. Since {\mathscr U} is an ultrafilter, either {S \in \mathscr U} or {{\mathbf N} \setminus S \in \mathscr U}. If {{\mathbf N} \setminus S \in \mathscr U}, {{\mathbf N} \setminus S \in \mathscr V} as well, but then {S \cap ({\mathbf N} \setminus S)}, i.e. {\emptyset}, must also be in {\mathscr V}. Since {\emptyset} is a subset of every subset of {{\mathbf N}}, that every subset of {{\mathbf N}} is in {\mathscr V}, so {\mathscr V} is the trivial filter {\mathcal P({\mathbf N})}. The only way {\mathscr V} can avoid being {\mathcal P({\mathbf N})} is if this is not the case for any value of {S}, which means every value of {S} is a member of {\mathscr U}, so {\mathscr V \subseteq \mathscr U}, which means {\mathscr U = \mathscr V}.

Suppose {\mathscr U} is a filter and the only filters {\mathscr V} such that {\mathscr U \subseteq \mathscr V} are {\mathcal P({\mathbf N})} and {\mathscr U} itself. Let {S} be a subset of {{\mathbf N}} which is not in {\mathscr U}. We want to show that {{\mathbf N} \setminus S \in \mathscr U}. It is sufficient to show that a subset of {{\mathbf N} \setminus S} is in {\mathscr U} since {\mathscr U} is a filter. Now, a member {T} of {\mathscr U} is a subset of {{\mathbf N} \setminus S} if and only if it is disjoint from {S}, i.e. {S \cap T = \emptyset}. So we need to show that {\mathscr U} contains a member {T} disjoint from {S}. Let {\mathscr V} be the set of all supersets of intersections of {S} with members of {\mathscr U} (i.e. {\mathscr V = \{X : (\exists T \in U) (S \cap T \subseteq X)}). It can be immediately seen that {\mathscr V} is a filter, and, moreover, {\mathscr U \subseteq \mathscr V}, so {\mathscr V} is either {\mathcal P({\mathbf N})} or {\mathscr U} itself. But it cannot be {\mathscr U} itself, because {S \cap {\mathbf N} = S} is a member of {\mathscr V} but not {\mathscr U}. So {\mathscr V} must be {\mathcal P({\mathbf N})}, which contains {\emptyset}, so there is a member of {\mathscr U} disjoint from {S}. \Box

Theorem 5 There is an ultrafilter.

Proof: Let {S} be the set of all filters other than {\mathcal P({\mathbf N})} which are supersets of the Fréchet filter. Let {T} be any chain in {S} with respect to inclusion. In order to use Zorn’s lemma we need to show that {T} has an upper bound in {S}. Well, the union {\mathscr U} of {T} is an upper bound of {T} with respect to inclusion. The hard part is showing that {\mathscr U} is a filter and not {\mathcal P({\mathbf N})}. It obviously contains {{\mathbf N}} since all the members of {T} contain {{\mathbf N}}. And it obviously doesn’t contain {\emptyset} since none of the members of {T} contain {\emptyset}; therefore, it is not {\mathcal P({\mathbf N})}. Suppose {A} and {B} are members of {\mathscr U}. Then there are members {\mathscr F} and {\mathscr G} of {T} such that {A \in \mathscr F} and {B \in \mathscr G}. Since {T} is a chain we can assume without loss of generality that {\mathscr F \subseteq \mathscr G}, which means {A \in \mathscr G} as well, so {A \cap B \in \mathscr G} (since {\mathscr G} is a filter) and hence {A \cap B \in \mathscr U}. So {\mathscr U} is a filter. Therefore, by Zorn’s lemma, there is a maximal member of {T} with respect to inclusion. Since this member cannot be {\mathcal P({\mathbf N})}, by the lemma above, this filter must be an ultrafilter. \Box

We have not yet defined the order on {\star {\mathbf R}}, but there is a natural way to do it which makes {\star {\mathbf R}} into an ordered field, without any more conditions being imposed on {\mathscr F}. I may expand on this in a later post, but this was supposed to only take a day or two to write.

Advertisements

One response to “Constructing the hyperreal numbers

  1. Garry Whitworth Briggs

    use hyperreals in SUPERCOMPLEX NUMBERS eg z=a+bi+cj+dij where j*j=+1 and jis the Perplex Operator

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s