# Large and small sets

For some infinite sets ${S = \{x_1, x_2, x_3, \dotsc\}}$ of positive integers, the sum of the reciprocals of the members of ${S}$, i.e.

$\displaystyle \frac 1 {x_1} + \frac 1 {x_2} + \frac 1 {x_3} + \dotsb,$

diverges. Such sets are said to be large. For example, ${\mathbb N}$, the set of all the positive integers, is large, because the harmonic series diverges. On the other hand, for some infinite sets ${S}$ the sum of the reciprocals of the members of ${S}$ converges instead; such sets are said to be small. For example, for every integer ${p}$ greater than 1, the set of all the positive integral ${p}$th powers is small, because the series

$\displaystyle 1 + \frac 1 {2^p} + \frac 1 {3^p} + \dotsb$

converges. There is thus an interesting distinction between two different sizes of infinite sets of positive integers. Let’s have a look at some other infinite sets of positive integers and see whether they are large or small.

For every positive integer ${m}$, the set of all the positive integral multiples of ${m}$ is easily seen to be large, due to the fact that the harmonic series diverges, and dividing the harmonic series by ${m}$ yields the series

$\displaystyle \frac 1 m + \frac 1 {2 m} + \frac 1 {3 m} + \dotsb.$

In fact, this proof can easily be generalised to show that for every set ${S}$ of positive integers and every positive integer ${m}$, ${\{m n : n \in S\}}$ is large if and only if ${S}$ is large.

What about the set ${S}$ of all the positive integers which are not multiples of ${m}$? The sum of the reciprocals of the members of ${S}$ can be written as the difference

$\displaystyle \left( 1 + \frac 1 2 + \frac 1 3 + \dotsb \right) - \left( \frac 1 m + \frac 1 {2 m} + \frac 1 {3 m} + \dotsb \right).$

The minuend and subtrahend of this difference are both equal to ${\infty}$, so the sum of the reciprocals of the members of ${S}$ cannot be immediately seen from writing it like this. However, if we take out the factor of ${1 / m}$ from the subtrahend, we get

$\displaystyle \left( 1 + \frac 1 2 + \frac 1 3 + \dotsb \right) - \frac 1 m \left( 1 + \frac 1 2 + \frac 1 3 + \dotsb \right),$

and then we can remove the common factor of ${1 + 1 / 2 + 1 / 3 + \dotsb}$ to get

$\displaystyle \left( 1 + \frac 1 2 + \frac 1 3 + \dotsb \right) \left( 1 - \frac 1 m \right).$

It is easy to see that the value of this expression is also ${\infty}$. So ${S}$ is large, as well.

In fact, for every finite set of pairwise coprime positive integers ${p_1}$, ${p_2}$, … and ${p_{\ell}}$, the set ${S}$ of all the positive integers which are not multiples of any of these integers is large, too. This is because, if we let ${H}$ be the value of the harmonic series, then, using the inclusion-exclusion principle, the sum of the reciprocals of the members of ${S}$ can be written as

$\displaystyle H - \sum_{k} \frac H k + \sum_{k_1, k_2} \frac H {k_1 k_2} - \dotsb + \frac {(-1)^{\ell} H} {k_1 k_2 \dotsm k_{\ell}},$

in which, for every integer $j$ such that $1 \le j \le \ell$, the $j$th term is a sum over all the finite sets of $j$ integers $k$ such that $1 \le k \le \ell$. By factorising this expression, we can rewrite it as

$\displaystyle H \left( 1 - \frac 1 {p_1} \right) \left( 1 - \frac 1 {p_2} \right) \dotsm \left( 1 - \frac 1 {p_{\ell}} \right),$

and it is easy to see that the value of this expression is ${\infty}$.

Generalising from this, one might expect that if we let ${P = \{p_1, p_2, p_3, \dotsc\}}$ be the prime numbers, so that the members of ${P}$ are pairwise coprime and every integer is a multiple of a member of ${P}$, then

$\displaystyle H \left( 1 - \frac 1 {p_1} \right) \left( 1 - \frac 1 {p_2} \right) \left( 1 - \frac 1 {p_3} \right) \dotsm = 1. \ \ \ \ \ (1)$

To prove this rigorously, all that is needed is to note that for every positive integer ${n}$,

$\displaystyle H \left( 1 - \frac 1 {p_1} \right) \left( 1 - \frac 1 {p_2} \right) \dotsm \left( 1 - \frac 1 {p_n} \right) = 1 + K,$

where ${K}$ is the sum of the reciprocals of the integers greater than 1 which are not multiples of any of the first ${n}$ prime numbers. The least such integer is ${p_{n + 1}}$, so ${K}$ is smaller than the sum of the reciprocals of the integers greater than or equal to ${p_{n + 1}}$. Therefore,

$\displaystyle H \left( 1 - \frac 1 {p_1} \right) \left( 1 - \frac 1 {p_2} \right) \dotsm \left( 1 - \frac 1 {p_n} \right) < 1 + \frac 1 {p_{n + 1}} + \frac 1 {p_{n + 1} + 1} + \frac 1 {p_{n + 1} + 2} + \dotsb.$

As ${n}$ tends towards positive infinity, the sum of the reciprocals of the integers greater than or equal to ${p_{n + 1}}$ tends towards 0, because it can be written as ${H - H_{p_{n + 1} - 1}}$, which is less than ${H - H_n}$, and by definition ${H_n}$ tends towards ${H}$ as ${n}$ tends towards positive infinity. So the right-hand side of the inequality above tends towards 1, and an application of the squeeze theorem completes the proof.

Rearranging (1) to solve for ${H}$ yields

$\displaystyle H = \left( \frac 1 {1 - 1 / p_1} \right) \left( \frac 1 {1 - 1 / p_2} \right) \left( \frac 1 {1 - 1 / p_3} \right) \dotsm,$

so we now have a way of expressing the harmonic series as an infinite product. This is quite interesting in its own right, but it can also be used to prove a rather surprising result. By taking the natural logarithm of both sides, we see that

$\displaystyle \ln H = \ln \left( \frac 1 {1 - 1 / p_1} \right) + \ln \left( \frac 1 {1 - 1 / p_2} \right) + \ln \left( \frac 1 {1 - 1 / p_3} \right) + \dotsb.$

The natural logarithm of ${\infty}$ is ${\infty}$, so this means the series on the right-hand side diverges. For every positive integer ${n}$, the ${n}$th term of the series on the right-hand side, ${\ln (1 / (1 - 1 / p_n))}$, can also be written as ${\ln (1 + 1 / (p_n - 1))}$, and this number is less than or equal to ${1 / (p_n - 1)}$ (in general, for every real number ${x}$ greater than ${-1}$, ${\ln (1 + x) \le x}$). If ${n}$ is greater than 1, then this number, in turn, is less than or equal to ${1 / p_{n - 1}}$, because ${p_n - 1}$ is greater than or equal to ${p_{n - 1}}$. And ${1 / (2 - 1)}$, i.e. 1, is less than or equal to 1. So, by the comparison test, the series

$\displaystyle 1 + \frac 1 {p_1} + \frac 1 {p_2} + \frac 1 {p_3} + \dotsb$

diverges, too. This shows that the set consisting of 1 as well as all the prime numbers is large, but it is then an immediate consequence that the set of all the prime numbers, not including 1, is large (since we can subtract 1 from the above series and the value of the result will still have to be ${\infty}$).

This is a rather interesting and surprising result. The set of all the prime numbers is large, but for every positive integer ${p}$, the set of all the positive integral ${p}$th powers is small, which means that, in a certain sense, there are more prime numbers than positive integral ${p}$th powers.