In this post, I’ll cover more or less the basic topics which are usually covered in any set theory course. I expect that the theorems presented in this post will be used extensively in my next articles and I will be referencing this post a lot.

I am skipping many formal details getting straight to the main idea of proofs, but all of the proofs can be made strict without much effort. I also present simpler results without proofs in the form of exercises. I am too lazy to prove each bit of trivialities, sorry.

We will start from a seemingly absurd statement.

**Theorem**. A unit line segment \([0, 1]\) contains exactly the same number of points as a filled unit square \([0, 1]^2\).**Proof**. Consider a decimal fraction representation of points of the unit segment and let function \(f\) map these points to a unit square by taking all odd positioned digits of decimal representation to the first coordinate of the square and all even positioned digits to the second coordinate of the square. This cryptic rule can be made clear with the help of an example. Point \(0.1234567\) has odd-positioned digits 1, 3, 5, 7 and even-positioned digits 2, 4, 6 in its decimal representation, so \(f\) will take this point into a point with coordinates \((0.1357, 0.246)\). It is easy to see that this mapping is one to one. QED.

**Exercise**. As promised, I provide quite informal proofs and sometimes admit holes in them. The proof above has one such hole. Can you see it? Can you resolve it? (Hint: any number has two decimal representations, for example number 0.2 can be also represented as an infinite fraction 0.19999…).

This proof can be easily extended to the case of a unit cube, or even a unit *n*-dimensional cube. Instead of mapping a point to two coordinates, map it to \(n\) coordinates and use each *n*-th digit of decimal representation for this purpose.

This result is quite counter-intuitive and may seem even wrong. After all, a line segment has area 0, but a unit square has area 1. A unit square has volume 0, but a

The catch here is in the terminology. When we say “line segment”, “square” or “cube” we immediately visualize geometric objects. The same terminology as in geometry is used in set theory, but set theory lacks all of the geometric sense. Things like “length”, “angle” or “area” are just not defined in set theory and we cannot make any references to them. Without these notions, set-theoretic figures are not exactly the same as geometric figures, even though we can describe them using coordinates. In particular, you are not supposed to draw set-theoretic figures on a sheet of paper, since any picture immediately suggests geometric information which doesn’t exist. As soon as you drop all of your geometric intuition, the only remaining way of dealing with these figures is to analyze functions that map points between them.

The same consideration applies to physical intuition. Mathematical objects don’t usually have mass, atoms inside, density etc, so when we stretch objects, they don’t become less dense. As an example, take function \(f:x\mapsto 2x\) which maps interval \([0,1]\) to \([0,2]\). Clearly, \([0,1]\) is a subset of \([0,2]\), but at the same time \(f\) is one-to-one, so each point in \([0,1]\) can be mapped uniquely with a point in \([0,2]\) and vice versa. Thus from one point of view \(f\) stretches the interval two times, since any subinterval is mapped to a subinterval of twice the length (e.g. \([0.1,0.2]\) is mapped to \([0.2, 0.4]\)). At the same time its image is not less “dense” than its preimage, so in this

The preceding paragraph can be summarized as a **rule**: stretching of an object doesn’t change the number of points in it.

To make this reasoning more clear, consider a set of integers. Are there more integer numbers than even numbers? From one point of view yes, since any even number is an integer, but not any integer is even. At the same time function \(x\mapsto 2x\) provides a way to match each integer with exactly one even number and vice

Which response is correct? In fact, both of them could be correct, depending on how you define “more”. As soon as we leave the world of finite sets, we cannot really say that one set has more elements than another in the same sense as we compare sets of 2 and 3 elements. Looking into subsets is not very productive though. We know that \([0,1]\subset[0,2]\), but at the same time \([100, 101]\not\subset[0,2]\), even though it is the same set moved on a real line. We clearly don’t want moving sets around affecting our reasoning about sizes of sets. Thus, it appears to be more natural when we use one-to-one mappings to compare cardinality of sets and so this definition is usually used.

**Definition**. Function \(f\) is called *injection*, if it maps different elements into different elements. Formally, for injective function, if \(f(x)=f(y)\) then \(x=y\).

**Definition**. We say that set \(A\) is *not smaller* than set \(B\) (or has *not smaller cardinality*, or has *not less elements*) and denote it as \(|A|\le|B|\) if there is an injection from \(A\) to \(B\). We say that set \(A\) has *the same number of elements* as \(B\) (or *the same cardinality*, or *the same number of elements*) and denote it as \(|A|=|B|\) if there is a one-to-one mapping between \(A\) and \(B\).

**Definition**. We say that set \(A\) has the cardinality of *the* *continuum* if there exists a one-to-one mapping between \(A\) and a unit segment. We denote this as \(|A|=\mathfrak{c}\).

**Schröder-Bernstein Theorem**. If \(|A|\le|B|\) and \(|B|\le|A|\), then \(|A|=|B|\).

The statement of the theorem seems quite intuitive, but given all of the subtlety of infinite sets, this requires a formal proof. I am skipping it, but if you are interested you can check one of the approaches on Wikipedia.

**Exercise**. Prove that the whole real line \(\mathbb{R}\) has the cardinality of the

**Exercise**. Prove that an arbitrarily dimensional space \(\mathbb{R}^n\)

**Exercise**. Given \(|A|\le|B|\le|C|\) and \(|A|=|C|\), prove that \(|A|=|B|\).

Consider an arbitrary figure in *n*-dimensional space. If this figure has a subset of continuum cardinality, we end up in the situation of the exercise above. Thus we conclude that any figure in any finite-dimensional space has the maximum cardinality of the

This result raises an interesting question: is it possible at all to have the cardinality bigger than the continuum? Before answering this question let’s consider a couple of more examples.

**Definition**. Set \(A\) is called *countable* if it contains the same number of elements as the set of natural numbers \(\mathbb{N}= \{0, 1, 2, 3, \ldots\}\). This is the same as to say that all elements of \(A\) can be arranged into an infinite sequence. We denote it as \(|A|=\aleph_0\).

The following exercises are elementary and compulsory to be solved before you continue.

**Exercise**. Prove that union of a

**Exercise**. Prove that Cartesian product of countable sets is countable.

**Exercise**. Prove that sets of integers and rational numbers are countable.

**Exercise**. Prove that a set of finite words built from a finite alphabet is countable.

**Exercise**. Prove that a set of polynomials with integer coefficients is countable.

All of the statements in the exercises above are quite simple, but at the same time important. I think that you will be able to solve these quite easily, otherwise just try to search for the solution on the Internet.

From the

**Cantor’s Theorem**. For an arbitrary set \(A\), its powerset (i.e. a set of all subsets, we denote it by \(2^A\)) has a strictly bigger cardinality than \(A\).

**Proof**. We will prove this theorem by contradiction. Let’s assume that there is a one-to-one mapping \(f: A \to 2^A\). Let $$X=\{x\in A | x\not\in f(x)\}$$ Since \(f\) is one-to-one and \(X\) is a subset of \(A\), we know that there is \(a\in A\) such that \(f(a) = X\). If we assume that \(a\in X\) than by definition of \(X\) we get \(a\not\in f(a)=X\) which is a contradiction. The assumption that \(a\not\in X\) leads to the same contradiction. Thus, both possibilities are impossible and our initial assumption that one-to-one \(f\) exists is false. We conclude that there is no one-to-one mapping between \(A\) and \(2^A\). QED.

If \(|A|=a\) then for cardinality of all subsets of \(A\) we will write \(|2^A| = 2^a\).

**Theorem**. \(\mathfrak{c} = 2^{\aleph_0}\).

**Proof**. Any number within a unit segment can be represented as a fraction between 0 and 1 in a binary number system (e.g. any number can be written as something like 0.11001001110…). If we match each digit in this binary representation to a number in \(\mathbb{N}\), then we see that we can map each number within a unit segment to a subset of \(\mathbb{N}\). By Cantor’s theorem we see that the cardinality of a unit segment has to be bigger than the cardinality of \(\mathbb{N}\). QED.

From here we can immediately make some practical statements. The first two of them demonstrate why it makes sense to compare sets using one-to-one mappings.

**Statement 1**. There are real numbers which are not solutions to any of the polynomial equations with integer coefficients (numbers which appear to be solutions are called *algebraic *and numbers which cannot be solutions are called *transcendental*; this statement says that transcendental numbers exist).

This statement follows immediately from the fact that the set of all polynomials is countable, while the set of real numbers is uncountable.

**Statement 2**. There are numbers such that we cannot write a computer program which prints all of the digits of this number sequentially (such numbers are often called non-*computable*, although terminology may vary). Note that I am not saying anything about the time which the program takes for computation. This is the statement only about the theoretical possibility of such a program, even in the

This is actually the same statement as statement 1. Each computer program is just a finite string made up of symbols of a finite alphabet and as you have proved above, sets of such strings are always countable. The proof of this fact mirrors the proof for the number of polynomials. Please note that this result is applied irrespectively to the programming language, formal logic or the computer architecture we use. Until programs are strings,

**Statement 3**. Set of all sets cannot exist.

If a set of all sets could exist, it would be “the biggest” set of all, since it has to contain all the elements from all of the sets. Since we know that for an arbitrary set \(A\) its powerset \(2^A\) is “bigger” than \(A\), we conclude that there cannot be “the biggest” set.

The last three statements are simple enough to be proven as an exercise.

**Statement 4**. Any set of intervals on a real line is at most countable.

**Statement 5**. Any real function can have at most a countable number of extremums.

**Statement 6**. Any real increasing function can have at most a countable number of discontinuities.

In fact, there is nothing special about increasing functions. In one of the following blog posts we will generalize the last statement to all functions \(\mathbb{R}\to\mathbb{R}^n\) which have only discontinuities of the first kind. But this is a topic for a separate blog post.