Why implication is so weird?

Everyone who learned formal logic knows the truth table for implication:

\(A\)\(B\)\(A \Rightarrow B\)
001
011
100
111

The last two rows don’t raise any concerns, but the first two say that anything follows from a false statement. According to this table, the following statements are all true:

  1. 2 + 2 = 4 follows from 2 + 2 = 5
  2. The Earth existed for not more than 10 years, so 2 + 2 =5
  3. The Sun is black, so it follows that the Sun is red

All of these are obviously absurd, but professors keep teaching it at universities. I’ll try to explain why. I am not getting into much of the details and sometimes take a very liberal approach to the terminology, looking more for reader’s intuition rather than formal completeness. Hopefully, it wouldn’t become an obstacle and the reader would benefit from such informal reading about different views on this topic.

(Non-) explanation 1: By definition
You can make up an arbitrary binary operation (there are just 16 of them in classical logic) and call it whatever you want. This would be a valid definition which could be worked with. After all, any definition is nothing more than an agreement in terminology and this is the most common “explanation” which people hear in universities.

Even though such explanation is perfectly valid, it doesn’t give you much. Why exactly was this definition chosen among 16 possibilities and why is it called “implication”? It is still unclear. You also should keep in mind that if a teacher tells you that implication is just an operation according to the truth table, you are not in a position to derive paradoxes like the ones above since the connection between the truth table and the meaning of natural language is not defined with this approach.

(Non-) explanation 2: Paradoxical part doesn’t matter
I’ve seen this explanation in several lecture notes published online. If we apply implication to deduce true statements from true statements, we are fine. On the other hand, if we try to deduce something from false statements, we are in trouble anyway, since we assume something which is not true. So, the part of the truth table for false assumptions doesn’t matter.

This explanation is not entirely wrong, but it misses one important point: the paradoxical implication is used in mathematics, even though in most cases not directly. Let’s see how it appears in practice.

In calculus we define an open set as a set \(A\), such that for any \(x \in A\) we have some neighborhood \(N\) of \(x\), which lies within \(A\). We also treat the empty set as open and we can understand why using paradoxical implication. This is how we write that \(A\) is open formally (for simplicity I am skipping a formal definition of “neighborhood” here):

$$x \in A \Rightarrow \exists N. x \in N \land N \subset A$$

If we substitute the empty set for \(A\), we see that this property is true by paradoxical implication, since \(x \in \emptyset\) is always false. So the fact that empty set is open is not part of the definition, but is a theorem. This statement may look innocent, but it is critical. If we didn’t assume paradoxical implication and decided to treat the empty set as non-open, the whole theory would break. For example, we can prove that the intersection of two open sets is open, but if we take two open non-intersecting sets, their intersection is empty, so the empty set must be open, otherwise, we get a contradiction.

For another example where the paradoxical implication is used in mathematical proof directly, you can check Godel’s ontological argument, explained in this blog recently (see the proof of theorem 1).

As a deeper example, consider proofs by contradiction in general. Let’s say that we proved that \(B\) follows from \(A\). We also proved that \(B\) is false. We conclude that \(A\) must be false as well, since otherwise, we get that \(B\) is true (as we said, it follows from \(A\)), contrary to what we know. This kind of reasoning is common in mathematics, but how can we prove that proofs by contradiction do actually work?

It appears that proof by contradiction depends directly on paradoxical implication. We’ll make use of a logical rule called case analysis: if we proved that \(A\Rightarrow C\) and \(B\Rightarrow C\) and we know that \(A\lor B\) is true, then \(C\) is true as well. It is quite intuitive if you meditate on this for a minute or two, so I am not getting into details here. We consider this property as an axiom.

Let’s say that we proved \(A\Rightarrow B\) and \(\neg B\) (which means that \(B\) is false). We assume that all statements are either true or false, so \(A\lor\neg A\). We’ll apply case analysis to derive \(\neg A\) from here. First of all, we trivially have \(\neg A\Rightarrow \neg A\). If we assume \(A\) instead, then by \(A\Rightarrow B\) we know that \(B\) is true, but it is a contradiction, so by paradoxical implication we can derive anything from here, including \(\neg A\). So we got that \(\neg A\) follows from both \(A\) and \(\neg A\) and by case analysis we conclude \(\neg A\).

This is how proof by contradiction works under the hood. Other interpretations of this rule are also possible, but this one is the most common and I hope sufficient as an example.

Explanation 3: Logical necessity
There are a number of properties we want the implication operation to follow. For example, if \(A\Rightarrow B\) and \(B\Rightarrow C \) then we expect \(A\Rightarrow C\). This property is sometimes called transitivity and we can write it as

$$((A \Rightarrow B) \land (B \Rightarrow C)) \Rightarrow (A \Rightarrow C)$$

We consider this formula as an axiom, so it must be true irrespectively of the truth values of \(A, B, C\). Assume now that we already know the truth table of implication for the true assumption and we want to fill in the remaining entries of the table:

\(A\)\(B\)\(A \Rightarrow B\)
00?
01?
100
111

If we substitute 1 for \(A\) and \(C\) and 0 for \(B\) in transitivity axiom, we get the following formula

$$((1 \Rightarrow 0) \land (0 \Rightarrow 1)) \Rightarrow (1 \Rightarrow 1)$$

Using truth table for known values and taking into account that \(X\land 0 = 0\) for all \(X\), we can rewrite this statement as \(0 \Rightarrow 1\). Since we assume transitivity as an axiom, this final stament must be true.

Similarly, if in the example above we take \(C\) as false, we get that \(0 \Rightarrow 0\) is also true. So by demanding that implication has to be transitive, we were able to fill in the truth table for implication. Any other truth table would falsify transitivity property.

Transitivity is not the only property which forces the truth table to be what it is. As an even simpler example, we can consider the property

$$\neg (A \Rightarrow B) \Leftrightarrow (A \land \neg B)$$

This equivalence is also what we naturally expect from an operation called “implication”: we are sure that there is no implication exactly when we can provide an example for a truthful assumption and a false consequence.

Explanation 4: By definition II

So what we get is that the truth table for implication is forced by the properties which we want it to have, but it is only half of the story. The previous explanation makes a false impression that paradoxical implication is forced by the fact that any statement can be either true or false. It is called classical assumption, but it is not as obvious as you might think. There are plenty of different kinds of logics which have either more truth values or don’t make any assumptions about truthfulness at all.

If we don’t assume that everything is true or false, we cannot really have a truth table, but we still have alternatives. Before I present them, I have to explain how logical systems are defined in general.

The science of logic has two basic purposes. The first one is to provide a language which captures only a “logical essence” of statements without attaching any particular meaning to them. The second is to give rules according to which statements can be derived from each other. Consider a classical example:

  • All humans are mortal
  • Socrates is a human
  • Therefore, Socrates is mortal

This is trivial logical reasoning, but it is quite a general one. We can apply the same logic to a completely different problem domain:

  • All cats are beautiful
  • Tom is a cat
  • Therefore, Tom is beautiful

The situation is different, but you may notice that the reasoning is exactly the same. We can apply it even to absurd statements:

  • All quipee are totoked of unkes
  • Kupe is quipee
  • Therefore, kupe is totoked of unkes

We don’t understand what “quipee”, “unkes” and “kupe” are, but we can see that at least the reasoning is correct if such things ever exist. Logic science abstracts away from the specific meaning of the phrases and allows us to concentrate only on the derivations of one statement from another. Using formulas the same deduction can be written as

  • \(P(x) \Rightarrow Q(x)\)
  • \(P(a)\)
  • Therefore, \(Q(a)\)

Here instead of sentences of language we use formulas, which lack any meaning, but which still allow us to capture the pattern of reasoning.

The simplest logical system known to me is the one which is taught in universities and which is called “propositional calculus”. The syntax of its formulas is very simple. A formula can be one of the three:

  1. A letter: A, B, etc. Letters symbolize logical statements, the exact meaning of which is not important.
  2. Expression \((A) \land (B)\), where \(A\) and \(B\) are formulas.
  3. Expression \(\neg (A)\), where \(A\) is a formula.

Brackets around \(A\) and \(B\) above allow to parse formulas unambiguously. For example, expression \(\neg ((A) \land (B))\) is precisely defined, while formula \(\neg A \land B\) can be interpreted either as \(\neg (A \land B)\) or as \((\neg A)\land B\). In practice, we often omit excessive brackets and have priority rules for logical operations, but from the formal point of view, it is not needed.

For some expressions we define shorthands. As the first example we can define \(A\lor B\) to be a shorthand for expression \(\neg((\neg A)\land(\neg B))\). If you meditate about it for a moment, you’ll see that it matches our intuitive understanding of operation “or”. To say “A or B” is the same as to say “It is not true that both A and B are false”. The difference with truth table approach here is that we don’t actually assign any truth value to operation \(\land\), we only say that it is a shorthand.

In the same way, we can say that \(A\Rightarrow B\) is just a shorthand for \(B\lor\neg A\). Once again, we don’t assume that every statement is true or false and we don’t define the truth table, we just say that \(\Rightarrow\) is a shorthand for a particular syntax construct, which happens to intuitively match our definition of “implication”. To say that “A implies B” is the same as to say that “either B is right, or A cannot be right”.

At this point, if we wish, we can introduce truth tables. We only need to define the truth table for initial operations “and” and “not” and from there we can conclude what is the truth table of implication. As a non-classical example, consider Kleene’s ternary logic, where apart from “true” and “false” we also have logical value “unknown”. In the table below “and” and “not” operations are considered given, while truth values for “or” and “implication” are derived using the shorthand definition:

AB\(A\land B\)\(\neg A\)\(A\lor B\)\(A\Rightarrow B\)
FFFTFT
FUFTUT
FTFTTT
UFFUUU
UUUUUU
UTUUTT
TFFFTF
TUUFTU
TTTFTT

Another example is fuzzy logic. In this logic, statements can be “partially true”, which is represented by logical values in range \(\left[0;1\right]\). Logical “and” in this case is defined as an operation of taking the minimum value among two reals: \(A\land B = \min (A, B)\). Operation “not” is defined as \(\neg A = 1 – A\).

Using the shorthand rule given above, we can now conclude that operation “or” is given by formula
$$A\lor B = 1 – \min(1 – A, 1 – B) = \max(A, B)$$
and implication by formula
$$A\Rightarrow B = max(1 – A, B)$$
You can see that if \(A\) is false (its value equals 0), then the implication is true (value equals 1). At the same time if \(A\) is true, then the value of the implication equals the value of \(B\).

There is a caveat, though. In classical logic, we can say that
$$A\Rightarrow B = (\neg A)\lor B = (\neg A)\lor (A \land B)$$
but if we use the latest expression as the definition of implication in fuzzy logic, we get a different formula (I use prime sign \(\prime\) to distinguish it from the previous definition):
$$A\Rightarrow’ B = \max(1 – A, \min(A, B))$$

Although this formula gives the same results in case if \(A\) is 0 or 1, for other values it is not equivalent. If we take, for example, \(A=0.5\) and \(B=0.75\), then according to our formulas
$$A \Rightarrow B = 0.75$$
$$A \Rightarrow’ B = 0.5$$

Which formula is correct? It is impossible to say. We must admit that the choice to define implication as \((\neg A)\lor B\) is quite arbitrary. It works for some logics, but doesn’t work for others and sometimes several different definitions are possible.

Explanation 5: Derivation rules

In the previous explanation, we mentioned that logic consists of two parts: syntax and derivations. We explained how an implication can be treated just as a shorthand for a logical formula. Here we’ll show that it also matches our intuitive idea of derivation.

Let’s introduce some notation. If formula \(B\) can be derived from formulas \(A_1,\ldots,A_n\), we denote it as \(A_1,\ldots,A_n\vdash B\). For our purposes we will need only four derivation rules:

  1. \(A\land B \vdash A\),
  2. \(A \land B \vdash B\),
  3. \(A \vdash A\lor B \), for arbitrary \(B\),
  4. \(A\lor B, \neg A \vdash B \).

You can easily check that these rules are true for the classical logic, but in fact we expect these rules to be valid for any kind of logic, not only the classical.

We want to define implication in such a way that \(A \Rightarrow B\) if and only if we can derive \(B\) from assumption \(A\). We can formulate it as a rule:

5. If \(A \vdash B\), then \(\vdash A \Rightarrow B\) (no assumption to the left of \(\vdash\) here, in other words if we need assumption \(A\) to derive \(B\), then we can derive \(A \Rightarrow B\) without assumption \(A\)).

Let \(\bot\) denote a statement which cannot be true:

6. \(\bot = A \land (\neg A)\) for arbitrary \(A\).

From here we can derive an interesting property of implication. Let’s take some arbitrary statement \(B\). Then

  1. Let’s assume \(\bot\).
  2. From assumption 1 we derive \(A\) by applying rule 1 to the definition of \(\bot\).
  3. From assumption 1 we derive \(\neg A\) by applying rule 2 to the definition of \(\bot\).
  4. From statement 2 we derive \(A\lor B\) by rule 3.
  5. Applying rule 4 to statements 3 and 4, we conclude \(B\).

So we managed to derive \(B\) from \(\bot\), which means that \(\bot \Rightarrow B\) by rule 5. It is exactly what we were looking for: any statement follows from absurd, but this time we managed to show this based only on very generic derivation rules and not any particular kind of logic. This rule is true even if we don’t assume that all of the statements are either true or false.

Explanation 6: Models

Let’s say that the universe is entirely described by a number of logical variables. For example, let’s say that just three variables \(A, B, C\) are enough. Each particular assignment of values to these logical variables gives us a particular instantiation of the universe. We call such instantiations models. In the example with three variables we will have 8 different models of the universe.

Now let’s say that we know some facts about the universe, which we are going to call axioms. For our three-variable world consider the following two axioms:

  1. \(A\lor B\)
  2. \(B = C\)

These axioms put restrictions on what the universe can be, so we have to throw away some of the models. In fact, after introducing these two axioms we have just three possible models remaining:

  1. \(A = 1, B = 0, C = 0\)
  2. \(A = 0, B = 1, C = 1\)
  3. \(A = 1, B = 1, C = 1\)

All other models are invalid. For instance, model
$$A = 0, B = 0, C = 0$$
is invalid because it doesn’t match the first axiom, and model
$$A = 1, B = 0, C = 1$$
invalidates the second axiom.

Now we can define implication:

Definition. \(A\Rightarrow B\), if and only if in any model where \(A\) is true, \(B\) is also true.

If we add axiom \(A\Rightarrow B\) to the set of axioms which we had before, the list of valid models narrows down to just two of them:

  1. \(A = 0, B = 1, C = 1\)
  2. \(A = 1, B = 1, C = 1\)

You can see from this example, that in case of axiom \(A\Rightarrow B\), if \(A\) is true in some model, then \(B\) must be also true. At the same time, if \(A\) is false, then the constraint which this axiom impose is not applied, although we still claim that the axiom is correct. Since \(A\Rightarrow B\) is an axiom, it must be true in all models, including the models where the constraint is not applied. Thus, we conclude that \(A\Rightarrow B\) is true when \(A\) is false.

Conclusion

There are many other ways of looking at the implication. Some of them are purely philosophical, others involve more advanced mathematics. Hopefully, the approaches which I demonstrated shed some light on this topic and now the reader understands it a little bit better, although, admittedly, many of the explanations are just manipulations with formulas, which don’t give much intuition. The implication looks simple when you don’t dig deep, but when you try to formalize it, it is getting very counterintuitive due to its nature and I doubt that it is possible to make the implication look any more natural.

God’s existence

You may have seen headlines like “Mathematicians proved the existence of God”. I’ve seen it a couple of times. In this post, we will see two of such proofs: a very famous one by Godel (the first version of which appears in Godel’s papers from 1941) and another a little bit less famous by Hartshorne (published 1962).

We’ll use classical higher-order modal logic and first of all I have to explain what it means.

“Classical” in context of logic means that any statement is either true or false. If \(S\) is an arbitrary statement, it means that \(S\lor\neg S\) is true and this is an axiom. Here \(\lor\) means “or” and \(\neg\) means “not”. We refer to this as to the law of excluded middle.

We will also make use of symbol \(\land\) for “and”, \(\rightarrow\) for implication and \(\leftrightarrow\) for logical equivalence.

Two things about implication may be non-intuitive. First of all, according to laws of implication absolutely any statement follows from a false statement. For example, the following seemingly absurd statement is considered to be true: “\(1+2=14\) follows from the fact that \(2+3=1\)“. It may be uncomfortable to accept this kind of reasoning, but this is really a true implication. I’ll explain this in more details in one of the following blog posts, but for now, I can just say that the fact that we can derive arbitrary absurd from wrong statements doesn’t really affect anything. Since the assumption of implication is never true, such rule is never applied. It is still important to know this for understanding the following. We’ll refer to this as to paradoxical implication.

Let’s say that \(A\) and \(B\) are some statements. Statement \(A\rightarrow B\) is equivalent to \(\neg B\rightarrow \neg A\). You can either learn it in any basic logic text or just think about it for a bit and convince yourself that this equivalence is true. We’ll refer to this as to implication contrapositive rule.

If statement \(S\) depends on some object \(x\) and we know that such object always exists, we write it as \(\exists x. S(x)\). If statement \(S\) is true for arbitrary object, we write it as \(\forall x. S(x)\). The second classical assumption is that \(\neg\exists x.S(x)\leftrightarrow\forall x. \neg S(x)\). It is just an axiom and you can convince yourself that this makes sense by just exploring couple of examples.

In statement \(S(x)\), \(S\) is called predicate, while \(x\) is object (in a language of logic it is called “term”, but we will stick to more non-logician friendly terminology). First-order logic (the one which is studied in universities) allows us to apply quantifiers \(\exists\) and \(\forall\) for objects, but not for predicates themselves. We are also not allowed to apply predicates to other predicates. Higher-order logic differs from first-order logic in that it relaxes this requirement, so we may write statements like \(\forall S .\exists x. S(x)\).

Now the interesting part. Modal logic. As I said above, we assume that any statement is either true or false. Modal logic goes further and says that some of the statements can be necessarily true or false, and some of the statements are possibly true or false. We denote it with symbols \(\diamond\) and \(\Xi\) respectively:

  1. Statement is “just” true: \(S\).
  2. Statement is possibly true: \(\diamond S\).
  3. Statement is necessarily true: \(\Xi S\).

Unfortunately, I am not using a standard notation here. “Necessary truth” is traditionally denoted by a shallow square, but this symbol is unavailable in the tool which I use for writing formulas. I hope I’ll be able to fix it in the future and get back to this article to change it.

One way to think about these three modes of truthfulness is to assume that we have many possible worlds. They can be imaginary worlds or some physical alternative realities. We live in one of these worlds. Given this assumption we can interpret these formulas this way:

  1. \(S\): the statement is true in our world, but not necessarily in others.
  2. \(\diamond S\): the statement may be true in some of the worlds, but not necessarily in our world. What is important is that we can imagine the world where \(S\) is true.
  3. \(\Xi S\): the statement is true in all possible worlds we can imagine.

Modal logic has a number of axioms:

M1. \(\neg \Xi S \leftrightarrow \diamond \neg S\). If it is not true that something is true in each world, then there may exist a world where it is false. We can treat this first axiom as a definition of \(\diamond\) through \(\Xi\).

M2. \(\Xi S \rightarrow S\). If something is necessarily true, then it is true in our world as well.

M3. \((\Xi (S \rightarrow T)) \rightarrow ((\Xi S) \rightarrow (\Xi T))\). If in all worlds \(T\) follows from \(S\), then if in all worlds \(S\) is true we also know that in all worlds \(T\) is true as well.

M4. If \(S\) can be proven mathematically (which includes the case when \(S\) is an axiom), then \(\Xi S\).

M5. \(\diamond \Xi S \rightarrow \Xi S\). If \(\diamond\Xi S\) is true, then it means that there exists some world where it is known that \(\Xi S\). If we could transfer ourselves to that world, we would know that \(S\) is necessarily true. So we conclude that \(\Xi S\).

Application of implication contrapositive rule to M5 gives us formula \(\diamond \neg S \rightarrow \Xi \diamond \neg S\). Denoting \(\neg S\) by \(T\) we get an equivalent statement of the same axiom:

M5′. \(\diamond T \rightarrow \Xi \diamond T\)

Modal logic often uses other axioms as well, but we wouldn’t need them.

Now we are ready to prove the existence of God following Hartshorne. I’ll loosely follow the paper by Small here.

Let’s denote statement “God exists” by \(g\). Hartshorne assumed two axioms:

HA1. \(\diamond g\). God may exist (he assumed that even atheists can imagine the existence of God in some alternative reality).

HA2. \(\diamond\neg g \rightarrow \neg g\). It is quite a pessimistic assumption: from the fact that it is possible that God doesn’t exist in some other reality, we immediately conclude that it doesn’t exist in our own reality. (In fact, I am tricking you right now and abusing the initial axiom, but I’ll explain it a little bit later).

Now we are ready for the proof. Below if I don’t mention explicitly where we derive a formula from, it implies that we use the previous formula.

H1. First let’s substitute \(\neg g\) to M5′ axiom: \(\diamond\neg g \rightarrow \Xi\diamond \neg g\). Note that we are not making assumptions about God here, only using tautologies of modal logic.

H2. Substitute \(\Xi g\) to the law of excluded middle and use M1: \((\Xi g)\lor(\diamond \neg g)\).

H3. Applying implication H1 to H2 we are getting \((\Xi g) \lor (\Xi\diamond\neg g)\).

H4. Since we assume HA2 as an axiom, we can apply rule M4 to get \(\Xi(\diamond\neg g \rightarrow \neg g)\)

H5. By M3: \((\Xi\diamond\neg g) \rightarrow (\Xi\neg g)\)

H6. H5 and H3 together gives us \((\Xi g) \lor (\Xi\neg g)\). This eliminates the possibility of accidental God existence. Either God exists in all possible worlds or it cannot exist in any of them.

H7. Axiom M1 is equivalent to \(\neg\neg\diamond g\) (two negations cancel each other), to which we can apply M1 to get \(\neg\Xi\neg g\).

H8. From H7 we know that term \(\Xi\neg g\) in H6 cannot be true. So the only possibility for H6 to be true (and we know that it is true because we proved it) is that \(\Xi g\) is true. This shows that God necessarily exists which completes the proof.

Does this really prove that God exists? It depends if you agree with Hartshorne’s axioms and axioms of modal logic. In fact, this proof doesn’t even say what “God” is. It can be applied for any statement which matches the same set of axioms. For example, substitute “God doesn’t exist” for \(g\) and the same reasoning would apply for the proof that God doesn’t exist.

Also, it is interesting to note that axiom HA2 is equivalent to statement \(g \rightarrow \Xi g\) (using implication contrapositive and M1) which reads as “if God exists in our world, it must exist in any possible world”. This statement looks much stronger than the original axiom and seems to be less probable to be true, even though these two formulations are totally equivalent! Actually, Hartshorne used exactly this “less pessimistic” axiom, but I chose to change it to make it look less shaky at least initially.

Can this proof be considered mathematical? Once again it depends on what you call “mathematics”. It definitely doesn’t use any advanced mathematical concepts and is quite trivial. The only bit of mathematics here is the usage of formulas and a strict agreement on axioms and what logical derivations we consider as acceptable. This is what mathematics is actually is: it is just an agreement on rules which we use and a convenient language which helps our reasoning. Exactly the same argument could be repeated without any formulas, but then it would be much more difficult to follow. In fact, originally a kind of this proof was presented informally by Anselm in XI century and only in 1962 it was made formal by Hartshorne (although honestly, I find the relation to be quite loose).

Does this proof have any value? Of course! It is at least a nice exercise in modal logic and an interesting model you can think of. As was discussed in the post on models, even if some model is not correct, it is still able to give food for thoughts. In this particular case, a formal model made obvious how shaky Anselm’s argument was (which was clear anyway, but this model makes the proof and the vision of its weaknesses accessible to non-philosophers).

Now after a short warm-up with modal logic we are ready to discuss Godel’s proof. In comparison to Hartshorne’s proof, this proof also tries to explain what God actually is. Informally speaking, Godel assumes that each imaginable object has a set of properties and is uniquely described by that set (here I don’t assume any definition of “set” in a set-theoretic sense). This reflects what physicists do when they introduce some object. For example, in order to explain what “electron” is you have to specify its properties: its charge, mass, diameter, etc.

Each property is binary: an object can either have a property or not, it cannot have some degree of a property. If object \(x\) has property \(\phi\) we denote it as \(\phi(x)\). This is not limiting at all since properties can be arbitrarily precise (like “an object is 1.23 nanometers in diameter”), but just a consequence of the syntax we use. Also, some of the properties are “positive” and some are not. If \(\phi\) is positive we denote it as \(P(\phi)\). What “positive” means is not specified, but God is such an object that has all positive properties.

It is possible that for some property \(\phi\) there is no object which has this property (consider property “this object does not exist”). If this is not the case and there are some object \(x\) such that \(\phi(x)\) holds, we call \(\phi\) instantiable and \(x\) an instance for property \(\phi\).

In the following, I’ll follow the notation and statements order of Wikipedia.

Axiom 1. \((P(\phi)\land \Xi \forall x . \phi(x)\rightarrow \psi(x)) \rightarrow P(\psi)\)

Formula \(\forall x . \phi(x)\rightarrow \psi(x)\) means that property \(\psi\) follows from property \(\phi\): any object which has property \(\phi\) will also have property \(\psi\). We can also call \(\psi\) a subproperty of \(\phi\). As an example consider properties “Bigger than 1 meter” and “Bigger than 1 millimeter”. The second clearly follows from the first one. What Axiom 1 says is that from positive property only positive properties can follow. It cannot be the case that some positive property implies a negative one.

Axiom 2. \(P(\neg\phi)\leftrightarrow\neg P(\phi)\)

For any property \(\phi\) we have an opposite property denoted by \(\neg\phi\) which reads just as absence of \(\phi\). Axiom 2 says that if the presence of some property is a positive thing than absence of it is not and vise versa. It means that we cannot have neutral properties (or we just don’t consider such properties).

Theorem 1. For any positive property it may exist an object which has this property (or in other words, any positive property is instantiable): \(P(\phi)\rightarrow \diamond \exists x . \phi(x)\)

Proof: Let’s assume that property \(\phi\) is non-instantiable. Then statement \(\phi(x)\) is always false and any statement follows from it by paradoxical implication. So we have statement \(\forall x. \phi(x)\rightarrow \psi(x)\) for arbitrary property \(\psi\). This means that all of the properties are subproperties of \(\phi\) together with their opposites. If \(\phi\) was positive that would contradict Axiom 2. This concludes that no positive property can be non-instantiable. QED

Definition 1. \(G(x)\leftrightarrow \forall\phi . (P(\phi)\rightarrow \phi(x))\)

\(G\) is a property which we’ll call “God-like”. This definition says that a property of being God-like implies having all positive properties (and no negative properties, otherwise it would contradict Axiom 2).

Axiom 3. \(P(G)\) (being God-like is a positive property).

Theorem 2. \(\diamond \exists x . G(x)\) (it is possible that God-like object exists at least in some world).

Proof: This theorem is trivial since God-likeness is a positive property and we know that all positive properties are instantiable by Theorem 1. QED

Definition 2. \(\phi \;\mathrm{ess}\; x \leftrightarrow (\phi(x)\land (\forall \psi . \psi(x)\rightarrow \Xi \forall y . \phi(y)\rightarrow\psi(y)))\)

This defines the essence of an object. Informally speaking, the essence of an object is a property which determines an object uniquely up to a world. Let’s say that \(x\) and \(y\) are the same objects but in the different worlds. We cannot write \(x=y\) because these two objects are still not exactly the same—they live in different worlds. But if you had a way to bring this objects in the same world, they would become indistinguishable and effectively the same physical entity. In order to capture this idea, Godel introduced “essences”—properties which describe objects uniquely within the world.

Given this informal reasoning, you now can understand what the formal definition means. It just says that if some property \(\phi\) is the essence of an object \(x\), then any other object \(y\) which has the same essence \(\phi\), would have exactly the same set of properties in any world imaginable. As I said before, the idea here is that properties describe objects uniquely, but belonging to a world is not a property which can be used to identify objects across the worlds. Definition of essence does the trick.

Axiom 4. \(P(\phi)\rightarrow \Xi P(\phi)\) (if some property is positive, it is positive in all imaginable worlds).

Theorem 3. \(G(x) \rightarrow G\;\mathrm{ess}\; x\) (if \(x\) is God-like, then \(G\) is its essence)

Proof: We know by definition that being God-like determines all of the properties of an object uniquely (it is all positive properties). Since we know that positivity of a property is the same in all of the worlds (by Axiom 4), it means that all of the God-like objects have exactly the same properties in all of the worlds. This is what essence is all about (although it might require some reflection). QED

Definition 3. \(E(x) \leftrightarrow \forall \phi .( \phi \;\mathrm{ess}\; x \rightarrow \Xi \exists y . \phi(y))\)

Property \(E\) is called necessary existence. If object \(x\) necessarily exists, it means that in any imaginable world exists some object with the same essence. It is instructive to compare this with \(\Xi\) operator. While \(\Xi\) tells that some logical statement is true in every possible world, \(E\) makes a statement that some object exists in every possible world with exactly the essence?of a given object.

Axiom 5. \(P(E)\)

This is another controversial axiom which says that necessary existence is a positive property. One way to think about it is that positivity of a property is not about being “better”, but about being “greater” in a sense of being bigger, smarter, more powerful, containing more energy, having bigger speed, etc. These examples are arbitrary, nothing actually says that being “greater” cannot mean “smaller in size”, but if we continue this intuitive idea, then we can think that the more worlds the object can exist in, the greater it is. This is what this axiom is about.

Theorem 4. God exists (more precisely, in any world exists a God-like object): \(\Xi\exists x. G(x)\)

Proof: Since a God-like object has all positive properties, it also has the property of necessary existence, which gives us formula $$\forall x.G(x)\rightarrow E(x)$$

If we combine this with theorem 2 we get $$\diamond \exists x. G(x)\land E(x)$$

By definition of essence $$\diamond \exists x. G(x)\land \Xi\exists y.G(y)$$

If some object has some two properties, it definitely has one of them, so we get $$\diamond \exists x. \Xi\exists y.G(y)$$

or, since \(x\) is not used in this formula anymore, we drop it and get $$\diamond\Xi\exists y.G(y)$$

Applying axiom of modal logic M5 we get the statement of the theorem. QED

Did this prove that God really exists? Not really. At least not in a popular religious way. Next time when you see in a newspaper that “scientists proved that God exists”, you know what scientists really did and how it is related to the God which you or your neighbor worship. Most likely, not related at all. And at best it is not more than a language play by formal rules.

Once again, even though this proof doesn’t tell us much about the reality in which we live and that each of the axioms can be criticized, I find this proof is still quite elegant. The play with essences is smart and may be even used in absolutely unrelated modal logic derivations.

In order to get a better feeling of this proof and modal logic I suggest you to solve the following exercise:

Exercise. Let’s assume that the definition of essence is formulated as

$$\phi \;\mathrm{ess}\; x \leftrightarrow (\forall \psi . \psi(x)\rightarrow \Xi \forall y . \phi(y)\rightarrow\psi(y))$$

and all of the axioms are still the same. Prove that in this case, Godel’s axioms are self-contradictory. (Hint: start by proving that a non-instantiable property is the essence for any object and get a contradiction from this).

The hardest logic puzzle ever

Imagine you have a conversation with three gods. One god always tells the truths. Another god always lies. The third god gives random answers. These gods can answer any of your question which admits a yes or no response. The problem is that you don’t know which god is lying, which tells the truths and which is random. Moreover, the gods answer in their own language, in which “yes” and “no” sounds like “da” and “ja”, but you don’t know the translation of these words. By asking as few questions as possible you have to find out which god is random, which god tells the truths and which god always lies.

Take your time to try to solve this problem. Then read on.

First of all, the problem says that we have to solve it using as few steps as possible. It appears that we can estimate the number of required questions even without knowing the questions themselves. We have three gods and three roles for them. The number of different combinations of roles is just a number of permutations of these roles, which is 3!=6. At the same time if you asked n yes/no questions, then you may distinguish between 2n different situations. We need 2n to be bigger than 6, which gives us that we need at least three questions to determine the roles of the gods.

So, what these three questions should be? This is a hard problem and we will solve it in several steps:

  1. We will first solve the problem in the assumption that there is no random god and that the gods answer in English.
  2. After this, we will assume that they answer “da” and “ja” instead of “yes” and “no”.
  3. Eventually, we will solve the initial problem with a random god.

Each of these steps is much easier to solve and I advise you to stop here and try to think about the problem again.

So, we have two gods. One is always lying and the other is always telling the truth. Let’s say that S is some statement and we want to learn truthfulness of it. It appears that we can do it by asking only one question to an arbitrary god. To understand how to formulate the question, consider first the following truth table:

T S desired answer
0 0 0
0 1 1
1 0 0
1 1 1

Here I denoted by T the fact that we talk to the god which tells the truths and by S, as above, the actual truthfulness of the statement which interests us. 0 means “false” and 1 means “true”. Example: if we talk to the god which tells the truths and the statement S is false, then it matches the third line of a table and the desired answer is “no” (denoted in the table as 0).

The problem is that in the first two rows instead of the desired answer we will get an inverted answer since we are talking to the lying god. But if we ask our question not about S, but about some other statement Q1, which has inverted truth values in the first two rows of a table, we will get the desired result:

T S Q1 answer we get
0 0 1 0
0 1 0 1
1 0 0 0
1 1 1 1

As you can see, in this case, if we ask a question about the truthfulness of Q1, we instead get the correct answer about the truthfulness of P. Q1 just means that both statements T and S are logically equivalent, so our question to a god may sound like this: “Could you tell me please, if either both statements ‘you always tell the truth’ and ‘S’“are true, or both of them are false?” You can do your calculation to make sure that the response to this question always matches to the truth value of S, no matter which god you ask this question.

The second problem requires just a slight extension of this approach and you may want to try to do it yourself before moving on.

So, now instead of “yes” and “no” we have responses “ja” and “da” with unknown meaning. We can make an assumption that “ja” means “yes” (I call this assumption A below) and extend our truth table the same way as we did before:

T A S Q1 Q2 answer we get
0 0 0 1 0 da
0 0 1 0 1 ja
0 1 0 1 1 da
0 1 1 0 0 ja
1 0 0 0 1 da
1 0 1 1 0 ja
1 1 0 0 0 da
1 1 1 1 1 ja

Here Q1 has the same meaning as before and Q2 inverts the value of Q1 in case “ja” means “no”. This may be a little bit difficult to digest at first, but the idea here is quite simple. We assume that “ja” means “yes” and we want to hear the response “ja” if S is true. But if “ja” means “no”, we want a god to revert his response so he still answers “ja” in case of truth, and this is the purpose of Q2.

How to ask the question of the truthfulness of Q2? We can explain to the god how we build the table above and then ask for a value of Q2. The god knows the values of T, A, and S and so he knows what number we are interested in. Then he either tells us the truths or lies, using rules of his own language. You may evaluate different possibilities of the answer yourself and see that he will say “ja” if and only if S is true.

Another approach to ask the same question is to ask it in a more direct fashion by using Conjunctive Normal Form of logical statements, but I’ll skip this since it is fairly simple and you can understand how to do it from Wikipedia article.

The question could be formulated much easier. Let’s once again assume that the gods answer “yes” and “no”. How would a god respond to the question “What would you tell me, if I asked you about the truthfulness of S?” The god which tells the truths would give us the correct value of S, this is simple. But the god which lies will be tricked. Let’s assume that S is true. If you ask a question about S directly, the god which lies would answer “no”. But we are not asking about S, we are asking about the answer he would give, so the god will have to lie again and say “yes”, the value of S. The same logic applies if S is false.

It is easy to extend this approach to the case when the gods answer with “ja” and “da”. The question is the following: “Would you say ‘ja’ if I asked you about the truthfulness of S? You can work out the logic behind this question yourself.

There are two reasons why I presented two possible solutions for this problem. First of all, I was asked this question at a job interview and I was able to come up with a solution with truth tables myself, so I am particularly proud of it (although I didn’t get that job). Another reason is that it is interesting that the first solution was achieved using quite trivial formal logic, while the second one is based on a language play, which cannot be formalized by mathematical methods. This means that, for example, the first solution could be potentially found by Artificial Intelligence, but the second, short and elegant, couldn’t even be represented using the methods of AI available today. This is a fundamental limitation of all of the formal systems which mathematics and AI use in comparison to a “real” logic used by humans.

After solving the second subproblem, we know how to ask for any information from truth-telling or lying god using only a single question. We just have to eliminate the god which gives us totally random answers. I’ll present the questions as though we could ask them directly, but you have to keep in mind that these questions should be “translated” to a form which allows us to get a correct response. The next paragraph contains a solution to the entire problem and you know what to do.

Let’s say that we enumerated the gods as 1, 2, 3. We ask the first god: “Is the second god Random?”. If he answers “yes”, it may be the truth, or it may just mean that the first god gave us a random response. In both cases, we can conclude that the third god is not random and we can proceed to ask questions to him. If the response to the question is “no”, then we know for sure that the second god is not random. The rest is trivial.

In the solution presented above, we didn’t define what “random” actually is, but it can mean different things. This puzzle was initially published by George Boolos and the definition which he used assumed that the random god flips the coin before each answer and the result of this flipping defines if he tells the truth or lies.

At first sight, it may look like a valid definition for randomness, since to the question “Are all roses are red?” we may receive responses “yes” or “no” with equal probability. At the same time, to the question “What would you tell me if I asked you whether all the roses are red?” he would give the response “no”. He flips the coin to decide if he tells the truths, but in both cases, he got caught into the trap of our indirect question. “Yes” is impossible and he doesn’t look like random anymore. The solution I presented above works for a truly random god as well, but the fact that the definition of randomness with flipping a coin doesn’t make him truly random is remarkable.

The last bit. We assumed that even if we don’t know the meaning of the words “ja” and “da”, we at least know what possible responses might be. This is a lot of information already! What if we didn’t know even a single possible word in god’s language? It is easy to solve the problem in four steps. First, ask an arbitrary question to learn at least one word from god’s vocabulary. Then apply the strategy derived above. This straightforward solution is easy, but is it possible to solve the problem with fewer steps?

If one of the gods is truly random (without flipping a coin), then the solution in three steps is impossible. It is clear that the last question can be asked only to a non-random god. If you think about it for a bit, this observation is also quite remarkable: we cannot get any information by asking the last question to the random god, but asking a question to the random god the first question is actually giving us information on who is not random (we used this trick in our initial solution). This is quite difficult to explain without going into depths of theory, but a short story is that this happens because a randomized answer can only reduce uncertainty, but cannot give you precise information.

So we have to determine which god is not random by using only two questions. The first response is some word which we’ve never heard before, so it cannot give any information. It means that the second question cannot depend on the response to the first question. If you add to the mixture true randomness, it appears that even though we can get some information from these two responses, we cannot confidently eliminate a random god. Details, unfortunately, are quite complicated and I will skip it.

If the random god is defined by flipping a coin, the solution in three steps is possible. The first two questions are the following:

  1. First ask the first god: “Let’s consider English transliteration of words ‘yes’ and ‘no’ in your language. If I asked you whether you are random would you answer with a response which goes alphabetically first in the list of transliterations?”
  2. Then ask the second god: “Would you give me the same response to the same question which I just asked?”

I leave it as a nice exercise to understand how you can make sense out of responses to these two questions and how to complete the puzzle in this case.