Scroll down for video lectures
1. What is a Mathematical Proof?
Mathematical proof is a method of discourse which allows a human being to:
- discover new mathematical knowledge,
- analyze existing mathematical knowledge,
- verify truthfulness of a piece of mathematical knowledge.
Mathematical knowledge is knowledge of abstract principles about our universe. As such, it requires use of symbols to represent entities
that are inherently abstract. For example, the symbol 2 may represent 2 apples or 2 pears. The number 2 is an abstract entity, since it is not confined to any
of these concrete representations.
Mathematics functions at different levels of
abstraction too. For instance, we may write a symbol, such as n, to represent
any number. In one case we could have n = 2, and in another case we could have n = 3.
This is a second layer of abstraction compared to the layer of each specific number,
such as number 2. Symbols representing abstract entities form basic ingredients of mathematical proof. The most complex parts of mathematical proofs deal with manipulations of these symbols, which sometimes may take an extremely long time. To optimize a proof, it is important to understand its most fundamental components. The aim of these lectures is to provide an exposition of these fundamental components.
Activity. Get Python IDE, if you do not already have one: there are many available, Spyder is recommended (very easy to install). Then, get the file sofia.py, which can be obtained from https://github.com/ZurabJanelidze/sofia. Save sofia.py to the runtime directory of your Python IDE. To test that you have done it correctly, create a separate file named something.py, copy-paste the following code in that file, and run the file with your Python IDE:
import sofia
sofia.help()
2. SOFiA Formalism
We will work in three language layers:
- Intuitive layer. This is ordinary English text such as the one you are reading now.
- Formal layer. Mathematical formulas, written out using rules that cannot be compromised on, belong to this layer. We will use the SOFiA formalism for this (to be explained below).
- Computational layer. This is the language in which we will communicate with the computer to help us reason mathematically. Python language with the package sofia.py, which implements the formal layer, will be used here.
SOFiA formalism is quite simple. A mathematical formula written in this formalism is called an expression. An expression can be any sequence of characters (with a few exceptions, not to cause a conflict with the computational layer -- for the Python implementation that we use, the quotation marks are the only characters that are not permitted, since SOFiA expressions appear there as strings and the quotation marks are reserved for representing strings in Python). The only rule that an expression needs to satisfy is a valid placement of square brackets. For instance, [[X][Y]] is a SOFiA expression, while [[X] is not. There is a simple algorithm to ensure that brackets are placed correctly:
- Read the expression from left to right, and count the number of open and the number of closed brackets.
- At each point in the expression, the number of open brackets should be more than the number of closed brackets.
- Once you complete reading the expression, the number of open brackets should match with the number of closed brackets.
At this point, if not before, you might be wondering what does SOFiA stand for? This term was coined by Brandon Laing and the author of this post. It is an acronym for Synaptic First-Order Assembler, except that Fi and O are rearranged for better readability. The first word synaptic refers to the idea of how complex SOFiA expressions are formed using simpler ones (we will come back to this a bit later). The words First-Order refer to First-Order language in mathematical logic. We will not go into this now, but you can think of this as the most standard type of mathematical language. The word Assembler refers to the fact that in some sense, SOFiA is to mathematical language what the assembler language in programming is to other programming languages.
Activity. You can verify whether something is a valid SOFiA expression or not using the following tool. Run the code below (create a separate file named something.py, copy-paste the following code in that file, and run the file with your Python IDE):
import sofia
P = sofia.prop()
E = "expression"
print(E + ' is a valid SOFiA expression = '+ str(P._valexp(E)))
Activity. Once you feel like you have a good understanding of what a SOFiA expression is, try to challenge yourself by coming up with an example where it may be less obvious whether it is a SOFiA expression or not.
3. Statements and Formulas
There are two types of SOFiA expressions: "statements" and "formulas". A statement is a SOFiA expression that has the form [...][...][...]...[...], where the three dots contained in the open-closed bracket pairs represent SOFiA expressions, while the fourth set of three dots tells us that there can be as many copies of [...] as we want.
Activity. Get a better understanding as to which SOFiA expressions are statements using the following code:
import sofia
P = sofia.prop()
E = "[statement]"
print(E + ' is a valid SOFiA statement = '+ str(P._valsta(E)))
A formula is a SOFiA expression that is not a statement.
Activity. Write your own code to determine whether a string is a formula or not. Then, use this as a tool to get a better understanding of which SOFiA expressions are formulas and which are not.
The intuition behind formulas and statements is as follows:
- A formula is a formalized idea. It can be a valid idea or an invalid one. When we write down a formula, we do not claim validity of the idea we express using it.
- A statement is a validation of one or more ideas: the validation could be achieved by means of an assumption (we agree to believe that the idea is valid), or conclusion (we can deduce validity of the idea based on earlier knowledge). When we write down a statement, we are making an assumption or a claim that the idea we express using it is valid.
Consider, for instance the following two SOFiA expressions:
it is a beautiful day today
[it is a beautiful day today]
The first of these is a formula - it is merely an idea of the day being beautiful today. It is all right to write down this formula even if the day is rainy. The second of these is a statement. Writing it down means that we are stating a fact that it is a beautiful day today. We may draw conclusions from this fact: for instance, we may decide to go for a walk. If it happens to be a rainy day, we will get wet and too bad, we have chosen to believe in something that was flawed.
A statement need not always be a fact. It can also be an object about which we may wish to establish a fact. When writing down such object as a statement, the existence of the object is being claimed, or at least, assumed. For instance, the statement [alien] refers to some alien whose presence is being asserted. In contrast, the formula alien is a consideration of an alien without any claims or assumptions about its existence.
A statement which begins and ends with a bracket is called an atomic statement. General statements always decompose into atomic statements: they are always a sequence of atomic statements written next to each other. For instance, [n > 3] is an atomic statement, while [n > 3][5 > n] is not. The second decomposes into two atomic statements, [n > 3] and [5 > n]. Such statements validate each of its atomic parts. For instance, [n > 3][5 > n] claims (or assumes) that n is greater than 3 and at the same time, n is less than 5. Thus, writing statements next to each other allows one to express in SOFiA what in mathematical logic is called conjunction of statements.
Activity. Come up with your own examples of formulas and statements to analyze the difference between their interpretation as ideas or validated ideas.
4. Ontology of formal expressions
In SOFiA formalism, the term synapsis refers to a process when one or more formulas are turned into a statement and vice versa, one or more statements are turned into a formula. There are two main types of synapsis. The first one, creates a statement out of several formulas.
Activity. Use the following tool to create statements out of lists of formulas.
import sofia
P = sofia.prop()
F = ["Jabulani likes red flowers","Jabulani likes yellow flowers"]
print('Formulas: '+str(F))
print('Statement: '+P._statfromformulas(F))
Discuss the difference between a list of formulas and the corresponding statement in the example above, or other examples of your own.
The second principal type of synapsis is when a formula is created from statements. We will restrict here our consideration to two statements. For example, [A]=[B] is a synapsis of statements [A] and [B]. It expresses the idea that the statements [A] and [B] could be the "same thing". For instance, consider [2+2]=[1+3]. We can think of this as a SOFiA encoding of the idea that 2+2 is the same thing as 1+3. The reason why we cannot write 2+2=1+3 is that without the brackets, 2+2 and 1+3 are formulas. Synapsis cannot create a new formula from given ones. It can only create a formula from statements. Note also that since [2+2]=[1+3] is a formula and not a statement, writing it does not mean that we are claiming (stating) that 2+2 is the same as 1+3. We are only proposing an idea of them being the same. But they are the same, so might as well have stated that they are the same. This can be done by applying another synapsis, to produce a statement out of the formula: [[2+2]=[1+3]]. The following diagram shows the ontology of our final statement as a combination of synapses:
A third type of synapsis creates a statement from a mix of statements and formulas. For instance, synapsis of [x][y], [z], [x]+[y]=[z] results in [[x][y]][[z]][[x]+[y]=[z]].
5. Variables
In mathematics, variables are symbols or combinations of symbols that represent abstract entities within the scope in which the variables appear, but which lose their meaning outside their "scope" (see below for the definition of a scope of a variable in the formalism of SOFiA). In SOFiA, variables are non-empty formulas that do not contain any square brackets.
Activity. Use the following tool to learn how to recognize variables in a given SOFiA expression. The output will give a list of all recognized variables with repetitions if the same variable appears several times in the expression.
import sofia
P = sofia.prop()
E = "[[x1]>x2][x3][Did you know that [[x3] is South African]?]"
print(P._vars(E))
The scope of a variable can be described in terms of synaptic ontology. Consider, for example, the following statement:
[[x][[x] speaks isiZulu][[x] speaks English]][[x] speaks Chinese]
Synaptic ontology for this statement is:
The scope of a variable is the block B in the ontology immediately above where variable appears in a separate block, if the block C immediately above B is a statement, and it is C if C is a formula. All variables within the scope of a given variable are meant to represent the same object. This means that the x that speaks isiZulu and the x that speaks English are both the same x, since they both fall within the scope of the left-most x. However, the x that speaks Chinese, need not be the same x as the one that speaks isiZulu and English. The scopes of variables for the example above are shown below:Build your own SOFiA expression having variables, draw their ontology and mark the scope of each variable.
Activity. Draw ontology for [[z]:[[x]:[[y]:[[x]>[y]+[z]]]][[y]+[z]>[x]]] and mark the scope of each variable.
6. Mathematical Reasoning
The process of mathematical reasoning involves writing down statements sequentially, with the goal of generating new knowledge. The generated knowledge is called a theorem and the steps leading to it constitute a proof. There are very precise rules, called deduction rules, for constructing a proof. This ensures that we can rely on validity of theorems.
We will now demonstrate the construction of a proof by giving a proof of "transitivity of equality". This is a theorem in mathematics that refers to the fact that we must have x = z, whenever we know that x = y and y = z. This theorem may seem to be quite obvious, and it is indeed intuitively obvious, but not everything that feels obvious is always correct, so knowing that we can establish this obvious fact by means of a mathematical proof, is important. In other words, no prior knowledge is required. The proof will use the following three deduction rules:
- Assumption - introducing an assumption in our proof. Note that the validity of the equality x = z depends on what information we are given about x and z. In this case, we are given that x = y and y = z. This will be our first line of proof: to assume that x = y and y = z are indeed valid.
- Substitution - replacing names of objects as permitted by given equalities. Since we are given that x = y and y = z, we could replace y in the equality x = y, with z (since we are given that y = z, so we are given that y and z are the same thing). Carrying out this replacement in the equality x = y, we will get x = z as a logical conclusion of the given information.
- Synapsis - stepping out from an assumption block (although closely related, this is not to be confused with synapsis for forming SOFiA expressions). Once we have assumed that x = y and y = z, and from this concluded that x = z, we are not done yet. We now want to make one more step and say: therefore, we have proved that if x = y and y = z then x = z. The synapsis deduction rule refers to this step.
To formalize the proof in SOFiA, first we need to have an idea how to express the statements that will be involved in the proof using SOFiA expressions. We already mentioned in an earlier section that the two sides of the equality should be statements. So we will work with [x]=[y], say, instead of merely x=y. Next, we want to be able to state that we are given both [x]=[y] and [y]=[z]. We need this in our assumption step. So we need to create a statement out of these two formulas: [[x]=[y]][[y]=[z]]. We want, however, the two y's to be in the same scope, so we will have to extend the previous statement as follows: [y][[x]=[y]][[y]=[z]]. Finally, we want to state what we are proving, namely that if [y][[x]=[y]][[y]=[z]] then [[x]=[z]]. In mathematics, we often use the symbol "=>" for this purpose: [y][[x]=[y]][[y]=[z]] => [[x]=[z]]. The symbol "=>" is called implication, and what we just wrote down would be read as "[y][[x]=[y]][[y]=[z]] implies [[x]=[z]]". To make such a statement in SOFiA, we use the colon ":" in place of "=>". So we will write [y][[x]=[y]][[y]=[z]]:[[x]=[z]]. In order for the two x's and the two z's to be in the same scope, we will expand this to: [x][y][z][[x]=[y]][[y]=[z]]:[[x]=[z]]. But this is only a formula. To turn this into a statement, we enclose the formula in a pair of square brackets: [[x][y][z][[x]=[y]][[y]=[z]]:[[x]=[z]]].
The Python implementation of SOFiA will do most of this by itself. Instead of entering a proof line-by-line, all we need to do is to give a reference to a deduction rule used to generate each next line. These references are actually Python functions coded in sofia.py.
Activity. Run the following program.
import sofia
P = sofia.prop("Equality is Transitive")
P.a("[x][y][z][[x]=[y]][[y]=[z]]")
P.rs(1,1,[1],5,4)
P.s()
You should get the following output for the program above (in fact, the output will consist of all intermediate steps of building the proof as well, where what is shown below will be the final step):
Try to interpret this output and figure out how it relates to the input. Experiment with the input by modifying some of its lines or parameters in the lines, to see what effect it will have on the output. If you struggle with interpretation, using sofia.help() as a line in your program should be helpful.
Activity. Use coding to build a proof of the following statement: [[x][y][z][t][[x]=[y]][[y]=[z]][[z]=[t]]:[[x]=[t]]].
Activity. Analyze the theorem and the proof that the following code outputs. Pay attention to the new deduction rule. What does it do?
import sofia
T=sofia.prop("Reflexivity of Equality")
T.a("[X]")
T.e(1)
T.s()
Then, write a program that can compose a proof for the theorem [[X][Y][[X]=[Y]]:[[Y]=[X]]], calling the resulting theorem "Symmetry of Equality".
7. Constants
If we want to talk about a specific human being, called Sofia, we do not want to write [Sofia], since it will get interpreted as a variable. Instead, we must write [Sofia[]].
Activity. Analyze the theorem and the proof that the following code outputs. Pay attention to the new deduction rule. What does it do?
import sofia
S=sofia.prop("Sofia can prove")
S.a("[[Sofia[]] is human][[X][[X] is human]:[[X] can prove]]")
S.a("[Sofia[]]")
S.d(1,[[2,1]],2)
S.s()
S.s()
Check what happens when you use [Sofia] instead of [Sofia[]].
We can think of a formula as an operator that takes input statements (those used in the synapsis that creates the formula) and produces a thought that combines them. But not all thoughts necessarily decompose into statements (via synapsis, explained in Section 4 above). For instance, Sofia[] is an example of such thought. Similarly, the thought that the world is flat can be formalized as the world is flat[]. Formulas that do not contain any variables in them are called constants. Both Sofia[] and the world is flat[] are examples of constants.
Activity. Consider the following the following formula:
[X][[x]:[[[X] teaches [x]]=[[[x] teaches [x]]:[the world is flat[]]]]]:[the world is flat[]].
Draw the ontology of this formula to analyze its logical meaning. Afterwards, complete the following proof for the statement of this formula.
import sofia
T=sofia.prop("Paradox of an Ideal Teacher")
T.a("[X][[x]:[[[X] teaches [x]]=[[[x] teaches [x]]:[the world is flat[]]]]]")
T.d(1,[[1,1]],2)
T.a("[[X] teaches [X]]")
T.rs(2,3)
...
8. Axioms
When a large body of mathematical theorems share common assumptions, it is convenient to separate the assumptions from the statements of the theorems and state them separately as axioms. We can recall the axioms when we need them in proofs.
Activity. The following program introduces two axioms.
import sofia
A1=sofia.prop("Everyone has a friend")
A1.p("[[x]:[y][[y] is friends with [x]]]")
A2=sofia.prop("Sofia exists")
A2.p("[Sofia[]]")
What do these axioms state?
Activity. Combine the program from the previous activity with the following to see how an axiom can be applied in a proof
S=sofia.prop("Sofia has a friend")
S.c(A1)
S.c(A2)
S.d(1,[[2,1]])
Activity. What does the theorem composed with the following program state?
import sofia
A1=sofia.prop("Everyone has a friend")
A1.p("[[x]:[y][[y] is friends with [x]]]")
A2=sofia.prop("Sofia exists")
A2.p("[Sofia[]]")
S=sofia.prop("Sofia has a friend")
S.c(A1)
S.c(A2)
S.d(1,[[2,1]])
S.d(1,[[3,1]])
S.r([[3,1],[4,1],[4,2],[3,2]])
Note a new deduction rule in the last line. What does it do?
It is possible to recall proved theorems just as it is possible to recall axioms in a proof. This saves time as well: one does not need to reprove the same theorem every time one needs to use it. So mathematics is somewhat similar to building a brick structure: one begins with bricks in the foundation (the axioms), then places on top of them more bricks (theorems), on which one places further bricks (more theorems), and so it. Of course, this pattern applies to any building process, whether it is building a physical structure or building knowledge. What makes mathematics special is the accuracy with which the bricks are placed on top of other bricks, given by the method of mathematical proof.
We have now come close to the end of this post and, hopefully, to the start of your fascination in exploring and composing mathematical proofs, with or without the help of SOFiA. Talking about help, you can actually ask her for a little help while programming: type sofia.help() in your program. Among other things, it will list all deduction rules currently implemented in SOFiA together with examples of their usage. You may want to browse through some further examples and, if you are an expert in Python, acquire the skill of creating axiom builders, which implement axiom schema (systems of infinitely many axioms) in the sofia.py software.
9. Fallacy
A subject in mathematics, called incidence geometry, studies affine subspaces of vector spaces, but only up to the relationship of which affine subspaces is contained in which. Affine subspaces of a three-dimensional space are points, straight lines and planes. The theory develops by introducing axioms that deal only with the information of one affine subspaces being contained in another, and deriving theorems from them. We will illustrate this process with one axiom and one theorem that we will prove using the axiom.
The purpose of what we are going to do is to illustrate the use of a special type of constant, a fallacy in SOFiA. This constant, written as ![], has two uses:
- To come to a contradiction in a proof - this is simply when fallacy is stated (so we have [![]]) as a line in the proof.
- To state that some statement is false... but, what does it mean for a statement to be false? Mathematical definition of something being false is that it implies fallacy! So to state, for instance, that 1+1=3 is false, we will write [[1+1=3]:[![]]]. Of course, a more proper way to write this, so that SOFiA can distinguish between numbers and identify equality, would be: [[[[1[]]+[1[]]]=[3[]]]:[![]]].
[[x][[[[x] is nice]:[![]]]:[![]]]:[[x] is nice]]
as an axiom as well as
[[[[life is good]:[![]]]:[![]]]:[life is good]].
The first of these says that for any value of the variable x, if it is false that it is false that x is nice, then x must be a nice. The second says that if it is false that it is false that life is good, then life must be good. We want this kind of axiom not only for the formula [x] is nice or the formula life is good, but for any other formula as well. When we are given a formula, [x] is nice, for which we want to generate such axiom, we will use the following code (omitting the first line, if it already appears in our module):
import sofia
BL=sofia.bool()
A=BL.n("[[x] is a nice]","[x]")
The first parameter in the function BL.n is the given formula in a stated form, while the second parameter is the context - a string of stated variables that are supposed to be degrees of freedom in the formula. Help for "bool" can be obtained as follows:
import sofia
BL=sofia.bool()
BL.help()
Activity. Run the code above and compare the result with the result of running the following:
import sofia
BL=sofia.bool()
B=BL.n("[[x] is a nice]")
Discuss the logical meanings of the axiom A and the axiom B.
We will now jump in into illustrating all of the above in a single code.
Activity. Analyze the output of the following code.
import sofia
BL=sofia.bool()
I2C=sofia.prop("I2 - plane has three noncolinear points")
I2C.p("[[p][[p]plane]:[a][b][c][[[a]=[b]]:[![]]][[[a]=[c]]:[![]]][[[b]=[c]]:[![]]][[a]point][[b]point][[c]point][[a]in[p]][[b]in[p]][[c]in[p]][[l][[l]line][[a]in[l]][[b]in[l]][[c]in[l]]:[![]]]]")
T1=sofia.prop("There is a point outside any line on a plane")
T1.a("[l][p][[l]line][[p]plane][[l]in[p]]")
T1.c(I2C)
T1.d(2,[[1,2]])
T1.a("[[d][[d]point][[d]in[p]]:[[d]in[l]]]")
T1.d(4,[[3,1]])
T1.d(4,[[3,2]])
T1.d(4,[[3,3]])
T1.d(3,[[1,1]],13)
T1.s()
T1.a("[[d][[d]point][[d]in[p]][[[d]in[l]]:[![]]]:[![]]]")
T1.a("[d'][[d']point][[d']in[p]]")
T1.a("[[[d']in[l]]:[![]]]")
T1.d(10,[[11,1]])
T1.s()
T1.c(BL.n("[[d']in[l]]","[d'][l]"))
T1.d(15,[[11,1],[1,1]])
T1.s()
T1.r([[17,1]],["d"])
T1.d(9)
T1.s()
T1.c(BL.n("[d][[d]point][[d]in[p]][[[d]in[l]]:[![]]]","[l][p]"))
T1.d(21,[[1,1],[1,2]])
T1.s()
T1.r([[23,1]],["l","p","a"])
Explain the logical meaning of the axiom I2C and the theorem T1.
If you struggle to make send of the proof contained in the activity above, the following translation of the proof in the semi-intuitive layer may be useful:
╔ Consider a line l in a plane p (L1)
║Recall the axiom that plane contains three noncolinear points (L2)
║We can then conclude that there are three distinct points a', b', and c' in p such that it is false that the three points lie in the same line (L3)
║╔ Assume now that every point in p is in l (L4)
║║Then a' is in l (L5)
║║b' is in l,(L6)
║║and c' is in l too (L7)
║╚ We can then conclude falsity, since by L3, it is false that the three points lie on the same line (L8)
║We have thus proved that it is false that every point d that is in p is also in l (L9)
║╔ Assume that it is false that there is a point that is in p but not in l (L10)
║║╔ Assume further that d' is a point that lies in p (L11)
║║║╔ Suppose it is false that d' is in l (L12)
║║║╚ We can conclude falsity by L10 (L13)
║║║ Therefore, it is false that it is false that it is false that d' lies in l (L14)
║║║ Recall the Boolean double negation axiom for the statement "d' lies in l", with d' and l being degrees of freedom (15)
║║╚ Applying the axiom above to L14 we get that d' lies in l (L16)
║║We have thus proved that every point d' in p is also in l (L17)
║║Restate removing the prime above d (L18)
║╚ Then by L9 we can conclude falsity (L19)
║So the assumption in L10 implies falsity - it is false that it is false that there is a point that is in p but not in l (L20)
║Recall again the Boolean double negation axiom, but this time for the statement that there is a point that is in p but not in l (L21)
╚ L21 to L20 we get that there is a point that is in p that is not in l (L22)
We have thus proved that for any line l in a plane p, there is a point d' that lies in p, but not in l (L23)
Restate this with d in the place of d' (just for convenience) (L24)
We can make this more intuitive by combining some steps into a single line:
╔ Consider a line l in a plane p (L1)
║By the axiom that every plane contains three noncolinear points, there are three distinct points a', b', and c' in p which never lie on the same line (L2-3)
║╔ Assume that every point in p lies in l (L4)
║╚ Then a', b', and c' all lie in l, which is not possible - a contradiction (L5-8)
║This proves that not every point of p lies in l (L9)
║╔ We will now derive from this that there must exist a point in p that does not lie in l. Assumption this is not true (L10)
║║╔ Suppose d' is a point that lies in p (L11)
║║║If d' does not lie in l, we get a contradiction (L12-13)
║║╚Therefore, d' lies in l (L14-16).
║║This proves that every point that lies in p also lies in l (L17)
║╚ A contradiction (L18-19)
║ So our assumption that there is a point in p that does not lie in l being false is actually false (L20)
╚ So there does indeed exist a point in p that does not lie in l (L21-22)
We have thus proved that for any line l in a plane p, there is a point that lies in p, but not in l (L23-24)
Aknowledgement
I am extremely grateful to Louise Beyers for her useful comments, corrections and suggestions in the process of creating this post. I am also grateful to Roy Ferguson and Gregor Feierabend for their comments and corrections. This post was creating as the framework for the lecture series by the same name given at the The 12th CHPC Introductory Programming School and The 4th NITheCS Summer School on the Foundations of Theoretical and Computational Science.
Lecture 1: Sections 1-4
Lecture 2: Sections 5, 6
Lecture 3: Sections 7
0 Comments:
Post a Comment