How to write Proofscape modules
If you're new to Proofscape, this tutorial is your starting point for learning how to contribute to the library. The library is built up out of Proofscape modules, which are files representing results from the mathematical literature, as well as expansions thereon. This tutorial teaches you the syntax of those modules, plus a bit about the library hierarchy.
But before beginning it's best to study at least one example. Every module in the library has a discussion page where you can see the contents of the module alongside the graphical display of those contents, and this is the best way to get a sense for how modules work.
As you read this guide, we encourage you to try things out for yourself, since that is the best way to learn. Therefore you might want to skip ahead and read about the module editor now, and then open up the sandbox in another tab, so that you can try things out as you read about them.
Everything in Proofscape, be it theorem, proof, or expansion, is called a deduction, and the purpose of a Proofscape module is to define deductions.
It all starts with a top-level deduction, which represents a result from the literature, such as a theorem, proposition, lemma, etc. A top-level deduction looks something like this:
In general every deduction has the form:
preamble { node_definitions meson_script }
and in Fig. 1 you see the preamble deduction Thm by Hilbert, followed by a single node definition defining node C, and a (quite minimal) Meson script. Meson scripts are covered in the Meson Language Tutorial.
The general form of the preamble is:
deduction NAME by AUTHOR [of TARGETS] [called TITLE]
and in Fig. 1 the NAME of the deduction is Thm, the AUTHOR is Hilbert, and we didn't use either of the two optional clauses, of TARGETS or called TITLE. In fact,
Don't worry about the called TITLE clause for now. It's a way to specify the display name of the deduction (what appears in the TrailMap), but display names are almost always generated automatically, and this clause is seldom needed.
Apart from top-level deductions, all other deductions have targets, which are simply nodes belonging to another deduction.
For example, the proof of the theorem defined in Fig. 1 might look like this:
This time the deduction is called Pf, it is again by Hilbert, and it has one target, namely the node called C in the deduction called Thm. That node is referenced from the context of this deduction by the dotted path Thm.C. If a deduction has several target nodes then they should be listed separated by commas, for example Thm.C1, Thm.C2, Thm.C3.
In this way, the deduction in Fig. 2 presents an argument that concludes with the statement that was made in the theorem. In other words, it is a proof of the theorem.
Our first two deductions were both by Hilbert, and evidently he felt that node S in deduction Pf was a fairly easy consequence of node R (see Fig. 2). Maybe we disagree, and after spending ten minutes figuring out how S follows from R we decide to record our newfound demonstration in an expansion:
This time the name of the deduction is Xpan, and for the AUTHOR argument of the preamble we have the username of the user who decided to code the expansion. The target is Pf.S, and we have adduced two additional reasons why that node follows, A1 and A2. Note that the Meson script continues to cite Pf.R as a reason as well.
These are the fundamentals of deductions. If you are encoding a result from the literature, you write it in a top-level deduction. If you are writing a proof or expansion, you encode it in a deduction whose targets are the nodes requiring proof or further explanation.
As you can learn in greater detail by reading about the structure of the Proofscape library, there are basically two kinds of module: literature modules, and expansion modules. Simply put, the former are for encoding results from the literature and their proofs, and the latter are for when you want to offer your own expansions.
Continuing with the fake example from the first section of this tutorial, a full literature module might look like this:
and an expansion module like this:
Apart from deduction definitions, the only other elements you see in the modules in Figs. 4 and 5 are import statements.
Import statements in Proofscape modules have more or less the same syntax and semantics as import statments in the Python programming language, so if you happen to be familiar with Python already then this will all sound very familiar. In fact the library of Proofscape modules has an __init__.pfsc module in every directory too, just as there are __init__.py modules in a Python library. (More about this below.)
Getting import statements right is probably the trickiest part of writing Proofscape modules, but fortunately much of the gritty details will be handled for you automatically. Each time you begin to edit a new module, the system will generate a certain amount of "boilerplate" code and drop it into the module editor, so that you'll never be starting with a blank slate. This mainly leaves you with just the interesting parts to do, namely filling in the nodes and the Meson script.
Nevertheless you should understand imports, and for that you must understand the structure of the Proofscape library. Fig. 6 shows a snapshot of a fragment of the library.
At the top level are the three main hierarchies, lit ("literature"), usr ("user"), and xpan ("expansions"), which house, respectively, the literature modules, modules representing users of the system, and the expansion modules.
Under lit we have folders for various authors, among them Hilbert. Under each author are folders representing those of their works which are currently represented in the library. For example under Hilbert we have ZB representing his work, the Zahlbericht. Within the folder for a work are Proofscape modules representing individual results in that work, and their proofs.
Let's now walk through the import statements in Figs. 4 and 5. The first is:
The dotted path lit.Hilbert points to the folder lit/Hilbert in the library, and -- just like in Python -- importing from a folder means importing from the __init__ module defined in that folder. In each author folder, the __init__ module defines metadata about the author under a variable called author, and it is the value of this variable that we are importing in this first import statement.
However, within the context of the literature module in Fig. 4 it makes sense to refer to the author variable as Hilbert, hence the as Hilbert clause at the end of the import statement, which sets the name for this variable within the module where we do the import. It is only because of this that we are able to name Hilbert in the preambles to the deductions defined in this module.
Next let's move to Fig. 5 and the next import statement, which is very similar to the last:
This time of course we are importing from the __init__ module in the usr/someuser folder, where a variable called user is defined, and which represents this user. We import this variable under this user's (not so imaginative) username someuser, so that, again, we can use this name in the preamble to the deduction, crediting this user with having written it.
Next comes:
Here we are imagining that Fig. 4 showed the contents of the Thm1.pfsc module in the folder lit/Hilbert/ZB (it doesn't actually!), and this import statement simply serves to import the Pf deduction defined there, so that we can now expand on one of its nodes. There's no as clause this time because we are already happy with the name Pf for this object.
It is only because of this import statement that we can refer to the node Pf.S in the preamble to our expansion, and again in its Meson script. In general,
When the expansion module of Fig. 5 is accepted into the library, it will be stored under the xpan hierarchy. For details on how that works, read more about the structure of the Proofscape library.
For now there is only one other type of statement you can make at the top level of a Proofscape module besides an import statement, and that is a tex_mode statement, which controls how the TeX that you write on node labels is interpreted.
The simplest thing you can do here is nothing at all. That is, if you don't make any tex_mode statement then the module interpreter will expect to find ordinary LaTeX markup on your node labels. If you wanted to, you could make the statment:
and that would have the same effect; but there's no reason to do this.
There is actually only one other option:
which allows you to write your node labels in an alternative (and some would say faster/easier/better) version of TeX syntax called VerTeX. The choice is yours.
You may write comments in a Proofscape module on lines beginning with #. However, you cannot put any comments at the top of the module. Any comments you write here will be deleted and replaced by the license header that is automatically added at the top of all Proofscape modules.
A deduction presents a logical argument, and,
In this section we talk about the various node types, and what kind of content belongs on them. Recall that every deduction definition has the form:
preamble { node_definitions meson_script }
and now we are discussing the node_definitions part of that. Meanwhile the arrows connecting the nodes are created by the meson_script, which is the subject of another tutorial.
Here we'll talk first about the things you define within a node definition, and how you define them; then cover the basic node types; finally we'll discuss subdeductions, which give Proofscape deductions a recursive structure.
Details aside,
Let's begin by examining the actual contents of the module lit.Hilbert.ZB.Thm1 which defines Theorem 1 from Hilbert's Zahlbericht (not the phony version we imagined earlier!):
and examine its one node definition.
The general form of a node definition is:
NODETYPE NAME { ASSIGNMENTS NODE_DEFINITIONS }
and in this example the NODETYPE is asrt (which stands for "assertion"), the NAME is C, and there are two ASSIGNMENTS, which assign string values to the variables de and en. This is where we define the text that will be rendered on the node as its label. In general a node definition may contain nested NODE_DEFINITIONS (you will see examples below), but in this case we don't have any.
A bit like Wikipedia, at Proofscape we think it's a nice idea to encode knowledge in more than one language; in our case that's especially true if and when you may be encoding a result that was originally published in a language other than English. However, today English is the lingua franca of mathematical and scientific communications in general, so in such cases it's a good idea to provide an English translation as well.
Hilbert's Zahlbericht is in German, so we have written the original theorem statement in German under the de variable, and we have provided an English translation under the en variable.
There is also one special "language code," sy which stands for "symbolic," and this is to be used for node labels that contain only mathematical symbols. Such labels can be displayed no matter what human language the user may select, so it's important to note these as such!
For the most part, node labels are the only ASSIGNMENTS you'll need to make; there are however a few special node types (discussed below) for which other assignments need to be made, and for which there are nested NODE_DEFINITIONS.
In the label for any node A you can create an internal link to any other node B with the markdown
[TEXT](DOTTED_PATH)
where the TEXT is what you want to appear in the node label, and the DOTTED_PATH is any valid reference to the node B from the context where node A is found.
For example if nodes A and B live in the same deduction then DOTTED_PATH can simply be B. If A belongs to a deduction Pf and B to a deduction Thm defined in the same module, then Thm.B will do. If B belongs to a deduction defined in another module, then you will need to use an import statement in order to refer to it.
An internal link appears as highlighted text in a node label. Hovering over it highlights the referenced node if it is already on the board; clicking the link puts the referenced node on the board if it is not already there.
As you may have noticed by now, strings are always defined in Proofscape modules using the quotation mark ", and you can use either one " or three """ at each end of the string (and they have to match!).
The rule is that you should use a single quotation mark " at each end of the string only when the whole string fits on a single line in the module. You should use triple quotation marks """ instead for multiline strings i.e. those whose definition spans more than a single line in the module. This holds for Meson scripts as well as any other strings, like node labels.
HOWEVER, in the case of node labels triple quotes do MORE than just allow you to define a string over several lines; the line breaks and indentation are actually honoured. That is, you can control the linebreaking and indentation that will appear when the node is displayed in the browser!
For example, if you define a node as in Fig. 8
then the node will appear as in Fig. 9.
Notice that when defining the node label in Fig. 8 we put a line break after the word "equation", we indented the equation by a few spaces, and we put line breaks after the equation and after the word "solutions". And this is exactly how the label appears when the node is rendered in the browser.
To be precise, indentation in triple-quoted strings is relative to that of the first line in the string containing non-whitespace characters. For example in Fig. 8, although the name "Catalan" is indented eight spaces in the module editor, this is interpreted as the level of zero indentation for the actual node label, and the indentation of subsequent lines is interpreted relative to this. (If subsequent lines have less indentation than the first line, they are simply put at the zero level.)
As another convenience, linebreaks occurring within LaTeX math modes are ignored. This is because the syntax required to define a mathematical expression is often much longer than the typeset expression itself, and so you may need to break the syntax over multiple lines in the module editor, while you still want the typeset expression to appear all on one line.
Altogether, the rules are summarised as follows:
It's important to use triple-quoted strings properly since, quite unlike writing a mathematics paper in LaTeX, you can't count on a pre-defined page width to create appropriate line breaks. Instead it's up to you to format the label text for a node in a way that's appropriate, so that it's easy to read and so that each node has an aesthetically pleasing aspect ratio -- neither too flat and wide nor too tall and thin.
In this section we cover the basic node types in Proofscape, which serve various purposes like making assertions, introducing defined symbols, making assumptions, etc. Examples can be seen in Fig. 10, and in the following subsections we discuss the node types one by one. As a preview, we show in Fig. 11 how each of the nodes in Fig. 10 could be defined.
You can use the links below if you want to jump down and read about a particular node type, but on a first reading we suggest beginning with Assertion nodes and reading straight through.
In the examples considered so far in this tutorial we have only seen the asrt or "assertion" node type, which, naturally, is for stating assertions. The most important thing to remember about assertion nodes is:
For example, a prose proof in the literature on group theory might conclude with a statement like, "Therefore H = G." When encoded in Proofscape, however, we would use an asrt node whose label said only "H = G" omitting "Therefore." Such a node could be defined as in Fig. 12.
In a Proofscape diagram it is entirely up to the arrows to indicate the logical flow of the argument, and that is why logical language like "therefore," "hence," "it follows that," etc. never appears on assertion nodes.
Before moving on to discuss the other node types, we pause a moment to talk about naming conventions for nodes. In Fig. 12 we used the name A30 for an asrt node. Nautrally enough, the A is for "Assertion," and if you look back at Fig. 11 you will see that we always use an initial letter corresponding to the node type. This makes it easier to keep track of your nodes when you are writing the Meson script. As for the numbering, we recommend that you employ the old programming trick of working by tens, in case you later realise that you need to go back and insert a node that you forgot about.
Introducing symbols to stand for various objects is a part of all mathematical proofs. In Proofscape this is done on an intr a.k.a. "introduction" or "intro" node.
Whereas asrt nodes are rectangular with ordinary boundary (Fig. 12), intr nodes are rectangular with bold boundary (Fig. 13). This helps you to locate intr nodes quickly when studying a Proofscape diagram, and also seems a natural metaphor for the introduction of objects and symbols by fiat.
NB: It is common practice in mathematical works for some definitions to persist for a whole section, a whole chapter, or even for an entire work. For this Proofscape provides a feature called RunningDefs which is covered below, in the section on advanced topics. For now we simply note:
Intro nodes are one of two node types that are called modal nodes, the other being Supposition/Assumption nodes, which we cover next. This distinction becomes relevant when writing a Meson script, as covered here.
"Supposition", "Assumption", or "Supp" nodes have nodetype supp and are for stating the premises of a theorem, as well as for introducing new assumptions during the course of a proof, such as in a proof by contradiction, or proof by cases (Fig. 14).
Supp nodes are rectangular with a dashed boundary, a natural metaphor for their contingent status.
Along with Intro nodes, Supp nodes are modal nodes.
The "existential introduction" or "exin" node has nodetype exin and it handles formulas like,
in which an object is both stated to exist and simultaneously introduced for further use (Fig. 15).
Since the exin node itself makes an assertion, its boundary is rectangular and in plain style like an ordinary assertion node. Meanwhile it features internal intro and assertion nodes for the components of the statement.
As such this is the first compound node type that we have considered, i.e. a node type containing nested node definitions within its own definition.
In Fig. 15 there is one internal intro node, introducing the object which is stated to exist, and two internal asrt nodes, stating properties of that object. (Within a compound node there is no need to employ the "number by tens" naming convention, since you are unlikely to accidentally omit part of the statement!)
The internal nodes can then be used individually in the Meson script, and are referenced via dotted paths such as E70.I, E70.A1, and E70.A2 in this example.
Note that the label text for the exin node itself specifies the connective text, "There exists ... such that" and a % character is used to separate the two parts of this text.
Sometimes in proofs we indicate how an inference follows by explicitly naming or describing the method that is involved. For example we use phrases like:
and in Proofscape these sorts of indications belong on mthd or Method nodes (Fig. 16).
In order to understand how method nodes are used in a Proofscape diagram you should check out the section of the Meson tutorial on method nodes and take a look at a real example.
In Proofscape you should use internal citations wherever possible. What this means is that if, for example, your proof cites a lemma, and that lemma has already been formalised in Proofscape, then you merely need to import that lemma into your module (see section above on imports) and then refer to it in your Meson script.
It's also common for proofs to cite a step in another proof, and one of the great things about Proofscape is that such a "step" will correspond to an actual node in that proof, and you can simply import that node and use it in your Meson script!
However, sometimes you want to cite something that is not in the Proofscape library, either because it just hasn't been added yet, or because it's too vague to ever be formalised. We refer to such cases as external citations, and for these we provide the cite or Citation node (Fig. 17).
Proof by contradiction or reductio ad absurdum is a staple of mathematical proof and, as mentioned earlier, when formalised in Proofscape a proof by contradiction will introduce its assumption using a supp node.
In order to make it quite clear when the contradiction has finally been derived in such a proof, there is a special node type just for this, the flse or Falsum node (Fig. 18).
Graphically a falsum node is simply an assertion node whose label shows nothing but the "uptack" or "falsum" symbol often used to indicate absurdity or contradiction in formal logical systems. While this could have been achieved with an asrt node, we feel it is better to semantically encode the contradiction using a special node type. Furthermore this does away with the need to choose a natural language phrasing to express the contradiction, or to record it in several different languages such as English, German, French, Russian, etc.
Since the label for a flse node is predefined, there is no need to actually define anything inside the node definition braces. Defining a flse node therefore really just means choosing a name for it, as in Fig. 18.
As in a case like this one, notice how the dashed boundary of the supp node makes it easy to find the assumption that led to the appearance of the flse node.
In derivations it is common to make sequential assertions of the form
in which the Ei are expressions and the Ri binary relations such as =, <, >, etc. In such cases it is standard practice to concatenate these relations, as
so that intermediate expressions are not repeated. In proofs we often find that such a "relation chain" is used altogether in support of some claim, or that individual relations within the chain may need to be cited one at a time, or proved one at a time.
In order to support these patterns, Proofscape provides the rels or Relation Chain node type (Fig. 19).
Each relation is given on an asrt node defined within the rels node. In the initial relation we write both the left-hand and right-hand expressions, whereas for subsequent relations we omit the left-hand side.
Much as we saw earlier with the exin node type, it is then possible to refer to the rels node itself in a Meson script -- as R80 in the present example -- as well as to refer to its subnodes using dotted paths, such as R80.A1, or R80.A2.
Often a proof P contains a subargument S that stands almost or completely on its own. It may be that S was included in P because it was not important enough to be stated as its own lemma, or S may depend on one or two assertions or definitions arising earlier in P. In any case, for a subproof like S Proofscape offers a feature called a subdeduction.
A subdeduction is a special node, of type subdeduc. It is declared and given a name just like any other node, and does not get a preamble like a proper deduction does. On the other hand, its internal structure is the same as that of a proper deduction: node declarations, followed by a Meson script.
Like all other compound nodes, a subdeduction itself can be named in a Meson script, and so can its internal nodes, using dotted paths.
Conversely, if S is any deduction -- either a proper deduction or a subdeduction -- and if T is a subdeduction declared within S, then the Meson script of T can refer to nodes in S, and this works according to the usual rules of variable scopes in programming languages. For example if both S and T declare a node called A10, then the Meson script of T can refer to the node in T as A10 and that of S as S.A10. If, on the other hand, S declares a node called I20 but T does not, then in the Meson script for T this node can be referenced simply as I20. In other words,
For example, in Fig. 20 the deduction Pf features a subdeduction named SD1. In the Meson script for the subdeduction you see that B refers to the node named B in Pf, whereas C refers to the node named C in SD1. Meanwhile Pf.C refers to the node by the same name in Pf.
The final ingredient to defining a deduction in a Proofscape module is the Meson script, and for that there is the separate
Now that you know how to write Proofscape modules, how do you open the module editor and begin editing?
The answer is that it depends on what you want to do. If you just want to practice then you can visit the Sandbox as explained below.
If you want to actually contribute to the library then you must first log in, after creating an account if you have not already done so. Accounts are free, and all you need is an email address.
If you are just learning how to write modules and would simply like to practice (with no impact on the library) you should visit the Sandbox.
Visit the Library page. Select the author and work that you are interested in. If the work (or author) is not yet in the library, submit a library requisition using the link on the left. Otherwise, after opening the work you'll see in the far-right column a list of all the results that have been formalised so far. If the result you want to encode is not listed yet, then click the 'Transcribe it' button, enter the result type and number, and then click the 'Begin' button.
There are two ways to offer an expansion. For one, you can scan the Requests page and look for questions you would like to answer. Otherwise, any time you are browsing in Moose you can select a node, and then click the 'Provide' button in the sidebar. Naturally, it has to be the kind of node that makes an assertion, so that you can offer further explanation as to how that inference is made.
If you want to edit an existing module, you must first open a deduction from that module in Moose, and then click the pencil icon in that deduction's row in the TrailMap tab of the sidebar. (There's help on using Moose here.)
If it is an expansion module, then you can only edit it if you were its original author, or if you are an admin user. If it is a literature module then anyone can edit it, unless it has been closed, which is a decision made by the admins when a module appears to be complete and correct. But even if you cannot edit a module you can still post a comment on the module's discussion page, which can be accessed by clicking the talk icon beside the pencil icon in the TrailMap.
The module editor has several buttons that may be visible, depending on the mode in which you are working.
This is the button you will use the most often. It passes your module to the validator, and, if everything is in order, your diagram is displayed on the canvas. You can check whether all the TeX came out as expected, revise the Meson script, etc., and click Preview again.
If instead there were any errors in your module, then the console will be overlaid atop the canvas, showing the error message(s).
Use this button to toggle the console overlay.
If you are editing an existing module, or a new module that is to be added to the library, then there will be a Submit button. You should not click this until you have already done a Preview, and checked that everything is exactly as you want it to be.
The Submit button sends the module to be checked for correctness, formatted (regularising things like indentations and line lengths), and finally written into the library.
Running Defs panel, when you are editing a module belonging to the lit hierarchy.
This opens theIt is common practice in mathematical works for some symbol definitions to persist for a whole section, a whole chapter, or even for an entire work. For this Proofscape provides a feature called RunningDefs.
When you are editing a literature module -- i.e. a module representing a result from the literature -- the module editor will feature a "Running Defs" button that you can click in order to open the Running Defs panel (Fig. 21).
Each work in the Proofscape library gets its own set of running defs (or "rdefs" for short). Therefore it is in the __init__ module for a work that all its rdefs are recorded, along with lists indicating to which of the results in that work each rdef applies. (We mentioned __init__ modules briefly in the section on imports, and you can read more about the structure of the Proofscape library here.)
Thus when you are formalising a new result for a work and you open the rdefs panel, you see a listing of all rdefs currently defined for that work. For example in Fig. 21 you see how the rdefs at one time looked for Hilbert's Zahlbericht.
The list of existing rdefs appears in Section 1 one of the panel, and you can simply check the box for each definition that applies to the result you are adding to the library. If instead the definition you need has not yet been added to the work, then you can create a new definition in Section 2 of the rdefs panel. HOWEVER, you must be careful here:
When you submit the module, the list of rdefs associated with it is automatically recorded in the __init__ module for the work, along with any new rdefs you have defined. You do not need to (and cannot) edit the __init__ module manually.
There are some subtle points about writing Proofscape modules which can only be properly understood after you have read more about the structure of the Proofscape library. These topics are covered in the Advanced Topcs section of the Proofscape Library Reference.