Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen
Editorial Board David Hutchison Lancaster University, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Alfred Kobsa University of California, Irvine, CA, USA Friedemann Mattern ETH Zurich, Switzerland John C. Mitchell Stanford University, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel Oscar Nierstrasz University of Bern, Switzerland C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen University of Dortmund, Germany Madhu Sudan Massachusetts Institute of Technology, MA, USA Demetri Terzopoulos University of California, Los Angeles, CA, USA Doug Tygar University of California, Berkeley, CA, USA Gerhard Weikum MaxPlanck Institute of Computer Science, Saarbruecken, Germany
4941
Marino Miculan Ivan Scagnetto Furio Honsell (Eds.)
Types for Proofs and Programs International Conference, TYPES 2007 Cividale del Friuli, Italy, May 25, 2007 Revised Selected Papers
13
Volume Editors Marino Miculan Ivan Scagnetto Furio Honsell Università degli Studi di Udine Dipartimento di Matematica e Informatica Via delle Scienze 206, 33100 Udine, Italy Email: {miculan, scagnett, honsell}@dimi.uniud.it
Library of Congress Control Number: 2008926731 CR Subject Classification (1998): F.3.1, F.4.1, D.3.3, I.2.3 LNCS Sublibrary: SL 1 – Theoretical Computer Science and General Issues ISSN ISBN10 ISBN13
03029743 3540680845 Springer Berlin Heidelberg New York 9783540680840 Springer Berlin Heidelberg New York
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. Springer is a part of Springer Science+Business Media springer.com © SpringerVerlag Berlin Heidelberg 2008 Printed in Germany Typesetting: Cameraready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acidfree paper SPIN: 12267088 06/3180 543210
Preface
These proceedings contain a selection of refereed papers presented at or related to the Annual Workshop of the TYPES project (EU coordination action 510996), which was held during May 2–5, 2007 in Cividale del Friuli (Udine), Italy. The topic of this workshop was formal reasoning and computer programming based on type theory: languages and computerized tools for reasoning, and applications in several domains such as analysis of programming languages, certiﬁed software, formalization of mathematics and mathematics education. The workshop was attended by more than 100 researchers and included more than 40 presentations. We also had the pleasure of three invited lectures, from Fr´ed´eric Blanqui (INRIA, Protheo team), Peter Sewell (University of Cambridge) and Amy Felty (University of Ottawa). From 22 submitted papers, 13 were selected after a reviewing process. Each submitted paper was reviewed by three referees; the ﬁnal decisions were made by the editors. This workshop is the last of a series of meetings of the TYPES working group funded by the European Union (IST project 29001, ESPRIT Working Group 21900, ESPRIT BRA 6435). The proceedings of these workshops were published in the Lecture Notes in Computer Science series: TYPES TYPES TYPES TYPES TYPES TYPES TYPES TYPES TYPES TYPES TYPES
1993 1994 1995 1996 1998 1999 2000 2002 2003 2004 2006
Nijmegen, The Netherlands, LNCS 806 B˚ astad, Sweden, LNCS 996 Turin, Italy, LNCS 1158 Aussois, France, LNCS 1512 Kloster Irsee, Germany, LNCS 1657 L¨okeborg, Sweden, LNCS 1956 Durham, UK, LNCS 2277 Berg en Dal, The Netherlands, LNCS 2646 Turin, Italy, LNCS 3085 JouyenJosas, France, LNCS 3839 Nottingham, UK, LNCS 4502
ESPRIT BRA 6453 was a continuation of ESPRIT Action 3245, Logical Frameworks: Design, Implementation and Experiments. Proceedings for annual meetings under that action were published by Cambridge University Press in the books Logical Frameworks and Logical Environments. TYPES 2007 was made possible by the contribution of many people. We thank all the participants to the workshops, and all the authors who submitted papers for consideration for these proceedings. We would like to also thank the referees for their eﬀort in preparing careful reviews. Finally we acknowledge the support of the University of Udine in the organization of the meeting. January 2008
Marino Miculan Ivan Scagnetto Furio Honsell
VI
Preface
Referees Abel, Andreas Alessi, Fabio Asperti, Andrea Benton, Nick Bertot, Yves Bove, Ana Brady, Edwin Brauner, Paul Callaghan, Paul Castran, Pierre Coquand, Thierry Crosilla, Laura D’Agostino, Giovanna Damiani, Ferruccio Di Gianantonio, Pietro Filinski, Andrzej Gabbay, Murdoch J. Gambino, Nicola Geuvers, Herman Ghani, Neil Gregoire, Benjamin Hasuo, Ichiro Herbelin, Hugo Honsell, Furio Hyland, Martin Hyvernat, Pierre Jacobs, Bart Kamareddine, Fairouz Kikuchi, Kentaro Kirchner, Claude Klein, Gerwin
Levy, Paul B. Luo, Yong Luo, Zhaohui Mackie, Ian Marra, Vincenzo McBride, Conor Miculan, Marino Miller, Dale Moggi, Eugenio Momigliano, Alberto Nordstr¨ om, Bengt Norell, Ulf Omodeo, Eugenio Ornaghi, Mario PaulinMohring, Christine PeytonJones, Simon Pichardie, David Pitts, Andy Pollack, Randy Power, John Rabe, Florian Rubio, Albert Scagnetto, Ivan Schwichtenberg, Helmut Soloviev, Sergei Urban, Christian Veldman, Wim Vene, Varmo Wenzel, Markus Werner, Benjamin Zacchiroli, Stefano
Table of Contents
Algorithmic Equality in Heyting Arithmetic Modulo . . . . . . . . . . . . . . . . . . Lisa Allali
1
CoqJVM: An Executable Speciﬁcation of the Java Virtual Machine Using Dependent Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Robert Atkey
18
Dependently Sorted Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jo˜ ao Filipe Belo
33
Finiteness in a Minimalist Foundation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Francesco Ciraulo and Giovanni Sambin
51
A Declarative Language for the Coq Proof Assistant . . . . . . . . . . . . . . . . . . Pierre Corbineau
69
Characterising Strongly Normalising Intuitionistic Sequent Terms . . . . . . J. Esp´ırito Santo, S. Ghilezan, and J. Iveti´c
85
Intuitionistic vs. Classical Tautologies, Quantitative Comparison . . . . . . . Antoine Genitrini, Jakub Kozik, and Marek Zaionc
100
In the Search of a Naive Type Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Agnieszka Kozubek and Pawel Urzyczyn
110
Veriﬁcation of the Redecoration Algorithm for Triangular Matrices . . . . . Ralph Matthes and Martin Strecker
125
A Logic for Parametric Polymorphism with Eﬀects . . . . . . . . . . . . . . . . . . . Rasmus Ejlers Møgelberg and Alex Simpson
142
Working with Mathematical Structures in Type Theory . . . . . . . . . . . . . . . Claudio Sacerdoti Coen and Enrico Tassi
157
On Normalization by Evaluation for Object Calculi . . . . . . . . . . . . . . . . . . J. Schwinghammer
173
Attributive Types for Proof Erasure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hongwei Xi
188
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
203
Algorithmic Equality in Heyting Arithmetic Modulo Lisa Allali LogiCal  École polytechnique  Région Ile de France www.lix.polytechnique.fr/Labo/Lisa.Allali/
1
Introduction
Deduction Modulo is a formalism that aims at distinguish reasoning from computation in proofs. A theory modulo is formed with a set of axioms and a congruence deﬁned by rewrite rules: the reasoning part of the theory is given by the axioms, the computational part by the congruence. In deduction modulo, we can in particular build theories without any axiom, called purely computational theories. What is interesting in building such theories  purely deﬁned by a set of rewrite rules  is the possibility, in some cases to simplify the proofs (typically equality between two closed terms), and also the algorithmic aspect of these proofs. The motivation of building a purely computational presentation of Heyting Arithmetic takes root in La science et l’hypothèse by Henri Poincaré [8] where the author asks: should the proposition 2 + 2 = 4 be proved or just verified ? A good way to verify such propositions is to use the formalism of deduction modulo and rewrite rules. In this perspective, Gilles Dowek and Benjamin Werner have built a purely computational presentation of Heyting Arithmetic[4]. Yet, this presentation didn’t take advantage of the decidability of equality in Arithmetic. In their system, equality was deﬁned by rewrite rules that followed Leibniz’s principle. This is the essential aspect that is changed in the work we present in this paper. The starting point of this work is a remark of Helmut Schwichtenberg, following the development that have been done in minlog [6], about how a set of rewrite rules could be (or not) enough to decide equality in Heyting Arithmetic expressed as a purely computational theory. We answer positively to that question with a new purely computational presentation of Heyting Arithmetic HA−→ such as: – HA−→ is an extension of the usual axiomatic presentation of Heyting Arithmetic HA: Leibniz’s proposition is not defining equality anymore, but is a consequence of the rewrite rules of the system – this extension is conservative over HA – the congruence of HA−→ is decidable – HA−→ has cut elimination property. This work opens new ways to consider equality of inductive types in general, not anymore with Leibniz’s axiom as it is the case in Coq for instance, but building speciﬁc rewrite rules for each type we would be interested in. M. Miculan, I. Scagnetto, and F. Honsell (Eds.): TYPES 2007, LNCS 4941, pp. 1–17, 2008. c SpringerVerlag Berlin Heidelberg 2008
2
L. Allali
2
Deﬁnitions
2.1
Deduction Modulo
Modern type theories feature a rule said conversion rule which allows to identify propositions which are equal modulo betaequivalence. It is often presented as follows: Γ t:T
Γ T : T ype Γ t : T
Γ T : T ype
T ≡ T
where T ≡ T is read T is convertible to T . This convertibility is not checked by logical rules but by computation with the rule β. The idea of natural deduction modulo is to use this computation of convertibility inside natural deduction. For instance, the axiom and ⇒ elimination rules are the following: Γ ≡ B
Ax if A ∈ Γ and A ≡ B
Γ ≡ A Γ ≡ C ⇒e if C ≡ A ⇒ B Γ ≡ B The other rules of natural deduction modulo are build the same way upon natural deduction[3]. The convertibility ≡ is not ﬁxed but depends on the theory. It can be any congruence deﬁned by the reﬂexive, symmetric and transitive closure of a rewrite system. 2.2
Theories in Natural Deduction Modulo
Definition 1 (Axiomatic theory) An axiomatic theory is a set of axioms. Definition 2 (Modulo theory) A modulo theory T is a set of axioms and a congruence defined as the reflexive, transitive and symmetric closure of a set of rewrite rules. The rewrite rules are either from terms to terms either from atomic propositions to propositions. Quantifiers may bind variables, thus these rewrite systems are Combinatory Reduction Systems [7]. Notation: Γ T A means the proposition A is provable in the theory T under the hypothesis Γ . Definition 3 (Purely computational theory) A purely computational theory is a modulo theory where the set of axioms is empty.
Algorithmic Equality in Heyting Arithmetic Modulo
2.3
3
Relations between Theories
We want to go from Heyting Arithmetic theory, which is an axiomatic theory, to reach a purely computational theory, that has the same expressiveness. We need the following deﬁnitions to be able to compare, step by step, each theory we build to the previous one. Definition 4 (Equivalence between two theories) Let T and T be two theories formed on the same language L. The theories T and T are equivalent if and only if for any proposition P of L, T P if and only if T P . Definition 5 (Extension) Let T and T be two theories formed respectively on the languages L and L with L ⊆ L . Theory T is an extension of T if and only if for all proposition P of L, if T P then T P . Definition 6 (Conservative extension) Let T and T be two theories respectively formed on languages L and L with L ⊆ L . T is a conservative extension of T if and only if for any proposition P of L, T P if and only if T P . 2.4
Models
In this section, we introduce the material we need to build models for intuitionist deduction modulo. We give the necessary deﬁnitions and state the main theorems. The interested reader can refer to [2] and [5] for further explanations. Pseudo Heyting algebra as model for modulo intuitionist logic Definition 7 (Pseudo Heyting algebra) ˜ , ˜ ∀, ˜ ∃, ˜ ⇒ ˜, ∨ ˜ , ⊥, Let B be a set and ≤ a relation on B. A structure B, ≤, ∧ ˜ is a Pseudo Heyting algebra if – ≤ is a reflexive and transitive relation (not necessarily antisymmetric)1 ˜ is a minimum of B for ≤ – ⊥ ˜ is a maximum of B for ≤ – ˜ y is a lower bound of x and y (where x and y are in B) – x∧ ˜ y is a upper bound of x and y (where x and y are in B) – x∨ ˜ and ∃ ˜ ( infinite lower and upper bounds) are functions from ℘(B) to B – ∀ such that: ˜ ≤ x (where x is in B and a is in ℘(B)),  x ∈ a ⇒ ∀a ˜ (where x and c are in B and a is in ℘(B)),  (∀x ∈ a c ≤ x) ⇒ c ≤ ∀a 1
When this relation is more over antisymmetric we get a Heyting Algebra.
4
L. Allali
˜ (where x is in B and a is in ℘(B)),  x ∈ a ⇒ x ≤ ∃a ˜ ≤ c (where x and c are in B and a is in ℘(B)).  (∀x ∈ a x ≤ c) ⇒ ∃a ˜ y ≤ z (where x, y and z are in B). – x ≤ y ⇒z ˜ ⇔ x∧ Definition 8 (Ordered pseudo Heyting algebra) An ordered pseudo Heyting algebra is a pseudo Heyting algebra together with a relation on B such that – – – –
is an order relation, ˜ ≤ b and b b then ˜ ≤ b , ˜ ˜ is a minimal element for , is a maximal element for and ⊥ ˜ ˜ ˜ ˜ ∧, ∨, ∀, ∃ are monotonous, ⇒ ˜ is left antimonotonous and right monotonous.
Definition 9 (Complete ordered pseudo Heyting algebra) An ordered pseudo Heyting algebra is said to be complete if every subset of B has a greatest lower bound for . Definition 10 (Modulo intuitionistic model) Let L be a language. An Intuitionist model M of L is : – – – –
a set M , an ordered and complete pseudo Heyting algebra B, for each function symbol f of arity n a function fˆ from M n to M , for each predicate symbol P of arity n a function Pˆ from M n to B.
Definition 11 (Denotation) Let M be a model, A be a proposition and φ be an assignment. We define Aφ as follows: xφ ⊥φ φ f (t1 , ..., tn )φ P (t1 , ..., tn )φ
= φ(x) ˜ =⊥ ˜ = = fˆ(t1 φ , ..., tn φ ) = Pˆ (t1 φ , ..., tn φ )
A ∧ Bφ A ∨ Bφ A ⇒ Bφ ∀x Aφ ∃x Aφ
˜ Bφ = Aφ ∧ ˜ Bφ = Aφ ∨ = Aφ ⇒B ˜ φ ˜ = ∀{A φ,x:=v  v ∈ M } ˜ = ∃{A φ,x:=v  v ∈ M }
Definition 12 (Models for purely computational theory) A model of a purely computational theory whose rewrite rules are R1 −→ R1 , . . ., Rn −→ Rn is such that for each assignment φ, Ri φ = Ri φ for i ∈ {1, . . . , n}. The concept of model is useful when trying to ﬁnd relations between theories as it is shown by the two following theorems: Theorem 1 (Completeness Theorem) Let T be a theory. If for every model M such as M = T we have M = A then T A.2 2
M = reads as M is a model for.
Algorithmic Equality in Heyting Arithmetic Modulo
5
Theorem 2 (Correctness Theorem) If T A then, for every model M, if M = T then M = A. Definition 13 (Superconsistency) A theory T , ≡ in deduction modulo is superconsistent if, for each ordered and complete pseudo Heyting algebra B, there exists a Bmodel of this theory. The main property of a superconsistent theory is to bear a model valuated in the Candidates Algebra and thus to normalize [3]. Theorem 3 (Normalization) If a theory T , ≡ is superconsistent, then each proof in T , ≡ is strongly normalizable.
3
Diﬀerent Presentations of Heyting Arithmetic  From Axioms to Rewrite Rules
3.1
The Axiomatic Presentation of Heyting Arithmetic
The language of arithmetic is formed with the constant 0, the unary functional symbol S, the binary functional symbols + and × and the binary predicate symbol =. The axioms are structured in four groups. Definition 14 (HA) 1. The axioms of equality Ref lexivity Leibniz axiom scheme ∀x (x = x) ∀x ∀y (x = y ⇒ (P (x) ⇔ P (y))) 2. The axioms 3 and 4 of Peano ∀x ∀y (S(x) = S(y) ⇒ x = y)
a
∀x (0 = S(x) ⇒ ⊥)
3. The induction scheme (P {x := 0} ∧ ∀y (P {x := y} ⇒ P {x := S(y)})) ⇒ ∀n (P {x := n}) 4. The axioms of addition and multiplication. ∀y (0 + y = y) ∀x ∀y (S(x) + y = S(x + y)) ∀y (0 × y = 0) ∀x ∀y (S(x) × y = x × y + y) a
We chose to formulate here Leibniz’s axiom with an equivalence symbol. Note that ∀x ∀y (x = y ⇒ (P (x) ⇒ P (y))) would have been enough but the equivalence form simplifies the proof of Proposition 5 (equivalence beetween this theory and HAR ).
6
L. Allali
The steps to go from an axiomatic presentation of Heyting Arithmetic HA to a purely computational one HA−→ We shall introduce four successive theories to reach the ﬁnal purely computational theory we aim at: HAR , HAN , HAK and HA−→ . We will prove that each of them is equivalent to or is a conservative extension of HA. The main novelty is the step from HA to HAR with new rewrite rules to compute equality instead of the Leibniz’s scheme. The three other theories are traced on the work done in [4], especially for the treatment of the induction scheme, but the rules are diﬀerent so that the proofs need to be adapted. 3.2
HAR , a Theory Equivalent to HA
The theory HAR keeps an axiom scheme for induction, but orients the axioms of addition and multiplication as rewrite rules. It also introduces four rules for rewriting atomic propositions of the form t = u. As we shall see, these rules replace the axioms of equality (reﬂexivity and Leibniz’s scheme) and the axioms 3 and 4 of Peano. Definition 15 (HAR ) 1. The induction scheme (P {x := 0} ∧ ∀y (P {x := y} ⇒ P {x := S(y)})) ⇒ ∀n P {x := n} 2. The rewrite rules 0 = 0 −→ 0 = S(x) −→ ⊥ S(x) = 0 −→ ⊥ S(x) = S(y) −→ x = y
0 + y −→ y S(x) + y −→ S(x + y) 0 × y −→ 0 S(x) × y −→ x × y + y
Proposition 1. The propositions ∀x (x = x), ∀x ∀y (x = y ⇒ y = x), and ∀x ∀y ∀z (x = y ⇒ y = z ⇒ x = z) are provable in HAR . Proof. Reﬂexivity is proved by induction on x. This requires to prove the proposition 0 = 0 and ∀y (y = y ⇒ S(y) = S(y)). The ﬁrst proposition reduces to and the second to ∀y (y = y ⇒ y = y) that is obviously provable. Symmetry is proved by two nested inductions, the ﬁrst on x and the second on y. Transitivity is proved by three nested inductions on x, y and then z. Notice that all these proofs can be written inside the system itself using the induction axiom scheme. 2 Proposition 2. The propositions ∀x ∀y ∀z (x = y ⇒ x + z = y + z) and ∀x ∀y ∀z (x = y ⇒ z + x = z + y) are provable in HAR . Proof. Both propositions are proved inside the system by two nested inductions on x and y. 2 Proposition 3. The propositions ∀x ∀y ∀z (x = y ⇒ x × z = y × z) and ∀x ∀y ∀z (x = y ⇒ z × x = z × y) are provable in HAR .
Algorithmic Equality in Heyting Arithmetic Modulo
7
Proof. Both propositions are proved inside the system by two nested inductions on x and y. But this requires to prove ﬁrst the propositions ∀x x × 0 = 0, ∀y ∀x (y × S(x) = y × x + y) and ∀x ∀y (x × y = y × x) that are again proved with the induction axiom scheme. 2 Proposition 4. For each term t, the proposition ∀a ∀b (a = b ⇒ t{y := a} = t{y := b}) is provable in HAR . Proof. By induction on the structure of t, using Proposition 1, 2, 3.
2
Proposition 5. Each instance of Leibniz’ scheme ∀x ∀y (x = y ⇒ (P (x) ⇔ P (y))) is provable in HAR . Proof. By induction on the structure of P using Proposition 4 for the atomic case. 2 Proposition 6 (Equivalence between HA and HAR ). The theory HAR is equivalent to HA, i.e. for any closed propositions A in the language of HA, A is provable in HA if and only if A is provable in HAR Proof ⇒ We check that each axiom of HA is provable in HAR and we conclude with an induction over the proof structure. – the proposition ∀x (x = x) is provable in HAR by Proposition 1. – Leibniz’ scheme is provable in HAR by Proposition 5. – the axioms 3 and 4 of Peano rewrite to easily provable propositions x = y ⇒ x = y and ⊥ ⇒ ⊥. – The induction scheme is an axiom of HAR . – The axioms of addition and multiplication rewrite to propositions that are consequences of the reﬂexivity of equality. ⇐ The induction axiom scheme is the same in HAR than in HA. The rest of HAR is a rewrite system deﬁning a congruence ≡. We prove that for every propositions A and B, if A ≡ B, there exists a proof of A ⇔ B in HA. To do so, we prove by induction on the structure of A that if A −→ B in HAR then there exists an proof of A ⇔ B in HA. 3.3
HAN , a Conservative Extension of HAR
We add the predicate N to the language. We modify the induction scheme axiom adding this predicate and two axioms for the predicate N that are the axioms 1 and 2 of Peano.
8
L. Allali
Definition 16 (HAN ) 1. The induction scheme ∀n (N (n) ⇒ (P {x := 0} ∧ ∀y (N (y) ⇒ P {x := y} ⇒ P {x := S(y)})) ⇒ P {x := n}) 2. The axioms 1 and 2 of Peano N (0) 3. The (1) (2) (3) (4)
rewrite rules 0 = 0 −→ 0 = S(x) −→ ⊥ S(x) = 0 −→ ⊥ S(x) = S(y) −→ x = y
∀x (N (x) ⇒ N (S(x))) (5) (6) (7) (8)
0 + y −→ y S(x) + y −→ S(x + y) 0 × y −→ 0 S(x) × y −→ x × y + y
Translation  .  from HAR to HAN (t = u) = (t = u) A ∨ B = A ∨ B   = A ⇒ B = A ⇒ B ⊥ = ⊥ ∀x A = ∀x (N (x) ⇒ A) A ∧ B = A ∧ B ∃x A = ∃x (N (x) ∧ A) We want to prove that HAN is an extension of HAR . The diﬃculty stands in the N (t) added by the translation. We ﬁrst prove a few properties on this N predicate: Proposition 7. HAN ∀x ∀y (N (y) ⇒ N (x) ⇒ N (x + y)) Proof. We ﬁrst introduce N (y) in the context, then we use the induction scheme axiom on x. 2 Proposition 8. HAN ∀x ∀y (N (x) ⇒ N (y) ⇒ N (x × y)) Proof. We ﬁrst introduce N (y) in the context, then we use the induction scheme axiom on x. The proof uses Proposition 7. 2 → → Proposition 9. N (− z )3 HAN N (t) for all t where F V (t) = − z Proof. By structural induction on t.
2
Then we prove the following proposition which is the key lemma of our proof. → Proposition 10. For each proposition A and vector − z where F V (A) is included → − → − in z , if Γ HAR A then Γ , N ( z ) HAN A. Proof. By induction on the size of the proof tree. Most of the cases are trivial, except those concerning introduction and elimination of the quantiﬁer. For those one, we use Proposition 9. 2 3
→ − N (− z ) is the notation for N (z1 ), ..., N (zn ) with → z = {z1 , ..., zn }.
Algorithmic Equality in Heyting Arithmetic Modulo
9
Proposition 11 (HAN is an extension of HAR ). Let A be a closed proposition of HAR . If A is provable in HAR then A is provable in HAN . → Proof. We use Proposition 10, and remove all the N (− z ) appearing in the context by using the ∃ elimination rule as follows: Ax Π HAN N (0) → − ∃i N (z0 ), N ( z ) HAN A HAN ∃x N (x) ∃e → − N ( z ) HAN A 2 To prove that the extension is conservative with respect to the translation  . , we introduce another translation ∗ from HAN to HAR where every occurrence of the N predicate is replaced by . Then we prove some properties about this translation to ﬁnally be able to prove the conservativity. Translation * from HAN (t = u)∗ = (t = u) ∗ = (A ∧ B)∗ = A∗ ∧ B ∗ (∀x A)∗ = ∀x (A∗ ) (∃x A)∗ = ∃x (A∗ )
to HAR N (x)∗ = ⊥∗ = ⊥ (A ∨ B)∗ = A∗ ∨ B ∗ (A ⇒ B)∗ = A∗ ⇒ B ∗
Proposition 12. Let A be a closed proposition of HAN . If A is provable in HAN then A∗ is provable in HAR . Proof. By induction on the size of the proof, using the fact that the rewrite rules are the same and that the induction scheme axiom of HAR is the exact translation by ∗ of the induction scheme axiom of HAN . 2 Corollary 1. Let A be a closed proposition of HAR . If A is provable in HAN then A∗ is provable in HAR . Proposition 13. Let A be a closed proposition of HAR . A and A∗ are equivalent in HAR . Proof. By structural induction on A.
2
Proposition 14 (Conservativity with respect to the translation  . ). Let A be a closed proposition of HAR . If A is provable in HAN then A is provable in HAR Proof. By Corollary 1, we know that if A is provable in HAN then A∗ is provable in HAR , and as A and A∗ are equivalent in HAR (Proposition 13), we 2 can conclude that if A is provable in HAN then A is provable in HAR .
10
L. Allali
3.4
HAK , a Conservative Extension of HAN
We sort our theory with the two sorts ι and κ, as follows: 0:ι S : ι, ι + : ι, ι, ι × : ι, ι, ι = : ι, ι N : ι. We add a symbol ∈ : ι, κ. For all propositions P of HAN , where F V (P ) = z, y1 , ..., yn , we add a new function symbol fz,y1 ,...,yn,P : ι, . . . , ι, κ. n times
The elements of sort κ are classes of integers. We build these classes with a comprehension axiom scheme restricted to the propositions of HAN following an idea going back to Takeuti. Finally we modify the induction axiom. We keep the previous rewrite rules. Definition 17 (HAK ) 1. The comprehension scheme ∀x∀y1 ...∀yn (x ∈ fz,y1 ,...,yn,P (y1 , . . . , yn ) ⇔ P {z := x})
a
2. The induction scheme ∀n(N (n) ⇔ ∀k(0 ∈ k ⇒ ∀y(N (y) ⇒ y ∈ k ⇒ S(y) ∈ k) ⇒ n ∈ k)) 3. The (1) (2) (3) (4) a
rewrite rules 0 = 0 −→ 0 = S(x) −→ ⊥ S(x) = 0 −→ ⊥ S(x) = S(y) −→ x = y
(5) (6) (7) (8)
0 + y −→ y S(x) + y −→ S(x + y) 0 × y −→ 0 S(x) × y −→ x × y + y
Remark: by construction of the new function symbols of the form fz,y1 ,...,yn ,P , there is no occurrence of the ∈ symbol in proposition P .
Proposition 15. Let A be a closed proposition of HAN . A is provable in HAN if and only if A is provable in HAK . Proof ⇒ The rewrite rules are the same. We check that each axiom of HAN is provable in HAK . ⇐ We begin with an arbitrary model MN of HAN . We show we can extend this model to a model MK of HAK without changing the denotation for the proposition of HAK . As A is a theorem of HAK , MK validates A. As the denotation of a proposition of HAK is the same in MK than in MN , MN also validates A. As A is valid in all model of HAN , we conclude A is a theorem of HAN . Thus all the theorems of HAK are theorems of HAN .
Algorithmic Equality in Heyting Arithmetic Modulo
11
We need the following deﬁnition to build such a model: Definition 18 (Definable function in HAN ) Let M be a model of HAN . A function γ from M to B is deﬁnable if there exists a proposition P in HAN language with F V (P ) = {x, y1 , ..., yn } and an assignment Φ from all b1 , ..., bn of M to y1 , ..., yn such as: γ(a) = P Φ,x:=a Let us now show how we build a model MK from a model MN without changing the denotation for the proposition of HAK . Let MN be a model of HAN . Let MN be the domain of MN and B its Heyting algebra. Extension from MN to MK : Let Mι = MN . Let Mκ be the set of the deﬁnable functions from Mι to B. The domain MK of MK is made of the sets Mκ and Mι . The variables of class are interpreted in Mκ , the other variables are interpreted in Mι . All the symbols of HAN have the same denotation in MK and in MN . Let us see how we interpret the symbols of HAK that do not appear in HAN : – We interpret the function symbol fz,y1 ,...,yn,P as the function mapping b1 , ..., bn (elements of Mι ) to the element of Mκ : a → P x:=a,y1 :=b1 ,...,yn:=bn – We interpret the ∈ symbol by the following application: x ∈ E = Ex Let us prove that MK is a model of HAK proving that the two axioms of HAK are valid in MK . • ∀x∀y1 ...∀yn (x ∈ fz,y1 ,...,yn,P (y1 , . . . , yn ) ⇔ P {z := x}) We need to show ˜ ∀x∀y1 ...∀yn (x ∈ fz,y1 ,...,yn,P (y1 , . . . , yn ) ⇔ P {z := x}) ≥ which lead to prove that for each a, b1 , ..., bn of Mι ˜ x ∈ fz,y1 ,...,yn,P (y1 , . . . , yn ) ⇔ P {z := x}x:=a,y1 :=b1 ,...,yn:=bn ≥ Let Φ be the assignment {x := a, y1 := b1 , ..., yn := bn }. We now must prove x ∈ fz,y1 ,...,yn,P (y1 , . . . , yn )Φ = P {z := x}Φ Let us focus on the ﬁrst part of this equality: x ∈ fz,y1 ,...,yn,P (y1 , . . . , yn )Φ = fz,y1 ,...,yn,P (y1 , . . . , yn )Φ xΦ We have: fz,y1 ,...,yn,P (y1 , . . . , yn )Φ = fz,y1 ,...,yn,P Φ (y1 Φ , . . . , yn Φ ) = fz,y1 ,...,yn,P Φ (b1 , . . . , bn ) By the interpretation we have given to function symbols, we have: fz,y1 ,...,yn,P Φ (b1 , . . . , bn ) is the deﬁnable function γ of Mκ associated to P with the assignment Φ that associates b1 , ..., bn to y1 , ..., yn .
12
L. Allali
Thus: (fz,y1 ,...,yn,P Φ (b1 , . . . , bn ))xΦ = γ a And by deﬁnition of the deﬁnable functions: γ a = P Φ ,z:=a As x is not free in P , we can add x := a to Φ : we get Φ, because the values assigned to y1 , ..., yn by Φ and Φ are the same. We have γ a = P Φ,z:=a Let us now look at the second part of the equality: P {z := x}Φ P {z := x}Φ = P Φ,z:=xΦ = P Φ,z:=a We ﬁnally have: for each interpretation Φ x ∈ fz,y1 ,...,yn,P (y1 , . . . , yn )Φ = P {z := x}Φ We can conclude ˜ ∀x∀y1 ...∀yn (x ∈ fz,y1 ,...,yn,P (y1 , . . . , yn ) ⇔ P {z := x}) ≥ • We proceed in the same way to prove that ∀n (N (n) ⇔ ∀f (0 ∈ f ⇒ ∀y (N (y) ⇒ y ∈ f ⇒ S(y) ∈ f ) ⇒ n ∈ f ))
4
HA−→, a Purely Computational Presentation of Heyting Arithmetic
In the previous section, all the axioms of the theory HAK were in equivalent form (i.e in the form of A ⇔ B for some propositions A and B). Following [2] we can transform an axiom in equivalent form into a rewrite rule without changing the expressiveness of the theory: the same theorems can be proved. In this section, we change the axioms of HAK into rewrite rules to obtain a purely computational theory. Definition 19 (HA−→ ) (1) x ∈ fz,y1 ,...,yn,P (y1 , . . . , yn ) −→ P {z := x} (2) (3) (4) (5) (6)
N (n) −→ ∀k (0 ∈ k ⇒ ∀y (N (y) ⇒ y ∈ k ⇒ S(y) ∈ k) ⇒ n ∈ k) 0 = 0 −→ 0 = S(x) −→ ⊥ S(x) = 0 −→ ⊥ S(x) = S(y) −→ x = y
(7) 0 + y −→ y (8) S(x) + y −→ S(x + y) (9) 0 × y −→ 0 (10) S(x) × y −→ x × y + y
Algorithmic Equality in Heyting Arithmetic Modulo
13
Proposition 16. Let A be a closed proposition of HAK . A is provable in HAK if and only if A is provable in HA−→ . Proof. To go from HAK to HA−→ , we have replaced two axioms in a form of equivalence by two rewrite rules that make each part of these equivalences congruent. It is trivial to prove that any proposition proved in HAK by using these two axioms can be proved in HA−→ , using the new rewrite rules. Conversely, any proposition proved in HA−→ can be prove in HAK using the transitivity of ⇔. 2
5
Properties of HA−→
5.1
HA−→ Is a Conservative Extension of HA
HA−→ is equivalent to HAK . HAK is a conservative extension of HAN . HAN is a conservative extension of HAR with respect to the translation  . . HAR is equivalent to HA. Thus, HA−→ is a conservative extension of HA. 5.2
Decidability of the Congruence Defined by the HA−→ Rewrite System
The rewrite system of HA−→ is not terminating, due to the rule (2). We change the orientation of this rule to obtain the rewrite system R. As the congruence is deﬁned by the reﬂexive, symmetric and transitive closure of the rewrite rules, the congruence deﬁned by R is the same as the congruence of HA−→ . Thus in order to prove the decidability of the congruence of HA−→ , we prove the termination and conﬂuence of R. (1) x ∈ fz,y1 ,...,yn,P (y1 , . . . , yn ) −→ P {z := x} (2) ∀k (0 ∈ k ⇒ ∀y (N (y) ⇒ y ∈ k ⇒ S(y) ∈ k) ⇒ n ∈ k) −→ N (n) (3) (4) (5) (6)
0 = 0 −→ 0 = S(x) −→ ⊥ S(x) = 0 −→ ⊥ S(x) = S(y) −→ x = y
(7) 0 + y −→ y (8) S(x) + y −→ S(x + y) (9 0 × y −→ 0 (10) S(x) × y −→ x × y + y
The R rewrite system Remarks on the R rewrite system – This system is not a ﬁrst order rewrite system due to the second rule that contains binders. Notice that as k is bounded in rule (2), one can not rewrite rule (1) in rule (2).
14
L. Allali
– The ﬁrst rule is a rule scheme (it is also the case in HA−→ ): there is an inﬁnity of rewrite rules following this scheme, as many as propositions P we can write in HAN . Example: Let us take the proposition z = y. For this proposition the symbol function fz,y,z=y has been added to the language. The instance of the rule 1 following the scheme for this proposition is: x ∈ fz,y,z=y (y) −→ x = y. The substitution P {z := x} only appear in the rule scheme but it doesn’t appear in any instance of it. Proposition 17. The R rewrite system is terminating. Proof. We establish the following well founded order on N × N × N . The ﬁrst component is the number of occurrences of the symbol ∈ appearing in a proposition. This component makes decrease rule 1: by construction of the comprehension scheme, the symbol ∈ doesn’t appear in P . The value decreases obviously in rule 2 also. The value does not change for the other rules where the symbol ∈ doesn’t appear. For the second component we deﬁne a measure function w on terms and propositions: this function is ﬁrst deﬁned on terms using the following equations w(x) = w(0) = 2 w(S(t)) = 2 + w(t)
w(t + u) = 1 + w(t) + w(u) w(t × u) = 2 + (w(t) × w(u))
We can easily prove that for any term t, w(t) 2. Then we propagate this measure on propositions as follows: w( ) = 0 w(t = u) = w(t) + w(u) w(A ∨ B) = w(A) + w(B) w(A ∧ B) = w(A) + w(B)
w(⊥) = 0 w(t ∈ k) = w(t) w(A ⇒ B) = w(A) + w(B) w(∀x A) = w(∃x A) = w(A)
This measure obviously decreases rule 3,4,5 and 6. Few simple calculi are enough to prove that the value is decreasing for rule (7), (9) and (10), knowing that for any term t, w(t) 2. Yet, the measure does not change for rule (8). We introduce ﬁnally a last measure w for rule 8: The measure is deﬁned on terms using the following equations: w (x) = w (0) = 2 w (S(t)) = 2 + w (t)
w (t + u) = 1 + 2 × w (t) + w (u) w (t × u) = 2 + (w (t) × w (u))
The propagation on propositions is the same as for w. This measure decreases for rule (8). Proposition 18. The R rewrite system is confluent.
2
Algorithmic Equality in Heyting Arithmetic Modulo
15
Proof. There is no critical pair in the system, so the system is locally conﬂuent[7]. As it is terminating, we can conclude that the system is conﬂuent. 2 Proposition 19 The congruence defined by the R rewrite system is decidable. Proof. As the rewrite system is terminating and conﬂuent, there exists a normal form for propositions and terms in our system. Two propositions or terms are congruent if and only if they have the same normal form. As the system has strong normalization property, the congruence is decidable. 2 5.3
Cut Elimination Property
Proposition 20. HA−→ has cut elimination property. Using [5] we prove that HA−→ is superconsistent: >From an ordered and complete pseudo Heyting algebra B, we will build a Bmodel M of HA−→ such that for each interpretation Φ, if A −→ A is a rule deﬁning the congruence in our theory then AΦ = A Φ . ˜ , ˜ ∀, ˜ ∃, ˜ ⇒, ˜, ∨ ˜ , ⊥, Proof. Let B = B, ≤, ∧ ˜ We build M as follows: – The domain of M is Mι = N and Mκ = B N . – The interpretation of the function symbol 0 is the 0N of the integers. S, + and × are interpreted as expected as the successor function, the addition and multiplication in N. ˜ and . ˜ – ⊥ and are interpreted respectively by ⊥ – We interpret the membership and all the function symbols of sort κ as in the previous proof of conservativity of HAK : the interpretation of ∈ that for each n and f associates f (n). The interpretation of a symbol of sort κ is a function receiving an assignment for the n free variables in the proposition associated to f , and returns a function from N to B. – The interpretation of equality , =, ˜ is deﬁned by the inﬁnite following array, ˜ on the diagonal, ⊥ ˜ on the rest of the array. witch is = ˜ 0 1 2 ˜ ⊥ ˜ ⊥ ˜ 0 ˜ ˜ ˜ 1 ⊥ ⊥ ˜ ⊥ ˜ ˜ 2 ⊥ .. .. .. .. . . . .
... ... ... ... .. .
– Interpretation of the predicate N : This is the most technical construction. Indeed, this predicate appears recursively in the rewrite rule: N (n) −→ ∀k (0 ∈ k ⇒ ∀y (N (y) ⇒ y ∈ k ⇒ S(y) ∈ k) ⇒ n ∈ k)
16
L. Allali
Let us keep in mind we are looking for a certain function F from N to B to interpret N such as for all a in N: N (n)n:=a = ∀k (0 ∈ k ⇒ ∀y (N (y) ⇒ y ∈ k ⇒ S(y) ∈ k) ⇒ n ∈ k))n:=a i.e. F = a → ∀k (0 ∈ k ⇒ ∀y (N (y) ⇒ y ∈ k ⇒ S(y) ∈ k) ⇒ n ∈ k))n:=a For each function f from N to B, we build a model Mf where N is interpreted by f , the other symbols are interpreted as deﬁned previously. Let Φ be the function form B N to B N , mapping f to the function M
f a → ∀k (0 ∈ k ⇒ ∀y (N (y) ⇒ y ∈ k ⇒ S(y) ∈ k) ⇒ n ∈ k))n:=a
We are interested in the function F such as Φ(F ) = F . Does such a ﬁxpoint exists ? The order on B N deﬁned by f g if for each x, f (x) g(x) is a complete order and the function Φ is monotonous as the occurrence of N is positive in ∀k (0 ∈ k ⇒ ∀y (N (y) ⇒ y ∈ k ⇒ S(y) ∈ k) ⇒ n ∈ k)) Thus we can apply the KnasterTarski theorem and deduce there exists a ﬁxed point F of the function Φ. Let us interpret the N predicate by this ﬁxed point F (ie choosing the model MF ). By construction, N (n)MF = ∀k (0 ∈ k ⇒ ∀y (N (y) ⇒ y ∈ k ⇒ S(y) ∈ k) ⇒ n ∈ k))MF MF is a Bmodel of HA−→ . We conclude by Deﬁnition 13 that HA−→ is superconsistent and thus, by Proposition 3, all proofs in HA−→ strongly normalize. 2
6
Discussion
One can ask if this system is really eﬃcient in practice: in one hand, the proof of x = y are shorter, in the other hand the proof of ∀x∀y (x = y ⇒ P (x) ⇒ P (y)) is longer. There is no theoretical answer to that question, it is only by making tests that we would see how the size of proof terms would change. A good indication is that the way we manage to “simulate” an application of Leibniz principle with our rewrite rules (the way it is shown in [1]) is linear in the size of the proposition.
7
Conclusion
We have reached a presentation of Heyting Arithmetic without any axiom, simply deﬁned by a rewrite rule system. A cornerstone of this presentation is that it makes use of the decidability of the equality in Heyting Arithmetic, indeed the equality is defined as a decision procedure, rather than as Leibniz’s proposition which becomes a consequence of the congruence of the system.
Algorithmic Equality in Heyting Arithmetic Modulo
17
Acknowledgments I would like to thank Gilles Dowek for all his constructive advice, Arnaud Spiwack for the help he gave me during the writing of this paper, and the anonymous referees who provided useful comments that contributed to the correctness of the paper.
References 1. Allali, L.: Memoire de DEA, http://www.lix.polytechnique.fr/Labo/Lisa.Allali/rapport_MPRI.pdf 2. Dowek, G., Hardin, T., Kirchner, C.: Theorem proving modulo. Journal of Automated Reasoning 31, 32–72 (2003) 3. Dowek, G., Werner, B.: Proof normalization modulo. The Journal of Symbolic Logic 68(4), 1289–1316 (2003) 4. Dowek, G., Werner, B.: Arithmetic as a theory modulo. In: Giesl, J. (ed.) RTA 2005. LNCS, vol. 3467, pp. 423–437. Springer, Heidelberg (2005) 5. Dowek, G.: Truth values algebras and normalization. In: Altenkirch, T., McBride, C. (eds.) TYPES 2006. LNCS, vol. 4502, Springer, Heidelberg (2007) 6. Schwichtenberg, H.: Proofs as programs. Proof theory: a selection of papers from the Leeds Proof Theory Programme 1990. Cambridge University Press, Cambridge (1992) 7. van Oostrom, V., van Raamsdonk, F.: Weak Orthogonality Implies Confluence: The HighOrder Case. Technical Report: ISRL945 (December 1994) 8. Poincarè, H.: La Science et l’hypothèse, 1902, Flammarion (1968) 9. Dowek, G.: La part du calcul. Mèmoire d’Habilitation à Diriger des Recherches, Universitè Paris 7 (1999) 10. The Coq Development Team. Manuel de Rèfèrence de Coq V8.0. LogiCal Project (20042006), http://coq.inria.fr/doc/main.html
CoqJVM: An Executable Speciﬁcation of the Java Virtual Machine Using Dependent Types Robert Atkey LFCS, School of Informatics, University of Edinburgh Mayﬁeld Rd, Edinburgh EH9 3JZ, UK
[email protected] Abstract. We describe an executable speciﬁcation of the Java Virtual Machine (JVM) within the Coq proof assistant. The principal features of the development are that it is executable, meaning that it can be tested against a real JVM to gain conﬁdence in the correctness of the speciﬁcation; and that it has been written with heavy use of dependent types, this is both to structure the model in a useful way, and to constrain the model to prevent spurious partiality. We describe the structure of the formalisation and the way in which we have used dependent types.
1
Introduction
Large scale formalisations of programming languages and systems in mechanised theorem provers have recently become popular [4,5,6,9]. In this paper, we describe a formalisation of the Java virtual machine (JVM) [8] in the Coq proof assistant [11]. The principal features of this formalisation are that it is executable, meaning that a purely functional JVM can be extracted from the Coq development and – with some O’Caml glue code – executed on real Java bytecode output from the Java compiler; and that it is structured using dependent types. The motivation for this development is to act as a basis for certiﬁed consumerside ProofCarrying Code (PCC) [12]. We aim to prove the soundness of program logics and correctness of proof checkers against the model, and extract the proof checkers to produce certiﬁed standalone tools. For this application, the model should faithfully model a realistic JVM. For the intended application of PCC, this is essential in order to minimise and understand the unavoidable semantic gap between the model and reality. PCC is intended as a secure defence against hostile code; the semantic gap is the point that could potentially be exploited by an attacker. To establish and test of our model we have designed it to be executable so that it can be tested against a real JVM. Further, we have structured the design of the model using Coq’s module system, keeping the component parts abstract with respect to proofs about the model. This is intended to broaden the applicability of proofs performed against the model and to prevent “overﬁtting” to the speciﬁc implementation. In order to structure the model we have made heavy use of Coq’s feature of dependent types to state and maintain invariants about the internal data M. Miculan, I. Scagnetto, and F. Honsell (Eds.): TYPES 2007, LNCS 4941, pp. 18–32, 2008. c SpringerVerlag Berlin Heidelberg 2008
CoqJVM: An Executable Speciﬁcation of the JVM Using Dependent Types
19
structures. We have used dependent types as a local structuring mechanism to state the properties of functions that operate on data structures and to pass information between them. In some cases this is forced upon us in order to prove to Coq that our recursive deﬁnitions for class loading and searching the class hierarchy are always terminating, but they allow us to tightly constrain the behaviour of the model, reducing spurious partiality that would arise in a more loosely typed implementation. To demonstrate the removal of spurious partiality, we consider the implementation of the invokevirtual function. To execute this function, we must resolve the method reference within the instruction; ﬁnd the object in the heap; search for the method implementation starting from the object’s class in the class pool and then, if found invoke this method. In a naive executable implementation, we would have to deal with the potential failure of some of these operations. For example, the ﬁnding of the class for a given object in the heap. We know from the way the JVM is constructed that this can never happen, but we still have to do something in the implementation, even if it just returns an error. Further, every proof that is conducted against this implementation must prove over again that this case cannot happen. We remove this spurious partiality from the model by making use of dependent types to maintain invariants about the state of the JVM. These invariants are then available to all proofs concerning the model. Our belief is that this will make largescale proofs using the model easier to perform, and we have some initial evidence that this is the case, but detailed research of this claim is still required. There are still cases when the model should return an error. We have attempted to restrict these to when the code being executed is not type safe. A basic design decision is to construct the model so that if the JVM’s bytecode veriﬁer would accept the code, then the model should not go wrong. Overview. In the next section we give an overview of our formalisation, detailing the highlevel approach and the module design that we have adopted. We describe the modelling of the class pool and its operations in Section 3. In Section 4 we describe the formalisation of object heaps and static ﬁelds, again using dependent types. In Section 5 we describe our modelling of the execution of single instructions. The extraction to O’Caml is discussed in Section 6. We discuss related work in Section 7 and conclude with notes for future work in Section 8. The Formalisation. Due to reasons of space, this paper can only oﬀer a highlevel overview of the main points of the formalisation. For more information the reader is referred to the formalisation itself, which is downloadable from http://homepages.inf.ed.ac.uk/ratkey/coqjvm/.
2
HighLevel Structure of the Formalisation
The largescale structure of the formalisation is organised using Coq’s module facilities. We use this for two reasons: to abstract the model over the implementation of several basic types such as 32bit integers and primitive class, method
20
R. Atkey
and ﬁeld names; and also to provide an extralogical reason to believe that we are modelling a platonic JVM, rather than ﬁtting to our implementation. The interface for the basic types is contained within the signature BASICS. We assume a type Int32.t with some arithmetic operations. This is instantiated with O’Caml’s int32 type after extraction. We also require types to represent the class, ﬁeld and method names. Since these types are used as the domains of ﬁnite maps throughout the formalisation, we stipulate that they must also have an ordering suitable for use with Coq’s implementation of balanced binary trees. We keep the constituent parts of the formalisation abstract from each other by use of the module system. This has the advantage of reducing complexity in each part and keeping the development manageable. It is also an attempt to keep proofs about the model from “overﬁtting” to the concrete implementation used. For example, we only expose an abstract datatype for the set of loaded classes and some axiomatised operations on it. The intention is that any implementation of class pools will conform to this speciﬁcation, and so proofs against it will have wider applicability than just the implementation we have coded1 . Thirdly, the use of modules makes the extracted code safer to use from O’Caml. Many of the datatypes we use in the formalisation have been reﬁned by invariants expressed using dependent types. Since O’Caml does not have dependent types they are thrown away during the extraction process. By using type abstraction we can be sure that we are maintaining the invariants correctly. The main module is Execution, which has the following functor signature: Module Execution (B : BASICS) (C : CLASSDATATYPES with Module B := B) (CP : CLASSPOOL with Module B := B with Module C := C) (RDT : CERTRUNTIMETYPESwith Module B := B with Module C := C with Module CP := CP ) The module types mentioned in the signature correspond to the major components of our formalisation. The signature BASICS we have already mentioned. CLASSDATATYPES contains the basic datatypes for classes, methods, instructions and the like. CLASSPOOL is the interface to the set of loaded classes and the dynamic loading facilities, described in Section 3. CERTRUNTIMETYPES is the interface to the object heap and the static ﬁelds, described in Section 4. Note the use of sharing constraints in the functor signature. These are required since the components that fulﬁl each signature are themselves functors. We need sharing constraints to state that all their functor arguments must be equal. The heavy use of sharing constraints exposed two bugs in Coq’s implementation, one to do with type checking code using sharing and one in extraction.
1
This technique is also used in Bicolano: http://mobius.inria.fr/bicolano
CoqJVM: An Executable Speciﬁcation of the JVM Using Dependent Types
3
21
Class Pools and Dynamic Loading
Throughout execution the JVM maintains a collection of classes that have been loaded into memory: the class pool. In this section we describe how we have modelled the class pool; how new classes are loaded; how functions that search the class pool are written; and how the class pool is used by the rest of the formalisation. 3.1
The Class Pool
Essentially, the class pool is nothing but a ﬁnite map from fully qualiﬁed class names to data structures containing information such as the class’s superclass name and its methods. We store both proper classes and interfaces in the same structures, and we diﬀerentiate between them by a boolean ﬁeld classInterface, which is true when the structure represents an interface, and false otherwise. The map from class names to class structures directly mimics the data structure that would be used in a real implementation of a JVM. In order to construct an executable model within Coq though, the basic data structure is not enough; we have to reﬁne the basic underlying data structure with some invariants. The motivation for adding invariants to the class pool data structure was originally to enable the writing of functions that search over the class hierarchy. Each class data structure is a record type, with a ﬁeld classSuperClass : option className indicating the name of that class’s superclass, if any. Searches over the class hierarchy operate by following the superclass links between classes. In a real JVM implementation it is known that, due to the invariants maintained by the class loading procedures, if a class points to a superclass, then that superclass will exist, and that java.lang.Object is the top of the hierarchy. Therefore, every search upwards through the inheritance tree will always terminate. When writing these functions in Coq we must convince Coq that the function actually does terminate, i.e. we must have a proof that the class hierarchy is well founded so that we can write functions that recurse on this fact. To this end, we package the basic data structure for the class pool, a ﬁnite map from class names to class structures, with two invariants: classpool : Classpool.t classpoolInvariant : ∀nm c. lookup classpool nm = Some c → classInvariant classpool nm c classpoolObject : ∃c. lookup classpool java.lang.Object = Some c ∧ classSuperClass c = None ∧ classInterfaces c = [ ] ∧ classInterface c = false We call the type of these records certClasspool. The type Classpool.t represents the underlying ﬁnite map. The function lookup looks up a class name in a given class pool, returning an option class result. There are two invariants that we maintain on class pools, the ﬁrst covers every class in classpool , we describe this
22
R. Atkey
below. The second states that there is an entry for java.lang.Object and that it has no superclass and no superinterfaces, and is a proper class. The invariants that every class must satisfy are given by the predicate classInvariant : Classpool.t → className → class → Prop classInvariant classpool nm c ≡ className c = nm ∧ (classSuperClass c = None → nm = java.lang.Object) ∧ classInvariantAux classpool c which states that each class structure must be recorded under a matching name; only java.lang.Object has no superclass; and that further invariants hold: classInvariantAux classpool c. This varies depending on whether c is a proper class or an interface. In the case of a proper class, we require that two properties hold: that all the class’s superclasses are present in classpool and likewise for all its superinterfaces. A proof that all a class’s superclasses are present is recorded as a term of the following inductive predicate: goodSuperClass classpool : option className → Prop gscTop : goodSuperClass classpool None gscStep : ∀cSuper nmSuper. lookup classpool nmSuper = Some cSuper → classInterface cSuper = false → goodSuperClass classpool (classSuperClass cSuper ) → goodSuperClass classpool (Some nmSuper ) For a class record c, knowing goodSuperClass classpool (classSuperClass c), means that we know that all the superclasses of c, if there are any, are contained within classpool , ﬁnishing with a class that has no superclass. We also know that all the class structures in this chain are proper classes. By the other invariants of class pools, we know that the top class must be java.lang.Object. There is also a similar predicate goodInterfaceList classes interfaceList that states that the tree of interfaces starting from the given list is well founded. For classInvariantAux classpool c, when c is an interface we again require that all the superinterfaces are present, but we also insist that the superclass must be java.lang.Object, to match the requirements in the JVM speciﬁcation. These predicates may be used to write functions that search the class hierarchy that are accepted by Coq by the technique of recursion on an adhoc predicate [3]. Unfortunately, they are not suitable for proving properties of these functions. We describe in Section 3.3 how we use an equivalent formulation to prove properties of these functions. 3.2
Dynamic Class Loading
New classes are loaded into the JVM as a result of the resolution of references to entities such as classes, methods and ﬁelds. For example, if the code for a method contains an invokevirtual instruction that calls a method int C.m(int), then
CoqJVM: An Executable Speciﬁcation of the JVM Using Dependent Types
23
the class name C must be resolved to a real class, and then the method int C.m(int) must be resolved to an actual method in the class resolved by C, or one of its superclasses. The process of resolving the reference C may involve searching for an implementation on disk and loading it into the class pool. The JVM speciﬁcation distinguishes between the loading and resolution of classes. It also takes care to maintain information on the diﬀerent class loaders used to load classes, in order to prevent spooﬁng attacks on the JVM’s type safety [7]. In our formalisation we only consider a single class loader, the bootstrap class loader. With this simpliﬁcation, we can unfold the resolution and loading procedures described in the JVM speciﬁcation to the following steps: 1. If the class C exists in the class pool as c then return c. 2. Otherwise, search for an implementation of C. If not found, return error. If found, check it for wellformedness and call it pc (preclass). 3. Recursively resolve the reference to the superclass, returning sc. 4. Check that sc is a proper class, and that C is not the name of sc or any of its superclasses. 5. Recursively resolve the references to superinterfaces. 6. Convert pc to a class c and add it to the class pool. Return c. The wellformedness check in step 2 checks that the class we have found has a superclass reference and that it has a matching name to the one we are looking for. In the case of interfaces, it checks that the superclass is always java.lang.Object. Since the formalisation only deals with a structured version of the .class data loaded from disk, we do not formalise any of the lowlevel binary wellformedness checks prescribed by the JVM speciﬁcation. We separate class implementations (preclasses pc) from loaded classes c because preclasses contain information that is discarded by the loading procedure, such as the proof representations required for PCC. If we start from a class pool that contains nothing but an implementation of java.lang.Object satisfying the properties in the previous section, and add classes by the above procedure then it is evident that we maintain the invariants. We now describe how we have formalised this procedure as a Coq function. The ﬁrst decision to be made is how to represent the implementations of classes “on disk”. Following Liu and Moore’s ACL2 model [9], we do this by modelling them as a ﬁnite map from class names to preclasses. The intention is that this represents a ﬁle system mapping pathnames to ﬁles containing implementations. We use the O’Caml glue code described in Section 6 to load the actual ﬁles from disk before any execution, parse them and put them in this structure. With this, the implementation of the procedure above can look into this collection of preclasses in order to ﬁnd class implementations and check them for wellformedness. However, a problem arises due to the recursive nature of the loading and resolution procedure. We must be able to prove that the resolution of a single class reference will always terminate; we must not spend forever trying to resolve an inﬁnite chain of superclasses or interfaces. To solve this problem we deﬁne a predicate wfRemovals preclasses that states that preclasses may
24
R. Atkey
be reduced to empty by removing elements one by one. With some work, one can prove ∀preclasses. wfRemovals preclasses. Using this, we deﬁne a function loadAndResolveAux which has type loadAndResolveAux (target : className) (preclasses : Preclasspool.t) (PI : wfRemovals preclasses) (classes : certClasspool) : {LOAD classes, preclasses ⇒ classes & c : class  lookup classes target = Some c}. This function takes the target class name to resolve, a preclasses to search for implementations, a PI argument stating that preclasses can be reduced to empty by removing elements, and the current class pool classes. The return type is an instance of the following type with two constructors: loadType (A : Set) (P : certClasspool → A → Prop) (classes : certClasspool) (preclasses : Preclasspool.t) : Set loadOk : ∀classes a. preserveOldClasses classes classes → onlyAddFromPreclasses classes classes preclasses → P classes a → loadType A P classes preclasses loadFail : ∀classes . preserveOldClasses classes classes → onlyAddFromPreclasses classes classes preclasses → exn → loadType A P classes preclasses and {LOAD classes, preclasses ⇒ classes & a : A  Q} is notation for loadType A (λclasses a. Q) classes preclasses. Hence, in the type of loadAndResolveAux, there are two possibilities: either a class structure is returned, along with a proof that this class is in the new class pool classes . Or a new classpool classes is returned, along with an error of type exn. These errors represent exceptions like java.lang.ClassFormatError that are turned into real Java exceptions by the code executing individual instructions. The two common parts of the constructors, the predicates preserveOldClasses and onlyAddFromPreclasses relate the new class pool classes to the old class pool classes and to preclasses. The predicate preserveOldClasses classes classes states that any classes that were in classes must also be in classes . The predicate onlyAddFromPreclasses classes classes preclasses states that any new classes in classes that are not in classes must have been loaded from preclasses. These two properties are used to establish properties of the class pool as it evolves during the execution of the virtual machine. In particular, we use them to show that the invariants of previously loaded classes are not violated by loading new classes, and to allow the inheritance of known facts about preclasses to the class pool. This is intended to be used in consumerside PCC for prechecking the proofs in a collection of class implementations before execution begins. We do not have space to go into the implementation of loadAndResolveAux. We have written the function in a heavily dependently typed style, making use of a “proof passing style”. We describe this style in the next section.
CoqJVM: An Executable Speciﬁcation of the JVM Using Dependent Types
3.3
25
Writing and Proving Functions That Search the Class Pool
Given a representation of class pools and functions that add classes to it, we also need functions that query the class pool. To execute JVM instructions, we need ways to determine when one class inherits from another; to search for virtual methods and to search for ﬁelds and methods during resolution. The basic structure of these operations is to start at some point in the class hierarchy and then follow the superclass and superinterface links upwards until we ﬁnd what we are looking for. We introduced goodSuperClass as a way to show that the hierarchy is well founded. While this deﬁnition is suitable for deﬁning recursive functions over the superclass hierarchy, it is not suitable for proving properties of such functions. In the induction principle on goodSuperClass generated for a property P by Coq’s Scheme keyword, the inductive step has P (classSuperClass cSuper ) g as a hypothesis where g has type goodSuperClass classes (classSuperClass cSuper ). The variable cSuper denotes the implementation of the superclass of the current class that goodSuperClass guarantees the existence of. However, during the execution of a recursive function over the class hierarchy we will have looked up the same class, but under a diﬀerent Coq name cSuper . We know from the other invariants maintained within certClasspool that cSuper and cSuper are equal, because they are stored under the same class name, but our attempts at rewriting the hypotheses with this fact were defeated by type equality issues. Although it may be possible to ﬁnd a way to rewrite the hypotheses in such a way that allows us to apply the induction hypothesis, we took an easier route and deﬁned an equivalent predicate, superClassChain: superClassChain : certClasspool → option className → Prop := sccTop : ∀classes. superClassChain classes None sccStep : ∀classes cSuper nmSuper. lookup classes nmSuper = Some cSuper → classInterface cSuper = false → (∀cSuper . lookup classes nmSuper = Some cSuper → superClassChain classes (classSuperClass cSuper )) → superClassChain classes (Some nmSuper). The diﬀerence here is that the step to the next class in the chain is abstracted over the implementation of that class, removing the problem described above. We retain the original goodSuperClass predicate because it is easier to prove while adding classes to the classpool. These two are easily proved equivalent. We can now deﬁne functions that are structurally recursive on the superclass hierarchy by recursing on the structure of the superClassChain predicate. To deﬁne such functions we must prove two inversion lemmas. These have the types inv1 : ∀classes nm c optNm. optNm = Some nm → superClassChain classes optNm → ¬(lookup (classpool classes) nm = None)
26
R. Atkey ﬁx search (classes : certClasspool) (supernameOpt : option className) (scc : superClassChain classes supernameOpt ) := match optionInformative supernameOpt with  inleft (exist supername snmEq) ⇒ let notNotThere := inv1 snmEq scc in match lookupInformative classes supername with  inleft(exist superC superCExists ) ⇒ let scc := inv2 superCExists snmEq scc in (* examine superC here, use * search classes (classSuperClass superC ) scc for * recursive calls *)  inright notThere ⇒ match notNotThere notThere with end end  inright ⇒ (* search failed code here *) end Fig. 1. Skeleton search function
inv2 : ∀classes nm c optNm.optNm = Some nm → superClassChain classes optNm → lookup (classpool classes) nm = c → superClassChain classes (classSuperClass c). A skeleton search function is shown in Figure 1. In addition to inv1 and inv2, we use functions optionInformative : ∀A o. {a : A  o = Some a} + {o = None} and lookupInformative : ∀classes nm. {c  lookup classes nm = Some c} + {lookup classes nm = None}. The search function operates by ﬁrst determining whether there is a superclass to be looked up. If so, optionInformative returns a proof object for supernameOpt = Some supername. This is passed to inv1 to obtain a proof that supername does not not exist in classes. The function then must look up supername for itself: since Coq does not allow the construction of members of Set by the examination of members of Prop, a proof can only tell a function that its work will not be fruitless, not do its work for it. To dispose of the impossible case when the superclass is discovered not to exist, we combine notNotThere and notThere to get a proof of False which is eliminated with an empty match (recall that ¬A is represented as A → False in Coq). Otherwise, we use inv2 to establish superClassChain for the rest of the hierarchy, and proceed upwards. 3.4
Interface to the Rest of the Formalisation
The implementation of the type certClasspool is kept hidden from the rest of the formalisation. To state facts about a class pool, external clients must make do with a predicate classLoaded : certClasspool → className → class → Prop. To know classLoaded classes nm c is to know that c has been loaded under the name nm in classes. The resolution procedures for classes, methods and ﬁelds are
CoqJVM: An Executable Speciﬁcation of the JVM Using Dependent Types
27
also exposed, using the loadType type described above. Each of these functions returns witnesses attesting to the fact that, when they return a class, method or ﬁeld, that entity exists and matches the speciﬁcation requested. This module also provides two other services: assignability (or subtype) checking, and virtual method lookup. These both work by scanning the class pool using the technique described in the previous subsection.
4
The Object Heap and Static Fields
The two other major data structures maintained by the JVM are the object heap and the static ﬁelds. In this section we describe their concrete implementations and the dependently typed interface they expose to the rest of the formalisation. 4.1
Object Heaps
As with the class pool, the object heap is essentially nothing but a ﬁnite map, this time from object references to structures representing objects. Object references are represented as natural numbers using Coq’s positive type. As above, we apply extra invariants to this basic data structure to constrain it to more closely conform to object heaps that actually arise during JVM execution. We build object heaps in two stages. First, we take a standard ﬁnite map data structure and repackage it as a heap. Heaps are an abstract datatype with the following operations, abstracted over a type obj of entities stored in the heap. We have operations lookup : heap → addr → option obj to look up items in the heap; update : heap → addr → obj → option t to update the heap, but only of existing entries; and new : heap → obj → heap × addr to create new entries. Given a heap datatype with these operations and the obvious axioms, we build object heaps tailored to the needs of the JVM. The ﬁrst thing to ﬁx is the representation of objects themselves. We regard objects as pairs of a class name and a ﬁnite map from ﬁelds to values. As with class pools we require several invariants to hold about the representation of each object. Each of the class names mentioned in an object heap actually exist in some class pool. Thus, the type of an object heap depends on some class pool: certHeap classes. Second, we require that each of the ﬁelds in each object is welltyped. We are helped here by the fact that JVM ﬁeld descriptors contain their type. The type of the operation that looks up a ﬁeld in an object is heapLookupField classes (heap : certHeap classes) (a : addr) (ﬂdCls : className) (ﬂdNm : ﬁeldName) (ﬂdTy : javaType) : {v  objectFieldValue classes heap a ﬂdCls ﬂdNm ﬂdTy v} + {¬objectExists classes heap a}. This operation looks up an object at address a, and a ﬁeld within that object. If the object exists then either the actual value of that ﬁeld or a default value for the ﬁeld (based on ﬂdTy) is returned, along with a proof that this is the value of that ﬁeld in the object at a in heap. Otherwise, a proof that the object does not
28
R. Atkey
exist is returned. The predicates objectFieldValue and objectExists record facts about the heap that can be reasoned about by external clients, in a similar way to the classLoaded predicate for class pools. A possible invariant that we do not maintain is that each ﬁeld present in an object is actually present in that object’s class and vice versa. We choose not maintain this invariant because it simpliﬁes development at this stage of the construction of the model. As described above and as we will elaborate in Section 5, we use dependent types to reduce spurious partiality in the model and to make the model more usable for proving properties. At the moment, it is useful to know that a ﬁeld’s value is welltyped; when proving a property of the model that relies on type safety we do not have to keep around an additional invariant stating that all ﬁelds are welltyped. Also, it is useful to know that an object’s class exists so that the implementation of the invokevirtual instruction can use this information to ﬁnd the class implementation and search for methods. If the class does not exist there is no sensible action to take other than to just go wrong, which introduces spurious partiality into the model. However, there is an obvious action to take if a ﬁeld does not exist – return a default value. The interface that object heaps present to the rest of the formalisation is constructed in the same style as that for class pools. We present an abstract type certHeap classes, along with operations such as heapLookupField above. Operations heapUpdateField and heapNew update ﬁelds and create new objects respectively. All these operations are dependently typed so that they can be used in a proofpassing style within the implementations of the bytecode instructions. Since the type of object heaps depends on the current class pool to state its invariants, we have to update the invariants’ proofs when new classes are added to the class pool. We use the preserveOldClasses predicate from Section 3.2: preserveCertHeap : ∀classesA classesB. certHeap classesA → preserveOldClasses classesA classesB → certHeap classesB Since every operation that alters the class pool produces a proof object of type preserveOldClasses, this can be passed into the above function to produce a matching object heap. 4.2
Static Fields
The static ﬁelds arc modelled in exactly the same way as the ﬁelds of a single object in the heap. The rest of the model is presented with a dependently typed interface that maintains the invariant that each ﬁeld’s value is welltyped according to the ﬁeld’s type. The type of welltyped static ﬁeld stores is ﬁeldStore classes heap.
5
Modelling Execution
All of the modules described above are arguments to the Execution functor, whose signature was in Section 2. We now describe the implementation of this module.
CoqJVM: An Executable Speciﬁcation of the JVM Using Dependent Types
5.1
29
The Virtual Machine State
The state of the virtual machine is modelled as a record with the ﬁelds stateFrameStack stateClasses stateObjectHeap stateStaticFields
: list frame : certClasspool : certHeap stateClasses : ﬁeldStore stateClasses stateObjectHeap.
States contain the three major data structures for the class pool, object and static ﬁelds that we have covered above. The additional ﬁeld records the current frame stack of the virtual machine. Individual stack frames have the ﬁelds: frameOpStack : list rtVal framePC : nat frameClass : class
frameLVars : list (option rtVal) frameCode : code
The type rtVal is used to represent runtime values manipulated by the virtual machine such as integers and references. There are entries for the current operand stack and the local variables. The use of option types in the local variables is due to the presence of values that occupy multiple words of memory. Values of types such as int only occupy a single 32bit word of memory on the real hardware, but long and double values occupy 64bits. When stored in the local variables, the second half of a 64bit value is represented using None. The rest of the ﬁelds in frame are as follows: the framePC ﬁeld records the current program counter; frameCode records the code being executed, this consists of a list of instructions and the exception handler tables; frameClass is the class structure for the code being executed, this is used to look up items in the class’s constant pool. 5.2
Instruction Execution
The main function of the formalisation is exec : state → execResult. This executes a single bytecode instruction within the machine. If any exceptions are thrown then the catching and handling or the termination of the machine are all handled before exec returns. The type execResult sets out the possible results of a single step of execution: execResult : Set cont : state → execResult stop : state → option rtVal → execResult stopExn : state → addr → execResult wrong : execResult. An execution step can either request continuation to the next step; stop, possibly with a value; stop with a reference to an uncaught exception; or go wrong. The basic operation of exec is simple. The current instruction is found by unpacking the current stack frame from the state and looking up by framePC . Each instruction is implemented by a diﬀerent function. The nonobjectoriented
30
R. Atkey
implementations are relatively straightforward; the objectoriented instructions more complex. They interact with the class pool, object heap and static ﬁeld data structures described in earlier sections. The dependently typed interfaces are used to ensure that we maintain the invariants of each data structure, and that we only go wrong when the executed code is not type safe.
6
Extraction to O’Caml
The Coq development consists of roughly 5275 lines of speciﬁcation and 2288 lines of proof, as measured by the coqwc tool. The proof component primarily comprises glue lemmas to allow the coding of proofpassing dependently typed functions. After extraction this becomes 16454 lines of O’Caml code, with a .mli ﬁle of 17132 lines. The expansion can be explained partially by the inclusion of some elements of Coq’s standard library, but mainly by the repetition of module interfaces several times. This appears to be due to an internal limitation of Coq in the way it represents module signatures with sharing constraints. To turn this extracted code into a simulator for the JVM we have written around 700 lines of O’Caml glue code. The bulk of this is code to translate from the representation of .class ﬁles used by the library that loads and parses them to the representation required by the extracted Coq code. The action of the O’Caml code is simply to generate a suitable preclasses by scanning the classpath for .class ﬁles, construct an initial state for a nominated static method, and then iterate the exec function until the machine halts. The JVM so produced is capable of running bytecode produced by the javac compiler, albeit very slowly. We have not yet implemented arrays or strings, so the range of examples is limited, but we have used it to test the dynamic loading and virtual method lookup and invocation, discovering several bugs in the model.
7
Related Work
We know of two other largescale executable speciﬁcations of the JVM constructed within theorem provers. Liu and Moore [9] describe an executable JVM model, called M6, implemented using ACL2’s purely functional subset of Common Lisp. M6 is particularly complete: in common with the work described here it simulates dynamic class loading and interfaces, it also goes beyond our work in simulating class initialisation and instructionlevel interleaving concurrency (though see comments on concurrency in the next section). Liu and Moore describe two applications of their model. They use the model to prove the invariant that if a class is loaded then all its superclasses and interfaces must also be loaded. Unlike in our model, where this invariant is built in, this proof is an optional extra for M6. Our motivation for maintaining this invariant is to prove that searches through the class hierarchy terminate. In M6 this is proved by ﬁrst establishing that there will always be a ﬁnite number of classes loaded, and so there will be a ﬁnite number of superclasses to any class; then recursion
CoqJVM: An Executable Speciﬁcation of the JVM Using Dependent Types
31
proceeds on this ﬁnite list. Note that, unlike our method, this does not guarantee that, at the point of searching through the hierarchy, all the classes will exist. Liu and Moore also directly prove some correctness properties of concurrent Java programs. Another executable JVM speciﬁcation is that of Barthe et al [2]. This is an executable speciﬁcation of JavaCard within Coq. With respect to our model, they do not need to implement dynamic class loading since all references are resolved before runtime in the JavaCard environment. They also prove the soundness of a bytecode veriﬁer with respect to their model. Klein and Nipkow [4] have formalised a Javalike language, Jinja, complete with virtual machine, compiler and bytecode veriﬁer, in Isabelle/HOL. They have proved that the compilation from the highlevel language to bytecode is semantics preserving. The language they consider is not exactly Java, and their model simpliﬁes some aspects such as the lack of dynamic loading and interfaces. The formalisation is executable via extraction to SML. Another largescale, and very complete formalisation is that of St¨ark et al [13] in terms of abstract state machines. This formalisation is executable, but proofs about the model have not been mechanised. The Mobius project contains a formalisation of the JVM called Bicolano2 . This formalisation uses Coq’s module system to abstract away from the representation of the machine’s data structures in a similar way to ours, and has been used to prove the soundness of the Mobius Base Logic. It does not model dynamic class loading. Other largescale formalisations of programming languages include Leroy’s formalisation of a subset of C and PowerPC machine code for the purposes of a certiﬁed compiler [6]. This is a very large example of a useful program being extracted from a formal development. Another large mechanisation eﬀort is Lee et al ’s Twelf formalisation of an intermediate language for Standard ML [5].
8
Conclusions
We have presented a formal, executable model of a subset of the Java Virtual Machine structured using a combination of dependent types and Coq’s module system. We believe that this use of dependent types as a structuring mechanism is the ﬁrst application of such a strategy to a large program. The model is incomplete at time of writing. In the immediate future we intend to add arrays and strings to the model in order to extend the range of real Java programs that may be executed. There are several extensions that require further research. Modelling I/O behaviour of JVM programs would be a useful feature. We speculate that a suitable way to do this would be to write the formalisation using a monad. The monad would be left abstract but axiomatised in Coq in order to prove properties, but be implemented by actual I/O in O’Caml. Even more diﬃcult is the implementation of concurrency. Liu and Moore’s ACL2 model simulates concurrency by interleaving, but this does not capture all the possible behaviours allowed by the Java Memory Model [10]. There has been recent work on formalising the Java Memory Model in Isabelle/HOL [1], but it 2
http://mobius.inria.fr/bicolano
32
R. Atkey
is diﬃcult to see how this could be made into an executable model. A suitable approach may be to attempt to only model datarace free programs, for which the Java Memory Model guarantees the validity of the interleaving semantics. Acknowledgement. This work was funded by the ReQueST grant (EP/C537068) from the Engineering and Physical Sciences Research Council.
References 1. Aspinall, D., Sevc´ık, J.: Formalising Java’s Data Race Free Guarantee. In: Schneider, K., Brandt, J. (eds.) TPHOLs 2007. LNCS, vol. 4732, pp. 22–37. Springer, Heidelberg (2007) 2. Barthe, G., Dufay, G., Jakubiec, L., Serpette, B., de Sousa, S.M.: A Formal Executable Semantics of the JavaCard Platform. In: Sands, D. (ed.) ESOP 2001. LNCS, vol. 2028, pp. 302–319. Springer, Heidelberg (2001) 3. Bertot, Y., Cast´eran, P.: Interactive Theorem Proving and Program Development: Coq’Art: The Calculus of Inductive Constructions. Springer, Heidelberg (2004) 4. Klein, G., Nipkow, T.: A machinechecked model for a Javalike language, virtual machine and compiler. ACM Transactions on Programming Languages and Systems 28(4), 619–695 (2006) 5. Lee, D.K., Crary, K., Harper, R.: Towards a Mechanized Metatheory of Standard ML. In: POPL 2007: Proceedings of the 34th annual ACM SIGPLANSIGACT symposium on Principles of programming languages, pp. 173–184. ACM Press, New York (2007) 6. Leroy, X.: Formal certiﬁcation of a compiler backend or: programming a compiler with a proof assistant. In: POPL 2006: Conference record of the 33rd ACM SIGPLANSIGACT symposium on Principles of programming languages, pp. 42– 54. ACM Press, New York (2006) 7. Liang, S., Bracha, G.: Dynamic class loading in the Java virtual machine. In: OOPSLA 1998: Proceedings of the 13th ACM SIGPLAN conference on Objectoriented programming, systems, languages, and applications, pp. 36–44. ACM Press, New York (1998) 8. Lindholm, T., Yellin, F.: The Java Virtual Machine Speciﬁcation, 2nd edn. AddisonWesley, Reading (1999) 9. Liu, H., Moore, J.S.: Executable JVM model for analytical reasoning: A study. Sci. Comput. Program. 57(3), 253–274 (2005) 10. Manson, J., Pugh, W., Adve, S.V.: The Java memory model. In: POPL 2005: Proceedings of the 32nd ACM SIGPLANSIGACT symposium on Principles of Programming Languages, pp. 378–391. ACM Press, New York (2005) 11. The Coq development team. The Coq proof assistant reference manual. LogiCal Project, Version 8.0 (2004) 12. Necula, G.C.: Proofcarrying code. In: Proceedings of POPL 1997 (January 1997) 13. St¨ ark, R., Schmid, J., B¨ orger, E.: Java and the Java Virtual Machine – Deﬁnition, Veriﬁcation, Validation. Springer, Heidelberg (2001)
Dependently Sorted Logic Jo˜ ao Filipe Belo The University of Manchester, School of Computer Science Oxford Road, Manchester, M13 9PL, UK
[email protected] Abstract. We propose syntax and semantics for systems of intuitionistic and classical ﬁrst order dependently sorted logic, with and without equality, retaining type dependency, but otherwise abstracting, from systems for dependent type theory, and which can be seen as generalised systems of multisorted logic. These are presented as extensions to Gentzen systems for ﬁrst order logic in which the logic is developed relative to a context of variable declarations over a theory of dependent sorts. A generalised notion of Kripke structure provides the semantics for the intuitionistic systems.
1
Introduction
Dependently sorted logic may be described as the generalisation of multisorted ﬁrst order logic to a logic with dependent sorts, i.e., as the generalisation of multisorted ﬁrst order logic in which the languages have been extended to include dependent sorts. This extension is nevertheless minimal in the sense that sort dependency is the only extra structure assumed. We may thus describe the systems we introduce here as systems for multisorted ﬁrst order logic generalised with dependent sorts. Alternatively, we may describe these systems as abstractions of logic enriched type theories [1] in which the only structure retained from the type theory is type dependency. The overall structure of these systems is thus that of a system of predicates and proofs over a type theory, or a theory of sorts [2]. The idea of minimally extending multisorted logic with dependent sorts has been addressed several times already [3,4,5,6] with seemingly diﬀerent motivations. A system for intuitionistic dependently sorted logic without equality and an abstract notion of theory of dependent sorts called a type setup is proposed by Aczel in [3], starting the work we present in this paper. This is further developed in [4] with set theoretical semantics and completeness, but for classical logic instead. In [5], Makkai introduces a dependently sorted logic to formulate category theoretical notions, in which the use of equality is restricted, consequently avoiding function symbols. The paper [6] by Rabe is rather close in goal to ours but the development is somewhat diﬀerent, deals only with classical logic, and imposes conditions on the syntax which, as said there, are rather restrictive, and which we don’t need. M. Miculan, I. Scagnetto, and F. Honsell (Eds.): TYPES 2007, LNCS 4941, pp. 33–50, 2008. c SpringerVerlag Berlin Heidelberg 2008
34
J.F. Belo
One motivation for developing dependently sorted logic as presented here is to ﬁt dependent sorts in systems for software speciﬁcation founded on multisorted logic, like CASL [7]. We want to try to apply these systems to generic programming following related work [8,9] in dependent type theory. In this paper we propose syntax and semantics for systems of intuitionistic and classical dependently sorted logic, with and without equality, staying as close as possible to traditional presentations of multisorted logic. One may then be concerned with important properties of multisorted logic, like interpolation, still holding in these systems. In a subsequent paper we intend to show that Craig interpolation indeed holds for these systems. This paper has two parts. In the ﬁrst part we are concerned with theories of sorts. These are syntactical theories, so it begins with the notions of expression and substitution, after which the notion of signature is presented. Signatures are the simplest of the theories of sorts due to the absence of equality. This is followed by the notion of generalised algebraic theory, introduced by Cartmell [10,11,12], which adds equality both on terms and on sorts. Then set theoretical semantics of generalised algebraic theories is given and a completeness result is proved. The second part presents the notion of a dependently sorted ﬁrst order theory and its complete semantics, both classical and intuitionistic.
2
Theories of Dependent Sorts
Roughly, a theory of dependent sorts is one which allows individual variables to occur in the expression of sorts. The actual sort, in the multisorted sense, depends on the value of the variables. It’s very much like the truth value of a predicate depending on the value of the variables occurring in it. Consider, for instance, the expression vec(n) for the sort of vectors of length n [13]. Sort dependency is the only extra structure, with regard to sorts in the multisorted sense, that we assume in dependently sorted logic. We ignore further structure, like dependent products, common in dependent type theories. This section presents two theories of dependent sorts, namely signatures and generalised algebraic theories. The distinguishing feature among them is equality, which is absent in signatures. The main observation about the presentation of these theories is that the inductive deﬁnition of sorts and terms must be simultaneous by the nature of dependent sorts. 2.1
Expressions
We shall now give a deﬁnition of a notion of expression, which we do mainly to simplify the deﬁnition of substitution on the objects we shall introduce later. We give it in very much the standard way where the expressions are built from variables by a series of symbol applications. It should nevertheless be noted that for now we don’t assign arities to the symbols, and hence that we don’t impose any arity constraint on the formation of expressions. It should also be noted that we intend a notion of expression modulo renaming of bound variables, or αconversion.
Dependently Sorted Logic
35
For certain inductively deﬁned sets we need an objective or formal notion of derivation according to the clauses of the inductive deﬁnition. These sets shall be deﬁned indirectly through the inductive deﬁnition of the derivations instead. For this we use the notation Π1 . . . Πk {R}, Q to abbreviate a clause “If Π1 , . . . Πk are derivations, then (Π1 , . . . Πk , Q) is a derivation, provided that R.” We may omit R, which we then consider to hold. The Πi we call the premises of the derivation and Q the conclusion. When writing down such a clause we shall in fact, of each Πi , show only its conclusion. We say that “Q is derivable,” write Q, when Q is the conclusion of some derivation. The sets indirectly being deﬁned are the sets of those Q which are derivable. For uniformity sake, hereafter we shall use this notation in every inductive deﬁnition, even when we don’t need the formal derivations. For us a sequence is always a possibly empty ﬁnite sequence, the empty sequence being denoted by the letter . In a situation where a sequence is speciﬁed, say E1 , . . . En , the sequence may be denoted simply by the unsubscripted letter, say E. Deﬁnition 1. We assume ﬁxed an inﬁnite set U of symbols and an inﬁnite set V of variables and inductively deﬁne the expressions by the following clauses. Variable, compound, variable binding v
{v ∈ V }
E1
. . . En {H ∈ U } H(E)
A ψ {H ∈ U and v ∈ V } (Hv : A)ψ
We may use inﬁx notation E1 HE2 when denoting expressions of the form H(E1 , E2 ) and also omit the parenthesis when denoting expressions of the form H(). Moreover, in this subsection we let the letters, possibly subscripted or primed, – H denote an arbitrary symbol, – u, v, w, x, y denote arbitrary variables, – A, C, D, E, ψ denote arbitrary expressions, except otherwise indicated. Deﬁnition 2. An occurrence of a variable v in an expression E is free ( bound) in E according to the following induction on the structure of E: (1) the occurrence of v in v is free; (2) a free (bound) occurrence of v in Ei is free (bound) in H(E1 , . . . En ); (3) a free (bound) occurrence of v in A is free (bound) in (Hv : A)ψ; (4) a free (bound) occurrence of v in ψ is free (bound) in (Hu : A)ψ if v is distinct from u, otherwise is bound. Deﬁnition 3. An occurrence of a variable v in an expression E is said to be binding in E according to the following induction on the structure of E: (1) the occurrence of v in v is not binding; (2) a binding occurrence of v in Ei is binding in H(E1 , . . . En ); (3) the occurrence of v in (Hv : A)ψ is binding; (4) a binding occurrence of v in ψ is binding in (Hu : A)ψ.
36
J.F. Belo
Deﬁnition 4. A variable v is said to be free in an expression E if v has a free occurrence in E and v is said bound in E if v has a binding occurrence in E. The set FV(E) is the set of free variables in E. The set FV(E1 , . . . En ) is the union of the set of free variables of each Ei . Deﬁnition 5. A substitution is a pair of sequences D1 , . . . Dm /y1 , . . . ym , for any m ≥ 0, such that the variables in y are distinct. Deﬁnition 6. Let the pair D1 , . . . Dm /y1 , . . . ym be a substitution. The simultaneous substitution E[D/y] of D for y in E is deﬁned by induction on the structure of E by equations: v[D/y] = Di v[D/y] = v
if v = yi for some i, if v = yi for all i,
H(E1 , . . . En )[D/y] = H(E1 [D/y], . . . En [D/y]) (Hv : A)ψ[D/y] = (Hv : A)(ψ[D /y ])
if v ∈ FV(D),
where D /y is D1 , . . . Di−1 , Di+1 , . . . Dm /y1 , . . . yi−1 , yi+1 , . . . ym D/y
if v = yi for some i, if v = yi for all i.
Each equation must be considered under the condition that the simultaneous substitutions on its right hand side are deﬁned. Proposition 1 ((Strict) Substitution lemma). Let C1 , . . . Cl /x1 , . . . xl and D1 , . . . Dm /y1 , . . . ym be substitutions. Then E[C/x][D/y] = E[D/y][C[D/y]/x], provided the simultaneous substitutions are deﬁned and none of the variables in x is in y or occurs free in D. As said, we intend a notion of expression modulo renaming of bound variables. For that we deﬁne syntactic equality which relates those expressions diﬀering only in bound variables. Deﬁnition 7. Syntactic equality ≡ of expressions is inductively deﬁned by the clauses: Reﬂexivity, congruence, bound variable renaming v≡v
E1 ≡ E1 . . . En ≡ En H(E) ≡ H(E )
ψ[w/v] ≡ ψ [w/v ] (Hv : A)ψ ≡ (Hv : A)ψ
where w is some variable occurring neither in ψ nor in ψ . Proposition 2. Syntactic equality is an equivalence relation.
Dependently Sorted Logic
37
Proposition 3. Let E and E be expressions such that E ≡ E . Then E[D/y] ≡ E [D/y], for any substitution D1 , . . . Dm /y1 , . . . ym for which the simultaneous substitutions are deﬁned. Proposition 4. For any expression E and substitution D1 , . . . Dm /y1 , . . . ym there exists an expression E such that E ≡ E and E [D/y] is deﬁned. Proposition 4 implies that simultaneous substitution on equivalence classes of expressions is totally deﬁned. Thus, hereafter the expressions are taken modulo syntactic equality, or renaming of bound variables. 2.2
Signatures
The simplest way to set up a theory of sorts with sort dependency is, perhaps, through the notion of signature we present in this section. The fundamental notions in this presentation are those of context, of sort, and of term. The set up will be such that (1) every sort and term shall explicitly contain a context declaring the variables from which it may be formed: no variable may be used other than those declared in that context; and (2) every term shall explicitly contain an expression designating its sort. Deﬁnition 8. A variable declaration is a pair v : A, where v is a variable and A is an expression. Signatures enjoy the relevant properties of the generalised algebraic theories we present in the next section. In fact they are a particular kind of generalised algebraic theory. We thus choose to develop signatures only to the point where we can give some examples involving dependent sorts. As metavariables, we now let the letters, possibly subscripted or primed, – – – –
f , g, h, F , G denote arbitrary symbols, u, v, w denote arbitrary variables, p, q, r, s, t, A, B, C, D denote arbitrary expressions, Γ , Δ, E denote arbitrary sequences of variable declarations,
except otherwise indicated. Also, unless stated otherwise, we assume the sequences of variables in order declared in Γ and Δ to be x1 , . . . xm and y1 , . . . yn , respectively. Deﬁnition 9. We start with the following deﬁnitions. 1. A sort constructor declaration is a pair (Δ) F . 2. A term constructor declaration is a triple f :: Δ → D. Sort and term constructor declarations simply assign arities to symbols: the sequence of variable declarations assigned to a symbol determines the number
38
J.F. Belo
of terms, and their sort, to which that symbol may be applied, and the target expression of a term constructor declaration determines furthermore the sort of that application. Note nevertheless that, at this point, symbols are assigned arbitrary sequences of variable declarations, i.e., the expressions assigned to the variables in those sequences don’t necessarily designate sorts. Nor for that matter does the target expression of a term constructor declaration designate a sort, as it should. These two constraints must be further imposed by stating that, in a proper collection of declarations, in what we shall call a signature, the sequences of variable declarations assigned to the symbols must actually be contexts and, roughly, the target expressions of term constructor declarations must be sorts. Deﬁnition 10. A signature is a set Σ of sort and term constructor declarations such that 1. there are no two declarations for the same symbol in Σ, 2. if (Δ) F ∈ Σ, then Δ, and 3. if f :: Δ → D ∈ Σ, then (Δ) D, according to the following clauses for the simultaneous inductive deﬁnition of contexts Γ , sorts (Γ ) A, and terms (Γ ) t : A over Σ, where (Γ ) t : Δ abbreviates the list of premises Γ
(Γ ) t1 : B1
...
(Γ ) tn : Bn [t1 , . . . tn−1 /y1 , . . . yn−1 ]
Δ
for B1 , . . . Bn the sequence of sorts of the variables in order declared in Δ. Note that since Δ may be the empty sequence, the premise Γ is necessary to disallow an arbitrary sequence of variables as the context of the conclusion. empty context, context extension
(Γ ) A {v ∈ Γ } Γ, v : A
sort constructor application (Γ ) t : Δ {(Δ) F ∈ Σ} (Γ ) F (t) declared variable, term constructor application Γ {v : A ∈ Γ } (Γ ) v : A
(Γ ) t : Δ (Δ) D {f :: Δ → D ∈ Σ} (Γ ) f (t) : D[t/y]
We may omit the pair of parenthesis when denoting an application of a constructor symbol to the empty sequence. We usually omit the context, and the arrow, in the declaration of sort and term constructors when the context is empty. Example 1. The canonical example of a signature is perhaps that for a theory of categories, where the set of morphisms between two objects is designated by a dependent sort:
Dependently Sorted Logic
39
Sort constructors () Obj (x : Obj, y : Obj) Arr Term constructors ◦ :: (x : Obj, y : Obj, z : Obj, g : Arr(x, y), f : Arr(y, z)) → Arr(x, z) id :: (x : Obj) → Arr(x, x) Example 2. A more interesting example deals with indexed families of categories: Sort constructors () V (i : V ) Obj (i : V, x : Obj(i), y : Obj(i)) Arr Term constructors ◦ :: (i : V, x : Obj(i), . . . f : Arr(i, y, z)) → Arr(i, x, z) id :: (i : V, x : Obj(i)) → Arr(i, x, x) Example 3. Another example is that of a signature for a theory of stacks, where the sort of a stack includes the number of elements in it: Sort constructors () N at () T (len : N at) Stk Term constructors 0 :: N at suc :: N at → N at empty :: Stk(0) push :: (len : N at, s : Stk(len), e : T ) → Stk(suc(len)) top :: (len : N at, s : Stk(suc(len))) → T This disallows, for instance, the application of top to the empty stack. Before we move on to the next subject we note, regarding the deﬁnition of signature, that dropping the premise Δ from the sort constructor application clause, and the premises Δ and (Δ) D from the term constructor application clause, enables the following to be a signature. Sort constructors (x : G(c)) G Term constructors c :: G(c) Note the “circularity” in the declarations. The theory can still be developed in this case, without much change, although the proofs are not so straightforward since the language no longer has a proper simultaneous inductive deﬁnition.
40
2.3
J.F. Belo
Generalised Algebraic Theories
The notion of generalised algebraic theory extends that of signature with the expression of equality both on terms and on sorts. It was introduced by Cartmell [10] as “a generalisation of the usual notion of a manysorted algebraic or equational theory.” Our presentation uses slightly diﬀerent notation and terminology in a style closer to that of the traditional presentations of multisorted languages. The key observation here is that an inferred equality may enable a new application of a constructor symbol, thus the contexts, the sorts, and the terms must be formed simultaneously with the inference of equality. Deﬁnition 11 1. An algebraic equation axiom is a triple (Γ ) r = s : A. 2. A sort equation axiom is a triple (Γ ) A = B. Deﬁnition 12. A generalised algebraic theory is a set Σ of sort constructor declarations, term constructor declarations, algebraic equation axioms, and sort equation axioms such that 1. 2. 3. 4. 5.
there are no two declarations for the same symbol in Σ, if (Δ) F ∈ Σ, then Δ, if f :: Δ → D ∈ Σ, then (Δ) D, if (Γ ) s = t : A ∈ Σ, then (Γ ) s : A and (Γ ) t : A, if (Γ ) A = B ∈ Σ, then (Γ ) A and (Γ ) B,
according to the following clauses for the simultaneous inductive deﬁnition of contexts Γ , sorts (Γ ) A, terms (Γ ) t : A, algebraic equations (Γ ) r = s : A, and sort equations (Γ ) A = B over Σ. For brevity we omit the signature clauses for the contexts, sorts, and terms. sort replacement on terms (Γ ) r : A (Γ ) A = B (Γ ) r : B algebraic equation axiom (Γ ) r : A (Γ ) s : A {(Γ ) r = s : A ∈ Σ} (Γ ) r = s : A reﬂexivity, symmetry, transitivity (Γ ) t : A (Γ ) t = t : A
(Γ ) r = s : A (Γ ) s = r : A
(Γ ) r = s : A (Γ ) s = t : A (Γ ) r = t : A
substitution on algebraic equations (Γ ) r = s : Δ (Δ) p = q : D (Γ ) p[r/y] = q[s/y] : D[r/y]
Dependently Sorted Logic
41
sort replacement on algebraic equations (Γ ) r = s : A (Γ ) A = B (Γ ) r = s : B sort equation axiom (Γ ) A (Γ ) B {(Γ ) A = B ∈ Σ} (Γ ) A = B reﬂexivity, symmetry, transitivity of sort equations (Γ ) A (Γ ) A = A
(Γ ) A = B (Γ ) B = A
(Γ ) A = B (Γ ) B = C (Γ ) A = C
substitution on sort equations (Γ ) r = s : Δ (Δ) C = D (Γ ) C[r/y] = D[s/y] We next display a series of fundamental properties of these derivations, writing (Γ ) t : Δ as an abbreviation for Γ, (Γ ) t1 : B1 , . . . (Γ ) tn : Bn [t1 , . . . tn−1 /y1 , . . . yn−1 ], and Δ, for Δ the sequence y1 : B1 , . . . yn : Bn of variable declarations. Proposition 5. Let Σ be a generalised algebraic theory. i. If u occurs in p in (Δ) p : D, then (Δ) u : Bi for some i. ii. If u occurs in D in (Δ) D, then (Δ) u : Bi for some i. Proposition 6. Suppose that (Γ ) t : Δ. i. If (Δ) p : D, then (Γ ) p[t/y] : D[t/y], ii. If (Δ) D, then (Γ ) D[t/y]. Proposition 7 i. ii. iii. iv. v.
If If If If If
(Γ ) A, then Γ . Γ and v : A ∈ Γ , then (Γ ) A. (Γ ) r = s : A, then (Γ ) r : A and (Γ ) s : A. (Γ ) A = B, then (Γ ) A and (Γ ) B. (Γ ) r : A, then (Γ ) A.
Proposition 8. If (Γ ) r : A and (Γ ) r : B, then (Γ ) A = B. Proof. The proof is by a double induction on the height of the derivations. Take a derivation of (Γ ) r : A and a derivation of (Γ ) r : B. They must be a variable, a term constructor application, or a sort replacement on terms. For every possible combination of these it can be checked that (Γ ) A = B indeed holds either by reﬂexivity or by the hypothesis of induction and transitivity.
42
3
J.F. Belo
Dependently Sorted Algebras
We now turn to the set theoretical semantics for generalised algebraic theories which we prove sound and complete, although in a weak sense. The semantics is based on the idea that a sort is interpreted by a family of sets indexed by the interpretation of its context, and that a term is interpreted by a function on the interpretation of its context to the interpretation of its sort, but in a way which respects the indexing of that sort. The condition that such an interpretation respects the interpretation of substitution as composition deﬁnes the notion of structure for a generalised algebraic theory. If furthermore the equality axioms are satisﬁed by the interpretation, then the structure is called a dependently sorted algebra. Thus, let Σ be a generalised algebraic theory. Deﬁnition 13. A structure M for Σ is a triple of interpreting functions ·M on contexts, sorts, and terms such that 1. Γ M is a set ()M = {∅} Γ, v : AM = {(e, e )  e ∈ Γ M and e ∈ (Γ ) AM (e)} 2. (Γ ) AM is a family of sets indexed by Γ M (Γ ) D[t/y]M (e) = (Δ) DM ((Γ ) t : ΔM (e)) 3. (Γ ) r : AM : (e ∈ Γ M ) → (Γ ) AM (e) (Γ ) xi : Ai M (e) = ei (Γ ) p[t/y] : D[s/y]M (e) = (Δ) p : DM ((Γ ) t : ΔM (e)) where xi : Ai is the ith variable declaration in Γ and (Γ ) t : ΔM (e) abbreviates ((Γ ) t1 : B1 M (e), . . . (Γ ) tn : Bn [t1 , . . . tn−1 /y1 , . . . yn−1 ]M (e)), B1 , . . . Bn being the sequence of sorts of the variables in order declared in Δ. One can also deﬁne a structure to be an assignment of a family of sets to each sort constructor and of a function to each term constructor such that the above equations, slightly changed and taken in general as inductively deﬁning a partial assignment on the contexts, sorts, and terms, are indeed total. Nevertheless, the above deﬁnition captures the properties needed for what follows. Deﬁnition 14. An algebraic equation (Γ ) r = s : A is valid in a structure M for Σ, written M =Σ (Γ ) r = s : A, if (Γ ) r : AM = (Γ ) s : AM . Similarly for a sort equation. The structure M is called a (dependently sorted) algebra for Σ if every algebraic and sort equation in Σ is valid in M . Proposition 9 (Soundness) i. For any algebraic equation (Γ ) r = s : A over Σ, (Γ ) r = s : A only if M =Σ (Γ ) r = s : A for any algebra M for Σ.
Dependently Sorted Logic
43
ii. For any sort equation (Γ ) A = B over Σ, (Γ ) A = B only if M =Σ (Γ ) A = B for any algebra M for Σ. Proof By induction on the height of the derivation of (Γ ) r = s : A and (Γ ) A = B, using properties 2 and 3 of the deﬁnition of structure for Σ in the case of a substitution on sort equations and substitution on algebraic equations, respectively. 3.1
Completeness
By completeness we understand the converse of propositions 9.i and 9.ii. We shall show that the converse of 9.ii does not hold, so we aim solely at the converse of 9.i, which we call weak completeness. The method we use is the standard one of building a term algebra, and then showing that it only validates a given algebraic equation if that equation is derivable. Thus, let Σ be a generalised algebraic theory. We shall prove the following proposition. Proposition 10 (Weak Completeness). For any equation (Γ ) r = s : A, it holds that (Γ ) r = s : A if M =Σ (Γ ) r = s : A for any algebra M . For simplicity of notation, we restrict the proof to the case where Γ is the empty context, and thus build the term algebra by collecting closed terms. For the general case one would collect terms in the context Γ instead. Deﬁnition 15. We say a sort A is closed if the sort has the empty context, and similarly for terms. We’ll often leave out the empty context when referring to closed sorts and closed terms. Recall proposition 8 for the next deﬁnition. Proposition 11. Let the binary relation on the closed terms over Σ be deﬁned by r : A s : B if r = s : B, for closed terms r : A and s : B. Then, i. is an equivalence relation, ii. c : Γ e : Γ implies r : A[c/x] r : A[e/x], for (Γ ) A and r : A[e/x], and iii. c : Γ e : Γ implies t[c/x] : A[c/x] t[e/x] : A[e/x], for (Γ ) t : A. Deﬁnition 16. The canonical structure for Σ is the triple of assignments · deﬁned on the contexts Γ , sorts (Γ ) A, and terms (Γ ) t : A by an induction on the number of declarations in Γ , 1. () = {∅}, 2. Γ , y : B = {(e, e )  e ∈ Γ and e ∈ (Γ ) B(e)}, 3. (Γ ) A is a family of sets indexed by Γ (Γ ) A([e] ) = {[r : A[e/x]]  r : A[e/x]}, 4. (Γ ) t : A : ([e] ∈ Γ ) → (Γ ) A([e] )
44
J.F. Belo
(Γ ) t : A([e] ) = [t[e/x] : A[e/x]] , where x is the sequence of variables in order declared in Γ . Proposition 12. The canonical structure for Σ is a structure for Σ. Proof. The proof is by induction on the number of variables declared in Γ , checking the conditions in the deﬁnition of structure. It essentially follows from the properties of substitution. Proposition 13. The canonical structure for Σ is an algebra for Σ. Proof. It needs only to be checked that, whenever (Γ ) r = s : A ∈ Σ, (Γ ) r : A([e] ) = (Γ ) s : A([e] ), for all [e] ∈ Γ , and that whenever (Γ ) A = B ∈ Σ, (Γ ) A([e] ) = (Γ ) B([e] ), for all [e] ∈ Γ . Proposition 14. Let Σ be a generalised algebraic theory, and let M be the canonical structure for Σ. Then, M =Σ r = s : A only if r = s : A, for closed terms r : A and s : A. Proof. This is an immediate consequence of the deﬁnition of . Again, we don’t claim that M =Σ A = B only if A = B, for closed sorts A and B. Here a problem arises as follows. Suppose Σ is the generalised algebraic theory {() A, () B, (x : A) A = B, (x : B) A = B}. Then A = B, but M =Σ A = B for every algebra M for Σ. This is because in every algebra either A and B are interpreted by the empty set, and then their interpretations are equal, or must otherwise be interpreted by equal sets as imposed by the axioms. Thus we have the following. Proposition 15. There is a generalised algebraic theory over which an equality on sorts is not derivable but is nevertheless satisﬁed in every algebra. Thus, for the stronger completeness result we need a more general category, one which allows a more intensional notion of equality between the objects interpreting the sorts.
4
Systems of Dependently Sorted Logic
As said in the introduction, we consider a logic system to be a system of predicates and proofs ﬁtted over a theory of sorts. The purpose of this section is to make precise what we mean by this for the particular case of dependently sorted logic. We shall introduce systems for both classical and intuitionistic ﬁrst order logic. The inference rules are essentially those of the systems G1c and G1i in [14], except that:
Dependently Sorted Logic
45
1. the sequents in our systems include a context, which is a way of dealing with empty sorts as the availability of variables of given sorts is implicitly an existence assumption, see [15], page 811, and 2. a substitution rule is included with an algebraic equality as a side condition, so that a term in the conclusion of a derivation may be replaced by another algebraically equal to it. We use the letters, possibly subscripted or primed, ζ, ρ, φ, ψ to denote arbitrary expressions, and Θ, P , Φ to denote arbitrary multisets of expressions. Deﬁnition 17 1. A predicate symbol declaration is a pair R ⊂ (Δ). 2. A logic system is a generalised algebraic theory Σ together with a set of predicate symbol declarations R ⊂ (Δ) such that Σ Δ. Let Σ be a logic system. We shall use the same letter to denote a logic system and its theory of sorts. Deﬁnition 18. The formulas over Σ are inductively deﬁned according to the following clauses. atomic, truth, and falsity (Γ ) R(t)
{R ⊂ (Δ) ∈ Σ and (Γ ) t : Δ}
(Γ )
{ Γ }
(Γ ) ⊥
{ Γ }
conjunction, disjunction, and implication (Γ ) ζ0 (Γ ) ζ1 (Γ ) ζ0 ∧ ζ1
(Γ ) ζ0 (Γ ) ζ1 (Γ ) ζ0 ∨ ζ1
(Γ ) ζ0 (Γ ) ζ1 (Γ ) ζ0 → ζ1
universal and existential quantiﬁcation (Γ, x : A) ψ (Γ ) (∃x : A)ψ
(Γ, x : A) ψ (Γ ) (∀x : A)ψ
Proposition 16 1. If u occurs free in φ in (Δ) φ, then (Δ) u : Bi for some i, where B1 , . . . Bn is 2. If (Γ ) t : Δ and (Δ) φ, then (Γ ) φ[t/y]. Deﬁnition 19. A sentence is a formula with the empty context. We usually omit the context when denoting a sentence. A sequent is a triple (Γ ) Φ ⇒ P such that (Γ ) Φ and (Γ ) P are ﬁnite multisets of formulas. The multiset (Γ ) Φ is called the antecedent and (Γ ) P the succedent. A sequent is called intuitionistic if the succedent has at most one formula. Deﬁnition 20. Sequents are derived over a logic system according to the following clauses, called the inference rules for classical logic.
46
J.F. Belo
logical axiom, ⊥ elimination, and introduction (Γ ) ζ ⇒ ζ
(Γ ) ⊥ ⇒
(Γ ) ⇒
weakening left and right, contraction left and right (Γ ) Φ ⇒ P (Γ ) ζ, Φ ⇒ P
(Γ ) Φ ⇒ P (Γ ) Φ ⇒ P, ζ
(Γ ) ζ, ζ, Φ ⇒ P (Γ ) ζ, Φ ⇒ P
(Γ ) Φ ⇒ P, ζ, ζ (Γ ) Φ ⇒ P, ζ
substitution and cut (Δ) Ψ ⇒ Θ { (Γ ) r = s : Δ} (Γ ) Ψ [r/y] ⇒ Θ[s/y]
(Γ ) Φ0 ⇒ P0 , ζ (Γ ) ζ, Φ1 ⇒ P1 (Γ ) Φ0 , Φ1 ⇒ P0 , P1
∧ elimination and introduction (Γ ) ζi , Φ ⇒ P (Γ ) ζ0 ∧ ζ1 , Φ ⇒ P
(Γ ) Φ ⇒ P, ζ0 (Γ ) Φ ⇒ P, ζ1 (Γ ) Φ ⇒ P, ζ0 ∧ ζ1
∨ elimination and introduction (Γ ) ζ0 , Φ ⇒ P (Γ ) ζ1 , Φ ⇒ P (Γ ) ζ0 ∨ ζ1 , Φ ⇒ P
(Γ ) Φ ⇒ P, ζi (Γ ) Φ ⇒ P, ζ0 ∨ ζ1
→ elimination and introduction (Γ ) Φ ⇒ P, ζ0 (Γ ) ζ1 , Φ ⇒ P (Γ ) ζ0 → ζ1 , Φ ⇒ P
(Γ ) ζ0 , Φ ⇒ P, ζ1 (Γ ) Φ ⇒ P, ζ0 → ζ1
∃ elimination and introduction (Γ, u : A) ψ[u/v], Φ ⇒ P (Γ ) (∃v : A)ψ, Φ ⇒ P
(Γ ) Φ ⇒ P, ψ[t/v] { (Γ ) t : A} (Γ ) Φ ⇒ P, (∃v : A)ψ
∀ elimination and introduction (Γ ) ψ[t/v], Φ ⇒ P { (Γ ) t : A} (Γ ) (∀v : A)ψ, Φ ⇒ P
(Γ, u : A) Φ ⇒ P, ψ[u/v] (Γ ) Φ ⇒ P, (∀v : A)ψ
Deﬁnition 21. The inference rules for intuitionistic logic are the same as those for classical logic except that the sequents must be intuitionistic and the → elimination clause is replaced by: (Γ ) Φ0 ⇒ ζ (Γ ) ζ, Φ1 ⇒ P . (Γ ) Φ0 , Φ1 ⇒ P Deﬁnition 22. If a sequent (Γ ) Φ ⇒ P is derivable over Σ using the inference rules of classical logic, then we write Σ (Γ ) Φ ⇒ P . If the sequent is intuitionistic and the derivation uses the inference rules of intuitionistic logic, then we write instead iΣ (Γ ) Φ ⇒ P . We omit the subscript Σ if no confusion arises. Deﬁnition 23. A ﬁrst order theory over Σ is a set of sentences over Σ, called the ﬁrst order axioms of the theory. A sentence ζ is said to be derivable from the ﬁrst order axioms in S, written S ζ, if () Φ ⇒ ζ such that the sentences in Φ are all ﬁrst order axioms in S. We again use a superscript i in case the derivation is intuitionistic.
Dependently Sorted Logic
5
47
Structures for Dependently Sorted Logic
Having presented the systems for intuitionistic and classical logic in the previous section, we now proceed with their semantics. The striking aspect of the following development is, perhaps, that it is only a slight generalisation of that for multisorted logic. We present complete semantics for both classical and intuitionistic logic. The deﬁnition of classical structure is straightforward: given a structure for the theory of sorts only the declared predicate symbols remain to be interpreted. Deﬁnition 24. Let Σ be a logic system. A classical structure M for Σ is a structure for Σ together with an assignment of a subset RM of ΔM to each predicate symbol declaration R ⊂ (Δ). The interpretation of arbitrary formulas is then given by appropriate subsets of the interpretation of their contexts and is deﬁned as follows. Deﬁnition 25. The satisfaction relation =Σ on M between formulas (Γ ) ζ and elements e of Γ M is deﬁned by induction on the structure of (Γ ) ζ as follows. M, e = (Γ ) R(t) if (Γ ) t : ΔM (e) ∈ RM , M, e = (Γ ) , M, e = (Γ ) ⊥, M, e = (Γ ) ζ0 ∧ ζ1 if M, e = (Γ ) ζ0 and M, e = (Γ ) ζ1 , M, e = (Γ ) ζ0 ∨ ζ1 if M, e = (Γ ) ζ0 or M, e = (Γ ) ζ1 , M, e = (Γ ) (∃x : A)ψ if M, (e, e ) = (Γ, x : A) ψ for some (e, e ) ∈ Γ, x : AM , 7. M, e = (Γ ) ζ0 → ζ1 if M, e = (Γ ) ζ0 implies M, e = (Γ ) ζ1 , 8. M, e = (Γ ) (∀x : A)ψ if M, (e, e ) = (Γ, x : A) ψ for all e ∈ (Γ ) AM (e).
1. 2. 3. 4. 5. 6.
Proposition 17 (Substitution lemma). For every formula (Δ) φ, terms (Γ ) t : Δ, and e ∈ Γ M , M, e = (Γ ) φ[t/y] if and only if M, (Γ ) t : ΔM (e) = (Δ) φ. Proof. By induction on the structure of (Δ) φ. Deﬁnition 26. A formula (Γ ) ζ is valid in M , written M =Σ (Γ ) ζ, if M, e =Σ (Γ ) ζ for all e ∈ Γ M . The classical structure M is a classical model of a ﬁrst order theory S over Σ, written M =Σ S, if M is an algebra for Σ and every ﬁrst order axiom in S is valid in M . The formula is a consequence of S, written S =Σ (Γ ) ζ, if it is valid in every model of S. Again we omit the subscript Σ if no confusion arises. We claim the completeness of the above semantics for classical logic. Proposition 18 (Completeness). For any ﬁrst order theory S, for any sentence ζ, S ζ if and only if S = ζ.
48
J.F. Belo
Proof. The “only if” direction of the claim, soundness, is proved by induction on the derivation of ζ from the ﬁrst order axioms in S. The classical approach to the other direction is to show that if ζ is not derivable from the ﬁrst order axioms in S, then S has a model that does not satisfy ζ. The traditional proof which proceeds by ﬁrst extending the theory to a complete Henkin one – a theory such that for any sentence, either it or its negation is derivable and such that there is a constant witnessing every derivable existential – and then by deﬁning a classical structure interpreting the predicate symbols in the canonical algebra through derivability, like that in [16] for the case of enumerable languages or the more general one in [17], carries over to our case. Such a proof carried over to dependently sorted languages without equality may be found in [4]. We now proceed to the semantics for the intuitionistic systems. We deﬁne a notion of Kripke structure composed of classical structures and generalised in the sense that between the nodes we allow arbitrary classical structure morphisms. We should note that this generalisation is not essential to our results, as the standard deﬁnition, which only considers inclusions between the nodes, should suﬃce. Deﬁnition 27. Let N be a classical structure. A classical structure morphism on M to N , also called an extension of M , is a family h of functions indexed by the sorts over Σ such that 1. for all sort (Δ) D, all d ∈ ΔM , all d ∈ (Δ) DM (d), h(Δ) D (d ) ∈ (Δ) DN (hΔ (d)) 2. for all term (Δ) p : A, all d ∈ ΔM , (Δ) p : AN (hΔ (d)) = h(Δ) A ((Δ) p : AM (d)) 3. for all R ⊂ (Δ) ∈ Σ, all d ∈ ΔM d ∈ RM only if hΔ (d) ∈ RN where hΔ is the induced assignment on contexts: h = id{∅} hΓ,v:A (e, e ) = (hΓ (e), h(Γ ) A (e )) for all e ∈ Γ M and e ∈ (Γ ) AM (e). Deﬁnition 28. A (generalised) Kripke structure for Σ is any category of classical structures for Σ and classical structure morphisms between them. The extension of a formula is deﬁned in the usual way through forcing, except that we now have to consider arbitrary structure morphisms on each node for implications and universal quantiﬁcations.
Dependently Sorted Logic
49
Deﬁnition 29. The forcing relation Σ on a Kripke structure K between classical structures M in K, formulas (Γ ) ζ, and elements e of Γ M is deﬁned by induction on the structure of (Γ ) ζ as follows. M, e (Γ ) R(t) if (Γ ) t : ΔM (e) ∈ RM , M, e (Γ ) , M, e (Γ ) ⊥, M, e (Γ ) ζ0 ∧ ζ1 if M, e (Γ ) ζ0 and M, e (Γ ) ζ1 M, e (Γ ) ζ0 ∨ ζ1 if M, e (Γ ) ζ0 or M, e (Γ ) ζ0 , M, e (Γ ) (∃x : A)ψ if M, (e, e ) (Γ, x : A) ψ for some (e, e ) ∈ Γ, x : AM , 7. M, e (Γ ) ζ0 → ζ1 if N, hΓ (e) (Γ ) ζ0 implies N, hΓ (e) (Γ ) ζ1 for all classical structure N and classical structure morphism h : M → N in K, 8. M, e (Γ ) (∀x : A)ψ if N, (hΓ (e), e ) (Γ, x : A) ψ for all (hΓ (e), e ) ∈ Γ, x : AN , classical structure N , and classical structure morphism h : M → N in K.
1. 2. 3. 4. 5. 6.
Deﬁnition 30. A formula (Γ ) ζ is valid at a classical structure M in K, written M Σ (Γ ) ζ, if M, e (Γ ) ζ for all e ∈ Γ M . The formula is valid in K, written K Σ (Γ ) ζ, if M (Γ ) ζ for all classical structure M in K. The Kripke structure K is a Kripke model of a ﬁrst order theory S, written K S, if every ﬁrst order axiom in S is valid in K. The formula is a Kripke consequence of S, written S Σ (Γ ) ζ, if it is valid in every Kripke model of S. Proposition 19 (Substitution lemma). Let (Δ) φ be a formula, let M be a classical structure in K, and let e ∈ Γ M . Suppose that (Γ ) t : Δ. Then M, e (Γ ) φ[t/y] if and only if M, (Γ ) t : ΔM (e) (Δ) φ. Proof. By induction on the structure of (Δ) φ. Proposition 20 (Completeness). For any ﬁrst order theory S and sentence ζ S i ζ if and only if S ζ Proof. The “only if” part of the claim is easy. For the other, one assumes a sentence ζ not derivable intuitionistically from the ﬁrst order axioms in S and proceeds to build a Kripke model that does not force it. Roughly, the traditional proof considers all possible saturations – extensions of the theory such that for every derivable disjunction at least one of its disjuncts is derivable and such that there is a constant witnessing every derivable existential – of S and then takes as a model the category composed of the canonical structures of each of those saturations and of all classical structure morphisms between them. This can be carried over to our semantics. One can show that the model thus constructed does not force ζ from the fact that ζ is not derivable in at least one of the saturations.
50
J.F. Belo
Acknowledgments This work was carried out under the supervision of Professor Peter Aczel. Also, many thanks to the referees for their generous and helpful comments.
References 1. Gambino, N., Aczel, P.: The generalised typetheoretic intepretation of constructive set theory (preprint 2005) 2. Jacobs, B.: Categorical Logic and Type Theory. Studies in Logic and the Foundations of Mathematics, vol. 141. North Holland, Amsterdam (1999) 3. Aczel, P.: Predicate logic with dependent sorts or types. Unpublished (2004) 4. Belo, J.F.: Dependently typed predicate logic. Master’s thesis, University of Manchester (2004) 5. Makkai, M.: First order logic with dependent sorts, with applications to category theory. Unpublished (1995) 6. Rabe, F.: Firstorder logic with dependent types. In: Furbach, U., Shankar, N. (eds.) IJCAR 2006. LNCS (LNAI), vol. 4130, pp. 377–391. Springer, Heidelberg (2006) 7. Mosses, P.D. (ed.): CASL Reference Manual. LNCS, vol. 2960. Springer, Heidelberg (2004) 8. Benke, M., Dybjer, P., Jansson, P.: Universes for generic programs and proofs in dependent type theory. Nordic J. of Computing 10(4), 265–289 (2003) 9. Pfeifer, H., Rueß, H.: Polytypic proof construction. In: Bertot, Y., Dowek, G., Hirschowitz, A., Paulin, C., Th´ery, L. (eds.) TPHOLs 1999. LNCS, vol. 1690, pp. 55–72. Springer, Heidelberg (1999) 10. Cartmell, J.: Generalized algebraic theories and contextual categories. PhD thesis, Univ. Oxford (1978) 11. Cartmell, J.: Generalized algebraic theories and contextual categories. Ann. Pure Appl. Logic 32, 209–243 (1986) 12. Pitts, A.M.: Categorical logic. In: Abramsky, S., Gabbay, D.M., Maibaum, T.S.E. (eds.) Handbook of Logic in Computer Science. Algebraic and Logical Structures, vol. 5, ch.2, Oxford University Press, Oxford (2000) 13. Hofmann, M.: Syntax and semantics of dependent types. In: Pitts, A.M., Dybjer, P. (eds.) Semantics and Logics of Computation, vol. 14, pp. 79–130. Cambridge University Press, Cambridge (1997) 14. Troelstra, A.S., Schwichtenberg, H.: Basic proof theory. Cambridge University Press, Cambridge (2000) 15. Johnstone, P.T.: Sketches of an Elephant: A Topos Theory Compendium, vol. 2. Oxford University Press, Oxford (2002) 16. Johnstone, P.T.: Notes on Logic and Set Theory. Cambridge University Press, Cambridge (1987) 17. Shoenﬁeld, J.R.: Mathematical Logic. Association for Symbolic Logic (1967)
Finiteness in a Minimalist Foundation Francesco Ciraulo1 and Giovanni Sambin2 Universit` a di Palermo, Dipartimento di Matematica ed Applicazioni, Via Archiraﬁ 34, 90123 Palermo, Italy
[email protected] http://www.math.unipa.it/∼ciraulo Universit` a di Padova, Dipartimento di Matematica Pura ed Applicata, Via Trieste 63, 35121 Padova, Italy
[email protected] http://www.math.unipd.it/∼sambin/
1
2
Abstract. We analyze the concepts of ﬁnite set and ﬁnite subset from the perspective of a minimalist foundational theory which has recently been introduced by Maria Emilia Maietti and the second author. The main feature of that theory and, as a consequence, of our approach is compatibility with other foundational theories such as ZermeloFraenkel set theory, MartinL¨ of’s intuitionistic Type Theory, topos theory, Aczel’s CZF, Coquand’s Calculus of Constructions. This compatibility forces our arguments to be constructive in a strong sense: no use is made of powerful principles such as the axiom of choice, the powerset axiom, the law of the excluded middle. Keywords: minimalist foundation, ﬁnite sets, ﬁnite subsets, type theory, constructive mathematics.
1
Introduction
The behaviour of a mathematical object and the properties it possesses are inﬂuenced by the foundational assumptions one accepts. That is true also for the apparently clear concepts of ﬁnite set and ﬁnite subset of a given set. For this reason, it seems interesting to know a stock of properties about ﬁniteness which are true in all foundational theories (or, at least, in the most used ones). Maria Emilia Maietti and the second author have recently proposed (see [5]) a foundational theory which is “minimalist” in the sense that it can be seen as the common core of some of the most used foundations, namely, ZermeloFraenkel set theory, topos theory, MartinL¨ of’s Type Theory, Aczel’s CZF, Coquand’s Calculus of Constructions. A peculiarity of this minimalist foundation is that it is based on two levels of abstraction: an extensional theory to develop mathematics in more or less the usual informal way (see [4]) and an underlying intensional type theory called “minimal Type Theory” (“mTT” from now on) on which mathematics is formalized (see [5]). Therefore, our task of speaking about ﬁniteness independently from foundations acquires a more precise form: to study ﬁniteness from the perspective of this minimalist foundation, and hence M. Miculan, I. Scagnetto, and F. Honsell (Eds.): TYPES 2007, LNCS 4941, pp. 51–68, 2008. c SpringerVerlag Berlin Heidelberg 2008
52
F. Ciraulo and G. Sambin
eventually in terms of mTT. Accomplishing this task is the aim of the present paper. Thanks to the reasons explained above, the deﬁnitions and results in the present paper are constructive in a strong sense: no use is made of powerful principles such as the axiom of choice, the powerset axiom, the principle of excluded middle. In fact, each of these principles breaks compatibility with at least one of the above foundational theories. The present work can be seen as a sequel to [9] because all the deﬁnitions and properties stated there, even if originally intended for MartinL¨ of’s theory, remain valid when viewed from the point of view of mTT since they do not need any application of the axiom of choice. For the same reason, a large part of [6] and [8] can be read as an explanation of the formal system mTT. For all those notions which are used but not explained in this paper, we refer to [5] and [6].
2
Minimal Type Theory: A Brief Introduction
The type theory mTT can be formalized as a variant of MartinL¨ of’s theory (see [6] and [8]); thus we feel free to use all the standard notation developed for Type Theory, mainly, the set constructors Σ and Π. The main diﬀerence between the two systems is that mTT identiﬁes each proposition with a particular set (namely the set of all its proofs), but not the converse which MartinL¨ of’s theory does, instead. That implies that the usual identiﬁcation between logical constants and set constructors cannot be performed any longer. In other words, in mTT every logical constant needs an independent deﬁnition; for example, the always false proposition, written ⊥, has to be kept distinct from N (0) (the set with no elements; see below) simply because N (0) is not a proposition. As a consequence, the axiom of choice is no longer provable in mTT.1 To see this, let us brieﬂy explain the diﬀerence between (Σ x ∈ A)B(x) (disjoint union) and (∃ x ∈ A) B(x) (existential quantiﬁer) in mTT. Both are sets, but only the latter is a proposition. Their formation and introduction rules are formally the same, but their elimination rules, namely [z ∈ (Σ x ∈ A)B(x)] .. .. C(z) set
C prop 1
[x ∈ A, y ∈ B(x)] .. .. d ∈ (Σ x ∈ A)B(x) m(x, y) ∈ C(< x, y >) ElΣ (d, m) ∈ C(d)
d ∈ (∃ x ∈ A)B(x) El∃ (d, m) ∈ C
[x ∈ A, y ∈ B(x)] .. .. m(x, y) ∈ C ∃  elimination
(1)
(2)
Note that the absence of the axiom of choice is necessary to keep compatibility with topos theory (see [5] for more details).
Finiteness in a Minimalist Foundation
53
diﬀer because the proposition C in the ∃elimination rule cannot depend on a proof of (∃ x ∈ A)B(x). This apparently small limitation is enough to make the axiom of choice nondeducible. Here for “ the axiom of choice” we mean the following proposition: (∀x ∈ A)(∃y ∈ B(x))C(x, y) → (∃f ∈ (Π x ∈ A)B(x))(∀x ∈ A)C(x, f (x)) (3) (where f (x) stands for Ap(f, x), the element of B(x) which is obtained by applying the function f to the input x in A). On the contrary, the set (Πx ∈ A)(Σx ∈ B(x))C(x, y) → (Σf ∈ (Πx ∈ A)B(x))(Πx ∈ A)C(x, f (x)) (4) can be proved to be inhabited. The reason for that stands in the fact that the second (or right) projection can be deﬁned with respect to Σ, but not with respect to ∃. Provided that c is an element of (Σ x ∈ A)B(x) (respectively (∃x ∈ A)B(x)), then the ﬁrst (or left) projection, written p(c) is the element ElΣ (c, m) (respectively El∃ (c, m)) obtained by elimination with A in place of C and x in place of m(x, y); of course, p(< a, b >) = a by equality. The second projection is obtained, in the case of Σ, by taking m(x, y) to be y; this forces C(x) to be B(x). Hence, this technique cannot be used in the case of ∃. Summing up, from a proof c ∈ (∃x ∈ A)B(x) we are able to construct an element p(c) ∈ A which, from a metalinguistic level, can be seen to satisfy B; nevertheless, we are not able to construct a proof of B(p(c)) within the system mTT. This fact is intimately related to the fact that, even if the axiom of choice is nondeducible within the system, it in fact holds on a metalinguistic level as long as the pure system mTT is considered; this happens because of our constructive interpretation of quantiﬁers. Of course, not all the extensions of mTT (e.g. topos theory) share this property and hence we cannot expect to prove the axiom of choice within our system. By the way, note that the usual logical rule of ∃elimination can be obtained from the above one by suppressing all proof terms; so:
C prop
(∃ x ∈ A)B(x) true C true
[x ∈ A, B(x) true] .. .. C true logical ∃  elimination.
(5)
This rule says that if we want to infer C (which does not depend on x ∈ A) from (∃x ∈ A)B(x), then we can assume to have an arbitrary x ∈ A and a proof of B(x); of course, that does not mean we are using ﬁrst and second projection. We take the occasion to warn the reader that we will often use a =S b, or simply a = b, instead of the proposition Id(S, a, b), provided that S is a set; this proposition, however, has to be kept distinct from the judgement a = b ∈ S. Provided that A set, B set, f ∈ A → B and a ∈ A we often write f (a) instead of Ap(f, a).
54
3
F. Ciraulo and G. Sambin
A Constructive Concept of Finiteness
In the framework of mTT, like in other constructive approaches, a collection of objects is called a set when, roughly speaking, we have rules to construct such objects; we reserve the word “element” for an object of a set. It is common practice to distinguish intensional sets from extensional sets (also called setoids) which are (intensional) sets endowed with an equivalence relation. Even if the deﬁnitions in the present paper are formulated with regard to sets, they can easily been extended to setoids: it is enough to replace the propositional equality by the equivalence relation of the setoid. Thus the natural framework to set the following results should be the extensional level of the minimal type theory (see [4]). An example of (intensional) set is N , the set of formal natural numbers. N formation
N set
N introduction 0∈N
n∈N s(n) ∈ N
(6)
(7)
N elimination [z ∈ N ] [x ∈ N, y ∈ C(x)] .. .. .. .. C(z) set c ∈ N d ∈ C(0) e(x, y) ∈ C(s(x)) R(c, d, e) ∈ C(c)
(8)
The programm R (for “recursion”) performs the following steps. Firstly, it brings c to its canonical form, that will be either 0 or s(n) for some n ∈ N . In the ﬁrst case it returns d ∈ C(0) (or, better, the canonical element produced by d); in the second case it evaluates R(n, d, e) and then it computes e(n, R(n, d, e)). N equality [x ∈ N, y ∈ C(x)] .. .. d ∈ C(0) e(x, y) ∈ C(s(x)) R(0, d, e) = d ∈ C(0)
[x ∈ N, y ∈ C(x)] .. .. n ∈ N d ∈ C(0) e(x, y) ∈ C(s(x)) R(s(n), d, e) = e(n, R(n, d, e)) ∈ C(s(n)) (9)
As usual, we write sn (0) (n an informal natural number) for the canonical element of N which is obtained from 0 by n applications of s. Thus sn (0) is a shorthand for the formal expression which represents the informal natural number n. When no confusion arises, we will use the symbol n instead of the formal natural number sn (0). Note that, provided that n and m are two diﬀerent informal natural numbers, surely the proposition Id(N, sn (0), sm (0)) cannot be proved within the system, as it is clear by an easy metalinguistic investigation.
Finiteness in a Minimalist Foundation
55
Nevertheless neither the proposition ¬Id(N, sn (0), sm (0)) is deducible, unless the ﬁrst universe (also called the set of small sets) is deﬁned (actually the boolean universe deﬁned in [4] is enough). Once the above rules are given, one can deﬁne addition in the usual recursive way: let the value of e(x, y) be s(y); then the element R(b, a, e) is what is called a + b. Moreover, one can deﬁne a ≤ b as (∃c ∈ N )(a + c = b), where x = y is the proposition Id(N, x, y). Of course, the standard product and a limited subtraction, such as all other recursive functions, can be deﬁned in the standard way. Another example is the deﬁnition of N (k), the standard set with k elements. N (k)formation
k∈N N (k) set
k = k ∈ N N (k) = N (k )
(10)
n = m ∈ N n < k true nk = mk ∈ N (k)
(11)
N (k)introduction n ∈ N n < k true nk ∈ N (k)
These rules introduce the k canonical elements of the set N (k), namely, 0k , (s(0))k , . . . , (sk−1 (0))k , which, for the sake of brevity, we write 0k , 1k , . . . , (k − 1)k . N (k)elimination [z ∈ N (k)] [n ∈ N, n < k true] .. .. .. .. C(z) set c ∈ N (k) cn ∈ C(nk ) Rk (c, c0 , . . . , ck−1 ) ∈ C(c)
(12)
where Rk is the function that brings c to its canonical form, that will be a certain nk for some n < k, and hence picks the corresponding cn . N (k)equality [z ∈ N (k)] [x ∈ N, x < k true] .. .. .. .. C(z) set n ∈ N n < k true cx ∈ C(xk ) . Rk (nk , c0 , . . . , ck−1 ) = cn ∈ N (k)
(13)
Note that, for n ∈ N it is possible to prove by induction the proposition (n < k) → (n = 0) ∨ (n = 1) ∨ . . . (n = (k − 1)), provided that k ∈ N is ﬁxed. Thus for x ∈ N (k), it is possible to prove the proposition (x = 0k ) ∨ (x = 1k )∨. . . ∨(x = (k − 1)k ). This implies that every quantiﬁcation over N (k) can be replaced by a ﬁnite conjunction or disjunction. More precisely, a proposition of the form (∀x ∈ N (k))P (x) is equivalent to P (0k ) & P (1k ) & . . . & P ((k − 1)k ), while (∃x ∈ N (k))P (x) is the same as P (0k ) ∨ . . . ∨ P ((k − 1)k ). Even if the axiom of choice is not deducible within the system mTT, nevertheless it holds with respect to the sets of the form N (k), in the sense of the following proposition.
56
F. Ciraulo and G. Sambin
Proposition 1. Let k ∈ N and S(x) set [x ∈ N (k)]; then the proposition (∀x ∈ N (k))(∃a ∈ S(x))P (x, a) → (∃f ∈ T )(∀x ∈ N (k))P (x, f (x))
(14)
is deducible, where T is (Π x ∈ N (k))S(x). Proof. Let Q(x) be (∃a ∈ S(x))P (x, a); then (∀x ∈ N (k))Q(x) is equivalent to Q(0k ) & . . . & Q((k − 1)k ). Thus, we can replace (∀x ∈ N (k))Q(x) with the k assumptions (∃a ∈ S(nk ))P (nk , a), n = 0, . . . , k − 1. By ∃elimination k times, we can assume P (0k , a0 ), . . . , P ((k − 1)k , ak−1 ), where each ai is an element of S(ik ), i = 0, . . . , k−1. By N (k)elimination, we can construct a family R(x, a0 , . . . , ak−1 ) ∈ S(x) and then a function f ∈ (Π x ∈ N (k))S(x), where f is λx.R(x, c0 , . . . , ck−1 ), such that P (x, f (x)) holds for all x ∈ N (k). Thus the proposition (∃f ∈ (Π x ∈ N (k))S(x))(∀x ∈ N (k))P (x, f (x)) can be inferred from P (nk , an ), n = 0, . . . , k − 1 and then, since it does not depend on any an , directly from (∃a ∈ S(x))P (x, a), x ∈ N (k). A classical deﬁnition says that a set is ﬁnite if it is not inﬁnite, where it is inﬁnite if there exists a onetoone correspondence between it and one of its proper subsets. An alternative way is to consider the sets of the form N (k) as prototypes of the ﬁnite sets and, hence, to call a set ﬁnite if it is in a bijective correspondence with N (k), for some k ∈ N . That is just the deﬁnition given by Brouwer in [3] and then by Troelstra and van Dalen in [10]. Of course, several other notions are possible (see section 5; see also [11]). For example, following [3], we could say that a set is (numerically) bounded if it cannot have a subset of cardinality n, for some natural number n. Otherwise, following [10], we could say that a set is ﬁnitely indexed or ﬁnitely enumerable or listable if there exists a surjective function from some N (k) onto it. From a classical point of view, that is in the framework of ZermeloFraenkel set theory with choice, the above deﬁnitions turn out to be all equivalent; the same does not happen in other foundations (see [11] for counterexamples in intuitionistic mathematics). So we have to make a choice; of course, we look for the most simple, natural and eﬀective one. What we do is to adopt the following (see “ﬁnitely indexed” in [10]). Provided that A set, B set and f ∈ A → B we write f (A) = B for the proposition (∀b ∈ B)(∃a ∈ A)Id(B, b, Ap(f, a)); in other words, f (A) = B true is the judgement “f is surjective”. Deﬁnition 1 (ﬁnite set). Let S be a set; S is said to be ﬁnite if the proposition (∃k ∈ N )(∃f ∈ N (k) → S)(f (N (k)) = S), which we shortly denote by Fin(S), is true. Proposition 2. If I is a ﬁnite set and (∃g ∈ I → S)(g(I) = S) is true, then S is ﬁnite. Proof. The proof is quite obvious; however we give a sketch of it in order to show that it can be carried out within mTT. By ∃elimination (twice) on Fin(I), we can assume k ∈ N , f ∈ N (k) → I and f (N (k)) = I. Again by ∃elimination, we can assume g ∈ I → S and g(I) = S. The function λx.g(f (x)) ∈ N (k) → S is surjective and thus Fin(S) is true, regardless of the particular k, f and g.
Finiteness in a Minimalist Foundation
57
It is possible to give also the notion of unary set, i.e. a set with at most one element. Trivially, every unary set is ﬁnite too. Deﬁnition 2 (unary set). Let S be a set; we say that S is unary if the proposition (∃k ∈ N )(k ≤ s(0) & (∃f ∈ N (k) → S)(f (N (k)) = S)) is true. Given a set I and a setindexed family of sets S(i) set [i ∈ I], it is possible to construct their indexed sum (or disjoint union), written (Σ i ∈ I)S(i). Its canonical elements are couples of the kind < i, a > with i ∈ I and a ∈ S(i). The following lemma and the subsequent proposition say that ﬁnite sets have the expected behavior with respect to indexed sums. Lemma 1. Let k ∈ N and n(x) ∈ N [x ∈ N (k)]; then (Σ x ∈ N (k))N (n(x)) is ﬁnite. Proof. Let m = n(0k ) + n(1k ) + . . . + n((k − 1)k ) ∈ N and consider the function f ∈ N (m) → (Σ x ∈ N (k))N (n(x)) deﬁned by the following m conditions: 0m 1m .. .
−→ −→ .. .
< 0k , 0n(0k ) > < 0k , 1n(0k ) > .. .
→ < 0k (n(0k ) − 1)m − (n(0k ))m − → < 1k .. .. . . .. .. . . (m − 1)m −→ < (k − 1)k
, (n(0k ) − 1)n(0k ) > , 0n(1k ) > .. . .. . , (n((k − 1)k ) − 1)n((k−1)k ) >
(15)
The idea is trivial: we perform k stages: ﬁrstly, we enumerate the n(0k ) elements of N (n(0k )), then the n(1k ) elements of A(n(1k )) and so on till we reach the last element in N (n((k − 1)k )). Proposition 3. Let A(i) set [i ∈ I] be a ﬁnite setindexed family of ﬁnite sets, that is, let I and each of the A(i) be ﬁnite. Then (Σ i ∈ I)A(i) is ﬁnite. Proof. By ∃elimination on Fin(I), we can assume k ∈ N , f ∈ N (k) → I and f (N (k)) = I. Firstly, let Q(i) prop [i ∈ I] be an arbitrary propositional function over I. From f ∈ N (k) → I we can infer (∀i ∈ I)Q(i) → (∀x ∈ N (k))Q(f (x)) true. Also, from f (N (k)) = I we can infer (∀x ∈ N (k))Q(f (x)) → (∀i ∈ I)Q(i). Thus (∀i ∈ I)Q(i) is equivalent to (∀x ∈ N (k))Q(f (x)), provided that the assumptions at the very beginning of the proof hold. Now let Q(i) ≡ Fin(A(i)) ≡ (∃n ∈ N )(∃g ∈ N (n) → A(i))(g(N (n)) = A(i)). Thus (∀i ∈ I)Fin(A(i)) is equivalent to (∀x ∈ N (k))Fin(A(f (x))). Hence, by proposition 1 applied twice, we can infer the existence of n ∈ N (k) → N and g ∈ (Π x ∈ N (k))(N (n(x)) → A(f (x))) such that g(x)(N (n(x))) = A(f (x)), that is g(x) is surjective, for all x ∈ N (k).
58
F. Ciraulo and G. Sambin
Let h ∈ (Σ x ∈ N (k))N (n(x)) → (Σ i ∈ I)A(i) be the function deﬁned by h ≡ λz. < f (p(z)), g(p(z))(q(z)) >, that is h(< x, y >) =< f (x), g(x)(y) >. The function h is surjective and the thesis follows by the previous lemma and proposition 2. As a corollary one gets that the cartesian product A × B of two ﬁnite sets is ﬁnite too. Beside the Σ operator, another common constructor for sets is the socalled dependent (or cartesian) product, written Π, which includes the set of functions between two sets as a special case. The canonical elements of (Π i ∈ I)S(i) are functions of the kind λx.f (x) with x ∈ I and f (x) ∈ S(x). The behavior of Π with respect to ﬁniteness is described in the following lemma and proposition. Lemma 2. Let k ∈ N and n(x) ∈ N [x ∈ N (k)]; then (Π x ∈ N (k))N (n(x)) is ﬁnite. Proof. Let m = n(0k ) · n(1k ) · . . . · n((k − 1)k ) ∈ N and consider the function f ∈ N (m) → (Π x ∈ N (k))N (n(x)) deﬁned by the following m conditions (we suppress indexes): ⎧ ⎧ 0 → 0 0 → 0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎨ 1 → 0 1 → 0 ... 1 −→ 0 −→ . .. .. .. .. .. ⎪ ⎪ . . . . ⎪ .. ⎪ . ⎪ ⎪ ⎩ ⎩ k − 1 → 1 k − 1 → 0 ⎧ ⎧ 0 → 0 0 → 1 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎨ 1 → n(1) − 1 1 → 0 m m − 1 −→ −→ . . . (16) .. .. .. .. .. .. ⎪ ⎪ n(0) n(0) . . . . . . ⎪ ⎪ ⎪ ⎪ ⎩ ⎩ k − 1 → n(k − 1) − 1 k − 1 → 0 ⎧ 0 → n(0) − 1 ⎪ ⎪ ⎪ ⎨ 1 → n(1) − 1 ... ... m − 1 −→ .. .. .. ⎪ . . . ⎪ ⎪ ⎩ k − 1 → n(k − 1) − 1 Proposition 4. Let k ∈ N and A(x) set [x ∈ N (k)] be a family of ﬁnite sets indexed by N (k). Then (Π x ∈ N (k))A(x) is ﬁnite. Proof. As in the proof of the previous proposition, from (∀x ∈ N (k))Fin(A(x)), we can construct n ∈ N (k) → N and g ∈ (Π x ∈ N (k))(N (n(x)) → A(x)) such that g(x)(N (n(x))) = A(x), that is g(x) is surjective, for all x ∈ N (k). Let h ∈ (Π x ∈ N (k))N (n(x)) → (Π x ∈ N (k))A(x) be the function deﬁned by λz.(λx.g(x)(z(x))), that is h(λx.f (x)) = λx.g(x)(f (x)). Note that it is surjective and apply the previous lemma and proposition 2. As a corollary, the set of functions N (k) → S is ﬁnite provided that k ∈ N and S is a ﬁnite set.
Finiteness in a Minimalist Foundation
59
Note that we cannot generalize the previous proposition to the case of an arbitrary ﬁnite set I in place of N (k). That is so because proving ﬁniteness of (Π i ∈ I)A(i) would need the construction of a partial inverse of the surjective function giving ﬁniteness of I (think of the special case I → A). The point is that such a construction of a partial inverse cannot be performed if the axiom of choice is missing. Hence, ﬁniteness of A → B does not follow from ﬁniteness of A and B. Incidentally, note that if both the axiom of choice and the ﬁrst universe (or, simply, the boolean universe of [4]) are adopted, then a ﬁnite set become exactly a set that can be put in a bijective correspondence with some N (k) (see Brouwer’s deﬁnition of ﬁnite set in [3]). In fact, if S is ﬁnite, then there exists an onto map f : N (k) → S; thus, by choice, we can construct a partial inverse, say g, whose image is in a one to one correspondence with S. Now, since equality in N (k) is decidable (thanks to the existence of either the ﬁrst or the boolean universe), we can count the elements of g(S); let n be this number. Then it is possible to construct a bijection between N (n) and S.
4
Finite Subsets
Before turning our attention to ﬁnite subsets, we have to introduce the notion of subset we are going to use. Following [9], a subset of a given set S is represented by a ﬁrst order (that is, with variables ranging only over sets) propositional function with at most one free variable over that set. A propositional function over S is of the kind U (x) prop [x ∈ S]; thus U (x) is a proposition provided x ∈ S. We write U , or also {x ∈ S : U (x)}, when we think of it as a subset of S. We write U ⊆ S to express that U is a subset of S. The membership relation between an element a ∈ S and a subset U ⊆ S is written a S U (or a U when no confusion arises) and is deﬁned as the proposition Id(S, a, a) & U (a), where U (x) is a propositional function which represents U . Note that a U is a proposition provided that a ∈ S and U ⊆ S; that is, (a S U ) prop [a ∈ S, U (x) prop [x ∈ S]] .
(17)
Hence, a U is not a judgement, but only a proposition; moreover a U is true exactly when U (a) is true and a ∈ S. Thus, from (a U ) true we can derive the judgement a ∈ S; note that the proposition Id(S, a, a) is introduced just to keep trace of the element a, since U (a) could loose the information about it (see [9] for further explanations). Given two subsets, say U and V , i.e. two propositional functions over S, we say that U is included in V when the proposition (∀x ∈ S)(x U → x V ), written U ⊆ V , is true. Of course, U = V is the proposition (U ⊆ V ) & (V ⊆ U ); hence, equality between subsets is extensional; in other words, a subset is a class of equivalent propositional functions. Important examples of subsets are: the empty subset, written ∅, that corresponds to the propositional function ⊥ prop [x ∈ S], where ⊥ is the false proposition; the total subset, denoted by S or simply by S, that corresponds to the always true propositional function prop [x ∈ S];
60
F. Ciraulo and G. Sambin
the singletons {a} for a ∈ S i.e. the propositional functions x = a (or, better, Id(S, x, a) prop [x ∈ S]).2 Finally, operations on subsets are deﬁned by reﬂecting the corresponding connectives (of intuitionistic logic). For example, provided that U and V are represented by U (x) and V (x) respectively, U ∩V is represented by the propositional function U (x) & V (x); in other words, x U ∩ V if and only if (x U ) & (x V ). Inﬁnitary operations are also available, such as the union of a setindexed family of subsets (more details can be found in [9]). Note that an operation corresponding to implication is also deﬁnable; in particular, given a subset U represented by U (x), we denote by −U the subset represented by ¬ U (x) ≡ U (x) → ⊥. We write PS for the collection of all subsets of the set S; it is surely not a set in the framework of mTT: assuming the powerset axiom breaks compatibility with predicative foundations, e.g. MartinL¨ of’s type theory (for more details on this, see [5]). To be precise, PS is an extensional collection, namely the quotient over logical equivalence of the collection of all propositional functions over S. Subsets of a set S can be identiﬁed with images under functions with S as their codomain. In fact, let U (x) prop [x ∈ S] then it is also U (x) set [x ∈ S]; thus we can construct (Σ x ∈ S)U (x) and the function λx.p(x) : (Σ x ∈ S)U (x) → S, where p is the ﬁrst projection; that is, provided that a ∈ S and π ∈ U (a), we map each < a, π > to a. Thus an element a ∈ S is in the image of λx.p(x) if and only if there exists a proof π of U (a). Vice versa, provided that I set and f : I → S, then the propositional function (∃i ∈ I)(x = f (i)) prop [x ∈ S] deﬁnes a subset of S which is exactly the image of I under f . Following the same pattern as for sets, we give the following. Deﬁnition 3 (ﬁnite (unary) subset). A subset K of a set S is ﬁnite ( unary) if it is the image of a function f : N (k) → S, for some k ∈ N (k = 0, 1); that is, K can be represented by the propositional function (in the free variable x): (∃i ∈ N (k))(x = f (i)). The collection of all ﬁnite (unary) subsets of S is denoted by Pω S (P1 S, respectively). It follows directly from the deﬁnition that every unary subset is ﬁnite too; so we can think of P1 S as included in Pω S. Trivially, S (in the sense of S ) belongs to Pω S (P1 S) if and only if S is ﬁnite (unary) as a set. The above deﬁnition is just the same as in [9] and coincides with the notion of ﬁnitely indexed as given in [10]. Until the end of this section, we will prove some basic properties about ﬁnite (and unary) subsets. First of all, we give a natural characterization in terms of ﬁnite sequences of elements, i.e. lists. So we need to introduce the set constructor List, which is deﬁned by the following rules. Listformation
2
S set List(S) set
(18)
Note that, since the symbol S can represent both a set and the total subset of that set, U ⊆ S can denote both a judgement and a proposition. Which case occurs will be clear from the context.
Finiteness in a Minimalist Foundation
61
Listintroduction l ∈ List(S) a ∈ S cons(l, a) ∈ List(S)
nil ∈ List(S)
(19)
that is, lists are recursively constructed starting from the empty list nil and adding elements of S one at a time (cons can be thought of as the function that attaches an element at the end of a given list). Listelimination [x ∈ List(S)] [x ∈ List(S), y ∈ S, z ∈ A(x)] .. .. .. .. A(x) set l ∈ List(S) a ∈ A(nil) f (x, y, z) ∈ A(cons(x, y)) LR(a, f, l) ∈ A(l) (20) that is, if we have an element in A(nil) and every time we know an element in A(x) we can construct (by means of the function f ) an element of A(cons(x, y)), then we are able to construct an element in A(l) for every list l. In other words, we have a function LR (for “list recursion”) that for every list l returns a value in A(l) depending on the method f and the starting value a ∈ A(nil). An important consequence of this last rule is that we can use induction when proving a certain property about lists. Remember that every proposition is also a set (an element is just a proof, a veriﬁcation); then the elimination rule yields: [x ∈ List(S)] .. .. P (x) prop l ∈ List(S) P (nil) true P (l) true
[x ∈ List(S), y ∈ S, P (x) true] .. .. P (cons(x, y)) true induction.
(21) Finally we have two equality rules which we state without writing again all the hypotheses. LR(a, f, nil) = a
LR(a, f, cons(l, b)) = f (l, b, LR(a, f, l)).
(22)
These conditions can be read, of course, as a recursive deﬁnition of the function LR. In order to continue with our simultaneous treatment of ﬁnite and unary subsets, we need to deﬁne the set of sequences of length at most one, written List1 (S). The rules for it are obtained as a slight modiﬁcation of those for List(S) and hence we do not write them down in all details. We give only the introduction rules, as an example: nil ∈ List1 (S)
a∈S . cons(nil, a) ∈ List1 (S)
(23)
Even if not formally right, we can conceive of List1 (S) as included in List(S) in order to avoid boring distinctions.
62
F. Ciraulo and G. Sambin
Sometimes we write [ ] instead of nil, [a] instead of cons(nil, a), [a, b] instead of cons(cons(nil, a), b) and so on. List(S) can be endowed with a binary operation, called concatenation and written ∗, which is recursively deﬁned by the following clauses: l ∗ nil =def l (24) l ∗ cons(m, a) =def cons(l ∗ m, a) where l,m are in List(S) and a ∈ S. Finally, we would like to deﬁne a function dec (for “deconstruct”) from List(S) to PS; of course, formally we will have (dec(l)(x) prop [x ∈ S])[l ∈ List(S)]
(25)
because we want dec(l) to be a subset of S (that is a propositional function over S) for any l ∈ List(S). Let us deﬁne dec recursively as follows3 : dec(nil)(x) ≡ ⊥(x) . (26) dec(cons(l, a))(x) ≡ dec(l)(x) ∨ Id(S, x, a) Proposition 5 (a characterization of ﬁnite (unary) subsets). Let S be a set and K ⊆ S. K is ﬁnite (unary) if and only if ∃ l ∈ List(S) (∃ l ∈ List1 (S)) such that K is equal to dec(l) in PS. Proof. Suppose K is ﬁnite (unary); then, by ∃elimination, we can assume to know a number k (k = 0, 1) and a function f from N (k) to S, such that K = f (N (k)). Now consider the list lf = [f (0k ), . . . , f ((k − 1)k )] and remember that (∃i ∈ N (k))(x = f (i)) is equivalent to (x = f (0k )) ∨ (x = f (1k )) ∨ . . . ∨ (x = f ((k − 1)k )). In other words, x f (N (k)) and x dec(lf ) are equivalent. Vice versa, by ∃elimination again, we can assume to have a list, say l = [a0 , . . . , ak−1 ], whose length is k and such that dec(l) = K. Then we can deﬁne a surjection fl from N (k) to S by prescribing the k conditions: fl (nk ) =def an , for n = 0, . . . , k − 1. Now it is easy to realize the equivalence between x fl (N (k)) and x dec(l). The previous proposition can be stated informally by saying that Pω S = (dec (List(S)), ↔) and P1 S = (dec(List1 (S)), ↔). In other words Pω S and P1 S are setindexed extensional families. It is also possible to deﬁne Pω S as a setoid, that is a quotient set. Indeed, let ∼ be the relation over List(S) deﬁned by l1 ∼ l2 if dec(l1 ) ↔ dec(l2 ). One sees at once that ∼ is an equivalence relation; hence we can consider the setoid (List(S), ∼). So, Pω S can be identiﬁed with (List(S), ∼); a similar argument holds for P1 S and (List1 (S), ∼). The general idea is that a ﬁnite subset is obtained from a list by forgetting (that is, by abstracting from) the order and multiplicity with which items appear in it. The fact that Pω S and P1 S are setindexed families allows us to treat them almost like sets. First of all, we can quantify over them; in fact, every quantiﬁcation intended over ﬁnite subsets of S can be given a constructive meaning 3
We write ⊥(x) to emphasize that we look at ⊥ as a propositional function over S.
Finiteness in a Minimalist Foundation
63
by quantifying over the set List(S) and then using the function dec. In particular, an expression like “(∀K ∈ Pω S)(. . . K . . .)” is a shorthand for “(∀l ∈ List(S))(. . . dec(l) . . .)”. Similarly for ∃. Of course a proposition over Pω S, say P (K), as to be thought of as a proposition over List(S) of the kind Q(l) such that Q(l1 ) ↔ Q(l2 ) true if dec(l1 ) ↔ dec(l2 ) true. Moreover it is possible to use Pω S to construct new setoids. For example, List(Pω S) can be deﬁned as the setoid (List(List(S)), ≈), where {l0 , . . . , ln−1 } ≈ {k0 , . . . , km−1 } is Id(N, n, m) & (∀i ∈ N )((i < n) → (dec(li ) ↔ dec(ki ))). However, if one is interested in constructing new objects based on Pω S, then it is more convenient to deﬁne Pω S (and P1 S similarly) by adding to the rules of List(S) the following ones (and by modifying the elimination rule in order to take the new equality into account; see [7]): exchange
contraction
a ∈ S b ∈ S l ∈ Pω S ; cons(cons(l, a), b) = cons(cons(l, b), a) ∈ Pω S
(27)
a ∈ S l ∈ Pω S . cons(cons(l, a), a) = cons(l, a) ∈ Pω S
(28)
It is easy to show that these two rules are enough to force two canonical elements to be equal when they are formed by the same items, regardless of order and repetitions. Thus, for a and b in S, we can infer that [a, b, b, a, b] and [a, b] are equal elements of Pω S. Of course, this does not mean that the equality in Pω S is decidable; for example, we can infer [a] = [b] ∈ Pω S only if a = b ∈ S. In other words, the equality in Pω S is decidable if and only if that of S is. The usual way of dealing with ﬁnite subsets can be reconstructed by means of suitable deﬁnitions and derived rules. As an example, let us consider the notion of membership. The idea is that an element a ∈ S belongs to l ∈ Pω S if the assumption a ∈ S has been used in the construction of l. However, by exchange and contraction, one may assume that a is the last item in l. So one can put a l ≡ (∃m ∈ Pω S)(l = cons(m, a)).
(29)
If Pω is seen as a constructor, then it is possible to construct Pω Pω S and so on. In [12] a proof can be found of the fact that Pω S is ﬁnite, provided that S is ﬁnite. However, we prefer to keep our original deﬁnition and look at Pω S as an extensional setindexed collection of subsets of S. The main reason for that is that we are thus allowed to apply to ﬁnite subsets all the operations of PS, even if Pω S is not closed under them. Proposition 6 (basic properties of ﬁnite and unary subsets). Let Pω S and P1 S be the collections of all ﬁnite and unary subsets of a set S; then: i) ∅ belongs to P1 S; ii) {a} belongs to P1 S for any a ∈ S; iii) K ∪ L ∈ Pω S, for all K and L belonging to Pω S.
64
F. Ciraulo and G. Sambin
Proof. For i) and ii) consider the lists nil and cons(nil, a), respectively. With regard to iii) take the concatenation of two lists corresponding to K and L and note that dec(l ∗ m) = dec(l) ∨ dec(m); in other words, the concatenation of two lists corresponds to the union of the corresponding ﬁnite subsets. Proposition 7 (induction principle for ﬁnite subsets). Let P (K) be a predicate over Pω S such that: 1. P (∅) holds; 2. P (L) implies P (L ∪ {a}), for any a ∈ S and L in Pω S; then P (K) holds for every K in Pω S. Proof. Note that the hypotheses 1 and 2 can be rewritten as P (dec(nil)) and (∀l ∈ List(S))(∀a ∈ S)(P (dec(l)) → P (dec(cons(l, a)))), while the thesis is (∀l ∈ List(S))P (dec(l)). Thus the statement is just a reformulation of the induction rule for lists with respect to the proposition Q(l) ≡ P (dec(l)). Proposition 8. Let S be a set and K ⊆ S be ﬁnite (possibly unary). Then it is decidable whether K is empty or inhabited. Proof. We prove (K = ∅) ∨ (∃a ∈ S)(a K) by induction on Pω S. If K = ∅, then we are done. Now suppose the statement is true for K and consider the subset K ∪ {a}, for a ∈ S; of course, a K ∪ {a} and the proof is complete. Proposition 6 says that (Pω S, ∪, ∅) is the supsemilattice generated by the singletons. In general, the intersection of two ﬁnite (unary) subsets cannot be proved to be ﬁnite (unary) too. This phenomenon corresponds to the fact that we cannot ﬁnd the common elements between two given lists unless the equality relation in the underlying set S is decidable. 4 Proposition 9. Let S be a set. The following are equivalent: 1. 2. 3. 4. 5.
the equality in S is decidable; {a} ∩ {b} is ﬁnite, for all a and b in S; K ∩ L is ﬁnite for all ﬁnite K and L; {a} ∪ −{a} = S for all a ∈ S; K ∪ −K = S for all ﬁnite K.
Proof. 1 ⇒ 2. If a = b holds, then {a} ∩ {b} is equal to {a} which is ﬁnite; instead, if a = b, then a cannot belong to {b} and {a} ∩ {b} is empty, hence ﬁnite. 2 ⇒ 3. Assume K = {a0 , . . . , an−1 } and L = {b0 , . . . , bm−1 }; then K = m−1 ∪n−1 i=0 {ai }, while L = ∪j=0 {bj }. Thus K ∩ L = ∪i,j ({a} ∩ {b}), by distributivity of ∩ with respect to ∪; thus it is ﬁnite, since it is the union of ﬁnitely many (namely n · m) ﬁnite subsets. 4
Of course, the equality in S is decidable if the proposition (∀a ∈ S)(∀b ∈ S) (Id(S, a, b) ∨ ¬Id(S, a, b)) is true.
Finiteness in a Minimalist Foundation
65
3 ⇒ 4. Let a, b ∈ S; then {a} and {b} are ﬁnite; thus {a} ∩ {b} is ﬁnite and it is decidable whether it is empty or inhabited. If the former holds, then b cannot belong to {a}, thus b −{a}. Instead, if the latter holds, then the proposition (∃c ∈ S)(c {a} ∩ {b}) yields a = b; so b {a}. Since b was arbitrary, {a} ∪ −{a} is the whole S. n−1 4 ⇒ 5. If K = {a0 , . . . , an−1 } = ∪n−1 i=0 {ai }, then −K = ∩i=0 − {ai }. By distributivity of ∪ with respect to ∩, K ∪ −K(x) can be seen as the intersection of n subsets of the kind K ∪ −{ai } for i = 1, . . . , n. Each of them is a union of n + 1 subsets and contains {ai } ∪ −{ai }, which is S by hypothesis; hence K ∪ −K = S. 5 ⇒ 1. Let a and b be two arbitrary elements of S. Since {b} is ﬁnite, then {b} ∪ −{b} = S; thus a belongs to it. In other words (a = b) ∨ (a = b) holds. Thus, provided that the equality of S is decidable, Pω S is closed under intersection. On the contrary, an arbitrary subset of a ﬁnite (unary) subset is not forced to be ﬁnite (unary) too, even in the case of a decidable equality. For let P be an arbitrary proposition and read P & Id(N (1), x, 01 ) as a propositional function over the ﬁnite set N (1). If the subset {x ∈ N (1) : P & (x = 01 )} were ﬁnite, then we could decide whether it is empty or inhabited: in the ﬁrst case ¬P should hold, otherwise P should; in other words, we could prove the law of excluded middle. Moreover note that the above argument holds even if the existence of the ﬁrst universe is assumed (thus the equality of N (1) is decidable). Thus we have proved the following. Proposition 10. In the framework of mTT, the statement that every subset of a ﬁnite (sub)set is also ﬁnite is equivalent to the full law of the excluded middle. We conclude the present section with a property about ﬁnite subsets which was used both in [2] and [12] to prove constructive versions of Tychonoﬀ’s theorem. Proposition 11. Let S be a set and K,V ,W subsets of S. If K ⊆ V ∪ W and K is ﬁnite (unary), then there exist V0 ⊆ V and W0 ⊆ W both ﬁnite (unary) such that K = V0 ∪ W0 . The above proposition looks as intuitively clear: take V0 and W0 to be K ∩ V and K ∩ W respectively. But formally we cannot follow this road because a part of a ﬁnite subset is not ﬁnite, in general (previous proposition). Proof. Let us start from the unary case. We can eﬀectively decide if K is the empty subset or a singleton. In the ﬁrst case take V0 = ∅ = W0 . Otherwise we have K = {a} for some a and, moreover, a V ∪ W ; so either a V or a W . In the ﬁrst case take V0 = {a} and W0 = ∅; W0 = {a} and V0 = ∅, otherwise. In order to prove the statement in the ﬁnite case, we use induction. If K = ∅ we can take V0 = ∅ = W0 . Now assume the theorem to be true for K and prove it for K ∪ {a}. So, let K ∪ {a} ⊆ V ∪ W . Then K ⊆ V ∪ W and by inductive hypothesis there exist V0 ⊆ V and W0 ⊆ W both ﬁnite and satisfying K = V0 ∪ W0 ; hence K ∪ {a} = V0 ∪ W0 ∪ {a}. On the other hand, we know that a V ∪ W : so, if a V , then we can take V0 = V0 ∪ {a} and W0 = W0 , while if
66
F. Ciraulo and G. Sambin
a W , then we take V0 = V0 and W0 = W0 ∪ {a} (if a belongs to both of them, then both choices are good). A remark on the last part of the previous proof could be useful. Even if we proceed by cases starting from a V ∪ W , it does not mean we are assuming that we can decide wether a V or a W ; what we are doing is just an application of the logical rule called “elimination of disjunction”. Thus, the eﬀective construction of V0 and W0 strongly depends on the degree of constructiveness of the hypothesis K ⊆ V ∪ W . The previous proposition, combined with the fact that we are always able to decide whether a ﬁnite subset is inhabited or not, yields the following corollary (see [12]). Corollary 1. Let P (x) and Q(x) be two propositional functions over a set S and let K ⊆ S be a ﬁnite subset such that for every x K either P (x) or Q(x) holds. Then either P (x) holds for every x K or there exists some x K such that Q(x) holds. In fact, from K ⊆ P ∪Q we can infer, as in the previous proposition, K = P0 ∪Q0 and then decide whether Q0 is empty or inhabited.
5
Some Other Notions of Finiteness
The notion of ﬁnite (sub)set we have adopted all over the present paper looks as the most natural one and, in fact, it is used by many authors (see [2] and [12]) including the present ones (see [1]). On the other hand, such notion lacks some desired properties such as closure under intersection. Hence, one can look for other deﬁnitions which enjoy the desired properties. Here we give a brief list of possible alternative notions, each accompanied by a brief report about its properties and disadvantages. For each of the following notions about subsets, a corresponding deﬁnition for sets can be obtained by identifying each set with its total subset. Deﬁnition 4 (subﬁnite; see [10]). U ⊆ S is subﬁnite if U ⊆ K is true, for some K which is ﬁnite according to deﬁnition 3. The collection of all subﬁnite subsets is closed both under (arbitrary) intersections and ﬁnite unions; on the other hand, it is not setindexed, in general. Moreover, the computational content carried by a subﬁnite subset is very poor; for instance, it is not possible to decide its emptyness. Deﬁnition 5 (bounded; see [3]). U ⊆ S is bounded if ∃k ∈ N such that ∀f ∈ N (k) → S f (N (k)) ⊆ U → (∃i, j ∈ N (k))(i = j & f (i) = f (j))
(30)
is true; that is, there cannot exist an injective map from N (k) into U (i.e. U has less than k elements).
Finiteness in a Minimalist Foundation
67
Contrary to the case of ﬁnite subsets, which are always represented by propositional functions of the form (∃i ∈ N (k))(x = f (i)), for some k ∈ N and f ∈ N (k) → S, it appears quite diﬃcult to characterize the propositional functions corresponding to bounded subsets. Also answering the question whether the collection of all bounded subsets is setindexed or not seems an hard task. This is surely due to the negative, not direct character of this deﬁnition. Proposition 12. Let S be a set and U, V ⊆ S; then: 1. if U is ﬁnite, then U is bounded; 2. if U ⊆ V and V is bounded, then U is bounded; 3. if U is subﬁnite, then U is bounded. Proof. If U is ﬁnite, then there exists a number k such that U has at most k elements; so U has less than k + 1 elements and hence it is bounded. If V is bounded, then there exists a number k such that no f from N (k) to V can be injective. Consider an arbitrary function f from N (k) to U ; of course, it can be seen as a map that take its values in V , hence it can not be injective and U is bounded. If U is subﬁnite, then there exists K ⊆ S such that K is ﬁnite and U ⊆ K. By item 1, K is bounded; by item 2, U is bounded. Item 1 in the previous proposition says that every ﬁnite (sub)set is bounded. On the contrary, it can not be formally proved that a bounded (sub)set is ﬁnite: classical logic seems necessary. Finally, an interesting generalization of the notion of ﬁnite subset is the following one that was proposed to us by Silvio Valentini. Deﬁnition 6 (semiﬁnite). U ⊆ S is semiﬁnite if: (x U ) ↔ ∨j∈J (&i∈I(j) x = aji ),
(31)
where aji ∈ S and both the set J and each I(j), j ∈ J, are of the form N (k). Of course, the aji in the deﬁnition above as to be thought of as a map from (Σ j ∈ J)I(j) to S. The collection of all semiﬁnite subsets is closed under ﬁnite intersections and unions; moreover, provided that the equality in S is decidable, semiﬁniteness collapses to ﬁniteness. Note that a semiﬁnite subset can be seen, by distributivity, as the intersection of a certain ﬁnite family of ﬁnite subsets. In other words, as Pω S is the ∪semilattice generated by singletons, so the collection of all semiﬁnite subsets is the lattice generated by them (with respect to intersection and union). Note also that semiﬁnite subsets form a family indexed by the set List(List(S)). However, it is no longer decidable if a semiﬁnite subset is empty or not; in other words, with respect to this deﬁnition proposition 8 (and hence proposition 11 and corollary 1) fails. Acknowledgments. The authors thank Maria Emilia Maietti for a lot of essential suggestions she gave them during long and dense discussions.
68
F. Ciraulo and G. Sambin
References 1. Ciraulo, F., Sambin, G.: Finitary Formal Topologies and Stone’s Representation Theorem. Theoretical Computer Science (to appear) 2. Coquand, T.: An Intuitionistic Proof of Tychonoﬀ Theorem. Journal of Symbolic Logic 57(1), 28–32 (1992) 3. van Dalen, D. (ed.): Brouwer’s Cambridge Lectures on Intuitionism. Cambridge University Press, Cambridge (1981) 4. Maietty, M.E.: Quotients over Minimal Type Theory. In: Cooper, S.B., L¨ owe, B., Sorbi, A. (eds.) CiE 2007. LNCS, vol. 4497, Springer, Heidelberg (2007) 5. Maietti, M.E., Sambin, G.: Toward a Minimalist Foundation for Constructive Mathematics. In: Crosilla, L., Schuster, P. (eds.) From Sets and Types to Topology and Analysis: Towards Practicable Foundations for Constructive Mathematics, Oxford UP. Oxford Logic Guides, vol. 48 (2005) 6. MartinL¨ of, P.: Intuitionistic Type Theory. Notes by G. Sambin of a series of lectures given in Padua, June 1980. Bibliopolis, Naples (1984) 7. Negri, S., Valentini, S.: Tychonoﬀ’s Theorem in the Framework of Formal Topologies. Journal of Symbolic Logic 62(4), 1315–1332 (1997) 8. Nordstr¨ om, B., Peterson, K., Smith, J.: Programming in MartinL¨ of’s Type Theory. Clarendon Press, Oxford (1990) 9. Sambin, G., Valentini, S.: Building up a Toolbox for MartinL¨ of Type Theory: Subset Theory. In: Sambin, G., Smith, J. (eds.) TwentyFive Years of Constructive Type Theory. Proceedings of a Congress Held in Venice, October 1995, pp. 221–240. Oxford University Press, Oxford (1998) 10. Troelstra, A.S., van Dalen, D.: Constructivism in Mathematics: An Introduction, vol. 1. NorthHolland, Amsterdam (1988) 11. Veldman, W.: Some Intuitionistic Variations on the Notion of a Finite Set of Natural Numbers. In: de Swart, H.C.M., Bergman, L.J.M. (eds.) Perspectives on negation. Essays in honour of Johan J. de Iongh on his 80th birthday, pp. 177–202. Tilburg Univ. Press, Tilburg (1995) 12. Vickers, S.J.: Compactness in Locales and in Formal Topology. In: Banaschewski, B., Coquand, T., Sambin, G. (eds.) Papers presented at the 2nd Workshop on Formal Topology (2WFTop 2002), Venice, Italy, April 0406, 2002. Annals of Pure and Applied Logic, vol. 137, pp. 413–438 (2006)
A Declarative Language for the Coq Proof Assistant Pierre Corbineau Institute for Computing and Information Science Radboud University Nijmegen, Postbus 9010 6500GL Nijmegen, The Netherlands
[email protected] Abstract. This paper presents a new proof language for the Coq proof assistant. This language uses the declarative style. It aims at providing a simple, natural and robust alternative to the existing Ltac tactic language. We give the syntax of our language, an informal description of its commands and its operational semantics. We explain how this language can be used to implement formal proof sketches. Finally, we present some extra features we wish to implement in the future.
1
Introduction
1.1
Motivations
An interactive proof assistant can be described as a state machine that is guided by the user from the ‘statement φ to be proved’ state to the ‘QED’ state. The system ensures that the state transitions (also known as proof steps in this context) are sound. The user’s guidance is required because automated theorem proving in any reasonable logics is undecidable in theory and diﬃcult in practice. This guidance can be provided either through some kind of text input or using a graphical interface and a pointing device. In this paper, we will focus on the former method. The ML language developed for the LCF theorem prover [7] was a seminal work in this domain. The ML language was a fullyblown programming language with speciﬁc functions called tactics to modify the proof state. The tool itself consisted of an interpreter for the ML language. Thus a formal proof was merely a computer program. With this in mind, think of a person reading somebody else’s formal proof, or even one of his/her own proof a couple of months after having written it. Similarly to what happens with source code, this person will have a lot of trouble understanding what is going on with the proof unless he/she has a very good memory or the proof is thoroughly documented. Of course, running the proof through the prover and looking at the output might help a little. This illustrates a major inconvenience which still aﬀects many popular proof languages used nowadays: they lack readability. Most proofs written are actually
This work was partially funded by NWO Bricks/Focus Project 642.000.501.
M. Miculan, I. Scagnetto, and F. Honsell (Eds.): TYPES 2007, LNCS 4941, pp. 69–84, 2008. c SpringerVerlag Berlin Heidelberg 2008
70
P. Corbineau
writeonly or rather write and executeonly, since what the user is interested in when rerunning the proof in is not really the input, but rather the output of the proof assistant, i.e. the sequence of proof states from the statement the theorem to ’QED’. The idea behind declarative style proofs is to actually base the proof language on this sequence of proof states. This is indeed the very feature that makes the distinction between procedural proof languages (like ML tactics in LCF) and declarative proof languages. On the one hand, procedural languages emphasize proof methods (application of theorems, rewriting, proof by induction. . . ), at the expense of a loss of precision on intermediate proof states: the intermediate states depend on the implementation on tactics instead of a formal semantics. On the other hand, declarative languages emphasize proof states but are less precise about the logical justiﬁcation of the gap between one state and the next one. 1.2
Related Work
The ﬁrst proof assistant to implement a declarative style proof language was the Mizar system, whose modern versions date back to the early 1990’s. The Mizar system is a batch proof assistant: it compiles whole ﬁles and writes error messages in the body of the input text, so it is not exactly interactive, but its proof language has been an inspiration for all later designs [15]. Another important source in this subject is Lamport’s How to write a proof [10] which takes the angle of the mathematician and provides a very simple system for proof notations, aimed at making proof veriﬁcation as simple as possible. John Harrison has been the ﬁrst to develop a declarative proof language for an interactive proof assistant: the HOL88 theorem prover [8]. Donald Syme has developed the DECLARE [12,13] batch proof assistant for higherorder logic with declarative proofs in mind from the start. He also describes CAPW (Computer Aided Proof Writing) as a means of overcoming the verbosity of declarative proofs. The ﬁrst interactive proof assistant for which a declarative language has been widely adopted is Isabelle [11], with the Isar (Intelligible SemiAutomated Reasoning) language [14], designed by Markus Wenzel. Freek Wiedijk also developed a light declarative solution [16] for John Harrison’s own prover HOL Light [9]. For the Coq proof assistant [2], Mariusz Giero and Freek Wiedijk have built a set of tactics called the MMode [6] to provide an experimental mathematical mode which give a declarative ﬂavor to Coq. Recently, Claudio SacerdotiCoen added a declarative language to the Matita proof assistant [1]. 1.3
A New Language for the Coq Proof Assistant
The Coq proof assistant is a Type Theorybased interactive proof assistant developed at INRIA. It has a strong user base both in the ﬁeld of software and
A Declarative Language for the Coq Proof Assistant
71
hardware veriﬁcation and in the ﬁeld of formalized mathematics. It also has the reputation of being a tool whose procedural proof language Ltac has a very steep learning curve both at the beginner and advanced level. Coq has been evolving quite extensively during the last decade, and the evolution has made it necessary to regularly update existing proofs to maintain compatibility with most recent versions of the tool. Coq also has a documentation generation tool to do hyperlinked rendering of ﬁles containing proofs, but most proofs are written in a style that makes them hard to understand (even with syntax highlighting) unless you can actually run them, as was said earlier for other procedural proof languages. Building on previous experience from Mariusz Giero, we have built a stable mainstream declarative proof language for Coq. This language was built to have the following characteristics: readable. The designed language should use clear English words to make proof reading a feasible exercise. natural. We want the language to use a structure similar to the ones used in textbook mathematics (e.g. for case analysis), not a bare sequence of meaningless commands. maintainable. The new language should make it easy to upgrade the prover itself: behavior changes should only aﬀect the proof locally. standalone. The proof script should contain enough explicit information to be able to retrace the proof path without running Coq. The Mizar language has been an important source of inspiration in this work but additional considerations had to be taken into account because the Calculus of Inductive Constructions (CIC) is much richer than Mizar’s (essentially) ﬁrstorder Set Theory. One of the main issue is that of proof genericity: Coq proofs use a lot of inductive objects for lots of diﬀerent applications (logical connectives, records, natural numbers, algebraic datatypes, inductive relations. . . ). Rather than enforcing the use of the most common inductive deﬁnitions, we want to be as generic as possible in the support we give for reasoning with these objects. Finally, we want to give an extended support for proofs by induction by allowing multilevel induction proofs, using a very natural syntax to specify the diﬀerent cases in the proof. The implementation was part of the oﬃcial release 8.1 version of Coq. 1.4
Outline
We ﬁrst describe some core features of our language, such as forward and backward steps, justiﬁcations, and partial conclusions. Then we give a formal syntax and a quick reference of the commands of our language, as well as an operational semantics. We go on by explaining how our language is indeed an implementation of the formal proof sketches [17] concept, and we deﬁne the notion of wellformed proof. We ﬁnally give some perspective for future work.
72
2 2.1
P. Corbineau
Informal Description An Introductory Example
To give a sample of the declarative language, we provide here the proof of a simple lemma about Peano numbers: the double function is deﬁned by double x = x+x and the div2 functions by: ⎧ ⎪ ⎨div2 0 = 0 div2 1 = 0 ⎪ ⎩ div2 (S (S x)) = S (div2 x) The natural numbers are deﬁned by means of an inductive type nat with two constructors 0 and S (successor function). The lemma states that div2 is the left inverse of double. We ﬁrst give a proof of the lemma using the usual tactic language: Lemma double_div2: forall n, div2 (double n) = n. intro n. induction n. reflexivity. unfold double in **. simpl. rewrite σz end Fig. 1. The partial conclusion mechanism
a pair of a new conclusion G and a proof of G → G . If Γ is a list of typed variables and σ a substitution, then Γ σ is formed by applying σ to all elements of Γ the following way: – if σx = x then x : T is discarded – is σx = x then x : T σ is added to Γ σ The notation Σ x ¯ : T¯ stands for a (possibly dependent) tuple built using the standard (and, prod, ex, sig, sigT) binary type constructors. Since a given problem may have several solutions, a simple search strategy has been adopted, and the constructor rule has been restricted to nonrecursive types in order to keep the search space ﬁnite. thesis
instruction
A∧B (A ∧ B) ∧ C A∨B (A ∧ B) ∨ (C ∧ D) ∃x : nat, P x ∃x : nat, P x ∃x : nat, ∃y : nat, (P y ∧ R x y) A∧B
thus B thus B thus B thus C take (2 : nat) thus P 2 thus P 2 suffices to have x : nat such that P x to show B
remaining thesis A A∧C D P2 ∃x : nat, R x 2 A ∧ ∃x : nat, P x
Fig. 2. Examples of partial conclusions
On Fig. 2, we give various examples of uses for the partial conclusion mechanism. The special constant stands for a conclusion where everything is proved. When using P 2 as a partial conclusion for ∃x : nat, P x, even though a placeholder for nat should remain, this placeholder is ﬁlled by 2 because of typing constraints. Please note that instantiating an existential quantiﬁer with a speciﬁc witness is an instance of this operation. In a former version of this language, we kept socalled split thesis (i.e several conclusions) instead of assembling them again. We decided to abandon this feature because it added much confusion without extending signiﬁcantly the
76
P. Corbineau
functionalities. In the case of the suffices construction, the part of the conclusion is removed and replaced by the suﬃcient conditions. Using this mechanism helps the user to build the proof piece by piece using partial conclusions to simplify the thesis and thus keeping track of his/her progress.
3 3.1
Syntax and Semantics Syntax
Figure 3 gives the complete formal syntax of the declarative language. the unbound nonterminals are id for identiﬁers, num for natural numbers, term and type for terms and types of the Calculus of Inductive Constructions, pattern refers to a pattern for matching against inductive objects. Patterns may be nested and contain the as keyword and the wildcard, but no disjunctive pattern is allowed. instruction ::=                 
proof assume statement [and statement]∗ [and (we have)clause]? (let, be)clause (given)clause (consider)clause from term [havethenthushence] statement justif ication thus? [∼ == ∼] [id:]? term justif ication suffices [(to have) − clausestatement [and statement]∗] to show statement justif ication [claimfocus on] statement take term ∗ ? ] as term
define id [var[,var] reconsider id thesis[[num]]? as type per [casesinduction] on term per cases of type justif ication suppose [id[,id]∗ and]? is is pattern
? such that statement [and statement]∗[and (we have)clause]? end [proofclaimfocuscasesinduction] escape return
α, βclause ::= α var[,var]∗ [β such that statement [and statement]∗
? [and α, βclause]? statement
::= [id:]? type  thesis  thesis for id
::= id[:type]?
? justiﬁcation ::= by * term[,term]∗ [using tactic]?
var
Fig. 3. Syntax for the declarative language
A Declarative Language for the Coq Proof Assistant
3.2
77
Commands Description
proof. ... end proof. This is the outermost block of any declarative proof. If several subgoals existed when the proof command occurred, only the ﬁrst one is proved in the declarative proof. If the proof is not complete when encountering end proof, then the proof is closed all the same, but with a warning, and Qed or Defined to save the proof will fail. have h:φ justif ication. then h:φ justif ication. This command adds a new hypothesis h of type φ in the context. If the justiﬁcation fails, a warning is issued but the hypothesis is still added to the context. The then variant adds the previous fact to the list of objects used in the justiﬁcation. thus h:φ justif ication. hence h:φ justif ication. These commands behave respectively like have and then but the proof of φ is used as a partial conclusion. This can end the proof or remove part of the proof obligations. These commands fail if φ is not a subformula of the thesis. claim h : φ. ... end claim. This block contains a proof of φ which will be named h after end claim. If the subproof is not complete when encountering end claim, then the subproof is still closed, but with a warning, and Qed or Defined to save the proof later will fail. focus on φ. ... end focus. This block is similar to the claim block, except that it leads to a partial conclusion. In a way, focus is to claim what thus is to have. This comes handy when the thesis is a conjunction and one of the conjuncts is an implication or a universal quantiﬁcation: the focus block will allow to use local hypotheses. (thus) ~= t justif ication. (thus) =~ t justif ication. These commands can only be used if the last step was an equality l = r. t should be a term of the same type as l and r. If ~= is used then the justif ication will be used to prove l = t and the new statement will be l = t. Otherwise, the justif ication will be used to prove t = r and the new statement will be t = r. When present, the thus keyword will trigger a conclusion step. suffices H : Φ to show Ψ justif ication. This command allows to replace a part of the thesis by a suﬃcient condition, e.g. to strengthen it before simple with previous step opens subproof iterated equality intermediate step have then claim ~=/=~ conclusive step thus hence focus on thus ~=/thus =~ Fig. 4. Synthetic classiﬁcation of forward steps
78
P. Corbineau
starting an proof by induction. In the thesis, Ψ is then replaced by Φ. The justiﬁcation should prove Ψ using Φ. assume G:Ψ ...and we have x such that H:Φ. let x be such that H:Φ. These commands are two diﬀerent ﬂavors for the introduction of hypotheses. They expect the thesis to be a product (implication or universal quantiﬁer)of the shape Πi xi : T i.Gi. It expects the Ti to be convertible with the provided hypotheses statements. This command is wellformed only if the missing types can be inferred. given x such that H:Φ. consider x such that H:Φ from G. given is similar to let, except that this command works up to elimination of tuples and dependent tuples such as conjunctions and existential quantiﬁers. Here the thesis could be ∃x.Φ → Ψ with Φ convertible to Φ. The consider command takes an explicit object G to destruct instead of using an introduction rule. define f (x : T ) as t. This command allows to deﬁnes objects locally. If parameters are given, a function (λabstraction) is deﬁned. reconsider thesis as T . reconsider H as T . These commands allows to replace the statement the thesis or a hypothesis with a convertible one, and fails if the provided statement is not convertible. take t. This command allows to do a partial conclusion using an explicit proof object. This is especially useful when proving an existential statement: it allows to specify the existential witness. per cases on t. — of F justif ication. suppose x : H. ... suppose x : H . ... end cases. This introduces a proof per cases on a disjunctive proof object t or a proof of the statement F derived from justif ication. The per cases command must immediately be followed by a suppose command which will introduce the ﬁrst case. Further suppose commands or end cases can be typed even if the previous case is not complete. In that case a warning is issued. If t occurs in the thesis, you should use suppose it is instead of suppose. per induction on t. — cases — suppose it is patt and x : H. ... suppose it is patt and x : H . ... end cases. This introduces a proof per dependent cases or by induction. When doing the proof, t is substituted with patt in the thesis. patt must be a pattern for a value of the same type as t. It may contain arbitrary subpatterns and as statements to bind names to subpatterns.
A Declarative Language for the Coq Proof Assistant
79
Those name aliases are necessary to apply the induction hypothesis at multiple levels. If you are doing a proof by induction, you may use the thesis for construction in the suppose it is command to refer to an induction hypothesis. You may also write induction hypotheses explicitly. escape. ... return. This block allows to escape the declarative mode back to the tactic mode. When encountering the return instruction, this subproof should be closed, or else a warning is issued. 3.3
Operational Semantics
The purpose of this section is to give precise details about what happens to the proof state when you type a proof command. The proof state consists of a stack S that contains open proofs and markers to count open subproofs, and each subproof is either a judgement Γ G, where Γ is a list of types (or propositions) indexed by names and G is a type (or proposition) or the symbol which stands for a closed subproof. A proof script S consists of the concatenation of instructions. The rules are given as bigstep semantics S ⇒T S means that when proving theorem T, we reach state S when executing the script S. This means that any script allowing to reach the empty stack [] is a complete proof of the theorem T. For clarity, we only give here the rules for some essential rules, the remaining rules can be found in appendix A. T = {Γ G}
S ⇒T (Γ ); []
proof. ⇒T (Γ G); []
S end proof. ⇒T []
S ⇒T (Γ G); S
jΓ T
G =
S have (x : T ) j. ⇒T (Γ ; x : T G); S S ⇒T (Γ G); S
jΓ T
G =
S thus (x : T ) j. ⇒T (Γ ; x : T G \ (x : T )); S The j Γ G expression means that the justiﬁcation j is suﬃcient to solve the problem Γ G. If it is not, the command issues a warning. The ≡ relation is the conversion (βδιζequivalence) relation of the Calculus of Inductive Constructions (see [2]). We write L R whenever the L context can be obtained by decomposing tuples in the R context. The \ operator is deﬁned in Fig. 1, rule gather. We use the traditional λ notation for abstractions and Π for dependent products (either implication or universal quantiﬁcation, depending on the context). The distinction between casesd and casesnd is used to prevent the mixing of suppose with suppose it is. For simplicity, the appendix A omits the coverage condition for case analysis as well as the semantics for escape and return.
80
P. Corbineau
4
Proof Editing
4.1
WellFormedness
If we drop the justiﬁcation (j . . . ) and completeness (G = ) conditions in our formal semantics, we get a notion of wellformed proofs. Those proofs when run in Coq, are accepted with warnings but cannot be saved since the proof tree contains gaps. This does not prevent the user from going further with the proof since the user is still able to use the result from the previous step. The smallest well formed proof is: proof. end proof. Introduction steps such as assume have additional wellformedness requirements: the introduced hypotheses must match the available ones. The given construction allows a looser correspondence. The reconsider statements have to give a convertible type. For proofs by induction, wellformedness requires the patterns to be of the correct type, and induction hypotheses to be build from the correct subobjects in the pattern. 4.2
Formal Proof Sketches
We claim that wellformed but incomplete proofs in our language play the role of formal proof sketches: they ensure that hypotheses correspond to the current statement and that objects referred to exist and have a correct type. When avoiding the by * construction, justiﬁcations are preserved when adding extra commands inside the proof. In this sense our language supports incremental proof development. The only thing that the user might have trouble doing when turning a textbook proof into a proof sketch in our language is ensuring that ﬁrstorder objects are introduced before a statement refers to them, since a textbook proofs might not be topologically organized. The user will then be able to add new lines within blocks (mostly forward steps).
5 5.1
Conclusion Further Work
Arbitrary relation composition. The ﬁrst extension that is needed for our language is the support for iterated relations other than equality. This is possible as soon as a generalized transitivity lemma of the form ∀xyz, x R1 y → y R2 z → x R3 z is available. Better automation. There is a need for a more precise and powerful automation for the default justiﬁcation method, to be able to give better predictions of when a deduction step will be accepted. A speciﬁc need would be an extension of equality reasoning to arbitrary equivalence relations (setoids, PERs . . . ).
A Declarative Language for the Coq Proof Assistant
81
Multiple induction. The support for induction is already quite powerful (support for deep patterns with multiple as bindings), but more can be done if we start considering multiple induction. It might be feasible to detect the induction scheme used (double induction, lexicographic induction ...) to build the corresponding proof ontheﬂy. Translation of procedural proofs. The declarative language oﬀers a stable format for the preservation of old proofs over time. Since many Coq proofs in procedural style already exist, it will be necessary to translate them to this new format. The translation can be done in two ways: by generating a declarative proof either from the proof tree, or from the proof term. The latter will be more ﬁne grain but might miss some aspects of the original procedural proof. The former looks more diﬃcult to implement. 5.2
Conclusion
The new declarative language is now widely distributed, though not yet widely used, and we hope that this paper will help new users to discover our language. As a beginning, JeanMarc Notin (INRIA Futurs) has translated a part of the Coq Standard Library to the declarative language. The implementation is quite stable and the automation, although not very predictable, oﬀers a reasonable compromise between speed and power. We really hope that this language will be a useful medium to make proof assistants more popular, especially in the mathematicians community and among undergraduate students. We believe that our language provides a helpful implementation of the formal proof sketch concept; this means it could be a language of choice for turning textbook proofs into formal proofs. It could also become a tool of choice for education. In the context of collaborative proof repositories such as in [5], our language, together with other declarative languages, will ﬁll the gap between the narrow proof assistant community and the general public: we aim at presenting big formal proofs to the public.
References 1. Coen, C.S.: Automatic generation of declarative scripts. CHAT: Connecting Humans and Type Checkers (December 2006) 2. The Coq Development Team. The Coq Proof Assistant Reference Manual – Version V8.1 (February 2007) 3. Corbineau, P.: Firstorder reasoning in the calculus of inductive constructions. In: Berardi, S., Coppo, M., Damiani, F. (eds.) TYPES 2003. LNCS, vol. 3085, pp. 162–177. Springer, Heidelberg (2004) 4. Corbineau, P.: Deciding equality in the constructor theory. In: Altenkirch, T., McBride, C. (eds.) TYPES 2006. LNCS, vol. 4502, pp. 78–92. Springer, Heidelberg (2007)
82
P. Corbineau
5. Corbineau, P., Kaliszyk, C.: Cooperative repositories for formal proofs. In: Kauers, M., Kerber, M., Miner, R., Windsteiger, W. (eds.) MKM/CALCULEMUS 2007. LNCS (LNAI), vol. 4573, pp. 221–234. Springer, Heidelberg (2007) 6. Giero, M., Wiedijk, F.: MMode, a mizar mode for the proof assistant coq. Technical report, ICIS, Radboud Universiteit Nijmegen (2004) 7. Gordon, M., Milner, R., Wadsworth, C.: Edinburgh LCF. LNCS, vol. 78. Springer, Heidelberg (1979) 8. Harrison, J.: A mizar mode for HOL. In: von Wright, J., Harrison, J., Grundy, J. (eds.) TPHOLs 1996. LNCS, vol. 1125, pp. 203–220. Springer, Heidelberg (1996) 9. Harrison, J.: The HOL Light manual, Version 2.20 (2006) 10. Lamport, L.: How to write a proof. American Mathematics Monthly 102(7), 600– 608 (1995) 11. Paulson, L.: Isabelle. LNCS, vol. 828. Springer, Heidelberg (1994) 12. Syme, D.: DECLARE: A prototype declarative proof system for higher order logic. Technical report, University of Cambridge (1997) 13. Syme, D.: Three tactic theorem proving. In: Bertot, Y., Dowek, G., Hirschowitz, A., Paulin, C., Th´ery, L. (eds.) TPHOLs 1999. LNCS, vol. 1690, pp. 203–220. Springer, Heidelberg (1999) 14. Wenzel, M.: Isar  A generic interpretative approach to readable formal proof documents. In: Bertot, Y., Dowek, G., Hirschowitz, A., Paulin, C., Th´ery, L. (eds.) TPHOLs 1999. LNCS, vol. 1690, pp. 167–184. Springer, Heidelberg (1999) 15. Wenzel, M., Wiedijk, F.: A comparison of Mizar and Isar. Journal of Automated Reasoning 29(34), 389–411 (2002) 16. Wiedijk, F.: Mizar light for HOL light. In: Boulton, R.J., Jackson, P.B. (eds.) TPHOLs 2001. LNCS, vol. 2152, pp. 378–394. Springer, Heidelberg (2001) 17. Wiedijk, F.: Formal proof sketches. In: Berardi, S., Coppo, M., Damiani, F. (eds.) TYPES 2003. LNCS, vol. 3085, pp. 378–393. Springer, Heidelberg (2004)
A
Operational Semantic (Continued) S ⇒T (Γ ; l : T G); S
j, l Γ T
G =
S then (x : T ) j. ⇒T (Γ ; l : T ; x : T G); S S ⇒T (Γ ; l : T G); S
j, l Γ T
G =
S hence (x : T ) j. ⇒T (Γ ; l : T ; x : T G \ (x : T )); S S ⇒T (Γ G); S
jΓ r =u
G =
S ~= u j. ⇒T (Γ ; e : l = u G); S S ⇒T (Γ G); S
jΓ u=l
G =
S =~ u j. ⇒T (Γ ; e : u = r G); S S ⇒T (Γ G); S
jΓ r =u
G =
S thus ~= u j. ⇒T (Γ ; e : l = u G \ ( l = u)); S S ⇒T (Γ G); S
jΓ u=l
G =
S thus =~ u j. ⇒T (Γ ; e : u = r G \ ( u = r)); S
A Declarative Language for the Coq Proof Assistant
S ⇒T (Γ G); S
G =
S claim (x : T ). ⇒T (Γ T ); claim; (Γ ; x : T G); S S ⇒T (Γ G); S
G =
S focus on (x : T ). ⇒T (Γ T ); focus; (Γ ; x : T G \ (x : T )); S S ⇒T (Γ ); claim; (Γ G); S
S ⇒T (Γ ); focus; (Γ G); S
S end claim. ⇒T (Γ G); S
S end focus. ⇒T (Γ G); S
S ⇒T (Γ G); S
Γ t:T
G =
S take t. ⇒T (Γ G \ (t : T )); S S ⇒T (Γ G); S
Γ ; x1 : T1 , . . . , xn : Tn t : T
G =
S define f (x1 : T1 ) . . . (xn : Tn ) as t. ⇒T (Γ ; f := λx1 : T1 . . . λxn : Tn .t G); S S ⇒T (Γ Πx1 : T1 . . . Πxn : Tn .G); S
(T1 . . . Tn ) ≡ (T1 . . . Tn )
S assume/let (x1 : T1 ) . . . (xn : Tn ). ⇒T (Γ ; x1 : T1 ; . . . ; xn : Tn G); S S ⇒T (Γ Πx1 : T1 . . . Πxn : Tn .G); S
(T1 . . . Tm ) (T1 . . . Tn )
S given (x1 : T1 ) . . . (xm : Tm ). ⇒T (Γ ; x1 : T1 ; . . . ; xm : Tm G); S S ⇒T (Γ G); S
Γ t:T
(T1 . . . Tn ) (T ) G =
S consider (x1 : T1 ) . . . (xn : Tn ) from t. ⇒T (Γ ; x1 : T1 ; . . . ; xn : Tn G); S S ⇒T (Γ ; x : T G); S
T ≡ T
G =
S reconsider x as T . ⇒T (Γ ; x : T G); S S ⇒T (Γ T ); S
T ≡ T
G =
S reconsider thesis as T . ⇒T (Γ T ); S S ⇒T (Γ G); S
j Γ ; x1 : T1 ; . . . ; xn : Tn T
G =
S suffices (x1 : T1 ) . . . (xn : Tn ) to show T j. ⇒T (Γ G \ (T1 ; . . . ; Tn T ); S S ⇒T (Γ G); S
jΓ t:T
S per cases of T j. ⇒T casesnd (t : T ); (Γ ; x : T G); S S ⇒T (Γ G); S
Γ t:T
S per cases on t. ⇒T cases(t : T ); (Γ ; x : T G); S S ⇒T cases(t : T ); (Γ G); S S suppose (x1 : T1 ) . . . (xn : Tn ). ⇒T (Γ ; x1 : T1 ; . . . ; xn : Tn G); casesnd (t : T ); (Γ G); S
83
84
P. Corbineau
S ⇒T (Γ ); casesnd (t : T ); (Γ G); S S suppose (x1 : T1 ) . . . (xn : Tn ). ⇒T (Γ ; x1 : T1 ; . . . ; xn : Tn G); casesnd (t : T ); (Γ G); S S ⇒T cases(t : T ); (Γ G); S S suppose it is p and (x1 : T1 ) . . . (xn : Tn ). ⇒T (Γ ; x1 : T1 ; . . . ; xn : Tn G[p/t]); casesd (t : T ); (Γ G); S S ⇒T (Γ ); casesd (t : T ); (Γ G); S S suppose it is p and (x1 : T1 ) . . . (xn : Tn ). ⇒T (Γ ; x1 : T1 ; . . . ; xn : Tn G[p/t]); casesd (t : T ); (Γ G); S S ⇒T (Γ ); casesd/nd (t : T ); (Γ G); S S end cases. ⇒T (Γ ); S S ⇒T (Γ G); S
Γ t:T
S per induction on t. ⇒T induction(t : T ); (Γ ; x : T G); S S ⇒T induction(t : T ); (Γ G); S S suppose it is p and (x1 : T1 ) . . . (xn : Tn ). ⇒T (Γ ; x1 : T1 ; . . . ; xn : Tn G[p/t]); induction(t : T ); (Γ G); S S ⇒T (Γ ); induction(t : T ); (Γ G); S S suppose it is p and (x1 : T1 ) . . . (xn : Tn ). ⇒T (Γ ; x1 : T1 ; . . . ; xn : Tn G[p/t]); induction(t : T ); (Γ G); S S ⇒T (Γ ); induction(t : T ); (Γ G); S S end induction. ⇒T (Γ ); S
Characterising Strongly Normalising Intuitionistic Sequent Terms J. Esp´ırito Santo1 , S. Ghilezan2 , and J. Iveti´c2 1
Mathematics Department, University of Minho, Portugal
[email protected] 2 Faculty of Engineering, University of Novi Sad, Serbia
[email protected],
[email protected] Abstract. This paper gives a characterisation, via intersection types, of the strongly normalising terms of an intuitionistic sequent calculus (where LJ easily embeds). The soundness of the typing system is reduced to that of a well known typing system with intersection types for the ordinary λcalculus. The completeness of the typing system is obtained from subject expansion at root position. This paper’s sequent term calculus integrates smoothly the λterms with generalised application or explicit substitution. Strong normalisability of these terms as sequent terms characterises their typeability in certain “natural” typing systems with intersection types. The latter are in the natural deduction format, like systems previously studied by Matthes and Lengrand et al., except that they do not contain any extra, exceptional rules for typing generalised applications or substitution.
Introduction The recent interest in the CurryHoward correspondence for sequent calculus [9,2,5,8,6] made it clear that the computational content of sequent derivations and cutelimination can be expressed through an extension of the λcalculus, where the construction that interprets cut subsumes both explicit substitution and an enlarged concept of application, exhibiting the features of “multiarity” and “generality” [8]. The sequent calculus acts relatively to such calculus of sequent terms as a typing system, and the ensuing notion of typeability is suﬃcient, but not necessary, for strong normalisability. This situation is wellknown in the context of the ordinary λcalculus, where simpletypeability is suﬃcient, but not necessary, for strong βnormalisability. A form of getting a characterisation of strongly normalising λterms is to extend the typing system with intersection types. For this reason intersection type assignment systems were introduced into λcalculus in the late 1970s by Coppo and Dezani [3], Pottinger [15] and Sall´e [18]. Intersection types completely characterise strong normalisation in lambda calculus (see [1]). In this paper we seek a characterisation of strongly normalising sequent terms via intersection types. We ﬁrst introduce, following [6], an extension of the λcalculus named λGtz (after Gentzen) corresponding to a sequent calculus for intuitionistic implicational logic, equipped with reduction rules for cutelimination. M. Miculan, I. Scagnetto, and F. Honsell (Eds.): TYPES 2007, LNCS 4941, pp. 85–99, 2008. c SpringerVerlag Berlin Heidelberg 2008
86
J. Esp´ırito Santo, S. Ghilezan, and J. Iveti´c
The typing system is from the beginning equipped with intersection types, following [4]. The correctness of the typing system is obtained by a reduction to the correctness of the system D [12]. The completeness of the typing system is obtained as a corollary to subject expansion at root position. A recent topic of research is the use of intersection types for the characterisation of strongnormalisability in extensions of the λcalculus with generalised applications or explicit substitutions [14,13,11]. A common symptom of these works is the need to throw in the typing system some extra, exceptional rules for typing generalised applications or substitutions. This breaks somehow the harmony observed in the ordinary λcalculus between typeability induced by intersection types and strong βnormalisability. One may wonder whether, in the extended scenario with generalised applications or explicit substitutions, the blame for the slight mismatch is on some insuﬃciency of the intersection types technique, or on some insuﬃciency of the reduction relations causing too many terms to be terminating. It turns out that, because of its expressive power, λGtz is a good tool to analyse this question. A simple analysis of our main characterisation result shows that strong normalisability as sequent terms (i.e. inside λGtz ) of λterms with generalised applications or explicit substitutions characterises their typeability in certain “natural” typing systems with intersection types. The latter are in the natural deduction format, like systems previously studied in [14,13], except that they do not contain any extra, exceptional rules for typing generalised applications or substitution. So one is led to compare the behavior under reduction of λterms with generalised applications or explicit substitutions inside λGtz and inside their native system ΛJ [10] or λx [17]. We conclude that the problem in ΛJ is that we cannot form explicit substitutions, and in λx is that we cannot compose substitutions. The paper is organized as follows. Section 1 presents the syntax of the untyped λGtz calculus. Section 2 introduces an intersection type system λGtz ∩. Strong normalisation is proved in Section 3, and characterisation of strong normalisation is given in Section 4. In Section 5, the relation between λGtz calculus and calculi with generalised applications and explicit substitutions is discussed. Finally, Section 6 concludes this paper.
1
Syntax of λGtz
The abstract syntax of λGtz is given by: (Terms) t, u, v ::= x  λx.t  tk (Contexts) k ::= x .t  u :: k, where x ranges over a denumerable set of term variables. Terms are either variables, abstractions or cuts tk. A context is either a selection or a context cons(tructor). Terms and contexts are together referred to as the expressions and will be ranged over by E. In λx.t and x .t, t is the scope of
Characterising Strongly Normalising Intuitionistic Sequent Terms
87
the binders λx and x , respectively. Free variables in λGtz calculus are those that are not bound neither by abstraction nor by selection operator and Barendregt’s convention should be applied in both cases. In order to avoid parentheses, we let the scope of binders extend to the right as much as possible. According to the form of k, a cut may be an explicit substitution t( x.v) or .v) (m ≥ 1). In the last a multiary generalised application t(u1 :: · · · :: um :: x case, if m = 1, we get a generalised application t(u :: x .v); if v = x, we get a .x as the empty list of arguments); multiary application t[u1 , · · · , um ] (think of x a combination of constraints m = 1 and v = x brings cuts to the form of an ordinary application. Reduction rules of λGtz are as follows: (β) (λx.t)(u :: k) → u( x.tk) (π) (tk)k → t(
[email protected] ) (σ) t( x.v) → v[x := t] (μ) x .xk → k, if x ∈ /k where t[x := u] (or k[x := u]) denotes metasubstitution, and
[email protected] is deﬁned by x.v)@k = x .vk . (u :: k)@k = u :: (
[email protected] ) and ( The rules β, π, and σ reduce cuts to the trivial form y(u1 :: · · · um :: x .v), for some m ≥ 1, which represents a sequence of left introductions. Rule β generates a substitution, and rule σ executes a substitution in the metalevel. Rule π generalises the permutative conversion of the λcalculus with generalised applications. Rule μ has a structural character, and either performs a trivial substitution in the reduction t( x.xk) → tk, or minimizes the use of the generality feature in the .xk) → t(u1 · · · um :: k). reduction t(u1 · · · um :: x βπσnormal forms of λGtz are: (Terms) tnf , unf , vnf = x  λx.tnf  x(unf :: knf ) (Contexts) knf = x .tnf  tnf :: knf λGtz is a ﬂexible system for representing logical derivations in the sequent calculus format and studying cutelimination. The inference rules of LJ axiom, right introduction, left introduction, and cut, are represented by the constructions x, λx.t, y(u :: x .v), and t( x.v), respectively. The βπσnormal forms correspond to the multiary, cutfree, sequent terms of [19]. See [6] for more on λGtz .
2
Intersection Types for λGtz
Deﬁnition 1. The set of types T ypes, ranged over by A, B, C, ..., A1 , ..., is inductively deﬁned as follows: A, B ::= p  A → B  A ∩ B where p ranges over a denumerable set of type atoms.
88
J. Esp´ırito Santo, S. Ghilezan, and J. Iveti´c
Deﬁnition 2 (i) Preorder ≤ over the set of types is the smallest relation that satisﬁes the following properties: 1. 2. 3. 4. 5. 6.
A≤A A ∩ B ≤ A and A ∩ B ≤ B (A → B) ∩ (A → C) ≤ A → (B ∩ C) A ≤ B and B ≤ C implies A ≤ C A ≤ B and A ≤ C implies A ≤ B ∩ C A ≤ A and B ≤ B implies A → B ≤ A → B
(ii) Two types are equivalent, A ∼ B , if and only if A ≤ B and B ≤ A. In this paper, we will consider types modulo the equivalence relation. Remark 3. The equivalence (A → B) ∩ (A → C) ∼ A → (B ∩ C), or more generally ∩(∩Ak → Bi ) ∼ ∩Ak → ∩Bi , follows from the given set of rules, and will be used in the sequel. Deﬁnition 4 (i) A basic type assignment is an declaration of the form x : A, where x is a term variable and A is a type. (ii) A basis Γ is a set of basic type assignments, where all term variables are diﬀerent. (iii) There are two kinds of type assignment:  Γ t : A for typing terms;  Γ ; B k : A for typing contexts. The following typing system for λGtz is named λGtz ∩. In Ax, →L , and Cut ∩Ai = A1 ∩ · · · ∩ An , for some n ≥ 1. j ∈ {1, · · · , n} (Ax) Γ, x : ∩Ai x : Aj Γ, x : A t : B (→R ) Γ λx.t : A → B Γ t : Ai , ∀i ∈ {1, · · · , n} Γ tk : B
Γ u : Ai , ∀i ∈ {1, · · · , n} Γ ; B k : C (→L ) Γ ; ∩Ai → B u :: k : C Γ ; ∩Ai k : B
(Cut)
Γ, x : A v : B (Sel) Γ;A x .v : B
By taking n = 1 in Ax, →L , and Cut we get the typing rules of [6] for assigning simple types. Notice that in this typing system there are no separate rules for the right introduction of intersections. The management of intersection is built in the other rules.
Characterising Strongly Normalising Intuitionistic Sequent Terms
89
Proposition 5 (Admissible rule  (∩L )) (i) If Γ, x : Ai t : B, for some i, then Γ, x : ∩Ai t : B. (ii) If Γ, x : Ai ; C k : B, for some i, then Γ, x : ∩Ai ; C k : B. Proof. By mutual induction on the derivation.
Proposition 6 (Basis expansion) (i) Γ t : A ⇔ Γ, x : B t : A and x ∈ / F v(t). (ii) Γ ; C k : A ⇔ Γ, x : B; C k : A and x ∈ / F v(k). Deﬁnition 7 / Γ2 } Γ1 ∩ Γ2 = {x : Ax : A ∈ Γ1 & x ∈ / Γ1 } ∪ {x : Ax : A ∈ Γ2 & x ∈ ∪ {x : A ∩ Bx : A ∈ Γ1 & x : B ∈ Γ2 }. Proposition 8 (Bases intersection) (i) Γ1 t : A ⇒ Γ1 ∩ Γ2 t : A. (ii) Γ1 ; B k : A ⇒ Γ1 ∩ Γ2 ; B k : A. Proposition 9 (Generation lemma  GL) (i) Γ x : A iﬀ x : ∩Ai ∈ Γ and A ≡ Ai , for some i. (ii) Γ λx.t : A iﬀ A ≡ B → C and Γ, x : B t : C. (iii) Γ ; A x .t : B iﬀ Γ, x : A t : B. (iv) Γ tk : A iﬀ there is a type B ≡ ∩Bi such that Γ t : Bi for all i, and Γ ; ∩Bi k : A. (v) Γ ; D t :: k : C all i.
iﬀ D ≡ ∩Ai → B, and Γ ; B k : C and Γ t : Ai for
Proof. The proof is straightforward since all rules are syntaxdirected.
Lemma 10 (Substitution and append lemma) (i) If Γ, x : ∩Ai t : B and Γ u : Ai , for each i, then Γ t[x := u] : B. (ii) If Γ, x : ∩Ai ; C k : B and Γ u : Ai , for each i, then Γ ; C k[x := u] : B. (iii) If Γ ; B k : Ci , ∀i, and Γ ; ∩Ci k : A, then Γ ; B
[email protected] : A. Proof. (i) and (ii) is proved by simultaneous induction on t and k. (iii) is proved by induction on k.
90
J. Esp´ırito Santo, S. Ghilezan, and J. Iveti´c
Theorem 11 (Subject Reduction). If Γ t : A and t → t , then Γ t : A. Proof. The proof employs the previous lemma. It is omitted because of the lack of space.
Example 12. In λcalculus, the term λx.xx has the type (A ∩ (A → B)) → B. The corresponding term in λGtz calculus is λx.x(x :: y.y). Although being a normal form this term is not typeable in the simply typed λGtz calculus. It is typeable in λGtz ∩ in the following way: Ax x : A ∩ (A → B), y : B y : B Ax x : A ∩ (A → B) x : A
Sel x : A ∩ (A → B); B y .y : B
Ax x : A ∩ (A → B) x : A → B
x : A ∩ (A → B); A → B (x :: y .y) : B
→L
Cut x : A ∩ (A → B) x(x :: y .y) : B λx.x(x :: y .y) : (A ∩ (A → B)) → B
3
→R .
Typeability ⇒ SN
In order to prove strong normalisation for the λGtz ∩ system, we connect it with the wellknown system D for λcalculus via an appropriate mapping, and then use strong normalisation theorem for λterms typeable in D system. λterms are given by M, N, P ::= x  λx.M  M N and equipped with (β) (λx.M )N → M [x := N ] (π1 ) (λx.M )N P → (λx.M P )N (π2 ) M ((λx.P )N ) → (λx.M P )N without clash of free and bound variables (Barendregt’s convention). We let π = π1 ∪ π2 . Proposition 13. If a λterm M is βSN, then M is βπSN.
Proof. This is Theorem 2 in [7]. The following typing system for λ is named D in [12]. Γ, x : A x : A Γ, x : A M : B →I Γ λx.M : A → B
Ax
Γ M :A→B Γ N :A →E Γ MN : B
Γ M :A Γ M :B ∩I Γ M : A∩B
Γ M : A1 ∩ A2 ∩E Γ M : Ai
Characterising Strongly Normalising Intuitionistic Sequent Terms
91
Lemma 14. The following rules are admissible in D: Γ M : A Γ ⊆ Γ W eak Γ M : A
Γ N : A Γ, x : A M : B Subst Γ M [x := N ] : B
Proposition 15 (SN). If a λterm M is typeable in D, then M is βSN.
Proof. A result from [16], [12].
We deﬁne a mapping F from λGtz to λ. The idea is as follows. If F (t) = .v), say, is mapped to M , F (ui ) = Ni and F (v) = P , then t(u1 :: u2 :: x (λx.P )(M N1 N2 ). Formally, a mapping F : λGtz − T erms −→ λ − T erms is deﬁned simultaneously with an auxiliary mapping F : λ − T erms × λGtz − Contexts −→ λ − T erms as follows: F (x) = x F (λx.t) = λx.F (t) F (tk) = F (F (t), k) F (N, x .t) = (λx.F (t))N F (N, u :: k) = F (N F (u), k) Proposition 16. If λGtz ∩ proves Γ t : A, then D proves Γ F (t) : A. Proof. The proposition is proved together with the claim: if λGtz ∩ proves Γ ; A k : B and D proves Γ N : A, then D proves Γ F (N, k) : B. The proof is by simultaneous induction on derivations Π1 and Π2 of Γ t : A and Γ ; A k : B, respectively. Cases according to the last typing rule used. The case (Ax) is obtained by the corresponding Ax in D together with the ∩E. The case → R, is easy, because D has the corresponding typing rule. Case (Cut). Π1 has the shape Π11i Π12 Γ t : Ai , ∀i Γ ; ∩Ai k : B (Cut) Γ tk : B By IH(Π11i ), D proves Γ F (t) : Ai . By repeated application of ∩I, D proves Γ F (t) : Ai . By IH(Π12 ), D proves Γ F (F (t), k) : B. This is what we want, since F (F (t), k) = F (tk). Case (Sel). Π2 has the shape Π21 Γ, x : A t : B (Sel) Γ;A x .t : B Suppose D proves Γ N : A. Then in D one has
92
J. Esp´ırito Santo, S. Ghilezan, and J. Iveti´c
IH Γ, x : A F (t) : B →I Γ λx.F (t) : A → B Γ N :A (→ E) Γ (λx.F (t))N : B This is what we want, since F (N, x .t) = (λx.F (t))N . Case (→ L). Π2 has the shape Π21i Π22 Γ u : Ai , ∀i Γ;B k : C (→ L) Γ ; ∩Ai → B u :: k : C Suppose D proves Γ N : ∩Ai → B. By IH (Π21i ) D proves Γ F (u) : Ai , ∀i; therefore, by repeated application of ∩I, D proves Γ F (u) : ∩Ai . Then in D one has Γ N : ∩Ai → B Γ F (u) : ∩Ai (→ E) Γ N F (u) : B Hence, by IH(Π22 ), D proves Γ F (N F (u), k) : C. This is what we want, since F (N F (u), k) = F (N, u :: k).
Proposition 17. For all t ∈ λGtz , if F (t) is βπSN, then t is βπσμSN. Proof. Consequence of the following properties of F : (i) if t →βπ u in λGtz , then Gtz F (t) →+ , then F (t) →β F (u) in λ.
π F (u) in λ; (ii) if t →σμ u in λ Theorem 18 (Typeability ⇒ SN). If a λGtz term t is typeable in λGtz ∩, then t is βπσμSN. Proof. Suppose t is typeable in λGtz ∩. Then, by Proposition 16, F (t) is typeable in D. So, by Proposition 15, F (t) is βSN. Hence, by Proposition 13, F (t) is βπSN. Finally, by Proposition 17, t is βπσμSN.
4 4.1
SN ⇒ Typeability Typeability of Normal Forms
Proposition 19. βπσnormal forms of λGtz calculus are typeable in λGtz ∩ system. Hence so are βπσμnormal forms. Proof. By simultaneous induction on the structure of βπσnormal terms and contexts. – Basic case: Every variable is typeable. – λx.tnf is typeable. By IH, tnf is typeable, so Γ tnf : B. We examine two cases:
Characterising Strongly Normalising Intuitionistic Sequent Terms
93
Case 1. If x : A ∈ Γ , then Γ = Γ , x : A and we can assign the following type to λx.tnf : Γ , x : A tnf : B Γ λx.tnf : A → B.
(→R )
Case 2. If x : A ∈ / Γ , then by Proposition 6 we get Γ, x : A tnf : B thus concluding Γ, x : A tnf : B (→R ) Γ λx.tnf : A → B. – x .tnf is typeable. Proof is very similar to the previous one. – tnf :: knf is typeable. By IH tnf and knf are typeable, i.e. Γ1 tnf : A and Γ2 ; B knf : C. Then, by Proposition 8 we get Γ1 ∩ Γ2 tnf : A and Γ1 ∩ Γ2 ; B knf : C, so we assign the following type to tnf :: knf : Γ1 ∩ Γ2 tnf : A Γ1 ∩ Γ2 ; B knf : C Γ1 ∩ Γ2 ; A → B tnf :: knf : C.
(→L )
– x(tnf :: knf ) is typeable. By IH and the previous case, context tnf :: knf is typeable, i.e. Γ ; A → B tnf :: knf : C. We examine 3 cases: Case 1. If x : A → B ∈ Γ , then: Γ x:A→B
(Ax)
Γ ; A → B tnf :: knf : C
Γ x(tnf :: knf ) : C.
(Cut)
Case 2. If x : D ∈ Γ , then Γ = Γ , x : D and we can expand basis of x : A → B x : A → B to Γ , x : D ∩ (A → B) x : A → B using Propositions 5 and 6. Also, by Proposition 5, we can write Γ , x : D ∩ (A → B); A → B tnf :: knf : C. Now, the corresponding type assignment is:
Γ , x : D ∩ (A → B) x : A → B
Γ , x : D ∩ (A → B); A → B tnf :: knf : C (Cut)
Γ , x : D ∩ (A → B) x(tnf :: knf ) : C.
Case 3. If x isn’t declared at all, by Proposition 6 we get Γ, x : A → B; A → B tnf :: knf : C from Γ ; A → B tnf :: knf : C, and then conclude: Γ, x : A → B x : A → B
(Ax)
Γ, x : A → B;A → B tnf :: knf : C
Γ, x : A → B x(tnf :: knf ) : C.
(Cut)
94
4.2
J. Esp´ırito Santo, S. Ghilezan, and J. Iveti´c
Subject Expansion at Root Position
Lemma 20. If Γ u( x.tk) : A and x ∈ F v(u) ∪ F v(k), then Γ (λx.t) (u :: k) : A. Proof. Γ u x.(tk) : A implies, by GL(iv), that there is a type B ≡ ∩Bi , such .(tk) : A. Further, this implies, by that Γ u : Bi , for all i and Γ ; ∩Bi x GL(iii), that Γ, x : ∩Bi tk : A so then there is a C ≡ ∩Cj such that Γ, x : ∩Bi t : Cj for all j and Γ, x : ∩Bi ; ∩Cj k : A. By assumption, the variable x is not free in k, so using Proposition 6 we can write the previous sequent as Γ ; ∩Cj k : A. Now, because of the equivalence ∩(∩Bi → Cj ) ∼ ∩Bi → ∩Cj , we have: Γ, x : ∩Bi t : Cj , ∀j Γ λx.t : ∩Bi → Cj , ∀j
(→R )
Γ u : Bi , ∀i Γ ; ∩Cj k : A Γ ; ∩Bi → ∩Cj u :: k : A
Γ (λx.t)(u :: k) : A.
(→L )
(Cut)
Lemma 21 (Inverse substitution lemma) (i) Let Γ v[x := t] : A, and let t be typeable. Then there is a basis Γ and a type B ≡ ∩Bi , such that Γ , x : ∩Bi v : A and for all i, Γ t : Bi . (ii) Let Γ ; C k[x := t] : A, and let t be typeable. Then there is a basis Γ and a type B ≡ ∩Bi , such that Γ , x : ∩Bi ; C k : A and for all i, Γ t : Bi . Proof. By simultaneous induction on the structure of the term v and the context k.
Lemma 22 (Inverse append lemma). If Γ ; B
[email protected] : A then there is a type C ≡ ∩Ci such that Γ ; B k : Ci , ∀i and Γ ; ∩Ci k : A. Proof. By induction on the structure of k. – Basic case: k ≡ x .v x.v)@k = x .vk . From Γ ; B x .vk : A, by GL(iii), In this case
[email protected] = ( we have that Γ, x : B vk : A. Then, by GL(iv), there is a C ≡ ∩Ci such that Γ, x : B v : Ci , ∀i and Γ, x : B; ∩Ci k : A. From the ﬁrst sequent we get Γ ; B x .v : Ci , ∀i . From the second one, considering that x is not free in k , we get Γ ; ∩Ci k : A. – k ≡ u :: k In this case,
[email protected] = (u :: k )@k = u :: (k @k ). From Γ ; B u :: (k @k ) : A, by GL(v), B ≡ ∩Ci → D, Γ ; D k @k : A and Γ u : Ci , for all i. From the ﬁrst sequent, by IH, we get some E ≡ ∩Ej such that Γ ; D k : Ej , ∀j and Γ ; ∩Ej k : A. Finally, for each j, Γ u : Ci , ∀i
Γ ; D k : Ej
Γ ; ∩Ci → D(≡ B) u :: k : Ej so the proof is completed.
(→L )
Characterising Strongly Normalising Intuitionistic Sequent Terms
95
Proposition 23 (Subject expansion at root position). If t → t , t is the contracted redex and t is typeable in λGtz ∩, then t is typeable in λGtz ∩. Proof. We examine four diﬀerent cases, according to the applied reduction. – (β) : Directly follows from Lemma 20. – (σ) : We should show that typeability of t ≡ v[x := u] leads to typeability of t ≡ u x.v. Assume that Γ v[x := u] : A. By Lemma 21 there are a Γ and a B ≡ ∩Bi such that Γ u : Bi , ∀i and Γ , x : ∩Bi v : A. Now Γ , x : ∩Bi v : A .v : A Γ u : Bi , ∀i Γ ; ∩Bi x x.v : A. Γ u –
(Sel) (Cut)
(π) : We should show that typeability of t(
[email protected] ) implies typeability of (tk)k . Γ t(
[email protected] ) : A, by GL(iv) yields that there is B ≡ ∩Bi such that Γ t : Bi , ∀i, and Γ ; ∩Bi
[email protected] : A. By applying Lemma 22 on previous sequent, we get Γ ; ∩Bi k : Cj , ∀j, and Γ ; ∩Cj k : A, for some type C ≡ ∩Cj . Now, for each j, Γ t : Bi , ∀i Γ ; ∩Bi k : Cj Γ tk : Cj
(Cut)
So Γ tk : Cj , ∀j. We obtain Γ (tk)k : A with a further application of (Cut). – (μ) : It should be shown that typeability of k implies typeability of x .xk. Assume Γ ; B k : A. Since x ∈ / k we can suppose that x ∈ / Γ , and by using Proposition 6 write Γ, x : B; B k : A. Now Γ, x : B x : B Γ, x : B; B k : A Γ, x : B xk : A Γ;B x .xk : A.
(Cut)
(Sel)
Theorem 24 (SN ⇒ typeability). All strongly normalising (βσπ − SN ) expressions are typeable in λGtz ∩ system. Proof. The proof is by induction over the length of the longest reduction path out of a strongly normalising expression E, with a subinduction on the size of E. If E is a βσπnormal form, then E is typeable by Proposition 19. If E is itself a redex, let E be the expression obtained by contracting redex E. Therefore E is strongly normalising and by IH it is typeable. Then E is typeable, by Proposition 23. Next suppose that E is not itself a redex nor a normal form. Then E is of one of the following forms: λx.u, x(u :: k), u :: k, or x .u (in each case with
96
J. Esp´ırito Santo, S. Ghilezan, and J. Iveti´c
u or k not βπσnormal). Each of the above u and k is typeable by IH, as the subexpressions of E. It is easy then to build the typing of E, as in the proof of Proposition 19.
Corollary 25. A term is strongly normalising if and only if it is typeable in λGtz ∩. Proof. By Theorems 18 and 24.
5
Generalised Applications and Explicit Substitutions
We consider two extensions of the λcalculus: the ΛJcalculus, where application M (N, x.P ) is generalised [10]; and the λxcalculus, where substitution M x := N is explicit [17]. Intersection types have been used to characterise the strongly normalising terms of both ΛJcalculus [14] and λxcalculus [13]. Both in [14] and [13] the “natural” typing rules for generalised application or substitution had to be supplemented with extra rules (the rule app2 in [14]; the rules drop or K − Cut in [13]) in order to secure that every strongly normalising term is typeable. Indeed, examples of terms are given whose reduction in ΛJ or λx always terminates, but which would not be typeable, had the extra rules not been added to the typing system. The examples in ΛJ [14] and λx [13] are t0 := (λx.x(x, w.w))(λz.z(z, w.w), y.y ), y = y , t1 := y y := xxx := λz.zz , respectively. Two questions are raised by these facts: ﬁrst, why the “natural” rules fail to capture the strongly normalising terms; second, how to characterise in terms of reduction the terms that receive a type under the “natural” typing rules. We now prove that λGtz and λGtz ∩ are useful for giving an answer to these questions. Deﬁnition 26. Let t be a λGtz term 1. t is a λJterm if every cut occurring in t is of the form t(u :: x .v). 2. t is a λxterm if every cut occurring in t has one of the forms t(u :: x .x) or t( x.v). We adopt the terminology “λJterm” (instead of “ΛJterm”) for the sake of uniformity. We may write t(u, x.v) instead of t(u :: x .v). Let t(u) abbreviate t(u :: x .x) and vx := t denote t( x.v). An inductive characterisation is: (λJterms) t, u, v ::= x  λx.t  t(u, x.v) (λxterms) t, u, v ::= x  λx.t  t(u)  vx := t
Characterising Strongly Normalising Intuitionistic Sequent Terms
97
Deﬁnition 27 1. λJ∩ is the typing system consisting of the rules Ax, →R and the following rule, where ∩Ak = A1 ∩· · ·∩An and ∩Bi = B1 ∩· · ·∩Bm , for some n, m ≥ 1: Γ t : ∩Ak → Bi ,∀i ∈ {1,· · · , m}
Γ u : Ak ,∀k ∈ {1,· · · , n}
Γ, x : ∩Bi v : C
Γ t(u, x.v) : C
(Gen.Elim)
2. λx∩ is the typing system consisting of the rules Ax, →R and the following rules, where ∩Ak = A1 ∩ · · · ∩ An , for some n ≥ 1: Γ t : ∩Ak → B
Γ u : Ak , ∀k ∈ {1, · · · , n} (Elim) Γ t(u) : B
Γ t : Ak , ∀k ∈ {1, · · · , n} Γ, x : ∩Ak (Subst) Γ vx := t : B If n = m = 1 in (Gen.Elim), then we obtain the usual rule for assigning simple types to generalised application. If n = 1 in (Elim) or (Subst), then we obtain the usual rule for assigning simple types to application or substitution. λJ∩ is a “natural” system for typing λJterms, in two senses. First, the rules in λJ∩ follow the natural deduction format. Notice that we retained in λJ∩ only the rules of λGtz ∩ that act on the RHS formula of sequents, and replaced the other rules of λGtz ∩ by an elimination rule. Second, λJ∩ has just one rule for typing generalised applications, contrary to in [14]. Similarly, λx∩ is a “natural” system for typing λxterms. Again, we retained in λx∩ only the rules of λGtz ∩ that act on the RHS formula of sequents, and replaced the other rules of λGtz ∩ by an elimination rule and a substitution rule. In addition, no extra cut or substitution rules are needed, contrary to [13]. The following is an addenda to GL. Proposition 28. In λGtz ∩ one has: 1. Γ t(u, x.v) : C iﬀ there are A1 , . . . , An , B1 , . . . Bm such that Γ t : ∩Ak → Bi , for all i; and Γ u : Ak , for all k; and Γ, x : ∩Bi v : C. 2. Γ t(u) : B iﬀ there are A1 , . . . , An such that Γ t : ∩Ak → B and Γ u : Ak , for all k. 3. Γ vx := t : B iﬀ there are A1 , . . . , An such that Γ t : Ai , for all i; and Γ, x : ∩Ai v : B. Proof. We just sketch the proof of statement 1. The “only if” implication follows by successive application of GL. As to the “if” implication, let A1 , . . . , An , B1 , . . . Bm be such that Γ t : ∩Ak → Bi , ∀i, Γ u : Ak , ∀k, and Γ, x : ∩Bi v : C. Here we use ∩Ak → ∩Bi ∼ ∩(∩Ak → Bi ). Recall t(u :: x .v) is denoted by t(u, x .v). Γ, x : ∩Bi v : C Γ u : Ak , ∀k Γ t : ∩Ak → Bi , ∀i
Γ ; ∩Bi x .v : C
Γ ; ∩Ak → ∩Bi u :: x .v
Γ t(u :: x .v) : C
(Sel) (→ L)
(Cut)
98
J. Esp´ırito Santo, S. Ghilezan, and J. Iveti´c
Proposition 29 1. Let t be a λJterm. λGtz ∩ derives Γ t : A iﬀ λJ∩ derives Γ t : A. 2. Let t be a λxterm. λGtz ∩ derives Γ t : A iﬀ λx∩ derives Γ t : A. Proof The “if” implications are proved by induction on Γ t : A in λJ∩ or λx∩, using the fact that Gen.Elim, Elim, and Subst are derived rules of λGtz ∩ (which is clear from the proof of Proposition 28). The “only if” implications are proved by induction on t, and rely on GL and its addenda (Proposition 28).
So we get a characterisation of typeability of t in the “natural” systems λJ∩ or λx∩ in terms of strong normalisability of t as a sequent term: Corollary 30 1. Let t be a λJterm. t is βπσμ − SN iﬀ t is typeable in λJ∩. 2. Let t be a λxterm. t is βπσμ − SN iﬀ t is typeable in λx∩. In addition, the “natural” systems λJ∩ and λx∩ do capture the strongly normalising terms, the point being what we mean by “strongly normalising”. Going back to the examples t0 and t1 of the beginning of this section, although t0 and t1 are strongly normalising in ΛJ and λx, respectively, they are not so in λGtz . Indeed, x.((x(x, w.w)) y .y ), which, after one βreduction step, t0 becomes (λz.z(z, w.w)) by abbreviation, is y y := x(x)x := λz.z(z), that is t1 ! After one σreduction step, t1 becomes the clearly nonterminating y y := (λz.z(z))(λz.z(z)). So, in this sense, it is correct that the natural typing systems λJ∩ and λx∩ (as well as the typing systems of [14] and [13] without extrarules app2 , drop, and K − Cut) fail to give a type to t0 and t1 , because these terms are, after all, nonterminating. Why were these terms no so in their native reduction systems? In ΛJ, t0 becomes y after one step of βreduction because the two substitutions of t1 cannot be formed and hence are immediately executed. In λx, the execution of the outer substitution in t1 is blocked because λx has no composition of substitutions.
6
Conclusion
This paper gives a characterisation, via intersection types, of the strongly normalising intuitionistic sequent terms. This expands the range of application of the intersection types technique. One of the points of extending the CurryHoward correspondence to sequent calculus is that such exercise will shed light on issues like reduction, strong normalisability, or typeability in the original systems in natural deduction format. In this paper this promise is fulﬁlled, because the characterisation of strong normalisability in the sequent calculus proves useful for analysing recent applications of intersection types in natural deduction system containing generalised applications or explicit substitutions. This analysis conﬁrms that there is a delicate equilibrium between clean typing systems and expressive reduction systems.
Characterising Strongly Normalising Intuitionistic Sequent Terms
99
References 1. Amadio, R., Curien, P.L.: Domains and LambdaCalculi. Cambridge Tracts in Theoretical Computer Science, vol. 46. Cambridge University Press, Cambridge (1998) 2. Barendregt, H., Ghilezan, S.: Lambda terms for natural deduction, sequent calculus and cut elimination. J. Funct. Program. 10(1), 121–134 (2000) 3. Coppo, M., DezaniCiancaglini, M.: A new typeassignment for lambda terms. Archiv f¨ ur Mathematische Logik 19, 139–156 (1978) 4. Dougherty, D., Ghilezan, S., Lescanne, P.: Characterizing strong normalization in the CurienHerbelin symmetric lambda calculus: extending the CoppoDezani heritage. Theoretical Computer Science (to appear, 2007) 5. Esp´ırito Santo, J.: Revisiting the correspondence between cutelimination and normalisation. In: Welzl, E., Montanari, U., Rolim, J.D.P. (eds.) ICALP 2000. LNCS, vol. 1853, pp. 600–611. Springer, Heidelberg (2000) 6. Esp´ırito Santo, J.: Completing Herbelin’s programme. In: Della Rocca, S.R. (ed.) TLCA 2007. LNCS, vol. 4583, pp. 118–132. Springer, Heidelberg (2007) 7. Esp´ırito Santo, J.: Delayed substitutions. In: Baader, F. (ed.) RTA 2007. LNCS, vol. 4533, pp. 169–183. Springer, Heidelberg (2007) 8. Esp´ırito Santo, J., Pinto, L.: Permutative conversions in intuitionistic multiary sequent calculi with cuts. In: Hofmann, M.O. (ed.) TLCA 2003. LNCS, vol. 2701, pp. 286–300. Springer, Heidelberg (2003) 9. Herbelin, H.: A lambda calculus structure isomorphic to Gentzenstyle sequent calculus structure. In: Pacholski, L., Tiuryn, J. (eds.) CSL 1994. LNCS, vol. 933, pp. 61–75. Springer, Heidelberg (1995) 10. Joachimski, F., Matthes, R.: Standardization and conﬂuence for ΛJ. In: Bachmair, L. (ed.) RTA 2000. LNCS, vol. 1833, pp. 141–155. Springer, Heidelberg (2000) 11. Kikuchi, K.: Simple proofs of characterizing strong normalization for explicit substitution calculi. In: Baader, F. (ed.) RTA 2007. LNCS, vol. 4533, pp. 257–272. Springer, Heidelberg (2007) 12. Krivine, J.L.: Lambdacalcul, types et mod`eles, Masson, Paris (1990) 13. Lengrand, S., Lescanne, P., Dougherty, D., DezaniCiancaglini, M., van Bakel, S.: Intersection types for explicit substitutions. Inf. Comput. 189(1), 17–42 (2004) 14. Matthes, R.: Characterizing strongly normalizing terms of a λcalculus with generalized applications via intersection types. In: Rolin, J., et al. (eds.) ICALP Workshops 2000, pp. 339–354. Carleton Scientiﬁc (2000) 15. Pottinger, G.: A type assignment for the strongly normalizable λterms. In: Seldin, J.P., Hindley, J.R. (eds.) To H. B. Curry: Essays on Combinatory Logic, Lambda Calculus and Formalism, pp. 561–577. Academic Press, London (1980) 16. Ronchi, S., Rocca, D.: Principal type scheme and uniﬁcation for intersection type discipline. Theor. Comput. Sci. 59, 181–209 (1988) 17. Rose, K.: Explicit substitutions: Tutorial & survey. Technical Report LS963, BRICS (1996) 18. Sall´e, P.: Une extension de la th´eorie des types en lambdacalcul. In: Ausiello, G., B¨ ohm, C. (eds.) ICALP 1978. LNCS, vol. 62, pp. 398–410. Springer, Heidelberg (1978) 19. Schwichtenberg, H.: Termination of permutative conversions in intuitionistic Gentzen calculi. Theoretical Computer Science 212(1–2), 247–260 (1999)
Intuitionistic vs. Classical Tautologies, Quantitative Comparison Antoine Genitrini1 , Jakub Kozik2 , and Marek Zaionc2 PRiSM, CNRS UMR 8144, Universit´e de Versailles ´ SaintQuentin en Yvelines, 45 av. des EtatsUnis, 78035 Versailles cedex, France
[email protected] 2 Theoretical Computer Science, Jagiellonian University, Gronostajowa 3, Krak´ ow, Poland [jkozik,zaionc]@tcs.uj.edu.pl 1
Abstract. We consider propositional formulas built on implication. The size of a formula is the number of occurrences of variables in it. We assume that two formulas which diﬀer only in the naming of variables are identical. For every n ∈ N, there is a ﬁnite number of diﬀerent formulas of size n. For every n we consider the proportion between the number of intuitionistic tautologies of size n compared with the number of classical tautologies of size n. We prove that the limit of that fraction is 1 when n tends to inﬁnity1 .
1
Introduction
In the present paper we consider propositional formulas built on implication only. In particular we do not use logical constant ⊥. The size of a formula is the number of occurrences of variables in it. We assume that two formulas which diﬀers only in the naming of variables are identical. For every n ∈ N, there is ﬁnite number of diﬀerent formulas of size n, we denote that number by F (n). Consequently there is also ﬁnite number of classical tautologies and of intuitionistic tautologies of that size. These numbers are denoted by Cl(n) and Int(n) respectively. We are going to prove that: Int(n) = 1. lim n→∞ Cl(n) This work is a part of the research in which the asymptotic likelihood of truth is estimated. We refer to Gardy [4] for a survey on probability distribution on Boolean functions induced by random Boolean expressions. For the purely implicational logic of one variable the exact value of the density of truth was computed in the paper [11] of Moczurad, Tyszkiewicz and Zaionc. It is well known that under CurryHoward isomorphism this result answered the question 1
Research described in this paper was partially supported by POLONIUM grant Quantitative research in logic and functional languages, cooperation between Jagiel´ lonian University of Krakow, L’ Ecole Normale Sup´erieure de Lyon and Universite de Versailles SaintQuentin, contract number 7087/R07/R08.
M. Miculan, I. Scagnetto, and F. Honsell (Eds.): TYPES 2007, LNCS 4941, pp. 100–109, 2008. c SpringerVerlag Berlin Heidelberg 2008
Intuitionistic vs. Classical Tautologies, Quantitative Comparison
101
of ﬁnding the “density” of inhabited types in the set of all types. The classical logic of one variable and the two connectors – implication and negation – was studied in Zaionc [13]. Over the same language, the exact proportion between intuitionistic and classical logics has been determined in Kostrzycka and Zaionc [7]. Some variants involving formulas with other logical connectors have also been considered. The case of and/or connectors received much attention – see Lefmann and Savick´ y [8], Chauvin, Flajolet, Gardy and Gittenberger [1] and Gardy and Woods [5]. Matecki [9] considered the case of the equivalence connector. In the latest paper [6] of Fournier, Gardy, Genitrini and Zaionc, the proportion between intuitionistic and classical logics when the overall number of variables is ﬁnite was studied. In this paper, the methods (and moreover the partition of formulas into several classes) developed in [6] are used in the diﬀerent case, when the number of variables is arbitrary and not ﬁxed. But formally this paper and [6] are incomparable in the sense that each one is not an extension of the other.
2
Basic Facts
2.1
Catalan Numbers
The nth Catalan number is the number of binary trees with n internal nodes or, equivalently, n + 1 leaves. For our exposition it will be convenient to focus on leaves, therefore we denote by C(n) the (n−1)th Catalan number. Its (ordinary) generating function is √ 1 − 1 − 4z n . C(n)z = c(z) = 2 n∈N
That function fulﬁlls the following property c(z) = c(z)c(z) + z.
(1)
The radius of convergence of c(z) is and limz→R 41 c(z) = . We have also the following property 1 C(n − 1) lim = . n→∞ C(n) 4 1 4
2.2
1 2
Algebraic Asymptotics
Lemma 1. Let f, g ∈ Z[[z]] be two algebraic generating functions, having (as complex analytic functions) unique dominating singularities in ρ ∈ R+ . Suppose that these functions have Puiseux expansions around ρ of the form 1
1
f (z) = cf + df (z − ρ) 2 + o((z − ρ) 2 ) 1
1
g(z) = cg + dg (z − ρ) 2 + o((z − ρ) 2 ). Then
[z n ]f (z) f (z) = lim . n→∞ [z n ]g(z) z→R ρ− g (z) lim
102
A. Genitrini, J. Kozik, and M. Zaionc
By the singularity analysis for algebraic generating functions (see e.g. Theorem 8.12 from [2]) we obtain that: df [z n ]f (z) = . n→∞ [z n ]g(z) dg lim
(z) On the other hand it can be easily calculated that limz→R ρ− fg (z) = analogous argument can be derived from the Szeg¨ o Lemma (see [12]).
2.3
df dg .
The
Bell Numbers
The nth Bell number, denoted by B(n), is the number of equivalence relations which can be deﬁned on some ﬁxed set of size n. We use the following property, which can be derived from the asymptotic formula for Bell numbers by Moser and Wyman ([10], see also [3]). e log(n) B(n − 1) ∼ . B(n) n 2.4
Formulas
Implicational formulas can be represented by binary trees, suitably labeled: their internal nodes are labeled by the connector → and their leaves by some variables. By φ we mean the size of formula φ which we deﬁne to be the total number of occurrences of propositional variables in the formula (or leaves in the tree representation of the formula). Parentheses (which are sometimes necessary) and the implication sign itself are not included in the size of expressions. Formally, xi = 1 and φ → ψ = φ + ψ . We denote by F (n) the number of implicational formulas of size n. Let T be a formula (tree). It can be decomposed with respect to its right branch. Hence it is of the form A1 → (A2 → (. . . → (Ap → r(T ))) . . .) where r(T ) is a variable. We shall write it as T = A1 , . . . , Ap → r(T ). The formulas Ai are called the premises of T and r(T ), the rightmost leaf of the tree, is called the goal of T . 2.5
Counting up to Names
It is easy to observe, that the number of diﬀerent formulas of size n is F (n) = B(n)C(n). C(n) corresponds to the shapes of formulas represented by trees, and B(n) to all possible distributions of variables in that shape.
Intuitionistic vs. Classical Tautologies, Quantitative Comparison
3
103
Simple Tautologies
We follow notation from [11], [14] and [6]. We are going to prove a theorem analogous to the main theorem of [6]. Deﬁnition 1. G is the set of simple tautologies i.e. expressions that can be written as T = A1 , . . . , Ap → r(T ), where there exists i such that Ai is a variable equal to r(T ). We let (G(n))n∈N be the sequence of the numbers of simple tautologies of size n. It is easy to prove that simple tautologies are indeed intuitionistic tautologies. The asymptotic equivalence of classical and intuitionistic logic is a direct consequence of the following theorem. Theorem 1. Asymptotically, all the classical tautologies are simple. We are going to prove the theorem in three steps. First, we estimate the number of simple tautologies. Then in two steps we show that the number of remaining tautologies is asymptotically negligible. Lemma 2. The fraction of simple tautologies among all formulas of size n is . asymptotically equal to e log(n) n Proof. First, we enumerate all the shapes of trees which can be labelled to be simple tautologies. The set of such trees will be denoted by GT . A tree belongs to GT if and only if it has at least one premise which is a leaf. Let GT (n, l) denote the number of trees of size n, whose l premises are leaves. We deﬁne bi variate generating function gt(x, z) = n,l∈N\{0} GT (n, l)z n xl . We use standard unlabeled constructions (see [3]) to obtain the explicit expression for gt(x, z). Clearly, for every tree t from GT there is a premise which is a leaf. The last such premise decomposes uniquely the sequence of premises of t into two sequences. The ﬁrst consists of arbitrary trees, while the second – of trees which are not leaves. Note that c2 (z) is the generating function for trees which are not leaves. Corresponding constructions on generating function yields: gt(x, z) =
1 1 · xz · · z. 1 − c2 (z) − xz 1 − c2 (z)
(2)
1 In the expression above the term 1−c2 (z)−xz corresponds to the sequence of 2 trees which are either leaves (xz) or not (c (z)). The second term xz corresponds to the last premise that is a leaf. The third term 1−c12 (z) corresponds to the remaining sequence of premises which are not leaves. The last occurrence of z corresponds to the goal. Let us ﬁx l ∈ N \ {0}. Let Gl (n) denote the number of simple tautologies of size n in which l premises are leaves. ¿From the inclusionexclusion principle we obtain
GT (n, l) · l · B(n − 1) > Gl (n) > GT (n, l) · l · B(n − 1) − GT (n, l)
l(l − 1) B(n − 2). 2
104
A. Genitrini, J. Kozik, and M. Zaionc
The ﬁrst inequality comes from the fact that in every simple tautology there is at least one premise which is equal to goal. Hence GT (n, l) corresponds to the shape of the tree, l corresponds to the possibilities of choice of the premise, B(n − 1) corresponds to the all possible labeling of variables (n − 1 since one premise is chosen to be equal to the goal). Of course, the formulas in which many premises are equal to the goal are counted more than once. The second inequality comes from subtracting all the formulas in which at least two premises are equal to the goal (again some formulas are subtracted many times). We have
GT (n, l) · l · B(n − 1) >
l∈N\{0}
Gl (n)
l∈N\{0}
l(l − 1) B(n − 2) GT (n, l) · l · B(n − 1) − GT (n, l) 2
Gl (n) >
l∈N\{0}
l∈N\{0}
therefore for every n ∈ N l∈N\{0} GT (n, l) · l · B(n − 1) F (n)
>
l∈N\{0}
Gl (n)
F (n)
(3)
and l∈N\{0}
Gl (n)
F (n)
>
l∈N\{0} (GT (n, l)
· l · B(n − 1) − GT (n, l) l(l−1) 2 B(n − 2)) F (n)
(4) We are going to ﬁnd a succinct formula for the generating function gt(z) = n z l · GT (n, l). Taking the derivative of the function gt(x, z) with n∈N l∈N\{0} respect to x we obtain the generating function of the sequence GT (n, l) · l · xl−1 z n . gtx (x, z) = n,l∈N\{0}
It remains to substitute 1 for x to obtain the sought generating function. We can write that function explicitly by applying those operations to the explicit formula for gt(x, z). We obtain: gt(z) =
z 1 − c(z)
2 = c(z)2 ,
the last equality results from (1). We encourage the reader to ﬁnd a direct interpretation in terms of trees of the obtained expression for gt(z). By diﬀerentiating gt(x, z) twice with respect to x, substituting 1 for x and multiplying by 12 , we analogously obtain the generating function: gt(z) =
n∈N\{0}
zn
l∈N\{0}
GT (n, l)
l(l − 1) 2
Intuitionistic vs. Classical Tautologies, Quantitative Comparison
105
Hence gt(z) = gtx (1, z) = c(z)(c(z) − z). Both generating functions gt(z) and gt(z) are algebraic generating functions with unique dominating singularities in 14 . Hence by the Lemma 1:
[z n ]gt(z) gt (z) = lim =1 1 − c (z) n→∞ [z n ]c(z) z→R 4 lim
and
3 [z n ]gt(z) gt (z) = lim = lim 1 − c (z) n→∞ [z n ]c(z) 4 z→R 4 Therefore l∈N
GT (n, l) · l · B(n − 1) ([z n ]gt(z)) B(n − 1) e log(n) = ∼ C(n)B(n) C(n) B(n) n
and l∈N
GT (n, l) l(l−1) ([z n ]gt(z)) B(n − 2) 3 2 B(n − 2) = ∼ C(n)B(n) C(n) B(n) 8
Finally from (3) and (4) we obtain Gl (n) G(n) e log(n) = l∈N ∼ F (n) F (n) n
e log(n) n
2
(5)
It remains to estimate the number of tautologies which are not simple. Those have to be found among formulas which are neither simple tautologies nor simple nontautologies (a formula T is simple nontautology if the goal of T does not occur as a goal of any premise of T ). That means that in every such formula T there is at least one premise which is not a variable, and whose goal is equal to the goal of T . First we will show that the number of formulas A1 , . . . , Ak → x in which x is a goal of at least two premises is negligible, the set of such formulas will be denoted by M P and the number of such formulas of size n by M P (n). Lemma 3. The fraction of formulas A1 , . . . , Ak → x in which x is a goal of at ). least two premises among all formulas of size n is o( e log(n) n Proof. Let P (n, l) denotethe number of trees of size n in which l premises are not leaves. Let p(x, z) = n,l∈N P (n, l)z n xl be its generating function. From the equation (1) we know that c(z) =
z 1−
c2 (z)
−z
106
A. Genitrini, J. Kozik, and M. Zaionc
That equation can be interpreted in terms of combinatorial constructions. Every tree is a sequence of premises, followed by the goal. That translates to the ex1 pression c(z) = 1−c(z) · z. Every tree is either a leaf or it consists of two subtrees, therefore we can substitute c2 (z) + z for c(z) in the last equation. We add a formal parameter x for every premise which is not a leaf to obtain the generating function p(x, z): z p(x, z) = 2 1 − xc (z) − z Every formula T ∈ M P has two premises with the goal equal to the goal of T , and those premises are not leaves. Therefore M P (n) ≤ B(n − 2)
P (n, l)
l∈N\{0}
l(l − 1) 2
Note that
zn
n∈N
P (n, l)
l∈N
Hence
l(l − 1) c5 (z) c7 (z) = 12 px (1, z) = = 2 (1 − c(z))2 z2 5
lim
n→∞
c (z) [z n ] (1−c(z)) 2
[z n ]c(z)
7
( c z(z) 7 2 ) = lim = (z) 1− c 4 z→R 4
It follows that
2 B(n − 2) l∈N1 P (n, l) l(l−1) M P (n) 7 e log(n) 2 ≤ ∼ F (n) C(n)B(n) 4 n
and comparing to (5), formulas from M P are negligible.
Finally, we estimate the number of tautologies which have exactly one premise with goal equal to the goal of the whole formula (compare the part about less simple nontautologies in [6]). Let T be such a formula, and C be that premise. Let D be the ﬁrst premise of C (C is not a variable), and r(D) be the goal of D (see Figure 3). A necessary condition for the formula T to be a tautology is that either r(D) is a goal of at least one premise of T or D, or r(T ) is a goal of some premises of D. We estimate the number of such formulas. Let LT be the set of formulas T for which both of the following conditions hold: 1. T has exactly one premise C whose goal is r(T ), and that premise is not a variable (this implies that T is not a simple tautology nor a simple nontautology). 2. Let D be the ﬁrst premise of C. At least one of the following conditions holds: (a) there is a premise of D with goal r(T ), (b) there is a premise of D with goal r(D), (c) r(D) = r(T ) and there is a premise of T with goal r(D).
Intuitionistic vs. Classical Tautologies, Quantitative Comparison
107
→ →
T1
→
Ti−1 →
→ →
D
→
C1 Cq
→
Ti Tp
r(T )
r(T )
Fig. 1. Tautologies from LT
Lemma 4. The of tautologies which are not simple among all formulas fraction . of size n is o e log(n) n Proof. Clearly all the tautologies, which are not simple and not in M P , belong to LT .Let LT (n) denote the number of formulas from LT of size n. Since M P (n) e log(n) is o it is enough to prove the estimation for LT (n). n Let LT T (n, m, l, k) denote the number of trees of size n which have l + 1 premises, and in which the ﬁrst premise of the mth premise has exactly k premises. Let LT T (n, l, k) = m∈N LT T (n, m, l, k). Every such tree can be turned into formula from LT by the appropriate assignment of variables, and every formula from LT can be constructed in this way. Therefore LT (n) ≤ (l + k) · LT T (n, l, k) · B(n − 2) + k · LT T (n, l, k) · B(n − 2) l,k∈N
l,k∈N
The ﬁrst sum corresponds to the situation, where r(D) occurs as a goal in some premises of D or T . The second one – to the the situation, where r(T ) occurs as a goal of some premises of D (these situations are not disjoint). Clearly LT (n) ≤ 2 (l + k) · LT T (n, l, k) · B(n − 2). l,k∈N
Let ltt(x, y, z) =
l,k,n∈N
ltt(x, y, z) =
xl y k z n LT T (n, l, k). We have z 1 1 · · c(z) · · z. 1 − xc(z) 1 − yc(z) 1 − xc(z)
1 The ﬁrst term 1−xc(z) corresponds to the sequence of premises preceding the z distinguished premise C. The component 1−yc(z) ·c(z) corresponds to the premise
108
A. Genitrini, J. Kozik, and M. Zaionc
C (formal parameter y counts the premises of the subtree corresponding to D). 1 The last component 1−xc(z) · z corresponds to the remaining premises of the main tree and the leaf (goal). Let ltt(x, z) = ltt(x, x, z), then ltt(x, z) = l,k,n∈N x(l+k) z n LT T (n, l, k), and lttx (1, z) = l,k,n∈N z n (l + k)LT T (n, l, k). We denote the last function by ltt(z), LT (n) )n∈N . We it is the generating function for the sequence which majorizes ( 2B(n−2) can write it explicitly as 3c6 (z) . ltt(z) = z2 Then we have 6 6 3c (z) n 3c (z) [z ] z2 z2 18 LT T (n) lim = lim = lim = =9 n n→∞ C(n) n→∞ [z ]c(z) c (z) 2 z→R 41 −
Therefore 2LT T (n)B(n − 2) B(n − 2) LT (n) ≤ ∼ 18 ∼ 18 F (n) B(n)C(n) B(n)
e log(n) n
2
The Theorem 1 is a direct consequence of Lemmas 2,3,4.
4
Discussion
Actually, in this paper much more is proved. The result obtained is not related only to intuitionistic tautologies, but it holds also in any logic which is able to prove simple tautologies. Indeed, all formulas of the form of simple tautologies are tautologies of every reasonable logic with this syntax. Therefore results comparing densities of any logic between minimal and classical one are the same. So the theorem proved may be applied as well to minimal, intuitionistic, and any intermediate logic. It shows, in fact, that a randomly chosen theorem has a proof which is a projection and statistically all true statements are the trivial ones. In the paper only implicational fragment is taken in to consideration. Right now we do not know the analogous result for more complex syntax. But based on our experience we believe that the similar theorems holds for more complex syntaxes, including full propositional logic. At the moment, these are just expectations, but certainly it is worth to look in this direction. Despite of the fact that all discussed problems and methods are solved by mathematical means, the paper, as was suggested by referees may have some philosophical interpretation and impact. However, the paper is purely technical and we are not ready to comment on these philosophical issues. Acknowledgements. We are very grateful to all three anonymous referees who suggested many improvements to the presentation of our paper.
Intuitionistic vs. Classical Tautologies, Quantitative Comparison
109
References 1. Chauvin, B., Flajolet, P., Gardy, D., Gittenberger, B.: And/Or trees revisited. Combinatorics, Probability and Computing 13(45), 475–497 (2004) 2. Flajolet, P., Sedgewick, R.: Analytic combinatorics: functional equations, rational and algebraic functions. In: INRIA, vol. 4103 (2001) 3. Flajolet, P., Sedgewick, R.: Analytic combinatorics. Book in preparation (2007), available at: http://algo.inria.fr/flajolet/Publications/books.html 4. Gardy, D.: Random Boolean expressions. In: Colloquium on Computational Logic and Applications. Proceedings in DMTCS, Chambery (France), June 2005, pp. 1–36 (2006) 5. Gardy, D., Woods, A.: And/or tree probabilities of Boolean function. Discrete Mathematics and Theoretical Computer Science, 139–146 (2005) 6. Fournier, H., Gardy, D., Genitrini, A., Zaionc, M.: Classical and intuitionistic logic are asymptotically identical. In: Duparc, J., Henzinger, T.A. (eds.) CSL 2007. LNCS, vol. 4646, pp. 177–193. Springer, Heidelberg (2007) 7. Kostrzycka, Z., Zaionc, M.: Statistics of intuitionistic versus classical logic. Studia Logica 76(3), 307–328 (2004) 8. Lefmann, H., Savick´ y, P.: Some typical properties of large And/Or Boolean formulas. Random Structures and Algorithms 10, 337–351 (1997) 9. Matecki, G.: Asymptotic density for equivalence. Electronic Notes in Theoretical Computer Science 140, 81–91 (2005) 10. Moser, L., Wyman, M.: An asymptotic formula for the Bell numbers, Transactions of the Royal Society of Canada, XLIX (1955) 11. Moczurad, M., Tyszkiewicz, J., Zaionc, M.: Statistical properties of simple types. Mathematical Structures in Computer Science 10(5), 575–594 (2000) 12. Wilf, H.: generating functionology, 3rd edn. A K Peters Publishers (2006) 13. Zaionc, M.: On the asymptotic density of tautologies in logic of implication and negation. Reports on Mathematical Logic 39, 67–87 (2005) 14. Zaionc, M.: Probability distribution for simple tautologies. Theoretical Computer Science 355(2), 243–260 (2006)
In the Search of a Naive Type Theory Agnieszka Kozubek and Pawel Urzyczyn Institute of Informatics, University of Warsaw, Poland {kozubek,urzy}@mimuw.edu.pl
Abstract. This paper consists of two parts. In the ﬁrst part we argue that an appropriate “naive type theory” should replace naive set theory (as understood in Halmos’ book) in everyday mathematical practice, especially in teaching mathematics to Computer Science students. In the second part we make the ﬁrst step towards developing such a theory: we discuss a certain pure type system with powerset types. While the system only covers very initial aspects of the intended theory, we believe it can be used as an initial formalism to be further developed. The consistency of this basic system is established by proving strong normalization.
1
Why Not Set Theory?
Set theory is an enormous success in the contemporary mathematics, including the mathematics relevant to Computer Science. Virtually all maths is developed within the framework of set theory, and virtually all books and papers are written under the silent assumption of ZF or ZFC axioms occurring “behind the back”. We sometimes feel as if we actually lived in set theory, as if it was the only true and real world. The settheoretical background has made its way to education, from the university to the kindergarten level, and what once was a foundational subject on the border of logic and philosophy now has become a part of elementary mathematics. And indeed, set theory deserves its pride. From an extremely modest background—the notion of “being an element” and the idea of equality—it develops complex notions and objects serving the needs of even most demanding researcher. Enjoying the paradise of sets we tend to forget about the price we pay for that. Of course, we must avoid paradoxes, and thus the set formation patterns are severely restricted. We must give up Cantor’s idea of “putting together” any collection of objects, resigning therefore, at least partly, from the very basic intuition that a set of objects can be selected by any criterion at all. Universes vs predicates. In fact, there are two very basic intuitions that are glued together into the notion of a “set”:
Partly supported by the Polish Government Grant 3 T11C 002 27, and by the EU Coordination Action 510996 “Types for Proofs and Programs”.
M. Miculan, I. Scagnetto, and F. Honsell (Eds.): TYPES 2007, LNCS 4941, pp. 110–124, 2008. c SpringerVerlag Berlin Heidelberg 2008
In the Search of a Naive Type Theory
111
– Set as a domain or universe; – Set as a result of selection. We used to treat this identiﬁcation as natural and obvious. But perhaps only because we were taught to do so. These two ideas are in fact diﬀerent, and this very confusion is responsible for Russel’s paradox. In addition, ordinary mathematical practice often makes an explicit diﬀerence between the two aspects. Mathematicians have been classifying objects according to their domain, kind, sort or type since the antiquity [2,21]. An empty set of numbers and an empty set of apples are intuitively not the same, as well as in most cases we do not need and do not want to treat a function in the same way as its arguments. The diﬀerence between domains (types) and predicates is made explicit in type theory. This results in various simpliﬁcations. For instance, the diﬀerence between operations on universes (product, disjoint sum) and operations on predicates (intersection, set union) becomes immediately apparent and natural. Yet another example is that a union A of a family A of sets is typically of the same “type” as members of A rather than as A itself. In set theory, this argument is not suﬃcient to reject common student’s misconceptions like e.g. A ⊆ A, because classifying sets (a priori) into types is illegal. Everyday maths vs foundations of mathematics. The purpose of set theory was to give a universal foundation for a consistent mathematics. That happened at the beginning of the 20th century, when consistency of elementary notions was a serious issue, threatened by the danger of antinomies, and when modern formal mathematics was in its infancy. It was then important to ensure as much security as possible. Therefore, all the development had to be done from ﬁrst principles, and the results of it have little to do with ordinary mathematical practice. For instance, using the Axiom of Foundation one derives in set theory the surrealistic conclusion that all the world is built from curly braces. This foundational tool is now being widely used for a quite diﬀerent purpose. We use sets as a basic everyday framework for various kinds of mathematics, and we teach set theory to students, beginning at a very elementary level. But that puts us into an awkward situation. On the one hand, we want to use as much commonsense as possible, on the other hand we do not want paradoxes and inconsistency. So if we do not want to cheat, what can we do? One possibility is to hide the problem and pretend that everything is OK: “Emmm. . . We assume that all sets under consideration are subsets of a certain large set.” This is what often happens in elementary and highschool textbooks. But is it really diﬀerent than saying that the world is placed on the back of a giant turtle? An intelligent student must eventually ask on whose back the turtle is standing. And then all we can say is “Sit up straight!” The other option is to pull the skeleton out of the closet, put all axioms on the table, and pay a heavy overhead by spending a lot of eﬀort on constructing ordered pairs in the style of Kuratowski, integers in the style of von Neumann, and so on. This approach is common at the university level and has been considerably mastered. For half a century, the book [18] by Halmos has been giving guidance to lecturers how to achieve a balance between precision and simplicity.
112
A. Kozubek and P. Urzyczyn
(Contrary to its title, the book is not about naive set theory. It is about axiomatic set theory taught in a “naive” or “commonsense” style.) But even this didactic masterpiece is a certain compromise. The idea vs the implementation. This is because the overhead is unavoidable. Very basic mathematical ideas must be encoded in set theory before they can be used, and a substantial part of student’s attention is paid to the details of the encoding. To a large extent this is a wasted eﬀort and it would be certainly more eﬃcient to concentrate on “toplevel” issues. Using an old comparison in a diﬀerent context, it is like teaching the details of fuel injection in a driving school while we should rather let students practice driving. Getting accustomed to set theoretical “implementation” of mathematics is painful to many students. In set theory the implementation is not “encapsulated” at all and we can smell the fuel in the passenger’s cabin. One of the most fundamental God’s creations is turned into a transitive set of von Neumann’s numbers so we must live with phenomena like 1 ∈ 2 ⊆ 3 ∈ 4 or N = N. We do not really need these phenomena. The actual use of various objects and notions in mathematics is based on their intensional “speciﬁcation” rather than formal implementation. We still have to ask students to remember the rule a, b = c, d iﬀ a = c ∧ b = d, (*) in addition to the deﬁnition a, b = {{a}, {a, b}}. But we must spend time on proving the above equivalence. A doubtful reward is the malicious homework “Prove that (N × N) = N.” We got used to such homeworks so much that we do not notice that they are nonsense. In a typed framework a substantial part of this nonsense simply disappears.
2
Why Type Theory and What Type Theory?
We believe that an appropriate type theory should give a chance to build a framework for “naive” mathematics that would not exhibit many of the drawbacks mentioned above. In particular, it is reasonable to expect that a “naive type theory” can be more adequate than “naive set theory” from our point of view, in that it should – – – – –
be free from both paradoxes and unnecessary artiﬁcial formalization; distinguish between domains (universes) and sets understood as predicates; begin with intensional speciﬁcations rather than from bare ﬁrst principles; be closer to the everyday maths and computer science practice; be more appropriate for automatic veriﬁcation.
We do not want to depart from ordinary mathematical practice, and thus our naive type theory should be adequate for classical reasoning, and extensional with respect to functions and predicates. We ﬁnd it however methodologically appropriate that these choices are made explicit (introduced by appropriate axioms) rather than implicit (built in the design principles). We also would like to include a CurryHoward ﬂavour, taking seriously De Bruijn’s slogan [6]:
In the Search of a Naive Type Theory
113
Treating propositions as types is deﬁnitely not in the way of thinking of the ordinary mathematician, yet it is very close to what he actually does. The basic idea is of course to separate the two roles played by sets, namely to put apart domains (types) and predicates (selection criteria for objects of a given type). Thus for any type A we need a powerset type P (A), identiﬁed with the function space A → ∗, where ∗ is the sort of propositions. That is, we would like to treat “M ∈ {a : A  ϕ(a)}” as syntactic sugar for “ϕ(M )”. Although our principal aim is a “naive” approach, we should be aware of the necessity of a formalization. Firstly, because we still need some justiﬁcation for consistency, secondly, because it may be desirable that “naive” reasoning can be computerassisted. We ﬁnd it quite natural and straightforward to build such a formalization beginning with a certain pure type system, to be later extended with additional constructs and axioms. Related systems. Simple type theory: In Church’s simple type theory [8,21] there are two base types: the type i of individuals and the type b of truth values. Expressions have types and formulas are simply expressions of type b. There is no builtin notion of a proof and formulas are not types. In addition to lambdaabstraction, there is another binding operator that can be used to build expressions, namely the deﬁnite description ιx. ϕ(x), meaning “the only object x that satisﬁes ϕ(x)”. While various forms of deﬁnite description are often used in the informal language of mathematics, the construct does not occur in most contemporary logical systems. As argued by William Farmer in a series of papers [11,12,13,14], simple type theory could be eﬃciently used in mathematical practice and teaching. Also the textbook [2] by P.B. Andrews develops a version of simple type theory as a basis for everyday mathematics. This is very much in line with our way of thinking. We choose a slightly diﬀerent approach, mostly to avoid the inherently twovalued Boolean logic built in Church’s type theory. Quine’s New Foundations: Quine’s type theory [20,23] is based on an implicit linear hierarchy of universes. Full comprehension is possible at each level, but a set always lives at a higher ﬂoor than its elements. The idea of a linear hierarchy is of course convenient from a foundational point of view, but is not very intuitive. Also implementing “ordinary” mathematics requires a similar eﬀort as in the usual set theory. The restriction to stratiﬁed constructs does not help either: one encounters diﬃculties when trying to deﬁne functions between objects belonging to diﬀerent levels of the hierarchy. Constable’s computational naive type theory: We have to admit that the title of Halmos’ book has already been rephrased by R. Constable [9]. But Constable’s idea of a “naive type theory” is quite diﬀerent than ours. It is inspired by MartinL¨ of’s theory and based on the idea of a setoid type, determined by a domain of objects plus an appropriate notion of equality. (In other words, quotient becomes a basic notion.) For instance, the ﬁeld Z3 has the same domain as the set of integers Z, but a diﬀerent equality. And Z6 is deﬁned by taking an “intersection” of Z2 and Z3 . This is very convenient and natural way of dealing with quotient constructions. However (even putting aside the little counterintuitivity of the “contravariant” intersection) we still believe that a “naive” notion of equality
114
A. Kozubek and P. Urzyczyn
should be more strict: two objects should not be considered the same in one context but diﬀerent in another. Coq and the calculus of inductive constructions: An almost adequate framework for a naive type theory is the Calculus of Constructions extended with inductive types. This is essentially the basic part of the type theory of the Coq proof assistant [5]. The paper [7] describes an attempt to use Coq in teaching rudiments of set theory. But in Coq, if A is a type (A : Set is provable) then the powerset A → Prop of A is a kind (A → Prop : Type is provable). That is, a set and its powerset do not live in the same sort, although they should receive similar treatment. Weyl’s predicative mathematics and Luo’s logicenriched theories: Zhaohui Luo in [22] considers logicenriched type theories” where the logical aspect is ” separated by design from the datatype aspect (in particular a separate kind Prf (P ) is used for proofs of any proposition P ). Within that framework one can introduce both predicative and impredicative notion of a set, so that the kind Type is closed under the powerset construction. This approach is used by Adams and Luo [1] to formalize the predicative mathematics of Weyl [25], who long ago made an explicit distinction between “categories” and sets, understood respectively as universes and predicates. Weyl’s theory is strictly predicative, and this certainly departs from our “naive” understanding of sets, but the impredicative version mentioned in [22] is very much consistent with it. In this paper. In the next section we collect a few postulates concerning the possible exposition of a naive type theory. With this exposition we would like to initiate a discussion to help establish a new approach to both teaching and using mathematics in a way that will avoid the settheoretic “overheads” and remain suﬃciently precise and paradoxfree. We realize that a naive approach to type theory can result in an inconsistency, as it happened to naive set theory and many other ideas. Therefore we consider it necessary to build the naive approach on top of a rigourous formal system, to be developed in parallel. The relation between the formal language and the naive theory should be similar to the relation between the ﬁrstorder ZFC formal theory, and Halmos’ book [18]. In the present paper we do not attempt to solve the problem in general but rather to formulate it explicitly and highlight its importance. On the technical side, we only address here one very initial but important problem. A set X and its powerset P (X) should be objects of the same sort, and we also assume that subsets of X should be identiﬁed with predicates on X. In the language of pure type systems that leads to the idea of a type assignment of the form X→∗ : ∗, which turns out to imply inconsistency. In Section 4 we show that this inconsistency can be eliminated if the diﬀerence between propositions and types is made explicit. More precisely, we prove strong normalization (and thus consistency) of an appropriate PTS. Of course, that is only the ﬁrst step. We need a much richer consistent system to back up our “practical” exposition of sets, functions, and composite types, as sketched in the previous section. This will most likely require extending our
In the Search of a Naive Type Theory
115
system LNTT by various additional constructs, in particular a general scheme for inductive types, and additional axioms. All this is future work.
3
Informal Exposition
In this section we sketch some basic ideas of how a “naive” informal presentation of basic mathematics could look when set theory is replaced by type theory. As we said, these ideas go far beyond the initial formalism of Section 4. Types. Every object is assigned a type. Types themselves are not objects.1 Certain types are postulated by axioms, and many of these should be special cases of a general scheme for introducing inductive (perhaps also coinductive) types. In particular, the following should be assumed: – – – – –
A unit type with a single element nil . Product types A × B and coproduct types A + B, for any types A, B. The type N of integers. The powerset type P (A), for any type A. Function types A → B, perhaps as a special case of a more general product.
In particular, a powerset P (A) of a type A should form a type and not a kind (i.e. it should be in the same sort as A) so that operations on types can be applied equally to both. Otherwise the classiﬁcation of compound objects becomes unreasonably complicated: just think of a product A × P (A) × (A → P (A)). Types come together with their constructors, eliminators etc., their properties postulated by axioms. For instance, the equivalence (*) should be an axiom. Equality. In the “ordinary” mathematics two objects are equal iﬀ they are the same object; one can do the same in the typed framework. As in common mathematical practice, equality between sets, functions, etc. should be extensional. In the formal model Leibniz’s equality should probably be an axiom. Sets. A predicate ϕ(x), where x : A, is identiﬁed with a subset {x : A  ϕ(x)} of type A. Subsets are assumed to be extensional, i.e., ϕ = ϕ iﬀ ∀x:A. ϕ(x) ↔ ϕ (x). Inclusion is deﬁned as usual by ϕ ⊆ ϕ iﬀ ∀x:A. ϕ(x) → ϕ (x). Set union and intersection as well as the complement −ϕ = {x : A  ¬ϕ(x)} are welldeﬁned operations on sets. Note the diﬀerence between operations on sets (like union) and on types (like disjoint sum). An indexed family of sets is given by any 2argument predicate, so that e.g. we can write the ordinary deﬁnition y:Y Ay = {x : X  ∀y:Y. Ay (x)}. Should we need an intersection indexed by elements of a set rather than a type we must explicitly include it in the deﬁnition by writing y∈ψ Ay = {x : X  ∀y:Y (ψ(y) → Ay (x))}. 1
At least not yet. We may have to relax this restriction, if we want to deal with e.g. objects of type “semigroup”. This may lead to an inﬁnite hierarchy of universes.
116
A. Kozubek and P. Urzyczyn
At this stage, one can prove standard results about the properties of the algebra of sets. Subsets of a Cartesian product A × A are of course called relations, and we can discuss properties of relations and introduce constructions like transitive closure and so on. Equivalences and quotients. While a deﬁnition of an equivalence relation (possibly partial) over a type A presents no diﬃculty, the notion of a quotient type must be postulated separately. Clearly, for every a : A we could consider a set [a]r = {b : A  b r a}, and form a subset of P (A) consisting of all such sets. However, that would be inconsistent with our main idea: a domain of interpretation is always a type and not a set. Also, there is no actual reason to deﬁne the abstract objects, the classes of abstraction, as equivalence sets, as it is done in set theory. There is a diﬀerence between abstraction and implementation. For instance, we deﬁne rationals from integers, but we do not think of 1/2 as of a set. The quotient type A/r induced by a (partial) equivalence r should be equipped with a canonical (partial) map abstract : A → A/r and (as a form of the axiom of choice) one could also postulate another map select : A/r → A satisfying abstract ◦ select = idA/r . The onetoone correspondence between the quotient type and the set of equivalence classes should then be proven as a theorem (“the principle of equivalence”). Functions: total or partial? The notion of a function brings the ﬁrst serious diﬃculty. In typed systems, once we assert f : A → B and a : A we usually conclude f (a) : B. That means we treat f (a) as a legitimate, welldeﬁned object of type B. Everything works well as long as we can assume that all functions from A to B are total. However, it can happen that a function is deﬁned only on a certain subset A of a given domain. In set theory this is not a problem, because both the type of arguments and the actual domain are simply sets, and we can always take f : A → B rather than f : A → B. In the typed framework, we would like to still say that e.g. λx:R. 1/x maps reals to reals, but the domain of the function is a proper subset of the type R. There are several possible solutions of this problem, see [11,15]. Perhaps the most adequate solution for our needs is to distinguish between the function space A → B and the type A −◦ B of partial functions. Then one assigns a domain predicate dom(f ) to every partial function f , and imposes a restriction on the use of the expression f (a) to the cases when a ∈ dom(f ). This seems to be quite consistent with the ordinary mathematical practice. The old idea of a deﬁnite description may turn out useful in this respect. A standard function deﬁnition may have the form f (x) = ιy.ϕ(x, y), or equivalently f = λx ιy.ϕ(x, y), and we would postulate an axiom of the form x ∈ dom(λx ιy.ϕ(x, y)) iﬀ ∃!y ϕ(x, y). Extensionality for partial functions would then be stated as f = g ↔ (dom(f ) = dom(g)) ∧ ∀x (x ∈ dom(f ) → f (x) = g(x)). This approach assumes that f (a) is a wellformed expression of the appropriate type only if a ∈ dom(f ), a problem that does not formally occur2 in set theory, 2
But it occurs in practice: e.g. f (x) = y can be understood diﬀerently than x, y ∈ f .
In the Search of a Naive Type Theory
117
where f (x) = y is syntactic sugar for x, y ∈ f . In type theory, it is more natural to refrain from entering this level of extensionality, and to assume function application as a primitive. Mathematics is not an exact science. Various identiﬁcations are common in mathematical practice. In a strictly typed framework such identiﬁcations are unavoidable. For instance, we would like to treat total functions as special cases of partial functions, even if these are of two diﬀerent types. There is a natural coercion from A → B to A −◦ B, which can, at the metalevel, be treated as identity without creating confusion. Also the diﬀerence between types and subsets becomes inconvenient in certain situations. One speciﬁc example is when we have an algebra with a domain represented as a type, and we need to consider a subalgebra based on a subset of that domain. Then we would prefer to have the “large” and the “small” domain living in the same sort. To overcome this diﬃculty, one may have to postulate a selection scheme: for every subset S of type A there exists a type AS, such that objects of type AS are in a bijective correspondence with elements of S. This partially brings back the identiﬁcation of domains and predicates, but in a controlled way.
4
Naive Type Theory as a Pure Type System
The assumption that a set and a powerset should live in the same sort leads naturally to the following idea: consider a pure type system with the usual axiom ∗ : and with the rule (∗, , ∗). This rule makes possible to build products of the form Πx:A. κ, where A : ∗ and κ : , and the product itself is then a type (is assigned the sort ∗). In particular, the function space A → ∗ is a type, and this is exactly the powerset of A. A subset of A is then represented by any abstraction λx:A. ϕ(x), where ϕ(x) is a (dependent) proposition. Unfortunately, this idea is too naive. As pointed out by A.J.C. Hurkens and H. Geuvers, this theory suﬀers from Girard’s paradox, and thus it is inconsistent. Theorem 1 (Geuvers, Hurkens [17]). Let VNTT (Very Naive Type Theory) be an extension of λP by the additional rule (∗, , ∗). Then every type is inhabited in VNTT (every proposition has a proof ). Proof. The proof is essentially the same as Hurkens’ proof in [19], (cf. the version given in [24, chapter 14]) for the system λU− . There are two essential factors that imply that Russel’s paradox can be implemented in a theory: – A powerset P (x) of a domain x of a sort s lives in the same sort s. – There is enough polymorphism available in s to implement a construction of an inductive object μx:s.P (s). In λU− we have s = and polymorphism on the kind level is directly available. But almost the same can happen in VNTT, for s = ∗. Indeed, the powerset A → ∗ of any type A is a type, and although type polymorphism as such is not
118
A. Kozubek and P. Urzyczyn
present, it sneaks in easily by the back door. Instead of quantifying over types, one can quantify over object variables of type T → ∗, where T is any type. Thus instead of using μt : ∗.P (t) = ∀t(∀u: ∗ ((u → t) → P (u) → t) → t) one takes a : T and then deﬁnes μt : ∗.P (t) = ∀x : T → ∗ (∀y : T → ∗ ((ya → xa) → P (ya) → xa) → xa), with essentially the same eﬀect.
It follows that our naive type theory cannot be too naive, and must avoid the danger of Girard’s paradox. The solution is to distinguish between propositions and sets, like in Coq. Deﬁne a pure type system LNTT (Less Naive Type Theory) with four sorts ∗t , ∗p , t , p , with axioms (∗t : t ) and (∗p : p ) and with the following rules: (∗t , ∗t , ∗t ), (∗p , ∗p , ∗p ), (∗t , t , t ), (∗t , ∗p , ∗p ), (∗t , p , ∗t ). The ﬁrst and second rule represent, respectively, the formation of function types, and logical implication; the third rule is for dependent types and the fourth one permits quantiﬁcation over objects of any type. The last rule is for the powerset. Note that there is no polymorphism involved, as rule (t , ∗t , ∗t ) can be fatal; however impredicativity is still present because of rule (∗t , p , ∗t ). Strong Normalization First note that, as all PTSs with only βreduction, the system LNTT has the ChurchRosser property and subject reduction property on welltyped terms [16]. Moreover, LNTT is a singly sorted PTS [3], so the uniqueness of types also holds. Deﬁnition 2. In a ﬁxed context Γ we use the following terminology. 1. 2. 3. 4. 5. 6. 7.
A A A A A A A
is is is is is is is
a term if and only if there exists B such that Γ A : B or Γ B : A. a kind if and only if Γ A : t . a constructor if and only if there exists B such that Γ A : B : t . a type if and only if Γ A : ∗t . a formula if and only if Γ A : ∗p . an object if and only if there exists B such that Γ A : B : ∗t . a proof if and only if there exists B such that Γ A : B : ∗p .
The classiﬁcation of terms in LNTT is more complicated than e.g. in the calculus of constructions λC. While in λC there is a simple linear” hierarchy (from ” objects via types/constructors to kinds), in LNTT we also have a separate hierp archy from proofs via formulas to ∗ . The relation between the two hierarchies is not straightforward: in some respects formulas correspond to types, in other to objects. This is why we need two translations in the proof of strong normalization. We use the notation T ermΓ , KindΓ , ConstrΓ , T ypeΓ , P ropΓ , ObjΓ , and P roofΓ for, respectively, terms, kinds, constructors, types, formulas, objects, and proofs of the context Γ . The following lemma explains the various cases.
In the Search of a Naive Type Theory
119
Lemma 3. Assume a ﬁxed context Γ – If A is a term such that Γ A : p then A = ∗p . – If A is a kind then A is of the following form • A = ∗t or • A = Πx : τ.B where τ is a type and B is a kind. – If A is a constructor then • A is a type, or • A is a variable, or • A is of the form λx : τ.κ where τ is a type and κ is a constructor, or • A is of the form κM where M is an object and κ is a constructor. – If A is a type then • A is a type variable, or • A is of the form Πx : τ.σ where τ and σ are types, or • A is of the form Πx : τ.∗p where τ is a type, or • A is of the form κM where M is an object and κ is a constructor. – If A is a formula then • A is a propositional variable, or • A is of the form Πx : ϕ.ψ where ϕ and ψ are formulas, or • A is of the form Πx : τ.ϕ where τ is a type and ϕ is a formula, or • A is of the form M N where M and N are objects. – If A is an object then • A is an object variable, or • A is of the form λx : τ.N where τ is a type and N is an object, or • A is of the form λx : τ.ϕ where τ is a type and ϕ is a formula, or • A is of the form M N where M and N are objects. – If A is a proof then • A is a proof variable, or • A is of the form λx : τ.D where τ is a type and D is a proof, or • A is of the form λx : ϕ.D where ϕ is a formula and D is a proof, or • A is of the form D1 D2 where D1 and D2 are proofs, or • A is of the form DN where D is a proof and N is an object. Lemma 4. If A is a term which is not a proof and B is a subterm of A then B is not a proof. Proof. This is an immediate consequence of Lemma 3.
Note that it follows from Lemma 4 that all formulas of the form Πx : ϕ. ψ, where ϕ and ψ are formulas, are actually implications (can be written as ϕ → ψ) because the proof variable x cannot occur in ψ. The ﬁrst part of our strong normalization proof applies to all terms but proofs. For a ﬁxed context Γ we deﬁne the translation TΓ : T ermΓ − P roofΓ → T erm(λP 2) from terms of LNTT into the system λP 2. Special variables Bool, F orall and Impl will be used in the deﬁnition of T . Types for these variables are given by the following context: Γ0 = {Bool : ∗, F orall : Πτ :∗. (τ → Bool) → Bool, Impl : Bool → Bool → Bool}.
120
A. Kozubek and P. Urzyczyn
Deﬁnition of the translation TΓ follows: – – – – – – – – – –
TΓ (t ) = ; TΓ (p ) = ∗; TΓ (∗t ) = ∗; TΓ (∗p ) = Bool; TΓ (x) = x, when x is a variable; TΓ (Πx : A.B) = Πx : TΓ (A).TΓ,x:A (B), for products created with the rules (∗t , ∗t , ∗t ), (∗t , p , ∗t ), (∗t , t , t ); TΓ (Πx : τ.ϕ) = F orall TΓ (τ ) (λx : TΓ (τ ).TΓ,x:τ (ϕ)), for products created with the rule (∗t , ∗p , ∗p ); TΓ (Πx : ϕ.ψ) = Impl TΓ (ϕ) TΓ (ψ), for products created with the rule (∗p , ∗p , ∗p ); TΓ (λx : A.B) = λx : TΓ (A).TΓ,x:A (B); TΓ (AB) = TΓ (A) TΓ (B).
For the sake of simplicity we omit the subscript Γ if it is clear which context we are using.3 Note that we cannot apply the translation T to proofs. Formulas of LNTT get translated by TΓ into objects of λP 2. Thus each abstraction of the form λx : ϕ.N would have to be translated into an expression λx : T (ϕ).T (N ). But T (ϕ) is an object so this expression would be illformed. The translation T is extended to contexts as follows: – T () = Γ0 , – T (Γ, x : A) = T (Γ ), x : TΓ (A), if A is a kind, a type, or ∗p , – T (Γ, x : A) = T (Γ ), if A is a formula. We now state some technical lemmas which are used in the proof of soundness of the translation T . Deﬁnition 5. We say that contexts Γ and Γ are equivalent with respect to the set of variables X = {x1 , . . . , xn } if and only if Γ and Γ are legal contexts and for all x ∈ X we have Γ (x) =β Γ (x). Lemma 6. If Γ and Γ are equivalent with respect to X, and N ∈ T ermΓ is such that F V (N ) ⊆ X and Γ N : A, then Γ N : A where A =β A . In particular, if N ∈ T ermΓ then N ∈ T ermΓ . Proof. Induction with respect to the structure of the derivation Γ N : A.
Lemma 7. If Γ and Γ are equivalent with respect to F V (M ) and M ∈ T ermΓ then TΓ (M ) = TΓ (M ). Proof. Induction with respect to the structure of M .
Lemma 8. If Γ a : A and Γ, x : A B : C for some C and a, B are not proofs then TΓ (B[x := a]) = TΓ,x:A (B)[x := TΓ (a)]. Proof. Induction with respect to the structure of B, using Lemma 7. 3
I.e., when the context is clear from the context ;)
In the Search of a Naive Type Theory
121
Lemma 9. If B and B are not proofs and B →β B then TΓ (B) + β TΓ (B ).
Proof. The proof is by a routine induction with respect to B →β B . If B is a redex then, by Lemma 4, it must be of one of the following forms: (λx : τ.ϕ)N, (λx : τ.M )N, (λx : τ.κ)N , where τ is a type, ϕ is a formula, and M, N are objects. In each of these cases we apply Lemma 8. If B is not a redex, we apply the induction hypothesis, using Lemma 7.
Lemma 10. If B TΓ (B) =β TΓ (B ).
=β
B and B, B are kinds, types or objects then
Proof. By ChurchRosser property there exists a welltyped term C such that B β C and B β C. We have TΓ (B) β TΓ (C) and TΓ (B ) β TΓ (C), by Lemma 9, whence TΓ (B) =β TΓ (B ).
Lemma 11 (Soundness of the translation T ). If Γ A : B and A is not a proof then T (Γ ) TΓ (A) : TΓ (B) in λP 2. Proof. Induction with respect to the structure of the derivation of Γ A : B using Lemmas 7, 8 and 10.
Corollary 12. If M is not a proof then M is strongly normalizing. Proof. Assume that there is an inﬁnite reduction M →β M1 →β M2 →β · · · By + + Lemma 9 then T (M ) + β T (M1 ) β T (M2 ) β · · · But T (M ) is a valid term of λP 2, by Lemma 11, thus it is strongly normalizing. The contradiction shows that also M is strongly normalizing.
To show strong normalization for proofs we use another translation t from LNTT to the calculus of constructions λC. This translation depends on a given context Γ . Observe that the classiﬁcation of a term A in LNTT does not a priori determine the classiﬁcation of t(A) in λC. For instance, some types of LNTT are translated to types and some (those which have ∗p as a ”target”) to kinds, cf. Lemma 18. Similarly, some object terms of LNTT are translated as type constructors. Note that we do not deﬁne the translation for t and p as it is not needed for soundness. – – – – – – – – –
tΓ (∗t ) = ∗; tΓ (∗p ) = ∗; tΓ (x) = x, if x is a variable; tΓ (Πx : τ.B) = tΓ,x:τ (B), for products constructed using the rule (∗t , t , t ); tΓ (Πx : A.B) = Πx : tΓ (A).tΓ,x:A (B), for all other products; tΓ (λx : τ.κ) = tΓ,x:τ (κ), if κ is a constructor and τ is a type; tΓ (λx : A.B) = λx : tΓ (A).tΓ,x:A (B), for all other abstractions; tΓ (κN ) = tΓ (κ), if κ is a constructor; tΓ (AB) = tΓ (A)tΓ (B), for all other applications.
122
A. Kozubek and P. Urzyczyn
We extend the translation t to contexts by taking t() = and t(Γ, x : A) = t(Γ ), x : tΓ (A). Lemma 13. If Γ and Γ are equivalent with respect to F V (M ) and M is a term in Γ then tΓ (M ) = tΓ (M ). Proof. Induction with respect to the structure of M .
Lemma 14. Assume that Γ, x : A B : C and Γ N : A and N is an object or a proof. – If N is an object and B is a type or a constructor then tΓ (B[x := N ]) = tΓ,x:A (B). – If B is neither a type nor a constructor then tΓ (B[x := N ]) = tΓ,x:A (B)[x := tΓ (N )]. Proof. Induction with respect to the structure of B, using Lemma 7.
Deﬁnition 15. A reduction step A →β A is silent if – A = (λx : τ.κ)N →β κ[x := N ] = A , where κ is a constructor and N is an object, or – A = Πx : τ.B →β Πx : τ .B = A , where τ →β τ and B is a kind, or – A = κN →β κN = A , where N →β N and κ is a constructor, or – A = λx : τ.κ →β λx : τ .κ = A , where κ is a constructor, or – A = C[B] →β C[B ] = A , where C[ ] is any context and B →β B is a silent reduction. Lemma 16. If A →β B then tΓ (A) β tΓ (B). In addition, if the reduction A →β B is not silent then tΓ (A) + β tΓ (B). Proof. Induction with respect to A →β B, using Lemma 14 when A is a redex, and Lemma 13 in the other cases.
Corollary 17. If B =β B then tΓ (B) =β tΓ (B ). The following lemma states soundness of the translation tΓ . In particular, item 2 implies that all the rules in the calculus of constructions are needed. Lemma 18. Assume a ﬁxed environment Γ . 1. If M is a proof, an object, or a formula and Γ M : B holds in LNTT then t(Γ ) tΓ (M ) : tΓ (B) in λC. 2. If M is a type or a constructor then t(Γ ) tΓ (M ) : ∗ or t(Γ ) tΓ (M ) : . 3. If M is a kind then t(Γ ) tΓ (M ) : . Proof. Simultaneous induction with respect to the structure of the appropriate derivation, using Lemma 13.
Theorem 19. System LNTT has the strong normalization property. Proof. We already know that all expressions except proofs are strongly normalizing. Arguing as in the proof of Corollary 12, and using Lemma 16, we conclude that almost all steps in an inﬁnite reduction sequence must be silent. Thus it suﬃces to prove that if D is a proof than there is no inﬁnite silent reduction of D. This goes by induction with respect to the size of D, by cases depending on its shape.
In the Search of a Naive Type Theory
123
No Conclusion The above is by no means a complete proposal of either theoretical or didactic character. It is essentially a collection of questions and partial suggestions of how such a proposal should be eventually designed. These questions are of double nature, and we would like to pursue the two directions. The ﬁrst one is to ﬁnd means to talk about basic mathematics without referring to set theory in either a naive (i.e., inconsistent) or axiomatic way, using instead an appropriate typebased language. That should happen in a possibly noninvasive way, keeping as much linguistic compatibility with the “standard” style as possible. The second problem is to give a formal foundation to this informal typebased language. This formalization is to be used for two purposes: to guarantee logical consistency of the naive exposition and to facilitate computer assisted veriﬁcation and teaching. That requires building a complex system, of which our PTSstyle Less Naive Type Theory is just a very basic core. This system must involve various extensions as in [4], perhaps include a hierarchy of sorts, etc. All this is future work.
Acknowledgement Thanks to Herman Geuvers and Christophe Raﬀalli for helpful discussions. Also thanks to the anonymous referees for their suggestions.
References 1. Adams, R., Luo, Z.: Weyl’s predicative classical mathematics as a logicenriched type theory. In: Altenkirch, T., McBride, C. (eds.) TYPES 2006. LNCS, vol. 4502, Springer, Heidelberg (2007) 2. Andrews, P.B.: An Introduction to Mathematical Logic and Type Theory: To Truth Through Proof, 2nd edn. Applied Logic Series, vol. 27. Kluwer Academic Publishers, Dordrecht (2002) 3. Barendregt, H.P.: Lambda calculi with types. In: Abramsky, S., Gabbay, D.M., Maibaum, T.S.E. (eds.) Handbook of Logic in Computer Science, vol. II, pp. 117– 309. Oxford University Press, Oxford (1992) 4. Barthe, G.: Extensions of pure type systems. In: DezaniCiancaglini and Plotkin [10], pp. 16–31 5. Bertot, Y., Cast´eran, P.: Interactive Theorem Proving and Program Development. Coq’Art: The Calculus of Inductive Constructions. Texts in Theoretical Computer Science. An EATCS Series, Springer, Heidelberg (2004) 6. de Bruijn, N.G.: A survey of the project automath. In: Seldin, J.P., Hindley, J.R. (eds.) To H.B. Curry: Essays on Combinatory Logic, Lambda Calculus and Formalism, pp. 579–606. Academic Press, London (1980) J., Sakowicz, J.: Papuq: A Coq assistant (manuscript, 2007) 7. Chrzaszcz, 8. Church, A.: A formulation of the simple theory of types. Journal of Symbolic Logic 5(2), 56–68 (1940)
124
A. Kozubek and P. Urzyczyn
9. Constable, R.L.: Naive computational type theory. In: Schwichtenberg, H., Steinbruggen, R. (eds.) Proof and SystemReliability, pp. 213–259. Kluwer Academic Press, Dordrecht (2002) 10. DezaniCiancaglini, M., Plotkin, G. (eds.): TLCA 1995. LNCS, vol. 902. Springer, Heidelberg (1995) 11. Farmer, W.M.: A partial functions version of Church’s simple theory of types. Journal of Symbolic Logic 55(3), 1269–1291 (1990) 12. Farmer, W.M.: A simple type theory with partial functions and subtypes. Annals of Pure and Applied Logic 64, 211–240 (1993) 13. Farmer, W.M.: A basic extended simple type theory. Technical Report 14, McMaster University (2003) 14. Farmer, W.M.: The seven virtues of simple type theory. Technical Report 18, McMaster University (2003) 15. Farmer, W.M.: Formalizing undeﬁnedness arising in calculus. In: Basin, D., Rusinowitch, M. (eds.) IJCAR 2004. LNCS (LNAI), vol. 3097, pp. 475–489. Springer, Heidelberg (2004) 16. Geuvers, H.: The ChurchRosser property for betaetareduction in typed lambda calculi. In: Logic in Computer Science, pp. 453–460 (1992) 17. Geuvers, H.: Private communication (2006) 18. Halmos, P.R.: Naive Set Theory. Van Nostrand, 1960. Reprinted by Springer, Heidelberg (1998) 19. Hurkens, A.J.C.: A simpliﬁcation of Girard’s paradox. In: DezaniCiancaglini and Plotkin [10], pp. 266–278 20. Jensen, R.B.: On the consistency of a slight(?) modiﬁcation of Quine’s NF. Synthese 19, 250–263 (1969) 21. Kamareddine, F., Laan, T., Nederpelt, R.: Types in logic and mathematics before 1940. Bulletin of Symbolic Logic 8(2), 185–245 (2002) 22. Luo, Z.: A typetheoretic framework for formal reasoning with diﬀerent logical foundations. In: Okada, M., Satoh, J. (eds.) ASIAN 2006. LNCS, vol. 4435, pp. 214–222. Springer, Heidelberg (2006) 23. Quine, W.V.: New foundations for mathematical logic. American Mathematical Monthly 44, 70–80 (1937) 24. Sørensen, M.H., Urzyczyn, P.: Lectures on the CurryHoward Isomorphism. Elsevier, Amsterdam (2006) 25. Weyl, H.: The Continuum. Dover, Mineola, NY (1994)
Veriﬁcation of the Redecoration Algorithm for Triangular Matrices Ralph Matthes and Martin Strecker C. N. R. S. et Universit´e Paul Sabatier (Toulouse III) Institut de Recherche en Informatique de Toulouse (IRIT) 118 route de Narbonne, F31062 Toulouse Cedex 9
Abstract. Triangular matrices with a dedicated type for the diagonal elements can be proﬁtably represented by a nested datatype, i. e., a heterogeneous family of inductive datatypes. These families are fully supported since the version 8.1 of the Coq theorem proving environment, released in 2007. Redecoration of triangular matrices has a succinct implementation in this representation, thus giving the challenge of proving it correct. This has been achieved within Coq, using also induction with measures. An axiomatic approach allowed a veriﬁcation in the Isabelle theorem prover, giving insights about the diﬀerences of both systems.
1
Introduction
Nested datatypes [9] may keep certain invariants (see also the illuminating [11]) even without employing a dependentlytyped system where types may also depend on objects, thus e. g., maintaining size information in the types. Redecoration for triangular matrices by means of a nested datatype has ﬁrst been studied in the case of inﬁnite triangles [2]. Its ﬁnitary version has been programmed iteratively in the subsequent journal version [3] and through primitive recursion in Mendlerstyle [1]. In all these cases, no attempt was made to verify properties other than termination. We put forward the example of redecoration of triangular matrices as a prototypical situation, where nested datatypes yield concise and elegant programs that are veriﬁable. The price to pay is a more complex framework that is needed in order to formulate the programs and a more complex logical apparatus for verifying them. Moreover, as in all formal veriﬁcation tasks, a major challenge is to develop an appropriate correctness criterion. We have chosen to give a very precise intuition about the algorithm. Even though this might satisfy the experienced programmer, we felt the need for a subsequent veriﬁcation against a completely diﬀerent model: a model that is just based on the ordinary type of lists and thus does not impose the aforementioned complex machinery.
With ﬁnancial support by the European Union FP62002ISTC Coordination Action 510996 “Types for Proofs and Programs”
M. Miculan, I. Scagnetto, and F. Honsell (Eds.): TYPES 2007, LNCS 4941, pp. 125–141, 2008. c SpringerVerlag Berlin Heidelberg 2008
126
R. Matthes and M. Strecker
Since we chose not to use dependent types, the listbased model speaks about a too large datatype, namely “triangles” that may be quite degenerate. Nevertheless, we may fully proﬁt from tool support for lists that is welldeveloped in interactive theorem provers. In this case study, we deployed the Coq and the Isabelle proof assistants. Coq has a very strong type system that is fully adequate for representing nested datatypes and reasoning about them. The latest release 8.1 of Coq explicitly supports nested datatypes: deﬁnition by patternmatching, induction principles, . . . Even though Isabelle, which is based on simplytyped lambdacalculus with type variables, does not accept nested datatypes as such, it permits to simulate essential aspects of the development in an axiomatic manner. Two critical aspects can be ensured by the use of theorem provers such as the two systems under study: termination of the algorithm (a nonfunctional speciﬁcation) and functional correctness with respect to a chosen correctness criterion, in our case the relation with the listbased model. The axiomatic approach that we had to use in Isabelle cannot ensure termination of the algorithms on nested datatypes and cannot justify their induction schemes from ﬁrst principles. However, the development of the listbased model is entirely derivable from a small logical core. The main challenge of the veriﬁcation is to ﬁnd the right lemmas that allow to get the inductive proof of the simulation theorem through. The whole explanations and the semiformal development in standard mathematical style of the proof in the next three sections have only been possible with the aid of a proof engineering eﬀort in the proof assistants. Simpliﬁcation and rewriting are tasks that are errorprone for humans and where the tool support is particularly welldeveloped and helpful. In the light of two complete formalizations in two entirely independent systems, it does not seem necessary to reproduce such proof steps in the main part of this article. The curious reader is invited to consult the full proof scripts that are available online [12]. The article is structured as follows: Section 2 introduces the problem and gives the intuitive justiﬁcation of redecoration for triangular matrices, viewed as a nested datatype. In Section 3, the listbased model is developed, against which the original model is veriﬁed in Section 4. Highlights of the formalizations in the proof assistants are presented in Section 5 for Coq and Section 6 for Isabelle. We conclude in Section 7.
2
Triangular Matrices
The “triangular matrices” of the present article are ﬁnite square matrices, where the part below the diagonal has been cut oﬀ. Equivalently, one may see them as symmetric matrices where the redundant information below the diagonal has been omitted. The elements on the diagonal play a diﬀerent role than the other elements in many mathematical applications, e. g., one might require that the diagonal elements are invertible (nonzero). This is modeled as follows: A type E of elements outside the diagonal is ﬁxed throughout, and there is a type of diagonal elements that enters all deﬁnitions as a parameter. More technically, if A
Veriﬁcation of the Redecoration Algorithm for Triangular Matrices
127
is the type of diagonal elements, then Tri A shall denote the type of triangular matrices with A’s on the diagonal and E’s outside. Then, Tri becomes a family of types, indexed over all types, hence a type transformation. Moreover, the diﬀerent Tri A are inductive datatypes that are all deﬁned simultaneously, hence they are an “inductive family of types” or “nested datatype”. We do not consider empty triangles here. So, the smallest element of Tri A contains a single element that thus is a diagonal element and hence taken from the type A. This is materialized by the datatype constructor sg : A → Tri A, that constructs these “singletons”. Here, A is meant to be a type variable, i. e., a variable of type Set, the universe of computational types. The reader may just conceive that sg were given the quantiﬁed type ∀A. A → Tri A. The nonsingleton case can be visualized like this: AEEEE AEEE AEE AE A
...
The vertical line cuts the triangle into one element of A and a “trapezium”, with an uppermost row solely consisting of E’s. One might now, for the purpose of having an inductive generation process, decompose that trapezium into the uppermost row and the triangle below, but it would be hard to keep the information that both have the same number of columns (unless making use of dependent types). The approach to be followed is the integration of the side diagonal (i. e., the elements just above the diagonal) into the diagonal. In this way, the trapeziums above are in onetoone correspondence to triangles as follows: EEE AEE AE A
E E E E A
triangle −→ ←− trapezium
E×A
E E×A
...
E E E×A
E E E E×A
...
The trapezium to the left is then considered as the “trapezium view” of the triangle to the right. Vice versa, the triangle to the right is the “triangle view” of the trapezium to the left. Since we are about to deﬁne what triangles are, it is now comfortable to refer to the triangle view of the trapezium in the second datatype constructor constr : A → Tri(E × A) → TriA , again with A a type variable. Hence, the nonsingleton triangles are conceived to consist of the topmost leftmost element, taken from A, and a triangle with
128
R. Matthes and M. Strecker
diagonal elements taken from E ×A. Abbreviate Trap A := Tri(E ×A). Therefore, constr : ∀A. A → Trap A → Tri A, but this is just a point of view to refer to the trapezium view of the argument that “is” a triangle. From sg and constr, the inductive family Tri is now fully deﬁned as everything that is ﬁnitely generated from these two constructors. As is usual with nested datatypes, one cannot understand one speciﬁc family member Tri A for some type A in isolation, but the recursive structure will include Tri(E×A), Tri(E×(E×A)), . . . , hence inﬁnitely many family members (with indices of increasing size). Naturally, also the induction principle for Tri cannot speak about one instance Tri A in isolation. The following induction principle is intuitively justiﬁed.1 Given a predicate P that speaks about all triangles with all types of diagonal elements, i. e., P : ∀A. Tri A → Prop, where Prop is the universe of propositions, the aim is to assure that P holds universally, i. e., ∀A∀t : Tri A. PA t holds true. (Here and everywhere, we suppress the type information that A is a type variable, and we write the type argument to P as a subscript.) An inductive proof of this universal statement now only requires a proof of the two following statements: – ∀A∀a : A. PA (sg a). – ∀A∀a : A∀r : Trap A. PE×A r → PA (constr a r). The inductive hypothesis PE×A r refers to the instantiation of the predicate P with type argument E × A. Except from this, the principle is no more diﬃcult in nature than the induction principle for (homogeneous) lists. The redecoration algorithm whose veriﬁcation is the aim of this article can be described as the binding operation of a comonad2 . In other words, we will organize the material in the form of a comonad for the type transformer Tri that might then be called the redecoration comonad. The function top : ∀A. Tri A → A that computes the top left element is programmed as follows: top(sg a) := a ,
top(constr a r) := a .
This is a simple nonrecursive instance of deﬁnition by patternmatching. The function top will be the counit of the comonad that we are deﬁning. Redecoration in general is dual to substitution, see [15]. Following this view, redecoration for triangular matrices will be deﬁned as Kleisli coextension operation: Given types A, B and a function f : Tri A → B – the “redecoration rule” – deﬁne the function redec f : Tri A → Tri B. In the formulation of [15], Tri A becomes the type of Adecorated structures. So, the redecoration rule f assigns Bdecorations to Adecorated structures, and redec f “coextends” this to an assignment of Bdecorated structures to Adecorated structures. The intuitive idea of redecoration in the case of triangles is to go recursively through the triangle and to replace each diagonal element by the result of 1 2
A formal justiﬁcation is provided by Coq, see section 5. No categorytheoretic knowledge is required to follow the article. The laws are given in our concrete situation, but they are mentioned to be instances of the general comonad notions.
Veriﬁcation of the Redecoration Algorithm for Triangular Matrices
129
applying f to the subtriangle that extends to the right and below the diagonal element. redec : ∀A∀B. (Tri A → B) → Tri A → Tri B, redec f (sg a) := sg(f (sg a)) , ) := constr(f t) (rest(redec f t)) . redec f (constr a r t:=
Here, rest(redec f t) is a metanotation for the element of type Tri(E × B) yet to be deﬁned. However, already at this stage of deﬁnition, the ﬁrst comonad law becomes apparent: top(redec f t) = f t . It remains to deﬁne rest(redec f t) : Tri(E × B) for f : Tri A → B and t = constr a r with a : A and r : Tri(E × A). The type of r “is” a triangle, but, from the description of constr, r ought to be seen in trapezium view in order to follow the above intuition of redecoration. Going back to the illustration on page 127, the uppermost row is to be cut oﬀ, then the topmost A be replaced by f applied to the remaining triangle, and then redecoration has to be carried on recursively. Finally, the uppermost row has to be recovered. First, deﬁne the operation cut that cuts oﬀ the top row from the trapezium view of its argument: cut : ∀A. Trap A → Tri A , cut(sg (e, a)) := sg a , cut(constr (e, a) r) := constr a (cut r) . Here, (e, a) : E × A denotes pairing. Note that r is of type Trap(E × A) in the second clause. The deﬁnition principle is thus “polymorphic recursion” where the type parameter can change in the recursive calls. Since it is even a deﬁnition that exploits that the argument is not an arbitrary Tri A, it goes beyond the iteration schemes proposed in [3].3 Note that, in the recursive equation for cut, no change of view is necessary since the arguments are always seen as trapeziums. We will deﬁne rest(redec f t) just from f and r : Trap A where the latter might be called rest t. In “reality”, r is no trapezium so that a recursive call to redec for r will need a dedicated redecoration rule for the trapezium view, to be obtained from the “original” redecoration rule f in (ordinary) triangle view: f : Trap A → E × B , f r := (fst(top r), f (cut r)) , with fst the left/ﬁrst projection out of a pair. Note that the target type E × B of f is the type parameter of Tri(E × B) = Trap B. The left component of f r 3
Basically, that article only allows to deﬁne iterative functions of type ∀A. Tri A → X A for some type transformation X. Through the use of syntactic Kan extensions for X, this can be relaxed somewhat, and an iterative function (called fcut on page 49 of that article) with the more general type ∀A∀B. (B → E × A) → Tri B → Tri A had to be deﬁned before instantiating it to cut by using the identity on E × A as the functional parameter (hence with B := E × A).
130
R. Matthes and M. Strecker
instructs to keep the leftmost element of the uppermost row in the trapezium view. Note that the original deﬁnition of the right component of f r in [2] did not use a cut function but just lifted the second projection via a mapping function for Tri. Correct types do not suﬃce, and veriﬁcation would have been welcome to exclude such an error. The operation that associates f with f is named lift, so lift f := f and lift : ∀A∀B. (Tri A → B) → Trap A → E × B. The deﬁnition of redecoration is ﬁnished by setting rest(redec f t) := redec (lift f ) r, whence (without using the abbreviations), the recursive equation for redec becomes redec f (constr a r) = constr (f (constr a r)) (redec (lift f ) r) , which is the equational form of the reduction behaviour established through Mendler recursion in earlier work [1]. A very typical phenomenon of nested datatypes are recursive functions that take an additional functional parameter – here the f – that is modiﬁed during the recursion. The major question is now: Did we come up with the right deﬁnition? By fairly easy inductive reasoning, using some auxiliary lemmas about cut and lift, the other two comonad laws can be established for the triple consisting of Tri, top and redec:4 – redec top t = t, – redec (g ◦ (redec f )) t = redec g (redec f t) for f, g, t of appropriate types, namely f : Tri A → B, g : Tri B → C, t : Tri A. In general, ◦ denotes functional composition λgλf λx. g(f x), but is written in inﬁx notation. However, these laws and the textual description do not yet conﬁrm a computational intuition that might have been formed through the experience with simpler datatypes such as lists. Therefore, we will set out to relate the behaviour of redec to a function redecL that does not involve nested datatypes but is based on just the ubiquitous datatype of lists.
3
A ListBased Model
We assume the type transformation List, where for any type A, the type of all ﬁnite lists with elements taken from A is List A. Although it is also a family of inductive types, List is not a nested datatype since there is no relation between any List A and List B for A = B in the deﬁnition of List. Clearly, such relations 4
We also need that redec is extensional in its function argument, see the discussion in the implementationrelated sections.
Veriﬁcation of the Redecoration Algorithm for Triangular Matrices
131
occur with the usual mapping function map : ∀A∀B. (A → B) → List A → List B that maps its function argument over all the elements of the second argument, but this is only after the deﬁnition of List. The listbased representation of triangles is now a simply parameterized family of inductive types, deﬁned explicitly by reference to List: TriL A := List(List E × A) . Any element of some TriL A is a ﬁnite list of “columns”, and each column consists of the ﬁnite list of E elements above the diagonal and the A element on the diagonal. Note that the argument List E × A of List has to be parenthesized to the left, i. e., as (List E) × A. We visualize an element of TriL A as a generalized triangle, with the A’s still in the diagonal, but always the list of E’s above each diagonal element, with the ﬁrst element the farthest away from the diagonal. An example with 4 columns would be: E E A
E E E AEE AE A
Triangularity is not expressed since, again, we do not want to make use of dependent types by which this could be controlled through the lengths of the E lists. In order to relate statements about Tri and TriL, we deﬁne the “list representation” of triangles of type Tri A as elements of type TriL A. Assume we want the representation for some constr a r with r : Trap A, then a recursive call to the representation function would yield an element of TriL(E × A). So, we would have to push out those E’s within the diagonal elements to the E lists. The columnwise operation is thus: shiftToE : ∀A. List E × (E × A) → List E × A , shiftToE(es, (e, a)) := (es + [e], a) , where + is used to denote list concatenation and [e] is the list that consists of just the element e, while the empty list will be denoted by [ ], and the “cons” operation will be denoted by inﬁx “::”. The mapping with shiftToE changes from “triangle view” to “trapezium view” in the listbased representation: shiftToEs : ∀A. TriL (E × A) → TriL A , shiftToEs := map shiftToE . The list representation of triangles is given iteratively as follows: toListRep : ∀A. Tri A → TriL A , toListRep(sg a) := [([ ], a)] , toListRep(constr a r) := ([ ], a) :: shiftToEs (toListRep r) .
132
R. Matthes and M. Strecker
The intention is to deﬁne a notion of redecoration also for the listbased representation, i. e., an operation redecL : ∀A∀B. (TriL A → B) → TriL A → TriL B . However, there will be no proper comonad structure since no counit topL : ∀A. TriL A → A can exist: A could be instantiated by an empty type A0 , and TriL A0 would still not be empty since it contains [ ]. As a preparation for the deﬁnition of redecL, more operations on columns are introduced that allow to cut oﬀ and restore the topmost E element: removeTopE : ∀A. List E × A → List E × A , removeTopE([ ], a) := ([ ], a) , removeTopE(e :: es, a) := (es, a) , singletonTopE : ∀A. List E × A → List E , singletonTopE([ ], a) := [ ] , singletonTopE(e :: es, a) := [e] , appendEs : ∀A. List E → List E × A → List E × A , := (es + es , a) . appendEs es (es , a) For all pairs p : List E ×A, one has appendEs(singletonTopE p)(removeTopE p) = p. The technical problem here is just that the E list can be empty, and so there is the need for the list with at most one element. These operations can be canonically extended to multiple columns. For oneplace functions, this is done via map, for the twoplace function appendEs, the generic zipWith function known from the Haskell programming language (see www.haskell.org) comes into play: zipWith : ∀A∀B∀C. (A → B → C) → List A → List B → List C , zipWith f (a :: 1 ) (b :: 2 ) := f a b :: zipWith f 1 2 , zipWith f 1 [ ] := [ ] , := [ ] . zipWith f [ ] 2 The last auxiliary deﬁnitions for redecL are: removeTopEs : ∀A. TriL A → TriL A , removeTopEs := map removeTopE , singletonTopEs : ∀A. TriL A → List(List E) , singletonTopEs := map singletonTopE , zipAppendEs : ∀A. List(List E) → TriL A → TriL A , zipAppendEs := zipWith appendEs . The following deﬁnition is by wellfounded recursion over the TriL A argument of redecL. Note that removeTopEs does not change the list length of its argument and that therefore, it is just the list length of the TriL argument of redecL that is smaller in the recursive call. redecL : ∀A∀B. (TriL A → B) → TriL A → TriL B , redecL f [ ] := [ ] , redecL f ((es, a) :: r) := (es, f ((es, a) :: r)) :: zipAppendEs (singletonTopEs r) (redecL f (removeTopEs r)) .
Veriﬁcation of the Redecoration Algorithm for Triangular Matrices
133
In comparison with redec, this deﬁnition contains a new trivial case for the empty list, and the redecoration rule f does not need to be adapted to a trapezium view in the recursive call. Thus, f is just a ﬁxed parameter throughout the recursion, hence, also the type parameters stay ﬁxed. Due to the less rigid constraints on the form in TriL, there may be E’s above the leftmost A. This parameter es is taken into account when evaluating the redecoration rule, but still, only the diagonal elements are modiﬁed by the algorithm.
4
Veriﬁcation Against the ListBased Model
Theorem 1 (Simulation). If E is nonempty, then for all types A, B, terms t : Tri A and f : TriL A → B: redecL f (toListRep t) = toListRep(redec (f ◦ toListRep) t) . This is the most natural theorem that relates redec and redecL through toListRep: If there were an operation topL to turn TriL and redecL into a comonad, this theorem would establish for toListRep one of the two properties of a comonad morphism from Tri to TriL. Unfortunately, it does not reduce redec to redecL or vice versa. The former direction seems already to be hampered by the need for a redecoration rule f : TriL A → B, hence with a much wider domain than prescribed for redec. However, this is not so due to the existence of a left inverse fromListRep of toListRep. As we will see in the main theorem at the end of this section, redec f t can be expressed in terms of redecL, toListRep and fromListRep. For the proof of the simulation theorem, one has to replay cut and lift on the list representations: Deﬁne remsh : ∀A. List E × (E × A) → List E × A , remsh := removeTopE ◦ shiftToE . Abbreviate TrapL A := TriL(E × A). Deﬁne cutL : ∀A. TrapL A → TriL A , cutL := map remsh , where the operational intuition is just to put the argument in trapezium view and then to cut oﬀ the top row. This intuition is met thanks to the functor law for map stating preservation of composition, i. e., map (g ◦ f ) t = map g (map f t). Lemma 1. toListRep(cut r) = cutL(toListRep r) for all A and terms r : Trap A. Proof. This is by induction on Trap, hence a section of Tri. The induction principle is as follows: Given P : ∀A. Trap A → Prop, one concludes its universality, i. e., ∀A∀t : Trap A. PA t from the following two clauses: – ∀A∀a : E × A. PA (sg a). – ∀A∀a : E × A∀r : Trap (E × A). PE×A r → PA (constr a r).
134
R. Matthes and M. Strecker
It should be as intuitive as the Tri induction principle. For formal justiﬁcations, see the later sections. The inductive step of the lemma will need (for r : TrapL(E × A)) shiftToEs (cutL r) = cutL (shiftToEs r) , that in turn follows from (for r : List E × (E × (E × A))) shiftToE(remsh r) = remsh(shiftToE r) and the abovementioned functor law for map.
The analogue liftL of lift can only be deﬁned for nonempty E. We will assume some ﬁxed e0 of type E in the sequel. Deﬁne liftL : ∀A∀B. (TriL A → B) → TrapL A → E × B , liftL f [ ] := (e0 , f (cutL [ ])) , liftL f ((es, (e, a)) :: r ) := (e, f (cutL r)) . r:=
The following relation between lift and liftL is a consequence of the preceding lemma. Lemma 2. lift (f ◦ toListRep) r = liftL f (toListRep r) for types A, B and terms f : TriL A → B and r : Trap A. The major obstacle on the way to proving the theorem is the following lemma. Lemma 3 (Main Lemma). For any types A, B and terms f : TriL A → B and r : TrapL A, one has with
shiftToEs (redecL (liftL f ) r) = zipAppendEs 1 2 1 := singletonTopEs(shiftToEs r) : List(List E) , 2 := redecL f (removeTopEs(shiftToEs r)) : TriL B .
Note that r : TrapL A, but that r is nevertheless just a (generalized) triangle. Hence liftL f is the right redecoration rule that treats it in trapezium view. Redecoration will nevertheless produce a (generalized) triangle, so the result is ﬁnally transformed into trapezium view. On the righthand side, r is ﬁrst made into a (generalized) trapezium, then redecoration is done to the result after cutting oﬀ the top row, but then the cut oﬀ elements are restored, hence the outcome is also a (generalized) trapezium. Note also that, as argued before, by virtue of the functor law for map, the argument to redecL f in 2 is equal to cutL r. Proof. The function redecL can be understood as being deﬁned by recursion over the list length of its argument, and also the proof can be done by induction on the list length of r. See more details in the following speciﬁc sections on Coq and Isabelle how it is done more elegantly.
Veriﬁcation of the Redecoration Algorithm for Triangular Matrices
Theorem 1 follows from the main lemma by induction on Tri for t.
135
We want to deﬁne a left inverse to toListRep. The type ∀A. TriL A → Tri A cannot be inhabited since TriL A is never empty while Tri A inherits emptiness from A. Hence, we will only be able to deﬁne a function fromListRep : ∀A. A → TriL A → Tri A such that fromListRep a0 (toListRep t) = t for all a0 : A, t : Tri A. Recall that we have ﬁxed an element e0 of type E. The operation on columns is deﬁned as shiftFromE : ∀A. List E × A → List E × (E × A) , shiftFromE([ ], a) := ([ ], (e0 , a)) , shiftFromE(e :: es, a) := (removelast (e :: es), (last (e :: es), a)) , with functions removelast and last of the meaning suggested by their names. It is easy to establish that, for all pairs p : List E × (E × A), one has shiftFromE(shiftToE p) = p . This is extended to an operation on (generalized) triangles: shiftFromEs : ∀A. TriL A → TrapL A , shiftFromEs := map shiftFromE , and shiftFromEs(shiftToEs r) = r for all r : TrapL A follows from the respective result on columns, the two functor laws for map (hence, also map (λx.x) l = l) and extensionality of map in its function argument.5 The deﬁnition of fromListRep is by wellfounded recursion over the TriL A argument, and as for redecL, this is justiﬁed by the decrease of the list length of this argument in the recursive calls. However, unlike the situation of redecL, we need polymorphic recursion in that the function at type A calls itself at type E × A: := sg a0 , fromListRep a0 [ ] := sg a , fromListRep a0 [(es, a)] fromListRep a0 ((es, a) :: p :: r) := constr a fromListRep (e0 , a0 ) (shiftFromEs(p :: r)) . Lemma 4 (left inverse). For any type A, terms a0 : A and t : Tri A, one has fromListRep a0 (toListRep t) = t. Proof. By a use of the induction principle for Tri, exploiting the fact that any toListRep t is a nonempty list, hence of the form p :: r. Theorem 2 (Main Theorem). If E is nonempty, then for all types A, B, and terms a0 : A, b0 : B, f : Tri A → B and t : Tri A: redec f t = fromListRep b0 redecL (f ◦ (fromListRep a0 )) (toListRep t) . Proof. An immediate consequence of the simulation theorem and the preceding lemma, by using extensionality of redec in its functional argument once more. 5
See the discussion on extensionality in Section 5.
136
5
R. Matthes and M. Strecker
Details on Formal Veriﬁcation with Coq
In this section, a Coq development of the mathematical contents of the last three sections is discussed. The Coq vernacular ﬁle can be found at the web site [12]. We mentioned above that the Coq system [10]6 has a genuine patternmatching support for nested datatypes like Tri since version 8.1, contributed by Christine Paulin. In version 8.0, there were subtle problems because such datatypes could only be speciﬁed through datatype constructors with universally quantiﬁed types that had to live in the universe Set as well, hence Set had to be made impredicative by an option to the Coq runtime system. The following remarks concern Coq 8.1 at patch level 3, released in December 2007. The nested datatype (a. k. a. inductive family) Tri is introduced as follows: Inductive Tri (A:Set) : Set := sg : A > Tri A  constr : A > Tri (E * A) > Tri A. Then, the appropriate induction principle is automatically generated, and one can check its type: Check Tri_ind : forall P : forall A : Set, Tri A > Prop, (forall (A : Set) (a : A), P A (sg a)) > (forall (A : Set) (a : A) (r : Tri (E * A)), P (E * A) r > P A (constr a r)) > forall (A : Set) (t : Tri A), P A t. This is exactly the induction principle of Section 2. However, the induction principle for Trap in Section 4 seems to need a (straightforward) proof via the fix construction for structurally recursive functions/proofs. The deﬁnition of redecL by recursion over a measure and reasoning about redecL by “measure induction” uses an experimental feature of Coq 8.1 (one has to load separately the package Recdef), provided by Pierre Courtieu, Julien Forest and Yves Bertot [4,5,6]. Function redecL (A B:Set)(f:TriL A > B)(t: TriL A) {measure length t} : TriL B := match t with nil => nil  (es,a)::rest => (es,f((es,a)::rest)):: zipAppendEs (singletonTopEs rest) (@redecL A B f (removeTopEs rest)) end. The fact that the length is a measure that decreases in the recursive call has to be proven in order to get Coq to accept this as a deﬁnition. Thanks to the explicit form @redecL A B that reveals the Churchstyle syntax that underlies
6
We will only presuppose concepts and features of Coq that are explained in the Coq textbook [8].
Veriﬁcation of the Redecoration Algorithm for Triangular Matrices
137
Coq although it is hidden from the user by the mechanism of implicit arguments, measure induction even works with this polymorphic function. Coq automatically generates an induction principle redecL ind, called functional induction, that allows to argue about values of redecL directly along the recursive call scheme of its deﬁnition. The induction hypothesis is prepared with the argument removeTopEs rest, and there is no need to redo the justiﬁcation by means of the decreasing length again. The proof of the main lemma is then an instance of redecL ind, and this is interactively initiated by functional induction (redecL (liftL e0 f) r). In a simpler form, functional induction is used for the analysis of zipWith that, despite being structurally recursive in both list arguments, also proﬁts from being deﬁned by the “Function” command that again prepares the induction hypotheses, and this has already been available in Coq for years now. However, even the current extensions to functional induction in Coq 8.1 patch level 3 do not cover the deﬁnition of fromListRep because it combines recursion with decreasing measure with polymorphic recursion. In the development version of Coq, a proposal by Julien Forest works well where the type A and the elements a0 : A and t : TriL A are encapsulated in a record, see [12]. Our solution consists in deﬁning an auxiliary function with an additional parameter n of type nat by ordinary recursion on n and then ﬁxing n to the length of t. This works very well because the list length is just one less in the recursive call and because the proof of Lemma 4 only needs the deﬁning equations of fromListRep immediately preceding that lemma. In the middle of the proof of the second comonad law, redec top t = t, we have to prove redec top t = redec (lift top) t. It was easy to prove ∀r. lift top r = top r before that. It would be easy to conclude if this implied lift top = top, but this typically cannot be done in intensional type theory to which the underlying system of Coq belongs, namely the Calculus of Inductive Constructions. But we do not even need that equality since, in general, redec f t only depends on the values of f (the “extension” of f ) and not its deﬁnition (or “intension”). More precisely, one can show by Tri induction on t that redec is “extensional”: ∀f f . (∀t . f t = f t ) → redec f t = redec f t . This property is also needed for the proofs of the third comonad law, the simulation theorem and the main theorem, and the analogous property for map enters the proof of the main lemma, its auxiliary lemmas and the proof that shiftFromEs is a leftinverse of shiftToEs.
6
Details on Formal Veriﬁcation with Isabelle
We are going to sketch an alternative to the Coq implementation, described in the previous section. This is done within the system Isabelle (more precisely,
138
R. Matthes and M. Strecker
Isabelle 2007 of November 2007), and the script with the theory development is also available from the web site [12]. The type system of Isabelle is less expressive than the type system of Coq: it is a simply typed lambdacalculus with MLstyle polymorphism [13]. Type parameters of polymorphic functions need not be supplied explicitly, but can be inferred by the system, and universal quantiﬁcation over types on the toplevel is provided through schematic type variables. The datatype deﬁnition mechanism currently implemented in Isabelle is described in more detail in [7]. To be consistent with the Coq formalization, we would like to ﬁx a type constant E by declaring “typedecl E” and deﬁne the polymorphic tri datatype as follows: datatype ’a tri
=
sg (’a)

constr (’a) ((E * ’a) tri)
As spelled out in Section 2, in the resulting induction principle ∀ P. (∀ a. P (sg a)) −→ (∀ a r. P r −→ P (constr a r)) −→ ∀ t. P t the universally quantiﬁed induction predicate P would then be applied both to a (E * ’a) tri and a ’a tri, thus overstraining Isabelle’s type system. Therefore, such a datatype deﬁnition is not valid in Isabelle. We circumvent this and related problems by not conceiving sg and constr as constructors of an inductive type, but just as constants declared by consts sg :: ’a ⇒ (’a,’e) tri constr :: ’a ⇒ (’a, ’e) trap ⇒ (’a, ’e) tri As above, (’a, ’e) trap abbreviates (’e * ’a,’e) tri. (And the ﬁxed parameter E is replaced by a second type parameter ’e. For the whole theoretical development, this diﬀerence does not play any role, but it facilitates concrete programming examples that are also provided in the Isabelle script.) For carrying out proofs, we have to provide appropriate instances of the induction predicate. In order to obtain the desired computational behaviour, we manually have to add reduction rules, as will be shown in the following. As an example, take the cut function of Section 2. We declare the function cut by: consts cut :: (’a, ’e) trap ⇒ (’a, ’e) tri The primitiverecursive function deﬁnition is accomplished by providing the following characteristic equations: axioms
cut_sg [simp]: cut (sg (e,a)) = sg a cut_constr [simp]: cut (constr (e,a) r) = constr a (cut r)
Note that in the second equation, cut is applied to expressions of diﬀerent types: on the left, to a term of type ’a tri, on the right, to a term (’a,’e) tri. Here,
Veriﬁcation of the Redecoration Algorithm for Triangular Matrices
139
we exploit an essential diﬀerence between a universally quantiﬁed variable (as in the induction predicate above), which can only be applied to elements of the same type, and a globally declared constant such as cut, which can be applied to instances of diﬀerent type. This distinction is reminiscent of the diﬀerence, in an MLstyle type system, between the term λid : a ⇒ a. λf : nat ⇒ bool ⇒ nat. f (id 0) (id T rue) (which is not welltyped) and let id = (λx : a. x) in λf : nat ⇒ bool ⇒ nat. f (id 0) (id T rue) (which is). Of course, this axiomatization does not provide the guarantees of a genuine primitive recursive deﬁnition, such as termination. As mentioned above, for typing reasons, we cannot state a general induction principle. We can, however, exploit the same mechanism as for function deﬁnitions and provide instances of the induction principle for proving individual theorems. We illustrate the procedure for the proof of the following (where # is the “cons” operation and snd is the right/second projection out of a pair) lemma toListRep_cons_inv: toListRep t = a # list −→ top t = snd a We notice that the proof can be carried out using the following instance of the induction predicate: (∀a. P1 (sg a)) −→ (∀a r. P1 r −→ P1 (constr a r)) −→ ∀t. P1 t where P1 is deﬁned as λt. (∀ a list. toListRep t = a # list −→
top t = snd a)
The proof of the lemma is now very easy: unfold the deﬁnition of P1 and carry out elementary term simpliﬁcation. Altogether, the proof of Theorem 1 requires four instances of the induction schema. This approach is not diﬃcult, but suﬀers from the wellknown drawbacks of code duplication: it is errorprone and the resulting theories are hard to maintain. This is even more true since, for the proof of Lemma 3, in order to get the induction through, we have to quantify over the function f as well, and this implicitly requires to quantify over its additional type variable B. Since the latter quantiﬁcation cannot be expressed, we cannot just use the above induction axiom for P1 with the respective new predicate in place of P1 but have to copy its deﬁnition to the four occurrences in the induction formula, giving rise to the axiom Tri ind MAIN appl2 in the Isabelle script. Even though it is possible in principle to generate the required induction schemas, the discussion shows that
140
R. Matthes and M. Strecker
the result tends to be artiﬁcial and, by excessive code duplication, contrary to good practice. On the good side, the diﬀerence between the induction principles for Tri and Trap becomes invisible in this approach while in Coq, the former is provided and the latter has to be deﬁned by structural recursion. In the listbased model, we enjoy the full support from Isabelle for datatypes, here for lists. The proof of the main lemma can just follow the structure of the recursive calls in the deﬁnition, expressed in the generated theorem redecL.induct that is a version of the respective “functional induction” scheme in Coq, without dependent types and hence without the need to reference redecL in it. This functionality has been developed by Konrad Slind [14]. Also note that proving extensionality of redec in its function argument becomes a triviality in Isabelle, thanks to its rule expand fun eq that assumes (?f = ?g) = (∀ x. ?f x = ?g x), i. e., functions are equal if and only if they are pointwise equal. Finally, we remark that Isabelle’s type system only allows inhabited types, hence the type parameters only range over nonempty types. Thus, all the elements denoted by a0 , b0 and e0 in our informal description (and present in the Coq development) could have been obtained by the ε operator and therefore do not show up in the Isabelle scripts [12].
7
Conclusions
This article has presented a mathematical formalization of redecoration in triangular matrices by means of a nested datatype. Redecoration provides a comonad structure for this datatype. Moreover, we have established a precise relationship with a model that is only based on lists. For its veriﬁcation, we have contrasted two formalizations in the proof assistants Coq and Isabelle and discussed their diﬀerent approaches, in particular recursion and induction that do not just follow the datatype deﬁnition. An important diﬃculty has been the necessity of polymorphic recursion, but this is intrinsic to nested datatypes. We would hope for some Isabelle extension with a full support of nested datatypes, i. e., where induction axioms and equational speciﬁcations of recursive functions are generated and justiﬁed in the kernel of Isabelle, just as in the existing datatype package. Interesting future work would treat the original inﬁnite triangular matrices of [2,3] or even specify and verify a datatypegeneric deﬁnition of redec. Acknowledgements. With the help of Stefan Berghofer, we overran the more subtle problems with variables of diﬀerent kinds in Isabelle. Mamoun Filali has provided valuable suggestions for the elimination of functional induction in the Coq development as an alternative to Julien Forest’s construction that is no longer supported by the current Coq version. The referees’ suggestions helped to substantially strengthen the main theorem.
Veriﬁcation of the Redecoration Algorithm for Triangular Matrices
141
References 1. Abel, A., Matthes, R.: Fixed points of type constructors and primitive recursion. In: Marcinkowski, J., Tarlecki, A. (eds.) CSL 2004. LNCS, vol. 3210, pp. 190–204. Springer, Heidelberg (2004) 2. Abel, A., Matthes, R., Uustalu, T.: Generalized iteration and coiteration for higherorder nested datatypes. In: Gordon, A.D. (ed.) FOSSACS 2003. LNCS, vol. 2620, pp. 54–68. Springer, Heidelberg (2003) 3. Abel, A., Matthes, R., Uustalu, T.: Iteration and coiteration schemes for higherorder and nested datatypes. Theoretical Computer Science 333, 3–66 (2005) 4. Balaa, A., Bertot, Y.: Fixpoint equations for wellfounded recursion in type theory. In: Aagaard, M.D., Harrison, J. (eds.) TPHOLs 2000. LNCS, vol. 1869, pp. 1–16. Springer, Heidelberg (2000) 5. Barthe, G., Courtieu, P.: Eﬃcient reasoning about executable speciﬁcations in Coq. In: Carre˜ no, V.A., Mu˜ noz, C.A., Tahar, S. (eds.) TPHOLs 2002. LNCS, vol. 2410, pp. 31–46. Springer, Heidelberg (2002) 6. Barthe, G., Forest, J., Pichardie, D., Rusu, V.: Deﬁning and reasoning about recursive functions: A practical tool for the Coq proof assistant. In: Hagiya, M., Wadler, P. (eds.) FLOPS 2006. LNCS, vol. 3945, pp. 114–129. Springer, Heidelberg (2006) 7. Berghofer, S., Wenzel, M.: Inductive datatypes in HOL  lessons learned in formallogic engineering. In: Bertot, Y., Dowek, G., Hirschowitz, A., Paulin, C., Th´ery, L. (eds.) TPHOLs 1999. LNCS, vol. 1690, pp. 19–36. Springer, Heidelberg (1999) 8. Bertot, Y., Cast´eran, P.: Interactive Theorem Proving and Program Development. Coq’Art: The Calculus of Inductive Constructions. Texts in Theoretical Computer Science. Springer, Heidelberg (2004) 9. Bird, R., Meertens, L.: Nested datatypes. In: Jeuring, J. (ed.) MPC 1998. LNCS, vol. 1422, pp. 52–67. Springer, Heidelberg (1998) 10. Coq Development Team: The Coq Proof Assistant Reference Manual Version 8.1. Project LogiCal, INRIA (2006), System available at: http://coq.inria.fr 11. Hinze, R.: Manufacturing datatypes. Journal of Functional Programming 11, 493– 524 (2001) 12. Matthes, R., Strecker, M.: Coq and Isabelle development for Veriﬁcation of the Redecoration Algorithm for Triangular Matrices (2007), http://www.irit.fr/∼ Ralph.Matthes/CoqIsabelle/TYPES07/ 13. Nipkow, T., Paulson, L.C., Wenzel, M.T.: Isabelle/HOL. LNCS, vol. 2283. Springer, Heidelberg (2002) 14. Slind, K.: Wellfounded schematic deﬁnitions. In: McAllester, D. (ed.) CADE 2000. LNCS, vol. 1831, pp. 45–63. Springer, Heidelberg (2000) 15. Uustalu, T., Vene, V.: The dual of substitution is redecoration. In: Hammond, K., Curtis, S. (eds.) Trends in Functional Programming 3, Intellect, Bristol / Portland, OR, pp. 99–110 (2002)
A Logic for Parametric Polymorphism with Eﬀects Rasmus Ejlers Møgelberg and Alex Simpson LFCS, School of Informatics, University of Edinburgh Abstract. We present a logic for reasoning about parametric polymorphism in combination with arbitrary computational eﬀects (nondeterminism, exceptions, continuations, sideeﬀects etc.). As examples of reasoning in the logic, we show how to verify correctness of polymorphic type encodings in the presence of eﬀects.
1
Introduction
Strachey [11] deﬁned a polymorphic program to be parametric if it applies the same uniform algorithm across all of its type instantiations. Parametric polymorphism has proved to be a very useful programming language feature. However, the informal deﬁnition of Strachey does not lend itself to providing methods of verifying properties of polymorphic programs. Reynolds [10] addressed this by formulating the mathematical notion of relational parametricity, in which the uniformity in Strachey’s deﬁnition is captured by requiring programs to preserve certain relations induced by the type structure. In the context of pure functional polymorphic languages, such as the secondorder lambdacalculus, relational parametricity has proven to be a powerful principle for establishing abstraction properties, proving equivalence of programs and inferring useful properties of programs from their types alone [12]. Obtaining a useful and indeed consistent formulation of relational parametricity becomes trickier in the presence of computational eﬀects (nondeterminism, exceptions, sideeﬀects, continuations, etc.). Even the addition of recursion (and hence possible nontermination) to the secondorder lambdacalculus causes diﬃculties. For this special case, Plotkin proposed secondorder intuitionisticlinear type theory as a suitable framework for formulating relational parametricity [9]. This framework has since been developed by the ﬁrst author and colleagues [2], but it does not adapt to general eﬀects. Recently, the authors have developed a more general framework that is appropriate for modelling parametric polymorphism in combination with arbitrary computational eﬀects [4]. The framework is based on a custombuilt type theory PE for combining polymorphism and eﬀects, which is strongly inﬂuenced by Moggi’s computational metalanguage [6], and Levy’s callbypushvalue calculus [3]. As presented in [4], the type theory is interpreted in relationallyparametric models developed within the context of an intuitionistic set theory as the mathematical metatheory. While this approach provides an eﬃcient M. Miculan, I. Scagnetto, and F. Honsell (Eds.): TYPES 2007, LNCS 4941, pp. 142–156, 2008. c SpringerVerlag Berlin Heidelberg 2008
A Logic for Parametric Polymorphism with Eﬀects
143
framework for building models, the underlying principles for reasoning about the combination of parametricity and eﬀects are left buried amongst the (considerable) semantic details. The purpose of the present article is to extract the logic for parametricity with eﬀects that is implicit within these models, and to give a selfcontained presentation of it. In particular, no understanding of the semantic setting of [4] is required. The logic we present, builds on Plotkin and Abadi’s logic for parametric polymorphism in secondorder lambdacalculus [7], and is inﬂuenced by the existing reﬁnements of this logic to linear type theory and recursion [9,2]. The logic is built over the type theory PE, presented by the authors in [4]. As in Levy’s callbypushvalue (CBPV) calculus [3], the calculus PE has two kinds of types: value types (whose elements are static values) and computation types (whose elements are dynamic eﬀectproducing computations. The type theory allows for polymorphic quantiﬁcation over value types as well as over computation types. A central result in [4] is that the algebraic operations that cause eﬀects (as in [8]) can be given polymorphic types and satisfy a parametricity principle. For example, in a type theory for polymorphism and nondeterminism, the nondeterministic choice operation has polymorphic type ∀X. X → X → X, where X ranges over all computation types. An essential ingredient in the logic we present is the division of relations into value relations and computation relations. The latter generalise the notion of admissible relations that arise in the theory of parametricity and recursion [2]. To see why such a notion is necessary for the formulation of a consistent theory of parametricity, consider the type ∀X. X → X → X of a binary nondeterministic choice operation, as above. Relational parametricity, states that for all computation types X, Y and all relations R between them, any operation of the above type must preserve R. If R were to range over arbitrary relations, then only the ﬁrst and second projections would satisfy this condition, and so algebraic operations (such as nondeterministic choice) would not count as parametric. This is why a restricted class of computation relations is needed. Such relations can be thought of as relations that respect the computational structure. This paper makes two main contributions. The ﬁrst is the formulation of the logic itself, which is given in Section 3. Here, our goal is to present the logic in an intelligible way, and we omit the (straightforward) proofs of the basic properties of the logic. Our second contribution is to use the logic to formalize correctness arguments for the type theory PE. In particular, we verify that our logic for parametricity with eﬀects proves desired universal properties for several polymorphicallydeﬁned type contructors, including existential and coinductive computation types. For this, we include as much detail as space permits.
2
A Type Theory for Polymorphism and Eﬀects
This section recalls the type theory PE for polymorphism and eﬀects as deﬁned and motivated in [4]; see also [5] for an application. As mentioned in the introduction, like CBPV [3], PE has two collections of types: value types and
144
R.E. Møgelberg and A. Simpson Γ, x : B  Δ t : C
Γ, x : B  − x : B
Γ  Δ λx : B. t : B → C
Γ Δ t: B Γ  Δ ΛX. t : ∀X. B
X ∈ FTV(Γ, Δ)
Γ x: A t: B Γ x: A x: A
Γ Δ s: B → C
Γ  Δ s(t) : C Γ  Δ t : ∀X. B Γ  Δ t(A) : B[A/X] Γ − s: A B
◦
Γ  − λ x : A. t : A B
Γ Δ t: B Γ  Δ ΛX. t : ∀X. B
X ∈ FTV(Γ, Δ)
Γ − t: B
Γ Δ t: A
Γ  Δ s(t) : B Γ  Δ t : ∀X. B Γ  Δ t(A) : B[A/X]
Fig. 1. Typing rules for PE
computation types. We follow Levy’s convention of distinguishing syntactically between the two by underlining computation types as in A, B, C . . .. The calculus PE has polymorphic quantiﬁcation over both value types and computation types, with type variables denoted X, Y . . . and X, Y . . . respectively. Value types and computation types are deﬁned by the grammar A, B ::= X  A → B  ∀X. A  X  ∀X. A  A B A, B ::= A → B  ∀X. A  X  ∀X. A . Note that the computation types form a subcollection of the value types. One semantic intuition is that value types are sets and computation types are algebras for some computational monad in the sense of Moggi [6]. In such a model, is modelled by the collection of algebra homomorphisms, a set which does not in general carry a natural algebra structure and is thus a value type in PE, and the inclusion of computation types into value types is modelled by the forgetful functor mapping an algebra to its carrier. We refer the interested reader to [4] for a discussion of such models in detail. Typing judgements of PE are of the form Γ  Δ t : A where Γ is an ordinary context of variables, and Δ is a second context called the stoup subject to the following conditions: either Δ is empty or it is of the form Δ = z : B, in which case A is also required to be a computation type. The semantic intuition for the second case is that t denotes an algebra homomorphism from B to A. The typing rules are presented in Figure 1. In them, Γ  − t : A denotes a judgement with empty stoup, and the operation FTV returns the set of free type variables, which is deﬁned in the obvious way. Note the following consequence of the typing rules: if Γ  z : A t : B is well typed, then so is Γ, z : A  − t : B. Terms of PE are identiﬁed up to αequivalence as usual. Although the calculus PE has just a few primitive type constructors, a wide range of derived type constructors, on both value types and computation types, can be encoded using polymorphism.
A Logic for Parametric Polymorphism with Eﬀects
145
1 =def ∀X. X → X A × B =def ∀X. (A → B → X) → X
(X ∈ FTV(A,B))
0 =def ∀X. X A + B =def ∀X. (A → X) → (B → X) → X
(X ∈ FTV(A,B))
∃X. B =def ∀Y. (∀X. (B → Y )) → Y
(Y ∈ FTV(B))
∃X. B =def ∀Y. (∀X. (B → Y )) → Y
(Y ∈ FTV(B))
μX. B =def ∀X. (B → X) → X
(X +ve in B)
νX. B =def ∃X. (X → B) × X
(X +ve in B)
Fig. 2. Deﬁnable value types !B =def ∀X. (B → X) → X
(X ∈ FTV(B))
◦
1 =def ∀X. 0 → X A ×◦ B =def ∀X. ((A X) + (B X)) → X
(X ∈ FTV(A,B))
0◦ =def ∀X. X A ⊕ B =def ∀X. (A X) → (B X) → X B· A =def ∀X. (B → A X) → X ◦
(X ∈ FTV(A,B)) (X ∈ FTV(B,A))
∃ X. A =def ∀Y . (∀X. (A Y )) → Y
(Y ∈ FTV(A))
∃◦ X. A =def ∀Y . (∀X. (A Y )) → Y
(Y ∈ FTV(A))
◦
μ X. A =def ∀X. (A X) → X
(X +ve in A)
ν ◦ X. A =def ∃◦ X. (X A)· X
(X +ve in A)
Fig. 3. Deﬁnable computation types
Since value types extend secondorder lambdacalculus, the polymorphic type encodings known from that case can be used for type encodings on value types in PE. Figure 2 recalls these type encodings and also shows how to encode existential quantiﬁcation over computation types. Note that the encodings of inductive and coinductive types require a positive polarity of the type variable X. This notion is deﬁned in the standard way, cf. Section 5. Figure 3 describes polymorphic encodings of a number of constructions on computation types. The ﬁrst of these is the free computation type !B on a value type B. This plays the role of the monad in Moggi’s computational lambdacalculus [6], or more precisely of the F constructor in CBPV [3] (for further details, see [4]). The next constructions are unit, product, initial object and binary coproduct of computation types. Thetype B· A is the Bfold copower of A, and thus can be thought of as a coproduct x∈B A of computation types indexed by a value type. The remaining constructions are existential quantiﬁcation over value types and computation types, packaged up as computation types, and inductive and coinductive computation types. We remark that the somewhat exoticlooking types appearing in this ﬁgure do have applications. For example,
146
R.E. Møgelberg and A. Simpson
in forthcoming work, we shall demonstrate an application to giving a (linear) continuationpassing translation of Levy’s CBPV. In Section 3 below we formulate a logic for reasoning about relational parametricity in PE. The main applications of this logic, in Sections 4 and 5, will be to verify the correctness of a selection of the above type encodings.
3
The Logic
This section presents the ﬁrst main contribution of the paper, our logic for reasoning about parametricity in PE. As mentioned in the introduction, this logic has been extracted as a formalization of the reasoning principles validated by the relationallyparametric models of PE described in [4]. The purpose of this paper, however, is to give a selfcontained presentation of the logic without reference to [4]. The idea is that the logic can be understood independently of its (somewhat convoluted) models. In order to be welltyped, propositions are deﬁned in contexts of relation variables and term variables, denoted Θ and Γ respectively in the metanotation. As mentioned in the introduction, the logic has two classes of relations: value relations between value types and computation relations between computation types. We use the notations Relv (A, B) and Relc (A, B) for the collections of all value relations between A and B and all computation relations between A and B respectively. The formation rules for propositions are given in Figure 4. In the ﬁgure, the notation Rel− (A, B) is used in some rules. In these cases the rule holds for both value relations and computation relations, and so is a shorthand for two rules. Note that we only include connectives and quantiﬁers from the negative fragment of intuitionistic logic. Although the others could be included in principle, we shall not need them, and so omit them for space reasons. The formation rules for relations are given in Figure 5. Relations are closed under: conjunctions, universal quantiﬁcation and under implications whose antecendent does not depend on the variables being related. This last restriction is motivated by the models considered in [4]. A similar condition is required on admissible relations in [2]. For the purposes of the present paper, this condition should just be accepted as a syntactic condition that needs to be adhered to when using the logic. Lemma 1. 1. If ρ : Relc (A, B) in some context then also ρ : Relv (A, B) in the same context. 2. If Γ  − f : A → B then Θ ; Γ (x : A, y : B). f (x) = y : Relv (A, B) for any relational context Θ. 3. If Γ  − g : A B then Θ ; Γ (x : A, y : B). g(x) = y : Relc (A, B). We write f and g for the relations of items 2 and 3 in the lemma, and call these relations graphs. We use eqA for the graph of the identity function on A. Since relations ρ are always of the form (x : A, y : B). φ we can use the metanotation ρ(t, u) for φ[t, u/x, y] whenever Γ  − t : A and Γ  − u : B.
A Logic for Parametric Polymorphism with Eﬀects
Γ − t: A
Γ − t: A
Γ − u: A
R : Rel− (A, B) ∈ Γ
Θ ; Γ R(t, u) : Prop
Θ ; Γ t =A u : Prop Θ ; Γ φ : Prop
Γ − u: B
147
Θ ; Γ ψ : Prop
Θ ; Γ φ ψ : Prop
Θ ; Γ, x : A φ : Prop
∈ {∧, ⊃}
Θ, R : Rel− (A, B) ; Γ φ : Prop
Θ ; Γ φ : Prop
Θ ; Γ ∀R : Rel− (A, B). φ : Prop
Θ ; Γ ∀X. φ : Prop
Θ ; Γ ∀x : A. φ : Prop ( )
Θ ; Γ φ : Prop Θ ; Γ ∀X. φ : Prop
( )
Fig. 4. Typing rules for propositions. Here ( ) is the side condition X ∈ / FTV(Θ, Γ ) and − ranges over {v, c}.
Γ, x : A  − t : C
Γ, x : A  − t : A
Γ, y : B  − u : C
Θ, R : Rel− (A, B) ; Γ (x : A , y : B ). R(t, u) : Relv (A , B )
Θ ; Γ (x : A, y : B). t = u : Relv (A, B) Γ x: A t: C
Γ, y : B  − u : B
Γ  x : A t : A
Γ y : B u: C
Γ  y : B u : B
Θ, R : Relc (A, B); Γ (x : A , y : B ). R(t, u) : Relc (A , B )
Θ ; Γ (x : A, y : B). t = u : Relc (A, B) Θ ; Γ, z : C (x : A, y : B). φ : Rel− (A, B)
Θ, R : Rel= (C, C ) ; Γ (x : A, y : B). φ : Rel− (A, B)
Θ ; Γ (x : A, y : B). ∀z : C. φ : Rel− (A, B)
Θ ; Γ (x : A, y : B). ∀R : Rel= (C, C ). φ : Rel− (A, B)
Θ ; Γ (x : A, y : B). φ : Rel− (A, B)
Θ ; Γ (x : A, y : B). φ : Rel− (A, B)
Θ ; Γ (x : A, y : B). ∀X. φ : Rel− (A, B)
Θ ; Γ (x : A, y : B). ∀X. φ : Rel− (A, B)
Θ ; Γ ψ : Prop
Θ ; Γ (x : A, y : B). φ : Rel− (A, B)
Θ ; Γ (x : A, y : B). ψ ⊃ φ : Rel− (A, B)
Fig. 5. Typing rules for relations. In these rules −, = range over {v, c}.
Similarly we can write ρop : Rel− (B, A) for (y : B, x : A). φ. If ρ is a value relation then so is ρop , and likewise for computation relations. Deduction sequents are written on the form Θ ; Γ  Φ ψ where Φ is a ﬁnite set of formulas. A deduction sequent is wellformed if Θ ; Γ ψ : Prop and Θ ; Γ φi : Prop for all φi in Φ, and we shall assume wellformedness whenever writing a deduction sequent. The rules for deduction in the logic are presented in Figure 6, to which should be added the rules for β and η equality as in Figure 7, and the usual rules for implication, conjunction, which we have omitted for reasons of space. The rules for equality implement a congruence relation on terms (the congruence rules not explicit in Figure 6 can be derived from the equality elimination rule). An important application of the logic is to prove equalities between terms. For terms Γ  Δ s : A and Γ  Δ t : A, we write Γ Δ s = t: A
148
R.E. Møgelberg and A. Simpson Θ ; Γ, x : A  Φ ψ Θ ; Γ  Φ ∀x : A. ψ
x∈ / FV(Φ)
Θ ; Γ  Φ ∀x : A. ψ
Θ ; Γ  Φ ψ[t/x]
Θ, R : Rel− (A, B) ; Γ  Φ ψ Θ ; Γ  Φ ∀R : Rel− (A, B). ψ Θ ; Γ  Φ ∀R : Rel− (A, B). ψ
Γ − t: A
− ∈ {v, c}
Θ ; Γ (x : A, y : B). φ : Rel− (A, B)
Θ ; Γ  Φ ψ[φ[t, u/x, y]/R(t, u)] Θ ; Γ Φ ψ Θ ; Γ  Φ ∀X. ψ Θ ; Γ Φ ψ Θ ; Γ  Φ ∀X. ψ
Θ ; Γ  Φ ∀X. ψ
X∈ / FTV(Θ, Γ, Φ)
Θ ; Γ  Φ ψ[A/X] Θ ; Γ  Φ ∀X. ψ
X∈ / FTV(Θ, Γ, Φ)
Θ ; Γ  Φ ψ[A/X]
Θ ; Γ Φ t = u
Γ − t: A Θ ; Γ Φ t = t
Θ ; Γ  Φ φ[t/x]
Θ ; Γ  Φ φ[u/x]
Γ, x : B  − t, u : C
Θ ; Γ, x : B  Φ t = u
Θ ; Γ  Φ λx : B. t = λx : B. u Γ  − t, u : B
Θ ; Γ Φ t = u
Θ ; Γ  Φ ΛX. t = ΛX. u Γ  x : A t, u : B
Θ ; Γ, x : A  Φ t = u
Θ ; Γ  Φ λ x : A. t = λ◦ x : A. u Θ ; Γ Φ t = u
Θ ; Γ  Φ ΛX. t = ΛX. u
x ∈ FV(Φ)
X ∈ FTV(Γ, Δ, Φ)
◦
Γ  − t, u : B
− ∈ {v, c}
x ∈ FV(Φ)
X ∈ FTV(Γ, Δ, Φ)
Fig. 6. Deduction rules
(although we shall often omit the type A) as notation for the deduction sequent − ; Γ, Δ t = u. Thus Γ  Δ s = t : A and Γ, Δ  − s = t : A are equivalent. This corresponds to the faithfulness of the forgetful functor from computation types to value types in the semantic models of [4]. A related fact, is that the canonical map of type (A B) → A → B given by the term λf : A B. λx : A. f (x) is injective, which is derivable using the lemma below. Lemma 2. The following extensionality schemas are provable in the logic. ∀f, g : A → B. (∀x : A. f (x) = g(x)) ⊃ f = g ∀f, g : A B. (∀x : A. f (x) = g(x)) ⊃ f = g ∀x, y : (∀X. A). (∀X. x X = y X) ⊃ x = y ∀x, y : (∀X. A). (∀X. x X = y X) ⊃ x = y
A Logic for Parametric Polymorphism with Eﬀects (λx : A. t)(u) = t[u/x] ◦
(λ x : A. t)(u) = t[u/x]
λx : A. t(x) = t ◦
λ x : A. t(x) = t
149
if t : A → B and x ∈ / FV(t) if t : A B and x ∈ / FV(t)
(ΛX. t) A = t[A/X]
ΛY . t Y = t
if t : ∀X. A and Y ∈ / FTV(t)
(ΛX. t) A = t[A/X]
ΛY. t Y = t
if t : ∀X. A and Y ∈ / FTV(t)
Fig. 7. β, η rules for PE Xi [ρ, ρ] = ρi X j [ρ, ρ] = ρj (A → B)[ρ, ρ] = (f : (A → B)(C, C), g : (A → B)(C , C )). ∀x : A(C, C), ∀y : A(C , C ). A[ρ, ρ](x, y) ⊃ B[ρ, ρ](f (x), g(y)) (A B)[ρ, ρ] = (f : (A B)(C, C), g : (A B)(C , C )). ∀x : A(C, C), ∀y : B(C , C ). A[ρ, ρ](x, y) ⊃ B[ρ, ρ](f (x), g(y)) (∀X. A)[ρ, ρ] = (x : ∀X. A[C, C], y : ∀X. A[C, C]). ∀Y, Z. ∀R : Relv (Y, Z). A[ρ, ρ, R](x Y, y Z) (∀X. A)[ρ, ρ] = (x : ∀X. A[C, C], y : ∀X. A[C, C]). ∀Y , Z. ∀R : Relc (Y , Z). A[ρ, ρ, R](x Y , y Z)
Fig. 8. Relational interpretation of types
We now come to the crucial relational interpretation of types, needed to deﬁne relational parametricity. Suppose A is a type such that FTV(A) ⊆ {X, X} (using bold font for vectors), and ρ and ρ are vectors of relations of the same lengths as X and X respectively such that Θ ; Γ ρi : Relv (Ci , Ci ) for each i indexing an element of X, and Θ ; Γ ρj : Relc (Cj , Cj ) for each j indexing an element of
X. We deﬁne A[ρ, ρ/X, X] : Relv (A[C, C/X, X], A[C , C /X, X]) by structural induction on A as in Figure 8, using the short notation A[ρ, ρ] for A[ρ, ρ/X, X] and A(C, C) for A[C, C/X, X].
Lemma 3. If A is a computation type then A[ρ, ρ] is a computation relation. As our axiom for parametricity we shall take a version of Reynolds’ identity extension schema [10] adapted to our setting. Using the shorthand notation ρ ≡ ρ for ∀x, y. ρ(x, y) ⊃⊂ ρ (x, y) this can be stated as: A[eqB , eqB ] ≡ eqA[B,B]
(1)
where A ranges over all value types such that FTV(A) ⊆ {X, X} and B and B range over all vectors of value types and computation types respectively (open as well as closed). Lemma 4. Identity extension (1) is equivalent to the two parametricity schemas: ∀x : (∀Y. A(B, B, Y )). ∀Y, Z, R : Relv (Y, Z). A[eqB , eqB , R](x(Y ), x(Z)) ∀x : (∀Y . A (B, B, Y )). ∀Y , Z, R : Relc (Y , Z). A [eqB , eqB , R](x(Y ), x(Z)) where A, A range over types with FTV(A) ⊆ {X, X, Y } and FTV(A ) ⊆ {X, X, Y }
150
R.E. Møgelberg and A. Simpson
In the case of parametricity in the secondorder lambdacalculus, the equivalence asserted by Lemma 4 is well known. The proof in our setting is similar. In one direction, the parametricity schemas are special cases of the identity extension schema in the case of polymorphic types. The other direction is proved by induction over the structure of types. Lemma 5 (Logical relations). Suppose Γ  Δ t : A is a valid typing judgement with FTV(Γ, Δ, t, A) ⊆ {X, X} and suppose we are given vectors of relations ρ : Relv (C, C ) and ρ : Relc (C, C ). Suppose we are further given si : Bi (C, C) and si : Bi (C , C ) for each xi : Bi in Γ , and if Δ = xn+1 : Bn+1 is nonempty also sn+1 : Bn+1 (C, C) and sn+1 : Bn+1 (C , C ). If Bi [ρ, ρ](si , si ) for all i, then A[ρ, ρ](t[s, C, C/x, X, X], t[s , C , C /x, X, X]) In the sequel, we apply the logic deﬁned in this section to verify properties of PE. In doing so, we call the underlying logic, without the identity extension schema, L; and we write L+P for the logic with the identity extension schema (equivalently parametricity schemas) added.
4
Verifying Polymorphic Type Encodings
In this section, we apply the logic to verify the correctness of a selection of the datatype encodings presented in Section 2. Our style will be arguments in an informal style, including as much detail as space permits, but to ensure that the arguments are always directly formalizable. The value type encodings of Figure 2, can be veriﬁed essentially as in Plotkin and Abadi’s logic [7] (see also [1]). Nevertheless, we brieﬂy discuss the the case of coproducts, as it serves to illustrate a subtlety introduced by the stoup. The type A+B supports derived introduction and elimination rules as follows. Γ − t: A
Γ − u: B
Γ  − in1 (t) : A + B
Γ  − in2 (u) : A + B
Γ, x : A  Δ u : C
Γ, y : B  Δ u : C
Γ − t: A + B
Γ  Δ case t of in1 (x). u; in2 (y). u : C
(2)
Here, the left and right inclusions are deﬁned as expected: in1 (t) =def ΛX. λf : A → X. λg : B → X. f (t) in2 (u) =def ΛX. λf : A → X. λg : B → X. g(u) But the deﬁnition of the case construction depends on the stoup. If the stoup is empty then case t of in1 (x). u; in2 (y). u =def t C (λx : A. u) (λy : A. u ) but if it is nonempty, say Δ = z : C then case t of in1 (x). u; in2 (y). u is:
A Logic for Parametric Polymorphism with Eﬀects
151
(t (C C) (λx : A. λ◦ z : C . u) (λy : A. λ◦ z : C . u )) z . That this encoding of coproducts enjoys the expected universal property is captured by the equalities in the theorem below. Theorem 1. Suppose u, u are as in the hypothesis of the elimination rule (2) then L proves • If Γ  − t : A then Γ  Δ case in1 (t) of in1 (x). u; in2 (y). u = u[t/x] • If Γ  − s : B then Γ  Δ case in2 (s) of in1 (x). u; in2 (y). u = u [s/y] If Γ  − t : A + B and Γ, z : A + B  Δ u : C then L+P proves Γ  Δ case t of in1 (x). u[in1 (x)/z]; in2 (y). u[in2 (y)/z] = u[t/z] We omit the proof since it follows the usual argument using relational parametricity, cf. [7,1]. Instead, we turn to the constructs on computation types, of Figure 3, whose veriﬁcation makes essential use of computation relations. Although the type !A, corresponding to Moggi’s T A [6] and Levy’s F A [3], is a particularly important one for eﬀects; we omit the verication of its universal property here, since the argument is given in detail in [4]. There, an informal argument is presented, which is justiﬁed in semantic terms. However, every detail of this argument is directly translatable into our logic. Instead, as our ﬁrst example, we consider the type A · B, which represents a Afold copower of B. Type theoretically, the universal property requires a natural bijection between terms of type A → (B C) and terms of type (A · B) C. The derived introduction and elimination rules for copowers are Γ − t: A
Γ Δ s: B
Γ Δ t · s: A · B Γ, x : A  y : B u : C
Γ Δ t: A · B
Γ  Δ let x · y be t in u : C
(3)
(4)
Indeed, we deﬁne t · s = ΛX. λf : A → B X. f (t)(s) let x · y be t in u = t C (λx : A. λ◦ y : B. u). Lemma 6. Suppose t, u are as in the hypothesis of the elimination rule (4) and Γ  − f : C C . Then L+P proves Γ  Δ let x · y be t in f (u) = f (let x · y be t in u) Proof. Parametricity for t states − ; Γ, Δ ∀X, Y , R : Relc (X, Y ). ((eqA → eqB R) → R)(t X, t Y )
(5)
152
R.E. Møgelberg and A. Simpson
By deﬁnition, − ; Γ (eqA → eqB f )(λx : A. λ◦ y : B. u, λx : A. λ◦ y : B. f (u)). Also, by Lemma 1, f is a computation relation, Thus, applying (5) we get: − ; Γ, Δ f (t C (λx : A. λ◦ y : B. u), t C (λx : A. λ◦ y : B. f (u))) i.e., by deﬁnition of the copower let expressions, − ; Γ, Δ f (let x · y be t in u, let x · y be t in f (u)) So, by deﬁnition of f , − ; Γ, Δ let x · y be t in f (u) = f (let x · y be t in u) which means that Γ  Δ let x · y be t in f (u) = f (let x · y be t in u)
is provable as desired. Lemma 7. Suppose Γ  Δ t : A · B and x, y are fresh. Then L+P proves Γ  Δ let x · y be t in x · y = t Proof. By extensionality it suﬃces to prove that if X and f are fresh then Γ, f : A → B X  Δ (let x · y be t in x · y) X f = t X f. Since
Γ, f : A → B X  − λ◦ x : A · B. x X f : A · B X,
by Lemma 6 Γ, f : A → B X  Δ (let x · y be t in x · y) X f = let x · y be t in (x · y X f ) But let x · y be t in (x · y X f ) = t X (λx : A. λ◦ y : B. x · y X f ) = t X f. Theorem 2. If Γ  − t : A, Γ  Δ s : B and Γ, x : A  y : B u : C then L proves Γ  Δ let x · y be t · s in u = u[t, s/x, y] If Γ  z : A · B u : C and Γ  Δ t : A · B then L+P proves Γ  Δ let x · y be t in u[x · y/z] = u[t/z] Proof. The ﬁrst part follows from β and η equalities, and the second part from Lemma 6 and Lemma 7.
A Logic for Parametric Polymorphism with Eﬀects
153
This theorem formulates the desired universal property for copower types. We consider one other example from Figure 3, existential computation types of the form ∃◦ X. A. The derived introduction and elimination rules are: Γ  Δ t : A[B/X]
(6)
Γ  Δ B, t : ∃◦ X. A Γ x: A u: B
Γ  Δ t : ∃◦ X. A
Γ  Δ let X, x be t in u : B
X∈ / FTV(B)
(7)
with the relevant term constructors deﬁned by: B, t = ΛY . λf : (∀X. A Y ). f B (t) let X, x be t in u = t B (ΛX. λ◦ x : A. u) Since correctness argument follows very closely that for copower types, we merely state the relevant lemmas and theorem, omitting the proofs. Lemma 8. Suppose t, u are as in the hypothesis of the elimination rule (7) and that Γ  − f : B B . Then L+P proves Γ  Δ let X, x be t in f (u) = f (let X, x be t in u) Lemma 9. Suppose Γ  Δ t : ∃◦ X. A then L+P proves Γ  Δ let X, x be t in X, x = t Theorem 3. Suppose Γ  Δ t : A[B/X] and Γ  x : A u : C then • L proves Γ  Δ let X, x be B, t in u = u[B, t/X, x] • If Γ  z : ∃◦ X. A s : C then L+P proves Γ  Δ let X, x be t in s[X, x/z] = s[t/z]
5
Inductive and Coinductive Types
This ﬁnal section describes encodings of general inductive and coinductive computation types and veriﬁes the correctness of the latter. To describe the universal properties of these types we need to consider the functorial actions of the type constructors of PE. This is essentially a standard analysis of type structure, adapted to the setting of the two collections of types in PE. We deﬁne positive and negative occurrences of type variables in types in the standard way (→ and reverse polarity of the type variables occurring on the left, and all other type constructors preserve polarity). If A is a value type in
154
R.E. Møgelberg and A. Simpson Yi (f , g, h, k) Y j (f , g, h, k) (A → B)(f , g, h, k) (A B)(f , g, h, k) (∀Y. A)(f , g, h, k) (∀Y . A)(f , g, h, k)
= = = = = =
gi kj λl : (A → B). B(f , g, h, k) ◦ l ◦ A(g, f , k, h) λl : (A B). B(f , g, h, k) ◦ l ◦ A(g, f , k, h) λx : ∀Y. A. ΛY. A(f , g, h, k, id Y , id Y )(x Y ) λx : ∀Y . A. ΛY . A(f , g, h, k, id Y , id Y )(x Y )
Fig. 9. The functorial interpretation of types
which the variables X, X occur only negatively and the type variables Y , Y occur only positively, then we can deﬁne a term MA : ∀X, X , Y , Y , X, X , Y , Y . (X → X) → (Y → Y ) → (X X) → (Y Y ) → A → A(X , Y , X , Y ) The term MA is deﬁned by structural induction over A simultaneously with the deﬁnition of a term NB : ∀X, X , Y , Y , X, X , Y , Y . (X → X) → (Y → Y ) → (X X) → (Y Y ) → B B(X , Y , X , Y ) for any computation type B satisfying the same condition on the occurrences of variables as above. The deﬁnition is in Figure 9, in which the simpliﬁed notation A(f , g, h, k) is used for MA (or NA whenever A is a computation type) applied to f , g, h, k, alongside evident notation for function composition. Lemma 10. For computation types A the terms MA and NA agree up to inclusion of into →. Moreover, the terms MA deﬁne functors since they: • preserve identities: A(id , id , id, id ) = id • preserve compositions: A(f ◦ f , g ◦ g, h ◦ h , k ◦ k) = A(f , g , h , k ) ◦ A(f , g, h, k) Finally we adapt the graph lemma of [7] to our setting. Lemma 11 (Graph Lemma). If f : B → B, g : C → C, h : B B, k : C C then L+P proves A[f op , g, hop , k] ≡ A(f , g, h, k) Suppose A is a computation type whose only free type variable is the computation type variable X which occurs only positively. As a consequence of parametric polymorphism the types μ◦ X. A =def ∀X. (A X) → X ν ◦ X. A =def ∃◦ X. (X A) · X are carriers of initial algebras and ﬁnal coalgebras respectively for the functor induced by A. Here, we show how the universal property for ﬁnal coalgebras can be veriﬁed in our logic.
A Logic for Parametric Polymorphism with Eﬀects
155
The ﬁnal coalgebra structure is deﬁned using unfold : ∀X. (X A) → X ν ◦ X. A out : (ν ◦ X. A) A[ν ◦ X. A/X] deﬁned as unfold = ΛX. λf : X A. λ◦ x : (ν ◦ X. A). X, f · x out = λ◦ x : ν ◦ X. A. let X, y be x in (let f · z be y in A(unfold X f )(f (z))) Lemma 12. If Γ  − f : B A[B/X] then L proves Γ  x : B out(unfold B f x) = A(unfold B f )(f (x)) . In diagramatic form Lemma 12 means that B
f
◦ A[B/X]
unfold B f ◦ ν ◦ X. A
A(unfold B f ) ◦ out ◦ A[ν ◦ X. A/X]
commutes, i.e., the term unfold veriﬁes that out is a weak ﬁnal coalgebra. Lemma 13. Suppose Γ  − h : B B , f : B A[B/X], f : B A[B /X], and that Γ  x : B f (h(x)) = A(h)(f (x)). Then L+P proves Γ  x : B unfold B f x = unfold B f (h(x)) Proof. By the Graph Lemma (Lemma 11) the assumption can be reformulated as h A[h/X])(f, f ). So by parametricity of unfold (h eqν ◦ X. A )(unfold B f, unfold B f ) implying Γ  x : B unfold B f x = unfold B f (h(x)).
Lemma 14. L+P proves unfold ν ◦ X. A out = id ν ◦ X. A . Proof. By Lemma 13, for arbitrary X, f : X A, unfold X f = (unfold ν ◦ X. A out) ◦ (unfold X f ) so Γ, f : X A  x : X X, f · x = unfold ν ◦ X. A out X, x. This implies using Lemma 8 and Lemma 9 that for any y : ν ◦ X. A y = let X, f · x be y in X, f · x = let X, f · x be y in (unfold ν ◦ X. A out X, x) = unfold ν ◦ X. A out (let X, f · x be y in X, x) = unfold ν ◦ X. A out y and so the lemma follows from extensionality.
156
R.E. Møgelberg and A. Simpson
Theorem 4. Suppose f : B ν ◦ X. A and h : B A[B/X] such that A(h) ◦ f = out ◦ h then L+P proves h = unfold B f . Proof. By Lemma 13 and Lemma 14 unfold B f = (unfold ν ◦ X. A out) ◦ h = h .
References
1. Birkedal, L., Møelberg, R.E.: Categorical models of AbadiPlotkin’s logic for parametricity. Mathematical Structures in Computer Science 15(4), 709–772 (2005) 2. Birkedal, L., Møgelberg, R.E., Petersen, R.L.: Linear Abadi & Plotkin logic. Logical Methods in Computer Science 2 (2006) 3. Levy, P.B.: Call By Push Value, a Functional/Imperative Synthesis. Kluwer, Dordrecht (2004) 4. Møgelberg, R.E., Simpson, A.K.: Relational parametricity for computational effects. In: LICS, pp. 346–355. IEEE Computer Society Press, Los Alamitos (2007) 5. Møgelberg, R.E., Simpson, A.K.: Relational parametricity for control considered as a computational eﬀect. Electr. Notes Theor. Comput. Sci 173, 295–312 (2007) 6. Moggi, E.: Notions of computation and monads. Information and Computation 93, 55–92 (1991) 7. Plotkin, G.D., Abadi, M.: A logic for parametric polymorphism. In: Bezem, M., Groote, J.F. (eds.) TLCA 1993. LNCS, vol. 664, pp. 361–375. Springer, Heidelberg (1993) 8. Plotkin, G.D., Power, J.: Algebraic operations and generic eﬀects. Applied Categorical Structures 11(1), 69–94 (2003) 9. Plotkin, G.D.: Type theory and recursion (extended abstract). In: Proceedings, Eighth Annual IEEE Symposium on Logic in Computer Science, Montreal, Canada, June 19–23, 1993, p. 374. IEEE Computer Society Press, Los Alamitos (1993) 10. Reynolds, J.C.: Types, abstraction, and parametric polymorphism. Information Processing 83, 513–523 (1983) 11. Strachey, C.: Fundamental concepts in programming languages. Lecture Notes, International Summer School in Computer Programming, Copenhagen (August 1967) 12. Wadler, P.: Theorems for free. In: Proceedings 4th International Conference on Functional Programming languages and Computer Architectures (1989)
Working with Mathematical Structures in Type Theory Claudio Sacerdoti Coen and Enrico Tassi Department of Computer Science, University of Bologna Mura Anteo Zamboni, 7 – 40127 Bologna, Italy {sacerdot,tassi}@cs.unibo.it
Abstract. We address the problem of representing mathematical structures in a proof assistant which: 1) is based on a type theory with dependent types, telescopes and a computational version of Leibniz equality; 2) implements coercive subtyping, accepting multiple coherent paths between type families; 3) implements a restricted form of higher order uniﬁcation and type reconstruction. We show how to exploit the previous quite common features to reduce the “syntactic” gap between pen&paper and formalised algebra. However, to reach our goal we need to propose uniﬁcation and type reconstruction heuristics that are slightly diﬀerent from the ones usually implemented. We have implemented them in Matita.
1
Introduction
It is well known that formalising mathematical concepts in type theory is not straightforward, and one of the most used metrics to describe this diﬃculty is the gap (in lines of text) between the pen&paper proof, and the formalised version. A motivation for that may be that many intuitive concepts widely used in mathematics, like graphs for example, have no simple and handy representation (see for example the complex hypermap construction used to describe planar maps in the four colour theorem [11]). On the contrary, some widely studied ﬁelds of mathematics do have a precise and formal description of the objects they study. The most well known one is algebra, where a rigorous hierarchy of structures is deﬁned and investigated. One may expect that formalising algebra in an interactive theorem prover should be smooth, and that the so called De Bruijn factor should be not so high for that particular subject. Many papers in the literature [9] give evidence that this is not the case. In this paper we analyse some of the problems that arise in formalising a hierarchy of algebraic structures and we propose a general mechanism that allows to tighten the distance between the algebraic hierarchy as is conceived by mathematicians and the one that can be eﬀectively implemented in type theory. In particular, we want to be able to formalise the following informal piece of mathematics1 without making more information explicit, expecting the interactive theorem prover to understand it as a mathematician would do. 1
PlanetMath, deﬁnition of Ordered Vector Space.
M. Miculan, I. Scagnetto, and F. Honsell (Eds.): TYPES 2007, LNCS 4941, pp. 157–172, 2008. c SpringerVerlag Berlin Heidelberg 2008
158
C. Sacerdoti Coen and E. Tassi
Example 1. Let k be an ordered ﬁeld. An ordered vector space over k is a vector space V that is also a poset at the same time, such that the following conditions are satisﬁed 1. for any u, v, w ∈ V , if u ≤ v then u + w ≤ v + w, 2. if 0 ≤ u ∈ V and any 0 < λ ∈ k, then 0 ≤ λu. Here is a property that can be immediately veriﬁed: u ≤ v iﬀ λu ≤ λv for any 0 < λ. We choose this running example instead of the most common example about rings[9,16,3] because we believe the latter to be a little deceiving. Indeed, a ring is usually deﬁned as a triple (C,+,∗) such that (C,+) is a group, (C,∗) is a semigroup, and some distributive properties hold. This deﬁnition is imprecise or at least not complete, since it does not list the neutral element and the inverse function of the group. Its real meaning is just that a ring is an additive group that is also a multiplicative semigroup (on the same carrier) with some distributive properties. Indeed, the latter way of deﬁning structures is often adopted also by mathematicians when the structures become more complex and embed more operations (e.g. vector spaces, Riesz spaces, integration algebras). Considering again our running example, we want to formalise it using the following syntax2 , and we expect the proof assistant to interpret it as expected: record OrderedVectorSpace : Type := { V:> VectorSpace; (∗ we suppose that V.k is the ordered ﬁeld ∗) p:> Poset with p.CApo = V.CAvs ; (∗ the two carriers must be the same ∗) add le compat: ∀ u,v,w:V. u ≤ v → u + w ≤ v + w; mul le compat: ∀ u:V.∀ α :k. 0 ≤ u → 0 < α → 0 ≤ α ∗ u }. lemma trivial: ∀ R.∀ u,v:R. (∀ α . 0 < α → α ∗ u ≤ α ∗ v) → u ≤ v.
The ﬁrst statement declares a record type. A record type is a sort of labelled telescope. A telescope is just a generalised Σtype. Inhabitants of a telescope of length n are heavily typed ntuples x1 , . . ., xn T1 ,...,Tn where xi must have type Ti x1 . . .xi−1 . The heavy types are necessary for type reconstruction. Instead, inhabitants of a record type with n ﬁelds are not heavily typed ntuples, but lighter ntuples x1 , . . ., xn R where R is a reference to the record type declaration, which declares once and for all the types of ﬁelds. Thus terms containing inhabitants of records are smaller and require less typechecking time than their equivalents that use telescopes. Beware of the diﬀerences between our records — which are implemented, at least as telescopes, in most systems like Coq — and dependently typed records “`a la Betarte/Tasistro/Pollack” [5,4,8]: 1. there is no “dot” constructor to uniformly access by name ﬁelds of any record. Thus the names of these projections must be diﬀerent, as .CApo and .CAvs . 2
The syntax is the one of the Matita proof assistant, which is quite close to the one of Coq. We reserve λ for lambdaabstraction.
Working with Mathematical Structures in Type Theory
159
We suppose that adhoc projections .k, .v, etc. are automatically declared by the system. When we write x.v we mean the application of the .v function to x; 2. there is no structural subtyping relation “` a la Betarte/Tasistro” between records; however, adhoc coercions “`a la Pollack” can be declared by the user; in particular, we suppose that when a ﬁeld is declared using “:>”, the relative projection is automatically declared as a coercion by the system; 3. there are no manifest ﬁelds “` a la Pollack”; the with notation is usually understood as syntactic sugar for declaring ontheﬂy a new record with a manifest ﬁeld; however, having no manifest ﬁelds in our logic, we will need a diﬀerent explanation for the with type constructor, it will be given in Sec. 2. When lambdaabstractions and dependent products do not type their variable, the type of the variable must be inferred by the system during type reconstruction. Similarly, all mathematical notation (e.g. “∗”) hides the application of one projection to a record (e.g. “ ?.∗ ” where ? is a placeholder for a particular record). The notation “x:R” can also hide a projection R.CA from R to its carrier. All projections are monomorphic, in the sense that diﬀerent structures have diﬀerent projections to their carrier. All placeholders in projections must be inferred during type reconstruction. This is not a trivial task: in the expression “α ∗ u ≤ α ∗ w” both sides of the inequation are applications of the scalar product of some vector space R (since u and v have been previously assigned the type R.CA); since their result are compared, the system must deduce that the vector space R must also be a poset, hence an ordered vector space. In the rest of the paper we address the problem of representing mathematical structures in a proof assistant which: 1) is based on a type theory with dependent types, telescopes and a computational version of Leibniz equality; 2) implements coercive subtyping, accepting multiple coherent paths between type families; 3) implements a restricted form of higher order uniﬁcation and type reconstruction. Lego, Coq, Plastic and Matita are all examples of proof assistants based on such theories. In the next sections we highlight one by one the problems that all these systems face in understanding the syntax of the previous example, proposing solutions that require minimal modiﬁcations to the implementation.
2
Dependently Typed Records in Type Theory
The ﬁrst problem is understanding the with type constructor employed in the example. Pollack and alt. in [8] propose the model for a new type theory having in the syntax primitive dependently typed records, and show how to interpret records in the model. The theory lacks with, but it can be easily added to the syntax (adopting the rules proposed in [16]) and also interpreted in the model. However, no nonprototipical proof assistant currently implements primitive dependently typed records.
160
2.1
C. Sacerdoti Coen and E. Tassi
Ψ and Σ Types
In [16], Randy Pollack shows that dependently typed records with uniform ﬁeld projections and with can be implemented in a type theory extended with inductive types and the inductionrecursion principle [10]. However, inductionrecursion is also not implemented in most proof assistants, and we are looking for a solution in a simpler framework where we only have primitive records (or even simply primitive telescopes or primitive Σtypes), but no inductive types. In the same paper, he also shows how to interpret dependently typed records with and without manifest ﬁelds in a simpler type theory having only primitive Σtypes and primitive Ψ types. A Σtype (Σ x:T. P x) is inhabited by heavily typed couples w,pT,P where w is an inhabitant of the type T and p is an inhabitant of P w. The heavy type annotation is required for type inference. A Ψ type (Ψ x:T. p) is inhabited by heavily typed singletons wT,P,p where w is an inhabitant of the type T and p is a function mapping x of type T to a value of type P x. The intuitive idea is that w, p[w]T,P and wT,P,λx:T. p[x] should represent the same couple, where in the ﬁrst case the value of the second component is opaque, while in the second case it is made manifest (as a function of the ﬁrst component). However, the two representations actually are diﬀerent and morally equivalent inhabitants of the two types are not convertible, against intuition. We will see later how it is possible to represent couples typed with manifest ﬁelds as convertible couples with opaque ﬁelds. We will denote by .1 and .2 the ﬁrst and second projection of a Σ/Ψ type. The syntax “Σ x:T. P x with .2 = t[.1] ” can now be understood as syntactic sugar for “Ψ x:T. t[x] ”. The illusion is completed by declaring a coercion from Ψ x:T. p to Σ x:T. P x so that wT,P,p is automatically mapped to w, p wT,P when required. Most common mathematical structure are records with more than two ﬁelds. Pollack explains that such a structure can be understood as a sequence of leftassociating3 nested heavily typed pairs/singletons. For instance, the record r ≡ nat, list nat, @R of type R := {C : Type; T := list C; app: T → T → T} is represented as4 r0 ≡ (), TypeUnit , λC:Unit. Type r1 ≡ r0 ΣC:Unit. Type , λx:(ΣC:Unit. Type).Type1 , λy:(ΣC:Unit. Type). list y.1 r ≡ r1 , @Ψy:(ΣC:Unit. Type). list y.1 , λx:(Ψy:(ΣC:Unit. Type). list y.1). x.2→x.2→x.2
of type Σ x:(Ψ y:(Σ C: Unit. Type). list y .1). x.2 → x.2 → x.2. However, the deep heavy type annotations are actually useless and make the term extremely large and its type checking ineﬃcient. The interpretation of with also becomes more complex, since the nested Σ/Ψ types must be recursively traversed to compute the new type. 3
4
In the same paper he also proposes to represent a record type with a rightassociating sequence of Σ/Φ types, where a Φ type looks like a Ψ type, but makes it ﬁrst ﬁelds manifest. However, in Sect. 5.2.2 he also argues for the leftassociating solution. Type1 in the deﬁnition of r1 is the second universe in Luo’s ECC [13]. Note that Type has type Type1 .
Working with Mathematical Structures in Type Theory
2.2
161
Weakly Manifest Types
In this paper we drop Σ/Ψ types in favour of primitive records, whose inhabitants do not require heavy type annotations. However, we are back at the problem of manifest ﬁelds: every time the user declares a record type with n ﬁelds, to follow closely the approach of Pollack the system should declare 2n record types having all possible combinations of manifest/opaque ﬁelds. To obtain a new solution for manifest ﬁelds we exploit the fact that manifest ﬁelds can be declared using with and we also go back to the intuition that records with and without manifest ﬁelds should all have the same representation. That is, when x ≡ 3 (x is deﬁnitionally equal to 3) and p: P x, the term x, pR should be both an inhabitant of the record R := { n: nat; H: P n} and of the record R with n = 3. Intuitively, the with notation should only add in the context the new “hypothesis” x ≡ 3. However, we want to be able to obtain this eﬀect without extending the type theory with with and without adding at run time new equations to the convertibility check. This is partially achieved by approximating x ≡ 3 with an hypothesis of type x = 3 where “=” is Leibniz polymorphic equality. To summarise, the idea is to represent an inhabitant of R := {n: nat; H: P n} as a couple x, pR and an inhabitant of R with n=3 as a couple c, qR,λc:R. c.n=3 of type Σ c:R. c.n=3. Declaring the ﬁrst projection of the Σtype as a coercion, the system is able to map every element of R with n=3 into an element of R. However, the illusion is not complete yet: if c is an inhabitant of R with n=3, c .1. n (that can be written as c.n because .1 is a coercion) is Leibnizequal to 3 (because of c.2 ), but is not convertible to 3. This is problematic since terms there were welltyped in the system presented by Pollack are here rejected. Several concrete example can already be found in our running example: to type u + w ≤ v + w (in the declaration of add le compat), the carriers p.CApo and V.CAvs must be convertible, whereas they are only Leibniz equal. In principle, it would be possible to avoid the problem by replacing u + w ≤ v + w with [u+w]p.2 ≤[v+w]p.2 where [ ] is the constant corresponding to Leibniz elimination, i.e. [x] w has type Q[M] whenever x has type Q[N] and w has type N=M. However, the insertion of these constants, even if done automatically with a couple of mutual coercions, makes the terms much larger and more diﬃcult to reason about. 2.3
Manifesting Coercions
To overcome the problem, consider c of type R with n=3 and notice that the lack of conversion can be observed only in c .1. n (which is not deﬁnitionally equal to 3) and in all ﬁelds of c.1 that come after n (for instance, the second ﬁeld has type P c.1. n in place of P 3). Moreover, the user never needs to write c.1 anywhere, since c.1 is declared as a coercion. Thus we can try to solve the problem by declaring a diﬀerent coercion such that c .1. n is deﬁnitionally equal to 3. In our example, the coercion5 is 5
The name of the coercion is kn R verbatim, R and n are not indexes.
162
C. Sacerdoti Coen and E. Tassi
deﬁnition kn R : ∀ M : nat. R with n=M → R := λ m:nat. λ x:(Σc:R. c.n=M). M, [x.1.H]x.2 R
Once knR is declared as a coercion, c.H is interpreted as (knR 3 c).H which has type n P (kn R 3 c).n, which is now deﬁnitionally equal to P 3. Note also that (kR 3 c).H is deﬁnitionally equal to [c .1. H]c.2 that is deﬁnitionally equal to c .1. H when c.2 is a closed term of type c .1. n = 3. When the latter holds, c .1. n is also deﬁnitionally equal to 3, and the manifest type information is actually redundant, according to the initial intuition. The converse holds when the system is proof irrelevant, or, with minor modiﬁcations, when Leibniz equality is stated on a decidable type [12]. Coming back to our running example, u + w ≤ v + w can now be parsed as the welltyped term CA
po u (V.+) w ((kPoset V.CAvs p).≤) v (V.+) w
Things get a little more complex when with is used to change the value of a ﬁeld f1 that occurs in the type of a second ﬁeld f2 that occurs in the type of a third ﬁeld f3 . Consider the record type declaration R := { f1 : T; f2 : P f1 ; f3 : Q f1 f2 } and the expression R with f1 = M, interpreted as Σ c:R. c. f 1 = M. We must ﬁnd a coercion from R with f1 = M to R declared as follows deﬁnition kfR1 : ∀ M:T. R with f1 = M → R := λ M:T. λ x:(Σc:R. c.f1=M). M, [c.1.f2 ]c.2 , w
for some w that inhabit Q M [c.1.f 2 ]c.2 and that must behave as c .1. f 3 when c .1. f 1 ≡ M. Observe that c .1. f 3 has type Q c.1. f 1 c.1. f 2 , which is deﬁnitionally equivalent to Q c.1. f 1 [c .1. f 2 ]reﬂT c.1.f1 , where reﬂ T c.1.f1 is the canonical proof of c .1. f 1 = c.1. f 1 . Thus, the term w we are looking for is simply [[ c .1. f 3 ]]c.2 which has type Q M [c.1.f2 ]c.2 where [[ ]] is the constant corresponding to computational dependent elimination for Leibniz equality: lemma [[ ]]p : Q x (reﬂ A x) →Q y p. where x : A, y : A, p : x = y, Q : (∀ z. x = z → Type) and [[M]]reﬂA
x
≡ M.
To avoid handling the ﬁrst ﬁeld diﬀerently from the following, we can always use [[ ]] in place of [ ] . The following derived typing and reduction rules show that our encoding of with behaves as expected. PhiStart ∅ valid PhiCons
Φ valid R, l1 , . . . , ln free in Φ Ti : Πl1 : T1 . . . . .Πli−1 .Ti−1 .T ype i ∈ {1, . . . , n} Φ, R = l1 : T1 , . . . , ln : Tn : T ype valid
Working with Mathematical Structures in Type Theory
163
Form (R = l1 : T1 , . . . , ln : Tn : T ype) ∈ Φ Γ, l1 : T1 , . . . , li−1 : Ti−1 a : Ti Γ R with li = a : T ype Intro (R = l1 : T1 , . . . , ln : Tn : T ype) ∈ Φ Γ R with li = a : T ype Γ Mk : Tk M1 . . . Mi−1 a Mi+1 . . . Mk−1 k ∈ {1, . . . , i − 1, i + 1, . . . , n} Γ M1 , . . . , Mi−1 , a, Mi+1 , . . . , Mn R , ref lA aR,λr:R.a=a : R with li = a Coerc
(R = l1 : T1 , . . . , ln : Tn : T ype) ∈ Φ Γ R with li = a : T ype Γ c : R with li = a li Γ kR ac:R
CoercRed Γ
li kR
(R = l1 : T1 , . . . , ln : Tn : T ype) ∈ Φ a M1 ,. . . , MnR , wR,s M1 ,. . . , Mi−1 , a, [[Mi+1 ]]w ,. . . , [[Mn ]]w R
Proj
(R = l1 : T1 , . . . , ln : Tn : T ype) ∈ Φ Γ R with li = a : T ype Γ c : R with li = a li Γ (kR a c).lj : Tj (kR a c).l1 . . . (kR a c).lj−1
ProjRed1
(R = l1 : T1 , . . . , ln : Tn : T ype) ∈ Φ li Γ (kR a M1 , . . . , Mn R , wR,s ).lj Mj
ProjRed2
j
(R = l1 : T1 , . . . , ln : Tn : T ype) ∈ Φ li Γ (kR a c).li a
ProjRed3 (R = l1 : T1 , . . . , ln : Tn : T ype) ∈ Φ li Γ (kR a M1 , . . . , Mn R , wR,s ).lj [[Mj ]]w
2.4
i<j
Deep with Construction
In order to interpret the with type constructor on “deep” ﬁelds, it is suﬃcient to follow the same schema, changing the coercion to make their subrecords manifest. Formally, when Q := {f: T; l : U} and R := {q: Q; s: S}, we interpret R with q.f = M with Σ c:R. c.q. f = M and we declare the coercion: deﬁnition kq.f R : ∀ M:T. R with q.f = M → R := λ M:T. λ x:(Σc:R. c.q.f=M). kqR (kfQ M x.1.1, x.2Q, λq:Q. q.f=M ) (match x with a, lQ , sR , wR, λr:R. r.q.f=M ⇒ a, l Q , sR , [[reﬂ Q a, lQ ]]w R, λr:R. r.q=kf
Q
M a,lQ , wQ, λq:Q. q.f=M )
164
C. Sacerdoti Coen and E. Tassi
Note that the computational rule associated to the computational dependent elimination of Leibniz equality is necessary to type the previous coercion: a, l Q , sR ,[[reﬂQ a, lQ ]]w R, λr:R. r.q=kf
Q Ma,lQ ,wQ,λq:Q. q.f=M
is well typed since reﬂ Q a, lQ has type a, l Q = a, l Q that is equivalent to a, l Q = kfQ a a, lQ , reﬂT aQ,λq:Q. q.f=q.f ; thus [[reﬂQ a, lQ ]]w has type a, l Q = kfQ M a, lQ , wQ, λq:Q. q.f=M . As expected, (kq.f R M c).q.f M for all c of type R with q.f = M. Due to lack of space we omit all other derived typing and reduction rules associated to the deep with construct. 2.5
Nested with Contructions
Finally, from the derived typing and reduction rules it is not evident that a type R with la =M with lb =N can be formed. Surprisingly, this type poses no additional problem. The system simply desugars it as Σ d: (Σ c:R. c. l a =M). (klRa M d).lb =N
and, as explained in the next section, automatically declares the composite coercion klRa ,lb := λ M,N,c. klRb N (klRa M c) as a coercion from R with la =M with lb =N to R such that: (klRa ,lb M N c).la M and (klRa ,lb M N c).lb N and (klRa ,lb M N M1 ,. . ., Mn R , wa , wb ).li {{Mi }}wa ,wb
where {{Mi }}wa ,wb is Mi (if i