HANDBOOK OF LOGIC IN COMPUTER SCIENCE
Editors S. Abramsky, Dov M. Gabbay, T. S. E. Maibaum
HANDBOOKS OF LOGIC IN COMP...
45 downloads
1454 Views
28MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
HANDBOOK OF LOGIC IN COMPUTER SCIENCE
Editors S. Abramsky, Dov M. Gabbay, T. S. E. Maibaum
HANDBOOKS OF LOGIC IN COMPUTER SCIENCE
and ARTIFICIAL INTELLIGENCE AND LOGIC PROGRAMMING Executive Editor Dov M. Gabbay Administrator Jane Spurr
Handbook of Logic in Computer Science Volume 1 Volume 2 Volume 3 Volume 4 Volume 5 Volume 6
Background: Mathematical structures Background: Computational structures Semantic structures Semantic modelling Logic and algebraic methods Logical methods in computer science
Handbook of Logic in Artificial Intelligence and Logic Programming Volume 1 Volume 2 Volume 3 Volume 4 Volume 5
Logical foundations Deduction methodologies Nonmonotonic reasoning and uncertain reasoning Epistemic and temporal reasoning Logic programming
Handbook of Logic in Computer Science Volume 5 Logic and Algebraic Methods
Edited by S.ABRAMSKY Christopher Stanley Professor of Computing University of Oxford
DOV M. GABBAY Augustus De Morgan Professor of Logic King's College, London and
T. S. E. MAIBAUM Professor of the Foundation of Software Engineering King's College, London
Volume Co-ordinator DOV M. GABBAY
CLARENDON PRESS • OXFORD 2000
OXFORD UNIVERSITY PRESS
Great Clarendon Street, Oxford 0x2 6DP Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide in Oxford New York Athens Auckland Bangkok Bogota Buenos Aires Calcutta Cape Town Chennai Dares Salaam Delhi Florence Hong Kong Istanbul Karachi Kuala Lumpur Madrid Melbourne Mexico City Mumbai Nairobi Paris Sao Paulo Shanghai Singapore Taipei Tokyo Toronto Warsaw with associated companies in Berlin Ibadan Oxford is a registered trade mark of Oxford University Press in the UK and in certain other countries Published in the United States by Oxford University Press Inc., New York © The contributors listed on pp. xvii-xviii 2000 The moral rights of the authors have been asserted Database right Oxford University Press (maker) First published 2000 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, or under terms agreed with the appropriate reprographics rights organization. Enquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above You must not circulate this book in any other binding or cover and you must impose this same condition on any acquirer A catalogue record for this book is available from the British Library Library of Congress Cataloging in Publication Data ISBN 0-19-853781-6 Typeset by the authors and Jane Spurr using LaTeX Printed in Great Britain on acid-free paper by Biddies Ltd., Guildford & King's Lynn
Preface We are happy to present Volume 5 of the Handbook of Logic in Computer Science, on Logic and Algebraic Methods, The first two volumes of the Handbook presented the background on fundamental mathematical structures—consequence relations, model theory, recursion theory, category theory, universal algebra, topology; and on computational structures— term-rewriting systems, A-calculi, modal and temporal logics and algorithmic proof systems. The computational structures considered thus far have been predominantly syntactic in character; while the discussion of mathematical structures has been quite general and free-standing. In Volumes 3 and 4 these threads were drawn together. We looked at how mathematical structures are used to model computational processes. In Volume 3, the focus is on general approaches: Domain theory, denotational and algebraic semantics, and the semantics of types. In Volume 4, some more specific topics are considered, and the emphasis shifts from structures to the actual modelling of some key computational features. The present Volume 5 continues with logical and algebraic methodologies basic to computer science. Chapter 1 covers Martin-L0f's type theory, originally developed to clarify the foundations of constructive mathematics it now plays a major role in theoretical computer science. The second chapter covers categorial logic, the interaction area between category theory and mathematical logic. It builds on the basic concepts introduced in the chapter 'Basic Category Theory' in Volume 1 of this Handbook series. The third chapter presents methods for obtaining lower bounds on the computational complexity of logical theories. Many such theories show up in the landscape of logic and computation. The fourth chapter covers algebraic specification and types. It treats the subject using set theoretical notions only and is thus accessible to a wide range of readers. The last (fifth) chapter deals with computability on abstract data types. It develops a theory of computable functions on abstract many-sorted algebras, a general enough notion for the needs of computer science.
The Handbooks We see the creation of this Handbook and its companion, the Handbook of Logic in Artificial Intelligence and Logic Programming as a combination of authoritative exposition, comprehensive survey, and fundamental research exploring the underlying unifying themes hi the various areas. The intended
vi
Preface
audience is graduate students and researchers in the areas of computing and logic, as well as other people interested in the subject. We assume as background some mathematical sophistication. Much of the material will also be of interest to logicians and mathematicians. The tables of contents of the volumes were finalized after extensive discussions between Handbook authors and second readers. The first two volumes present the Background—Mathematical Structures and Computational Structures. The chapters, which in many cases are of monographic length and scope, are written with emphasis on possible unifying themes. The chapters have an overview, introduction, and main body. A final part is dedicated to more specialized topics. Chapters are written by internationally renowned researchers in their respective areas. The chapters are co-ordinated and their contents were discussed in joint meetings. Each chapter has been written using the following procedures: 1. A very detailed table of contents was discussed and co-ordinated at several meetings between authors and editors of related chapters. The discussion was in the form of a series of lectures by the authors. Once an agreement was reached on the detailed table of contents, the authors wrote a draft and sent it to the editors and to other related authors. For each chapter there is a second reader (the first reader is the author) whose job it has been to scrutinize the chapter together with the editors. The second reader's role is very important and has required effort and serious involvement with the authors. 2. Once this process was completed (i.e. drafts seen and read by a large enough group of authors), there were other meetings on several chapters in which authors lectured on their chapters and faced the criticism of the editors and audience. The final drafts were prepared after these meetings. 3. We attached great importance to group effort and co-ordination in the writing of chapters. The first two parts of each chapter, namely the introduction-overview and main body, are not completely under the discretion of the author, as he/she had to face the general criticism of all the other authors. Only the third part of the chapter is entirely for the authors' own personal contribution. The Handbook meetings were generously financed by OUP, by SERC contract SO/809/86, by the Department of Computing at Imperial College, and by several anonymous private donations. Recently I have moved, together with the Editorial Office to King's College, London. I thank King's College for their open ended support of our publications efforts. We would like to thank our colleagues, authors, second readers, and students for their effort and professionalism in producing the manuscripts
Preface
vii
for the Handbook. We would particularly like to thank the staff of OUP for their continued and enthusiastic support, and Mrs Jane Spurr, our OUP Administrator, for her dedication and efficiency.
A View Finally, fifteen years after the start of the Handbook project, I would like to take this opportunity to put forward my current views about logic in computer science. In the early 1980s the perception of the role of logic in computer science was that of a specification and reasoning tool and that of a basis for possibly neat computer languages. The computer scientist was manipulating data structures and the use of logic was one of his options. My own view at the time was that there was an opportunity for logic to play a key role in computer science and to exchange benefits with this rich and important application area and thus enhance its own evolution. The relationship between logic and computer science was perceived as very much like the relationship of applied mathematics to physics and engineering. Applied mathematics evolves through its use as an essential tool, and so we hoped for logic. Today my view has changed. As computer science and artificial intelligence deal more and more with distributed and interactive systems, processes, concurrency, agents, causes, transitions, communication and control (to name a few), the researcher in this area is having more and more in common with the traditional philosopher who has been analysing such questions for centuries (unrestricted by the capabilities of any hardware). The principles governing the interaction of several processes, for example, are abstract and similar to principles governing the cooperation of two large organisations. A detailed rule based effective but rigid bureaucracy is very much similar to a complex computer program handling and manipulating data. My guess is that the principles underlying one are very much the same as those underlying the other. I believe the day is not far away in the future when the computer scientist will wake up one morning with the realisation that he is actually a kind of formal philosopher! London April 1999
D. M. Gabbay
This page intentionally left blank
Contents List of contributors
Martin-L6f s type theory
1
B. Nordstrtim, K. Petersson and J. M. Smith 1
2 3
4 5
6
Introduction 1.1 Different formulations of type theory 1.2 Implementations Propositions as sets Semantics and formal rules 3.1 Types 3.2 Hypothetical judgements 3.3 Function types 3.4 The type Set 3.5 Definitions Prepositional logic Set theory 5.1 The set of Boolean values 5.2 The empty set 5.3 The set of natural numbers 5.4 The set of functions (Cartesian product of a family of sets) 5.5 Prepositional equality 5.6 The set of lists 5.7 Disjoint union of two sets29 5.8 Disjoint union of a family of sets 5.9 The set of small sets The ALF series of interactive editors for type theory
1 3 4 4 7 7 9 12 14 15 16 19 20 21 21 23 26 28 29 30 32
x
Contents
Categorial logic
39
Andrew M. Pitts 1 Introduction 2 Equational logic 2.1 Syntactic considerations 2.2 Categorical semantics 2.3 Internal languages 3 Categorical datatypes 3.1 Disjoint union types
52
3.2 Product types
57
3.3 Function types
3.4 Inductive types 3.5 Computation types 4 Theories as categories 4.1 Change of category 4.2 Classifying category of a theory 4.3 Theory-category correspondence 4.4 Theories with datatypes 5
60
62 65 67 68 68 73 75
Predicate logic
77
5.1 Formulas and sequents
77
5.2 Hyperdoctrines
78
5.3 Satisfaction 5.4 Prepositional connectives 5.5 Quantification
82 84 89
5.6 Equality
5.7 Completeness 6 Dependent types 6.1 Syntactic considerations
7
40 43 44 45 48 50
93
97 100 101
6.2 Classifying category of a theory
107
6.3 Type-categories 6.4 Categorical semantics 6.5 Dependent products
109 114 119
Further reading
123
Contents
A uniform method for proving lower bounds on the computational complexity of
logical theories
xi
129
K. Compton and C. Ward Henson 1 2 3 4 5 6 7 8 9 10
Introduction Preliminaries Reductions between formulas Inseparability results for first-order theories Inseparability results for monadic second-order theories Tools for NTIME lower bounds Tools for linear ATIME lower bounds Applications Upper bounds Open problems
Algebraic specification of abstract data types
129 135 140 151 158 164 173 180 196 204
217
J. Loeckx, H.-D. EhrichandM. Wolf
1 2
3
4
5
Introduction Algebras 2.1 The basic notions 2.2 Homomorphisms and isomorphisms 2.3 Abstract data types 2.4 Subalgebras 2.5 Quotient algebras Terms 3.1 Syntax 3.2 Semantics 3.3 Substitutions 3.4 Properties Generated algebras, term algebras 4.1 Generated algebras 4.2 Freely generated algebras 4.3 Term algebras 4.4 Quotient term algebras Algebras for different signatures 5.1 Signature morphisms 5.2 Reducts 5.3 Extensions
219 220 220 223 224 225 225 227 227 228 229 229 230 230 233 234 235 235 235 237 238
xii 6
7
8
9 10
11
12 13
Contents Logic 6.1 Definition 6.2 Equational logic 6.3 Conditional equational logic 6.4 Predicate logic Models and logical consequences 7.1 Models 7.2 Logical consequence 7.3 Theories 7.4 Closures 7.5 Reducts 7.6 Extensions Calculi 8.1 Definitions 8.2 An example 8.3 Comments Specification Loose specifications 10.1 Genuinely loose specifications 10.2 Loose specifications with constructors 10.3 Loose specifications with free constructors Initial specifications 11.1 Initial specifications in equational logic 11.2 Examples 11.3 Properties 11.4 Expressive power of initial specifications 11.5 Proofs 11.6 Term rewriting systems and proofs 11.7 Rapid prototyping 11.8 Initial specifications in conditional equational logic 11.9 Comments Constructive specifications Specification languages 13.1A simple specification language 13.2 Two further language constructs 13.3 Adding an environment 13.4 Flattening 13.5 Properties and proofs 13.6 Rapid prototyping 13.7 Further language constructs 13.8 Alternative semantics description
239 239 240 241 241 243 243 244 245 246 248 248 249 249 250 251 252 253 253 255 256 257 257 258 260 260 261 263 265 266 266 267 270 271 274 278 281 282 282 282 283
Contents 14
15
16
Modularization and parameterization 14.1 Modularized abstract data types 14.2 Atomic module specifications 14.3 A modularized specification language 14.4 A parameterized specification language 14.5 Comments 14.6 Alternative parameter mechanisms Further topics 15.1 Behavioural abstraction 15.2 Implementation 15.3 Ordered sorts 15.4 Exceptions 15.5 Dynamic data types 15.6 Objects 15.7 Bibliographic notes The categorical approach 16.1 Categories 16.2 Institutions
xiii 284 284 285 287 290 294 295 297 297 299 301 302 304 306 308 309 309 309
Computable functions and semicomputable sets on many-sorted algebras 397 /. V Tucker and J. I. Zucker 1
2
Introduction 1.1 Computing in algebras 1.2 Examples of computable and non-computable functions 1.3 Relations with effective algebra 1.4 Historical notes on computable functions on algebras 1.5 Objectives and structure of the chapter 1.6 Prerequisites Signatures and algebras 2.1 Signatures 2.2 Terms and subalgebras 2.3 Homomorphisms, isomorphisms and abstract data types 2.4 Adding Booleans: Standard signatures and algebras 2.5 Adding counters: V-standard signatures and algebras 2.6 Adding the unspecified value u; algebras Au of signature Su 2.7 Adding arrays: Algebras A* of signature £_* 2.8 Adding streams: Algebras A of signature £
319 322 325 329 335 340 343 344 344 349 350 351 353 355 356 359
xiv 3
Contents
While computability on standard algebras 3.1 Syntax of While(£)361 3.2 States 3.3 Semantics of terms 3.4 Algebraic operational semantics 3.5 Semantics of statements for While(E,) 3.6 Semantics of procedures 3.7 Homomorphism invariance theorems 3.8 Locality of computation 3.9 The language While Proc(S) 3.10 RelativeWhile computability 3.11 For(E) computability 3.12 WhileN and ForN computability 3.13 While* and For* computability 3.14 Remainder set of a statement; snapshots 3.15 E*/E conservativity for terms 4 Representations of semantic functions; universality 4.1 Godel numbering of syntax 4.2 Representation of states 4.3 Representation of term evaluation 4.4 Representation of the computation step function 4.5 Representation of statement evaluation 4.6 Representatipn of procedure evaluation 4.7 Computability of semantic representing functions; term evaluation property 4.8 Universal WhileN procedure for While 4.9 Universal WhileN procedure for While* 4.10 Snapshot representing function and sequence 4.11 Order of a tuple of elements 4.12 Locally finite algebras 4.13 Representing functions for specific terms or programs 5 Notions of semicomputability 5.1 While semicomputability 5.2 Merging two procedures: Closure theorems 5.3 Projective While semicomputability: semicomputability with search 5.4 WhileN semicomputability 5.5 Projective WhileN semicomputability 5.6 Solvability of the halting problem 5.7 While* semicomputability 5.8 Projective While* semicomputability 5.9 Homomorphism invariance for semicomputable sets 5.10 The computation tree of a While statement 5.11 Engeler's lemma
360 363 363 364 366 368 371 372 374 375 376 377 378 380 383 387 388 389 389 390 392 393 394 397 401 402 404 405 406 407 408 409 413 414 416 416 420 421 422 423 425
Contents 5.12 Engeler's lemma for While* semicomputabitity 5.13 £1*definability: Input/output and halting formulae 5.14 The projective equivalence theorem 5.15 Halting sets of While procedures with random assignments 6 Examples of semicomputable sets of real and complex numbers 6.1 Computability on R and C 6.2 The algebra of reals; a set which is projectively While semicomputable but not While* semicomputable 6.3 The ordered algebra of reals; sets of reals which are While semicomputable but not While* computable 6.4 A set which is projectively While* semicomputable but not projectively WhileN semicomputable 6.5 Dynamical systems and chaotic systems on R; sets which are WhileN semicomputable but not While* computable 6.6 Dynamical systems and Julia sets on C; sets which are WhileN semicomputable but not While* computable 7 Computation on topological partial algebras 7.1 The problem 7.2 Partial algebras and While computation 7.3 Topological partial algebras 7.4 Discussion: Two models of computation on the reals 7.5 Continuity of computable functions 7.6 Topological characterisation of computable sets in compact algebras 7.7 Metric partial algebra 7.8 Connected domains: computability and explicit definability 7.9 Approximable computability 7.10 Abstract versus concrete models for computing on the real numbers 8 A survey of models of computability 8.1 Computability by function schemes 8.2 Machine models 8.3 High-level programming constructs; program schemes 8.4 Axiomatic methods 8.5 Equational definability 8.6 Inductive definitions and fixed-point methods 8.7 Set recursion 8.8 A generalised Church-Turing thesis for computability 8.9 A Church-Turing thesis for specification 8.10 Some other applications
Index
xv 429 431 434 435 438 439 441 443 445 447 449 451 452 453 455 458 460 464 465 465 470 475 479 479 484 488 490 490 492 493 493 496 500
525
This page intentionally left blank
Contributors B. Nordstrom, Kent Petersson and Jan Smith Department of Computer Science Chalmers Tekniska Hogskola Institutionen for Datavetenskap S-412 96 Goteborg Sweden A. Pitts University of Cambridge Computer Laboratory New Museums Site Pembroke Street Cambridge CB2 3QG J. Loeckx University des Saarlandes D-66041 Saarbrucken Germany H.-D. Ehrich Technische Universitat Braunschweig D-38023 Braunschweig Germany M.Wolf Universitatdes Saarlandes D-66041 Saarbrucken Germany K. Compton Department of Computer Science University of Aarhus Ny Munkegade, Bldg 540 DK-8000 Aarhus C Denmark C. Ward Benson Department of Mathematics University of Illinois 1409 Green St. Urbana, IL 61801 USA
Contributors J.V.Tucker Department of Computer Science University of Swansea Singleton Park Swansea Wales J. I. Zucker Department of Computing and Software Faculty of Engineering McMaster Univeristy 1280 Main Street West, JHE-327 Hamilton, Ontario, L8S 4L7 Canada
Martin-L0f 's type theory B. Nordstrom, K. Petersson and J. M. Smith
Contents 1
2 3
4 5
6
Introduction 1.1 Different formulations of type theory 1.2 Implementations Propositions as sets Semantics and formal rules 3.1 Types 3.2 Hypothetical judgements 3.3 Function types 3.4 The type Set 3.5 Definitions Propositional logic Set theory 5.1 The set of Boolean values 5.2 The empty set 5.3 The set of natural numbers 5.4 The set of functions (cartesian product of a family of sets) 5.5 Propositional equality 5.6 The set of lists 5.7 Disjoint union of two sets 5.8 Disjoint union of a family of sets 5.9 The set of small sets The ALF series of interactive editors for type theory
1 3 4 4 7 7 9 12 14 15 16 19 20 21 21 23 26 28 29 29 30 32
1 Introduction The type theory described in this chapter has been developed by Martin-L6f with the original aim of being a clarification of constructive mathematics. Unlike most other formalizations of mathematics, type theory is not based on predicate logic. Instead, the logical constants are interpreted within type theory through the Curry-Howard correspondence between propositions and sets [Curry and Feys, 1958; Howard, 1980]: a proposition is interpreted as a set whose elements represent the proofs of the proposition.
2
B. Nordstrom, K. Petersson and J. M. Smith
It is also possible to view a set as a problem description in a way similar to Kolmogorov's explanation of the intuitionistic propositional calculus [Kolmogorov, 1932]. In particular, a set can be seen as a specification of a programming problem; the elements of the set are then the programs that satisfy the specification. An advantage of using type theory for program construction is that it is possible to express both specifications and programs within the same formalism. Furthermore, the proof rules can be used to derive a correct program from a specification as well as to verify that a given program has a certain property. As a programming language, type theory is similar to typed functional languages such as ML [Gordon et al., 1979; Milner et al., 1990] and Haskell [Hudak et al, 1992], but a major difference is that the evaluation of a well-typed program always terminates. The notion of constructive proof is closely related to the notion of computer program. To prove a proposition (Vx€ A)(By E B)P(x,y) constructively means to give a function / which when applied to an element a in A gives an element b in B such that P(a, b) holds. So if the proposition (VxE A)(By£B)P(x,y) expresses a specification, then the function / obtained from the proof is a program satisfying the specification. A constructive proof could therefore itself be seen as a computer program and the process of computing the value of a program corresponds to the process of normalizing a proof. It is by this computational content of a constructive proof that type theory can be used as a programming language; and since the program is obtained from a proof of its specification, type theory can be used as a programming logic. The relevance of constructive mathematics to computer science was pointed out by Bishop [1970]. Several implementations of type theory have been made which can serve as logical frameworks, that is, different theories can be directly expressed in the implementations. The formulation of type theory we will describe in this chapter form the basis for such a framework, which we will briefly present in the last section. The chapter is structured as follows. First we will give a short overview of different formulations and implementations of type theory. Section 2 will explain the fundamental idea of propositions as sets by means of Heyting's explanation of the intuitionistic meaning of the logical constants. The following section will give a rather detailed description of the basic rules and their semantics; on a first reading some of this material may just be glanced at, in particular the subsection on hypothetical judgements. In section 4 we illustrate type theory as a logical framework by expressing propositional logic in it. Section 5 introduces a number of different sets and the final section give a short description of ALF, an implementation of the type theory of this chapter. Although self-contained, this chapter can be seen as complement to our book, Programming in Martin-Lof's Type Theory. An Introduction [Nord-
Martin-Lof's type theory
3
strom et a/., 1990], in that we here give a presentation of Martin-Lof's monomorphic type theory in which there are two basic levels, that of types and that of sets. The book is mainly concerned with a polymorphic formulation where instead of a level of types there is a theory of expressions. One major difference between these two formulations is that in the monomorphic formulation there is more type information in the terms, which makes it possible to implement a type checker [Magnusson and Nordstrom, 1994]; this is important when type theory is used as a logical framework where type checking is the same as proof checking.
1.1
Different formulations of type theory
One of the basic ideas behind Martin-Lof's type theory is the CurryHoward interpretation of propositions as types, that is, in our terminology, propositions as sets. This view of propositions is closely related to Heyting's explanation of intuitionistic logic [Heyting, 1956] and will be explained in detail below. Another source for type theory is proof theory. Using the identification of propositions and sets, normalizing a derivation corresponds to computing the value of the proof term expressing the derivation. One of Martin-Lof's original aims with type theory was that it could serve as a framework in which other theories could be interpreted. And a normalization proof for type theory would then immediately give normalization for a theory expressed in type theory. In Martin-Lof's first formulation of type theory in 1971 [Martin-L6f, 1971a], theories like first-order arithmetic, Godel's T [Godel, 1958], secondorder logic and simple type theory [Church, 1940] could easily be interpreted. However, this formulation contained a reflection principle expressed by a universe V and including the axiom V E V, which was shown by Girard to be inconsistent. Coquand and Huet's 'Calculus of Constructions' [Coquand and Huet, 1986] is closely related to the type theory in [Martin-L6f, 1971a]: instead of having a universe V, they have the two types Prop and Type and the axiom Prop € Type, thereby avoiding Girard's paradox. Martin-Lof's later formulations of type theory have all been predicat ive; in particular, second-order logic and simple type theory cannot be interpreted in them. The strength of the theory considered in this chapter instead comes from the possibility of defining sets by induction. The formulation of type theory from 1979 in 'Constructive Mathematics and Computer Programming' [Martin-L6f, 1982] is polymorphic and extensional. One important difference with the earlier treatments of type theory is that normalization is not obtained by metamathematical reasoning; instead, a direct semantics is given, based on Tait's computability method. A consequence of the semantics is that a term, which is an element in a set, can be computed to normal form. For the semantics of this theory, lazy evaluation is essential. Because of a strong elimination rule for the set
4
B. Nordstrom, K. Petersson and J. M. Smith
expressing the propositional equality, judgemental equality is not decidable. This theory is also the one in Intuitionistic Type Theory [Martin-Lof, 1984]. It is also the theory used in the NuPRL system [Constable et al., 1986] and by the group in Groningen [Backhouse et al., 1989]. The type theory presented in this chapter was put forward by Martin-Lof in 1986 with the specific intention that it should serve as a logical framework.
1.2
Implementations
One major application of type theory is its use as a programming logic in which you derive programs from specifications. Such derivations easily become long and tedious and, hence, error-prone, so it is essential to formalize the proofs and to have computerized tools to check them. There are several examples of computer implementations of proof checkers for formal logics. An early example is the AUTOMATH system [de Bruijn, 1970; de Bruijn, 1980] which was designed by de Bruijn to check proofs of mathematical theorems. Quite large proofs were checked by the system, for example the proofs in Landau's book Grundlagen der Analysis [Jutting, 1979]. Another system, which is more intended as a proof assistant, is the Edinburgh (Cambridge) LCF system [Gordon et a/., 1979; Paulson, 1987]. The proofs are constructed in a goal directed fashion, starting from the proposition the user wants to prove and then using tactics to divide it into simpler propositions. The LCF system also introduced the notion of metalanguage (ML) in which the user could implement her own proof strategies. Based on the LCF system, a system for Martin-Lof's type theory was implemented in Goteborg 1982 [Petersson, 1982, 1984]. Another, more advanced, system for type theory was developed by Constable et al. at Cornell University [Constable et al., 1986]. In recent years, several logical frameworks based on type theory have been implemented: the Edinburgh LF [Harper et al., 1993], Coq from INRIA [Dowek et al., 1991], LEGO from Edinburgh [Luo and Pollack, 1992], and ALF from Goteborg [Augustsson et al., 1990; Magnusson and Nordstrom, 1994]. Coq and LEGO are both based on Coquand and Huet's calculus of constructions, while ALF is an implementation of the theory we describe in this chapter. A brief overview of the ALF system is given in section 6.
2
Propositions as sets
The basic idea of type theory to identify propositions with sets goes back to Curry [Curry and Feys, 1958], who noticed that the axioms for positive implicational calculus, formulated in the Hilbert style,
Martin-Lof's type theory (A D B D C) D (A D B) D A D C
correspond to the types of the basic combinators K and 5,
Modus ponens then corresponds to functional application. Tait [Tait, 1965] noticed the further analogy that removing a cut in a derivation corresponds to a reduction step of the combinator representing the proof. Howard [Howard, 1980] extended these ideas to first-order intuitionistic arithmetic. Another way to see that propositions can be seen as sets is through Heyting's [Heyting, 1956] explanations of the logical constants. The constructive explanation of logic is in terms of proofs: a proposition is true if we know how to prove it. For implication we have: A proof of A D B is a function (method, program) which to each proof of A gives a proof of B. The notion of function or method is primitive in constructive mathematics and a function from a set A to a set B can be viewed as a program which when applied to an element in A gives an element in B as output. The idea of propositions as sets is now to identify a proposition with the set of its proofs. In case of implication we get: A D B is identified with A ->• B, the set of functions from A to B. The elements in the set A -> B are of the form where 6 6 B and b may depend on x € A. Heyting's explanation of conjunction is that a proof of A A B is a pair whose first component is a proof of A and whose second component is a proof of B. Hence, we get the following interpretation of a conjunction as a set. A/\B is identified with A x B, the cartesian product of A and B. The elements in the set A x B are of the form (a, b), where a € A and A disjunction is constructively true if and only if we can prove one of the disjuncts. So a proof of A V B is either a proof of A or a proof of B. Hence, A V B is identified with A + B, the disjoint union of A and B. The elements in the set A + B are of the form inl(a) and inr(b), where a 6 A and b 6 B. The negation of a proposition A can be defined by:
6
B. Nordstrom, K. Petersson and J. M. Smith
where stands for absurdity, that is a proposition which has no proof. If we let {} denote the empty set, we have - {} and then obtain an element in {}
and therefore
Example 5.2. Using the rules for natural numbers, Booleans and functions we will show how to define a function, eqN e (N, N)Bool, that decides if two natural numbers are equal. We want the following equalities to hold:
It is impossible to define eqN directly just using natural numbers and recursion on the arguments. We have to do recursion on the arguments separately and first use recursion on the first argument to compute a function which, when applied to the second argument, gives us the result we want. So we first define a function / € (N) (N -» Bool) which satisfies the equalities
where
If we use the recursion operator explicitly, we can define / as
26
B. Nordstrom, K. Petersson and J. M. Smith
The function / is such that f ( n ) is equal to a function which gives true if applied to n and false otherwise, that is, we can use it to define eqN as follows:
It is a simple exercise to show that
and that it satisfies the equalities we want it to satisfy.
5.5
Prepositional equality
The equality on the judgement level a = b 6 A is a definitional equality and two objects are equal if they have the same normal form. In order to express the fact that, for example, addition of natural numbers is a commutative operation it is necessary to introduce a set for propositional equality. If a and b are elements in the set A, then \d(A, a, b) is a set. We express this by introducing the constant Id and its type The only constructor of elements in equality sets is id and it is introduced by the type declaration To say that id is the only constructor for ld (A, a, b) is the same as to say that ld(A, a,b) is the least reflexive relation. Transitivity, symmetry and congruence can be proven from this definition. We use the name idpeel for the selector, and it is introduced by the type declaration
and the equality
The intuition behind this constant is that it expresses a substitution rule for elements which are propositionally equal. Example 5.3. The type of the constructor in the set ld(A, a,b) corresponds to the reflexivity rule of equality. The symmetry and transitivity rules can easily be derived.
Martin-Lof's type theory
27
Let A be a set and a and 6 two elements of A. Assume that
In order to prove symmetry, we must construct an element in ld(A, b, a). By applying idpeel to A, [ x , y , e ] l d ( A , y,x), a, b, d and [x]id(A,x) we get, by simple type-checking, an element in the set \d(A,b,a):
The derived rule for symmetry can therefore be expressed by the constant symm defined by
Transitivity is proved in a similar way. Let A be a set and a, b and c elements of A. Assume that
By applying idpeel to A, [ x , y , z ] l d ( A , z , c ) - l d ( A , z , c ) , a, b, d and the identity function [x]A(ld(A, x, c ) , l d ( A , x , c ) , [ w ] w ) we get an element in the set \d(A,b, c ) - l d ( A , a,c). This element is applied to e in order to get the desired element in \d(A,a,c):
Example 5.4. Let us see how we can derive a rule for substitution in set expressions. We want to have a rule
To derive such a rule, first assume that we have a set A and elements a and b of A. Furthermore, assume that c 6 \d(A,a,b), P(x) 6 Set [x € A] and p € P(a). Type checking gives us that
28
B. Nordstrom, K. Petersson and J. M. Smith
We can now apply the function above to p to obtain an element in P(b). So we can define a constant subst that expresses the substitution rule above. The type of subst is
and the defining equation
5.6
The set of lists
The set of lists List(A) is introduced in a similar way to the natural numbers, except that there is a parameter A that determines which set the elements of a list belongs to. There are two constructors to build a list, nil for the empty list and cons to add an element to a list. The constants we have introduced so far have the following types:
The selector listrec for types is a constant that expresses primitive recursion for lists. The selector is introduced by the type declaration
Martin-Lof's type theory
29
The defining equations for the listrec constant are
5.7
Disjoint union of two sets
If we have two sets A and B we can form the disjoint union A+B. The elements of this set are either of the form inl(A, B, a) or of the form inr(A, B, b) where a £ A and b E B. In order to express this in the framework we introduce the constants
The selector when is introduced by the type declaration
and defined by the equations
Seen as a proposition, the disjoint union of two sets expresses disjunction. The constructors correspond to the introduction rules
and the selector when corresponds to the elimination rule.
5.8
Disjoint union of a family of sets
In order to be able to deal with the existential quantifier and to have a set of ordinary pairs, we will introduce the disjoint union of a family of sets. The set is introduced by the type declaration
30
B. Nordstrom, K. Petersson and J. M. Smith
There is one constructor in this set, pair, which is introduced by the type declaration
The selector of a set E(A, B) splits a pair into its parts. It is defined by the type declaration
and the defining equation
Given the selector split, it is easy to define the two projection functions that give the first and second component of a pair:
When viewed as a proposition the disjoint union of a family of sets E(A,B) corresponds to the existential quantifier (3 x E A)B(x). The types of constructor pair and when correspond to the natural deduction rules for the existential quantifier
5.9
The set of small sets
A set of small sets U, or a universe, is a set that reflects some part of the set structure on the object level. It is of course necessary to introduce this set if one wants to do some computation using sets, for example to specify and prove a type checking algorithm correct, but it is also necessary in order to
Martin-Lof's type theory
31
prove inequalities such as 0 succ(O). Furthermore, the universe can be used for defining families of sets using recursion, for example non-empty lists and sets such as Nn. We will introduce the universe simultanously with a function S that maps an element of U to the set the element encodes. The universe we will introduce has one constructor for each set we have defined. The constants for sets are introduced by the type declaration
Then we introduce the constructors in U and the defining equations for S
Example 5.5. Let us see how we can derive an element in the set
or, in other words, how we can find an expression in the set
We start by assuming that
32
B. Nordstrom, K. Petersson and J. M. Smith
Then we construct a function, Iszero, that maps a natural number to an element in the universe:
It is easy to see that
and therefore
Finally, we have the element we are looking for:
It is shown in Smith [Smith, 1988] that without a universe no negated equalities can be proved.
6
The ALF series of interactive editors for type theory
At the Department of Computing Science in Goteborg, we have developed a series of interactive editors for objects and types in type theory. The editors are based on direct manipulation, that is, the things which are being built are shown on the screen, and editing is done by pointing and clicking on the screen. The proof object is used as a true representative of a proof. The process of proving the proposition A is represented by the process of building a proof object of A. The language of type theory is extended with placeholders (written as indexed question marks). The notation ? € A stands for the problem of finding an object in A. An object is edited by replacing the placeholders by expressions which may contain placeholders. It is also possible to delete a subpart of an object by replacing it with a placeholder. There is a close connection between the individual steps in proving A and the steps to build a proof object of A. When we are making a topdown proof of a proposition A, then we try to reduce the problem A to some subproblems B1 ,..., Bn by using a rule c which takes proofs of B1,..., Bn to a proof of A. Then we continue by proving B1, ..., Bn. For instance, we can reduce the problem A to the two problems C D A and C by using modus ponens. In this way we can continue until we have only axioms and
Martin-Lof's type theory
33
assumptions left. This process corresponds exactly to the way we can build a mathematical object from the outside in. If we have a problem
then it is possible to refine the placeholder in the following ways: • The placeholder can be replaced by an application c(?1,,..., ?n) where c is a constant, or x(?1,..., ?„), where x is a variable. In the case that we have a constant, we must have that c(?1,..., ?n) E A, which holds if the type of the constant c is equal to (x1 E A1 ;...; xn G An)B and ?1 E A1,?2 € A2 [x1-?1],... ,xn E An[Xl Xi,
Fig. 1. Equational logic
46
Andrew M. Pitts
will denote the unique morphism whose compositions with each i, is fi. For definiteness we will assume that the product Xi x - - - x Xn is denned by induction on the length of the list [ X 1 , . . . , Xn] using a terminal object, 1, and binary products, — x —. Thus the product of the empty list is 1; and, inductively, the product of a list [ X 1 , . . . , Xn, .Xn+1] of length n + 1 is given by the binary product (X1 x - - - x Xn) x Xn+i. A structure in C for a given signature Sg is specified by giving an object [] in C for each sort a, and a morphism [F] : [ 1] x • • • x [n] —> [T] in C for each function symbol F : T. (In the case n = 0, this means that a structure assigns a global element [c| : —>• [T] to a constant c : T.) Given such a structure, for each context l = x1 : 1, ... ,xn : n, term M and sort T for which M : T [L] holds, we will define a morphism in C: where |L] denotes the product [1] x • • • x [ n ] - Note that the rules for deriving well-formed terms-in-context are such that for each L, M and r, there is at most one way to derive M : T[L]. The definition of [M : T[L]] can therefore be given by induction on the structure of this derivation. Since it is also the case that T is uniquely determined by L and M, we will abbreviate [M : T[L] to |M[F]J. The definition has two clauses, corresponding to rules (2.2) and (2.3):
Lemma 2.2 (Semantics of substitution) . In the categorical semantics of terms-in-context, substitution of a term for a variable in a term is interpreted via composition in the category. Specifically, if Mi : i [L] for then
where N[M /x\ denotes the result of simultaneously substituting Mi for Xi (i = l,...,n) in N. (Note that (N[M /x\) : T [L'] holds by repeated use of (2.4)-) The lemma is proved by induction on the structure of N. By taking the terms Mi to be suitable variables one obtains the following result as a special case. Corollary 2.3 (Semantics of weakening) . Suppose that N : T [L] and that L' is another context that contains L = [x1 : 1 , . . . , xn : nn] as a sublist. Then where with m(i) the position in L'' of the variable Xi .
Categorical logic
47
Suppose M = M' : a [L] is an equation-in-context over a given signature, Sg. Since M : a [L] and M : a [L] are required to hold, a structure in C for Sg gives rise, via the above definition, to morphisms [M[L]], [M'[L]] : [L] —> []. The structure is said to satisfy the equationin-context if these morphisms are equal. If Th is an algebraic theory over Sg, then the structure is a Th-algebra in C if it satisfies all the axioms of Th. There are very many different categories with finite products, and algebras for an algebraic theory in one may have very different detailed structure from algebras for the same theory in another category. Nevertheless, the following proposition shows that whatever the underlying category, we can still use the familiar kind of equational reasoning embodied by the rules in Fig. 1 whilst preserving satisfaction of equations. Theorem 2.4 (Soundness). Let C be a category with finite products and Th an algebraic theory. Then a Th-algebra in C satisfies any equationin-context which is a theorem of Th. Proof. The properties of equality of morphisms in C imply that the collection of equations-in-context satisfied by the TTi-algebra is closed under rules (2.6), (2.7) and (2.8) in Fig. 1. Closure under rule (2.9) is a consequence of Lemma 2.2. • The converse of this theorem, namely the completeness of the categorical semantics for equational logic, will be a consequence of the material in Section 4.2. Summary. We conclude this section by summarizing the important features of the categorical semantics of terms and equations in a category with finite products. • Sorts are interpreted as objects. • A term is only interpreted in a context (an assignment of sorts to finitely many variables) containing at least the variables mentioned in the term, and such a term-in-context is interpreted as a morphism with: * the codomain of the morphism determined by the sort of the term; * the domain of the morphism determined by the context; * variables interpreted as product projection morphisms (identity morphisms being a special case of these); * substitution of terms for variables interpreted via composition and pairing; * weakening of contexts interpreted via composition with a product projection morphism. • An equation is only considered in a context (containing at least the variables mentioned), and such an equation-in-context is satisfied if
48
Andrew M. Pitts the two morphisms interpreting the equated terms-in-context are actually equal in the category.
2.3
Internal languages
The interpretation of equational logic described in the the previous section provides a means of describing certain properties of a category C with finite products as though its objects were sets of elements and its morphisms functions between those sets. To do this we use the terms over a manysorted signature Sgc naming the objects and morphisms of C. The sorts of Sgc are the objects of C; and for each non-empty list X 1 , . . . ,Xn,Y of objects, the function symbols of type X 1 , . . . , Xn —> Y are the morphisms / : X1 x - • • x Xn —> Y in C. Of course the size of this signature depends on the size of C: it will have a set of symbols, rather than a proper class, only if C is small (i.e. has only a set of morphisms). For the sake of clarity, we are not making a notational distinction between a symbol in Sgc and the element of C that it names; as a result a morphism / : X1 x • • • x Xn —> Y occurs in Sgc both as an n-ary function symbol of type X 1 , . . . , Xn —> Y and as a unary one of type X1 x • • • x Xn —> Y. The terms-in-context over this signature constitute a language for describing morphisms of C—a so-called internal language for C . The connection between these terms-in-context and the morphisms they describe is given by means of the evident structure for Sgc in C which maps symbols in Sgc to the corresponding objects and morphisms in C. Given M : Y [x1 : X1,...,x n : Xn], the definitions of section 2.2 applied to this structure yield a morphism
inC. We will not refer to this structure for Sgc in C by name, but simply say that 'C satisfies M = N : X [L]'if the structure satisfies this equation-incontext. The following results are easy exercises in the use of the categorical semantics. Proposition 2.5.
(i) Two parallel morphisms in C, f,, g : X —> X, are equal if and only if C satisfies f(x) = g(x) : X [x : X], (ii) A morphism f : X X in C is the identity on X if and only if C satisfies f(x) = x : X [x : X]. (in) A morphism f : X —>Z is the composition of g : X Y and h:Y —> Z if and only if C satisfies f ( x ) = h(g(x)) : Z (x : X}. (iv) An object T is a terminal object in C if and only if there is some morphism t : 1 —> T satisfying x = t :T [x : T]. (v) X - Y is a binary product diagram in C if and only if there
Categorical logic
49
is some morphism r : X x Y —> Z satisfying
The internal language of C makes it look like a category of sets and functions: given an object X, its 'elements' in the language are the terms M of sort X; and a morphism / : X —> Y yields a function sending 'elements' M of sort X to 'elements' f(M) of sort y. However, in the internal language the 'elements' of X depend upon the context; and these elements are interpreted as morphisms to X in C whose domain is the interpretation of the context. Thus we can think of morphisms x : I —> X in C as generalized elements of X at stage I and will write x I X to indicate this. A particular case is when I is the terminal object 1: the generalized elements of an object at stage 1 are more normally called its global elements. In general, it is not sufficient to restrict attention to global elements in order to describe properties of C. For example, each morphism / : X Y yields via composition a function taking generalized elements of X at each stage , x I X, to generalized elements of Y at the same stage, (f o x) I Y. Two such morphisms f,g:X —> Y are equal just if they determine equal functions on generalized elements at all stages, but are not necessarily equal if they determine the same functions just on global elements. Call a category with terminal object well-pointed if for all f, g : X —> Y in C, f = g whenever / o x = g o x for all global elements x : 1 —> X. Typically, categories of sets equipped with extra structure and structure-preserving functions are well-pointed, and algebraic structures in such categories are closely related to corresponding set-valued structures via the use of global elements. If we only consider algebras in such well-pointed categories, we lose the ability to model certain phenomena. Here is a little example stolen from [Goguen and Meseguer, 1985], to do with 'empty sorts'. Example 2.6. Consider the usual algebraic theory of Boolean algebras (with sort bool and operations T, F, , V, ) augmented with a second sort hegel and a function symbol C : hegel—> bool satisfying the axiom (C(x)) C(x) : bool [x : hegel\. Then T = F : bool [x : hegel] is a theorem of this theory, but T = F : bool [] is not. Any set-valued model of the theory that assigns a non-empty set to hegel must identify T and F. However, there are models in non-well-pointed categories for which hegel is non-empty (in the sense of not being an initial object), but do not identifyf T and F. For example, consider the category [w )4 Se] of set-valued functors and natural transformations from the ordinal (which is a poset, hence a category). A more concrete description of [w, Set] is as the category of 'sets
50
Andrew M. Pitts
evolving through (discrete) time'. The objects are sets X equipped with a function E : X —> w recording that x € X exists at time E(x) w,, together with a function (—) + : X —> X describing how elements evolve from one instant to the next, and hence which is required to satisfy E(x+) = E(x) + l. A morphism between two such objects, / : X —» Y, is a function between the sets which preserves existence (E(f(x)) = E(x)) and evolution (f(x+) = f ( x ) + ) . The global sections of X are in bijection with the elements which exist at the beginning of time, {x € X \ E(x) — 0}. So this category is easily seen to not be well-pointed. One can model the algebraic theory of the previous paragraph in this category by interpreting bool as the set having two distinct elements at time 0 that evolve to a unique element at time 1, and interpreting hegel as the set with no elements at time 0 and a single element thereafter. (This interpretation for hegel is different from the initial object of [w,Set], which has no elements at any time.) The internal language of a category with just finite products is of rather limited usefulness, because of the very restricted forms of expression and judgement in equational logic. However, internal languages over richer logics can be used as powerful tools for proving complicated 'arrow-theoretic' results by more familiar-looking 'element-theoretic' arguments. See [Bell, 1988], for extended examples of this technique in the context of topos theory and higher-order predicate logic. The use of categorical semantics in nonwell-pointed categories gives us an increase in ability to construct useful models compared with the more traditional 'sets-and-elements' approach. The work of Reynolds and Oles (see [Oles, 1985]) on the semantics of block structure using functor categories is an example. (As the example above shows, functor categories are not well-pointed in general.) See [Mitchell and Scott, 1989] for a comparison of the categorical and set-theoretic approaches in the case of the simply typed lambda calculus.
3
Categorical datatypes
This section explores the categorical treatment of various type constructors. We postpone considering types that depend upon variables until section 6, and concentrate here upon 'simple' types with no such dependencies. We will consider properties of these types expressible within the realm of equational logic; so from the work in the previous section we know that at the categorical level we must start with a category with finite products and ask what more it should satisfy in order to model a particular type constructor. We hope to demonstrate that the necessary categorical structure for a particular type constructor can be derived from its introduction, elimination and equality rules in a systematic way, given the basic framework of the previous section. In fact what emerges is an interplay between type theory and categorical structure. To model the term-forming operations associated with
Categorical logic
51
a type constructor one needs certain structures in a category; but then the characterization (up to isomorphism) of these structures via categorical properties (typically adjointness properties) sometimes suggests extra, semantically meaningful rules to add to the logic. Metalinguistic conventions The term-forming operations we will consider in this and subsequent sections will contain various forms of variable-binding, with associated notions of free and bound variables and substitution. To deal with all this in a uniform manner, we adopt a higher-order metalanguage of 'arities and expressions' as advocated by Martin-L6f. See [Nordstrom et al., 1990, Chapter 3] for an extended discussion of this approach. For our purposes it is sufficient to take this metatheory to be a simply typed lambda calculus with a-, b-, and n-conversion, whose terms will be called metaexpressions. The result of applying metaexpression e to metaexpression e' will be denoted e(e'), with a repeated application such as e(e')(e") abbreviated to e(e',e"); in certain situations the application e(e') will be written ee• [ ] in C can be given unambiguously by induction on the structure of this derivation. The same is true for the rules considered in section 5. It is only when we come to consider dependent type theories in section 6 that this property breaks down and a more careful approach to the categorical semantics must be adopted. We will also take care to retain the property of derivable typing judgements, M : a [L], that a is uniquely determined by the term M and the context L. This is the reason for the appearance of type subscripts in the syntax given below for the introduction terms of disjoint union, function and list types. Remark 3.2 (Implicit contexts). In order to make the statement of rules less cluttered, from now on we will adopt the convention that a lefthand part of a context will not be shown if it is common to all the judgements in a rule. Thus for example rule (3.2) below is really an abbreviation for case Remark 3.3 (Congruence rules). The introduction and elimination rules for the various datatype constructors to be given below have associated with them congruence rules expressing the fact that the term-forming constructions respect equality. For example, the congruence rules corresponding to (3.1) and (3.2) are:
case We will not bother to give such congruence rules explicitly in what follows.
3.1
Disjoint union types
We will use infix notation and write a + a' rather than +( ,) for the disjoint union of two types a and ' . (Thus this type constructor has arity TYPES TYPES TYPES.) The rules for introducing terms of type + are
and the corresponding elimination rule is
Categorical logic
53
What structure in a category C with finite products is needed to define [M[L]| in the presence of these new term- forming operations? We already have at our disposal the general framework for interpreting equational logic in categories with finite products, given in the previous section. Since this is our first example of the technique, we will proceed with some care in applying this framework. Formation. Since types a get interpreted as objects X = [ ] of C, we need a binary operation, X, X' X + X' , on objects in order to interpret the formation of disjoint union types as:
Introduction. Since terms-in-context M : a [L] get interpreted as morphisms from I = [L] to X — [ ], to interpret the introduction rules (3.1) we need functions on horn-sets of the form
that can be applied to [M[L]] and |M'][L] respectively to give the interpretation of inl (M) and inr(M'). Now in the syntax, substitution commutes with term formation, and since substitution is interpreted by composition in C, it is necessary that the above functions be natural in /. Therefore by the Yoneda lemma (see [Mac Lane, 1971, III.2]), there are morphisms X — > X + X' and X' — >• X + X' which induce the above functions on horn-sets via composition. So associated with the object X + X' we need morphisms inlx,x' :X—*X + X'
and inrx,x> -X — > X + X'
and can then define
Elimination. Turning now to the interpretation of the elimination rule (3.2), we need a ternary function on hom-sets of the form
and, as before, it must be natural in I because of the way that substitution is modelled by composition in C. Naturality here means that for each caseI
54
Andrew M. Pitts
Applying this with 2, n o (I x id), and n' o (I x id) for c, n, and n' respectively, gives case I (c, n, n') = caseI(x+x')(x-2,n o (I
x id),n'o(7n x id))o{id,c)
since 2 o (id,c) = c and (I x id) o ( ( i d , c ) x id) = id. So in fact cas1/(c, n, n') can be expressed more fundamentally as a composition {n|n'}/o (id, c) where
is a family of functions, natural in /. Thus the semantics of case-terms becomes: [case Equality. Next we consider equality rules for a + ' . Consider first the familiar rules:
(3.5) case(inr (M'),N,N') = N'(M') : r Writing m : I —>• X for [M[L], m' : I —> X' for [M'[L]J, n : I x X —>Y for [N(x)[L,x : ]], and n' : I x X1 —». Y for [N'(x')[L ,x' : a'], then from above we have that [case(inl ' ( M ) , N , N ' ) [ L ] ] is {n\n'}I ( i d , inlx,x' °m) and [case(inr(M),N,N')[L]l is {n|n'}/ o (id, inr x,x' ° m). From Lemma 2.2, we also have that [N(M)[L] is no ( i d , m) and that [N(M')[L]] is n' o (id, m). Consequently the structure on C for interpreting terms associated with disjoint union types described above, is sound for the rules (3.4) and (3.5) provided the equations
hold for all m, n and n' (with appropriate domains and codomains). Clearly for this it is sufficient to require for all n and n' that
and the naturality of { —|—}/ ensures that (3.6) and (3.7) are also necessary for the previous two equations to hold for all m, n and n'.
Categorical logic
55
So we require that { — | — } _ provides a natural left inverse to the natural transformation
which sends / to (f o (id x inlx,x'), f ° (id x inrx, x' ))- If, furthermore, we require that it provide a two-sided inverse, so that
then the categorical semantics becomes sound for a further rule for a + ' , namely:
caSe(C,(x)F(inlf,(x)),(x')F(inr(x')))
= F(C) : T
(3.10)
(Recall that an expression like (x)-F(inl'(x)) denotes a lambda abstraction in the metalanguage.) If we think of (3.4) and (3.5) as the 'b-rules' for disjoint union, then (3.10) is an 'n-rule'. As we will see, the rule is necessary if we wish a + a' to be characterized up to isomorphism by and 1. Now taking the component of (3.8) at the terminal object 1 and using the canonical isomorphisms 2 : 1 x (-) (— ), gives that
is a bijection for each Y. By definition, this means that
is a binary coproduct diagram in C. We will denote the inverse bijection to (inl Xt x inr x,X') at apairofofmorphisms / : X — » Y, f : X' — > Y by
So we are led to require that C have binary coproducts. But this is not all: we can now compose the natural bijection (3.8) with
to deduce that for each /, X, X', composition with the canonical morphism
56
Andrew M. Pitts
induces a bijection on hom-sets, and hence, by the Yoneda lemma, that d must be an isomorphism. In this case we say that binary products distribute over binary coproducts in C, or simply that C has stable binary coproducts. Thus the construct { — |— }- takes n : I x X — > Y and n' : I x X' — > to to
Defining satisfaction of equations-in-context over this richer collection of terms just as in section 2.2, Theorem 2.4 extends to include the equality rules for disjoint unions, i.e. if C is a category with finite products and stable binary coproducts, then the equations-in-context that are satisfied by a structure in C are closed under the rules of equational logic augmented by rules (3.4), (3.5), and (3.10). As with all categorical structure defined by universal properties, coproducts in a category are unique up to isomorphism. So we have that the structure needed in a category to interpret disjoint unions with all three equality rules is essentially unique. If we do not wish to model the third equality rule (3.10), then we have seen that a weaker structure than stable coproducts is sufficient —just inlx,x' , inlx,x' and an operation { — | — }_ satisfying the two equations (3.6) and (3.7), together with the naturality condition:
(This naturality is automatic in the presence of the uniqueness rule (3.10).) However, in this case there may be several non-isomorphic 'disjoint unions' for X and X' in C. Put another way, provided the third equality rule (3.10) is included, the rules for disjoint unions characterize this type constructor up to bijection. This kind of observation seems important from a foundational point of view, and arises very naturally from the categorical approach to logic. The stability of binary coproducts in C says precisely that for each object 7, the functor I x (— ) : C — > C preserves binary coproducts. Since coproducts, being colimits, are preserved by any functor with a right adjoint, stability is automatic if each Ix (— ) : C — > C has a right adjoint. The value of this right adjoint at an object X is by definition the exponential object I -X. In particular ifC is a cartesian closed category, then coproducts in C are automatically stable, since all exponential objects exist in this case by definition. We will see in Section 3.3 that exponential objects model function types. So if one is only concerned with modelling disjoint sums in a higher-order setting, one need only verify the presence of coproducts in
Categorical logic
57
the category. Here is an example of a category with non-stable coproducts (and hence without exponentials). Example 3.4. Recall that the category of complete, atomic Boolean algebras and boolean homomorphisms is equivalent to the opposite of the category of sets and functions. In Set one generally does not have
(Consider the case I = X = X' = 1, for example.) Thus in the opposite of the category of sets, binary coproducts are not in general stable, and hence nor are they stable in the category of complete atomic Boolean algebras. Therefore, even though this category has coproducts (it has all small limits and colimits), we cannot soundly interpret disjoint union types in it. Remark 3.5 (Empty type). We have treated binary disjoint union types. The 'nullary' version of this is the empty type null, with rules:
These rules can be soundly interpreted in C if it has a stable initial object, 0. This means that not only is 0 initial (i.e. for each object X, there is a unique morphism 0 —> X), but so also is 7 x 0 for any object /. This latter condition is equivalent to requiring that : I x 0 —> 0 is an isomorphism. As before, this stability condition is automatic if the functors I x (—) : C —> C have right adjoints, and thus, in particular, an initial object in a cartesian closed category is automatically stable.
3.2
Product types
One could be forgiven for jumping straight to the conclusion that product types a x a' should be interpreted by binary categorical product in a category C. However, as we have seen in section 2, the finite product structure of C is there primarily to enable terms involving multiple variables to be interpreted. Also, we will use the elimination rule for products that is derived systematically from its formation and introduction rule, rather than the more familiar formulation in terms of first and second projection. (See [Backhouse et al., 1989] for comments on this issue in general.) Nevertheless, with a full set of equality rules including the analogue of the 'surjective pairing' rule, binary products in C are exactly what is required to model product types. If one does not wish to model the surjective pairing rule, then a weaker structure on C (not determined uniquely up to isomorphism) suffices. Recall that the introduction rule and corresponding elimination rule for product types are
58
Andrew M. Pitts
(together with associated congruence rules—cf. Remark 3.3). By a similar analysis to that in section 3.1, one finds that the following structure on C is needed to interpret these new terms: • for all objects X,X', an object X * X'; • for all objects X,X', a morphism • for all objects /, X, X', Y, a function
that is natural in I (so that splitI' (n o (g x id)) = splitI(n) o (g x id) holds for all The semantics of product types and terms in C is then: def
The equality rules for product types are:
For the above semantics to be sound for (3.14), one needs that for each
In other words, the natural transformation induced by pre-composition with (id x pairx x1)
should have left inverse given by n -» split I ( n ) . This condition does not suffice to determine X * X' uniquely up to isomorphism: there may be
Categorical logic
59
several different such 'weak product' structures on C. However, if we insist that (3.15) is also satisfied, i.e. that
then split_ becomes a two-sided inverse for (3.16). By considering the component of this natural isomorphism at the terminal object, one has that pre-composition with pair : X x X' —-> X * X' induces a bijection on hom-sets, and hence is itself an isomorphism. Consequently, to interpret product types satisfying both equality rules, the structure needed on C must be essentially the existing binary product structure (and so, in particular, is unique up to isomorphism). Thus the semantics of product types and terms in C simplifies to:
In particular, the definable first and second projections
when interpreted, give the first and second projection morphisms as one might hope:
Remark 3.6 (One-element type). The 'nullary' version of binary product types is the one-element type unit with the following rules (together with associated congruence rules). As for product types, we use the possibly less familiar elimination rule derived systematically from the form of formation and introduction
60
Andrew M. Pitts
The last rule is the analogue of rule (3.15)—in other words, it is the for this type. The 'B-rule' for unit (analogous to (3.14)) is
n
-rule'
but it is easy to see that in fact this rule is derivable from the above rules, as is the rule
Using an argument similar to that for product types, one finds that these rules can be interpreted soundly in C provided it has a terminal object.
3.3
Function types
Since types a get interpreted as objects X = [o] of C, we need a binary operation objects in order to interpret function types as follows: Introduction.
The introduction rule for function types is:
To interpret this rule in a category C with finite products we need a function on horn-sets of the form
in order to define
In order to preserve the basic property of the semantics that substitution is modelled by composition, we require the above functions to be natural in /, meaning
Elimination. We will use the familiar elimination rule involving application rather that the rule systematically derived from formation and introduction, which involves judgements with higher-order contexts (see [Nordstrom et o/., 1990, Section 7.2]):
Categorical logic
61
To interpret this rule, we need a function on horn-sets of the form
which is natural in I (in order to respect the semantics of substitution in terms of composition). Naturality in / of these functions implies that they are induced by a morphism
and we define
Equality.
The rule of B-conversion is
The structure in C we have described for interpreting function types is sound for this rule provided
holds for all For this it is sufficient, and because of the naturality of cur also necessary, that
hold for all / : I x X —> y. The rule for n-conversion is
The semantics is sound for this rule provided that for all m : I —>
(X-Y)
In fact the naturality of cur_ implies that this holds if and only if for each X and y
62
Andrew M. Pitts
Now (3.20) and (3.22) say precisely that the natural transformation
given by bijection with inverse cur-. Thus, by definition, X-Y is the exponential of Y by X in the category C, with evaluation morphism Recall that by definition, a cartesian closed category has all finite products and exponentials. Thus we have: function types satisfying B- and n-conversion can be soundly interpreted in C provided it is cartesian closed.
3.4 Inductive types In this section we consider the categorical interpretation of inductively defined datatypes (such as the natural numbers, lists and trees) using initial algebras. For definiteness we will concentrate on the case of the list type constructor, Recall that the introduction rules and corresponding elimination rule for list are:
By a process of analysis that one hopes is becoming familiar by now, one finds that the following structure on a category C with finite products is needed to interpret this new syntax: • For each object X, an object LX equipped with morphisms nilX : • For all objects I, X, X', an operation sending a morphism / : / x X x LX x X' — > X' to a morphism listrecI(f) : I x LX x X ' — > X1, which is natural in I (so that for all g : I' — >• I, listrecI'- (f o (g x id x id x id)) = l i s t r e c ( f ) o (g x id x id)). The semantics of alist and its associated terms in C is then:
Categorical logic
63
(In the clause for denotes the unique morphism from the terminal object 1.) Now consider the usual equality rules for list:
to
.
For the semantics to be sound for these two rules, one needs that for each
We can go one step further and insist that given f, listrecI(f) is the unique morphism satisfying (3.27) and (3.28). In this case the structure on C is also sound for the rule
With (3.29) and in the presence of product types, the scheme for primitive recursion becomes interdefinable with a simpler scheme of iteration, given by the following rules:
64
Andrew M. Pitts
For clearly iteration is a special case of primitive recursion, and in the reverse direction one can define
and use (3.31), (3.32) and (3.33) to prove that this has the correct properties. To do this, one also has to prove that
and this requires the uniqueness property (3.33). In C, the rules (3.30)-(3.33) correspond to the following universal property of the object LX equipped with morphisms nilX : 1 — > LX and cons x '• X x LX — > LX: for each pair of morphisms f : I x X x X1 — >• X' and m' : I — > X' , there is a unique morphism g : I x LX — > X' such that the following diagrams commute:
The uniqueness property of g implies that the operation sending / to g is natural in I—so it is unnecessary to state this as a separate requirement. If the category C is cartesian closed, then a further simplification can be made: it is sufficient to have the above universal property for / = 1 to
Categorical logic
65
deduce the general case. If C also has binary coproducts, then one can combine nilx and consx into a single morphism
.where T : C —> C is the functor 1+(X x —). The universal property enjoyed by nilx and consx is equivalent to the following property of i: for each morphism h : T(X') —»• X', there is a unique morphism g : LX —> X' with g o i = ho T(g). In other words, (LX,i) is an initial object in the category of algebras for the functor T. (Recall that a T-algebra is an object X' of C equipped with a 'structure' morphism T(X') —> X'; a morphism of T-algebras is a morphism in C between the underlying objects of the algebras that makes the evident square involving the structure morphisms commute.) As usual, this initiality property characterizes LX, nilx and cons x uniquely up to isomorphism in C. Moreover, the structure morphism of an initial algebra is always an isomorphism: so i gives an isomorphism between 1 + (X x LX) and LX. (The inverse of i is the unique T-algebra morphism from the initial algebra to (T(LX),T(i)).) Taking the type a in list to be the one-element type unit, one obtains the type of natural numbers. On the categorical side, when X is the terminal object 1 in C, the above universal property of LI is that given by Lawvere in defining the notion of a natural number object in a category. Other inductive datatypes (equipped with iterators satisfying uniqueness conditions) can be modelled soundly by requiring C to have initial algebras for the 'strictly positive' functors T : C —>• C—those expressible via a combination of constant functors, products, coproducts, and exponentiation by a fixed object. Remark 3.7 (Weak natural numbers object). Without the uniqueness rules (3.29) or (3.33), the scheme of iteration is weaker than that of primitive recursion. (For example, the expected properties of the predecessor function on the natural numbers cannot be derived from an iterative definition—see [Girard, 1989, page 51].) The corresponding uniqueness part of the universal property defining the notion of natural number object N makes this notion conditional-equational rather than equational. If this uniqueness requirement is removed, N is termed a weak natural numbers object. Lambek [1988] and Roman [1989] have studied extensions of the algebraic theory of weak natural numbers involving Mal'cev operations in which the uniqueness of the iterator can be derived.
3.5
Computation types
Moggi [1991] introduces a new approach to the denotational semantics of programming languages based on the categorical notion of a strong monad
66
Andrew M. Pitts
and the corresponding type-theoretic notion of 'computation types'. As a further illustration of the methods developed so far, we will give the semantics of Moggi's 'computational lambda calculus' in a category with finite products equipped with a strong monad. The syntax of computation types is as follows. We are given a unary type constructor a i-»- Ta. One should think of Ta as the type of 'computations' of elements of type a. There are two rules for building terms of computation types:
(together with associated congruence rules—cf. Remark 3.3). Intuitively, val(M) is the value M regarded as a trivial computation which immediately evaluates to itself; and let(E', F) denotes the sequential computation which firstly tries to evaluate E to some value M : a and then proceeds to evaluate F(M). These intended meanings motivate the following three equality rules:
To interpret this syntax soundly in a category C with finite products we need the following structure on C: • for each object X, an object TX; • for each object X, a morphism nx • X —>• TX; • for all objects I , X , X ' , a function
that is natural in I (so that holds for all Then the semantics of computation types and their associated terms in C is:
Categorical logic
67
Satisfaction of rules (3.36), (3.37), and (3.38) requires the following equational properties of nx and lift I to hold for all (we have used the naturality of lift I in I to state these in their simplest form):
Specifying the structure ( T , n , l i f t ) with the above properties is in fact equivalent to specifying a monad on the category C together with a 'tensorial strength' for the monad: we refer the reader to [Moggi, 1991] and the references therein for more details. Note that unlike the other categorical datatypes considered in this section, for a given category C the structure of a strong monad is not determined up to isomorphism — there may be many different ones on C. (Moggi [1991] gives several examples with computational significance.)
4
Theories as categories
Sections 2 and 3 have shown how to model an equational theory, possibly involving some simple datatypes, in a suitably structured category in a way which respects the rules of equational logic. The purpose of this section is to demonstrate that this categorical semantics is completely general, in the sense that the category theory provides an alternative but equivalent formulation of the concepts embodied in the logic. We will see that there is a correspondence between theories and categories, which enables one to view the latter as giving an abstract or presentation-free notion of theory. Such a theory-category correspondence relies upon the fact that the categorical formulation of semantics allows us to consider models of a theory in different categories. For a given theory Th, amongst all these different categories it is possible to construct one which contains the 'most general' model of Th. This category is called the 'classifying category' of Th and the model it contains is called the 'generic' model. They are characterized by a certain category-theoretic, universal property which depends upon the ability to compare algebras in different categories by transporting them along functors preserving the relevant categorical structure. So we begin by describing this process. Until section 4.4 we will confine our attention to algebraic theories versus categories with finite products. Th will denote a fixed, many-sorted algebraic theory (as defined in section 2.1), with underlying signature Sg.
68
4.1
Andrew M. Pitts
Change of category
Recall that a functor T : C —> D is said to preserve the finite product Xi x • • • x Xn if the morphisms make T ( X 1 x • • • x Xn) into the product of the T(Xi) in V. If V itself has finite products, then this is equivalent to requiring that
be an isomorphism. (A stronger condition would be to demand that a be an identity morphism—in which case we say that T strictly preserves the finite product X1 x • • • x Xn.) If C and D are categories with finite products and T : C —> D is a functor preserving them, then every structure S for Sg in C gives rise to a structure T(S) in V defined by:
for all sorts a and function symbols F : a\,..., an —>• T. Since T preserves finite products and the semantics of terms-in-context is defined in terms of these, the meaning of a term in 5 is mapped by T to its meaning in T(S). More precisely, if M : r [F] then
commutes in "D. Consequently, if 5 satisfies M = M' : T [F], then so does T(S). Thus finite product preserving functors preserve the satisfaction of equations. In particular, if 5 is an algebra in C for a theory Th, then T(S) is an algebra for Th in V. Example 4.1. If C is locally small, then for each object / in C the homfunctor C(I, _) : C —» Set preserves products (but not strictly so, in general). Thus every algebra S in C gives rise to an ordinary, set-valued algebra C ( I , S ) , whose elements are the 'generalized elements of S at stage /' in the sense of section 2.3.
4.2
Classifying category of a theory
Two contexts over the signature Sg are a.-equivalent if they differ only in their variables; in other words, the lists of sorts occurring in each context
Categorical logic
69
axe equal (and, in particular, the contexts are of equal length). Clearly this gives an equivalence relation on the collection of contexts, and the set of a-equivalence classes of contexts is in bijection with the set of lists of sorts in Sg. Assuming a fixed enumeration Var = {v1,V2, • • •} of the set of variables (i.e. of the set of metavariables of arity TERMS), we can pick a canonical representative context for the a-equivalence class corresponding to each list a1, . . . , an of sorts. We will not distinguish notationally between a context and the a-equivalence class it determines. Given contexts F and a context morphism
from F to r'
is specified by a list y = [N1 , . . . , Nm] of terms over Sg such that Nj : TJ [F] holds for each j = 1, . . . ,m. Note that F' could be replaced by any aequivalent context without affecting this definition. Given another such morphism y' = [N1', . . . , N'm], we will write
to indicate the judgement that y and y' are context morphisms from F to F' that are Th-provably equal: by definition, this means that for each (2.8) imply that Th-provable equality of context morphisms (between two given contexts) is an equivalence relation. Moreover, rule (2.9) shows that changing F up to a-equivalence will not change the class of 7 under this equivalence relation. The composition of context morphisms y : F — >• F' and y' : F" — > F" is the context morphism y' o y : F — » F" formed by making substitutions:
The fact that y' o y does constitute a context morphism from F to F" follows from rule (2.4); and rule (2.9) implies that composition respects Th-provable equality:
The usual properties of term substitution imply that composition of context morphisms is associative:
70
Andrew M. Pitts
when Moreover, the operations of composition possess units: the identity context morphism for
is given by the list idr = [x1, • • •, xn] of variables in F; clearly one has
We now have all the ingredients necessary to define a category. Specifically, the classifying category, C l ( T h ) , of an algebraic theory Th is defined as follows: Objects of Cl(Th) are a-equivalence classes of contexts (or, if you prefer, finite lists of sorts) over the signature of Th. Morphisms of C£(Th) from one object F to another F' are equivalence classes of context morphisms for the equivalence relation which identifies y : F —>• F' with y' : F —>• F' just if 7 =Th l' y' —>• F'. Composition of morphisms in Cl( Th) is induced by composition of context morphisms, and identities are equivalence classes of identity context morphisms. We will not distinguish notationally between a context morphism and the morphism of C l ( T h ) which it determines. Proposition 4.2. Cl(Th) has finite products. Proof. Clearly, for each context F the empty list of terms is the unique context morphism F —> [], and so its equivalence class is the unique morphism from F to [] in Cl(Th). Thus the (a-equivalence class of the) empty context [ ] is a terminal object in Cl( Th). Given contexts F = [x1 : a1,..., xn : an] and F' = [yi : TI ,..., ym : rm], we make use of the given enumeration Var = { v i , v 2 , . . . } of the set of variables to define a context
Then [ v 1 , . . . , vn] and [v n +1,..., vn+m] determine morphisms ?I : F x F' —*• F and 2 : F x F' —>• F' in C l ( T h ) which make F x F' into the product of F and F' in C l ( T h ) . For if 7 = [M1,...,M n ] : A —> F and a context morphism (y,y') : A —>• F x F' which is unique up to Thprovability with the property that Remark 4.3. We could have defined C l ( T h ) by taking its objects to be just contexts rather than a-equivalence classes of contexts; this would have resulted in a larger category, equivalent to the one defined above (and hence
Categorical logic
71
with the same categorical properties). One advantage of the definition we gave is that the particular choice of finite products given in the proof of Proposition 4.2 is strictly associative and strictly unital: this means that for all objects F, F' and F", the morphisms
are not merely isomorphisms (as is always the case), but actually identity morphisms. Remark 4.4. There is a close relationship between Cl(Th) and the free Th-algebras in Set generated by finitely many indeterminates. Writing F for denote the free algebra generated by finitely many indeterminates x1,..., xn of sorts a1,..., an respectively. Then FTh(F) can be constructed by taking its underlying set at a sort r to consist of the set of terms M satisfying M : r [F], quotiented by the equivalence relation which identifies M with M' just if M = M' : T [F] is a theorem of Th. This quotient set is precisely the set of morphisms F —> [vi : T] in C l ( T h ) . Thus the horn-sets of C l ( T h ) can be used to construct the free finitely generated Th-algebras in Set. Conversely, it can be shown that the category C l ( T h ) is equivalent to the opposite of the category whose objects are the free, finitely generated Th-algebras in Set, and whose morphisms are all Th-algebra homomorphisms. Next we describe the 'generic' model of Th in the classifying category. Each sort a of the underlying signature Sg of the theory Th determines an object in the classifying category of Th represented by the context [v1 : a]. We will denote this object of Cl(Th) by G[a]. If F : aI , . . . , an —» r is a function symbol in Sg, then, since
we have that [ F ( v 1 , . . . , vn)] is a context morphism from [v1 : a1,..., vn : an] to [v1 : T}. Now the proof of Proposition 4.2 shows that in C l ( T h ) the object is the product of the objects [vi : oi] = hence [ F ( v 1 , . . . , v n ) } determines a morphism G[F] : G[o1] x • • • x Altogether, then, G is a structure for Sg in the category C l ( T h ) . Lemma 4.5. The structure G has the following properties. (i) For each context F, the object G|F] in C l ( T h ) is (the a-equivalence class of) F; (ii) If M : T [F] then the morphism G[M[F]1 : G[F] —> G[r] in Cl(Th) is that determined by the context morphism [M] : F —> [v1 : T];
72
Andrew M. Pitts
(iii) M = M' : T [F] is a theorem of Th if and only if G[M[F]] = G[M'[F]]. In other words, G satisfies an equation-in- context just if it is a theorem of Th. Prooof.
(i) We have already observed that [v1 : a1 , . . . ,vn : an] is the product G[oI] x • • • x G{an} in C l ( T h ) ; but recall from section 2.2 that this is also the definition of G[[VI : a1 , . . . , vn : (on]]. (ii) This follows by induction on the structure of M using the explicit description of the product structure of Cl( Th) given in the proof of Proposition 4.2. (iii) By part (ii), G[M[F]] = G[M'[F]] holds if and only if [M] and [M'] determine equal morphisms in C l ( T h ) , which by definition means that M = M' : T [F] is a theorem of Th. Part (ii) of the above lemma implies in particular that G satisfies the axioms of Th and hence is a Th-algebra. G will be called the generic Th-algebra. It enjoys the following universal property. Theorem 4.6 (Universal property of the generic algebra). For each category C with finite products, any Th-algebra S in C is equal to S(G) for some finite product preserving functor S : C l ( T h ) — > C. Moreover, the functor S is uniquely determined up to (unique) natural isomorphism by the algebra S. (S is called the functor classifying the algebra S.) Proof. Define 5 on objects F by
and
on morphisms
Then the fact that S is a functor preserving finite products follows easily from the definition of C l ( T h ) in section 4.2. Applying S to G, one has for a sort a that similarly for a function symbol F that Hence S(G) = S. Now suppose that T : Cl(Th) — >• C is another product-preserving functor and that there is an isomorphism, h : T(G) = S, of Th-algebras in C. For each object , since F = one gets isomorphisms
Categorical logic
73
which are the components of a natural isomorphism k : T =^ S. Finally, note that since T and S preserve finite products and every object in Cl( Th) is a finite product of objects of the form G[o], k is uniquely determined by the condition that it induces h. I Corollary 4.7.
(i) The classifying category of Th is determined uniquely up to equivalence, and the generic Th-algebra uniquely up to isomorphism by the universal property in Theorem 4.6. (ii) The operation of evaluating a finite product preserving functor from Cl( Th) to C at the generic algebra G is the object part of an equivalence of categories:
where F P ( C l ( T h ) , C ) is the category of finite product preserving functors and natural transformations from C l ( T h ) to C, and Th-ALG(C) is the category of Th-algebras and homomorphisms in C. Proof. Part (i) is a routine consequence of the universal property, but part (ii) deserves further comment. The statement of Theorem 4.6 says that the functor T -> T(G) is essentially surjective, and full and faithful for isomorphisms; hence it gives an equivalence between the category of functors and natural isomorphisms and the category of algebras and algebra isomorphisms. Since this is true for any C with finite products, we can replace C by its arrow category (whose objects are the morphisms of C and whose morphisms are commutative squares in C), which certainly has finite products when C does. The equivalence for objects and isomorphisms in this case implies that the original functor T >-» T(G) is full and faithful, and hence that (4.1) holds. I
4.3
Theory-category correspondence
Let Sgc be the signature defined from a category C with finite products, as in section 2.3. As we noted in that section, there is a canonical structure for Sgc in C. Define Thc to be the algebraic theory over Sgc whose axioms are all equations-in-context which are satisfied by this structure. Then the structure is automatically an algebra for this theory, and hence by Theorem 4.6 corresponds to a finite product preserving functor T : C l ( T h c ) —> C. The definition of Sgc (which names the various objects and morphisms in C) and The (which identifies terms which name the same things in C) entails that T is full, faithful and essentially surjective—and
74
Andrew M. Pitts
hence is an equivalence of categories. Thus we have that every category with finite products is equivalent to the classifying category of some manysorted algebraic theory. As we noted in section 2.3, Thc will have a set of symbols and axioms (rather than a proper class of them) only in the case that C is a small category. Starting with C, forming Thc and then Cl(Thc), one gets back not C itself, but an equivalent category. What about the reverse process: how are the theories Th and Th C l ( T h ) related? To answer this question, we need an appropriate notion of morphism, or translation between algebraic theories. A rather general notion is obtained by declaring a translation Th —> Th' to be an algebra for Th in the classifying category of Th'. The syntactic nature of the construction of Cl( Th') permits one to give a purely syntactic explanation of this notion of translation, which we omit here. In fact we can form a 2-category of many-sorted algebraic theories, Alg, whose horncategories are the categories of algebras and homomorphisms in classifying categories: If F-p denotes the 2-category whose objects are categories with finite products and whose horn-categories consist of finite product preserving functors and all natural transformations, then the classifying-category construction is the object part of a 2-functor
This 2-functor is 'full and faithful' in the sense that it is an equivalence on hom-categories (using (4.1)), and is 'essentially surjective' in the sense that every object of Fp is equivalent to one in the image of Cl. Thus Cl induces an equivalence between Alg and fp in an appropriate, up-to-equivalence sense. Remarks 4.8. We close this section by mentioning some consequences of the correspondence. (i) We can view a particular small category with finite products C as specifying a (many-sorted) algebraic theory independently of any particular presentation in terms of a signature and axioms: there may be many such presentations whose syntactic details are different, but which nevertheless specify the 'same' theory, in the sense that their classifying categories are equivalent to the given category C. Indeed, once one has this categorical view of algebraic theories, one sees that there are many 'naturally occurring' algebraic theories which do not arise in terms of a presentation with operators and equations. For example, the category whose objects are the finite cartesian powers of the set of natural numbers N and whose morphisms Nm —>Nn are ntuples of m-ary primitive recursive functions, is a category with finite
Categorical logic
75
products and hence an algebraic theory. This category is a paradigmatic example of the notion of 'iteration theory' introduced by Elgot: see the book by Bloom and Esik [1993]. (A reader who takes this advice should be warned that [Bloom and Esik, 1993] adopts a not uncommon viewpoint that algebraic theories can be identified with categories with finite coproducts. Since the 2-category of categories with finite coproducts and functors preserving such is equivalent (under the 2-functor taking a category to its opposite category) to the 2-category of categories with finite products, this viewpoint is formally equivalent to the one presented here. However, it does not sit well with the intuitively appealing picture we have built up of the sorts of a theory (objects of a category) as generalized sets and the terms-in-context (morphisms) as generalized functions.) (ii) If T : C —> D is a morphism in Fp between small categories, then, for any C € Fp with small colimits, it can be shown that the functor induced by composition with T has a left adjoint, given by left Kan extension along T. Since T corresponds to a translation between algebraic theories and T* is the functor restricting algebras along T, this left adjoint provides 'relatively free' constructions on algebras for a wide variety of situations (depending upon the nature of the translation). (iii) Free constructions (indeed weighted colimits in general) in Fp can be constructed via appropriate syntactic constructions on algebraic theories and translations between them.
4.4
Theories with datatypes
In this section we will examine the effect on the classifying category Cl( Th) of a theory Th when we enrich equational logic with the various datatype constructions considered in section 3, together with their associated introduction, elimination, and equality rules. We will look at product, disjoint union, and function types. In each case the classifying category turns out to possess the categorical structure that was used in Section 3 to interpret these datatype constructs. (Similar considerations apply to the other type-forming operations considered in that section, namely list types and 'computation' types.) Product types. (Cf. Section 3.2.) In this case, for each pair of types a and a' there is a binary product diagram in the classifying category C l ( T h ) of the form
Similarly, in the presence of a one-element type unit (cf. Remark 3.6), then [z : unit] is a terminal object in C l ( T h ) . It follows that in the presence of
76
Andrew M. Pitts
these type-forming constructs, C l ( T h ) is equivalent to the full subcategory whose objects are represented by contexts of length one. Disjoint union types. (Cf. Section 3.1.) In this case, for each pair of types a and a' and each context F = [x1 : aI, . . . ,xn : a n ] , there is a coproduct diagram in the classifying category Cl( Th) of the form
(where x,x',z are chosen to be distinct from x = x1,. . . ,xn). In the case T = 0, we have that [z : a + a'] is the binary coproduct of [x : a] and [x : a] in Cl( Th). Given the description of products in the category Cl( Th) in Proposition 4.2, it also follows that product with an arbitrary object [F] distributes over this coproduct, i.e. it is a stable coproduct(cf. 3.1). Similarly, in the presence of an empty type null (cf. Remark 3.5), [z : null] is a stable initial object in C l ( T h ) . In the presence of product and one-element types we have seen that every object is isomorphic to one of the form [x : a]. In the presence of disjoint union and empty types as well, we can conclude that Cl( Th) has all stable finite coproducts. Function types. (Cf. Section 3.3.) In this case, for each pair of types a and a', the object [f : a-a'] is the exponential of C l ( T h ) , with associated evaluation morphism
Unlike the case for disjoint union types, it is not necessary to assume the presence of product types in order to conclude that the classifying category possesses exponentials for any pair of objects. For, given any objects T = the is given by the object
Thus in the presence of function types, the classifying category Cl( Th) is a cartesian closed category, whether or not we assume the theory Th involves product types. Remark 4.9 (The generic model of a theory). Suppose we consider equational theories Th in the equational logic of product, disjoint union,
Categorical logic
77
and function types. We have seen that Cl(Th) is a cartesian closed category with finite coproducts. (Recall that the stability of coproducts is automatic in the presence of exponentials.) Such a category is sometimes called bicartesian closed. We can define a structure G in C l ( T h ) for the underlying signature of Th just as in Section 4.2 and satisfying Lemma 4.5. Thus the structure G in C l ( T h ) is a model of Th (i.e. satisfies the axioms of Th) and indeed satisfies exactly the equations that are theorems of Th. An immediate corollary of this property is the completeness of the categorical semantics: an equation is derivable from the axioms of Th using the equational logic of product, disjoint union, and function types if (and only if) it is satisfied by all models of Th in bicartesian closed categories. Indeed, Theorem 4.6 extends to yield a universal property characterizing G as the generic model of Th in bicartesian closed categories: given any other model S of Th in a bicartesian closed category C, then the functor S : C£(Th) —> C defined in the proof of Theorem 4.6 is a morphism of bicartesian closed categories, i.e. preserves finite products, finite coproducts, and exponentials. It also maps G to 5 and is unique up to unique isomorphisms with these properties.
5
Predicate logic
In previous sections we have considered the categorical interpretation of properties of structures that can be expressed equationally. Now we consider what it means for a structure in a category to satisfy properties expressible in the (first-order) predicate calculus, using logical connectives and quantifiers. As in section 3, adjunctions play an important role: we will see that the propositional connectives, the quantifiers, and equality can be characterized in terms of their adjointness properties. First we set up the basic framework of predicate logic and its categorical semantics.
5.1
Formulas and sequents
Let us augment the notion of many-sorted signature considered in section 2 by allowing not only sort and function symbols, but also relation symbols, R. We assume that each relation symbol comes equipped with a typing of the form indicating both the number and the sorts of the terms to which R can be applied. To judgements of the form (2.1) asserting that an expression is a well-formed term of a given sort in a given context, we add judgements of the form
asserting that the expression 0 is a well-formed formula in context F. The atomic formulas over the given signature are introduced by the rules
78
Andrew M. Pitts
with one such rule for each relation symbol R. Later we will consider compound formulas built up from the atomic ones using various propositionforming operations. The logical properties of these operations will be specified in terms of sequents of the form
where is a finite list of formulas, ip is a formula and the judgements 0i,prop [T] and prop [T] are derivable. The intended meaning of the sequent (5.2) is that the joint validity of all the formulas in logically entails the validity of the formula if). Figure 2 gives the basic, structural rules for deriving sequents. Since we wish to consider predicate logic as an extension of the rules for equational logic given in section 2.1, Fig. 2 includes a rule (Subst) permitting the substitution in a sequent of a term for a provably equal one. The rule (Weaken) allows for weakening of contexts. The other possible form of weakening is to add formulas to the left-hand side of a sequent:
This is derivable from the rules in Fig. 2 because of the form of rule (Id) . Note that in stating these rules we continue to use the convention established in Remark 3.2 that the left-hand part of a context is not shown if it is common to all the judgements in a rule. Thus for example, the full form of rule (Subst) is really
Let us enrich the notion of theory used in section 2 by permitting the signature of the theory to contain typed relation symbols in addition to sorts and typed function symbols, and by permitting the axioms of the theory to be sequents as well as equations-in-context. The theorems of such a theory will now be those judgements that can be derived from the axioms using the rules in Figs. 1 and 2 (together with the rules for propositionforming operations to be considered below, as appropriate).
5.2
Hyperdoctrines
If the judgement 0 prop [x1 : a1 , . . . , xn '• an] °ver a given signature is derivable, then {x1, . . . , xn] contains the variables occurring in the formula
Categorical logic
79
'
Fig, 2. Structural rules Intuitively such a formula describes a subset of the product of the sorts a1, ... ,an. Indeed, suppose we are given a set-valued structure for the signature: such a structure assigns a set [o] to each sort a, a function to each function symbol T, and a subset to each relation symbol R C Then each formula-in-context prop gives rise to the subset of consisting of those n-tuples (a1,..., an) for which the sentence is satisfied in the classical, Tarskian sense (see [Chang and Keisler, 1973, 1.3]). Our aim is to explain how this traditional notion of satisfaction can be generalized to give an interpretation of formulas-in-context in categories other than the category of sets. To do this one needs a categorical treatment of the notion of 'subset' and the various operations performed on subsets when interpreting logical connectives and quantifiers. For this we will use the 'hyperdoctrine' approach originated by Lawvere [1969; 1970], whereby the 'subsets' of an object are given abstractly via additional structure on a category. Definition 5.1. A prop-category, C, is a category possessing finite products and equipped with the following structure: • For each object X in C, a partially ordered set Propc(X), whose elements will be called C-properties of X. The partial order on Propc(X) will be denoted by X in C, an operation assigning to each C-property A of X, a C-property f*A of Y called the pullback of A along f. These operations are required to be monotone: if A < A', then f*A < f*A';
80
Andrew M. Pitts
Thus a prop-category is nothing other than a category with finite products, C, together with a contravariant functor Propc(—) from C to the category of posets and monotone functions. Propc(—) is a particular instance of a Lawvere 'hyperdoctrine', which in general is category- rather than posetvalued (and pseudofunctorial rather than functorial). The reader should be warned that the term 'prop-category' is not standard; indeed there is no standard terminology for the various kinds of hyperdoctrine which have been used in categorical logic. Here are some important examples of prop-categories from the worlds of set theory, domain theory, and recursion theory. Example 5.2 (Subsets). The category Set of sets and functions supports the structure of a prop-category in an obvious way by taking the Setproperties of a set X to be the ordinary subsets of X . Given a function f : Y — > X, the operation /* is that of inverse image, taking a subset A C X to the subset {y \ f(y) E A] C Y. Example 5.3 (Inclusive subsets of epos). Let Cpo denote the category whose objects are posets possessing joins of all w-chains and whose morphisms are the w-continuous functions (i.e. monotone functions preserving joins of w-chains). This category has products, created by the forgetful functor to Set. We make it into a prop-category by taking PropCpo(X) to consist of all inclusive subsets of X, i.e. those that are closed under joins of w-chains in X. The partial order on PropCpo(X) is inclusion of subsets. Given / : Y —» X, since / is w-continuous, the inverse image of an inclusive subset of X along / is an inclusive subset of Y; this defines the operation /* : PropCpo(X) — > PropCpo(Y). Example 5.4 (Realizability). The prop-category Kl has as underlying category the category of sets and functions. For each set X, the poset P r o p k l ( X ) is defined as follows. Let X—>P(N) denote the set of functions from X to the power set of the set of natural numbers. Let < denote the binary relation on this set defined by: p < q if and only if there is a partial recursive function : N —*• N such that for all x 6 X and n € N, if n € p(x) then (p is defined at n and (n) € q(x). Since partial recursive functions are closed under composition and contain the identity function, < is transitive and reflexive, i.e. it is a pre-order. Then Propkl(X) is the quotient of X-P(N) by the equivalence relation generated by PropKl(Y) sends it is easily seen to be well-defined, monotonic and functorial. Example 5.5 (Subobjects). A important class of examples of propcategories is provided by using the category-theoretic notion of a subobject of an object. Recall that a morphism / : Y —> X in a category C is a
Categorical logic
81
monomorphism if for all parallel pairs of morphisms g, h with codomain y, g = h whenever / o g = f o h. The collection of monomorphisms in C with codomain X can be pre-ordered by declaring that f < f' if and only if f = f' o g for some (mono)morphism g. Then a subobject of X is an equivalence class of monomorphisms with codomain X under the equivalence relation generated by P(N) say, the meet of [p] and [q] in P r o p k l ( X ) is represented by the function n -» p(n) A q(n). The greatest element Tx £ Pr°PKe(X) is represented by the function n i-» N. Since the pullback operations /* in kl are given by pre-composition with /, it is clear from these descriptions that finite meets are preserved by f*. Finally, let C be a category with finite limits, made into a prop-category as in Example 5.5 by taking the C-properties of an object X to be its subobjects. Then the top subobject TX is represented by the identity on X; clearly this is stable under pullback. Given two subobjects represented by monomorphisms A —> X, B —> X say, their binary meet is represented by the composition A B —> A —>• X, where A B —> A is the categorical pullback of B —> X along A —> X. Elementary properties of pullbacks in categories guarantee that this meet operation is stable under pullback.
5.4 Prepositional connectives Given the well-known connection between intuitionistic prepositional logic and Heyting algebras (see [Dummett, 1977, Chapter 5], for example), it is not surprising that to model these prepositional connectives in a propcategory we will need each poset Propc(X) to be a Heyting algebra. However, we also require that the Heyting algebra structure be preserved by the pullback operations f*. This is because these operations will be used in the categorical semantics to interpret the operation of substituting terms for variables in a formula. Definition 5.10. Let C be a prop-category. (i) C has finite joins if for each object X in C the poset Propc(X) possesses all finite joins and these are preserved by the pullback operations /*. Thus for each X there is ±x € Propc(X) satisfying
and for all A, B € Propc(X), there is an element A V B £ Propc(X) satisfying * for all C 6 Propc(X), A V B < C if and only if A < C and B• B if and only if C A A < B, "(A -+B) = f*A -» f*B, for all / : Y —> X.
Categorical logic
85
Given a signature Sg as in section 5.1, consider compound formulas built up from the atomic ones and the propositional constant false using the propositional connectives & (conjunction), V (disjunction), and => (implication) . The rules of formation are false prop [T] prop [T]
' prop [F] prop [F]
(where # is one of &, V, =>).
(5.5)
Negation, , will be treated as an abbreviation of => false; truth, true, will be treated as an abbreviation of -ifalse; bi-implication, will be treated as an abbreviation of Given a structure for Sg in a prop-category C with finite meets, finite joins and Heyting implications, we can interpret formulas-in-context 0[F] as C - p r o p e r t i e s , by induction on the derivation of prop [F]. Atomic formulas are interpreted as in section 5.3, and compound formulas as follows:
v In this way the notion given in section 5.3 of satisfaction of a sequent by a structure applies to sequents involving the propositional connectives. Theorem 5.11 (Soundness). Given a structure in a prop-category C with finite meets, finite joins, and Heyting implications, the collection of sequents that are satisfied by the structure (cf. 5.3) is closed under the usual introduction and elimination introduction rules for the propositional connectives in Gentzen's natural deduction formulation of the intuitionistic sequent calculus, set out in Fig. 3. Recall that a poset P can be regarded as a category whose objects are the elements of *P and whose morphisms are instances of the order relation. From this point of view, meets, joins, and Heyting implications are all instances of adjoint functors. The operation of taking the meet (or the join) of n elements is right (or left) adjoint to the diagonal functor P — > Pn; and given A E P, the Heyting implication operation adjoint to (— ) A A : P — > P. Figure 4 gives an alternative formulation of the rules for the intuitionistic propositional connectives reflecting this adjoint formulation. The rules take the form
86
Andrew M. Pitts
Fig. 3. Natural deduction rules for intuitionistic prepositional connectives
Fig. 4. 'Adjoint' rules for intuitionistic prepositional connectives sequents sequent A set of sequents is closed under such a 'bi-rule' if the set contains the sequents above the double line if and only if it contains the sequent below the double line; such a rule is thus an abbreviation for several 'uni-directional' rules. Note that the presence of the extra formulas in rule (V-Adj) means that the character of disjunction is captured not just by the left adjointness
Categorical logic
87
property of binary joins—one also needs the adjoint to possess a 'stability' property which in this case is the distribution of finite meets over finite joins. (Compare this with the need identified in section 3.1 for coproducts to be stable in order to model disjoint union types correctly.) In the presence of rule (=>-Adj) this stability property is automatic, and we could have given the rule without the extra formulas ((false-Adj) is also a 'stable' left adjunction rule, but in this case the stability gives rise to no extra condition.) The proof of Theorem 5.11 amounts to showing that the rules in Fig. 3 are derived rules of Fig. 4. Conversely, it is not hard to prove that the (unidirectional versions of) the rules in Fig. 4 are derived rules of Fig. 3. In this sense, the two presentations give equivalent formulations of provability (i.e. what can be proved in intuitionistic propositional logic). However, the structure of proofs in the two formulations may be very different. One should note that the presence of rule (Cut) (which corresponds to the transitivity of < in prop-categories) in Fig. 4 is apparently essential for the adjoint formulation to be equivalent to the natural deduction formulation of Fig. 3—unlike the situation for Gentzen's sequent calculus formulation where cuts can be eliminated (see [Dummett, 1977, 4.3]). Remark 5.12. A typical feature of the categorical treatment of logical constructs is the identification of which constructs are essentially uniquely determined by their logical properties. Adjoint functors are unique up to isomorphism if they exist. So, in particular, the operations of Definitions 5.7 and 5.10 are uniquely determined if they exist for C. Correspondingly, the adjoint rules of Fig. 4 make plain that, up to provable equivalence, the intuitionistic connectives are uniquely determined by their logical properties. For example, there cannot be two different binary meet operations on a poset, and correspondingly if &' were another connective satisfying rule (& - Adj), then would be provable for all Such uniqueness for logical constructs does not always hold: for example, Girard's [1987] exponential operator does not have this property. Example 5.13. The prop-category Set of Example 5.2 has finite meets and finite joins, given by set-theoretic intersection and union respectively (which are preserved by the pullback operations since these are given by taking inverse images along functions). Indeed, each PropSet(X) is a Boolean algebra and so the categorical semantics of formulas is sound for classical propositional logic. This semantics coincides with the usual settheoretic one, in the sense that the subset consists of those n-tuples sentence is satisfied in the classical, Tarskian sense (see [Chang and Keisler, 1973, 1.3]). Example 5.14. Turning to the prop-category Cpo of Example 5.3, first note that each poset PropCpo(X) does have meets, joins, and Heyting im-
88
Andrew M. Pitts
plications. Meets (even infinite ones) are given by set-theoretic intersection. Finite joins are given by set-theoretic union. The Heyting implication A —>• B of inclusive subsets A, B 6 PropCpo (X) is given by taking the intersection of all inclusive subsets of X that contain A or x E B}. The operation /* of taking the inverse image of an inclusive subset along an w-continuous function / : Y —> X preserves meets and finite joins, but it does not preserve Heyting implications. (For example, take X to be the successor ordinal w+, Y to be the discrete w-cpo with the same set of elements, and / to be the identity function; when A — {w} and B = 0, one has Thus Cpo supports the interpretation of conjunction and disjunction, but not implication. Example 5.15. The prop-category Kl of Example 5.4 has finite meets, finite joins, and Heyting implications. To see this, one needs to consider numerical codes for partial recursive functions. Let n • x denote the result, if defined, of applying the nth partial recursive function (in some standard enumeration) to x. The notation {n}(x) is traditional for n • x. We will often write nx for n • x; a multiple application (n • x) • y will be written nxy, using the convention that — • — associates to the left. So for each partial recursive function ip : N -*• N there is some n € N such that nx x (x) for all x € N. Here 'e e' means e- if and only if e', in which case e = e', and 'e- means the expression e is defined. The key requirement of this enumeration of partial recursive functions is that the partial binary operation n, m H-> n • m makes N into a 'partial combinatory algebra'—i.e. there should be K, S € N satisfying, for all x,y,z E N, that and Kx x
Sxfy, Sxyty and Sxyz x xz(yz). In particular, K and S can be used to define P, P0, PI e N, so that (x, y) >-t Pxy defines a recursive bijection N x N = N with inverse z >-*• (Poz, PI 2)Now define three binary operations on subsets A, B C N as follows:
Using elementary properties of the partial combinatory algebra (N, •) one can show that for any pair of kl-properties of a set X, represented by functions p, q : X —» P(N) say, the meet of [p] and [q] in PropKf(X) is represented by the function n i-> p(n) A q(n). Similarly, their join is represented by the function n >-> p(n) V q(n) and their Heyting implication
Categorical logic
89
is represented by the function n 1-4 p(n)->q(n). The greatest and least elements Txi-Lx € Propkl ( (X) are represented by the functions n >->• N and n M- 0 respectively. Since the pullback operations /* in kl are given by pre-composition with /, it is clear from these descriptions that finite meets, finite joins, and Heyting implications are all preserved by f* . So the prop-category ICf. supports an interpretation of the prepositional connectives that is sound for intuitionistic logic. Combining the definition of the categorical semantics with the above description of finite meets, finite joins, and Heyting implications in kl one finds that this interpretation is in fact Kleene's '1945-realizability' interpretation of the intuitionistic prepositional connectives (see [Dummett, 1977, 6.2]), in the sense that the relation ' coincides with the relation 'n realizes . Remark 5.16 (Classical logic). We can obtain classical prepositional logic by adding the rule
to those in Fig. 3. Recalling that is an abbreviation for one has that is the pseudocomplement of in the Heyting algebra Propc([F]). For the above rule to be sound in C we need to hold for each A € Propc(X). In other words, the pseudocomplement of every C-property must actually be a complement. So to obtain a sound (and in fact complete) semantics for classical prepositional logic we require that each Propc(X) is a Boolean algebra and that each pullback operation /* preserve finite meets and joins (and hence also complements).
5.5 Quantification The rules for forming quantified formulas are
The usual natural deduction rules for introducing and eliminating quantifiers are given in Fig. 5. Modulo the structural rules of Fig. 2, these natural deduction rules are interderivable with an 'adjoint' formulation given by the bidirectional rules of Fig. 6. It should be noted that some side conditions are implicit in these rules because of the well-formedness conditions mentioned after (5.2) that are part of the definition of a sequent. Thus x does not occur free in in (V -Intro) or (V-Adj); and it does not occur free in (3-Elim) or (3-Adj). What structure in a prop-category C with finite meets is needed to soundly interpret quantifiers? Clearly we need functions
90
Andrew M. Pitts
Fig. 5. Natural deduction rules for quantifiers
Fig. 6. 'Adjoint' rules for quantifiers
so that we can define
In order to retain the soundness of the structural rules (Weaken) and (Subst), we must require that these operations be natural in I, i.e. for any f : I' — I we require
Given the semantics of weakening in Lemma 5.6, for the soundness of (V-Adj) we need
for all A € Propc(I) and B € Propc(I x X). In other words.
provides
Categorical logic
91
a right adjoint to the monotone function
Note that this requirement determines the function uniquely and implies in particular that it is a monotone function. Given the definition of satisfaction of sequents in section 5.3, for the soundness of (3-Adj) we need
for all A, C e Propc(I) and B 6 Propc(I x X). We can split this requirement into two: first that provides a left adjoint to, and second a 'stability' condition for the left adjoint with respect to meets:
Definition 5.17. Let C be a prop-category with finite meets. We say that C has universal quantification if, for each product projection :I x X —> X in C, the pullback function 1 possesses a right adjoint and these right adjoints are natural in / (5.6). We say that C has existential quantification if each possesses a left adjoint V/,x (5-9) these left adjoints are both natural in I (5.7) and satisfy the stability condition (5.10). Thus we have shown that if C is a prop-category with finite meets and both universal and existential quantification, then the sequents that are satisfied by a structure in C are closed under the quantifier rules in Fig. 5. Remark 5.18. Conditions (5.6) and (5.7) are instances of what Lawvere [1969] termed the 'Beck-Chevalley condition': see Remark 5.27. Condition (5.10) was called by him 'Frobenius reciprocity'. It is somewhat analogous to the stability condition on coproducts needed in section 3.1 to soundly interpret disjoint union types. Just as there we noted that the existence of exponentials makes stability of coproducts automatic, so here it is the case that condition (5.10) holds automatically if C has Heyting implications (see Definition 5.10). Example 5.19. The prop-cat Set of Example 5.2 possesses adjoints for all pullback operations and in particular for pullback along projections. Specifically, for A C I x X we have
These adjoints are easily seen to be natural in /, and since Set has Heyting
92
Andrew M. Pitts
(indeed Boolean) implications, as noted above the Frobenius reciprocity condition (5.10) holds automatically. Thus Set has both universal and existential quantification. Given the above formulas for the adjoints, it is clear that the remark in Example 5.13 about the coincidence of the categorical semantics in Set with the usual, Tarskian semantics of formulas continues to hold in the presence of quantified formulas. Example 5.20. The prop-category K.I of Example 5.2 also has universal and existential quantification. Given A € Propel x X), suppose A is represented by the function p : I x X — > P(N). Then and V/,x (^) can be represented respectively by the functions
defined by
Some calculations with partial recursive functions show that these formulas do indeed yield the required adjoints, and that they are natural in /. Since we noted in Example 5.15 that K.I has Heyting implications, the Frobenius reciprocity condition (5.10) holds automatically. Example 5.21. The prop-category Cpo of Example 5.3 possesses natural right adjoints to pulling back along projections: they are given just as for Set by the formula (5.11), since this is an inclusive subset when A is. Cpo also possesses left adjoint to pulling back along projections: VI x(A) is given by the smallest inclusive subset of I containing {i E I \ 3x & X.(i,x) 6 A} (i.e. by the intersection of all inclusive subsets containing that set). However, these left adjoints do not satisfy the Beck-Chevalley condition (5.7). For if this condition did hold, for any i 6 / we could apply it with I' a one-element w-cpo and / : /' —> I the function mapping the unique element of /' to i, to conclude that i € V/,x(^) ^ anc^ on'y if (i,x) E A for some x € X. In other words, the set in (5.11) would already be inclusive when A is inclusive. But this is by no means the case in general. (For example, consider when / is the successor ordinal LJ+ and X is the discrete w-cpo with the same set of elements. Then A = {(m,n) \ m < n} is an inclusive subset of I x X, but (5.11) is u, which is not an inclusive subset of w + .) Thus the prop-category Cpo has universal quantification, but not existential quantification.
Categorical logic
93
Fig. 8. 'Adjoint' rule for equality
5.6
Equality
We have seen that the categorical semantics of the prepositional connectives and quantifiers provides a characterization of these logical operations in terms of various categorical adjunctions. In this section we show that, within the context of first-order logic, the same is true of equality predicates. (As with so much of categorical logic, this observation originated with Lawvere [1970].) The formation rule for equality formulas is
Natural deduction rules for introducing and eliminating such formulas are given in Fig. 7 (cf. [Nordstrom et ai, 1990, Section 8.1]). The usual properties of equality, such as reflexivity, symmetry, and transitivity, can be derived from these rules. It is not hard to see that modulo the structural rules of Fig. 2, the natural deduction rules are interderivable with an 'adjoint' formulation given by the bidirectional rule of Fig. 8. Suppose C is a prop-category with finite meets. To interpret equality formulas whilst respecting the structural rules (Weaken) and (Subst) of Fig. 2, for each C-object X we need a C-property
so that we can define
94
Andrew M. Pitts
Given the definition of satisfaction of sequents in section 5.3, for the soundness of (=-Adj) we need
for all C-objects I,X and all C-properties A 6 Propc(I x X) and B € Propc(I x (X x X ) ) . Here AX is the diagonal morphism (idx,idx) '• X —> X x X, and TTI and 7r2 are the product projections X x X —> X andl x(X xX) —> X x X. Condition (5.12) says that for each A € Propc(I x X), the value of the left adjoint to (idj x AX)* exists at A and is equal to (idi x ni)*(A) A In particular, taking / to be the terminal object in C and A = we have that Eqx is the value of the left adjoint to Ax at the top element Tx € Propc(X), i.e.
for all B € Propc(X x X). Note that the C-properties Eqx are uniquely determined by this requirement. Definition 5.22. Let C be a prop-category with finite meets. We say that C has equality if for each C object X, the value Eqx of the left adjoint to Ax at the top element Tx 6 Propc(X) exists and satisfies (5.12) Thus we have shown that if C is a prop-category with finite meets and equality, then the sequents that are satisfied by a structure in C are closed under the equality rules in Fig. 7. Remark 5.23. In fact when C also has Heyting implications and universal quantification, then property (5.12) is automatic if Eqx exists satisfying (5.13). This observation corresponds to the fact that in the presence of implication and universal quantification, the rule (=-Adj) is interderivable with a simpler rule without 'parameters':
Example 5.24. The prop-category Set of Example 5.2 has equality. Since we have seen in previous sections that Set has implication and universal quantification, by Remark 5.23 it suffices to establish the existence of the left adjoint to Ax at TX- But for this, clearly we can take Eqx C X x X to be the identity relation {(x,x) \ x €. X}. Similarly for the prop-category kl of Example 5.4: since it has implication and universal quantification, to see that it also has equality we just have to verify (5.13). This can be done by taking taking Eqx to be the kl-property represented by the function 6X : X x X —» P(N) defined by
Categorical logic
95
Example 5.25. We saw in Example 5.14 that prop-category Cpo does not have Heyting implications. Hence condition (5.12) is not necessarily implied by the special case (5.13) in this case. Nevertheless, we can satisfy (5.12) by taking each Eqx to be the inclusive subset {(x,x) \ x € X} of the w-cpo X x X. Thus Cpo does have equality. Example 5.26. Let C be a category with finite limits, made into a propcategory (with finite meets) as in Example 5.5. Then C has equality, with Eqx the subobject of X x X represented by the diagonal monomorphism AX = (idx, idx)- Note that in this case, definition (5.11) implies that the interpretation [(M =ff M')[F]J of an equality formula is the subobject of [F] represented by the equalizer of the pair of C-morphisms .
Remark 5.27 (Generalized quantifiers). Suppose that C is a propcategory with finite meets, Heyting implications, and both universal and existential quantification. Then not only do the adjoints but in fact for any C morphism / : X — >• Y, the monotone function /* : Propc(Y) — >• Propc(X) has both left and right adjoints, which will be denoted V/ and A/ respectively. Indeed, for B € Propc(Y) we can define
It is easier to see what these expressions mean and to prove that they have the required adjointness properties if we work in a suitable internal language for the prop-category C. The signature of such an internal language is like that discussed in section 2.3, but augmented with relation symbols R C X1 x • • • x Xn for each C-property R € Propc(Xi x • • • x Xn) (for each tuple of C-objects Xi, . . . ,Xn). Using the obvious structure in C for this signature, one can describe C-properties using the interpretation of formulas over the signature; and relations between such C-properties can be established by proving sequents in the predicate calculus and then appealing to the soundness results we have established. From this perspective Vf and Af are *ne interpretation of formulas that are generalized quantifiers, adjoint to substitution:
96
Andrew M. Pitts
The adjointness properties
can be deduced from the fact that the following bidirectional rules are derivable from the natural deduction rules for &, =>•, 3, V, =:
We required the adjoints V/ x( = V* ) an ~ Th ' holds if and only if both <j> (- ' [F] and $ h (j) [F] are theorems of Th. The partial order on -Prop^f T/I) (F) is that induced by Th-provable entailment: A < A' holds if and only if for some (indeed, any) formulas and 4>' representing the equivalence classes A and A' respectively, is a theorem of Th. To complete the definition of the prop-category structure (cf. Definition 5.1), we have to define the action of pulling back a C£(Th)-property along a morphism. This is induced by the operation of substituting terms for variables in formulas. Given A € Propcl(Th) (F) and 7 : F' —> F with
The properties of monotonicity and functoriality of (—)* in Definition 5.1 are easily verified (using the definition of identities and composition in Cl( Th) given in section 4.2.) Proposition 5.29. The classifying prop-category Cl(.( Th) has finite meets, finite joins, Heyting implication, universal and existential quantification, and equality.
Categorical logic
99
Proof. The prepositional operations are induced by the corresponding logical connectives. Thus for any object F and Cl( Th)-properties A = [] and A1 = ['] of T, we have:
Similarly, quantification is induced by the corresponding operation on formulas. Thus if A = [4>] € Proper/I) (^ x F), so that (j>prop [A,F], then
if F = [xi :CTI, . . . , xn : TYPES TERMS^TERMS TERMS5-»TERMS.
The axioms are as follows:
where
It is evident from this example that the formal requirement in the introductory axiom of a function symbol (such as that for Comp) that all variables in the context occur explicitly as arguments of the function is at variance with informal practice. See [Cartmell, 1986, Section 10] for a discussion of this issue. Remark 6.6 (Substitution and Weakening.). A general rule for substituting along a context morphism is derivable from the rules in Figs 9 and 10, viz.
Categorical logic
107
where J is one of the four forms 'e type', 'e = e", 'e : e", or 'e = e' : e"\ and y is the list of variables in T'. A special case of this is a general derived rule for weakening contexts:
The rules for substitution in Fig. 9 also have as special cases forms which correspond more closely to the substitution rule (2.9) for simply typed equational logic, viz:
6.2
Classifying category of a theory
We are now going to construct a category, C l ( T h ) , out of the syntax of a dependently typed algebraic theory Th. In fact the construction is essentially just as in section 4.2. More accurately, the construction in that section is the special case of the one given here when the theory has only 0-ary type-valued function symbols. We will use the following terminology with respect to the judgements which are theorems of Th. Say that F is a Th- context if the judgement [F] ctxt is a theorem of Th; say that Th-contexts F and F' are Th-provably equal if the judgement F = F' is a theorem of Th; say that 7 : F-»F' is a morphism of Th-contexts if the judgement 7 : F-»F' is a theorem of Th; and finally, say that two T7i-context morphisms are Th-provably equal if 7 = 7': F-^F' is a theorem of Th. Objects of C l ( T h ) . The collection of objects of the classifying category is the quotient of the set of Th-contexts by the equivalence relation of being 7h-provably equal. This definition makes sense because the following rules are derivable from those in Figs 9 and 10:
We will tend not to distinguish notationally between a Th-context and the object of Cl( Th) that it determines. Morphisms of C l ( T h ) . Given two objects in Cl(Th), represented by Thcontexts F and F' say, the collection of morphisms in the classifying cat-
108
Andrew M. Pitts
egory from the first object to the second is the quotient of the set of Thcontext morphisms F-»F' by the equivalence relation of being Th-provably equal. This definition makes sense because the following rules are derivable from those in Figs 9 and 10:
We will tend not to distinguish notationally between a Th-context morphism and the morphism of Cl ( Th) which it determines. Composition in Cl(Th).
Given context morphisms 7 : F—>T' and 7' : define tion 7' o 7 : F->T" just as in section 4.2, viz.
The derived rules for substitution mentioned in Remark 6.6 can be used to show that the following derived rules for composition are valid.
From these rules it follows that a well-defined and associative operation of composition is induced on the morphisms of the classifying category. The identity morphisms for this composition are induced by the identity context morphisms idr '• F—>F, which are defined just as in section 4.2:
where x 1 , . . . , xn are the variables listed in F. These context morphisms do induce morphisms in Ct( Th) which are units for composition because the following rules are derivable:
Categorical logic
109
This completes the definition of Cl( Th) as a category. We now move on to examine what extra categorical structure it possesses.
6.3
Type-categories
For a simply typed algebraic theory, we saw in section 4 that the relevant categorical structure on the corresponding classifying category was finite products. Here it will turn out to be the more general3 property of possessing a terminal object and some pullbacks. To explain which pullbacks, we identify a special class of morphisms in the classifying category of Th. Definition 6.7. If F is a Th-context, the collection of T-indexed types in Cl( Th) is defined to be the quotient of the set of types a such that a type [F] is a theorem of Th, with respect to the equivalence relation identifying a and a1 just if a = a1 [F] is a theorem of Th. As usual, we will not make a notational distinction between a and the F-indexed type it determines. Each such F-indexed type has associated with it a projection morphism, represented by the context morphism
Here x1,... ,xn are the variables listed in F, and x is any other variable. Note that the object in C l ( T h ) represented by [F,a; : a] and the morphism represented by a are independent of which particular variable x is chosen. Lemma 6.8. Given a morphism 7 : F' —» F in Cl( Th) and a T-indexed type represented by a, then
is a pullback square in CL( Th) . Proof. Suppose we are given 7' : F"->F' and 7" : F"-»[r, x : a] in CL( Th) satisfying 7 o 7' = na o 7" : F"->F. We have to prove that there is a unique 3
Since finite products can be constructed from pullbacks and a terminal object.
110 morphism
Andrew M. Pitts satisfying
and
Now since , we must have that the list of terms 7" is of the form for some term N for which is a theorem of Th. Now since (where x1 is the list of variables in F'), we get a morphism
satisfying
and
as required. If 6' : F"—>•[!", x' : &[j/x\] were any other such morphism, then from the requirement ' we conclude that the list 6' is of the f o r m ; and then from the requirement we conclude further that
The above lemma motivates the following definition, which is essentially Cartmell's (unpublished) notion of 'category with attributes'. Definition 6.9. A type-category, C, is a category possessing a terminal object, 1, and equipped with the following structure: • For each object X in C, a collection Typec(X), whose elements will be called X -indexed types in C. • For each object X in C, operations assigning, to each X-indexed type A an object X K A, called the total object of A, together with a morphism
led the projection morphism of A. • For each morphism / : Y — » X in C, an operation assigning to each X -indexed type A, a, Y indexed type f*A, called the pullback of A
Categorical logic along X, together with a morphism making the following a pullback square in the category C:
111 A
In addition, the following strictness conditions will be imposed on these operations:
Notation 6.10. If C is a type-category and X is a C-object, then as usual C/X will denote the slice category whose objects are the C-morphisms with codomain X, f : dom(f) —l X, and whose morphisms / —> f are C-morphisms g : dom(f) —> dom(f') satisfying /' o g = /. Identity morphisms and composition in C/X are inherited from those in C. Note that for each A £ Typec(X), the projection morphism -ITA is an object in the slice category C/X. So we can make Typec(X) into a category (equivalent to a full subcategory of C/X) by taking morphisms A —> A' to be C/Xmorphisms VA —>• TT^' (with identities and composition inherited from C/X). Proposition 6.11. The classifying category, C l ( T h ) , of a dependently typed algebraic theory Th is a type-category. Proof. We identified notions of indexed type and associated projection morphisms for C l ( T h ) in Definition 6.7 and verified in Lemma 6.8 the existence of pullback squares of the required form (6.5). The 'strictness' conditions in Definition 6.9 are met because of the usual properties of substitution of terms for variables. Finally, note that Cl( Th) does have a terminal object, namely the (equivalence class of the) empty context []. I Here are some examples of 'naturally occurring' type-categories. Example 6.12 (Sets). The category Set of sets and functions supports the structure of a type-category: for each set X, we can take the ^-indexed types to be set-valued functions with domain X. Given such a function A : X —^ Set, the associated total set is the disjoint union
112
Andrew M. Pitts
with projection function ~KA given by first projection (x, a) t-¥ x. The pullback of A along a function / : F —> X is just the composition def
which indeed results in a pullback square in Set of the required form (6.5) and satisfying the strictness conditions, with equal to the function Note the following property of this example, which is not typical of typecategories in general: up to bijection over X, any function with codomain X, f : Y — > X, can be expressed as the projection function associated with an indexed type:
Example 6.13 (Constant families). Every category C with finite products can be endowed with the structure of a type-category, in which the indexed types are 'constant families'. For each object X in C, one defines Typec(X) to be Objc, the class of all objects of C. The associated operations are
Given / : Y —>• X in C and A € Typec(X) = Objc, the pullback F-indexed type f*A is just the object A itself and the morphism X x A is / x id A- This does yield a pullback square in C, viz.
and these pullbacks do satisfy the strictness conditions in Definition 6.9. When a category with finite products is regarded as a type-category in this way, the categorical semantics of simply typed algebraic theories given in section 2 becomes essentially a special case of the semantics of dependently typed theories to be given below. Example 6.14 (Toposes). The following example is given for readers familiar with the notion of topos (see the references in section 7). One
Categorical logic
113
can make each topos £ into a type-category as follows. First note that £ is indeed a category with a terminal object. For each object X e Obj£, define Type£ (X) to consist of all pairs (A, a) where A 6 Obj£ and a : X x A —» fi. Here fi denotes the codomain of the object classifier of £, T : 1 —> fl. Given such an X-indexed type, the total object X x (A, a) is given by forming a pullback square in E:
brojection
morphism associated to (A, a) is defined by the composition
For any morphism is defined to be the F-indexed type The pullback square (6.5) is obtained as the factorization through (6.6) of the corresponding pullback for a o (/ x id A)Elementary properties of pullback squares ensure that the strictness conditions of Definition 6.9 hold, without the need for any coherence assumptions on the chosen pullback operations for E. Traditionally, dependently typed theories have been interpreted in toposes using the locally cartesian closed structure that they possess, whereby the category Type£(X) of X-indexed types (cf. Notation 6.10) is just the slice category E / X : see [Seely, 1984]. Such a choice for indexed types yields a 'non-strict' type-category structure, in that the strictness conditions of Definition 6.9 have to be weakened by replacing the equalities by coherent isomorphisms. This can lead to complications in the interpretation of type theory: see [Curien, 1990]. The method of regarding a topos as a type-category given here achieves the strictness conditions and side-steps issues of coherence. Given that toposes correspond to theories in intuitionistic higher-order logic (cf. [Bell, 1988], [Mac Lane and Moerdijk, 1992]), this example can form the basis for interpreting dependently typed theories in higher-order predicate logic: see [Jacobs and Melham, 1993]. However, Hofmann [1995] has recently pointed out that using a particular construction of Benabou (for producing a split fibration equivalent to a given fibration), an arbitrary locally cartesian closed category, E, can be endowed with a more involved notion of X-indexed type which makes £ a type-category (supporting the interpretation of product, sum, and identity types) and with T y p e e ( X ) equivalent to E/X (pseudo-naturally in X).
114
Andrew M. Pitts
Example 6.15 (Split fibrations). Let Cat denote the category of small categories and functors. We can turn it into a type-category by decreeing that for each small category, C, the C-indexed types are functors A : Cop —> Cat. The associated total category C x A is given by a construction due to Grothendieck: • An object of C x A is a pair (X, A), with X an object of C and A an object of A.(X). • A morphism in C x A from (X, A) to (X1, A') is a pair (x, a), where x : X —>• X' in C and a : A —> A.(x)(A') in A(X). The composition of two such morphisms (a;, a) : (X, A) —> (X',A') and (a;', a') : (X1, A') —> (X", A") is given by the pair (x1 o x, A(x)(a') o a). The identity morphism for the object (^",^4) is given by (id x, id A)The projection functor TTA : C K A —> C is of course given on both objects and morphisms by projection onto the first coordinate. The pullback of the indexed type A : Cop —> Cat along a functor F : D —> C is just given by composing A with F, regarded as a functor Dop —> Cop. Applying the Grothendieck construction to A and to A o F, one obtains a pullback square in the category Cat of the required form, with F x A : D x (A o F) —> C K A the functor which acts on objects and morphisms by applying F in the first coordinate. Regarding a set as a discrete category, there is an inclusion between the type-category of Example 6.12 and the present one. Unlike the previous example, not every morphism in Cat arises as the first projection of an indexed type. Those that do were characterized by Grothendieck and are known as 'split Grothendieck fibrations'. As the name suggests, these are a special instance of the more general notion of 'Grothendieck fibration'— which would give a more general way of making Cat into a type-category, except that the 'strictness' conditions in Definition 6.9 are not satisfied. See [Jacobs, 1999] for more information on the use of Grothendieck fibrations in the semantics of type theory.
6.4
Categorical semantics
We will now give the definition of the semantics of the types, terms, contexts, and context morphisms of Th in a type-category C. In general, contexts are interpreted as C-objects, context morphisms as C-morphisms, and types as indexed types in C. Terms as interpreted using the following notion of 'global section' of an indexed type. Definition 6.16. Given an object X in a type-category C, the global sections of an X-indexed type A € Typec (X) are the morphisms a : X —> X x A in C satisfying IT A ° a = idx- We will write
Categorical logic
115
to indicate that a is a global section of the X-indexed type A. For any morphism / : Y —> X, using the universal property of the pullback square (6.5) we get fa
is the unique morphism with TT/.^ o /*
and (/ x A) o fa = a o /.
Definition 6.17. Let Sg be a signature as in section 6.1. A structure for Sg in a type-category C is specified by: • For each type-valued function symbol s, a C-object Xs and an indexed type As E Typec(Xs). • For each term-valued function symbol F, a C-object Xp, an indexed type AF £ Typec(Xp), and a global section ap £XF AFSemantics of expressions. Given such a structure, we will define four relations:
where the judgements enclosed by [ J are all well-formed and where X is a C-object, A € Typec(X), a Ex A, and / is a morphism from X' to X in C. These relations are defined inductively by the rules in Fig. 11. In the first rule concerning context morphisms, {) : X—*l denotes the unique morphism from X to the terminal object. The two rules concerning the semantics of variables use the following pairing notation. Notation 6.18. Referring to the pullback square h : Z —> X x A are morphisms satisfying / o g = KA o h, then
and
will denote the unique morphism satisfying A) o (g, h} — h whose existence is guaranteed by the universal property of the pullback square. Thus for example, in this notation the global section fa €y /* A referred to in Definition 6.16 is given by the morphism (idy, &° It is clear from the form of the rules in Fig. 11 that the relations they define are monogenic, in the sense that for a judgement J and categorical structures E and E', if [J] x E and [J] x E', then E = E' in C. Thus given a structure for Sg in C, the assignments
116
Andrew M. Pitts Contexts
Terms
Context morphisms
Fig. 11. Categorical semantics
Categorical logic
117
determine partial functions. We will write to indicate that these partial functions are defined at a particular argument J. Remark 6.19. The case when J is x : T [A] deserves some comment, since this is the most complicated case in the definition of [J]. The wellformedness of x : T [A] guarantees that x is assigned a type in A, i.e. that A is of the form [F, x : a, F']. However, the presence of type equality judgements in the logic means that a is not necessarily syntactically identical to T. This accounts for the presence of the hypotheses concerning r in the two rules for this case in Fig. 11. The first of these two rules deals with the case when F' = [ ], and then the second rule must be applied repeatedly to deal with general F'. Satisfaction of judgements Next we define what it means for the various forms of judgement to be satisfied by a structure for Sg in a type-category, C. • F ctxt is satisfied if and only if [F ctaJtJJJ• a type [F] is satisfied if and only if \a type [F]J||.. • M : a [F] is satisfied if and only if |M : a [F]|-IJ.. • 7 : F->F' is satisfied if and only if [7 : F-»F']JJ.. • F = F' is satisfied if and only if for some (necessarily unique) object X
[F ctxt\ x X
and [F7 ctxt] x X.
• a = a' [F] is satisfied if and only if for some (necessarily unique) object X and ^-indexed type A £ Typec(X) {a type [F]I x A [X]
and
[a' type [F]] x A [X].
• M = M' : a [F] is satisfied if and only if for some (necessarily unique) object X, indexed type A 6 Typec(X), and global section a Ex A IM : a [F]] x a : A [X]
and
[M' : a [F]] x a : A [X].
• 7 = 7' : F-»F' is satisfied if and only if for some (necessarily unique) morphism / : X' —> X in C 17 : F'->r] x / : X'-*X
and [7' : F'->F] x / : X'-tX.
Suppose Th is a dependently typed algebraic theory. A model of Th in a type-category, C, is a structure in C for the underlying signature of
118
Andrew M. Pitts
Th with the property that the collection of judgements satisfied by the structure is closed under the rules for the axioms of Th given in Fig. 10. Theorem 6.20 (Soundness). The collection of judgements that are satisfied by a structure in a type-category, C, is closed under the general rules of dependently typed equational logic given in Fig. 9. Consequently, a model inC of a dependently typed algebraic theory, Th, satisfies all the judgements which are theorems of Th. Much as with the corresponding result for simply typed equational logic, the key tool used in proving the above theorem is a lemma giving the behaviour of substitution in the categorical semantics: Lemma 6.21 (Semantics of substitution). Suppose that Then
The generic model of a well-formed theory. Suppose that Th is a dependently typed algebraic theory that is well-formed, in the sense of Definition 6.3. Then the classifying category C l ( T h ) contains a structure, G, for the underlying signature of Th. G is defined as follows, where we use a notation for the components of the structure as in Definition 6.17. The well-formedness of Th implies that for each type-valued function symbol s, with introductory axiom s(x)type [Fs] say, Fs is a ITi-context and hence determines an object Xs of Cl( Th). Then define the X s -indexed type AS 6 TyPecl(Th)(-^s) to be that represented by s(x) (which is a Fs-indexed type because s(x) type [Fs] is a theorem of Th). For each term-valued function symbol F with introductory axiom F(x) : VF [TF],the well-formedness of Th implies that up type \Tp] is a theorem of Th. Hence F is a Th-context and hence determines an object ofCl(Th). Then define the XF-indexed type AF €T y p e c l ( T h ) ( X F )to be that represented by the Ff-indexed type , and define the global section to be the morphism represented by the Th-context morphism The structure G has the following properties: • For each judgement F ctxt and Cl(Th)-object X, the relation [F] x X holds if and only if F ctxt is a theorem of Th and X is the object of C l ( T h ) represented by F. • For each judgement a type, [F], each object X and each X-indexed type A e Typece^Th-)(X), the relation [a type [F]J x A [X] holds if and only if a type [F] is a theorem of Th, X is the object of C l ( T h ) represented by F, and A is the X"-indexed type represented by a.
Categorical logic
119
• For each judgement M : a [F], each object X and each X-indexed type A € Typec^Th)(X), and each global section a Ex A, the relation [M : a [F]] x a € A [X] holds if and only if X is the object of Ci( Th) represented by F, A is the X-indexed type represented by a, and a is the global section represented by the Th-context morphism • An equality judgement is satisfied by G if and only if it is a theorem of Th. Thus G is a model of Th and a judgement is satisfied by this model if and only if it is a theorem of Th. Remark 6.22. The classifying type-category Cl( Th) and the generic model G of a well-formed, dependently typed algebraic theory Th have a universal property analogous to that given for algebraic theories in Theorem 4.6, with respect to a suitable notion of morphism of type-categories. To go further and develop a correspondence between dependently typed algebraic theories and type-categories along the lines of section 4.3, one has to restrict to the subclass of 'reachable' type-categories. For note that an object X in a type-category C can appear as the interpretation of a context only if for some n > 0 there is a sequence of indexed types
with Call C reachable if there is such a sequence for every C-object. Every classifying category has this property. Conversely if C is reachable, then it is equivalent to Cl( Th) for a theory in a suitable 'internal language' for C (cf. section 2.3).
6.5
Dependent products
The categorical framework corresponding to dependently typed equational logic that we have given can be used to infer the categorical structure needed for various dependently typed datatype constructors—much as we did for simple types in section 3. We will just consider one example here, namely the structure in a type-category corresponding to dependent products. The formation, introduction, elimination, and equality rules for dependent product types are given in Fig. 12. Equality rules for both B- and n-conversion are given. There are also congruence rules for the constructors II, A, and ap, which we omit (cf. Remark 3.3). If C is a type-category, the existence of the pullback squares (6.5) allows us to define a functor between slice categories
given by pullback along if A '• X x A —> X :
120
Andrew M. Pitts
Fig. 12. Rules for dependent product types
(This definition makes use of the pairing notation introduced in Notation 6.18.) Symmetrically, there is a functor
induced by pullback along
Definition 6.23. A type-category C has dependent products if for each there type and a morphism in Typec(X tx. A)
Categorical logic
121
satisfying the following: • Adjointness property. ^n(A,A') '• X x 11(^4, A') — > X is the value of the right adjoint to the pullback functor (6.7) at TTA> '• X x A x A' — > X x A, with counit apA A,. In other words, for any / : Y — » X in C and g : there is a unique morphism in C/X satisfying • Strictness property.
For any morphism
Here /* and (f x .A)* are instances of the pullback functors (6.8). To specify the categorical semantics of dependent product types in such a type-category, we give rules extending the inductively defined relation of section 6.4: Formation
Introduction
Note that the conclusion of this rule is well-formed if the hypotheses are. For if a' &XKA A', then a' so we can apply the adjointness property from Definition 6.23 to form cur(o') : idx — > ^n(A>i')' wmch is, in particular, a global section of H(A,A'). Elimination
If the hypotheses of this rule are well-formed, then a Ex A and b €x H(A,A'), Since
122 we
Andrew M. Pitts can use
the pairing operation of Notation
6.18 to form Since by definition and
a, we get a morphism (a, b) : hence by composition a morphism
pairing operation yields a morphism
whose composition with ' is idx and hence which is a global section of a" A'. Thus the conclusion of the above rule is well-formed when its hypotheses are. With these rules we can extend Theorem 6.20 to include dependent product types. Theorem 6.24 (Soundness). Defining satisfaction of judgements as in section 6.4, it is the case that the collection of judgements satisfied by a structure in a type-category with dependent products is closed under the rules in Fig. 12 (as well as the general rules of Fig. 9). Thus the structure of Definition 6.23 is sufficient for soundly interpreting dependent products. On the other hand it is necessary, in the sense that Proposition 6.11 can be extended as follows: Proposition 6.25. If Th is a theory in dependently typed equational logic with dependent product types, then the classifying category C l ( T h ) constructed in section 6.2 is a type-category with dependent products. Proof. Referring to Definition 6.23, to define (6.9) find judgements
that are theorems of Th and such that X, A, and A' are the equivalence classes of F, a, and o-'(x), respectively. Then take 11(^4,^4') to be the Findexed type determined by . This definition is independent of the choice of representatives. The morphism (6.10) is that represented by the Th-context morphism
Since, this does indeed induce a morphism of the required kind; and once again the definition of apAA, is independent of the choice of representatives. The adjointness property required for 11(^4, ^4') can be deduced from the equality rules in Fig. 12, and the stability property follows from standard properties of substitution. •
Categorical logic
7
123
Further reading
This section lists some important topics in categorical logic which have not been covered in this chapter and gives pointers to the literature on them. Higher-order logic The study of toposes (a categorical abstraction of key properties of categories of set-valued sheaves) and their relationship to set theory and higher order logic has been one of the greatest stimuli of the development of a categorical approach to logic. [Mac Lane and Moerdijk, 1992] provides a very good introduction to topos theory from a mathematical perspective; [Lambek and Scott, 1986, Part II] and [Bell, 1988] give accounts emphasizing the connections with logic. The hyperdoctrine approach to first-order logic outlined in section 5 can be extended to higherorder logic and such higher-order hyperdoctrines can be used to generate toposes: see [Hyland et al, 1980], [Pitts, 1981,1999] and [Hyland and Ong, 1993]. Polymorphic lambda calculus Hyperdoctrines have also been used successfully to model type theories involving type variables and quantification over types. [Crole, 1993, Chapters 5 and 6] provides an introduction to the categorical semantics of such type theories very much in the spirit of this chapter. Categories of relations Much of the development of categorical logic has been stimulated by a process of abstraction from the properties of categories of sets and functions. However, categorical properties of sets and binary relations (under the usual operation of composition of relations) have also been influential. The book by Preyd and Scedrov [1990] provides a wealth of material on categorical logic from this perspective. Categorical proof theory Both Lawvere [1969] and Lambek [1968] put forward the idea that proofs of logical entailment between propositions, may be modelled by morphisms, in categories. If one is only interested in the existence of such proofs, then one might as well only consider categories with at most one morphism between any pair of objects, i.e. only consider pre-ordered sets. This is the point of view taken in section 5. However, to study the structure of proofs one must consider categories rather than just pre-orders. Lambek, in particular, has studied the connection between this categorical view of proofs and the two main styles of proof introduced by Gentzen (natural deduction and sequent calculus), introducing the notion of multicategory to model sequents in which more than one proposition occurs on either side of the turnstile, : see [Lambek, 1989]. This has resulted in applications of proof theory to category theory, for example in the use of cut elimination theorems to prove coherence results: see [Mine, 1977], [Mac Lane, 1982]. In the reverse direction of applications of category theory to proof theory, general categorical machinery, particularly enriched category theory, has provided useful guidelines
124
Andrew M. Pitts
for what constitutes a model of proofs in Girard's linear logic (see [Seely, 1989], [Barr, 1991], [Mackie et al., 1993], [Bierman, 1994]); for example, in [Benton et al., 1993], such considerations facilitated the discovery of a well-behaved natural deduction formulation of intuitionistic linear logic. Categorical combinators The essentially algebraic nature of the category theory corresponding to various kinds of logic and type theory gives rise to variable-free, combinatory presentations of such systems. These have been used as the basis of abstract machines for expression evaluation and type checking: see [Curien, 1993], [Ritter, 1992].
References [Backhouse et al, 1989] R. Backhouse, P. Chisholm, G. Malcolm, and E. Saaman. Do-it-yourself type theory. Formal Aspects of Computing, 1:19-84, 1989. [Barr, 1991] M. Barr. *-autonomous categories and linear logic. Math. Structures in Computer Science, 1:159-178, 1991. [Barr and Wells, 1990] M. Barr and C. Wells. Category Theory for Computing Science. Prentice Hall, 1990. [Bell, 1988] J. L. Bell. Toposes and Local Set Theories. An Introduction, volume 14 of Oxford Logic Guides. Oxford University Press, 1988. [Benton et al., 1993] P. N. Benton, G. M. Bierman, V. C. V. de Paiva, and J. M. E. Hyland. A term calculus for intuitionistic linear logic. In M. Bezem and J. F. Groote, editors. Typed Lambda Calculi and Applications, Lecture Notes in Computer Science 664, pages 75-90, SpringerVerlag, 1993. [Bierman, 1994] G. M. Bierman. On intuitionistic linear logic. PhD thesis, Cambridge Univ, 1994. [Bloom and Esik, 1993] S. L. Bloom and Z. Esik. Iteration Theories. EATCS Monographs on Theoretical Computer Science. Springer-Verlag, 1993. [Cartmell, 1986] J. Cartmell. Generalised algebraic theories and contextual categories. Annals of Pure and Applied Logic, 32:209-243, 1986. [Chang and Keisler, 1973] C. C. Chang and H. J. Keisler. Model Theory. North-Holland, 1973. [Coste, 1979] M. Coste. Localisation, spectra and sheaf representation. In M. P. Fourman, C. J. Mulvey, and D. S. Scott, editors, Applications of Sheaves, Lecture Notes in Mathematics 753, pages 212-238. SpringerVerlag, 1979. [Crole, 1993] R. L. Crole. Categories for Types. Cambridge Univ. Press, 1993.
Categorical logic
125
[Crole and Pitts, 1992] R. L. Crole and A. M. Pitts. New foundations for fixpoint computations: Fix-hyperdoctrines and the fix-logic. Information and Computation, 98:171-210, 1992. [Curien, 1989] P.-L. Curien. Alpha-conversion, conditions on variables and categorical logic. Studia Logica, 48:319-360, 1989. [Curien, 1990] P.-L. Curien. Substitution up to isomorphism. Technical Report LIENS-90-9, Laboratoire d'Informatique, Ecole Normale Superieure, Paris, 1990. [Curien, 1993] P.-L. Curien. Categorical Combinators, Sequential Algorithms, and Functional Programming (2nd edition). Birkhauser, 1993. [Dummett, 1977] M. Dummett. Elements of Intuitionism. Oxford University Press, 1977. [Ehrhard, 1988] Th. Ehrhard. A categorical semantics of constructions. In 3rd Annual Symposium on Logic in Computer Science, pages 264-273. IEEE Computer Society Press, 1988. [Freyd, 1972] P. J. Freyd. Aspects of topoi. Bulletin of the Australian Mathematical Society, 7:1-76 and 467-80, 1972. [Freyd and Scedrov, 1990] P. J. Freyd and A. Scedrov. Categories, Allegories. North-Holland, Amsterdam, 1990. [Girard, 1986] J.-Y. Girard. The system F of variable types, fifteen years later. Theoretical Computer Science, 45:159-192, 1986. [Girard, 1987] J.-Y. Girard. Linear logic. Theoretical Computer Science, 50:1-102, 1987. [Girard, 1989] J.-Y. Girard. Proofs and Types, Cambridge Tracts in Theoretical Computer Science 7. Cambridge University Press, 1989. [Goguen and Meseguer, 1985] J. A. Goguen and J. Meseguer. Initiality, induction and computability. In M. Nivat and J. C. Reynolds, editors, Algebraic Methods in Semantics, pages 459-541, Cambridge University Press, 1985. [Goguen et al, 1977] J. A. Goguen, J. W. Thatcher, E. G. Wagner, and J. B. Wright. Initial algebra semantics and continuous algebras. Journal of the Association for Computing Machinery, 24:68-95, 1977. [Hofmann, 1995] M. Hofmann. On the interpretation of type theory in locally cartesian closed cateories. In L. Pacholski and J. Tiuryn, editors, Computer Science Logic, Kazimierz, Poland, 1994, Lecture Notes in Computer Science 933. Springer-Verlag, 1995. [Hyland, 1988] J. M. E. Hyland. A small complete category. Annals of Pure and Applied Logic, 40:135-165, 1988. [Hyland and Ong, 1993] J. M. E. Hyland and C.-H. L. Ong. Modified realizability toposes and strong normalization proofs. In M. Bezem and J. F. Groote, editors, Typed Lambda Calculi and Applications, Lecture Notes in Computer Science 664, pages 179-194. Springer-Verlag, 1993.
126
Andrew M. Pitts
[Hyland and Pitts, 1989] J. M. E. Hyland and A. M. Pitts. The theory of constructions: categorical semantics and topos-theoretic models. In J. W.Gray and A. Scedrov, editors, Categories in Computer Science and Logic, Contemporary Mathematics 92, pages 137-199. American Mathematical Society, 1989. [Hyland et al, 1980] J. M. E. Hyland, P. T. Johnstone, and A. M. Pitts. Tripos theory. Mathematical Proceedings of the Cambridge Philosophical Society, 88:205-232, 1980. [Jacobs and Melham, 1993] B. Jacobs and T. Melham. Translating dependent type theory into higher order logic. In M. Bezem and J. F. Groote, editors, Typed Lambda Calculi and Applications, Lecture Notes in Computer Science, pages 209-229. Springer-Verlag, 1993. [Jacobs, 1999] B. Jacobs. Categorical Loigc and Type Theory. Elsevier, 1999. [Lambek, 1968] J. Lambek. Deductive systems and categories I. Math. Systems Theory, 2:287-318, 1968. [Lambek, 1988] J. Lambek. On. the unity of algebra and logic. In F. Borceux, editor, Categorical Algebra and its Applications, Lecture Notes in Mathematics 1348. Springer-Verlag, Berlin, 1988. [Lambek, 1989] J. Lambek. Multicategories revisited. In J. W. Gray and A. Scedrov,editors, Categories in Computer Science and Logic, Contemporary Mathematics 92, pages 217-239, American Mathematical Society, 1989. [Lambek and Scott, 1986] J. Lambek and P. J. Scott. Introduction to Higher Order Categorical Logic, Cambridge Studies in Advanced Mathematics 7. Cambridge University Press, 1986. [Lawvere, 1964] F. W. Lawvere. An elementary theory of the category of sets. Proceedings of hte National Academy of Sciences of the USA, 52:1506-1511, 1964. [Lawvere, 1966] F. W. Lawvere. The category of categories as a foundation for mathematics. In S. Eilenberg, D. K. Harrison, S. MacLane and H. Rohrl, editors, Proc. Conference on Categorical Algebra, La Jolla, pages 1-21, Springer-Verlag, 1966. [Lawvere, 1969] F. W. Lawvere. Adjointness in foundations. Dialectica, 23:281-296, 1969. [Lawvere, 1970] F. W. Lawvere. Equality in hyperdoctrines and the comprehension schema as an adjoint functor. In A. Heller, editor, Applications of Categorical Algebra, pages 1-14. American Mathematical Society, 1970. [Mac Lane, 1971] S. Mac Lane. Categories for the Working Mathematician, Graduate Texts in Mathematics 5. Springer-Verlag, Berlin, 1971.
Categorical logic
127
[Mac Lane, 1982] S. Mac Lane. Why commutative diagrams coincide with equivalent proofs. Contemporary Mathematics, 13:387-401, 1982. [Mac Lane and Moerdijk, 1992] S. Mac Lane and I. Moerdijk. Sheaves in Geometry and Logic. A First Introduction to Topos Theory. Universitext. Springer-Verlag, 1992. [Mackie et al., 1993] I. Mackie, L. Roman, and S. Abramsky. An internal language for autonomous categories. Applied Categorical Structures, 1:311-343, 1993. [Makkai and Reyes, 1977] M. Makkai and G. E. Reyes. First Order Categorical Logic, Lecture Notes in Mathematics 61. Springer-Verlag, 1977. [McLarty, 1992] C. McLarty. Elementary Categories, Elementary Toposes, Oxford Logic Guides 21. Oxford University Press, 1992. [Mine, 1977] G. E. Mine. Closed categories and the theory of proofs. Zap. Nauch. Sem. Leningrad. Otdel. Mat. Inst. im. V. A. Steklova (LOMI), 96:83-114,145, 1977. Russian, with English summary. [Mitchell and Scott, 1989] J. C. Mitchell and P. J. Scott. Typed lambda models and cartesian closed categories (preliminary version). In J. W. Gray and A. Scedrov, editors, Typed Lambda Calculi and Applications, Lecture Notes in Computer Science, pages 301-316, Springer-Verlag, 1989. [Moggi, 1991] E. Moggi. Notions of computations and monads. Information and Computation, 93:55-92, 1991. [Nordstrom et al., 1990] B. Nordstrom, K. Petersson, and J. M. Smith. Programming in Martin-Lof's Type Theory, volume 7 of International Series of Monographs on Computer Science. Oxford University Press, 1990. [Obtulowicz, 1989] A. Obtulowicz. Categorical and algebraic aspects of Martin-L6f type theory. Studia Logica, 48:299-318, 1989. [Oles, 1985] F. J. Oles. Types algebras, functor categories and block structure. In M. Nivat and J. C. Reynolds, editors. Algebraic Methods in Semantics pages 543-574. Cambridge University Press, 1985. [Pierce, 1991] B. C. Pierce. Basic Category Theory for Computer Scientists. MIT Press, 1991. [Pitts, 1981] A. M. Pitts. The theory of toposes. PhD thesis, Cambridge Univ., 1981. [Pitts, 1987] A. M. Pitts. Polymorphism is set theoretic, constructively. In D. H. Pitt, A. Poigne, and D. E. Rydeheard, editors, Category Theory and Computer Science, Proc. Edinburgh 1987, Lecture Notes in Computer Science 283, pages 12-39. Springer-Verlag, Berlin, 1987. [Pitts, 1989] A. M. Pitts. Non-trivial power types can't be subtypes of polymorphic types. In 4th Annual Symposium on Logic in Computer Science, pages 6-13. IEEE Computer Society Press, 1989.
128
Andrew M. Pitts
[Pitts, 1999] A. M. Pitts. Tripos theory in retrospect. In L. Birkedal and G. Rosolini, editors, Proceedings of the Tutorial Workshop on Realizabitity Semantics, FLOC '99, Trento, Italy, 1999. Electronic Notes in Theoretical Comptuer Science 23, Elsevier, 1999. [Poigne, 1992] A. Poigne. Basic category theory. Chapter in S. Abramsky, D. M. Gabbay and T. S. E. Maibaum, editors, Handbook of Logic in Computer Science, Vol. 1, Oxford University Press, 1992. [Reynolds, 1983] J. C. Reynolds. Types, abstraction and parametric polymorphism. In R. E. A. Mason, editor, Information Processing 83, pages 513-523. North-Holland, 1983. [Reynolds and Plotkin, 1993] J. C. Reynolds and G. D. Plotkin. On functors expressible in the polymorphic typed lambda calculus. Information and Computation, 105:1-29, 1993. [Ritter, 1992] E. Ritter. Categorical abstract machines for higher-order typed lambda calculi. PhD thesis, Cambridge University, 1992. [Roman, 1989] L. Roman. Cartesian categories with natural numbers object. Journal of Pure and Applied Algebra, 58:267-278, 1989. [Scott, 1980] D. S. Scott. Relating theories of the lambda calculus. In J. P. Seldin and J. R. Hindley, editors, To H. B. Curry: Essays on Combinatory Logic, Lambda Calculus and Formalism, pages 403-450. Academic Press, 1980. [Seely, 1984] R. A. G. Seely. Locally cartesian closed categories and type theories. Mathematical Proceedings of the Cambridge Philosophical Society, 95:33-48, 1984. [Seely, 1987] R. A. G. Seely. Categorical semantics for higher order polymorphic lambda calculus. Journal of Symbolic Logic, 52:969-989, 1987. [Seely, 1989] R. A. G. Seely. Linear logic, *-autonomous categories and cofree algebras. In J. W. Gray and A. Scedrov, editors, Typed Lambda Calculi and Applications, Lecture Notes in Computer Science 664, pages 371-382. Springer-Verlag, 1989. [Streicher, 1989] Th. Streicher. Correctness and completeness of a categorical semantics of the calculus of constructions. PhD thesis, University of Passau, 1989. Tech. Report MIP-8913. [Streicher, 1991] Th. Streicher. Semantics of Type Theory. Birkhauser, 1991. [Taylor, 1999] P. Taylor. Practical Foundations of Mathematics. Cambridge University Press, 1999.
A uniform method for proving lower bounds on the computational complexity of logical theories Kevin J. Compton and C. Ward Henson
Contents 1 2 3 4 5 6 7 8 9 10
Introduction Preliminaries Reductions between formulas Inseparability results for first-order theories Inseparability results for monadic second-order theories . . . . Tools for NTIME lower bounds Tools for linear ATIME lower bounds Applications Upper bounds Open problems
129 135 140 151 158 164 173 180 196 204
1 Introduction In this chapter we present a method for obtaining lower bounds on the computational complexity of logical theories, and give several illustrations of its use. This method is an extension of widely used procedures for proving the recursive undecidability of logical theories. (See Rabin [1965] and Ersov et al. [1965].) One important aspect of this method is that it is based on a family of inseparability results for certain logical problems, closely related to the well-known inseparability result of Trakhtenbrot (as refined by Vaught), that no recursive set separates the logically valid sentences from those which are false in some finite model, as long as the underlying language has at least one non-unary relation symbol. By using these inseparability results as a foundation, we are able to obtain hereditary lower bounds, i.e., bounds which apply uniformly to all subtheories of the theory. The second important aspect of this method is that we use interpretations to transfer lower bounds from one theory to another. By doing this we eliminate the need to code machine computations into the models of the theory being studied. (The coding of computations is done once and for all
130
Kevin J. Compton and C. Ward Henson
in proving the inseparability results.) By using interpretations, attention is centred on much simpler definability considerations, viz., what kinds of binary relations on large finite sets can be defined using short formulas in models of the theory. This is conceptually much simpler than other approaches that have been proposed for obtaining lower bounds, such as the method of bounded concatenations of Fleischmann et al. [1977]. We will deal primarily with theories in first-order logic and monadic second-order logic. Given a set 17 of sentences in a logic L, we will consider the satisfiability problem sat(
)={
L \ a is true in some model of 27}
and the validity problem val(
)= {
L \ a is true in all models of 27}.
A hereditary lower bound for 27 is a bound that holds for sat( ') and val( ') whenever 27' C val( ). If L is a first-order logic, define inv(L) to be the set of sentences in L that are logically invalid, i.e., false in all models. If L is a monadic second-order logic, define inv(L) to be the set of sentences false in all weak models. (See section 2 for definitions.) The complexity classes used here are time-bounded classes for nondeterministic Turing machines and for the more general class of linear alternating Turing machines. In providing reductions between different decision problems, we are always able to give log-lin reductions. That is, our reduction functions can be computed by a deterministic Turing machine which operates simultaneously in log space and linear time. In particular, such functions have the property that the size of a value is bounded uniformly by a constant multiple of the size of the argument. Let L0 denote the first-order logic with a single, binary relation symbol. Let MLo denote the corresponding monadic second-order logic. Let T(n) be a time resource bound which grows at least exponentially in the sense that there exists a constant d, 0 < d < 1, such that T(dn)/T(n) tends to 0 as n tends to . (This condition is satisfied by the iterated exponential functions and other time resource bounds which arise most commonly in connection with the computational complexity of logical theories.) Let satT(Lo) denote the set of sentences in L0 such that a is true in some model on a set of size at most T(| |). (Here \ \ denotes the length of a.) Similarly define satT(ML 0 ) for sentences of monadic second-order logic. The inseparability results which form the cornerstone of our method are as follows: (a) For some c > 0, satT(Lo) and inv(Lo) cannot be separated by any set in NTIME(T(cn)). (b) For some c > 0, satT(MLo) and inv(MLo) cannot be separated by
Proving lower bounds
131
any set in ATIME(T(cn),cn). (ATIME(T(cn),cn) denotes the set of problems recognized by alternating Turing machines in time T(cn) making en alternations along any branch.) In proving (b) we prove another interesting result. Let ML+ be a monadic second-order logic with a ternary relation PLUS and satT(M +) be the set of sentences in this logic true in the model (T(n), +}, where n = \ \ and + is the usual ternary addition relation on the set T(n) = {0,..., T ( n ) — 1}. Then for some c > 0, satT(M +) and inv(ML+) cannot be separated by any set in ATIME(T(cn),cn). We prove and discuss these results in sections 3 and 4 respectively. In fact, we prove more: any problem separating satT(L 0 ) and inv(L0) is a hard problem (under log-lin reductions) for the complexity class NTIME(T(cn)). Any problem separating satT(ML0) and inv(ML0) is a hard problem for the complexity class ATIME(T(cn),cn}. In these results one can see a parallel between first-order logic and NTIME, on the one hand, and monadic second-order logic and linear alternating time, on the other. This parallel persists throughout the lower bounds for logical theories which we discuss here, and we feel that our point of view helps to explain why the complexity of some theories is best measured using NTIME while for others the best measure is linear alternating time. In order to obtain lower bounds from these inseparability results, or to transfer lower bounds from one theory to another, we use interpretations. However, sometimes we require not just a single interpretation, but rather a sequence {In | n 0} of such interpretations. Not only do we require that each In define a sufficiently rich class of models when applied to the models of the theory under study, but also we require that the function taking n (in unary) to In should be log-lin computable. As an example of how such interpretations are used, suppose £ is a theory such that for each n 0, In applied to models of yields all the binary relations of size at most T(n) (and perhaps others), for a given time resource bound T. It follows that for some constant c > 0, sat( ) and inv(L) cannot be separated by any set in NTIME(T(cn)). In general, it follows that has a hereditary NTIME(T(cn)) lower bound. There is a corresponding result for the complexity classes ATIME(T(cn),cn) when the interpretations In interpret monadic second-order logic of binary relations on sets of size at most T(n).
132
Kevin J. Compton and C. Ward Henson
Upon establishing a lower bound for sat(£) in this manner, we may then use interpretations of £ to obtain lower bounds for other theories. Continuing the example above, assume In„ applied to the set of models of ' yields all the binary relations of size at most T(n). In fact, it will not be necessary to apply In to all models of £ to obtain all binary relations of size at most T(n), since there are only finitely many such binary relations. Suppose that Cn is a set of models of £ such that In applied to Cn yields all the binary relations of size at most T(n). If {I'n \ n 0} is another log-lin computable sequence of interpretations such that I'n applied to models of a theory ' in a language L' yields all models of Cn (and perhaps others, some possibly not even models of ), then for some constant c > 0, sat( ') and inv(L') cannot be separated by any set in NTIME(T(cn)). Thus, we have developed a theory for establishing lower bounds of logical theories analogous to the theory for establishing NP -hardness results via polynomial time reductions. The observation that I'n applied to models of £' is allowed to yield models not satisfying £ may seem unimportant, but in practice it results in significant simplifications in interpretations, especially compared to the method of establishing lower bounds by Turing machine encodings. There one must must produce, for a nondeterministic Turing machine M with the appropriate running time, an efficient reduction from strings w to sentences in such a way that when M accepts w, w is true in some model of ', w and when M does not accept w, w is true in no model of '. Ensuring that a sentence is true in no model of £' in the case of nonacceptance can be cumbersome. Inseparability considerations eliminate the need for it. Our method can give short, transparent lower bound results in many cases where Turing machine encodings are far from apparent. An example is the result of Compton et al. [1987] that the theory of almost all finite unary functions is not elementary recursive. The proof there, using the methods of this chapter, is set forth in a short paragraph. A proof by Turing machine encodings would run to many pages, and possibly would never have been discovered at all. It seems likely that our method will have to be used if sharp lower bounds are to be obtained for some of the important, more algebraic (and less directly combinatorial) theories which are known to be decidable. (See, for example, Problems 10.1, 10.2, 10.7, 10.9, and 10.10.) In making use of sequential families of interpretations there are certain technicalities regarding lengths of formulas which must be addressed. They can be illustrated by considering the formula ' which results from a formula when one replaces every occurrence of a certain binary relation symbol P by a given formula . If has many occurrences of P and if the length of is of the same order of magnitude as the length of , then ' may well be extremely long compared to . (This is precisely the kind of operation on formulas used as a reduction function between theories when
Proving lower bounds
133
one uses interpretations to obtain lower bounds.) This difficulty can be overcome if one uses sequences {In} of interpretations in which either all formulas in each In are in prenex form, or they are obtained by a certain kind of iterative process. In practice, the interpretation sequences used to transfer lower bounds from one logical problem to another can always be found satisfying one of these conditions. We have used this approach to give a precise analysis of the computational complexity of various theories of finite trees. For each r let r denote the first-order theory of all finite trees of height r, and let M r denote the corresponding monadic second-order theory. Also let and M , denote the corresponding theories of all finite trees. Let expm(n) be the m-times iterated exponential function (e.g., exp2(n) = 2 2n ) and let exp (n) be the tower of twos function:
Our results concerning the various theories of finite trees can be summarized as follows: (a) For each r 4 there are constants c and d > 0 such that sat(r) is in NTIME (ex.pr_2(dn)) but that sat(r) and val( r) are hereditarily not in NTIME (expr_2(cn)). For r = 3 the upper bound is NTIME(2 d n 2 ) and the hereditary lower bound is NTIME(2cn). (b) There exist constants c and d > 0 such that sat( ) is in NTIME but that sat (
) and val(
(exp(dn))
) are hereditarily not in
NTIME (exPoo (cn)). Hence, these problems are hereditarily not elementary recursive. (c) For each r 1 there are constants c and d > 0 such that sat(MSr) is in A TIME (expr (dn/ log n) , dn) but that sat(MSr) and val(M r) are hereditarily not in ATIME(expr(cn/ logn), en). It is not hard to show that and M are mutually interpretable, and hence have the same complexity. In any case, for such rapidly growing time resource bounds as e x p ( n ) , the difference between NTIME and ATIME has vanished.
134
Kevin J. Compton and C. Ward Henson
In order to test our method for effectiveness and smoothness of use, we have used it to provide proofs of essentially all previously known complexity lower bounds for first-order and monadic second-order theories. These arguments avoid direct coding of machine computations, and are usually much simpler and more conceptual than the original arguments. We present many of these proofs, or at least sketches of them, in section 8. In all cases our lower bounds are hereditary, and are expressed in terms of log-lin hardness for certain complexity classes, providing complete problems for many of these NTIME and linear ATIME classes. In some cases we verify results which had been only announced, no published proof ever having appeared. We do not consider lower bounds for sentences with restricted quantification prefixes, as in Lewis [1980], Scarpellini [1984], Gradel [1988; 1989], Schoning [1997], and Borger et al. [1997]; nor do we consider lower bounds for nonclassical logics such as temporal logics, interval logics, dynamic logics, logics of knowledge, and other modal logics. Many of these logics have decision problems complete in deterministic time classes. See, for example, Chagrov and Zakharyaschev [1997], Fischer and Ladner [1979], Emerson and Halpern [1985], Fagin et al. [1995], Halpern and Moses [1992], Halpern and Vardi [1986], Harel [1984], Kutty et al. [1995], Kozen and Tiuryn [1990], Rabinovich [1998], Ramakrishna et al. [1992; 1996], Stirling [1992], and Vardi [1997]. It is our hope that the appearance of this chapter will stimulate the interests of many computer scientists and mathematicians, and that they will be inspired to investigate the many decidable theories for which no detailed complexity bounds have been found. Even though there has been continued interest in the complexity of decidable thoeries since this material first appeared (see, for example, Cegielski [1996], Cegielski et al. [1996], Maurin [1996; 1997a; 1997b], and Michel [1992]), researchers in the area are sometimes unaware of the tools for obtaining lower bounds, and, in particular, of hereditary lower bounds and inseparability results. The organization of our chapter is as follows. Section 2 contains various definitions and technical conventions. In section 3 we present some technical machinery needed to handle the details about lengths of formulas which arise in complexity arguments. Sections 4 and 5 contain the basic inseparability results for logical problems; these are the analogues of the Trakhtenbrot-Vaught theorem. In sections 6 and 7 we discuss interpretations and set up the procedures by which they are used to obtain lower bounds for logical theories. Here too are proved the lower bounds for the various theories of finite trees which are treated here. Section 8 contains a lengthy series of applications of our method, yielding lower bounds for a wide variety of theories of independent interest. In section 9 we obtain various upper bounds for problems treated here, in order to show that our method is capable of achieving sharp results. We present a selected list of
Proving lower bounds
135
open problems at the end of section 10. In this chapter we have not discussed ways in which our method can be used to obtain lower bounds in terms of SPACE(T(n)) complexity classes. This is because there are so few known cases in which best possible lower bounds for logical theories are expressed in terms of space complexity classes. The exceptions are the PSPACE-complete theories such as those discussed in Stockmeyer [1977] and Grandjean [1983]. These are, in some sense, the least complex theories since Stockmeyer shows implicitly that if £ is a logical theory with has at least one model with at least one nontrivial definable relation (i.e., the relation is true of some elements and false of others), then sat( ) is log-space, polynomial-time hard for PSPACE. (Note that if equality is taken as a basic relation in the language, or is definable, then the hypothesis means simply that has at least one model with two or more elements.) If £ does not have any such model, then it is utterly trivial, and both sat( ) and val( ) are in LOGSPACE. (We assume throughout that the vocabularies of logics are finite.) Possibly methods of this chapter could be extended to obtain polynomial nondeterministic time lower bounds for PSPACE -complete theories. On the other hand, all 'natural' logical theories which are known to be decidable seem actually to be primitive recursive. Furthermore, among theories where a somewhat careful upper bound analysis has been carried out, decidable logical theories seem to fall into NTIME(exp (dn)) for some constant d in all but a few cases. The exceptions have been discovered only recently by Vorobynov [1997; preprint]. We believe that the method presented here is simple and can be mastered quickly. To get an overview of the applications of our method we advise looking over the results in section 8 first. Not only does this give a summary of the main complexity lower bounds now known, but also we have tried to present these applications in such a way as to give an accessible exposition of how our methods are meant to be used, and the main technical points which arise in their use.
2
Preliminaries
In this section we present the definitions and notations used throughout the chapter. All alphabets considered will be finite. The length of a string w is denoted \w\. The empty string is denoted . We use the standard 'big oh' and 'little oh' notations throughout, as well as the 'big omega' notation: write f ( n ) = (g(n)) if f ( n ) kg(n) for some k > 0 on all large n. All theories considered here either first-order or monadic second-order. For convenience we explicitly treat only relational languages in sections 37; functions are handled by using their graphs as relations, and constants
136
Kevin J. Compton and C. Ward Henson
are treated as special unary relations. This restriction makes no difference as far as concerns the lower bounds we obtain: sentences containing function and constant symbols can be transformed into equivalent sentences containing relation symbols with an increase in length of only a constant factor. (It is necessary to 'reuse' variables to accomplish this.) We will assume for convenience that our languages contain equality. However, in many situations our methods work even without equality. We use equality mainly to keep formulas short while substituting other formulas for atomic formulas: equations are used for coding truth values. This can usually be done by other formulas, as long as they satisfy the right conditions of nontriviality. If one is willing to accept polynomial-time reductions, then equality is generally not necessary, but the lower bounds degrade slightly. To specify a logic in this chapter, we need only give its set of relation symbols and their associated arities (the vocabulary of the logic) and indicate whether the logic is first-order or monadic second-order. We formulate all of our logics using finitely many symbols, so that all terms and formulas are strings on a finite alphabet. In particular, a variable is represented by a symbol followed by a subscript in binary notation. Thus, to represent n distinct variables we need strings with total length about n log n. (Logarithms will always have base 2.) To avoid subscripted subscripts we use lower-case Roman letters t,u,x,y,z—possibly with subscripts—as formal variables to denote actual variables Vi. Monadic variables are represented by corresponding upper-case letters. The power of a model is the cardinality of its universe. A weak model for a monadic second-order logic L is a pair (, F}, where is a model for L and F is a collection of subsets of the universe of . The truth value of a formula from L in (, .F) is determined in the usual way except that monadic quantifiers range over the sets in .F rather than the collection of all sets. Throughout the chapter, equivalence of monadic second-order formulas will mean equivalence on weak models. This is stronger than the usual notion of equivalence. The first-order and monadic second-order logics with vocabulary consisting just of a binary relation symbol P are central to our investigation. They will be denoted L0 and MLo, respectively. We also study theories of finite trees; again the vocabulary consists just of one binary relation symbol which, in this case, interprets the successor (or parent-child) relation. Let Lt and MLt denote the first-order and monadic second-order logics with this vocabulary. These are essentially the same as Lo and MLo; they differ only in the binary relation symbol used. However, it will be convenient to have a different notation for these logics when considering trees. When considering finite trees we will often require the notion of a primary subtree. Such a subtree is formed by restricting to a set of vertices
Proving lower bounds
137
consisting of a child of the root and all its descendents. Thus, we may regard a tree as being formed by directing an edge from the root of the tree to the root of each of its primary subtrees. The depth of a vertex in a tree is its distance from the root. The height of a vertex is the maximum distance to a leaf below it. Thus, the height of a tree is the maximum depth of its vertices, which is also the height of the root. As we noted in Section 1, we will consider problems of the form sat(£) and val( ). Prom a computational point of view, these two problems are complementary. That is, a sentence a is in sat( ) exactly when - is not in val(£). Hence, sat(£) is a member of a particular complexity class if and only if val( ) is a member of the corresponding co-complexity class. If £ is a complete theory, then sat(£) = val( ). When we are in a firstorder logic, val( ) is the deductive closure of £ by the Godel completeness theorem. There is no corresponding result for monadic second-order logic. Often a logical theory is specified not by giving a set of axioms , but by giving a class of models C. In this situation we take £ to be the set of sentences true in all members of C. It is easy to verify in this case that val( ) = £ and sat(£) is the set of sentences true in some member of C. If L is a first-order logic we define inv(L) to be the set of sentences false in all models for L. This is just the complement of sat( ). If L is a monadic second-order logic we define inv(L) to be the set of sentences false in all weak models for L. Given a time resource bound T(n), let satr(£) be the set of sentences true in some model of £ of size at most T(| |). Also, write satT(L) for satT( ). Let satpt(£) be the set of prenex sentences true in some model of £ of size at most r(| |). Let be a model for a logic L, m = m1, . . . ,m,k elements of , and ( x i , . . . , x n , y 1 , . . . , y k ) a formula from L. Then (x,m) denotes the n-ary relation defined by
for a = a1,. ..,an M. Interpretations of one class of models in another are fundamental to many parts of logic, and we use them extensively here. For example, to interpret a binary relation ' (i.e., a model for the logic L0) in a theory £ from a logic L, we must produce formulas 6(x, u) and (x, y, u) from L so that for some model of £ and some elements m of , ' is isomorphic to the structure
where we require that (x, y,m) (x,m) x (x,7n). There is also a more general kind of interpretation that is often used, in which the domain of can be a set of k-tuples from (not just elements of ) and in which
138
Kevin J. Compton and C. Ward Henson
' is isomorphic to a quotient of by an equivalence relation definable in . Let be a formula in a logic L and D a unary relation symbol. By D we mean the relativization of to D. This is formed by systematically replacing all subformulas y and y of with y(D(y) ) and y(D(y) ), respectively. If L is a monadic second-order logic, it is not necessary to relativize the set quantifiers since elements have already been restricted to D. The complexity classes we use are defined by time resource bounds. A time resource bound T is a mapping from the nonnegative reals to the nonnegative reals such that for each k > 0, T(kn) is dominated by some fully time-constructible function on the integers; see Hopcroft and Ullman [1979] for definitions. (Readers who need a primer in complexity theory may also wish to consult Stockmeyer [1987].) We will also require that T(n) > n and that for each k 1, T(kn] kT(n). This last condition Machtey and Young [1978] call at least linear. It says that when input length is increased by some factor, the allowed computation time increases by at least the same factor. It is included for technical reasons; we could get by with less, namely, that T be nondecreasing and for every / there should be a k such that T(kn) lT(n). The iterated exponentials and tower of twos functions appear often as time resource bounds in the problems we consider. The iterated exponentials exp m (n), where m is a nonnegative integer, are defined by induction on m. Let exp0(n) = n and expm+1(n) = 2expm(n). The tower of twos function e x p ( n ) is defined to be
Recall that a problem is elementary recursive if it is recognized in time exp m (n) for some m 0. All of our bounds are for nondeterministic or alternating Turing machines. The set of problems recognized by nondeterministic Turing machines in time T(n) is denoted NTIME(T(n)). With alternating Turing machines we will be concerned chiefly with the complexity classes ATIME(T(n), cn), the set of problems recognized by an alternating Turing machine in time T(n) making at most cn alternations. We will assume that alternating Turing machines have four types of states: universal, existential, accepting, and rejecting. See Chandra et al. [1981] for the definition of acceptance by alternating Turing machines and a description of the computation trees associated with these machines. We will sometimes say that a theory has a hereditary NTIME(T(cn)) (or ATIME(T(cn), en)) lower bound. By this we mean that there is a c > 0 such that for all E' , neither sat( ') nor val( ') is in NTIME(T(cn))
Proving lower bounds
139
(or in ATIME(T(cn),cn)). A log-lin reduction is a mapping computable in log space and linear time. In some sources this terminology is used for a log space computable, linearly bounded mapping, which is a weaker notion. (Linearly bounded means that output length is less than some constant multiple of input length.) It is not crucial for the applications presented here that our reductions be quite so restricted: polynomial-time, linearly bounded reductions suffice. However, to obtain some results in the literature, such as the nondeterministic polynomial lower bounds in Grandjean [1983], linear time reductions would be needed. We encounter a technical problem with log-lin reductions: we do not know if they are closed under composition. To overcome this difficulty we define a stronger notion of reset log-lin reduction. A machine performing such reduction is a log space, linear time bounded Turing machine with work tapes, an input tape, and an output tape. It has the capability to reset the input tape head to the initial input cell on k moves during a computation, where k is fixed for all inputs; on all other moves the input tape head remains in place or moves one cell to the right. It writes the output sequentially from left to right. Suppose that M' and M" are two such machines using at most k' and k" resets, respectively. We informally describe a machine M to compute the composition of the reductions computed by M' and M". Imagine that the output tape of M' and the input tape of M" have been removed. Instead, M' sends its output directly to M". As M" computes its output, it calls M' to supply it with a new symbol on those moves when the input head of M" would have moved right. M' has only to resume its computation from the last call to supply this symbol. On those moves where M" would have reset its input head, M' must begin its computation anew. Now the input head of M" would have passed over each input cell at most k" +1 times during the computation, and to supply each symbol the input head of M' passes over each input cell at most k' +1 times. Thus, M resets its input head at most (k1 + l)(k" + 1) - 1 times. Clearly, M is log space bounded. Since the part of M corresponding to M' is forced to begin its computation anew at most k" times, it is easy to see that M is linear time bounded. It is not difficult to show that the prenex formulas of a logic are closed under relativization up to reset log-lin reductions. That is, there is a reset log-lin reduction which takes formulas of the form D, where is a prenex formula with no variable quantified more than once, to equivalent prenex formulas. We use this fact often. Unfortunately, we know of no way to eliminate duplicate quantifications of variables using reset log-lin reductions, but this can be accomplished easily with polynomial-time, linearly bounded reductions. A problem is hard for a complexity class C via reductions from a class S if every problem ' C can be reduced to £ by some / € S. That is, if
140
Kevin J. Campion and C. Ward Henson
A and A' are the alphabets for £ and 17' respectively, then / maps A1* to A* so that w " if and only if f ( w ) . If, in addition, £ € C, we say that 1 is complete for C via reductions from S.
3
Reductions between formulas
One of our goals is to develop effective and easily used methods for transferring lower bounds from one problem to another. Our methods are based on interpretations between theories (or equivalently, between classes of models) and can be seen as an extension of the most widely used methods for proving the undecidability of logical theories; see Ersov et al. [1965] and Rabin [1965] for a discussion of undecidable theories from this point of view. To obtain complexity lower bounds for decidable theories we must use interpretations which have a somewhat more general form than those used in undecidability proofs, and there are certain technicalities about lengths of formulas which must be addressed in this more general setting. In this section we will develop the required machinery. The first-time reader may wish to skip the proofs in this section as they are somewhat tedious and only the statements of results will be used later. A common method for proving that a theory £ in a logic L is undecidable is to show that the theory So of finite binary relations, formulated in the logic L0, can be interpreted in In the simplest case this means that formulas (x,u) and (x,y,u) of L are given so that every finite binary relation can be obtained (up to isomorphism) in the form
for a model of £ and m a sequence of elements of 91. The formulas 6 and are then used to define a reduction from formulas of L0 to formulas of L, as follows: given a formula of L0, replace every occurrence of an atomic formula P(z, t) by the formula (Z, t, u) and relativize every quantifier to the formula . (One must rewrite bound variables to avoid conflicts and make sure that u is a sequence of otherwise unused variables.) Call the resulting formula . The reduction mapping is then used to obtain undecidability results for from corresponding results for E0. This kind of simple interpretation is not adequate for obtaining lower complexity bounds when £ is a decidable theory. One works instead with a parameterized family of formulas { n \ n > 0 } from L0 and uses a sequence of formula pairs { (n(x,u),n(x,y,u)) | n > 0} from L. In reducing the formulas n to L: one proceeds as above, except that n is obtained from using n and n. In complexity lower bound arguments, it is not only n necessary that the function n n should be efficiently computable, but also that \'n\ should be linearly bounded in \n\. If P occurs many times in n and |n| grows without bound as n increases, or if n has many quantifiers and |n| grows without bound as n increases, then the linear
Proving lower bounds
141
boundedness condition may not hold. However, in certain cases there are methods to efficiently replace 'n by an equivalent formula for which the linear boundedness condition does hold. Roughly speaking, we can do this when the formulas n and n are all in prenex form, or are obtained by a certain kind of iterative procedure. The machinery developed here to accomplish this task is implicit in most complexity lower bound arguments for logical problems. In order to describe this machinery, it is convenient to introduce an extension L* of each logic L, in which explicit definitions are allowed. (L* has no more expressive power than L, but properties can sometimes be expressed by shorter formulas in L* than in L.) Continuing the example above, let Dn denote a formula in which all quantifiers of n have been relativized to a new unary relation symbol D. Then the extended language L* in this case would include a formula
whose interpretation is exactly the same as that of 'n, although its length is likely to be more under control. Here the equivalences in brackets are interpreted to mean that P is explicitly defined by n and that D is explicitly defined by n. The general problem, treated below in this section, is to find situations in which certain formulas of the extended language L* can be efficiently reduced to equivalent formulas of L, without a significant increase in the length of the formulas. (In general, it is possible to find for each L* formula of length n an equivalent L formula of length O(n log n); this is not good enough for sharp complexity bounds.) Let L be either a first-order or monadic second-order logic. Define L* as follows. Formulas of L* may contain any of the symbols occurring in formulas of L and, in addition, relation variables Sji for each i, j 0. In each case the arity of Sij is j and the subscript and superscript of S? are expressed in binary notation. (If L is a monadic second-order logic we need two superscripts, the first denoting the arity of element arguments and the second denoting the arity of set arguments.) Subscripts and superscripts of relation variables contribute to the length of formulas in which they occur, just as element variable subscripts do. (However, superscripts may be ignored in asymptotic estimates of formula length because they are dominated in length by their corresponding argument lists.) We define the set of formulas of L* inductively, and at the same time define free( ), the set of free variables in (p. An atomic formula of L* is either an atomic formula of L or a formula P(x1 ,...,£_,•) where P denotes a relation variable—in the former free(( ) is the same as in L; in the latter, free((p) — {P,xi,.. - , X j } . More complex formulas may be constructed using the logical connectives and quantifiers appropriate to L; in these cases free(( ) is defined just as in L. The only other way to construct more complex
142
Kevin J. Campion and C. Ward Henson
formulas is by explicit definition. Let and 0 be formulas in L*, P a, relation variable which does not occur freely in 6, and x = xi,...,Xj a sequence of distinct element variables. Then given by
is also a formula of L* and free The part of within brackets is an explicit definition which defines the interpretation of P in . If 0 is a prenex formula from L we will say that it is a prenex definition. The truth value of is the same as that of the second-order expression
Notice that the truth value is consistent with the definition of free( Notice also that the second-order expression above is equivalent to
).
so that (P(x) is equivalent to [P(x) . If free ( ) = 0, then is a sentence of L*. We will let sat*( ) denote the set of sentences from L* true in some model of S, satT( ) denote the set of sentences from L* true in some model of of size at most T(| |), satT(L) denote the set of sentences from L* true in some model of size at most T(| |), and inv*(L) denote the set of sentences from L* true in no model (or no weak model when L is a monadic second-order logic). Introduction of explicitly defined relations is standard practice in mathematical discourse. Explicitly defined relations are also similar to nonrecursive procedures in programming languages. Explicit definitions can be used to define reductions between satisfiability problems. To provide good lower bounds these reductions must be efficiently computable and linearly bounded. We will show, in fact, that there are reset log-lin reductions, defined on certain subsets of sentences from L*, that take formulas to equivalent formulas in L. (Unfortunately, such reductions probably cannot be defined on the set of all sentences in L*; with a little effort we can produce a polynomial-time reduction which maps sentences in L* of length n to equivalent sentences in L of length O(nlogn).) We inductively define positive and negative occurrences of a relation symbol Q in formulas from L*. Q occurs positively in atomic formulas
Proving lower bounds
143
of the form Q(x). Q occurs positively (negatively) in the formulas and V when it occurs positively (negatively) in either of the formulas ( or ). Q occurs positively (negatively) in the formula if it occurs negatively (positively) in the formula . Q occurs positively (negatively) in the formulas and x(p if it occurs positively (negatively) in the formula . Q occurs positively (negatively) in the formula [P(x) , where P is not Q, if it occurs positively (negatively) in , or if it occurs positively (negatively) in 9 and P occurs positively in , or if it occurs negatively (positively) in 9 and P occurs negatively in . We say that P occurs only positively in a formula if it does not occur negatively (in particular, it may not occur at all). We inductively define an iterative definition [P(x) ]n as follows. The iterative definition [P(x) = 0]0 is equivalent to the explicit definition [P(x) = 1], where L is a sentence false in all models (or all weak models if L is a monadic second-order logic). The iterative definition [P(x) ]n+l is equivalent to We make iterative definitions part of the syntax of L*, but we require that the subscript n be written in unary notation so that the length of an iterative definition is of the same order (O(n) and (n)) as the length of the nested explicit definitions it replaces. We call 9 the operator formula for the iterative definition. We can think of iterative definitions as approximations to implicit definitions. We will not formally define implicit definitions, since they do not figure directly in what follows, but an example should convey the idea. (See Moschovakis [1974] for an account.) Consider a language containing just a binary relation symbol E denoting the edge relation on graphs. The implicit definition
[P(x, y) = (x = yVz
(P(x, z) A E(z, y)))]
defines the path relation in each graph: P(x, y) is the least relation satisfying the equivalence, so it holds precisely when there is a path between x and y. Now consider the related iterative definition [P(x, y) = (x = yVz
(P(x, z)
E(z, y)))] n
which defines a relation P(x,y) which holds precisely when the distance between x and y is at most n — 1 (when n > 1). Notice that this 'approximation' to the implicitly defined relation does not converge very rapidly. The iterative definition
[P(x, y)
(x = yV E(x, y) V z (P(x, z) A P(z,
144
Kevin J. Compton and C. Ward Henson
defines a relation P(x, y) which holds precisely when the distance between x and y is at most 2 n - 1 (for n 1), so this approximation to the path relation converges exponentially 'faster'. For an implicit definition to make sense, = (P) should be monotone in P (i.e., for every structure , if P and P' are relations on 21 with PC.P', then 0 (P) C 0 (P')). Monotonicity can be guaranteed by requiring that P is positive in 6. No such restriction is needed for iterative definitions. In most of our applications P does occur positively and the iterative definitions approximate an implicit definition. Usually, the faster the convergence, the better the lower bounds obtained by our methods. We will see that the positivity of P in 0 does have implications in lower bound results. To show that we can efficiently transform iterative definitions into equivalent explicit definitions, we require the following theorem, which will also be used to show that certain sets of formulas in L* can be efficiently transformed into equivalent formulas from L. Theorem 3.1. Let L be a first-order or monadic second-order logic and let L' be a logic, of the same type, whose vocabulary consists of the vocabulary of L together with relation symbols P1 , . . . , Pm . There is a reset log-lin reduction taking each prenex formula of L' to an equivalent prenex formula of L' having at most one occurrence of each Pj . Proof. The proof follows an argument of Ferrante and Rackoff [1979, pp. 155-157]. We must add some details, however, because they were not interested in obtaining a reset log-lin reduction. We adopt the same assumption they did there: we assume that L has a symbol for equality and that all structures have cardinality at least 2. We could dispense with this assumption at the cost of added complications. We deal explicitly only with the case m = 1. It will be clear from the proof that the procedure can be iterated to treat P1 , P2 , . . . in succession. We describe the action of our algorithm on , a prenex formula from L. First add a 0 bit to the end of every variable index occurring in . This will allow us to introduce variables of odd index without creating a conflict. Now is of the form
where each Qi is a quantifier and
is quantifier- free. Let
be all the subformulas of containing P1 . The idea is to replace each subformula P I ( X i l , . . . ,xil) of with a Boolean variable and stipulate with a formula containing just one occurrence of P1 that each of these Boolean variables has the same truth value
Proving lower bounds
145
as the formula it replaces. Since we have no Boolean variable type, we instead replace each subformula PI xil ,...,xil) with an equation v\ = Vb(i) . We must ensure for each i that b(i) is odd and greater than 1, that b(i) is log-lin computable from PI (xil, . . . ,xil) (with no resets), and that b(i) b(j) when i j. To produce b satisfying these conditions suppose that Xil , . . . , xil are formal variables denoting actual variables with subscripts ji, . . . ,ji respectively. In the string j l #j2# • • • #j1 replace every occurrence of 0 with 01, of 1 with 11, and of # with 10; let the result be b(i). Let be the result of replacing each formula PI (xil,, . . . ,xil) in by the formula v1 = ub(i). Now with a little effort we can see that is equivalent to
where y and y* — y1 , . . . , y1 denote variables with odd indices of odd length. It is not difficult to verify that this formula is reset log-lin computable from (f using two resets. I Remark 3.2. The formula in the proof of Theorem 3.1 uses the symbol . If we require that formulas use only the Boolean connectives A, V, and , we must expand the subformula v1 = y P 1 ( y ) of ' to obtain a formula in which P occurs twice, once positively and once negatively. It is easy to see that this is the best we can do. Suppose that the number of occurrences of P in such a formula could be reduced to one. If this occurrence were positive, then ' would be monotone in P (i.e., truth is preserved when the interpretation of P is expanded). If this occurrence were negative, then would be monotone in P. But it is easy to produce a formula such that neither it nor its negation is monotone in P. However, in the case where P occurs only positively in , we can construct in the proof of Theorem 3.1 using the subformula V1 = y -> P\(y ) in place of v1 = y P 1 ( y ) . For this case the theorem is true even if just the connectives A, V, and are allowed. This is one of the advantages of using positive formulas. Theorem 3.3. Let L be a first-order or monadic second-order logic and I be a fixed positive integer. There is a reset log-lin reduction which taking each iterative definition of the form [P(x) = 0]n, where 0 is a formula from L* of length at most I, to an equivalent explicit definition. Proof. Since there are only finitely many formulas from L* of length at most I, we may, given such a formula 0, find an equivalent prenex formula 0' from L in constant time. Moreover, by Theorem 3.1 we may assume that P occurs in just once, say in a subformula P(y). Define formulas 0n = n(x) by induction on n. Let 0 be a sentence false in all models
146
Kevin J. Compton and C. Ward Henson
(or all weak models if L is a monadic second-order logic). Form i+1 by substituting the variables y for corresponding free variables x in 0i (perhaps changing other variables to avoid conflicts) and substituting the result for P ( y ) in '. It is clear that [P(x) = ]n is equivalent to [P(x) = n]. If the substitution of variables has been done in a systematic way in the construction of n, then it is clear that n can be obtained from [P(x) = 9] by a reset log-lin reduction. I Often we need to make several iterative definitions simultaneously. For example, Fischer and Rabin [1974], in their lower bound proof for the theory of real addition, define sequences of formulas m n ( x , y , z ) and n(x, y,z). Formula m n (x, y,z) holds precisely when a; is a nonnegative integer less than 22n and x • y = z; formula n(x, y, z) holds precisely when x, yx, and z are nonnegative integers less than 22n and yx = z. These definitions are simultaneous: the definition of n+1, for example, depends not only on n, but also on mn. Let us make the notion of simultaneous definition precise. Let L be a first-order or monadic second-order logic and 1 , . . . , k be formulas from L*. A simultaneous iterative definition is denoted
Fix a structure for L and an assignment from to the free variables of this definition (defined in the obvious way). The simultaneous iterative definition assigns a relation from the universe of to each symbol P1,...,P k . We define this assignment by induction on the depth n of the definition. When n = 0 it assigns the empty relation to each symbol. When n > 0 the assignment to 6i is determined by letting the assignment for depth n — 1 interpret the free occurrences of P1,...,P k in 0,. We can use simultaneous iterative definitions to augment the syntax of a logic in the same way we used iterative definitions. In particular, subscripts on definitions are expressed in unary notation. The following theorem shows that simultaneous iterative definitions do not increase the expressiveness of a logic. Moreover, their use does not make for appreciably shorter expressions than use of ordinary iterative definitions. The theorem is proved along the same lines as similar results in Fischer and Rabin [1974] and Ferrante and Rackoff [1979, p. 159]. Moschovakis [1974, p. 12] used similar ideas to prove an analogous theorem for simultaneous implicit definitions. Theorem 3.4. Let L be a first-order or monadic second-order logic and P1,..., Pk be fixed relation variables. There is a reset log-lin reduction taking each formula of the form
Proving lower bounds
147
Pi(x1) = 1
where and 1,... ,0k are formulas from L* whose only free relation variables are P1 ,..., Pk, to an equivalent formula of the form
where ' and ' are formulas from L* whose only free relation variable is P. Moreover, if P1, . . . ,Pk occur only positively in each of the formulas 1,.. . ,0k, then we may arrange that P occurs only positively in . Proof. As before, we assume that L has a symbol for equality and that all structures have cardinality at least 2. Again, we could dispense with these assumptions at the cost of added complications. Without loss of generality, we may assume that the variable sequences x1 , . . . , xk are mutually disjoint. Let z denote a sequence z , Z 1 , . . . , Z k of distinct variables disjoint from x1 , . . . , Xk • The idea of the proof is that one relation P ( z , x 1 , . . . , X k ) will code the relations PI (x1 ) , . . . , Pk (xk ). To be more precise, the relation P(z, x1 , . . . , Xk) is equivalent to
Thus, a particular Pi(Xj) can be extracted by writing
Call this formula 5i(x) (or Si for short). Define ' to be the L* formula
Notice that P is the only free relation variable in
Now we can easily show that
. Let
be the formula
148
Kevin J. Compton and C. Ward Henson
is equivalent to [P(x)
]n
' by induction on n.
Remark 3.5. Notice that in the proof of Theorem 3.4 formula is formed simply by inserting explicit definitions of fixed length before . These definitions may be eliminated by replacing relation variables in with their corresponding definitions. Now if is a prenex formula or a member of a prescribed set of formulas (defined below), it is easy to arrange that is a formula of the same type. We can now say precisely which kinds of definitions are used in the reductions described at the beginning of this section: they are prenex definitions and iterative definitions. It is useful, therefore, to have terminology to describe sets of formulas in L* built up from prenex formulas using prenex and iterative definitions. We must place some restrictions on these sets to be able to efficiently translate them into equivalent formulas from L. Let L be a first-order or monadic second-order logic. Let L' be the logic formed by adding relation variables PI, ..., Pk to the vocabulary of L, and I be a fixed positive integer. A prescribed set of formulas over L is a set of formulas of the form
where is a prenex formula from L', and for each i either n, = 1 and 6i is a prenex formula from L' in which only P1, ..., Pi-1 may occur as free relation variables (i.e., Pi has a prenex definition), or is a formula of length at most l from L* in which only PI ,..., Pi may occur as free relation variables (i.e., Pi has an iterative definition in which the operator formula has bounded length). We place one further restriction on sets of prescribed formulas: each variable is quantified at most once in and in each formula i where Pi has a prenex definition. We impose this condition so that when we relativize all the formulas within a set to a unary relation symbol D, there is a reset log-lin reduction taking resulting formulas to equivalent formulas from another prescribed set of formulas. The condition is easy to satisfy in practice. We now present our fundamental theorem for making reductions between formulas. Theorem 3.6. Let L be a first-order or monadic second-order logic. For each prescribed set of formulas over L there is a reset log-lin reduction taking each formula in the set to an equivalent formula in L.
Proving lower bounds
149
Proof. Fix a prescribed set of formulas over L. There are relation variables P1 , . . . , Pk as in the definition such that all formulas in the set are of the form where is a prenex formula in which only P1,..., Pk, may occur as free relation variables, and for each i either ni = 1 and i is a prenex formula in which only P1,...,P,_i may occur as free relation variables, or , is a formula of length at most l in which only P1,..., Pi may occur as free relation variables. At first glance it may seem that P1 , . . . , Pk are being defined simultaneously, but this is not the case. First P1 is assigned a value by an iterative definition of depth n1 which is substituted in the remaining definitions. Then P2 is assigned a value by the next iterative definition of depth n2 which is substituted in the remaining definitions, and so on. The proof combines this observation with the construction used in Theorem 3.4. As in that theorem, we will code the relations P 1 ( x 1 ) , . . . , Pk(xk) into a single relation P ( y ) equivalent to
where y is the variable sequence z , z 1 , . . . , Z k , x \ , . . . , X k be the formula
As before, let
To construct a formula from L equivalent to we build, inductively, a sequence of formulas (y ) , (y ) • • • , k (y) • Begin by taking 0 to be a sentence false in all models (or weak models, if L is a monadic second-order logic). Suppose now that i-1 is given. Consider the simultaneous definition
This definition simply defines Pi(Xi) to be i and leaves the other relations unchanged. Use the construction in the proof of Lemma 3.4 to produce an
150
Kevin J. Compton and C. Ward Henson
equivalent definition [P(y ) ni(y)]. Hence, 77, is
where
is the formula
We claim that there is a reset log-lin reduction taking ni to an equivalent prenex formula ni with just one subformula P(u) in which P occurs. Whether Pi has a prenex definition, in which case , is in prenex form, or an iterative definition, in which case is of bounded length, there is a simple reset log-lin reduction to convert , into prenex form. Apply the reduction given by Theorem 3.1 to the result to obtain an equivalent formula in which each of the symbols PI , . . . , Pk occurs just once. In this formula, for each j, substitute j(yj) for the subformula Pj(yj)- Convert to prenex form again by a reset log-lin reduction and apply the reduction of Theorem 3.1 one more time to obtain n' as desired. Notice that if Pi has an iterative definition, ni, has length less than some constant determined by / and the arities of P1 , . . . , Pk. If Pi has a prenex definition, form by substituting -\(u) for P(u) in n'i- If Pi has an iterative definition we must make several substitutions. Beginning with i-1, replace free variables with the corresponding variables u and substitute the result for P(u) in n'i. Repeat this operation ni, times. The resulting formula is . In either case it is easy to see that i is obtained by a reset log-lin reduction. Since is in prenex form, we can apply the reset log-lin reduction of Theorem 3.1 to obtain an equivalent prenex formula ' in which each of the symbols P1,...,P k occurs at most once. As before, there is a reset log-lin reduction to convert
into a prenex formula with just one subformula P(u) in which P occurs. Substitute k(u) for this subformula to obtain finally '. Repeated use of closure of reset log-lin reductions under composition shows that the mapping is reset log-lin computable. • Remark 3.7. Scrutiny of the preceding proof reveals two useful facts. First, if all the symbols Pi have prenex definitions we can arrange that is in prenex form. Second, if we wish to restrict to formulas in which the only connectives are A, V, and , the theorem remains true providing Pi
Proving lower bounds
151
occurs only positively in i when Pi, has an iterative definition. To see this, observe that by the remark following Theorem 3.1 we can always ensure that the formulas i each contain at most two occurrences of P. This is not a problem when Pi has a prenex definition because i figures only once in the construction of k and there are a bounded number of such definitions. When Pi has an iterative definition we can ensure, again by the remark following Lemma 3.1, that Pi occurs at most once in i since it occurs only positively in i.
4
Inseparability results for first-order theories
Hereditary lower bound results have proofs similar to the classical hereditary undecidability results. Young [1985], for example, modified techniques used in the proof of the hereditary version of Godel's undecidability theorem, which states that all subtheories of Peano arithmetic are undecidable, to show that all subtheories of Presburger aaithmetic have an NTIME (2 2cn ) lower bound. Our starting point is another classical undecidability result— the Trakhtenbrot-Vaught inseparability theorem. Many hereditary undecidability results have been derived from this theorem. Recall that L0 is the first-order logic whose vocabulary contains just a binary relation symbol P. Let fsat(Lo) be the set of sentences of L0 true in some finite model, and inv(Lo) the set of sentences of L0 true in no model. The Trakhtenbrot-Vaught inseparability theorem states that fsat(Lo) and inv(L0) are recursively inseparable: no recursive set contains one of these sets and is disjoint from the other. Trakhtenbrot [1950] showed this for a first-order logic with sufficiently many binary relations in its vocabulary and Vaught [1960; 1962] reduced the number of binary relations to one. To see how this theorem gives hereditary undecidability results, suppose that for some theory £ in a logic L there is a recursive reduction from the sentences of L0 to the sentences of L that takes fsat(L0) into sat( ) and inv(Lo) into inv(L). Clearly sat( ) is not recursive since it separates the image of fsat(Lo) from the image of inv(L0). Moreover, if ' C val( ), then sat( ) sat( ') and sat( ') inv(L) = 0 so sat( ') is not recursive either. Let T(n) be a time resource bound. Recall that sat T(Lo) is the set of sentences in L0 such that is true in a structure of power at most T(l l). Our analogue of the Trakhtenbrot-Vaught Inseparability Theorem states that for T satisfying certain weak hypotheses, satT(L0) and inv(Lo) are NTIME(T(cn))-inseparable for some c > 0. That is, no set in NTIME(T(cn)) contains one of these sets and is disjoint from the other. We show, in fact, that the result is true if we restrict to prenex sentences in L0. Thus, using the reductions between formulas described in the previous section, we can obtain hereditary NTIME lower bounds for theories in much the same way that we obtain hereditary undecidability results. In
152
Kevin J. Compton and C. Ward Henson
the next section we prove an inseparability theorem which gives hereditary linear ATIME lower bounds. Our result is a consequence of the following theorem. Theorem 4.1. Let T(n) be a time resource bound and A be an alphabet. Given a problem A C A* in U c >o NTIME(T(cn)), there is a reset log-lin reduction taking each w A* to a prenex sentence w of L0 such that if w A, then w satT(Lo), and if w , then w inv(L0). Moreover, each variable occurring in w is quantified just once. Proof. Let M be a T(cn) time-bounded nondeterministic Turing machine that accepts A. We may assume that on all inputs, all runs of M eventually halt since we may incorporate into M a deterministic 'timer' which halts after some number of moves given by a fully time-constructible function dominating T(cn). To simplify notation we assume that M has just one tape. Extending to multitape Turing machines requires only minor modifications. Let m be the number of tape symbols used by M. We assume that one of the tape symbols not in A is a blank symbol, denoted #. The proof has two parts. In the first we translate information about runs of M on input w into formulas 'w in a logic with a vocabulary consisting of m + 3 binary relation symbols so that 'w satisfies the conditions of the theorem. In the second we transform sentences 'w into the desired sentences w by combining m + 3 binary relations into one. Translating Turing machine runs into first-order sentences is an old idea in logic; see Turing [1937], Buchi [1962], and, for a general discussion, Borger [l984a; 1984b; 1985]. Our translation of runs into sentences is standard except for some difficulties that must be overcome to obtain prenex sentences using reset log-lin reductions. First we describe the intended meanings of the m + 3 binary relation symbols constituting the vocabulary of the logic for sentences 'w. The symbol will interpret a discrete linear order with a least element. For convenience we use infix notation with this symbol. We call the least element with respect to this order 0 and denote the successor and predecessor of an element x by x + 1 and x - 1, respectively. In this way we identify elements of a model with consecutive nonnegative integers. We also have relation symbols STATE, HEAD, and SYMa for each a A. STATE(x,t] holds if M is in state x at time t. (States are ordered arbitrarily. We may assume that all models considered have at least as many elements as M has states since we can precede w with enough dummy quantifiers to ensure that T ( ' w ) exceeds the number of states.) HEAD(x,t) holds if the read head of M scans the tape cell at position x at time t. SYMa(x,t) holds if the tape cell at position x contains symbol a at time t. Let 'w be a prenex sentence asserting the following. (a) Relation is a discrete linear order with a least element. (b) Each tape cell contains precisely one symbol at each time. The
Proving lower bounds
153
read head scans precisely one cell at a given time. M is in precisely one state at a given time. (c) If HEAD(x, t) does not hold and SYMa(x, t) holds, SYMa(x, t + 1) also holds. If HEAD(x, t) holds, then the values of SYMa(x,t + 1), HEAD(x ± I, t + 1), and the element z making STATE(z, t + 1) true are determined by the values of SYMa(x, t), and the element y making STATE(y, t) true, in accordance with the transition function of M. (d) The read head initially scans cell 0. M begins in its initial state. (e) If STATE(x,t) holds, then t has a successor if and only if x is a final state. For some t there is a final state x such that STATE(x, t) holds. (f) The input tape initially contains w. Notice that (f) is the only conjunct depending on w; the others are fixed. Thus, if we can express (f) as a prenex sentence obtainable from w by a reset log-lin reduction, it is a simple matter to produce a prenex sentence 'w equivalent to the conjunction of (a)-(f) obtainable from w by a reset log-lin reduction. Suppose that w = a0a1...an-1 where each ai is an input alphabet symbol. We cannot just say that there are positions V 0 , V I , . . . ,vn-1 such that U0 = 0, Ui+i = Vi + 1, and SYMai (vi, 0) for i < n because the combined length of variable indices is (n log n) . We must define two relations LEFT and RIGHT to reduce the number of quantified variables. Intuitively, LEFT and RIGHT interpret the left and right child relations for a binary tree T on the first n elements 0, . . . , n — 1 of the model. It is intended that LEFT(x, y) holds precisely when y = 2x and RIGHT (x, y) holds precisely when y — 2x + 1. Then T has root 0 and 0 is its own left child (so the notion of tree is interpreted somewhat loosely). Using the machinery of the previous section it is easy to give short formulas defining LEFT and RIGHT on long intervals. Let 0 be the formula
Then the relation P(X1, x2, y1, y2) given by the iterative definition
is true when 0 y1—x1 2m—1 and 2(y1 -xt) = y2-x2. Now LEFT(x, y) is equivalent to P(0, 0, x, y) and RIGHT(x, y) is equivalent to P(0, l,x, y) on the interval 0, . . . , 2m + 1, so we take m = [log(n — 1)] to obtain RIGHT and LEFT on the interval 0, . . . , n. By Theorem 3.3 there are first-order formulas 'n and "n defining LEFT and RIGHT; moreover, they are computable from the unary representation of m by a reset Turing machine in
154
Kevin J. Compton and C. Ward Henson
time log n and space log log n. By increasing time to log n log log n we can make 'n and prenex formulas. The height of T is h = [log n] . Now for every i such that 0 i < h and every j < 2h-j define a quantifier-free formula i j ( x 0 , . . . , Xi) by induction on i. Roughly, i,j(xo, ... ,xi) says that if (xj,Xj—1,. .. ,x 0 ) is a path in from vertex j = Xi, then the symbol at position X0 on the input tape is a X 0 . First, 0 , J ( X 0 ) is S Y M a j ( x 0 , 0 ) when j < n and some tautology (say X0 = X0) when n j < 2h. Next, 0i +1,j (xo, . . . , x,+i) is the formula
By induction on i we can show that for j
2h–i
the sentence
is true if and only if for every vertex k < n which is an ith-generation descendant of j, S Y M a k ( k , 0 ) holds. Since 0 is a left child of itself, every vertex in T is an hth-generation descendant of 0, so the sentence
says that SYMa k (k, 0) holds when k < n; that is, it says w is written on the first n cells of the input tape at time 0. Let w be a conjunction of this sentence and a sentence that says SYM#(x, 0) holds for all x > n (that is, that all the tape cells from position n onward are blank at time 0). We must show that there is a reset log-lin reduction taking w to w , from which it follows easily that there is a reset log-lin reduction taking w to
w
We describe the actions of a machine effecting such a reduction. For the conjunct asserting that cells from position n onward are blank this is straightforward, so we need only show it for the conjunct asserting w is written on the first n cells. We will suppose that indices of variables are in unary; thus, the formal variable Xi denotes the actual variable
First, w is read from the input tape while h cells are marked off on a work tape. One way to accomplish this is simply to keep a count on a work tape of the number of input tape cells scanned. Count in binary. Incrementing the count requires changing the low order 1-bits to 0 until encountering a 1, which is changed to 0. The work tape head is then returned to the
Proving lower bounds
155
lowest-order bit to prepare for the next advance of the input head. It is not difficult to show that the time required to read the input tape and do all the increments is O(n). Next the input head is reset. Now a simple algorithm utilizing a stack to keep track of subscripts will generate w. The maximal stack height is h. Formula w was defined in such a way that the information required from the input tape can be read off from left to right as the algorithm proceeds. Variable indices are easily computed from the stack height since they are in unary. This computation clearly uses just log space. We need show that it takes just linear time. The time required is less than a constant multiple of the length of h, o(x0, • • •, Xh)', the length of this formula is in turn less than a constant multiple of the combined lengths of variable indices occurring within it. By induction on i, variable Xk occurs no more than 3 • 2i-k times in ij when k < i and Xi occurs just twice. Hence, the combined lengths of variable indices occurring in h,o amounts to no more than
Thus, the computation requires just linear time. Notice that each variable in w is quantified just once so it is easily arranged that the same is true
of
w-
If necessary we can add dummy quantifiers at the beginning of 'w to make its length at least cn. Thus, if w € A, then 'w is true in some model of size at most T(|w|). If w , then 'w is true in no model. Suppose, on the contrary, that 'w is true in some model . Consider the submodel of obtained by restricting to the elements which can be reached from 0 by finitely many applications of successor. The values of STATE (x,t), HEAD(x,t), and SYMa(x, t) on this submodel describe a run of M on input w. Since we have incorporated a timer which halts M on every run, this submodel is finite. But then the last element t in this submodel has no successor, so the state x for which STATE(x, t) holds is final. Thus, M accepts w, a contradiction. This completes the first part of the proof. We now show how to combine m + 3 binary relations into one. To simplify notation, let us rename the relation symbols P0,P1,..., Pm+2. Suppose ' is a model of 'w. Before describing how to transform 'w into w, we describe the model of w corresponding to '. First form the disjoint union of the interpretations of P0, P1,..., Pm+2 in '. We have then relations R 0 ,R1,...,R m+ 2 on disjoint domains B0, B1,.. .,Bm+2. Their union is a single binary relation R on the domain B = <m+2 Now enlarge R so that R (B0 x Bi) is the natural bijection from B0 to Bi
156
Kevin J. Compton and C. Ward Henson
when 1 i m + 2. Next, enlarge B by adding elements bo,b0,..., and then add the pairs (bi, b) to R for each i m + 2 and b Bi. Define to be (B, R}. We see that if bo, bi,..., bm+2 are known, then the relations RO , R1 , . . . , Rm+2 can be recovered in and in fact we may use the natural isomorphisms from B0 to Bi to define an isomorphic image of ' with domain B0. Relativize the quantifiers of 'w to a unary relation symbol D, thereby forming ('w)D. This sentence contains relation symbols D, Po, P1, . . . , Pm+2. Put the explicit definitions [D(x) P(x0,x)], [P0(x, y) = P(x, y)], and, for 1 i m + 2,
before this formula. Here x0, x1 , . . . , xm+2 are new free variables whose intended interpretations in are bo,b1,..., bm+2. Now existentially quantify X0 , X 1 , . . . , xm+2. There is a reset log-lin reduction that takes the resulting formula to an equivalent prenex formula w . (Since each variable in 'w is quantified just once the relativizations may be pushed inward. Then since all of the explicit definitions are of fixed length, conversion to prenex form is straightforward.) If M accepts w, then 'w is true in some model ' of power at most . It follows that w is true in some model of power at most
We can assume this quantity is less than T( w) by lengthening w with dummy quantifiers if necessary and using the definition of time resource bound to infer that (m + 3)T(n) T((m + 3)n). If M does not accept w, then w has no models and hence neither does w. Finally, it is easy to arrange that every variable in w is quantified just once. I Remark 4.2. Inspection of the construction of sentences w in Theorem 4.1 reveals that for every constant b > 0 there are constants c0 and c\ such that \w\ C1\w\ + C0 whenever 0 < c < b. The constant c1 depends only on the size of the input alphabet A and the constant b. (This will be useful in the proof of Theorem 4.5.) On the other hand, the constant C0 depends on the particular Turing machine M whose runs w describes. Corollary 4.3. Let T1(n) and T2(n) be time resource bounds such that NTIME(T2(n)) - NTIME(T1(n))
0.
Suppose that limn 00T1(n)/n = . Then there is a constant c > 0 such that for each set of satis fiable sentences with satT2(Lo) ,
Proving lower bounds
157
Proof. Let A be an element of NTIME(T2(n)) - NTIME(T1(n)), where A C A*. By Theorem 4.1 there is a reset log-lin reduction taking each w € A* to a sentence w in L0 so that A is mapped into satT2(L0) and A* - A is mapped into inv(L0). Suppose that this reduction takes at most time b\w\. Let c = 1/6 and be a set of satisfiable sentences containing satT2(Lo). Suppose that NTIME(T1 (cn)). To decide if w £ A we will compute w in time b\w\, and then determine if w using a T1(cn) time-bounded nondeterministic Turing machine. Certainly \w\ b\w\ so the composition of these two reductions takes time at most b\w\ + T1(bc\w\). We know that \w\ TI(\W\) so this time is bounded above by (b + l)T1(|w>|). Since limn T1(n)/n = we can apply the linear speed-up theorem (see Hopcroft and Ullman [1979]) to show that A € NTIME(T1(n)), a contradiction. Therefore, NTIME(T1(cn)). Remark 4.4. We see from Corollary 4.3 that application of our results relies on the ability to separate nondeterministic time complexity classes. The strongest result in this direction for time resource bounds in the range bounded above by e x p ( n ) is due to Seiferas et al. [1978]. It says that if T2(n) is a time resource bound and T 1 ( f ( n + 1)) 6 (T 2 (f(n))) for some recursively bounded, strictly increasing function f(n), then NTIME(T2(n)} - NTIME(T1(n))
0.
This theorem has interesting implications for us. Let T(n) be a time resource bound. Take T1(n) = T(dn), where 0 < d < 1, T2(n) = T(n), and f ( n ) = n. The Seiferas-Fischer-Meyer theorem tells us that if T(dn + d) = o(T(n)), then
NTIME(T(n)) - NTIME(T(dn))
0.
By taking a slightly smaller d the hypothesis may be simplified to T(dn) = o(T(n)). By Corollary 4.3 there is a constant c > 0 such that if is a set of satisfiable sentences with satT(L0) , then NTIME(T(cn)). Most time resource bounds that occur as complexities of theories satisfy the hypothesis T ( d n ) — o(T(n)). Among them are the functions exp, (n) when r 1, 2n/ logn , and 2nk . Powers do not satisfy this hypothesis, so when T(n) = nk where k > 1, we can only conclude that NTIME(g(n)) for all g(n) such that g ( n ) = o(nk). By considering just one time resource bound T ( n ) we gain another advantage: we can prove a fully-fledged inseparability theorem. This will allow us to obtain NTIME lower bounds for problems of the form val( ), as well as problems of the form sat( ). Theorem 4.5. If T is a time resource bound such that for some d between
158
Kevin J. Compton and C. Ward Henson
0 and 1, T(dn) = o ( T ( n ) ) , then there is a constant c > 0 such that and inv(L0) are NTIME(T(cn))-inseparable. Proof. Corollary 4.3 and the preceding remark show that there is a c > 0 such that if satT(L0) C and inv(L0) = 0, then NTIME(T(cn)). (The corollary applies since limn T(n)/n = when T(dn) = o(T(n)).) We must show that there is a c' > 0 such that if satT(Lo) and inv(Lo) = 0, then , the set of sentences from L0 not in , is not in NTIME(T(c'n)). Let b be a positive constant (say 1). By Theorem 4.1, if A C A* is in NTIME(T(c'n)) for 0 < c' < b, there is a reset log-lin reduction taking each w 6 A* to a sentence w of L0 such that A is mapped into satT(Lo) and A is mapped into inv(L0). We know also from the remark following Theorem 4.1 that \w\ CI\W\ + c0, where c1 is a constant depending only on A and b, not on d or A. Take c' small enough that c'c1 < c. Now consider a set such that satT(L 0 ) and inv(Lo) 1 = 0. We must show that NTIME(T(c'n)) so suppose the contrary. We can take A in the previous paragraph to be so there is a reset log-lin reduction mapping into satT(Lo) and into inv(Lo) . This reduction takes an input w to a sentence w in time at most b\w\. Thus, we can determine whether w by computing w and determining in nondeterministic time T(c'n) if w € F. Hence, NTIME(bn + T(c'(c0n + c1))) C N T I M E ( T ( c n ) ) a contradiction. Remark 4.6. For simplicity, Theorem 4.5 was stated for sets and inv(Lo). By Theorem 4.1 it remains true if we take instead and the set of prenex formulas in inv(L0). Similarly, Theorem 4.3 holds if we restrict to prenex sentences.
5
Inseparability results for monadic second-order theories
In this section we develop inseparability results for monadic second-order theories analogous to those for first-order theories in section 4. The appropriate complexity classes here are the linear alternating time classes rather than nondeterministic time classes. Because linear alternating time classes are closed under complementation, we do not need a special argument like the one used in Theorem 4.5 to obtain inseparability results. Also, lower bounds are obtained by simple diagonalization, rather than more sophisticated results such as the theorem of Fischer, Meyer, and Seiferas used in the previous section.
Proving lower bounds
159
As before, inseparability results are closely related to satisfiability problems that are hard for certain complexity classes. The classes are of the form
ATIME(T(cn),cn) which is in many ways more natural than
ATIME(T(cn),n). If there is a reset log-lin reduction from a problem to a class of the first form, then we may conclude that is also in the class. We know of no speed-up theorem for alternations, so we cannot make the same claim for classes of the second form. One of the main results of the section is Theorem 5.2, an analogue of Theorem 4.1. We could prove this result along the same lines as Theorem 4.1, but we obtain a somewhat sharper result if we appeal to a result of Lynch [1982] relating nondeterministic time classes to the spectra of monadic second-order sentences. Lynch encodes Turing machine runs in a way different from the classical method used in the last section. Rather than explicitly accounting for symbols at each tape position and time in a machine run, he keeps track of just the symbol changed (not its position), the symbol which replaces it, and the direction of head movement at each time. If the underlying models have enough structure, it is possible to express derivability between instantaneous descriptions of nondeterministic Turing machines with just this information. Lynch shows, in particular, that this is the case if the underlying models have an addition relation PLUS(x, y, z) which holds when x + y = z. We begin, therefore, by considering the monadic second-order logic ML+ whose vocabulary contains just a ternary relation symbol PLUS, and M+, the monadic second-order theory of addition on initial segments of the natural numbers. M+ can be axiomatized by a set of first-order sentences. Explicitly define a relation by
Then M+
says that
is a discrete linear order with least element 0 and
(The immediate predecessor of an element x is denoted z— 1; the immediate successor is denoted x + 1.) Note that even though M+ consists of firstorder sentences, satT(ME+) is the set of monadic second-order sentences true in some model of M+ of size at most
160
Kevin J. Compton and C. Ward Henson
Theorem 5.1. Let T(n) be a time resource bound and A an alphabet. Given a problem A C A* in Uc>o ATIME(T(cn),cn), there is a prescribed set of sentences over ML+, and a reset log-lin reduction taking each w . A* to a sentence w in such that if w € A, then w £ satT (M+), a n di f w , then w inv*(ML+). Proof. Fix c > 0; we may take c to be an integer. Let M be an alternating Turing machine that accepts A in time T(cn) with at most en alternations. As in Theorem 4.1 we may assume that on all inputs, all runs of M eventually halt: incorporate into M a deterministic 'timer' which halts after some number of moves given by a fully time constructible function dominating T(cn). We assume for simplicity that M has just one tape; extending to multitape Turing machines requires only minor modifications. Let a 1 , . . . , am be the tape symbols used by M. Let r = {0,... ,r — 1} be a finite ordinal and + be the usual addition relation on r, so that (r, +) is a model of M+. In this model we represent an instantaneous description of M by a sequence of sets X 1 , . . . , Xm+2 = X, where sets X1,..., Xm partition r and Xm+1 is a singleton set, and Xm+2 is singleton set whose element is one of the states of M. The set of states is identified with an initial interval of r. We intend that the symbol ai is at position x when x Xi, that the head scans position x when x € Xm+i, and that M is in state x when x Xm+2 • We will need to restrict to initial intervals in our models because when we consider truth in weak models at the end of the proof we may not be able to quantify over all subsets, but we can arrange to quantify over subsets of finite initial intervals. Let ID(x, X) be the formula specifying that sets X 1 , . . . , Xm partition the interval [0, x], Xm+1 and Xm+2 are singleton sets contained in this interval, and the element in Xm+z is a state. Recall that each state of M is of one of four types: universal, existential, accepting, and rejecting. Let
be formulas indicating that ID(x, X) holds and the state for the instantaneous description represented by X is of the corresponding type. Lynch [1982] shows that for each nondeterministic Turing machine M' there is a monadic second-order formula nM'(X, y) that holds in (r, +) precisely when X and Y represent instantaneous descriptions for M' and Y can be obtained from X within r or fewer moves of M' by a computation in which the head does not reach a tape position greater than or equal to r. We can regard the alternating Turing machine M as a nondeterministic Turing machine simply by ignoring state types. We also form the nondeterministic Turing machine M' by eliminating transitions out of all states in M except existential states and then ignoring state types, and M" by eliminating
Proving lower bounds
161
transitions out of all states in M except universal states and then ignoring state types. Let (x,X,y) be the formula .
Let n (x,X,y) be the formula
Let
be the formula
That is, (x, X, y) expresses derivability between instantaneous descriptions on the interval [0, x]; ( x , X , y ) ( n ( x , X, y)) expresses the same except that all states, excluding possibly the last, are existential (universal). Notice, in particular, that n ( x , X , X ) holds for all X. Let TERM(x ,X] be the formula and TERM ( x , X ) be the formula
Let
w(x, X) be a prenex sentence asserting the following: (a) Relation is a discrete linear order with a least element and a greatest element; every element other than the greatest has a successor. (b) The relation PLUS(x 0 ,, x1, x2) holds if and only if either x0 = x2 and x\ = 0 or PLUS(xo, x1 — l,x% — 1) holds. (c) There is a set Y with no elements. For every set Y and element y there is a set Y U {y}. (d) If y is an element such that for all Y satisfying ID(y, y), TERM(y, X, y) implies ACC(y, y) or REJ(y, y), then x y. (e) The sequence X represents the instantaneous description for an input tape with w written on it, the head scanning the first position, and M in the initial state. Each of these items can be expressed by a fixed sentence except the part of condition (e) concerning the input tape. The prenex formula expressing this part is constructed in the same way as in Theorem 4.1. As in that construction, it follows that there is a reset log-lin reduction taking w to
162
Kevin J. Compton and C. Ward Henson
Consider any weak model 21 in which conditions (a)—(c) hold. For the moment, let us suppose that the element x from the universe of is finite, i.e., a finite distance from the least element 0. (By condition (a), this is a well-defined notion.) Condition (b) ensures that PLUS restricted to the interval [0,..., x] in is the usual addition relation. Condition (c) ensures that quantification over subsets of {0,..., x} in the weak model is quantification over all subsets of {0,...,a;}. Thus, for finite x, the formulas n ( x , X, y), (x, X, y), and (x, X ,y) have interpretations in corresponding to computations of M as described above. Now in consider x and X satisfying conditions (d) and (e). (We no longer stipulate that x is finite.) Since all runs of M on input w halt, say within k moves, we see that every such x is at most distance k from 0. (Note that this is the case even if there is no y as described in (d).) Thus, such an x will be finite and the observations of the previous paragraph pertain. Let (x, X) be the formula
Let
w
be the sentence
where n = \w\. Clearly, there is a prescribed set of sentences containing every w. It is also clear that if w € A, then w is true in some model of size at most T w because in such models P interprets acceptance by M for x = T(cn). If w , then pw is true in no model. Suppose on the contrary that is true in . Then in there are x and X such that w(x, X) and w P(x, X) hold, where P is given by the implicit definition
By the remarks above, x must be finite, and hence P describes an accepting computation of M on input w, which is a contradiction. I The next theorem, the analogue of Theorem 4.1, follows from the previous theorem and a result of Kaufmann and Shelah [1983]. Theorem 5.2. Let T(n) be a time resource bound and A an alphabet. Given a problem A C A* in c>o ATIME(T(cn),cn), there is a prescribed set of sentences over MLo, and a reset log-lin reduction taking each
Proving lower bounds w A* to a sentence and if w , then w
w
in such that if w € A, then inv*(MLo).
163 w
satT
Proof. By the previous theorem we need only show that there is a formula (x,y, z) from MLo such that for each finite ordinal n = {0, . . . , n — 1} there is a binary relation R on n such that (X, y, z) is an addition relation on n, where = (n, R). Kaufmann and Shelah [1983] prove a much stronger result: there is a formula (x, y, z) such that for almost every binary relation R on n, (a;, y, z) is an addition relation on n, where = (n, R). For the sake of completeness, we sketch a proof of the simpler result that there is a formula that codes an addition relation on 'some binary relation of each finite power. First suppose that the vocabulary for the logic has three binary relation symbols P1, P2, P3, rather than just one, and that they interpret binary relations R1, R2, R3 on m. To simplify the proof we assume that m = r3. It is easy to specify a formula (X) saying the relations RI, R2, R3 restricted to m x X are functions, respectively denoted f1, f2, f3, and that (f1(x),/2(x),/3(x)) ranges over each triple in X3 precisely once as x ranges over m. Thus |X"|= r and we have defined a bijection between m and X3. Since we can quantify over subsets of m, we can quantify over ternary relations on X when (X) holds. Therefore, we can, without much trouble, define an addition relation on X. Also, we can extend this relation to define addition modulo r. But then it is easy to define addition on m using the bijection between m and X3. Using the construction in the proof of Theorem 4.1, the three binary relations R1 , R2 , R3 on a set of size m = r3 can be coded as a single binary relation on a set of size n = 3(m + 1). This set has three disjoint subsets of size m on which addition and addition modulo m can be coded. It is not difficult now to define addition on all of n. Thus, there is a binary relation on n from which an addition relation can be defined when n is of the form 3(r3 + 1). With a little effort this construction can be made to work for arbitrary n. I We now state an analogue of Corollary 4.3. Corollary 5.3. Let T1{n) and T2(n) be time resource bounds such that
ATIME(T2(n),n) - ATIME(T1(n),n)
0.
Suppose that l i m n - + T 1 ( n ) / n =. Then there is a constant c > 0 such that for each set of satisfiable sentences with satT2(MLo) , ATIME(T1(cn),cn). The proof is the same as for Corollary 4.3. Note, however, that we rely on a result that says the linear speed-up theorem applies to alternating
164
Kevin J. Compton and C. Ward Henson
Turing machines. We must also use Theorem 3.6 to obtain a reset loglin reduction from a prescribed set of sentences over ML0 to equivalent sentences in ML0. We also have an analogue for Theorem 4.5. Theorem 5.4. // T is a time resource bound such that for some d between 0 and 1, T(dn) = o(T(n)), then there is a constant c > 0 such that and inv(MLo) are ATIME(T (cn), cn) -inseparable. Proof. The proof is much simpler than that of Theorem 4.5. We can separate ATIME(T(n), n) and ATIME(T(dn), dn) using a straightforward diagonalization, so we do not appeal to the more difficult methods used in separating NTIME classes. Then we use the previous corollary to show that for some c > 0, if satT(ML 0 ) and inv(ML0) = 0 then ATIME(T(cn),cn). Since ATIME(T(cn),cn) is closed under complementation, we have that satT(ML 0 ) and inv(MLo) are ATIME(T(cn),cn)inseparable. • Remark 5.5. By Theorem 5.1, Theorems 5.3 and 5.4 hold with and inv(ML+) in place of satT(MLo) and inv(MLo). It is also important to note that even though we reduced the prescribed sets of sentences in these theorems to equivalent sets of monadic secondorder sentences, it is the prescribed sets which are used to obtain lower bound results. For example, in the proof of Theorem 5.4 we actually showed that there is a prescribed set of sentences over ML0 such that satT(ML0) n and inv*(ML0) are ATIME(T(cn),cn)-inseprable. In sections 6 and 7 we will find lower bounds for various theories £ from logics L by finding a reset log-lin reduction from to ', a prescribed set of sentences over L, so that satT(ML 0 ) is mapped into sat*( ) and inv*(ML0) is mapped into inv*(L) n ' . Thus, for some c > 0 sat*( ) and inv*(L) n are ATIME(T(cn),cn)-inseparable. Then by Theorem 3.6, sat( ) and inv(L) are ATIME(T(cn),cn)-inseparable.
6
Tools for NTIME lower bounds
We present several useful tools for establishing NTIME lower bounds for theories by interpreting models from classes of known complexity. We begin with some definitions regarding interpretations of classes of models and give a general outline of how interpretations are used to obtain lower bounds. Theorem 6.2, a specific instance of the method, follows from the results in section 4. It tells how to obtain lower bounds by interpreting binary relations. We then show in Theorem 6.3 how to interpret binary relations in finite trees of bounded height. As a consequence we obtain hereditary lower bounds for theories of finite trees of bounded height and a tool for obtaining further lower bounds by interpreting classes of these
Proving lower bounds
165
trees in other theories. We obtain similar results for classes of finite trees of unbounded height in Theorem 6.7 and its corollaries. Let £ be a theory in a logic L' and Co, C1,C2, • • • be classes of models for a logic L whose vocabulary consists of relation symbols P1 , . . . , Pk . Let x1 , . . . , Xk, be sequences of distinct variables with the length of xt equal to the arity of Pi. Suppose that there are formulas n(x, u), 1 / n ( X I , U ) , . . ., (xk, u) from L' which are reset log-lin computable from n (expressed in unary notation) so that for each Cn we have a model ' of and elements m in ' with
isomorphic to . The parameter sequence u is allowed to grow as a function of n. The sequence {In | n 0} where
is called an interpretation of the classes Cn in S. The interpretation is a simple interpretation if the formulas n,1/n, . . . , kn are fixed with respect to n; this is the situation traditionally found in logic. The interpretation is a prenex interpretation if the formulas n,1/n, . . . , kn are all in prenex form. The interpretation is an iterative interpretation if the formulas n,1/n . . . , kn* are given by iterative definitions. By this we mean that there are formulas
and integer functions f, g 1 , . . . , 9 k which are reset log-lin computable (using unary notation) such that 5n is the formula given by the iterative definition
and in is the formula given by the iterative definition
as in Theorem 3.3. Notice that we may regard simple interpretations as special cases of either prenex or iterative interpretations. There is a slightly more general point of view toward interpretations which is sometimes useful. Suppose C'0 C'1 C'2 • • • are classes of models of such that for each n 0 and € Cn, the model in the above definition can be found in C'n. Then we will say that we have an interpretation of the classes Cn in the classes C'n. Sometimes we will not mention at all when discussing interpretations in this context. In that
166
Kevin J. Compton and C. Ward Henson
case we must say whether we intend the classes C'n to be models for a firstorder or for a monadic second-order logic. Thus, we will say that there is an interpretation of the classes Cn in the first-order (or monadic second-order) classes C'n. Interpretations, inseparability, and prescribed sets are the cornerstones of our method. Suppose we have an interpretation of classes Cn in classes C'kn for some nonnegative integer k. (In most cases ; is 1 but occasionally we need a larger value.) Suppose also that there is a prescribed set of formulas over L such that {
is realized in some
Cn where
= n}
and inv*(L) are NTIME(T(cn))-inseparable for some c > 0. (A formula is realized in if it is true in for some assignment to its free variables.) Now map each formula in to the formula ' given by
where n = . By adding dummy quantifiers in the right places we can ensure that kn. If is realized in some model in Cn, then is realized in some model in C'kn. If i is true in no model, then ' is true in no model. (There is a minor point which should be addressed here. To be completely rigorous we should require for all models ' that (x) 0 since certain formulas in inv*(L) may become true when relativized to an empty relation. For example, consider the sentence x (x x). We can always meet this requirement by replacing 6n(x) with the formula n(x) V x-n(x) so we can ignore this point in subsequent discussions.) When we have a prenex interpretation or an iterative interpretation and the definitions in ' are replaced by the appropriate iterative definitions, the sentences all belong to some prescribed set of formulas over L'. (If the interpretation is iterative, then by definition the parameter sequence cannot grow with n.) We have now that is realized in some
C'n where
= n}
and inv*(L') are NTIME(T(cn))-inseparable for some c > 0. It follows by Theorem 3.6 that sat( ) and inv(L') are NTIME(T(cn))-inseparable for some c > 0 so £ has a hereditary NTIME(T(cn)} lower bound. Now let us broaden the definition of interpretation to cover instances where the formulas n and 1 n , . . . , kn contain free relation variables which also receive prenex or iterative definitions. For example, if these formulas contain a free unary relation variable Q we would write
Proving lower bounds
167
for , the formulas n having the same sort of restrictions as dn and 1n, ... ,kn. In this case we would have
as the elements of our interpretation. If these formulas have all prenex definitions we could use Theorem 3.1 to rewrite n and 1/n, ... , kn so that Q occurs just once and substitute 9n for occurrences of Q. For iterative definitions, we know of no similar substitution which eliminates Q and keeps the length of linearly bounded in n. Fortunately, such a substitution is not needed and is even undesirable. By taking a top-down approach to the construction of interpretations, building complex relations from simpler relations, we make our task easier and exposition clearer. We extend our definitions of simple, prenex, and iterative interpretation to this more general situation. (In principle, some definitions could be prenex and others iterative, but this does not seem to occur in practice.) We see that hereditary lower bounds are obtained using interpretations to transfer inseparability results from one prescribed set of formulas to another. One of the advantages of this method is that by establishing lower bounds in this manner, we also establish tools for proving further lower bounds. In the situation described above, if we have another prenex or iterative interpretation of classes C'n in classes C'n of models of A, then we can use C'n and in place of Cn and to establish a lower bound for A. Compare with the well-known methods for establishing NP-hardness of a problem by reducing to a problem already known to be NP-hard. After the first lower bound or hardness result has been proved one should never again have to code Turing machines. It is worth noting what happens when the interpretation used to establish a lower bound is not prenex or iterative. In that case we do not know that there is a reset log-lin reduction taking formulas defined above to equivalent formulas in L''. As we mentioned in section 3, the shortest equivalent formula in L' we know how to obtain has length (n log n) in the worst case. We can only conclude that sat( ) and inv(L') are NTIME(T(cn/log n))-inseparable for some c > 0 so £ has a hereditary T(cn/log n) lower bound. Successive applications of such interpretations give increasingly worse bounds. After k interpretations the lower bound would be T(cn/(log n)k) rather than T(cn). In the case where the formulas in are in prenex form already we do not have this loss in the lower bound, but even then, when the interpretation is not prenex or iterative, we cannot use the classes Cn to obtain further lower bounds without a subsequent loss. We have introduced prenex and iterative interpretations to avoid these losses. In our experience prenex and iterative interpretations not only achieve sharp lower bounds, but also are easy to manage and occur quite naturally in applications.
168
Kevin J. Compton and C. Ward Henson
Within this framework we can also accommodate the more general kind of interpretation in which the domain of the interpreted model is not a subset of ', but a set of k-tuples from ', and the equality relation is interpreted by an equivalence relation definable in . We have found this to be necessary for only two theories treated here and so we avoided stating these definitions in the fullest generality. However, it would not have been difficult to introduce these features explicitly. (See Examples 8.12 and 8.14.) Although we have emphasized inseparability results, we should not lose sight of the fact that the starting point for our reductions, Theorem 4.1, is a hardness result: every set -T of satisfiable sentences with satT(Lo) is hard for the complexity class
NTIME(T(cn)) c>0
via reset log-lin reductions and for the class
NTIME(T(nc)) c>0 via polynomial time reductions. Thus, all our inseparability results can be reformulated as hardness results. We summarize the previous discussion and make this point precise in the following theorem: Theorem 6.1. Let Co,C1,C2,... be classes of models such that for some prescribed set of formulas over a first-order logic L, is realized in some
Cn where
= n}
and inv*(L) are NTIME(T(en))-inseparable for some c > 0. Let C'0 C C'1 c'2 • • • be classes of models of a theory in a logic L'. If there is a prenex or iterative interpretation of the classes Cn in the classes C'kn for some nonnegative integer k, then the following are true. (i) The sets sat( ) and inv( ) are NTIME(T(cn))-inseparable for some c> 0. (ii) If for some d between 0 and 1, T(dn) = o(T(n)), then has a hereditary NTIME(T(cn)) lower bound. (iii) For each ' val( ), sat( ') and val( ') are both hard for the complexity class NTIME(T(cn)) c>0 via reset log-lin reductions. (iv) For each val( ), sat( ') and val( ') are both hard for the complexity class
Proving lower bounds
169
NTIME(T(nc)) c>0
via polynomial time reductions, (v) There is a prescribed set of sentences over L' such that '
is realized in some
C'n where
=
and inv*(L') are NTIME(T(cn))-inseparable for some c > 0. Usually when we apply our method we state the result as in (ii) for brevity, but the reader should be aware that all of the conclusions hold. The following result is an immediate consequence of the above theorem and Theorem 4.5. It is one of the most useful tools for establishing NTIME lower bounds. Theorem 6.2. Let T(n) be a time resource bound such that for some d between 0 and I, T(dn) = o(T(n)). Let Cn be the class of binary relations (i.e., structures for L0) on sets of size at most T(n) and a theory in a logic L. If there is an interpretation of the classes Cn in , then £ has a hereditary N T I M E ( T ( c n ) ) lower bound. The first application of this result is to first-order theories of finite trees of bounded height. Recall that we express first-order properties of trees in the logic Lt whose vocabulary contains just a binary relation symbol which interprets the parent-child relation. Let r be the theory of finite trees of height at most r and be the theory of finite trees of arbitrary height. Define MT and M similarly for the monadic second-order logic MLt. Let Tk be the class of finite trees of height k. We inductively define certain restricted classes of finite trees in which the classes Cn in Theorem 6.2 are interpreted. For each m > 0 let Tm0 be the class To (all of whose elements are isomorphic). The class Tmk+1consists of those trees whose primary subtrees are all in km, but such that no more than m primary subtrees may be in the same isomorphism class. Clearly, km C Tk and isomorphism types, then Tmk+1contains at most (m + 1)t isomorphism types. From the following theorem we obtain hereditary lower bounds for theories of finite trees of bounded height and another useful tool for obtaining other lower bounds. Theorem 6.3. Let Cn be the class of binary relations on a set of size exp r _ 2 (n) and m = m(n) be the least integer such that m logm n. Then there is a prenex interpretation of the classes Cn in the first-order classes mr when r > 3 and in the first-order classes T 2m r when r = 3.
Proof. Note that m = O(n/logn).
n}
170
Kevin J. Compton and C. Ward Henson
First, consider the case r > 3. Define m(x,y) to be a formula, with free relation variable Q, which says that for all children t1,...,t m of x, there are children u1 , . . . , um of y such that Q(ti, ui) holds for 1 i < m, and
We wish to write m as a prenex formula which can be computed from n (in unary) by a reset log-lin reduction. Unfortunately, the displayed formula has length (n 2 ) so we replace it with
which is reset log-lin computable from n. Hence, we can write prenex formula which is reset log-lin computable from n. When k > 0, the iterative definition
m
as a
defines an equivalence relation on the vertices of height less than k. For k = 1, Q is an equivalence relation on the leaves, which are precisely the vertices of height 0. This relation makes all leaves equivalent. Increasing k to 2, we must extend the relation to vertices of height 1. These are the vertices adjacent to a leaf and all of their children are leaves. Two such vertices are equivalent if they either have the same number of children or both have at least m children. In general, for larger k we extend the relation to vertices of height k — 1 leaving the relation unchanged for lower heights. Two vertices of height k — 1 are equivalent if for each equivalence class represented among their children (on which the relation has already been defined), they either have the same number of children in the class or both have at least m children in the class. When k = r we have an equivalence relation on the set of all vertices in a tree of height r except the root. By Theorem 3.3 there is a reset log-lin reduction taking the iterative definition
to an equivalent explicit definition
Proving lower bounds
171
Moreover, since r is fixed we can arrange that mr is a prenex formula. We will say that two vertices x and y in a tree of height r have the same mr -type if (x, y) holds. Define n(x) to be a prenex formula that says x is a child of the root, x has at least one child, and no two distinct children of x have the same type. Define nn(x, y) to be a prenex formula that says d(x) and (y) hold and there is a child z of the root coding (x, y). By z coding (x, y) we mean that for every type, if x and y have no child of that type, then neither does z; if x has a child of that type but y does not, then z has precisely two children of that type; if x has no child of that type but y does, then z has precisely three children of that type; and if x and y both have a child of that type, then z has at least four children of that type. (For this coding to work, m must be at least 4, but this is not really a problem because if m < 4, then n < 5 and we can easily formulate interpretations of Cn in 77™ when n < 5.) Both n(x) and n(x, y) are reset log-lin computable from n. Now it is easy to show by induction on k that if x and y are vertices of height at most k, where k < r - 1, and if the two subtrees formed by restricting to x and its descendents, and to y and its descendents, are nonisomorphic trees in T k m, then x and y have different mr types. We know that = (m + l)m+1 >2n and that if |Tkm| = t, then \Tmk+1\ = (m + 1)t so \Tmk\ > expk_1(n) when k 2. Thus, for vertices of height r - 2 there are at least exp r _ 3 (n)mrtypes. If a; is a child of the root satisfying n(x), its children are of height at most r — 2 and it has either 0 or 1 children of each possible mr-type. Thus, it is possible to distinguish between as many as exp r _ 2 (n) vertices x satisfying n(x). Clearly, if 6n(x) and n(y) hold, (x, y) can be coded by some child of the root. It is easy to see that every binary relation on a set of size at most exp r _ 2 (n) is isomorphic to {n(x),7r*(x, y)) for some tree in mr. This concludes the case r > 3. Now we consider the case r = 3. The construction just given shows that every binary relation on a set of size at most 2m = 2°'n/log n' is isomorphic to (n (x) , n (x , y)} for some tree in Tm3 We must work harder to remove the log n denominator in the exponent; to do this we must interpret Cn in T2m3
We begin by specifying formulas m(x, y) and m(x, y) which will define equivalence relations on vertices of height 2. The formula 0'm(x, y) says that for all children t1, . . . , tm of x with at least 1 but no more than m children, there are children u1 , . . . , um of y with at least 1 but no more than m children, such that m2(ti,ui) holds for 1 i m, and ti — tj ui, = uj holds for 1 i,j < m. The formula n' m (x,y) says that for all children t1,...,t m of x with more than m children, there are children u1 , . . . , um of y with more than m children, such that 2m2(ti, ui) holds for 1 i < m,
172
Kevin J. Compton and C. Ward Henson
and ti = tj Ui = Uj holds for 1 i, j m. Using the same argument as above we can say that m(x,y) is a prenex formula which is reset log-lin computable from n and equivalent to 'm(x, y) m(y, x), and similarly for m(x,y). Since m and m define equivalence relations on the set of vertices of height 2 we can speak of the m-type and m-type of a vertex x of this height. Define vi(x) to be the minimum of m and the number of children of x with precisely i children. Define v+i (x) to be the minimum of m and the number of children of x with at least i children. The m-type of x is precisely determined by the values v1(x), . . . , vm(x) and the nm-type by the values v m +1(x), . . . ,v2m-1,v+2m(x).We see then that the m -type and the m -type of a vertex are independent, and that there are mm+1 > 2™ m -types and the same number of m -types. Let n(x) be a prenex formula that says x is a child of the root and V0(X) — 0. Let n (x, y) be a prenex formula that says 6n(x) and 6n(y) hold and there is a child z of the root such that V0(Z) > 1 and m(x,z) and (z,y) hold. It is easy to arrange that n(x) and n(x, y) are reset log-lin computable from n. Each binary relation on a set of size at most 2™ is isomorphic to for some tree in 2m3. Corollary 6.4. Let r 3. r has a hereditary NTIME(expr_2(cn)) lower bound. Corollary 6.5. Let r 3 and be a theory in a logic L. If there is an interpretation of the classes mr log n in , then £ has a hereditary NTIME(exp r _ 2 (cn)) lower bound. Remark 6.6. For each r 3 there is a constant d > 0 such that every tree in trn/log n has at most exp r _ 2 (dn) vertices. Hence, we can view Corollary 6.5 as a significant improvement over Theorem 6.2 for obtaining NTIME(expr_2(cn)) lower bounds: rather than interpreting all binary relations on sets of size exp,,_2(cn) we need only interpret all trees of height r on sets of this size. In applications it is often much more natural to interpret trees than binary relations. See also Theorems 7.5 and 7.9. We next prove results similar to Theorem 6.3 and Corollaries 6.4 and 6.5 for finite trees of unbounded height. Theorem 6.7. Let Cn be the class of binary relations on a set of size exp (n — 3). Then there is a prenex interpretation of the classes Cn in the first-order classes T2n. Proof. Recall from the proof of Theorem 6.3 the formula mr which defines an equivalence relation on the set of all vertices in trees of depth r except the root. In that proof r was fixed, so we could assume that mr was in prenex form, and TO increased with n. In this proof we fix TO = 2 and
Proving lower bounds
173
consider formulas . We see now that has an iterative definition. By induction on k, there can be as many as exp(k;) 2n-types among vertices of height k < n in a tree in 2n We define n and n in essentially the way as in Theorem 6.3. The only difference is that when a vertex z coded a pair (x, y) there, z could have up to four children of the same type. Here, in trees from 2n, we have at most two, so we refer to types of grandchildren rather than children. Thus, we interpret binary relations on sets of size e x p ( n — 3) rather than n - 2). Corollary 6.8.
, has a hereditary NTIME(exp(cn))
lower bound.
Corollary 6.9. // there is an interpretation of the classes T2n in a theory , then £ has a hereditary NTIME(ex.p(cn)) lower bound.
7
Tools for linear ATIME lower bounds
The theorems in this section are counterparts of those in the last section. In order to obtain linear alternating time lower bounds for logical theories we must introduce a stronger form of interpretability which we call monadic interpretability. Theorems 7.2 and 7.3 tell how to obtain lower bounds by monadic interpretation of addition relations and binary relations. We then show, in Theorems 7.4 and 7.7 that binary relations have monadic interpretations in certain classes of trees of bounded height. From these results we obtain useful tools for establishing linear ATIME lower bounds, and lower bounds for monadic second-order theories of trees of bounded height. Suppose £ is a theory in a logic L' and C o , C i , C 2 , . - - are classes of models for a monadic second-order logic ML whose vocabulary consists of relation symbols P1, ... .Pk. Suppose that there are formulas 6n(x, u), reset log-lin computable from n, so that for each Cn there is a model ' of and elements m in ' with isomorphic to
, and the sets
range over all subsets of n (x, m) as p ranges over '. The parameter sequence u is allowed to grow as a function of n but t must remain fixed. The sequence {In | n 0} where In =
is called a monadic interpretation of the classes Cn in . We define simple, prenex, and iterative monadic interpretations similarly to definitions in the
174
Kevin J, Compton and C. Ward Henson
previous section. We also define the notion of monadic interpretation of classes Cn in classes C'n as in the previous section. Evidently, a monadic interpretation of classes Cn is nothing more than an interpretation where the models in Cn are regarded as models for a monadic second-order logic. Note that if we have an interpretation of classes Cn in a theory in some monadic second-order logic L', then we can automatically extend to a monadic interpretation by taking n(x, X) to be the formula x 6 X where X is a new set variable. The framework for obtaining linear alternating time lower bounds is essentially the same as before. Suppose we have a monadic interpretation of classes Cn in classes C'kn for some nonnegative integer k and that there is a prescribed set of formulas over ML such that is realized in some
Cn where
= n}
and inv*(ML) are ATIME(cn),cn)-inseparable for some c > 0. Now given formula in ML we form as follows. Replace each monadic quantification X or X with a quantification tx or tx, where tx is a variable sequence of the same length as t, uniquely determined by X, and not in conflict with the other variables in . (We may need to change other indices to avoid conflicts.) Introduce a relation variable 5 and replace each atomic formula x X by S(x, tx)- We can easily arrange that there be a reset log-lin reduction taking to . Now map each formula in to the formula given by
where n — . As before, it is easy to arrange that kn. If is realized in some model in Cn, then is realized in some model in C'kn; if is true in no weak model, then is true in no model or weak model (depending on whether L' is first-order or monadic second-order). When we have a prenex interpretation or an iterative interpretation and the definitions in ' are replaced by the appropriate iterative definitions, the sentences all belong to some prescribed set of formulas over L'. We have now that is realized in some
C'n where
= n}
and inv*(L') are ATIME (T(cn),cn)-inseparable for some c > 0. Then sat( ) and inv(L') are ATIME(T(cn),cn)-inseparable for some c > 0 so has a hereditary ATIME(T(cn),cn) lower bound. As before we allow a more elaborate definition of monadic interpretation where formulas n, 1n , . . . , kn , and n may contain free relation variables which also receive prenex or iterative definitions. We summarize these remarks in the following theorem which parallels Theorem 6.1:
Proving lower bounds
175
Theorem 7.1. Let Co,C1,C2, • • • be classes of models such that for some prescribed set of formulas over a monadic second-order logic L, is realized in some
Cn where
= n}
and inv*(L) are ATIME(T(cn),cn)-inseparable for some c > 0. Let C'0 C C'0 C'2 C • • • be classes of models of a theory £ in a logic L'. If there is a prenex or iterative monadic interpretation of the classes Cn in the classes C'kn for some nonnegative integer k, then the following are true. (i) The sets sat(£) and inv( ) are ATIME(T(en), en)-inseparable for some c > 0. (ii) If, for some d between 0 and 1, T(dn) = o(T(n)), then £ has a hereditary ATIME(T(cn),cn) lower bound. (iii) For each £' C val(£), sat( ') and val( ) are both hard for the complexity class ATIME(T(nc)nc) via reset log-lin reductions. (iv) For each val( ), sat( ') and val( ') are both hard for the complexity class ATIME(T(nc),nc) c>0 via polynomial time reductions, (v) There is a prescribed set ' of sentences over L' such that ' is realized in some
C'n where
= n}
and inv*(L') are ATIME(T'(cn), cn) -inseparable for some c> 0. The following theorems are immediate consequences of the preceding theorem and Theorems 5.1, 5.2, and 5.4. Theorem 7.2. Let T(n) be a time resource bound such that for some d between 0 and I, T(dn) = o(T(n)). Let Cn be the class of addition relations on sets of size at most T(n) and £ a theory in a logic L. If there is a monadic interpretation of the classes Cn in £, then £ has a hereditary ATIME(T(cn),cn) lower bound. Theorem 7.3. The previous theorem holds with binary relations in place of addition relations. From the following theorem we obtain a useful tool for obtaining lower bounds, but not the best lower bounds for monadic second-order theories of trees of bounded height.
176
Kevin J. Compton and C. Ward Henson
Theorem 7.4. Let Cn be the class of binary relations on a set of size exp r _ 1 (n). Then there is an iterative monadic interpretation of the classes Cn in the monadic second-order classesT2nrwhen r > 2 and in the monadic second-order classesT22nrwhen r = 2. Proof. The proof is very similar to the proof of Theorem 6.3. First, consider the case r > 2. We iteratively define a relation Q'(X, Y) which says that \X\ = \Y\ 2". Write Z = X U Y as an abbreviation for Z = X Y A X Y = 0. Let p(X, Y) be the formula
The iterative definition [Q'(X, Y) = p]n+l gives the desired relation. Now let (x, y) be a formula, with free relation variable Q, which says that if X is a subset of the set of children of x, and Q(XI, x2) holds for all Xi,x2 X, then there is a set Y which is a subset of the set of children of y such that Q'(X, Y) holds, and Q(y1, y2) holds for all y1,y 2 Y, and holds for all x\ 6 X and y1 Y. Consider a tree in72nr. When k > 0, the iterative definition
defines an equivalence relation on the vertices of height less than k. For k = 1, Q is an equivalence relation making all leaves equivalent. Increasing k to 2, two vertices of height 1 are equivalent if they have the same number of children. (In Theorem 6.3 this was true only up to O(n/ log n) children. This is the reason why we get an additional level of exponentiation here.) For larger k extend to vertices of height k-1, leaving the relation unchanged at lower heights. Two vertices of height k - 1 are equivalent if, for each equivalence class represented among their children (on which the relation has already been defined), they either have the same number of children in the class or both have at least 2" children in the class. There is a reset log-lin reduction taking the iterative definition
to an equivalent explicit definition
Moreover, since r is fixed this can be expressed as a simple definition. The rest of the theorem proceeds as in the proof of Theorem 6.3 except that nr is
Proving lower bounds
177
used in place of mr 2nr is used in place of Trm, and we automatically have a monadic interpretation since we are interpreting in a monadic second-order theory. Let us consider the case r = 2. Our formulas will now have parameters U, V, W, and Z. Let 2 n ( x , y , U ) and 2 n ( x , y , V ) be defined as ( x , y ) above except that, rather than comparing the number of children of vertices x and y of height 1, 0n2 (x, y, U) compares the number of children of x and y in U, and 2n(x ,y, V) compares the number of children of x and y in V. Now let n(x) be the formula x Z. Let n(x, y) be a prenex formula that asserts the following: (a) x Z and y € Z. (b) If x there is a t Z such that 2n (x, t) holds and n2n(t, y) holds. (c) If x = y, then x W. Every binary relation on a set of size at most 2nis interpreted in some tree in T22n2. For every pair (x, y) in the relation with x y there is at least one t Z such that m(x,t) and m(t,y) hold. I Corollary 7.5. Let r 2 and be a theory in a logic L. If there is a monadic interpretation of the classes 2nr in , then has a hereditary ATIME(expr-1(cn),cn) lower bound. Remark 7.6. For each r 2 there is a constant d > 0 such that every tree in 2nr has at most expr_1(dn) vertices, so we can view Corollary 7.5 as an improvement over Theorem 7.3 for obtaining ATIME(expr_1(cn),cn) lower bounds. It is instructive to compare this result with Corollary 6.5. It often happens that an interpretation of the classes Tnr/log n in a theory can be modified slightly to obtain a monadic interpretation of the classesT2nr-1 even when the theory is first-order. This explains, in part, why NTIME lower bounds can often be pushed up to linear ATIME lower bounds. We now prove another theorem about monadic interpretations of classes of trees of bounded height. We shall see in section 9 that this theorem gives the best lower bounds for monadic second-order theories of trees of bounded height. Theorem 7.7. Let Cn be the class of binary relations on a set of size exp,,([n/logn]). Then there is a prenex monadic interpretation of the classes Cn in the monadic second-order classes and in the monadic second-order classes Proof. First, consider the case r > 1. Formulas will contain parameters X = X 1 , X - 2 , . . . ,Xm, where m = 1 + [n/log n]. This is the first case we have encountered where the parameter sequence grows with n. Let Q' be
178
Kevin J. Compton and C. Ward Henson
a binary relation variable and are both leaves and
m(x,y)
a prenex formula that says x and y
Q' will be given by the prenex definition [Q'(x,y) = m]. Q'(x,y) is obviously an equivalence relation on the vertices of height 0. Now let (x, y) be a formula (with free relation variable Q) that says x is not a leaf and for every child t of x there is a child u of y such that Q(t,u) holds. Now the iterative definition
defines an equivalence relation on the vertices of heights less than k. For k = 1, Q is identical to the equivalence relation Q'. For larger k we extend the relation to vertices of height k - 1 by specifying that two such vertices are equivalent if precisely the same equivalence classes are represented among their children. When k = r we have an equivalence relation on the set of all vertices in a tree of depth r except the root. The iterative definition
can be converted to a prenex definition
of fixed length since r is constant. By Theorem 3.1 the sequence
of prenex definitions can be replaced by a single prenex definition
where rm(x, y) is reset log-lin computable from n. We will say that two vertices x and y in a tree of height r have the same rm-type if r m ( x , y ) holds. Now it is easy to show by induction on k that there is a tree of height r and sets X1 , . . . , Xm of leaves in this tree such that there are at least 1 + expk+1([n/logn]) rm-types among vertices of height k when k < r: every nonempty set of rm-types of vertices of height k - 1 determines a distinct rm-type for a vertex of height k. More is required to see that there is such a tree in exP2 r([n/log n]) When k = 0 there is no problem Consider the case k = 1. For i = 1,2, . . . ,exp 2 ([n/logn]) let Tt be the tree of height
Proving lower bounds
179
1 with precisely i leaves. Each of these trees is in exp2(m) Without much trouble we can choose, from the leaves of each Ti, sets X1 , . . . , Xm so that the roots of Ti and TJ have different rm-types when i j. Further, for T1, X1, . . . , Xm can be chosen in at least two ways. Thus, there are at least 1 + exp 2 ([n/ logn]) rm-types among vertices of height 1. From the set of trees {Ti, | 1 i < exp2(m)} choose a nonempty subset and form a tree of height 2 by directing edges from a new vertex to the roots of trees in this subset. We can form at least 1 + exp3 ( [n/ log n] ) such trees, each one in Texp2 Moreover, if we carry along the subsets X1,..., Xm in each subtree Ti, each root has a distinct rm-type. Continue for each k < r, forming at least 1 + expk+1([n/logn n) trees in Tkexp 2 (|n/logn]) subsets X 1 , . . . , X m ,so that each root has a distinct The interpretation of binary relations on sets of size exp r ([n/logn]) now uses precisely the same construction used in the proofs of Theorems 6.3 and 7.4. Let us consider the case r = 1. Our formulas will now have parameters X = Xi,...,Xm, Y = Y1,...,Y m , and Z and W. Define m(x, y) as above and n m ( x , y ) in the same way except that Y is used in place of X. These formulas define independent equivalence relations on leaves so we can speak of the m- and m-type of a leaf. The relations are independent and of index at most 2m. The rest of the proof then follows as the in case r = 2 in Theorem 7.4. Corollary 7.8. Let r > 1. M rr has a hereditary ATIME(expr (en/ log n) , en) lower bound. Corollary 7.9. Let r > 2 and be a theory in a logic L. If there is a monadic interpretation of the classes Trexp2 (n/log n) then £ has a hereditary A TIME (expr (cn/ log n) , cn) lower bound. The case r = 1 is worth stating separately. Observe that trees of height 1 do not really have much structure. We can regard them as sets with one distinguished point, the root. Therefore, we state the result in terms of interpretations of sets rather than trees. Corollary 7.10. Let Cn be the class of sets of size at most 2 n /log n and a theory in a logic L. If there is a monadic interpretation of the classes Cn in , then has a hereditary ATIME(2 c n / l o g n, cn) lower bound.
180
Kevin J. Compton and C. Ward Henson
This concludes our survey of tools for establishing lower bounds. The next section contains many examples of their application.
8
Applications
In this section we use the methods developed in earlier sections to give a representative sample of arguments for known lower bounds of theories. We believe that the details given here justify the claim that every known lower bound for a theory can be obtained in this way, with simpler, more conceptual proofs. In particular, there is no further need to code Turing machine computations. Moreover, in almost all cases our approach gives technical improvements on known results: we always obtain hereditary lower bounds; these bounds hold for both sat( ) and val( ), in contrast to most published NTIME lower bounds which are for just sat( ); and the reductions used are reset log-lin reductions rather than polynomial time, linearly bounded reductions. In a few cases we obtain qualitative improvements in the bounds. The most significant improvement is that we always obtain inseparability results. To simplify the statement of results and avoid repetition, when we say a theory £ has a hereditary NTIME(T(cn)) lower bound, we intend each of the statements (i)-(iv) in Theorem 6.1. When we say a theory has a hereditary ATIME(T(cn),cn) lower bound, we intend each of the statements (i)-(iv) in Theorem 7.1. For convenience we will also use functional notation rather than relations in some examples. Example 8.1 (The first-order theory of finite linear orders with an added unary predicate.). We show that this theory has a hereditary lower bound of NTIME(ex P (cn)) by iteratively interpreting the classes T2nand applying Corollary 6.9. In fact, we interpret the classes Tn consisting of the finite trees of height n. We denote the linear order by and the predecessor and successor of an element x by x — 1 and x+1 (when they exist). Let x < y be the formula x y Ax y and LAST(x) the formula asserting that x is the last element in the order. Let P be the unary relation symbol. We can identify each finite model = (m, R) of this theory (where m = {0,...,m — 1}) with a string a 0 a1... am-1 of 0's and 1's. We stipulate that ai = 1 if and only if P(i) holds. Representing a finite tree as a string of O's and 1's is straightforward. Now with each vertex x in a tree of height n associate a string that begins 0n-rl, where r is the depth of x, followed by the strings associated with each child y of x. The tree is represented by the string associated with the root. For example, the tree of Figure 1 is represented by the string 0001001010100100101.
Proving lower bounds
181
Fig. 1. Tree example In general, T may have many representations, since the children of each vertex are ordered arbitrarily. Thus, the tree above is also represented by the string 0001001001010010101. However, T can be easily interpreted within each of its representations . Let n(x) be the formula P(x) (indicating that is a position where 1 occurs). Let 0 ( x , y ) be the following formula with free relation variable Q:
Then [Q(x, y) ]n+1 defines a relation Q(x, y) which holds precisely when the number of consecutive O's preceding position x and the number of consecutive 0's preceding position y are equal and at most n. Let n(x,y) be the formula
Thus, n (x,y) holds precisely when there are 1's at positions x and y, x precedes y, there is exactly one more 0 preceding x than preceding y (but no more than n 0's preceding y), and there is no position between x and y which has as many 0's preceding as x has. Now a tree T of height n represented by a linear order is isomorphic to when Q is given by the iterative definition [Q(x ,y) 0]n+l. Remark 8.2. Note that for some d > 0 each tree in T2n has at most e x p ( d n ) vertices so for some d' > 0 each representation of a tree in this class has at most exp00(d'n) elements. From Theorem 6.1 we see that we have a tool for obtaining further lower bounds. If there is a prenex or iterative interpretation of the classes Cn consisting of linear orders of length at most exp00(n) with added unary predicates in a theory , then £ has a hereditary lower bound of N T I M E ( e x p ( c n ) ) .
182
Kevin J. Compton and C. Ward Henson
Example 8.3 (The first-order theory of all linear orders). We show that this theory has a hereditary lower bound of NTIME(exp(cn)) by a simple interpretation of the models in Example 8.1. Consider a finite linear order (n, ) together with a unary relation R on n interpreting the predicate symbol P. We will represent (n, , R) by a linear order = (S, ) formed by replacing each i € n with a copy of the closed unit interval [0,1] if i is not in R and with a single point followed by a copy of the closed unit interval [0,1] if i is in R; the order is extended in the obvious way. Now we interpret {n, , R) in as follows. Let (x) be the first-order formula saying that x has no immediate successor and x is either the least element or has an immediate predecessor. Clearly, (x) picks out all the left endpoints of the unit intervals in (S, ). There are precisely n elements in (x). Let (x) be the first-order formula saying that x has no immediate successor but does have an immediate predecessor y such that y either has no immediate predecessor or is the least element. Thus, (X) picks out the left endpoints of the unit intervals associated with elements i in R. Thus, (n, exp3(dn); this estimate comes from the prime number theorem. We then obtain a monadic interpretation as in the previous example. Remark 8.23. Evidently, the basic ideas used in Examples 8.20 and 8.22 are already found in Fischer and Rabin [1974]. However, we obtain several benefits from our approach. First, by using the machinery of the previous sections we avoid technicalities concerning coding sequences and Turing machine computations which Fischer and Rabin needed to address. Second, by emphasizing inseparability instead of just complexity, we obtain hereditary lower bounds. Finally, by observing that monadic quantification is implicit in the interpretations, we see why the appropriate lower bounds are ATIME bounds rather than an NTIME bounds. The linear alternating time lower bounds in Examples 8.20 and 8.22 were proved by Berman [1980]. A slightly weaker hereditary NTIME(2cn) lower bound for real addition was obtained by Ferrante and Rackoff [1979], and a hereditary NTIME (22 cn) lower bound for Presburger arithmetic was obtained by Young [1985]. Maurin [1996; 1997a] has shown complexity bounds for theories of ordinal addition. The full strength of the prime number theorem is not needed in Example 8.22. A much simpler result, due to Chebyshev, suffices; see Theorem 7 of Hardy and Wright [1964]. This illustrates a common phenomenon
Proving lower bounds
193
in lower bound results for a theories. Crude arguments often suffice. Sophisticated mathematics and an intimate knowledge of the theory under consideration are rarely needed. In this respect, upper bound results may be much more difficult. Note that since we have monadic quantification on sets of size 2cn in a model of real addition (or any extension of the theory of semigroups as described in the remark after Example 8.20) we have a monadic interpretation of the classes from the same representation of trees used Example 8.1. Similarly, since we have monadic quantification on sets of size 22cn in a model of Presburger arithmetic, we have a monadic interpretation of the c l a s s e s . These facts will be useful in the next two examples. Fischer and Rabin [1974] announced two other lower bounds: a lower bound of NTIME(exp3(cn)) for the first-order theory of integer multiplication; and a lower bound of NTIME(22 c n ) for the theory of finite Abelian groups. They did not supply proofs (but mentioned the key idea of encoding sequences of integers as exponents in a prime decomposition). To our knowledge, no proofs have ever been published. In the next two examples we sketch proofs of the stronger ATIME versions of these results. Example 8.24 ( The first-order theory of multiplication on the positive integers). This is the first-order theory of the model (II, •), where I is the set of positive integers. We obtain a hereditary lower bound of
by giving a monadic iterative interpretation of the classes T42n and applying Corollary 7.5. Let Pi be the i-th prime number. Observe that (II,.) is isomorphic to a direct sum of countably many copies of (N, +) by the mapping that takes the sequence (a1,a2,...), where ai is 0 for all large i, to Moreover, each direct summand (i.e., set of powers of some prime Pi) can be defined. But we saw in Example 8.22 and Remark 8.23 that there is a monadic interpretation of the classes in {N, +). The idea here is that we interpret a tree from in by directing edges from a new root to the roots of trees interpreted in each direct summand. Clearly, we can specify a first-order formula a(x,y,z,t) that says t is a prime, x, y, and z are powers of t, and x.y = z. Let 6n(x, u'), n(x,y,u'), and o n (x,u',v') be formulas giving a monadic iterative interpretation of in (N,+). We saw in Example 8.22 that such formulas exist. (If we substitute a(x,y, z,t) for all occurrences of x + y= z we obtain formulas 8' n (x,t,u'), 'n(x,y,t,u'), and o' n (x,t,u',v') in the language of (II,.). For each fixed prime t, these formulas give a monadic iterative interpretation of in (II,.). For S'n(x,t,u') to be satisfied, it is necessary that x and u' be powers of t. Thus, except possibly for 1, which is a power of every
194
Kevin J. Compton and C. Ward Henson
prime, there are no elements common to the interpretations of S'n(x, t,u') and S'n(x, t',u") when t and t' are distinct primes. Let n(t,u',u) be a formula that says t is a prime and u' is the largest power of t dividing u. Define S'n(x,u) to be the formula
. Define n"(x,y,u) to be the formula
The first disjunct gives an edge from 1, the root of the tree, to the root of each primary subtree; the second disjunct gives the edges in the primary subtrees. UsingS"n|and n" we can interpret each tree from by taking a disjoint collection of trees in and 1 as a new root—we need only choose u suitably. Define 0 " n ( x , u , v , w ) to be the formula
By varying v and w we obtain all subsets of S"n (x, u), so we have a monadic interpretation. Remark 8.25. An upper bound of ATIME(exp3(dn),dn) for the firstorder theory of integer multiplication can be obtained from the treatment of this theory in Ferrante and Rackoff [1979]. The original reference for an upper bound on the first-order theory of integer multiplication is Rackoff [l975b; 1976]. Decidability results and complexity bounds for related theories have been given by Maurin [1997b] and Michel [1981; 1992]. As our last example we consider the first-order theory of finite Abelian groups. Lo [1988] has given an extensive treatment of upper bounds for theories of Abelian groups. He states these bounds in terms of the classes SPACE, but it is clear that his analysis gives ATIME(2 2 d n ,dn) upper bounds. We derive a matching lower bound, not just for the theory of finite Abelian groups, but also for the theory of finite cyclic groups. Example 8.26 (The first-order theory of finite cyclic groups). We
obtain a hereditary lower bound of
by giving a monadic iterative interpretation of the classes and applying Theorem 7.5. We use a device similar to the one used in Example 8.24. Now rather than regarding the positive integers with multiplication as a direct sum of copies of (N,+}, we regard a finite cyclic group as a direct sum of cyclic groups whose orders are prime powers.
Proving lower bounds
195
Let C(l) be the cyclic group of order 1. We know from the remarks following Example 8.22 that there is a d > 0 such that when 1 > 2 2dn , there is a monadic iterative interpretation of T2n in C(l). More precisely, there are formulas S n (x,t',u'), n(x,y,t',u'), and an(x,t',u',v) given by iterative definitions such that if u' is a generator of C(l), then each tree in is isomorphic to (Sn (x,t',u'), n (x,y,t',u')) for some t' in C(l), and includes all subsets of Sn (x, t' , u') as v' ranges over elements of C(/). (We need to mention the generator u of C(l) explicitly in these formulas because there is no preferred generator; this necessitates only a minor modification of the formulas in Example 8.20.) Let p1,p2, . . . ,pk be the prime numbers less than 22dn+1 and mi the largest power of pi less than 22dn+1 . Then since pi • mi > 22dn+1 and Pi < mi, we know that m^ > (22dn+1 )1//2 = 22dn so that there is a monadic iterative interpretation of in each C(mi) as described above. By Chebyshev's theorem (Theorem 7 of Hardy and Wright [1964]) we know that k, the number of primes less than 22dn+1 , is at least 22dn for sufficiently large n. By taking d large enough, we can ensure that k is greater than the maximum number of primary subtrees in each tree of . Now take m = m1.m2 • • mk so that C(m) = C(m l )®C(m2)0- • •®C(mk). We see that m > (22dn )k > 22dn . We need to show that we can combine the monadic iterative interpretations of in the direct summands C(mi) to obtain a monadic iterative interpretation of T32n in C(m). To do this, we will show that we can define the decomposition of C(m) into the factor subgroups C(m,i). Let 0(x, y, t, u) be the following formula with free relation variable Q:
Then by induction on n the relation Q(x,y,t,u) given by the iterative definition is true precisely when there is an integer j in the range 0 < j < 22n-1 such that x = jt and y = ju. Notice that we are using an idea first exploited by Fischer and Rabin [1974] in their proof of a lower bound for Example 8.20: each integer j < 22n can be written j = j1J2 +j3 +J4, where j1,j2,J3,J4 < 22n-1. For the remainder of the proof we will assume that Q(x, y, t, u) is given by the iterative definition
196
Kevin J. Compton and C. Ward Henson
Fix a generator u of C(m). We can specify that x = ju for some integer j in the interval by writing Q(x,x,u,u). Since m > 22 , the values ju are distinct for j E I. Let us identify I with the set of elements ju such that j E I. The group operation restricted to 7 defines integer addition. We can also define multiplication: if X1 = j1u and x2 = J2U then y = j1j2u precisely when Q ( y , x 1 , X 2 , u) holds. Therefore, we can say that s € I is prime and also that s E I is a prime power. Let a(x,y,z, s,u) be a first-order formula that says that s E I is the largest power of some prime in I, that x, y, and z are annihilated by s (i.e., if a(x, y, z, s, u) holds, then s = mi for some i < k and x,y,z E C(mi). In other words, a(x, y, z, s, u) defines addition on the direct summands C(mi). Now every element t in C(m) can be uniquely expressed as a sum
where each ti is an element of C(mi). We can specify a formula n(s, t', t, u) that says t1 = ti when s is m,. We say simply that t' is annihilated by s and t — t' is divisible by s:
Thus, we can define the decomposition of C(m) into its factor subgroups. In particular, since u can be expressed as a sum uI + u2 + • • • + Uk , where ui is a generator of (C(mi), the formula n(s,u',u,u) picks a unique generator for each factor subgroup as s ranges over maximal prime powers in I. The rest of the proof proceeds as in Example 8.24. For example, we form S'n(x, s,t',u',u) by substituting a(x,y,z,s,u) for each occurrence of x + y = z in Sn(x,t', u'). ThenS"n£(x,t, u) is the formula
We define n " ( x , y , t , u ) and a"n(x,t,u,v,w) similarly.
9
Upper bounds
In this section we give upper bounds that show that most of the lower bounds obtained in sections 4-7 are the best possible. First we give upper bounds for satT(Lo), satpT(Lo), and sat*T(Lo). Recall from Theorem 4.5 that when T(dn) = o(T(n)) for some d between 0 and 1, these sets are not in NTIME(T(cn)) for some c > 0.
Proving lower bounds
197
Proposition 9.1. Let T be a time resource bound. Then
Proof. To determine whether a sentence from L0 is in satT(Lo), nondeterministically generate a finite binary relation. We give a nondeterministic recursive procedure that determines whether is true in this relation. If € satT(Lo), then it holds in some binary relation a set of size at most T ( n ) . The representation of this relation requires at most T(n)2 bits. We will show that our recursive procedure halts within time cT(n)n+2 on The procedure tests the subformulas of and combines results to produce an answer. We may assume that all negations have been pushed inward so that only atomic formulas are negated. It is clear how the procedure works if is a conjunction or disjunction. If begins with an existential quantifier, an element of is nondeterministically assigned as the value of the quantified variable and the enclosed formula is checked. If begins with a universal quantifier, then each element of is assigned in turn to the quantified variable and the enclosed formula is checked. When an atomic formula is reached it can be determined in time O(T(n) 2 ) whether it is true in for the assignment values at that point. Since, for each of at most n universal quantifiers, T(n) values are generated, the total time is 0(T(n) 2 T(n) n ), as claimed. Suppose is in prenex normal form. For each E > 0, whenever n is sufficiently large there are at most (1 + E)n/logn universal quantifiers in , so determining whether a prenex formula is in s a t ( L o ) ) is in then used, except that when a relation variable is encountered, it is necessary to jump to its definition (this may take n moves), compute its value by calling our recursive procedure, and return. Total time, then, is O(nT(n) 2 T(n)n) because the tree of recursive procedure calls has height at most n and branches at most n times at each vertex; at the leaves, there is a cost of T(n) 2 moves to evaluate atomic formulas; at the vertices corresponding to relation variable references there is a cost of O(n) moves to find the definition. For all three bounds we must use the linear speed-up theorem (see Hopcroft and Ullman [1979]) to eliminate constants in front of the time bounds. I We see that if T(n) n+2 = O(T(dn)) for some d > 0, then satT(L0) € NTIME(T(dn)), so we have essentially the same upper and lower bounds. Similar remarks pertain in the other cases.
198
Kevin J. Compton and C. Ward Henson
Proposition 9.2. Let T be a time resource bound. Then
Proof. Given a sentence of length n in MZ/o, nondeterministically generate a binary relation. We use alternation to determine whether holds in the relation. If E satT(MLo), then it holds in some binary relation on a set of size at most T ( n ) . For each set quantifier encountered it is necessary to generate T(n) bits to assign a value to the quantified variable. There are O(n) such variables so this part of the computation takes time O(nT(n}). This time is dominated by the O(T(n)2) time needed to generate and verify atomic formulas. If is a sentence in ML0,we use the same procedure except that when a subformula with a relation variable is encountered, the value of the subformula is guessed and verified using alternation. I Notice that if
for some c > 0, then, for some so
same upper and lower bounds. We now turn to upper bounds for theories of finite trees. To determine if a sentence of length n is in s a t ( E r ) , we will show that it suffices to nondeterministically generate a tree in Trm, where mlog m > n, and verify that holds in this tree. In fact, we prove a somewhat stronger result: given a tree of height r or less, there is a tree in satisfying precisely the same sentences of length n or less. Our proof uses Ehrenfeucht games. Observe that a sentence of length n can have at most m distinct variables (i.e., variables with different subscripts). Therefore, we will use the formulation of Ehrenfeucht games for logics with a bounded number of variables. These games were introduced for infinitary logics by Barwise [1977] and later used by Immerman [1986] to obtain lower bounds for queries on finite relational structures. Given two structures A and B for a first-order logic L, write A = B to indicate that A and B satisfy precisely the same sentences from L of quantifier rank at most n and with at most m distinct variables. The game used to characterize = is played for n moves between players I and II on a pair of structures A and B. Each player begins with m pebbles. On the first move player I places a pebble on an element of A (or B) and player II responds by placing a corresponding pebble on an element of 23 (or A). On each remaining move player I has two options: he may place an unplayed pebble on an element of A (or B), in which case player II places a corresponding pebble on an element of B (or A) ; or he may remove one of the pebbles on A (or B) and replay it (not necessarily on the same structure), in which case player II removes the corresponding pebble from
Proving lower bounds
199
B (or A) and replays it in response to the move of player I. Player II wins if the mapping from the set of elements of A covered by pebbles at the end of the game to corresponding elements in 53 is an isomorphism between substructures of A and B; otherwise, player I wins. The basic result concerning this game is that player II has a winning strategy if and only if A = B. We use this fact to prove two simple lemmas. We will assume henceforth that in addition to the binary edge relation, trees also have a unary relation that is true only at the root. This is a technical convenience to force player II to pebble a root whenever player I pebbles a root. For a tree A and a vertex x in A, let Ax be the subtree of A whose set of vertices consists of x and all of its descendents. Lemma 9.3. Suppose A is a tree and x is a vertex of A. Let A' be the result of replacing Ax in A by another tree Proof. We know that player II has a winning strategy for the m pebble game of length n on Ax and B. Player II uses this as part of a winning strategy for the m pebble game of length n on A and A'. Whenever player I pebbles an element in A — Ax or A' — 53, player II responds by pebbling the same element; whenever player I pebbles an element in 2lx or B, player II responds by pebbling the element dictated by the strategy for Ax and 53. Notice that if player I pebbles the root of Ax, player II pebbles the root of 53 (and vice versa) so that adjacency to elements in A — Ax will be the same. This is a winning strategy for player II so A = A'. I Lemma 9.4. Let A and B be trees. Suppose that for each isomorphism type, the set of primary subtrees of A and the set of primary subtrees of 53 either contain the same number of trees of that type, or both contain at least m trees of that type. Then A = B for every n > 0. Proof. It is easy to determine a winning strategy for player II. If player I pebbles the root of A (or B), player II pebbles the root of B (or A). If player I pebbles an element in a primary subtree A' of A and no other elements of A' have been pebbled, player II responds by pebbling the corresponding element in a primary subtree B' = A' of B where no other elements have been pebbled. Player II responds similarly if player I pebbles an element in a primary subtree B' of B and no other elements of B' have been pebbled. If player I chooses an element in a primary subtree A' of A where some elements have already been pebbled, these elements correspond to elements already pebbled in some primary subtree B' — A' of B. The isomorphism determines the response of player II. Player II responds similarly if player I chooses from a primary subtree of 53 where elements have already been pebbled. Because no more than m elements in a structure are pebbled at
200
Kevin J. Compton and C. Ward Henson
any time, it is easy to see that this strategy can always be carried out for A and B satisfying the hypotheses of the lemma. I Theorem 9.5. Given a finite tree A of height at most r, there is a tree B E Trm such that A = B for all n>0. Proof. Modify A in the following manner. For each nonleaf vertex x of depth r — 1, consider all children y of x and subtrees Ay; for each isomorphism type, if more than m subtrees Ay are isomorphic, delete enough of them so that there are precisely m. Continue this modification procedure for vertices of depth r — 2, r — 1, and so on, up to the root. Call the resulting tree B. It is clear from the two preceding lemmas that every time we delete subtrees in this process, we obtain a tree in the same =-class as A. Thus, A = B. It is evident that B € T™. Remark 9.6. With slight modifications, this proof shows that for any tree A of height r or less, there is a tree B € T™ such that A =™ B for all m > 0. We have, as an immediate consequence of the preceding theorem, upper bounds for theories of finite trees.
Corollary 9.7.
Proof. When r > 3 and m is the least integer suc
each tree in Trm has at most exp r _ 2 (cn) vertices. Thus, the time required to nondeterministically generate a tree in 77™ and determine whether a sentence of length n holds in this tree is exp r _ 2 (cn) n . This function is dominated by expr_2(dn) for some d > 0 when r > 3 and by expr_2(dn2) when r = 3. I Remark 9.8. This theorem gives matching upper bounds for the lower bounds obtained in Corollary 6.4 except for the case r = 3. There the lower bound is NTIME(2cn) and the upper bound is NTIME(2dn2). Ferrante and Rackoff [1979] get precisely the same bounds for the theory of one-to-one functions (cf. Example 8.12). It is more satisfying to say that sat(E3) is a complete problem for
via polynomial time reductions, according to Theorem 6.1(iv). Both sat(E1) and sat(E2) are in PSPACE since the number of vertices in trees in T1m and 72m is polynomially bounded in n, where m is the
Proving lower bounds
201
least integer such that m log m > n. (Recall that by Theorem 9.5, for every finite tree A of height at most r, there is a tree such that A =™ B.) Every first-order theory with a model of power greater than 1 is hard for PSPACE (via log-space reductions), so we know fairly precisely the complexity of these theories. Write A &% B to indicate that A and 03 satisfy the same monadic second-order sentences of quantifier rank n with at most m variables. To obtain upper bounds for monadic second-order theories of finite trees we must introduce Ehrenfeucht games characterizing the relation mn. In such a game, players I and II play for n moves on structures A and B. Let PI , P2, . . . , Pm be unary relation symbols not in the language of A and B. During each move of the game one of the symbols Pi will be assigned a pair of sets. This pair contains a subset of A and a subset of B. Initially, each of these symbols is assigned the empty set for A and the empty set for 03. On each move player I picks a relation symbol Pi. The previous assignment to Pi, is forgotten. Player I assigns a subset of A (or B). Player II responds by assigning a subset of B (or A) to Pi . Whenever player I picks a singleton set, player II must respond with a singleton set. (Singleton set moves correspond to element quantifiers.) Now suppose that at the end of the game symbols P1 , P2, . . . , Pm are assigned subsets P1 , R2, . . . , Rm of A and subsets S1 , S2, . . . , Sm of 03. If the set
is an isomorphism between substructures of
then player II wins; otherwise, player I wins. The basic result concerning this game is that player II has a winning strategy if and only if A ™ B. We will sometimes say that an Ehrenfeucht game is played on structures (A, RI , . . . , Rm) and (B, S1 , . . . , Sm), where R1 , . . . , Rm are subsets of A and S1, . . . ,Sm are subsets of 03. By this we mean that the symbols P1 , . . . , Pm initially have R1 , . . . , Rm and S1 , . . . , Sm assigned to them. The game then proceeds as before. We will say that (A, R1, . . . , Rm) and (03, S1, . . . ,Sm) are n-equivalent if player II has a winning strategy for games of length n on this pair of structures. Clearly, n-equivalence is an equivalence relation. Using monadic second-order Ehrenfeucht games, we can show that =™ can be replaced by ™ in Lemma 9.3. In fact, we can show more. Lemma 9.9. Suppose that A is a finite tree with subsets RI, . . . , Rm. Let x be a vertex the structure obtained by replacing the substructure by another
202
Kevin J. Compton and C. Ward Henson
structure (B, S1,..., Sm), where B is a tree. If
are n-equivalent, then so are We would also like to prove a result like Lemma 9.4, but this is not so simple. We must prove results like Lemma 9.4 and Theorem 9.5 simultaneously. To do this we need more complicated sets of trees. Definition 9.10. Define functions f(m,n,r) and g(m,n,r) as follows.
and
Functions f and g are defined for all integers m,n,r repeatedly use the last equation to obtain
> 0 because we can
and then replace each factor f(m,i,r + 1) by 2m(g(m,i,r) + l)f( m , i , r ). Now define classes Urm,n of trees of height at most r. Uom,n contains all trees of height 0. consists of all trees whose primary subtrees come f r o m , no more than g(m, n, r) primary subtrees coming from the same isomorphism class. Define classes Vrm,n of structures (A,R1,..., Rm), where A is a tree of height at most r and each Ri is a subset ofA.vom,n contains all structures (A, R1, ..., Rm), where A is a tree of height 0. by restricting to primary subtrees of A all come from Vrm,n, no more than g(m,n,r) such substructures coming from the same isomorphism class. Notice that . Theorem 9.11. Given a finite tree A of height at most r and an integer n>0, there is a tree B € Urm,n such that A mnB. Proof. We prove a more general assertion. Given a structure ( A , R I , ..., Rm}, where A is a tree of height at most r and R1,..., Rm are subsets of A, there is a structure such that and n-equivalent. The proof is by induction on r. For r = 0, the assertion is obvious. Assume that A has height r > 0 and that the assertion holds for trees of lesser
Proving lower bounds
203
height. Consider the substructures of (A, RI ,..., Rm) formed by restricting to primary subtrees. By the induction hypothesis, each such substructure is n-equivalent to some structure in for each n-equivalence class replace the substructures in that class by a structure from in the same class with the provision that if there are more than g(m, n, r — 1) substructures in the class we first eliminate enough of them to make their number precisely g(m,n,r — 1). In this way we form a structure (B,S1,... ,S m ) € V™'™. We show that (A, RI , . . . , Rm) and (B,S1,..., Sm) are n-equivalent. Fix an n-equivalence type T. Let (A', R1,..., R'm) be the union of substructures of (A, R1,... ,Rm) of type T formed by restricting to primary subtrees of A. Define the substructure (B', S'1..., S'm) of (B,S1,..., Sm) similarly. Thus, A' and B' are forests, and each substructure of (A', R'1,..., formed by restricting to a tree in the forest is of type T. Moreover, A' and B' either contain the same number of trees or both contain at least g(m,n,r — 1) trees. We claim that (A', R'1,..., R'm) and (B',S'1,... ,S'm) are n-equivalent. From this claim it follows easily that (A,R1,.. .,Rm) and (B,S1,...,S m ) are n-equivalent because player II can combine the winning strategies on the pairs of substructures. We establish the claim by induction on n. The case n = 0 is clear. (Notice, however, that it is crucial that g(m,0,r — 1) = 2 because S'i must be assigned a nonsingleton set whenever R'i is assigned a nonsingleton set.) Suppose that n > 0 and that the claim is true for smaller values. Player I will begin by assigning a subset of one of the forests—say subset Ri of A'—to a relation symbol Pi. This gives a new structure where when By the induction hypothesis (for the induction on r), every substructure formed by restricting this structure to a tree in A' is (n — l)-equivalent to some structure in there can be at most f(m, n - 1, r — 1 ) - e q u i v a classes represented among such substructures. Player II responds by assigning a subset of Si" of B' to Pi;to obtain a structure (B', S1",..., Sm), where S" = Sj when i = j. She does this in such a way that for every (n - 1)-equivalence type either have the same number of substructures of type T' formed by restricting to trees in A' and B', or both have at least g(m,n — l,r — 1) such substructures of type T'. Player II can always make such a response because A' and B' either have the same number of trees or both have at least g(m, n, r — 1) = f(m, n — 1, r — l)g(m, n — 1, r — 1) trees. By the induction hypothesis (for the induction on n)
are (n - l)-equivalent so (A',R'1,... ,R'm) and (B',S'1,... ,S'm) are n-equivalent. I Theorem 9.12. For each r > 1 there is a d > 0 such that
204
Kevin J. Compton and C. Ward Henson
Proof. It is easy to show by induction that for each r > 1 there is a c > 0 such that
If we let h(m, n, r) be the maximum number of vertices of any structure in see
we
that
For each r > 2 there is a c > 0 such that h(m, n, r) < expr(c(m + logn)). When r > 2 we can determine if a sentence from MLt is in sat(ME r ) by nondeterministically generating a tree in , where m log m > n, and using alternation to verify that holds in this tree. This can be done in ATIME(expr(dn/ logn), n). When
r = 1 we must be a little more careful because a tree in
structure. Suppose we have chosen m subsets R1,..., Rm from a tree A of height 1. Let +Ri, be Ri and — Ri be the complement of Ri. For We only keep track of which sets Ri contain the root of A and the the values \d • R\ for each d E {+, -}. Thus, {A,R1,..., Rm) can be represented in space O(2m logn). Using this kind of representation, the argument above shows that sat
10
Open problems
We close with a list of problems, first presented in Compton and Henson [1990]. They are mostly concerned with lower bounds for theories arising in algebra. The only one on which there has been substantial progress is Problem 10.13. These problems represent only a small number of the decidable theories whose complexities deserve to be investigated. We have not mentioned many of the decidable theories of modules investigated in recent years—Point [1986; 1993], Point and Prest [1993], Prest [l988a; 1988b], for example. Problem 10.1. Determine the complexity of first-order theories of finite fields.
The first-order theory of finite fields, and several related theories, were shown to be decidable by Ax [1968] in a paper which has proved to be
Proving lower bounds
205
of great mathematical influence. Later Fried and Sacerdote [1976] gave a primitive recursive decision procedure, but it is not known if any of these theories is elementary recursive. It is not difficult to show that these theories have a hereditary lower bound of ATIME(exp 2 (cn),cn) The method is to give a monadic interpretation of the classes of binary relations on sets of size at most exp 2 (n), and then apply Theorem 7.3. We give a brief sketch of the argument. Let f be any infinite field which is a model of the theory of finite fields and in which one has the coding of finite sets used by Duret [1980]. That is, given any two disjoint finite sets A, B C F, there is an element w e f such that if a £ A then a + w is a square in f and if b € B then b + w is not a square in f. Construct by iteration formulas an(x,u) such that for each n there is a choice of parameters u so that an(x,u) is true in f of exp 2 (n) values of x. For example, if f has characteristic 0, then an(x) can be constructed as in Fisher and Rabin [1974] so that an(x) holds in f exactly when x is one of the integers 0,... ,exp 2 (n) — 1. Alternatively, one could use formulas an(x,y) asserting that x is an exp 2 (n)th root of y. Now consider the formulas B n ( x , t , u ) given by
For an appropriate choice of t in F, the mapping is oneto-one on . This together with the coding of finite sets gives every binary relation on sets of size at most exp 2 (n). The coding of finite sets also gives every subset of the universe so we have the required monadic interpretation. Problem 10.2. Determine the complexity of the first-order theory of linearly ordered Abelian groups. This theory was shown to be decidable by Gurevich [1965], with later improvements in Gurevich [1977]. There is a simple interpretation of the first-order theory of linear orders in this theory, so it has a hereditary N T I M E ( e x p ( c n ) ) lower bound. (See Example 8.3.) On the other hand, a primitive recursive decision procedure for the theory was given by Gurevich [1977]. It would be interesting to improve either of these bounds. Problem 10.3. Determine the complexity of first-order theories of valued fields. Ax and Kochen [1966] and Ersov [1965] proved the decidability of various first-order theories of valued fields, including some power series fields and the fields of p-adic numbers Qp. Brown [1978] obtained an elementary recursive upper bound for the first-order theory of 'almost all' of the fields Qp—that is, the set of sentences true in Qp for all but finitely many p. Very little is known about lower bounds for this theory or about the other related theories covered by the Ax-Kochen-Ersov work, with the exception
206
Kevin J. Compton and C. Ward Henson
of the theory of p-adic numbers, whose complexity was determined by Egidi [1993] Problem 10.4. Determine the complexity of the first-order theory of Boolean algebras with several distinguished filters. Ersov [1964] proved decidability of the first-order theory of Boolean algebras with a distinguished filter. Touraille [1985] presents some results on the elimination of quantifiers for this theory, but does not show decidability. Rabin [1969] showed decidability of the theory of Boolean algebras with quantification over filters by giving an interpretation in the monadic second-order theory of two successors. This gives an upper bound of NTIME(exp00(dn)), but nothing is known about lower bounds. Problem 10.5. Determine the complexity of the first-order theory of the lattice of closed subsets of the Cantor set. Rabin [1969] proved that this theory is decidable by interpreting it in the monadic second-order theory of two successors. As in the previous problem, this gives an upper bound of NTIME(exp00(dn)). It is not known if this theory is elementary recursive and no nontrivial lower bounds are known. A more explicit analysis of this theory has been given by Gurevich [1982]. Problem 10.6. Determine the complexity of the first-order theory of the ring of bounded sequences of real numbers. Cherlin [1980] gave a very explicit and difficult decision procedure for this theory, but its complexity has not been analysed. It should be possible to extract an upper bound from Cherlin's work. While it seems unlikely to us that this theory is elementary recursive, there are no good lower bounds known. Problem 10.7. Determine the complexity of the theory of pairs of torsionfree Abelian groups, and of the theory of a vector space V with k distinguished subspaces (k = 1,2,3,4). The theory of pairs of torsion-free Abelian groups was proved decidable by Kozlov and Kokorin [1969]; see also Schmitt [1982]. For k > 5, the theory Tk of vector spaces with k distinguished subspaces (over some fixed field f) is undecidable; see Baur [1976] or Slobodskoi and Fridman [1976]. Here the vector space is equipped with + and a family of unary scalar multiplication functions, one for each element of f'. If k < 4, however, the theory Tk was shown to be decidable by Baur [1980]. See Prest [1988a] for a discussion of how these theories are related to the representation theory of finite-dimensional algebras over F. The theories Tk are stable and hence the undecidability of T5 could not be proved by the usual means of interpreting arithmetic or finite graphs. There does not seem to be any corresponding a priori impediment to using the methods of this chapter to
Proving lower bounds obtain lower bounds on the complexity of
207 ,
Problem 10.8. Determine the complexity of the first-order theory of real closed fields and the theory of the first-order theory of algebraic closed fields. These theories are, respectively, the first-order theory of the real numbers and the first-order theory of the complex numbers. Good upper and lower bounds are known for these theories, but the gap has not been completely closed. Berman's ATIME(2 c n ,n) lower bound for real addition is the best bound known for the theory of real closed fields. We discussed this bound in Example 8.20. By the remarks following the example, we have the same lower bound for the theory of algebraic closed fields. The best upper bound at present for the theory of real closed fields is SPACE(2dn); this was proved by Ben-Or et al. [1986]. This bound holds as well for the theory of algebraic closed fields since there is a simple interpretation of the complex numbers in the real numbers. For the same reason, any lower bound for the theory of algebraic closed fields would hold for the theory of real closed fields. Robinson [l959b] showed that if A is the field of real algebraic numbers, then the first-order theory of (R, +, -, A) is also decidable. It would be interesting to know if this theory has a somewhat higher complexity than the theory of real closed fields. Problem 10.9. Determine the complexity of the first-order theory of differentially closed fields. Robinson [l959a] proved the decidability of this theory, but essentially nothing more is known about its complexity. See Wood [1976] for a fuller discussion of this mathematically interesting theory. Problem 10.10. Is elementary recursiveness of a theory preserved under product and sheaf constructions? Decidability of first-order theories is preserved by many general constructions, such as products (Feferman and Vaught [1959]) and sheaf constructions (Macintyre [1973]). Some upper bound results for weak products are presented in Ferrante and Rackoff [1979], where the question is raised whether, for every model A whose first-order theory is elementary recursive, the first-order theory of the weak direct product Aw is also elementary recursive. The same type of question for other product and sheaf constructions is also open and worth investigating. (See Chapter 5 of Ferrante and Rackoff [1979].) Problem 10.11. Give nontrivial lower bounds for mathematically interesting theories whose decidability is still open. Examples include the first-order theories of the field of rational functions over the complex numbers; the real exponential field (R,+,-,exp);
208
Kevin J. Compton and C. Ward Henson
the field of meromorphic functions; and many others. It may be possible to show that some of these theories are not elementary recursive, just as Semenov [1980] did for the theory of free groups. (See the remarks following Example 8.8.) Problem 10.12. Is there a 'natural' decidable theory which is not primitive recursive? Problem 10.13. In [Compton and Henson, 1990], we asked whether there is there a 'natural' decidable theory with a lower bound of the form NTIME(exp00(f(n)) where f ( n ) is not linearly bounded. This was answered affirmatively by Vorobyov [1997; preprint], who showed that a rudimentary version of the theory of finite typed sets investigated by Henkin [1963] and Statman [1979] is decidable, but has a lower bound of
Problem 10.14. Determine the complexities of fragments of theories with given prefix structures. There has been some interesting work done in this area. (See, for example, Robertson [1974], Reddy and Loveland [1978], Furer [1982], and Scarpellini [1984].) In certain cases the methods of this chapter should give results under these restrictions. This is not likely to be true where iterative interpretations are used, since iterative definitions almost always introduce an unbounded number of alternations of quantifiers. However, where prenex interpretations are used in conjunction with Theorem 6.1 and Corollary 6.1, it seems clear that complexity results for sentences with specific limitations on prefix structure can be obtained. Other syntactic limitations can also be imposed on decision problems and have been widely studied in the setting of the decidable/undecidable distinction. For example, in algebraic theories, one may pay attention to the degree and number of variables of occurring polynomials. In decision problems (and, more generally, complexity problems) the refinements mentioned above, especially limitations to a simpler and more intelligible prefix structure, often reflect restriction to mathematically more interesting and significant problems (as has been emphasized to us by G. Kreisel). Thus, the undecidability of Hilbert's tenth problem is far more interesting than the undecidability of arithmetic; the undecidability of the word problem for finitely presented groups is far more interesting than the undecidability of the theory of groups. One can hope for and expect to see a similar kind of increasing maturity in the study of complexity of decidable problems, not only at the level of NP-complete or PSPACEcomplete problems (where it already exists to some extent), but also at
Proving lower bounds
209
higher levels of complexity. Problem 10.15. Characterize the PSPACE-complete theories. We noted in section 1 that every theory with a model of power greater than 1 is PSPACE-haxd. Thus, the PSPACE-complete theories are, in some sense, the simplest nontrivial theories. A number of different theories have been shown to be PSPACE-complete. (See Stockmeyer [1977], Ferrante and Rackoff [1979], and Grandjean [1983].) It would be interesting to have model-theoretic characterization of these theories. Problem 10.16. If one substitutes 'tree' for 'binary relation' in the definitions of satT(Lo) and satT(M£o), do Theorems 6.1 and 7.1 still hold for An affirmative answer would give a generalization of Corollaries 6.5, 7.5, and 7.9. Problem 10.17. Use the techniques of this chapter to derive a lower bound for the emptiness problem for *-free regular expressions. The proof that this problem is not elementary recursive is one of the more difficult results in Stockmeyer [1974]. McNaughton and Papert [1971] show that the first-order theory of finite linear orders with an added unary predicate (Example 8.1) can be reduced to this problem, but it is not clear that an elementary recursive reduction can be found. Furer [1978] gives a lower bound of for the emptiness problem for *-free regular expressions. It would be interesting to know if this could be strengthened to a lower bound of
Acknowledgements The authors wish to thank Elsevier Scientific for permission to publish an updated, revised version of the paper [Compton and Henson, 1990]. We would like to thank the referee of Compton and Henson [1990] for his extraordinarily careful reading of this chapter and helpful comments.
References [Ax, 1968] J. Ax. The elementary theory of finite fields, Ann. Math., 88, 239-371, 1968. [Ax and Kochen, 1966] J. Ax and S. Kochen. Diophantine problems over local fields III: decidable fields, Ann. Math., 83, 437-456, 1966. [Barwise, 1977] J. Barwise. On Moschovakis closure ordinals, J. Symbolic Logic, 42, 292-296, 1977.
210
Kevin J. Compton and C. Ward Henson
[Baur, 1976] W. Baur. Undecidability of the theory of abelian groups with a subgroup, Proc. Amer. Math. Soc., 55, 125-128, 1976. [Baur, 1980] W. Baur. On the elementary theory of quadruples of vector spaces, Ann. Math. Logic, 55, 125-128, 1976. [Ben-Or et al, 1986] M. Ben-Or, D. Kozen, and J. Reif. The complexity of elementary algebra and geometry, J. Comput. System Sci., 32, 251-164, 1986. [Berman, 1980] L. Berman. The complexity of logical theories, Theoret. Comput. Sci., 11, 71-77, 1980. [Borger, 1984a] E. Borger. Decision problems in predicate logic. In Logic Colloquium 1982, G. Lolli, G. Longo, and A. Marcja, eds., pp. 263-301. North-Holland, Amsterdam, 1984. [Borger, 1984b] E. Borger. Spektralproblem and completeness of logical decision problems. In Logics and Machines: Decision Problems and Complexity, E. Borger, G. Hasenjaeger, and D. Rodding, eds., Lecture Notes in Comput. Sci. 171, pp. 333-356. Springer-Verlag, Berlin, 1984. [Borger, 1985] W. Borger. Berechenbarkeit, Komplexitdt, Logik. ViewegVerlag, Wiesbaden, 1985. [Borger et al, 1997] E. Borger, E. Gradel and Y. Gurevich. The Classical Decision Problem. Springer-Verlag, Berlin, 1997. [Brown, 1978] S. S. Brown. Bounds on transfer principles for algebraically closed and discretely valued fields, Mem. Amer. Math. Soc., 204, 1978. [Bruss and Meyer, 1980] A. Bruss and A. Meyer. On time-space classes and their relation to the theory of real addition, Theoret. Comput. Sci., 11, 56-69, 1980. [Biichi, 1962] J. Buchi. Turing machines and the Entscheidungsproblem, Math. Ann., 148, 201-213, 1962. [Cegielski, 1996] P. Cegielski. Definability, decidability, complexity, Ann. Math. Artificial Intelligence, 16, 311-341, 1997. [Cegielski et al., 1996] P. Cegielski, Y. Matiyasevich, and D. Richard. Definability and decidability issues in extensions of the integers with the divisibility predicate, J. Symbolic Logic, 61, 515-540, 1996. [Chagrov and Zakharyaschev, 1997] A. Chagrov and M. Zakharyaschev. Modal Logic. The Clarendon Press, New York, 1997. [Chandra et al, 1981] A. Chandra, D. Kozen, and L. Stockmeyer. Alternation, J. Assoc. Comput. Mach., 28, 114-133, 1981. [Cherlin, 1980] G. Cherlin. Rings of continuous functions: decision problems. In Model Theory of Algebra and Arithmetic, L. Pacholski, J. Wierzejewski, and A. J. Wilkie, eds., Lecture Notes in Math. 834, pp. 44-91. Springer-Verlag, Berlin, 1980. [Compton and Henson, 1990] K. J. Compton and C. W. Henson. A uni-
Proving lower bounds
211
form method for proving lower bounds on the complexity of logical theories, Ann. Pure Appl. Logic, 48, 1-79, 1990. [Compton et al, 1987] K. J. Compton, C. W. Henson, and S. Shelah. Nonconvergence, undecidability, and intractability in asymptotic problems. Ann. Pure Appl. Logic, 36, 207-224, 1987. [Duret, 1980] J.-L. Duret. Les corps pseudo-finis ont la propriete d'independence. C. R. Acad. Sci. Paris Ser. A-B , 290, A981-A983, 1980. [Egidi, 1993] L. Egidi. The complexity of the theory of p-adic numbers. In Proceedings of the 34th Annual Symposium on Foundations of Computer Science, Palo Alto, CA, pp. 412-421. IEEE Computer Society Press, Los Alamitos, CA, 1993. [Emerson and Halpern, 1985] E. A. Emerson and J. Halpern. Decision procedures and expressiveness in the temporal logic of branching time, J. Comput. System Sci., 30, 1-24, 1985. [Ersov, 1964] Y. Ersov. Decidability of the elementary theory of relatively complemented distributive lattices and the theory of filters, Algebra i Logika, 3, 17-38, 1964. (In Russian.) [Ersov, 1965] Y. Ersov. On the elementary theory of maximal normed fields, Soviet Math. Dokl., 6, 1390-1393, 1965. (English translation.) [Ersov et al., 1965] Y. Ersov, I. A. Lavrov, A. D. Taimanov, and M. A. Taitslin. Elementary theories, Russian Math. Surveys, 20, 35-100, 1965. (English translation.) [Fagin et al., 1995] R. Fagin, J. Halpern, Y. Moses, and M. Vardi. Reasoning about Knowledge. MIT Press, Cambridge, MA, 1995. [Feferman and Vaught, 1959] S. Feferman and R. Vaught. The first-order properties of products of algebraic systems, Fund. Math., 47, 57-103, 1959. [Ferrante, 1974] J. Ferrante. Some upper and lower bounds on decision procedures in logic, Doctoral thesis, Massachusetts Institute of Technology, Cambridge, MA, 1974. [Ferrante and Rackoff, 1979] J. Ferrante and C. Rackoff. The Computational Complexity of Logical Theories, Lecture Notes in Math. 718. Springer-Verlag, Berlin, 1979. [Fischer and Ladner, 1979] M. J. Fischer and R. Ladner. Prepositional dynamic logic of regular programs, J. Comput. System Sci., 18, 194-211, 1979. [Fischer and Rabin, 1974] M. J. Fischer and M. Rabin. Super-exponential complexity of Presburger arithmetic. In Complexity of Computation, R. M. Karp, ed. SIAM-AMS Proc., vol VII, pp. 27-42. American Mathematical Society, Providence, RI, 1974. [Fleischmann et al., 1977] K. Fleischmann, B. Mahr, and D. Siefkes.
212
Kevin J. Compton and C. Ward Henson
Bounded concatenation theory as a uniform method for proving lower complexity bounds. In Logic Colloquium 1976, R. Gandy and M. Hyland, eds. pp. 471-490. North-Holland, Amsterdam, 1977. [Fried and Sacerdote, 1976] M. Fried and G. Sacerdote. Solving Diophantine problems over all residue class fields of a number field and all finite fields, Ann. Math., 104, 203-233, 1976. [Furer, 1978] M. Furer. Nicht-elementare untere Schranken in der Automaten-theorie, Doctoral thesis, ETH, Zurich, 1978. [Furer, 1982] M. Furer. The complexity of Presburger arithmetic with bounded quantifier alternation, Theoret. Comp. Sci., 18, 1-5-111, 1982. [Gradel, 1988] E. Gradel. Subclasses of Presburger arithmetic and the polynomial-time hierarchy, Theoret. Comput. Sci., 56, 289-301. [Gradel, 1989] E. Gradel. Dominoes and the complexity of subclasses of logical theories, Ann. Pure Appl. Logic, 43, 1-30, 1989. [Grandjean, 1983] E. Grandjean. Complexity of the first-order theory of almost all finite structures, Inform. Control, 57, 180-204, 1983. [Gurevich, 1965] Y. Gurevich. Elementary properties of ordered Abelian groups, Amer. Math. Soc. Transl, 46, 165-192, 1965. [Gurevich, 1977] Y. Gurevich. Expanded theory of ordered Abelian groups, Ann. Math. Logic, 12, 192-228, 1977. [Gurevich, 1982] Y. Gurevich. Crumbly spaces, In Proc. 1979 Intl. Cong. Logic, Methodology and Philosophy of Science, pp. 179-191. NorthHolland, Amsterdam, 1982. [Halpern and Moses, 1992] J. Halpern and Y. Moses. A guide to completeness and complexity for modal logics of knowledge and belief, Artificial Intelligence, 54, 319-379, 1992. [Halpern and Vardi, 1986] J. Halpern and M. Vardi. The complexity of reasoning about knowledge and time. In Proc. 18th Ann. ACM Symp. on Theory of Computing, pp. 304-315. Association for Computing Machinery, New York, 1986. [Hardy and Wright, 1964] G. H. Hardy and E. M. Wright. Theory of Numbers. Oxford University Press, Oxford, 1964. [Harel, 1984] D. Harel. Dynamic logic. In Handbook of Philosophical Logic, Vol. II, D. M. Gabbay and F. Guenthner, eds., pp 497-604, Reidel, Dordrecht, 1984. [Henkin, 1963] L. Henkin. A theory of prepositional types, Fund. Math., 52, 323 344, 1963. [Hopcroft and Ullman, 1979] J. Hopcroft and J. D. Ullman. Introduction to Automata Theory, Languages and Computation. Addison-Wesley, Reading, MA, 1979. [Immerman, 1986] N. Immerman. Relational queries computable in polynomial time, Inform. Control, 68, 86-104, 1986.
Proving lower bounds
213
[Kaufmann and Shelah, 1983] M. Kaufmann and S. Shelah. On random models of finite power and monadic logic, Discrete Math., 54, 285-293, 1983. [Korec and Rautenberg, 1975/76] I. Korec and W. Rautenberg. Model interpretability into trees and applications, Arch. Math. Logik Grundlag, 17, 97-104, 1975/76. [Kozen, 1980] D. Kozen. Complexity of Boolean algebras, Theoret. Comp. Sci., 10, 221-147, 1980. [Kozen and Tiuryn, 1990] D. Kozen and J. Tiuryn. Logics of programs. In Handbook of Theoretical Computer Science, Vol. B, pp. 789-840, Elsevier, Amsterdam, 1990. [Kozlov and Kokorin, 1969] G. T. Kozlov and A. I. Kokorin. Elementary theory of Abelian groups without torsion, with a predicate selecting a subgroup, Algebra and Logic, 8, 182-190, 1969. (English translation.) [Kutty et al, 1995] G. Kutty, L. Moser, P. Melliar-Smith, Y. Ramakrishna, and L. Dillon. Axiomatizations of interval logics, Fund. Inform., 24, 313-331, 1995. [Lewis, 1980] H. R. Lewis. Complexity results for classes of quantificational formulas, J. Comput. System Sci., 21, 304-315, 1980. [Lo, 1988] Libo Lo. On the computational complexity of the theory of abelian groups, Ann. Pure Appl. Logic, 37, 205-248, 1988. [Lynch, 1982] J. F. Lynch. Complexity classes and finite models, Math. Systems Theory, 15, 127-144, 1982. [Machtey and Young, 1978] M. Machtey and P. Young. Introduction to the General Theory of Algorithms. North-Holland, New York, 1978. [Macintyre, 1973] A. Macintyre. Model completeness for sheaves of structures, Fund. Math., 81, 73-89, 1973. [Maurin, 1996] F. Maurin. Exact complexity bounds for ordinal addition, Theoret. Comp. Sci., 165, 247-273, 1996. [Maurin, 1997a] F. Maurin. Ehrenfeucht games and ordinal addition, Ann. Pure Appl. Logic, 89, 53-73, 1997. [Maurin, 1997b] F. Maurin. The theory of integer multiplication with order restricted to primes is decidable, J. Symbolic Logic, 62, 123-130, 1997. [McNaughton and Papert, 1971] R. McNaughton and S. Papert. Counter Free Automata. MIT Press, Cambridge, MA, 1971. [Meyer, 1974] A. Meyer. The inherent computational complexity of theories of ordered sets. In Proc. 1974 Intl. Cong, of Mathematicians, Vancouver, BC, pp. 477-482, 1974. [Meyer, 1975] A. Meyer. Weak monadic second order theory of successor is not elementary recursive. In Logic Colloquium (Boston 1972-73), R. Parikh, ed., Lecture Notes in Math. 453, pp. 132-154. Springer-Verlag, Berlin, 1975.
214
Kevin J. Compton and C. Ward Henson
[Michel, 1981] P. Michel. Borne superieure de la complexite de la theorie de N muni de la relation de divisibilite. In Model Theory and Arithmetic (Paris, 1979-1980), C. Berline, M. McAloon, and J.-P. Ressayre, eds., Lecture Notes in Math. 890, pp. 242-250. Springer-Verlag, Berlin, 1981. [Michel, 1992] P. Michel. Complexity of logical theories involving coprimality, Theoret. Comp. Sci., 106, 221-241, 1992. [Moschovakis, 1974] Y. Moschovakis. Elementary Induction on Abstract Structures. North-Holland, Amsterdam, 1974. [Point, 1986] F. Point. Problemes de decidabilite pour les theories de modules, Bull. Soc. Math. Belg. Ser. B, 38, 58-74, 1986. [Point, 1993] F. Point. Decidability questions for theories of modules. In Logic Colloquium '90 (Helsinki, 1990), J. M. R. Oikkonen and J. Vaananen, eds., Lecture Notes in Logic 2, pp. 266-280. Springer, Berlin, 1993. [Point and Prest, 1993] F. Point and M. Prest. Decidability for theories of modules, J. London Math. Soc. (2), 38, 193-206, 1988. [Prest, 1988a] M. Prest. Model theory and representation type of algebras. In Logic Colloquium '86 (Hull, 1986), pp. 219-260. North-Holland, 1988. [Prest, 1988b] M. Prest. Model Theory and Modules, Cambridge University Press, Cambridge, 1988. [Rabin, 1965] M. Rabin. A simple method for undecidability proofs and some applications. In Proc. 1964 Intl. Cong. Logic, Methodology and Philosophy of Science, pp. 58-68. North-Holland, Amsterdam, 1964. [Rabin, 1969] M. Rabin. Decidability of second order theories and automata on infinite trees, Trans. Amer. Math. Soc., 141, 1-35, 1969. [Rabinovich, 1998] A. Rabinovich, Non-elementary lower bound for prepositional duration calculus, Inform. Process. Lett., 66, 7-11, 1998. [Rackoff, 1975a] C. Rackoff. The computational complexity of some logical theories, Doctoral thesis, Massachusetts Institute of Technology, Cambridge, MA, 1975. [Rackoff, 1975b] C. Rackoff. The complexity of theories of the monadic predicate calculus, Research Report no. 136, INRIA, Roquencourt, France, 1975. [Rackoff, 1976] C. Rackoff. On the complexity of the theories of weak direct powers, J. Symbolic Logic, 41, 561-573, 1976. [Ramakrishna et al, 1992] Y. Ramakrishna, L. Dillon, L. Moser, P. Melliar-Smith, and G. Kutty. An automata-theoretic decision procedure for future interval logic. In Foundations of Software Technology and Theoretical Computer Science, R. Shyamasundar, ed., Lecture Notes in Comput. Sci. 652, pp. 51-67. Springer-Verlag, Berlin, 1992. [Ramakrishna et al., 1996] Y. Ramakrishna, P. Melliar-Smith, L. Moser, L. Dillon, and G. Kutty. Interval logics and their decision procedures. I.
Proving lower bounds
215
An interval logic, Theoret. Comput. Sci., 166, 1-47. 1996. [Reddy and Loveland, 1978] C. R. Reddy and D. Loveland. Presburger arithmetic with bounded quantifier alternation. In Proc. 10th Ann. ACM Symp. on Theory of Computing, pp. 320-325. ACM, New York, 1978. [Robertson, 1974] E. L. Robertson. Structure of complexity in the weak monadic second-order theories of the natural numbers. In Proc. 6th Ann. ACM Symp. on Theory of Computing, pp. 161-171. ACM, New York, 1974. [Robinson, 1959a] A. Robinson. On the concept of a differentially closed field, Bull. Res. Council Israel, 8F, 113-128, 1959. [Robinson, 1959b] A. Robinson. Solution of a problem of Tarski, Fund. Math., 47, 179-204, 1959. [Scarpellini, 1984] B. Scarpellini. Complexity of subcases of Presburger arithmetic, Trans. Amer. Math. Soc., 284, 203-218, 1984. [Schmitt, 1982] P. Schmitt. The elementary theory of torsion free Abelian groups with a predicate specifying a subgroup, Z. Math. Logik Grundlagen Math., 28, 323-329, 1982. [Schoning, 1997] U. Schoning. Complexity of Presburger arithmetic with fixed quantifier dimension, Theory Comput. Syst., 30, 423-428, 1997. [Seiferas et al., 1978] J. Seiferas, M. Fischer, and A. Meyer. Separating nondeterministic time complexity classes, J. Assoc. Comput. Much., 25, 146-167, 1978. [Semenov, 1980] A. L. Semenov. An interpretation of free algebras in free groups, Soviet Math. Dokl, 21, 952-955, 1980. (English translation.) [Semenov, 1984] A. L. Semenov. Logical theories of one-place functions on the set of natural numbers, Math. USSR-Izv., 22, 587-618, 1984. (English translation.) [Slobodskoi and Fridman, 1976] A. M. Slobodskoi and E. I. Fridman. Theories of abelian groups with predicates specifying a subgroup, Algebra and Logic, 14, 353-355, 1976. (English translation.) [Stern, 1988] J. Stern. Equivalence relations on lattices and the complexity of the theory of permutations which commute. In Methods and Applications of Mathematical Logic (Campinas, 1985), W. Carnielli and L. de Alcantara, eds., pp. 229-240. American Mathematical Society, Providence, RI, 1988. [Statman, 1979] R. Statman. The typed -calculus is not elementary recursive, Theoret. Comput. Sci., 9, 73-81, 1979. [Stirling, 1992] C. Stirling. Modal and temporal logics. In Handbook of Logic in Computer Science, Vol. 2, pp. 477-563. Oxford University Press, Oxford, 1992. [Stockmeyer, 1974] L. Stockmeyer. The complexity of decision problems in
216
Kevin J. Compton and C. Ward Henson
automata and logic, Doctoral thesis, Massachusetts Institute of Technology, Cambridge, MA, 1974. [Stockmeyer, 1977] L. Stockmeyer. The polynomial-time hierarchy, Theoret Comp. Sci., 3, 1-22, 1977. [Stockmeyer, 1987] L. Stockmeyer. Classifying the computational complexity of problems, J. Symbolic Logic , 52, 1-43, 1987. [Tetruashvili, 1984] M. Tetruashvili. The computational complexity of the theory of abelian groups with a given number of generators. In Frege Conference, 1984 (Schwerin, 1984), PP- 371-375. Akademie-Verlag, Berlin, 1984. [Touraille, 1985] A. Touraille. Elimination des quantificateurs dans la theorie elementaire des algebres de Boole munies d'une famille d'ideaux distingues, C. R. Acad. Sci. Paris, Ser. I, 300, 125-128, 1985. [Trakhtenbrot, 1950] B. A. Trakhtenbrot. The impossibility of an algorithm for the decision problem for finite models, Dokl. Akad. Nauk SSSR, 70, 569-572, 1950. [Turing, 1937] A. M. Turing. On computable numbers with an application to the Entscheidungsproblem, Proc. London Math. Soc. (2), 42, 230-265, 1937. Correction, ibid., 43, 544-546, 1937. [Vardi, 1997] M. Vardi. Why is modal logic so robustly decidable? In Descriptive Complexity and Finite Models (Princeton, NJ), pp. 149-183. Amer. Math. Soc., Providence, RI, 1997. [Vaught, 1960] R. Vaught. Sentences true in all constructive models, J. Symbolic Logic, 25, 39-58, 1960. [Vaught, 1962] R. Vaught. On a theorem of Cobham concerning undecidable theories. In Proc. 1960 Intl. Cong. Logic, Phil, and Methodology of Sci., pp. 14-25. Stanford University Press, Stanford, CA, 1962. [Volger, 1985] H. Volger. Turing machines with linear alternation, theories of bounded concatenation and the decision problem of first-order theories, Theoret. Comp. Sci., 23, 333-337, 1983. [Vorobyov, 1997] S. Vorobyov. The 'hardest' natural decidable theory. In Proc. 12thd Ann. IEEE Conf. on Logic in Computer Science, pp. 294305. IEEE Computer Society Press, 1988. [Vorobyov, preprint] S. Vorobyov. The most nonelementary theory, preprint. [Wood, 1976] C. Wood. The model theory of differential fields revisited, Israel J. Math., 25, 331-352, 1976. [Young, 1985] P. Young. Godel theorems, exponential difficulty and undecidability of arithmetic theories: an exposition. In Recursion Theory, A. Nerode and R. Shore, eds., Proc. Symp. Pure Math. 42. American Mathematic Society, Providence, RI, 1985.
Algebraic specification of abstract data types J. Loeckx, H.-D. Ehrich and M. Wolf
Contents 1 2
3
4
5
6
7
Introduction Algebras 2.1 The basic notions 2.2 Homomorphisms and isomorphisms 2.3 Abstract data types 2.4 Subalgebras 2.5 Quotient algebras Terms 3.1 Syntax 3.2 Semantics 3.3 Substitutions 3.4 Properties Generated algebras, term algebras 4.1 Generated algebras 4.2 Freely generated algebras 4.3 Term algebras 4.4 Quotient term algebras Algebras for different signatures 5.1 Signature morphisms 5.2 Reducts 5.3 Extensions Logic 6.1 Definition 6.2 Equational logic 6.3 Conditional equational logic 6.4 Predicate logic Models and logical consequences 7.1 Models 7.2 Logical consequence 7.3 Theories 7.4 Closures
219 220 220 223 224 225 225 228 228 229 230 230 231 231 234 235 236 236 236 238 239 240 240 241 242 242 244 244 245 246 247
218
8
9 10
11
12 13
14
15
J. Loeckx, H.-D. Ehrich and M. Wolf 7.5 Reducts 7.6 Extensions Calculi 8.1 Definitions 8.2 An example 8.3 Comments Specification Loose specifications 10.1 Genuinely loose specifications 10.2 Loose specifications with constructors 10.3 Loose specifications with free constructors Initial specifications 11.1 Initial specifications in equational logic 11.2 Examples 11.3 Properties 11.4 Expressive power of initial specifications 11.5 Proofs 11.6 Term rewriting systems and proofs 11.7 Rapid prototyping 11.8 Initial specifications in conditional equational logic 11.9 Comments Constructive specifications Specification languages 13.1 A simple specification language 13.2 Two further language constructs 13.3 Adding an environment 13.4 Flattening 13.5 Properties and proofs 13.6 Rapid prototyping 13.7 Further language constructs 13.8 Alternative semantics description Modularization and parameterization 14.1 Modularized abstract data types 14.2 Atomic module specifications 14.3 A modularized specification language 14.4 A parameterized specification language 14.5 Comments 14.6 Alternative parameter mechanisms Further topics 15.1 Behavioural abstraction 15.2 Implementation 15.3 Ordered sorts 15.4 Exceptions
249 249 250 250 251 252 253 254 254 256 257 258 258 259 261 261 262 264 266 267 267 268 271 272 275 279 282 283 283 283 284 285 285 286 288 291 295 296 298 298 300 302 303
16
15.5 15.6 15.7 The 16.1 16.2
Dynamic data types Objects Bibliographic notes categorical approach Categories Institutions
305 307 309 310 310 310
1 Introduction It is widely accepted that the quality of software can be improved if its design is systematically based on the principles of modularization and formalization. Modularization consists in replacing a problem by several "smaller" ones. Formalization consists in using a formal language; it obliges the software designer to be precise and principally allows a mechanical treatment. One may distinguish two modularization techniques for the software design. The first technique consists in a modularization on the basis of the control structures. It is used in classical programming languages where it leads to the notion of a procedure. Moreover, it is used in "imperative" specification languages such as VDM [Woodman and Heal, 1993; Andrews and Ince, 1991], Raise [Raise Development Group, 1995], Z [Spivey, 1989] and B [Abrial, 1996]. The second technique consists in a modularization on the basis of the data structures. While modern programming languages such as Ada [Barstow, 1983] and ML [Paulson, 1991] provide facilities for this modularization technique, its systematic use leads to the notion of abstract data types. This technique is particularly interesting in the design of software for non-numerical problems. Compared with the first technique it is more abstract in the sense that algebras are more abstract than algorithms; in fact, control structures are related to algorithms whereas data structures are related to algebras. Formalization leads to the use of logic. The logics used are generally variants of the equational logic or of the first-order predicate logic. The present chapter is concerned with the specification of abstract data types. The theory of abstract data type specification is not trivial, essentially because the objects considered — viz. algebras — have a more complex structure than, say, integers. For more clarity the present chapter treats algebras, logics, specification methods ("specification-in-the-small"), specification languages ("specification-in-the-large") and parameterization separately. In order to be accessible to a large number of readers it makes use of set-theoretical notions only. This contrasts with a large number of publications on the subject that make use of category theory [Ehrig and Mahr, 1985; Ehrich et a/., 1989; Sannella and Tarlecki, 2001]. Our attention is restricted to those topics that are now standard. Proofs of theorems are omitted. For more details the reader is referred to the literature and, in particular, to the textbook [Loeckx et al., 1996]. This
220
J. Loeckx, H.-D. Ehrich and M. Wolf
textbook treats the subject along the same lines as the present chapter, uses the same notation and contains the proofs of most of the theorems. A book treating more advanced issues is [Sannella and Tarlecki, 2001]. A comprehensive state-of-the-art report on recent advances in algebraic foundations of systems specification may be found in [Astesiano et al., 1999b]. A broad survey of the topic together with an annotated bibliography is in [Cerioli et al., 1997]. Sections 2 to 8 describe the fundamental specification tools. More precisely, sections 2 to 5 are devoted to many-sorted algebras and sections 6 to 8 to logic. Section 9 introduces the general notion of a specification. Sections 10 to 12 present three specification methods for specification-inthe-small: loose specifications, initial specifications and constructive specifications. Section 13 presents a simple prototypical specification language for specification-in-the-large and discusses in some detail the language constructs. Section 14 shows how specification languages may be generalized for modularization and parameterization. Section 15 shortly discusses some further issues. Finally, section 16 briefly indicates how the notions of a category and of an institution may be used in the study of abstract data type specifications.
2
Algebras
The algebras to be discussed here are essentially the universal algebras discussed by Meinke and Tucker in [Tucker and Meinke, 1992]. As a difference the present section only treats those aspects that play a major role in specification.
2.1
The basic notions
The first notion introduced is that of a signature. Informally, a signature constitutes the syntax of an algebra by fixing the names and the arities of the operations. Definition 2.1 (Signature). A signature is a pair (S, ) of sets, the elements of which are called sorts and operations, respectively. Each operation consists of a (k + 2)-tuple with n is called the operation name of the operation and its arity; the sorts s1,..., sk are called the argument sorts of the operation and the sort s its target sort. If k = 0 the operation n: —)• s is called a constant of sort s. Note that different operations may have the same operation name. Actually, the equality of two operations implies the equality of their names and the equality of their arities.
Algebraic specification of abstract data types
221
The preceding definition differs from Definition 3.1.1 in [Tucker and Meinke, 1992] essentially by explicitly distinguishing between an operation and its name. This avoids difficulties when several operations have the same name. Example 2.2. The following signature (S, ) is intended for an algebra with Boolean values and natural numbers:
Signatures may be represented graphically. In this representation sorts correspond to labelled nodes and operations to edges pointing from the argument sorts to the target sort. A graphical representation of the signature of Example 2.2 is in Figure 1. While such a representation is easily understandable it may be ambiguous in that it does not fix the order of the argument sorts. For instance, the graphical representations of operations such as n : s x t —> u and n : t x s — u coincide. True
False Succ
Fig. 1. A graphical representation of the signature of Example 2.2. Informally, an algebra provides a meaning to a signature. Definition 2.3 (Algebra). Let E = (S, n) be a signature. An algebra for this signature (or a E -algebra) assigns: • a set A(s) to each sort s € S, called the carrier set of the sort s; the
222
J. Loeckx, H.-D. Ehrich and M. Wolf elements of a carrier set are called carriers; • a total function to each operation. It is understood that A(n: - s) denotes an element of the carrier set A(s).
The class of all E-algebras is denoted Alg(E). Whenever no ambiguity arises one may write A(n) instead of A(n: S1 x . . . x sk —> s). Note that many publications — including [Tucker and Meinke, 1992] — use the notation As and nA instead of A(s) and A(n), respectively. By the way, Alg(E) is in general a class and not merely a set (see [Loeckx et al, 1996] for a proof). Example 2.4. Let E = (S, ) be the signature of Example 2.2. (i) The "classical" algebra for E is the algebra A defined by:
and similarly for the other operations. Note that, for instance, stands for (ii) The following "fancy" algebra B is also a E-algebra:
etc.
While the last example may suggest that a signature allows any algebra provided its functions respect the arities, the following example illustrates the contrary. Example 2.5. Let E = (S, ) be the signature with For any E-algebra A the carrier sets A(s) and not empty. In fact, Further examples of algebras may be found in [Tucker and Meinke, 1992]. It is of course possible to generalize the definition of an algebra by allowing the functions associated with an operation to be partial. Unfortunately, this generalization leads to serious problems with homomorphisms
Algebraic specification of abstract data types
223
(see section 2.2) as well as with the logic (see section 6) and is therefore not further considered.
2.2
Homomorphisms and isomorphisms
Informally, homomorphisms are mappings between algebras or, more precisely, mappings between their carrier sets that "respect" their functions. The following definition is "classical" . Examples of the different notions and the proofs of the theorems may be found in [Tucker and Meinke, 1992] or any textbook on universal algebra. Definition 2.6 (Homomorphism). Let A, B be two E-algebras, E = homomorphism h : A — > B from A to B is a, family (h s )sES of functions such that for any operation the following holds:
say
Equation (2.1) is called the homomorphism condition of the operation u for the homomorphism h. Theorem 2.7. The composition of two homomorphisms yields a homomorphism. More precisely, if E = (S, ) is a signature and if h : A — >• B, g : B — C are two E -homomorphisms then g oh = (gs°hs)ses constitutes a E -homomorphism too. Definition 2.8 (Homomorphic image). The homomorphic image of the E-algebra A under the homomorphism h : A — > B is the E-algebra h(A) defined by:
denotes the restriction of the function B(w) to the set h(A)(s1) x .. . x Definition 2.9 (Isomorphism).
(i) A (E)-isomorphism is a bijective S-homomorphism. (ii) Let E be a signature. Two E-algebras A and B are called isomorphic if there exists a E-isomorphism from A to B. In that case one writes A~B.
Informally, isomorphic algebras are "abstractly" identical in that they differ only by the nature of their carriers. This property is illustrated by
224
J. Loeckx, H.-D. Ehrich and M. Wolf
the following two theorems, the proofs of which are straightforward. Theorem 2.10. The relation of isomorphism between algebras is an equivalence relation. Theorem 2.11. Let A, B, C be algebras of the same signature. If A ~ B then the following facts hold: (i) whenever there exists a homomorphism from B to C, then there exists a homomorphism from A to C; (ii) whenever there exists a homomorphism from C to B, then there exists a homomorphism from C to A. In the following theorem 1A : A —> A and 1B : B —> B stand for the identity homomorphisms in A and B, respectively. Theorem 2.12. A homomorphism h : A —> B is an isomorphism, if and only if there exists a homomorphism g : B —» A such that h o g = 1B and g°h = 1A. The following definition plays an important role in section 11. Definition 2.13 (Initial algebra, final algebra). Let C C Alg(E) be a class of E-algebras for a signature E. (i) An algebra A 6 C is called initial in the class C, if for each B E C there exists exactly one homomorphism from A to B. (ii) An algebra A £ C is called final in the class C, if for each B E C there exists exactly one homomorphism from B to A. Examples may be found in section 4.2 and in [Tucker and Meinke, 1992, Example 5.1.15]. Theorem 2.14. Let C C Alg(E.) be a class of"E-algebras and let A,B € C. (i) Assume A is initial in C. Then B is initial in C if and only if A ~ B. (ii) As (i) with "final" instead of "initial".
2.3
Abstract data types
Informally, an abstract data type is a class of algebras closed under isomorphism (cf. [Tucker and Meinke, 1992]). This definition fits the study of specification well because the logics to be used cannot distinguish between isomorphic algebras. Definition 2.15 (Abstract data type). An abstract data type for a signature E is a class C C Alg(E) satisfying the condition: for any pair of E-algebras A, B. An abstract data type is called monomorphic if all its algebras are isomorphic to each other; otherwise it is called polymorphic.
Algebraic specification of abstract data types
225
2.4 Subalgebras Informally, a subalgebra is an algebra with some carriers deleted and with the functions restricted accordingly. Definition 2.16 (Subalgebra). Let E = (S, ) be a signature and let A and B be E-algebras. The algebra B is called a subalgebra of A if the following two conditions hold:
Examples may be found in [Tucker and Meinke, 1992]. Fact 2.17. Let C be a class of E-algebras, i.e. C C Alg(E). The relation "is a subalgebra of" constitutes a partial order on C. As a result it makes sense to speak, for instance, of a minimal algebra of a class. Subalgebras may be defined by predicates as is now shown. Definition 2.18 (Induced subalgebra). Let E = ( S , ) be a signature, A a E-algebra and P = ( P s ) s e s a family of predicates Ps on A(s), s E S. The subalgebra induced on A by P is the E-algebra B defined by:
To be consistent the definition requires that the following closure condition holds:
A proof of the following theorem may be found in [Tucker and Meinke, 1992]. Theorem 2.19. Let A and B be E-algebras and let h : A — > B be a homomorphism. The homomorphic image h(A) is a subalgebra of B.
2.5
Quotient algebras
Informally, a quotient algebra is an algebra in which some carriers are identified with the help of a "congruence relation" . A congruence relation is an equivalence relation which is "compatible" with the functions of an algebra in that equivalent arguments lead to equivalent function values. Definition 2.20 (Congruence relation). Let E = (S, ) be a signature, A a E-algebra and Q = (Qs)ses a family of equivalence relations Qs on A(s), s E S. If Q satisfies the following substitutivity condition:
226
J. Loeckx, H.-D. Ehrich and M. Wolf
then it is called a congruence relation on A. Definition 2.21 (Quotient algebra). Let E = (S, ) be a signature, A a E-algebra and Q a congruence relation on A. The quotient algebra of A by Q is the E-algebra A/Q defined by:
where, for instance, denotes the equivalence class of a with respect to the equivalence relation Qs. Building a quotient algebra is called factorization. It is not evident that this definition is consistent. In fact, the value of the right-hand side of the second equality in principle depends on the choice of the representative ai for the equivalence class [ai] on the left-hand side. The consistency proof as well as examples of quotient algebras may be found in [Tucker and Meinke, 1992]. The following two definitions and the subsequent theorem show that there exists a one-to-one correspondence between homomorphisms and quotients. Definition 2.22 (Induced congruence relation). Let E = (S, ) be a signature, let A and B be E-algebras and let h : A — > B be a homomorphism. Furthermore, be the family of equivalence relations on A defined by: for all induced by the homomorphism h.
is called the congruence relation
Definition 2.23 (Quotient homomorphism). Let E = (S, ) be a signature, A a E-algebra and Q a congruence relation on A. The family homQ : A — A/Q of mappings defined by: for all a E A(s), s € S, is called the quotient homomorphism of the congruence relation Q. Of course, it has to be proved that the relation of Definition 2.22 is indeed a congruence relation and that the family of functions of Definition 2.23 indeed constitutes a homomorphism. The following fact expresses the "one-to-one correspondence" announced.
Algebraic specification of abstract data types
227
Fact 2.24. With the notation of Definitions 2. 22 and 2.23 one has: = Q. The following theorem is called the first homomorphism theorem [Tucker and Meinke, 1992] and plays an important role in section 11. Theorem 2.25. Let E be a signature and let A and B be two 'E-algebras. Furthermore, let h : A — » B be a homomorphism and the congruence relation induced by h. If the homomorphism h is surjective, then
3
Terms
If an algebra is viewed as a functional programming language, a program in this language consists of a term. This explains why terms play a prominent role in the specification of abstract data types.
3.1
Syntax
First the notion of a variable has to be made precise. A family V = ( V s ) s e s of pairwise disjoint infinite sets is assumed to be associated with any given signature E = (S, ). An element of Vs is called a variable of sort s, s E S. A family X with X C V is called — by abuse of language — a set of variables for the signature £. It is required that the variables of V and the operation names of have different denotations. Informally, V constitutes a "universe" of variables whereas the variables effectively used in a particular application are those of X. The existence of the universe V makes it possible to add "at any time" "new" variables (i.e. variables of V - X) to X — provided V - X is infinite. Definition 3.1 (Syntax of terms). Let E = (S, ) be a signature and X a set of variables for it. The sets constituting the family are defined by simultaneous induction:
An element of A
is called a E(X)-term of sort s or, shortly, a term.
term without variables is called a ground term. One writes
As different operations may have the same operation name, a term may have several sorts. A trivial example is a signature containing the constants n:—> s, n:—t s'; in that case the term n has two sorts, viz. s and s' . In practice terms with different sorts are a nuisance because they are a potential cause of "programming errors". In fact, the introduction of
228
J. Loeckx, H.-D. Ehrich and M. Wolf
typing was a milestone in the development of programming languages such as Algol and Pascal. This consideration justifies the following definition. Definition 3.2 (Strongly typed signature). A signature E is strongly typed if for any set X of variables for E each E(X)-term has a single sort. Clearly, a signature in which all operations have different names is strongly typed. The following theorem provides a less restrictive criterion. Theorem 3.3. Let E = (S, ) be a signature. If for any two different operations following condition holds: then the signature E is strongly typed. Informally, the theorem states that operations with the same name must have a different number of arguments or differ in the sorts of at least one argument. Of course, this still excludes signatures with, for instance, constants n: -> s and n: -» s' — as does a programming language such as Pascal that uses the constants 1 and 1.0 to distinguish between the sorts integer and real. From now onwards all signatures will be assumed to be strongly typed. Infix and mixfix notation allow us to write, for instance, t1 < t2 and if b then t1 else t2 fi instead of < (t1,t2) and ifthenelse(b,t1,t2). Priorities allow us to write, for instance, t1+t2*t3 instead of (t1 + (t2*t3)). While the formal parts of this chapter will stick to Definition 3.1, examples will make use of these notational facilities. For the case of infix and mixfix notation the position of the arguments is classically indicated by underscores in the signature; for instance:
3.2
Semantics
Variables are given a semantics through "assignments". Definition 3.4 (Assignment). Let E — (5, ) be a signature, X a corresponding set of variables and A a E-algebra. An assignment of X for A is a family of (total) functions as • Xs —>• A(s). One writes a : X —> A. The following definition is classical. Definition 3.5 (Semantics of terms). Let E = (5, ) be a signature, X a set of variables for a term, A a E-algebra and a : X —» A an assignment. The value A(a)(t) of t for a and A is defined by induction on the structure of t:
Algebraic specification of abstract data types
229
When t is a ground term the value of t does not depend on a and one may write A(t) instead of A(a)(t).
3.3
Substitutions
Informally, a substitution is a syntactical operation replacing variables by terms. Definition 3.6 (Substitution). Let £ = (5, ) be a signature and X, Y two (not necessarily distinct) sets of variables for S. A family a = (o s )ses of functions is called a substitution a : X — > TE(Y)- The application of this substitution to a term t € TE(x) yields a term ta € ^TE(Y) inductively defined by:
Where a is the identity except for a finite number of variables, say x1, . . . , xm , it is common practice to write: instead of a. If Y is empty one speaks of a ground substitution. Example 3. 7. Consider the signature of Example 2.2. Let X be defined by
Then:
3.4
Properties
The following important theorem relates terms to homomorphisms. Theorem 3.8. Let A and B be two algebras for the same signature E and let h : A — > B be a E-homomorphism. Then the following facts hold:
230
J. Loeckx, H.-D. Ehrich and M. Wolf
The next theorem states that a term keeps its value in a subalgebra. Theorem 3.9. Let A and B be E-algebras for some signature £ = (5, ). // B is a subalgebra of A, then:
The following theorem states that in a quotient algebra the value of a term is replaced by the equivalence class of its value in the original algebra. Theorem 3.10. Let A be a E-algebra for some signature £ and let Q be a congruence relation on A. Then:
where B : X — > A/Q is the assignment defined by
The substitution theorem expresses the semantical effect of a substitution. Theorem 3.11 (Substitution theorem). Let X, Y be sets of variables for a signature E = (S, ) and let a : X — > TE(y) be a substitution. Finally, let A be a E-algebra and B : Y — A an assignment of Y for A. Then for each by
A is the assignment of X for A defined
4 Generated algebras, term algebras 4.1 Generated algebras A carrier of an algebra can be accessed by a user only through a ground term. A carrier that fails to be the value of a ground term is therefore called junk. Generated algebras capture the notion of an algebra without junk. As an additional important feature, generated algebras allow proofs by induction.
Algebraic specification of abstract data types
231
Definition 4.1 (Generated algebra). Let £ = (S,) be a signature, A a E-algebra and a set of operations called constructors. The algebra A is said to be: (i) generated by the set c (or sufficiently complete with respect to c), if for each carrier of A, say a € >A(s), s E S, there exists at least one ground term with a = A(t); (ii) generated (or reachable), if it is generated by The class of all generated E-algebras is denoted Gen(E). Example 4.2. The algebra A of Example 2.4 is generated by {True, False, 0,Succ}. The algebra C identical with A except for C(nat) = Z (= the set of integers) is not generated because no ground term has a negative value. As a generated algebra contains no junk, it is "minimal". This is made precise by the following theorem. Theorem 4.3. A E-algebra is generated if and only if it is minimal in Alg(E) with respect to the partial order "is a subalgebra of". While the above theorem is mainly of theoretical interest, the following properties play an important role in the theory of specification. Theorem 4.4. Let h : A —t B be a *£-homomorphism for some signature E. (i) If B is a generated algebra, then h is surjective. (ii) If A is a generated algebra, then h is the only homomorphism from A to B. Theorem 4.5. Let A and B be E-algebras for some signature £. (i) If A is generated and A ~ B, then B is also generated, (ii) If A and B are generated and there exist a homomorphism from A to B and a homomorphism from B to A, then A ~ B. Theorem 4.6. Let A, B be generated E,-algebras for some signature ( S , ) such that B(s) C TE,5 for each s € 5. If (i) there exists a homomorphism h : A —> B and (ii) A(B(t)) = A(t) for all ground terms t e TE, then A ~ B. The proofs of these different theorems are not very difficult (see,for example, [Loeckx et al., 1996]). During specification one may leave the definition of some sorts pending. Of course, the algebras thus specified fail to be generated because of the "pending" sorts. This suggests a generalization of the concept of a generated algebra which consists in disregarding some sorts. While this idea sounds simple, its realization is less trivial; the reason lies in the fact
232
J. Loeckx, H.-D. Ehrich and M. Wolf
that the syntax of terms has been defined by simultaneous induction on all sorts. Definition 4.7. Let E = (S, ) be a signature, A a E-algebra and Sc C S a set of sorts. The following notation is introduced:
where each ca: -> s is a "new" constant; (ii) Ac is the EcA-algebra defined by:
Informally, contains additional constants but Ac "remains unchanged" . Definition 4.1 may now be generalized. Definition 4.8 (Algebras generated in some sorts). Let E = (S, ), and Ac be as in Definition 4.7. Finally, let be a set of operations with their target sort in Sc . The E-algebra A is said to be: • generated in the sorts of Sc by the set c of constructors if the -algebra Ac is generated by the set • generated in the sorts of Sc if it is generated in the sorts of Sc by the operations of with their target sort in ScExample 4.9. Let E = (S, ) be the signature with
and A a E-algebra with A(el) = a not further defined set of "elements" , A(list) = the set of finite lists of elements, A([]) = the empty list, and where A(Add) adds an element in front of the list.
Moreover, Ac is defined like A with additionally Clearly, Ac is generated by For instance, the list consisting of a single element, say e, is represented by the term Add(ce, []). Hence, the E-algebra A is generated in the sort list by {[], Add}.
Algebraic specification of abstract data types
233
Note that A is not generated in the sense of Definition 4.1, because in the signature E even a list consisting of a single element is not representable by a ground term. It should be clear that a generated algebra (Definition 4.1) is simply an algebra generated in all sorts (Definition 4.8). When an algebra is generated in some sorts, properties of the carrier sets of these sorts may be proved by induction. For instance, to prove that a property P holds for all carriers of the carrier set A(list) of Example 4.9, it is sufficient to prove:
4.2
Freely generated algebras
Definition 4.10 (Freely generated algebra). A freely generated algebra is defined as a generated algebra in Definition 4.1 but with the condition "there exists at least one ground term" replaced by "there exists exactly one ground term". Example 4.11. The algebra A of Example 2.4 is freely generated by
The following theorem is relatively easy to prove: Theorem 4.12. A freely generated E-algebra is initial in the class Alg(E) of all E,-algebras. As a direct consequence freely generated algebras are isomorphic. Moreover, there exists exactly one homomorphism from a freely generated Ealgebra to any other S-algebra. By the way, Theorem 4.14 will show that there effectively exists a freely generated algebra for each signature. The definition of an algebra freely generated in some sorts may be deduced from Definition 4.8 in a similar way to the above. The algebra A of Example 4.9 is freely generated by {[], Add} in the sort list. While generated algebras allow inductive proofs, freely generated algebras also allow inductive definitions of functions. For instance, to define a function / : A(s) —> ... it is sufficient to define the value f ( A ( t ) ) for all ground terms t of sort s. The reason is that (syntactically) different ground terms have (semantically) different values; this guarantees that the function definition is consistent because the function value for given arguments is defined exactly once (see, for example, [Loeckx and Sieber, 1987] for more details).
234
4.3
J. Loeckx, H.-D. Ehrich and M. Wolf
Term algebras
With each signature is associated an algebra called a "term algebra" , the carriers of which consist of ground terms. Definition 4.13 (Term algebra). The term algebra for the signature
At first sight this definition may be confusing. The reason is that ground terms now play a dual role: as elements of the set TE they are syntactical objects; as carriers of the algebra T(E) they are semantical objects. Actually, this dual role is what makes term algebras so interesting. Theorem 4.14. A term algebra is freely generated. By Theorem 4.12 there exists a unique homomorphism h : T(E) — > A for any E-algebra A. This homomorphism is called the evaluation homomorphism of A. Note that h(t) = A(t) for any ground term t.
Example 4.15.
(ii) Let A be the classical E-algebra with A(nat) = N
tion homomorphism h is defined by: Note that in this case the evaluation homomorphism is an isomorphism. The main property of term algebras is the following. Theorem 4.16. Let A be a generated E-algebra for some signature E and let h : T(E) — > A be its evaluation homomorphism. Then (where =h is the congruence relation induced by h according to Definition 2.22). Example 4.17. Let E be as in Example 4.15 but with the additional operation: The set contains additional carriers such as 0 + S'ucc(O) and Succ(0 + 0) + 0. Let A again be the "classical" algebra, with A(+) being addition, and let h be its evaluation homomorphism. Then the carrier
Algebraic specification of abstract data types
4.4
235
Quotient term algebras
Informally, a quotient term algebra is a quotient of the term algebra that possesses the properties common to a given class of algebras. Such an algebra may be viewed as a generalization of the algebra T(E)/=h introduced above. Quotient term algebras play a major role in section 11. Definition 4.18 (Quotient term algebra). Let C C Alg(E} be a nonempty class of E-algebras for some signature E = (S, ). The quotient term algebra of C is the E-algebra T(E,C) defined by:
This definition is consistent because the relation =c may be proved to be a congruence relation on T(E). The following theorem shows the interest of quotient term algebras. Theorem 4.19. Let C C Alg(E,) be a class of E-algebras for some signature S. For each algebra A £ C there exists a unique homomorphism from T(E,C) to A. Hence, T(E, C) has properties similar to those of an initial algebra. It is indeed an initial algebra if it belongs to C. More precisely: Corollary 4.20. LetC C Alg(E) be a class of E-algebras. Jf T(E,C) E C, then it is initial in C.
5
Algebras for different signatures
The study of algebras above was for a given, fixed signature. The relation between algebras with different signatures is investigated in the present section.
5.1
Signature morphisms
Informally, a signature morphism is a mapping of signatures that respects the arity of the operations. Definition 5.1 (Signature morphism). Let E = (S, ) and E' =
(5',
) be two signatures.
236
J. Loeckx, H.-D. Ehrich and M. Wolf such that, for each operation exists an operation name m such that
(ii) A renaming is a bijective signature morphism, i.e. a signature morphism u = (us, u) with us and bijective. If no ambiguities arise one may write u instead of us and Example 5.2.
a possible signature morphism One then speaks of an inclusion and writes
A possible signature morphism
defined by
To extend signature morphisms to terms it is first necessary to extend them to variables. To simplify the definition it is customary to make the following assumption. Let u : E — > E ' be a signature morphism where the associated with E and E', respectively (see section 3.1). It is then assumed Definition 5.3 (Extension of signature morphisms to variables). Let u, : E —)• E' be a signature morphism with E = (50), E' = (S', '). If X is a set of variables for E, then u ( X ) is defined to be the following set of variables for E':
Informally, u ( X ) contains the same variables as X but the sort of a variable may be different. By the way, thanks to the above assumption u , ( X ) is indeed a set of variables for E'. In the absence of this assumption, Definition 5.3 would have had to additionally rename the variables.
Algebraic specification of abstract data types
237
Definition 5.4 (Extension of signature morphisms to terms). Let E — (S, ) and E' be signatures, X a set of variables for E and a signature morphism. For any term the term is defined inductively as follows:
Example 5.5. Let u be the signature morphism of Example 5.2(ii). Then:
Note that in the left-hand side of the equation the variables x and y are of sorts el1 and e/2, respectively; in the right-hand side they are both of sort nat.
5.2 Reducts Reducts constitute a semantical counterpart of (the syntactical notion of) signature morphisms. As a difference the "mapping" is from E'-algebras to E-algebras. Definition 5.6 (Reduct). Let u : E — E' be a signature morphism with E = (S, ) and let A' be a E'-algebra. The u,-reduct of A' is the E-algebra A' \ u defined by:
If u is an inclusion, one may write A'\ E instead of Informally, is a E-algebra with the semantics of A' . In the case of an inclusion the effect of the u-reduct is simply to "forget" the sorts and operations from E' — E. By the way, it is important not to confuse the notions of a subalgebra and of a reduct: a subalgebra is an algebra for the same signature whereas a (non-trivial) reduct is an algebra for a different signature. Example 5.7. Resuming Example 5.2(ii), let A' be the "classical" E'-algebra, i.e. A'(pairnat) = N0 x NO, A' (nat) = NO, A'(Pair)(m,n) = (m,n), A'(First)((m,n)) = m and A' (Second) ((m, n)) = n for all m, n € N0. The reduct A' \ u is defined by:
238
J. Loeckx, H.-D. Ehrich and M. Wolf
Definition 5.8 (Reduct of a homomorphism) . Let E, E' be signatures and a signature morphism. Let A', B' be E'-algebras and h' : A' — B' a E'-homomorphism. The u-reduct of h' is the E-homomorphism
The now following reduct theorem bears strong similarities with Theorem 3.11. Theorem 5.9 (Reduct theorem). Let E = (S, ), E' be signatures and signature morphism. Let X be a set of variables for E and t a term from TE(x). Finally, let A' be a E' -algebra and a' : u ( X ) — > A' be an assignment for A' . Then:
Note that i f t i s a ground term t h e theorem y
i
e
l
d
s
.
5.3 Extensions Contrasting with a reduct, an extension is a "mapping" in the same "direction" as a signature morphism, i.e. from Alg(E) to Alg(E'). Definition 5.10 (Extension). Let u : E — > E' be a signature morphism.
When u is an inclusion one speaks of an extension instead of a u-extension. Clearly, an algebra possesses in general several extensions. Note that the notion of a u-extension of a class is very "liberal" as it only requires that does not contain "additional" algebras. Note also that the term "extension" refers to the signature, not to the "size" of the class.
Algebraic specification of abstract data types
6
239
Logic
Most books on mathematical logic concentrate on first-order predicate logic. In contrast, specification makes use of a variety of logics such as equational logic, conditional equational logic, first-order predicate logic or modal logic. Moreover, depending on the application, the semantical domains may vary from the class of all algebras to the class of algebras generated in some sorts. Hence, it is appropriate to introduce a general notion of logic encompassing these different logics as particular cases. Such a general notion is introduced in the present section. A still more general notion called "institution" is based on categories and will be briefly discussed in section 16.1. The present section together with section 7 presents a satisfaction-based view of logic; section 8 will present a proof-based view of logic (cf. [Ryan and Sadler, 1992]).
6.1 Definition Definition 6.1 (Logic). An algebra logic (logic for short) L consists of: (i) a decidable set L(£) for each signature £; an element of L(E) is called a (E—)formula; (ii) a function uL '• L(E) — L(E') for each signature morphism u: E —>• E'; this function ul is called the formula morphism; (iii) a relation |=E C Alg(E) x L(E) for each signature E; this relation is called the satisfaction relation (for E); if A |= for some E-algebra A and some formula L(E), one says that is valid in A or that A satisfies It is required that the following two conditions are satisfied: (i) (Isomorphism condition.) For any signature £, for any formula E L(E) and for any E-algebras A, B with A ~ B, (ii) (Satisfaction condition.) For any signatures E, E', for any signature morphism u, : E —» £', for any formula € L ( E ) and for any E'-algebra.A', One may write u instead of uL, if no confusion arises. The following definition defines an important subclass of logics. Definition 6.2 (Logic with equality). A logic L is called a logic with equality if, for any signature E, any sort s of this signature and any ground terms t, u € T£,S, there exists a formula, say such that for all E-algebras A.
240
J. Loeckx, H.-D. Ehrich and M. Wolf
Sections 6.2 to 6.4 now present three different "instances" of this general logic. Each of these instances constitutes a logic with equality — as the reader may easily verify.
6.2
Equational logic
Definition 6.3 (Equational logic). According to Definition 6.1, the equational logic EL is defined as follows: • for each signature E,
A formula from EL(E) is also called an equation. An equation V0.t — u is called a ground equation. It is important not to confuse the symbol "=" of an equation with the operation name "=" of an operation such as As a fundamental difference the symbol "=" of an equation is a logical symbol (i.e. a "metasymbol") with a fixed semantics. One may write t = u instead of "X.t = u if each variable of X has at least one occurrence in t or u. Theorem 6.4. Equational logic satisfies the isomorphism and satisfaction conditions of Definition 6.1. Example 6.5. (i) Let E = (5, ) be the signature with 5 = {bool, nat} and Let, moreover, X be a set of variables for E and x,y € Xnat, b € XbooJExamples of S-equations are:
Note that, for instance, x + 1 = 1 + x stands for V{x;}.x + 1 = 1 + x. (ii) Let E = (S, ) be the signature with 5 = {s,t} and Let X be a set of variables for E and nally, let A be a E-algebra with A(s) = 0 and A(a) A(b). Clearly, A = a = b (which stands for A = V0.a = b) does not hold. On
Algebraic specification of abstract data types
241
the other hand A (= V{x}.a = b does hold because there exists no assignment a : {x} — > A due to the fact that A(s) = 0.
6.3
Conditional equational logic
This logic is a mild generalization of equational logic. Its interest stems from the fact that in the frame of specification it is easier to use than equational logic. Definition 6.6 (Conditional equational logic). The conditional equational logic is the logic CEL defined as follows:
The theorem, the remarks and the notational convention of Section 6.2 may be generalized for conditional equational logic. Example 6.7. Let E be the signature of Example 6.5(i). Examples of conditional equations are:
In the same vein a logic PL called predicate logic can be defined by generalizing classical first-order predicate logic for many-sorted algebras. For more transparency the set PL(E), the formula morphisms UPL and the satisfaction relation |=E are defined separately. Definition 6.8 (The set PL(E)). For each signature E the set PL(E) of formulas is defined inductively:
242
J. Loeckx, H.-D. Ehrich and M. Wolf
As usual it is possible to introduce additional logical symbols such as V, D, =, 3 as abbreviations, and to reduce the number of parentheses in formulas by assigning priorities to the logical symbols. Moreover, one may write instead of Vx: s.(p when the sort s of x is known. Definition 6.9 (The formula morphisms UPL). For each signature morphism u, : E — > E' the corresponding formula morphism uPL is defined by:
The definition of the satisfaction relation requires two additional notions introduced in the following definitions. Definition 6.10 (Free occurrence of a variable). Let E be a signature and a formula of predicate logic. The set free((p) of the variables occurring free in is defined by:
Definition 6.11 (Value of a formula). Let E be a signature, a formula of predicate logic, X a set of variables for S with free( A an assignment. The value for a and A is a truth value from {true, false} and is defined by:
The following definition concludes the definition of the logic PL.
Algebraic specification of abstract data types
243
Definition 6.12 (The satisfaction relation of PL). Let S be a signature. The satisfaction relation of first-order predicate logic is defined by: A (=s V O- (A(a)(
Logical consequence
Definition 7.7 (Logical consequence). Let L be a logic, E a signature, f € £(E) a formula and $ C £(£) a set of formulas. Furthermore, let U be a domain for E. (i) The formula ip is called a logical consequence of $ in U if A |=s if for each A € U with A |=£ $; one then writes $ |=z/,£ - If W = Alg(£) one speaks of a logical consequence of $ and writes $ (=s ip. If U = Gen(E) one speaks of an inductive consequence of $ and writes * N/nd,S ¥>•
(ii) The formula tp is said to be valid in U if it is a logical consequence
Algebraic specification of abstract data types
245
of the empty set of formulas in U\ one then writes |=w,s (p instead of 0 |=M,E ¥>• If W = -4^(S) the formula ($') is a persistent ^-extension of Modz($), then $' is a persistent ^-extension This theorem allows Theorem 7.24 to be rephrased.
8
Calculi
Informally, a calculus constitutes an inductive definition of a set of strings. In a calculus for a logic these strings are formulas. The goal of such a calculus is to syntactically grasp the notion of logical consequence. Hence, with such a calculus it is in principle possible to mechanically prove statements such as $(= 0. Note that an inference rule { ((ipi ,... ,ipn),tp) \ . . . } is generally written
as V 1 , • • • , list . . _ : list x list ->• list Isprefix : list x list —> list vars l,m,n: list, e: el axioms []./ = { Add(e, l).m = Add(e, l.m) (Isprefix (I, m) — True) = (3n.(m = l.n)) endspec The specification is an adequate specification of the "classical" algebra because this algebra constitutes a model of the "axioms" — as one may easily check. It is not a strictly adequate specification due to the existence of models that are not generated in the sorts bool and list. Example 10.3 (Peano axioms). Again the logic considered is PL. loose spec sorts bool, nat opns True : -»• bool
Algebraic specification of abstract data types
255
False: -+ bool 0: -» nat Succ: nat —t nat _ + _: nat x not —>• nat _ * _ : nat x nat —» nat . < _: nat x nat -> feooJ vars m, n: nat axioms -i(7Vue = False) -.(0 = Succ(n)) Succ(n) = Succ(m) D n = m (0 < n) = Thie (Succ(n) < 0) = Fake (Succ(n) < Swcc(m)) = (n < m) n+0=n n + Swcc(rn) = Succ(n + m) n*0 = 0 n * Swcc(m) = n + (n * m) endspec Clearly, the specification is an adequate specification of Peano arithmetic (see Example 7.15). As Peano arithmetic is not axiomatizable, it cannot be a strictly adequate one. The following theorem indicates some limits of the expressive power of loose specifications. Theorem 10.4. Let sp = (E, $) be a loose specification in a logic L. Then: (i) M(sp) = (M(sp)YL; (ii) if L is a logic for which there exists a sound and complete calculus, then M(sp) is axiomatizable in L; (in) if L is one of the logics of sections 6.2 to 6.4 and if M.(sp) contains an algebra with an infinite carrier set, then M.(sp) also contains nongenerated algebras, A variant of Theorem 10.4(iii) is also known as the Lowenheim-Skolem theorem (see, for instance [Ebbinghaus et al, 1984]). Properties of loose specifications may be proved along the lines discussed in section 9. In general, rapid prototyping is not possible for loose specifications.
10.2 Loose specifications with constructors Definition 10.5 (Loose specification with constructors). Let L be a logic. (i) (Abstract syntax.) A loose specification with constructors is a 4-tuple
256
J. Loeckx, H.-D. Ehrich and M. Wolf sp = (S, $, 5 c ,fl c ) where £ = (5, fi) is a signature, $ C L(S) a set of formulas, Sc C. S a set of sorts and flc C fi a set of operations with target sorts in Sc called constructors.
(ii) (Semantics.) The meaning M(sp) of sp is defined to be the abstract data type M(sp) = Modu($) where U
=
{A € AlgCE) \ A is generated in the sorts of Sc by the set flc}.
The concrete syntax for such specifications may be chosen to be identical with that of section 10.1 except that in the list- of- operations each sort of Sc and each operation of fZc is preceded by the keywords generated and constr respectively. Example 10.6. Consider the specification sp of Example 10.2 with generated preceding boot and list and with constr preceding each of the first four operations. By its very definition M(sp) no longer contains algebras that are not generated in the sorts bool and list. Nevertheless, it still fails to be a strictly adequate specification because it contains models for which the carrier set of sort list is a singleton. Example 10.7. Consider the specification of Example 10.3 with generated preceding bool and nat and with constr preceding each of the first four operations. It may be shown that the abstract data type defined by the specification is monomorphic. Hence, it constitutes a strictly adequate specification of Peano arithmetic. By the way, this specification illustrates that the first-order predicate calculus together with the induction principle is in some sense complete with respect to Gen(S) (cf. section 8.3). The properties of loose specifications — including Theorem 10.4 — carry over to loose specifications with constructors. Of course, the effect of the Lowenheim-Skolem theorem (Theorem 10.4(iii)) is void.
10.3
Loose specifications with free constructors
The definition of these specifications is as in Definition 10.5 with "freely generated" instead of "generated" . Similarly, one uses the keyword freely generated instead of generated. Example 10.8. Consider the specification of Example 10.6 with freely generated instead of generated. The specification is now a strictly adequate specification of the "classical" algebra. Note that the abstract data defined is still polymorphic due to the non-generated sort el. Example 10.9. Consider Example 10.7 but with freely generated instead of generated. The first three formulas are now superfluous and may be removed. Example 10.10. Consider the following specification:
Algebraic specification of abstract data types
257
loose spec sorts opns
freely generated not constr 0 : —» not constr Succ : nat —> not Pred : nat -> nat vars n: no£ axioms Pred(Succ(n)) = n
endspec in which Pred is "intended" to be the predecessor function. This polymorphic specification is a strictly adequate specification of the "classical" algebra in which the value Prerf(O) is left pending. More precisely, the abstract data type defined contains non-isomorphic algebras differing from each other only by the value of Pred(0).
11
Initial specifications
Initial specifications are atomic specifications in the sense of section 9. They differ from loose specifications in that they always define monomorphic abstract data types. Furthermore, initial specifications make use of equational logic or conditional equational logic. While sections 11.1 to 11.7 treat the case of equational logic, section 11.8 briefly considers the case of conditional equational logic.
11.1
Initial specifications in equational logic
Definition 11.1 (Initial specifications). Let EL be the equational logic of section 6.2 and let T(£, $) denote the quotient term algebra of Definition 7.5. (i) (Abstract syntax.) An initial specification in equational logic is a pair sp = (£, $) where S is a signature and $ C EL(E) a set of equations. (ii) (Semantics.) The meaning M(sp) of an equational specification sp = (£, 4>) is the monomorphic abstract data type M(sp) = {Ae Alg(E) | A ~ T(E, *) }. The interest of this definition stems from the following fundamental theorem of equational logic: Theorem 11.2. Let S be a signature and $ C ££(£) a set of equations. ThenT(£,) £ Morf s ($). The proof of this theorem is essentially based on the Theorems 3.10 and 3.11. Together with Theorems 7.6 and 2.14, this leads to the following important result: Corollary 11.3. Let sp = (£,$) be an initial specification. Each algebra of M(sp) is initial in
258
J. Loeckx, H.-D. Ehrich and M. Wolf
Prom the properties of algebras and, in particular, of quotient term algebras one easily deduces the following fact: Fact 11.4. Let sp = (£,$) be an initial specification. (i) Each algebra of M (sp) is generated. (ii) For each algebra A 6 Morfs($) there exists a unique homomorphism h : T(S, $) —> A defined by h([t]) = A(i), t € Ts, viz. the initial homomorphism (see Definition 7.5). (iii) For each algebra B € M(sp) there exists a unique and surjective homomorphism h : T(S) —> B defined by h(i) = A concrete syntax for initial specifications is chosen to be identical with that of section 10.1 but with the keywords initial and eqns instead of loose and axioms.
11.2
Examples
Example 11.5. (i) A trivial initial specification is: initial spec sorts not opns 0 : -» not Succ : nat —>• nat _ + _ : nat x nat —> nat vars m,n:nat eqns n + 0 = n n + Succ(m) = Succ(n + m) endspec One may prove that the specification is an adequate specification of the "classical" algebra. Note, in particular, that T(£,$)(nat) consists of the equivalence classes [0], [5«cc(0)],[5«cc(5«cc(0))], etc. Note also that, for instance, [0] = [0 + 0] = [0 + 0 + 0] = . . . [SMcc(O)] = [SMCc(O) + 0] = [0 + S«cc(0)] = [Succ(O) + 0 + 0] = . . . . (ii) The following example is the classical example of an initial specification. Its glamour stems from the fact that there is no equivalent loose specification with free constructors that is equally "abstract" . initial spec sorts nat,, set opns 0 : ->• nat Succ : nat —> nat 0 : -> set Insert : set x nat —> set vars m, n: nat,s: set eqns Insert(Insert(s,n),n) = Insert (s, n)
Algebraic specification of abstract data types
259
Insert(Insert(s,n),m) = Insert(Insert(s,m),n) endspec Informally, the first equation identifies terms in which the same "element" occurs several times; the second equation identifies terms in which the same "elements" occur in different order. The following example illustrates the problems resulting from the fact that initial specifications cannot define polymorphic abstract data types. Example 11.6. Consider the initial specification deduced from the loose specification of Example 10.10 by deleting the keywords freely generated and constr and by replacing loose and axioms by initial and eqns. This specification is not an adequate specification of the "classical" algebra because T(S,)(na£) now contains additional carriers such as [Pred(O)], [Pred(Pred(0))} and [Succ(Pred(Q))]. Correcting non-adequate initial specifications is non-trivial and errorprone, as the following example illustrates. Example 11.7. To correct the non-adequate specification of Example 11.6 the following solutions may be envisaged. One may add the equation Pred(O) = Swccr™'(0) for some fixed n > 0, and thus use [5wc 0. Note that C indirectly proves that 0 and Succ are constructors of the sort not (see the remark in Example 11.15). The fourth principle of proof consists in the use of term rewriting systems, to which we now turn.
11.6
Term rewriting systems and proofs
Term rewriting systems are particular reduction systems. Both types of systems are extensively discussed in [Klop, 1992; Avenhaus, 1995]. In order to fix the notation the main definitions are briefly recalled. First the notions related to reduction systems are defined. Definition 11.18 (Reduction system). (i) A reduction system is a pair (R, ->) where jR is a set and "—>•" a relation on R. (ii) A reduction sequence of a reduction system (R, —») is a possibly infinite sequence r\,..., r / t , . . . of elements of R such that FJ -» r,+1 for each i > 1; one then writes r\ ->•* r^ for any k>l. (iii) An equivalence sequence is defined similarly but with r, —» r»+i or n+i ->• Ti for each i > I; one writes r\ ~ r^ for any k > 1. Definition 11.19 (Noetherian, confluent, locally confluent). (i) A reduction system is Noetherian if it possesses no infinite reduction sequence, (ii) A reduction system (jR, —>) is locally confluent if for all r, s, t € R the following holds: if r -> s and r -* t then there exists u € R with s —>•* w and t —>* u. (iii) The notion of a confluent reduction system is defined as in (ii) but with r —t* s and r —>•* t instead of r —» s and r —> t, respectively. Definition 11.20 (Normal form). Let (R, ->) be a reduction system and r & R. A normal form of r is an element s £ R such that r —»* s and there exists no i € .R with s -t t. Theorem 11.21 (Newman's lemma). A Noetherian and locally confluent reduction system is confluent. Theorem 11.22. Let (R, ->) be a Noetherian and confluent reduction system. (i) Each element of R has exactly one normal form. (ii) Let r,s € R. Then r ~ s if and only if r and s have the same normal form. The following definition associates a reduction system with each initial specification.
264
J. Loeckx, H.-D. Ehrich and M. Wolf
Definition 11.23 (Term rewriting system). The term rewriting system of an initial specification (E, $) is the reduction system (T^, —>) with " ->" inductively defined by: (i) ta —> ucr for each equation VX.t = u £ $ and each ground substitution a : X —> Ts; (ii) if i -> u, then v[t/y] -» w[u/y] for all terms w e ?E({»}) containing at least one occurrence of the variable y. Note that an initial specification in which an equation VX.t = u is replaced by VX.u = t leads to a different relation "-»*" but to the same relation "~". The following definition and theorem constitute the basis of a criterion guaranteeing that a term rewriting system is Noetherian: Definition 11.24 (Rewrite ordering, reduction ordering). Let £ be
a signature with Tg)g ^ 0 for each sort s from £. Furthermore, let X be a set of variables for £. An irreflexive partial order ">" on T£(X) is called a rewrite ordering on TE(X) if f°r a^ terms t,u € TE(X) : (i) i > u implies fcr > ucr for all substitutions a : X —> ^£(-f ) > (ii) t > u implies v[t/y] > v[u/y] for all terms v 6 ?s(xu{y}) where j/ is a variable for S, y £ X, which occurs at least once in v. A rewrite ordering is called a reduction ordering if there exists no infinite sequence • • • < h < *2 < ti,
ti € TZ(X) for all i > 1.
Theorem 11.25. Le£ (E, $) 6e an initial specification and X a set of variables for S containing the variables occurring in $. // there exists a reduction ordering < on T%(x), such that u < t for any equation VY.t = u e $, £/«en #je term rewriting system of (£,$) is Noetherian. The main theorem of term rewriting states that the equivalence relation "~" coincides with the validity of ground equations in the initial algebra: Theorem 11.26. Let (£,$) be an initial specification and (Ts,->) its term rewriting system. For all ground terms t, u € TS it is the case that t ~ u if and only if T(E, $) |= t = u. Note that the theorem does not require that the term rewriting system is Noetherian and/or confluent. The previous theorem is the basis of the fourth principle of proof mentioned in section 11.5. Suppose one has to prove T(E, $) |= VX.t = u. It is sufficient to show that for all ground substitutions a : X •—> T^ it is the case that T(£,<J>) |= ta = ua (according to Theorem 11.14) or, equivalently, that ta ~ ua (according to Theorem 11.26),
Algebraic specification of abstract data types
265
Example 11.27. Consider the specification of Example 11.5(i). (i) To prove that T(E, $) \= Vn.n + Succ(Succ(0)) = Succ(Succ(Q)) + n one proves that t+Succ(Succ(0)) ~ Succ(Succ(t)) + Q for an arbitrary *€TE: t + Succ(Succ(G)) -> Succ(t + Succ(0)) -»• Succ(Succ(t) + 0) ->• Succ(Succ(t)) * Succ(Succ(t)). Hence t + Succ(Succ(0)) and Succ(Succ(t)) + Q have the same normal form, viz. the normal form of Succ(Succ(t)). Reducing proofs to the calculation of normal forms is possible only if the term rewriting system of the specification is Noetherian and confluent. Several criteria have been developed that guarantee each of these properties; they are based on Theorem 11.25 and Theorem 11.21, respectively. The Knuth-Bendix algorithm constitutes a particular implementation of these criteria: it tries to transform an initial specification sp = (S, $), the term rewriting system of which is Noetherian, into an initial specification sp1 = (E,) be an initial specification, the term rewriting system of which is Noetherian and confluent. One associates with this specification a E-algebra C defined by: C(s) = { t £ TE>S | Ha a normal form }, for each sort s of E; C(ui) = the normal form of the term n, for each operation u> = (n : —> s) of E; C (w) (ti,... , t k ) — the normal form of the term n(ti, .. .,**), for each operation (jj = (n: si x . . . x Sfc -» s) of E. Theorem 11.30. Let (S,*) and C be as in Definition 11.29. Then C is a characteristic term algebra of the initial specification (E, $). Hence Informally, the theorem states that the algebra C of Definition 11.29 may be viewed as a representative of the abstract data type defined by the initial specification (S, 4>). This justifies why the evaluation of ground terms in the algebra C may be defined as rapid prototyping. As this evaluation consists in the calculation of the normal form of these ground terms, it may be performed automatically by the term rewriting tools mentioned in section 11.6. Example 11.31. Consider the specification of Example 11.5(i). Rapid prototyping of the ground term Succ(Succ(0)) + Succ(Q) yields the ground term Succ(Succ(Succ(0))).
11.8
Initial specifications in conditional equational logic
The main definitions and theorems of sections 11.1 to 11.7 may be generalized in a straightforward way for conditional equational logic. In particular, Theorem 11.2 holds for conditional equational logic as well. The use of conditional equational logic often simplifies the design of initial specifications because it allows the expression of implications between equalities instead of equalities only. On the other hand it may be more difficult to grasp the structure of the initial algebra T(E, $) and, in particular, of its carriers; moreover, term rewriting systems and their tools are more complex (see, for example, [Klop, 1992; Padawitz, 1988]).
11.9
Comments
It is tempting to try to generalize initial specifications for logics other than equational and conditional equational logic. The following example illustrates the limits of such a generalization.
Algebraic specification of abstract data types
267
Example 11.32. Let £ be a signature with a single sort and four constants, viz. a, 5, c, d. Let $ be a set consisting of the single formula (of predicate logic): (-.a = 6 A c = d) V (a = b A -ic = d). Then Mod-s($) has no initial algebra. In fact, by Theorem 3.8 an initial algebra A has to satisfy both A(a) / A(b) and A(c) ^ A(d). Hence, it cannot be a model of $.
12
Constructive specifications
While loose specifications and initial specifications have a model-theoretic semantics, constructive specifications have an operational semantics. Hence, constructive specifications allow rapid prototyping by their very definition. The literature proposes several notions of constructive specifications, some of which lack a precise definition. The following general notion encompasses most of them. For more clarity its definition is given in three steps. Definition 12.1 (Constructive specification: abstract syntax). A
constructive specification is a triple sp = (£,4>, fic) where S = (S,fl) is a signature, $ C EL(%) a set of equations and J7C C fi a set of operations called constructors. It is required that the following three constraints are satisfied. In these constraints X is a set of variables for E, Ec = (S, fl c ) and Var(t) denotes the set of variables occurring in a term t. (i) Each equation of $ is of the form n(vt , . . . , V k ) — t with (n: Si x ... x at -> s) £ n — H c , k > 0, *>i € TSc(x),«,. for all i, 1 < i < k, and t € T^X),sMoreover, it must be the case that Var(t) C Var(n(vi,... , r^)). Finally, no variable may occur more than once in n(«i,... ,ut). (ii) For each ground term n(w\,..., Wk) of IE with (n: si x ... x Sk -» s) € ft - ftc, k > 0, u>i € Tzc,si for each i, 1 < i < k, there exists exactly one equation n(v\,... ,Vk) = t 6 $ and exactly one ground substitution a : Var(n(vi,... ,Vk)) —> T^c such that Wi = Via for all i, 1 < i < k. (iii) There exists a reduction ordering "'* if 4>' is a /^-extension of $ (cf. Definition 7.25). (ii) The signature morphism /z is called a specification morphism and one writes fj, : (£, $) —>• (£', $') if // : $* —» $'* is a theory morphism. (iii) The pair ((£,), (E',$')) is called a hierarchical specification if the signature morphism n is an inclusion and fj, : (E, $) —>• (E', $') is a specification morphism.
14
Modularization and parameterization
The specification languages discussed above have two shortcomings. First, they are concerned with algebras and thus implicitly suggest a bottom-up design of specifications. A more general design would require the possibility of handling incomplete specifications, say "specification pieces". Next, the languages do not support reusability. This would require the possibility of providing "specification pieces" with parameters. Sections 14.1 to 14.3 introduce a notion of "specification piece", henceforth called "module specification" . A facility for parameterization is proposed in section 14.4.
14.1
Modularized abstract data types
Informally, modularized abstract data types are the objects to be specified by the module specifications to be introduced in sections 14.2 and 14.3. Definition 14.1 (Module signature). A module signature is a pair (Si, Se) of signatures; £j and Se are called the import signature and export signature, respectively. A sort or operation from the signature Ej n Se is called inherited.
Algebraic specification of abstract data types
285
A graphical representation of the module signature (E,, E e ) with ,r},{ui,u2}), Ee = ({s},{wi,w3}) is:
The dotted lines characterize the inherited sorts and operations. Definition 14.2 (Modularized abstract data type). (i) A modularized abstract data type for the module signature (Ej,E e ), or (S,, E e )-modwJe for short, is a (total) function F : such that the class F(A) is an abstract data type for each A € Alg(£i)i in this definition Clas9(£e) denotes the "collection" of all classes of Ee-algebras. (ii) A (Sj,£e)-module F is called persistent if for each A e Alg(Si): for each B € It is called consistent if F(^4) ^ 0 for each A € (iii) A (Sj, Ee)-module F is called monomorphic if F(A) is monomorphic for each A € Alg(Ei). Informally, persistency expresses the fact that inherited sorts and operations have the "same" meaning in A and F(A); consistency expresses the fact that the mapping F is "effective" . Clearly, an abstract data type (in the sense of Definition 2.15) may be viewed as a module with an empty import signature.
14.2
Atomic module specifications
The different atomic specifications of sections 10 to 12 are now generalized for module specifications. By way of restriction, only module signatures (Ej, E e ) with Sj C Ee are considered. This restriction is irrelevant because
286
J. Loeckx, H.-D. Ehrich and M. Wolf
the constructs forget and export rename of the specification language of section 14.3 will allow modules with S, £ Se to be obtained. First, the case of loose specifications is treated (cf. Definition 10.1). Definition 14.3 (Loose module specification). Let L be a logic. (i) (Abstract syntax.) A loose module specification in L is a pair msp = ((S,,E e ),$) where (£j,£ e ) is a module signature with Sj C Ee and where $ C L(E e ) is a set of formulas. (ii) (Semantics.) The meaning of the loose module specification msp — ((£;, £ e ), $) is the module M(msp) defined by M(msp)(A) = {B& Alg^e) B \= $ and (B \ £,) ~ A} for each A 6 Alg(Tn). Clearly, the meaning of a loose module specification is persistent but not necessarily consistent. A possible concrete syntax is like that for loose specifications of section 10.1 but with the keyword import preceding each sort and operation of the import signature and with mspec instead of spec. Example 14.4. The following example is similar to Example 10.2: loose mspec sorts list, import bool, import el opns import True: -> bool import False : —>• bool
[ ]: -4 list Add: el x list —>• list - . - : list x list —> list Isprefix: list x list -> list vars l,m,n: list,e: el axioms [ ]./ = / Add(e, l).m — Add(e,l.m) (Isprefix(l,m) — True) = (3n.(m = l.n)) endmspec Hence, S; = ({bool, el}, {True, False}) and Se = ^t\j({list}, {[ ], Add,...}). Informally, the module specification specifies lists. The abstract data types bool and el are to be defined "elsewhere". Clearly, the abstract data type el is intended to be a parameter. Loose module specifications with constructors and/or free constructors may be defined similarly. Next, the case of initial specifications is treated (cf. section 11.1). Definition 14.5 (Initial module specification).
(i) (Abstract syntax.) An initial module specification in equational logic is a pair msp = ((Sj, £«=),$) where (Sj,S e ) is a module signature with EJ C Ee and where $ C EL(T,e) is a set of equations.
Algebraic specification of abstract data types
287
(ii) (Semantics.) The meaning of the initial module specification msp = ((£j,£ e ),$) is the module M(msp) defined by M(msp)(A) — {B^Alg(T,e)\ B is a free extension of A for(E e ,*)} for each A € Alg(Ei) (cf. Definition 13.4). Clearly, the meaning of an initial module specification is consistent; it is persistent only if for each A € Alg(Y,i) the free extension of A for (S e ,$) is persistent (cf. Definition 13.4). Constructive module specifications may be defined similarly. Their meaning is persistent and consistent. By the way, loose module specifications may be simulated by normal specifications in the following sense. If msp = ((Ej,E e ),$), £{ C £ e , is a loose module specification, consider the loose specification sp = (E e , bool constr False : —^ bool endmspec; LIST o BOOL
290
J. Loeckx, H.-D. Ehrich and M. Wolf
The import signature of this module specification is empty. Hence, it may be viewed as a specification with signature ({bool, el, list}, {True, False, [ ], Add,...}) defining lists of "elements". Further comments on the modularized specification language may be found in section 14.5.
14.4
A parameterized specification language
The parameter mechanism introduced in this section is a very elementary one. It allows the simulation of most of the parameter mechanisms that are described in the literature (see section 14.6). According to this parameter mechanism, a parameter is a distinguished sort or operation of the import signature of a module specification. In the module specification of Example 14.4 the imported sort el is predestined as a parameter in contrast with the imported sort bool. The reason is that the sort el is concerned with reusability while bool is concerned with modular design. More precisely, the intended meaning of bool is fixed while the meaning of el is intentionally left pending. In fact, it makes sense to use the module specification with different meanings for el, for instance natural numbers, strings or lists of lists of natural numbers, but it is not sensible to use the module with a meaning for bool other than the intended one. By the way, the difference between imported sorts or operations that are parameters and those that are not is similar to that between parameters and global variables in the procedure body of an imperative language. As a result, imported sorts and operations that are parameters and those that are not do not differ in their semantics but merely differ in their intended use. Providing a module specification language with a parameter mechanism may therefore be reduced to the introduction of two additional language constructs called import rename and import model, respectively. The first of these constructs allows parameter passing by renaming the (imported sorts and operations that constitute the) formal parameters into their actual values. In the case of Example 14.4 it allows one to rename the sort el into, for instance, nat or string. The second construct allows one to put semantic constraints on the parameters. For instance, a module specification of "ordered lists" that is parameterized in the sort of its elements requires that the carriers of this sort satisfy the axioms of a partial order—as will be illustrated in Example 14.13. The following two definitions introduce a parameterized specification language called PSL. This language is identical with the module specification language MSL except for the two additional language constructs mentioned above. The definition of the construct import rename is slightly more complex than the above comments may suggest. While the construct introduces
Algebraic specification of abstract data types
291
"new" names for the sorts and operations of the import signature, it generally modifies the export signature too. In fact, the inherited sorts and operations in the export signature have to be renamed accordingly. The same holds for the inherited sorts occurring in the arities of the exported operations. For this reason the signature morphism is defined as a signature morphism fj,: Ej U Se —» E' rather than // : EJ —> E'. Definition 14.9 (Abstract syntax of the parameterized specification language PSL). The set of parameterized specifications psp of the language PSL and their module signatures S(psp) are defined inductively: (i)-(x) as Definition 14.6(i) to (x) but with "parameterized specification" instead of "module specification"; (xi) if psp is a parameterized specification with S(psp) = (Ej,S e ) and if n : Ej U Ee —t E' is a surjective signature morphism satisfying the following four conditions: (a) for each sort s from E e \Ej, n(s) = s, (b) for each operation u> from E e \£j, /x(w) and w have the same operation name, (c) for any two different sorts or operations so\e and so-ze from S e , n(soie) — n(so?,e) implies that both so\e and so^e are inherited, (d) for any sort or operation so, from Ej and soe from S e , l_t(soi) = n(soe) implies that soe is inherited, then (import rename psp by /x) is a parameterized specification with ^(import rename psp by p.) = (/i(Sj),/x(S e )); (xii) if psp is a parameterized specification with S(psp) = (Sj,£ e ) and if $ C L(Ej) is a set of formulas for some logic L, then (psp import model *) is a parameterized specification with S(psp import model $) = S(psp). Informally, conditions (xi)(a) and (xi)(b) express the fact that /j, constitutes a renaming of the import signature. More precisely, condition (xi) (a) expresses the fact that n renames sorts from Se only if they are inherited. Condition (xi)(b) expresses the same property for operations or, at least, for their names; the condition does not extend to their arities: if a noninherited operation u from Se contains imported sorts in its arity, w and n(w) may differ from each other in their arities. Conditions (xi)(c) and (xi)(d) avoid "name clashes". More precisely, condition (xi)(c) expresses
292
J. Loeckx, H.-D. Ehrich and M. Wolf
the fact that p. is injective on the non-inherited sorts and operations from Se. Condition (xi)(d) expresses the fact that p, may identify a sort or operation from £, and a sort or operation from Se only if the latter is inherited. Note that the signature morphism n is not necessarily bijective and hence may fail to constitute a renaming in the sense of Definition 5.1(ii). This is sensible because it must be possible that two different formal parameters get the same actual value (see Example 14.11). In the concrete syntax one uses pspec and endpspec instead of mspec and endrnspec. Definition 14.10 (Semantics of PSL). The meaning M(psp) of a parameterized specification psp is a module (in the sense of Definition 14.2) and is inductively defined according to Definition 14.9: (i) to (x) as in Definition 14.7 but with "parameterized specification" instead of "module specification"; (xi) ifS(psp) = (Si.Se), then M (import rename psp by n}(A) = { B 6 Alg(»(Ee)) | (B | (/z|EJ) € M(psp)(A \ (/i| Ej )) } for each A 6 Alg(fj,(Si)); (xii) if S(psp) = (Ef,E e ), then ... . . , / M(psp)(A) W/n M(psp import model1 A$)(/!) = < „ v (0 for each A E. Alg(£i).
ifA\=$, ,, • otherwise
The meaning of the construct import rename is as expected. The effect of the construct import model is to "eliminate" the imported algebras that fail to satisfy $. The construct import rename preserves persistency and, when applied to persistent specifications, consistency. The construct import model preserves persistency but not necessarily consistency. Again, the addition of an environment presents no problem. Example 14.11. A "declaration" of a parameterized specification is PAIR is loose pspec sorts freely generated pair, import eli, import el? opns constr [_ , _ ]: el\ x el2 -* pair First :pair —>• el\ Second : pair ->• el2 vars e\: eli, e2: el-2 axioms Fz'rsi([e1;e2]) = ei Second([ei,e2]) = e2 endpspec An instantiation of this parameterized specification is
Algebraic specification of abstract data types
293
import rename PAIR by sorts e l i , eli as sorts nat, not The instantiation specifies pairs of natural numbers or, more precisely, of carriers of sort not. Its module signature (£,, S e ) is as expected: £i = ({nat},®), T,e = ({nat,pair}, {[_]: nat x nat ->• pair, First -.pair ->• na£, Second :pair —> nat}). The example illustrates that import renaming must not be injective. The notation for an instantiation is clumsy and requires repetition of the formal parameters. By storing the formal parameters in the environment one may adopt a "sugared" notation similar to that of programming languages: the formal and actual parameters are written between brackets after the name of the parameterized specification. This notation has the additional advantage that it explicitly distinguishes between parameters and imported sorts and operations that are not parameters. Example 14.12. The parameterized specification of Example 14.11 may now be written: PAIR (sorts eli, el2) is loose pspec sorts freely generated pair, import eli, import e/2 opns [_ , _ ] : . . . endpspec Similarly, the instantiation is written: PAIR (sorts nat, nat) The following example illustrates the use of import model for expressing parameter constraints. Example 14.13. The following parameterized specification defines ordered lists. In it the construct import model makes sure that the relation "C" is a partial order. ORDERED-LISTS (sorts el, opns _ C _ : el x el ->• boot) is ((loose pspec sorts freely generated list, import bool, import el opns import True: —> bool import False : -4 bool import _ C _: el x el —t bool constr [ ]: -> list constr Add : el x list -¥ list Is-ordered : list —»• bool vars e,e\,e^: el, I: list axioms Is-ordered([ ]) = True
294
J. Loeckx, H.-D. Ehrich and M. Wolf Is-ordered(Add(e, [ ])) = True (d C 62) = True D Is- ordered ( Add (ei, Add (e^, I))) = Is-ordered(Add(e bool constr False : —> 6ooZ constr 0 : —> nat constr SWcc : nat —> na£ _ < _ : nat x nai —>• bool vars m, n: nai axioms (0 < n) = TVue (5wcc(m) < 0) = Fafee (Succ(m) < Succ(ri)) = (m < n) endpspec)
Then ORDERED-LISTS (sorts not, opns 600?) o is a specification of ordered lists of natural numbers. Of course, this specification makes sense only if the module it defines is consistent (in the sense of Definition 14.2). Hence, it is necessary to prove that "