This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
F(x, a) A F(x, (3)) Proof of Lemma 1.3. Without loss of generality, suppose that F(x, y) is convex to the right. For a contradiction, suppose that there aren't a, (3 £ p(M) such that M (= F(/3, a) A 3x[nF(x, a) A F(x, (3)} As F(x, y) is not equivalence generating then there are a,/3 £ p(M) such that M (= F(/3,a) A 3x[F(x,a) A >F(x,(3)]. Consequently there is 7 £ p(M) such that 7 £ F{M, a) \ F(M, /?). Consider / e AutA(M) such that f(y) — a. As a < j3 < 7 we have fn(a) < /"(/?) < / " ( 7 ) = fn~l{a) for each n < u. As M \= F(f,a) we have M (= F(fn~1(a), fn(a)) for each n < LO. By our supposition we have M \= F(j,fn(a)) for each n < u. As M \= iF(7,/3) we have M (= n n ^F(f (l),f (P)) and consequently M (= ^F{yJn(/3)) for each n < u. Thus, F("f, M) is a union of infinitely many convex sets, contradicting to weak ominimality of M. • z —> tp does not define a 11 function from (1 + e)z into z", for any ip £ T,n and rational e > 0 Vz > l"
Vx < a + 13y < a3z < w M\=\/y<
Vrr < 63y < a3z < uxp(x, y, z)\ (*). By hypothesis about 0, there exist c < a and 1,
35
Lemma 1.4. Let M be a weakly ominimal structure, A C M, p G Si(A) be nonalgebraic, M be \A\+saturated. Suppose that F(x,y) is a pstable convex to the right formula so that F(x, y) is equivalence generating. Then 1) G(x,y) := F(y,x) is a pstable convex to the left formula which is also equivalence generating. 2) E(x, y) := F(x, y)\/F(y, x) is an equivalence relation which partitions p(M) into convex classes. Proof of Lemma 1.4. 1) As F(x, y) is convex to the right we have M ^= Vj/Va; [F(x, y) —> y < x\ and consequently M \= VzVy [G(x, y) —» x < y\. Let a, (3 G M such that M (= G(/3,a). Then we have /? < a. Prove that for any /3' such that /3 < /?' < a we have M = G(/3', a). As M = G(f3,a) then M [= F(a,/3). As F(x,y) is convex to the right we have M (= F(f3',f3). As F(x,y) is equivalence generating we have M = F{a,(3'). Therefore M = G(/3',a) and G(i,i/) is convex to the left. Prove that G(x, y) is pstable. Take an arbitrary a £ p(M) and consider G(M,a). It needs to find 71,72 € p(M) such that 7i < G(M,a) < 72. It is obvious for any 72 G p{M) such that 72 > a we have G(M, a) < 72. Show an existence of such 71. For a contradiction, suppose that for any 71 G p{M) such that 71 < a we have M = G(7!,a). Take an arbitrary (5 G p(M) such that (3 < a and consider F{M,j3). By our supposition M \= G(/3,a) and consequently M \= F(a,j3). As F(x,y) is pstable there is 72 G p{M) such that F(M,(3) < j 2  Consider / G AutA(M) such that / ( 7 2 ) = /3. As /? < 72 then /(/J) < / ( 7 2 ) , i.e. f(/3) < (3. By the supposition we have M (= G{f{(3), a). Consequently M (= G(/3, /  1 ( a ) ) or equivalently Af = ^ ( /  H a ) , ^ ) . As /? < a then /"^jS) < f~\a), i.e. 72 < / _ 1 ( " )  We 1 _1 Q have M = .^(72,/?) A F ( /  ( a ) , / 3 ) A72 < / ( ) contradicting to F(x,y) is convex to the right. Consequently G(x,y) is pstable. Prove now that G(x,y) is equivalence generating. For a contradiction, suppose that G(x,y) is not equivalence generating. By Lemma 1.3 there are a, (3 G p(M) such that M (= G(/3, a) A 3x[^G(x, a) AG{x, (3)}. There is 7 G M such that M \= >G(7, a) A G(7, /?). Then we have M =F(/3,7)AF(a,/3)AnJF(a,7)A7
36
is equivalence generating we have M \= F(y, a) and consequently M = E(j, a). Thus, E(x,y) is an equivalence relation. Prove now that for any a e M E(a, M) is convex. Let 71,72 € E(a, M). Without loss of generality we can assume a < 71 < 72. Then we have M = F(ji,a) A F ( 7 2 , a ) . As F(x,y) is convex to the right M \= F(y,a) and consequently M \= E(~f, a) for any 7 such that 71 < 7 < 72. • Definition 1.6. We will say p is semiquasisolitary to the right (left) if there is the greatest pstable convex to the right (left) formula. Definition 1.7. We will say p is quasisolitary if p is semiquasisolitary both to the right and to the left. In Example 1.1 the type p is not quasisolitary. Proposition 1.1. Let M be a weakly ominimal structure, A C M, M be \A\ + saturated, p £ Si(A) be nonalgebraic. Then 1. If F(x,y) is the greatest pstable convex to the right (left) formula then F(x, y) is equivalence generating. 2. Any semiquasisolitary onetype is quasisolitary. Proof of Proposition 1.1. 1. Let F(x,y) be the greatest pstable convex to the right formula. Suppose that F(x,y) is not equivalence generating. By Lemma 1.3 there are a,f3 Gp(M) such that M ^ F(/3,a) A 3x[^F(x,a) A F(x,/3)}. By Lemma 1.2 F'(x,y) := 3z[F(z,y) A F(x, z)] is pstable convex to the right. It is obvious F(M, a) C F'(M, a), contradicting to F(x, y) is the greatest. 2. Let F(x, y) be the greatest pstable convex to the right formula. By item 1 F(x, y) is equivalence generating. By Lemma 1.4 G(x, y) := F(y, x) is a pstable convex to the left formula. For a contradiction, suppose G(x, y) is not the greatest. Then there is a pstable convex to the left formula G'(x, y) which is greater than G(x, y). Consequently there are a, 7 S p{M) such that 7 e G'(M, a)\G(M, a) (*). Let G'^j, x) be a convex subformula of G'(~f,x) such that M = G'0(j, a). As G'(x,y) is pstable there is 7' G p(M) such that 7' < G'(M,a). Consider / G AutA(M) such that /(7O = a. As 7' < 7 < a then a = f(j') < f(y) < f(a). As M f= >G'(7', a) we have M (=  . G ' ( a , / ( a ) ) and consequently M f= >G'(j, f (a)). We have M \= ^G'Q(y, f {a)) and consequently G'Q{^,M) < f(a) Consequently G'0{y,M) C p(M). Consider the following formula: F'(x, y):=x>yA
[G'0(y, x) V 3z(G'0{y, z)Ax<
z)\
(**).
37
Prove F'(x,y) is pstable convex to the right and greater than F(x,y). Consider arbitrary 71,72 S M such that M = ^'(72,71). Then M  = 7 2 > 7 i A [Go(7i>72) V 3z{G'0{yi,z) A 7 2 < z)] Consider arbitrary /3 £ M such that 71 < (3 < 72. If M = G 0 (7i, /3) then M \= F'(j3,ji). If not we have M \= /3 < 72 A £0(71,72) and consequently M (= F'(j9,7i). Thus, F'(x,y) is convex to the right. Let g G Au£^(M) such that 3(7) = 71. By (**) we have G 0 (7i,M) < g(f(a)) and consequently F'(M, 71) < g(f{a)), i.e. F'(x,y) is pstable. Understand that F(M,^) C F'(M,^). We have M (= F ' ( a , 7 ) . If M (= F ( Q , 7 ) then M (= G(7,a), contradicting to (*). Consequently a G F'(M, 7) \ F(M, 7) and thus, F'(x,y) is greater than F(x,y), contradicting to F(x,y) is the greatest. • 2. Main theorem Lemma 2.1. Lei T be a weakly ominimal theory, M \= T, A C M, M fee A + saturated, p G 5i(A) fee nonalgebraic. Suppose that E(x,y) is an Adefinable nontrivial equivalence relation which partitions p{M) into convex classes. Then E partitions p{M) into infinitely many such classes, so that the induced ordering on classes is either a dense order without endpoints or a discrete order without endpoints. Proof of Lemma 2.1. First show that there is no leftmost .Eclass which is contained in p(M). By assumption there are elements a,/3 G p(M) with a < (3 and ^E(a,f3). As all elements of p(M) have the same type as f3 over A we have: for every element f3' in p(M) there exists a' G p(M) such that a' < /?' and >E(a',(i'). Therefore there is no smallest .Eclass in p{M). We can also show that there is no the rightmost .Eclass. Thus, E partitions p(M) into infinitely many classes. Now, consider the following formula: $(x) := 3z[>E{z, x)Ax
Vt(z < t < z > E(x, t) V E(z, t))}
If $(x) G p then .Eclasses are discretely ordered. If not, then .Eclasses are densely ordered. • Corollary 2.1. Let T be an ^ocategorical weakly ominimal theory, M \=T, A C M, M be \A\+ saturated, p G S\(A) be nonalgebraic. Suppose that E(x, y) is an Adefinable nontrivial equivalence relation which partitions p(M) into convex classes. Then the induced order on Eclasses is a dense order without endpoints.
38
Lemma 2.2. Let T be an ^categorical weakly ominimal theory, M\=T, A C M, M be \A\ +saturated, p £ Si(A) be nonalgebraic. Suppose that Ei(x,y), E2(x,y) are Adefinable equivalence relations which partition p{M) into convex classes so that there is an element a £ p{M) such that Ei(M,a) c E'2(M,a). Then E\ partitions each E^class into infinitely many Eiclasses. If F(x, y) is a pstable convex to the right (left) formula, we will say F(x, y) is trivial if for any a £ p{M) M (= Va; [F(x, a) —> x = a]. Otherwise, such a formula is said to be nontrivial. A quasisolitary type p is said to be solitary if the greatest pstable convex to the right (left) formula is trivial. If A, B C M, n £ u>, we will say A is nindiscernible over B in M if for any properly ordered ntuples a, b £ An tp(a/B)tp(b/B), and we will say A is indiscernible over B in M if for any n £ u> A is nindiscernible over B in M. Lemma 2.3. Let M be a linearly ordered structure, M be max {\A\ + ,to}saturated. Then p(M) is n + 1indiscernible over A if p(M) is nindiscernible over A and for every ai,...,an £ p(M) such that ct\ < • • • < an the set p(M) D \fi £ M\an < (3} is 1indiscernible over A\J{ai,...,an). Lemma 2.4. Let M be a weakly ominimal structure, A C M, p £ Si (A) be nonalgebraic. Suppose that p is solitary. Then p(M) is 2indiscernible over A. Proof of Lemma 2.4. By Fact 2.3 we have to prove that for a\ £ p(M) all elements of p(M) that are bigger then ct\ have the same type over A U { a i } . If this would not be the case, there would be a Adefinable formula F(x, «i) which does not hold for all elements of p(M) n {x\x > o?i}. By changing F(x, a{) either to the smallest connected component or to the formula x > ai AVy[F(y,a1)
> x < y]
we can assume F(x,y) to be convex to the right. Furthermore F(x,y) pstable. This is a contradiction to the solitarity of p.
is •
Theorem 2.1. Let T be a weakly ominimal theory, M \= T, A C M, M be \A\ + saturated, p £ Si(A) be nonalgebraic. Suppose that the set of all pstable convex to the right formulas is ordered by u>*, where u>* is the reverse ordering on the natural numbers. Then any pstable convex to the right (left) formula is equivalence generating.
39
Proof of Theorem 2.1. By the hypothesis there is the set of all pstable convex to the right formulas ordered by w*: F\(x,y), F^{x, y),..., Fn(x,y),... so that for any a G p(M) we have Fi(M, a) D F2(M, a) D . . . D Fn(M, a) D .... Prove that for any i > 1 Fi(x,y) is equivalence generating. Step i. Suppose that for any j G { 1 , . . . , i} Fj(x,y) is equivalence generating. Prove that Fi+i(x,y) also is equivalence generating. For a contradiction, suppose that Fi+\(x,y) is not equivalence generating. By Lemma 1.3 there are a,/3,7 G p(M) such that M = F i + i(/J,a) A F i + 1 ( 7 ,/3) A F i + i( 7 ,c*)
(*)
Consider the following formula: F'(x,y) := 3t [Fi+i(t,y) A Fi+i(x,t)] By Lemma 1.2 ^'(a;, y) is pstable convex to the right. By (*) 7 G F'(M, a) \ Fi+1(M, a), i.e. F'{M, a) D Fi+1(M, a). Consequently, there is j G { 1 , . . . , 1} F'(M, a) = Fj(M, a). Then the following holds: M ^VxiF^a)
^3t[Fi+l{t,a)
AFi+l{x,t)})
(1)
Consider an arbitrary element 71 G Fj(M,a)\Fi+i(M,a). By (1) there is f3 G Fi+i(M,a) such that 71 G F i + 1 (M,/3). Consider / G AutA(M) such that / ( 7 i ) a . As a < 71 we have / n + 1 (o!) < / n + 1 ( 7 i ) = Pipt) for each n < w. As M \= Fj(7i,a) then M = Fj(fn(a)Jn+1{a)) for each n < w. As Fj(x,y) is equivalence generating we have M \= Fj (71, / " ( a ) for each n < u. By (1) there is /3 n G Fi+l(M, fn{a)) such that 71 G F i + i(M,/3 n ). As M =  . F i + 1 ( 7 i , a ) then M (= .F i + 1 (/"(a),/" + 1 (Q;)) and consequently M =  . F i + 1 ( 7 i , / n ( a ) ) for each n < w. A s M [ =  F i + 1 ( 7 l , r ( a ) ) A F i + i ( / W n ( a ) ) A Fi + i(7i,Ai) we have / " ( a ) < /3„ for each n < to. As M \= ^Fi+l(fnx(a),fn{a)) A Fi+\{fin, fn{a)) w e have /3„ < fn~1(a) for each n < w. Thus, fn(a) & Fi+1(luM),(3n G F i + 1 ( 7 l , M ) and fn(a) < pn < n 1 f ~ (a) for each n < u>. Consequently ^+1(71, M) is a union of infinitely many convex sets, contradicting to weak ominimality of M. Step i is proved.
•
Corollary 2.2. Let T be an ^categorical weakly ominimal theory, M\=T, A C M, p G Si(A) be nonalgebraic. Then any pstable convex to the right (left) formula is equivalence generating. Proof of Corollary 2.2. If A is finite it immediately follows from Theorem 2.1. Consider the case when A is infinite. Consider an arbitrary pstable convex to the right
40
formula F{x, y). Let AQ be a finite subset of A such t h a t F(x, y) is a formula over AQ. Let po be P\A0 It is obvious t h a t F(x, y) is postable convex to t h e right. By Theorem 2.1 F(x,y) is equivalence generating in po and therefore also equivalence generating in p. • T h e following corollary is very close t o results of Section 2 [3]. C o r o l l a r y 2 . 3 . Let T be an ^Qcategorical weakly ominimal theory, M\=T, A C M, A be finite, p G S\(A) be nonalgebraic. Suppose that {Fi(x, y), ..., Fm(x, y)} is a complete list of all nontrivial pstable convex to the right formulas so that for any a G p(M) Fi(M,a) c ... C Fm(M,a). Then the Adefinable nontrivial equivalence relations with infinite classes on p(M) are precisely Ei for 1 < i < m given by Ei(x,y) := Fi(x,y) V Fi(y,x) so that the following holds: (1) Em partitions p(M) into infinitely many Emclasses, every Emclass is convex and open so that the induced ordering on classes is a dense order without endpoints (2) For any i G {1, . . . , m — 1} Ei partitions every Ei+\class into infinitely many Eiclasses, every Eiclass is convex and open so that Eisubclasses of every Ei+iclass are densely ordered without endpoints (3) For any a £ p(M) Ei(M,a) is 2indiscernible over A
References 1. H.D. Macpherson, D. Marker, Ch. Steinhorn, Weakly ominimal structures and real closed fields, Trans, of Amer. M a t h . S o c , 352 (2000), pp. 54355483. 2. B.S. Baizhanov, Expansion of a model of a weakly ominimal theory by a family of unary predicates, JSL, 66 (2001), pp. 13821414. 3. B. Herwig, H.D. Macpherson, G. Martin, A. Nurtazin, J.K. Truss, On No categorical weakly ominimal structures, APAL, 101 (2000), pp. 6593.
41
PROOFS A B O U T FOLKLORE: W H Y M O D E L CHECKING = REACHABILITY? (EXTENDED ABSTRACT) K. CHOE, H. EO, S. O Korea Advanced Institute of Science and Technology, Kusongdong Yusonggu 3731 Taejon 305701, Korea, Email: [email protected], [email protected], [email protected] N. V. SHILOV Institute of Informatics Systems, Lavren'ev av., 6, Novosibirsk 630090, Russia, Email: [email protected] K. YI Seoul National University, San 561 Shilimdong Kwanakgu, Seoul 15174%, Korea, Email: [email protected] We demonstrate that Hintikkalike gametheoretic semantics for a socalled SecondOrder Elementary Propositional Dynamic Logic (SOEPDL) leads to a principle opportunity to use solvers of simple reachability properties as engines for model checking classical temporal and program logics like /^Calculus (fiC) and Computation Tree Logic (CTL).
1. Introduction and Motivation It is wellknown that various propositional program logics like CTL [1] (Computational Tree Logic) are easy to encode into the propositional /iCalculus (fJ.C) of Kozen [4] due to its expressive power. It implies that any model checking engine for //C can be used for checking CTL without any model modification. But there is also interest [6, 7] to use of model checking engines for simple temporal properties like safety or progress properties for model checking other more complicated temporal properties but with aid of some algebraic transformations of models. In particular, paper of Schuppan and Biere [6] has admitted Cartesian products of models with finite sets for an efficient reduction of model checking progress (liveness) properties to model checking safety properties. It leads to a practically efficient model checking progress properties via safety one.
42
Paper of Shilov and Yi [7] has demonstrated how (in principle) to use a model checker that can solve finite games for finite model checking of /xC and second order propositional program logic 2M of Stirling [10]. For it, Shilov and Yi have admitted Cartesian products and power set construct on models, and introduced a very expressive SecondOrder Elementary Propositional Logic (SOEPDL). The cited paper [7] has demonstrated that SOEPDL is more expressive than CTL, ^C, and 2M, and that Secondorder logic of several monadic Successors (S(n)SLogic) of Rabin [5] can be interpreted in SOEPDL. The reduction of SOEPDL model checking to /xC model checking is based on Hintikkalike game theoretic semantics [3]. For a SOEPDLformula we construct a game between two players Spoiler and Duplicator in a manner that validness of the formula is reduced to existence of a winning strategy for Duplicator. Because there exists a ;uCformula WIN = fix.(pV (a) V ((b)true A [b]x)) that represents existence of a winning strategy of terminating games, model checking of given SOEPDLformula is reduced to model checking of WIN in the model of the game. Unfortunately, gametheoretic semantics suggested by Shilov and Yi [7] is extremely inefficient: the complexity of game construction is exponential to the size of model and length of formula. In the present paper we suggest more efficient game semantics whose complexity is exponential to the size of model and number of independent variables in the formulaa. Then we study how to solve finite games by CTL model checkers, moreover, by checkers of Vreachability and 3reachability properties. It implies (in combination with reduction of SOEPDL model checking to existence of a winning strategy) a formal justification for a folklore opinion of software engineering community that finitestate model checking is basically a kind of reachability checking. 2. Preliminaries Let Prp and Act be two disjoint alphabets of propositional variables and action symbols respectively. Let us assume that reader is familiar with basic concepts of CTL and juC. We would like to use a standard notation [1] for CTL, and adopt quite standard notation [7] for fiG. In contrast, let us define below two less known secondorder propositional program logics, namely 2M by Stirling [10] and SOEPDL by Shilov and Yi [7]. a T h u s we try to restrict a set of 'valuable' variables in a manner that can improve upper complexity bound. This approach is very useful in complexity research, see for instance paper of Vardi [11].
43
Semantics of propositional program logics is denned in models, which are called labeled transition systems or Kripke structures. Definition 2.1. A model M is a pair (DM, IM) where the domain DM is a nonempty set, and the interpretation IM is a pair of mappings (PM,PM)Elements of the domain DM are called states. The interpretation maps propositional variables into sets of states and action symbols into binary relations on states: PM : Prp > 2DM and RM : Act > 2DMXDM. (We use IM{O) and IM{P) instead of RM(O) and PM(P) frequently when it is explicit that a £ Act and p G Prp.) Every model can be considered as labeled directed graph with nodes and edges marked by sets of propositional variables and action symbols respectively. Definition 2.2. Assume we are given a set of formulae of any propositional program logic that is syntactically closed with respect to negation (>), conjunction (A), and disjunction (V) (it can be a set of CTLformulae, /iCformulae, etc.). A satisfiability relation (= between models, states, and formulae is defined inductively with respect to the structure of formulae. For every model M = (DM, IM), every state s € DM, and every formula (j> let us write:
• "sH=M"iff(M,s,^)eh • "s ^ M " iff (M,s, <£)£>• For Boolean constants s \=M true and s ^M false. For propositional variables we have: s \=M p iff S S IM(P) For Boolean constructs (=M is defined in the standard manner: s \=M , iff s V=M 4>, s \=M 4> A tp iS s \=M
which are read as "box/diamond a <j>" or "after a always/sometimes <j>" respectively
44
• s \=M ((a)) iff f° r every state s': (s,s') G IM{O) implies s' \=M
which are read as "for every/some p " respectively which are read as "box/diamond
45
Definition 2.8. Let ^ be a formula of fiC, 2M, or SOEPDL. Propositional variable p is said to be propositional constant 6 in the formula <j>, if p has no bound instances in ft. Let C(<j>) be set of propositional constants in 2°M is a total function that maps each variable that is not a constant in £ but has free instances in tp to S(x) C Z3. • Spoiler has moves of 4 kinds related to conjunctive subformulae, and wins in positions of 5 kinds. • Duplicator has moves of 4 kinds related to disjunctive subformulae, and wins in positions of 5 kinds. All moves and winning positions are represented in Fig. 1 and 2.
e
Recall that true and false we call Boolean constants. Here we use the following notation for functions. First, the emptyset 0 can be considered as a total function 0 : 0 —> B for every set B. Then for every two elements a and b let ( a n d ) : {a} —• {b} be a total function such that maps a into b. Next, let F : A —> B be any total function, C C A, and d 0 A and b € B; then let F\c : C —+ B be a restriction of F to C and F(,/ d : {d} U A —• B be an extension of F o n d , i.e. the following functions: f
F(X)
if X £
C
F(l)
^ ° ' (F h/,)(x) = ^(F\C)(x) OHz;  I{ u n d e f 'i n e d otherwise, ^>>/d)W  { ^
'
if X € A
i f x = rf
'
46
Duplicator i€{l,2}: S\FWt)) (a,(V>iV^),S)»(a,Vi (s,t) 6 / M ( a ) :
Spoiler {s,{il>iArl;2),S)^(s,iPi
( « , ( [ < # ) , $ )  (t,1>,S) (B.W.S)^
(S,(Vy.V>),S)* (S,MST/V)\FM)
i',Py1>),s) ^(S.^.(S'T/ W )IF(V))
Moves of Spoiler and Duplicator
Spoiler (s,/aZse,0) (s,p,0) where s ^ / M ( P ) (s, 75,0) where s € IM(P) (s, x, (x H> T)) where s ^ T (s,  n , (a; t> T)), where s € T Figure 2.
\F(i/>i))
( S ,((a)V),S)  ( t . ^ S ) t£%:  ( s , ( 0 V ) , S )  (t,V,S) TCD:
(t,il>,S)
Figure 1.
S
Duplicator (s, irue,0) (s,p,0) where s G 7 M ( P ) (s, ip, 0) where s 0 7 M ( P ) (s,a;, (a; H» T)) where s € T (s, ia;, (a; t+ T)), where s £T
P 6 C(0 P e C(0 a; € P r p a; 6 P r p .
Winning positions of Spoiler and Duplicator
While the game in paper of Shilov and Yi [7] traces valuations of all variables for all positions, we only trace valuations of variables with free instances. Improvement of complexity comes from relative scarceness of free variables in each subformula. Proposition 3.1. an For every finite model M = (DM,IM) d every normal SOEPDLformula £ there exists a finite game G(M,£) of two players Spoiler and Duplicator such that the following holds for every state s G DM and every subformula
47
finite models (where d is number of states in a model, f is size and n is independent variable index of a formula). 4. Reduction to CTL We study two opportunities how to use a model checker for CTL for solving [iC, 2M, and SOEPDL formulas in finite models. Definition 4.1. Let G  (PA, PB, MA, MB,WA, WB) be a game with winning positions. Let DQ be P^ U Pg A powerset model PTQ of this game is denned as follows. The domain of model is the powerset 2Da. A single propositional variable p is interpreted by the powerset 2WA. The interpretation of a single action symbols next comprises all pairs (S', S") such that S', S" C DG and • for every s' G S", for some (s1, s") G (MA U MB): s' $ (WA U WB) =• s" G S"; • for every s' G S', for every (s', s") G MB: s' G (PB \ (WA U WB)) => s" G S"; • for every s" G S", for some (s', s") G (MA U MB): s' G S'. Proposition 4.1. For every game with winning positions G, for every set of positions S, if there is a finite upper bound on length of G sessions, then S \=PTC EFp iff a player A has winning strategies against the counterpart B in all positions within S. In combination with theorem 3.1, it implies the following theorem. Theorem 4.1. Let MC be a model checker which can check CTL formula (EFp) in finite models. Then MC can be reused for checking all formulae of SOEPDL, 2M, fiC, and CTL in all finite models. Unfortunately, this kind of reuse is double exponential in model size. But there is another more efficient opportunity to use CTL model checker for solving finite games. This time we assume that we have an engine for solving A F and E F queries (formulas) and design a 'driver' that solves /uCformula WIN= /J, x. (p V (a)x V ((b)true A [b]x)) in finite games. These engines should be able to solve A F and E F queries in the following sense: for every finite model M, for every set of states P within this model
48
XQ := WA; %winning positions for A% DO Yt := AFGBXU %GB stays for G with Bmoves only% X ( i + 1 ) := Yi U EFaAXi; % GA stays for G with Amoves only% i:=i + l UNTIL Xi = X(i_1); WIN := Xi Figure 3.
A driver that solves finite games in terms of A F and E F
• let A F M P be a set exactly at the states • let E F M  P be a set exactly at the states
of in of in
states where AGp holds in M when p holds P. states where E F p holds in M when p holds P.
Proposition 4.2. For every game with winning positions G with some upper bound on length of plays method depicted in Fig. 3 eventually terminates, and upon termination the set WIN consists of all positions where a player A has winning strategies against the counterpart B. Proof. Termination of the method trivially follows from a simple observation: every turn in any game consists of one game step at least. Thus, if k is an upper bound for length of the game sessions, then Xk = X^+i)We show that for every i > 1, Xi consists of all positions within the game where player A has a strategy that leads to his/her win in all sessions that consist of (i — 1) changes of turns at most (by induction on i). Proposition 4.2 follows from this claim since the loop condition is a fixpoint stabilization of Xi and the method terminates eventually. Induction basis: i = 1. Then by construction Yj = AFQBWA, i.e. it consists of all positions where it is turn of B, but he/she loses every session that consists of his/her moves only. Hence Xi = FFQAWA U AFGBWA, i.e. it consists of all positions where it is turn of A, and he/she can run a session that consists of his/her moves only and leads to his/her win, or (alternatively) it is turn of B, but he/she loses every session that consists of his/her moves only. Induction hypothesis: assume that for some k > 1 the claim holds. Induction step: i = (k + 1). Then by construction Yi = AFGB^k, i.e. it consists of all positions where it is turn of B, but he/she loses every session that starts with his/her turn and then consists of (k — 1) turns at
49
most, where A utilizes a strategy that guaranties win for A. Hence Xi = Yi U EFc^Xfc consists of all positions where • A can run a turn that leads to a position where he/she has a strategy that leads to A win in all sessions that consist of (fc — 1) changes of turns at most, or (alternatively) • B loses every session that starts with turn of B and then consists of (k — 1) turns at most, where A utilizes a strategy that guaranties win for A, i.e. where player A has a strategy that leads to his/her win in all sessions that consist of k = (i — 1) changes of turns at most. • Combining this proposition with theorem 3.1, we get another theorem. Theorem 4.2. Let MC be a model checker that can solve AF and EFqueries in linear time in size of finite model. Then MC can be reused for checking all formulae of fi C, 2M, and SOEPDL with upper time bound f2 x exp(d x n) in all finite models (where d is number of states in a model, f is size and n is independent variable index of a formula). Observe that our above reuse theorems 1, 2, and 3 can be extended to the class of (not necessarily finite) models closed to Cartesian products and powersets, because actual game model construction and CTL model construction uses only Cartesian products and powersets. References 1. Emerson E.A. Temporal and Modal Logic. Handbook of Theoretical Computer Science, v.B, Elsevier and The MIT Press, 1990, 9951072. 2. Harel D. FirstOrder Dynamic Logic. Lecture Notes in Computer Science, v. 68, 1979. 3. Hintikka J., Sandu G. GameTheoretical Semantics. Handbook of Logic and Language, 1997. 4. Kozen D. Results on the Propositional MuCalculus. Theoretical Computer Science, v. 27, n. 3, 1983, p. 333354. 5. Rabin M.O. Decidable Theories, in Handbook of Mathematical Logic, ed. Barwise J. and Keisler H.J., NothHolland Pub. Co., 1977, 595630. 6. Schuppan V. and Biere A. Efficient reduction of finite state model checking to reachability analysis. , International Journal on Software Tools for Technology Transfer (STTT), v.5 (23), p. 185204, 2004.
50
7. Shilov N.V., Yi K. On Expressive and Model Checking Power of Propositional Program Logics. Lecture Notes In Computer Science, v. 2244, p. 3946, 2001. 8. Shilov N.V., Yi K. Engaging Students with Theory through ACM Collegiate Programming Contests. Communications of ACM, v. 45, n. 9, 2002. 9. Shilov N.V., Yi K. How to find a coin: propositional program logics made easy. In Current Trends in Theoretical Computer Science, World Scientific, v. 2, 2004, p. 181214. 10. Stirling C. Games and Modal MuCalculus. Lecture Notes in Computer Science, v. 1055, 1996, p. 298312. 11. Vardi M.Y. On the complexity of boundedvariable queries. Proc. 14th ACM Symp. on Principles of Database Systems, 1995, p. 266276.
51
A N O T E ON A i I N D U C T I O N C. DIMITRACOPOULOS* Department
of History and Philosophy of Science, University of Athens, GR157 71 Zografou, Greece Email: [email protected] A. SIROKOFSKICHt Department of Mathematics, University of Athens, GR157 84 Zografou, Greece Email: [email protected]
We give an alternative proof of a result of T. Slaman, concerning the strength of I A i , i.e. the theory of provablyAi induction.
1. Introduction We work with subsystems of firstorder Peano Arithmetic {PA). As usual, for n 6 N, / £ „ denotes the induction schema for £ „ formulae (plus the wellknown base theory PA"), LE„ the least number axiom schema for £ „ formulae (plus PA~), BT,n denotes /Ao plus the collection schema for £ „ formulae, exp denotes the axiom expressing "exponentiation is total" and ^ i ) & > li denotes the axiom expressing "xlog ^ is total", where log^ denotes the fcth iterate of the logarithmic function. Also, IAn denotes the induction schema for provablyA„ formulae (plus PA~) and LAn the least number schema for provablyA„ formulae (plus PA"). Finally, ( ,) denotes one of the usual pairing functions and x & y denotes the Ao formula expressing "2X appears in the binary expansion of j / " . For details, the reader can consult HajekPudlak [6]. We will also deal with other fragments of PA, namely versions of the pigeonhole principle. As usual, PHPT,n denotes PA~ plus the schema expressing this principle for Endefinable functions: * Research (EPEAEK t Research (EPEAEK
cofunded by the European Social Fund and National Resources II) PYTHAGORAS II. cofunded by the European Social Fund and National Resources II) HERAKLEITOS.

52
Vz "tp does not define a 11 function from z + 1 into z", for any
imply BT,n (n > I)?
Motivated by this problem, several authors have recently studied the strength of IAn and its parameter free counterpart, IA~, especially for the case n = 1 (see Beklemishev [1], FernandezMargarit et al. [5], Slaman [10] and Thapen [11]). Before we refer briefly to some of their results, we mention a few early theorems concerning fragments of PA. One of the first results concerning BT,n, proved by C. Parsons [9], is that it is not implied by the set of nn__i sentences true in the standard model. Theorem 1. For n > 1, II n + i(N) ^ £ £ „ . Relationships among BT,n and other fragments of PA were extensively studied in ParisKirby [7]; the ones that are relevant to the sequel are summarized as follows. Theorem 2. For all n e N, 7 S n + i => BT,n+i implications are strict).
=> / £ „ O L£„ (and the
By using (easy) modifications of some arguments in ParisKirby [7] and an argument due to R. Gandy (unpublished, see p. 66 of HajekPudlak [6]), one obtains Theorem 3. For all n e N ,
B£„+i <=> LAn+i
=*> J A n + 1 .
53
Turning now to recent work motivated by Problem 1, FernandezMargarit et al. [5] contained a study of restrictions of IAn, LAn etc. to A n (T) formulae, i.e. formulae that are A„ provably in (a certain theory) T, while the other papers studied the original problem. In the rest of this section, we will refer briefly to results in Beklemishev [1], Slaman [10] and Thapen [11], in order to place our result in the appropriate background. Motivated by the following picture, /Ei
=$• 7Ao + exp
BEx
j=
n 2 (N)
IAX which emerged as a synthesis of earlier results for n = 1, Beklemishev considered the problem whether ^ ( N ) implies IAX or not and he solved it by proving Theorem 4. n 2 ( N ) ^ / A j . At the end of his paper, Beklemishev noted that Theorem 4 (and other results of his paper) can be easily generalized for any n > 1 and stated some problems motivated by Problem 1. Two of his problems were the following. Problem 2. Does IA^ sentences?
follow from EA? From any r.e. set of true Hi
(IAj denotes parameter free IA\, while EA is a theory he called "Elementary Arithmetic" and is easily seen to be equivalent to IAo + exp  see, e.g., section 1(b) in chapter I of HajekPudlak [6]). Problem 3. Does BEi follow from IAi tences?
together with all true Yl2 sen
Slaman considered a variant of Problem 1, namely the same question but taking as base theory /Ao + exp instead of / A 0 , and showed that the answer is "yes", i.e. he proved Theorem 5. For all n > 1, IAn + exp => BT,n. As a consequence, he obtained positive solutions to (i) Problem 1 for n > 1, since IAn => I'Sni => exp, for such n (ii) Problem 3, since exp is a particular case of a true II2 sentence.
54
Thapen improved Slaman's result for n = 1, by showing Theorem 6. IAi +t =£• BY,i, w/iere £ is the axiom Vx3y3z(x < p(y) Az = xy), with p being any primitive recursive function. In particular, it follows that IA\ + fif =>• B E i . CordonFranco et al. [3] produced a negative solution to Problem 2. Actually, they proved the following more general result: Theorem 7. For any n G N, there is no r.e. set of true Hn+2sentences which implies IA~+1. In the next section, we give an alternative argument for Slaman's result and discuss the plausibility of the conjecture that a modification of this argument can lead to a proof, in the same spirit, of Thapen's result. The main difference between our approach and those of Slaman and Thapen is that we exploit results of DimitracopoulosParis [4] and Paris et al. [8], concerning the provability of versions of the pigeonhole principle in fragments of P A 2. Aiinduction vs. Eipigeonhole principle Before giving proofs we first recall some results of DimitracopoulosParis [4]. Proposition 8. IA0 + exp =*>
PHPA0.
Idea of proof. Working in M \= IAQ + exp, assume PHP AQ fails, i.e. for some 9 G Ao,a G M we have M \="9 defines a 11 function / : a + 1 —> a". Since M \= exp, f can be coded and hence M \=3z
2< a ' a  1 >+ 1 H codes a 11 function from z + 1 into z".
But now we can use LAo to find the least such z and then derive a contradiction. • Proposition 9. For all n > 0, PHP(En Theorem 10. For all n>0,
V II n ) => IT,n.
PHPT,n =4> BT.n.
Idea of proof. Working in M \= PHPT,n, assume BT,n fails, i.e. for some 9 e n „ _ i , o G M we have M (= Vz < a3y9(x, y) A GiVx < a3y < t9(x, y).
55
By Proposition 9, M = L£„_i and hence the formula 9(x,y) A Vu < y^6(x,u) defines a function / : a —> M with unbounded range. By considering the elements in the range of / in increasing order, we can now produce a S„ formula defining a 11 function g from a + 1 onto a (the idea is that g(x) = y •$=> f(y) is the immediate successor of f(x) in the range of / ) . But such a function cannot exist, since PHPT,n holds. D In view of Theorem 10, to prove that IAi + exp =>• B S i , it suffices to prove an apparently stronger result, i.e. Theorem 11. JAi + exp =>
PHPHi.
Proof. Let M \= IAi + exp, a£ M and 0(x,y) £ Si such that 9 is of the form 3z
aVw[(p(b, y, w) —+ Vx < b3y < a3z < w
56
(=>') Assume b € A, c < a, d satisfy M \=
IAo+il'lhwPHPAo. Using this result instead of Lemma 13, one can immediately modify the proof above to show that 7Ai + f^ h wPHPEi, for any k > 1. As mentioned by Paris et al. [8], it is not known whether Qi in Lemma 13 can be weakened to fij, for any/some k > 1; if this were proved, Theorem 12 could be strengthened accordingly.
57
A c k n o w l e d g e m e n t . T h e authors would like to t h a n k Jeff Paris for comments and corrections which helped to improve this paper.
References 1. L. D. Beklemishev: On the induction schema for decidable predicates, J. Symbolic Logic 68 (2003), 1734. 2. P. Clote and J. Krajicek: Open problems, Oxford Logic Guides, 23, Arithmetic, proof theory, and computational complexity (Prague, 1991), 119, Oxford Univ. Press, New York, 1993. 3. A. CordonFranco, A. FernandezMargarit and F. F. LaraMartin: Fragments of Arithmetic and True Sentences, Mathematical Logic Quarterly. Vol. 51. Num. 3. 2005. Pag. 313328. 4. C. Dimitracopoulos and J. Paris: The pigeonhole principle and fragments of arithmetic, Z. Math. Logik Grundlag. Math. 32 (1986), 7380. 5. A. FernandezMargarit and F. F. LaraMartin: Induction, minimization and collection for A„ + iformulas, Arch. Math. Logic 43 (2004), 505541. 6. P. Hajek and P. Pudlak: Metamathematics of firstorder arithmetic, SpringerVerlag, Berlin, 1993. 7. J. B. Paris and L. A. S. Kirby: Encollection schemas in arithmetic, Logic Colloquium '77, NorthHolland, Amsterdam, 1978, 199209. 8. J. B. Paris, A. J. Wilkie and A. R. Woods: Provability of the pigeonhole principle and the existence of infinitely many primes, J. Symbolic Logic 53 (1988), no. 4, 12351244. 9. C. Parsons: On a number theoretic choice schema and its relation to induction, 1970, Intuitionism and Proof Theory (Proc. Conf, Buffalo, N.Y., 1968), 459473 NorthHolland, Amsterdam. 10. T. A. Slaman: T,nbounding and Aninduction, Proc. Amer. Math. Soc. 132 (2004), 24492456. 11. N. Thapen: A note on Ai induction and Ei collection, Fund. Math. 186 (2005), no. 1, 7984.
58
A R I T H M E T I C T U R I N G D E G R E E S A N D CATEGORICAL THEORIES OF C O M P U T A B L E MODELS* E. FOKINA Sobolev Institute of Mathematics Siberian Branch of the Russian Academy of Sciences 4 Acad. Koptyug avenue, Novosibirsk, 630090, Russia Email: [email protected] In this paper we study the complexity of uncountably categorical and of countably categorical theories with computable models. We also study the complexity of index sets of countably categorical computable ddecidable models.
1. Introduction One of the approaches of the computable model theory deals with the existence of computable models for the first order theories. Every consistent decidable, that is with computable set of theorems, theory T has a decidable model. For the uncountably categorical first order theories Harrington and Khissamiev in [7, 8], showed that indeed all countable models of such theory T are decidable. If T is uncountably categorical but not decidable then some of its models can be computable while the others are not. In the paper of Baldwin and Lachlan [1] it was proved that all countable models of an uncountably categorical theory can be listed into a chain of the elementary embeddings Ao < A\ < A2 ^ . . . A w , where Ao is a prime model, A w is a saturated model and every A; + i is a minimal proper elementary extension of A*. Then the spectrum of computable models of the theory T is the set SCM(T) = {i  A, has a computable presentation}. Thus, the result of Harrington and Khissamiev can be presented as SCM(T) = w\J{u>}. This result led to the investigation of computable models of Hicategorical undecidable theories. In [4] Goncharov showed that there existed a Hicategorical, O'computable theory with SCM{T) — {0}. This result was generalized by Kudaiberguenov in [11], where he presented a Hicategorical, O'computable theory with SCM(T) = { 0 , 1 . . . , n } . In [9] Khoussainov, Nies, Shore built examples of Hicategorical, 0"computable theories T\ "This work was partially supported by RFBR grant 050100819 and grant of Scientific schools of Russia 4413.2006.1
59
and T2 with SCM(Ti) = w and SCM(T2) = OJ\J{UJ} \ {0}. Moreover Nies in [14] built an example of Kicategorical theory T with SCM(T) = {1} and proved that for an arbitrary Kicategorical theory T SCM(T) € £§(0"). All given examples of ^categorical theories are 0"computable. In [5] Goncharov and Khoussainov for any n > 1 built an example of Hicategorical theory which is Turing equivalent to 0™ and has a computable model. Using a modification of the construction from [5] for any arithmetic degree a we built a Kicategorical theory which is Turing equivalent to a and has a computable model. Moreover we show that every countable model of this theory has a computable presentation, that is SCM(T) — u;J{w}. Lerman and Schmerl in [12] gave some sufficient conditions for the countably categorical theory to have a computable model. They showed that any countably categorical arithmetic theory T for which the set of all sentences beginning with 3 and having n + 1 changes of quantifiers is a E° + 1 set for all n, has a computable model. In [10] Knight improved this result omitting the condition that T is arithmetic but requiring certain uniformity. However all known examples of Nocategorical theories with computable models were of low complexity. In [5] Goncharov and Khoussainov for any n built an example of countably categorical theory, Turing equivalent to 0™, with computable model. Using the technics from [5] for any given arithmetic degree a we built a countably categorical theory which is Turing equivalent to a and has a computable model. Let's introduce some basic definitions. We fix some computable Godel numbering of a language L. An algebraic structure A of the language L is computable if its domain is computable and all basic operations and predicates are uniformly computable. This definition is equivalent to the condition that the atomic diagram of A is computable. The algebraic structure B is computably presentable if it's isomorphic to a computable structure. In this case any isomorphism of B onto A is called the computable presentation of B. A complete theory T is acategorical if any two models of T of the power a are isomorphic. It's wellknown that if theory is /3categorical for some uncountable /? then it is acategorical for any uncountable a. To prove the basic results of the paper we need, like in [5], to define two operators. The construction of the operators follows the ideas of Marker from [13]. Their definition and properties are formulated in the next section. The detailed information can be found in [5]. In the section 3 we give the definition of 1tolrepresentation of E^sets and state 1tolrepresentation
60
lemmas and corollaries. The proof of the lemmas is in [5, 6]. In the next two sections the following theorems are proved: Theorem 4.1. For any arithmetic Turing degree there exists a tf.\categorical theory T of a finite signature which is Turing equivalent to this degree and has a computable model. Moreover, all countable models of T have a computable presentation. Theorem 5.1. For any arithmetic Turing degree there exists a Kocaiegorical theory T of a finite signature which is Turing equivalent to this degree and has a computable model. In the last section we are interested in the complexity of the index sets of ddecidable K0categorical models, where d is arithmetic. More precisely we prove the following: Theorem 6.1. For every arithmetic Turing degree d the index set of all ddecidable models has the Turing degree d^3) in the universal computable numbering of all computable ^categorical models in the signature with one binary predicate. 2. Marker's construction Let L be a finite language with no functional symbols, and let A = (A, PQ° ,... , P „ m ) be a structure of L. We assume that for every P of this structure the sets P and Ak \ P are infinite where k is the arity of P. For each fcary predicate P of this structure we define 3 and Vextensions of P . Marker's 3extension of P is a (k + l)ary predicate denoted by Pg with the following properties. Let X be an infinite set disjoint with A. Then Pg satisfies the following conditions: (1) If Pa(ai,a 2 ,...,afc,a f e + i) then P(a1,...,ak) and ak+\ G X. (2) For all ak+i G X there exists a unique tuple (ai,...,ak) such that P 3 ( a i , a 2 , . . .,a f c ,a f c + i). (3) If P(a\,..., afc) then there exists a unique a such that P3(ai,a2,...,ak,a). Marker's Vextension of the predicate P is a (k + l)ary predicate Pv with the following properties. Let X be an infinite set disjoint with A. Then Py satisfies the following conditions: (1) If P v ( a i , a 2 , . . . , a f c , a f e + i ) then ai,...,ak
€ A and o fc+1 G X.
61
(2) For all ( a i , . . . , a^) G A there exists at most one a^ + i G X such that  i P v ( a i , a 2 , . . . ,afc,Ofc+1). (3) If P v ( a i , a 2 , . • • ,ak,ak+i) for all a fe+1 G X then P ( a ! , . . . ,ajt). (4) For all afc+i G X there exists a unique tuple (ai,...,ajt) such that >fV(a 1 ,a 2 ,... ,a f e ,a f c + i). The set X in 3 or Vextension is called a fellow of P. Definition 2.1. Let .4 = (A, P 0 n ° , . . . , P£">) be a model. (1) The model A3 is a model (AuX0..UXm,P^0+1,..,P^m+1,X0, ..,Xm), where each P™i+ , i = 0 , . . . , m, is a Marker's 3extension of P " j such that fellows Xi of distinct predicates are pairwise disjoint sets. (2) The model A* is a model ( A U l 0 . . U l m , P 0 " ° + 1 , . , i ^ ' " + 1 , I o , . , I m ) , where each P " i + 1 , i = 0 , . . . , m, is a Marker's Vextension of P™* such that fellows Xi of distinct predicates are pairwise disjoint sets. Theorem 2.1. Let A3 and Ay be the Marker's extensions of the model A. Then they satisfy the following properties: (1) The model A is definable in each of the extensions. (2) If the theory of the model A is "Rocategorical then so is the theory of each of the extensions. (3) If the theory of the model A is N^categorical then so is the theory of each of the extensions. (4) If the theory of A is almost strongly minimal then so is the theory of each of the extensions. (5) Any automorphism of A can be extended to automorphisms of each of the extensions. Let A be a structure and w be a word over the alphabet {3,V}. We define Aw by induction. If w is an empty string then Aw = A. If w = w'3 or w = w/V and B = Aw> then Aw>3 = B3 and Aw'\/ = By Therefore we have the following corollary: Corollary 2.1. Let A be a structure and w be a word over the alphabet {3,V}. Then (1) The model A is definable in Aw. (2) If the theory of the model A is ^categorical (Hicategorical) then so is the theory of Aw. (3) Any automorphism of A can be extended to an automorphism of Aw
62
Our next goal is to show that A^M is less complex than A itself from a computability theoretic point of view. 3. O n onetoone representation of S j  s e t s The following definition and lemmas can be found in [5]. We will need them for the proof of the main results of the paper. Definition 3.1. A S^set A is onetoone representable if for some computable predicate Q c w3 the following is true: (1) (2) (3) (4) (5)
For For For For For
every every every every every
n G ui, 3aVbQ(n, a, b) if and only if n £ A. n £ u>, 3aVbQ(n, a, b) if and only if 3 = 1 aV6Q(n, a, 6) a . b there exists a unique pair {n, a} such that i Q(n, a, b). pair {n,a} either 3 = 1 6iQ(n, a, b) or VbQ(n,a, b). a there exists a unique n such that V6Q(n, a, b).
Lemma 3.1. Let A be a coinfinite T,\set that possesses an infinite computable subset S such that A\ S is infinite. Then A has a onetoonerepresentation. The definition of a onetoonerepresentation of a S^set can be relativized with respect to any oracle X. The relativized version of the lemma will be used in the proofs in the next sections. Lemma 3.2. Let A be a coinfinite E 2 ' set that possesses an infinite Xcomputable subset S such that A\ S is infinite. Then there exists a Xcomputable set Q such that Q is a onetoonerepresentation of A. The following two theorems are the corollaries of the lemma 3.2 and the corollary 2.1. Theorem 3.1. For any Turing degree d and for every computable sequence of a"computable models .Mo, • • • Mn, • • • of a finite signature there exists a computable sequence (A4o)v3, • • •, (A^ n )va, • • • of dcomputable models. Proof of the Theorem 3.1. The proof of the lemma 3.1 in [5] shows that the construction of onetoonerepresentations may be arranged uniformly for all n. Using the uniform version of the lemma 3.1 one can construct the sequence (.Mo)v3,..., (A4 n )va,. • • and show that every (A1n)v3 is dcomputable. a
3 " " ' i P ( i ) means that there exists a unique x satisfying P.
63
Theorem 3.2. For every Turing degree d a model M is ddecidable if and only if My and Ms are ddecidable. Proof of the Theorem 3.2. According to the theorem 2.1 the model M is definable in each of the extensions My and Ms. Therefore, if My or Ms is ddecidable then so is M. On the other hand, the properties of My or Ms are completely determined by M. Thus, if M is ddecidable then My and Ms are ddecidable. 4. Nicategorical theory with computable models Theorem 4.1. For any arithmetic Turing degree there exists a ^icategorical theory T of a finite signature which is Turing equivalent to this degree and has a computable model. Moreover, all countable models of T have a computable presentation. Proof. In the proof we follow the ideas from [5]. Let X be a E°set. We consider the structure M = (M,P), where P is a binary predicate on M with the following properties: (1) Predicate P is antireflexive that is *P(x,x) for all x. (2) For any x there exists a unique y such that P(x,y). For any y there exists a unique x such that P(x,y). (3) m e X if and only if there exists a unique Pcycle of the length 3m + 1 and there is no Pcycle of the length 3m + 2. m ^ X if and only if there exists a unique Pcycle of the length 3m + 2 and there is no Pcycle of the length 3m + 1. For any m there exists a Pcycle of the length 3m. (4) Any x € M is in some Pcycle. We denote Tx = Th(M).
The properties of the theory Tx are:
(1) Theory Tx is Nicategorical. (2) Theory Tx is complete. (3) Tx =T X. It's easily seen that up to isomorphism M has a presentation M = (w,P), where P £ £° and P satisfies the conditions of lemma 3.2. P is coinfmite and possesses an infinite 0("~^computable subset S of 3mcycles such that P \ S is infinite. According to the lemma 3.2 P has a 1tolrepresentation, that is, there exists Q\ C u>4 such that Q\ £ S ° _ 2 and
64
(1) (2) (3) (4) (5)
For all (x,y) 3aMbQi(x,y,a,b) if and only if (x,y) G P . For all (x,y) 3aV6Qi(a;,y,a, b) if and only if 3=1aWbQi(x, y,a,b). For every b there exists a unique tuple {x, y, a} such that i Qi(x, y, a, b). For every {x,y,a} either 3=1b*Qi(x,y,a,b) or VbQi(x,y,a,b). For every a there exists a unique (x,y) such that "ibQi{x,y,a,b).
Let's consider models .Mi = .Mva and J\f\ = (M U X\ U X2 U Yi,P^,ylJ, A ^ P i ) , where Xi, X 2 , Y"i are infinite pairwise disjoint sets that are disjoint with M, Pi(x,y,ui,vi) 4=> (x,y) G M, ui € Xi, vi e Yi and Qi(x,y,u\,Vi), A\ and Ai are for Xi and X 2 correspondingly; i>i € Y\ 4=> (3u 2 G X2)Pi(ui,W2) a n d the predicates satisfy the conditions from the definition of the 3extension. Then P\ is O( n_1 )computable, N\ is 0^ n ~^computable and the following holds: (1) From the definition of Pi and the properties of P it follows that Pi satisfies the conditions of the lemma 3.2. (2) The mapping P:X X —• P such that F{ux) = (x,y) <=> (Vvi)Pi(x,y,u\,vi) is a bijection. Proof. From the properties of Qi for every u\ £ Xi there exists a unique pair (x,y) € P such that (Vvi)Qi(x,y,ui,vi), i.e. (Vvi)Pi(x,y,ui,vi); for every pair (x,y) € P there exists a unique element Ui G Xi such that (\/vi)Qi(x,y,ui, Vi), i.e. (Wvi)Pi(x,y,ui,vi). (3) The mapping G such that G(x,y,u\) = v\ «=> 1 P i ( x , y , u \ , v i ) is a bijection. Proof. From the properties of Qi for every tuple (x, y, Ui) either there exists a unique element Vi G Y\ such that iQi(a;,j/,Mi, V\) (i.e. .Pi(a;,y,ui,ui)) or (Vvi)Qi(x,y,u u vi), i.e. (Vvi)Pi(x,y,ui,ui); for every Ui G Yi there exists a unique (x,y,ui) such that > Qi(:r,2/,«i,Ui), i.e. .Pi(x,y,ui,i;i). Thus, from the properties 2 and 3 it follows that Mi = N\. The properties of M\ guarantee that Ti = Th(Mi) = Th(Ni) is Nicategorical and Ti =T Tx =T X. By induction we build Mo = M, Mi, • • •, Mni, where Mk = (Mki)\/3 We also build Afi,... ,Mni Using the lemma we find Qk which is onetoonerepresentation of Pfc_i. We define Nk = {M U Xi,i U X 2 ,i U Xi j 2 U . . . U Xifc U . . . U X22fcifc U Yi,i U Yi]2 . . . U Y2^2k,Plk+2,Af^...A\k...A\2k_lk,Bl\...Bl2k_,k) where the sets X j j , Yitj and M are pairwise disjoint. Pk(x,y,ui}i,uii2, ..ui.fc, vi,i, ••, vi.fc) 4=^(2:, y)G.M, UijGXitj,vitjeYij and Qfc(z,y,«i,i,Mi, 2 , • •"!,*;,^i,i> ..,ui,jt),
65
andui,i e l y «=> 3u2,fcVv2,fe..Vu2,2^i,i(ui>iu2,fc)etc.; A\k,..A\2k_lk are for X\ ^ ... X22ki kSimilar to the case k = 1 Pk satisfies the conditions of the lemma, Qk, Pk are 0("~ fc1 )computable, Afk is 0(u~k~^computable. From the properties of M.k and Afk' Adk — Afk, Tk = Th(Afk) is Nicategorical, Tfc =T Tfci = T • • • =T Tx =T X. In particular Afni is computable and its theory T„_i = Th(Afn\) is Kjcategorical and T n _i = y X. We prove now that all models of T n _i = Th{Afni) have a computable presentation. Let's note the following. Let A = {A, P) be some algebraic structure and A a = (A U Xi U X2 U Yi, P\/3,Ul, U2, V?) is its Marker extension. We define A\ = (A, Pi) as a structure in which A\ (= P\{x,y) «=> «4V3 = 3ttiVuiPvg(x,j/,ui,ui). The properties of the Marker extensions guarantee that Pyg is 1tolrepresentation Pi and P = Pi,A = A!. As it was proved in [1], all models Adf,i = 1,2... of the theory Tx can be listed into a chain of elementary embeddings from the prime model to the saturated model. The prime model consists only of finite Pcycles and every next model contains one more infinite chain than the previous. Thus, every model Mf of Tx is Xcomputably presentable. We apply to M* the operator V3 n — 1 times and we get a computable model {Mi ) " _ 1 of the theory T n _i. Moreover, if m € X then
5CM(T n . 1 )=wUM. 5. Nocategorical theory with computable model In this section we prove the following theorem. Theorem 5.1. For any arithmetic Turing degree there exists a Nocategorical theory T of a finite signature which is Turing equivalent to this degree and has a computable model.
66
Proof. We code a E° + 1 set Y into a Nocategorical theory Ty so that Y and Ty have the same Turing degree. The construction of Ty is similar to the construction in [5]. The language of Ty consists of one binary predicate R. For every n £ u we define a cycle Cn = ( { 0 , 1 , . . . ,n + 2},R) of the length n + 3, where R(x,y) is true if and only if {x, y} = {i,i + 1} or {x, y} = {n + 2,0}. Let ICy be a class of all finite graphs Q such that m £ Y if and only if C3m+i £ ICy and Czm+2 & &Y, m $• Y if and only if C 3 m + 2 £ ICy and As in [5] the class Ky satisfies the following properties. If A, B\, B2 are in ICy and there are embeddings e : A —> B\, f : A —> B2 then there exists C from /Cy and embeddings g : B\ —> C and h : B2 —>• C such that 1 € /Cy and 23 is a subgraph of ^4 then B £ Ky. For all A and Z? from /Cy there exists C G /Cy such that there are embeddings of A and B into C. The axioms of Ty are the following. The first of them states that R is antireflexive. The infinite list of universal sentences says that any B $ ICy can not be embedded into models of Ty. The infinite list of V3sentences guarantees the following property. If A, B £ Ky and there is an embedding / : A —> B then there is an extension A! of A and an isomorphic mapping f':A'*B that extends / . The sentence that guarantees the existence of 3n + 1cycle belongs to Ty if and only if n £ Y. At the same time n ^ Y if and only if the sentence that guarantees the existence of 3n + 2cycle belongs to Ty. Therefore, Y and Ty have the same Turing degree. Using the properties of Ky, for every A £ Ky we find A* that satisfies the following property. For all B, C £ Ky and an embedding f oi B into A such that B is a subgraph of C, and card(C) = card(B) + 1 there is an embedding g : C —> A* that extends / . Then the model A of Ty is constructed as follows. Let AQ be a model from Ky. We consider the chain Ao C Ai C A2 • • • of models from Ky such that An+i = A^ Let A be the union of the chain. Then A is a model of Ty. Using the backandforth method we can show that any two countable models A and B of Ty are isomorphic. That is, Ty is Nocategorical.
67
As TY =T Y then TY has a model A = (w, R), R € S ° + 1 and R satisfies the conditions of the lemma 3.2. We construct the sequence of models {A}j
is a ddecidable model}.
Theorem 6.1. For every arithmetic Turing degree d the index set CKd of all ddecidable models has the Turing degree d^ in the universal computable numbering of all computable Hocategorical models in the signature with one binary predicate. Lemma 6.1. CKd e £°' d . Proof of the Lemma 6.1. We need to write that there exists a dcomputable function which is the characteristic function of the full diagram of Mn This statement is a S 3 ' sentence. Lemma 6.2. For every d there exists an ordering Ld = (N, ^d) that is dcomputable, has the type u> + u* but u> is not dc.e. Proof of the Lemma 6.2. The lemma is the relativized modification of the lemma from [3]. The original version states the existence of a computable ordering of the type ui + u>* such that its initial segment of the type w is not computable. It's not hard to see that in this case w or w* is not c.e. because otherwise we can enumerate both of them. Therefore, they both are computable.
68
Lemma 6.3. If A £ E 3 ' Q(n,x,y) such that n
G
then there exists a dcomputable predicate
A *=>
(3x)(3°°y)Q(n,x,y)
and for all n Q(n, 0,0) and Q(n, 0,1). Proof of the Lemma 6.3 can be found in [15]. Proof of the Theorem 6.1. For all x S Ld and for all n we consider the set L( n , x ) = {{x',y')\x' < x and Q(n,x\y')} that is uniformly dcomputable. Let i?(„jX) be a linear ordering of £( n ,z) such that if L(raa.) is infinite then (L(n>x), R(n,x)) has the type r\ and (0,0) < (0,1); if I/(nia;) is finite then (L(„iX), i?(„jX)) is a linear ordering. We define now xeLa If n € A then according to the lemma 6.3 (3xo)(3ccy)Q(n, xo, y). Thus, for all x such that XQ < x the set I/(„,x) is infinite. By the definition of R(n,x) r(n)
Ln — 2_^ Sk + Pk, fc=0
where Sk is finite and Pk = rj for every k < r(n), r{n) is finite and depends on n. Thus, Ln is ddecidable. If n £ A then for all x the set L(n,x) i s finite and Ln = w + u>*. By the lemma 6.2 L n is dcomputable as L<j is dcomputable. At the same time Ln is not ddecidable. If Ln were ddecidable then we could enumerate w with the oracle d and w would not be c.e. over d. For every n the structure Ln satisfies the conditions of the lemma 3.2. Trere exists m such that each Rn is a coinfmite E™set. The set Sn of all pairs (0,0) and (0,1) from all L(n<x) form an infinite subset which is 0^ m _ 1 'computable and the lemma 3.2 gives a 1tolrepresentation R^ of Rn. Again as in the previous two sections for all n we build the sequence of models Lln,l < i < m, such that L° is Ln and each Lln is a VEiextension of L%~x. Now we use the theorems 3.1 and 3.2. Each Lln is O^7™*)computable. According to the corollary 2.1 for all n and for alii < m Lln is Nocategorical. In particular, L™ is computable and Nocategorical. If n e A then L™ is ddecidable and if n £ A then L™ is not ddecidable.
69
C o r o l l a r y The index set of all decidable models has the Turing degree 0^ in the universal computable numbering of all computable ^categorical models in the signature with one binary predicate.
References 1. J. Baldwin, A. Lachlan, On Strongly Minimal Sets, Journal of Symbolic Logic, 36, 1971, 7996. 2. C. C. Chang and H. J. Keisler, Model Theory, 3rd ed., Stud. Logic Found. Math., 73, 1990. 3. Yu. Ershov, Theory of Numberings 3, Novosibirsk, Novosibirsk State University, 1974. 4. S. Goncharov,Constructive Models of w\categorical Theories, Matematicheskie Zametki, 23, 1978, 885888. 5. S. Goncharov and B. Khoussainov, Complexity of categorical theories with computable models, Dokl. Russian Academy of Science, 2002, 385, N 3, 299301. 6. S. Goncharov and B. Khoussainov, On comlexity of theories of computable "tilcategorical models, Vestnik of Novosibirsk State University, series: mathematics, mechanics and informatics , 2001, 1, N 2, 6376. 7. L. Harrington, Recursively Presentable Prime Models, J. of Symbolic Logic, 39, 1974, 305309. 8. N. Khisamiev, Strongly Constructive Models of a Decidable Theory, Izv. Akad. Nauk Kazakh. SSR, Ser. Fiz.Mat., 1, 1974, 8384. 9. B. Khoussainov, A. Nies, R. Shore, On Recursive Models of Theories, Notre Dame Journal of Formal Logic 38, 2, 1997, 165178. 10. J. Knight, Nonarithmetical ^categorical Theories with Recursive Models, J. Symbolic Logic, 59, N 1, 1994, 106112. 11. K. Kudaibergenov, On Constructive Models of Undecidable Theories, Siberian Mathematical Journal, v. 21, no 5, 1980, 155158. 12. M. Lerman, J. Schmerl, Theories With Recursive Models, J. Symbolic Logic 44, N 1, 1979, 5976. 13. D. Marker, NonY,naxiomatizable almost strongly minimal theories, J. Symbolic Logic 54, 1989, 921927. 14. A. Nies, A new spectrum of recursive models, Notre Dame Journal of Formal Logic, 40, 1999, 307314 15. H. Rogers, Theory of recursive functions and effective computability, McGrawHill, 1967.
70
EQUIVALENCE RELATIONS A N D CLASSICAL B A N A C H SPACES* SUGAO Department of Mathematics, PO Box 311430, University of North Texas, Denton, TX 76210, U.S.A. Email: [email protected] We give a survey of results on Borel reducibility among equivalence relations induced by classical Banach spaces. We present an application of this study to a classification problem related to the big O notation. Finally we study a question of Kanovei and provide some information on t h e complexity of the equivalence relations involved.
1. P r e a m b l e Classical Banach spaces and their actions give important examples of equivalence relations and are intensively studied in the descriptive set theory of Borel reducibility. In this article we give a survey of the results related to these equivalence relations and discuss some intriguing open problems. Many folklore results in the area have simple proofs which are hard to find in the literature; for some of them we give the proofs here. Our selection of results is not complete and inevitably reflects personal taste. At times attributions of the results are hard to determine and might not be accurate. The main objective of the paper is to provide an overview of the area for the reader and to motivate further research. We recall the definition of Borel reducibility. Let E, F be equivalence relations on Polish spaces X, Y, respectively. We say that E is Borel reducible to F, and denote E Y such that, for all x\,x2 € X, x1Ex2
<=>
e{x1)F6{x2).
If E l),co,£°° •Research partially supported by the U.S. NSF grant DMS0501039. I would like to thank the Chinese Academy of Sciences for a partial travel grant.
71
and briefly on Co(IR+). There is in fact very interesting work done on nonclassical Banach spaces, but it is not covered in this paper.
2. e1 This notation is now overloaded with several meanings. As a Banach space it denotes the linear subspace of Ru given by t1 = Uxn)
£ R U : 5 3  a : n  < oo 71=0
endowed with the complete norm
II W i l l = /
J n=0
F«l
As an equivalence relation its underlying space is Ru and it is defined as (xn)tl(yn)
4=> (xn  yn) € I1,
where the lx on the right hand side is the above space. Sometimes the equivalence relation is also represented by the quotient space and is denoted by R^/fi. This last notation emphasizes the fact that the equivalence relation is induced by the additive action of ll as an additive group. Thus the notation I1 is also used to denote the Polish group under addition. Kechris noticed early on that the equivalence relation I1 is related to ideals. Recall that the summable ideal on N is defined by
V
neA
)
For any ideal / on IN define the equivalence relation E\ on 2^ by xEjy
<^=S> xAy e I,
where x,y £ 2 N are understood as subsets of IM in a natural way and xAy = (x \ y) U (y \ x) is the symmetric difference of x and y. The following simple fact was the starting point of the study of the equivalence relation £l. Lemma 2.1 (Kechris). Let I be the summable ideal on N. Then I1 ~ B £ J . Hjorth discovered the following important dichotomy regarding equivalence relations below ll in the Borel reducibility hierarchy. Recall that an equivalence relation F is countable if every Fequivalence class is countable,
72
and an equivalence relation E is essentially countable if there is a countable equivalence relation F such that E ^rriin > m (x(m) = y(m)). In the same spirit Kanovei asked the following concrete question. Question 2.3 (Kanovei [12]). 7s E^
<==> 3g eG(gxi
=x2).
Theorem 2.4 (GaoPestov [9]). Let G be any abelian Polish group and X be a Polish Gspace. Then there is a Polish i1space Y such that EQ
gF = {g + f :
f£F}. FY/*1 ^
Then results of [9] show that the orbit equivalence relation E£1 universal orbit equivalence relation for abelian Polish group actions.
is a
73
This last result was proved indirectly. It is still of interest to produce a meaningful reduction from £} to Eel . In other words, which closed 1 1 subsets of I can code elements of IR' ' in an ^invariant way? A more intriguing question is whether it is possible to Borel reduce .Eoo to any orbit equivalence relation by abelian Polish group action. This is weaker than Kanovei's question above, but a negative answer would be more striking.
Question 2.5. Is £ M
E^?
3. £P (p > 1) Similar to the situation for tl, £p for all p > 1 have been investigated as equivalence relations and Polish groups in action. One naturally wonders about their Borel reducibility, and Dougherty and Hjorth gave the full answer. Theorem 3.1 (DoughertyHjorth [3]). For any l
£p
Thus they form a chain of order type R° (the nonnegative real numbers). A natural question is whether they exhaust all equivalence relations on a chain in the Borel reducibility hierarchy. We note next that this is not the case. For this we need a definition. Let Xn, n S u>, be Polish spaces and En be equivalence relations on Xn respectively. The direct sum of Xn, denoted © „ e w Xn or simply X^, is the disjoint union of all Xn with the topology naturally induced by the topologies of all Xn, with each Xn clopen. The direct sum of En, denoted © n 6 u En or simply Eu, is defined by xEuy
«=> 3n€u(x,y
£ Xn
AxEny).
In particular, if Xn = X and En — E for all n £ u, Eu is an equivalence relation on Xu. An equivalence relation E on X is called splitting if there is a Borel isomorphism
X such that xEuy
«=>
By the ShroderBernstein theorem E is splitting iff there is a Borel isomorphic embedding p : Xu —> X as above, so that p(X w ) is an .Einvariant subset of X. We note the following simple fact. Lemma 3.2. For any p > 1, £p is splitting.
74
Proof. Let Yn C R" be the set of all (x/.) such that xk = n if k is even. Let £ „ be the £p \Yn. It is clear that En is Borel isomorphic to £p. However, E„ = £p \ U„GW Yn In particular, for (xk) e y n , (o^) € Vm and n ^
m,
p
(xk) is not £ equivalent to (x'k). Finally, notice that the saturation of Yu in W can be obtained by an action of £p on Yu. This action actually gives a Borel isomorphism of Yu with an invariant subset oi Ru. D For each p > 1, define an equivalence relation £
. Then the DoughertyHjorth theorem implies that E^ \,
£
Proof. Toward a contradiction assume that 9 : R" —> Xu is a Borel reduction of £p to £
N (see the books [25, 16]). ^ d e n o t e s the partial computable function computed by program i in the ysystem. Wi denotes domain(^i). Wi is, then, the r.e. set/language (C N) accepted (or equivalently, generated) by the (^program i. £ will denote the set of all r.e. languages. L, with or without decorations, ranges over £. L denotes the complement of L. C, with or without decorations, ranges over subsets of £. £ = {Li \ i € N} is called an indexed family iff there exist a recursive function / such that f(i,x) = 1 iff x € Li. 2.2. Some Notions
from Language
Learning
We now consider some basic notions in language learning. Following definition gives the concepts of data that is presented to a learner. Part (a)
96
considers the notion of positive data, and part (b) considers the case when both positive and negative data are given. Definition 2 . 1 . (Gold [10]) (a) A text T is a mapping from N into (iVU {#}). The content of a text T, denoted content(T), is the set of natural numbers in the range of T. (b) An infinite information sequence 7 is a mapping from N to (N x {0,1}) U { # } , such that if (x,b) appears in the sequence, then (x, 1 — 6) does not appear in the sequence. The content of an information sequence I denoted content(i), is the set of pairs in the range of I. PosInfo(Z) = {x  (x, 1) G content(/)}, and Neglnfo(J) = {x  (x,0) € content(J)}. (c) T is a text for L iff content(T) = L. I is an information sequence for L iff PosInfo(J) = L and Neglnfo(i) = L. (d) T[n] denotes the initial segment of T of length n. Similarly, I[n] denotes the initial segment of I of length n. We let T (I), with or without superscripts, range over texts (information sequences). Intuitively, # ' s in the texts/information sequences denote pauses in the presentation of data. For example, the only text for the empty language is just an infinite sequence of # ' s . Note that by our convention on information sequences, PosInfo(Z) n Neglnfo(J) = 0. A finite sequence a is an initial segment of a text or an infinite information sequence. One can similarly define content(cr) (and PosInfo(<7), Neglnfo(cr) in case of a being initial segment of an information sequence). SEQ denotes the set of all finite initial segments of texts. SEG denotes the set of all finite initial segments of information sequences. Note that SEQ and SEG can be coded onto N. Definition 2.2. A language learning machine is an algorithmic device which computes a mapping from SEQ (or SEG) into N. Later we will consider variation of learning machines. For convenience of exposition we avoid defining these variants until we need them. We let M , with or without decorations, range over learning machines. oo
We say that M ( T )  = i « ( V n)[M(T[n]) = i\. Convergence on information sequences is similarly defined. We now define some common criteria for learning. Our first criterion is based on learner, given a text for the language, converging to a grammar for the language.
97
Definition 2.3. (Gold [10], Case and Lynes [6], Osherson and Weinstein [22]) Let o e iVU{*}. (a) M TxtEx"identifies L (written: L € T x t E x ° ( M ) ) <£> (V texts T for L){3i  Wi =a L)[M(T)l = i\. (b) T x t E x ° = {£  (3M)[£ C T x t E x a ( M ) ] } . The criterion we call T x t E x 0 is due to Gold [10]. The a > 0 case is from Case and Lynes [6] (Osherson and Weinstein [22] independently introduced the a = * case). We refer the reader to Pinker [24], Wexler and Culicover [28], Wexler [27], Osherson, Stob, and Weinstein [19, 20, 21], and Jain et al [14] for further discussion on the paradigm. The next definition is based on learner semantically rather than syntactically converging to the grammar(s) for the language. Definition 2.4. (Case and Lynes [6]) Let a € N U {*}. (a) M TxtBc"identifies L (written: L £ T x t B c a ( M ) ) & (V texts T OO
forL)(Vn)[WM(T[n])="L]. (b) T x t B c " = {£  (3M)[£ C T x t B c a ( M ) ] } . The a £ {0, *} cases were independently introduced by Osherson and Weinstein [22, 23]. The corresponding notion in the case of learning functions was introduced by Barzdins [2] and Case and Smith [7]. We now consider the corresponding learning criteria when information sequences are provided to the learner. Definition 2.5. (Gold [10] and Case and Lynes [6]) Let a € N U {*}. (a) M InfEx°identifies L (written: L € InfEx a (M)) <^> for all information sequences / for L, M(/)J. and W]y[(/) =a L. InfEx" = {£  (3M)[£ C InfEx a (M)]}. (b) M InfBcaidentifies L (written: L G InfBc a (M)) <=> for all inforOO
mation sequences I for L, (V 7I)[WM(/[TI]) =a L]. InfBc a = {£  (3M)[£ C InfBc a (M)]}. We often write T x t E x (respectively, T x t B c , InfEx, InfBc) for T x t E x 0 (respectively, T x t B c 0 , InfEx 0 , InfBc 0 ). The following theorem gives some basic comparison between the criteria of inference discussed above. Note that by definition, for all a G N U {*}, T x t E x " C InfEx 0 n T x t B c " , and (TxtBc° U InfEx") C InfBc". T h e o r e m 2 . 1 . (Gold [10], Blum and Blum [3], Case and Lynes [6] and Case and Smith [7]) For all n € N, the following hold.
98
(a) T x t E x n + 1  InfEx" ^ 0. (b) TxtEx*  U m e i V I n f E x m ± 0. (c) T x t B c  InfEx* / 0. (d) T x t B c " + 1  InfBc" ^ 0. (e) TxtBc*  \JmeN InfBc m ^ 0. (f) TxtEx 2 " C TxtBc". (g) TxtEx 2 " + 1  TxtBc" ^ 0. (h) InfEx* C InfBc 0 . (i) InfEx  TxtBc* ^ 0. ( j ) 5 e InfBc*. 3. Identification with Finite Negative Information We first consider the model where an apparently small finite set of negative information is given in addition to text. In part (a) of both Definitions 3.1 and 3.2 just below, 5 is the core of negative information. The learner gets (besides the positive data) exactly this core negative data (marked as such) and no other negative data. Definition 3.1. (Baliga, Case and Jain [1]) Suppose a,b G N U {*}. (a) M NegFbTxtEx.aidentifies L G £ (written: L G N e g F 6 T x t E x a ( M ) ) «• (35 C L  card(5) < 6)(VJ  PosInfo(J) = L & Neglnfo(J) = 5 ) [ M ( / )  and W M ( /) =° L]. (b) N e g F 6 T x t E x a = {£ C £ \ (3M)[£ C NegF 6 TxtEx a (M)]}. Definition 3.2. (Baliga, Case and Jain [1]) Suppose a,b G N U {*}. (a) M NegF 6 TxtBc° identifies Le£ (written: L G NegF 6 TxtBc a (M)) <* (35 C I  card(5) < 6) (VI  PosInfo(J) = L & Neglnfo(J) = 5)(V n)[W M ( / [ n ] ) =a L]. (b) N e g F 6 T x t B c a = {£ C 5  (3M)[£ C NegF 6 TxtBc a (M)]}. By definition, for all a, N e g F ° T x t E x a = T x t E x 0 and a N e g F ° T x t B c = TxtBc". The next theorem illustrates the gain in learning power obtained by using sets of negative information with cardinality at most one/two. Theorem 3.1. (Baliga, Case and Jain [1]) £ G NegF2TxtEx n N e g F ^ x t E x 1 n N e g F ^ x t B c . In contrast to the above result, we have:
99
Theorem 3.2. (Baliga, Case and Jain [1]) £ (£ N e g P ^ x t E x . However N e g F 1 T x t E x is still quite powerful as shown by the following theorem. Theorem 3.3. (Baliga, Case and Jain [1]) (a) {L G £ 1 1 is infinite} G N e g F ^ x t E x . (b) N e g F ^ x t E x  TxtBc* ^ 0. (c) N e g F ^ x t E x  InfBc n ^ 0. (d) T x t E x 1 C N e g F ^ x t E x . (e) InfEx C N e g F ^ x t E x . For i > 2, it is open at present whether TxtEx 1 c N e g F x T x t E x . 4. Some other Negative Information Models Shinohara [26] considered giving to the learner atleast (but arbitrary) n negative data items. Definition 4.1. (Shinohara [26]) Let n G N. (a) Suppose L has at least n elements. M PP™identifies L (written L G P P " ( M ) ) , iff for all information sequences / such that PosInfo(J) = L and card(NegInfo(/)) > n, M(7) converges to a grammar for L. (b) P P n = {£  (3M)[£ C PP n (M)]}. Theorem 4.1. (Shinohara [26]) Let n G N. Suppose for any L G C, L contains at least n elements. Then C G T x t E x iff C G P P n . Fulk considered giving the grammar for the complement of L to the learner. For this notion consider M as being given two inputs: (a) a grammar, and (b) a text. Convergence of M(i,T) can be defined as usual. Definition 4.2. (Fulk [9]) (a) M CTxtExidentifies L (written: L G CTxtEx(M)) iff for all i such that Wj = L, for all texts T for L, M ( i , T ) converges to a grammar fori. (b) CTxtEx = { £  (3M)[£ C CTxtEx(M)]}. Fulk showed that having a grammar for the complement gives tremendous advantages. Theorem 4.2. (Fulk [9]) Let n&N.
CTxtEx  InfBc" ^ 0.
100
Fulk also considered the case when instead of being given a grammar for complement of L, the learner is given a sequence of grammars all but finitely many of which are grammars for L. It is not known at present whether this gives any advantages over informants. Jain and Sharma [15] considered giving a grammar for a subset of the complement of the language being learned, where this subset has certain density. Motoki [18] considered a form of open negative information as follows. Definition 4.3. (Motoki [18]) M identifies L using advisor Ai iff for all information sequences / such that PosInfo(Z) = L and Neglnfo(L) D AL, M ( / ) converges to a grammar for L. We use a general definition, though Motoki was mainly interested in indexed families. Motoki showed that there exists a class C 0 TxtEx, such that a learner M can identify each L € £ using some advisor Ai, where caxd(Ai) < 1. Motoki also gave a characterization of indexed families which can be learned using some advisor. We will be discussing a general form of open negative information in the next section. 5. Identification with Open Negative Information We now consider another model of presenting negative information to learning machines. Here the negative information is supplied in a manner reminding one of the basic open sets for the topology with respect to which enumeration operators are continuous. This is the first topology described in Exercise 1135, page 217 of Rogers [25]. These models were motivated in part by those considered by Motoki [18] (see Definition 4.3 above) and those in Section 3 above. Basically, this model allows the possibility of more negative information being supplied in addition to the finite cores of negative information. Definition 5.1. (Baliga, Case and Jain [1]) Suppose a, b € N U {*}. (a) M NegO b TxtEx a identifies L € S (written: h L G NegO TxtEx a (M)) e> (3S C I  card(5) < b)(VI  PosInfo(Z) = L & S C Neglnfo(J) C L)[M(T)i and WM{I) =a L]. (b) NegO b TxtEx a = {£ C £  (3M)[£ C NegO f e TxtEx a (M)]}. Thus, in contrast with Definition 3.1, in above model the learner must satisfy the stronger constraint that it needs to learn when the negative
101
information present in the data given to it is any S' such that S C S' C L (here S' may be infinite). Definition 5.2. (Baliga, Case and Jain [1]) Suppose a, b £ N U {*}. (a) M N e g O b T x t B c a identifies L G£ (written: L G_Neg0 6 TxtBc a (M)) <£> (3S C I  card(S) < 6)(VI  PosInfo(J) = L & S C NegInfo(/) C I ) ( V n)[WM{I[n]) = a L]. (b) NegO b TxtBc a = {C C £ \ (3M)[£ C NegO b TxtBc a (M)]}. Clearly, for all a, NegO°TxtEx° = T x t E x a and N e g O ° T x t B c a = TxtBca. Theorem 5.1 below shows that the NegO* criteria are equivalent to supplying all the negative (as well as the positive) information to a learning machine. Theorem 5.1. (Baliga, Case and Jain [1]) For all a € NegO*TxtEx a = InfEx a and NegO*TxtBc° = InfBc°.
N U {*},
Thus, in particular we have £ e NegO*TxtBc*, and NegO*TxtEx C NegF^xtEx. Note that if we consider languages such that informant for a language can be effectively obtained from its text, then above theorem shows that NegO type negative data does not help. As a corollary to Theorem 5.1, using Theorems 2.1 and 3.3, we have Corollary 5.1. (Baliga, Case and Jain [1]) (a) For all neN, TxtEx" + 1  NegO*TxtEx" ^ 0; (b) For all neN, T x t B c n + 1  NegO*TxtBc" ^ 0; (c) T x t B c  NegO*TxtEx* ^ 0; (d) N e g F ^ x t E x  NegO*TxtBc" ^ 0; (e) N e g F ^ x t E x  NegO*TxtEx* ^ 0. The above Corollary shows that there are classes of languages which can be learned with n +1 mistakes, but not with n, no matter how much open negative information is provided in the n mistake case. In other words, the gap left by the possible extra anomaly can be greater in information content than the information provided by open negative information. The following theorem generalizes Theorem 2.1(f). Theorem 5.2. (Baliga, Case and Jain [1]) For alia € NU {*} and j e N, [NegO a TxtEx 2 j C N e g O a T x t B c j ] .
102
The following result contrasts with Theorem 2.1(g). Theorem 5.3. (Baliga, Case and Jain [1]) TxtEx* C NegC^TxtBc. The next theorem contrasts nicely with Theorem 5.1 above. It provides classes of languages which can be learned with n + 1 pieces of core open negative information, but not with n, no matter how many anomalies are permitted in the n piece case. In other words, the extra possible negative information can be greater in information content than the information that may be omitted by the anomalies. Theorem 5.4. (Baliga, Case and Jain [1]) (a) NegC^TxtEx  NegO°TxtBc* ^ 0. (b) For all n e N, N e g O " + 1 T x t E x  NegO"TxtEx* ^ 0. (c) For all n e N, N e g O n + 1 T x t E x  \Jj€N NegO n TxtBc J ' ^ 0. The previous theorem has the following straightforward corollary. Corollary 5.2. (Baliga, Case and Jain [1]) For all a € NU {*} and j , n e N, (a) NegO"TxtEx a C N e g O n + 1 T x t E x a and (b) NegO"TxtBc i C NegO n + 1 TxtBc J '. 5.1. Complexity
Advantages
of Open Negative
Information
McNeill [17] posits that there is faster learning of language for children in homes in which more corrections (usually in the form of, possibly exemplary, expansions) are given. These corrections are, in part, a form of negative information. Theorem 5.5 below shows that an improvement in speed (measured by mindchanges) can result from the presence of open negative information even when the classes themselves can be learned without the negative information. For this section it is convenient to modify the definition of the learning machine to the following. Definition 5.3. A language learning machine is an algorithmic device which computes a mapping from SEQ (or SEG) into iV U {?}. Intuitively the outputted ?s represent the machine not yet committing to an output. This avoids biasing the number of mind changes before a learning machine converges. In the next definition, the subscript b represents a bound on the number of mind changes allowed before convergence.
103
Definition 5.4. (Case and Smith [7], Case and Lynes [6]) Suppose a,b £ NU{*}. We say that M TxtExgidenti/ies L & [[L G TxtEx a (M)]A (V textsT forL)[card({a;  [? ^ M{T[X])}A[M(T[X}) + M(T[a;+l])]}) < &]]. One can similarly define N e g O c T x t E x £ . Next theorem shows the speed advantage of having open negative information. T h e o r e m 5.5. (Baliga, Case and Jain [1]) There exists a class of languages C such that, (a) C G T x t E x , (b) £ G N e g C ^ T x t E x o , and (c)£0Une;v TxtEx;. We now list some of the open problems regarding this model. (a) For i > 1, £ G NegO^TxtBc*? Here note that £ G NegO*TxtBc*. (b) By Theorem 2.1(g), T x t E x 2 j + 1  T x t B c j f 0. Similarly, can it be shown that, for i > 1, NegC^TxtEx^ 4 " 1  N e g O ^ x t B c ^ ' ^ 0? (c) For i > 1, is NegCTTxtEx* c N e g O i + 1 T x t B c ? So far we know that NegO*TxtEx* C NegO*TxtBc. 6. Learning w i t h Negative C o u n t e r e x a m p l e s We now consider providing negative data to the learner via counterexamples to the conjectures of the learner. We will be considering three variants of the model. Intuitively, for learning with negative counterexamples, we may consider the learner being provided a text, one element at a time, along with a negative counterexample to the latest conjecture, if any. The list of negative counterexamples may be modeled as a second text provided to the learner. Thus the learning machines get as input two texts, one for positive data, and other for negative counterexamples. We say that M(T, T") converges to a grammar i, iff for all but finitely many n, M(T[n], T'[n]) = i. In the basic model of learning from positive data and negative counterexamples, if a conjecture contains elements not in the target language, then a negative counterexample is provided to the learner. N C in the definition below stands for negative counterexample. Definition 6.1. (Jain and Kinber [13]) Suppose a 6 N U {*}. (a) M N C E x a identifies a language L (written: L G N C E x a ( M ) ) iff for all texts T for L, and for all T" satisfying the condition: T'{n) G Sn, if Sn ^ 0 and T'(n) = # , if Sn = 0,
104
where S„ = L n W M(r [ n ] iT <[ n ]) M(T, T") converges to a grammar i such that W* = a L. (b) N C E x a = {£  (3M)[£ C NCEx a (M)]}. We also consider two variants of above definition as follows: — the learner gets least negative counterexample instead of any counterexample. This criteria is denoted LNCEx a . — the negative counterexample is provided only if there exists one such counterexample < the maximum positive element seen in the input so far (otherwise the learner gets jf). This criteria is denoted by BNCEx". (Essentially Sn in the definition of T"(n) in part (a) is replaced by Sn = I f l W M(T [ n ] iT /[ n ]) n {x  x < max(content(T[n]))}). The B N C model essentially addresses some complexity constraints. Similarly, we can define N C B c a , LNCBc a and BNCBc" criteria of inference. It is easy to see that T x t E x a C B N C E x a C NCEx° C LNCEx a . All of these containments, except the last one, are proper. Part (a) of the following theorem shows that every indexed family can be learned using positive data and negative counterexamples. This improves a classical result that every indexed family is learnable from informants. Since there exist indexed families not in TxtEx, this illustrates a difference between N C E x learning and learning without negative counterexamples. Part (b) of the following theorem illustrates another difference between N C E x learning and TxtEx learning. Such a result does not hold for T x t E x (for example, {F \ F is finite } U {L} g TxtEx, for any infinite language L). Theorem 6.1. (Jain and Kinber [13]) (a) Suppose £ is an indexed family. Then £ G N C E x . (b) Suppose £ G N C E x and L is a recursive language. Then £ U {L} € NCEx. Part (b) of the above theorem does not generalize to taking r.e. language (instead of recursive language) L, as witnessed by £ = {{^4 U {x}}  x $ A}, and L = A, where A is any nonrecursive r.e. set. Here note that £ G TxtEx, but £ U {L} is not in NCEx. The following theorem shows that using least negative counterexamples, rather than arbitrary negative counterexamples, does not enhance power of a learner.
105
T h e o r e m 6.2. (Jain and Kinber [13]) Let a e N U {*}. Then, N C E x a = LNCExaClnfEx°. For Bcstyle learning, a limited version of above holds. Though, the equality N C B c = LNCBc can be generalized to learning with anomalies (see Corollary 6.2 below), LNCBc C InfBc, cannot be generalized to learning with anomalies. Proposition 6.1. (Jain and Kinber [13]) N C B c = LNCBc C InfBc. Part (a) of the following theorem shows that all classes of languages learnable in the basic Exstyle model with arbitrary finite number of errors in almost all conjectures can be learned without errors in the basic Bcstyle model. This contrasts with learning from texts where TxtEx 2 ; , + 1 — TxtBc^ ^ 0 (Theorem 2.1(g)). Part (b) of the following theorem is somewhat surprising. It shows that sometimes negative counterexamples are not enough: to learn a language, the learner must have access to all negative examples. T h e o r e m 6.3. (Jain and Kinber [13]) (a) NCEx* C N C B c . (b) InfEx  N C B c ^ 0. We now show advantages of having negative counterexamples. Part (a) of the following theorem shows that the model B N C E x is quite powerful: there are classes of languages learnable in this model that cannot be learned in the classical Bcstyle model even when an arbitrary finite number of errors is allowed in almost all conjectures. Part (b) of the following theorem shows that there are classes of languages learnable in the basic model that cannot be learned in any of the models that use negative counterexamples of limited size. Theorem 6.4. (Jain and Kinber [13]) (a) B N C E x  T x t B c * ^ 0. (b) N C E x  B N C B c * ^ 0. Note that the diagonalizations in Theorem 6.4 can be shown using indexed families of languages. Thus, in contrast to Theorem 6.1, there exists an indexed family not in BNCBc*. In contrast to Theorem 6.4 (b), the following shows that if attention is restricted to only infinite languages, then N C E x and B N C E x behave similarly.
106
Theorem 6.5. (Jain and Kinber [13]) Suppose a G JVU{*}. Suppose C consists of only infinite languages. Then (a) £ G N C E x a iff C G B N C E x ° . (b) C G N C B c a iff C. G B N C B c a . We now consider the error hierarchy for learning with negative counterexamples. That is, learning with at most n + 1 errors in almost all conjectures in the basic model is stronger than learning with at most n errors. The hierarchy easily follows from the following theorem. Theorem 6.6. (Jain and Kinber [13]) Suppose n G N. (a) T x t E x n + 1  NCEx" ^ 0. (b) TxtEx*  \JneN NCEx" ^ 0. (c) T x t B c  NCEx* ^ 0. (d) T x t B c 1  N C B c ^ 0. As, T x t E x n + 1 C B N C E x n + 1 C NCEx" + 1 C L N C E x n + 1 , the following corollary follows from Theorem 6.6. Corollary 6.1. (Jain and Kinber [13]) Suppose n £ N. Then, for I G { N C E x , L N C E x , B N C E x } , we have I " c I n + 1 . Now we consider another surprising result. There exists a Bc 1 style learner with negative counterexamples, with the "ultimate power"  it can learn the class of all recursively enumerable languages! Theorem 6.7. (Jain and Kinber [13]) £ G N C B c 1 . Since £ G InfBc*, we have Corollary 6.2. (Jain and Kinber [13]) (a) N C B c 1 = InfBc*. (b) For allaeNU {*}, N C B c a = LNCBc a . The following corollary shows a contrast with respect to the case when there are no errors in conjectures (Proposition 6.1 and Theorem 6.3(c)). What a difference just one error can make! Corollary 6.3. (Jain and Kinber [13]) For all n G N, n> 0, InfBc" C NCBc" = N C B c x . Based on the ideas similar to the ones used for proving Theorem 6.7, one can show Theorem 6.8. (Jain and Kinber [13]) (a) Let C = {L G £  L is infinite}. ThenC&BNCBc1.
107
(b) For all n€N, TxtBc" C B N C B c 1 . (c) TxtEx* C B N C B c 1 . As there exists a class of infinite languages which does not belong to InfBc" (see Case and Smith [7]), we have Corollary 6.4. (Jain and Kinber [13]) For all n£N, B N C B c 1  InfBc" + 0. Thus, B N C B c m and InfBc™ are incomparable for m > 0, m,n £ N. The above result does not generalize to InfBc*, as InfBc* contains the class E. We now mention some of the open questions regarding behaviourally correct learning when the size of the negative counterexamples is bounded. (a) Is B N C B c " hierarchy strict? (b) Is TxtBc* C B N C B c 1 ? 6.1. Complexity
Issues
We now consider the complexity advantages of having negative counterexamples. This section is based on the paper [13]. The class L\ = {L  card(iV — L) = 1} is in TxtEx, but requires unbounded number of mind changes to learn. On the other hand, C\ can be easily learned using one mind change if negative counterexamples are available. Thus, not only does N C E x model give learnability advantages over TxtEx, it also gives complexity advantages over T x t E x for some classes in TxtEx. Note that if one does not allow mind changes, then N C E x and T x t E x are both the same — thus the above result is the best mind change complexity advantage possible. The class £ 2 = {L  (3i)[L = {x \ x < i}}} U {JV}, is learnable in N C E x model, but the number of mind changes is unbounded. However, £ 2 can be learned by using at most one mind change in the model LNCEx. Thus, even though LNCEx does not give learnability advantages over NCEx, it does give complexity advantages. Let a — b = a — b, if a>b; a — 6 = 0 otherwise; Consider the class: £ 3 = {L  (3!e)[(0,e) G L A L  {(0,e)} C {(x,y)  x > 1} A card(L {(0,e)})=emin(We)]} £3 is in LNCEx with at most one mind change. However £3 cannot be learned in InfEx using bounded number of mind changes. Note that LNCEx C InfEx. So getting negative counterexamples gives complexity
108
advantages over informants, despite informant being more advantageous for learning as a whole. The situation is more complex in considering the complexity advantages of NCExmodel compared to InfEx model. There exist classes which can be NCExidentifies using n — 1 mind changes, but cannot be InfExidentified using (2™ — 1) — 2 mind changes. This is optimal as it can be shown that any class which can be NCExidentified using n — 1 mind changes can also be identified using (2™ — 1) — 1 mind changes in InfExmodel. We omit the details. 7. Learning W i t h Subset Queries We now consider learning with subset queries, which turn out to be another mechanism for providing negative examples. In this model learner is allowed to ask queries of the form "is Q C L?", where L is the language being learned. If the answer to query is "no", we additionally can have the following possibilities: (a) Learner is given an arbitrary counterexample (a member of Q — L); (b) Learner is given the least counterexample; (c) Learner is just given the answer 'no', without any counterexample. We would often also consider bounds on the number of queries. We first formalize the definition of a learner which uses queries. Definition 7.1. (Jain and Kinber [12]) A learner using queries can ask a query of form uWj C L?" on any input a. Answer to the query is "yes" or "no" (along with a possible counterexample). Then, based on input a and answers received for queries made on prefixes of cr, M outputs a conjecture (from N). Note that the queries are for recursively enumerable languages, which are posed to the teacher using a grammar (index) for the language. Many of the diagonalization results stand even if one uses arbitrary type of query language. However simulation results often crucially depend on the queries being made only via grammars for the queried languages. Here, if one allows infinite number of subset queries, then one can learn the whole class £ of recursively enumerable languages in Exmodel of learning. Furthermore, as we will see below (Proposition 7.2) if one allows finite, but unbounded, number of queries, then for Exmodel of learning the notion coincides with learning from negative counterexamples. We now formalize learning via subset queries.
109
Definition 7.2. (Jain and Kinber [12]) Suppose a G N U {*}. (a) M SubQaExidentifies a language L (written: L G S u b Q a E x ( M ) ) iff for any text T for L, it behaves as follows: (i) The number of queries M asks on prefixes of T is bounded by a (if a — *, then the number of such queries is finite). Furthermore, all the queries are of the form uWj C L?" (ii) Suppose the answers to the queries are made as follows. For a query uWj C L?", the answer is "yes" if Wj C L, and the answer is "no" if Wj — L ^ 0. For "no" answers, M is also provided with a counterexample, x G Wj — L. Then, for some k such that Wk = L, for all but finitely many n, M(T[n]) outputs the grammar k. (b) S u b Q a E x = {£  (3M)[£ C S u b Q a E x ( M ) ] } . LSubQ a Exidentification and ResSubQ°Exidentification can be defined similarly, where for LSubQ°Exidentification the learner gets the least counterexample for "no" answers, and for R e s S u b Q a E x identification, the learner does not get any counterexample along with the "no" answers. For a,b G JVU{*}, for I G {Ex 6 ,Be 6 }, one can similarly define SubQ°I, LSubQ"I, and R e s S u b Q a I . Next two propositions show a close correspondence between learning via negative counterexamples and learning via subset queries. In particular, learning via finite number of subset queries coincides with learning via negative counterexamples for Exmodel of learning. P r o p o s i t i o n 7.1. (Jain and Kinber [12]) For any a G ./VU{*},I G {ExQ,Bca}, (a) SubQ*I C N C I . (b) LSubQ*I C L N C I . (c) ResSubQ*I C R e s N C I . P r o p o s i t i o n 7.2. (Jain and Kinber [12]) Suppose a G N U {*}. N C E x a = SubQ*Ex a = L N C E x a = LSubQ*Ex° = R e s N C E x a = ResSubQ*Exa. Next theorem establishes a hierarchy of learning capabilities with respect to the number of subset queries. T h e o r e m 7.1. (Jain and Kinber [12]) Suppose n G N. R e s S u b Q " + 1 E x  L S u b Q n B c * ^ 0.
Then,
110
We now consider relationship between various types of subset queries. When only a single query or an unbounded but finite number of queries are used, different types of counterexamples do not make a difference. Theorem 7.2. (Jain and Kinber [12]) Suppose a G N U {*}, b G {0,1, *}, and I G { E x a , B c Q } . Then, ResSubQ fa I = SubQ 6 I = LSubQ b I. Thus, one needs to consider at least two queries when showing differences between various types of subset queries. The following theorem establishes the relationship between different types of subset queries. Theorem 7.3. (Jain and Kinber [12]) For all ne
N,
(a) LSubQ2Ex  SubQnBc* £ 0. (b) SubQ2Ex  ResSubQnBc* ^ 0. We next consider the anomaly hierarchy for the subset query learning criteria. Theorem 7.4. (Jain and Kinber [12]) (a) For all n G N, T x t E x " + 1 LSubQ*Ex n ^ 0. (b) For all neN, T x t B c " + 1  LSubQ*Bc n ^ 0. (c) LSubQ*Ex* C ResSubQ*Bc. As a corollary we get: Corollary 7.1. (Jain and Kinber [12]) Let a G N U {*}, and ne N. (a) SubQ a Ex" c SubQ°Ex" + 1 . (b) LSubQ a Ex" c LSubQ a Ex n + 1 . (c) ResSubQ°Ex n c ResSubQ a Ex" + 1 . Similar corollary exists for Bccriteria of learning with Ex being replaced by Be in the above. 8. Random Negative Examples In this section we briefly consider the impact of having random negative examples. It would be interesting to explore in general how random negative examples effect learning compared to other kind of negative examples as discussed in this paper. When considering giving random negative examples, one may consider any measure theoretic method of selecting a random negative example. The only property used in the following is that if A is infinite and B is a finite subset of A, then measure of A — B (with respect to A) is 1. Let
Ill
R a n d p T x t E x denote the class of languages t h a t can be identified using positive d a t a and one random negative example with probability p. T h e o r e m 8 . 1 . Consider the following class of languages: C = {L (3i)[Wi = L & c a r d ( I  {(i,x) \ x G N}) < oo & card(L n {{i,x) x G N}) = oo]}. Then, C G R a n d j T x t E x  T x t E x .
 
Note here t h a t by Theorem 4.1, if one considers having arbitrary counterexamples, then for any class of languages which consists only of coinfinite languages, k arbitrary negative examples do not help in learning. So above theorem also shows t h a t random negative examples are more useful for learning compared to arbitrary negative examples. Acknowledgements Sanjay Jain was supported in p a r t by NUS grant number R252000127112. References 1. G. Baliga, J. Case, and S. Jain. Language learning with some negative information. Journal of Computer and System Sciences, 51(5):273285, 1995. 2. J. Barzdins. Two theorems on the limiting synthesis of functions. In Theory of Algorithms and Programs, vol. 1, pages 8288. Latvian State University, 1974. In Russian. 3. L. Blum and M. Blum. Toward a mathematical theory of inductive inference. Information and Control, 28:125155, 1975. 4. M. Blum. A machineindependent theory of the complexity of recursive functions. Journal of the ACM, 14:322336, 1967. 5. R. Brown and C. Hanlon. Derivational complexity and the order of acquisition in child speech. In J. R. Hayes, editor, Cognition and the Development of Language. Wiley, 1970. 6. J. Case and C. Lynes. Machine inductive inference and language identification. In M. Nielsen and E. M. Schmidt, editors, Proceedings of the 9th International Colloquium on Automata, Languages and Programming, volume 140 of Lecture Notes in Computer Science, pages 107115. SpringerVerlag, 1982. 7. J. Case and C. Smith. Comparison of identification criteria for machine inductive inference. Theoretical Computer Science, 25:193220, 1983. 8. M. Demetras, K. Post, and C. Snow. Feedback to first language learners: The role of repetitions and clarification questions. Journal of Child Language, 13:275292, 1986. 9. M. Fulk. A Study of Inductive Inference Machines. PhD thesis, SUNY/Buffalo, 1985. 10. E. M. Gold. Language identification in the limit. Information and Control, 10:447474, 1967.
112 11. K. HirshPasek, R. Treiman, and M. Schneiderman. Brown and Hanlon revisited: Mothers' sensitivity to ungrammatical forms. Journal of Child Language, 11:8188, 1984. 12. S. Jain and E. Kinber. Learning languages from positive data and a finite number of queries. In Kamal Lodaya and Meena Mahajan, editors, Foundations of Software Technology and Theoretical Computer Science, volume 3328 of Lecture Notes in Computer Science, pages 360372. SpringerVerlag, 2004. 13. S. Jain and E. Kinber. Learning languages from positive data and negative counterexamples. Journal of Computer and System Sciences, 2005. To appear. 14. S. Jain, D. Osherson, J. Royer, and A. Sharma. Systems that Learn: An Introduction to Learning Theory. MIT Press, Cambridge, Mass., second edition, 1999. 15. S. Jain and A. Sharma. Learning in the presence of partial explanations. Information and Computation, 95:162191, 1991. 16. M. Machtey and P. Young. An Introduction to the General Theory of Algorithms. North Holland, New York, 1978. 17. D. McNeill. Developmental psycholinguistics. In F. Smith and G. Miller, editors, The Genesis of Language, pages 1584. MIT Press, 1966. 18. T. Motoki. Inductive inference from all positive and some negative data. Information Processing Letters, 39(4): 177182, 1991. 19. D. Osherson, M. Stob, and S. Weinstein. Ideal learning machines. Cognitive Science, 6:277290, 1982. 20. D. Osherson, M. Stob, and S. Weinstein. Learning theory and natural language. Cognition, 17:128, 1984. 21. D. Osherson, M. Stob, and S. Weinstein. Systems that Learn: An Introduction to Learning Theory for Cognitive and Computer Scientists. MIT Press, 1986. 22. D. Osherson and S. Weinstein. Criteria of language learning. Information and Control, 52:123138, 1982. 23. D. Osherson and S. Weinstein. A note on formal learning theory. Cognition, 11:7788, 1982. 24. S. Pinker. Formal models of language learning. Cognition, 7:217283, 1979. 25. H. Rogers. Theory of Recursive Functions and Effective Computability. McGrawHill, 1967. Reprinted, MIT Press 1987. 26. T. Shinohara. Studies on Inductive Inference from Positive Data. PhD thesis, Kyushu University, Kyushu, Japan, 1986. 27. K. Wexler. On extensional learnability. Cognition, 11:8995, 1982. 28. K. Wexler and P. Culicover. Formal Principles of Language Acquisition. MIT Press, 1980.
113
E F F E C T I V E C A R D I N A L S IN T H E N O N S T A N D A R D UNIVERSE VLADIMIR KANOVEI* Institute for the information transmission problems (IITP RAS) Bol. Karetnyj Per. 19 GSP4, Moscow 127994 Russia Email: [email protected] and [email protected] MICHAEL REEKEN Department of Mathematics, University of Wuppertal, Gauss Strasse 20, Wuppertal 42097, Germany, Email: [email protected] de
We study the structure of effective cardinals in the nonstandard set universe of Hrbacek set theory HST. Some results resemble those known in descriptive set theory in the domain of Borel reducibility of equivalence relations.
Introduction Nonstandard analysis as a branch of mathematics 8, emerged in the beginning of 1960s when A. Robinson [26] demonstrated that nonstandard models (that is, proper elementary extensions) of the real continuum lead to a mathematically rigorous system including infinitesimals and infinitely large numbers. In the course of 1960s, the model theoretic tools used by Robinson were shown to be applicable to a variety of mathematical structures, and that such an applicability was based on a few general properties of nonstandard extensions, in particular, elementarity and saturation. For instance any Kisaturated elementary extension *IN of the integers IN contains an infinitely large number. Several nonstandard axiomatical systems were proposed, beginning with the mid1970s, based on those general principles. Unlike the modeltheoretic approach, such theories as Nelson's internal set theory [23], two theories of [8, 9], bounded set theory [13], axiomatically described nonstandard extensions of the whole standard set universe of ZFC rather than extensions of any particular structure. In the mid1990s we formulated Hrbacek set theory HST [14], based on "Contact author. Partially supported by RFBR 030100757, 060100608, and DFG 436 RUS 17/68/05. "See [5, 29] on the early history of infinitesimal analysis.
114
earlier theories in [8, 9]. This theory combined achievements of different nonstandard set theories and avoided their faults. The set universe of HST is axiomatized as a von Neumann superstructure H over a fully saturated elementary extension D (I = internal sets) of the class WF of all wellfounded sets, see more on this in Section 1. Our monograph [17] presents in detail the structure of the HST universe and metamathematical properties of HST and some other popular nonstandard set theories. This paper is devoted to the structure of cardinalities in the nonstandard set universe of HST. Note that HST does not include the axioms of Power Set, Choice, and Regularity. In fact these axioms contradict HST. This is why methods of study of the structure of cardinalities known from ZFC are not always applicable in HST. Nevertheless there are two rather regular families of cardinalities in HST: WFcardinals and Dcardinals. Either family behaves in ZFClike manner simply because both WF and I satisfy ZFC. The intersection of the two families consists of finite cardinals. But little is known beyond this. Some independence results have been obtained. For instance, the hypothesis that all infinite sets in D are equinumerous in the whole universe H, and the hypothesis that Dcardinals are preserved in H (except for hyperfinite cardinalities m < n such that ^ is not infinitesimal, [19]) are consistent with HST, see [15] or [17], Chapter 7. Yet an alternative approach seems to be much more promising in the context of HST. Instead of abstract "cantonal" cardinalities, we consider here those induced by effective embeddings, i.e. those definable in some way or given by a certain construction. In this we follow earlier works in nonstandard analysis. For instance studies on collapse of hyperfinite cardinalities by Borel and countably determined maps were carried out in 1980s, see [12, 19, 27]. Further studies revealed a complicated structure of "Borel" and "countably determined" cardinalities of hyperfinite sets [16]. However HST admits a much more general concept of effective cardinality than those based on Borel or countably determined maps. This concept involves the class L[l] of all sets constructible over I, and the class A  8 of all sets i € L [ I ] , i C I (see details below), which includes and greatly exceeds Borel and countably determined sets. The first part of the paper is devoted to effective cardinalities of internal sets and, generally, sets that consist of internal elements. We prove that effective cardinalities of internal sets are just their icardinals in the Dinfinite domain, and resemble multiplicative galaxies in the hyperfinite domain. Effective cardinalities of SJ S sets (WFsize unions of internal sets) are still linearly ordered and admit characterization in terms of cuts (initial
115
segments) in the class "Card of all lcardinals. Some results for cardinalities in more complicated classes 1 1 " and A  s will be presented, too. The second part of the paper considers effective cardinalities in full generality. Fortunately there is a reduction down to D: any set in L[I] admits an effective bijection onto the quotient structure of the form X/E, where E is a A " relation on a A ^ set X (by necesity X C J). And this brings us to an analogy with modern descriptive set theory, where cardinality problems for Borel quotient structures in Polish spaces became the focal point since early 1990s — especially in the form of Borel reducibility of quotients and the corresponding equivalence relations, see e.g. [6, 7, 18]. We pursue essentially the same idea, with A " reduction maps in the same role as Borel reductions in descriptive set theory. Inspired by this analogy, we prove several results related to dichotomy of "large""small" sets, a nonstandard form of the Ramsey theorem, a theorem saying that quotients with rather small (for instance countable) classes are "smooth" in a sense similar to the smoothness for quotients in descriptive set theory, and finally consider effective reducibility within the family of monadic equivalence relations. Those readers with an experience in descriptive set theory may be interested to recognize similarities and differences with the setup they are accustomed to. 1. Structure of the nonstandard universe The language of Hrbacek set theory HST contains two basic predicates, the membership £ and the standardness s t , hence it is called the st£language. The axioms of HST describe a set universe H where the following classes are defined, S = {x : s t x}
— standard sets;
st
D = {y : 3 £ (y & x)} — internal set; WF
— wellfoundedb sets;
so that S C D, D is an elementary extension of S in the ^language, S (and I as well) satisfies ZFC in the Glanguage, the class D is transitive, and the universe H is a von Neumann superstructure over 0. The universe H satisfies all ZFC axioms except for Regularity (weakened to Regularity over I), Choice (weakened to Standard Size Choice) and Power Set axioms. The axioms of Separation and Replacement are accepted in the stGlanguage. a b
3 s t and V s t are shorthands for "there is a standard", "for all standard". A set x is wellfounded iff its transitive closure has no infinite gdecreasing chains.
116
Metamathematically, HST is equiconsistent with ZFC, and HST is a conservative extension of ZFC in the sense that any Gformula $ is a theorem of ZFC iff $ s t (the relativization of <5 to S) is a theorem of HST. See [17] on axioms, metamathematics, basic set theoretic structures, and the structure of hyperreals in the HST universe. Convention 1.1. We argue in HST below unless otherwise stated.
•
Asterisks. An Gisomorphism x H> *x of WF onto S is defined in HST so that *x H S = {*y : y G x} for all x G WF. The map * is an elementary embedding of WF in D in the Glanguage. The classes S and WF are Gisomorphic and satisfy ZFC. Each of them can be informally identified with the conventional set theoretic universe. The class WF is somewhat more convenient in this role as it is transitive and contains all its subsets, hence some important set theoretic operations are absolute for WF in HST. Integers and reals. The sets IN,(Q,R (integers, rationals, reals) belong to WF and are equal to resp. (IN)WF (i.e. N defined in WF), (Q) WF , (R) WF . In addition *n = n for all n G INJ, therefore IN C *IN, moreover tNl is an initial segment in *M. The set *N coincides with the set (N)1 of all ^natural numbers, similarly *Q and *R are equal to, resp., (Q)1 and (R)°. Elements of *N,*Q,*R are often called resp. hyperintegers, hyperrationals, hyperreals. A hyperreal x G *R is infinitesimal, x ~ 0 in symbols, if a; < V in *R for all r G R, r > 0, and infinitely large, if x~l ~ 0, i.e. \x\ > *r for all r G R. A hyperreal x is limited, if it is not infinitely large. In this case there exists a unique r G R such that x ~ *r (that is, x — *r ~ 0). Such a real r is denoted by °x (the shadow, or standard part, of x G *R). Ordinals and cardinals. The operation * extends to proper classes X C WF by *X = (J xewF, xcx *x> a n d this does not yield contradiction provided X G WF. Then *WF = D. In HST, the classes Card and Ord (all cardinals, resp., ordinals) satisfy Card C Ord C WF and Ord = (0rd) W F (that is, ordinals = WFordinals), Card = (Card) WF . Thus classes "Card C *0rd C 0 are denned (all Dcardinals, resp., lordinals). Note that *IM C *Card. Sets of standard size. Sets equinumerous with sets in WF are called sets of standard size. Note that c a r d X G Card is defined then for any set X of standard size. In HST, sets of standard size is the same as wellorderable sets, 1.3.1 in [17]. The axiom of Saturation claims that every nonempty flclosed set X C 0 \ {0} of standard size has a nonempty intersection f]X. The axiom of Standard Size Choice claims the existence of a choice function / for any set X of standard size (i.e. f(x) G x for all x G X, x 7^ 0 ) . An easy consequence is the axiom of Power Set for sets X of
117
standard size: ^(X) is a set of standard size for any such X. Finite sets are sets of standard size. On the other hand any infinite set I £ I, for instance any set of the form { 0 , 1 , 2 , . . . , h}, where h € *tsJ \ Dsl, is not a set of standard size. 2. Classes A^s and L[0]: effective sets Which sets should be viewed as effective in HST ? Following the examples of recursive, Borel, constructible sets, we have to choose an initial class of sets and a set of operations applying to the initial sets. The sets obtained this way are considered as effective. In nonstandard set theoretic systems, internal sets are usually considered as the initial sets, because of their special role in the construction of nonstandard universes. (In particular D is the von Neumann basis of the HST universe of sets.) As for the operations, let us take unions and intersections of families of standard size. We immediately obtain the classes E* 3 , I I " of all sets of the form resp. UoeA^a' (^aeA^a, where A £ WF and all sets Xa belong to I, or, that is the same, of the form resp. (J SC, 0&, where SE C D is a set of standard size. (The index s s indicates that unions and intersections of sets of standard size are taken.) We further define the class A " of all sets that can be represented both in the form \Ja€Af]b€B Xab, where A, B 6 WF and all Xab belong to D, and in the dual form (possibly with different sets A, B, Xab). Note that taking, say, three operations of union and intersection no new sets appear according to the following result (1.4.2, 1.4.3 in [17]). Proposition 2.1. If X C A " is a set of standard size then the sets (J S£ and f\ 3£ belong to A^ 8 . In addition, any set X C D defined in 1 by a st£formula with sets in D as parameters belongs to A " . D Thus A " is a rather large class of sets. c Yet it consists only of those sets X satisfying X C I , The class L[D] of all sets constructible over I extends A " on higher levels of the von Neumann hierarchy over D. Definition 2.2. L[0] consists of all sets x which admit a transfinite construction determined by a wellfounded tree T with sets in D attached to all endpoints of T. The tree T itself and the map which attaches internal sets c
There are meaningful subclasses within A  s , namely countably determined sets, i.e. those of the form X = U b e B p l n s b  ^ " ' where B C ^ ( N ) and all sets X„ are internal (there are different but equivalent formulations), and Borel sets that belong to the closure of 0 under countable operations of \J and ". These classes are considered within model theoretic nonstandard analysis under the assumption of HiSaturation, [19].
118
to the endpoints of T belong to A3,3. In every node t of T that is not an endpoint, the set of all sets, attached to immediate successors of t in T is defined. The final set x is obtained in the root of T. • Thus sets in Q_[0] are obtained via effectively coded (in A " ) transfinite iterations of the operation of assembling of a set from its elements. This enables us to view sets in L[D] as effectively definable. Conversely, any effective (informally) set belongs to L[D]. Indeed it follows from theorem 2.3 (ii) below that effective constructions have to be absolute for 0_[D], hence the results of such constructions are necessarily sets in L[l]. Identifying the informal notion of effectivity in HST with L[D], we put x
<eff y ^ jfj t n e r e j s a n injection / G L[D] of x into y ;
x = e f f y, iff there is a bijection
/ G L[D] of x onto y
(1)
and x < e f f y iff x < e f f y but y ^.eiI x. The ordinary Cantor  Bernstein argument proves x < e f f y A y < e f f x <*=*> x =ett y for any sets x, y G 0_[0]. Define a; eff , the effective cardinality of x G L[fl], to be the = eff equivalence class {y G L[0] : x = e f f y}. The inequalities \x\elt < \y\eit and \x\eil < \y\eil will be understood as synonimous to resp. x <etl y and x < e f f y. Theorem 2.3. (i) / / x C I then x G Af
<=> x G L[0].
(ii) L[l] is a transitive class satisfying HST d and WF U A3,3 C L[D]. (iii) For any set A G L[D] there is a set X G D and an equivalence relation E on X, E G A  s , such that A =e" X/E. Proof. On (i), (ii) see 5.5.4 in [17] where the class A " is denoted by E. (iii) According to 5.5.4(8) in [17], there exist a set X G D and a map h G L[0], h:X°^ A. Define, for x,y G X, x E y iff h(x) = h(y), and consider the map O £ J 1 H / ( a ) = {x G X : h(x) = a}.
D
8 3
Theorem 2.3 allows to suitably replace L[0] by A . in the context of  •  e f f . For instance we conclude from 2.3(i) that (1) is equivalent to the following in the domain of subsets of I: x
<eff y ^ jg there is a A^ s injection / : x —> y
x = e f f y, iff there is a A8.3 bijection / : x ^5
for x,yCl
(2)
We begin the study of the structure of effective cardinalities  •  e f f with rather simple classes, internal sets and sets of standard size. In fact the least class with these properties. A suitable version of Godel's definition of relative constructibility leads to exactly the same class L[0] in HST. See 5.5.6 in [17].
119
3. Effective cardinalities of internal sets Generally elements of *Card, that is, lcardinals, behave like ZFC cardinals since 0 is a ZFC universe (in the Glanguage). Let a; int G *Card denote the icardinality of a set x G 0. Obviously \x\iat = y i n t implies rceff = y eff since I C A^s. This implication is partially reversible according to Corollary 3.2 below. To figure out the effect of noninternal maps in the domain of internal sets, let us give some definitions. Define, for any x, \\x\\* = {y i n t : x — V e 1} — t n e interior spectrum of x \\x\\* = {ylnt : i C j / G l } — the exterior spectrum of x
(3)
Then H^H^ is a cut (initial segment) in *Card while a;* is a proper class and a final segment in "Card. Further, for any K G *N define the cuts KIM = {A G * N : 3 n G N (A < n«)} , K/N = {A G *N : Vn G N (A < n/n)} in *1M, and the multiplicative galaxy gal K = KN \ K/M of K. Then A G gal K iff neither of the fractions j , ~ is infinitesimal. To preserve the unity of notation put KM = K, gal K = {K} for any K G *Card \ *IM. Define, for if, L C *Card, i f < L i f f V K ; G i f 3 A G . L ( K ; < A ) . Accordingly, if < L iff if < L but L •£ K.ln particular, in two cases when one of the sets K, L is a singleton, we obtain K < L
iff 3 A G L ( K < A ) ,
and
if < A iff VK G if (K < A).
(4)
Note that galaxies are pairwise disjoint intervals in "Card (singletons outside of *N), thus for any two galaxies r i , r 2 , T\ < 1^ means that KI < K2 for any (equivalently, for all) «i G Ti, K2 &T2. See 1.4.9 and 9.6.12 in [17], or [19], on the next theorem. In the case of linfinite sets the factors IN and h in 3.1 vanish by obvious reasons. Theorem 3.1. (i) Suppose that X, Y G D and f : X —> Y is a A  s map. Then \X\inth G  r a n /   * for any h G *N \IM. In addition, (a) if r a n / = Y then Y i n t < X int IN, and (b) if f is an injection then  X  i n t < Y lnt IN. (ii) Suppose that X G I is infinite. Then \X\eli = \X x tSleff, in particular, \Y\elt < \X\etl for any internal Y with Y i n t < X lnt !N. D Corollary 3.2. If x,y G 0 then \x\e11 < \y\etl is equivalent to \x\int < y i n t N provided \y\int G *N \ IN and to just :r int < y l n t otherwise. • Thus :reff =  y  e " is equivalent to gal a; lnt = gal \y\int in the domain *IN \ 1X1, and equivalent to just x l n t = y i n t outside of the domain *N \ IN. In the linfinite domain *Card \ *N, the two characterizations coincide.
120
4. Effective cardinalities of sets of standard size By definition sets of standard size, or s. s. sets, are those equinumerous (that is, admit a bijection onto) with sets in WF. For any s. s. set X define c a r d X = card W £ Card, where W is a set in WF equinumerous with X. Lemma 4.1. (i) Any s. s. set X C I is £ 8 3 and A8,8. (ii) Any s. s. set W is equinumerous with an s. s. set X C 1 . (iii) If X CUs a s.s. set then *Card \ N C X* and \\X\\* C IN. (iv) If X,Y Ci are s. s. sets then cardX = cardF iff \X\etl = \Y\eti, thus \X\eti can be identified with cardX. Proof, (ii) We may assume that W £ WF. Then the map w \—> *w is a bijection of W onto X = {*w : w £ W} and X is a set of standard size, too. (iii) To prove *Card \ IN C X* fix h £ *N \ IN and apply Saturation to the family of all sets Cu = {c £ D : u C cA c l n t = h}, where u C l i s finite. (iv) Any bijection / between two sets X, Y C I of standard size is itself a set of standard size, then apply (i). • It follows that s. s. sets are adequately represented among S 3 S sets in the context of card, and on the other hand  •  e f f and card coincide on s. s. sets. The next theorem shows that effective cardinalities of A8,5 sets begin with sets of standard size, where they coincide with wellfounded cardinals, followed by the domain of A8,8 sets not of standard size. It will be demonstrated below that the structure of effective cardinalities in the latter is connected with *Card in a certain way. Theorem 4.2. (i) Infinite internal sets are not s. s. sets. (ii) Any A ^ set X not of standard size contains an infinite internal subset, that is formally IN S ^*. (iii) If X as. s. set and Y is a A " but not s. s. set then \X\etI < \Y\e11. Proof, (i) A simple corollary of Lemma 4.1(iii). (ii) By definition A8,3 sets are s. s. unions of II 8 8 sets. Yet it is another rather simple corollary of Saturation that any infinite n 8 8 set contains an infinite internal subset, see 1.4.11 in [17]. (iii) By (ii) some number h £ *IN\IN belongs to y„. On the other hand h £ \\X\\* by Lemma 4.1 (iii). This implies \X\e" < \Y\e". The inequality \Y\e" £\X\elf follows from (i). •
121
5. Exteriors and interiors It turns out that internal approximations X"^, X"j* are very instrumental in the study of effective cardinalities of S j s and partly Tl\s sets X. Now a few words on cuts (initial segments) in *Card. Definition 5.1. A cut U C *Card is standard size (s. s.) cofinal resp. coinitial, iff there exist a cardinal d £ Card, infinite or equal to 1 = {0}, and an increasing, resp. decresing sequence {v(}«$, of u^ £ *Card such that U = U{<^i K e *Card : K < v^}, resp., U = C\«s{K e *Card: K < i ^ } . D Note that s. s. cofinal cuts are £ 3 S while s. s. coinitial cuts are IIfs. Internal cuts, i.e. those of the form U = {K £ *Card: K < v}, v £ *Card, belong to either of the two "standard size" categories, for take fl = 1 and VQ = v. See 1.4b in [17] on the next result: Proposition 5.2. Any A^ s cut in *Card is s. s. cofinal or s. s. coinitial. If a cut is both s. s. cofinal and s. s. coinitial then it is internal. • Coming back to A")^ and X*, note that for any X the intersection llX"!!,, n \\X\\* contains at most one element. If K £ HX^ D \\X\\* then there exist internal sets Y,Z with Y C X C Z and  r  i n t =  Z  i n t = K. In this case, if K £ *M then X itself is internal with  X  l n t = K, while if K is 0infinite then only \X\ett = «:eff holds provided X is A3,3. Lemma 5.3. (i) If X is a set in X 3S , resp., U\s then \\X\\t is a standard size cofinal, resp., standard size coinitial cut in *Card. (ii) In both cases, X, U X* = *Card. (iii) In both cases, if either \\X\\t contains a largest element K, or \\X\\* contains a least element K, then K £ \\X\\t D X*. Proof, (i) Consider a set <5T C 0 of standard size. Let X = \}3£. Then by Saturation any internal set Y C X is covered by a set of the form U 2£' where 2C' C SE is finite. On the other hand, by 1.3.3 in [17] the set •^'fin(^) = {3E' Q 3E '• $£' is finite} is still a set of standard size. Prove (ii) for S j s . Let X = (J 2£ be as above. Show that any lcardinal K £" \\X\\t belongs to X*. Take any set Z £ I such that X C Z. If SE' C SE is finite then by definition J SE' is covered by an internal set of Dcardinality K, hence the set Px, = {C £ D : J SE' C C C ZAC i n t < n} £ I is nonempty. Apply Saturation to the family of all these sets Psc. (iii) Apply Saturation. •
122
Example 5.4. The following example e of a A " set X such that \\X\\t U IIXH* C *Card employs a nontrivial ultrafilter U £ WF over N. Let h £ *N \ IM and £> = {1, 2 , . . . , h}. The set P = &\D) = ^(Z?) n 1 of all internal sets x C D belongs to I and satisfies  P  l n t = 2h. Then U> = {x £ P .xHM £U} = {JbeUf)neb{x
£ P :n £ x}
is an ultrafilter in P and a A3,3 set. f We claim that \\U'\\^ — 2h/\ti. Let Z' C P be an internal set. By Saturation (see e.g. 9.2.15 in [17] or 1.6 in [19]), Z = {x n IM : x £ Z'} is a closed subset of £/. It follows that the Lebesgue measure of Z in ^(tM) (identified with 2 N ) is 0. Then easily the Loeb measure of Z in &>l(D) is 0, so that \Z'\im £ 2h/IM. Thus t/'„ C 2/l/Dsl. To prove the converse note that for any u £ U the set X = {x £ P : x n IM = u] is a I I " subset of [/' that surely satisfies \\X\\^ = 2h/U. It follows from t/'„ = 2h/M that £/'* = 2 h — by the symmetry of the sets U' and P \ U' = {D \ x : x £ P} within P. • One can easily transform the set £/' as in 5.4 to a A " set X C *N such that X, = 2fc/N and X*  2hIM. The gap *Card\ (X* U \\X\\t) consists, in this case, of the whole galaxy g a l 2h = 2hM \ 2h/^\ in *N. The next theorem shows that this is a maximal possible gap! Theorem 5.5. / / X is AS2S and K £ *Card, K £ X*U Xj)t, then K £ *!M and the difference *Card \ (X* U X"•) is a subset of gain. Thus if X is AS2S and *IM C X„ then X # U X* = *Card. Proof. By definition X — \JaeA Xa where A £ WF and every Xa is a I l f set. Take any Ocardinal K £ \\X\\* \ \\X\\t. Obviously U a e A \\Xa\\t C LY„, thus K £ f]aeA ll^all* ^y Lemma 5.3. It suffices to prove that any A £ *Card belongs to ^"* in either of the two cases: 1) A = K $ *N, 2) A £ *IM \ KIM. Note that n/c < A holds for all n £ IM in both cases. Using Standard Size Choice, choose, for any a £ A, a set Ya £ II such that Xa C Ya and V a  i n t = «• Thus X is covered by the union \Ja£AYa. For any finite A' C A, the finite union YA> = U a e ^ ' ^a 1S a n internal set satisfying Y"^'int < A by the above. The same application of Saturation as in the proof of Lemma 5.3 yields an internal set Y still with  y  i n t < A, satisfying \Ja€A Ya C Y, and hence X C.Y and A £ *Card. D e
Essentially given in [24], see also [19], p. 1172, but with a more complicated proof based on a rather nontrivial combinatorial theorem in [4], f The set U' is even countably determined.
123
The following corollary belongs to the "smalllarge dichotomy" type. (B) witnesses that a given A8,8 set is rather large w. r. t. a given cut U (has rather large internal subsets), while (Al) and (A2) witness that X is rather small (can be covered by rather small internal sets). The proof is easy: if U ^ HX'H* then (B) holds by definition, otherwise apply Theorem 5.5 and get (Al) or (A2) (or Lemma 5.3(H)  in the case of E 3 S and II 3 3 sets). Corollary 5.6. If X Ci is a A3,3 set and U C *Card is a A " cut then at least one of the following conditions holds, and moreover (A2) can be excluded for E™ and H\s sets X: (Al) for any K $.U there is an internal set Y D X such that  y  l n t = K; (A2) there exists h G *IM \ M t such that /i/N C U C /iN, and for any K G *Card\/ilKI there exists an internal set Y D X such that \Ylnt = K ; (B) there exists an internal set Y C X such that \Y\int g" U.
•
6. Effective cardinalities of E ^ sets One may expect that the bigger X, (or the smaller X*) is the bigger X e f f should be. According to the next theorem, such a connection holds for E 3 3 sets X except those satisfying X„ C N. Following the notation in Section 3, we define, for any K C *Card, a cut KM = {\:3K€ K3nebl(\ < UK)} in *Card. T h e o r e m 6.1. / / X,Y are S? 3 sets and N C \\Y\\t then \X\eff < \Y\eft is equivalent to \\X\l C y#IM, and also to X, C y # if *N C y # . The case y» C N will be considered below. Proof. Suppose that X = \J X and Y = [j <&, where 3C, <& C 0 are sets of standard size. There is a set D 6 1 such that X U Y C D. Assume w. 1. o. g. that SE, 'S/ are ficlosed families. By Saturation, the sets of lcardinals {X' lnt : X' e ST}, {yf n t :Y' £&} are cofinal in resp. X., y.. Direction =>. Suppose otherwise. Then there is an internal set X' C X such that  y '  i n t m <  X '  i n t / n for any internal Y' C Y and k,n G IM. As y, is a s.s. cofinal cut in *Card by Lemma 5.3(i), there exists, by Saturation, K G "Card such that  y '  i n t m < K <  X '  i n t / n for any internal y C y and k,n G IM. Thus y + ISJ < K, hence K G y* by Lemma 5.3(ii). In other words, there is an internal set Z such that Y C Z and \Z\int = K. On the other hand, KIN <  X '  l n t while by \X\e" < \Y\eii there exists a A3,3 injection X' —> Z, a contradiction to Theorem 3.1(i).
124
Direction •$==, in a stronger assumption that simply X # C y„. Case 1: y„ contains a maximal element K = yo l n t , where YQ £ ^ , hence Y0 C y. Then for any X' £ SC the set # X ' = { / i e D : / i : D  > D A / i t X ' i s a n injection A h" X' C y 0 } is nonempty. In addition, Hx"uX' = Hx" n i^x' Saturation yields an element /i G Plx'e^r ^f' Clearly /i f X is an injection of X into y"o, and hence X e f f <  y  e f f , as required. Case 2: y„, does not contain a maximal element, and for every a £ y„ n *IM there exists 7 £ y„, n *IM such that a IN < 7  meaning that 7 > a n for any n £ IN. By Standard Size Choice there is a map / : SE —>tysuch that  X '  l n t <  / ( X ' )  i n t for all X' £ X, and even X' int tN <  / ( X ' )  i n t provided  / ( X ' )  i n t (then also X' i n t ) belongs to *tN. Then Hx> = {h £ 0: h : D > D A h \ X' is an injection Ah"X' C / ( X ' ) } is nonempty for any X ' G <5T. Then argue as in Case 1. Case 3: the negation of cases 1,2. Then there is a number c G *IN \ IN such that c G y„ but 2c 0 y„. Then [0,c) e " <  y  e f f while  X  e " < [0,2c) eff (see case 1). However [0,2e) eff = [0,c) e " by Corollary 3.2. Direction <=, general case. If X # C Hy^N, but HXj^ C \\Y\\^ does not hold then there exist numbers c G *IM \ IN and n £ IN such that \\X\\^ C [0,nc) and [0,c) C y % C [0,2c). We have  X  e " < [0,nc) eff by the above, and [0,c) eff <  y  e f f . It remains to apply Corollary 3.2. Q It remains to consider the case y„, C N avoided in the theorem. It leads to sets of standard size! Lemma 6.2. For a set X CD to be of standard size each of the conditions \\X\\* C IN, *tlsl \ IN C X* is necessary and, if X is A ^ , also sufficient. Proof. By Theorem 4.2(i) \\X\\t C IN. On the other hand *IN \ IN C X* by Lemma 4.1 (iii). The sufficiency follows from Theorem 4.2(h). D Thus Theorem 6.1 fails in the case \\X\\t — IN: take any pair of infinite sets X, Y C I of standard size with c a r d X ^ c a r d y and apply 6.2 to show that X* = y* = IN, and Lemma 4.1 to show that  X  e " ^  y  e f f . Nevertheless we easily obtain the following corollary. Corollary 6.3. If X,Y are S j s sets then their effective cardinalities are comparable in the sense that at least one of the following inequalities holds: \X\e" < \Y\e" or \Y\eil < \X\eil. •
125
7. Effective cardinalities of II^ S sets The proof of = > in Theorem 6.1 does not work for I I " sets since y„ is now s. s. coinitial and the Saturation argument does not work. On the other hand there is a suitable counterexample. E x a m p l e 7 . 1 . Fix h G *N \ IN and let S be the set of all internal maps s : {0,1, 2 , . . . , h  1, h} > {0,1} = 2. Define as,bs G 2 N (hence G WF) so that as(k) = s(k) and bs(k) = s(h  k) for all k G IN. For a,6 G 2 N put Sab = {s : as = a A bs = 6} and 5 a = {s : a s = a}. Then 5 is internal,  5 mt = 2h+1, while each Sa is a II 5 8 set with \\Sa\\, = 2 h /N. Obviously (2'1/N)[H = (2h/\ti). To see that S and 5 a lead to a counterexample to =>of Theorem 6.1, it suffices to prove that  5  e f f =  5 a  e f f for some a. Since either of S, Sa is a union of 2 N many sets of the form Sab, it remains to show that 5 a b e f f — \Saib>\eif for all a,b,a',b'. By Saturation there is <7 G S such that a(n) = a'(n) © a{n) and 6(n) = b'(n) © cr(/i — n) for all n G N, where © is addition modulo 2. Finally the internal map s \> s © a (in the termwise sense) easily maps Sab onto S a '(/ in 11 way. • In fact 7.1 is the only possible counterexamle for nf s sets in the following sense: if X,Y are n\s sets, N C \\Y\l, and X e f f <  y  e f f then either llX'll,, C yyi^lM or there is a number K G H^H,, K € *Hsl \ M, such that HA"!!, = K/N while y, C KN. We skip the proof. Our further goal is to present what looks like a nearcounterexample, (ii) of Theorem 7.2, to < = of Theorem 6.1 in the field of n f sets. If X is a n , s set then 11X11* is standard size coinitial by Lemma 5.3. 1
II
II
J
If 11X11* contains a least element K then K is simultaneously the largest element in HX^ still by Lemma 5.3, and then easily X e f f = «; eff . It follows that if in this case Y is another n ^ 3 set with V* = X* then X e f f =  y  e f f . But if X* does not contain a least element then there is an infinite coinitial sequence with standard size many terms. This case is considered by the next theorem. It follows from (ii) that there are sets of the largest effective cardinality among all H\s sets X with the same X*, while (iii) presents a rather nontrivial partial counterexample to Theorem 6.1 for Tl\s sets. We deal with linfinite cardinals here, but similar results can be obtained in the hypernnite domain — we leave it to the reader. T h e o r e m 7.2. (i) 1/ f , f C I are sets of standard size, X = f] SC, Y = Pl^i ^ = card 3£ G WF is an infinite regular cardinal, X* = V*, and the coinitiality of \\X\\* is exactly •#, then \Y\eti <  X  e f f .
126
(ii) There exist U\s sets X,Y as in (ii) such that \X\e" < \Y\ett fails via A^s injections g of the form g = \Jwe\y f \ < # 9w£> where all gw£ are internal and W is a set of standard size. Proof, (i) Assume w. 1. o. g. that there exist sets XQ G 2£', YQ G *3f such that X C X0 and Y C Y0 for all X G ST, Y G 0 \ and the families ST, & are Dclosed. We claim that there exists a function if) : ?£ —> & satisfying VAG ^
i n
(f")3/eFVIe AWIJCdoi/A/'^I) CI),
(5)
where F G I is the set of all 11 functions / G D with dom/ C YQ and r a n / C XQ. TO define ip fix an enumeration ^T = {Xa : a < $}. Suppose that a' < •&, and the values ip{Xa) G &, a < a', have been defined. In our assumptions, there is a set Y G & such that  y  l n t <  HaeA ^a l n t f° r every finite A C [0, a']. To complete the inductive step put (p(Xai) = ^ To prove (5) consider a finite set A = \a.\ < • • • < « „ } C i3. By the construction \ip(Xak)\lnt <  ni
&(I/>(X)
C d o m / A / " ( ^ ( X ) ) C X). e
(6)
ett
Thus f"Y C X, for such an / , and hence  y  " < \X\ holds even by means of an internal map / . (ii) Fix an infinite cardinal i9 in WF. It easily follows from Saturation that there exists a strictly decreasing sequence V = {^}^ u{). Let r = 1?+ (the next cardinal in WF). The counterexample is based on a sequence {Y~f}~l
127
Z u „ l n t satisfy KUV = KU, where KU = KU0, and 3£ < i9 {KU > u$). For any finite u C A let £(u) be the least ordinal £ < •# such that i/£ < KU and £ > supu. We assert that there is an internal set Z satisfying (d) \ZD Zuv\int = V£(u) and \ZUV \ Z  i n t = KU for any pair of disjoint finite sets tt,«CA, 1 1 ^ 0 , and (e) \Z \ U ^ ^ / j  i n t =  Z  l n t > ^o for each finite set vQX. Indeed as d is a set of standard size it suffices to prove that for any finite d C d there is a set Z £ I satisfying (d), (e) for all u,v C d. Note that the sets of the form Zuv, where u U v = d and uC\v = 0 , are mutually disjoint, and by definition satisfy v^(u) < « « = ^uu l n t  This allows us to define an internal Z satisfying (d) for all pairs u, v with u\Jv = d, uHv = 0, u ^ 0, and, adding a sufficient portion out of [jged Zp, also \Z \ (Led Zp\int = \Z\int > VQ. It remains to show (d) for all disjoint sets u,v C d not necessarily with uUv = d. We show this by backward induction on the cardinality of uUv. Suppose that M U « C ( J , Take any a £ d\(uUv). Let u' = uL){a} and v' = vU{a}. Then by the inductive hypothesis \Z(1 ZU'V\int = v^(u') a n d \Zn Zuvi\int = V£(u). Since Zuv = Zu>v U Zuv> and easily £(«) < £(u') whenever ii C «', we conclude that  Z n Z u t j  i n t = u^u^ as required. Similarly,  Z u ' „ \ Z  i n t = KU> and ZU„' \ Z\lnt = KU, therefore \ZUV \ Z  l n t = KU* + KU = KU as required. Take as Yy any set Z £ D satisfying (d), (e). We have to demonstrate that (a), (b) remain true for the sequence {Ys}s<y, or, that is equivalent, for the sequence {Za}a<\, where Z\ = V7 = Z. Take any pair of disjoint sets u,v C A U {A}. If A S u U v then the set Zuv = HaGu ^a x Usgi; Zp ls the same as above so there is nothing to prove. Suppose that A £ u; put u' = u \ {A}. Then £„„ = Z n Z u /„, and hence Z u „ is i7large by (d) (applied for the pair u', v). Separately if u = {A} then u' = 0 , hence Zu*v is not defined, but obviously Zuv = Z \ Us^t, %Pi therefore Zuv is i?large by (e). Suppose that X £ v; put v' = v \ {A}. Then Z u „ = Zuw/ \ Z, and hence Z u „ is i?large still by (d). This proves (a); the derivation of (b) from (d), (e) is similar. This ends the recursive construction of the sets Yy. Show that such a sequence {Yy}7
128
inequality Y7 fl Za\lnt > V£ can be true only for a < £. Thus there exist (<#)many sets Y$, S < 7, satisfying Y7 D Ya lnt > ^ , as required. Coming back to the proof of (ii) of Theorem 7.2, we fix a sequence { Y 7 } 7 < r satisfying (a), (b), (c), and put <3T = {Y7 : 7 < T } . Then 7 = fl7
129
one and the same internal set R = h " X j . Note that i? i n t = \X$lnt because h is an injection. But X% is a i/large set, a contradiction with (c). • Question 7.3. Is Corollary 6.3 still true for sets in A8,8 or in II 8 8 ?
•
Theorem 10.2 below shows that a wider category of A8,8 quotients has plenty of incomparable sets. Note that the existence of countably determined sets incomparable in the sense of countably determined injections, is also an open problem. A counterexample defined in [2] in the AST frameworks makes use of the hypothesis that there exist only Himany internal sets, and hence is irreproducible in HST. On the other hand all Borel sets (in the sense of Footnote 0 ) are Borelcomparable. This result was first obtained by ASTfollowers, see e.g. [12], and then reproved in [27]. See more on this in [17], 9.6 and 9.7. 8. Effective sets in the form of quotients Sets of the form X/E, where X is A8,8 while E is a A8;8 equivalence relation on X will be called A^ s quotients. These A8,8 quotients include the class A8,*5 itself, for take E to be just the equality on a given A^ s set X, so that the map sending any i g l t o {x} is a bijection of X onto X/E. On the other hand, it follows from Theorem 2.3(iii) that every set in L[D], that is, every effective set in the sense explained in Section 2, admits an effective bijection onto a A8;8 quotient. Thus A " quotients exhaust, in the context of effective cardinalities, all effective (= L[D]) sets in general. One may ask whether A  s quotients produce more effective cardinalities than just A5,8 sets. Call smooth any A " quotient that admits a A8,8 bijection onto a A^s set. We show in Section 9 that every A?,8 quotient X/E, such that all Eclasses [X]E = {y € X : x E y}, x G X, are sets of standard size, is smooth. A family of nonsmooth A8,8 quotients, those defined by means of monadic partitions of *N, will be studied in Sections 10, 11. We prove there that there exist incomparable effective cardinalities of monadic A8,8 quotients, still an open problem for A8,8 sets themselves. We also prove a "smalllarge" type theorem for A^ s quotients in Section 12, similar to 5.6 but not so sharp, with an interesting Ramseylike corollary. Note that A8.8 quotients consist of subsets of D which are not necessarily internal sets themselves. Accordingly injections of A8,8 quotients are maps whose dom and ran not necessarily consist of internal sets. Still there is a way to pull the consideration down to the basic level. Definition 8.1. Let E, F be equivalence relations on sets X, Y. A set R C
130
X x Y is a (E, F)invariant preinjection of X into Y iff 1) domi? = Xg and 2) the equivalence xEx' •£=>• y F y' holds for all (a;, y) £ R and (a;', y') £ R. Such a set R is a reduction of X/E to Y/F (or just of E to F) if in addition 3) J? is a (graph of a) function X ^>Y. Write E < e f f F iff there is a (E, F)invariant preinjection P C X x Y, P £ A8,8, of X into Y. Write E < ^ f F, in words: E is effectively reducible to F, iff there is a reduction p £ A8,8, p : X —> Y of E to F. An equivalence relation E on a set X and the quotient X/E are A8,8smooth iff there is a A3;8 set Y such that E <eff Dy, where Dy is the equality on Y considered as an equivalence relation. h • This definition resembles some central concepts in modern descriptive set theory, like Borel reducibility and "Borel cardinals" (see, for instance, [6, 7, 18]), where Borel maps are used in approximately the same role as A8,8 maps in this paper. Proposition 8.2. (i) Suppose that E, F are A^ s equivalence relations on A8.8 sets X,Y. Then \X/E\eii < \Y/F\elt iff E < . „ F. (ii) An A8,55 equivalence relation E on a A^ s set X is A^ s smooth iff there exists a A8,3 set Y such that X/E e f f = \Y\ett. Proof, (i) Suppose that / £ L[D] is an injection X/E > Y/F. Then P = {(x,y) £ X x Y : / ( [ Z ] E ) = [V]F} is a set in L[D], hence a A  s set by 2.3(i), and obviously an invariant preinjection. The converse is equally simple: if P is an invariant preinjection then to define an injection / : X/E —» Y/F put /([x] E ) = [J/]F for any (x,y) £ P. (ii) Suppose that E < e f f Dz, where Z is a A8,8 set. Let this be witnessed by an invariant preinjection R C X x Z of class A8,8. Clearly R = p is then a reduction (a map X —> Z such that x E x ' <=*> p{x) = p{x')). The set Y — ran p C Z is as required. • 9. Equivalence relations with standard size classes In modern descriptive set theory, an equivalence relation E is countable iff all equivalence classes [X]E = {y : x E y}, x £ dom E, are at most countable. See [10] on properties and some open problems related to countable equivalence relations. But in the nonstandard setting the structure of equivalence 8
This condition can be weakened to [X]E n d o m i i 5^ 0 for any x 6 X without any harm. Note that in this case any invariant preinjection is a partial map that can be immediately extended to a reduction, and hence in fact E <*tl Dy holds. h
131
relations in a much wider class turns out to be considerably simpler: all of them admit effective transversals. Recall that a transversal of an equivalence relation is any set having exactly one element in common in every equivalence class. Theorem 9.1. Any A3,5* equivalence relation E, on an internal set H and with s.s. classes, has a A3,3 transversal and hence is A™smooth. Opposed to this, the Vitali equivalence on the reals is obviously countable but not smooth (via Borel maps), neither it admits a Borel transversal. Proof. First of all, a A " transversal implies A^smoothness: let p(x) denote the only element of the transversal equivalent to x and apply 2.1 to show that p is still A3,". Let us prove the existence of a A3;3 transversal. By definition E = \JaeA HbeB Eab, where Eab C H x H are internal sets while A, B e WF. Put P"x = {y: (x,y) € P} for P C H x H and x G H. Lemma 9.2. There exists a standard size family & of internal maps F : H > H such that [x]E C {F{x): F G J*"} for all x € H. Proof. It suffices to prove the lemma for each "constituent" Ea = ObeB Eab of E. According to 1.3.6 in [17], the intersection f]X of a s.s. family X of internal sets either is not a s. s. set or it is finite and there is a finite X' C % such that f]X' = f]X.lt follows that every set Ea[x] is finite and moreover there is a finite set (3ax £ B such that Ea " x = flbe^x Eab" x' Put, for any n G IN and any finite (3 C B, Ea0 = flbe/3 Eab
and
Papn = {(x,y) e Ea0 : ca.rdEa0 "x < n} .
All sets Paj3n are internal. We define Fa0ni(x) = ui th element of Pa/3n in the sence of a fixed internal linear ordering of Pa/3n" in the case when 1 < i < n and Pa/3n contains at least i elements, and Fapni{x) = Vo otherwise, where yo is a once and for all fixed element of H. It remains to define & to be the family of all functions Fa$ni. • Let & be as in the lemma. The sets D F = dom(EnF) = {:cG# :xEF(x)}
(F G &)
belong to A3,3 by Proposition 2.1. Let us fix an internal wellordering < of the set H. Suppose that F G &. For any x G H we carry out the following construction called the Fconstruction for x. Define an internal <decreasing sequence {z(a)}a
132
If z = F(x(a)) < X(Q) then put X(a+i) = z, otherwise put a{x) = a and stop the construction. Eventually the construction ends since £( a +i) ~< x{a) f° r all a. Put vp(x) = 0 if a(x) is even and vp{x) = 1 otherwise. Define ^{x)(F) = vF\x) for any x G H, F G &\ thus ip : H * 2*. Lemma 9.3. If r G23* then <&r = {x G H: ip(x) = r} belongs to Ag 3 . Proof. Note that x G * r iff vF{x) = r(F) for all F G &. On the other hand, all sets Xp = {x G H : vp(x) = 0} ( F G &) are internal because the Fconstruction is internal. It remains to apply Proposition 2.1. • According to the next lemma, any two different but Eequivalent elements x G H have different "profiles" ip(x). Lemma 9.4. If x ^ y G H and xEy
then ip(x) ¥" V'(y)
Proof. Suppose that y < x. There exists a function F G & such that y = F(x). Then y = x^ in the sense of Fconstruction for x. It follows that the Fconstruction for y has exactly one step less than the Fconstruction for x. Thus vp(x) =£ vp(y) and ip(x) ^ ip(y). • We continue the proof of Theorem 9.1. Note that 2^ and ^(2^) are sets of standard size together with & (1.3.3 in [17]). Thus by the axiom of Standard Size Choice there is a map A — i > rA such that rA G A for any nonempty A C 2^. Its graph C = {{A,r) :A
& (x GDp AF(x)
G*r) A
AVFG&3rGA(xGDp=>
F(x) G * r ) .
Yet the sets \&r and Dp are A^ s (see above), while the domains A and & are sets of standard size. Now apply Proposition 2.1.
• (Thm 9.1) 10. Monadic partitions A cut U C *IN is additive if a G U ==» 2a G U. Any such cut U induces an equivalence relation x Mu y iff \x — y\ G U on *D\I. (The additivity implies
133
that My is transitive.) Its equivalence classes [x]u = {y '• % My y} = {y : \x—y\ £ U}, are called XJ monads and relations of the form \J\\j, accordingly, monadic equivalence relations or monadic partitions. Monads of various kinds are considered in nonstandard analysis. As for those induced by additive cuts in *IM, see [11, 20], The following is an elementary corollary of Proposition 5.2: Proposition 10.1. / / 0 ^ U 5 *H\I is an additive A^ s cut then U is noninternal and either standard size cofinal or standard size coinitial. • Any additive A f cut U C *M defines a A^ s quotient *Ds/{7 = *\H/Mu, the set of all [/monads. According to the next theorem, effective cardinalities of those quotients are determined by two factors. The first of them is vidtf = r i u € t f , U ' € n ^ [ 0 , £ ) =
fWlWt/.^JO,^
the width of U.1 The second one is the cofinality/coinitiality. The cofinality cof U of a standard size (s.s.) cofinal noninternal cut, is the least cardinal i? £ Card such that U has an increasing cofinal sequence of type d. The coinitiality coiU of a standard size coinitial cut is defined similarly, with a reference to coinitial sequences in *Card \ U. Note that cof U and coi U are infinite regular cardinals. Additive cuts of lowest possible width are obviously those of the form U = cIM, c G *tsj and U = c/IM, c G *IM \ tNl, which we call slow; they satisfy widf/ = IM. Other additive cuts will be called fast. Theorem 10.2. Suppose that U, V are additive A^ s cuts in %i other than 0 and *N. Then (i) *Neff < *(NJ/C/eff. In addition, (ii) *N/C/ is A^ssmooth
iff *K\I/I/ /ias a A^ s transversal iff U is slow;
(iii) i/ C/ is s?ow iften *[H/f/eff < \*\l4/V\etl; (iv) if 6ot/i U, V are s. s. cofinal cuts and U is fast then *N/f/eff \*M/V\efl iff: cof U = cat V and uidU CuldV;
<
(v) if both U, V are s.s. coinitial cuts and U is fast then \\l/U\ett *IN/VTff iff co±U = coiV and v±dUCv±dV;
<
(vi) if U, V are fast cuts, U is s. s. cofinal and V is s. s. coinitial then *[M/[/eff and *IN/V e " are incomparable. 'Also called the thickness of U in some papers on AST.
134
Thus either of the two classes of monadic partitions (s. s. cofinal and s.s. coinitial) is linearly < eff (pre)ordered in each subclass of the same cofinality (coinitiality), slow partitions of both classes form the <eflleast type, and there is no other < eff connection between the two classes and their samecofinality/coinitiality subclasess. See [16] for earlier results of countably determined and Borel reducubility of monadic partitions for countably cofinal/coinitial cuts. 11. The proof of the reducibility theorem We begin the proof of Theorem 10.2 with the following observation. Remark 11.1. Call a set X C *[Nl scattered iff there is a number c G *N \ N such that I— is infinitesimal for any interval I in *N of length c. It is quite clear that *N is not a finite union of scattered sets, and hence; by Saturation, *Hsl is not a standard size union of internal scattered sets. • Proof of Theorem 10.2. (i) Choose a number h e *N\C/. The map x i+ [xh]u is an injection of %i into *D\l//7. (ii) If *Dsl/C/ admits a A " transversal then it is A s smooth. (Let, for x G *IM, p{x) be the only element of the transversal equivalent to a;.) Suppose that %\/U is smooth, i.e. My < e f f DR for a suitable A " set Z. This is witnessed by a A^ 8 reduction p : *tM —» Z By Theorem 3.1(i) the set r a n p can be covered by an internal set Y with  y  l n t < *IMint. Thus *N/[/ eff < *Neff • Then *IM/?7eff < \*M/V\eil for any other additive AS2S cut V by (i), thus U must be slow by (vi). Finally, if U is slow then *IN/C/ has a A " transversal by Theorem 1.4.7 in [17].•> (iii) If U is slow then *b\/U is A^ s smooth, and in fact *!M/[/ e " < *N eff , see the proof of (ii). It remains to apply (i). (iv) Thus let U, V be additive s.s. cofinal cuts. Choose increasing sequences {M^}{<,3 and {vr)}ri
135
Since R is A^s, we have, by definition, R = [jaeA ObeB ^ a6 > A,B £\J¥ and the sets Rab C *IKI x *N are internal. Let us fix a € A. Then Ra = flbes ^ob £ R, hence for any r\ < r we have Vb(xRaby
A x'Raby')
for all x,x',y,y' V?7
A yy' < ^
wnere
=*> 3 ^ < i? (a;  i '  < u £ )
£ *N. We obtain, by Saturation, 3 finite F C 5 3 £ < t f
Vx,x',y,y'
£ *N :
a; i ? a F y A x' i ? a F y' A \y  y'\ < vv =4> x  x'\ < U(. ,
(7)
where Rap = f~\beF Rab A similar (symmetric) argument yields: V£
3 finite F ' C B 3 ? 7 < T
V i , i ' , y , y ' e *N :
^ RaF' j A l ' fiaF' 2/' A \x  x'\ < Uf =4>  j /  J/' < Vv .
(8)
Suppose, towards the contrary, that vidU % widK Then there exists 7] < T such that the sequence {^}rj<7j'
=> \x —
x'\; \yy\
\xx'\
=>
Z,£,r),V
depend on o.
(9)
Put D(a) = domii(a), an internal subset of *IM together with R(a). Note that any interval of length t y in *N consists of approximately s = 1 subintervals of length vv. Accordingly any interval of length v^ consists of approximately t = ^ subintervals of length w^, while  is infinitesimal by the above. It follows by (9) that ' n,j\£l— is infinitesimal for any interval / in *M of length wi, hence D(a) is scattered in the sense of 11.1. On the other hand *N = domi? = UaeA Da — UaeA £Ka)> where Da = domi? a , simply because Ra C R(a), which is a contradiction with 11.1. Part 2: in the same assumptions and notation as in Part 1, we prove that cof U = cof V. This means to prove fl — r. Suppose •& ^ T. Let say •d < T. (The other case is similar.) Then, for a fixed a £ A, there is an
136
ordinal rj < r, one and the same for all £ < i9, such that (8) takes the form: V£ < d 3 finite F' C B Vx,x',y,y' € *N : x RaF, y A x' RaF> y' A x  x' < u € = > y  j / '  < vv .
(10)
Take an ordinal £ < i? for this 77 by (7), and then apply (10) for £ + 1. We obtain a finite set F C 5 such that, for all x, x' € D(a) = dom fia^ : \x — x'\
=>• x — x'\ < wj.
(11)
However, as £/ is fast, the cofinal sequence {«{} can be chosen so that — is infinitesimal for all £. Then the set D(a) is scattered by (11), and so on towards the contradiction as in Part 1. Part 3. Suppose that cof U = cof V = $ (an infinite regular cardinal in Card) and vidU C widV. To prove \*M/U\efI < \*M/V\e11 it suffices, by 8.2, to define a reduction of *IN/[/ to *INI/V. Let { U J } J < ^ , {^}^
3uGU
VU'€U,U'>U
V'> V
3V'GV,
/u' v' —<— \U
V
This allows us to define an unbounded subsequence of {u^}^,? such that, after the renumeration, the following holds (£,T7, C a r e ordinals < 1?): VC V ( > (
3r? > C f— < —,
that is,
^<^<
and then to once again define an unbounded subsection of, now, {vri]ri<^ to satisfy, after the renumeration, the following: V£ <
v
< 0 (^
< ^,
that is,
^
< ^ V
(12)
Finally, we may assume that u 0 = 1. (Replace each u^ by u'e = ^ . As all ii£ are powers of 2, these fractions belong to *IM. The sequence {u'A is then cofinal in the cut U' = U/u0 = {u: uu0 € U}. The inequality *IM/t/e" < *N/J7' eff is witnessed by the map \x]u >—> [entire part of ^}u') Note that the map / sending each u^ to v^ satisfies the following: dom / = {«£ : £ < t?} is a s.s. set, dom/ and r a n / consist of powers of 2, and ^ T ^ ^f1 f o r a11 u < u> i n d o m / b y ( 1 2 ) B y Saturation there is an internal function F with D — domF a hyperfinite subset of *Jsl \ {0}, such
137
that dom / C dom F, F(u^) = v^ for all £, and still D = dom F and Z — ran F consist of powers of 2 and —jL < ^, ' for all d < d' in D. Let /i =  D  i n t =  Z  i n t and D = {du d2,..., dh}, Z = {zu z 2 , . . . , z fc }, in the increasing order of *N in D. Then zv — F(d„) for all v = 1 , . . . , h. As all dv, zv are powers of 2, the fractions jj, =  j 1 ^ andfc^= ^ j ^ belong to *IM and j'„ < kv by the above. Note also that dj = u0 = 1. Any number i 6 ' N admits, in D, a unique representation in the form x = 53„ = 1 o^di,, where a„ G *IM and 0 < av < j„ for all v = 1 , . . . , h — 1 (but ah is not restricted, of course). The first idea that comes to mind is as a to try a(x) = Ylv=ia"zv reduction of *N/£7 to *N/V. However this does not work. Indeed let x = Ylv=i ^ anc ^ x' = X ^ = i ( i f ~ 1)^,, s o that x — x' = 1 but \<J{X) — <J{X')\ can be very big in the case when, say, ku > j v for all v. However there is a useful modification. Suppose that x = Y^v=\ avdv £ *N, and 0 < a„ < j v for v — 1 , . . . , h—1, as above. Say that x is type1 if there exist indices 1 < v' < v" < h — 1 such that dui £ U, dv» $ U, and ct^ = j v — 1 for all v such that v'
138
(vi) Suppose that U, V are resp. s. s. cofinal, s. s. coinitial additive fast cuts. Prove that \*M/U\eil £ \*M/V\efl; the proof of \*\fi/V\eil g \*^/U\eii is similar. Choose an increasing sequence {u^}^ < ^ and a decreasing sequence {vv}v
\/x,x',y,y'
G *N :
x RaF y A x' RaF y' A \y  y'\ < vv = > x  x'\ < u^ ,
(13)
where Rap = f]beF Rab, and, in the opposite direction, V £ < t f V ? y < T 3 finite F'C B x RaF< y A x' RaF' y1 /\\x
Vx,x',y, y' G *N : x'\ < u( =>• \y  y'\ < vv .
(14)
Let a G A. Take £, 77, F as in (13). Take then F' as in (14) for £ + 1 and 77. We may assume that F C F ' . Then for all x,x' in the set D{a) = domR(a), where R(a) = RaF>, we have \x — x'\ < u^+i =>• \x — x'\ 2K €U,OT, equivalently, U = 2U holds, where 2U = {1? G *Card : 3 K G U (•& < 2K)}. (2K is understood as the cardinal exponentiation in I.) We write A > 2U to mean A > 2K for all K G U. Theorem 12.1. Suppose that E is a A  s equivalence relation on an internal set H and U C *Card is a A^ s cut such that IN C U. Then at least one of the following conditions holds:
139
(A) for any A G *Card with A > 2U and any m G *N \ N there is an internal map p defined on H such that  r a n p  l n t < Am (=A whenever A ^ *N) and p{x) = p(y) =4> x E y for all x,y G H; (B) i/iere exists an internal set Y C H of pairwise Einequivalent elements such that Y i n t G" U. If U is an exponential noninternal cut then (A) and (B) are incompatible even in the case when A8;3 maps p are allowed in (A). In terms of effective cardinals (B) means K < \H/E\eit (and even by means of an internal reduction) for some K — Y i n t G *N \ U, that is a restriction of the cardinality of the quotient H/E from below. Accordingly (A) means that for all A > 2U and m G *N \ N and any internal Z with  Z  l n t = Am there is an equivalence relation F on Z (in terms of (A), p{x) F p(y) iff x E y) such that  F / E  e f f <  Z / F  e " (still by means of an internal reduction), a restriction of the cardinality of H/E from above. Some theorems of this form are known from descriptive set theory, for instance Silver's theorem on II} equivalence relations in [28], in which "small" means at most countably many equivalence classes while "large" means that there exists a pairwise Einequivalent perfect set. Note that the implication p(x) = p(y) => x E y in (A) cannot be replaced by the equivalence p(x) = p(y) <==>• xEy: indeed the latter would imply the A8,3 smoothness of E, which, generally speaking, is not the case even for equivalence relations of the form Mj/ by Theorem 10.2. Proof (Theorem 12.1). Case 1: U is standard size cofinal, including internal cuts. In this case we prove a stronger disjunction (A') V (B), where (A') there exist a set D G WF, and for each d € D a,n internal set Rd and an internal map fd:H>Rd such that \Rd\iDt G 2U and f(x) = f(y) => x E y for all x,y G H, where f(x) = {fd(x)}deD . We first show that (A') implies (A). Suppose that A > 2U, m G *N \ N. Recall that the map d H> *d is an injection D —• *D. Its image D' = {*d: d G £>} C *D is a set of standard size together with D. By 4.1(iii), D' can be covered by an internal set S C *D such that  5  l n t < m. The Extension principle (1.3.13 in [17]) yields an internal function F defined on S x H so that F(*d, x) = fd{x) for all d G D, x G H. By the same reasons there is an internal map r defined on S so that r(*d) = Rd for all d G D. We can assume that for any s G S, r(s) is an internal set with r(s) l n t < A, and F(s, x) G r(s) for all x G H. (Otherwise redefine r and F by r(s) = {0}
140
and F(s, x) = 0 for all "bad" s — but none of s = *d, d € D, is "bad" in the assumptions of (A').) Put p(x)(s) = F(s,x) for x G H, s G S. We begin the proof of (A') V (B). By definition E = \JaeAf)beBEb, where all sets Eg C H x H are internal while A,B€ WF. We may w. 1. o. g. assume that every set Eg is symmetric (similarly to E itself), that is, Eg = (Eg)"1, where E~l = {(y,x) : x E y} : indeed E = E n E" 1 = \Ja(A f V e B Eg U (Eg,)1 = [)aGA
C
f\,VeB
w ,
where the sets Cgb, = (Eg U(Eg,)~1)n (Eg, U (Eg)'1) are symmetric. (We write x E y for (a;, y) £ E whenever E is a binary relation.) It follows from the transitivity of E that for any x, y G H 3aeA3z£HVb<=B(xEgzAyEgz)
=^
xEy.
The axiom of Saturation transforms this to 3aeAVB'€
&>fin(B)3z£H
(xEaB, z A y E%, z) = > x E y,
where EB, = C\beB' Eg As the two leftmost quantifiers are restricted to the sets A and ^tin(B) in WF, the last formula is equivalent to V
=* xEy,
(15)
where $ G WF is the set of all functions
^fin(B). As U is standard size cofinal, there is an increasing sequence {^}^ i/£) ==>• 3 i ^ y £ y 3 a G AVfo £ B (x Eg y)) , where P = &l(H) = {y C ff : y is internal}. Saturation converts the expression to the right of = > to laeAVB'e and then toV ip e $ 3a € A3x any function ip € $
&>fin(B)3x^yGY(xEB,y), ^ y sY
(x E\a, y). We conclude that for
V y G P (VC < i? (  y  l n t > ^ ) ^ 3 f l € 4 3 ^ ! / e F ( x
£ £ ( o ) y)) .
Saturation yields an ordinal £(
vi{v) ^3a^Av3x^y&Y(x
E£ ( a ) y)) .
(16)
141
Let Yy be any maximal (internal) subset of H such that i x E",as y for all a £ Av and I / J £ Yv. Then (16) implies l^, i n t < V£(v), while the properties of maximality of Yv and symmetricity of E% imply Vx£H3y£Ytp3a£Aip{xE°{a)y).
(17)
Put Cx(v, a) = {y G Vy : x E«(a) i/} for i e iJ, ^ e $, a 6 A , . Thus £r belongs to the set Z of all functions £ defined on the set D = {(tp,a) : if £ $ A a € A^,} G WF and satisfying Cx(<^,a) £ Rv = ^{Y^). The sets 1 U int i ^ are internal and satisfy Ifl^l" * G 2 (because \Yv\ G U). We claim that £x = (y implies x Ey.lt suffices, by (15), to prove that for every
{B), we put fd(x) = Cx{f,a) and fl^ = Rv for all x £ H and d = (
2U. Then (B) fails also for the internal, hence, s. s. cofinal, cut U' = {K £ *Card: 2K < A} : indeed, U C U' by the choice of A. Therefore (A) holds for U'. Thus there is an internal map p, domp = H, such that  xan.p\xnt < Am and p(x) = p(y) =>• x E y. Incompatibility. Assume that Y C H witnesses (B), in particular, K = \Y\iM (£U = 2U. Then U' = {A G "Card: 2X < K} is an internal cut with U C U'. Thus U C J]' since U is noninternal. Therefore there is •& 0 U such that 2s < K. Applying this trick once again, we find i? 0 U with 2 2 < K. Suppose on the contrary that p witnesses (A) for A = 2tf and some m G *N \ IN, m < i?. Then p fV is an internal injection of V into an internal set Z = p"Y satisfying  Z  i n t < 2* m . But this contradicts Theorem 3.1, since by definition 2®m • n< 2^ < 2 2 " < K =  y  i n t for any n £ N. • The case U = IN deserves special attention. Since Dsl is a s.s. cofinal cut, a stronger dichotomy holds: (A') V (B). Clearly (B) claims the existence of an infinite internal set of pairwise Einequivalent elements in this case. On the other hand, the sets fl^ in (A') are finite, hence P = Ylde£> Rd is a set of standard size, and so is any quotient of the form fl/F, where F is an equivalence relation on P. Thus (A') implies that H/E itself is a set of standard size. Such a dichotomy (i.e. standard size of H/E or an infinite internal pairwise inequivalent set) is contained in Theorem 1.4.11 in [17]. Similar dichotomies appeared in [16] for countably determined equivalence relations. P. Zlatos informed us that a close result for U = IN was earlier obtained by Vencovska (unpublished) in the frameworks of AST.
V'CJ/)) ( w e suppose that the variable a: does not appear in the sentence £). Denote 9rt ^± 03 x £, 6 ^ ( I s , 0 ) and c ;= (0,1 £ ). Then c = 6 and 9rt = (6) x (c). We may assume that 9rt = (6) x (6). Notice that (b) \=
142
13. Nonstandard version of the finite Ramsey theorem The following corollary of Theorem 12.1 is a Ramseylike result. Recall that [A]n = {X C A : c a r d X = n}. By a partition of [A]n we understand any equivalence relation E on [A]n, and a homogeneous set for E is any H C A such that the sets X G [H]n are pairwise Eequivalent. The finite Ramsey theorem claims (in ZFC) that (*) for any natural numbers £,n,s there is k G N such that k —> (£)". Here k —> (£)" means that for any partition of [k]n into smany parts there is an ^element homogeneous set H C k. We refer to [25], and also to 3.3.7 in [1], §6 in [21], or [3] for a modern proof, details and related results. Let K(£,s,n) denote the least k satisfying k —> (£)". It is known that K(£, s, n) is rapidly increasing as a function of ( for any fixed n, s, see [3]. But of course K is a recursive function. It is an easy nonstandard corollary of (*) that K —> (•£)" for all n,s,£s IN and K G *IN \ IN where i n t over the arrow means that the partition and the homogeneous set are assumed to be internal. A nicer nonstandard version, also wellknown, is K —> (oo)" for any n, s G N and K G *IN \IN, that is, any internal partition [«]n into s parts admits an infinite internal homogeneous set. By the way, its quantifier structure is simpler than that of (*): VK, I, n, s V partition 3 A V u , « € [A]n. The following theorem contains a much more general claim. In HST, define a function K in WF as above. Then *K is a standard function *N —> *N having in the internal universe I the same properties as K in WF. Theorem 13.1. Suppose that U ^ *U is a A  s cut with IN C U, closed under *K and exponential, n G IN, K G *M \ U, and E is a A  3 equivalence relation on [«] n . If there is no internal pairwise Einequivalent sets Y C [/c]n satisfying  y  l n t & U, then the partition E admits an internal homogeneous set AC K such that \A\int £ U. A similar result was obtained in [22] in the case U = N for countably determined equivalence relations. See Theorem 2.8 in [19] for a somewhat weaker result in the case when t in the proof of 13.1 is predefined. Proof. Define, in WF, f(s) = K{s,s,n) for each s G IN. Then / : N > N and s < f(s), Vs. The map *f has the same properties with respect to *IN. As U is *ifclosed and exponential, there exist s, i? G *IN \ U and m G *N \ IN such that */(s) = *K(s, s,n) < K and 2*m < s.
143
In our assumptions, (B) of Theorem 12.1 fails, hence (A) holds, t h a t is, there exists an internal m a p p defined on [K]" such t h a t  r a n p  l n t < 2 1 ? m < s and p(u) = p{u) ==> u E u for all u,v £ [«;]". On the other hand, we have K —> (s)™ by the choice of s, therefore the partition of [«]" induced by p has an internal homogeneous set A such t h a t  A  i n t = s gU. T h u s p(u) = p(v), and hence uEv, for all u,v G [A] n . D
References 1. C. C. Chang and H. J. Keisler, Model Theory, 3rd edition. Amsterdam: North Holland, 1992, xiv + 650 pp. 2. K. Cuda and P. Vopenka, Real and imaginary classes in the AST, Comment. Math. Univ. Carol., 20, pp. 639653 (1979). 3. P. Erdos, A. Hajnal, A. Mate, and R. Rado, Combinatorial set theory: partition relations for cardinals. Amsterdam: North Holland, 1977. 4. P. Frankl, Families of finite sets satisfying an intersection condition. Bull. Austral. Math. Soc. 15, 1, 7379 (1976). 5. E.I.Gordon, A.G.Kusraev, and S. S. Kutateladze, Infinitesimal analysis, Kluwer, Dordrecht, 2002. xiv+422 pp. 6. G. Hjorth, Orbit cardinals: on the effective cardinalities arising as quotient spaces of the form X/G where G acts on a Polish space X, Israel J. Math. I l l , pp. 221261 (1999). 7. G. Hjorth, Classification and Orbit Equivalence Relations (Mathematical surveys and monographs, 75), AMS, 2000. 8. K. Hrbacek, Axiomatic foundations for nonstandard analysis, Fund. Math. 98, pp. 119 (1978). 9. K. Hrbacek, Nonstandard set theory, Amer. Math. Monthly 86, pp. 659677 (1979). 10. S. Jackson, A. S. Kechris, and A. Louveau, Countable Borel equivalence relations, J. Math. Logic, 2, 1, pp. 180 (2002). 11. R.Jin, Existence of some sparse sets of nonstandard natural numbers, J. Symbolic Logic 66, 2, pp. 959973 (2001). 12. M. Kalina and P. Zlatos, Borel classes in AST, measurability, cuts, and equivalence, Comment. Math. Univ. Carol, 30, pp. 357372 (1989). 13. V. Kanovei, Undecidable hypotheses in Edward Nelson's Internal Set Theory, Russian Math. Surveys 46, 6, pp. 154 (1991). 14. V. Kanovei and M. Reeken, Internal approach to external sets and universes. Studia Logica, 55, 2, pp. 229257 (1995), 55, 3, pp. 347376 (1995), 56, 3, pp. 293322 (1996). 15. V. Kanovei and M. Reeken, Isomorphism property in nonstandard extensions of a ZFC universe, Ann. Pure Appl. Logic, 88, pp. 125 (1997). 16. V. Kanovei and M. Reeken, Borel and countably determined reducibility in nonstandard domain. Monats. fur Math., 140, 3, pp. 197231 (2003). 17. V. Kanovei and M. Reeken, Nonstandard Analysis: Axiomatically, Springer, 2004.
144
18. A.S.Kechris, New directions in descriptive set theory, Bull. Symbolic Logic, 2, pp. 161174 (1999). 19. H.J. Keisler, K.Kunen, A.Miller, and S.Leth, Descriptive set theory over hyperfinite sets, J. Symbolic Logic, 54, pp. 11671180 (1989). 20. H. J. Keisler and S. Leth, Meager sets on the hyperfinite time line, J. Symbolic Logic 56, pp. 71102 (1991). 21. K.Kunen, Combinatorics, in: Handbook of mathematical logic, Studies in Logic and Foundations of Math., 90, NorthHolland, Amsterdam, 1977, pp. 371401. 22. J.Mlcek and P. Zlatos, Some Ramseytype theorems for countably determined sets, Arch. Math. Logic 4 1 , 7, pp. 619630 (2002). 23. E. Nelson, Internal set theory; a new approach to nonstandard analysis, Bull. Amer. Math. Soc. 83, 6, pp. 11651198 (1977). 24. R. L. Panetta, A finite intersection property and the measurability of ultrafilters on hyperfinite sets, Ann. Math. Artif. Intell. 6, 13, pp. 267270 (1992). 25. F.P.Ramsey, On a problem in formal logic, Proc. London Math. Soc, 30, pp. 264286 (1930). 26. A. Robinson Nonstandard analysis, NorthHolland, Amsterdam, 1966, xi+293 pp. 27. K. Schilling, Vanishing Borel sets. J. Symbolic Logic, 63, 1, pp. 262268 (1998). 28. J. Silver, Counting the number of equivalence classes of Borel and coanalytic equivalence relations, Ann. Math. Log., 18, pp. 128 (1980). 29. V. A. Uspensky, What is nonstandard analysis? (Russian), Nauka, M., 1987.
145
MODELTHEORETIC M E T H O D S OF ANALYSIS OF COMPUTER ARITHMETIC SERGE P. KOVALYOV Institute
of Computational Technologies, 6 Lavrentiev Ave, Novosibirsk, 6S0090, Russia Email: [email protected]
Practical problems associated with engineering efficient robust algorithms for real world computers lay beyond the traditional scope of mathematical theory of algorithms. Special mathematical methods are required to formally specify and verify empirical approaches routinely used by technicians. Such methods based on model theory and multiplevalued logics are presented in this report. A construct of partial interpretation is elaborated for developing formal specifications of computer arithmetics taking resource limitations into account. Finitevalued Lukasiewicz logic is proven to be capable to express and verify operations used in computer implementations of integral arithmetic.
1. Introduction One of the key problems arising at developing computing systems is caused by restrictions on available amount of resources. Computations performed on real devices are limited by finite amounts of time (performance) and space (memory). Due to memory limitations computer implementations of arithmetic fail to satisfy standard arithmetic axioms that have only infinite models. Nevertheless arithmetic devices are required to have supported numbers behaving similarly to their theoretical originals. Software engineers qualify this situation as conflict between functional and nonfunctional requirements to computation models. With regard to semiconductor computers this conflict is considered as defacto resolved few decades ago (although, as shown in the report, not ideally). However, when developing novel nontraditional computing devices the problem arises again. General mathematical methods are needed to solve it. Such methods based on model theory and multiplevalued logics are presented in this report. A construct of partial interpretation is elaborated for developing formal specifications of computer arithmetics taking resource limitations into account. Finitevalued Lukasiewicz logic is employed to express and verify operations used in computer implementations of integral arithmetic.
146
In fact, it is shown t h a t Lukasiewicz logic is "right" (natural) abstraction of various finite approximations of arithmetic.
2. S p e c i f i c a t i o n s o f c o m p u t e r a r i t h m e t i c s Traditionally, the Abstract D a t a Type technique is used to specify computer implementations of real world objects [5]. However it has a serious limitation: it doesn't offer tools for abstract modeling of infinite entities by finite structures. T h e author of the report has suggested one such tool in [4]. It is special modification of s t a n d a r d modeltheoretic approach called partial interpretation of firstorder theory T. It lies in constructing algebraic system t h a t must verify only those statements from T t h a t contain only t e r m s t h a t can be substituted by constants from given finite subsignature of its signature. T h u s the formal specification of resource available t o represent objects described by T is constituted by t h e explicitly given set of constant symbols. Observe t h a t even the usual properties of equality relation are allowed to fail beyond this set. Isomorph embedding of this set t o universes of models of various theories allows formalizing t h e concept of polymorphism. T h e precise definition of this construction follows. D e f i n i t i o n 1. We will consider firstorder languages w i t h o u t equality. Let
Let £)
147
momorphism from the reduct 21 f (
I — I V ^ max(0, x — y),
x
I x I V ^ min(n,a;y);
b) (Almost) symmetric modular segment of integers is the algebraic system M A n + 1 ^ (En+U0,1,...,
[(n  l)/2],  [ n / 2 ] , . . . ,  1 ,
R , ( + ) , R , ( x ) , Carry), x(=)y^(x
= y)V (x, y) G {(0, n), (n, 0)}, ±
x (+) y ^ {x + y) mod n, (—)x ^ n  i , £ (x) y ^± xy mod n, Carry(x) ^ (x = n). 3. Arithmetics design method Designers of computation models and algorithms use specifications of computer arithmetics as input data. They particularly need them while performing mapping of computing algorithms to computer architecture, i.e. binding computation stream to functional capabilities of employed computing devices [8]. Abstract mathematical method of modeling computer architectures is needed here. It should offer verification technique based on
148
formal proof. As a basis for such method the author of the report has suggested to employ multiplevalued Lukasiewicz logic [4]. Numerical data storage units (variables) are used as architecture elements of computer arithmetic. Their values (states) correspond to logical constants. Computation operations are described as compositions of base logic functions. Such setting traditionally disposes one to apply multiplevalued logics. However, Lukasiewicz logic and its enrichments weren't thoroughly employed earlier. They provide rich capabilities to evaluate efficiency of computing models against different criteria: functional power, performance, energy consumption etc. For these purposes multiplevalued logic is represented as matrix  algebraic system with the universe En+\. The following matrix corresponds to Lukasiewicz logic [3]: Ln+i ^
(En+i,~,^,{n»,
~ x ^ n — x, x —> y ?=± min(n, n — x + y). The connectives of this matrix can be used to express manyvalued disjunction and conjunction: x V y ^ max(x, y) = (x > y) > y, x A y ^ min(x, y) = ~ (~ xV ~ y) = ~ (x —>~ {x —> y)), Let's consider properties of Lukasiewicz logic as a clone  class of functions on En+i closed with respect to function composition. Every clone is a subclass of the Post logic Vn+\ which consists of all functions on En+\. For any nonempty set X C En+\ let
C*+1 ^ { / e P n + i I / ( * , . . . , X ) c x } , D£f 1  {/ e Pn+l I / ( £ „ + ! , . . . , En+1) C X}, Q„+i ^ {/ G Pn+i  there exists c £ X such that f(X,...,
X) = c},
T —> p{°>"} Jn+l   ^ „ + i •
Class C r a + 1 is precomplete in P n + i , i.e. it is closed and the closure of its union with an arbitrary function not expressible in it equals P n +x Regarding Lukasiewicz logic Evans and Schwartz have shown in [1] that it is weakly complete [7], i.e. the system of functions obtained by uniting it with the set of all constant functions on En+i is complete in P n +i Precisely due to functional incompleteness Lukasiewicz logic seems inadequate in modeling finite arithmetic. However, its incompleteness is overcome by addition
149
numbers themselves (constant functions). It means that Lukasiewicz logic allows discovering structural properties of arithmetic operations that don't depend on their arguments values. In [3], the following numbertheoretic characterization is given for the lattice of clones that contain Lukasiewicz logic. We call Lclosed a clone that contains connectives of Ln+\. Denote by D(n) the set of all divisors of number n. A subset Y C D{n) is said to be LCMclosed iff 1 £ Y and LCM(a:,y) £ Y whenever x,y £ Y. Henceforth, we denote by Fn+i(Y) the class of functions / in P n +i that satisfy the following set of divisibility conditions: for every m £ Y, if all x\,... ,Xk are divisible by m, then f{x\,... ,Xk) is also divisible by m. Then F n + i ( F ) = fluey ^n+i > w n e r e V(u) F± {v  v £ En+i A u £ D(v)} (thus Fn+i(Y) is a clone). It is proven that class K „ + i is Lclosed if and only if there exists such LCMclosed set Y C En+1 that K n + 1 = F n + i ( Y ) . In particular, on the one hand L n + i is contained in the precomplete class T n + i and coincides with it if and only if n is prime. On the other hand, L n + i contains class D ^ ^ and doesn't coincide with it if n > 1 (hereinafter we assume that this condition holds). There exists a bijective correspondence between functions from D ^ ^ and predicates on En+i that interpret arithmetic relations. They are constructed via disjunction and conjunction from RosserTurquette functions
In the present report apparatus of Lukasiewicz logic is employed to analyze properties of operations of computer arithmetics described in Definition 2. The following results are obtained. Proposition 1. An expression whose computation is determined by a formula A does not cause overflow if and only if ~ Jn(A) is a tautology in Ln+i. • Lemma 1. The following equalities hold. a) (x = y) = Jn((x > y) A (y > x)); b) x  +  y = ~ x > y; c) x  —  y = ~ (x —> y); d ) ^  x  2 / = Vi€£ri+1(~^i0)AJi(y). Theorem 1. a) The functions
=,  + ,  — ,  x  are expressible in Ln+i.
150
b) The system of functions {  + ,  —  } forms a basis in the class Ln+i l~l C^Ii which is precomplete in £ n +ic) The systems of functions { ~ ,  +  }, {n,  —  }, and OALn+i {=,  — } are bases in Ln+\.
^
Corollary 1.1. The set of constants and functions of the overflow arithmetic OAn+i is complete in Pn+\. n Lemma 2. The following equalities hold. a) x mod n = x\ — \ Jn(x); b) x(+)y=((x\
+ \ y) mod n)  +  ((x 
(n    y)) mod nj;
c) (—)x = n  —  x; d)x(x)y = \/ieEn+i(0(+)ix)AJi(y); e) Carry(:r) = Jn(x); i)x(=)y = C^xy(()[x(+)(()y)}). Denote by M„+i the class of functions that preserve the modular equality relation (=) of the system MA„+i. It is known [7] that the class M n +i is precomplete but not weakly complete. Denote by M „ + 1 the class of functions that preserve the quaternary relation A n + i F± {(x,x,x,x)
x €
En+i}
4
U ({0,n} n {{xi,X2,x3,xi)
 xi +x2 = x3 + aj4( mod 2n)}).
Denote by U „ + 1 class of all unary functions on E«.+i Let MP„+i ^±{f
e P„+i  f(xi,...
= /fa,...
,xsi,0,xs+1,...
,xs_i,n,xs+i,...
,xk)
,xk) for all s =
l,...,k},
J«+i ^ {id, ~ , Jn, JQ, JQJU, JQJQ},
R„ +1 ^ ( M P B + 1 n (Dft?> u D f ; r U n } u Df;t A{0} )) u JB+1, S„+i ^ (QiTi}
n M
« + i ) u ( T « + i n un+i)
T h e o r e m 2. a) The functions (=), (+), (—), (x) are expressible in Ln+i. b) The system of functions {(+),(—), (x), Carry} forms a base in the class
151
MACn+\
C Ln+i n Rn+i c Ln+\ n 5 n + i c £„+i n M%+1 c Ln+1 n M n + i C £n+l,
where MACn+i coincides with Ln+i fl i?„+i «/ and on/y if n is prime. c) TVie system of functions M A L forms a base in
n +
^ \ { ^ C a {{(+),(),
r r y
^ Carry, V},
U
= 2' n > 2,
Ln+\.
Corollary 2.1. The set of constants and functions of the modular arithmetic MAn.\i is neither complete nor maximal in Pn+i but becomes complete after enrichment by function max(x,y), (x,y) S En+\ x En+\. • 4. Digital number systems Explicit expressions for arithmetic operations shown in Lemmas 1 and 2 demonstrate the following wellknown fact: when n increases, computational complexity of machine implementations of operations increases. The traditional approach to decreasing complexity lays in employing digital number representation in positional system. Arithmetic over gdigit (q > 1) number representation in 6ary positional system is interpreted over the universe Ebiq, where b  q ^ b" + 1. The functional structure of digitwise operations is induced by decomposing a number x into summands of the form bj(x) = Xjb>, j = 0 , 1 , . . . , q — 1, each of which is computed by the formula , [ x mod tP+1 — x mod IP, 0 < j < q — 1, bj(x) ^ < I x — x mod bq , j = q — 1. An arbitrary function / : EL —> E^q may be defined by a quasiPost representation (q. p. r.)  a family of functions {fj : E%+1 —> i?b+i  j = 0 , . . . , q—1} such that, if the decomposition of every x; in base b has the form
152 Xi = ]T\ Xijb3, then the decomposition of f(x\,..., f(xi, ...,Xk)
= 2j/j(a;i,o, • • • ,xi,qi,...
Xk) has the form ,Xk,o,.. • ,x fciq _i)& J .
3
This representation is called regular (r. q. p. r.) if it is done by means of a family that maps the set £?* n ( { 0 } 9 _ 1 x {&}) into itself. If the function has an r. q. p. r. in which each /,• depends only on ( x i j , • • •, Xk,j) € E*+l, then it is called digital. Further, if it has an r. q. p. r. in which each fj depends only on (xi, ) G Eft+i1^' t n e n it i s called progressive. Progressive functions play an important role in machine arithmetics: functions bj(x), and all functions of the system MA„ + i are progressive. Consider the r. q. p. r. as a mapping of classes of functions as follows: given a class Kb+i of functions on Eb+i, for every q > 1, denote by Kb+i t 1 the class of functions on E^q that have r. q. p. r. with all terms belonging to Kb+i. We also put Kb + i f 1 ;=± Kb+i because of the equality b j 1 = 6 + 1 . Since the r. q. p. r. is concordant with superposition (i. e., for every tuple of functions, there exists an r. q. p. r. of their superposition whose terms are superpositions of the terms of r. q. p. r.'s of these functions), it follows that closedness of the class Kb+i implies closedness of Kb+i  Q The class of all digital functions from Kb+i  q will be denoted by Kf,+i  _ q, and the class of all progressive functions from K(,+i ] q  by Kb + i  + q. Analysis of functional properties of digital computer arithmetics yield the following results. Theorem 3. a) The class Kb+i coincides with Tb+i if and only if the class Kb+i  q coincides with Tb\q for every q > 1. b) For every q > 1, the class L[Lb+\  q] coincides with Tb[q. c) For every q > 1, if b = ps, with p being a prime and s > 0, then all progressive functions in Lb+\ T + q (in particular, all the functions bj, j = 0 , . . . , q — 1) are expressible in Lb\q; otherwise, none of the bj's is expressible in L^q and the set of all bj 's enriches Lb\q to the class L+u ^ L[Lb+l t+ q] = L[Lb+1 H q] = Fbu({dbi\deD(b),
j =
0,...,ql}).
d) Ifb is not a prime power then the system of functions QPLbU ^ {>, 60, bo (+) h,..., forms a base in £jt for every q > 1.
b0 ( + ) . . . (+) bq_2}
153
Corollary 3.1. The following statements are equivalent: a) b is a prime; b) the class £&+i f q coincides with T^q for every q > 1; c) for every q > 1 and every function expressible in Lb\q, there exists a q. p. r. all of whose members are expressible in Lf,+i. 5. Application: dataflow computations verification Deep formal performance analysis and optimization of arithmetic algorithms is required in control engineering. Control applications, such as sound mixers or digital electricity meters, must satisfy hard real time and reliability requirements while running on devices with limited resource capacities. One of widely used approaches to their design is based on the dataflow paradigm [2]. Here data values are represented as flows  numeric sequences {xi \ i = 1,2,...} indexed by ticks of infinite discrete clock. At clock tick i, ith values of output flows (reactions) are computed from ith values of input flows (stimuli) and possibly lth. values of certain flows for some I < i (memory). The computation is: a) directed (each x± is assigned exactly once, at tick i), b) bounded (the total number of operations performed at tick i doesn't exceed fixed constant that doesn't depend on i), c) finite (all data domains are finite sets). At implementation phase it must be verified that computation always ends before the next tick and an overflow is signaled if occurred. Presented results are used for this purpose as follows. Computation and overflow detection rules are specified as finite superpositions of base functions of suitable clone enriching Lukasiewicz logic. Requirements a)c) guarantee that such specification exists. When mapping it to the target microprocessor, arithmetic unit commands are expressed using the same base. Then subexpression matching is performed, resulting in formal definition of the algorithm in terms of target hardware capabilities. It is optimized using various techniques, e.g. caching multiply used results in intermediate flows. The resulting expressions' computation time is determined using hardware performance characteristics. Such analysis requires large amount of routine Lukasiewicz logic reasoning when applied to realworld algorithms. For brief illustration here we draw very simple example. Consider sound filter described by the following finitedifference equation: l=0,...,N
1=1,... ,M
154 In order to implement it using overflow and modular arithmetics (e.g. on TMS digital signal processor from Texas Instruments Corp.), it is specified according to Lemmas 1 and 2 as follows: y(z0,...
,ZN+M)
=
[(+)/=0,...,iv(0 (+)"' Zl)] (+) [() (+), = ! o(z0,...
,ZN+M)
M (0 ( + ) " * " + «)].
=
Carry( + \l=0_N
(0  + fe< z,)) V Carry( +  I=li> .. >M (0  + °<
zN+l)),
where o is a twovalued (Boolean) overflow signal. Mapping of this specification is straightforward.
6. Conclusion To summarize presented results we quickly review the traditional approach to computer arithmetics implementation. In generalpurpose hardware platforms and programming environments modular arithmetic MA2j q is implemented with binary digital number representation (usually q equals 16, 32 or 64). At the hardware level, the choice of number 2 for the base is justified by Corollary 3.1 and item (c) of Theorem 2. In addition, the q. p. rused for constructing arithmetic operations is implemented "for free" (without extra transformations) by means of a simple commutation of the input and output lines. However, the Carry predicate (flag) doesn't have software implementation, which complicates integer overflow control. For example, the predicate Carry(:r  +  y) is implemented by the following rather verbose procedure on the C programming language: i n t carry_sum (unsigned i n t x, unsigned i n t y) { r e t u r n (y != 0) && (x >=  y ) ;
} Moreover, functional incompleteness of MA n +i leads to necessity of such inefficient operations as conditional branch (cf Corollary 2.1). Important alternative is offered by multiplevalued arithmeticlogical units (MVALU) such as hardware implementation of overflow subtraction  —  by polymer conductors [6]. According to Theorem 1 it is enough to implement this operation in order to build fullfunctional MVALU.
155
References 1. Evans T., Schwartz P.B., On Slupecki Tfunctions. J. Symbolic Logic 2 3 (1958), 267270. 2. Halbwachs N., Caspi P., Raymond P., Pilaud D., The synchronous data flow programming language LUSTRE. Proc. IEEE 79 (1991), 13051320. 3. Karpenko A.S., Lukasiewicz's Logics and Prime Numbers (Moscow, Nauka, 2000). (Russian) 4. Kovalyov S.P., Mathematical foundations of computer arithmetics. Siberian Advances in Mathematics 15(4) (2005), 3470. 5. Liskov B.H., Zilles S.N., Specification techniques for data abstractions. IEEE Trans, on Software Engineering S E  l ( l ) (1975), 719. 6. Mills J.W., Polymer Processors. Indiana University, Computer Science Dept, Technical Report TR580 (Indiana University, 2003). http://www.cs.indiana.edu/pub/techreports/TR580.pdf. 7. Rosenberg I.G., Completeness properties of multiplevalued logic algebras. Computer Science and MultipleValued Logic, (Amsterdam, North Holland, 1977), 144186. 8. Voevodin V.V., Mapping the problems of computational mathematics onto computer systems architecture. Russian J. Numer. Anal. Math. Modelling 1 5 ( 3  4 ) (2000), 349359.
156
T H E F U N C T I O N A L COMPLETENESS OF LESNIEWSKI'S SYSTEMS FRANgOIS LEPAGE Departement de philosophic Universite de Montreal C.P. 6128, succ. CentreVille Montreal (Quebec) H3C 3J7, Canada Email: francois.lepage©umontreal. ca After a brief presentation of Lesniewski's system of Protothetics and Ontology, we introduce the Aoperator and then show that the language of Protothetics is functionally complete. Furthermore, after having introduce a very simple algebra over names, we show that language of Ontology is also functionally complete.
1. Introduction In the first half of the last century, the Polish logician Stanislaw Lesniewski introduced three new systems of logic — Protothetic, Ontology and Mereology — that were free from logical contradictions and strong enough to be used as a foundation of mathematics. His work did not receive all the attention it deserves, most likely because it arrived too late. In this paper, I will first present a rough sketch of the first two of Lesniewski systems. Secondly, I will expand upon these systems by adding the A abstractor. Lastly, I will introduce two special functors, which will engender a new system that is functionally complete in a sense to be specified in the second section.
2. Lesniewski's Systems The simplest of the three systems is Protothetic, which is incorporated into the two others. Protothetic is a generalized calculus of propositions containing variables of arbitrary syntactic categories which are defined starting with the basic category S of sentences.
157
Definition 2.1. (1) 5 is a syntactic category. (2) If X and Y are syntactic categories, (X/Y) is a syntactic category a . (3) Nothing else is a syntactic category. The wff's of Protothetic are (1) (2) (3) (4)
A variable of type S; Identity statements \= (AB)^ where A and B are wff's; Generalization: \y\... vn\ \A~] where A is a wff; All the expressions N(v\... vn) where N is introduced by a definition. The general form of these definitions is [vi... vn\ \= (N(v!...
vn) A{vi...
vn})]
where N is a new constant and Af(vi... i>„) is of category S and A{v\... Vn) is an already defined wff containing the variables v\...vn. Example 2.1. (1) p where p is a propositional variable (2) [p\ \p] (which will be used to designate falsity ±)
(3) [j*zjr=(P9)l (4) _Pj ~= (pp)~\ (which will be used as truth T) (5) [pj \= (>(p) = (pL))] (this definition introduces negation) (6) l ^ j [ = (A(pq)[f\\= (p = ( H T E E (pf(r))Mr\\= ( 9 /(r))l))l)l (this definition introduces conjunction) In the last example, / is of category S/S. (7) l p g j r = ( V ( p g H A (  ( p M g ) ) ) l
(8) bJr=Ota)vKp)
A2. l^jr=(^(M)L/Jl = (/(p)/(g))l)l A3. \pq\ \= (= (pq) = (L/J r= (f(P)f(q)) = (Pq)]))} a T h e (2) of this definition is much simpler than the one we can find in the literature on Protothetic: (2) If X and X\,..., Xn are syntactic categories, X/X\ ... Xn is a syntactic category. We can easily show that (... {XjXi)... Xn) can accomplish the same task. We will write A{BC) instead of (A)(B){C)
158
A4. L/JN(/(bJM) = (= (/(bj \P])[P\ \P\) = dq\ r/(w rpi)/(?)i)))iIf we add the following 5 rules: Rl. R2. R3. R4. R5.
Substitution Detachment Distribution of quantifiers Extensionality (of any expression of any category) Rule of definition (every definition above is a theorem)
we have a complete system in the following sense: Every closed wff of category S is either a theorem or its negation is a theorem. It is worth saying few words about the distinction between Protothetic and the theory of propositional types. First, Protothetic is purely nominalistic: There is no formal semantics in terms of a hierarchy of functions built on truth values. Indeed, we have an implicit semantic: The theorems are taken as the true statements and expressions of the type S/S can be seen as denoting one place propositional functions etc., but these considerations play no role in the theory. Protothetic is a syntactic theory and thus purely inscriptional: It deals with uninterpreted strings of symbols. This brings me to a second remark. A Protothetic system is never completely developed. One can always introduce new constants and thus produce new theorems. Each system is complete but this completeness is relative to the constant functors already introduced. As a result, each system is a work in progress. This is the major difference with propositional type theory in which the hierarchy of propositional functions is given once and for all. The second system is Ontology. It expands upon the Protothetic by adding a second basic category, the category N of names. Definition 2.2. (1) S and N are syntactic categories; (2) If X and Y are syntactic categories, (X/Y) (3) Nothing else is a syntactic category.
is a syntactic category;
We will omit parentheses when unnecessary, the default association being on the left: We will thus write X/Y for (X/Y), X/Y/Z for ((X/Y)/Z) and X/{Y/Z) for (X/(Y/Z)). Ontology contains a new constant "e" of the category S/N/N. e(AB) should read "A is a part of B". "s" should not be confused with the "G" of set theory. For example, "f" is transitive whereas "€" is not.
159
Let us introduce equality in Protothetic. For expression of type S, equality is identity. Let us consider an expression A of type X. Categories Xi,..., Xn such that for any expressions Ai,..., An respectively of category Xi,... ,Xn, A(A\...An) is of type S. We define identity as follows: Ai...A n jr= (= (AB) = (A(A1...An)B(A1...An)))]. We will assume the rule of substitution for =: From = (AB) and ^(...A...) we can write fy(...B...). A particular case would be the detachment rule: From = (AB) and B, we can write A. A system for Ontology is obtained by adding A5. [Ab\ f= (e{Ab} A HLBJ h(e{BA})]) A([DC\ \D (A(e{DA}£{CA})e{DC})[D\
\D
(e{DA}e{Db})])))]
Identity between names is introduced by the following definition: lAB\\=(=(AB)A(e{AB}£{BA}))] Equality between arbitrary functions is defined in the same way it was in Protothetic. Finally, we introduce Aabstractor (which is not in the original Protothetic) with the following axiom: = ([Xx\ \A(B)] A[x\B})
(A reduction )
where A[a;B] is the formula obtained from A by substituting B to each free occurrence of x in A. The introduction of Aabstractor in Protothetic authorizes structural definitions while they are always contextual in the original system. We have already introduced negation by the contextual definition
\p\\=irip) = ipm This definition is contextual because it does not provide a complex expression X of the category S/S, which gives the negation of the argument. Given the Aabstractor we can provide such an expression:
(LApjr=(p_L)i It is clear that both definitions are equivalent. Replacing A by ([Xp\ \= (ql.)~\ in the contextual definition we get (changing the first p for q for clarity's sake)
Lgjr=(LAPJN(PL)l(9) = (9L)l by A reduction [q\\=(=(q±)
= (q±)]
160
which is a theorem. In fact, we can prove that: [= ( [ u i . . . vn\ \ = (C(vi...
vn) A{vi... vn})~]
= ([Xvi ...vn\
\C(vi...
vn)] [Xvi ...vn\
\A{vi...
u„}l))]
3. The Functional Completeness of A—Protothetic In classical propositional type theory the notion of functional completeness is very simple: Does any function in the hierarchy of propositional types have a name, i.e., is there an expression of type theory that has this function as semantic value? The answer is yes. Now, the question becomes whether Protothetic is functionally complete. As we don't have an explicit semantic, we must specify what we mean by functional completeness. For example, we already have introduced _L, T, i and A. i is a name for negation and A for conjunction because the following are theorems of Protothetic:
b J N H p ) = (pL?JM))l [pjr=Hp)sE(pL))l ^ H T ) EE (T±))l and [EE (,(J_) EE (±±))] For conjunction, it is much more difficult to show but the following are also theorems [EE (A(TT)T)1 [BE (A(T_L)_L)1 [EE (A(±T)_L)1
\= (A(±±)±)l One can check that V and D have the expected properties. In this line of thought, the question of the functional completeness of Protothetic can be formulated as follows: Given the collection of expressions of category C\ already introduced, and the collection of expressions of category C^ already introduced, can we introduce a new constant X of category C1/C2 such that, for any arbitrary chosen A and B respectively of category C\ and C2, = (X(A)B) is a theorem? The answer is yes. The recursive formula is quite complicated even if the idea is quite simple (I take it from van Benthem 1992). We will need the notion of projector. Let X be an expression of category C. There exists a sequence (possibly empty if C is S) of categories such that for any sequence of expressions x\,... ,xn respectively of categories C i , . . . , Cn, X(xi,..., xn) is of category S. xi,..., xn is called a projector of X and X(x\,... ,xn) is a projection. For simplicity's sake, we will use X( x ) instead of X(x\,... ,xn).
161
Any functor can be represented by a generalized truth table: x\
x2
...
xn
^l
^2
•• •
^n
+
AT
+
A™~
+
' A{ AT
A\ ~ ~~AY~
...
X(xi,...,xn)
where + stands for T o r i . There are two cases. First case, the right column contains only .Ls. In that case, [Xxi... xn\ [~_L] is an expression having this generalized truth table. In the second case, the right column contains at least one T. Let pr{A\) be the set of projectors of A\. We will write p € pr{A3i) for p being a projector of A\ and f\^exA f° r a conjunction over a set X^t Q p{Ai) of projectors of At. Similarly, we will use general disjunctions. Let 5Ai [xi(J>)] stand for Xi(~p) if [= (A3i(~p)T)~\ is a theorem and stand for Xi(~p) if [= (A^Cp )!.)'] is a theorem. Let T(X) be the subset of the set of projectors of X for which X has a T in the generalized truth table. Theorem 3.1.
LAan ... Aa;„J T(
V
(
A ^ M ? ) ] A•••A
34J'eT(X) "pepi(^i)
/\
^[i^)]))l
pepr(Ai)
is an expression which has the generalized truth table of X. This formula is recursive: Starting from T and J_ it provides a structural name for any functor given by a generalized truth table. Without the Aabstractor, we can only provide a contextual definition [Xl...xn\\^(X(
\/
(
f\5A{[Xl(^)]A:.A
34jeT(AT) y€pr(A{)
/\
5Ai[xna?)])))]
~p€pr{Ai,)
which is equivalent but less elegant. 4. The Functional Completeness of A—Ontology The question now becomes whether we can do a similar exercise in ontology without adding new resources? The answer is no. To see why, let's go back to the generalized truth table.
162
Xl
X2
...
^1
^2
• • •
AT
AV?
...
Xn
X(xi,...,Xn)
An
+
~A^~
+
What permits us to use the technique of generalized truth tables is the fact that the behaviour of any A\ rests on the behaviour of its projections which are of category S and for that category we have an algebra, namely the Boolean algebra which needs only >, A, _L and T and all these expressions are definable using quantification and s . We just do not have a similar tool or the categories built on N. For the moment, I will restrict myself to a hierarchy based on the category iV and I will treat mixed categories (e.g. (S/N)/(S/N)) at the end of the paper. Let us introduce two new constants of category N, X (up) and Y (down) and two new functors
V ( A (= (xix2) = (z3A.)) A H = ( * l S 2 ) ) = (*3Y))))l
The "meaning" is clear: ®(xy) reduces to X when x and y are equal, reduces to Y when x and y are not. For all other cases, the function is the value _L. Let us check the first case [XxxXx^xsl
\= (= i®(xiX2)x3) V ( A (= (xix2) = {x3X)) A H = (xix2)) =
(x3Y))))](aaX)
By A reduction = (= (®(aa)x) V (A(= (aa) = (XX)) A (.(= (aa)) = (AY)))) By the usual rules, the completeness and detachment = (<3>(aa)x)
The second functor is a little bit more complex. In order to simplify the formula, we will assume that © is commutative and we will write
163
\/{ABCD...)
instead of V ( A v ( B v ( C V ( D . . . ) ) ) ) as above.
[XxiXx2Xx3\ [~= (=
(®(xix2)x3) \J((/\(=(x2^)
= ( a * z i M = a:i A).(= n Y ) )
( / \ ( = ( x 2 Y ) = (x3X)^(=
(x1X))l(=
(nY)))
( A ( = ( ^ l A )  (X2Y) = (X3Y)) ( / \ ( = ( X l Y ) = (z 2 Y) = (rr3Y)) (/\(=( 3 : 1 A) = (x 2 X) = ( a; 3A))))l This large formula is justified because it gives the following theorems: }(aX)a)
)(ar)x) )(AY)Y) )(YY)Y) {(XX)X)
Let us consider in detail a very simple example: That of an arbitrary functor X of category N/N. It is entirely described by the following table
a2
X(x) bx b2
an
bn
X
ax
The following formula F is such that = (F(a,i)bi) is a theorem for any i < n. F is \\x\ [[©[... [©[©(6i(®(a;ai))) © (b2(®(xa2)))}...
}(®(bn(®{xan)))}]
Let us check by applying F to
which reduces to (©(6,Y))
164
which reduces to X.
In the second case, we have {®{bj(®(aiai))) which reduces to (®{biX)) which reduces to bt. So the whole expression reduces to [[©[...[©[•••© [A A ] . . . j ^ ] . . . ] ^ which reduces to bt. This is the elementary case for a functor of category N/N. For the general case of any functor based on the category N, we use the same strategy that was used for the functors based on S. Let us consider the following arbitrary table of category ((... ((N/Xn)/Xni)... )/Ni): X(xi,...,xn)
X2
xn
A\
A\
A1
Oi
A{
A\
A\
Oj
Am 2
AAm
Xi
A?
A
n
an
T h e o r e m 4 . 1 . The following expression corresponds exactly to the table above: m
[xXl...xxn\\ 0 [ K ® [ 0 where
® [ Z { ^ ' } ] stands for ®{Z{~A1}(...
(atec^c?))))]]! (®(Z{ A m " 1 } Z {
1m}))...))
~AieP(X) m
and &[Z{Ai}] i=0
sianrf 5 /or©(Z{A 1 }(...(©(Z{^ m _ 1 }Z{A m }))...)).
165
5.
Conclusion
T h e two constructions above deal with only one basic category: S for the first one and N for the second one. W h a t would h a p p e n in mixed cases? For example, functors of the category ((S/(N/S))/((S/N)/(N/S)). It is not clear there is a simple way to merge b o t h approaches. Fortunately, we do not need t o do so because the second approach is universal and, moreover, can be generalized to any language having a finite number of basic categories. Let us suppose we have a finite number of categories N\, N2,..., Nn. (We can think of N\ as S a n d N2 as N.) We t h e n generalize t h e definitions of © and (g> as being category free (or having ©AT and (gi/^ for each category Ni) and t h e second formula above can b e used t o provide a n a m e for any functor. Ironically, this brings us back to the philosophical starting point: Nominalism. In particular, the category S is t h e category of the n a m e of t r u t h values.
References 1. ANDREW, P. B., "A Reduction of the Axioms for the Theory of Propositional Types", Fundamenta Mathematical LII, 1963, 345350. 2. VAN BENTHEM, J., Language In Action, NorthHolland, Amsterdam/MIT Press, Cambridge, 1995. 3. HENKIN, L., "A Theory Of Propositional Types", Fundamenta Mathematical, LII, 1963, 323344. 4. LEPAGE, P., "Partial Monotonic Protothetic", Partiality And Modality, E. Thijsse, F. Lepage & H. Wansing (eds.) Special Issue Of Studia Logica, Vol. 66, No. 1, 2000, 147163. 5. MONTAGUE, R., "Universal Grammar", in Formal Philosophy, Yale University Press, New Haven, 1974, 222246. 6. POST, E., "Introduction To A General Theory Of Elementary Propositions", in From Frege To Godel, J. van Heijenoort (ed.), Harvard University Press, Cambridge (Mass.), 1967, 264283. 7. RICKEY, F., "A Survey Of Lesniewski's Logic", Lesniewski's System: Protothetic, J. Srzednicki, Z. Stachniak (eds.), 1998, 2342. 8. SLUPECKI, J., "St. Lesniewski Prosthetics", Studia Logica, 1, 1953, 44112. 9. TARSKI, A., "Sur le terme primitif de la logistique", Fundamenta Mathematical IV, 1923, 5974.
Acknowledgement This work is supported by a Social Science and Humanities sil of Canada research grant.
Research
Coun
166
ANALYSIS OF A N E W R E D U C T I O N CALCULUS FOR T H E SATISFIABILITY P R O B L E M S. NOUREDDINE Department of Mathematics and Computer Science University of Lethbridge, 4401 University Drive Lethbridge, TlK 3M4, AB, Canada Email: [email protected] This paper proposes a new reduction calculus for propositional formulas. We prove that the presented calculus is refutationcomplete. Hence, it is theoretically as powerful as resolution. Resolutionbased algorithms for propositional tautology/satisfiability testing are in general optimized by reduction rules, but the main message of the paper is that by this mere optimization we are able to solve the satisfiability problem even if this solution is still of exponential worstcase complexity. We outline an algorithm for hybrid reduction/resolution and prove its partial correctness if only reduction is used. The hybrid approach promises more efficiency in practice, however, we will not analyze its complexity in the present paper.
1. Introduction The ubiquitous tautology and its sibling the satisfiability problem are still open problems with extreme impact on theory and applications of computer science and related disciplines. The problem is so needed in practice that unceasingly papers for approximating it are published every year. No one knows whether the problem is efficiently solvable or not. The origin of the problem is in mathematical logic. However, hundreds of other problems stemming from other disciplines and equivalent to it have been discovered over the last decades. All these problems have in common that no efficient solution is known for them and if at least one of them is efficiently solvable then all of them are too. Theorists hence have a name for this kind of problems, namely, NPcomplete problems. There are many approaches to tackle the tautology/satisfiability problem. The principal ones are: (i) Manipulation of formulas. (it) Methods of optimization theory. (Hi) Methods of graph theory.
167
(iv) Probabilistic approximation methods. Manipulation of propositional formulas is the immediate approach for the tautology/satisfiability problem as these are originally problems of propositional logic. Prominent approaches in this respect are the method of resolution and the DavisPutnam (DPLL) procedure. Resolution is a calculus and hence a method of (dis)proof [2]. DPLL is an manipulation algorithm that is based on Booleanalgebraic identities [1]. Both are well used in practice. Our approach in this paper is akin to the method of resolution and falls in this category. Methods of optimization theory convert the problem to a functional optimization one. They approximate the solution by the search for the global minimum of a wellselected function under specific constraints [3]. Of course, they rather see the problem as an optimization problem and not as a decision problem. In [4], we described a method that uses exterior, exact penalty optimization with a coercive objective function. The method is partially heuristic and delivers suboptimal results. In [5], we extended the theory of geometric programming [10] to cope with the satisfiability problem. In both methods we focused on the socalled exact satisfiability problem [6], which is known to be NPcomplete [7]. Finally, graphtheoretical methods [8] can be used to solve special cases of the satisfiability problem (e.g., 2SAT). Graph theory is in this respect extremely useful for analytical purposes. The implementation of these methods, however, is often more efficient by way of formula manipulation methods. Last but not least, probabilistic algorithms try to circumvent the hardness of the problem by stochastic reasoning [9]. These methods proved to be very successful in practice and they are very promising in theory. 2. Notation and Problem Statement Let 5 be the set of all propositional formulas over a set of n logic variables (i.e., atomic formulas) V = {x\,..., xn). Let B be the set of all valuations b: b:V^{0,l} The domain of b is inductively extended to $. A formula F & $ in disjunctive normal form (DNF) will be represented as a finite set of terms, where each term t is either the empty set {} or a set of literals over V. The empty term {} is equivalent to the valid terminal true. Without loss of generality, nonempty terms are assumed to be satisftable (thus, excluding terms like {false}, {x, >x, y}). The empty formula {} is
168
equivalent to the unsatisfiable terminal false. Occasionally, we will write V(F) for the set of variables occurring in F. Throughout the paper, we will use the following symbols to specify formulas in #: • The letter x with/without subscript is used for logic variables. • The letter t with/without subscript is used for conjunctive terms. • Capital letters like F, G, and X with/without subscripts are used for arbitrary formulas in DNF. The problem we shall address is the following: Given a formula F € $ in DNF, decide whether or not F is a tautology (i.e., a theorem of propositional logic). More precisely, we want to decide whether or not: V6 e B : b(F) = 1 Since \B\ = 2™ (or in other words, F admits 2 n possible valuations), this problem is known to be in CoNP. It is in a sense the hardest problem in CoNP, since it is also CoNPcomplete. 3. Reduction Calculus The reduction calculus (RC) has a single rule called reduction rule, which operates in a purely syntactic manner. The Reduction Rule Two disjunctively connected conjunctive terms of the form {x} Lit, {^x} lit are reduced to the term t. Formally: {{x}Llt,{ix}Ut}
Ih {t}
Lemma 3 . 1 . The reduction rule preserves the validity and the satisfiability of terms. Proof. This can be immediately seen since by boolean algebra both sides of the reduction rule are logically equivalent. • Corollary 3.1. The reduction rule is valid in both directions. We shall however use the specified direction only in order to reduce the original formula. Definition 3.1. Two terms are said to be reducible to t if they are of the form {x} U t, {.x} U t.
169
Lemma 3.2. Let F S ? be in DNF and represented as a set of terms. Let t\, t2 G. F be reducible to t. Then:
F=
F\{t1,t2}u{t}
Proof. The (semantic) equivalence of the formulas immediately follows from the basic identity: xAtV^xAt
=t
D
Definition 3.2. Let F G $ be represented as a set of terms. We define the following term sets:
{
if
^\{*i,*2} U W F
*i,*2 G F reducible to t exist
otherwise n+1
2. R°(F) = F, R
(F)
n
= R(R {F))
for n > 0
3. R*(F) = \J Rn(F) n>0
The set R* (F) defined in equation (3) is called the closure set of F w. r. t. reduction. Lemma 3.3. Let F £ $ be represented as a set of terms. Then: \Rn+1{F)\
< \Rn(F)
Vn>0
Proof. We know: Rn+1(F) = R(Rn{F)). Hence, by definition if t\,t2 G Rn(F) reducible to t exist, then: i?"+i(F) = \Rn{F)\  1 otherwise l ^ + ^ F ) ! = \Rn{F)\.
U
Theorem 3.1. Soundness of Reduction Calculus If {} G R*(F) then F is valid. Proof. Since {} e R*{F), there exists an n > 0 with {} G Rn(F). represents a valid formula. By Lemma 3.2 we know:
Thus,
Rn(F)
F = Rn(F) Thus, F is valid, too.
•
The next technical Lemma is obvious and is needed in the proof of Theorem 3.2. Lemma 3.4. / / F consists of only positive or only negative literals, then F is not valid.
170
Theorem 3.2. Completeness of Reduction Calculus IfF is valid then {} € R*{F). Proof. We proceed by induction over V(F) = n, the number of variables in F. Base, n = 1. Since F is valid, F = {{x}, {^x}}. Applying the reduction rule yields {}. Induction hypothesis. If F is valid and \V{F)\ < n then {} G R*(F). Induction step. Let F be valid with V(F) = n. Choose a variable x of V{F) such that ix occurs in F. Since F is valid, such a variable exits by Lemma 3.4. Let Fo and F\ be the formulas that result from fixing b(x) = 0 and b(x) = 1, respectively. Thus, by construction, F can be put in the following form without impairing its validity: F = x * Fi U .a; * F 0
(1)
Here, the multiplication symbol (*) of a literal i by a term set S means adding x to each term of 5. We assert that both Fi and Fo are valid. Suppose Fo is not valid. Thus, there would exist a b such that 6(F)) = 0. Consider the valuation b' over V(F) defined by: l.b'(y) = b(y) 2. b'(x) = 0 3. b'(y)
ify€V(F0) arbitrary otherwise.
Obviously, by (1) , b'(F) = 0 contradicting the validity of F . Thus, Fo is valid. By a dual argument we can infer that F\ is valid, too. Since both Fo and Fi are valid and additionally V(Fo) < n and V(Fi) < n, we can apply the induction hypothesis on both formulas. This yields: {} e R*(F0)
(2)
{} e R*(Fi)
(3)
Let no and n\ be the smallest integers such that {} S Rn°(Fo) and {} G Rni(Fi), respectively. Build a reduction of F to {} by the following procedure: 1. Add each 2. Add each
in each reduction step taken in building Rn°(Fo) the literal —*x in term. in each reduction step taken in building Rni(Fi) the literal x in term.
171
3. By (2) and (3): a. After no steps t h e t e r m {'x} is generated. b. After no + n\ steps the t e r m {x} is generated. 4. Generate t h e empty t e r m {} in an extra step by applying t h e reduction rule on the subset {{x}, {*x}}. T h u s , {} G Rn°+ni+1(F),
which ends the proof.
•
W i t h Theorem 3.1 and Theorem 3.2, we have the following main theorem. T h e o r e m 3 . 3 . Fundamental Theorem of Reduction F is valid if and only if {} G R*(F).
Calculus
4. A R e d u c t i o n A l g o r i t h m for T a u t o l o g y T e s t We now develop and analyze an algorithm for tautology (and t h u s for satisfiability) test based on the fundamental theorem of reduction calculus. T h e following is t h e reduction algorithm for tautology test. R e d u c t i o n Algorithm R e d u c t i o n (Formula F) { Input: A DNF formula F. Output: {} if F is a tautology. Convert F to a term set; Eliminate unsatisfiable terms in F; repeat G:=F; F := R(F); until F = G / / o r ({} e F) if {} € F t h e n return {}; else return F; } T h e o r e m 4 . 1 . Partial Correctness of the Reduction Algorithm If the output of the reduction algorithm is {} then F is a tautology. P r o o f . Observe first t h a t the algorithm does not mimic t h e proof of t h e completeness theorem (Theorem 3.2). By L e m m a 3.2, we have: Ri(F)
= F
(4)
172
Hence: {} e Ri(F)
= > Ri{F) = F
is a tautology (by soundness theorem) is a tautology by (4)
D
Observe that Theorem 4.1 could be made stronger, if the algorithm is slightly modified so as to really mimic the proof of the completeness theorem. Though the proof of the completeness theorem is constructive per se, it yields exponential worstcase complexity in practice. It is for this reason that we tried to avoid it in the previous algorithm though sacrificing strong correctness. Theorem 4.2. The reduction algorithm is of polynomialtime
complexity.
Proof. The algorithm just applies the reduction rule successively until no more reductions are possible. It is more efficient to quit the loop as soon as the empty term {} is generated. The concrete complexity bound is not important in our context but it polynomially depends on the number of the variables and the size of terms in the formula F. • The algorithm we presented is akin to the wellknown resolution algorithm. In resolution, however, the term set is expanded and not reduced as in our algorithm. Resolution algorithms have exponential complexity if implemented to achieve total correctness. The immediate question that arises now, what to do when our algorithm delivers a nonempty set? It is clear that in this case the original formula F may still be a tautology. It is here where a combination of our algorithm with resolution is convenient. The idea is depicted in Figure 1. We propose to combine the two algorithms in series. First reduction is used. If reduction detects that F is a tautology, the algorithm is exited. If not, resolution takes over and expands the formula. This expansion should be restricted in size, however, to prevent exponential growth of the term set. If resolution can detect the tautology property the algorithm is exited. If not, reduction is visited again to lower the size of the term set. This loop is repeated forever. Clearly, resolution can detect nonsatisfiability, too, in which case the algorithm is terminated. The above described reduction algorithm cannot handle this case. Though reduction seems to be inferior to resolution in practice, in theory it is not. In fact, we proved in last section that reduction calculus is refutationcomplete. It is wellknown that resolution is refutationcomplete, too. Thus, theoretically the two methods are on par in some sense. This
F
Figure 1.
Interplay between Reduction and Resolution.
is a very interesting theoretical insight. Some kind of formula simplification (akin to reduction) is often used in algorithms for resolution as in the algorithm we proposed above. The major contribution of the paper is rather to show that such an apparently helper reduction procedure is theoretically not inferior to the main resolution procedure. 5. Conclusion The paper presented a new reduction calculus for propositional formulas. The calculus is proved to be refutation complete. We also proposed a partially correct algorithm for tautology test based on reduction calculus. The algorithm is of polynomial complexity. We tried to avoid exponential complexity even at the price of weaker correctness results. We also discussed the interplay between reduction and resolution. There is no doubt that both reduction calculus and resolution calculus serve as an algorithmic approach for checking the satisfiability of logic formulas. The unexpected result that simple reduction is sound as well as
174
complete with respect to refutation is exciting. This implies t h a t resolution is not stronger t h a n simple reduction from a theoretical point of view. O n t h e other hand, the mechanics of resolution is more powerful in general, as reduction calculus has a simple and single rule of inference. Algorithmically, it is recommended t o combine b o t h approaches as outlined above. T h e combination of reduction and resolution is a straightforward approach. Apparently the process of repeated reduction(s) after each expansion step(s) is beneficial. However, a rigorous complexity analysis of this approach is needed and this is one of our current research objectives. T h e analysis is expected t o be complicated as the two algorithms m a y influence the t e r m set in opposite manners. Also, the algorithm we presented has the potential t o be parallelized. This will be investigated in future.
References 1. F. Bacchus: Enhancing Davis Putnam with Extended Binary Clause Reasoning, 18* National Conference on Artificial Intelligence, (2002). 2. J. A. Robinson: A MachineOriented Logic Based on the Resolution Principle, J. Assoc. Comput. Mach., 12, 2341, (1965). 3. S. R. Fletcher: Practical Methods of Optimization, 2nd Edition, Wiley, (1987). 4. S. Noureddine: An Approach for the Satisfiability Problem via Exterior Penalty Optimization, Journal of Computer Science, (2005). 5. S. Noureddine: A Geometric Programming Approach for the Satisfiability Problem, Technical Report, Dept. of Mathematics & Computer Science, University of Lethbridge, Lethbridge, Canada, (2005). 6. S. Porschen, et a l : Exact 3satisfiability is Decidable in Time O ( 2 0 1 6 2 5 4 ) , Annals of Mathematics and Artificial Intelligence, (2002). 7. T. J. Schaefer: The Complexity of Satisfiability Problems, Proceedings of the 10th annual ACM Symposium on Theory of Computing, San Diego, California, U.S.A., 216226, (1978). 8. B. Aspvall et al.: A Lineartime Algorithm for Testing the Truth of Certain Quantified Boolean Formulas, Information Processing Letters, 8 ( 3 ) : 121123, (1979) 9. U. Schoening: A Probabilistic Algorithm for kSAT and Constraint Satisfaction Problems, Proceedings, 40 Annu. IEEE Symposium on Foundations of Computer Science, FOCS'99, 410414, (1999). 10. R.J. Duffin et al.: Geometric Programming: Theory and Applications, Wiley, New York, (1967).
175
ELEMENTARY T Y P E S E M I G R O U P FOR B O O L E A N A L G E B R A S W I T H D I S T I N G U I S H E D IDEALS* DMITRY PAL'CHUNOV
1.
Introduction
We study model theoretical properties of Boolean algebras with distinguished ideals, further called as /algebras. We consider /algebras in the signatureCT;=± (U, H,  , I\,...,/;), where / i , . . . , /; are unary predicate symbols for the ideals. The number I of distinguished ideals is fixed. Different model theoretical and algorithmic properties of /algebras were studied in [112]. Some of results of this paper were announced in [8]. Definition 1. Denote E ;= {21  21 is an Ialgebra}/= = {[21]= 121 is an Ialgebra}, where [91] = = {03  35 is an Ialgebra and 03 = 21}. Denote E F± (E;x,e), where [21] = x [35]= ^ [21 x 03]= and e ^± [({0},
176
Theorem 1. Each of properties of Ialgebras to be nonvanishing, local, finitely axiomatizable and u>categorical may be represented in the semigroup E by a first order formula, i.e., there are first order formulas N(x),L(x),F(x) and C(x) of the signature {x} such that 1) Ialgebra 21 is nonvanishing if and only ifE \= iV([2l]=); 2) Ialgebra 21 is local if and only ifE \= L([2l] = ); 3) Ialgebra 21 is finitely axiomatizable if and only if E (= F([2l] = ); 4) Ialgebra 21 is uicategorical if and only ifE\= C([2l] = ). 2. Nonvanishing Ialgebras For an Ialgebra 21 and a G 21 we denote a ^ {b G 21  b < a}, for b G 21 denote b ^±br\a. Then (a; U, n , _ a , I\ n a,..., 7/ D a) is an Ialgebra, it is denoted as (a). Remark 2. If 21 is an 7algebra, then 21 = (a) x (a). We write 21 = (a) x (a). Definition 2. [7] An 7algebra 21 is called nonvanishing if for any Ialgebras 05 and £, the statement 21 = 05 x € implies 21 = 03 or 21 = £. The meaning of this concept is the following: for any direct decomposition of a nonvanishing Ialgebra its elementary theory does not vanish, and remains at least in one direct summund. Remark 3. An Ialgebra 21 is nonvanishing if and only if for any a £ 21, we have 21 = (a) or 21 = (a). In [7] it was formulated the following Question. Whether it is true, that if Ialgebra 21 is nonvanishing and 03 = 21, then 05 is nonvanishing also. To answer this question we prove the following Proposition 2. An Ialgebra 21 is nonvanishing if and only if for any Ialgebras 05 and €, 21 = 05 x € implies 21 = 03 or 21 = C. Proof. (=*) Let an Ialgebra 21 be nonvanishing and 21 = 03 x £. Assume that 21 ^ 03 and 21 ^ C Then there exist sentences tp and ip, such that *B\=
177
3y(y < x &
, so 21 f= y> or 21 = V But, by the hypothesis 21 ^= <£ and 21 ^= •0 — a contradiction. Hence, 21 = 03 x £ implies 21 = 03 or 21 = £. («=) Let 21 = 93 x £ imply 21 = 03 or 21 = £. If 21 = 03 x £, then 21 = 03 x £, therefore 21 = 03 or 21 = £. Corollary 1. 7/21 = 03 and 21 is nonvanishing, then 03 is nonvanishing also. Proof. In fact, let 03 = 971 x <JX and 21 = 03. Hence 21 = 971x91, so 2l = 9rt or 21 = 9T, therefore 03 = 9rt or 03 = 91 Thus, 03 is nonvanishing. Corollary 2. An Ialgebra 21 is nonvanishing if and only if E \= iV([2l]=), where N(x) ^ \/y\/z(x = y x z —> {x = y\/ x = z)). 3. Ordered semigroup Definition 3. For /algebras 21 and 03 we denote 21 < 93 if for some /algebra £ we have 03 = 21 x £, denote 21 < 03 if 21 < 03 and 21 ^ 03. For elements a,b £ E, we denote a
[*h < [»] = • Proof. 21 < 03 <£> 03 = 21 x £ for some /algebra £ <^> [03]= = [21] = x [£] = for some /algebra £ «=> [03] = [2t]= x c for some c £ [ # [21] = < [03] = . Corollary 3. Suppose that for Ialgebras 21, 03,971 and 9t we have 21 = 971 and 03 = m. Then 21 < 03 if and only i/ 9rt < 9T. Proposition 3. The relation < is a partial order in the semigroup E. Proof. Reflexive and transitive properties are obvious. Antisymmetry property follows from [11, Lemma 3]
178
Proposition 4. If a,b,c,d £ E, a < b and c < d, then a x c < b x d. It means that (E; <) is an ordered semigroup. Proof. Let a = [21] = , 6[Q3]= , c = [£]= and d = [M]= . Then 21 < 03 and € < 971, therefore there exist 91 and £ such that 03 = 21 x 21 x 23 = Vn(a), where we identify (a,0) with a. Definition 5. For n G IN let 2tn be a countable /algebra such that 2t„ \= Vn(l). For m £ IM we denote Vn < Vm if 2l„ < 2l m , and Vn < Vm if
By the construction, the sequence of formulas Vn, n £ IN, possesses the following properties (see [3]): Proposition 6. a) For any m,n, ifVm < Vn, then m jn. b) The number of minimal (up to the order <) formulas Vn is finite. c) For any n, the set {m \ Vm < Vn} is finite. d) For any n, the set {m \ ifVk< Vm, then k < n} is finite. Definition 6. A basic /algebra is called continuous if 21 = 21 x 21, and pointwise if 21 ^ 21 x 21. Remark 5. If an /algebra 21 is continuous (pointwise) and 03 = 21, then 03 is continuous (pointwise) also.
179
Proposition 7. [3, 7] 1) If an Ialgebra 21 is continuous, then there exists an element a G 21 such that 21 = (a) and 21 = (a). 2) 7/ an Ialgebra 21 is pointwise and 21 = 93 x £ (a G 21 j , £/ien either 21 ^ 93 or 21 ^ £ (either 21 ^ (a) or 21 ^ (a) respectively). Remark 6. A basic /algebra 21 is continuous if and only if [a]= is an idempotent of the semigroup E: [a]= x [a]= = [a]=. For an /algebra 21 we denote M(2t) ^ {m G N  21 = a z V ^ x ) } = {m G N  2l m < 21}, iV(2l) ^{me
M(2t)  VnG M(2l), Vm it Vn} = {mG N  2l m is a maximal
element in the set {2t„  n G M(2l)} w.r.t. the order relation < } . Definition 7. An /algebra 21 is called local if the set M(2l) is finite, i.e., the number of different, up to the elementary equivalence, basic direct summands of 21 is finite. Proposition 8. [3, 7] 1) Every Ialgebra 21 has a basic direct summand, i.e., M(2l) ^ 0. / / 21 is local, then JV(2l) ^ 0. 2) Let an Ialgebra 21 be local and N(2l) = {n\,..., rifc}. T/ien a) M(2l) = {m  Vm < Vni for some i
any
/algebra
21 we
define
a
characteristic
180
I 1, if 2l„ < a and 2ln x 2l„ = 2l n ; [ oo, if 2ln x 2ln ^ 2l„ and (2ln)fc < 21 for any k e IN. For a e 21 we denote r a ^ r( a ). Remark 7. An /algebra 21 is local if and only if the function ra has a finite support, i.e., there exists n e IN such that r<&(k) = 0 for any k > n. Theorem 3 . [3] For local Ialgebras 21 and 23 we have 21 = 23 if and only if r%=r<sProposition 9. [3] . _^ fmax{r2i(n),r!8(n)}, if 2l„ x 2l n = 2l„; raxaW  \ra(„) +r!8(n), z/2i„ x 2t„ ^ 2i„. {Here oo + £ = oo). Corollary 4. 7/21 < 25, i/ien r a < rtg. Proof. Let 21 < 23, then 03 = 21 x £ for some /algebra £. Therefore, r<8 = r a x c , hence r a < T  S . Definition 9. A set K C IN is called natural if for any m, n £ N, V™ < V^ and n £ K imply m £ K. Proposition 10. [3] A set K C IN is natural if and only if K = M(2l) /or some Ialgebra 21. Definition 10. An infinite natural set K C IN is called stepwise if for any n € IN there exists fc e IN such that if m, j £ K, m < n and j > k then Vm
181
Definition 11. An /algebra 21 is called strictly stepwise if it is stepwise and for any k < I, I s1 G If if and only if l21" G lfn for any n G M(21) (i.e., a sentence ( V ^ l ) —> (1 G Ik))) is true on the class of /algebras. Thus, for any strictly stepwise /algebra 21 truth values of the sentences Z/c(l) are determined syntactically given the set M(2l). Prom Theorem 1.2 [10] and the definition of stepwise /algebras it follows Theorem 4. For strictly stepwise Ialgebras 21 and 03 we have 21 = 23 if and only if M(2l) = M(03). Theorem 5. For any nonlocal Ialgebra 21 there exists a strictly stepwise Ialgebra 03 < 21. Proof. First we prove Proposition 12. / / an infinite set M is natural, then there exists L C M which is minimal among infinite natural sets. Proof. Let a set M be infinite and natural. We denote Mo ^ {n G M  Vn is minimal w.r.t. < } , Mj+i ^ {n G M  if Vm < Vn , then m G M*}. It is obvious that Mo C M\ C . . . . Proposition 6 implies that for any k G N the set M& is finite, because Mo is finite, and if M$ is finite, then Mi+i is finite also. Moreover, Proposition 6 implies that M = UneN ^«We choose a set Lo Q Mo which is minimal w.r.t. inclusion and such that the set {n G M  if m G M 0 and Vm < Vn, than m G L0} is infinite. Suppose that a set Lj C Mj is constructed and the set {n G M  if m G Mj and Vm < Vn , than m G //,} is infinite. Choose a set L^+i C Mj+i which is minimal w.r.t. inclusion and such that Lj+iOMj = Lj, the set {n G M  if m G M i + 1 and y m < Ki, then m G Lj + i} is infinite, and if n G Z/;+i and Vm
182
The lemma is proved. Lemma 2. L is minimal among infinite natural sets. Proof. Suppose that L is not minimal among infinite natural sets. Hence, there is an infinite natural set K C L with K ^ L. Therefore, K C M . Denote Ki ^ Kn Li. Then K \JiehS Ki, and for any i, Ki C Li. Hence, there exists i £ IM with Ki ^ Li. Let j be the least number such that 1) j = 0. Let n £ K, then n s M . Assume that m G Mo and Vm < Vn. Then n € L and m £ L, since the set L is natural. However, L n Mo = Lo, hence, m £ Lo, therefore m £ Ko = K D Lo. Thus, if C {n 6 ¥ if m G M 0 and Vm < Vn, then m e if 0 } and the set K is infinite. We arrive at a contradiction with LQ being minimal w.r.t. inclusion, because KQ C L 0 a n d KQ ^ LQ. 2) j > 0. Then j = i + 1 for some i. Hence Km = Lm for any m < i. Therefore Ki+i C L i + 1 C M i + i , Li = Ki C Ki+\ and Li C Mi, so L» C i f i + i n Mj and .ffi+1 n Mj C L i + 1 n Mj = Li. Hence if i+1 n M; = Lt. Let n G i^i+i and Vm < Vn, then n £ L»+i, s o m e L*. Further, assume that n £ K, then n £ L and n G M. Let m G M»+i and Vm < Vn. Then TO G Lj+i, and m £ K, because K is natural. Hence m £ K n L i + 1 = i f i + i . Thus if C {n G M  if m G M i + i and Vm < KJ, then m G ifi+i} and these sets are infinite. Besides, Ki+\ C L, + i and ifi+i ^ i»+i. We arrive at a contradiction, because Ki+\ is minimal w.r.t. inclusion. Hence, an infinite natural set K C L, with K ^ L, does not exist. Therefore L is minimal. Lemma 2, as well as Proposition 12, is proved. Proposition 13. If a set L is stepwise and L C M(2l), then there exist a strictly stepwise Ialgebra 93 < 21 with M(23) = L. Proof. Let L C M(2l) and L be stepwise. Consider a set J ^ {k < I  l a " G i f " foranyn
£ L}.
Consider a set of formulas
S(x)^{3y(y<xkVn(y))\n£L} U {3y(y < x & V^j/))  n £ L} U {/fc(a;)  k £ J}. Denote T ^ T/i2t. We show that the set of formulas T U S(x) is consistent. First we show that it is locally consistent. Let V C L and L' be
183
finite. We prove that Tu{3y(y<x&Vn(y))\n£L'} U {i3y{y <xk
Vn{y)) \ n <£ L) U {Ik(x)
\ k € J}
is consistent. Since L C M(2t), we have L' C M(2l), therefore, for any n S L ' , 21 = 3yVr„(y). For any n G 1/ we find a n G 21 such that 21 (= Vn(an), put a ^ IJnGL' a « Then 21 = 3y(y < a & V^(y)) for any n G L'. Moreover, if k £ J, then for any n G L a formula (Vn(x) —> Ik(%)) is true on the class of /algebras. Therefore, a n G /& for any n G L'. Hence Ik(a) for any k G J. Suppose that n £ L and there exists c < a such that 21 = V^(c). Let 1/ = { m i , . . . , m f c } . Denote &! ^ ami,b2 ^ a m 2 \ a m i , . . . ,6fc ^ a m f c \(a m i U . . . U a mib _ 1 ). Then a = &i U . . . U 6fc and fcj l~l bj = 0 for i ^ j . We denote Then (c) S ( Cl ) x . . . x (cfc) and (c) \=Vn(l). Cl^cnbi,...,Ck^cr\bk. /algebra (c) is nonvanishing, so (c;) f= K J ( 1 ) for some i < k. Moreover Ci = cC\bt — cD (ami\(ami U . . . U a m i _ J ) < am.. Hence (a m i ) = ^ ( 1 ) , ( Q ) = V„(l) and (CJ) < ( a m J . Therefore, Vn < Vm.. Furthermore, m* G 1/ C L and L is a natural set. So, n G L — a contradiction, because n g L by our assumption. Hence 21 = 3x((x < a) & K,(a;)) for any n £ L. Thus we have proved that %\=TU{3y(y
184
Proof. It is obvious that M(05)uM(C) C M(2t). Let n G M(2l) and a G 21. Then there exists c G 21 with 21 = Kx(c). Hence, as it was shown in the proof of Proposition 13, either 21 \= Vn(c n a) or 211= V„(c n a). Therefore, either n G M(a), or n G Af(a). Corollory is proved. Theorem 6. A strictly stepwise Ialgebra is nonvanishing. Proof. Let an /algebra 21 be strictly stepwise and 21 = 03 x <£. Then M(21) = M(03) U M(C). Therefore at least one of the sets Af (05) or M(€) is infinite, suppose that M(03) is infinite. Hence the infinite natural set M(05) C M(2l) and M(2l) is stepwise, so M(03) = M(2l). It is easy to prove that the /algebra 05 is strictly stepwise, hence 21 = 03. The theorem is proved. Corollary 7. If an Ialgebra 21 is strictly stepwise and 03 < 21, then either 03 is local, or 21 = 03. Corollary 8. In the set {a G E  a is not local }, under each element there is a minimal one. The minimal elements of this set are exactly the equivalence classes of strictly stepwise Ialgebras. Corollary 9. If an Ialgebra 21 is basic and 21 < 03 x €, then 21 < 05 or 21 <€. Proof. Let /algebra 21 be basic, 21 = 2ln for some n G IN, and 21 < 05 x £. Then 2l„ < 03 x € and 03 x £ (= 3xVn(x). Let 9Jt = 03 x £, b G SDT, 03 = (b) and € = (b). Then there exists a G 9JT such that 371 = Vn(a). Hence, as shown above, 971 = Vn(aDb) or 271 = Vn(arib). Therefore (anb) \= Vn(l) and (anb) = 2l„, or (anb) \= Vn(l) and (anb) = 2l„. However, (aHfe) < (b) = 05 and (a n 6) < (6) = £. Thus 21 < 03 or 21 < £. The corollary is proved. Corollary 10. / / an Ialgebra 21 is basic and 03 < 21, then 21 x 03 = 21. Proof. Let 21 be basic and 05 < 21. Then 03 < 21, hence 21 = 03 x £. Moreover, 21 ^ 03 and 21 is nonvanishing, therefore 21 = C, i.e., 21 = 03 x 21 = 21 x 05. The corollary is proved. We denote L'(x) ^ (V6 < x)(3c < 6)(Vd < b)(((c < d) & N(d)) *c = d). By L(x) we denote a formula of the first order predicate logic of the signature { x } , which is equivalent to L'(x) on E (which is easily constructed). Theorem 7. An Ialg.ebra 21 is local if and only ifE\= L([2l] = ).
185
Proof. (=>) Let an /algebra 21 be local, b < [21] = and [05]= = b. Then 05 < a . Let iV(03) = { n i , . . . ,n/c}. By virtue of Proposition 8(2) there exists a decomposition 05 = 05i x . . . x 05^ with iV"(05i) = {n^}.Consider the following cases. I. k > 1 and for some i < k, there exists m G N such that 05 j = ( 2 l n J m . Put c ^ [2ln<] = . Then AT(2lnJ = { n j , therefore c ^ 6 and c < 6, i.e., c < b. Let d < b, c < d, E \= N(d) and d = [£]= for some £. Then £ < 03, 2lrai < £ and £ is nonvanishing. Since N(£) C {n\,..., n^}, then N{£) = {rii}. Hence £ < (2l n i ) m , and £ = 2l„ i; because £ is nonvanishing. Thus c = d. II. k > 1 and for any i < k and m G IN, ( 2 l n J m < 05,. Put c ^± [*Bi]=. By virtue of Proposition 8(6), 03i is nonvanishing. Let d < b, c < d, E (= N(d), and d[£] = for some £. Then 051 < £ and £ < 03. Hence, N(£) = { n j and for any m S IN, (2l n i ) m < £. By the definition of the characteristic r, then r £ = r
186
E )= (3c < 6)(Vd < &)(((c < d) & N(d)) > c = d). Let c < 6 and c = [€] = . Then £ < 53, i.e., £ < 03 and £ ^ 03. Therefore, by virtue of Corollary 7, £ is local and M(£) is finite. Consider n = maxM(£). M(03) is stepwise, so there exists k £ IN such that if m,j £ M(53), m
n^}.
Theorem 9. An Ialgebra 21 is finitely axiomatizable if and only if Proof. (=>) Let 21 be finitely axiomatizable. Denote a ^ [21] = . We prove that E = F(a), i.e., E = F'(a). 21 is local, therefore, by virtue of Theorem 7, E = L(o). Consider b, d, e £ E. Let b = [03]= , d = [9Jt]= , e = pt] = ,2l = 9Jtx9T, 21 ^ 9t, WtxOJlsOJt,
187
03 < 971, 03 ^ 93 x 03 and 7algebras 23 and 971 are nonvanishing. Then 2? and 971 are local. Let iV(2t) — { n i , . . . , nk}, m i , . . . , mk G N and 21 = (0„1)mix...(2ln]k)m*. 7algebras 21 and 01 are local, and 21 j£ 91, so r
188
vanishing by virtue of Proposition 8 (6). Put an ^ 9li x . . . x 9 V i x Oai+i x . . . x mk, d = [9^]= and e = [9K] = . Then N(Dtt) = JV(2l)\{ni}, therefore !OT # 21. Hence a = d x e and a =£ e. Let b = [2lnJ = . Then b < d and E (= N(6), because 2l ni is nonvanishing. 2l„; x 2t„; ^ 2i ni and 91; x 9tj = 91;, so i> ^ 6 x b and d = d x d. Hence E H= (3c)(JV(c)&(6 < c)&(c < d)). Let c € E and c = [£]= . Then £ is nonvanishing, 2l n; < £ and £ < 91; which implies M(£) C M(91;). Moreover, 2l„4 < £ implies ni € M(£). Hence, 7V(£) = {«;}. Since £ is nonvanishing and 2l ni ^ £, Proposition 8 (6) implies (2l n i ) m < £ for any m S M. Then r(r(rij) = oo = r^i^i) Therefore rg = r ^ and £ = 91;, i.e., c = d — a contradiction. Thus E Y= F'(a). The theorem is proved. 6. wcategorical /algebras Remind that an 7algebra 21 is called wcategorical if the elemenary theory T/i(2t) is categorical in the countable cardinality, i.e., for any countable 25 and £ from 21 = 25 and 21 = £ it follows that 03 = £. Denote C'(x) ^ V6((b < x) > F(fo)) and C{x) ^ V6Vc ((i = b x c) * F(6)). Remark 8. E (= Vx(C'(x) <> C(x)). Theorem 10. ( [7], Theorem 9, (1) and (3)) An Ialgebra^i is tocategorical if and only if for any decomposition 21 = 25 x £, Ialgebra 25 is finitely axiomatizable. Corollary 11. Ialgebra 21 is u>categorical if and only i/E = C([2l] = ). Corollary 2, Theorem 7, Theorem 9 and Corollary 11 imply Theorem 1. 7. Nonaxiomatizability of local, finitely axiomatizable and wcategorical Jalgebras Theorem 11. Classes of local, finitely axiomatizable and u>categorical Ialgebras are not axiomatizable. Proof. Let K be a class local /algebras (a class finitely axiomatizable or a class of wcategorical /algebras). Suppose that K is axiomatizable, then K = K(T) for some set of sentences I\
189
From [3] it follows t h a t the set of wcategorical basic /algebras 2l n is infinite. Denote C F± {n  2l„ is wcategorical }. Notice t h a t 2t n is local and finitely axiomatizable for any n £ C. By the construction of t h e sequence of formulas Vn(x) [3], for any finite L C C there exists n £ C such t h a t 2t m < 2t n for any m £ L.ln this case 2l n = 3XV^J(:E) for any m £ L, and 2l n (= T. Therefore T U { 3 I K J ( O ; )  n £ C} is locally consistent, so it is consistent. Hence there exists 03 = ru{3:cV^(x)  n £ C}. T h e n 23 \= V and C C M(Q3), so M(Q3) is infinite. Consequently 03 is not local, and hence it is neither wcategorical nor finitely axiomatizable — a contradiction. T h e theorem is proved. Q u e s t i o n . W h e t h e r classes of nonlocal, not finitely axiomatizable and nonwcategorical /algebras are axiomatizable? T h u s , each of these classes of /algebras, which are not even axiomatizable in the language of /algebras, is described by one sentence of t h e signature { x } in the semigroup of /algebra elementary types.
References 1. Yu.L. Ershov. Decidability of the theory of distributive lattices with relative complements and the filter theory. Algebra and Logic, vol. 3, N 3, 1964, p. 1738. 2. A. Macintyre, J.G. Rosenstein. Kocategoricity for rings without nilpotent elements and for Boolean structures. J. Algebra, vol. 43, N 1, 1976, p. 129154. 3. D.E. Pal'chunov. Countablycategorical Boolean algebras with distinguished ideals. Studia Logica, vol. XLVI, N 2, 1987, p. 121135. 4. D.E. Pal'chunov. Finitely axiomatizable Boolean algebras with distinguished ideals. Algebra and Logic, vol. 26, N 4, 1987, p. 435455. 5. Alain Touraille. Theories d'Algebres de Boole Munies d'Ideaux Distingues. I. Theories Elementaires. J. Symb. Log., vol. 52, N 4, 1987, p. 10271043. 6. Alain Touraille. Theories d'Algebres de Boole Munies d'Ideaux Distingues, II. J. Symb. Log., vol. 55, N 3, 1990, p. 11921212. 7. D.E. Pal'chunov. Direct summands of Boolean algebras with distinguished ideals. Algebra and Logic, vol. 31, N 5, 1992, p. 499537. 8. D.E. Pal'chunov. Elementary type semigroup for Boolean algebras with distinguished ideals. 3th International Conference on Algebra, Krasnoyarsk, 1993, p. 253. 9. D.E. Pal'chunov. Prime and countablycategorical Boolean algebras with distinguished ideals. SIBAM, 1994, N 3, p. 83108. 10. D.E. Pal'chunov, Theories of Boolean algebras with distinguished ideals having no the prime model, SIBAM, 1994, N 4, p. 86117.
190
11. D.E. Pal'chunov. The LindenbaumTarski algebra for the class of Boolean algebras with one distinguished ideal. Algebra and Logic, vol. 33, N 2, 1994, p. 179210. 12. D.E. Pal'chunov. The LindenbaumTarski algebra for Boolean algebras with distinguished ideals. Algebra and Logic, vol. 34, N 1, 1995, p. 88116.
191
INTERVAL FUZZY A L G E B R A I C SYSTEMS* D. E. PAL'CHUNOV, G. E. YAKHYAEVA Institute of Mathematics, Siberian Branch of Russian Acad. Sci. 630090, Novosibirsk, Russia; [email protected], Email: guLnara[email protected]
1. Introduction The concept of fuzzy logic was introduced by Lotfi Zadeh [12] as a result of development of fuzzy set theory. A fuzzy subset A of a crisp set X is determined by a mapping (so called, membership function) which defines a membership degree of an element x in the set A, for each element x £ X. Similarly, if X is a set of sentences then a degree of truth for elements of X may be defined: a statement may be "absolutely true", or "absolutely false", or may have an intermediate value belonging to some partially ordered set. There are two meanings of fuzzy logic [13]. Fuzzy logic in a wide sense is a tool and a methodology for fuzzy management, analysis of unprecise sentences of natural language and some other applications [5, 7, 8, 14]. Fuzzy logic in a narrow sense [4, 9, 11] is a kind of symbolic logic. It includes investigations in syntax, semantics, axiomatization, completeness and so on. Fuzzy logic in this sense may be considered as one of fields of manyvalued logic. In the paper we fix some signature a which is a finite set of predicate symbols and constant symbols. It means that a does not contain symbols of functions. Also we fix a set A. For the set A and the signature a we denote aA ^ a U {ca  a £ A}. We assume that ca £ a for any a £ A. In the present article we consider models 21 = (A, a) of the signature a A with the universe A. We suppose that cjf ?=± a for any model 21 = (A,a). By K(A, a) F± {21  21 = (A, a)} we denote the class of all such models. We deal with sentences of the first order predicate logic of the signature a A without equality. Let S(CTA) be the set of all sentences of the signature •Supported by RFBR grant N 050104003NNIOa (DFG project COMO, GZ: 436 RUS 113/829/01), and by grant of the Russian Science Agency, project 2006PH19.0/001/269.
192
a A and Sa(aA) ^=± {P(ci,... , c„)  P, ci,...,cn £ cr^} be the set of all atomic sentences of the signature aA • Suppose that for a model 21 £ K(^4,cr) and a mapping fj, : Sa(crA) —> {0,1} we have fj,((p) = 1 if and only if 21 =
) =
Then
r'(^) U T"(^) =
(X'\T'(Y>)) U (X"\T»(
= (X' U
I»)\(r'(p)Ur»(V))=rM. (2) r(
303
The revised knowledge base consists of all the remaining clauses and ip. In addition, minimal unsatisfiable formulas have some applications in formal verification, model checking, diagnose, etc. In formal verification, an abstract model should be refined because some property fails. Refinement is achieved by identifying the cause of infeasibility. Although a unsatisfiable formula itself is an explanation of the infeasibility, we are interested in a "minimal" explanation since it excludes irrelevant information. Thus minimal unsatisfiable subformulas provide useful insight on the cause of infeasibility. In the past decade, many breakthroughs has been made in order to have a deeper understanding of Mf7formulas. In this paper, we shall report main results on the complexity concerning minimal unsatisfiability. In 1988, MU has been shown to be D p complete [30]. Dp is the class of problems which can be described as the difference of two ./VPproblems. It is strongly conjectured that Dp is different from NP and from coNP. There are several approaches for defining natural subclasses of MLMbrmulas. For example, the deficiency, the difference between the number of clauses and the number of variables, can be restricted. It is known that any minimal unsatisfiable formula over n variables consists of at least 71+ 1 clauses [1, 6, 9, 27]. There exist some minimal unsatisfiable formulas such that removing or adding some literal to some clause will not destroy the minimal unsatisfiability. Please see the following example. Let F = (.a V c) A (6 V a V c) A (* V a) A >c. It is easy to see that the resulting formula by removing c from the first clause or by adding c to the third clause is still minimal unsatisfiable. This motivates us to investigate subclasses of minimal unsatisfiable formulas to which (resp. from which) we can not add (resp. delete) any occurrence of a literal with minimal unsatisfiability still preserved. Another class of MUionmnlas which is closely related to unique satisfiability is the class of MU formulas which after removing any clause will have exactly one satisfying truth assignment. A powerful tool for investigating the structure of minimal unsatisfiable formulas is splitting. We take a variable x, set x = 1 and x = 0, and consider the resulting formulas. For a minimal unsatisfiable formula the resulting formulas contain again minimal unsatisfiable formulas. A disjunctive splitting means that the formula can be divided into two separate minimal unsatisfiable formulas by setting the variable true resp.
304
false. The minimal unsatisfiable formulas with disjunctive splitting on every variable is also interesting because removing from it any clause results in a uniquely satisfiable formula. Unfortunately, the above mentioned classes of minimal unsatisfiable formulas are not closed under splitting, that is, a formula in the class may have splitting formulas not in the class. Therefore, we investigate classes closed under splitting. Since some classes of simple minimal unsatisfiable formulas are polynomialtime solvable, it should be interesting to decide the unsatisfiability by testing the existence of simple minimal unsatisfiable subformulas. We also review some results on homomorphisms between minimal unsatisfiable formulas and some generalizations of minimal unsatisfiability. In this paper, clauses (disjunctions of literals) are considered as sets of literals, while CNF formulas (conjunctions of clauses) are multisets of clauses. The symbol "+" stands for the union operation of multisets. 2. MUFormulas with Fixed Deficiency Given a CNF formula F, the deficiency, denoted as d(F), is the difference between the number of clauses of F and the number of variables occurring inF. For any fixed natural number k, we denote by MU(k) the class of all minimal formulas with deficiency k. Please note that the satisfiability problem for formulas with fixed deficiency is still TVPcomplete. In 1996, H. Kleine Biining propose the question: for fixed k, whether MU(k) can be solved in polynomial time. Lemma 2.1. (G. Davydov, I. Davydova, and H. Kleine Biining [6]) The Problem of determining if a CNF formula belongs to MU(1) can be solved in linear time. The proof of Lemma 2.1 is based the following two nice properties of MU formulas. Proposition 2.1. [6, 14] MU(k) is closed under (1,*)resolution. That means, if F = {LV /,>LV #i,>LV g2, • • ,iLVgs} + F' <E MU{k) and L and >L don't occur in F' then {/ V g\, f V 32, • • • , / V ga} + F' E MU(k). Proposition 2.2. [6, 14] Any formula F € MU(1) always contains a literal which occurs in F exactly once.
305
Then Kleine Biining in [14] proved that after iteratively applying (l,*)resolution each formula in MU(2) can be transformed to the following formula with respect to renaming (here each column represents a clause): / X\ X2
x3
\ xn
 i l l >X2 • • •  « „ _ ! >Xn X2
X3
•••
Xn
Xi
iXi \ ~^X2
>x3
~*xn /
Consequently, MU(2) can be solved in polynomial time. In 1998, Xishun Zhao and Decheng Ding [34, 35] proved the following. Suppose F € MU(k). If F contains a complete clause / , i.e., every variable of F occurs in / (either positively or negatively), then F can be renamed to a formula with has at most k nonHorn clauses, and the renaming can be defined efficiently. And consequently, for formulas with a complete clause, whether it is in MU(k) can be computed in polynomial time. But finally the question was completely solved by H. Fleischner, 0 . Kullmann, S.Szeider in 2001. Theorem 2.1. (H. Fleischner, 0 . Kullmann, and S. Szeider [10, 27]) For each fixed k, MU(k) can be solved in polynomial time. The following assertions and notations play the key role in their proofs. (1) (2) (3) (4)
Maximum deficiency of F: d*{F) := max{d(F) \ F' C F} If d*(F) = 0 then F is satisfiable. If F € MU then F is stable, i.e., d(F') < d(F) for any F c F . For any F we can find in polynomial time a stable subformula G of F such that d{G) = d* (F) and that G and F have the same satisfiability. (5) Suppose F with d(F) = k is stable and satisfiable, then there is a partial truth assignment v defining on k variables such that v(F) has maximum deficiency 0.
3. Minimal Formulas with Simple Structures Clearly, the class 2CNFMU, i.e. 2CNF n MU, is solvable in quadratic time since the satisfiability of 2 CNF formulas can be decided in linear time. Please note for a 2CNFMU formula that each literal occurs in it at most twice. Then each 2CNFMU formula can be reduced by iterative (l,*)resolution in linear time to &2CNF MU formula in which each literal
306
occurs exactly twice. The linear time solvability of 2 CNF MU follows from the following nice structural property of 2 CNF MU formulas [26]. Every 2 CNF MU formula in which each literal occurs at least twice has the following form up to renaming: (x\
\x2
iX 2 • • •  « „ _ !  i £ „
x3
•••
xn
xi
iXi
X2
•••X„_i
Xn
\
>a;2 >x3 • • • <xn >x\ )
where each column represents a clause. However, 3CNF MU is still I n c o m p l e t e . More generally, we have Theorem 3 . 1 . (Hans Kleine Biining, Xishun Zhao [26]) For any fixed k>4, and p>2, and for a k CNF formula F in which each literal occurs at least p times, the problem of determining whether F is minimal unsatisfiable is still Dpcomplete. Open Question 1. Construct a 3 CNFMU formula in which each literal occurs at least 5 times. Open Question 2. For a 3 CNF formula in which each literal occurs at least 4 times, what is the complexity to decide whether it is in MU? For any Horn formula F, if F is minimal unsatisfiable then F must be in MU{1), and it have at least one positive unit clause. More Precisely, F is of the following form.
* *+ * * *+ \*
* * * •• •+/
where each row represents a variable, each column represents a clause, and entries "*" are wildcards for "—" (for negative occurrence) or "0" (for no occurrence). Consequently, HornMU can be solved in linear time. 4. Maximal M U Formulas A formula F in MU is called maximal, if for any clause / G F and any literal L which is not in / , adding L to f yields a satisfiable formula. In a certain sense maximal formulas are maximal extensions of M[/formulas. In this section we will show the D p completeness of MAXMU, the class of all socalled maximal minimal unsatisfiable formulas.
307
Definition 4 . 1 . For a formula F G MU and a clause / G F we say / is m a x i m a l in F if for any literal L occurring neither positively nor negatively in / the formula obtained from F by adding L to / is satisfiable. Clearly, F G MU is maximal minimal unsatisfiable, i.e. F G MAXMU, if and only if every clause in F is maximal. That MAXMU is in Dp is not hard to see. For the .D p hardness we establish a reduction from the £> p complete problem MU. At first we introduce an auxiliary function by associating to a formula F, a clause / G F, and a new variable z a formula £(F, / , z) preserving the minimal unsatisfiability. Later on the formula £(F, f, z) will be used in order to associate in polynomial time to each formula in MU a maximal formula. Definition 4.2. For a clause / = L\ V • • • V Lfc, we use p(f) to denote the formula consisting of the following clauses: .Li V L2 V L3 V ••• V i f c ,
L2VL3VVLfc, .La V • • • V Lfc,
The formula p(f) + {/} is a maximal minimal unsatisfiable formula. That means we have p(f) + {/} G MAXMU. For a formula F = {/} + H let z be a. new variable. Then we define £(F, /,*) = * V d # + {/} + iz V d p(/), where L Vc; {z Vcj p(f) is maximal in £(F, / , z). (3) If F G M[/, i/ien g £ H is maximal in F if and only if zV g is maximal
in£(F,f,z). Now run the following procedure. P r o c e d u r e MUMAX Input: A formula F in CNF Output: A formula 6(F) in CNF begin C :=the set of clauses in F while C is nonempty
308
for a clause / in C; for a new variable z F:=£(F,/,z) C:=zVcl(C{f}) end while 5(F) := F end The procedure requires not more than 0((mn)3) steps, where m is the number of clauses in F, n the number of variables of F. Now from the above lemma we can see that F G MU if and only if <5(F) G MAXMU. Theorem 4.1. (Hans Kleine Buning, Xishun Zhao [20]) MAXMU Dp complete.
is
5. Marginal M U Formulas A M[/formula F is called marginal if, and only if removing an arbitrary occurrence of a literal from F leads to a unsatisfiable formula which is not in MU. The class of all marginal formulas is denoted as MARGMU. Obviously, the class MARGMU is in Dp. We will show the p D hardness by a reduction from the D p complete problem MU [30]. We establish a procedure running in polynomial time generating a formula cr(F) from a formula F, such that F G MUii and only if a(F) G MARGMU. The procedure is based on an iterative application of the following function (. Let F = {L V / , L V g} + H be a formula with at least two occurrences of the literal L. For new variables y and z we define C (F, L V / , L V g, y, z) = {y V / , z V g, .y V z, y V *,  y V ^2 V L} + H. The formula describes the equivalence of y and z, the two occurrences of L are replaced by one occurrence, and £(F, LV f,LV g,y,z) \= F. For short we write C,(F). For a formula F G M[/ and a literal L we say F is marginal w.r.t. the literal L if removing any occurrence of L from F results in a unsatisfiable formula which is not in MU. Clearly, F is marginal if and only if F is marginal w.r.t. all literals. Lemma 5.1. (Hans Kleine Buning, Xishun Zhao [20]) (1) F G MU if and only if ((F) G MU (2) For F G MU, ((F) is marginal w.r.t. the new literals y,iy,z,>z. (3) For F G MU, if F is marginal w.r.t. a literal K different from L, then ((F) is marginal w.r.t the literal K. That is ( preserves the marginality.
309
Now we introduce the above mentioned procedure. Procedure MUMARG Input: A formula F in CNF Output: A formula o(F) in CNF . begin £:=the set of literals occurring at least twice in F while C is nonempty for some L G £ for two clauses L V f,L\/ g G F; for new variables y, z F:=((F,LVf,L\/g,y,z) remove from C literals occurring in F exactly once end while a{F) := F end The running time of the procedure MUMARG is bound by a polynomial depending on the length of F, because within the whileloop a double occurrence of a literal L is replaced by one occurrence. Please note, that any literal of the input formula occurs exactly once in <J(F). By an iterative application of the above lemma, we see that F G MU if and only if a(F) £ MU. Now it remains to show that for a formula F £ MU the formula
is
6. Unique M U Formulas Another class of restrictions is based on a limited number of satisfying truth assignments. Beside the unsatisfiability, minimal unsatisnable means that for any clause / the formula F — {/} is satisfiable. If for any clause / , F — {/} has exactly one satisfying truth assignment, that means F — {/} is in UniqueSAT, then F is called uniquely minimal unsatisnable. The class of these formulas is denoted as UniqueMU. At the first glance, to demand that for all clauses there is exactly one satisfying truth assignment seems to be very strong.
310
It has been proved that the problem UniqueMU is as hard as the UniqueSATproblem and therefore probably not D p complete, because it is not known whether UniqueSAT is D p complete. It is strongly conjectured that UniqueSAT is neither D p complete nor in NP or coNP. A slight modification of UniqueMU is the class AlmostUniqueMU of almost unique minimal unsatisfiable formulas. A formula F € MU is in AlmostUniqueMUif for at most one clause / , F—{/} may have more than one satisfying truth assignment. Under the assumption that UniqueSAT is not £) p complete, AlmostUniqueMU'is harder than UniqueMU, because we have shown the Z? p completeness of AlmostUniqueMU. Theorem 6.1. (Hans Kleine Biining, Xishun Zhao [20]) (1) UniqueMU = p UniqueSAT, that is, the unique minimal unsatisfiability problem is as hard as the unique satisfiability problem with respect to the polynomial reduction. (2) AlmostUniqueMU is Dvcomplete. Proof. We just present the reductions we need. 1. We first define a polynomial time reduction 6 from UniqueMU to UniqueSAT such that F G UniqueMU if and only if 9(F) is in UniqueSAT. In order to simplify the construction we demand that any literal occurs negatively and positively in the formula. If this is not the case then obviously the formula F is not in MC/and therefore not in UniqueMU. For F = {/i, • • • , fm} we define 6(F) := ((F  {/!}) + {A}) A
f\
(F 
{f,})^.
l
(F — {fi)Y+1 is the formula we obtain by renaming the variables of the formulas (F  {/»}), such that the formulas (F  {/j}) J + 1 (1 < j < m) and ((F — {/i}) + {/i}) have pairwise different variables, / i is the conjunction of the negated literals of f\. The reduction from UniqueSAT to UniqueMU will be very complicated. At first we introduce the transformation u>(F), which will be used later on as a basis for our desired reduction. Let F = { / i , / 2 ,  '  , fm} be a 3C2VF formula over variables {x\,X2, ,xn} with clauses /j = Ln VLJ2 V i , 3 . We introduce new variables {j/i, 2/2, • • • , Vm} 7Tj (1 < z < m) denotes the clause Vi V • • • V 2/j_i V j/i+i V • • • V ym. OJ(F) is the conjunction of the following groups of clauses:
311
(A) The clauses (B) The clauses
/ i V TTI, / 2 V TT2, • • • , fm V irm
 > L n V 7Ti V .J/i,  . L 2 1 V 7T2 V 2/2, ' • • ,  " £ m l V 7Tm V  i j / m 1L12 V 7Ti V it/1, 1L22 V 7T2 V .J/2, • • • ,  , ^ m 2 V 7Tm V i£/ m .L13 V 7Ti V .J/1,  . L 2 3 V 7T2 V .3/2, • • ' ,  i L m 3 V 7Tm V >2/m
(C) The clauses (D) The clause
>r/i V .j/j (1 < i < j < m) y\ V j/2 V • • • V j / m
It is not hard to see that F is satisfiable if and only if w(F) is minimal unsatisfiable. However, w(F) is not necessarily in UniqueMU even if F is uniquely satisfiable. This is because the resulting formula after deleting a clause in group (B) may have multiple satisfying truth assignments. For each /» G F , x* is the disjunction of all literals >x, where x G var(F) — var(fi), and \ denotes the disjunction of all literals >x, where x G var(F). £l(F) is the formula consisting of the following groups of clauses: (A') For each clause (ft V iti) G w(F): fi V 7Tj V Xi, fiV TTiW x
for all a; G var(F) — var(fi)
(B') For each clause (>£*& V 7Tj V .j/i) G w(F): .Lift V 7Ti V .j/j V Xi, 'iifc V Ki V ij/j V a; for all a; G var(F) (C) For each clause (.j/j V
.J/J)
var(fi)
G w(F):
ij/j V >j/j V x, "^J/t V .j/j V x
for all a; G var(F)
(D') The clause j/i V y2 V • • • V ym Then we can show that F is uniquely satisfiable if and only if fi(F) G UniqueMU. 2. The membership in Dp is easy. For the hardness we recall the i n complete problem SATUNSAT of determining for a given pair of formulas one is satisfiable and the other is not [30]. Next we define a reduction from SAT UNSAT to AlmostUniqueMU. For a pair of formulas Fi, F 2 , We can also assume that fi(Fi) and A(F 2 ) := fl(F2) — {j/i V • • • Vy m } have different variables. Let h\ be a clause in fl(F\) such that fi(Fi) — {h\} G UniqueSAT (from our construction we can easily find such a clause). For a fixed clause h2 6 A(Fz) we define G := (ft(Fi)  {/11}) + {hi V M + (A(F 2 ) 
{h2}).
We can show that Fi G 5 ^ T and F 2 G UNSAT if and only if G G
312
For a technical reason, in the above construction we assume F2 contains at least six negative clauses whose variables are distinct. Otherwise, we extend F2 for new variables x\, • • • , xis to the formula F2 + {^xi
V ^x2 V >x3, ••• , <x\6 V 10:17 V
which has the same satisfiability with F2.
*xi8},
U
7. M U Formulas with Disjunctive Splitting In order to characterize and to analyze minimal unsatisfiable formulas, we can split formulas in MU into two minimal unsatisfiable formulas. For a variable x we remove the clauses with literal >x (set >£ = 1) resp. x (set x = 1). In the remaining clauses we delete the occurrences of the literal x resp. ix. The formulas are unsatisfiable and contain therefore minimal unsatisfiable subformulas, say Fx and F,x. More precisely, given a minimal unsatisfiable formula F and a variable x £ var(F), F can be represented as the following form. F = {{x V 9l), • • • , (x V gr)} + BX + C + B,X + { (  « V / 1 ) , • • • , ( ^ V / , ) } ,
such that formulas {g\, • • • ,gr} + Bx + C, denoted as Fx, and C + B^x + {/i)"   i/g}i denoted as F,x, are minimal unsatisfiable. Where Bx, C, B,x are pairwise disjoint and contains no occurrence of x or >x. We call (FX,F^X) a splitting of F on x, and accordingly, Fx, F^x are called splitting formulas. Generally speaking, splitting formulas Fx and F,x have common clauses, that is, C is nonempty. Whenever C is empty we call (i7^, F,x) a disjunctive splitting of F on x. A more detailed analysis of the class UniqueSAT leads to class DisMU. A minimal unsatisfiable formula F is in DisMU if and only if F has a disjunctive splitting on any variable. That means, for any variable x of F, F can be split into two disjoint subformulas in MU. DisMU is of interest, because DisMU is a proper subclass of UniqueMU and its close relation to treelike decision procedures [19]. For the polynomialtime reduction Cl we also can prove that F e UniqueSAT'if and and only if £l(F) £ DisMU. Therefore, DisMU is at least as hard as UniqueSAT. We did not succeed in finding a reduction from a £> p complete problem. But we conjecture that the problem DisMU is not D p complete. Theorem 7.1. (Hans Kleine Biining, Xishun Zhao [20J) DisMU is at least as hard as the unique satisfiability problem with respect to polynomial reduction.
313
Open Question 3 Is DisMU
Dpcomplete?
8. M U Formulas Closed under Splitting From the above results we see that the restrictions of maximality, marginality, and disjunctive splitting, etc. can not reduce the complexity greatly. One reasoning is probably that these features are not closed under splitting. Take maximality as example, suppose F is a maximal MU formula and (FX,F,X) a splitting of F on i , then the splitting formulas are not necessarily maximal. Example 1: The following example is a maximal formula with a nonmaximal splitting formula: (split on x) / x x x ix >x >x <x\ •a a ^a a 