WORDS, LAN
COMBINATO
This page intentionally left blank
Proceedings of the International Conference
WORDS, lR#GURGES ft COMBINATO Kyoto, Japan
14 - 18 Mavch 2000
Editors
Masarni I t o Kyoto Sangyo University, Japan
Teruo Imaoka Shimane University, Japan
b
World Scientific NewJersey London Singapore Hong Kong
Published by World Scientific Publishing Co. Re. Ltd. 5 Toh Tuck Link, Singapore 596224 USA once: Suite 202, 1060 Main Street, River Edge, NJ 07661 UK once: 57 Shelton Street, Covent Garden, London WC2H 9HE
British Library Cataloguing-in-PublicationData A catalogue record for this book is available from the British Library
WORDS, LANGUAGES & COMBINATOFUCS III Proceedings of the Third InternationalColloquium Copyright 0 2003 by World Scientific Publishing Co. Re. Ltd. All rights reserved. This book, orparts thereof; may not be reproduced in any form or by any means, electronic or mechanical, includingphotocopying, recording or any information storage and retrieval system now known or to be invented, without wrirten permissionfrom the Publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.
ISBN 981-02-4948-9
Printed in Singapore.
V
Preface The Third International Colloquium on Words, Languages and Combinatorics was held at Kyoto Sangyo University from March 14 to 18, 2000. The colloquium was a continuation of the previous two International Colloquiums on Words, Languages and Combinatorics held in Kyoto in 1990 and 1992. The colloquium was organized under the sponsorship of the Institute of Computer Science at Kyoto Sangyo University and with the financial support of the Asahi Glass Foundation and the Japan Society for Promotion of Science. The program committee consisted of the following members:
J. Almeida ( U . Porto, Portugal), J . Brzozowski (U. Waterloo, Canada), C. Calude ( U . Auckland, New Zealand), J . Dassow (U. Magdeburg, Germany), K. Denecke (U. Potsdam, Germany), V . Diekert ( U . Stuttgart, Germany), F. G6cseg (U. Szeged, Hungary), T . Hall (Monash U., Australia), T. Head (Binghamton U., USA), J . Howie (U. St Andrews, UK), T . Imaoka (Shimane U., Japan), M. Ito (Kyoto Sangyo U., Japan, chair), H. Jurgensen (U. Western Ontario, Canada & U. Potsdam, Germany), J . Karhumaki (U. Turku, Finland), M. Katsura (Kyoto Sangyo U., Japan), S. Marcus (U. Bucharest, Romania), J. Meakin (U. Nebraska, USA), M. Nivat (U. Paris VI, France), Gh. P5un (Ins. Mathematics, Romania), J . Reif (Duke U., USA), N . Reilly (Simon Fraiser U., Canada), G. Rozenberg (U. Leiden, Netherlands), J . Sakarovitch (ENS des Telecommunication, France), B. Schein (U. Arkansas, USA), G. Thierrin ( U . Western Ontario, Canada), P. Trotter (U. Tasmania, Australia), Do Long Van (Ins. Mathehmatics, Vietnam), M. Volkov (Ural State U., Russia) The topics of the colloquium were: ( a ) semigroups, especially free monoids and finite transformation semigroups, ( b ) codes and cryptography, ( c ) automata, ( d ) formal languages, ( e ) varieties of semigroups and languages, ( f ) word problems, ( 9 ) word- and term-rewriting systems, ( h ) ordered structures and categories, (i) combinatorics on words, ( j ) complexity and computability, ( k ) molecular computing, especially DNA computing, (/) quantum computing The number of participants was 92, from 19 different countries. There were 69 lectures (5 plenary lectures among them) during the sessions. The colloquium was arranged by the conference committee consisting of the following members:
P. Domosi (U. Debrecen, Hungary), Z. Esik ( U . Szeged, Hungary), U . Knauer (U. Oldenburg, Germany), Y. Kobayashi (Toho U., Japan), T. Imaoka (Shimane U., Japan, co-chair), B. Imreh (U. Szeged, Hungary), M. Ito (Kyoto Sangyo U., Japan, cechair), M. Katsura (Kyoto Sangyo U., Japan), M. Kudlek (U. Hamburg, Germany), C. Nehaniv (U. Hertfordshire, UK), F. Otto (U. Kassel, Germany), K. Shoji (Shimane U., Japan), K.P. Shum (Chinese U. of Hong Kong, Hong Kong)
vi
This volume contains papers based on lectures given at the colloquium. All papers have been refereed. The editors express their gratitude to all contributors of this volume including the referees. The organizers would like to express their thanks to the Institute of Computer Science, the Asahi Glass Foundation, the Japan Society for Promotion of Science and the World Scientific Publishing Company for providing the conditions to host the colloquium. We are also grateful to Ms. Yuki Yasuda, Ms. Miyuki Endo, Ms. Tomomi Hirai, Ms. Chikage Totsuka, Mr. Kenji Fujii, Mr. Taro Nakamura, Mr. Jun-ichi Nakanishi, Mr. Tetsuya Hirose and Mr. Ryo Sugiura for their help to realize the colloquium. Finally, we would like to express our appreciation for the assistance of Mr. Christopher Everett during the editing procedure.
March 2003 Masami Ito Department of Mathematics Kyoto Sangyo University Teruo Imaoka Department of Mathematics Shimane University
vi i
Scientific Program March 14, 2000 Plenary lecture 10.00-10.50 Gh. PYun (Instilute of Mathematics of the Romanian Academy),
P systems: An early survey Section A Invited lectures 11.00-11.40 J . Gruska (Masaryk University), Quantum challenges in automata theory 11.40-12.20 D.L. Van (Hanoi Institute of Mathematics), A unified approach to the embedding problem for codes defined by binary relations
12.20-13.00 N.H. Lam (Hanoi Institute of Mathematics), Finite maximal solid codes Contributed lectures 14.30-15.00 D.Y. Long & W.J. Jia (City University of Hong Kong), A new symmetric cripto-algorithm based on prefix codes 15.00-15.30 V. Brattka (FernUniversitat Hagen), The emperor’s new recursiveness 16.00-16.30 A. Yamamura (Communications Research Laboratory) & K. Kurosawa (Tokyo Institute of Technology), Key agreement protocol over a commutative group 16.30-17.00 G. Horvbth, K. Inoue, A. Ito & Y. Wang (Yamaguchi University), Closure property of probabilistic Turing machines and alternating Turing machines with subalgorithmic spaces
...
Vlll
Section B Invited lectures 11.00-11.40 M. Steinby ( University of Turku), Tree automata in term rewriting theory 11.40-12.20 G. Niemann & F. Otto (UniversitZt Kassel), Some results on deterministic restarting automata 12.20-13.00 E. Csuhaj-Varj~(Computer and Automation Research Institute, Hungarian Academy of Sciences) & A. Salomaa (Turku Centre for Computer Science), Networks of Watson-Crick DOL systems
Contributed lectures 14.30-15.00 G. Horvith (Yamaguchi University), Cs. Nagylaki (University of Debrecen) & 2. Nagylaki (Hiroshima University), Visualization of cellular automata 15.00-15.30 H. Nishio, Cellular automata with polynomials over finite fields 16.00-16.30 R. Schott (Universitd Henri Poincard) & J.-C. Spehner (Universitd de Haute Alsace) Two optimal parallel algorithms on the commutation class of a word 16.30-17.00 I. Inata (Toho University), Presentations of right unitary subiiionoids of monoids
March 15, 2000 Plenary lecture 10.00-10.50 J . Meakin (University of Nebraska), One-relator inverse monoids and rational subsets of one-relator groups
ix
Section A Invited lectures 11.00-11.40 L. Kari (University of Western Ontario), Computation in cells 11.40-12.20 S.W. Margolis (Bar-Ilan University), J.-E. Pin (Universitk Paris 7) & M.V. Volkov (Ural State University), Words guaranteeing minimal image 12.20-13.00 C. Campbell (University of St Andrews), The semigroup efficiency of groups and finite simple semigroups
Contributed lectures 14.30-15.00 T. Buchholz, A. Klein & M. Kutrib (University of Giessen), Iterative arrays with limited nondeterministic communication cell 15.00-15.30 T. Saito, Acts over right, left regular bands and semilattices types 16.00-16.30 G. Mashevitzky (Ben Gurion University of the Neger), On definability of weighted circulants by identity 16.30-17.00 M. CiriC & T. PetkoviC (University of NiS), Syntactic and semantic properties of semigroup identities
Section B Invited lectures 11.00-11.40 J. Karhumaki (University of Turku), Remarks on language equations 11.40-12.20 A. Mateescu (University of Bucharest), Routes and trajectories 12.20-13.00 J . Dassow (Otto-von-Guericke-Universitat Magdeburg), On the differentiation function of some language generating devices
Contributed lectures 14.30-15.00 M. Ogawa (NTT Communication Science Laboratories), Well-quasiorders and regular w-languages
X
15.00-15.30 J.A. Anderson (University of South Carolina) & W. Forys (Jagiellonian University), Regular languages and seniretracts 16.00-16.00 K . Hashiguchi, Y. Wada & S. Jimbo (Okayama University), Regular binoid expressions and regular binoid languages 16.30-17.00 K. Shoji (Shimane University), On a proof of Okninski and Putcha’s theorem
March 16, 2000 Plenary lecture 10.00-10.50 J. Shallit (University of Waterloo), Number theory and formal languages
Section A Invited lectures 11.00-11.40 K . Denecke (University of Potsdam), Tree-hyper recognizers and treehyper grammars 11.40-12.20 Z. Esik (University of Szeged) & W. Kuich (Technische Universitat Wien), Inductive *-semirings 12.20-13.00 V. Diekert & & C. Hagenah (Universitat Stuttgart), A remark on equations with rational constraints in free groups
Contributed lectures 14.30-15.00 F. Bassino (Universitd de Marne La Vallde), A characterization of cubic simple beta-numbers 15.00-15.30 T. Poomsa-ard (Khon Kaen University), Hyperidentities in medial graph algebras 15.30-16.00 T. PetkoviC, M. Cirid & S. Bogdanovid (University of Nis), Nonregular varieties of automata
xi
Section B Invited lectures 11.00-11.40 S. Marcus (Institute of Mathematics of the Romanian Academy), From infinite words to languages and back: an expected itinerary 11.40-12.20 C. Choffrut & S. Grigorieff (UniversitC Paris 7), Rational relations on transfinite strings 12.20-13.00 T. Yokomori (Waseda University), On approximate learning of DFAs
Contributed lectures 14.30-15.00 S. Konstantinidis (Saint Mary’s University), Error-detecting properties of languages 15.00-15.30 M. It0 (Kyoto Sangyo University) & Y . Kunimochi (Shizuoka Institute of Science and Technology), On C P N languages 15.30-16.00 S.V. Avgustinovich, D.G. Fon-Der-Flaass & A.E. &id (Sobolev Institute of Mathematics), Arithmetical complexity of infinite words
March 17, 2000 Plenary lecture 10.00-10.50 J. Almeida (University of Porto) & A. Escada (University of Coimbra), Semidirect products with the pseudovariety of all finite groups
Section A Invited lectures 11.00-11.40 K.P. Shum (Chinese University of Hong Kong), On super Hamiltonian semigroups 11.40-12.20 G. SCnizerguesu (UniversitC Bordeaux I), The equivalence problem for a subclass of Q-algebraic series
xii
12.20-13.00 M. Ozawa (Nagoya University), Computational equivalence between quantum circuits and quantum Turing machines
Contributed lectures 14.30-15.00 0. Carton (Universitk de Marne-la-vallk), R-trivial languages of words on countable ordinals 15.00-15.30 N. RuSkuc (University of St Andrews), Some (easy?) questions concerning semigroup presentations 16.00-16.30 B. Steinberg (University of Porto), Polynomial closure and topology 16.30-17.00 H. Machida (Hitotsubashi University), Some properties of hyperoperations and hyperclones 17.00-17.30 R. Matsuda (Ibaraki University), Characterization of valuation rings and valuation semigroups by semistar-operations
Section B Invited lectures 11.00-11.40 A.V. Kelarev & P.G. Trotter (University of Tasmania), A combinatorial property of automata, languages and syntactic monoids 11.40-12.20 J. Sakarovitch (ENST), Star height of rational languages: a new presentation for two old results 12.20-13.00 P. Domosi (University of Debrecen) & M. Kudlek (Universitat Hamburg), An improvement of iteration lemmata for context-free languages
Contributed lectures 14.30-15.00 P. Domosi (University of Debrecen), M. Kudlek (Universitat Hamburg) & s. Okawa (University of Aizu), A homomorphic characterization of recursively enumerable languages 15.00-15.30 B. Imreh (University of Szeged), M. Ito (Kyoto Sangyo University) & A. Pukler (Istvin Szkchenyi College), On commutative asynchronous automata
xiii
16.00-16.30 T. Imaoka (Shimane University), Some remarks on representations of orthodox *-semigroups 16.30-17.00 Z. PopoviE, S. BogdanoviC, M. CiriC & T. PetkoviC (University of NiS), On finite generalized directable automata 17.00-17.30 C. Choffrut (Universitd Paris 7), S. Horvith (Eotvos Lorind University) & M. Ito (Kyoto Sangyo University), Monoids and languages of transfinite word
March 18, 2000 Plenary lecture 10.20-11.10 J.-E. Pin (Universitb Paris VII) & P. Weil (Universitd Bordeaux I and CNRS) , Semidirect products of ordered semigroups
Section A Invited lectures 11.20-12.00 A. Atanasiu, C. Martin-Vide & V. Mitrana, On the sentence valuations in a semiring - An approach to the study of synonymy 12.00-12.40 K . Auinger (Universitat Wien), Join decompositions involving pseudovarieties of semigroups with commuting idempotents
Contributed lectures 14.10-14.40 T. Koshiba (Telecommunications Advancement Organization of Japan) & K. Hiraishi (JAIST), A note on finding one-variable patterns consistent with examples and counterexamples 14.40-15.10 M. Yasugi & M. Washihara (Kyoto Sangyo University), Rademacher functions and computability
xiv
Section B Invited lectures 11.20-12.00 J.-E. Pin (Universitd Paris VII) & P. Weil (Universiti Bordeaux 1 and CNRS), Semidirect products of ordered semigroups - Applications t o languages 12.00-12.40 C. Mauduit (Institut de Mathematiques de Luminy), Pseudorandom words
Contributed lectures 14.10-14.40 E. Moriya & T. Tada (Waseda University), Relation between the space complexity and the number of stack-head turns of pushdown automata 14.40-15.10 M. Mitrovid, S. BogdanoviC & M. CiriC (University of NiS), Iteration of matrix decompositions
xv
List of Speakers Almeida, J. (University of Porto) e-mail: j
[email protected] Anderson, J.A. (University of South Carolina) e-mail:
[email protected] Auinger, K. (Universitat Wien) e-mail:
[email protected] Bassino, F. (Universitk de Marne La Vallke) e-mail:
[email protected] Brattka, V. (FernUniversitat Hagen) e-mail:
[email protected] Campbell, C. (University of St Andrews) e-mail:
[email protected]. uk Carton, 0. (Universitk de Marne-la-Vallke) e-mail:
[email protected] Choffrut, C. (Universitk Paris 7) e-mail:
[email protected] CiriC, M. (University of Nis) e-mail: ciricmebankerinter .net Csuhaj-Varj6, E. (Computer and Automation Res. Inst. Hung. Academy) e-mail:
[email protected] Dassow, J. (0tto-von-Guericke-Universitat Magdeburg) e-mail:
[email protected] Denecke, K. (University of Potsdam) e-mail:
[email protected] Diekert, V. (Universitat Stuttgart) e-mail:
[email protected] Domosi, P. (University of Debrecen) e-mail:
[email protected] Esik, Z. (University of Szeged) e-mail:
[email protected] xvi
Frid, A. E. (Sobolev Institute of Mathemathics) e-mail:
[email protected] Gruska, J. (Masaryk University) e - mail: gruskaeinformati cs.muni. cz Hashiguchi, K. (Okayama University) e-mail:
[email protected] Horvith, S . (Eotvos L o r h d University) e-mad:
[email protected] Imaoka, T . (Shimane University) e-mail:
[email protected] himane-u .ac.jp Imreh, B. (University of Szeged) e-mail: imreh0inf.u-szeged.hu Inata, I. (Toho University) e-mail:
[email protected] Inoue, K. (Yamaguchi University) e-mail:
[email protected] Ito, M. (Kyoto Sangyo University) e-mail:
[email protected] Karhumaki, J. (University of Turku) e-mail:
[email protected] Kari, L. (University of Western Ontario) e-mail:
[email protected] Kelarev, A.V. (University of Tasmania) e-mail:
[email protected] Konstantinidis, S . (Saint Mary’s University) e-mait S.Kons t
[email protected] Koshiba, T. (Secure Computing Laboratory, Fujitsu Laboratory Ltd) e-mail: koshi baeyokohama. t ao .go .jp Kudlek, M. (Universitat Hamburg) e-mail:
[email protected] Kutrib, M. (University of Giessen) e-mail:
[email protected] Lam, N.H. (Hanoi Institute of Mathematics) e-mail:
[email protected] xvii
Long, D,Y. (City University of Hong Kong) e-mail:
[email protected] Machida, H. (Hitotsubashi University) e-mail:
[email protected] Marcus, S . (Institute of Mathematics of the Romanian Academy) e-mail:
[email protected] Mashevitzky, G. (Ben Gurion University of the Neger) e-mail:
[email protected] Mateescu, A. (University of Bucharest) e-mail:
[email protected] Matsuda, R. (Ibaraki University) e-mail:
[email protected] Mauduit, C . (Institut de Mathematiques de Luminy) e-mail:
[email protected] Meakin, J. (University of Nebraska) e-mail:
[email protected] Mitrana, V. (University of Bucharest) e-mail:
[email protected] Mitrovib, M. (University of Nis) e-mail: meli@junis .ni.ac.yu Moriya, E. (Waseda University) e-mail:
[email protected] Nagylaki, Z. (Hiroshima University) e-mail:
[email protected] Nishio, H. (Kyoto, Japan) e-mail:
[email protected] Ogawa, M. (NTT Communication Science Laboratories) e-mail:
[email protected] Otto, F. (Universitat Kassel) e-mail:
[email protected] Ozawa, M. (Tohoku University) e-mail:
[email protected] P b n , Gh. (Institute of Mathematics of the Romanian Academy) e-mail:
[email protected] xviii
PetkoviC, T. (University of NiB and TUCS) e-mail:
[email protected] Pin, J.-E. (Universitk Paris VII) . e-mail:
[email protected] Poomsa-ard, T. (Khon Kaen University) e-mail:
[email protected] Popovid, Z. (University of NiB) e-mail:
[email protected] Ruskuc, N. (University of St Andrews) e-mail:
[email protected] Saito, T. (Innoshima, Japan) e-mail:
[email protected] Sakarovitch, J. (ENST) e-mail:
[email protected] Schott, R. (Universitk Henri Poincark) e-maik
[email protected]. Senizergues, G. (Universiti Bordeaux I) e-mail:
[email protected] Shallit, J. (University of Waterloo) e-mail: shallit C3graceland.math .uwaterloo. ca Shoji, K. (Shimane University) e-mail:
[email protected] Shum, K.P. (Chinese University of Hong Kong) e-mail:
[email protected] Steinberg, B. (University of Porto) e-mail:
[email protected] Steinby, M. ( University of Turku) e-mail:
[email protected] Van, D.L. (Hanoi Institute of Mathematics) e- mail: dlvan @thevin h.ncs t .ac.vn Volkov, M.V. (Ural State University) e-mail:
[email protected] Weil, P. (Universitk Bordeaux I and CNRS) e-,mail: WeilC3labri.u-bordeaux.fr
xix
Yamamura, A. (Communications Research Laboratory) e-mail:
[email protected] Yasugi, M. (Kyoto Sangyo University) e-mail:
[email protected] p Yokoinori, T. (Waseda University) e-mail:
[email protected] This page intentionally left blank
xxi
Table of Contents Contributed Papers Semidirect Products with the Pseudovariety of All Finite Groups . J. Almeida (Porto, Portugal) and A. Escada (Coimbra, Portugal)
. . .
1
On the Sentence Valuations in a Semiring . . . . . . . . . . . . . 22 A. Atanasiu (Bucharest, Romania), C. Martin- Vide (Tarragona, Spain) and V. Mitrana (Bucharest, Romania) Join Decompositions of Pseudovarieties of the Form DH K. Auinger (Wien, Austria)
n ECom . . .
Arithmetical Complexity of Infinite Words . . . . . . . . . . . S. V. Avgustinovich (Novosibirsk, Russia), D. G. Fon-Der-Flaass (Novosibirsk, Russia) and A. E. Frid (Novosibirsk, Russia)
. .
The Emperor’s New Recursiveness: The Epigraph of the Exponential Function in Two Models of Computability . . . . . . . . . . . . V. Brattka (Hagen, Germany)
51
.
63
.
73
. . . . . . . .
88
Iterative Arrays with Limited Nondeterministic Communication Cell T. Buchholz (Giessen, Germany), A. Klein (Giessen, Germany) and M. Kutrib (Giessen, Germany) %Trivial Languagesof Words on Countable Ordinals 0. Carton (Marne-la- Vallke, fiance)
40
The Theory of Rational Relations on Transfinite Strings . . . C. Choflrut (Paris, France) and S. Grigorieff (Paris, Frunce)
. . . .
103
Networks of Watson-Crick DOL Systems . . . . . . . . . . . . . . 134 E. Csuhaj- Varjd (Budapest, Hungary) and A. Salomaa (lhrku, Finland) On the Differentiation Function of Some Language Generating Devices J. Dassow (Magdeburg, Germany)
151
xxii
Visualization of Cellular Automata . . . . . . . . . . . . . . . . 162 M. Deminy (Debrecen, Hungary), G. Horva'th (Debrecen, Hungary), Cs. Nagylaki (Debrecen, Hungary) and 2. Nagylaki (Debrecen, Hungary) On a Class of Hypercodes . . . . Do Long Van (Hanoi, Vietnam)
................
A Parsing Problem for Context-Sensitive Languages . . . . . P. Domosi (Debrecen, Hungary) and M. It0 (Kyoto, Japan)
. . . .
171
183
An Improvement of Iteration Lemmata for Context-Free Languages . . 185 P. Domosi (Debrecen, Hungary) and M. Kudlek (Hamburg, Germany) Quantum Finite Automata . . . . . . . . . . . . . . . . . . . . 192 J. Gruska (Brno, Czech Republic) and R. Vollmar (Karlsruhe, Germany) On Commutative Asynchronous Automata . . . . . B. Imreh (Szeged, Hungary), M. It0 (Kyoto, Japan) and A . Pukler (Gyor, Hungary)
. . . . . . . .
212
Presentations of Right Unitary Submonoids of Monoids . . . . . . . 222 I. Inata (Funabashi, Japan) A Combinatorial Property of Languages and Monoids . . . . . . . . 228 A. V. Kelarev (Hobart, Australia) and P. G. Trotter (Hobart, Australia) Error-Detecting Properties of Languages S. Konstantinidis (Halifax, Canada)
. . . . . . . . . . . . . .
240
A Note on Finding One-Variable Patterns Consistent with Examples and Counterexamples . . . . . . . . . . . . . . . . . . . . . . T. Koshiba (Kawasaki, Japan) and K. Hiraishi (Ishikawa, Japan)
253
On the Star Height of Rational Languages: A New Presentation for Two Old Results . . . . . . . . . . . . . . . . . . . . . . . . S. Lombardy (Paris, fiance) and J. Sakarovitch (Paris, France)
266
Some Properties of Hyperoperations and Hyperclones
H. Machida (Kunitachi, Japan)
. . . . . . . .
286
xxiii
Words Guaranteeing Minimal Image . . . . . . . . . . . . . S. W. Margolis (Ramat Gan, Israel), J.-E. Pin (Paris, fiance) and M. V. Volkov (Ekaterinburg, Russia)
. . .
297
Power Semigroups and Polynomial Closure . . . . . . . . . . . . . 311 S. W. Margolis (Ramat Gan, Israel) and B. Steinberg (Porto, Portugal)
..............
.
323
Characterization of Valuation Rings and Valuation Semigroups by Semistar-Operations . . . . . . . . . . . . . . . . . . . . . . R. Matsuda (Mito, Japan)
.
339
..
352
Routes and Trajectories . . . . . . A . Mateescu (Bucharest, Romania)
Further Results on Restarting Automata . . . . . . . . . . . . G. Niemann (Kassel, Germany) and F. Otto (Kassel, Germany) Cellular Automata with Polynomials over Finite Fields H. Nishio (Kyoto, Japan)
. . . . . . . 370
Generalized Directable Automata . . . . . . . . . . . 2. Popovic' (NiS, Serbia), S. Bogdanovic' (NiS, Serbian), T. Petkovic' (Turku, Finland) and M. CiriC (NiS, Serbia) Acts over Right, Left Regular Bands and Semilattices Types T. Saito (Innoshima, Japan)
. . . . ..
. . . . . 396
Two Optimal Parallel Algorithms on the Commutation Class of a Word . . . . . . . . . . . . . . . . . . . . . . . . . . R. Schott (Nancy, France) and J.-C. Spehner (Mulhouse, France) A Proof of Okninski and Putcha's Theorem K. Shoji (Matsue, Japan)
378
.
403
. . . . . . . . . . . . 420
Subdirect Product Structure of Left Clifford Semigroups . . . . K. P. Shum (Hong Kong, China), M. K. Sen (Calcutta, India) and Y. Q. Guo (Kunming, China) Tree Automata in the Theory of Term Rewriting M. Steinby (Turku, Finland)
. . . 428
. . . . . . . . . .
434
xxiv
Key Agreement Protocol Securer Than DLOG . . . . . . . . . . . 450 A . Yamamum (Tokyo, Japan) and K. Kurosawa (Hitachi, Japan)
A Note on Rademacher Functions and Computability . . . . . M. Yasugi (Kyoto, Japan) and M. Washihara (Kyoto, Japan) Authors Index
.
. . .
466
. . . . . . . . . . . . . . . . . . . . . . . .
477
1
Semidirect products with the pseudovariety of all finite groups* Jorge Almeida
Ana Escada
Abstract This is a survey of recent results related to semidirect products of an arbitrary pseudovariety with the pseudovariety of all finite groups. The main flavour is the establishment of links between various operators on pseudovarieties, some obviously computable, others known not to be so. This not only leads to decidability results but does so in a sort of uniform way which has a structural tint even though the arguments are mostly syntactical.
1
Introduction
Many problems in computer science lead to decidability questions on pseudovarieties of finite semigroups. Often the problem involves some sort of decomposition process which in terms of pseudovarieties translates to the calculation of a semidirect product of pseudovarieties. When just two factors are concerned, the cases in which the second factor is the pseudovariety G of all finite groups or the pseudovariety D of all finite definite semigroups have attracted the most attention [18,3, 22, 43, 461. This paper is a survey of some recent work around the theme of the semidirect product with the pseudovariety G. It uses some powerful tools to deal with such semidirect products, particularly when the second factor is the pseudovariety G , to obtain syntactic proofs of equalities of the form V * G = &V,where &Vdenotes the pseudovariety consisting of all finite semigroups 'The authors gratefully acknowledge support by FCT through the Centro de Matemc'tica da Universidade do Porto and the Centro de Matemc'tica da Universidade de Coimbm, respectively, and by the FCT and POCTI approved project POCTI/32817/MAT/2000 which is comparticipated by the European Community Fund FEDER.
2
whose idempotents generate subsemigroups from V. Subpseudovarieties V of DS are considered, including all subpseudovarieties of LI,DA, DS itself, and J. The latter of these provides a new proof of a crucial step in a result of Henckell and Rhodes [23] which is their deduction from Ash's inevitability theorem [13] of the famous equality IPG = 'BG between the pseudovariety generated by all power semigroups of finite groups and the pseudovariety of all finite semigroups in which regular elements have a unique inverse [33]. The arguments are of a syntactical/combinatorial nature. They consist in suitable formal manipulations of words in the enlarged signature with a pseudo-inversion operation which is never nested. Most proofs are only sketched here. See the full paper [6] for further details.
2
Generalities
We gather in this section the necessary notation and background for the remainder of the paper. The reader is referred to [32] for a basic introduction to finite semigroup theory and to [3] for a more comprehensive treatment based on methods which are closer to those adopted here. See also these references for any undefined terms. By a pseudovariety we mean a class of finite semigroups which contains all homomorphic images, subsemigroups, and finite direct products of members of the class. The most active and successful area of finite semigroup theory is precisely the study of pseudovarieties, particularly some natural operations on them such as the semidirect product. Such operations are often obtained by applying some natural algebraic operator to semigroups from the argument pseudovarieties and closing up to the generated pseudovariety.
2.1
Various operators on pseudovarieties
Let V and W be pseudovarieties. The semidirect product pseudovariety V * W is defined to be the pseudovariety generated by all semidirect products S* T with S E V and T E W. It turns out that rather than using general semidirect products one may use specifically the wreath product which, in a suitable context, is associative, and so the semidirect product of pseudovarieties is also associative. A few other operators will play a role in this paper. The join VVW is simply the pseudovariety generated by the class VUW or, to use an algebraic operator, by all direct products S x T with S E V andT~W. Denote by &V the class of all finite semigroups S whose idempotents
3
generate a subsemigroup which lies in V. Note that the operator E is idempotent. The Mal'cev product V @ W is the pseudovariety generated by all finite semigroups S for which there is a homomorphism cp : S -+ T with T E W and p-'e E V for every idempotent e E T . As indicated below, the Mal'cev product has important links with the semidirect product. The power operator Y associates with V the pseudovariety IPV generated by all power semigroups Y ( S )with S E V. See [3] for an extensive study of this operator and [19,201 for recent improvements and extensions. Let S be a finite semigroup and D one of its regular 'D-classes. Let be the equivalence relation on the set of group elements of D generated by the identification of elements which are either 3 or C-equivalent. A block of D is the Rees quotient of the subsemigroup of S generated by a --class modulo the ideal consisting of the elements which do not lie in D. The blocks of S are the blocks of its regular 'D-classes. The block operator associates with V the class of all finite semigroups whose blocks lie in V, which can be shown to be a pseudovariety. For a semigroup S, denote by E ( S ) the set of its idempotents. The local operator C is defined by letting LV consist of all finite semigroups S all of whose submonoids of the form eSe, with e E E ( S ) ,lie in V. Note that C is also an idempotent operator. The class DV is defined to consist of all finite semigroups whose regular 'D-classes are subsemigroups which lie in V. It is again easy to see that DV is a pseudovariety. Among the above operators, which do not exhaust those of interest for the applications, some have explicit structural definitions while others involve taking the pseudovariety generated by a subclass which itself is defined explicitly in that sense. Note that membership in classes with such explicit structural definitions is relatively easy to test and, in particular, can be done algorithmically. Call a class of finite semigroups decidable if there is an algorithm to test membership in it. It is by no means obvious how to construct an algorithm for the pseudovariety generated by a decidable class. In fact, this task is not always possible. More precisely, the join [l] and the semidirect and Mal'cev products [37] of decidable pseudovarieties may not be decidable. Recently, Auinger and Steinberg [15] have announced that the power operator also fails to preserve decidability. Thus, any connections which may be found between operators defined by generators and structurally defined operators are particularly useful and often translate in an elegant manner into algorithms for computing values of the former. We review below some such connections which are of interest for the specific topic of this paper. For this purpose, we need to introduce some
-
4
further ideas and results.
2.2
Semidirect products with D
Tilson [46] introduced the notions of pseudovariety of (finite) categories and of pseudovariety of (finite) semigroupoids (categories without the requirement of local identities) and he showed that one is led to considering them by studying semidirect products of pseudovarieties of semigroups. In this context, semigroups are seen as semigroupoids by viewing elements as edges (or morphisms) at a virtual single vertex (or object). On the other hand, the edges of a semigroupoid with both ends at a particular vertex v, assuming there is at least one, form a semigroup which is called the local semigroup at v. See Tilson’s work (and the recent continuation [41]) for precise definitions and results. It is well known that pseudovarieties of semigroups may be defined by formal equalities between members of free profinite semigroups (this is basically Reiterman’s theorem [2, 351; see [lo] for a presentation in this language), which are called pseudoidentities. For pseudovarieties of semigroupoids there is an analogous result where free profinite semigroupoids freely generated by finite graphs play the role of generating sets of free profinite semigroups [ll, 261. Thus, pseudoidentities for pseudovarieties of semigroupoids are written over finite graphs. See Theorems 2.4 and 2.6 for specific examples. The global gV of a pseudovariety V of semigroups is the pseudovariety of semigroupoids generated by V. We say that V is local if gV is defined by pseudoidentities over 1-vertex graphs. Consider some frequently encountered pseudovarieties of semigroups, where [C]denotes the class of all finite semigroups which satisfy all members of a set C of pseudoidentities. In such pseudoidentities we adopt the convention that e, f,. . . stand for idempotents and 0 for a zero. So, for instance, a semigroup satisfies the pseudoidentity ex = xe if and only if its idempotents commute with all elements. S1 = {finite semilattices} = [x2 = z,zy = y z ]
B = {finite bands} = [x2 = x ] 0 = {finite orthodox semigroups} = [ ( e f ) 2 = ef] Corn = {finite commutative semigroups} R = {finite %trivial semigroups} L = {finite C-trivial semigroups} J = {finite &trivial semigroups} = R n L
5
D = {finite semigroups in which idempotents are right zeros} D, = [ Y Z ~" - z n = 2 1 ..-z,I K = {finite semigroups in which idempotents are left zeros} K, = [ZI.*-z,Y = ~1 .-.z,] N = {finite nilpotent semigroups} = K n D = [e = 01 G = {finite groups} Ab = {finite Abelian groups} LG = {finite left groups) = D1 V G = [ e z = z] RG = {finite right groups} = K 1 V G = [ z e = z] A = {finite aperiodic semigroups} CS = {finite simple semigroups} CR = {finite completely regular semigroups} = {finite unions of groups} S = {finite semigroups} I = {singleton semigroups} We warn the reader that the notation 0 is often found in the literature with a different meaning [24]. A pseudovariety of semigroups is said to be monoidal if it is generated by its monoids. The interest of the notion of a local pseudovariety of semigroups comes from the following result which is a simplified version of the so-called delay theorem.
Theorem 2.1 ([46]). A monoidal pseudovariety V of semigroups is local if and only if V * D = CV. Theorem 2.2 (1171). The pseudovariety S1 is local. Theorem 2.3 ([18, 421). The pseudovariety R is local (therefore so is L). Theorem 2.4 ([30]). The pseudovariety J is not local, 2.
gJ = [ ( z y ) " z t ( ~ t ) "= (zy)"(Zt)"; -.
z
1.'
Y, t
Theorem 2.5 ([43]). A s pseudovarieties of monoids, nontrivial pseudovarieties of groups are local.
--
Theorem 2.6 ([45]). The pseudovariety Com is not local,
gCom = [ z y z = z y z ;
-.
2,z Y
1.
An orthogroup is an orthodox completely regular semigroup. 'See Subsection 2.4 for a definition of the w-power.
6
Theorem 2.7 ([27]). Every monoidal pseudovariety of orthogroups which does not consist entirely of groups is local. Theorem 2.8 ([28]). The pseudovariety 'DS is local. Theorem 2.9 ([4]). The pseudovariety DA is local.
2.3
Semidirect products -
* H vs other operators
It is a well-known result that, for a pseudovariety V,
and the first inclusion is an equality if V is local and monoidal. Theorem 2.10 ([42, 181). R * G = ER. Theorem 2.11 ([12]). S1* G = ESl. Theorem 2.12 ([IS]). The equality V @ G = EV holds for every nontrivial monoidal pseudovariety V of bands.
Combining with Theorem 2.7, we deduce that the equality V * G = EV holds for every nontrivial monoidal pseudovariety of bands. In the same paper the authors of Theorem 2.12 claim to establish the following result but their arguments are flawed. Based on results of Szendrei [44], P. Trotter has shown in unpublished work that indeed the result is true. Theorem 2.13. CR * G = ECR.
The following is a combination of results of Margolis and Pin [31] and Henckell and Rhodes [23], the latter depending on a deep theorem of Ash [13] which is presented further below. Theorem2.14. I P G = J * G = J @ G = E J = ' B G .
A pseudovariety of groups V is said to be arborescent if (V n Ab) * V = V. Gildenhuys and Ribes [21] had shown that, for such pseudovarieties, their free profinite groups have Cayley graphs which are profinite trees (in a natural homological sense). Almeida and Weil [9] in turn showed that the converse also holds. The following result is a combination of parts of results of Steinberg [38, 391 who has done very extensive work on pseudovarieties of the form V * H and related pseudovarieties. Theorem 2.15. For an arborescent pseudovariety H of groups, IPH = J
H=J@H.
*
7
By further refining the study of the geometry of Cayley graphs of relatively free profinite groups, Auinger and Steinberg [14] have recently obtained a characterization of all pseudovarieties of groups for which the equality J *H = J @ H holds. The situation is however different from that of G.
Theorem 2.16 ([25]). The inequality J @ H # 'BH holds for every pseudovariety H G closed under extension.
2
5
Note that IBH # EV for every pseudovariety H G and every pseudovariety V since G E l but G IBH. From a result of Karnofsky and Rhodes [29] it follows that
c
A*GS&A. For a pseudovariety H of groups, denotes the class of all finite semigroups all of whose subgroups lie in H, which is easily shown to be a pseudovariety. Theorem 2.17 ([47]).For V = CS n Ab, we have the following strict inclusions: V * G V @ G EV.
5
2
Fkom work of Rhodes [36] it follows that V = A * G is also an example in which both inclusions in ( 1 ) are strict and in fact EV = &A. Among the various questions suggested by the above results, we consider in this paper the following problem.
Problem. For which pseudovarieties V do we have V * G = EV? At this point, we add a couple of elementary observations related with the above problem.
1. The pseudovariety &Vis the largest pseudovariety W such that W * G &V.
c
2. We have &'DOE &A[6].
3. If V g &Aand (V n A) * G = E(V n A), then V * G = EV
2.4
(w- 1)-words
By an (w-1)-word we mean a term in a free unary semigroup where the unary operation is denoted (-)"-l. The height of an (w - 1)-word is h(w) = 0 if w does not involve the operation (-)"-l, and is recursively defined by letting h((~)~= - ' h(w) ) 1 and ~ ( w ~ w=zmax{h(wl), ) h(wZ)}.
+
8
The natural interpretation of the unary operation (-)"-l in a finite semigroup S associates with an element s the inverse sW-l of se in the maximal subgroup K of the subsemigroup generated by s, where sw = e is the idempotent of K . The following constitute a Noetherian system of reduction rules which preserve equality in the free group:
a(wa)w-l -+ (w)"-l (aw)w-la -+ (w)"-l (1)-1 -+ 1 The system is not confluent since, for instance, the (w-1)-word (ab)w-la(ca)w-l may be reduced to bw-l(ca)w-l and also to (ab)w-lc"-l but both of these are irreducible (w - 1)-words. For words that reduce to 1, there is a more convenient set of reduction rules. L e m m a 2.18. Let w be an (w - 1)-word of height a t most 1 which i s equal to 1 in the free group. Then it is possible to reduce w to the empty word by applying a finite number of times rules of the form U(VU)W--1V
-+ 1
(2)
where u and v are possibly empty words. L e m m a 2.19 ( P r o d u c t Inverse Formula). In a finite semigroup, the following formula holds:
where a: = (ai+l. . .anal . . .ai)w-lai+l .. .anal . . .ai-1.
(3)
Let S be a finite semigroup and let s E S. By a weak inverse of s we mean an element t E S such that tst = t. Note that sW-l is the only power of s which is a weak inverse of s. Also, the element a: of Lemma 2.19 given by (3) is a weak inverse of ai. A weak conjugate of s is an element of the form asb where one of a and b is a weak inverse of the other. The self-conjugate core D ( S ) is defined to be the smallest subsemigroup of S containing the idempotents which is closed under weak conjugation.
Corollary 2.20. Let S be a finite semigroup and let ai E S and ui E D ( S )U (1) (i = 1,.. . , n ) . Then the product
a l u l . . .a,u,(a,+l. lies in D ( S ) .
. .anal . . .
%+iG+i. . .'%an
9
By a relational morphism p : S + T between two semigroups we mean a relation with domain S which is a subsemigroup of S x T. We shall call a graph what is usually called a directed multigraph, i.e., edges are directed and there may be several edges with the same end vertices. An edge-labeling of a graph by a semigroup S is a function which associates with each edge an element of the semigroup. An edge-labeling X of a graph by a group is said to commute if, for every circuit ( e l , . . . ,en) (which is turned into an oriented cycle if some of the edges in it are reversed), we have the equality (XS,)'~ . . . (Xsn)'- = 1where ~i = -1 or E, = 1according to whether the edge ei is reversed in the circuit or not. For a pseudovariety H of groups, an edge-labeling X of a finite graph I? by a finite semigroup S is said to be H-inevitable if, for every relational morphism p : S + G into G E H, there is a commuting edge-labeling of I? by G which is p-related with A. We may now formulate Ash's inevitability theorem [13] as follows taking into account some observations in [8]. Denote by AS the free profinite semigroup freely generated by a set A.
Theorem 2.21. An edge-labeling X of a finite graph I? by a finite semigroup S i s G-inevitable if and only if, for every (or for some) onto homomorphism q : AS + S , there is an edge-labeling p of I? by (w - 1)-words of height at most 1 such that q o p = X and p commutes over G .
In the terminology of J. Rhodes, an element s of a finite semigroup S is called a type 11element if, for every relational morphism p : S + G into a finite group, (s,1) E p. In other words, the l-vertex l-edge graph labeled s is G-inevitable. The set of all type I1 elements of S is denoted K ( S ) and is called the group kernel of S. By Theorem 2.21, s E K ( S ) if and only if there is some (w - 1)-word of height at most 1 that is equal to 1 in the free group and evaluates to s in S. From this observation it is now easy to establish the following result which was known in the 1980's as the type 1 1conjecture and which was proposed by J. Rhodes. Theorem 2.22 ([13]). For every finite semigroup S , K ( S ) = D ( S ) .
Proof. Note that aba = a =+ e E E(S) a
asb = a(ba)"-'sb, e=eW.
Hence D ( S ) E K ( S ) . Conversely, if s E S is a type I1 element, then it admits an expression as an (w - 1)-word w of height at most 1 which evaluates to 1 in the free group.
10
By Lemma 2.18, the (w - 1)-word w admits a factorization into factors of the form a l u l . . . a,u,(a,+l. . .anal . . . a,)w-lu,+lar+l . . .u,a, with the ui evaluating to 1 in the free group (i = 1 , . . . ,TI). Thus, assuming inductively that all ui E D ( S ) U {l},by Corollary 2.20 we deduce that s E D(S). 0
Bases for semidirect products V * G
2.5
The following result combines Theorem 2.21 with the special case of semidirect products with G of what has come to be known as the basis theorem. It was proved by Almeida and Weil [ll]as a combination of profinite techniques with Tilson's derived category theorem [46].
Theorem 2.23. Let V be a pseudovariety of semigroups and suppose g V admits a basis of pseudoidentities E involving only a bounded number of vertices. Then the semidirect product V * G is defined by the pseudoidentities of the f o m $p = $q where p = q is a pseudoidentity from C over a finite graph I' and cp is an edge-labeling of 'I by (w - 1)-words of height at most 1 which commutes over G . We present next a simple application of this result for which a further few preliminaries are needed. Denote by B2 the 5-element multiplicative matrix semigroup consisting of the 2 x 2 matrices over 2 / 2 2 with at most one nonzero entry. Observe that B2 is locally a semilattice, i.e., B2 E LSl. Let X be an alphabet and let X - l be a disjoint set of formal inverses of the letters. For a word w over X (or, more generally, a member of OxS) using all the letters, let -W denote the equivalence relation on Y = X U X - l generated by the pairs (2-',y) such that xy is a factor of w. Then the most general X-labeled graph rwwith initial and final vertices (which may also be seen as an automaton) supporting w, in the sense that the word w may be read along the graph from the initial to the final vertex, is obtained as follows: 0
Vertices(I',) = Y/mW;
0
Edge@,):
0
initial vertex: x/-, where x is the first letter of w;
0
final vertex: x-'/-,
x/wW-%z/-,
if y
-W
x and y-'
-, z;
where x is the last letter of w.
11
Based on results of Reilly [34] in the context of inverse semigroups, Almeida, Azevedo and Teixeira [5] have observed that a semigroup pseudoidentity u = v is valid in B:!if and only if u and v use exactly the same variables and J?, = r V .This in turn allowed them to prove the following result which explains why the calculation of globals concentrates on pseudovarieties excluding B2.
Theorem 2.24. If B2 E V and V = [C], then g V is defined by the pseudoidentities in C viewed over the most general graphs supporting them. Using this result and Theorem 2.23, we may now proceed to compute some specific semidirect products with G. Proposition 2.25. Let V be a pseudovariety containing S1 and suppose {ui = v, : i E I } is a basis of pseudoidentities f o r V . Then C V is local and (CV) * G is defined by the pseudoidentities of the f o r m
u i ( z w y l x w , . .. ,xWy,zW) = v2(zWy1xW,. . . ,xWy,xW)
(4)
with i E I , x a variable, and the y j (w - 1)-words of height at most 1 which are 1 in groups. Proof. Since V 2 S1, the local semillatice B2 belongs to C V . Note that C V is defined by the pseudoidentities of the form (4)where x , y1 , , .. ,Y n are distinct variables and u i , q depend on the same n variables. By Theorem 2.24, g C V is defined by the pseudoidentities of the form (4) viewed over the corresponding most general graph supporting both sides. But, for the variables x,yk the equivalence relation defining r identifies x with 2-l and y;', and also identifies Y k with x-'. Since this holds for every k E (1, .. . ,n } , the whole graph I' has only one vertex. Hence the pseudovariety CV is local. The remainder of the result follows from Theorem 2.23 by noting that no restriction needs to be imposed on z since it only appears in (4) as an w-power . 0 Note that C I is not local. It can be easily shown that its global is defined by a single pseudoidentity on a 2-vertex graph:
Using this observation, it is also easy to construct a 2-vertex semigroupoid which fails the above pseudoidentity but satisfies all 1-vertex pseudoidentities valid in gCI, i.e., whose local semigroups lie in C I .
12
Proposition 2.26 ([4012). Let V be a pseudovariety containing Sl. T h e n EV is local. Proof. Let {ui = v, : i E I} be a basis of pseudoidentities for V. Then EV is defined by the pseudoidentities of the form
with i E I. Since B2 E ESl E EV, it suffices to verify that the most general graphs over which such pseudoidentities may be written have only one vertex. Now, the (symmetrized) relation generating the equivalence relation -W which defines the most general graph I'w over which w = xy . . .x: may be read is determined by the following connected graph 21
Xn
Hence the graph I'whas only one vertex. Since V contains S1, ui and vi must involve precisely the same variables. It follows that the most general graph in which the two sides of (5) may be read has only one vertex. By Theorem 2.24, this shows that g E V is defined by pseudoidentities over 1-vertex graphs, that is EV is local. Again, in contrast, &I is not local and in fact
Stiffler [42] has shown that, if V is monoidal, then D * V E V * D . This allows us to obtain some curious inclusions relating the operators L and -* G under suitable hypothesis.
Proposition 2.27. Let V be a monoidal pseudovariety of semigroups such that V * G is local. Then, we have the following inclusions: a) L(V * G) * G = L(V * G). b) (LV) * G g L(V * G). Proof. (a) By [3, Exercise 10.2.4(a)], V * G is a monoidal pseudovariety. By Theorem 2.1, it follows that L(V * G) = V * G * D. Taking into account Stiffler's result, this leads to the following inclusions:
L(V * G) * G = V * G * D * G C V * G * G * D = V * G * D = L(V* G) 21n [40] one also finds the statement If V * G = EV and V i s a monoidal pseudovariety of bands, then V * G is local (Proposition 10.1). However, in view of Theorems 2.7 and 2.12, the hypothesis V * G = EV is only stating that V is nontrivial and therefore the result adds no new cases of locality to Proposition 2.26.
13
and the reverse inclusion is obvious. (b) Indeed, we have CV C(V * G),and so
(LV)* G C L(V * G)* G = L(V * G ) . Note that the 6-element semigroup with zero given by the following presentation s = (e,f ;e2 = e, f 2 = f, fef = 0) lies in C&S1= L(Sl* G)but not in &LSl 2 (LSl) * G.Hence, at least for V = S1, the inclusion of Proposition 2.27(b) is strict.
Some solutions of the equation V * G = €V For the remainder of the paper, we concentrate on the equation V * G = 3
&V.We have already mentioned some important solutions and also that the equation does not hold in full generality. A syntactical approach based on Theorem 2.23 allows us t o find similarly flavoured proofs that the equation holds for many pseudovarieties contained in IDS. We only sketch here some of those proofs. The details may be found in [6]. 3.1
Locally trivial solutions
Consider the following pseudovarieties:
Kk = 1x1 - - . z n e= z1 -..x,f] LI; = i.1. . . znef = ~ 1. .znf] . LI: = [ e f q - - - z= nezl...X J I u = 82. = 0 , z y = yz]. Note that &I= [e = f] = (N n Corn) V G where the last equality is easily established and may be found in [3, Section 9.11. It is also an easy exercise in the methods of [3] to show that KL = K, V &I= K, V N V G. The nilpotency index of a nilpotent semigroup S is the least positive integer n such that S satisfies the identity 2 1 . ..xn = 0. From results of Almeida and Reilly [7] it is easy to deduce that U is the smallest pseudovariety of nilpotent semigroups with no bound on the nilpotency index of its members. Using Theorem 2.23, one may show that U * G = EU = &I.Also using the same theorem, one observes that [ e~z l . . . z , = z l . . . z , e = z l . . . [zl---zn=O]*G
zn3
(6)
(where the inclusion is actually an equality) and so the semidirect product with G of a nilpotent pseudovariety V with a bound on the nilpotency index
14
of its members is not EV (which is equal to &I) since N is not contained in the right side of (6). This proves the first part of the following result. The other parts can be established similarly.
Theorem 3.1. Let V C_ ELI. Then V * G = EV if and only i f one of the following conditions holds:
a) U C V C_ €1 = Kb in which case EV
b) V
= €1;
E EK, and V p KL for every n in which case EV = EK;
c ) left-right dual of (ii) in which case EV = ED; d) V is not contained in any of the pseudovarieties LIL and LI:, in which case EV = ELI.
3.2
Some solutions in 'DS
It is well known that DS is the largest pseudovariety that excludes the semigroup Bz.It has been the object of much attention not only since it and some of its subpseudovarieties such M J, R and DA are found in various applications in theoretical computer science, but also because they are particularly amenable to syntactical methods, that is the investigation of their properties through the study of relatively free profinite semigroups and pseudoidentities. This is the case also for our equation. The following theorem summarizes the known results in this respect. Theorem 3.2. Every pseudovariety in the following four intervals (indicated by a bold line) together with 'DS is a solution of the equation V * G = EV: IDS
For the remainder of the paper, we sketch the proof of this result for the two extreme pseudovarieties in the above diagram, namely D S and J.
15
3.3
The case of DDS
Let S be a finite semigroup. We say that v E S is good if, for every x E S, ((VZ)"V)"+l
= (V2)"V
that is (vx)"v is a group element. Observe that an idempotent is always good.
Lemma 3.3. Let S E EDS. Then a)
if a , b E S are good then so is ab;
ii) if v E S is good and aba = a or bab = b, then avb is good.
Proof. (i) Let x E S and let e = (bza)". Then, by associativity, we have the following equalities: ( a b . x)"ab = aeb, ( a . bx)"a = ae, (b . xa)"b = eb. Since (bxa)"-lbz . aeb . ax(bza)"-' = e , all these elements lie in the same 9-class of S. Here is a sketch of their distribution in the corresponding egg box picture:
From the hypothesis that both a and b are good it follows that ae and eb are group elements and therefore their %-classes contain idempotents. Since S E EBS, the %-class of aeb must also contain an idempotent and, therefore, ab is good. (ii) Let x E S. We consider here the case aba = a , the other case being handled similarly. We show that (avbx)"avb is a group element by a sequence of equalities in which in each step we either simply use associativity or we underline the factors about which some property is being used to lead to the next step. At one of the steps we use the fact that, by (i), since bav is a product of an idempotent and a good element, it is good.
16
(gvbx)"gvb = (abavbz)"abawb
= = = = =
b
a . (&bza)"&.
a( (bavbxa) "bav)"+I
b a (bavbxa)"bav ((bavbza)w bav)" b (-ababz)" abav( b(awbzab)"av)"b (avbz)" av (b(gvbz&)"gw) "b = (avbz)"avb((avbz)wuvb)" = ((avbz)"avb)"f'. 17
We may now easily complete the proof of the equality D S * G = EDS. Note that D S is defined by the pseudoidentity ((zy>"z)"+l = (XY)" 2. Since D S is local, by Theorem 2.23 we obtain the following basis of pseudoidentities for D S * G:
9s * G = ~ [ ( ( ~ ~ ) " u =y +(uv)wu l
:
G
I=
= = 11.
Given S E EDS, a homomorphism cp : AS -+ S, and a pseudoidentity u = 1 valid in G , we have cp(u) E K ( S ) = D ( S ) (cf. Theorem 2.22) and so cp(u) is good by induction on the construction of elements of D ( S ) using Lemma 3.3 for the induction steps. This shows that S verifies all the pseudoidentities in the above basis (even those without any requirement on v). Hence EDS C D S * G and the reverse inclusion is valid for any pseudovariety in the place of D S .
3.4
The case of J
In the case of the pseudovariety J , the fact that it is not local makes things somewhat harder which lets us show how the (w- 1)-words of height at most 1 come handy. Let S be a finite semigroup and let s E S. Say that s is w-central if (st)" = (ts)" for all t E S. The following is the analog of Lemma 3.3. The proof is similar and is omitted.
Lemma 3.4. For S E EJ, the set of all w-central elements contains E ( S ) and is closed under multiplication and weak conjugation. Part (i) of the following proposition improves a result of Margolis and Pin [31] which characterizes EJ as the pseudovariety defined by the pseu= (xe)" in the sense that the proposition shows that EJ doidentity satisfies apparently much stronger pseudoidentities.
17
Proposition 3.5. Let w E AS be such that G the following pseudoidentities: 2)
w = 1. Then EJ satisfies
(wz)"= (zw)";
ii) (wz)"w = (wz)"; iii)
& J + 1=
w".
Proof. For (i), note that, in a given S E E J , w evaluates to an element of K ( S ) and therefore to an element of D ( S ) . Hence w is w-central by Lemma 3.4. Part (ii) is proved similarly, that is by establishing first a suitable 0 analog of Lemma 3.4. Part (iii) follows from (ii) by taking z = w. It is now a routine matter to prove the following corollary using Proposition 3.5 and associativity.
Corollary 3.6. Let u , v,w E GAS be such that G satisfies the pseudoidentities
+ uw = v = 1. Then EJ
(uvw)wuw = UW(UzIW)" = (uvw)w. Corollary 3.6 gives the first step in an induction procedure that gives, more generally, the following result for (w 1)-words of height at most 1. The proof is omitted.
-
Proposition 3.7. Let u,v be (w - l)-urords of height at most 1 such that u = 1 holds in G and u + v by rules of the form u(vu)"-~v + 1. Then EJ satisfies the pseudoidentities vu" = uw = uwv. With these tools at hand, one may now proceed to prove that J is a solution of our equation by establishing the following result.
Proposition 3.8. Let u1, u2, v1, v2 be (w - 1)-words of height at most 1 such that u1u2 = 211212 = u1v2 = 1 in groups. Then EJ satisfies the pseudoidentity ( ~ 1 ~ 2 ) w ~ l ~ 2 ( v=1 (~~ 2 1~ )2 w) w ( v 1 v 2 ) w .
We only sketch here the proof of Proposition 3.8. By Lemma 2.18, it is possible to reduce the (w - 1)-word ulv2 to 1 by a finite sequence of applications of rules of the form u(vu)W-1v
+ 1.
(7)
18
where u and v are possibly empty words. In such a reduction, part will be done entirely within factors descending from u1 or from v2 while some may require a factor descending from u1 and a factor descending from v2. The former are handled directly using Proposition 3.7. For the latter, some care has to be taken. By associativity, we may for instance assume that the word u is shortest possible and that the factor ( V U ) ~ - ~ Vcomes entirely from a descendant of v2 and that this factor cannot be erased from that descendant of 212 by application of rules of the form (7). The factor u in turn is a product u = yz where y comes from a descendant of u1 and z from a descendant of 0 2 . So there are descendants ui of u1 and vh of v2 under the reduction rules of the form (7) and factorizations ui = uyy and vh = , Z ( V U ) ~ - ~ V Vwith ~ , IyI smallest possible. Now there is a shortest suffix t of v2 from which 06 is obtained by application of rules of the form (7) and we let 02 = st. Because u was chosen to be shortest possible, and since v1svh = v l v 2 = 1 in G , it must be possible to reduce 01s to an (w - 1)-word of which y is a suffix. Moreover, such a reduction must come from the left factor v1,that is there is a factorization vl = xjjs' such that 5 reduces to y and s's reduces to 1. This shows that the reduction ~ Z ( V U ) ~ - ' V+ 1 that has to be performed in a descendant of u 1 v 2 to reduce it to 1 may also be performed in a descendant of v102 and so again Proposition 3.7 allows us to do it using the right factor ( ~ 1 7 ~ 2 ) For ~ . the reader's benefit, we depict the various factorizations in the following picture where the appearance in the second line of a factor below another in the first line means it is a descendant of the latter. '111 '112 '111 .. . .-
...
--.
'111
212
1 '1;
73'111
v2
s
't
v1
xlgls'
4 -x Y -
v2
s
I
...
t
4 ... -
Y
The equality J * G = EJ follows now from Theorem 2.23 taking into account the basis of pseudoidentities for gJ given by Theorem 2.4.
References [l] D.Albert,
R. Baldinger, and J. Fthodes, The identity problem for finite semigroups (the undecidability of), J . Symbolic Logic 57 (1992) 179-192.
[2] J. Almeida, The algebra of implicit operations, Algebra Universalis 26 (1989)
16-32. [31
, Finite Semigroups and Universal Algebra, World Scientific, Singapore, 1995. English translation.
19
[41
, A syntactical proof of locality of D A , Int. J . Algebra and Computation 6 (1996) 165-177.
[5] J. Almeida, A, Azevedo, and L. Teixeira, O n finitely based pseudovarieties of the forms V*D and V*Dn, J . Pure and Appl. Algebra 146 (2000) 1-15. [6] J. Almeida and A. Escada, On the equation V*G=EV, J . Pure and Appl. Algebra. To appear. [7] J. Almeida and N. R. Reilly, Generalized varieties of commutative semigroups, Semigroup Forum 30 (1984) 77-98. [8] J. Almeida and B. Steinberg, O n the decidability of iterated semidirect products and applications to complexity, Proc. London Math. SOC. 80 (2000) 50-74. [9] J. Almeida and P. Weil, Reduced factorizations in free profinite groups and j o i n decompositions of pseudovarieties, Int. J. Algebra and Computation 4 (1994) 375-403.
POI
, Relatively free profinite monoids: a n introduction and examples, in Semigroups, Formal Languages and Groups, J. B. Fountain, ed., vol. 466, Dordrecht, 1995, Kluwer Academic Publ., 73-117.
[111
, Profinite categories and semidirect products, J . Pure and Appl. Algebra 123 (1998) 1-50.
[12] C. J. Ash, Finite semigroups with commuting idempotents, J . Austral. Math. SOC.,Ser. A 43 (1987) 81-90.
~ 3 1 , Inevitable graphs: a proof of the type 11 conjecture and some related decision procedures, Int. J. Algebra and Computation 1 (1991) 127-146. [14] K. Auinger e B. Steinberg, T h e geometry of profinite graphs with applications to free groups and finite monoids, Tech. Rep. CMUP 2001-06, 2001. [15] K. Auinger and B. Steinberg, O n the extension problem f o r partial permutations, Tech. Rep. CMUP 2001-08, 2001. [16] J.-C. Birget, S. Margolis, and J. Rhodes, Semigroups whose idempotents f o r m a subsemigroup, Bull. Austral. Math. SOC.41 (1990) 161-184. [17] J. A. Brzozowski and I. Simon, Characterizations of locally testable events, Discrete Math. 4 (1973) 243-271. [18] S. Eilenberg, Automata, Languages and Machines, vol. B , Academic Press, New York, 1976. [19] A. Escada, T h e power exponent of a pseudovariety, Semigroup Forum. To appear.
POI
, Contributions f o r the study of power operators over pseudovarieties of semigroups, Ph.D. thesis, Univ. Porto, 1999. In Portuguese.
[21] D. Gildenhuys and L. Ribes, Profinite groups and Boolean graphs, J . Pure and Appl. Algebra 12 (1978) 21-47.
20
[22] K. Henckell, S. Margolis, J.-E. Pin, and J. Rhodes, Ash’s type 11 theorem, profinite topology and Malcev products. Part I, Int. J. Algebra and Computation 1 (1991) 411-436. [23] K. Henckell and J. Rhodes, T h e theorem of Knast, the PG=BG and Type I1 Conjectures, in Monoids and Semigroups with Applications, J . Rhodes, ed., Singapore, 1991, World Scientific, 453-463. [24] P. M. Higgins, Pseudovarieties generated by transformation semigroups, in Semigroups with Applications, including Semigroup Rings, S. Kublanovsky, A. Mikhalev, J. Ponizovskii, and P. Higgins, eds., St Petersburg, 1999, TPO “Severny Ochag”, 85-94. [25] P. M. Higgins and S. W. Margolis, Finite aperiodic semigroups with commuting idempotents and generalizations, Israel J. Math. 116 (2000) 367-380. [26] P. R. Jones, Profinite categories, implicit operations and pseudovarieties of categories, J. Pure and Appl. Algebra 109 (1996) 61-95. [27] P. R. Jones and M. B. Szendrei, Local varieties of completely regular monoids, J. Algebra 150 (1992) 1-27. [28] P. R. Jones and P. G. Trotter, Locality of D S and associated varieties, J. Pure and Appl. Algebra 104 (1995) 275-301. [29] J. Karnofsky and 3. Rhodes, Decidability of complexity one-half for finite semigroups, Semigroup Forum 24 (1982) 55-66. [30] R. Knast, S o m e theorems o n graph congruences, RAIRO Inf. ThBor. et Appl. 17 (1983) 331-342. [31] S. W. Margolis and J.-E. Pin, Varieties of finite monoids and topology for the free monoid, in Proc. 1984 Marquette Semigroup Conference, Milwaukee, 1984, Marquette University, 113-129. [32] J.-E. Pin, Varieties of Formal Languages, Plenum, London, 1986. English translation. [331
, BG=PG: A success story, in Semigroups, Formal Languages and Groups, J. Fountain, ed., vol. 466, Dordrecht, 1995, Kluwer, 33-47.
[34] N. R. Reilly, F’ree combinatorial strict inverse semigroups, J. London Math. SOC.39 (1989) 102-120. [35] J. Reiterman, T h e Birkhofl theorem for finite algebras, Algebra Universalis 14 (1982) 1-10. [36] J. Rhodes, Kernel systems - a global study of homomorphisms o n finite semigroups, J. Algebra 49 (1977) 1-45. [371
, Undecidability, automata and pseudovarieties of finite semigroups, Int. J. Algebra and Computation 9 (1999) 455-473.
[38] B. Steinberg, Inevitable graphs and profinite topologies: some solutions to algorithmic problems in monoid and automata theory, stemming f r o m group theory, Int. J. Algebra and Computation 11 (2001) 25-71.
21
WI ~401
, A note on the equation PH
= J*H,Semigroup Forum. To appear.
, Semidirect products of categories and applications, J. Pure and Appl. Algebra 142 (1999) 153-182.
[41] B. Steinberg and B. Tilson, Categories as algebras 11,Tech. Rep. CMUP 20004, 2000. [42] P. Stiffler, Extension of the fundamental theorem of finite semagroups, Advances in Math. 11 (1973) 159-209. [43] H. Straubing, Finite semigroup varieties of the form V * D,J. Pure and Appl. Algebra 36 (1985) 53-94. [44] M. B. Szendrei, T h e bifree regular E-solid semigroups, Semigroup Forum 52 (1996) 61-82. [45] D. Therien and A. Weiss, Graph congruences and wreath products, J. Pure and Appl. Algebra 36 (1985) 205-215. [46] B. Tilson, Categories as algebra: a n essential ingredient in the theory of monoids, J. Pure and Appl. Algebra 48 (1987) 83-198. [47] S. Zhang, An infinite order operator on the lattice of varieties of completely regular semigroups, Algebra Universalis 35 (1996) 485-505.
22
On the Sentence Valuations in a Semiring Adrian ATANASIU*
Carlos MARTiN- VIDE**
Victor MITRANA*
*University of Bucharest, Faculty of Mathematics Str. Academiei 14, 70109, Bucharest, Romania e-mail:
[email protected] e-mail:
[email protected] **Research Group in Mathematical Linguistics and Language Engineering Rovira i Virgili University Pqa. Imperial TBrraco 1, 43005 Tarragona, Spain e-mail:
[email protected] Abstract. This paper proposes an algebraic way of sentence valuations in a semiring. Actually, throughout the paper only valuations in the ring of integers with usual addition and multiplication are considered. These valuations take into consideration both words and their positions within the sentences. Two synonymy relations, with respect to a given valuation, are introduced. All sentences that are synonymous form a synonymy class which is actually a formal language. Some basic problems regarding the synonymy classes are formulated in the general setting but the results presented concern only very special valuations.
1
Introduction
A series of paper, see, e.g., [l],[a], [8], [9], and the references thereof, have dealt with homomorphisms h from a free generated monoid M into the monoid ((0, m),., l), so that the sum of all homomorphical images of generators of M equals 1, called 'Supported by the Direccih General de Enseiianza Superior e Investigacibn Cientifica, SB 97-00110508
23 Bernoulli homomorphisms (distributions, measures). Besides being homomorphisms, Bernoulli homomorphisms may be viewed as probability measures on the family of all languages over a given alphabet. Furthermore, they played an important role in developing the theory of codes [l].Some authors discarded the homomorphism property keeping the probability measure property as done in [8], [9] whilst others proceeded vice versa [6], calling them valuations. These valuations were used in the study of unambiguity representations of languages. Along these lines, in [7] the equation system zI = pi which identifies a context-free grammar, is transformed, via a valuation h, into a numerical equation system h(zi)= h(pi). Solving the former system one gets the context-free language generated by the given grammar while solving the later system one gets exactly the valuation of L , h ( L ) , defined as the sum of all h ( w ) with w E L. Close relations between the unambiguity of the given context-free grammar and the value h(L) are discussed. Moreover, new characterizations of unambiguity in regular expressions based on the same concept of language valuation are proposed. This way of assigning values to a sentence remembers also some devices introduced in the area of regulated rewriting: weighted grammars and automata, see, e.g., [4,5 , 12, 10, 131, where a given number in a group is associated with each computation step (derivation or configuration). A computation is valid iff the total value assigned t o that computation, computed in accordance with the operation of the group considered, is the neutral element of that group. A consistent extension to the basic paradigm of constraint satisfaction in parsing might make use of the penalty factors assigned to syntactic, semantic, and mapping constraints. Penalty factors, which may range from zero to one, are combined multiplicatively leading to confidence scores which indicate a sort of level of constraints violation. This extension can be used to model distance effects if one takes into consideration the local distance between two consecutive constraint violations. This suggests t o compute the confidence score depending also on the position of the constraint. In this paper, we introduce a generalization of the aforementioned valuations in the following sense. The value of a sentence depends not only on its words but also on their positions within the sentence. Furthermore, the valuation is computed in a richer structure that of a semiring instead of a monoid. Moreover, we consider valuations that allow a finite set of values for each sentence. More precisely, each word in a given vocabulary has a finite set of values (attributes) and each position (a natural number) has just one attribute. For a given sentence, the value associated to a position occupied by a certain word a is obtained by considering two attributes: one is that of position itself the other being one among the attributes associated to the word a. Thus, we need an operation for computing the value of every position in the sentence and one operation for computing the value of the whole sentence.
24
The latter should be, in our opinion, an additive type one. What structure might be the most relevant one for our purposes? We have chosen a very common and widely investigated structure in semantics, that of a semiring. More precisely, all the results we present concern a particular semiring, namely the (semi)ring of integers. Based on this valuations there are defined two types of synonymy relations. Two sentences are weakly synonymous, with respect t o a given valuation, if they have a common value computed in accordance with the given valuation; they are strongly synonymous if they lead t o a common value in between any contexts. Informally speaking, two sentences are weakly synonymous if they have a common meaning. However, if one adds the same contexts t o two weakly synonymous sentences, one may get two new sentences that have no common meaning (the new sentences are not weakly synonymous anymore). This undesired feature is avoided by the definition of the strong synonymy relation. We investigate the decidability of the finiteness problem of synonymy classes as well as the possibility of algorithmically deciding whether two given sentences are strongly synonymous (as we shall see, it is always decidable whether or not they are weakly synonymous). In our approach we consider two very special types of valuations depending on their position attributing function, that is the polynomial (constant and linear) and (restricted) exponential functions, respectively.
2
Definitions and examples
A vocabulary is a finite nonempty set whose elements are called words; if V = {al, a 2 , . . . , a,} is a vocabulary, then any sequence w = ailai, . . . a i k ,1 5 ij 5 TI, 1 5 j 5 k , is called sentence over V . The length of the aforementioned sentence w is denoted by l g ( w ) and equals k . The empty sentence is denoted by E , lg(&)= 0. As a rule, the words are denoted by small letters from the beginning of the Latin alphabet and the sentences are denoted by small letters from the end of the same alphabet, ~ the sentence obtained from excepting the empty sentence. Moreover, ( x ) delivers x by removing all words not in U. The set of all sentences over V is denoted by v* and V+ = V*- { E } . Any subset of V * is called language. A structure ( A ,+, -,0 , l ) is called a semiring iff the following conditions are satisfied for all a, b, c E A: (i) ( A ,
+,0) is a commutative monoid,
(ii) ( A ,-,1) is a monoid, (iii) a . ( b
+ c ) = a . b + a . c, ( a + b) . c = a . c + b . C,
(iv) O . a = a . O = O .
25
The semiring ( A ,+, -,0 , l ) is said to be commutative iff ( A ,., 1) is a commutative monoid. For further notions we refer to [ll].If B and C are two subsets of A and q E A, one defines
B .Q = {PqlP E B } ,Q . B = {qpb E B ) , B + C = { p + r l p E B,r E C}, B - C = {prlpE B , r E C } . Let V be a vocabulary and ( A ,+, .,0 , l ) be a commutative semiring. A valuation of V * in A is a pair of mappings
4 = (.,P), where : V --+ 2f, (the word valuating function; a ( a ) is the set of all values (attributes) of a ) ,
0
a
0
p : IN +A , (the position valuating function; p(n) is the value (attribute) of the position n ) .
Here 2; denotes the set of all finite subsets of A . Given a valuation a sentence x = a 1 a 2 . . a, E V * ,ai E V, 1 5 i 5 n, we define
4 as above and
n
val+(x)= Ccr(ai). p ( i ) . i= 1
Moreover, v a l + ( ~delivers ) always 0, for any valuation q5. A valuation as above is deterministic iff card(cr(a)) = 1 for all a E V . By our intuition, the beginning and the end of a sentence offer more information then its middle part. Even so, one may argue that beginning is still predominant, but for our further results this makes no difference since for any string x as above we can consider n vaZ4(x) =
C a(ui) .p(n - i + 1).
i=l
Two sentences x,y are weakly synonymous with respect to the valuation 4, written as x "+ y , iff val+(x)n v a l @ ( y )# 8. One may easily notice that this relation is reflexive and symmetrical but not transitive. The weak s y n o n y m y class of x is defined as [XI4 = {Y E V*lZ -4 9). Two sentences x,y are strongly synonymous with respect to the valuation 4, written as x "+ y , iff val+(uxv)f l val+(uyv) # 8, for any pair of sentences u, E V * . Again,
26
this relation is reflexive and symmetrical but not transitive. The strong synonymy class of x is defined as b$4 = {Y E V * b “4 Y}. Note that the strong synonymy always implies the weak synonymy but the converse does not hold.
Example 1. Let us consider the semiring +[XI of all polynomials with only one indeterminate and coeficients in Z together with addition and multiplication. W e consider the valuation of {a, b, c, d}* in Z [ X ] ,4 = (a,p), defined as follows:
a(.) = 2X2 a ( b ) = X 2 - 1 a(.) = 1 a ( d ) = 2X2 - 1 P(i) = X
+
2,
i E IN.
It is easy t o note that
valb(dacb) = val+(aba) = 5 X 3 + 10X2- X which implies dacb
wB
Example 2. Take V with
-
2
aba.
=
{a, b, c} and the valuation of V* in (Q, +, ., 0 , l )
4 = ( a ,p)
a ( a ) = {1/2}, a(b)= {1/3}, a ( c ) =z {-1/6} P(n) = 5 , for all n 2 1. The reader may easily verify that [&I4= {xi3ig((x)a)
+ 21g((x)b)= l g ( ( x ) c ) } .
Note that [&I4 is a context-free non-regular language. Moreover, both valuations are deterministic. We proceed t o investigate mainly the synonymy classes. A natural problem concerns the finiteness of these sets as well as the possibility t o decide on this problem. As we shall see in the sequel, a closely related problem concerns the decidability status of the next problem: For a given value q, are there sentences whose valuation set contains q? Furthermore, we are concerning with the problem of finding appropriate devices (automata, grammars, etc.) which characterize the synonymy classes. Since the above definitions were given in a very general setting, we should restrict our investigation to particular valuations. To this end, in this paper we shall only consider the valuations in the ring of integers Z with the usual addition and multiplication. Of course, other semirings may be considered as well,
27
but we have chosen this semiring because it is the most natural and simple one. Even so, the problems we considered appeared t o be very difficult. A similar investigation for other semirings remain to be done. The absolute value of an integer x is denoted by 1x1. In the sequel, we shall foccus our attention on valuations whose function p is either the constant polynomial, the linear polynomial, or the exponential function
an. We start with a lemma which will be useful in the sequel. Let x = ( 3 1 ~ 2 . .. a, be a sentence in V* and 4 = ( a ,p) be a valuation of V' in Z with
p(n)= Conk + C 1 n k P 1 + . . .
+
Ck.
Denote by
4i = (a,nZ),o 5 i 5 k . Clearly,
by a direct calculation one gets the desired equality.
3
0
The constant polynomial
As [x]4= V ' , for all x E V*, providing that p is the null function, we shall consider only non-zero position valuating functions in the rest of this section. Note that the relations stated by Lemma 1 and relation (l),respectively, may be combined in
+
valg(xy) = val@(x) val4(y).
(2)
The next result is an immediate consequence of relation 2.
Proposition 1. Let 4 = ( a ,p) be a valuation of V * in Z.Then, x
~4
y iff x "4 y .
Theorem 1. Let q5 = ( a ,p) be a valuation of V' in 72. Then, the following problems are decidable: I. Given q E Z,are there sentences x E V+ such that q E va14(x)? 2. Given q E Z,are there arbitrarily many sentences x such that q E va14(x)?
28
Proof. Assume that P(n) = k , n E W, for some integer k . Moreover we take a positive k , the case k < 0 may be treated similarly. 1. Obviously, there is no sentence whose valuation contains q if q is not a multiple of k . We distinguish two cases depending on the values of the words in V . Firstly, let us suppose that all values of the words in V are nonnegative; the reasoning is the same when all of them are negative. It follows that val4(z) contains only nonnegative integers, for each x E V+, hence q has to be nonnegative. Clearly, if q = 0, the answer is affirmative if and only if there is a word in V that has a null value. For q > 0 it suffices to restrict our search to sentences of length a t most q / k . Consequently, one can algorithmically find the answer in this case. Now, let us consider that the set C = { p # Olp E a ( u ) , a E V } contains both negative and positive integers. We claim that exists z E V+ such that q E val+(z)if and only if q is a multiply of k d , where d is the greatest common divisor of all integers in C. Obviously, if q E valg(z) for some x E V+,then q must be a multiple of k d . We prove now th a t for each multiple of kd there exists a sentence that contains it in its valuation set. Let q l , q l , . . . qn be all positive integers in C and p l , p 2 , . . . p,, be all negative integers in C. I t is known that
i=l
i=l
for some integers ki,rj, 1 5 i 5 n, 1 5 j 5 m. Moreover, one can choose either ki,rj 2 0 or kj, r j 5 0, for all 1 5 i 5 n, 1 5 j 5 m. The last claim requires a short discussion: we thought that we would find a reference for it but we were not able to find such a reference. In order to keep the proof easy to follow, we prefered to prove it in an appendix at the end of the paper. Let q be an arbitrary multiple of kd; consider the sentence x given by the next algorithm:
Algorithm 1. Procedure Findsynonymy-class-representative(q); begin 2
:= E ;
if q > 0 then choose k i , r j 2 0 in (3) else choose k i , r j 5 0 in (3); endif; for i:= 1 t o n do choose a E V with qj E a(a); 2
:= x , e ~ i / ( ~ 4 ;
29 endfor; for i:= 1 to m do
choose a E V with pi E a(a); x := xaVil(k4; endfor; if p=O then choose a, b E V such that 91 E a(a),pl E a @ ) ; 2
:= a l P l l p ;
endif; end.
It is easy to notice that q E valb(z) which concludes the reasoning of the first assertion. 2. The latter item follows from the first one as follows. Find, if any, a sentence z such that q E valb(z). If the sentence z exists, detect a sentence y whose valuation contains the value 0. When no sentence y exists, only a finite number of sentences might have q in their valuation sets. Indeed, if there is no sentence y E V + with 0 E val+(y), then all values of the words in V are either positive or negative. By the first part of this proof it follows that only a finite number of sentences might have q in their valuation set. If such a sentence exists, then by equation 2 all sentences 0 zym,rn 2 0, are in [x]b,which ends the proof. From the previous theorems one can infer the next result.
Theorem 2. Let qj = (a,p) be a valuation of V * in Z.The following problems are decidable: 1. Are two given sentences weakly/strongly synonymous? 2. Given a sentence z E V * , are [XI+ and (x)4finite? We recall now an operation on sentences that will turn out to be very useful for our investigation regarding the type of languages [z]b. This operation, called shufle is a well-known operation in formal language theory and in parallel programming theory. We define this operation on sentences, recursively, as follows: for two strings z, y E V * and two symbols a, b E V we write (2)
(ii)
zLU&=&IUz=2, uz u bg = a(z LU by) u b(az Lu y).
A shuffle of two strings is an arbitrary interleaving of the substrings of the original strings We naturally extend this operations to languages: L1LULz=
u
zuy.
ZEh,YELZ
The next theorem settles the position of synonymy classes with respect to valuations whose position valuating function is a constant in the Chomsky hierarchy.
30
Theorem 3. Let q5 = ( a ,p ) be a valuation of V* in Z and x be a sentence in V * . 1. The language [x]4is context-free. 2. If q5 is deterministic, one can decide whether or not [x]g is regular.
Proof. 1. The reader may construct a nondeterministic pushdown automaton that recognizes all sentences in [ x ] g . We prefer another proof, namely we use a slightly modified version of additive valence grammar. A right-linear additive valence grammar, see [3], is a construct G = ( N ,T, P, v), where N , T , P are the parameters of a right-linear grammar and v : P -+Z is the valence function. The valence associated t o a derivation
s,
D : wo
~1
s,
ar2 ~2 3 . . . arm wm
such t h a t at each step wi-1 ari wi, 1 5 z 5 m, the applied rule is ri is
i= 1
The language generated by G with the valence q E Z is the set
L(G,q) = {x E T*l there exists a derivation S =+*
x with v(S J*
x) = q ) .
It is known [3] t h a t all languages generated by right-linear additive valence grammars are context-free. Now, given a valuation q5 = (a,P) of V * in Z,with a constant function p, we construct a right-linear additive valence grammar G+= ( N , V,S, P, v) as follows:
v,Q E 44).
0
N = { S ) u { ( a ,q)la E
0
For each nonterminal (a,q) E N we have the following rules in P:
S (a,q)
(a,q)
-+ -+ +
(a,q), with the valence q, aS, with the valence 0, E , with the valence 0.
Obviously, the equality [XI4 =
u
L(GbtIP(1))
tEWQlg( 5 )
holds. Since the family of context-free languages is closed under union the first assertion is completely proved.
31
2. Let q5 = ( a ,p ) be a deterministic valuation of V * in
Z.Denote by
v = { a E Vla(a)= 0} and U = V \ v. It is easy to note that [TI4= [(z)v]g v*.
If card(U) 5 1, then [z]4 is regular for any z E V * . Indeed, if card(U) = 0, then [z]4= V * .If card(U) = 1, then all classes [ ( z ) ~ ]are + finite, hence all languages [z]4 are regular. Let us suppose that card(U) 2 2. If a(a). a ( b ) > 0 for all pairs (a, b) E V x V , then [(z)v]4 is a finite set for all z E V* (see the proof of Theorem l),therefore [z]4 is regular. If there exist a, b E V such that a ( a ) - a ( b ) < 0, then the language [ ( z ) U ] $ is context-free but not regular. This language is [(Z)UI4
= {z E U*l
c
l g ( ( z ) a ) . a ( a ) = Val&)).
QEU
As shuffling a context-free non-regular language with a regular language, the languages being over disjoint vocabularies, one gets a context-free non-regular language, it follows that [z]4is regular iff either card(U) 5 1 or a ( a ) . a ( b ) > 0 for all pairs 0 ( a , b) E V x V , both conditions being decidable. Remark. In the view of the last theorem, the decidability of the finiteness problem for synonymy classes follows directly from the finiteness problem for context-free languages. However, the proof of Theorem 1 offers a more easily testable condition and a less complex (time and memory) implementation.
4
The linear polynomial
In this section we shall consider only valuations whose position valuating functions are linear polynomials.
Theorem 4. Let q5 = ( a ,P ) be a valuation of V * over one can decide the finiteness of
[XI$.
+. Given a sentence
3:
E V*,
Proof. Let 4 = ( a ,p) be a valuation with @(n)= k n + p . By Lemma 1 and relation 1, one may write also
wUal&y)
= wal,$(z)
+ wal+(y) + k .l g ( z ) .val+,(y).
(4)
Assume that a ( a ) contains only positive integers, for all a E V ; the case when a ( a ) contains just negative integers may be treated analogously. We analyse what happens when k < 0; the reasoning may be carried over the case k > 0 with minor
32 changes. Clearly, there exists no E IN such that valb(z) has only negative values (vaZ+(z)< 0, for short), for all sentences z in V * longer than no. Let z be such a sentence. We claim that valb(y) < valb(z), for all y E V * such that Zg(y) 2 lg(z). maz{ Irl I r E valb(z)}. Due t o the length of y, one infers that walb(y)
5 maz{lrl I r
E valb(z)} . maz{val~(w)lwis a subsentence of y of length Zg(z)}
which is smaller than valg(z) because valb(w) is negative too, providing that w is a subsentence of y of length l g ( z ) . Consequently, [z]@is finite for all z E V * . Let us consider that exist a, b E V , possibly the same, such that &(a) . a(b) contains at least one negative integer. Take q1 E a ( a ) ,q2 E a ( b ) such that q1 -q2 < 0; the sentence y = alQzlblqllsatisfies the relation 0 E valb,(y). Moreover, we claim that 0 E valb, (,aR), for all z E V* with 0 E vaZgO(z).Indeed, if z = 2122.. . z,, zi E V , and ti E a(zi),1 5 i 5 m, so that CEl ti = 0, then (2m I) Ci=,(ti)E va$(zzR) holds. Note also that 0 E val+,(zzR),too. Now, as valg(zzR) = k . Val+, (2.") p . val@,(ZZR)
+
+
one gets 0 E valg(zzR). Due to the relation 4 one concludes that all sentences 0 z(zzR)q,q 2 0, with z as above, belong to [z]d. As far as the position of languages [ z ] in ~ the Chomsky hierarchy is concerned, we have:
Theorem 5 Let 4 = (a,p ) be a valuation of V* in Z.The language sensitive, for any z E V*.
[XI+
is context-
+
Proot Let us suppose that P(n) = k n p . We give the proof for k > 0 only; the proof may be carried over the case k < 0 with the appropriate changes. Take no the minimal natural number such that kn p > 0 for all n > no. For each z E V* one constructs the phrase-structure grammar G, which works accordingly with the next nondeterministic procedure:
+
1. The grammar generates a sentential form X a l a 2 . . . a,YZ, X, Y,Z being nonterminals, ui E V,1 5 i 5 n, and n > no. 2. If no > 0, choose q E valb(ala:!.. .ano) and transform the sentential form into either X b l b z . . .bnoano+l... a,Y(-l)lqlZ, iff q < 0, or X b l b 2 . . . bnoa,o+l.. .anYlqZ, iff q 2 0,
bl, b2,.
. . , bno being nonterminals.
33
3. While the current sentential form contains words in V and no trap nonterminal do 0
0
Assume that the suffix of the current sentential form is Y d Z , for some c E {-1, l}, and q 2 0. if cq E [rnin(valb(z)), rnas(val+(z))], then
- choose a word a, in between X and Y , - transform a, into a nonterminal b,, - choose T
T
E a(a,), and write either
lr(kz+p),
if
T
2 0, or ( - l ) l r l ( k z + P ) , if
< 0, before 2,
- remove iteratively all pairs of consecutive symbols
-1,1, or 1,-1, in
between Y and 2. 0
0
if cq < rnin(vaZb(z)),then look for a word a, in between X and Y such that a(@,)contains a positive integer; if no such position exists, then block the derivation by a trap nonterminal; otherwise
a, into a nonterminal b,, choose T E a(a,),r > 0, and write lr(kzz+P) before 2, remove iteratively all pairs of consecutive symbols -1,1, in between
- transform -
Y and 2. 0
0
if cq > rnaz(valb(s)),then look for a word a, in between X and Y such that a(a,) contains a negative integer; if no such position exists, then block the derivation by a trap nonterminal; otherwise - transform a, into a nonterminal b,,
- choose T E a(a,),r < 0, and write ( - l ) l r l ( k z + p ) before 2, - remove iteratively all pairs of consecutive symbols 1,-1, in between Y and Z . 4. If the current sentential form does not contain any trap nonterminal, check whether or not its suffix YcqZ satisfies the relation cq E vul+(z),q 2 0. In the affirmative, remove all symbols c and X , Y , 2,and rewrite all nonterminals b, into a,, 1 5 a 5 n, otherwise block the derivation by a trap nonterminal. Clearly, [Zlb = q G , )
u {Y E V*ls
Yl
k A Y ) 5 720).
34
Note that the working space [14] of each z E L(G,) is bounded as follows
W S ( z ,GZ)I m a x ( l g ( z ) + 3 + m ~l g,( z ) + 3
+ + 2 . m3(k . l g ( z ) + p ) ) , m2
where
ml = max{lsl : s E v a l g ( y ) , y E V * , l g ( y ) I n o } , m2 = max{lsl : s E valg(x)}, m3 = max{ls\ : s E a ( a ) , a E V}. It follows that L(G,) is context-sensitive, hence [x]g is context-sensitive, too. 0 Note that, by relation 1 and Lemma 1, if Zg(z) = l g ( y ) , then x -g y iff z z g y . We do not have any algorithm for deciding whether two sentences are strongly synonymous with respect t o valuations whose position functions are arbitrary polynomials. However, we present below an algorithm for a large class of valuations.
Theorem 6. If 4 = ( a , @is ) a valuation with p being a non-constant polynomial and there i s a sentence a with exactly one non-zero value in a ( a ) , theiz one can decide whether x =o g , for a n y sentences x,y.
+ +
Proof. Let 4 = ( a ,p ) be a valuation of V *over Z with @(n) = conm +clnm-l . .. h.Assume that x z g y ; it follows that 2 y as well as val+(xak)n d 4 ( y a k )# 0, for all k 2 0, a being the word in V with just one attribute in a(.) which is not zero. By Lemma 1 and relation 1 one gets N~
i=O
i=O
+ i=o c cz j=1C ( l g ( 5 )+ j)"-iiY(u). m
= valg(x)
k
(5)
Analogously, m
k
Consequently, l g ( x ) = lg(y) must hold, otherwise {s1-s2Is1 E v a l g ( x ) , s2 E valg(g)} would be infinite, a contradiction. Indeed, for a ( a ) is a non-null integer, if l g ( y ) > l g ( x ) , then the relations (5) and (6) may be written as:
35 for some integers q k , t k . One infers that t k E {s1 - s2lsl E val+(y), s2 E val+(x)} for all Ic 2 0, which is contradictory. Analogously, when lg(y) < lg(z). In conclusion, x ~6 y iff (x "6 y)&(lg(x) = lg(y)), conditions that may be algorithmically checked. 0
5
The exponential case an
The subject of investigation in this section is the class of valuations q5 = (a,,@ whose position valuating function P is a particular exponential function, namely p(n)= an, n 1 1,a E Z \ {0,1}. Clearly, val+(zy) = val+(z) a'g(")valb(y). (7)
+
Theorem 7.Let q5 = ( a ,P ) be a valuation of V * in P,P(i) = ail i E IN,a E +\{O}. One can decide whether there exist sentences x E V+ such that q E val+(x),for a given q E Z?
Proof. One can distinguish three cases: a = 1, la( 2 2 and a = -1. If a = 1, we are dealing with a valuations whose position valuating function is constant; this situation has been treated in the proof of Theorem 1. Let us analyse the case a = -1. Define d as being the greatest common divisor of all integers in the set { r - s l r E a ( b ) ,s E a ( c ) } ,where b and c are words (might be the same) in V . Given an integer q, there exists x such that q E va16(x) if and only if q = t( mod d ) , t E {o}Uub,,{lSl I S E a(b)lS < 0) or d - t E u b c v { S l S E a ( b ) ,.S > o}. The argument is similar to that used in the proof of Theorem 1; the reader may easily find out the slight modifications. Assume now that la1 12. It is easy to notice that exists a sentence z E V+ such that q E vald(z) if and only if there exists a polynomial P whose coefficients are in the set C = { p E a ( b ) ( b E V } such that P ( a ) = q/a. Suppose that z = blb2.. . b,, bi E V,1 5 i 5 m.For
+ ~ a ( b 2+) . . .a"-'a(b,))
VU~+(X)= a(a(b1)
and a # 0, it follows that the required polynomial P is an element in the set of polynomials a(b1) X a ( b 2 ) . . . Xrn-'a(bm).
+
+
Let p = max({ll( I 1 E C} U { q / a } ) . The following algorithm decides, for any given la) 2 2, whether there is a polynomial P with coefficients in C such that P - q/a has the zero a.
36 Algorithm 2. Procedure ExistxPolynornial(q5,q); begin := { p E a(b)lb E V } ; D1 := { - q l a } ; D := 0; repeat D := D U D1; R := 0; for each i E D do for each j E C do if i+j mod a=U then R := R U {i j dzv a } ; if 0 E R then “THE POLYNOMIAL EXISTS”; stop; else D1 := D1 U R; until D = D1;
c
+
“THE POLYNOMIAL DOES NOT EXIST”; end.
In order to finish the proof, we need a reasoning for the correctness of the above algorithm. Termination. We claim that at each step when a number i + j div a, i E D and j E C, is added to R, this number is between -p and p. Indeed, initially the assertion is valid. Assume that at an arbitrary moment, when entering the repeat ...until loop, all elements of D are bounded by -p and p, respectively. For la1 2 2, every multiple of a of the form i j , i E D , j E C, is in the interval [-2p, 2p], hence i j div a is in [-p,p]. Consequently, either 0 E R, during the loop or D = D1 after this loop has been performed at most 2 p times. Correctness. Assume that the algorithm provides 0 in R at some step. This implies the existence of some k 2 1 such that
+
or equivalently q / a E a(bil)ak-’
+
+ a(biz)ak-2+ . . . + a(bi,).
It follows that q E ual#(bikbik--l . . . bi,). Obviously, if the algorithm ends with D = 0 D1, then there is no sentence y such that q E vald(y).
It is worth mentioning here that the problem of deciding upon the strong synonymy between two given sentences can be algorithmically solved for the same class of valuation as that stated in Theorem 6. Theorem 8. If I$ = ( a l p ) is a valuation with p being a n exponential function p(n) = an, whose base a is any integer distinct of 0 and -1, and there is a word
37 b with exactly one value in a(b), then one can decide whether x z $ y, for any sentences x, y.
The proof is an immediate consequence of relation 7 being left to the reader as an exercise.
6
Final remarks
We briefly discuss here some considerations that seem to be in order. In the present paper we have considered the semiring of integers with the addition and multiplication. It appears to be interest to replace it by other semirings (or other structures) having linguistical relevance. Our approach tries to valuate all sentences over a vocabulary. A more natural approach might be the valuation of just those sentences which belong to a given language. An attractive class of languages seems to be the context-free one. As one can easily notice, there are plenty of natural questions without answer; all of them remain to be further investigated. We provide below a list of a very few of them which appear to be more attractive from our point of view. 1. Is it decidable whether or not two given sentences are strongly synonymous with respect to valuations whose position function is an arbitrary polynomial or exponential function? 2. Can we decide the finiteness of strong synonymy classes in the arbitrary polynomial case? What about the same problem for both classes in the exponential case? In our opinion, a natural direction of further work may consider this formalism as an algebrac backbone upon which other formalisms of semantical structure can be grafted.
References [l] J. Berstel and D. Perrin, Theory of Codes, Pure and Applied Mathematics, Academic Press, 1985. [2] J. Berstel and C. Reutenauer, Rational Series and Their Languages, EATCS Monographs on Theoretical Computer Science, vol. 12, Springer, Berlin, 1988. 131 J. Dassow, Gh. P lu n , Regulated Rewriting in Formal Language Theory, SpringerVerlag, 1989. [4] J. Dassow, V. Mitrana, Finite automata over free generated groups, Intern. Journal of Algebra and Computation, 10, 6(2000), 725-737.
38
[5] S. A. Greibach, Remarks on blind and partially blind one-way multicounter machines, Theoret. Comp. Sci. 7(1978), 311-324. [6] H. Fernau, Valuation of languages, with applications t o fractal geometry, Theoret. Comput. Sci. 137 (1995) 177-217. [7] H. Fernau, L. Staiger, Valuations and unambiguity of languages, with applications t o fractal geometry, ICALP'94, LNCS 820, Springer, 11-22. [8] G. Hansel and D. Perrin, Codes and Bernoulli partitions, Math. Systems Theory 16 (1983) 133-157. [9] G. Hansel and D. Perrin, Rational probability measures, Theoret. Comput. Sci. 65 (1989) 171-188. [lo] 0. H. Ibarra, S. K. Sahni, C. E. Kim, Finite automata with multiplication, Theoret. Comp. Sci.,2(1976), 271-294. [ll]W. Kuich and A. Salomaa, Semirings, Automata, Languages, EATCS Monographs on Theoretical Computer Science, vol. 5, Springer-Verlag, 1986. [12] V. Mitrana, R. Stiebe, Extended finite automata over groups, Discrete AppJ. Math., 108, 3(2001), 247-260. [13] Gh. P b n , A new generative device:valence grammars, Rev. Roum. Math. Pures et Appl., 25, 6(1980), 911-924. [14] A. Salomaa, Formal Languages, Academic Press, 1973.
39
Appendix Let S be a finite set of positive integers; we denote by gcd(S) the greatest common divisor of all elements from S . We now proceed to prove the following claim which is equivalent with the fact used in Theorem 1:
Claim 1 Let S be a finite set of positive integers. For any partition S1, S2 of S , with both sets S1,S2 nonempty, the following two conditions are satisfied: 1. There exist coeficients c(t), t E S , such that
gcd(S) = C c ( t ) t tES
and c(t ) 2 0 for all t
E 5’1,
and c( t) 5 0 for all t E S2
2. There exist coeficients c( t) , t E S , such that gcd(S) =
C c(t)t tES
and c(t ) 5 0 for all t E S1, and c( t) 2 0 for all t E
5’2
Proof. We prove the claim by induction on the cardinality of S . Let us assume that S = { t l , t z } . It is known that gcd(S) = ctl f t z for some integers c, f. Clearly, c 2 0 and f 5 0 or c 5 0 and f 2 0. Without loss of generality we may assume that f 5 0. I f f = 0, then c = 1, gcd(S) = t l , and t2 = g t l for some g 2 1. Now we can write gcd(S) = (1 - 2g)tl 2t2 and we are done. Suppose now that f < 0; then ctl = gcd(S)- ft2. Since t 2 = g d , for some g 2 1, one gets ctl = gcd(S) - fg(ct1 f t 2 ) which implies gcd(S) = c(f g l ) t l f 2 g t 2 . For f g 1 5 0 and f 2 g 2 0 the basic step of induction is completely proved. We assume that the assertion holds for any set S of cardinality s, consider a set S’ of cardinality s 1, and a partition of S’ in two nonempty sets S i and SL. Obviously, at least one of these sets, say SL, contains at least two elements. Consider an arbitrary element of S;, say 1 and set S = SIUS~,where Sl = S i and S2 = S;\{l}. By the induction hypothesis,
+
+
+
+
+
+
+
gcd(S’) = gcd({gcd(S),l})= d C c ( t ) t
+ f‘l
tES
holds, where the coefficients c( t) can be chosen as stated above. Again, either c’ 2 0 and f’ 5 0 or c‘ 5 0 and f’ 2 0. We now choose c ( t ) 2 0, 0 for all t E S1, and c(t ) 5 0, for all t E Sz, which completes the proof.
40
JOIN DECOMPOSITIONS OF PSEUDOVARIETIES OF THE FORM DH n ECom KARL AUINGER Institut f i r Mathemat&, Universitat Wien, Strvdlhofgasse 4, A-I090 Wien, Austria E-mail:Karl.Auingerhnavie.ac.at A constructive proof of the equation DH n ECom = (J n ECom) V H is presented where H denotes any arborescent pseudovarietyof groups. In addition, a larger class of pseudovarieties of groups is found for which that equation holds.
1
Introduction
The purpose of this paper is to present a constructive proof of the equation
DH n ECom = (J n ECom) V H
(1)
where H is a “sufficiently nice” pseudovariety of groups. As usual, for a group pseudovariety H, DH denotes the pseudovariety of all (finite) semigroups all of whose regular ID-classes are groups in H, while ECom is the class of all (finite) semigroups with commuting idempotents and J stands for the pseudovariety of all J-trivial semigroups. (In this paper, all semigroups except free semigroups A+ and free monoids A* are assumed to be finite.) A syntactic proof of the equation (1) has been found by Almeida and Weil in the case H being arborescent which means that (H n Ab) * H = H (where Ab is the pseudovariety of all abelian groups and * denotes the Mal’cev product, or, equivalently, the semidirect product of the involved pseudovarieties). Moreover, also in one can find (in terms of a “unique factorisation property”) a condition characterizing the set of all pseudovarieties H satisfying equation (1). From that condition it follows that this set is closed under taking joins (within the lattice of all pseudovarieties). The arguments in are based on a careful study of the free pro-H groups and some knowledge of the free proDS semigroups. (Here DS denotes the pseudovariety of all semigroups all of whose regular ID-classes are subsemigroups). The proof thereby obtained is not constructive in the sense that for a given S E DH n ECom it would effectively construct a semigroup C E J n ECom and a group H E H such that S divides the direct product C x H . From the proof we only know that suitable C and H do exist.
41
In contrast, our approach will prove equation (1) for a larger class of pseudovarietes H which can be characterized by a certain condition (P) (see Definition 2.3) and the proof will be constructive. It is based on a discovery by Ash, Hall and Pin which provides a convenient set of generators of the pseudovariety DH n ECom. In it is shown that each S E DH n ECom divides a precisely described finite direct product of transition semigroups of automata of a very special kind (these automata will be introduced in section 3); conversely, all such transition semigroups are in DH n ECom. The main idea of our proof then will be, given an automaton d of that kind, to consider a certain quotient automaton A/- ( N essentially eliminates the non-trivial group sub-automata of d) and whose transition semigroup is aperiodic (that is, it is a member of J n ECom). Then we construct a suitable finite group H such that the transition semigroup M(d) of d divides the direct product of the transition semigroup M(d/N) of d/- and H . The group H is especially designed to outweigh the “loss in accuracy” which comes from going from d to d/N. The prerequisites to construct such a group are developed in section 2 (without giving full proofs). An expanded version of the paper, containing full proofs and several refinements will appear elsewhere. For undefined notions in the theories of semigroups, pseudovarieties, automata, etc., the reader is referred to the books of Almeida and Pin ’; for background information about varieties of groups the book of Neumann is a good reference. Throughout, for a word w E A* (on a finite alphabet A), c ( w ) stands for the content of w, that is, the set of all letters occurring in w while I wI denotes the length of the word w. For any finite set S, IS1 stands for the number of elements of S. For any A-generated (semi)group S = (A) and any word w E A+ we will write, if emphasis is necessary, w ( S ) to denote the evaluation of the word w in S. 2
Groups
Here we present an auxiliary result which will be essentially used in the next section in the proof of the main theorem. The result is about semigroup identities (not) being satisfied by certain group varieties. (However, the result holds - mutatis mutandis - for group identities as well). The notation throughout this section will be as follows. For a finite alphabet A let ( z i ) i l l be a sequence of letters of A and ( u i ) i ? ~(vi)i>o , two sequences of words in A* (some of them may be empty) such that zz
$ C (I Ui-1) uC(Vi-1)
u c(u2) u (Vi).
42
We will mainly be interested in identities of the form uo21u1... znunN voz1w1.. .2,vn.
Let U ,V be varieties of groups and let of U and V , that is,
U * V = {G I GIN
EV
U * V be the usual (Mal'cev) product
for some normal subgroup N E U}.
It is well known that * is an associative operation on the lattice of group varieties, and that U * V is generated, as a variety, by all possible semidirect products U * V with U E U and V E V (see Neumann '). Moreover, there is a well-known representation of the A-generated free object in U * V as a subgroup of a semidirect product of an appropriate member of U by an appropriate member of V . We use here the version presented in Theorem 10.2.1 in '. More precisely, let F = FAV be the A-generated free object in V and let A' = F x A. Then F acts on A' by g ( h a) , = (gh,a ) Vg, h E F, Vu E A. Consequently, if G = FAIU is the A'-generated free object in U then F acts on G by automorphisms on the left via g[(hl,U1)*1.
. . (hn,an)*l] = (gh1,Ul)*l ... (ghn,an)*I.
So we may form the semidirect product G * F subject to this action. Then the free object on A in U * V is isomorphic to the subgroup of G * F freely generated by the elements of the form ((l,a), a) where a E A. For semigroup identities u N v with u, v E A* this means that if u = a1 . . .an and v = bl . . .b, then U * V u N v if and only if (i) V k u 'v v and (ii) U k (l,al)(al,a2). .. (a1.. .an-l,an)N (l,bl)(bl,b2). .. (bl . . .b,_l,b,). In the latter identity the variables are of the form (u,a ) with u E A* and a E A, and two such variables, say ( u ,a ) and (v,b) are the same if and only if a and b are the same letters from A and V u 'v v. We are going to formulate the main result of this section. For each positive integer n let ?in be a group variety which does not satisfy any non-trivial semigroup identity u E v with ) u ((,v J5 n and u,v E A*. For each n 2 0 let
+
Vn := '?&+I
*'+I!, * . . . * 3t2 * 3c1
and for convenience put V-1 := 7, the trivial variety. In the next results we assume that the words ui,vi and the letters xi satisfy the conditions imposed at the beginning of this section. The first result can be proved by induction on n.
43
Lemma 2.1 The variety V, does not satisfy any identity uo21u1.
..2,+1u,+1
?! wo21w1
. . .xtvt
for 0 5 t 5 n. The main result of this section is an easy consequence of this lemma, again to be proved by induction on n. Theorem 2.2 If V, satisfies uox1u1.. .x,an N V o z l V l . . .Z,W, then Vn-1 satiesfies u; E wi for all i. The following property (P) of a pseudovariey H of groups will turn out to be crucial for our purpose. Definition 2.3 A pseudovariety H of groups has the property (P) if f o r each G E H and for each positive integer n there is a group F E H such that 1. F does not satisfy any non-trivial semigroup identity u 'v v f o r IuI, IvI 5 71.7
2. ( F ) * (G) C H. Property (P) has already been pointed out to be of some interest in '. Here (. . .) denotes the pseudovariety generated by the group ". . ." and * is the Mal'cev (that is, semidirect) product of the involved pseudovarieties. Observe that a pseudovariety H enjoys property (P) if the seemingly weaker condition holds: for each group G E H there exists a prime p for which the wreath product q p z 1 G is in H (see '). The following corollary is a consequence of Theorem 2.2 in a form we shall use it in the next section. Corollary 2.4 Let H be a pseudovariety of groups satisfying the property (P) and let G = ( A ) be an A-generated group in H; then f o r each positive integer n there is an A-generated group G, = ( A ) in H such that 1 . for all u,v E A+, i f u(G,) = w(G,) then also u(G) = w(G),
2. whenever (uox1u1.. . znun)(G,) = ( w o z 1 w 1 . ..xnwn)(Gn) then ui(G) = wi(G) for all i (with the assumptions on the words ui, w; and the letters x; imposed at the beginning of this section). Proof. Let Go = G E H and n E N;choose a group F1 E H not satisfying any identity u E w with 1u1, lwl 5 2 such that (F1) * (Go) C H. Notice that (F1)* (Go)is locally finite, that is, it is the finite trace of a locally finite variety. In particular, in (F1) * (Go) all finitely generated free objects exist. Let G1 be the A-generated free object in (F1) * (Go). Suppose that F,-1 and G,-1 have already been constructed. Let F, be in H such that F, does not satisfy any non-trivial identity u N w with 1211, IwI I n 1 and (F,) * (Gn-l) C H.
+
44
Now let G , be the A-generated free object in (F,) * (Gn-l). By induction one can see that for all i 5 n, Gi and (Fi)* . . . * (F1) * (Go) satisfy the same identities in [A1variables. Consequently, G , is the A-generated free object in (F,) * .. . * (F1)* (Go) and so by Theorem 2.2 has the requested property.
Remark 2.5 Note that the property (P) is not a closure property in the sense that for each pseudovariety H there would exist a least pseudovariety V such that H G V and V has (P).
3 The main result As mentioned in the introduction, a pseudovariety of groups is arborescent if (H n Ab) * H = H . It has been shown in that for H arborescent free pro-H groups enjoy certain “unique factorisation conditions” (similar to the free group). These factorisation conditions were, in turn, one essential ingre dient of the proof that such pseudovarieties satisfy the equation (1); the other ingredient was a sufficiently precise description of the implicit operations on DS. This section gives a constructive proof for the join decomposition (1) which applies to a wider class of pseudovarieties, namely those which satisfy condition (P). As a preparation for this new proof we first shall recall a result of Ash, Hall and Pin presenting a set of “nice” generators of the pseudovariety DH n ECom. We require some definitions. Throughout, let A be a finite set of letters. A permutation automaton (or group automaton) A = (&,A,-) on A consists of a finite non-empty set of states Q together with a labelling of some permutations of Q by the letters of A, denoted by a : q C) q .a. Here two different letters may label the same mapping and the m e IQI = 1 is also included; in the latter case, each letter labels the identity mapping on Q. Let r E N;for each i 5 r let Ai C A and = (Qi, Ai, -) be a permutation automaton such that Qi n Qj = 0 if j # i. Moreover, let uo,u, E A* and u1,. . . ,u,-1 E A+ be such that the first letter of ui is not in Ai (1 5 i 5 r ) and the last letter of ui is not.in Ai+l (0 5 i 5 T - 1). For each i choose pi,qi E Qi; as in 2 , p. 393, let
be the automaton depicted in Figure 1 (with appropriately chosen states inside each path ui).
45
Figure 1: The automaton A
'
A formal definition of this automaton has been given in on p. 38. For convenience, let us call such an automaton good. Note that in good automata have been characterized as those admitting a linear quasiorder on the set of states which is compatible with the action of the letters. Let n = l u o u 1 . ..url be the length of that automaton. If all transition groups of the automata belong to the group pseudovariety H then the automaton d will be called H-good. The transition semigroup M(d) of an H-good automaton A is in DH n ECom (see 5 , Corollary 2.3 or 2 , Lemma 3.9). Moreover, the class of all transition semigroups of H-good automata generates the pseudovariety DHnECom. On the one hand, this follows from the proof of Theorem 3.8 in 2 : there it is shown that the class of all such transition semigroups does not satisfy more pseudoidentities than DH n ECom itself; therefore, by Reiterman's Theorem the result follows. On the other hand, a constructive proof of this assertion has been presented in '. That is, given any S E DHnECom, a precisely described finite set of H-good automata has been constructed such that S divides the direct product of the transformation semigroups of these automata (Proposition 3.5 in '). We shall discuss that construction in more detail. Crucial for it is an important lemma by Ash which holds, if appropriately reformulated, in the more general context of semigroups with commuting idempotents and which lemma has been an important step in Ash's famous proof that each semigroup with commuting idempotents divides an inverse semigroup (see The version we shall need is the following (see Propositions 3.3 and 3.4 in '): Proposition 3.1 Let S = ( A ) E DGnECom; then there is a positive integer K = K ( S ) (depending o n S only) such that each word w E A+ admits, for some n 2 0, a factorization w = goulgl .. .ungn such that
'
314).
I . all g,(S> are group elements (go,gn may be empty),
2. for each i, the first letter of ui is not in c(gi-1) and the last letter of ui is not in c ( g i ) , 3.
lUlU2..
.un-lu,I 5 K
46
This statement and its more general version is usually proved by the use of Ramsey's Theorem. For the above mentioned case we are interested in, semigroups in DG n ECom, we shall give an elementary proof, thereby getting a better bound for the number K . Recall that each S E DG n ECom is a semilattice of unipotent semigroups, that is, if q is the least semilattice congruence on S then S/q C E ( S ) and each q-class S, is an ideal extension of the group He (the group %-class containing e ) by a certain nilpotent semigroup N , U (0). Then S, = He U Ne and the product of any INeI 1 elements of S, lies in He. Notice also that for any u , v E A+ with c(u) = c ( v ) we have u(S)r] v(S), that is, u ( S ) and v(S) are in the same subsemigroup S,. Corollary 3.2 Let S = ( A ) E DG n ECom and let
+
N = N ( S ) = max l N e l + l . eEE(S)
Then the number K ( S ) of Proposition 3.1 can be chosen to be less than NIAl. More precisely, each w E A+ admits a factorisation as in Proposition 3.1 such that Ju1. . . u,1 5 Nlc(w)I- 1. Proof. We only have to prove the existence of a factorisation satisfying conditions ( 1 ) and ( 3 ) . Namely, if g(S) is a group element and if x € c(g) then J,(s) 2 J,(s) and therefore (zg)(S)and ( g e ) ( S )are group elements, as well. Likewise, if g(S) and h ( S ) are group elements then so is ( g h ) ( S ) . Therefore, each factorisation satisfying (1) and (3) can be reduced to a factorisation satisfying (1),(2) and (3). The proof now is by induction on the size Ic(w)l of the content of w. If 1c(w)1 = 1 then the claim follows immediately from the definition of N . So let w E A+ and suppose that the claim is true for all words v with Ic(v)I < lc(w)l. We factorize w as w = 'lllx1u2x2... 'Ilkxkuk+l
such that for all i 5 k , Ic(ui)l = lc(w)l-l and xi is a letter not being contained in c(ui), that is, c(uixi) = c ( w ) ; moreover, I C ( U k + l ) l < Ic(w)I and Uk+1 may be empty. If k 2 N then we are done because in this case,
is a product of N or more elements within the same semilattice congruence class Se so that this product lies in He. Since c(uk+l) C c(u1x1. . . ukxk) the element w(S) itself is a group element. So we may assume that k 5 N - 1. By induction hypothesis, each ui admits a factorization in group and nongroup parts such that the accumulated length of the non-group parts is by induction hypothesis - at most Nlc(w)l-l- 1. Thereby we have already found a factorisation of w in group and non-group elements: each uixi admits
47
such a factorisation with the accumulated length of the non-group elements being at most Nlc(w)l-l - 1 + 1 = NIC(w)l-l.Hence the accumulated length of the non-group elements in w is at most N l " ( w ) l - y N - 1) + N I C ( 4 - 1 - 1 = N I C ( 4 l - 1,
as required. Remark 3.3 For the semigroup S = ( A ) E DG n ECom and for any a C A, a # 8 put N , = {w(S) I w(S) is non-regular and c ( w ) = a } and let N ' ( S ) = m a x ~- # IN,I+l. ~ c ~ Then Corollary 3.2 still holds if N ( S )is replaced with N'(S). The proof of Proposition 3.5 in (combined with Corollary 3.2) now shows the following. Corollary 3.4 Let S = ( A ) E DG n ECom; then S divides a direct product of transition semigroups of good automata on A such that I . each of these automata has length at most IAIN(S)lAl; = ( H e ,A,, -) where 2. the incorporated group automata are of the form each He is a maximal subgroup of S (which can be regarded to be generated by a subset A, of A); the action . is just the multiplication on the right. The transition group of each such automaton is isomorphic to the group He.
Remark 3.5 1. In item 1 we could restrict ourselves to good automata on A of length (precisely) IAIN(S)IAI; but then, as another factor in the direct product, we have to mention the least group 'fl-class of S which is A-generated and which may be needed for the division but which need not be representable as (a divisor of) the transition semigroup of any automaton described in item 2 of Corollary 3.4and having positive length. 2. Observe that the number of distinct automata satisfying i t e m 1 and 2 above and also the size of each associated transition semigroup are bounded by a primitive recursive function in the cardinality of S. 3. In the proof of Proposition 3.5 in one can argue by induction on Ic(w)U c(w')Jinstead of Ic(w)l Jc(20')).This is the reason why the factor 2 which occurs in the sentence before Proposition 3.6 in does not occur in the expression IAIN(S)IAI. Now let d = d(uO,p1,dl,q1. .. , p , , A,q,., u,) be a good automaton with group automata = ( Q i ,Ai, Let M(d) be the transition semigroup of A. We intend to show that M(d) divides the direct product C x H for some
+
0 ) .
C E J n ECom and some group H . The aperiodic semigroup C is obtained by “factoring the groups out of d”: more precisely, consider the equivalence relation on the set of states of d which identifies all states within each group automaton A. The resulting automaton, denoted by A/- is the following (as on p.394 in ’):
-
A1
A’
Ar
Figure 2: The automaton A/The associated transition semigroup C = M ( d / - ) is in J n ECom (see Corollary 2.3 in or Lemma 3.9 in ’). Now we construct the group H . Thereby we shall use the ideas developed in section 2. Let 3tl be the (non-trivial, locally finite) variety of groups generated by the transition groups of the automata d k (1 5 Ic 5 r ) . Let n = luo.. .url be the length of d and for each i E (2,. . . , n 1) let Xi be a locally finite variety of groups not satisfying any non-trivial identity u 11 w * 3t1 and let with IuI, Iwl 5 i where u, w E A*. Let V , = %,+I * . . . * 3 t ~ H = (A) be the free group on A in V,. Then we have, with the notation introduced above, the main result of the paper: Theorem 3.6 The semigroup M ( d ) divides the direct product C x H . Proof. We show that the map (a,a) e a, a E A, extends to a morphism ((a,a) 1 a E A ) C C x H -+ M ( d ) . Let u,w E A+ be such that
+
u(C) = w(C) and u ( H ) = w ( H ) . We have to show that u ( M ( d ) )= w(M(d)); therefore, it suffices to show that for each state q of d, q . u = q . w. From u(C) = v(C) and from the definition of A / - we have the following: for each state p of d / - , p . u = p . w. Consequently, for each state p of d ,p . u is defined if and only if p - w is defined. Let i j = q-; assume that i j is somewhere on the path ui and ?j. u = i j . w is somewhere on uj. Then i 5 j and we may assume that i < j . Then there are factorisations of u and w, respectively, such that
and
49
where the words U k are the words occurring in d resp. d/N, u: is a (possibly empty) su& of ui, u>is a (possibly empty) prefix of u j , and for all possible 1, the first letter of uz is not in c(gz)uc(hz)[c(gj)Uc(hj)]while the last letter of uz [u:]is not in c(gz+l)Uc(hz+l) [c(gi+l)Uc(hi+l)J(some of the words gz, hz may be empty, as well). By our assumption, u ( H )= v ( H ) ,that is V, b u E v. By Theorem 2.2, 311 gz N hi for all 1 (notice that in Theorem 2.2 some of the segments U k , vk may be empty). In particular, for each I E {i 1, . .. ,j } and each group G = (Az) in 311 we have that gI(G) = hl(G). Let GI = ( A z ) be the transition group of the automaton dz. By the above argument we have gz ( G I )= hz (G I ) ,that is, gz (G I )and hz(G I ) are the same element of the group GI and consequently, gz and hz act in the same way on &I. This applies to each I E { i 1, . . .,j } . Therefore, q .u:gi+l = q .u:hi+l, and by induction we get:
[ui]
+
+
q * u:gi+l = q . u:hi+l* q . u:gi+lui+l = q . u:hi+lUi+l Q . u:gi+lui+lgi+2 = 4 . u:hi+lui+lhi+z
* * ... * q . u = q . v
and the theorem is proved. Using Corollary 2.4 we get the next result precisely in the same way. Corollary 3.7 Let H be a pseudovariety of groups satisfying condition (P) of Definition 2.3. Let M be the transition semigroup of a n H-good automaton. Then M divides the direct product C x H for some C E J nECom and some H E H. We have already remarked that, given an arbitrary semigroup S = ( A ) in DG n ECom then we can find transition semigroups M I , . .. ,Mk of good k automata such that S divides Hi=, Mi and such that k as well as all /Mil are bounded by a primitive recursive function in JSI. Moreover, the length of the involved automata is bounded by IAIN(S)IAI.Similarly as in 6 , Corollary 3.1, it can be shown that the cardinality of the group H in Theorem 3.6 can be bounded by a primitive recursive function in the length of the involved automata and the cardinalities of the involved subgroups. Summing up, we get the next result (answering problem 25 in for that particular join decomposition): Corollary 3.8 The decomposition DG n ECom = (J n ECom) V G is effective in the following sense: each S E DG n ECom divides a direct product C x H f o r some C E J n ECom and H E G such that the cardinalities of C and H are bounded by primitive recursive functions in the cardinality of S . We remark that the latter result more generally holds for each arborescent pseudovariety H. What we actually need is that in property (P) (Definition 2.3) the cardinality of F is bounded by a primitive recursive function in n and
50
the cardinality of G. References
1. J. Almeida, Finite Semigroups and Universal Algebra, World Scientific, Singapore, 1994. 2. J. Almeida and P. Weil, Reduced factorizations in free profinite groups and join decompositions of pseudovarieties, Int. J. Algebra Comput. 4 (1994) 375-403. 3. C. J. Ash, Finite idempotent-commuting semigroups, pp 13-23 in: S. M. Goberstein and P.M.Higgins (eds.), Semigroups and Their A p plications, Reidel Publishing, Dordrecht, 1987. 4. C. J. Ash, Finite semigroups with commuting idempotents, J. Austral. Math. SOC.43 (1987) 81-90. 5. C. J. Ash, T. E. Hall and J. E. Pin, O n the varieties of languages associated with some varieties of finite monoids with commuting idempotents, Inform. Computation 86 (1990) 32-42. 6. K. Auinger, Semigroups with Central Idempotents, pp 25-33 in: J. C. Birget et a1 (eds.), Algorithmic Problems in Groups and Semigroups. Birkhauser, Boston Base1 Berlin, 2000. 7. K. Auinger and B. Steinberg, The geometry of profinite graphs with applications to free groups and finite monoids, preprint. 8. H. Neumann, Varieties of Groups, Springer-Verlag, Berlin Heidelberg New York, 1967. 9. J. E. Pin, Varieties of Formal Languages, North Oxford, London and Plenum, New York, 1986.
51
Arithmetical Complexity of Infinite Words S. V. Avgustinovich,* D. G. Fon-Der-Flaass,t A. E. Fridl Sobolev Institute of Mathematacs, pr. Koptyuga, 4, Novosibirsk, Russia Email: {avgust,flaass,frid}Qmath.nsc.ru
Abstract
We introduce a new notion of the arithmetical complexity of a word, that is, the number of words of a given length which occur in it in arithmetical progressions. The arithmetical complexity is related t o the well-known function of subword complexity and cannot be less than it. However, our main results show that the behaviour of the arithmetical complexity is not determined only by the subword complexity growth: if the latter grows linearly, the arithmetical complexity can increase both linearly and exponentially. To prove it, we consider a family of DOL words with high arithmetical complexity and a family of Toeplitz words with low complexity. In particular, we find the arithmetical complexity of the Thue-Morse word and the paperfolding word.
1
Introduction
The famous 1927 theorem by Van der Waerden [9, 71 states that for each infinite word w = wow1 . . . w, . . . on a finite alphabet C there exist arbitrarily long arithmetical progressions k, k + p , . . . , k n p such that wk = wk+p =
... - wk+np.
+
In this paper we are interested in the following generalization of the problem: what can the words WkWk+p . . . Wk+np be in general for a given w and arbitrary k, p , and n ? What are the properties of the arithmetical closure *Supported in part by INTAS (grant 97-1001) and RFBR (grant 00-01-00916). +Supported in part by Netherlandish-Russian grant NWO-047-008-006 and RFBR (grant 99-01-00581). fsupported in part by RFBR (grant 99-01-00531) and Federal Aim Program “Integration” (joint grant AO-110).
52
FA(w), i. e., of the language consisting of all such words? In particular, we make an attempt to compute or estimate the arithmetical complexity f i ( n ) of w which is defined as the number of words of FA(w)of length n. The procedure of taking arithmetical closure reminds what is called decimations in the paper [6] by J . Justin and G. Pirillo. In terms of that paper, taking an arithmetical progression is a blind decimation. Note that the arithmetical closure can be defined not only for a word but also for a language, and the term “closure” is used just because for all w the equality FA(w)= FA(FA(w)) holds. As for the arithmetical complexity f i ( n ) ,it is somehow similar to the usual subword complexity f w ( n ) ,i. e., t o the number of factors of w of length n. For example, both subword and arithmetical complexities of a periodic infinite word are ultimately constant, and both functions are bounded by (#C)n. Clearly, for all w and n
We try to show in this paper that arithmetical subwords and the function of arithmetical complexity are worth studying. In Section 2, after introducing the needed notions, we show that an arithmetical subsequence of a uniformly recurrent word is uniformly recurrent. Then we pass to studying arithmetical complexity and first show in Section 3 that it is not obliged to grow as slow as the subword complexity. We describe a class of words containing the Thue-Morse word and having linear subword complexity and arithmetical complexity equal to ( # C ) n . On the other hand, arithmetical complexity of a non-periodic word can grow linearly, and in Section 4 we validate it by an example of some family of Toeplitz words. Finally, we use the latter result to compute the arithmetical complexity of the paperfolding word, which turns out to be equal to 8n 4 for all n 2 14.
+
2
Basic Notions
Let C be a finite alphabet. As usuai, the set of all finite words on C is denoted by C*, the set of all non-empty finite words is denoted by C+, the set of all words of length n is denoted by En, and the set of all (right) infinite words is denoted by Cw. For any t E C+, the word tt . . . t . . . is denoted by tW. A (finite) word u is called a factor, or subword of a (finite or infinite) word v if v = slusz for some words $1 and s2 which may be empty. Let us consider an infinite word w E Cw:
53
where wi E C. The set of all factors of w is denoted by F(w).The the wellknown function of subword complexity of the word w (or of the language F(w)) is the number of words in F(w)of length n; we denote it by f,(n) = f F ( w ) ( n ) . Let us call the infinite word wi = WkWk+pwk+Zp. . . ~ k . . .+ the~ arith~ metical subsequence of w starting with position k and having diference p . A factor of some wi is called an arithmetical subword of w,and the set FA(w) of all arithmetical subwords of w is called its arithmetical closure:
FA(w)=
u
I P > Lk,n20},
F(w~)={X}U{~k~k+p...wk+np
P21,kB
where X denotes the empty word. In these terms, the Van der Waerden theorem can be stated as follows:
Theorem 1 (Van der Waerden 1927) For each infinite word w and positive integer n there exists a symbol a E C such that an E FA(w). In this paper, we are interested in the properties of the language FA(w) and in particular in its subword complexity f F A),( ( n ) which is denoted also by ( n ) and called the arithmetical complexity of w. Let v = W k W k + l . . .Wk+n. Formally, an occurrence of v in w is the word v together with the number k of its first letter in w.Clearly, a word may have a finite or infinite number of occurrences in a n infinite word w.If v is a prefix of w ,we call its occurrence corresponding t o k = 0 the prefix occurrence. Recall that a n infinite word w is called uniformly recurrent if each of its factors occurs in w infinitely many times with bounded gaps, i. e., if there exists a finite recurrence function R,(n) such that each factor of w of length R,(n) contains all factors of w of length n. The following lemma does not seem t o be new, but since we did not find a reference t o it, here it is given with a proof.
fi
Lemma 1 An arithmetical subsequence of a uniformly recurrent word is uniformly recurrent. PROOF. Let us consider an arithmetical subsequence wf of a uniformly recurrent word w. Since a word obtained from a uniformly recurrent word by erasing a finite prefix is uniformly recurrent, it is sufficient to consider 1 = 0 and t o show that the prefix u' of wj of length n' 1 occurs in it once more with a gap bounded by a function of n' and p. Let n = n'p and u = wo. . . w,. The word u' is an arithmetical subword of u. To find another occurrence of u' in w;,we shall find an occurrence of u in w lying at a distance dividing p from the prefix occurrence.
+
54
For an occurrence B = W k . . .W I + ~of a word v in w, we define the function c(B) = Ic mod p . Our goal is t o find an occurrence fi of u not equal t o the prefix one and having c(ii) = 0. Denote u = uo. For all i = 1,.. . , p , we define ui inductively as the minimal prefix of w containing two occurrences of ui-1 (including the prefix occurrence): ui = ui-19-i = liui-1 for some Zi,ri E C+. Since w is uniformly recurrent, all ui are well-defined and Iuil 5 R,(Iui-ll) + 1: indeed, by definition, an occurrence of ui-1 is contained in each subword of w of length Rw(lui-ll)including w1 .. . w ~ ~ ( l ~and ~ -the ~ loccurrence ), of ui-1 is not the prefix one. Thus, ui is a prefix of wo. . .WR,,,(I~~-~ 1). For all j 2 i 2 0, let us denote the last occurrence of ui in the prefix occurrence of uj by ua. By the pigeon-hole principle, at least two of the numbers c(uz) = 0, c(uk), . . . ,c(uE) are equal: c(ui) = c(uA) for some 0 5 i < j 5 p . This implies the equalities c ( u i ) = c ( u i ) for all Ic 5 i; in particular, 0 = c(ui) = c(u{). Since u = uo is a prefix of ui,we have also c(fi)= 0, where fi is the first occurrence of u in u:. Since i < j, ii is not the prefix occurrence of u in w but the needed one. Clearly, since luol = n 1 and luil 5 R,(lui-ll)
+
lupl I R,(R,(. . . (R,(n
+ 1 for all i, we have
+ 1)+ 1).. . + 1)+ 1.
P
Consequently, two occurrences of u’are contained in the prefix of w: of length at most
1 1 - ( / u p] 1) 1 5 - R,(R,(. . . (R,(n
P
+
- P
+ 1)+ 1).. . + 1)+ 1)+ 1.
P
Since this upper bound does not depend on the choice of u‘ (to consider another word, we just erase another prefix of w in the beginning of the proof), there is an occurrence of each factor of w: of length n’ + 1 in each its factor of length ;R,(R,(. . . (R,(n 1) 1).. . 1) 1). So, this is an estimate
-
+ +
+ +
P
for the recurrence function:
R,; (n’
+ 1) 5 P--1R,(R,(.
. . (R,(p’
+ 1)+ 1).. . + 1).
P
We have estimated the recurrence function of an arithmetical subsequence using its difference and the recurrence function of the initial word. The lemma is proved. 0
55
High Arithmetical Complexity In this section we consider DOL words. Let cp : C* + C* be a morphism, i.e., 3
a mapping satisfying cp(xy)= cp(x)cp(y)for all x , y E C*. Clearly, a morphism is completely defined by the images of letters. If for some a E C the image cp(a) starts with a, and Icpi(a)l-b 00, then there exists a right infinite word called the fixed point w of cp starting with a and defined by the equalities
w = cp(w) = lim @ ( a ) . n+cc
Fixed points of morphisms are called also DOL words and are widely used as examples of infinite words with given properties. Here we consider only uniform morphisms, that are morphisms with all the images of letters having the same length denoted by m. Let C = C, = {0,1,. . . ,q - l},and let a @ b denote the symbol-to-symbol addition modulo q of the words a and b of equal length. In this section, for every i E {0,1,. . . ,q- l},the expression ij denotes the word i . . . i (and not the j t h power of the number i ) . To distinguish it from
v j
the arithmetical subsequence wi, the word
Wk
.. ' w k is denoted by
(wk)j.
j
We say that a morphism cp is symmetric if for all i E C we have cp(i)= cp(0) @ i". Clearly, a symmetric morphism is determined by the image of 0. A DOL word is symmetric if it is a fixed point of a symmetric morphism starting with 0. Theorem 2 Let cp be a symmetric morphism on C , , where q is prime, and let its jixed point w be non ultimately periodic. Then for all n 2 0 we have f 3 4 = qn.
PROOF. The words of the form Oilbi, where Ibil = n - i , constitute a basis of C y under the symbol-to-symbol addition modulo q. So, it is sufficient t o prove, first, that the set of arithmetical subwords of w having length n is closed under this addition, and second, that the words O i l are contained in FA( w)for all i 2 0. First, let us consider two arithmetical subwords a and b of equal length of w and prove that a @ b also belongs t o FA(w). Let a = W k W k + p . . . ~ k and b = w k ' w k ' + p ' * . . Wk'+np'. Let r be an integer such that k n p < mr, where m is the image length of a symbol. Note that for every i , j E C we have cp(i @ j) = cp(0) @ i" @ j m = p ( i ) @ j " . Thus, cpz(i)= cp(cp(0) @ i") = ( ~ ~ (@0im2, ) etc.; by induction, we have cp'(i) = cpr(0) el3 impfor all i. The symbol Wk'mP+k is the (k + 1)st
+
+
56
symbol of cp'(wk0 = cp'(0) @ (wkOmr; thus, it can be obtained from the (k 1)st symbol of cp'(0) (equal t o W k ) by adding W k ' modulo q. This means that W k l m r + k W k @ W k ' ; analogouSly, W(kt+pl)mP+k+p = wk+p @ Wkl+pl, etC. Thus, a @ b = wk'm'+kW(k'+pl)mP+k+p . . . 'W(k~+np')mP+k+np; We see that a @ b E FA(W). Now let us prove that Oil E FA(w)for all i 2 0. Since q is prime, it is equivalent t o Oik E F A ( w )for some k E C,, k # 0: indeed, kc 1 (mod q ) for some number c E {0,1,. . . ,q - l},and adding Oik t o itself c times as it is described above, we obtain O i l . Suppose the opposite: an arithmetical subsequence of the form Oi can be prolonged in FA( w ) only by 0 if i is sufficiently large. Since the morphism is symmetric and thus 0 and any other symbol are interchangeble in FA(w), and due t o the Van der Waerden theorem, infinitely long arithmetical progressions of 0's do occur in F A ( w ) . Let us consider such progression wa = 0". Since we always can pass from cp to some its power cpk without changing the fixed point w , without loss of generality we assume that the difference p of the progression is not greater than m. So, since we know w;, starting with some point, we know at least one symbol of each image of letter. But since the morphism is symmetric, this is sufficient to determine the images of letters themselves. Since the positions modulo m of known symbols (participating in wa) change periodically, w is itself ultimately periodic. A contradiction. 0
+
Remark 1 It is possible t o characterize all ultimately periodic symmetric DOL words using the fact that they must contain arbitrarily long factors whose position modulo m depends on an occurrence. In other terms, each ultimately periodic symmetric DOL word is uncircular (an explicit definition of this notion can be found e. g. in [5]). Using the criterion of circularity obtained in [5] and adding the condition that the morphism cp is symmetric, we can conclude that a fixed point of a symmetric morphism cp can be (ultimately) periodic only if the morphism has a very special structure. Namely, up t o cyclic renaming of symbols (of the form k -+ ck mod q for a fixed integer c), we must have p(0) = (01. . . ( q - 1 ) ) l O for some I > 0. The complete proof of this statement is not at all difficult but rather cumbersome and thus is omitted.
Remark 2 If we replace the condition of non-periodicity by that of occurrence of O i l in FA(w) for all i, Theorem 2 becomes true for all cardinalities of alphabets.
57
So, at least for the case when the cardinality q of the alphabet is prime (and in fact for arbitrary alphabets too), Theorem 2 concerns most symmetric DOL words. In particular, it is applied t o the most famous one, the ThueMorse word W T M = 0110100110010110~~~ (see [3]) which is the fixed point starting with 0 of the symmetric morphism ( P T M :
{
(PTM (0) (PTM(1)
= 01, = 10.
Since q = 2, the arithmetical complexity of the Thue-Morse word is 2".
4
Low Arithmetical Complexity
In this section, we consider a subfamily of Toeplitz words and prove that each word from it has a linearly growing arithmetical complexity. Let us consider an alphabet E and a symbol ? $! C called gap. A pattern is a finite word t E C(C U {?})*. For an infinite word w E (C U {?})" and a pattern t we define an infinite word Tt(w)as a result of replacing all gaps in w by corresponding in order symbols of the periodic infinite word t W . Consider a sequence of patterns tl ,t 2 , . . . ,t , . . . and the corresponding sequence of infinite words
.. Clearly, this sequence has a limit U(t1,. . . ,t,, . . .) E C". It is called the Toeplztz word generated by the sequence (tl , .. . ,t,, . . .). In the particular case when all the ti are equal t o the same pattern t , we say that the Toeplitz word is generated by t and denote it by U ( t ) . The subword complexity of Toeplitz words generated by one pattern was found in [4] (see also [S]); it always grows as a polynomial and is linear in the case when the number of gaps in the pattern divides its length. All such words are uniformly recurrent. In this paper we consider only patterns with gaps constituting an arithmetical progression of prime difference dividing the length of the pattern; i. e., patterns of the form
58
where 1 is prime and j E (1,.. . ,I - 1). The length of such pattern t is q l , and the gaps constitute an arithmetical subsequence in tW starting with j t h symbol and having difference 1. The set of all such patterns is denoted by T(1,q , j ) . A very close notion of regular patterns together with corresponding Toeplitz words (generated in general by different patterns) and their subword complexity were considered in [8]. Example 1 Let us consider a pattern t,f = 0?2?. Clearly, t,f E T(2,2,1). We have
uo =?”
=............................
... ,
Ul = Tt, (Uo) = 0?2?0?2?0?2?0?2?0?2?0?2?0?2?.. * , ., U2 = Tt, ( U i ) = Q02?022?QQ2?022?QQ2?022?QO2?~~ U, = Ttpf(U2) = 0Q20022?0022022?0020022?0022~. . ,
...
The limit of this sequence is called the (canonical) paperfolding word and denoted by U ( t , f ) :
.. . U ( t , f ) = 002002200Q220220002002220022~ The subword complexity of U(t,,) was found in [l]and is equal t o 4n for n 2 7. By a result of [4] (and also, indirectly, of [S]), the properties of the paperfolding word are not unique: the subword complexity of every Toeplitz word generated by a pattern of T(Z,q,j)is O ( n ) . We prove that the same property holds for its arithmetical complexity: Theorem 3 For every t E T(1,q, j ) the arithmetical complexity of U ( t ) grows linearly: f&t,(n)= Wn).
PROOF. First of all, without loss of generality we can consider only canonical patterns, i. e., patterns all whose symbols except gaps are distinct. Indeed, we can obtain any other pattern t’ E T(1,q, j ) (and consequently the Toeplitz word U(t’)) from a canonical pattern t E T(Z,q,j)(respectively, from U ( t ) ) by identifying symbols. Thus, factors of arithmetical subsequences of U (t’) are also obtained from factors of corresponding arithmetical subsequences of U ( t ) by identifying symbols. Since each symbol of the canonical pattern t always maps to the same symbol of t’, a word from F l ( U ( t ) )always maps t o the same word of F l ( U ( t ’ ) ) ,but different words of FA(U(t)) can give the
59
same word of F A ( U ( ~ ' )Thus, ). the Toeplitz word generated by the canonical pattern has the maximal arithmetical complexity: fvA(t) (n) 2 f $ t , ) (n). Without loss of generality we consider the word U ( t ) generated by the pattern t = TOTI . . . ~ ~ l -such 1 that the symbol ri is equal to i if i # kl + j , 0 5 k < q, and t o ? otherwise. Clearly, t is a canonical pattern from T(1,q , j ) . Let us consider a finite or infinite word u = uoul . . 'u,.. . with ui E C. We say that a position i (and the symbol ui)have nth order if i = k l n + j e . This definition is introduced so that if u is a Toeplitz word U(t1,t 2 , . . .) for ti E T(1,q, j ) , then i is of nth order if and only if ui appears from a gap not earlier than in Un+l. This is easy to prove by induction: its base is given by the fact that all positions are of order 0, and its step uses the fact that the symbols of (n 1)st order in u are exactly the symbols of 1st order in the arithmetical subsequence of u constituted by its symbols of n th order. Let us choose an arithmetical subsequence v = vovl. . . v, . . . , where vi E C, of the word U ( t ) and study the set of factors of v. Let the difference of v be equal t o p = mql p' for some m 2 0 and p' E (0,. . . ,ql - 1); in other terms, let it be equal to p' modulo ql. Suppose first that 1 divides p'. If the position of the first symbol of v in U ( t ) is equal t o j modulo I , then v consists of symbols having 1st order in U ( t ) ,and thus is equal to another arithmetical subsequence of U ( t ) having smaller difference and consisting of appropriate symbols of 0th order. So, it does not contain factors which do not occur in subsequences of smaller differences. And otherwise, if the position of the first symbol of v in U ( t ) is not equal t o j modulo 1, then there are no symbols having 1st order in U ( t ) in it, and thus v is periodic with the period not exceeding q. Such subsequences can add only a finite number of arithmetical subwords of each length. Now consider the main case of the difference p = mql p' not divided by 1. Since I is prime, it means that exactly one of each 1 consequtive symbols of v is of first order in U ( t ) ,exactly one of each 1' symbols is of second order etc.. For each n 2 0, let us consider the set S(n) of all factors of v of length 1" whose ( w j ) t h symbol is of nth order in U ( t ) ,i. e., is situated in a position s for some k. Clearly, for all n the set S(n) is not empty. number kln j Let us show that the prefix of length 1" of a word v ( n 1) E S(n + 1) belongs t o S(n). Indeed, for all j > i, the ith and j t h symbols of a word from S(n 1)are symbols situated at distance p ( j - i) in w. In particular, since its (-j)th symbol occupies the position number a ( k ,n) = k l n + l + j 11"+'-1 -1,
+
+
+
+
+
its ( ' 2 j ) t h symbol is situated at the position number
+
60
which is of nth order. Thus, there exists an infinite word s such that for each n its prefix of length 1" belongs t o S(n). By the construction of s , each its factor is a factor of v. Vice versa, since v is uniformly recurrent according t o Lemma 1, and s contains its arbitrarily long factors, it contains all its factors. So, F(v)= F(s). Let sk be the kth symbol of s having nth order in it. In the initial Toeplitz word U ( t ) ,sk+q' is situated at the distance pqZn+l from sk. This distance is divided by qln+', and thus, if only the position of sk in V ( t )is not of ( n + l ) s t order, sk = sk+ql. We see that the set of n t h order (but not ( n + 1)st order) symbols of s is defined by a pattern t, E T ( l , q , j ) . Consequently, s is a Toeplitz word: s = U ( t o , t l , . . . ,t,, . . .). Moreover, since the (k 1)th symbol of each ti is equal t o k p + ci modulo ql for some ci, each pattern ti is uniquely determined by its first symbol (equal t o ci) and by p or, more exactly, by the remainder p' from division p t o qz. That is why we can denote
+
s = V(t1, , .. ,t,, . ..) = V(p';c1,. , . ,c,, . . ,) = V ( p ' ;c ) , where c is the sequence (c1,. . . ,en,. ..), By the definition of s = U(p';c), each word equal t o some its prefix occurs in v so that symbols which had n t h order in V(p'; c) correspond t o n t h order positions in V ( t )for all n . In particular, the position a, = je (i.e., the first position of nth order) in s = V ( p ' ; c ) is occupied by symbol c, and corresponds t o the position number d, = kqZn+l cnZn j e in U ( t ) , where k 2 0. What symbol occurs at the position number a,-l = jin V ( p ' , c ) ? By the definition, it is equal t o c,-1. On the other hand, in V ( t )it corresponds t o the symbol in the position dn-1 = dn -p(an - an-1) = dn -pln-l. Sincep=mqE+p', we haved,-l = ( k l - j m ) q l n + ( j + l c n - j p ' ) l n - l + j ~ , and thus (mod qZ). j lc, - jp' f c,-1
+
+
+
We see that c,-1 is uniquely determined by c, and p': in the example 2 below we denote this fact by c,-1 = a(c,,p'). Hence, since the sequence c is infinite, it is periodic and uniquely determined by c1 andp'. It means that the sequence s = U ( p ' , c ) and, consequently, the factorial language F ( s ) = F ( v ) depend only on p' and c1: s = V ( p ' ,c) = V ( p ' ,c l ) . Thus,
61
where P is the set of factors of finite number of periodic words given by values of p divisible by i. We see that the arithmetical closure of U(t) is the union of a finite number of languages of factors of Toeplitz or periodic words. So, computing the arithmetical complexity is reduced to computing the subword complexity of several Toeplitz words. It follows from results of [4] that the subword complexity of each of them is linear, so the arithmetical complexity of U(t) also grows linearly. D Example 2 Let us find the arithmetical complexity of the paperfolding word U(tpf) defined in Example 1 (note that the pattern tpf is canonical). To do it, we find all the U(p';c\) and use Formula (1). First of all, as proved above, c n _i is uniquely determined by cn and p' and does not depend on n or p: cn-\ = a(cn,p') = j + lcn — jp' (mod ql). Here j = 1 and q = / = 2, so we easily find a(0,1) = a(2,1) = 0 and a(0,3) - a(2,3) = 2. That is why if p' = 1, then the only possible c is ( 0 , . . . , 0 , . . . ) , and £/(p',c) = 17(1,0) = I7(0?2?) = U ( t p f ) . Analogously, if pf = 3, t h e n c = (2,... , 2 , . . . ) , and U(p',c) =U(ipf), where ipf = 2?0?. The even (i.e., divisible by / = 2) values of the difference add the set P of factors of (T, 2 W , and (02) w , so that
FA(U(tpf}) = F(U(tpf))UF(U(tpf)}
U P.
It is not difficult to show (see [2]) that a word of length n > 14 can belong to at most one of these three sets. It is clear also that f u ( t p ! ) ( n ) = /c/(t p/ )(^) — 4n, so for n > 14 we have
/#(*„)(*) = fu(trf)(n) + fu(tpf)(n) + 4 - 8n + 4. The values of f£,t -. (n) for n < 14 can be found by manual comparing the three sets F ( U ( t p f ) ) , F ( U ( t p f ) ) , and P, and are given in the following table:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 n fA(n) 2 4 8 16 24 32 44 52 64 76 86 96 106 116 We have found the arithmetical complexity of the paperfolding word.
5
Concluding remark
Like most new notions, arithmetical complexity poses a series of natural problems. As always, it would be interesting to investigate the case of low
62
complexity and, for instance, t o characterize the set of infinite words whose arithmetical complexity grows linearly. Another direction is examination of known families of infinite words, like DOL words and Toeplitz words from wider classes, Sturmian words etc., and finding their arithmetical complexity.
Acknoledgements We thank professor Masami Ito who made it possible to present this work at the 3rd ICWLC, and the referee for careful reading and checking calculations in Example 2.
References [l] J.-P. Allouche, The number of factors in a paperfolding sequence, Bull. Austral. Math. SOC.46 (1992) 23-32.
[2] J.-P. Allouche, M. Bousquet-Mklou, Canonical positions for the factors in paperfolding sequences, Theoret. Comput. Sci 129 (1994), 263-278. [3] J.-P. Allouche, J. Shallit, The ubiquitous Prouhet-Thue-Morse sequence, in: C. Ding, T. Helleseth, and H. Niederreiter (eds.), Sequences and their Applications, Proc. of SETA’98, DMTCS, Springer (1999), 1-16. [4] J. Cassaigne, J . Karhumaki, Toeplitz words, generalized periodicity and periodically iterated morphisms, European J. of Combinatorics 18 (1997), 497-5 10. [5] A. E. Frid, On uniform DOL words, in: M. Morvan, C. Meinel, and D. Krob (eds.), STACS’98, LNCS 1373, Springer (1998), 544-554.
[6] J . Justin, G. Pirillo, Decimations and Sturmian words, RAIRO Informatique 31 (1997), 271-290.
[7] A. Khintchine, Three Pearls in Number Theory, Graylock Press, New York, 1948. [8] M. Koskas, Complexite‘ de suites de ToepZitz, Disc. Math. 183 (1998), 161-183. [9] B. L. Van der Waerden, Beweis einer Baudet’schen Vermutung, Nieuw. Arch. Wisk. 15 (1927), 212-216.
63
THE EMPEROR’S NEW RECURSIVENESS: THE EPIGRAPH OF THE EXPONENTIAL FUNCTION IN TWO MODELS OF COMPUTABILITY VASCO BRATTKA Theodische Informatik I, FernUniversitat Hagen, D-5808d Hagen, Germany E-mail: vasco.brattkaofernuni-hagen.de In his book “The Emperor’s New Mind” Roger Penrose implicitly defines some criteria which should be met by a reasonable notion of recursiveness for subsets of Euclidean space. We discuss two such notions with regard t o Penrose’s criteria: one originated from computable analysis, and the one introduced by Blum, Shub and Smale.
1
Introduction
In his book “The Emperor’s New Mind” Roger Penrose raises the question whether the famous Mandelbrot set A4 g lR2 can be considered as recursive in some well-defined sense. Throughout his discussion of this problem Penrose uses an intuitive notion of recursiveness and he complains about the lack of a mathematically precise meaning of this notion. On the one hand, he argues that it is insufficient to define recursiveness of a set as decidability with respect to computable points, since in this case even a simple set like the unit ball B := {(s, y) E R2: x 2 + y2 5 1) does not become recursive. Since Penrose is convinced that the unit ball should become recursive we are led to introduce the following criterion.
Penrose’s first criterion. A reasonable notion of recursiveness for subsets of Euclidean space should make the closed unit ball recursive.
Figure 1. The closed unit ball B
64
On the other hand, Penrose argues that certain other ways to define recursiveness are also inappropriate, especially, because they do not handle the border of the sets under consideration in the right way. This aspect is important since the complexity of sets is often inherent in their border, as in case of Mandelbrot’s set. For instance, a definition of recursiveness as decidability with respect to rational or algebraic numbers is insufficient, since in this case sets like the closed epigraph of the exponential function E := {(x,y) E R2 : y 2 e Z } would not be handled appropriately. The border of this set does not contain any algebraic point besides ( 0 , l ) and thus the border is irrelevant to a decision procedure which is restricted to algebraic points. Of course, Penrose is convinced that a set, easily structured like the closed epigraph of the exponential function, should be recursive. This motivates the second criterion.
Penrose’s second criterion. A reasonable notion of recursiveness for subsets of Euclidean space should make the closed epigraph of the exponential function recursive.
...
L
.r Figure 2. The closed epigraph E of the exponential function
Apparently, there are several similar conditions and Penrose’s criteria are by no means sufficient conditions for a reasonable notion of recursiveness. They are just necessary conditions; a notion of recursiveness which does not meet Penrose’s criteria would be highly suspicious since it could be doubted whether it reflects algorithmic complexity in the right way. Since Penrose did not present any notion which fulfills all his requirements, it seems as if there exists no suitable notion of recursiveness. The aim of this paper is to compare two existing notions of recursiveness for subsets of Euclidean space and to find out which comes closest to Penrose’s requirements. The first notion is based on computable analysis and has been developed and investigated by several authors. The basic idea of recursive analysis is to call a function f : R n -+ R computable, if there exists a Turing machine which transforms Cauchy sequences of rationals, rapidly converging to an input 2, into Cauchy sequences of rationals, rapidly converging to the
65
output f(z).Moreover, a set A 2 R” is called recursive, if its distance funct i o n d A : R n +. R is computable.a Here d denotes the Euclidean metric. This notion of recursiveness straightforwardly generalizes the notion of recursiveness from classical Computability theory (see Odifreddi 3 ) : if we endow the natural numbers N with the discrete metric, then the distance function of a subset A N is equal to its characteristic function and computability of the characteristic function is equivalent to recursiveness of the set A. In Euclidean space the distance function is a LLcontinuous substitute” for the characteristic function. Although recursiveness of subsets of Euclidean space in this sense does not correspond to the intuition of “decidability”, it is a formal generalization of the classical notion of recursiveness. Especially, a subset A N considered as a subset of R is recursive, if and only if it is classically decidable. Finally, it is easy to prove that this notion of recursiveness meets Penrose’s criteria. As a second notion of recursiveness we will investigate the notion which has been developed by Blum, Shub and Smale.4i5In their theory a function f : R ” +. Iw is computable (we will call it algebraically computable for the following), if there exists a real random access machine which computes f . Such a machine uses real number registers, arbitrary constants, arithmetic operations, comparisons and equality tests. Moreover, a set A C R”is called recursive by Blum, Shub and Smale (we will call it algebraically recursive for the following), if its characteristic function is algebraically computable. If we restrict the class of constants appropriately (for instance to rational numbers), then a set A C N considered as a subset of R is algebraically recursive, if and only if it is classically decidable. In this sense the notion of algebraic recursiveness is a second generalization of the classical notion of recursiveness. Obviously, the unit ball is algebraically recursive and hence Penrose’s first criterion is met. Blum and Smale have proved that Mandelbrot’s set is not algebraically recursive and hence it seems as if they have given an answer to Penrose’s original question. But with a similar technique we will prove that the closed epigraph of the exponential function is not algebraically recursive and hence it is highly questionable whether Blum and Sinale’s answer to Penrose’s question is significant. If even a simple set like the epigraph of the exponential function is not algebraically recursive, we can conclude that algebraic non-recursiveness obviously does not reflect the intrinsic algorithmic complexity of a set.
aThe idea of using distance functions to characterize “located” sets has first been used in constructive analysis, see Bishop and Bridges.2
66
2
Recursive and Recursively Enumerable Sets
In this section we give the precise definitions of several classes of recursively enumerable and recursive sets and we will give a short survey on some elementary properties. Let d : R" x R" .+ R be the Euclidean metric of R", defined by d ( x , y) := dET'l Ixi - yi12 for all x , y E R". By B ( x , r ) := {y E R" : d ( x , y ) < T - }we denote the open balls and by B ( X , T:= ) {y E R" : d(x,y) 5 r } the closed balls with respect to d. For each set A 5 IR" we denote by d A : R" -+ R the distance function of A , defined by d A ( x ) := infaEAd ( x , u). Let a : N + R" be some standard enumeration with range(0) = Q", defined for instance by c x ( ( i ~ , Icl), j ~ , ..., (i,,j,, k,)) := ..., -1. Here, (.) : N2 -+ N denotes Cantor's Pairing Function, defined by ( i , j ) := + j ) ( i + j + 1)+ j , which can inductively be extended to a function (.) : N" + N. All these pairing functions are bijective and computable, as well as their inverses. We assume that the reader is familiar with the definition of computable real functions (see, for instance, Weihrauch17Pour-El and Richards,s KO 9 ) . We briefly recall the ideas: a function f :C IR" -, Iw is called computable, if there exists a Turing machine which transforms each Cauchy sequence (qi)iEn of rational numbers qi E Q" (encoded with respect to a ) , which rapidly converges to some x E dom(f) into a Cauchy sequence ( r i ) i Eof~ rational numbers ri E Q , which rapidly converges to f(x). Here, rapid convergence means d ( q i , q k ) 5 2-k for all i > Ic (and correspondingly for (ri)iEn). Of course, a Turing machine which transforms an infinite sequence into an infinite sequence has to compute infinitely long, but in the long run the correct output sequence has to be produced. It is reasonable to assume one-way output tapes for such machines since otherwise the output after some finite time would be useless (because it could be replaced later). Functions, such as exp, sin, cos, In and max are examples of computable functions. One of the basic observations of computable analysis is that computable functions are continuous. This is because approximations of the output are computed from approximations of the input and therefore each a p proximation of the output has to depend on some approximation of the input. Computable functions of type f : N R" can be defined similarly and are called computable sequences. Now we are prepared to define the notion of recursively enumerable and recursive subsets in the sense of computable analysis (see Brattka and Weihrauch lo for a survey). These notions are explicitly defined for open or closed sets, respectively.
(w,
i(i
67
Definition 2.1 (Recursively enumerable open and closed sets) 1. An open subset A IR" is called recursively enumerable, (r.e. for short), if there is a computable function f : N 4 N2 such that A = U ( i , j ) E r a n g e ( f ) B(a(i),2 - 9
2. A closed subset A En is called recursively enumerable, (r.e. for short), if A = 0 or there is a computable sequence f : N -+ IR" such that range(f) is dense in A. 3. An open (closed) set is called co-recursively enumerable (cer.e. for short), if its complement AC is r.e. 4. An open (closed) set is called recursive, if it is r.e. and cer.e. Recursively enumerable open sets have first been introduced and investigated by Lac0mbe.l' Equivalent definitions to the given ones have been investigated by several authors (see Weihrauch and Kreitz,l27l3KO et a1.,1419715 Ge and Nerode,16 Zhou,17 Zhong,l8 Brattka 19). The following characterization gives an impression of the stability of the definition of r.e. sets. For completeness we also mention the characterizations via semi-computable d i s tance functions. These notions are not used any further in this paper and the interested reader is refered to Brattka and Weihrauch lo for the definitions and proofs.
Lemma 2.2 (Characterization of r.e. closed sets) Let A
Rn be a
closed set. T h e n the following equivalences hold:
1. % %
2.
A i s recursively enumerable { (2, j ) E N2 : A n B ( a ( i )2, - j ) # 8) is recursively enumerable d A : R n IR i s upper semi-computable, -+
A i s co-recursively enumerable
* {(i, j ) E N2 : A n B(a(i),2 - j ) = 0 ) i s recursively enumerable % d A : R n --+
%
3.
A = f-l{O}
i s lower semi-computable f o r some computable f u n c t i o n f : R"
A i s recursive
-+
R,
d~ : Rn -+ R i s computable.
Using these characterizations and the fact that the exponential function is a computable function one can easily show that the notion of recursiveness of computable analysis fulfills Penrose's criteria.
68
Proposition 2.3 (Recursive sets and Penrose’s criteria)
1. T h e closed unit ball B := ((2,y) E R 2 : x2 + y2 5 1) i s a recursive set. 2. T h e closed epzgruph E := { (x,y) E R 2 : y 2 ex} is u recursive set. Proof.
d
m
1. We obtain d ~ ( ~ , =y max(0, ) - 1) for the distance function dg : R 2 --+ R . Thus, d~ is computable and B is recursive.
2. There exists a computable function f : N R 2 such that range(f) = {(z, e” y) E R 2 : z, y E Q, y 2 0 } , since the exponential function is computable. Since range(f) is dense in E it follows that E is an r.e. closed set. The function g : R 2 -+ R with g ( z , y) := max(0, e” - y} is computable and E = g-l{O}. Thus, E is a ccer.e. closed set. Altogether, 0 E is a recursive closed set.
+
+ .
More generally, the proof of 2. shows that the closed epigraph epi(f) = { (2,y) E R 2 : y 2 f(x)} of a computable function f : R +. R is a recursive set.b It is worth noticing that the notion of computability and the notion of recursiveness of computable analysis fit together very well: a continuous function f : Iw +. R is computable, if and only if its graph is recursive a s a closed subset of R 2 (see Weihrauch 20). 3
Algebraic Recursiveness
In this section we want to prove that the notion of algebraic recursiveness does not meet Penrose’s second criterion. We start with the definition of algebraically r.e. sets as halting sets of real random access machines, as they have been used by Blum, Shub and Smale.425These real random access machines use real number registers, arbitrary constants, arithmetic operations, comparisons and equality tests. We assume that the reader is familiar with the precise definitions. F’rom the point of view of computable analysis especially the comparisons and equality tests are problematic. From the point of view of classical Computability theory also the constants are suspicious since one can code an arbitrary function f : N +. N in such a constant.
bFora general discussion of computability properties of the epigraph, see Zheng et. a1.20,21
69
Definition 3.1 (Algebraically r.e. sets) Let A
R".
1. A is called algebraically r.e., if A is the halting set of some real random access machine. 2. A is called algebraically recursive, if A as well as its complement A" is algebraically r.e. If A is the halting set of a real random access machine which does only use rational constants, then we will say that A is algebraically r.e. with rational constants. Obviously, the unit ball B := ((5, y ) E R2 : x2 + y2 5 l} is an algebraically recursive set, even with rational constants. We just have to compute z2 y 2 and test z2 y2 5 1.
+
+
Proposition 3.2 T h e closed unit ball B := {(z, y ) E R 2 : x2 algebraically recursive with rational constants.
+ y2 5 1) i s
One can easily prove that the open epigraph of the exponential function > e 2 ) is an r.e. open set and hence it is also algebraically r.e. (as any other r.e. open set). On the other hand, we will show,that the closed epigraph E of the exponential function is not algebraically recursive. Indeed, we will prove that it is not even algebraically r.e. The proof uses some standard techniques of Blum, Shub and Smale's theory, especially their Path Decomposition Theorem, which states that each algebraically r.e. set is a countable (disjoint) union of basic semi-algebraic sets (see Blum et al.5). We recall some basic definitions and facts from real algebraic geometry (which can be found in Bochnak et a1.22 and Marker et al.23). The class of semi-algebraic subsets of R" is the smallest class of subsets of R" which contains all sets {x E R" : p ( z ) > 0 ) with real polynomials p : R" -+ R,and which is closed under finite intersection, finite union and complement. Each semi-algebraic set can be written a s finite union of basic semi-algebraic sets, which have the form {z E RIB": Pl(Z)= 0,.'.,p z ( z ) = 0,q1(x) > 0, ' " 1 qJz) > O } , where p l , . . . , p i , 41,..., qj : R" + R are real polynomials. A (partial) function f :C R" + R is called semi-algebraic, if its graph graph(f) := ((2, y ) E EXnf1 : f(z)= y} is a semi-algebraic set. Using the normal form given above, it is easy to show that each semi-algebraic function is algebraic, i.e. there exists some real polynomial p : Rn+l + R , p # 0 such that p ( z , f(z))= 0 for all z E dom(f). By the Theorem of Tarski-Seidenberg semi-algebraic sets are closed under projection and one can conclude that the interior A", the closure 71 and {(z,y) E R 2 : y
70
hence the border aA = 71\ A" of a semi-algebraic set A is semi-algebraic too (additionally, one uses the fact that the Euclidean metric is a semi-algebraic function). Correspondingly, one can see that the lower border A1 := {(x,y) E R 2 : (z, y) E A and (Vz E R ) ( ( x 2) , E A +z 2 y ) } is semi-algebraic if A C R 2 is. By f l u we will denote the restriction of a function f with dom(f 1 ~ = ) dom(f) n U . Now we are prepared to prove the following result.
Proposition 3.3 T h e closed epigraph E := { ( s l y ) E R 2 : y 2 e 2 } of the exponentaal function i s not algebraically T. e. Proof. Let E := {(z,y) E R 2 : y 2 f ( x ) } be the closed epigraph of the exponential function f : R -+ R . Let us assume that E is algebraically r.e. Then, by the Path Decomposition Theorem, E is a countable union of semi-algebraic sets Ai C R2, i.e. E = U Z o A ~ Since . the closure of a semi-algebraic set is semi-algebraic too, we can m u m e w.l.0.g. that all sets Ai are closed. Especially, we obtain aE = U&(dE n Ai)and since the border dE is a complete subspace of R 2 it follows by Baire's Category Theorem that there is some i E N and a non-empty open set U C R 2 such that 8 # dE n U 5 Ai. Since 8E = graph(f) and f is continuous, there are some non-empty open intervals I , J C R such that f(1)C_ J and V := I x J C U . Hence graph( f 11) = d E n V = A! n V is semi-algebraic, since A! and V are semi-algebraic. But using the Identity Theorem for real-analytic functions and the power series expansion of the exponential function, one can prove 0 that f l is ~ not algebraic. Contradiction! This proposition proves that algebraic recursiveness does not meet Penrose's second criterion. w e will call a function f : R + everywhere transcendental, if f Iu is not algebraic for each non-empty open set U R.The proof that the exponential function is everywhere transcendental can be found in basic texts on analysis (see, for instance, Erwe 24). Besides the fact that the exponential function is everywhere transcendental and continuous, we have not used any specific properties of the exponential function in the previous proof. By symmetry we obtain the following general result.
Theorem 3.4 Iff : R -+R i s a n everywhere transcendental and continuous function, t h e n neither the closed epigraph, nor the closed hypograph, nor the graph o f f i s a l g e b r a i d l y r.e. It is worth noticing that the notions of algebraic recursiveness and algebraic computability do not fit together in the same sense as the notions of recursiveness and computability of computable analysis. The square root
71
:c
function f R --t R , 2 c-) fi is an example of a function which is not algebraically computable but whose gra.pli is algebraically recursive. Hence, the algebraic non-recursiveness of the graph of the exponential fuiiction cannot simply be deduced from the fact that the exponential function is not algebraically computable.‘
4
Conclusion
We have seen that the notion of algebraic recursiveness does not meet Penrose’s criteria, while the notion of recursiveness from computable analysis does. The latter notion describes recursiveness in terms of computability of the distance function d A of a set A. In view of the fact that equality on the real numbers is undecidable, recursiveness in this sense is the best what one could expect. Recursiveness implies “decidability up to the equality test on the real numbers”: if we only could decide whether d ~ ( z=) 0, then we could decide whether 2 E A or not. An essential question remains open. We do not know whether the Mandelbrot set is a recursive closed set or not. It is easy to see that it is a c+r.e. closed set but it is still a challenging open question to find out whether it is also an r.e. closed set or not!
Acknowledgements The main result of this paper, Proposition 3.3, has been motivated by an inspiring discussion with Peter Hertliiig in Dagstuhl 1997. This work has been supported by DFG Grant BR 1807/4-1.
References 1. R. Penrose, The Emperor’s New Mind. Concerning Computers, Minds and The Laws of Physics (Oxford University Press, New York, 1989). 2. E. Bishop and D.S. Bridges, Constructive Analysis (Springer, Berlin, 1985). 3. P. Odifreddi, Classical Recursion Theory (North-Holland, Amsterdam, 1989). 4. L. Blum, M. Shub, and S. Smale, On a theory of computation and complexity
over the real numbers: NP-completeness, recursive functions and universal machines, Bul. Amer. Math. SOC.21:l (1989) 1-46. 5. L. Blum, F. Cucker, M. Shub, and S. Smale, Complexity and Real Computation (Springer, New York, 1998). “Over algebraically closed fields a function is algebraically computable, if and only if its graph is algebraically recursive, see Ceola and L e c ~ m t e . ’ ~
72
6. L. Blum and S. Smale, The Godel incompleteness theorem and decidability over a ring, in M.W. Hirsch et al. (eds.), From Topology to Computation: Proceedings of the Smalefest (Springer, New York, 1993) 321-339. 7. K. Weihrauch, Computable Analysis (Springer, Berlin, 2000). 8. M.B. Pour-El and J.I. Richards, Computability in Analysis and Physics (Springer, Berlin, 1989). 9. K.-I KO, Complexity Theory of Real Functions (Birkhauser, Boston, 1991). 10. V. Brattka and K . Weihrauch, Computability on subsets of Euclidean space I: Closed and compact subsets. Theoret. Comput. Sci. 219 (1999) 65-93. 11. D. Lacombe, Les ensembles rkcursivement ouverts ou fermks, et leurs applications a 1'Analyse rbcursive, C.R. Acad. Sc. Paris 246 (1958) 28-31. 12. K. Weihrauch and C. Kreitz, Representations of the real numbers and of the open subsets of the set of real numbers, Ann. Pure Appl. Logic 35 (1987) 247260. 13. C. Kreitz and K. Weihrauch, Compactness in constructive analysis revisited, Ann. Pure Appl. Logic 36 (1987) 29-38. 14. K.-I KO and H. Friedman, Computational complexity of real functions, Theoret. Comput. Sci. 20 (1982) 323-352. 15. A. Chou and K . 4 KO, Computational complexity of two-dimensional regions, SIAM J. Comput 24 (1995) 923-947. 16. X. Ge and A. Nerode, On extreme points of convex compact Turing located sets, in A. Nerode and Y. V. Matiyasevich (eds.), Logical Foundations of Computer Science, vol. 813 of LNCS (Springer, Berlin, 1994) 114-128. 17. Q. Zhou, Computable real-valued functions on recursive open and closed subsets of Euclidean space, Math. Logic Quart. 42 (1996) 379-409. 18. N. Zhong, Recursively enumerable subsets of R' in two computing models: Blum-Shub-Smale machine and Turing machine, Theoret. Comput. Sci. 197 (1998) 79-94. 19. V. Brattka, Computable invariance, Theoret. Comput. Sci. 210 (1999) 3-20. 20. K. Weihrauch and X. Zheng, Computability on continuous, lower semicontinuous and upper semi-continuous real functions, Theoret. Comput. Sci. 234 (2000) 109-133. 21. X. Zheng, V. Brattka, and K. Weihrauch, Approaches to effective semicontinuity of real functions, Math. Logic Quart. 45:4 (1999) 481-496. 22. J. Bochnak, M. Coste, and M.-F. Roy, Ge'ome'trie alge'brique re'elle (Springer, Berlin, 1987). 23. D. Marker, M. Messmer, and A. Pillay, Model Theory of Fields (Springer, Berlin, 1996). 24. F. Erwe, Differential- und Integralrechnung (Bibliographisches Institut, Mannheim, 1973). 25. C. Ceola and P.B.A. Lecomte, Computability of a map and decidability of its graph in the model of Blum, Shub and Smale, Theoret. Comput. Sci. 194 (1998) 219-223,
73
ITERATIVE ARRAYS WITH LIMITED NONDETERMINISTIC COMMUNICATION CELL T. BUCHHOLZ, A. KLEIN AND M.KUTRIB Institute of Informatics, University of Giessen Arndtstr. 2, 0-35392 Giessen, Germany E-mail:
[email protected] An iterative array is a line of interconnected interacting finite automata. One distinguished automaton, the communication cell, is connected to the outside world and fetches the input serially symbol by symbol. Sometimes in the literature this model is referred to as cellular automaton with sequential input mode. We are investigating iterative arrays with a nondeterministic communication cell. All the other cells are deterministic ones. The number of nondeterministic state transitions is regarded as a limited resource which depends on the length of the input. It is shown that the limit can be reduced by a constant factor without affecting the language accepting capabilities, but for sublogarithmic limits there exists an infinite hierarchy of properly included real-time language families. Finally we prove several closure properties of these families.
1
Introduction
Devices of interconnected parallel acting automata have extensively been investigated from a language theoretic point of view. The specification of such a system includes the type and specification of the single automata, the interconnection scheme (which sometimes implies a dimension to the system), a local and/or global transition function and the input and output modes. One-dimensional devices with nearest neighbor connections whose cells are deterministic finite automata are commonly called iterative arrays (IA) if the input mode is sequential to a distinguished communication cell. Especially for practical reasons and for the design of systolic algorithms a sequential input mode is more natural than the parallel input mode of so-called cellular automata. Various other types of acceptors have been investigated under this aspect (e.g. the iterative tree acceptors in [8]). In connection with formal language recognition IAs have been introduced in [7] where it was shown that the language families accepted by real-time IAs form a Boolean algebra not closed under concatenation and reversal. Moreover, there exists a context-free language that cannot be accepted by any &dimensional IA in real-time. On the other hand, in [6] it is shown that for every context-free grammar a 2-dimensional linear-time IA parser exists. In [lo] a real-time acceptor for prime numbers has been constructed. Pattern
74
manipulation is the main aspect in [I]. A characterization of various types of IAs by restricted Turing machines and several results, especially speed-up theorems, are given in [13,14,15]. Various generalizations of IAs have been considered. In [20] IAs are studied in which all the finite automata are additionally connected to the communication cell. Several more results concerning formal languages can be found e.g. in [21,22,23]. In some cases fully nondeterministic arrays have been studied, but up to now it is not known how the amount of nondeterminism influences the capabilities of the model. In terms of Turing machines bounded nondeterminism has been introduced in [ll]. Further results concerning cellular automata, Turing machines, pushdown automata and finite automata can be found e.g. in [3,5,16,17,18,19]. Here we introduce IAs with limited nondeterminism. We restrict the ability t o perform nondeterministic transformations to the communication cell, all the other automata are deterministic ones. Moreover, we limit the number of allowed nondeterministic transitions dependent on the length of the input. The paper is organized as follows. In section 2 we define the basic notions and the model in question. Section 3 is devoted to the possibility t o reduce the number of nondeterministic transitions by a constant factor. In section 4 by varying the amount of allowed nondeterminism we prove an infinite hierarchy of properly included language families. Due t o the results in section 3 we need sublogarithmic limits for the number of nondeterministic transitions in order t o obtain the hierarchy. Finally, in section 5 several closure properties of the real-time acceptors with such limits are shown.
2
Model and Notions
We denote the rational numbers by Q,the integers by 7, the positive integers { 1 , 2 , . . .} by N, the set N U (0) by No and the powerset of a set S by 2s. The empty word is denoted by E and the reversal of a word w by w R . We use C for inclusions and C if the inclusion is strict. For a function f we denote its i-fold composition by f [ i ] , i E N, and define the set of mappings that grow strictly less than f by o(f) = { g : NO-+ N limn+, = 0). The set R ( f )
I
#
> 0). The identity is defined according t o {g : No -+ N 1 liminf,,, function n F-+ n is denoted by id. An iterative array with nondeterministic communication cell is an infinite linear array of finite automata, sometimes called cells, where each of them is
75
connected t o its both nearest neighbors (one to the left and one t o the right). For convenience we identify the cells by integers. Initially they are in the socalled quiescent state. The input is supplied sequentially to the distinguished communication cell at the origin. For this reason we have two local transition functions. The state transition of all cells but the communication cell depends on the current state of the cell itself and the current states of its both neighbors. The state transition of the communication cell additionally depends on the current input symbol (or if the whole input has been consumed on a special end-of-input symbol). The finite automata work synchronously at discrete time steps. More formally:
Definition 1 An iterative array with nondeterministic communication cell (G-IA) is a system ( S ,S , S n d , SO, #, A , F ) , where 1. S is the finite, nonempty set of states, 2. A is the finite, nonempty set of input symbols, 3. F S is the set of accepting states, 4 . so E S is the quiescent state, 5. # !$ A is the end-of-input symbol, 6. S : S3 Si is the deterministic local transition function for non-communication cells satisfying &(so,S O ,S O ) = SO, 7. 6,d : S3 x ( A U {#}) + 2’ is the nondeterministic local transition function for the communication cell satisfying VSI, s2, s3 E S, a E A U { #} : b n d ~ ls,2 , s 3 , a> # 0. Let M be a G-IA (G for guessing). A configuration of M at some time t 2 0 is a description of its global state which is actually a pair (wt, ct), where wt E A* is the remaining input sequence and ct : Z 3 S is a mapping that maps the single cells to their current states. During its course of computation a G-IA steps nondeterministically through a sequence of configurations. The configuration (WO, CO) at time 0 is defined by the input word 200 and the mapping co(i) = SO, i € Z,while subsequent configurations are chosen according t o the global transition function And: Let (wt ,ct), t 2 0, be a configuration then the possible successor configurations (wt+l, ct+l) are as follows:
where i E Z \ {0}, and a = it, wt+l = E if wt = E , and a = a l , wt+l = a2 . . . a, if wt = a1 . . . a,. Thus, the global transition function And is induced by 6
76
and
6nd.
The i-fold composition of
And is defined as follows:
If the state set is a Cartesian product of some smaller sets S = Sl x . . . x S k , we will use the notion register for the single parts of a state. The concatenation of one of the registers of all cells respectively forms a track. A G-IA is deterministic if 6 n d ( S 1 , ~ 2 , s g , u )is a singleton for all states S ~ , S Z , S QE S and all input symbols u E A U {#}. Deterministic iterative arrays are denoted by IA.
Definition 2 Let M = ( S ,6, bnd, S O , #,A, F) be a G-IA. 1. A word w E A* is accepted by M if there exists a time step i E N such that ci(0) E F for some (wi,ci) E n i i ( ( w , c 0 ) ) . 2. L ( M ) = { w E A* I w is accepted by M } is the language accepted by
M.
+
3. Let t : NO + N, t(n) 2 n 1, be a mapping and iw be the minimal time step a t which M accepts a w E L ( M ) in some computation. If all w E L ( M ) are accepted within i, 5 t(lw1) time steps, then L is said to be of time complexity t. The family of all languages which can be accepted by a G-IA with time complexity t is denoted by -Yt(G-IA). In the sequel we will use a corresponding notion for other types of acceptors. If t equals the function n 1 acceptance is said t o be in real-time and we write -YTt(G-IA). The linear-time languages -Ylt(G-IA) are defined according t o -Yit(G-IA) = UkEQ,k,lLk.n(G-IA). There is a natural way t o restrict the nondeterminism of the arrays. One can limit the number of allowed nondeterministic state transitions of the communication cell. For this reason a deterministic local transformation 6d : S3 x ( AU {#}) -+ S for the communication cell is provided and the global transformation induced by 6 and bd is denoted by Ad. Let g : No + NObe a mapping that gives the number of allowed nondeterministic transitions dependent on the length of the input. The resulting system ( S ,6, dnd, 6 d , S O , #, A,F ) is a gG-IA (g guess IA) if starting with the initial configuration (w0,co) the possible configurations at some time i are given by the global transformations
+
77
as follows: ifi=O
U
A$-g(lwl)l ((w', c'))
otherwise
( W ' , C ' ) ~ A ~ ~((w0,co)) ''"')~
Observe that in this definition the nondeterministic transitions have to be applied before the deterministic ones. This is not a serious restriction since nondeterministic transitions for later time steps can be guessed and stored in advance (cf. second part of the proof of Theorem 3). Up to now we have g not required t o be effective. Of course, for almost all applications we will have to do so but some of our general results can be developed without such requirement.
3
Guess Reduction
This section is devoted to the reduction of the number of nondeterministic transformations. In the sequel we will make extensively use of the ability of IAs t o simulate a pushdown storage [8,2] or a queue [4] on some track in real-time. The communication cell contains the symbol at the top of the stack or the queue. The left-to-right inclusion in the following theorem is not immediate since there might be computation paths of the kgG-IA that cannot appear for the gG-IA. Therefore, the kgG-IA must be able t o verify whether or not its communication cell has performed g(n) nondeterministic transitions.
Theorem 3 Let g : NO -+ No be a mapping and k E N be a constant. If t : No -+ N, t(n)2 n + 1, is a mapping such that t(n)2 k .g(n) for almost all
n E N then 2t(gG-IA) = 2 t ( l C . gG-IA) Proof. The crucial point in proving the inclusion 2t(gG-IA) C Zt(kgG-IA) is that a kgG-IA M' which is designated to simulate a given gG-IA M with the same time complexity must not simulate too many nondeterministic transitions of M . Therefore, the communication cell of M' is equipped with a pushdown storage. During its nondeterministic transitions M ' either can simulate a nondeterministic step of M whereby k - 1 specific symbols are pushed or can simulate a deterministic step of M whereby one symbol is popped.
78
Once M' decided t o simulate a deterministic transition it has to do so for its remaining nondeterministic steps, whereby again one symbol is popped respectively. In order t o accept the input M' has to pop the last symbol from the stack exactly a t time step k . g(n) which is its last nondeterministic one. Let m be the number of time steps at which symbols are pushed. Then we have m . ( k - 1) = k . g(n) - m m = g(n). To see the other inclusion D49,(kgG-IA) _49,(gG-IA)we use again a pushdown storage. The communication cell of a gG-IA M' simulating a kgG-IA M without any loss of time pushes k - 1 nondeterministically determined functions d : S3 x ( AU {#}) --t S satisfying d(s1,s2] s3, a) E bnd(Sl1s 2 , s3, a) (here bnd denotes the nondeterministic transition function for the communication cell of M ) during each of its nondeterministic transitions. Additionally] it simulates a nondeterministic transition of M . During the first deterministic transitions such a function is popped and applied to the states of the communication cell and its neighbors and the current input symbol which yields the next state of the communication cell. Hence a nondeterministic transition in M is simulated deterministically. Altogether M ' performs g(n) + (k- 1).g(n) = k.g(n) nondeterministic transitions and accepts exactly 0 the same language as M . A constant number of nondeterministic transitions does not increase the power of IAs. The principle of the proof is t o simulate all finitely many choices on different tracks.
Theorem 4 Let t : constant then
0.10
+ N,
t(n) 2 n + 1 be a mapping. If k E N is a
9t(kG-IA) zz L&(IA) The next corollary extends the previous results.
Corollary 5 Let g : NO + NO be a mapping and q E Q, 0 < q rational number such that g(n) = Lqn] for almost all n 6 N, then
5 1, be a
2Tt(gG-IA)= TTt(idG-1A) 4
Nondeterministic Hierarchy
Definition 6 Let L g A* be a language over an alphabet A and 1 E NObe a constant. 1. Two words w and w' are 1-equivalent with respect to L if wwl E L
u w'wl E L for all w1 E A'
79
2. N ( n ,1, L ) denotes the number of 1-equivalence classes of words of length n with respect to L (i.e. Iww1I = n).
+
Lemma 7 Let g : No -+ No, g(n) 5 n 1, be a mapping. If L E L%(gG-IA) then there exist constants p , q E N such that
Proof. Let M = ( S ,6,dnd,6 d , S O , #, A,F ) be a real-time gG-IA which accepts L. We define 4 = max { Ibnd(S1,s2, s 3 , .)I Is1,S 2 , 5 3 E
s A a E A}
In order t o determine an upper bound to the number of 1-equivalence classes we consider the possible configurations of M after reading all but I input symbols. The remaining computation depends on the last 1 input symbols and the states of the cells -1 - 1,.. . , 0, . . . , I 1. For the 21 3 states there are 1S12'+3 different possibilities. Let p = ISI5 then due t o lS12'+3 = IS121.1S13 = (IS12)'.1S/3 5 (lS12)'.(IS13)' = (IS12.1S13)'5 p' we haveat most p' different possibilities for at most qg(n)different computation paths. Since the number of equivalence classes is not affected by the last 1 input symbols
+
*dn)
in total there are at most ( p ' )
= p'.qg(")classes.
+
0
The following result does not follow for structural reasons since there might be accepting computation paths of the fG-IA that cannot appear for the gG-IA. Therefore, the fG-IA must be able to verify whether or not its communication cell has performed g(n) nondeterministic transitions.
Theorem 8 Let f : NO + NO,f(n) 5 ,: and g : NO -+ NO,g(n) 5 f(n), be two increasing mappings such that V m, n, E N : f(m) = f ( n )==+ g(m) = g(n). If L, = ( ~ g ( ~ ) b f ( ~ ) - gI n ( ~ E) N} belongs to the family &(IA) then 2 T t
(SG-IA)
C -%t (fG-IA)
Proof. Let M be a real-time gG-IA that accepts the language L. A real-time f G-IA M' which simulates M works as follows. Since f 2 g M' can guess the time step g(n) and therefore simulate M directly. Additionally, M' has to verify that its guess was correct. Otherwise the computation must not be accepting. It is known that deterministic linear-time IAs can be sped-up to (2 . i d ) time [14]. Thus, L, belongs to 22id(IA). Now M' simulates such an acceptor M" on an additional track. During the first g(n) time steps M' simulates M" under the assumption that M" fetches input symbols a. From the guessed
80
time step g ( n ) up t o the last nondeterministic step f(n) M ' simulates M" under the assumption that M" fetches input symbols b, respectively, and during the last n - f ( n )time steps M' simulates M" without input. altogether M' simulates at least 2 . i d Due to the condition f ( n ) 5 time steps of M " . If M ' guessed g ( n ) correctly it simulates M" for the input ~ g ( ~ ) b f ( ~ ) - and, g ( ~ )hence, an accepting computation. On the other hand, if M' simulates an accepting computation then it guessed a time step t such that the input ~ ~ b f ( " belongs )-~ to L,. It follows t E { g ( m ) I f ( m ) = f ( n ) } and due to the assumption V m, n, E N : f ( m )= f ( n )==+ g ( m ) = g ( n ) it holds t = g ( n ) . Therefore, M' can verify whether its guess was correct and, thus, accepts L in real-time. The following situation may clarify the necessity of the condition V m, n, E N : f ( m ) = f ( n ) ==+ g(m) = g(n). Let m < n and f ( m ) = f ( n ) and g(m) < g ( n ) . Since c ~ g ( ~ ) b f ( " ) - g ( ~belongs ) t o L, the word ~ g ( ~ ) b f ( ~ ) - g ( ~ ) does. Consequently, for an input of length m the word c ~ g ( ~ ) b f ( ~ ) - would g ( ~ ) lead t o an accepting computation but since g(m) < g ( n ) the time step g might be guessed wrong. Now we are going to extend the previous result to a hierarchy of properly included language families. Theorem 9 Let f : No -+ No and g : NOt NObe two mappings which meet the conditions of Theorem 8. If additionally f E o(1og) and g E o(f) then
-Z-t(gG-IA)c -%t(fG-IA) Proof. We define a mapping h : NO t N by h(n) = 2 f ( n ) . h is increasing since f is. Moreover, since f E o(1og) for all k E Q, k 2 0, it .holds 2f(n) limn-,m = limn+m -= 0 and therefore h E o ( n k ) . Especially for Ic = $ it follows that themappingm(n) = max{n' E NOI ( h ( n ) + l ) . ( n ' + l5 ) n } is unbounded, and for large n we obtain m(n) > h(n). Now we define a language L that belongs to T,t(fG-IA) but does not belong to z,t(gG-IA).
L= { $ T ~ ~ $ ~ 2 $ . . . $ ~ j ~ y ~ ) 3 n E D J : j = E {h O( ,nl )} rAn ~( ni ) , l ~ i ~ j , A T = ~7, - ( h ( n ) 1 ) .(m(n)+ 1) A 3 1 5 i' 5 j : W ~ = I yR}
+
It follows that L is not empty (cf. Example 10). Assume now L E zTt(gG-IA). Then by Lemma 7 there exist constants p , q E N such that ~(n,m(n +)1,L) 5 p(rn(n)+l).q."'"'. Since g E o ( f ) for all IC E 9, IC 2 0, it k.g n 2E.d") = lim n+m = limn+m 2 f o = 0. Thus, we obtain holds limn+m
#
81
2"g E o ( 2 f ) = o(h). Therefore, for large n the number of equivalence classes is bounded as follows: N ( n ,m(n)+ 1,L ) 5 p ( m ( n ) + l ) . P ' " ' < -p 2 4 n - 210g ( p ).2. m ( n ).2'"9(4).9(" ) -
) 4 " )
Let k be log(q), then 21°g(q)'g(") = 2"g(") E o ( h ) . Now we can find a constant no such that for all n 2 no: 2 . log(p) . 2"g(") < ah(n). It follows
< 2m(n),h(n).$
210g(p).2.m(n).2'0g(P)'9(")
On the other hand, let for all n E N and for every subset U = ( w l , . . . ,w ~ ( of (0, l } m ( na) word u be defined according to u = $'w~$. . . $ w ~ ( ~where )$ T = n - ( h ( n ) 1) . (m(n) 1). Then for all y E (0, l } m ( n ) :
+
+
yE
u
-
uyRe E L
(2h:G))
Since there exist at least 2m(n) different words wi there are different subsets U . For every pair U , V of subsets one can find a wi belonging t o U \ V or t o V \ U . It follows UW?$ E L w VW?$ fj L and, hence,
): ;(
N ( n ,m(n)+ 1,L ) 2
=
2 4 7 4 . ( 2 4 7 4 - 1) . . . . . ( 2 4 7 4 - h(n)+ 1) h(n)!
( 2 4 4 - qn))h(") h(n)W From m(n) > h(n) for large n it follows 2m(n) - h(n) > - 2m(n).5. Thus
> - 2m(n).h(").; From the contradiction we obtain L fj -Y't(gG-IA). It remains to show L E -Y.t(fG-IA). An fG-IA M which accepts L has t o check whether j = h ( n ) ,whether all the wi are of the same length, whether T < h(n) (from which now follows that lwil = m ( n ) ) and , whether there exists an i' such that wit = yR. Accordingly M performs four tasks in parallel.
82
For the first task M simulates a stack and pushes a symbol 1 at every nondeterministic transformation. After the last nondeterministic transformation the pushed string is handled as a counter which is decremented every time step a new wi appears in the input. The decrementation starts for w2. The number of wis is accepted if the counter is 0 after reading the input because is the binary number 2 f ( n )- 1 = h ( n )- 1. For the second task M uses two more stacks. The subword w1 is pushed onto one of them. When M fetches w2 it pushes w2 to the second stack and pops w1 from the first stack whereby their lengths are compared symbol by symbol. This task is repeated up to wj. The third task uses another stack on which the first T symbols $ of the input are pushed. Subsequently for each subword ‘uli one of them is popped. The last task is to find an i’ such that W ~ = I yR. Here the nondeterminism is used. During the first f(n)nondeterministic steps a binary string is guessed bit by bit and pushed onto a stack. From time f ( n ) on it is handled as a counter which is decremented for every subword wi. If it is 0 the next word is pushed onto another stack. It will be popped and compared symbol by symbol when the word y appears in the input. Thus, the i’ is guessed during the nondeterministic transformations. 0
At first glance the witness L for the proper inclusion seems to be rather complicated. But here is a natural example for a hierarchy: Example 10 Let i > 1 be a constant and f(n) = log[il(n) and g(n) = l ~ g [ ~ + l I ( nThen ). by Theorem 9 we have Trt(gG-IA) c 9?t(fG-IA). Since 9Lt(IA) is identical to the linear-time cellular automata languages [22] and {anb2n-n I n E N} is acceptable by such devices { a g ( n ) b f ( n ) - g ( n ) I n E N} E Tlt(IA) holds. Moreover, from g E log(f) follows V m , n E N : f ( m ) = f ( n ) ==+ g(m) = g(n). Thus, the conditions of Theorem 8 are met. Trivially, g is of order o(f), E.g. for i = 2 we obtain m(4) = 0, m(8) = 1, m(l6) = 2, E L. m ( 3 2 ) = 4, and $01$11$10$00~11~ 5
Closure Properties
Besides that closure properties are interesting of their own they are a powerful tool for relating families of languages. Our first results in this sections deal with Boolean operations. Lemma 11 Let g : No -+ No and t : No -+ N, t(n) 2 n + 1, be two mappings. Then the family 9t(gG-IA) is closed under union and intersection and trivially contains 9t(IA).
83
Proof. Using the same two channel technique of [9] and [22] the assertion can easily be seen. Each cell consists of two registers in which acceptors for both languages are simulated in parallel. Now we turn to more language specific closure properties. For some functions g the families 5Yrt(gG-IA) are closed under concatenation and for some others are not. At first we consider the closure under marked concatenation. Lemma 12 Let g : N o + No be an increasing mapping such that the language { a g ( m ) b m - g ( m ) I m E N} belongs to 5YTi(IA). Then the family -YTt(gG-IA) is closed under marked concatenation. Proof. Let L1 resp. L2 be formal languages over the alphabets A1 resp. A2 which are acceptable in real-time by the gG-IAs MI resp. M z . Let L denote the marked concatenation of L1 and Lz: i.e.,
L = { W I C W Z I w1 E L1 and
202
E Lz}
where c 6 A1 U A2 is a marking symbol. A gG-IA M that accepts L in real-time works as follows. ATcA; is a regular language and, therefore, belongs trivially to 5YTt(gG-IA). Since 2Zrt(gG-IA) is closed under intersection (cf. Lemma 11) it is sufficient to consider inputs of the form ATcA; only. Let w = wlcw2 with w1 E A;, w2 E A;, and n1 = 1w11, n2 = 12021. Now the idea is as follows: On input w the array M simulates the behavior of MI (on input wl)until reading the marking symbol c and subsequently the behavior of Ma (on input 2 0 2 ) . M accepts w iff both simulations are accepting. The simulation of M1 can be performed directly since g is monotonically increasing and therefore g(n) 2 g ( n 1 ) . But the time step g ( n l ) has to be guessed and verified. In order t o perform this task an acceptor for the language L' = { a g ( m ) b m - g ( m ) I m E N} is simulated on an additional track in parallel. Thereby an input symbol a is assumed for each nondeterministic step (up to the guessed time g(n1)) and an input symbol b for each deterministic step (up t o the end of simulation at time n l ) . So the number 2 resp. y of simulated nondeterministic resp. deterministic transitions corresponds to a word azbY belonging to L' iff there exists an m E N such that 2 = g(m) and y = m - g(m). Thus, iff n1 = z y = g(m) m - g(m)= m. The simulation of M Zis performed similarly. However, a problem would arise with the nondeterministic transitions if g(n) < n1 + 1 g(nz). Therefore, during its nondeterministic transitions M uses a queue into which it
+
+
+
84
pipes nondeterministically chosen local transition functions corresponing to a possible nondeterministic transition of Ma (cf. the proof of Theorem 3). During the simulation of the nondeterministic transitions of M 2 these functions are successively extracted from the queue and applied to the communication cell. 0 The assertions of the lemma can essentially be weakened. Let h be a homomorphism such that h(z) = a for z # b and h(b) = b. Then instead of requiring L = {a9(")bm-g(") I m E N} to be acceptable in real-time by some iterative array it is sufficient to require that some language L' with h(L') = L belongs to T T t (IA). By $4' we denote the set of functions g : N + No, g(n) 5 n, such that there exists a language L' E Trt(IA) whose image under h is {ag(m)bm-g(m)I m E N}. So in fact any family TTt(gG-IA) where g E $4' is closed under marked concatenation. The usage of a marking symbol can be omitted if the limiting function g allows a gG-IA to determine a possible concatenation point by its own (for instance nondeterministically by using a b-ary counter). Hence, we obtain the following corollary.
Corollary 13 Let g : No + NObe a mapping with g E R(1og). If.&(gG-IA) is closed under marked concatenation then it is closed under concatenation. On the other hand, there exist functions g for which gG-IA is not closed under concatenation. The proof follows essentially an idea presented in [7] to show that the family A$t(IA) is not closed under concatenation. Theorem 14 Let g : NO + No, g E o(loglog), be a mapping. -Y,.t(gG-IA) is not closed under concatenation.
Then
Proof. Let A be the alphabet consisting of the four symbols 0 , 1,a, and b. Further let L1 = A* and denote by L2 the language of palindromes over A, i.e. the set of all words w over A which are identical to their reversals wR.As it has been shown in [7] L1 as well as L2 are belonging to TTt(IA) and thus to TTt(gG-IA). Consider now the concatenation L = L1 L2 and assume contrarily that L belongs to A!Tt(gG-IA), too. Then let W , = {Owl I w E {a,b},} for n E OJ and define for each subset U = {wl, . . . wk} of W , the word u as
u={
E thk
if U = @ otherwise
85
where the
u1,.
. . , U k are recursively defined by uo = E ,
ui+l
15 i = wi+lwRwi, R
5 m - 1.
One easily sees that lul = n(2k - 1) and that for all w E W, it holds w E U iff u w E L. Therefore (choosing k = n ) there are especially at least different n-equivalence classes with respect to L in the set of words of length n2n over A. Hence using the assumption on g we can work out a contradiction to Lemma 7 for a sufficiently large n. So L is not acceptable in real-time by a gG-IA, i.e. -YTt(gG-IA)is not closed under concatenation. 0
(z)
Note that one can additionally show that for g E o(log1og) the corresponding family Z T t (gG-IA) is not closed under marked iteration although it might be closed under marked concatenation. Theorem 15 Let g : No -+ NO,g E o(log), be a mapping. Then the family ZTt(gG-IA)is not closed under reversal. Proof. Consider the language L consisting of all marked concatenations of binary sequences of equal length where the first sequence occurs at least twice, i.e
L = {W1$. . . w k $ I k 2 2 A 3 m E N : ‘Wi E ( 0 , ,}I A325j5k:wl
15 i
5 k,
=wj}.
We are going to show that L belongs to -YTt(IA) 2 -YTt(gG-IA), but LR 6 -YTt(gG-IA). An iterative array M that accepts L in real-time works as follows. The communication cell is equipped with a queue through which symbols can be piped in first-in-first-out manner. At the beginning of the computation M stores its input symbols to the queue until the first symbol $ appears. Afterwards at every time step one symbol is extracted from the queue and compared to the current input symbol. At the same time step it is stored in the queue again. Thus, the symbols of w1 circulate through the queue and w1 is compared with all the wi, 2 5 i 5 k , serially. It remains to show that LR does not belong to 2’Tt(gG-IA). Let us assume that LR is acceptable by some gG-IA in real-time. Let us consider the equivalence classes N ( ( m 1)2, ( m l ) ,L R ) . For every pair of different subsets { X I , . . . ,x,} and {yl, . . . ,ym} of the set (0, I}, there are words $21. . . $x, and $ y l . . . $ym which belong to different such (rn 1)-equivalence classes. W.1.o.g. let 21 $! { y l , . . . ,ym}. Then $21 ... $ x m $ x l belongs to LR
+
+
+
86
(2)
+
whereas $y1. . . $y,$z1 does not. Hence, there are at least such (m 1)equivalence classes. Since f E o(1og) we obtain a contradiction t o Lemma 7 0 for a. sufficiently large m which concludes the proof. The last two results deal with the closure under homomorphisms. Theorem 16 Let g : NO+ NObe a mapping. If 9',t(gG-IA) C 2ZTt(idG-IA) then 9',t(gG-IA) is closed under &-free homomorphism iff z',t(gG-IA) = .YTt (idG-IA) .
Proof. One can show that the family 2',t(idG-IA) coincides with the closure of 9',t(IA) under &-freehomomorphisms and forms an AFL which is closed under intersection and reversal. Consequently 2',t (idG-IA) is closed under &-freehomomorphisms, too, implying the closure of 9 T t(gG-IA) under &-free homomorphism if 2+ (gG-IA) = 2',t(idG-IA) holds. On the other hand, since 9',t(IA) C .Y',t(gG-IA) it follows that the closure of &(IA) under €-free homomorphisms (which is dP,t(idG-IA)) is contained in the closure of 9',t(gG-IA). If the latter family is T',t(gG-IA) itself then it follows 2,.t(idG-IA) C 2',L(gG-IA) C 9',t(idG-IA), i.e T',t(gG-IA) = 9',t(idG-IA) 0 Corollary 17 Let g : No + No be a mapping. If 9',t(gG-IA) c ZTt(idG-1A) then 9',t(gG-IA) is not closed under &-freehomomorphism, homomorphism and &-freesubstitution and substitution. By Theorem 9 such functions exist. References
1. Beyer, W. T . Recognition of topological invariants by iterative arrays. Technical Report TR-66, MIT, Cambridge, Proj. MAC, 1969. 2. Buchholz, Th. and Kutrib, M. Some relations between massively parallel arrays. Parallel Comput. 23 (1997), 1643-1662. 3. Buchholz, Th., Klein, A., and Kutrib, M. One guess one-way cellular arrays. In: Proc. Int. Sym. on Mathematical Foundations of Computer Science (MFCS). LNCS 1450, Springer, 1998, 807-815. 4. Buchholz, Th., Klein, A., and Kutrib, M. Iterative arrays with limited nondeterministic communication cell. IFIG Research Report 9901, Institute of Informatics, University of Giessen, Giessen, 1999. 5. Buss, J. and Goldsmith, J. Nondeterminism within P . SIAM J. Comput. 22 (1993), 560-572.
87
6. Chang, J. H., Ibarra, 0. H., and Palis, M. A. Parallel parsing on a oneway array of finite-state machines. IEEE Trans. Comput. C-36 (1987), 64-75. 7. Cole, S. N. Real-time computation b y n-dimensional iterative arrays of finite-state machines. IEEE Trans. Comput. C-18 (1969), 349-365. 8. Culik 11, K. and Yu, S. Iterative tree automata. Theoret. Comput. Sci. 32 (1984), 227-247. 9. Dyer, C. R. One-way bounded cellular automata. Inform. Control 44 (1980), 261-281. 10. Fischer, P. C. Generation of primes b y a one-dimensional real-time iterative array. J . Assoc. Comput. Mach. 12 (1965), 388-394. 11. Fischer, P. C. and Kintala, C. M. R. Real-time computations with restricted nondeterminism. Math. Systems Theory 12 (1979), 219-231. 12. Hromkovic, J. et al. Measures of nondeterminism in finite automata. In: Proc. Int. Conf. on Automata, Languages, and Programming (ICALP). LNCS 1853, Springer, 2000, 199-210. 13. Ibarra, 0. H. and Jiang, T. On one-way cellular arrays. SIAM J. Comput. 16 (1987), 1135-1154. 14. Ibarra, 0. H. and Palis, M. A. Some results concerning linear iterative (systolic) arrays. J. Parallel and Distributed Comput. 2 (1985), 182-218. 15. Ibarra, 0. H. and Palis, M. A. Two-dimensional iterative arrays: Characterizations and applications. Theoret. Comput. Sci. 57 (1988), 47-86. 16. Kintala, C. M. and Fischer, P. C. Refining nondeterminism in relativized complexity classes. SIAM J. Comput. 13 (1984), 329-337. 17. Kintala, C. M. and Wotschke, D. Amounts of nondeterminism in finite automata. Acta Inf. 13 (1980), 199-204. 18. Salomaa, K. and Yu, S. Limited nondeterminism for pushdown automata. Bulletin of the EATCS 50 (1993), 186-193. 19. Salomaa, K. and Yu, S. Measures of nondeterminism for pushdown automata. J. Comput. System Sci. 49 (1994), 362-374. 20. Seiferas, J . I. Iterative arrays with direct central control. Acta Inf. 8 (1977), 177-192. 21. Seiferas, J. I. Linear-time computation b y nondeterministic multidimensional iterative arrays. SIAM J . Comput. 6 (1977), 487-504. 22. Smith 111, A. R. Real-time language recognition b y one-dimensional cellular automata. J. Comput. System Sci. 6 (1972), 233-253. 23. Terrier, V. On real time one-way cellular array. Theoret. Comput. Sci. 141 (1995), 331-335.
88
R-TRIVIAL LANGUAGES OF WORDS ON COUNTABLE ORDINALS OLIVIER CARTON Imtitut Gaspard Monge Universite' d e Marile-la- Valle'e, F-77454 Marne-la- Valle'e C e d e x 2, Prance, Email: 01 i v i e r . C a r t o n h n i v - m l v . f r , Url: h t t p :/ / w w w - i gm .un i v-m 1v .f r / - c a r t on/ Following the recently proved variet,y theorem for transfinite words we give, in this paper, three instances of correspondence between varieties of finit,e WI -semigroups and varieties of wl-languages. We first characterize the class of languages which are recognized by automata in which overlapping limit transit,ions end in t,he same state. I t turns out. that. the corresponding variety of w~-semigroupsis defined by an equation which has a topological interpretation in the case of infinite words. It characterizes languages of infinite words in the class A2 = l I z n C z of t,he Bore1 hierarchy. This result is used t,o prove that an wl-latiguage is recognized by an extensive automaton if and only if its syntacric wl-semigroup is R-uivial and satisfies the Az-equation. This result extends Eilenberg's result, concerning Rtrivial semigroups and extensive automata. We finally characterize wl-languages recognized by extensive automata whose limit transitions are trivial.
1
Introduction
Finite semigroups are the algebraic counterpart of automata. The first deep result using semigroup recognition is due to Schtitzenberger 14. He proved that the syntactic semigroup of a recognizable language L is finite and aperiodic (i.e. group-free) if and only if L is star-free, i.e., it belongs to the s~nallestclass of languages containing the letters and closed under product a i d finite boolean operations. The idea of using algebraic properties of syntactic seinigroups to classify recognizable languages was developed by Eilenberg *, wlro showed that there exists a one-to-one correspondence between varieties of finite semigroups (class of semigroups closed under taking sub-semigroups, quotients and finite direct products) and certain classes of languages, the varieties of Iaiiguages. This theorem is known as the variety theorem. Since that time the tlieory of varieties of recognizable languages has been widely developed (see and '). For instance, it has been shown by Eilenberg that a language is recognized by an extensive automaton if and only if its syntactic semigroup is R-trivial (see Chap. 10 in *). Furthermore, such languages can he described by very special rational expressions. Automata on infinite words were introduced by Biiclii 6. A few years later, Buchi extended this notion to ordinals '. The challeirge was tlieii to
89
extend the algebraic approach to infinite words in a first step a i d to orcliiials in a second step. For infinite words, there is now a rather satisfying theory The couiiterpart of culminating in the works of Wilke 1 5 , Perrin and Pin this theory for ordinal words was a bit slower to develop. Wojciechowski l 6 defined rational expressions and proved that they are equivalent to automata. The algebraic theory was first settled for ordinals less than w" and later extended t o countable ordinals in 5 . The key algebraic notioii is tliat of an wl-semigroup which extends the notion of an w-semigroup introduced in 15,'. Roughly speaking, an wl-semigroup is a structure in which the product of any sequence of a countable number of elements is possible. The variety theorem is also extended to words on countable ordinals in 5 . In this paper, we give three instances of correspondence betweeii varieties of finite wl-semigroups and varieties of wl-languages. We first, cliaixterize the class of languages which are recognized by automata in which overlappilig limit transitions end in the same state. It turns out that tlie corresponding variety of wl-semigroups is defined by an equation which has a topological interpretation in the case of infinite words. This equation characterizes languages of infinite words in the class A2 = rIgnC.2 of the Bore1 hierarcliy 15, We use this result to characterize wl-languages recognized by extensive automata. An wl-language is indeed recognized by an extensive automaton if and only if its syntactic wl-semigroup is R-trivial and satisfies the A.L-equtitioit. This result extends Eilenberg's result concerning R-trivial semigroups and extensive automata. We finally characterize wl-languages recogiiized by extensive automata whose limit transitions are trivial. The paper is organized as follows. Basic definitions of words, autoinata, rational expressions and wl-semigroups are recalled iii Sectioii 2. The tliree instances of correspondence are given in Section 3. ' 9 " .
2
Notation and Basic Definitions
This section is devoted to basic notation and definitions on ordinals, words, rational expressions, automata and wl-semigroups. 2.1
Ordinals
We refer the reader to l 3 for a complete introduction to the theory of orcliiials. An ordinal is a class for isomorphism of well-founded linear orderings. 11-1 this paper, ordinals are usually denoted by lower Greek letters like u, 8, y. An ordinal a: is said to be a successor if a = p+ 1 for some ordinal 8. An ordinal is either 0, a successor ordinal or a limit ordinal. As usual, we identify the
90
linear order on ordinals with the membership. A n ordiiial CY is ttieii identified with the set of ordinals srrialler than a. In this paper, we maiiily use ordiiials to index sequences. Let n be an ordinal. A sequence r of lengtll u (or ail a-sequence) of elements from a set E is a function which maps any ordinal y smaller that a to an element of E. A sequence r is usually denoted by x = ( z ~ ) ~ In < ~this . paper, we only use countable ordinals, except for w1 which denotes the first uncountable ordinal.
2.2
Words
Let A be a finite set called the alphabet whose elements are called k t t e r s . For an ordinal a , an a-sequence of letters is also called a word of length n or an a - w o r d over A . The sequence of length 0 which has no element is called t,he e m p t y word and it is denoted by A. The length of a word r is denoted by 111 '. For an ordinal a , Aa denotes the set of all words of' length a. The set of' all words of countable length over A is denoted by Ah. A subset of Ah is called a language or an w1 -language. Let ( z ~ ) - , n A < w ~ + * the i-th iterate of n-trace-w-power of XU { 1) is (U, 0 or in Choueka(Q,t) (i.e. Qt) if i = 0. When ( varies among strings for which q5(c) has fixed components p-1, . . . , p k + l , such (1 2k)-th components can be compared via the lexicographic t-power of if i > o or of if i = 0. The greedy ordering +$",: is defined from +$2Jy(in case i > 0) or from i.e. < (in case i = 0) as follows. To compare two different (wi+' 1)-Choueka-continuous sequences q, t, we consider the sequences q5(5) and q5(q), look at the first component on which they differ and compare (, q according to these components. Though the (1 2k)-th component of 4(E) lies in a set depending on (, such a comparison really makes sense. In fact, if q, E cannot be compared via their first 2k components, then their 1 2k-th components lie in the very same set Choueka(Q,y 1) (where y is of the form w2.t for some t 2 0). Clearly, is a total ordering on Choueka(Q, wi+l 1).
+ +
+
+
= (E t [ O , ~ l I , E t [ m , 4 , * * > E t [ % 4 , ~ m I ) of Choueka-continuous sequences of lengths wz*, wZ2,. . ., w Z m. We define the greedy ordering +gz,y on Choueka(Q, Q 1) as follows. To compare two different ( a 1)-Choueka-continuous sequences 7, we consider the m-tuples O(q) and O ( c ) , we look at the first component on which they differ, say it has rank j , and compare 0. We show that it will be valid for t 1, too. Let us consider s ( t ) = (L1( t ) ,. . . ,L p ( t ) ) Then, . by applying morphisms gi, 1 5 i 5 T , we obtain g i ( L i ( t ) ) = Ci(t)U Bi(t) where strings of Ci(t) are the correct words (with incorrect complementary words) and strings of &(t) are the incorrect words in g i ( L i ( t ) ) . Then, by means of protocol ( b ) , the set of strings at the ith component at the next state is Li(t 1) = h,(Bi(t)) Cj(t). We can easily observe that for Li(t) = Li(t) it holds that g i ( L i ( t ) ) = Ci)(t)U Bi(t) with Ci(t)= h,(Ci(t)) and Bi(t) = h , ( B i ( t ) )where , strings of Ci(t) are incorrect words from g:(L:(t)) and strings of B:(t) are the correct ones with incorrect complementary words. Then Li(t 1) = Li(t + 1) = B:(t) h , ( C i ( t ) )for each t = 0,1,2,. . . . Observe that h; is the identity. By the above equalities, we can easily notice that Li(t 1), t = 0,1,. . . , can be obtained as the result of computation step t of I" (which works with protocol ( a ) ) ,thus for each t = 0,1,2,. . . , state s ( t ) of r is equal' to state s ' ( t ) of I". Moreover, I" is an s-type N W D O L system. Now we prove the statement for x = a and y = b. Suppose that r uses protocol ( a ) for functioning and r', defined above, works with protocol (b). We f i s t show that the state of the ith component of I', 1 5 i 5 T , a t the tth step of the derivation, where t 2 0, is equal to the state of the ith component of I" at the same derivation step of the computation. As in the above case, we use induction by t. For t = 0 the statement is obvious. Let us suppose that the equality holds for some t , t > 0, and let s ( t ) = ( L l ( t ) ., . . ,L p ( t ) ) be the state of J? at the tth step of the computation. Then the next state of r can be calculated by using equality Li(t 1) = Ci(t)) h,(Bj(t)), where sets of strings Cg(t)and Bi(t),1 5 i 5 T , are defined in the same way as above, namely, Ci(t)is the set of correct words (with incorrect complementary words) and Bi(t) is the set of incorrect words in g i( L i( t) ) .Since g:(Lk(t))= Ci(t)UBi(t)with Ci(t)= h , ( C i ( t ) )and B:(t) = h,(Bi(t)) and L:(t) = Li(t), for state s'(t+l) = ( L i ( t + l ) ,. . . ,L i ( t + l ) ) of r' we have Li(t+l) = Li(t+l), 1 5 i 5 T . Thus, r' and have the same state sequences. Moreover, I?' is an s-type N W D O L system. From the above constructions we can see that r and r' have the same Watson-Crick road, moreover, the Watson-Crick road of the ith component of I? is equal to the Watson-Crick road of the ith component of I?', 1 5 i 5 T . Hence the result. The reader can notice that the N W D O L systems of Example 1 in the previous section were constructed according to ideas of the above proof.
+
+
+
+
+
145
5
String population growth in NWDOL systems
Networks of Watson-Crick DOL systems determine string set collections changing in time. One measure describing the dynamics of these collections is the number of strings present in the network (at some nodes, at a specific node) at a certain step of the computation.
Definition 5.1 Let = (X,$,(91, { A l } ) ,. . . , (gr, {A,.})), r >_ 1, be an NrWDOL system with protocol (x),x E { a , b}. Let s ( t ) = ( L l ( t ) ,. . . , L r ( t ) ) ,t = 0,1,. . . , be the state of r' at step t ofthe computation. Then function p : N -+ N defined by
c r
P(t) =
C U N L i ( t ) ) , t L 0,
i=l
is called the string population growth function of r. Function pi : N -+ N , where pi(t) = card(Li(t))for t = 0,1,. . . , is the string population growth function of node i of r, 1 5 i 5 r. Although the nodes can receive new strings by communication, the string population growth function of an NWDOL system is not necessarily monotonously increasing. The following simple example proves this statement.
Example 2 Let r = (C, 9, (91, { a l } ) ,( 9 2 , { m ) ) ,(93,{ a 2 } ) ) be a standard NWDOL system with protocol ( a ) , where C = { a l ,a2, G I , a z } . Let g l ( b ) = a l , for b E C, 92(a2) = g3(a2) = al, gZ(a1) = g3(al) = a 2 , 9 2 ( 3 1 ) = g 3 ( a l ) = a l , and g 2 ( a 2 ) = g 3 ( & 2 ) = a 2 . Then, it is easy to see that for the state sequence s ( t ) , t = 0 , 1,2,. . . , of the network the following holds: s ( t ) = ( { a l } ,{ a l } ,{ a l } ) for t = 2k 1, k 2 0 , and s ( t ) = ( { a l ,a2}, {az},{ q }for ) t = 2k, k 2 1.
+
We continue with another example we shall use in the sequel.
Example 3 L e t r = (E,d, (91, { a l } ) ,( g 2 , {ala27i3})) be a standard NWDOL system with protocol ( a ) , where C = { a1 ,~ 2 , 1 2 3 a1 , ,G 2 , a,}. Let g l ( b ) = a1 for b E C and g2(ai) = ai, 1 5 i 5 3. Let g ~ ( a 1= ) Sit&, and g2(u3) = a:. g2(62) = The first few steps of derivation result in the following strings at the nodes:
146
Then p ( t ) = 7 for 11 5 t 5 27. Let us discuss the functioning of this network. First, we can observe that the first node never emits any string in the course of computation, it is a black hole. The growth of the string population is due t o the second node which sends at some derivation steps one string t o the first node. The Watson-Crick D0L system of the second component WZ= ((C, 9 2 , a l a z a s ) , 4) was examined in details in [6],[7] and [9]. I n [9] it was proved that the Watson-Crick road of Wz is not ultimately periodic, there is an exponentially growing sequence of 0s between subwords 11. More precisely, after the first position bit 1 occurs exactly in positions 3i+' i and 3i+' i 1, for i 2 0. By the definition of the Watson-Crick road, this property implies that communication takes only place at steps 3i+1 + i and 3i+1 + i + 1, i 2 0 , in the course of computation in r, when the number of strings is increased by 1. ( T h e length of the string at the second node is monotonously increasing and the first node does n o t change the lengths of the strings it has, so any communication implies the increment of the number of strings at the first node.) The growth of the string population in the network can be obtained by function p ( t ) , where
+
p(t
+ 1) =
p(t)+l
+ +
i f t=Y+l+i-I otherwise.
OT
t=3i+1+ili>0
+
Now let us consider function d ( t ) , t 2 0 , with d ( t ) = p ( t 1 ) - p ( t ) . It can take values 1 or 0. ( d ( t ) gaves the tth bit of the Watson-Crick road of the second node.) If d ( t ) would be a 2-rational function, then 0s would occur in an ultimately periodic fashion among its values (Skolem-Mahler-Lech theorem, [ l l ] , pp. 58., Lemma 9.10). But this is n o t the case, thus, d ( t ) is not a Zrational function, which implies that p ( t ) is n o t Z-rational either. Moreover-, we can ea-sily observe that p l ( t ) , the string population growth function of the first component is not 2-rational either, but p Z ( t ) , the string population growth function of the second component is a 2-rational function, since it equals t o 1 f o r each t , t 2 0.
147
Let us now modify the definition of r to I?’ as follows: let g:(b) = 61 and gk(b) = gJ(b) for each b E C. Let us suppose that I?‘ functions with protocol (b). Then, as in the previous case, the first node is a black hole. T h e second node behaves like a ”dual” node of the second node of r, it issues a string at any step t of the computation with t # 3i+1 + i and t # 3i+1 i 1, where i 2 0. T h e n the growth of the string population of r’,p’(t), is as follows:
+ +
+
’) =
{ i’!: + 1
i f t=2+l+i-I otherwise.
or t = 3 i + 1 + i , i > 0
Analogously t o the above case we can show that function d’(t), t 2 0 , with d’(t) = p’(t 1) - p’(t) is not a 2-rutional function, which implies that p ’ ( t ) is not Z-rational.
+
This example leads to two important observations.
Theorem 5.1 Forx E { a ,b} there exists a n N W D O L system r with protocol (x)such that the string population growth function of r is not 2-rational. Proof. N W D O L systems I? and of the claim.
r’ of
Example 2 satisfy the conditions
Theorem 5.2 L e t r = (E,q5, (91, { A l } ) ,. . . , (g,., { A T } ) ) T, 2 1, be a n N T W D O L system with protocol x, x E { a , b } , such that the components ( g i , { A i } ) , i = 2 , . . . ,T are black holes, and the Watson-Crick road of the component (91, { A l } ) is not ultimately periodic. Moreover, for each i, where i = 2 , . . . ,r, let c a r d ( l i ( t ) ) < c a r d ( l i ( t l)), if communication takes place at derivation step t and c a r d ( L i ( t ) )= card(Li(t + 1)) otherwise, where Li(t) denotes the state of the ith node at derivation step t, t 2 0. Then the population growth function p ( t ) of r is not a 2-rational function.
+
Proof. We give the proof for the case of protocol ( a ) ,the other case can be treated in a similar way. If components (gi, Ai), i = 2 , . . . , are black holes, then they do not contribute to the increment of the growth of the string population by communicating strings to other nodes. Thus, at any step of the computation the first node has exactly one string. The string population in the system increases whenever the first node isssues a string, and this takes place exactly at that derivation steps when the string obtained by morphism 91 has to turn to its complementary word. Then the number of strings in the system in the course of the computation can be calculated as follows: P ( 0 ) = r,
148
P(t
+ 1) =
{ 1:; +
T
if the tth bit of the Watson - Crick road of component (91,w1) is 0 , - 1 otherwise.
Analogously to the considerations used in Example 2, we can show that p(t) is not a Z-rational function. Again, we consider function d(t) = p(t 1) - p(t). This function assumes only two values: 0 and T - 1. If it were Zrational, then the 0-s would occur in an ultimately periodic fashion among its values (Skolem-Mahler-Lech Theorem, [ll],pp. 58., Lemma 9.10.). But this is not the case if the Watson-Crick road of the component is not ultimately periodic.
+
6
Remarks on black holes
Communication in networks of Watson-Crick DOL systems raises a lot of intriguing questions. Among them a particularly interesting problem is whether or not a given network contains a black hole, that is, a node which never emits any string in the course of the computation. For networks working with protocol (a), this question is strongly connected with the problem of stability of WDOL systems. A WDOL system W is stable if the complementarity transition never takes place in its word sequence S ( W ) . In [7] it was shown that any algorithm solving the stability of a given standard Watson-Crick DOL system can be converted to an algorithm solving the problem Z,,, and conversely. For NWDOL systems, if a component i, 1 5 i 5 T, is a black hole in an NWDOL system r = (C,+, (91,{ A l } ) ,. . . , (g,., { A , . } ) ) ,T 2 1, working with protocol ( a ) ,then for all strings u E Li(t),t 2 1, where Li(t) is the state of the ith node at the tth step of the computation, it holds that Wi = ( ( C , g i ,u ) ,+) is a stable WDOL system, that is, +(g:(u)) = 0, k 2 1. For protocol ( b ) , if node i is a black hole, then Li(t) contains only "instable" strings, that is, +(gi(u)) = 1 for each string u in Li(t). As a direct consequence of the statement concerning the stability of Watson-Crick DOL systems above, we can state the following result.
Theorem 6.1 A n y algorithm f o r deciding whether a standard NWDOL syst e m working with protocol (a) contains a black hole can be converted t o a n algorithm f o r solving problem Z,,,. Proof. Let us assume that we have an algorithm A for deciding the existence of a black hole. Let us apply A to a network where there is only
149
one component in the system. Then, A solves the stability of an arbitrary standard WDOL system and, hence by the earlier result of [7], it can be converted to settle problem Z,,,.
References [l] E. Csuhaj-Varj6: Networks of Language Processors. EATCS Bulletin 63 (1997), 120-134. Appears also in Gh. P h n , G.Rozenberg, ASalomaa (eds.) Current Trends in Theoretical Computer Science, World Scientific, Singapore, 2001, 771-790. [2] E. Csuhaj-Varjli, A. Salomaa: Networks of parallel language processors. In: New Trends in Computer Science, Cooperation, Control, Combinatorics. (Gh. Pgun, A. Salomaa, eds.), LNCS 1218, SpringerVerlag, Berlin-Heidelberg-New York, 1997, 299-318. [3] Handbook of Formal Languages. Vol. 1-111. (G. Rozenberg, A. Salomaa, eds.) Springer Verlag, Berlin-Heidelberg-New York, 1997. [4] A. Salomaa: Formal Languages. Academic Press, New York, 1973. [5] G. Rozenberg, A. Salomaa: The Mathematical Theory of L systems. Academic Press, New York, London, 1980. [6] V. Mihalache, A. Salomaa: Watson-Crick DOL systems. EATCS Bulletin 62 (1997), 160-175. [7] V. Mihalache, A. Salomaa: Language-theoretic aspects of DNA complementarity. Theoretical Computer Science 250( 1-2) (2001), 163178.
[8] A. Salomaa: Turing, Watson-Crick and Lindenmayer. Aspects of DNA Complementarity. In: Unconventional Models of Computation. (C.S. Calude, J . Casti, M.J. Dinneen, eds.) Springer Verlag, Singapore, Berlin, Heidelberg, New York, 1998, 94-107. [9] A. Salomaa: Watson-Crick Walks and Roads on DOL Graphs. Acta Cybernetica 14 (1) (1999), 179-192.
[lo] W. Kuich, A. Salomaa: Semirings, Automata, Languages. EATCS Monographs on Theoretical Computer Science, Springer Verlag, Berlin, Heidelberg, New York, Tokyo, 1986
1 50
[ll] A. Salomaa, M. Soittola, Automata-Theoretic Aspects of Formal Power Series. Text and Monographs in Computer Science. Springer Verlag, Berlin, Heidelberg, New York, 1978.
751
On the Differentiation Function of some Language Generating Devices Jurgen Dassow Otto-uon-Guericke- Uniuersitat Magdeburg Fukultat fur Informatik PSF 4120, 0-39016 Magdeburg
Abstract: The differentiation function of a language generating device counts the number of words which can be generated by the device in a given number of steps. In this paper we summarize results on the differentiation function of deterministic tabled Lindenmayer systems, evolutionary grammars and context-free grammars. We present sharp upper bounds for the differentiation function, prove the closure under some algebraic operations, relate this function with other functions studied in formal language theory and consider decision problems for the differentiation function.
1
Introduction and Definitions
In biology, differentiation means the evolution of a variety of organisms which are modifications of the species from which they originate. The differentiation function gives the numerical size of the variety at a certain moment. Lindenmayer systems form a description of the development of (lower) organisms on the basis of formal language theory. The organisms are represented by words, and a derivation step corresponds to a step of the development, as division of cells or changes of the state of cells etc. Therefore it is natural to define the differentiation function of a Lindenmayer system which gives the number of words which can be obtained in a certain number of derivation steps from an axiom (representing the basic organism) in order t o reflect the biological differentiation. Analogously, the differentiation function of an evolutionary grammar which describes an evolutionary process on the basis of formal language theory gives the number of words (representing DNA sequences) which can be obtained in a certain number of derivation steps (representing mutations) from a set of axioms. In order to formalize this idea we introduce the notion of a language generating device. N denotes the set of natural numbers including zero. By # ( M ) we designate the cardinality of a set M . Given an alphabet V , V* denotes the set of all words over V including the empty word A. We set V + = V' \ {A}. By IwI we designate the length of a word w. A morphism h : V* + V* is a mapping with h(w1wz) = h(w,)h(wz) for all wl, w~E V * . -
A language generating device is a construct D V is an alphabet,
I,
= (V,
A ) where
152 -
==+c V' x V'
A
c V'
is a binary relation over V * , is a finite subset of V'.
For n 2 0, the language L,(G) generated by G in n steps is defined inductively as follows:
Lo(G) = A , L,(G) = {w I v
+w for some v E Ln-l(G)} for n 2 1.
The language generated by G is defined as
L(G)=
U Li(G). i>O
The differentiation function d c of G is defined as dG :
N
-+
N with d c ( n ) = #(L,(G))
The aim of this paper is to summarize results on the differentiation function of (deterministic tabled) Lindenmayer systems and (context-free) evolutionary grammars. In addition, we also present results on the differentiation function of context-free grammars where the definition has to be changed slightly by the distinction between nonterminals and terminals. We present sharp upper bounds for the differentiation function, prove the closure under some algebraic operations, relate this function with growth and structure functions which have already been studied in formal language theory and consider the decidability of the equality and boundedness of the differentiation function of given systems or grammars. Mostly, we only present ideas of the proofs or give partial proofs. For complete proofs we refer to [a, 3, 41.
2
Deterministic Tabled Lindenmayer Systems
A deterministic tabled Lindenmayer system without interaction (DTOL system for short) is a tripe1 G = (V,P, w)where - V is an alphabet, - P = { h l , h 2 , .. . ,h,} is a set of T- morphisms hi : V' -+ v 'and - w is a non-empty word over V . Intuitively, V encodes cells, the morphisms hi, 1 5 i 5 n, describe the developments of the cells in a certain environment (e.g. h(a) = aa for a cell a describes the division of a cell a into two cells of the same type) and w corresponds to the organisms from which the development starts. The derivation relation of a DTOL system G is defined as foliows: for two words v and v', the relation v ==+ w' holds if and only if there is an integer i, 1 5 i 5 n, such that v' = hi(.). Thus one derivation step corresponds to the application of one of the morphisms. Thus any deterministic tabled Lindenmayer system G corresponds to a language generating device D ( G ) = (V,*, {w}).We denote the associated languages and differentiation function by L,(G) and dc instead of L,(D(G)) and d D ( G ) , respectively.
153
By definition, dG counts the number of different words which can be derived from the start word w in exactly n steps. In biological terms, d~ gives the number of different organisms which can be obtained after n steps using the given sets of developmental rules. We now present two examples. First we consider the DTOL system
G = ( { a , b, c ) , {hl, h d , a ) with h l ( a ) = abc, h l ( b ) = b, hl(c) = c, h 4 a ) = ab'c, hz(b) = b, h 4 c ) = c . Then Ln(G)={ab"lcbi2c ...b i n c I i j E { 1 , 2 ) } and hence d G ( n ) = 2"
for n 2 1 .
Second, for k 2 1, we consider the DTOL system
G' = ( { a , b , a l , az,. . . ,ak}, { h ; ,h:},al) with h ; ( a ) = ab, h i ( a ) = ab', h:(b) = b, h:(ak) = a for i E { 1 , 2 } , h:(aj) = aj+l for i E { 1 , 2 } , j E { 1 , 2 , .
. . ,k - 1}
Then L,(G')
= {a,+l} for 0 Lk(G') = { a } , L,(G') = {ab"b " . . . . b " n - A
5 n 5 k - 1, ~ij~{1,2}}={a~~n-k k .
The following theorem gives an upper bound for the differentiation function of DTOL systems and shows that it is sharp.
Theorem 1 i ) For any DTOL system G with r morphisms, d G ( n ) 5 r" for n 2 0. ii) For a n y natural number r 2 1, there is a DTOL system G with r morphisms such that d G ( 7 ~ = ) r". Proof. The first statement follows easily since from any word we can derive at most r different words. The second statement follows by a generalisation of the construction 0 given in the first example above.
154 Theorem 2 Let f and g be differentiation functions of DTOL systems, and k
2
1 be a
natural number. Then
+ g, defined b y (f + g)(O) = 1 and (f + g ) ( n ) = f ( n )+ g ( n ) f o r n L 1,
-
f
-
f .9, defined by (f .9 ) ( n )= f ( n ) . g ( n ) f o r 12 L 0,
-
f[k],
Wried b y f [ k ] ( n )= f
) for n
z 0'.
are differentiation functions of DTOL systems, too.
Proof. We only give the proof for multiplication of functions and mention that also in the other cases the proofs are constructive. Let GI = ( V I ,{ h l , hz, . . . ,h , ) , ~ )and Gz = (VZ,{gl,gZ,.. . ,gm), W Z ) be two DTOL systems where we assume without loss of generality that Yl consider the DTOL system G = (Vi U Vz,{fi,j
where, for 1 5 i
n V, = 0.
We
I 1 I i I n, 1 I j I m ) ,W W )
5 n, 1 5 j I m, the morphisms f i , j
are defined by
It is easy to see that L,(G) = L,(G1). L,(G2) for n 2 0. Hence d c ( n ) = d c , ( n ). dG,(n) for n 2 0. 0
+
+
+
Corollary 3 Let p(x) = a,xm arn-1xm-' alx a0 be a polynomial such that all coeficients a;, 0 I i I m, are integers and a , > 0 . Then there is a constant c 2 0 and a DTOL system G such that dG(n) = p ( n ) for n 2 c . Proof. We prove the statement by induction on the degree m of the polynomial. For m = 0, i.e. p ( x ) = a0 the statement follows easily by considering
G = ( { a , b l , b2,. . . ,b,,), { h l ,hz, . . . ha,), a ) with h;(a)= b; and h;(bj) = bj for 1 5 i 5 ao, 1 5 j I a,,. Let p ( z ) = amtlxm+l amxm . . . a1x ao. Assume that a0 2 0. Then we set q ( x ) = am+lxm a,xm-l . . . azx a l . By induction, there is a DTOL system G such that dG(n) = q ( n )for n 2 c. Then we construct a DTOL system H with the differentiation function dG(n) . n ao. Obviously, dH(n) = p ( n ) for n 2 c. If a0 < 0. Then we construct g(x) = arn+lzm amxm-' . . . a z x ( a l - 1) for which some G with d c ( n ) = g ( n ) for n 2 c exists. Moreover, by the above example there is a DTOL system H with dH(n) = n a0 for n 2 a0 1. Thus there is a system F with d F = dG d H and therefore dF(n) = g(n) n - a0 = p(n) for n 2 c' for some c' 2 0. 0
+ + + + + +
+
+
+
+
+
[ uJ denotes the largest integer
+
n with n
5a
+
+ +
+
+
155
A DOL system is a DTOL system with exactly one morphism. The DOL system H = (V,h , w) generates in n steps the word h"(w) only. Therefore with a DOL system G we can associate the growth function gH : N
-+ N given by gH(n) = lh"(w)I.
The following theorem relates growth function and differentiation function to each other.
Theorem 4 Let H be a DOL system with a non-erasing morphism. Then there is a DTOL 0 system G such that dG = IJH. We note that there are growth functions of DOL systems which do not belong to the types of functions given in Theorem 1 ii) and Corollary 3 (see [5, 81).
Theorem 5 For two given DTOL systems, it is undecidable whether or not their differentiation functions are equal.
The proof can be given by a reduction to the Post Correspondence Problem.
3
Evolutionary Grammars
A (context-free) evolutionary grammar is a sixtupel G = (V,C, I , T , D , A ) where V is an alphabet, - C , I, T , and D are finite subsets of V * , - A is a finite subset of V'. The derivation relation of an evolutionary grammar G is defined as follows: for two words v and v' the relation v Id holds if and only if one of the following conditions holds:
-
c,
v = v1xv2, 0' = v1v2, x E v = v1xv2, v' = v1xRv2, x E I , v = ~ 1 x 0 2 0 3V', = viv2Xv3, x E T , - v = v1vZXvz, V' = v1xV2v3, x E T , - v = v1xv2, V' = V ~ X X V x~ ,E D. -
~
Intuitively, V encodes the DNA molecules, the sets C, I , T and D correspond to (large-scale) mutations which can occur in the evolution, C is the set of sequences which can be deleted, I is the set of sequences which can be reversed, T is the set of sequences which can be shifted (translocated) in the DNA strand and D is the set of sequences which can be duplicated, and A is a finite set of DNAs from which the evolution starts. Thus, in biological terms, dG gives the number of DNAs which can be obtained from a set of start sequences by a given set of mutations. Again, any evolutionary grammar G corresponds to a language generating device D ( G ) = (V,*,A), and we denote the associated languages and differentiation function by L,(G) and dG instead of L n ( D ( G ) )and d D ( G ) , respectively. We give two examples. 'The reversal zR of a word z E V' is inductively defined by XR = A, zR = z for zFzp for q , z 2 E v*.
( ~ 1 . 2 )= ~
1:
E V and
7 56
First, let
G = ( { a , b}, 0,0,0, {a'ba, a2ba2},{a2baz}) be a context-free evolutionary grammar where only duplications are allowed. Then
L,(G) = { , z b a ~ ~ f 2 b a i 2 f 2...ba""+2ba2 1 ij E {1,2}} and & ( n ) = 2" for n 2 0. Second, we consider the evolutionary grammar
G' = ( { a , b}, 0, { a a } , { b } , { a a ) , {aab}) for which
L,(G') = {arbas I r + s = 22,1 5 i 5 k} u { a Z k f 2 b }
and dGt(n) = 3
hold for n
+ 5 + . . . + (2k + 1) + 1 = (k + 1)'
2 0.
Again we start with upper bounds.
Theorem 6 i) For any evolutionary grammar G, there are constants c1 and dG(n)
c2
such that
5 c1 . c; for n 2 0 .
ii) For any evolutionary grammar G = (V,C , I , T ,0, A ) with an empty set of duplications, there is a constant c such that dG(n) 5 c for n 2 0 . iii) For any natural number c 2 1, there is an evolutionary grammar G such that dG(72) = C". Proof. i) For a given context-free evolutionary grammar G = (V,C, I , T , D , A ) we set r = max{IzI I I E A } and s = max{lyl I y E D}. Then 2) ==+ u implies lul 5 IuI + s. Thus, for n 2 0, IzI 5 r + n . s for z E L,(G). The number of words whose length is at most r n . s can be bounded in the required form. ii) follows from the fact that evolutionary grammars without duplications only generate finite languages. iii) can be shown by a generalization of the construction in the first example above. 0
+
Theorem 7 Let f and g be differentiation functions of evolutionary grammars. Then the functions
+
(f
f(.)
-f gJ defined b y -k !?)(n)= t g ( n ) f o r 2O J - f * , defined b y f * ( n )= f ( i )f o r n 2 0 , are differentiation functions of evolutionary grammars, too.
Proof. We give the proof for f * only. Let G = (V,C, I , T , D , A ) be an arbitrary evolutionary grainmar and let a be a letter not contained in V . Then we consider the evolutionary grammar H = (Vu{a},C,Z,T,DU{a},{a}A).
157
It is easy to show that
n
L,(H) = U{a'")L,-,(G) i=O
0
from which d~ = f * follows.
It is an open problem whether the set of all diffentiation functions of evolutionary grammars is closed under product. For evolutionary grammars we have only the following weaker form of Corollary 3.
Lemma 8 For any natural number m 2 1, there is an evolutionary grammar G such that dc(n) = o(nm). Proof. The statement follows from Theorem 7 and the fact that (nm)*= O(nm+').
4
a
Context-Free Grammars
Obviously, by the biological motivation the differentiation function of DTOL systems and evolutionary grammars is of interest and importance. This does not hold for the differentiation function of grammars defined with a linguistic motivation. However, we shall see that there are some results which give hints to an interest in such a function for linguistically motivated grammars, too. We restrict to context-free grammars in this paper. A context-free grammar is a quadruple G = ( N ,T , P, S ) where - N and T are disjoint alphabets, - S E N and - P = { ( A ~ , W I ) , ( A Z , W ~.).,(A,,w,) ,. is a finite set of pairs with A; E N and w; E ( N U T ) ' for 1 5 i 5 r . A context-free grammar is called linear, if all elements of P are of the form ( A ,w B v ) or ( A ,w)with A , B E N and w, v E T*. It is called regular, if all elements of P are of the form ( A , w B )or (A,w) with A, B E N and w E T'. The derivation relation of a context-free grammar G is defined as follows: for two v' holds if and only if words v and v', the relation v v = vlAv2, v' = v1wvz and (A,w) E P. We say that a derivation is leftmost if v1 E T* holds. Since in the theory of linguistically motivated grammars one is only interested in words over the terminal alphabet T , we modify the concept of a differentiation function of a context-free grammar as follows: Let v v' if there are words vo,v l r v2,,. . ,v, such that vj vj+l for 0 5 i 5 n - 1 and v = vo and v' = v,. Then we set
*
+
3
L ( G ) = (2 I S + z), L',(G) = L , ( G ) n T * , d G ( n ) = #(Lh(G)).
A context-free grammar G is unambigous if, for any word w E La(G),there is exactly w where any derivation step is leftmost. one derivation S
158
First we consider the context-free grammar
G = (IS),{ a , b, c>,{ ( S ,Sbc), ( S ,Sb'c), ( S ,abc), ( S ,ab'c)), S ) Then .
.
.
.
L,(G) = {Sb'"cb'"-'c. . . bi'c 1 i j E { 1,2} } U {abl"cb'"-' c . . . P l c p j € { l , 2 } } . . Lk(G) = {ablncbln-' c . . . bile I ij E {I, 2) } d ~ ( 0 ) = 0 and & ( n ) = 2 " for n 2 1 . As a second example we consider the context-free grammar G' = ( { S ,A ) , { a , b}, { ( S ,S ) ,( S ,a s ) , (S,bA),( A , a A ) ,( A ,A)), S ) for which
Lk(G') = {arbas 1 r + s = i , O 5 i 5 n - 1)
and
cz = n(n2
n-1
d ~ ( 0= ) 0
and & ( n ) =
.
-
~
1)
for n 2 1
i=O
hold. Concerning the upper bounds we have essentially the same situation as for DTOL systems. c such that d G ( n ) 5 c" f o r n 2 0. ii) For any natural number c 2 1, there is a regular grammar G such that d G ( n ) = cn f o r n 2 1. 0
Theorem 9 i) For any context-free grammar GJ there is a constant
Theorem 10 Let f and g be differentiation functions of context-free grammars. Then the functions
f
+ g J defined
+
+
O, b y (.f g ) ( n ) = f ).( g(n) f o r b y f [ k ] ( n ) = f ( I:])f o r 2 O, - f + , defined b y f * ( n )= EY=o f ( 2 ) f o r n 2 0, are differentiation function of context-free grammars, too. -
- f [ k ] J defined
Proof. Let dG = f With a
We only prove the second statement. G = ( N , T , P, S ) be a context-free grammar with the differentiation function . Then we construct the context-free grammar G' = ( N ' , T , P', S:) as follows. nonterminal A E N we associate the set N A = { A I , Az, . . . ,Ak-1). Then we set
N' = { S;, S;,...,SL-l}U
IJ N a , AEN
P' =
S ) ,(s;, s;), S ) , s3,.. . I(SL2,S ) ,( S L , SLlL (GI>S ) U { ( A ,A I ) ,( A i ,Az),( A z ,A ) ,. . . , Ak-i)} U ( ( 4 - 1 , I (4 ($7
U
($7
W)
AEN
W )
E
PI
159 Intuitively, besides a start phase which ensures that S' S for 1 I j < Ic, any derivation step in G is obtained by Ic derivation steps in G'. Thus, for n 2 1 and lcn 5 i < Ic(n + 1) and z E T * , S' z holds if and only if S =f+ z. This implies the statement. 0
3
Again, it is an open problem whether the set of differentiation functions of evolutionary grammars is closed with respect to product. For a language L , we define the structure function S L by SL : N
+ N and sL(n) = # ( { z I z E L , IzI = n } ) .
For facts on the structure function we refer to [I, 6, 7, 91. The following theorem gives a connection between structure functions of context-free grammars and differentiation functions of context-free grammars.
Theorem 11 i) For any context-free language L , there is a context-free grammar G such that dG = SL. ii) For any umambigous context-free grammar G , there is a context-free language L such that SL = d c . iii) If the differentiation function of a context-free grammar G is bounded b y a constant, then the structure function of L ( G ) is bounded b y a constant, too. Proof. i) The proof can be given by the use of the Greibach normal form for context-free grammars. ii) Let G = ( N ,T , P, S) be a context-free grammar. Let P = {pl,pz,., . ,p,}. We define the morphism h : ( N U T ) *+ N* by h ( a ) = X for a E T and h ( A ) = A for A E N and set G' = ( N ,{[il I 1 i i I n ) , { ( A ,[iIh(w)) I Pi = (A,w)), S)'
Obviously, if
is a leftmost derivation of a terminal word consisting of n steps (below the arrows we have given the applied element of P ) , then
s ===2 [il]Vl * [il][i2]2)2 * . . . ===+ [il][iZ]. . . [i,-l]V,-l ===+ [il][iZ] . . . [in-l][in]
(2)
is a terminating derivation in G'. Further, by the assumption of unambiguity, there is no other left derivation of w . Thus any derivation of w in G differs from ( 1 ) only in the order in which the rules are applied. Hence any derivation of type (2) corresponds to a derivation of w produces [il][iZ]. . . [in]. On the other hand, any word [il][iZ]. . . [in] E L(G') of length n is associated with a word w E L ( G ) which is derived by the left derivation where the production p;l , p i z , . . . ,pi,, are applied in succession. This implies a one-to-one mapping of the set of words w which can be generated in G in n steps and the words of length n which can be generated in G', i.e. #({w
Iw
E L(G'),IwI = n } ) = #({w
I S9
w,w E T*}).
160
Therefore
SL(G,)
= dG.
0
From the first two statements of the preceding theorem it follows that the sets of differentiation functions of unambigous context-free grammars and of structure functions of context-free languages coincide. We close this section with two (un)decidability results. Theorem 12 For two given linear grammars, it is undecidable whether or not their differentiation functions are equal. 0 Theorem 13 Given a context-free grammar G and a natural number c 2 1, it is decidable whether or not the differentiation function of G is bounded b y a constant c, i.e., d G ( n ) 5 c for n 2 1 .
Proof. Let G = ( N ,T , P, S ) be a context-free grammar. We introduce a new symbol 6 N U T , and define the context-free grammar H = ( N ,T U { x } , P', S ) with the set of rules P' = { A + a * z : A -+ a E P } . Obviously, a word w E T* can be derived in G in n steps iff H generates a word w' with A T ( W ' ) = w and n{x}(w') = P, where TY denotes the projection morphism on the alphabet Y . Similar to the proof of the decidability of k-slenderness for matrix languages (see [lo]), one shows that the language z
I P k ' ( H ) = { u J ~ # . . . w ~ #T:V ( W ~ E) L(G)(l 5 i 5 k), TV(W1)
# w(wj),T{z}(wi) = " { z } ( W j ) ( 1 5 i < j Ik)l
is a matrix language, and a matrix grammar generating L['k](H) can be constructed effectively. Clearly, the differentiation function of G is bounded by a constant c iff L[>"+'](H) 0 is empty which is decidable. It is an open problem whether or not Theorem 13 holds for DTOL systems and evolutionary grammars, too. Moreover, for all devices the decidable status of the question whether or not the differentiation function is bounded is not known, The most interesting open problem in the area of differentiation function is the characterizetion of the set of differentiation functions of devices of a given type. Here we have presented only upper and lower bounds and some special classes of functions which occur as differentiation function (see Corollary 3, Theorems 4 and 11 i) ).
References [l] A . BERTONI, M. GOLDWURM and N. SABADINI, The complexity of computing the number of strings of a given length in context-free languages. Theor. Comp. Sci. 86 (1991) 325-342.
[2] J . DASSOW,Eine neue Funktion fur Lindenmayer-Systeme. EIK 12 (1976) 515-521. [3] J. Dassow, Numerical parameters of evolutionary grammars. In: J . KARHUMAKI, H. MAURER,GH. P ~ U and N G . ROZENBERG (eds.), Jewels are Forever, SpringerVerlag, Berlin, 1999, 171-181.
161
[4] J. DASSOW, V. MITRANA, GH. P ~ Uand N R. STIEBE,On functions and languages associated with context-free grammars. Submitted. and G. ROZENBERG, Developmental Systems and Languages. North[5] G. HERMAN Holland, Amsterdam, 1975.
M. OKAMOTO and H. ENOMOTO, Characterization of structure[6] T. KATAYAMA, generating functions of regular sets and the DOL systems. Inform. Control 36 (1978) 85-101. The structure generating function of some [7] W. KUICHand R.K. SHYAMASUNDAR, families of languages. Infomn. Control 32 (1976) 85-92. and A . SALOMAA, The Mathematical Theory of L Syste,ms. Aca[8] G. ROZENBERG demic Press, 1980. [9] A. SALOMAA and M. SOITTOLA, Automata-Theoretic Aspects of Formal Power Series. Springer-Verlag, 1978. [lo] R. STIEBE,Slender matrix languages. In: G. ROZENBERG and W. THOMAS (eds.), Developments in Language Theory, World Scientific, Singapore, 2000, 375-385.
162
Visualization of Cellular Automata MBria Demkny, G6za HorvBth, Csaba Nagylaki and ZoltBn Nagylaki
1. Abstract Cellular automaton is a special sort of automata and heavily studied in automata theory [2]. Cellular automata consist of automata are put next to each other. They can be in a line, in a grid or even higher dimensional arrangements. These automata differ from the other well-known basic automata, firstly they have not got any tape. Secondly, the transition function is different, the new state of an automaton is determined by its current state and the current states of its direct neighbouring automata. These finite state machines work synchronously. In this paper we show a visualization of cellular automata in a three dimensional environment. We represent the automata with their properties. We discuss the aspects of designing and developing of an engine which simulates the work of automata. This engine applies the automata’s transition rules in each step. We analyze how the engine works and how the simulation and visualization are accomplished. The number and form of cellular automata, their initial states and the transition rules are required for this software as input. The output consists of those states of automata which states the automata had during the execution. Every state can be assigned any special coloured and sized shape supported by the three-dimensional environment. For each arbitrary cellular automata the most appropriate assignment can be done for better understanding the result of the run.
This work was supported by the Hungarian National Science Foundation (Grant No’s.: TO19392 and T030140).
163
2. Introduction 2.1. Cellular Automata
A cellular automaton is a discrete dynamical system. Space, time, and the states of the system are discrete. Each point in a regular spatial lattice, called a cell, can have any one of a finite number of states. The states of the cells in the lattice are updated according to a local rule. That is, the state of a cell at a given time depends only on its own state one time step previously, and the states of its nearby neighbours at the previous time step. All cells on the lattice are updated synchronously. Thus the state of the entire lattice advances in discrete time steps. To define a cellular automaton precisely, we have to give the following parameters of automata: - Dimension We will examine one, two and three dimensional cases. - Size of automata In one dimensional case it is a number of cells in the line of automata. In two dimensional case it is the width and length of the grid of automata. In three dimensional case it is the width, length and height of automata. - Set of states Set of states is usually an alphabet, but sometimes this alphabet can be a set of numbers. - Set of rules Rules are transition functions, which gives one state for each combination of the states of the cells and nearby neighbours. Usually specified in the form of a rule table. - Initial state We have to give the initial state for each cell of the automata. 2.2. Life
”Life” originally began as an experiment to determine if a simple system of rules could create a universal computer [l][5] [6]. The concept ”universal computer” was invented by Alan Turing and denotes a machine that is capable of emulating any kind of information processing by implementing a small set of simple operations. The inventor of Life, John Conway, sought to create as simple a ”universe” as possible that was capable of computation. What he found after two years of experimentation was a system consisting of a rectangular grid where each square could be in one of two states: on or off. He considered of them as cells, alive and dead. The rules of the system are
164
very simple: a cell survives if it has two or three living neighbours. A new cell is created on a ”dead” square if it has exactly three living neighbours. It is one example of the two dimensional cellular automata. In 1982 Stephen Wolfram, set out to create an even simpler, one-dimensional system. The main advantage of a one-dimensional automaton is that changes over time can be illustrated in a singe, two-dimensional image and that each cell only has two neighbours.
3. Visualization of Cellular Automata To examine cellular automata’s work is very important to see how they work. To visualize automata’s states we have two possibilities, as we can see in the following. We created a universal viewer, which can show all - one, two or three dimensional - automata’s states while working, in this two way. 3.1. Higher dimensional viewing
As we were seen in Wolfram’s one dimensional system, the one dimensional automata can be seen as a two dimensional picture. In this case the automata is a line of the cells, and we draw each line under the previous, when the automata’s states are changing. When we have two dimensional automata, we have to draw two dimensional grids next to each other, and finally we will receive a three dimensional image. If we use this representation method then we can see all the previous states of automata, and finally these states together form a higher dimensional still-image. 3.2. Same dimensional viewing
For investigating how Conway’s universe works, we have to show his two dimensional automata in two dimension, and only the last states in each time. In this case we receive a two dimensional moving picture, but we can see only the latest states of automata. Similarly we can see the one dimensional moving image (a line) in one dimensional case, and a three dimensional moving image in three dimensional case.
165
4. The engine Our engine is represented in a VRML environment. VRML is a very popular three dimensional Virtual Reality Modeling Language for the Internet users [3] [4]. This language has the indispensable means for representing a real 3D environment. It includes shapes, geometry, textures, lights, sound sources, transformations, interpolators, etc. These nodes can be grouped, embedded and linked to each other for forming sophisticated objects. It makes possible to create arbitrary static 3D world. Additionally, it contains objects for representing dynamic objects of the world. It includes translators, sensors, timers and event processing mechanism. Resulting that the ’move’ and ’change’ of the world can be applied. The VRML as a universal 3D environment supports all those requirements which are laid by the problem of visualization of cellular automata, even in the three dimensional case. Unfortunately, the VRML doesn’t have some basics, important programming possibilities, such as variables, functions, etc., but fortunately we can use JavaScript functions in our VRML program, so we have these means, which are important to compute the states of automata. The main concept of our program is to give a universal viewer. The program contents the following main parts: - The computing function, which is given in JavaScript, it computes the states of automata by applying the rules and gives the results to the visualization part. - The visualization part, which is written in VRML, it shows the automata’s work in the three dimensional environment. 4.1. Input parameters
There are declaration parts of the program, these contain all the data of automata and contain the information of visualization too. These data are the input parameters of the program. The declaration part for the automata consists of the following: - The states of automata. The states of automata can be natural numbers. They are listed together with the shapes which represent the state in the visualization part, therefore this input will be described in the visualization part. - The dimension of automata. It is a number, which can be 1, 2 or 3. This value must be assigned to the variable ”dim”. - The size of the automata. These are three natural numbers, we have to give the width, length and height of the automaton to the variables ” x n u m ” , ” y n u m ” and ” z num ” .
166
- The initial states of automata. We have t o give the initial states of each automaton to the ”ca” variable. The ”ca” variable is an array, we must list the states in the order of Z, Y and X increasingly. - The rule table. We have t o give the number of rules t o the variable ”rulenum”, and we have to give the rules of the automata t o the variable ”rule”. Rules contain the states of cells, and must be given in the following form: center Xleft Xright Yleft Yright Zleft Zright newstate Naturally, in the two dimensional case the automata have not got neighbours in the Z-direction, thus we don’t give the Z neighbours’ states. Similarly, in the one dimensional case we do not specify the Y neighbours’ states neither. There are two values with special roles. Firstly, we can use the -2 instead of some input state. The -2 means that this state should not be considered, it is a universal one. For applying a rule to an automaton the state of automaton regarding to the -2 valued state of the rule is not compared. Namely that automaton can be in any state nevertheless the rule can be applied. Secondly, the automata are expanded with one cell in each direction. All these cells in this frame has the state -1 permanently. Therefore for those rules which is intended to be applied for the automata on the border -1 is supposed to be used in the external direction. - The number of steps. We have to give to the variable ”stepsnum” it specifies the number of applying the rules. These input parameters must be given to compute the states of the cells of automaton. The declaration part for the visualization contains: - The cells’ shapes. This is a list of strings, ”cell-shape” is the variable’s name. Each string is a valid VRML source of a shape. These are the shapes which can be connected t o states of automata, therefore this is how a cell is visualized, it is the specification how it appears on the screen. There are some predefined shapes, naturally defining new shapes are also possible, but it requires basic knowledge of VRML. Each shape’s size is supposed to fit in the unit-cube. If the shape X, Y or Z size is bigger than one then the ” cellsizex” , ” cell-size-y” and ” cellsizez” variables must be set accordingly for forming a bounding box around the shape. Since these shapes are specified by VRML, arbitrary VRML objects can be used. - The connection table, it assigns one shape to each state of automata. The variable is called ”connect”, and contains pairs of numbers, where the first one is the state, and the second one is the index of the ”cellshape” list. Naturally, every state must be listed but the same shape can be assigned to different states. The -1 value has a special role, it is the void shape. If a state is connected to -1 then this state will not be visualized by a shape, namely
167
its space remains empty. - The number of these number pairs in "connect" has to assign to the "connectnum" variable. The declaration part for the technical input parameters contains: - In the case of two dimensional automata we have to give x n u m zeros to the "01d..row~~ variable. - In the case of three dimensional automata we have t o give x-num*ynum zeros to the "old-plane" variable too. These parameters are fields of the "SCRIPT" node which contains the Javascript part of the engine too. 4.2. Dashboard
For controlling the visualization a dashboard is created. The execution and visualization of the automata is conducted by the dashboard. - Gap. It specifies the X, Y and Z space between the cells. - Step. In the case of higher dimensional viewing, we can set the X, Y and Z space between the new and the previous shapes of states of cells. - Moving / Staying switch button. It switches the type of viewing. Moving is Higher dimensional viewing Staying is Same dimensional viewing - Continuous / Step by step switch button. There are two modes of play, this button switches between them. In the continuous play the automata's work is played as an animation, the shapes of new states are displayed after each other as the time passes. In step b y step mode we can see each frame of the animation individually. - Play button. It starts the animation. In the case of continuous play, a real animation starts, in the case of step by step play the rules are applied and the shapes of new states are displayed. - Stop button. It has effect during continuous play only, it stops the animation. - Speed. It also has effect during continuous play only it specifies the delay in seconds for displaying the next shapes of states of automata, namely the next frame of animation. - CA Step. It specifies how often the program visualize the states of automaton. Firstly, several states can be skipped. Secondly, it makes possible to visualize the states of automata after given steps every time. This parameter takes effect in both play modes. - Follow mode On / Off switch button. It has effect in the case of higher dimensional viewing. If it is set on then the shapes of new states are centered on the screen. Namely, the viewer slides paralelly to the shapes of states in
168
each step. - Redraw. The last visualized states are redrawn. It is useful a t the begining, when we set the gap. - Board Off. Removes the dashboard from the screen. It increases the screen’s region for viewing and reduces the load of the VRML-browser. - Board On. P ut the dashboard to the screen.
5. Plans - In our program the parameters are specified in the program-file. As an
enhancement the input can be read from files. Several file formats for specifying automata rules and states are used recently. Adding filters to the program for these formats are planned. - Sometimes the transition function is not specified by a rule table, but it is specified by formulas. - The automata’s states are visualized in forward direction. Moving in backward direction, namely the revisualization of previous steps can be made possible. In the case of reversible automata this feature can be added relatively easily. - Support for partial cellular automata is also planned.
6. References 1. ftp:/ /alife.santafe.edu/pub/topics/cas/txt/general.txt 2. Andreas Ehrencrona’s Cellular automata homepage http: / /cgi.student .nada.kth.se/ cgi-bin/ d95-aeh/get/lifeeng 3. Carey, R., Bell, G., Marrin, C.: ISO/IEC 14772-1:1997, Virtual Reality Modeling Language, (VRML97), San Diego Supercomputing (SDSC), 1997. http://www.vrml.org/Specifications/VRML97 4. San Diego Supercomputing Group, The Virtual Reality Modeling Language Version 2.0, ISO/IEC CD 14772 August 4, 1996. http://vrml.sgi.com/moving-worlds/spec/part1/ 5. Morita, K.: Cellular automata and artificial life - Computation and life in reversible cellular automata -, Proc. of the 6th Summer School on Complex Systems, Santiago, 1-40, 1998. 6. Morita, K. and Harao, M.: Computation universality of one dimensional reversible (injective) cellular automata, Trans. IEICE Japan, E72, 758-762, 1989.
169
Picture 1. The dashboard
Picture 2. One dimensional automaton, higher dimensional viewing
170
Picture 3. and Picture 4. Two dimensional automata, higher dimensional viewing
171
O N A CLASS OF HYPERCODES
BY Do Long Van' Institute of Mathematics P.O.Box 631 Bo Ho, 10 000 Hanoi, Vietnam Abstract. In this note we consider a special class of hypercodes whose elements are called supercodes. Characterizations of supercodes, maximal supercodes are established. Embedding a supercode in a maximal supercode is considered.
1. Preliminaries. Let A throughout denote an alphabet, i.e. a non-empty finite set of symbols called letters. We denote by A* the free monoid generated by A whose elements are called words over A . The empty word is denoted by 1 and A+ = A* - (1). The number of all occurrences of letters in a word u is the length of u, denoted by IuI. Any set of words is a language. A non-empty language X is a code if for any positive integers n , m 2 1 and for any X I , ...,x,, y1, ...,urn E X ,
x1 ...x, = y1 ...yrnj n = m and xi = yi for all i For further details and background of the theory of codes we refer to [BP, S]. Let u,v E A*, we say that u is a subword of v if, for some n 2 1, u = ul...u,, v = ~ 0 ~ 1...xu,x, 1 with u1, ...,u,,xo,x1 ,...,x, E A*. If ~ 0 x... 1x, # 1 then u is called a proper subword of v. A subset X A+ is a hypercode if no word in X is a proper subword of another word in it. Hypercodes have been considered by many authors [Tl,ST,Val,T2,S], and they have some interesting properties, in particular one has
c
Proposition 1.1 (see [S]) Every hypercode is finite .
As has been observed by several authors, many classes of codes can be defined by a binary relation (see [IJST,S]). Given a binary relation < on A*. A subset X A* is an independent set w. r. t. the relation < if any two elements of X are not in this relation. We say that a class C of codes is defined by 4 if these codes are exactly the independent sets w.r.t. 4 . Then we denote the class C by C+. Very often, the relation < characterizes some property (Y of words. In this case, instead of < we write +, and also C,
c
le-mail: d1vanQthevinh.ncst.ac.vn
172
stands for C+-. It is obvious that the class the relation 4 h given by 21 < h V
eS 3n
ch
of hypercodes is defined by
2 1 : 21 = u~uz...U(LnA V = X O U ~ X ~ U Z . . . Uwith ~ X ~ X o X i ...X n # 1.
Let < be a binary relation on A" and u ,v E A". We say that u depends on v if either u + v or v < u holds. Otherwise, u is independent of v. These notions can be extended to subsets of words in a standard way. Namely, a word u is dependent on a subset X if it depends on some word in X . Otherwise, u is independent of X . For brevity, the following notations will be used in the sequel u4 3v E x : u + v ; < u $3v E x :v 4 u ; Next, an element u in X is minimal in X if there is no word v in X such that v < u. When X is finite, by m a x X we denote the maximal wordlength of X . Now, for every subset X C A* we denote by D x , I x , L X and Rx the sets of words dependent on X , independent of X , non-minimalin I X and minimal in I x , respectively. In notations:
x
x+
Dx={uEA* Iu<XVX,(5,219 (4,3), (3,4), (2,6), (1,7), (0,s) is a complete chain containng S . The corresponding complete set of S' is u = {(6,0>,(5,2>,(4,3>,(3,4),(2,6), (1,7>,(0,8)}. So 1' = p - l ( U ) is a maximal supercode contaning X . More explicitely, Y = ~ ( 2with ) 2 = {u6,u5b2,a4b3,a3b4,u2b6,ub7,b8}.
s
References
[BPI J. Berstel, D. Perrin, Theory of Codes, Academic Press, Orlando, 1985. [HT] T. Head, G. Thierrin, Hypercodes in deterministic and slender OL languages, Infom. and Control, 45 (1980), 251-262. [IJST] M. Ito, H. Jiirgensen, H. J . Shyr, G. Thierrin, Outfix and infix codes and related classes of languages, J. Comput. and System Sci.,43 (1991),484508. [S] H. J. Shyr, n e e Monoids and Languages, Hon Minh Book Company, Taichung, 1991. [ST] H. J. Shyr, G. Thierrin, Hypercodes, Inform. and Control, 24 (1974), 45-54. [Tl] G. Thierrin, The syntactic nionoid of a hypercode, Semigroup Forum, 6 (1973), 227-231. [T2] G. Thierrin, Hypercodes, right-convex languages and their syntactic monoids, Proc. Amer. Math. SOC.,83 (1981), 255-258. [Val] E. Valkema, Syntaktishe monoide und hypercodes, Semigroup Forum, 13 (1976/77), 119-126. [Van] D. L. Van, The enibedding problem for codes defined by binary relations, Hanoi Institute of Mathematics, Preprint 98/A22, 1998.
183
A Parsing Problem for Context-Sensitive Languages - A Correction PA1 Domosil and Masami Ito' 'Institute of Mathematics and Informatics, University of Debrecen, Egyetem tkr 1, H-4032 Debrecen, Hungary email: domosiQmat h.klte.hu 2Faculty of Science, Kyoto Sangyo University, Kyoto 603-1555, Japan email: itoQksuvx0.kyoto-su.ac.jp X * be a language over a nonempty finite alphabet X. For a word Let L u E X * over X ,we denote the length of u by 1211. Moreover, S u b ( L ) means the set { p E X' : 3u, q , r E X*, u = qpr E L}. In this note, we will prove Theorem 1 using the following two lemmas. Lemma 1 ([a, page 841) Every context-sensitive language is recursive Lemma 2 ([2, page 891) Let X be a nonempty finite alphabet. Let L' X* be a type-0 (recursively enumerable) language. Moreover, let a , b $ X where u # b . Then there is a context-sensitive language L such that (a) L consists of words of the f o m aibp where i 2 0 and p E L', and (ii) for every p E L', there is an i 2 0 such that a'bp E L'. Theorem 1 Let L be a language and let f i ; : N -+ N be a function such that for any p E S u b ( L ) there exists a pair q , r with qpr E L and lqrl 5 fL(Ip1). Then there is a context-sensitive language which has no recursive function fL having this property. Proof. Let X = { c , d, . . .} and let M be a recursively enumerable set of positive integers that is not recursive. Moreover, let L' = {cnd : n E M } . Now let L be a context-sensitive language over { a , b, c, d, . . .} defined as in Lemma 2 and let f~ : N + N be a function stated in Theorem 1. Suppose f L is recursive. Since f L is recursive, for any positive integer k we can construct the language L k = {a"bckd : m 5 f L ( k 2)). If k E M , then, by Lemma 2 and the definition of f L , L r l Lk # 0. Conversely, if L fl Lk # 0, then bckd E S u b ( L ) , which implies k E M . Consequently, for a given positive integer k , k E A4 if and only if L f l Lk # 8. On the other hand, by Lemma 1, L is recursive. Thus it
+
184
is decidable whether L n L k is empty for a given positive integer k. Therefore, A4 is recursive, a contradiction. Hence fL is not recursive. This completes the proof of Theorem 1. Acknowledgement Prof. F. Otto a t the University of Kassel indicated the mistake in our proof for Theorem 1 in [I]. Moreover, he had a diffirent proof from our correction. We would like to express our gratitude for his indication and suggestion. References [l] P. Domosi and M. Ito, Characterization of languages by lengths of their subwords, in Semigroups (edited by K.P. Shum et al.), Monograph Series (1998) (Springer, Singapore), 117 - 129
[2] A. Salomaa, Formal Languages, Academic Press, New York, London, 1973
185
An Improvement of Iteration Lemmata for Context-free Languages1 PQ DOMOSI Institute of Mathematics and Informatics, L. Kossuth University Debrecen, Egyetem t6r 1, H-4032, Hungary e-mail:
[email protected] and Manfred KUDLEK Fachbereich Informatik, Universitat Hamburg D-22527 Hamburg, Vogt-Kolln-Str. 30, D-22527 Hamburg, Germany
[email protected] Abstract: An improvement of iteration lemmata is given for context-free languages.
1. Introduction In this paper we give an improvement of iteration lemmata for contextfree languages in [l,2, 6, 7, 10, 8, 12, 111. For all notions and notations not defined here, see [8] and [9, 11, 13, 141. An alphabet is a finite nonempty set. The elements of an alphabet are called letters. A word over an alphabet X is a finite string consisting of letters of X. For any alphabet X , let X * denote the free monoid generated by X , i.e. the set of all words over X including the empty word X and X+ = X*\ {A}. The length of a word w, in symbols )wI, means the number of letters in w when each letter is counted as many times as it occurs. Therefore, w has 1201 positions. By definition, 1x1 = 0. If u and 21 are words over an alphabet X , then their catenation uu is also a word over X . Especially, for any word This work was supported by DAAD, the Hungarian National Science Foundation (Grant No’s TO19392 and T030140).
186
uvw, we say that v is a subword of uvw.Let w be a word. We put wo= X and wn = wn-lw(n > 0). Thus w k ( k 2 0) is the k - t h power of w. A ( generative unrestricted, or simply, unrestricted ) grammar is an ordered quadruple G = (V,X , S , P ) where V and X are disjoint alphabets, S E V , and P is a finite set of ordered pairs (W,2)such that 2 is a word over the alphabet V U X and W is a word over V U X containing a t least one letter from V . The elements of V are called variables and those of X terminals. S is called the start symbol. Elements (W,Z)of P are called productions and are written W + Z . If W 4 2 E P implies W E V then G is called context-free. A word Q over V U X derives directly a word R, in symbols, Q + R,if and only if there are words Q 1 , Q2,Q3,R1 such that Q = QzQlQ3,R = Q2R1Q3 and Q1 R1 belongs t o P. The language L(G) generated by a grammar G = (V,X , S, P ) is the set L ( G ) = {w I w E X* and S ~ Wwhere , 3 denotes the reflexive and transitive closure of + . L C X* is a context-free language if we have L = L ( G ) for some context-free grammar G.
2. Excluded Positions in Derivation Trees For any word z E X*, and positive integer k 5 JzJ,we will speak about the kth position of z . Moreover, if z = a1 . . .a,, with a l , . . . ,an E X, then we say that ak is in the kth position of z . In addition, sometimes we will distinguish excluded and non-excluded positions of z . Finally, if a k , . . . , uk+e are in excluded positions of z then we also say that a k . . . a k + e consists of excluded positions. Given a context-free grammar G in Chomsky normal form, let T, be a derivation tree for some z E L(G).We say that a subpath of T, is external if its initial node is the root of the tree and its terminal node is either the first or the last position of z . In the same sense, we will speak about the external subpaths of a given subtree of T,. An intermediate node of T, is said to be a branch point if each of its children has an excluded descendant. On the other hand, define a node to be free if each of its children has no excluded descendant. ( Recall that G is in Chomsky normal form. Thus every node in T, has not more than two children. ) Of course, the leaves of T, are neither branch points nor free nodes. A subpath of T, is distinguished if a) its initial node is either a branch point or the root of the tree, and its terminal node is either a branch point or a single excluded position ( i.e. the left and right neighbours are non-excluded positions ); b) non of its intermediate nodes is a branch point; c) if it has no intermediate node then its initial node is the root of the tree and simultaneously not a branch point.
187
For a context-free grammar G in Chomsky normal form and a word z E L ( G ) , let T, be a derivation tree with z = z ~ w l z l ...wnzn, where wl, . . . ,w, denote ( possibly empty ) words consisting of excluded positions, and z1 , . . . ,z, denote ( possibly empty ) words having no excluded positions. A derivation tree T, is called minimal if all of its subpaths with the following properties : a) the terminal node is either a branch point, a non-excluded position, or a single excluded position; b) no other node is a branch point; have no 2 non-terminal nodes with the same ( non-terminal ) label. We start with the following
Lemma 2.1. Let T, be a minimal derivation tree f o r z E L ( G ) , and consider a n arbitrary distinguished subpath p . T h e n the free children of the intermediate nodes o n p have not more than 2lVl-' - 1 non-excluded descendants. I
Proof. Consider a subpath p' containing all nodes of p apart from its initial node if the initial node of p is a branch point. Otherwise, ( if the initial point of p is not a branch point, and then it is the root of the tree ) let us assume p' = p. Since T, is minimal, p' is not reducible. Consider the maximal derivation subtree Tzr of T, having the root as the initial node of p'. Omitting all of the descendants of the terminal node of p' ( and p ) from T,,, we get a subtree containing no path with distinct nodes having the same nonterminal label. Therefore, the subtree T,!, has not more than 2lVl-l leaves, where one of the leaves is the terminal node of p' ( and p ). Tzl1
0
Lemma 2.2. Let k be the number of the words in ( ~ 1 ,. .. , w,} consisting of two or more letters. Suppose that T, is a minimal derivation tree. T h e n T, has not more than Iw1 . . . w,J n + k - 1 distinguished paths.
+
Proof. a) Consider any block w with (w(2 2 consisting only of excluded positions. Then this w contributes at most 1wI distinguished paths. This is proved by induction on IwI. If JwI = 2, then clearly, w contributes a t most 2 distinguished paths. Assume that w1w2 with Iw1w21 = n contributes a t most n distinguished paths. Adding a new excluded position x gives w1xw2. If w1 = X then the path from x has to be joined on the left external paths belonging to w1w2. This can be done either above the highest branch point, or within some distinguished left external path. But then no distinguished
188
path below the join point can have left free children. Thus a t most 1 new distinguished path can be added, the join point becoming a branch point. Symmetrically, if w1 = A. If w1 # A, w2 # A, then x must be joint to some interior path, without left and right free children, either to a left or to a right external distinguished path. Again, at most 1 new distinguished path can be added, the join point becoming a branch point. b) Now consider the highest branch points coming from the blocks w, with lwil 2 2 as ’excluded positions’. With the same argument which has been used for the number of distinguished paths in [6, 71 one gets a t most 2(k ( n - k)) - 1 = 2 n - 1 distinguished paths. Adding those contributed by the k blocks wii with lwiiI 2 2 gives at most
+
2n - 1
+
k
(wijI
=
lwi,
. . - wi, I + 2 ( n - k ) + 2k - 1 = Iw1. . . w,I
+n + k - 1
j =1
distinguished paths. ( n - k is the number of blocks wi:, with
I w ~ , ~ )= 1 ). 0
Theorem 2.3. Given a context-free grammar G = (V,XIS, P ) in Chomsky normal f o r m and a word zowlz1 . . . Wnzn E L(G) with A $ ( ~ 1 ,... ,wn}, let k be the number of the words in (201, . . . ,w,} consisting of two or more letters. If T, is a minimal derivation tree for z , then the following holds : 1 . ~ 1 5 (2lv1-l - l ) ( l w l . . . w,I + n + k - 1) - n - k + 1.
Proof. Let T, be a minimal derivation tree of the word z = zowlzl . . . w,z,. Exclude positions in z such that w1 , . . . ,w, are ( possibly empty ) words consisting of excluded positions, and ~ 1 ,. .. ,zn are ( possibly empty ) words having no excluded positions. Then, by Lemma 2.2, T, has not more than l w l . . . w,I n k - 1 distinguished paths. On the other hand, using Lemma 2.1, for every distinguished subpath p of T,, the free children of the intermediate nodes in p have not more than 21vl-1 - 1 excluded descendants. Therefore, we have not more than (21vl-1 - l)(lwl . . . w,J + n k - 1 ) nonexcluded ( leaf ) descendants of all free children. Now we consider an arbitrary non-excluded position. If it is a descendant of a branch point then consider the last one with this property. It should have two children. Both of them should have an excluded descendant. Therefore, one of them is an intermediate point of a distinguished path of which the considered excluded position is also a descendant. On the other hand, if the considered non-excluded position is not a descendant of any branch point
+ +
+
189
then the root of the tree should not be a branch point. Therefore] either the considered non-excluded position is a descendant of a branch point or not, and then it i s a descendant of a free node. Since there are not more than (21vl-1 - l ) ( ( w l . . w,( n k - 1) nonexcluded ( leaf ) descendants of all free children] it follows immediately that I Z I 5 1w1.. . w , ~ (2l'l-l1)(lw1.. . w , ~ n k - 1) - 21vl-1(lw1.. . w,I n k - 1 ) - n - k 1.
+ +
+
+ + +
+ +
0
3. Context-Free Languages Now we show an improvement of Theorem 1.7 in [7].
Theorem 3.1. For a context-free grammar G = (V,X , S, P ) in Chomsky normal form and a word z = z o w l z l . . . wnzn E L ( G ) with A $ ( ~ 1 ,. . ,wn} letk bethenumberof a n d ) z )> (21vl-1-1)(Jwl . . .w , J + n + k - l ) - n - k + 1 , the words in { w l , , . . ,w,} consisting of two or more letters. There are words u,u,w , x , y , with z = uuwxy, and positive integers s, t with 0 5 s < t 5 n, u = Z O W l Z 1 . . . w,-lzs-lw,z;, u = z;, w = z~w,+1z,+1 . . . W t - l Z t - l W t Z ; , I 11 111 5 = z;, y = Z : I I W t + l Z t + l . . . w,z, ( 2 , = z;z:zr, Zt = ZtZt tt ), 1uwx( 5 2 . ((2I'l-l - l ) ( l w l . . . w,( + n + k - 1 ) - n - k + 1 ) ] lux( > 0 , and uuzwxiy E L ( G ) for every nonnegative integer i.
Proof. We consider the following cases. Case I. Suppose that T,, a derivation tree for z , is not minimal and denote by p one of its subpaths. Thus, there exist distinct nodes in p having the same ( non-terminal ) label, say A, moreover, two strings of terminals] v and z, and two nonterminals, B and C , such that the derivation A + BC + W A X is represented in T,. B and C cannot both dominate the lower A, therefore lux) > 0. On the other hand, since there exists no intermediate branch point of the distinguished paths, we have that neither u nor x contains an excluded position. ( Of course, the free children of the nodes of this path do not have excluded descendants. ) In other words, we obtain for an appropriate pair s,t of positive integers, u = Z O W l Z l . . . ws-1zS-1w,z;] 21 = z;, w = zyws+1z,+1.. . W t - l Z t - l W t Z ; , z = z;,y = t ~ w t + l z t + l ...w,zn, lvzl > 0 , and uuiwxzy E L ( G ) , i 2 0. In addition, by the derivation discussed above, A 5 BC + W A X ,we may assume the existence of derivations B + * z ' , C j *z" with z't'l = uwx such that the derivation subtrees T,,, T,,, are minimal. Therefore, we get (uwz( I 2 . ((2lv1-l - l ) ( ( w l . .wn( . n k - 1)- n - k I).
+ +
+
190
Case 11. Suppose that T, is minimal with respect t o 2 0 , w l , zl,. . . , wnrz,. Therefore, by Theorem 2.3, IzI < (2l"l-l - l ) ( l w l . . . WnI n+ k - 1) - n - k I), contradicting our conditions.
+
+
0
References [l]Bader, C., Moura, A. : A Generalization of Ogden's Lemma. JACM 29, no. 2, (1982), 404-407.
[a] Bar-Hillel,Y.,
Perles, M., Shamir, E. : O n Formal Properties of Simple Phrase Structure Srammars. Zeitschrift fur Phonetik, Sprachwissenschaft, und Kommunikationsforschung, 14 (1961), 143-172.
[3] Berstel, J., Boasson, L. : Context-free Languages., in Handbook of Theoretical Computer Sciences, Vol. B : Formal Models and Semantics, van Leeuwen, J., ed., Elsevier/MIT, 1994, 60-102. [4] Domosi, P., Ito, M. : O n Subwords of Languages. RIMS Proceedings, Kyoto Univ., 910 (1995), 1-4. [5] Domosi, P., Ito, M. : Characterization of Languages b y Lengths oftheir Subwords. Proc. Int. Conf. on Semigroups and their Related Topics, (Inst. of Math., Yunnan Univ., China,) Monograph Series, SpringerVerlag, Singapore, to appear.
[6] Domosi, P., Ito, M., Katsura, M, Nehaniv ,C. : A New Pumping Property of Context-free Languages. Combinatorics, Complexity and Logic (Proc. Int. Conf. DMTCS'96), ed. D.S. Bridges et al., Springer-Verlag, Singapore, 1996, 187-193. [7] Domosi, P., Kudlek, M. : Some New Iteration Lemmata for Context-free and Linear Indexed Languages, ( accepted for Publicationes Mathematicae, No 60 ). [8] Harrison, M. A. : Introduction to Formal Language Theory. AddisonWesley Publishing Company, Reading , Massachusetts, Menlo Park, California, London, Amsterdam, Don Mils, Ontario, Sidney, 1978. [9] Hopcroft, J.E., Ullman, J.D. : Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, Reading , Massachusetts,
191
Menlo Park, California, London, Amsterdam, Don Mils, Ontario, Sidney, 1979. [lo] Horv&th, S. : A Comparison of Iteration Conditions o n Formal Languages. In Algebra, Combinatorics and Logic in Computer Science, vol. 11, pp. 453-464, Colloquia Matematica Societatis J’anos Bolyai, 42, North Holland 1986. [ll] Nijholt, A. : An Annotated Bibliography of Pumping. Bull. EATCS, 17
(June, 1982), 34-52. [12] Ogden, W. : A Helpful Result f o r Proving Inherent Ambiguity. Math. Syst. Theory 2 (1968), 191-194 [13] Rkvksz, Gy.E. : Introduction to Formal Languages, McGraw-Hill, New York, St Louis, San Francisco, Auckland, Bogota, Hamburg, Johannesburg, London, Madrid, Mexico, Montreal, New Delhi, Panama, Paris, Siio Paulo, Singapore, Sydney, Tokyo, Toronto, 1983. [14] Salomaa, A. : Formal Languages, Academic Press, New York, London, 1973.
192
Q U A N T U M FINITE AUTOMATA Jozef Gruska* and Roland Vollmar Faculty of Informatics, Masaryk University, Botanicki 68a, 602 00 Brno, Czech Republik Fakultat fur Informatik, Universitat Karlsruhe, Am Fasanengarten 5, 76128 Germany
Abstract Various quantum versions o f the most basic models o f the classical finite automata have already been introduced and various modes of their computations have already started t o be investigated. In this paper we overview basic models, approaches, techniques and results in this promising area of quantum automata that is expected t o play an important role also in theoretical computer science. We also summarize some open problems and research directions t o pursue in this area.
1
Introduction
Once an understanding has emerged that foundation o f computing has t o be based on the laws and limitations o f quantum mechanics, it has became natural t o turn attention also t o various quantum models of automata.
1.1
Goals of the research
The research in the area o f quantum computation models has several interrelated external and internal goals. 0
0
0
To get an insight into the power o f different quantum computing models and modes, using language/automata theoretic methods. To discover very simple models o f computation a t which one can proof large (or huge) difFerence in the power between quantum and classical versions o f automata. To determine borderlines between algorithmic decidability and undecidability for key algorithmic problems. To explore how much quantumness is needed and how pure it has t o be, in order t o have (quantum) models of computation that are more powerful than classical ones.
* The paper has been written during the first author stay with University of karlsruhe, Department of Informatics, in summer 2000. Support of the grants GACR 201/98/0369, CEZ:J07/98:143300001 and VEGA 117654120 is to be acknowledged.
193 0
0
0
0
To develop quantum automata (networks, algorithms) design and analysis methodologies. To explore mutual relations between different quantum computation models and modes. To discover, in a transparent and elegant form, limitations of quantum computations and communications. To explore how much o f quantum resorces are needed t o have quantum models (provably) more powerful than classical ones.
Main models of quantum automata are a natural quantum modification of the main classical models o f automata. 1. Quantum (one-tape) Turing machines (QTM). They are used t o explore, a t the most general level o f sequential computation, the potential and limitations o f quantum computing. Using this model the main computational complexity classes are defined. ( Q T M can be seen as the main quantum abstraction o f the human computational processes.)
2. Quantum cellular automata (QCA). They are used t o model and t o explore, on very general and basic level o f parallel computation, the potential and limitations of quantum computing. (QCA can be seen as a very basic quantum abstraction of computation by nature.
3. Quantum finite automata (QFA). They are considered t o be the simplest model o f quantum processors, with "finite" quantum memory, that models well the most basic mode o f quantum computing - a quantum action is performed on each classical input. Their classical variants are usually denoted by QFA or 1FA.
4. Almost finite quantum automata They are again modifications of the classical models. The needs for introducing them are the same as in the classical case. Attempts t o generalize or simplify main models t o get models for which one could extend results or for which one could get results not obtainable or not known t o be true for more general models. In addition, the classical model of push-down automata has a very strong motivation in connection with recursive programmimng. One natural approach t o design such models is to add additional tapes (of a special access). The main models to consider so far are: quantum multi-
tape finite automata, quantum counter finite automata, quantum pushdown automata
2
Classical Reversible Finite Automata
The very basic definition of a reversible finite automaton A is that of a deterministic finite automaton a t which t o each state q and any input a there is
194
at most one state q' such t h a t under the input a the automaton A gets from the state q' into the state q . A special case are totally reversible finite automata, called also group automata, a t which t o each state q and any input symbol a there is exactly one state q' such that under the input a the automaton A gets from the state q' into the state q. The power of reversible finite automata as acceptors depends on how many input and output states we allow, whether only one or many. In none of the cases the language {0*1*} is acceptable by a reversible finite (one-way) automaton. (However, this language can be accepted by reversible twoway automata and also by reversible push-down automata.) Concerning the acceptance, the most interesting case seems t o be the one a t which several input and also output states are allowed. Let us denote such a model as RFA. A detailed study of the power of RFA was done by Pin (1987). He gives the following characterizations of languages L accepted by RSA.
Theorem 2.1 (Pin, 1987') If L is a regular language, then the following conditions are equivalent.
1. L is accepted by a reversible finite automaton (with a set of initial states and a set of final states).
n C * , where K is a subset of the free group F ( C ) , consisting of a finite union of left cosets of finitely generated subgroups of F ( C ) .
2. L = K
3. The idempotents of the syntactical monoid M ( L ) of L commute and, f o r every x , u , y E C * , xu*y E L implies xy E L. 4. The idempotents of M ( L ) commute and, for everg s, t, e E M ( L ) such that e is idempotent, set E P implies st E P , where P is the image of L in
M(L). 5. The idempotents of M ( L ) commute and L is closed in the free-group topology. Languages accepted by group automata are also well understood. They are exactly languages the syntactical monoids o f which are groups. As pointed out by K. Paschen, for some languages there are two minimal reversible automata that are not isomorphic. This results also indicates that reversible finite automata are far from being so easy t o handle as the ordinary finite automata. A method how t o design a circuit out of (reversible) Fredkin gates that implements a given RFA was developed by Morita (1990).
195
3 Abstract Approaches to Quantum (Finite State) Sequential Machines Several general approaches have been developed by Gudder (2000). The first o f them is the concept of the quantum transition machine (QTRM) M = (X, Iqo), U ) , where IFI is a Hilbert space, Iqo) is an (initial) state and U is a unitary transformation. Closely related is the concept of quantum sequential machine (QSM) M = (Q,qo,d), where Q is the set of states, qo E Q and 6 : Q x Q 4 C is a transition mapping such that the following well-formedness condition
is satisfied. This condition guarantees unitarity o f the corresponding evolution o p erator 9‘
A t this approach the corresponding Hilbert space has the basis { 14) I q E Q} and the initial quantum state is lo). The above concepts of quantum machines can naturally be extended in two ways. We can consider a set Qt of terminating states in the case of QSM and a s u b space of terminating states Ht in the case of QTRM. A computation is considered as terminating if it gets into a terminating state. The second way o f generalization is t o consider transition mapping as being dependent also on input symbols. This can then naturally be combined also with the case of having terminating states. There is a close relation between the above two concepts o f quantum sequantial machines. Indeed, t o each quantum sequential machine we can construct an equivalent quantum transition machine by considering as the corresponding Hilbert space Zz(Q) and as the unitary operator the evolution operator defined above. Conversely, suppose that M = (Q, qo, U ) is a quantum transition machine. Let B be an orthonormal basis for H that includes the state qo and let us define 6(q, 4’) = (Uqlq’). (That is a quantum transition machine can be seen as a quantum sequential machine at which computational basis is unspecified.)
3.1
Quantum finite automata - a general scheme
Several, nonequivalent, models of quantum finite automata have been introduced. Most o f them have the following basic componenets: Input has the form #wl . . . w n $ or, shortly, #w$,where w E C*, IwI = n, C is an input aphabet amd {#,$} are endmarkers. The set of states Q = Qa U QT U Qn is composed of the accepting states, Qa, the rejecting states, Q T , and the nonhalting states, Qn.
196
A configuration is a pair (QlI ) - a state and a position on the input tape. The set of configurations has the form C(Qlw ) = { ( q , i )I q E Q10 5 i 5 IwI 1) This is used t o introduce the corresponding Hilbert space: Zz(C(Q,w ) ) Transition mappings 6 are defined as follows
+
q'EQ,Wj 0, if A accepts (rejects) any z E L (z $ L) with probability a t least $ E . I f there is an E such that A accepts L with probability a t least $ E , then A is said t o accept L with bounded error probability. A language L is accepted by A with unbounded error probability, if z E L (z $ L ) is accepted (rejected) with probability a t least $. The acceptance with bounded error probability is considered t o be the main and the most realistic one because it is robust with respect t o small errors. The acceptance with respect t o unbounded errors can lead t o unrealistic conclusions. Finally, let us denote by B M O (BMM) the family of languages accepted by MO-1QFA (MM-1QFA) with the bounded error and by UMO (UMM) the family of languages accepted by MO-1QFA (MM-1QFA) with the unbounded error.
+ +
4.2
+
Example - hierarchies of languages
We present now a simple, but tricky, example, due t o Ambainis and Freivalds (1998), of an interesting lQFA A:
States: Q = ( 4 0 , qi, q 2 , qa, q r } , Qa = { q a } , Qr = { q r } . Transitions:
vo141) = (1 -P)141) VOlQ2)=
K14d
=
I%-),
+ h F i q l q 2 ) + Jirlqr),
d i F 3 I q 1 ) +P142) - Ji--plqr),
W q 2 ) = lq2),
&Id=
IQP),
&lq2) =
I&).
The remaining transitions are defined arbitrarily t o satisfy the unitarity condition for mappings V,. The automaton A can quite well recognize the following languages L,, where L1 = O* and for n > 1
L,
=
{z;l;
. . .1;
1122-1
= 0,122 = 1)
198
Indeed, as shown by Ambainis and Freivalds (1998, for the case n = 2), and by Kikusts and RasEevskis (2000, for the general case), the language L , can be accepted by the automaton A with probability p , where p s + p = 1. In addition, (Li},”=, represents a sequence of langauges such that each next of them can be recognized only with probability smaller than the previous ones and these probabilities converge, from the above, to f. More exactly, it holds,
Theorem 4.2
(Kikusts and RasSEevskis, 2000)
1. T h e language L, can be recognized with probability
f + for a constant c, $ + -3m.
but cannot be recognized with probability greater than
+
gn2k_l 2. I f we put n1 = 2 and, f o r k > 1, n k = 7 1, and we define pk = $ then f o r every k > 1 the language L,, can be recognized by a MM1QFA with the probability p k , but cannot be recognized by a MM-1QFA with probability pk-1.
+ 6,
Remark 4.3 Actually, the above theorem has been first shown, by Ambainis et al. (1999), f o r the sequance of languages L , = a;aa. . . a;, over the increasinly large alphabets (a1 azl . . . ,an}. I n the classical case, it is usually straightforward to transform a many-letter alphabet result of such a type to a two-letter alphabet case. I n the quantum case no straightforward techniques to do that are known Remark 4.4 The above result holds only f o r the MM-mode of computation on IQFA. I n the case of MO-mode, it holds, similarly as in the classical case, that once a lagauge can be cacepted with the probability p > $, then it can also be accepted with probability 1 2 p’ > p . 4.3 Limitations on the probability acceptance The folllowing is the very basic result used to show some limitations for languages that can be accepted by MM-1QFA.
Theorem 4.5 (Ambainis and Freivalds, 1998) Let L be a regular language and A a minimal DFA for L with the transition function 6 and a set of accepting states. Let there be in A states 41, q2 and an input word w such that: (a) q1 # 92; (b) 6 ( q l ,w ) = S(q2, w ) = 92; (c) 92 is neither “all-accepting” nor “all-rejecting” state. Then L cannot be accepted by a MM-1QFA with probability at least + E , f o r any E > 0.
5
The minimal automaton for the language {0*1*} clearly contains the above trouble making construction and therefore it holds.
199
Theorem 4.6 (Ambainis and Freivalds, 1998) There is a regular language that can be recognized by a MM-1QFA with probability 0.68.. ., but neither by E nor by RFA. MM-IQFA with probability at least
i+
In addition, it holds.
Theorem 4.7 (Ambainis, Freivalds, 1998) If L is a regular language and A its minimal automaton with n states. If A does not contain the “orbidden constructi on” of Theorem 4.3, Then L can be recognized by an R F A with 0 ( 2 n ) states. The basic idea behind the lower bound proofs o f Theoerm 4.3 is the fact that the minimal DFA for the langauge L , contains n - 1 o f “forbidden constructions and each one decreasers probability with which the language can be accepted by a MM-1QFA.
4.4
Characterizations
In the case o f MM-1QFA a nice characterization is know only for langauges accepted with high probability.
Theorem 4.8 (Ambainis and Freivalds, 1998) A language can be recognized E , E > 0, if and only if it is accepted by a by a MM-IQFA with probability RFA.
i+
Another quite nice characterization is known for langauges accepted by MO1QFA.
Theorem 4.9 (Brodsky and Pippenger, 1999) The class B M O is exactly the class of group languages and therefore a proper subclass of the class of regular languages.’ 4.5
Closure properties
Let us first summarize the closure properties for the class BMO. I t holds
Theorem 4.10 (Brodsky and Pippenger, 1999) The class B M O is closed under Boolean operations, inverse homomorphism and word quotients, but not under homomorphism. On the other hand, the class UMO contains also non-regular languages. For Also the class UMO is example, the language {wllwlo = IwI1,w E {O,l}*}. closed under Boolean operations, inverse homomorphism and word quotients, but It is the class of languages accepted by group finite automata, or, equivalently, the class of regular languages syntactical monoids of which are groups.
200
not under homomorphism (Moore and Crutchfield, 1997). However, no precise characterization of this class is yet known. Less is known about the classes BMM and UMM. Both of them are closed under complement, inverse homomorphism and word quotient and it is known that the class U M M is not closed under homomorphism. Moreover, it has been shown by Valdats (2000) that the class o f languages accepted by l Q F A is not closed under union and, actually under any binary Boolean operation. Namely, he has showed t h a t the languages L1 = (aa)*bb*a(b*ab*a)*b*U (aa)* and L a = aL1 can be accepted by l Q F A with probability $, but their union is not acceptable by a 1QFA. In addition, Valdatas (2000) has shown that the above example represents a border case in the following sense. If two languages L1 and L 2 can be accepted by 1QFA with probabilities p1 and p2 such that < 3,
$+&
then their union is accepted by a lQFA with probability
p l ~ ~ ~ ' l p 2 .
4.6 Succinctness results Even if recognition power of l Q F A is not impressive, quite a different situation is with their descriptional power. However, even from this point of view a comparison between quantum and classical automata is not conclusive. For some langauges we can have exponentially more succinct description using 1QFA than using lFA, but in other cases the situation can be reverse. Here are the main results in this area.
Theorem 4.11 (Ambainis, Freivalds, 1998) 1. For a n y p r i m e p any DFA recognizing the language L, = { a i I i is divided b y p )
has to have at least p states, but there is a MM -1QFA with O(1gp) states recognizing L, . 2. For any integer n, each DFA recognizing the language L, = {On}, containing the single string, has to have n states, but there is a MM-1QFA with O(lg n) states recognizing L,. The proof of the first part of the Theorem 4.6 actually contains a method how t o accept the language L, using a lQFA with only O(1gp) states. An interesting analysis o f this method has been peformed by Berzina e t al. (2000). They show that even using only a 7 qubit computer it is possible to recognize the language L1223, as the special case of the languege L,, with probability 0.651. This is quite surprising because the method involved seems t o require t o work with very large primes.
Thorem 4.12 (Ambainis et al. 1998, Nayak, 1999) For each integern there is a DFA of size O(n) recognizing the language L, = {wO I w E (0, l}*,Iw( I n}, but each MM-IQFA recognizing L, with probability greater than f has to have 2"(n) states.
20 1 We can conclude that in some cases, it seems that due t o quantum parallelism, lQFA can be much smaller than their classical counterparts and in some cases, it seems that due t o the requirement on unitarity (reversibility), this is just the opposite way. Also the following result indicates that. By Ambainis and Freivalds (1998), there is for any integer n a language t h a t can be accepted by a DFA with O(n)states but each RFA accepting t h a t language has t o have O(2,) states. For a more detailed treatment o f the problem o f succinctness in quantum computing see [16].
4.7
Lower bounds methods
As usually, it is far from trivial t o show such sharp bounds on succinctness as presented in theorems above. So far three methods have been used t o do that. 0
0
Classical computing methods. In the case of the last theorem we can argue as follows: Since the language L, is finite, there is a RFA accepting it. It is easy t o see t h a t each RFA for L , has t o have O(2,) states. Indeed, due t o t h e reversibility requirement, each state has t o encode the whole input that brings the automaton t o that state. Since there are 2, of possible inputs, the total number o f states has t o be O(2,). Probabilistic computing methods. They are methods urrr t o show lower bounds for randomized computations, especially methods used t o show lower bounds for probabilistic automata.
0
Random access coding method The above idea does not seem t o be applied easily t o lQFA for L, for a t least two reasons. A lQFA can accept an input, with a certain probability, without reading the whole input. In addition, it is not clear in which sense particular states encode the history o f computation because the automaton can be a t a given moment not in a particular state, but in a superposition o f states. However, an interesting modification o f the above idea works in a combination with new, purely quantum, ideas o f the so-called random access coding and of the serial coding. One o f the basic results o f the quantum information theory, the so-called Holevo theorem, says that no more than n bits o f information can be encoded and later faithfully retrieved from n qubits. However, quite surprisingly, if we relax, in a reasonable sense, the above strong requirement on perfect retrieval o f all encoded bits, then we can encode m bits into n < m qubits in such a way that each single bit (but not all o f them) can be retrieved with a quite high probability. This idea has been formalized as follows.
Definition 4.13 A m -% n random access coding is a mapping f : {0,1}* x R -+ C2" such that for any 1 5 i 5 m, there is a quantum
202
measurement Oi producing values 0 or 1 and such that for any w E ( 0 , ljm, Pr(0il f ( w , r ) )= wi)2 p , where R is a set of random bits, f is called to be an encoding function and Oi are said to be decoding observations. I n the i can depend on the string wi+l . . . w,, we talk case that each observable 0 about serial coding. Ambainis e t al. (1998) showed the following lower bound n = 0
a(&)
Entropy method The conceptual framework behind this entropy proof method is very different and requires a more detailed presentation. Since each computation process of a l Q F A can be seen as a sequence of unitary operations V, and of the standard accepting/rejecting/nonterminating measurements, such a computation process can, and should, be seen as producing a sequence of mixed states. When we then consider computation of a lQFA for L, on a random binary input string, then one can show easily that the quantum entropy o f the mixed states being produced during the computation can only increase for any symbol a read and the corresponding unitary operation V,, and therefore also for the standard measurement. In addition, for certain languages one can show that such an entropy increase is limited from bellow for processing each symbol. Moreover, the total information capacity o f QFA can be bounded in terms of the number of states and this way a lower bound can be obtained on the number of states of an automaton t o recognize the language. I f we now have a restricted l Q F A A,, which accepts L , with probabilityp 2 $ and with the set of states Q , then on the basis of the lfollowing lemma one can see that after reading k input symbols the resulting mixed state with the density matrix p k is such that it holds (where S ( p ) is the Shannon entropy)
Q S ( P ~2) (1- S ( P ) ) ~ .
Lemma 4.14 Let po and p1 be two density matrices and p = + ( P I + p 2 ) . If 0 is a measurement with outcomes 0 and 1 such that making a measurement on P b yields b with probability p , then
From that we get the lower bound for An: IQI 1 2 " ( ( 1 - S ( p ) ) n ) . As already discussed above, once a l Q F A 23, is given for L,, with a set of states Q , we can construct an equivalent restricted l Q F A for L , with O(nlQl) states. This leads t o the overall lower bound for the number of states IQI of 1QFA recognizing L , :
2 2(1-S(p))n-kn-O(1)
IQI
203 Theorem 4.12 can be strengthened, t o hold also for a more general class o f one-way quantum finite automata, for the so-called enhanced one-way realtime quantum automata (elQFA). Their main new feature is that each time after a new symbol g is read an arbitrary sequence of unitary operations and orthogonal measurements, that depends (this sequence) on u , is performed. In short a superoperator is performed on the density matrix representing the current mixed state. The above model is of importance for several reasons. First of all, it is a very natural generalization of the model of l Q F A and, secondly, it is in lines with recent concentration on density matrices and superoperators (as operators t h a t are applied on density matrices). I t has been shown by Nayak (1999) that the lower bound 2"(,) holds also for elQFA recognizing the language L,. The proof goes basically along the same line of reasoning as in the former case. The key new fact to be used is that an application of a measurement (and thereby also of a superoperator) increases entropy a t least by the additive term 1- S ( p ) .
5
Two-way quantum finite automata
The second very basic model is that of the two-way quantum finite automaton (2QFA) at which a state is a superposition of the basis states that can correspond t o different heads positioned on different squares of the input type. Formally, 2QFA are defined as follows [28]. A two-way quantum finite automaton A is specified again by an alphabet C , a finite set of states Q , an initial state 40, sets Qa 5 Q and QT C Q , such that Qa n QT = 0,and the transition function
6: Q x
-
r x Q x I+-, 1,-+I
C[o,ilI
r
where = CU{#, $} is the tape alphabet of A and # and $ are endmarkers not in C , which satisfies the following conditions (of well-formedness) for any 41,QZ E Q , u,m,m E d E {4-,1,4}:
r,
1. Local probability and orthogonality condition.
2. Separability condition .
204
3. Separability condition 11.
C 6 * ( 4 1 , Ul,d , +)b(q2,uZ?,q', +) = 0 . 4'
The above conditions are not easy t o verify. Fortunately there is a simpler concept of 2QFA that is equally powerful.
Definition 5.1 A 2QFA A = (C,Q,qo,Qa,Qr,6)is simple, or unidirectional, i f for each u E I' there is a unitary operator V, defined o n the Hilbert space 1 2 ( Q ) and, in addition, a function D : Q -+ {+-, 1,-1) such that for each 4- E Q - ,, u E
r,
6 ( q ,u, q', d ) =
{ f" ' ) , ,, q
if D(q') = d; otherwise.
It is straightforward t o verify that if we rewrite the well-formedness conditions using the relation (l), then we get t h a t a simple 2QFA A satisfies the wellformedness condition if and only if
for each
5.1
0
E
r,which holds if and only if every operator V, is unitary.
Power of 2QFA
There are two basic results concerning the power of 2QFA, both due t o Kondacs and Watrous (1998).
Theorem 5.2 Each regular language can be recognized by a two-way classical reversible (and therefore also quantum) finite automaton. The basic idea behind the proof is to make a reversible simulation of a given DFA. This method leads in some casse t o an exponential increase o f the number of states and it is not clear whether this is avoidable.
Theorem 5.3 2QFA can recognize, with respect to the bounded-error mode of acceptance, also non-regular languages, such as the context-free language { O i l i I i > 0 } , and non-contest-free langauges, such as { O i l i O i I i > 0 ) . It should not be difficult t o construct a 2QFA recognizing the language L , = {Oili1 i > 0 ) from the following informal description of its behaviour. Figure 1 illustrates the basic trick of such a 2QFA A(") accepting strings from the language L , (the integer n is here a parameter that ensusres probability with which strings not in L, are rejected).
205
Stage 1. QFA keeps moving right checking whethex the input has the form 0' Stage 3. After arriving at the left endmarker each state branches into a superposition of new states and if they arrive simultaneously this superposition results in a single state.
I
Stage 2. At the right endmarker a superposition of new states is created and all states move left arriving at the left endmarker simultaneously iff the input has the form d 1 I .
Stage 4. A measurement is performed.
ACCEPT
Fig. 1. QFA recognizing the language {Oili I i 2 1) - 60%
Each computation of A(") consists of three phases. In the first phase any input word not of the form O a l j is rejected (this can be done actually by a classical reversible automaton). For words of the type O z l j the phase ends in a state with the head on the rightmost endmarker $. As the first step of the second phase a superposition o f n special states is formed. This way computation in a sense "branches" into n parallel paths (actually into their superposition). In the j t h paths, the head moves, deterministically, t o the left endmarker according t o the following rules. Each time the head is on a new cell and reads 0 (1) it remains stationary for j ( n - j 1) steps and then moves one cell left. Therefore, for an input of the form O'lv the j t h head requires exactly ( j + l ) u + ( n - j + 2 ) v + l steps t o reach t h e left endmarker. I f j # j ' , then
+
+
( j 1).
+ (n - j + 2)v + 1= (j'+ 1)u+ (n - j' + 2). + 1 if and only if
u = v.
This implies that any two heads of all n different computational paths reach the left endmarker a t the same time if and only if u = v. In the third phase, consisting of only one computational step and one measurement, each computation path splits again, this time the resulting superposition is obtained by an application o f the QFT (Quantum Fourier Transform). In the case u = v all these splittings occur simultaneously and the resulting superposition equals exactly to Is,), where Is,) is the single accepting state. A t that moment an observation is performed using the measurement making a projection
206 into the state spanned either by accepting, rejecting or nonterminating configurations. In the case u = 'u, the result o f such a measurement is "accept" with the probability 1. In the case u # 'u only one head comes as the first t o the leftmarker and a measurement can then accept the string only with probability
4.
5.2
1.5-way quantum finite automata
A natural modification o f the concept of 2QFA is that o f 2QFA a t which no head "can move left". It can therefore "keep staying" or "to move right" a t a computation step. It is an important open problem t o determine whether 1.5QFA can recognize all regular languages with respect t o bounded-error mode of acceptance. The method used t o show that all regular languages are accepted by 2QFA does not work for the case of 1.5QFA, and neither the method t o show that a regular language is not acceptable by a 1QFA. There is, however, a result showing that such automata are quite powerful. It was shown by Amano and lwama (1999), by a reduction t o the halting problem of one-register machines, that the emptiness problem is undecidable for this type of automata, what is quite surprising because this problem is, in the classical case, decidable even for push-down automata. In addition, they have shown that 1.5QFA can accept the language {OilOi I i >}, actually using a small modification of the method used t o show that 2QFA can accept the language {Oili I i > 0). Let us list now some open problems for 1.2QFA. 1. Can 1.5QFA accept some languages accepted by lQFA, but with larger probability? 2. Can some 1.5QFA have less states than each lQFA recognizing the same language. 3. What is the power of 1.5QFA?
5.3
Two-way classical/quantum finite automata
The models of QFA considered so far have all been natural quantum versions o f the classical models of automata. O f a different type is the model introduced in [4], and called two-way finite automata with quantum and classical states (2QCFA). This model is more powerful than classical (probabilistic) 2FA and a t the same time it seems t o be more realistic, and really more "finite" than 2QFA (because 2QFA actually need quantum memory of size O ( n ) t o process an input o f the size
n).
A 2QCFA A is defined similarly as a classical 2FA, but, in addition, A has a fixed size quantum register (which can be in a mixed state) upon which the
207
a a unitary operatio
result of measurement --_- _ _ _ _ _ -.-.--’the determines the action of the
classical part of the automaton
Fig. 2. A model of SQCFA
automaton can perform either a unitary operation or a measurement. A 2QCFA has a classical initial state qo and an initial quantum state 160). The evolution of the classical part of the automaton and of the quantum state o f the register is specified by a mapping 0 that assigns t o each classical state q and a tape symbol 0 an action O(q,a). One possibility is that Q(q, a ) = (q’, d, U ) , where q’ is a new state, d is the next movement of the head (to left, no movement or t o right), and U is a unitary operator t o be performed on the current quantum register state. .., m k , The second possibility is that O(q,a) = (M,ml,ql,dl,mz,qz,dz,. q k , dk), where M is a measurement on the register state, ml,. . . ,mkare its possible classical outcomes and for each measurement outcome new state and new movement of the head is determined. In such a case the state transmission and the head movement are probabilistic. I t has been shown in [4] that 2QCFA with only one qubit o f quantum memory are already more powerful than 2FA. Such 2QCFA can accept the language of palindromes over the alphabet (0, l}, which cannot be accepted by probabilistic 2FA a t all, and also the language { O i l i I i 2 0}, in polynomial time. This language can be accepted by probabilistic 2FA, but only in exponential time. In the above model only projection measurements have been considered. I t is not clear whether something, especially concerning the number of the classical states, could be obtained by considering also POVM.
6
Quantum almost finite automata
Let us discuss briefly also main modifications of the models of quantum finite automata discussed above. Quantum finite multitape automata have been introduced by Ambainis et al. (1999) with the idea t o show that for such a model quantum version of the automata is provably more powerful than the classical probabilistic one. Several
208
languages have been designed that are of increasing complexity when accepted by probabilistic classical versions of automata. However, the final proof that such a quantum model is more powerful than the classsical one is still missing. For the case o f two tapes only, it has been shown by Bonner a t al. (2000b) that for such quantum automata the emptiness problem is undecidable. This has been shown actually even for a weaker model in which at each step a t least one of the heads has t o move right. Quantum finite counter automata have been introduced by Kravtsev (19990 and studied also by Yamasaki et al. (1999). There are two major results concerning this model: I t is provably more powerful than i t s classical probabilistic version (see Bonner et al., 1999a), and (see Bonner et al., 2000) the emptiness problem for this model is undecidable (this has been shown by a reduction t o the Post correspondence problem. Quantum pushdown automata have been introduced at first by Moore and Crutchfield (1997) and in a more elaborated way by Golovkins (2000). He has shown that the following languages can be recognized by RPDA: (a) L1 = {O,l}*; (b) L2 = {w 1 IwJo = lw11, w E {O,l}*}. The following languages can be recognized by QPDA: (a) L3 = {w I lwlo = lwll = Iw12,w E {0,1,2}*} with probability :; (b) L4 = {w I JwIo= lwll or lwlo = Iwl2} with probability $. The last language is known not t o be recognizable by a DPDA. It is not clear whether this langauge is recognizable by a probabilistic PDA. In general, it is not yet known whether QPDA are more powerful than classical probabilistic push-down automaa. In all these cases a nontrivial problem was t o develop proper well-formedness conditions.
209
1. Leonard M . Adleman, Jonathan DeMarrais, and Ming-Deh A. Huang Quantum computability. SIAM Journal of Computing, 26(5):1524-1540, 1997. 2. Masami Amano and Kazuo Iwama. Undecidability on quantum finite automata. In Proceedings of 31st ACM STOC, pages 368-375, 1999. 3. Andris Ambainis, Richard Bonner, and Rising Freivalds a nd Arnolds Fikusts. Probabilities t o accept languages by quantum finite automata. Technical report, quant-ph/9904066, 1999. 4. Andris Ambainis and John Watrous. Two-way finite automata with quantum and classical states. Technical report, quant-ph/9911009, 1999. 5. Andris Ambainis and Risi@ Freivalds. 1-way quantum finite automata: strengths, weaknesses and generalizations. In Proceedings of 39th IEEE FOCS, pages 332-341, 1998. quant-ph/9802062. 6. Andris Ambainis, Ashwin Nayak, Amnon Ta-Shma , and Umesh Vazirani. Dense quantum coding and a lower bound for 1-way quantum finite automata. Technical report, quant-ph/9804043, 1998. 7. Andris Ambainis, Ashwin Nayak, Amnon Ta-Shma, and Umesh Vazirani. Dense quantum coding and a lower bound for 1-way quantum finite automata. Technical report, quant-ph/9804043, 1998. 8. Charles H. Bennett Logical reversibility of computation. IBM Journal of Research and Development, 17:525-532, 1973. 9. Ethan Bernstein and Umesh Vazirani. Quantum complexity theory. SIAM Journal of Computing, 26(5):1411-1473, 1997. 10. Aija Berzina, Richard Bonner, and Rusins Freivalds. Parameters in am bainis-freivalds algorithm. In Proceedings of the International Work-
shop on Quantum Computing and Learning, Sundbyhols Slott, Sweden, M a y 2000, pages 101-109, 2000. 11. Richard Bonner, Rising Freivalds, and Renars Gailis. Undecidability of 2tape quantum finite automata. In Proceedings of International Workshop on Quantum Computation and Lea rning, Sundbyholms, May 27-29, 2000, pages 93-100, 2000. 12. Richard Bonner, Rijsinx Freivalds, and Maxim Kravtsev. Quantum versus probabilistic 1-way finite automata with counter. In Proceedings of Inter-
national Workshop on Quantum Computation and Lea rning, Sundbyholms, May 27-29, 2000, pages 80-88, 2000a. 13. Richard Bonner, R i s i g E Freivalds, and Madars Rikards. Undecidability of quantum finite 1-counter automaton. In Proceedings of International
Workshop on Quantum Computation and Lea rning, Sundbyholms, May 27-29, 2000, pages 65-71, 2000b.
210
14. Richard Bonner, Rusins Freivalds, and Maxim Kravtsev. Quantum versus probabilistic one-way finite automata with counter. In Proceedings of
the International Workshop on Quantum Computing an d Learning, Sundbyholms Slott, Sweden, May 2000, pages 80-88, 2000a. 15. Alex Brodsky and Nicholas Pippenger. Characterization of 1-way quantum finite automata. Technical report, quant-ph/9903014, 1999. 16. Lance Fortnow. One complexity theorist's view of quantum computing. Technical report, Tech. report, NBC Research Institute, t o appear at CATS 2000 Proceedings and in ENTCS. 17. Maratas Golovkins. O n quantum pushdown automata. In Proceedings
of International Workshop on Quantum Computation and Lea rning, Sundbyholms, M a y 27-29, 2000, pages 41-51, 2000. 18. Jozef Gruska. Quantum computing. McGraw-Hill, 1999. See also additions and updatings of the book on http://www.mcgraw-hill.co.uk/gruska. 19. Jozef Gruska. Descriptional complexity issues in quantum computing. Journal of Automata, Languages and Combinatorics, 5:191-218, 2000. 20. Jozef Gruska. Mathematics unlimited, 2001 and beyond, chapter Quant u m computing challenges, pages ?-?+37. Springer, 2000. 21. Stanley Gudder. Basic properties of quantum automata. Technical report, Department o f Computer Science, University of Denver, 2000. 22. Arnolds Kikusts and Zigmars Rasscevskis. O n the accepting probabilities of 1-way quantum finite automata. In Proceedings of International
Workshop on Quantum Computation and Lea rning, Sundbyholms, M a y 27-29, 2000, pages 72-79, 2000. 23. Attila Kondacs and John Watrous. On the power of finite state automata. In Proceedings of 36th IEEE FOGS, pages 66-75, 1997. 24. Maksim Kravtsev. Quantum finite one-counter automata. Technical report, quant-ph/9905092, 1999. 25. Cristopher Moore and James P. Crutchfield Quantum automata and quant u m grammars. Technical report, Santa Fe, 1997. 26. Kenichi Morita. A simple construction method of a reversible finite automaton out of F redkin gates, and its related problems. The transactions of the IEICE, E73:978-984, 1999. 27. Ashwin Nayak. Optimal lower bounds for quantum automata and random access codes. In Proceedings of 40th IEEE FOGS, pages 369-376. ACM, 1999. 28. Jean-Erie Pin. O n the languages accepted by finite reversible automata. In Proceedings of 14th ICALP, pages 237-249. LNCS 267, Springer-Verlag, 1987.
21 1
29. Daniel R. Simon On the power of quantum computation. In Proceedings of 35th IEEE FOG'S, pages 116-123, 1994. See also SlAM Journal of Computing, V26, N5, 1474-1483, 1997. 30. Maris Valdats. T h e class of languages recognizable by 1-way quantum automata is not clos ed under union. Technical report, quant-ph/000115, 2000. 31. John Watrous. On t h e power of 2-way quantum finite automata. Technical report, University of Wisconsin, 1997.
212
On commutative asynchronous automata * B. Imreht
M. It04
A. Puklers
Abstract
The class of the commutative asynchronous automata are investigated here. By characterizing the subdirectly irreducible members of this class, it is proved that every commutative asynchronous automaton can be embedded isomorphically into a quasi-direct power of a suitable two-state automaton. We also prove that the exact bound for the maximal lengths of minimum-lengthdirecting words of an n states directable commutative asynchronous automata is equal to n - 1, moreover, it is [log,(n)] for the subclass containing all directable commutative asynchronous automata generated by one element.
1 Introduction An automaton is asynchronous if for every input sign and state the next state is stable for the input sign considered. Asynchronous automata were studied in different aspects. We mention here only the papers [6] and [7] which deal with the decomposition of a n arbitrary automaton into a serial composition of two ones having fewer states than the original automaton and one of them is asynchronous. It is said that an automaton is commutative if for every pair of its input signs, the transition of the states is independent of the order of the signs of the 'This work has been supported by the Japanese Ministry of Education, Mombusho International Scientific Research Program, Joint Research 10044098, the Hungarian National Foundation for Science Research, Grant T030143, and the Ministry of Culture and Education of Hungary, Grant FKFP 0704/1997. tDepartment of Informatics, University of Szeged, A r p a t6r 2, H-6720 Szeged, Hungary t Department of Mathematics, Faculty of Science, Kyoto Sangyo University, Kyoto 6038555, Japan §Department of Computer Science, Istvgn Szbchenyi College, H&Ierv*i tit 3., H-9026 GyBr, Hungary
213
pair. Commutative automata have been studied from different points of view. Regarding the isomorphic and homomorphic representations of commutative automata, we mention the papers [2], [3], [4], [5], [8] [12], [13]. As far as the directable commutative automata is concerned, we refer to the works [9] and P11In this paper, we deal with the intersection of these classes, namely the class of commutative asynchronous automata. After the preliminaries of Section 2, we persent in Section 3 the description of the subdirectly irreducible members of this class, and as an application of this description, we characterize the isomorphically complete systems for this class with respect to the quasi-direct product. In Section 4, the directable commutative asynchronous automata are investigated. We give the exact bound for the lengths of the minimal directing words of directable commutative asynchronous automata. Finally, we consider a subclass of the previous class, namely the class of directable commutative asynchronous automata generated by one element, and also give the exact bound for the lengths of the shortest directing words of the members of this class.
2
Preliminaries
The cardinality of a set A is denoted by ]At. The diagonal relation on A is denoted by W A , i.e., W A = {(a,a) : a E A } . Let X be a finite nonempty alphabet. The set of all finite words over X is denoted by X * and X + = X * \ {E}, where E denotes the empty word of X * . For any p E X * , let alph(p) denote the set of the letters which occur in p , i.e., z E alph(p) if and only if x occurs in p . By automaton we mean a triplet A = ( A , X , 6 ) , where A and X are finite nonempty sets, the set of states and the set of input signs, respectively, and 6 : A x X -+ A is the transition finction. An automaton can be also defined as an algebra A = ( A ,X) in which each input sign is realized as the unary operation xA : A 4 A, a -+ 6(a,z). The transition function can be extended to A x X * in the usual way. Each word p E X* defines then a A a -+ 6(a,p). If C C A and p E X * , then let unary operation p A : A CpA = {& : c E C } . In what follows, if there is no danger of confusion, then we write up and C p instead of upA and C#, respectively. A state a* of A is called a dead state if a*x = a* is valid, for all x E X . Using the second definition of automata, the notion such as subautomaton, generating element, congruence relation, subdirectly irreducible automaton can be defined in the usual way. We may associate with any nontrivial subautomaton B of A a congruence relation CTBcalled Rees congruence be-
214
longing to B as follows. For every a , b E A , let
aaBb if and only if a, b E B or a = b. An automaton A = ( A , X ) is commutative if axy = ayx is valid for all a E A and x,y E X . Another particular automata are the asynchronous automata. A = ( A , X ) is asynchronous if for every a E A and x E X, axx = ax is valid. For the sake of simplicity, let us denote by K the class of all commutative asynchronous automata. We use the notion of the connectivity defined as follows. The automaton A = ( A ,X ) is connected if for every couple of states a, b, there are input words p , q E X * such that ap = bq is valid. A word w E X* is called a directing word of an automaton A = ( A , X ) if it takes A from every state into the same state, or in other words, if [Awl = 1. An automaton is called directable if it has a directing word. Let A = ( A ,X ) now a directable automata. Furthermore, let d(A)
= min{(w[ : w
is a directing word of A).
Regarding the meaning of d(A), it gives the length of the shortest directing words of A. If w is a directing word of A and lw[ = d(A), then w is called a minimum-length directing word of A. Now, for every positive integer n, take the maximum of the lengths of the shortest directing words of all directable automata of n states, i.e., let d(n)
= max{d(A) : A
is a directable automaton of n states}.
The visual meaning of d(n) can be given as follows. For every directable automaton of n states, there exists a directing word whose length does not exceed d(n), moreover, there is such a directable automaton of n states for which the length of the shortest directing word is equal to d(n). Regarding d(n), Cernjr[l] has a famous conjecture which claims that d(n) 5 (n - 1)2. This conjecture has been neither proved nor disproved so far, and thus, it remains an open problem of the theory of automata. On the other hand, considering the directable members of special classes of automata, sometimes, a better bound can be given than (n - 1)2(see eg. [9], [lo], Ill]). The question can be restricted to particular classes of automata as follows. Let M be an arbitrary class of automata. Furthermore, let dM(n) = {d(A) : A E M and A is directable}. We shall present the values dK:(n), d p ( n ) , where K* is the subclass of containing all commutative asynchronous automata generated by one el& ment.
K
215
Let At = (At,X t ) , t = 1 , . . . ,k, be a system of automata. Moreover, let X be a finite nonempty alphabet, and 'p a mapping of X into Xt such that 'p is given in the form cp(x) = (cpl(x),.. . ,cpk(2)). Then, the automaton A= A t , X ) is called the quasi-direct product of At, t = 1,. . .,k, where ( a l , ... ,ak)xA = ( u ~ ( P I ( .z.). ,~a k~( ~, k ( z ) * k ) is valid for every ( a l , .. . , a )E At and x E X. In particular, the automaton A is called a quasi-direct power of B if A1 = .. . = A k = B for some automaton B.
n;=,
(n,"=, n,"=,
Then, the following statement can be easily proved by the definitions.
Lemma 1. If A = ( A , X ) can be embedded isomorphically into a direct product A,, where for every i, i = 1,. . . ,k, A, = ( A i , X ) can be embedded isomorphically into the quasi-direct product & Bit(X,@i), then A can be embedded isomorphically into a quasi-direct product of the automata Bit, t = l , ...,ri; i = l ,... ,k.
nbl
Now let M be an arbitrary class of automata. Furthermore, let C be a system of automata. It is said that 'c is isomorphically complete f o r M with respect to the quasi-direct product if for every automaton A E M , there are automata At E C, t = 1 , . ..k, such that A can be embedded isomorphically into a quasi-direct product of At, t = 1 , . .. ,k.
3
Isomorphic representation
First of all, we prove the following obvious statement.
Lemma 2. If A = ( A , X ) E K, then the transition graph of A can not contain any directed cycle different from loop.
Proof. Contrary, let us suppose that a E A , a # a y and ayp = a for some p E Xi. Then, since A is commutative and asynchronous, a y = (ayp)y = ayyp = ayp = a which is a contradiction. Let X = X I U X Z be a finite nonempty alphabet, where X1 and X2 are disjoint sets. Let us define the automaton Exl,x, = ( { O , l } , X 1U X Z )as follows. For every 2 1 E X1 and 2 2 E X Z , 0x1 = 0, 1x1 = 1x2 = 1, and 0x2 = 1 . The automaton EX,,X, is called the elevator ower X I and X Z . Then, we have the following characterization of the subdirectly irreducible commutative asynchronous automata.
Theorem 1. An automaton A = ( A , X ) E K with IA1 2 2 is subdirectly irreducible if and only i f there are disjoint subsets X I and X Z of X such that X1 U XZ= X and A is isomorphic to EX,,^, .
216
Proof.If IAl = 2, then the statement is obviously valid. Consequently, it is sufficient to prove that a commutative asynchronous automaton is subdirectly reducible if IAl > 2. For this purpose, let A = ( A , X ) be a commutative asynchronous automaton with IAl > 2. By the commutativity of A, the automaton A is either connected or disjoint union of its connected subautomata. In the latter case, it is easy to prove (see eg. [3]) that A is subdirectly reducible. Now, let us suppose that A is connected, and define the following relation on A. Let a 5 b if and only if there is a word p E X * such that ap = b. Then, this relation is a partial ordering on A, since the transition graph of A is cycle free. Since A is connected, there is a greatest element a* in ( A ,5 ) which is a dead state of A. Let us consider the partially ordered set ( A\ {a*},I).We distinguish two cases depending on the number of the maximal elements of (A\ {a*},5).
Case 1. (A\{a*}, 5 )has at least two maximal elements. Let bl, 4 denote two different maximal elements in ( A\ {a*},I).Then, the states a*, bl and a*,4 constitute subautomata of A, and for the corresponding nontrivial Rees congruences u{a*,b1}, d { a * , b 2 ) , we have that b { , * , b l ) n O{,*,b2) = WA. Therefore, A is subdirectly reducible. Case 2. ( A \ {a*},5 ) has one and only one maximal element denoted by al. Since A is connected and IAl 2 3, there is a maximal element in ( A\ { a * ,q } , 5 ) which is denoted by a2. Let us classify the elements of X as follows: X I = { x E X : a1x = a l } and X2 = { x E X : a1x = a*}. Since a2 I all and A is asynchronous, X I # 0. Moreover, by a1 5 a*, we get that X2 # 0 as well. Now, let us define the equivalence relation p A x A as follows. For any a, b E A, let apb if and only if a , b E {all a2} or a = b. It is proved that p is a congruence relation of A. For this purpose, let x E X2 be an arbitrary input sign. Then, a2x = a* must hold. Indeed, in the opposite case, a2x E {al,a2). If a22 = a l , then a2xx = a* # a1 which is a contradiction since A is asynchronous. If a22 = a2, then let y E X such an input sign for which a2y = al. Since a2 5 all such an input sign exists. Then, a1 = a2y = a2xy = a2yx = a1x = a* which is a contradiction again. Thus, a22 = a1x = a*, for all x E X2. On the other hand, a2xpalx, for all x E X I . If it is not so, then a22 = a* for some x E X I . Now, let y E X such an input sign for which a2y = al. Since a2 5 al, there exists such an input sign. In this case, a* = a2xy = a2yx = a1x = a1 which is a contradiction. Consequently, p is a nontrivial congruence relation of A. A further nontrivial congruence of A is the Rees congruence u{a*,al} belonging to the subautomaton {a*,a l } , and obviously, p n o{a*,al} = WA,
217
which results in the subdirect reducibility of A. Now, by Theorem 1, we can characterize the isomorphically complete systems for K with respect to the quasi-direct product as follows.
Theorem 2 . A system C of automata is isomorphically complete for K with respect to the quasi-direct product if and only if C contains an automaton A = ( A , X ) such that the elevator E{z},{v}can be embedded isomorphically into a quasi-direct product of A with a single factor.
Proof. The necessity of the condition is obvious. To prove the sufficiency, let us suppose that E{z},{v}can be embedded isomorphically into a quasidirect product of A with a single factor for some A E C. Then, it is easy to see that any elevator EX,,^, can be embedded isomorphically into a quasi-direct product of A with a single factor. Now, let A’ = ( A ’ , X ) be an arbitrary commutative asynchronous automaton. By Theorem 1, A’ can be embedded isomorphically into the direct product of some elevators; let us denote them On the other hand, every Ex,,,x~,,t = 1 , . . .k, by Ex,,J,,, . .. ,Exkl,xk2. can be embedded isomorphically into a quasi-direct product of A with a single factor. Now, by Lemma 1, we obtain that A’ can be embedded isomorphically into a quasi-direct power of A, and thus, C is an isomorphically complete system for the class K: with respect to the quasi-direct product.
4
Minimum-length directing words
Regarding the maximum of the lengths of the minimum-length directing words of commutative asynchronous automata of n states, the following statement is valid. Theorem 3. dX(n) = n - 1, for every integer n 2 1.
Pro05 It is known (cf. [9], [ll])that the maximum of the lengths of the minimum-length directing words of directable commutative automata of n states is equal to n - 1. Therefore, dx(n) 5 n - 1. To prove that the equality is possible, let n 2 1 be an arbitrary integer, and let us consider the automaton A, = ((1,. .. ,n } , (21,. .. ,z,-l}),where nxi = n, and jxi =
n ifj=i, otherwise,
j
for all zi E ( 2 1 , . . . ,3c,-1} and j E (1,. .. , n - 1). It is obvious that A, is a directable commutative asynchronous automaton, and d(A,) = n - 1. Consequently, dc(n) = n - 1 which ends the proof of Theorem 3.
218
In what follows, we show that this bound decreases drastically if we r e strict ourself to automata generated by one element. To do this, we need the following observations.
Lemma 3. If A = ( A ,X ) E K: and p E X * , then up = ax;, . . .xil,for all a E A, where {xil,. .. ,xi,} = alph(p).
Proof. Let p E X * be an arbitrary word and let us suppose that x E X occurs in p in more times. Then, there are words r , s , t E X * such that p = rxsxt. By the commutativity, up = axxrst, for all a E A. On the other hand, A is asynchronous, and thus, axx = ax, for all a E A. Therefore, up = axrst, for all a E A which yields the validity of Lemma 3. Lemma 4. If A = ( A , X ) E X* is a directable automaton and w is its directing word, then A contains one and only one dead state and w takes A from every state to the dead state. Proof. Let a0 denote the generating element of A. Since w is a directing word, Aw = {a*} for some state a* E A. We show that a* is a dead state of A. Indeed, let x E X be arbitrary. Then, a*x = aowx = aoxw = a'w = a*. Finally, let us observe that A can not contain a further dead state. Indeed, if si is a further dead state, then {a*,ii}w = { a * ,ii} which contradicts the fact that w is a directing word of A. Theorem 4. If A = ( A ,X ) E K:* is a directable automaton and w is one of its minimum-length directing words, then lwl 5 and IAI 2 214.
1x1
Proof. The inequality lwl 5 (XI immediately follows from Lemma 3. In order to prove the lower bound for the number of states, as a consequence of Lemma 3, we may suppose that w = X I . . .Xk, where XI,... ,Xk are pairwise different. We prove that IAl 2 2k by induction on k. If k = 1, then the statement is obviously valid. Let k 2 1 be an arbitrary integer, and let us suppose that the statement is valid for k. Furthermore, let A = ( A , X ) such a directable commutative asynchronous automaton generated by one element whose minimum-length directing word is w = X I . . .Xkxk+1, where 21,.. . ,xk+1 are pairwise different signs of A. Let us denote the generating element of A by ao. We show that the automaton A = (A,{XI,...,xk+l}) is also a directable commutative asynchronous automaton generated by a0 and w is a minimumlength directing word of A, where A = {ao# : p € {XI,... ,xk+l}*} and iixA 3 = iixjA, for all ii E A and xj E {XI,.. . ,xk+1}. Obviously, w is a directing word of A, and thus, we have to prove only the minimality of w. By Lemma 4, awA = a*, for all a E A, where a*
219 -
is the unique dead state of A. Then, awA = a*, for all a E A, where a* is the unique dead state of A. Let us suppose now that w is not a minimum-length directing word for A. Then, there are pairwise different letters zil,..,zitE (21,... ,Xk+1} such that 1 < k 1 and 5 = xil . ..zit is also a directing word of A. By Lemma 4, iidA = a*, for all ii E 8,since A has exactly one dead state a*. Now, we prove that d is a directing word of A as well. Let a E A be an arbitrary state. Then, a = aopA for some p E X*.Furthermore, a s A = aO(pij)A = aowA# = aowA# = a*@ = a* since a* is a dead state of A. Consequently, AWA = {a*} which contradicts the fact that w is a minimum-length directing word of A. This yields that w is a minimum-length directing word of A as well. Now, let us consider the automaton B = (B, (21,.. . ,Xk}) which is generated by a~ in A under the input signs 21,...,2k.Obviously, B is a commutative asynchronous automaton which is generated by ao. We prove that B is a directable automaton and XI.. .Zk is a minimum-length directing word of B. For this purpose, let aj # a; be two arbitrary states of B. Then, it is sufficient t o prove that
.
+
ai21...2k = ajzl.. .2k.
Since B is generated by ao, there are p, q E and aj = aOq. But then
.,
{XI,. . Zk}*
such that ai = aop
aix1...x k = aopxl . ..x k = aoxl.. .x k = aoqxl . . .x k = ajxl . . .xk.
Let us observe that z1 . ..Xk must be a minimal-length directing word of
B. Indeed, in the opposite case, if xil ...xiz would be a minimum-length directing word of B, where xit E {XI,.. .,zk}, t = 1 , . .. ,I, and 1 < k, then xil ...xi12k+1 would be a directing word for A which is a contradiction. (21,, . . ,2k+1}) is a directable Let c = {aZk+l : a E B}. Then, C = (C,
subautomaton of A with the minimum-length directing word X I ...x k and A = C U B. Now, we show that C n B = 0. Contrary, let us s u p pose that ai E C n B. Then, a; = aop for some p E (21,...,2k}* and ai = ajzk+l = aOqxk+l for some aj E B and q E {XI,.. . ,q}*. Thus, a* = aOz1.. .2k2k+1 = aO2k+lx1...Zk = aoqx&+lxl.. . x k = aopxl...x k = aox1.. .xk, where a* denotes the dead state of A. Therefore, for any at E B, at21 ... 2k = aOSZ1 ...Xk = a021 ...2k = a*, where S E (21,...,Xk}* such that a ~ = s at. Moreover, for any c E C, CXI ...zk = aouxk+1x1.. .x k = aoxk+121...2k (1021... 2k = a*, where U E (21,...,Zk}* such that c = aOuxk+l. This yields that X I . . .x k is a directing word of A which contradicts the minimality of w = X I ...Xk+1. Consequently, C n B = 0. Now, since C is a directable commutative asynchronous automaton generated by aoxk+l and 21 .. .x k is a minimum-length directing word for it, we 1
220
obtain that I C l 2 2k by the induction assumption. On the other hand, B is also a directable commutative asynchronous automaton generated by a0 whose minimum-length directing word is 21.. .xk, and thus, by the induction hypothesis again, IBI 2 2k. The obtained inequalities, A 2 A = B U C,and C n B = 0 result in (A(2 2k+1, which ends the proof of our statement. Now, we are ready to prove the following statement.
Theorem 5 . For every positive integer n, d p ( n ) = [log,(n)].
Proof. Let n 2 1 be an arbitrary integer. If n = 1 or n = 2, then the statement is obviously valid. Now, let us suppose that n 2 3. Let A = ( A , X ) E Ic* be an arbitrary directable automaton of n states. Assume that d(A) = k for some nonnegative integer k. Then, by Theorem 4, 2k 5 n, and thus, k 5 log,(n) which results in k 5 [log,(n)]. Consequently, d p (n) 5 [log,(n)]. To prove that the equality is possible, we construct an automaton A E K* such that (1) IAI = n, (2) d(A) = [log,(n)l.
To do this, let [log,(n)] = k and r = n - 2k. Let X = (21,. . . ,sk} and Y = {yl, ... ,y,.} be two disjoint sets of input signs. In particular, if r = 0, then let Y = 0. Let us denote by 0 the k-dimensional vector whose every component is equal to 0. Now, let us define the automaton A = ( A , X U Y ) = ({O,l}k U {l,.. . , r } , X U Y ) as follows. For every xj E X,
,
(il, .. . i+j
=
(21,.
..,ik) E (0, l}k, and t
E (1,.
. . , r } , let
(ii,. . .,ilk) if ij = 0, where ii = it, t = 1 , . . . ,k, t # j and i; = 1,
,
(il, .. . ik)
otherwise,
tsj = 0x3, and for every y1 E Y ,(il,. . .,ik) E (0, l}k, and t E {l,... , r } , let
0 s t 0
ift=r ift=r ift#r ift#r
and and and and
1=r, l = s for some s ~ { l..., , r-1},
~=t, l#t.
221
It is easy to see that
T generates the automaton A, and A is a commutative asynchronous automaton. In particular, if T = 0, then 0 generates A. Moreover, 2 1 ...z k is a minimum-length directing word of A. Consequently, d(A) = k = [log2(n)]which ends the proof of Theorem 5.
References [l] Clem$, J., P o z n h k a k homog6nym experimentom s konecinm automatami, Mat.-fyz. cas. SAV 14 (1964), 208-215.
[2] h i k , Z., B. Imreh, Remarks on finite commutative automata, Acta Cybernetica 5 (1981), 143-146. [3] h i k , Z., B. Imreh, Subdirectly irreducible commutative automata, Acta Cybernetica 5 (1981), 251-260.
[4] G h e g , F., On subdirect representations of finite commutative unoids, Acta Sci. Math. 36 (1974), 33-38. [5] G k g , F . , On vl-products of commutative automata, Acta Cybernetica 7 (1985), 55-59. [6] Gerace, G. B., G. Gestri, Decomposition of Synchronous Sequential Machines into Synchronous and Asynchronous Submachines, Information and Control 11 (1968), 568-591. [7] Gerace, G. B., G. Gestri, Decomposition of Synchronous Machine into an Asynchronous Submachine driving a Synchronous One, Infomation and Control 12 (1968), 538-548. [8] Imreh, B., On isomorphic representation of commutative automata with respect to a,-products, Acta Cybernetica 5 (1980), 21-32.
[9] Imreh, B., M. Steinby, Some remarks on directable automata, Acta Cybernetica 12 (1995), 23-35.
[lo] Pin,
J. E., Sur un cas particulier de la conjecture de Cerny. - Automata, languages and programming, ICALP’79 (Proc. Coll., Udine 1979), LNCS 62, Springer-Verlag, Berlin 1979, 345-352.
[ll]Rystsov, I., Exact linear bound for the length of reset words in commutative automata, Publicationes Mathematime 48 (1996), 405-409.
[12] Yoeli, M.,Subdirectly irreducible unary algebras, Amer. Math. Monthly 74 (1967), 957-960. [13] Wenzel, G. H., Subdirect irreducibility and equational compactness in unary algebras (A; f), Arch. Math., Basel 21 (1970), 256-263.
222
Presentations of right unitary submonoids of monoids ISAMUINATA Department of Information Science, Toho University, &nubashi 274-8510,Jupan
1 Introduction In the case of groups, the index of a subgroup H of a group G is the number of different right cosets of H in G . This index is equal to the number of equivalence classes of the right congruence P H = ((2, y ) E G x G Izy-' E H } on G . Using this index, Reidemeister and Schreier showed that every subgroup of a finitely presented group of finite index is also finitely presented (see [5]). Several authors considered generalizations of the above result to semigroups or monoids (see
[I,2,3,4,6,71). The purpose of this paper is to obtain a generalization of ReidemeisterSchreier theorem for right unitary submonoids of monoids. In the rest of this section we give basic definitions and notations on presentations of monoids. Let A be an alphabet and A* the free monoid on A. The empty word is denoted by A. We set A+ = A* - {A}. A (monoid) presentation is an ordered pair ( A I R), where R A* x A*. An element (u,u ) in R is called a (defining) relation and usually denoted by u = u. A monoid M is defined by a presentation ( A I R ) if M A * / q , where q is the congruence on A* generated by R. For any w1,w2E A*,we write w1 = w2 if w1 and w2 are identical as words, and write w 1 =R w2 if w1 and wp represent the same element in M ,that is w1/q = wz/q. For any subset S of M , set L(A,S) = {w E A* I w/q E S}. A monoid is called finitely presented if it can be defined by a presentation ( AI R) in which both A and R are finite.
2
The index of submonoids of monoids
A subset U of a monoid M is right (resp. left) unitary if for any u E U and x E M , ux E U (resp. xu E U ) implies x E U . A subset U of M which is both right and left unitary is unitary. Let N be a right unitary submonoid of a monoid M . A right coset N x of N is maximal if N x N y for some y E M implies Na: = Ny. The (right) index of N (in M ) is the number of different maximal right cosets of N . Remark that we can define the left index of a left unitary submonoid of a monoid in the same
223
way, but even though a submonoid is unitary, the right index and the left index are not necessarily equal.
Proposition 2.1 Let N be a right unitary submonoid of a monoid M and { N m i I i E I } the set of different maximal right cosets of N . Then, (1) M = UiE1Nmi. (2) N m ; N m j for all i # j . (3) There is i E I such that N = Nmi, that is, N is itself mazimal. Proof. Clear. Let N be a right unitary submonoid of a monoid M and { N m ; 1 i E I} the set of different maximal right cosets of N . Then a set {mi I i E I } is called a set of generalized right wset representatives of N . By the above proposition, we can choose m; = 1 for some i E I where 1 is the identity element in M . Proposition 2.2 Let {miI i E I } with mo = 1 be a set of generalized right coset representatives of N . Then, N n N m ; = 0 for all i # 0. Proof. Clear from the unitarity of N .
3 Presentations of right unitary submonoids of monoids Let M be a monoid defined by a presentation ( A I R), cp : A* + M the natural surjection and N a right unitary submonoid of M . And let {ui E A* I i E I } be a subset of A* such that {cp(ui)li E I } is a set of generalized right coset representatives of N . We choose uo G A. For any i E I and a E A, fix j 6 I such that cp(uj) is a generalized right coset representative of cp(uia). Then, for any i E I and w ala2.. .a, E A+, there exist j,,j,, . ..,&+I E I such that j1 = i and c p ( ~ i ~is+the ~ )fixed generalized right coset representative of cp(uj,ak) for all k = 1 , 2 , . . . ,r. Such j k + 1 is denoted by iw. Since for any i E I and a E A, there is n E L(A,N ) = {w E A* I cp(w) E N } such that
uia = R
mia,
we choose such n and denote it by n;,a. Using this notation, for any i E I and w = a1a2 . . .a, E A*, we have, U i W = R ni,alnia,,az . "nialaz..-a,-l,a,uiw.
The word ni,alnial,az. .-niala *...a,-l,av is simply denoted by n(i,w).
Lemma 3.1 For any w E A+, w E L(A,N ) i f and only i f Ow = 0.
224 Proof. For any w E A*, w = R n(0,w)uow. If w E L(A,N), then n(0,w)uow E L ( A , N ) . Since.N is right unitary, uow E L ( A , n ) , and hence Ow = 0. Conversely, if Ow = 0, then uow _= uo = A. Thus w E L(A,N). Now we have,
Theorem 3.2 N i s generated by the set {n;,a I i E I , a E A } . Proof. For any w
= a1a2 . - - a , E L(A,N),
Corollary 3.3 Eve? right unitary submonoid of a finitely generated monoid of finite index is finitely generated. Let M be a monoid, N a right unitary submonoid of M generated by a set X = {xi E M l i E I} and Y = {mj E M l j E J } a set of right coset representatives of N. Then it is easy to show that M is generated by X U Y. So, Corollary 3.3 can be strengthened to Corollary 3.4 Let M be a monoid and N a right unitary submonoid of M of finite index. Then, M i s finitely generated if and only if N i s finitely generated. Let B = {bi,, 1 i E I , a E A } be a new alphabet and $ : B* + A* a monoid homomorphism induced by the mapping b;,a e ni+. For i E I and w = a1a2 . ..a, 6 A+, the word bi,albial,az . . .b;ala2...aT-l,a., is denoted by b(i, w). Define a mapping 4 : L(A,N ) -+ B* by
4(A)
= A, and
$(w) = b(0,w) (w E A+). Now we have,
Lemma 3.5 N i s defined by the generators B , and the relations
where i E I , a E A, and w1, w2 E A*, u = v E R such that w1uw2 E L(A,N).
Proof. The mapping 4 is a reuniting mapping in the sense of [l].So, by [ l , Theorem 2.11, N is defined by the generators B and the relations d(ni,a) = h,a (i E 1,a E A ) ,
4(WlW2) = 4(w1)4(w2) (w1, w2 E L(A,N)), and (4) $ ( w ~ u w ~=)4(wlvw2) ( w ~ w2,E A*,u = v E R such that wluw2 E L(A,N)).
225
For any
Thus (W1W2)and (w1) (w2) are exactly equal as words over. B In this we can delete the relation of the form (4), andf we have the desired rela
4
Finitely presentability of right unitary submonoids of monoids
In the previous section, we have obtained a presentation of a right unitary submonoid of a monoid defined by some presentation. But such a presentation may be infinite even though both the presentation of M and the index of N are finite. In this section, we consider the following problem: when does a right unitary submonoid of a finitely presented monoid of finite index have a finite presentation? With the notations in the previous section, we have,
Lemma 4.1 N is defined by the generators B , and the relations
4(ni,a) = bi,a, and 4 ( U O W 1 ~ ~= 2 )4 ( U 0 W l V W 2 ) ,
where i E I , a E A, and
w l , w2 E A*, u
=uE
R such that
wluw2
E L(A,N ) ) .
Proof. To prove the assertion, it suffices to show that the relations (2) and (3) can be derived from the relations (5) and (6). The set of relations of the form (5) and (6) is denoted by R’. The relation (2) directly follows from the relation (5). Let q5(w1uw2)= c$(w1uw2)be a relation of the form (3), that is, w1, w2 E A*, u = u E R and w1uw2 E L(A,N ) . Then, we have (b(Wlaw2)
=R‘
(b(n(O,
W1)uOwl aw2)
=
4(@,
W1))4(~Ow,~W2)
=R’
d(n(0,wl))4(uOw1Vw2)
=R‘
4(WIUW2).
This completes the proof of the lemma.
Now we introduce an automaton A = (I,A, 6,0,0) associated with our relations as follows:
226 (1) I is the set of states, (2) A is the input alphabet, (3) 6 : I x A + I is a transition function defined by
d(i, a) = { j
E
I I there exists n
E L ( A ,N )
such that rnia = R nrnj}.
(4) 0 is the initial state, (5) 0 is the terminal state.
Remark that A is non-deterministic, in general. Proposition 4.2 With above notations, w E L ( A ,N)if and only if d(0, w) = 0.
Proof. It is immediate from Lemma 3.1. We say that an automaton A' is a deterministic choice of A, if A' is a deterministic subautomaton of A. And A' is called cycle-free, if there is no non-trivial directed cycle that does not contain 0.
Theorem 4.3 Let A be the automaton dejined above. Assume that both ( A1 R ) and I are finite. If there is a cycle-free deterministic choice of A, then N is finitely presented. Since there is a cycle-free deterministic choice of A, for any i E I and a E A, we can choose ia E I, which is cyclefree and deterministic. And E L ( A , N ) such that rnia = R n+rnia. Hence we obtain a we choose n+, presentation of N given in Lemma 4.1. To show the theorem, it is enough to show that the relations of the form ( 6 ) are finite. Since above choice is cyclefree, for any state i c? I, there are at most a finite number of paths from i to 0. So, for any w1 E A* and (u, w) E R, there are only a finite number of paths from Owlu to 0. Thus the relations of the form (6) are finite. This completes the proof of the theorem.
Proof.
Example 1. Let A = {a, b } , R = {aba = a, bab = b } and M the monoid defined by the presentation ( A ,R ) . And let N be the submonoid of M generated by a set {cp(ambn),cp(bnam) 1 rn,n E NO,rn n : even}. Then it is easy to see that N is a unitary submonoid of M and {A, a, b} is a set of generalized right coset representatives of N . In our automaton A, the transition function d is defined as
+
d(A,a) = a, d(A,b) = b, d(a, a) = A, d(a,b) = A, d(b,a) = A, 6(b,b)= A. It is clear that A is cycle-free and deterministic. Hence, N is finitely presented by Theorem 4.3. In fact, put aa = e, ab = f, ba = g and bb = h, then N is defined by the generators {e, f,g, h} and the relations {fe = e, f 2 = f, eg = e, g2 = g, gh = h, he = h}.
227
References [I] C.M. Campbell, E.F. Robertson, N. RuSkuc and R.M. Thomas, Reidemeister-Schreier type rewriting f o r semigroups, Semigroup Forum 51 (1995), 47-62. [2] C.M. Campbell, E.F. Robertson, N. Rudkuc and R.M. Thomas, O n subsemigroups of finitely presented semigroups, J. Algebra, 180 (1996), 1-21. [3] C.M. Campbell, E.F. Robertson, N. RuSkuc and R.M. Thomas, Presentations f o r subsemigroups - applications t o ideals of semigroups, J. Pure Appl. Algebra, 124 (1998), 47-64. [4] A. Jura, Determining ideals of a given finite index in a finitely presented semigroups, Demonstratio Math., 11 (1978), 813-827. [5] W. Magunus, A. Karrass and D. Solitar, Combinatorial Group Theory, Interscience Publishers, New York, 1966. [6] N. RuSkuc, O n large subsemigroups and finiteness conditions of semigroups, Proc. London Math. SOC.,76 (1998), 383-405. [7] N. Rudkuc, Presentations f o r Subgroups of Monoids, J. Algebra, 220 (1999), 365380. [8] N. RuSkuc and R.M. Thomas, Syntactic and Rees indices of Subsemigroups,
J. Algebra 205 (1998), 435-450.
228
A combinatorial property of languages and monoids A.V. KELAREVAND P.G. TROTTER School of Mathematics and Physics, University of Tasmania, G.P.0. Box 252-37, Hobart, Tasmania 7001, Australia Email:
[email protected] [email protected] In a 1976 paper 8 , B.H. Neumann characterized center-by-finite groups
as being groups with a particular combinatorial property; a group is centerby-finite if and only if every infinite sequence of its elements contains a pair of elements that commute. The characterization was produced as an answer to
a question by Paul Erdos and has led to a series of papers by various authors in which combinatorial properties of algebraic structures have been investigated. A survey of this direction of research, by the first author, appears in 6 . Our aim here is to investigate formal languages that satisfy particular combinatorial properties (namely, permutational properties) with respect to combinatorial and finiteness properties of their syntactic monoids. Given an alphabet A, let A+ and A* denote respectively the free semigroup and the free monoid generated by A. A subset L of A* is called a
language on A. The syntactic congruence induced b y L is the congruence p~ on A* defined by p~ = {(u,v) I avb E L H awb E L, for all a , b E A*}.
The quotient semigroup Syn(L) = A * / ~ is L called the syntactic monoid of
L (see4). It is well known that a language L is recognized by a finite state
229
automaton if and only if Syn(L) is finite. Furthermore, the property of a language L being rational, or regular, is equivalent to Syn(L) being finite. Let S , be the symmetric group on {1,2,. . . ,n} for some positive integer n. A semigroup S is said to be n-permutational if, for any elements X I ,2 2 , .
. .,x,
in S, there exists a non-identity permutation u E S, such that
A semigroup is permutational if it is n-permutational for some n. This notion generalizes commutativity and has been actively investigated (see 5 ,
for
references). In particular, by ', a group is permutational if and only if it is finiteby-abelian-by-finite. An important result of Restivo and Reutenauer
states that a finitely generated periodic semigroup is permutational if and only if it is finite; that is, a language on a finite alphabet is recognisable by a finite state automaton if and only if its syntactic monoid is periodic and
permutational. Because of the connection between a language L and its syntactic monoid via the congruence p ~it, is natural to define L to be n-permutational for some positive integer n if, for each word w E L and each factorization
of w, there exists a non-identity permutation u E S, such that
Define L to be permutational if it is n-permutational for some n. In 2, permutational semigroups are called 'permutable semigroups'. A language L is defined in2 to have the permutation property if, for some n and
230
for any words u, 2 1 , .
. . ,x,,
v in A*, there exists a non-identity permutation
u E S, such that '11x1* ..x,v E L
*
.. .XU(,)V
UZU(1)
E L.
It is clear that a language L with the permutation property is permutational. However, with A = {a, b}, there is a language L over A that is permutational but does not satisfy the permutation property. To see this, consider L = A* \ {aba2b2...anbn I n 2 2 ) .
It is easy to verify that this language is n-permutational for each n 2 3. However, with u = 1 = v and xi = aibi for 1 5 i I n, we get 21x1 . ..xnv $ L; yet for any non-identity permutation u E
S3,uxu(l)- - - X ~ ( ~ E) VL.
Hence L
does not have the permutation property. We begin with a pair of easy deductions based on the definitions and on the above mentioned result of Q .
THEOREM 1 For any language L, the monoid Syn(L) is n-permutational only if L is n-permutational.
Proof. Suppose that w
= uu1uz ---u,v, for some u,u;,v E A*, 1
5
i 5 n. Since Syn(L) = A * / ~ isL n-permutational, for some n, there exists u E S n \ (1) such that
It follows that uu1uz.. .unv E L
*
.
U U u ( ~ ) U u ( ~* )' U U ( , ) V
E L.
23 1
COROLLARY 2 Every language that is recognized by a finite state automaton is permutational.
Proof. Let L be a language over a finite alphabet that is recognized by a finite state automaton. Then by9, since Syn(L) is finite, Syn(L) is permutational, and so Theorem 1 completes the proof. 17 Corollary 2 also follows from Theorem 4; we have included both versions to show the first easier proof. The next example shows that the severing of the connection between languages and finite semigroups, as exists in Corollary 2, can result in a non-permutational language. Moreover, Corollary 2 does not generalize to context-free languages.
EXAMPLE 3 Let G be a context-free grammar in Chomsky Normal Form with alphabet 7,
A
= {a, b}, non-terminal symbols V = { a ,p, r}, start-symbol
and productions +rr, r + w r p , a + a,
O + b, r + ab.
Clearly, G generates the language M+, where
M = {anbn I n 2 1). We show that this language is not permutational. Indeed, for any positive integer n the following product p = aba2b2a3b3..-a"bn E M+
can be factorized as p =~
1 2 2 - -= - 2~(ba~)(b~a~)..-(b"-~a")b", ~
232
where
21
= a, 2 2 = ba2,
. .., xn-l
= bn-lan,
2"
= b". It is easily seen that,
for any non-identical permutation u,
Given a recognizable language L, the result of Restivo and Reutenauerg does not provide a formula for estimating the least n such that Syn(L) is n-permutational. The next theorem gives us a bound for the least n such that L is n-permutational.
THEOREM 4 Every language that is recognized by a (possibly non-deterministic and incomplete) finite state automaton with k states is 2k-permutational.
Proof. Suppose a language L is recognized by a finite state automaton d(S,X,cp,so,T)with IS1 = k. Take any word w E L and any factorization w = uulu2.. .'112kU. For each integer 0 5 i 5 2k, consider the state
where we assume u
= UO.
By the pigeonhole principle there exist 0 5
iz < i3 5 2k such that sil = siz
= si3. It
il
m. Indeed, as only insertions are permitted, z E (v), implies (v1 5 1x1; therefore, A E (v), and v E K U {A} imply v = A. On the other hand, as v E (A), implies IvI 5 m, one has that v E (A), and v E K U { A } imply v = A. Now let v1 and w2 be codewords of K such that v1 E ( ~ 2 ) ~As. only insertions are permitted, one has that lvll 2 Iw2I. In particular] 1 w l l = 12121 if and only if no insertion occurs in 212, if and only if w 1 = 212. Hence, as K is uniform, v1 = v2. Analogously, one can verify that every uniform code K is error detecting for d(m, 1),provided len K > m. Example 4 One can verify that the code KO= {000,111} is error-detecting for the channel y = (T 0 L 0S(1,3). But KOis not (y, *)-detecting. Indeed, consider the messages w2 = (000)3 and w1 = (000)2 such that w1 # w2. Then, w1 E (wg), by deleting appropriately three symbols from w2. Example 5 Consider the code K1 = {q,v2 I v1 = 00111, v2 = 0101011) and the channel y = 6(1,7). From the equalities (.I), = { v ~ , O l l l , O O l l }and
(v&
= {v2,101011,001011,011011,010011,010111,010101},
one verifies that K1 is error-detecting for y. In addition, we claim that K1 is (y,*)-detecting. Indeed, note first that A @ (w), and w @ (A), for all w E K t . Now consider two messages w1 and w2 in K t such that w1 E ( ~ 2 Then, w1 = [ I E ~ ]and w2 = [Q] for some factorizations 61 and 6 2 over K1. By property PI of the channel y,there is a factorization $ which is y-admissible for 6 2 such that [$] = w1 = [KI] and $(i) E ( ~ 2 ( i ) for ) ~ all i E I$ = In,. It is sufficient to show that $ = 61; then, as K1 is error-detecting for y,~ l ( i E) ( ~ ~ ( i implies )), ~ l ( i=) 1 c 2 ( i ) for all i in In,. So consider the word ~ l ( 0 of ) K1 which is a prefix of both, [nl] and [$I. If ~ l ( 0 = ) vl then $ ( O ) = v1 or $ ( O ) = 0011. The second case implies $(1) = 101011 which is impossible, as two deletions would occur in 62(0)n2(1) within a segment of length less than 7. Hence, $(O) = v1 as well. Similarly, one verifies that if ~ l ( 0 = ) vg then $(O) = v2 as well. Hence, $ ( O ) = nl(0) and $(1)$(2)... = ~ 1 ( 1 ) ~ 1 ( 2 ) The same argument can be applied repeatedly to obtain $(i) = ~ l ( i for ) all i in I$. The following proposition gives certain relationships between the error-
'
A code K is (?, *)-correcting if in K'.
# 0 implies W I = W Z ,for all
( ~ 1 n)(wz)-, ~
~1
and wz
246
detecting properties given in Definition 2.
Proposition 1 For every t in NO and f o r every P,-channel y, the following relationships are valid. (i) ED;+^ 2 ED;. (ii) ED; 2 ED,. (iii) ED; = nFoED;.
+
Proof: Consider a code K which is ( y , t 1)-detecting and the messages w1 E K S t and wz E K' such that w1 E (wz),. Let v E K . By property P 2 of the channel y,one has w1v E ( W Z V ) ~ .As w l v E Kit+' and wzv E K', it follows that w1v = w2v. Hence, w1 = wz and the first inclusion is correct. Obviously, the second inclusion is correct as well. For the third relationship, one can easily verify that ED; g ED; for all t in NO.Hence, ED; nEoED;. On the other hand, consider a code K in nE,,ED; and w1, w2 E K* with w1 E (wz),. Then, there is t E NO such that wl E Kt and, as K E ED;, it follows that w1 = wz. Hence, K E ED;. 0 Next it is shown that the inclusion in Proposition l ( i ) can be proper for every value of the parameter t.
Proposition 2 For every t in NO there is an SID-channel y such that ED;+,+' is properly contained in ED;.
+
Proof: For each t in No consider the SID-channel y = y(t) = 6 ( l , t 2) and the code K = K(t) = (Ot+'}. First we show that K is (y,t)-detecting and then that K is not (y, t + 1)-detecting. Let w1 E Km and 202 E Kn such that w1 E (wz),, m 5 t, and n E NO. As only deletions are permitted, lwll 5 IwgI. If lwll = 1w21 then w1 = w2 as required. On the other hand, we show that the assumption lwll < lwzl leads to a contradiction. Indeed, as IKI = 1, this assumption implies m 1 5 n. Now as wz consists of n codewords each of length t 2, at most one symbol can be deleted in each codeword and, therefore, a t most n deletions can occur in W Z . Hence, Iw1 I 2 lwzl - n which together with m 1 5 n imply
+
+
+
m(t
+ 2) 2 n(t + 2) - n + n 5 m(tt ++l 2) + m + l
1in a straightforward manner, the generalized algorithm does not seem to run in polynomial-time in the case k > 1. KO and Hua7 showed that the straightforward generalization to the two-variable case brings an NP-complete subproblem. KO and Tzeng’ have studied another important problem of finding a common pattern. That is the problem of finding a pattern consistent with given positive and negative examples. They showed that the problem of finding a pattern consistent with given positive and negative examples (i.e. given two sets S and T of constant strings, determine whether there exists -a pattern p such that S C L ( p ) and T 5 L ( p ) ) is C;-complete, where L ( p ) is the complement language of L ( p ) . KO et al.’ also stated that the complexity of the problem is not settled in the k-variable case for k 2 1. In this paper, we further investigate the modification of Angluin’s algorithm to the one-variable pattern-finding problem from given positive and negative examples. We show that the modified algorithm meets a difficult problem that is NP-complete. More precisely, in the modified algorithm, Step 3, finding a pattern that is recognized by Ai and consistent with each negative example seems to be difficult. We show that the pattern-finding problem is a subproblem of a graph problem and also show that the graph problem is NP-complete. Although this fact does not imply that the one-variable patternfinding problem from given positive and negative example is difficult, we can regard this fact as an aspect of the computational compIexity of the problem. We also give sufficient conditions that the one-variable pattern-finding problem from given positive and negative is efficiently computable. 2
Definitions
C is a finite alphabet containing at least two symbols. The set of all finite strings over C is denoted by C*. The set of all finite non-null strings over C
255 is denoted by C+. The set of all strings with the length k over C is denoted by Ck. A sample is a finite nonempty subset of C+, and each element of a sample is called an example. A pattern is a finite non-null string over C U {z}, where z is the variable symbol and not in C. Let PI denote the set of all patterns. The length of a pattern p , denoted lpl, is the number of occurrences of symbols composing it. For each set A, let I IAl I denote the cardinality of A. The concatenation of two patterns p and q is denoted by pq. The pattern that is k-times concatenation of a pattern p is denoted by p k . Let f be a non-erasing homomorphism from PI to PI with respect to concatenation. If f is an identity function when restricted on C, then f is called a substitution. We use a notation [ w / x ]for a substitution which maps the variable symbol z to the string w and every other symbol to itself. For any pattern p and for any substitution f , substituted pattern f ( p ) is denoted by P b J l X I . If p is a pattern, the language of p , denoted L ( p ) , is the set {s E Cf : s = f ( p ) for some substitution f } . A pattern p is said to be descriptive of a sample S if S C L ( p ) and for every pattern q such that S C L(q),L ( q ) is not a proper subset of L ( p ) . That is, for a descriptive pattern p of S , L ( p ) is minimal in the set-containment ordering among all pattern languages containing S . For any pattern p and any string s, if there exists a substitution f such that f ( p ) = s then we say that p generates s (by f ) . A pattern p is said to be consistent with a positive sample S and a negative sample T if S C L ( p ) and T C C* \ L ( p ) . 3
Review of the One-Variable Pattern-Finding Problem
The difficulty of the pattern-finding problem in the case of general patterns lies on that of the membership problem (i.e. given a pattern p and a constant string s, determine whether s E L ( p ) ) . The following shows the difficulty of the membership problem.
Proposition 1 (Angluinl (1980)) The problem of deciding whether s E L ( p ) f o r any string s E C* and f o r any pattern p i s NP-complete. However, in the case of one-variable patterns, the membership problem is decidable in polynomial time. This suggests that finding a common pattern in one-variable case may be solvable in polynomial time. Actually, Angluin's algorithm runs in polynomial time to find a common pattern from a given positive sample.
256
In this section, we review Angluin’s algorithm for finding a common onevariable pattern from a positive sample. We first define pattern automata. Let s be a string and let w be a nonempty substring of s. Denote P A l ( s ;w ) = { p E PI : s = p[w/3:]}. We define a (one-variable) pattern automaton A ( s ;w ) to recognize the set P A l ( s ; w ) . The states of A ( s ; w ) are ordered pairs ( i , j ) such that 0 5 i, 0 5 j , and i jlwl 5 IsI. The initial state is (0,O). The final states are all states ( i , j ) such that j 2 1 and i +jlwl = Is].The transition function 6 is defined as follows. Let b E C.
+
S((i,j),b) =
S ( ( i , j ) , 3 : )=
{
(Zt1,j)
undefined
{
(i,j
+ 1)
undefined
+ + jlwl)th symbol of s is b,
if the (1 i otherwise;
if w occurs in s beginning a t position (I i otherwise.
+ +jlwl),
The state ( i , j ) signifies that in the input string, z constant symbols and j occurrences of 3: have been read so far. Let A, = (Q,,Qo,S,,F,) for i = 1 , 2 be two finite automata over the alphabet C with the same initial state yo = (O,O), where Q , C: N x N is the set of states, S, is the transition function, and Fa is the set of final states of A,. Then we define A1 C A2 if and only if Q1 C Q2, F1 C F2 and whenever bl is defined, 62 is also defined and agrees with 61. A finite automaton A is called a one-variable pattern automaton if and only if A C A ( s ;w) for some string s and substring w. Let A, = ( Q z ,(O,O), S,, F a ) be two one-variable pattern automata, for i = 1 , 2 . Then the intersection of automata A1 and A2, denoted by A1 n A2, is the finite automaton (Q1 n Q2, (0, 0 ) ,S,F1 n F2), where b(q, a ) is defined t o be 61(q,a) whenever 6 1 ( y , a ) and 62(y,u) are both defined and equal; and is undefined Otherwise. Proposition 2 (Angluin’ (1980)) If A and A’ are one-variable pattern automata then A n A‘ as a one-variable pattern automaton, and L ( A n A’) = L ( A ) n L(A’). Next we discuss the partition of one-variable patterns into pairwise disjoint groups. For each one-variable pattern p , define ~ ( pto) be the triple of nonnegative integers ( i , j ,k ) such that the number of occurrences of constants in p is i, the number of occurrences of variables in p is j , and the position of the leftmost occurrence of 3: in p is k . Let P A ( i , j , k ) be the set of all patterns p in PI such that the number of occurrences of constants in p is i, the
257
number of occurrences of variables in p is j , and the position of the leftmost occurrence of x in p is k . Let us call a triple ( i , j , k ) feasible f o r s if 0 5 i 5 IsI, 1 5 j 5 Is], 1 5 k 5 i 1, and j just divides Is/ - i. We say a triple ( i , j ,k ) is feasible f o r a set S if it is feasible for all s in S. Let F be the set of all feasible triples for the given set S. We construct, for each string s and each triple ( z , j , k ) that is feasible for s, a pattern automaton A(s;w) where w is the unique string defined by the triple. Let an input sample S = {sl,. . . s,} be given, where each si E C+ and m 2 2. Then each triple ( i , j , k ) in F defines m automata A,(i,j, k ) , for T = 1,.. . ,m, as follows. Let w, be the substring of s, beginning at position k and with the length (Is,] - i)/j. To obtain A , ( i , j , k ) , take A(s,; w,) and remove any z transition leaving from a state (u, 0) where u < k - 1, remove the constant transition leaving from the state (0, k - l),and remove all final states except ( i , j ) .
+
Proposition 3 (Angluinl (1980)) A,(i, j , Ic) recognizes all patternsp in PI such that s, E L ( p ) and ~ ( p=) ( z , j ,Ic). Consequently,
u (fi L
(i,j,lc)EF
A,(i,j,h ) )
={p€
Pl : s
c L(p)}.
r=l
The above observation gives us the following algorithm. Angluin's One-Variable Pattern-Finding Algorithm INPUT:S = { s l , . . . , s,}; OUTPUT:a one-variable pattern p which is descriptive of S within PI; begin for each (i,j , Ic) in F do begin for T := 1 to m do construct automaton A,(i, j , k ) ;
n m
~ ( i , j , k:= )
~,(i,j,k);
T=l
end; sort F in descending order according to the value of a for each ( i , j ,Ic) in sorted F do if llL(A(i,j,k))ll # 0 then output any p E L(A(Z,j,k ) ) and exit end.
+j ;
258
In the above algorithm, it is clear that the time complexity depends on two factors: one is the number of feasible triples, and the other is the amount of time to construct A ( i , j , k ) . Let !be the input size, that is, IS,^. Since, for each feasible triple (i,j , k ) and each r , 1 5 r 5 m, the automaton A T ( i , j ,k ) can be constructed in time O(lsT12)and the intersection of automata can be constructed in linear time with respect to the size of the automata, the automaton A ( i , j ,k ) can be constructed in time O(C;==l1s,.I2). Furthermore, using a theorem (on number theory) of Dirichlet, we can show that IlFll is 0(t2log!). Therefore, the above algorithm runs in time 0(t4log!). We note that Angluin's algorithm guarantees the following property, which is useful in what follows.
xy=l
Proposition 4 (Angluin' (1980)) For any feasible tripZe ( i ,j , k ) f o r S , i j there exists a descriptive pattern of S in L ( A ( i , j ,k ) ) then all patterns in L ( A ( i , j ,k ) ) are descriptive of S . 4
Finding Patterns from Positive and Negative Examples
In general case, the pattern finding problem from positive and negative examples is not easier than the pattern finding problem from positive examples only. The following proposition suggested this observation.
Proposition 5 (KO and Tzeng' (1991)) T h e problem of deciding whether there exists a pattern p which is consistent with a positive sample S and a negative T is C;-complete. When the number of variables is fixed, whether the problem is efficiently solvable or not has been unsettled. In this section, we extend Angluin's algorithm to deal with negative examples. The following is a straightforward extended algorithm to find a one-variable pattern consistent with positive and negative examples.
One- Variable Pattern-Finding Simple Algorithm from Positive & Negative Sample INPUT:S = { s ' , . . . , s m } , T = { t l , . . . , t n } ; OUTPUT:a one-variable pattern p which is descriptive of S within PI and is consistent with S and T ; begin for each ( i , j , k ) in F do begin
259
for T := 1 to m do construct the automaton A r ( i , j ,k ) ;
n m
~ ( i , ,q j , :=
~ ~ ( i IC); , j ,
r=l
end; sort F in descending order according t o the value of i j ; for each ( i , j ,k ) in sorted F do begin if JIL(A(i,j,k))lJ = 0 then continue ;go t o the end point of the loop body if JJL(A(i,j, k))Jl = 1 then begin for T := 1 to n do check whether t, E L ( p ) ; if ‘dt E T [t $? L ( p ) ]then output p else continue ;go t o the end point of the loop body end else begin for T := 1 to n do begin construct the automaton AtT(Z,j, k ) := A(Z,.j,k ) n A t r ( i , jk, ) ; Er := { e I edge e appears in A but not in At.) end; find a pattern p which goes through a t least one edge in each E, output p and exit end end; output “none” end.
+
Obviously, time complexity of the algorithm is determined by the time complexity of execution of the step (*). We formulate the above problem into a decision problem, called the MCP problem, and show that MCP is an NP-complete problem.
260
Multiple Color Path(MCP) Problem: Given a directed acyclic graph G = (V,E ) , where V is a set of vertices and E is a set of edges, and given specified vertices s and t and subsets E l , E2,. . . ,Ek of E , find out whether there is a path from s to t which goes through at least one edge in each Ei. Theorem 6 The MCP problem is NP-complete. Proof: It is obvious that the M C P problem is an N P problem. We show that there is a polynomial time reduction of 3SAT to MCP, where 3SAT = {p I 'p is a Boolean formula in the conjunctive normal form (CNF) in which each clause contains exactly three literals and 'p is satisfiable}. Let cp be a Boolean formula in CNF with m variables X I , x i , . . . ,x , and n clauses, and with three literals per clause. That is,
where ei,j (1 5 i 5 n, 1 5 j 5 3) is either xk or PI, for some k with 1 5 k 5 m and %k denotes the negative literal of xk. We first define a graph G ( v ,E ) and specified vertices s and t as follows:
and
E
= El U Ez,
where
261 and
E2 =
Next we define path constraints, that is, subsets of E as follows: E i , j ,= ~ {fi,j,~}U {if !i,j = Xk Ei,j,F = { f i , j , F }
u {if &,j
=Zk
Ec,i = {if &,I is positive then
then
ek,F
else null },
then
ek,T
else null},
fi,l,T else fi,l,F}
U {if !i,2 is positive then
f i , 2 , ~else f i , 2 , ~ }
U {if
fi,3,~ else f i , 3 , ~ } ,
&,3
is positive then
for each i , j with 1 5 i 5 n, 1 5 j 5 3. It is easy to see that this reduction is computable in polynomial time. We have only to show that 'p is satisfiable if and only if there is a path in G from s to t which goes through at least one edge in each E i , j , ~Ei,j,T , or E c , ~ .Due to the construction of G, all paths from s to t go through either e k , T or e k , F , for all k (1 5 k 5 m). This fact corresponds to the truth assignments for variables in 'p: the fact that the path goes through e k , T means x k = 1. Due to the setting of E i , j , ~ and E i , j , ~if, x k appears in j t h clause as a positive (resp., negative) literal, the value of the literal is determined to 1 (resp., 0). If the path goes through e k , F then that means 21, = 0. Due to the setting of E i , j , ~ and E i , j , ~if, xk appears in j t h clause as a positive (resp., negative) literal, the value of the literal is
262
determined to 0 (resp., 1). Moreover, the setting of Ec,i guarantees the truth value of each clause. Therefore, we can say that 'p is satisfiable if and only if there is a path from s to t which goes through at least one edge in each E i j , ~ , Ei,j,T O r
Ec,i.
The above theorem suggests that the one-variable pattern-finding problem from positive and negative examples could be hard. However, it does not mean that the problem is negatively settled. In what follows, we consider sufficient conditions that the one-variable pattern-finding problem from positive and negative examples is efficiently computable. Theorem 7 Suppose that the number of negative examples is bounded by a constant. Then, the one-variable pattern-finding problem f r o m positive and negative examples is polynomial-time computable.
Proof: In case that the number of negative examples is bounded by a constant, it is the following subproblem of the MCP problem that we have to solve in order to execute the step (*) in the new algorithm. The subproblem is the case that k is bounded by a constant. It is easy to see that the subproblem is polynomial-time computable. Thus, the new algorithm finds a one-variable pattern from positive and negative examples in polynomial-time. 0 Pattern finding problem is related to solving word equations. Let p l and pa be one-variable patterns accepted by some pattern automaton for a positive sample S. We can regard any solution w for the equation p1 = p2 as substitution f = [w/rc],since f ( p l ) = f ( p 2 ) . Then, S is a subset of all substituted patterns f ( p l ) such that f ( p l ) = f (p2). The following is well known in the literature of the word equation problems. Proposition 8 Let p l and p2 be one-variable patterns. Any solution f o r p l = p2 belongs to either
{ ( ~ p ) :~i a2 0 } such that la1 5 lpll, 5 lp11, a and p are uniquely determined, and ap is a primitive word or
1. the set
2. some finite set whose elements are shorter than p l or of length Ipll. We call solutions in the former set long solutions and ones in the later set short ones. The analysis of the input S also brings some sufficient conditions the one-variable pattern-finding problem from positive and negative examples is efficiently computable.
263
Theorem 9 Let S = {sl,. . . , s,} be positive sample and T = { t i , .. . ,tn} be negative sample. Let smin be a n element in S of the minimum length. If ( I ) there exist s , , s ~ E S such that s, # S b and Is,I = lSbl 2 Is,in12 or (2) there exists s E S such that Is1 2 IsminI2and 'dt E T [It!2 I ~ ~ i ~then 1 ~ the ] , new algorithm find a one-variable pattern from positive and negative examples in polynomial-time.
s,
Proof: NOW we assume that S a , S b E S, # S b and = lSbl 2 ISmin12. Then the length of any pattern p which generates s,in is at most Ismini. Also the number of variable symbols in any pattern p which generates sminis at most Is,inI. Since any pattern automaton A ( i , j , k ) appeared in the new algorithm recognizes patterns of the same length, if there is a one-variable pattern p such that (1) the number of variable symbols in p is exactly Ismin[, (2) the pattern p generates smin,and (3) the pattern p is recognized by the pattern automaton, then the pattern automaton recognizes at most one onevariable pattern. So, we have to consider only the case the number of variable symbols in p is less than Ismini.We assume that there are two distinct onevariable patterns p l and p2 in L ( A ( i ,j , k ) ) . Since there exists s, E S such that Is,[ 2 Ismin12, there exists a substitution f such that f = [ w / x ]f(p1) , = f( p z ) , and IwI 2 lsminl+l.This means that the equationpl = pa has solutions whose length is more than ( p l ( .Proposition 8 says that there exists at most one long solution of the same constant length. The existence of two distinct strings s, and S b such that Is,I = lSbl 2 IsminI2,sa = fa(pl) = f a ( p 2 ) for some substitution fa and sb = fb(p1) = f b ( p 2 ) for some substitution f b means that there exist two distinct long solutions of the same length. This contradicts that there exists at most one long solution of the some constant length. So we can say that l l L ( A ( i , j , k))ll 5 1. If any pattern automaton A ( i , j , k ) appeared in the new algorithm is either a single path or the null automaton, then the new algorithm always skips the step (*), namely, the new algorithm runs in polynomial-time. Next we assume that there exists s E S such that Is1 2 Is,inI2 and W E T [It12 I ~ , i ~ 1 ~ ] Using . the similar discussion as above, we may assume that there are two distinct one-variable patterns p1 and p2 in L ( A ( i , j , k ) ) . This assumption and Is\ 2 Jsmin12 imply that the existence of long solutions for the equation p l = p2. Since all elements in T is long, it is easy to see that A t ( i , j , k ) is either the same as A(z,j , k ) , a single path automaton, or the null automaton. This fact enables us to run the step (*) in polynomial-time. Thus, we can say that the new algorithm runs in polynomial-time. 0
264
5
Conclusion
We considered the computational complexity of the one-variable patternfinding problem from given positive and negative examples. We modified the Angluin’s algorithm, which efficiently solves one-variable pattern-finding problem from positive examples only, in order to cope with negative examples. We showed that the modified algorithm involves some difficult problem (say, the MCP problem) and the problem is NP-complete. Since the modified algorithm actually involves some subproblem of the MCP problem, NP-completeness of the MCP problem does not imply the difficulty of the one-variable patternfinding problem from given positive and negative examples. We also showed some sufficient conditions that the one-variable pattern-finding problem from given positive and negative examples is computable in polynomial-time. Some conditions are obtained using properties of the word equation problem. Since pattern-finding problem is related to the word equation problem, more careful analysis on the word equation problem may affirmatively solve the onevariable pattern-finding problem from given positive and negative examples. References 1. D. Angluin. Finding patterns common to a set of strings. Journal of Computer and System Sciences, 21( 1):46-62, 1980. 2. W. Charatonik and L. Pacholski. Word equations with two variables. In Proceedings of the 2nd International Workshop on Word Equations and Related Topics, IWWERT’91, Lecture Notes in Computer Science 677, pages 43-56. Springer-Verlag, 1991. 3. M. R. Garey and D. S. Johnson. Computers and Intractability : A Guide to the Theory of NP-Completeness. W. H. Freeman, New York, 1979. 4. L. Ilie and W. Plandowski. Two-variable word equations. In Proceedings of the 17th Annual Symposium on Theoretical Aspects of Computer Science, STAGS 2000, Lecture Notes in Computer Science 1770, pages 122-132. Springer-Verlag, 2000. 5. T. Jiang, A. Salomaa, K. Salomaa, and S. Yu. Decision problems for patterns. Journal of Computer and System Sciences, 50(1):53-63, 1995. 6. D. E. Knuth, J. H. Morris, and V. R. Pratt. Fast pattern matching in strings. SIAM Journal on Computing, 6(2):323-350, 1977. 7. K.-I KO and C.-M Hua. A note on the two-variable pattern-finding problem. Journal of Computer and System Sciences, 34( 1):75-86, 1987. 8. K.-I KOand W.-G Tzeng. Three Cr-complete problems in computational learning theory. Computational Complexity, 1(3):269-310, 1991.
265
9. K.-I KO, A. Marron, and W.-G Tzeng. Learning string patterns and tree patterns from examples. In Proceedings of the 7th International Conference on Machine Learning, pages 384-391. Morgan Kaufmann, 1990. 10. M. Lothaire. Combinatorics on Words. Addison-Wesley, Reading: Massachusetts, 1983. 11. G. S. Makanin. The problem of solvability of equations in a free semigroup. Mathematics of the USSR Sbornik, 32(2):129-198, 1977. 12. S. E. Obono, P. Goralcik, and M. Maksimenko. Efficient solving of the word equations in one variable. In Proceedings of the 19th International Symposium on Mathematical Foundations of Computer Science, MFCS’94, Lecture Notes in Computer Science 841, pages 336-341. Springer-Verlag, 1994.
266
O N THE STAR HEIGHT OF RATIONAL LANGUAGES A NEW PRESENTATION FOR TWO OLD RESULTS SYLVAIN LOMBARDY AND JACQUES SAKAROVITCH Laboratoire Traitement et Communication de l’lnformution, CNRS / ENST, 46, rue Barrault, 75 634 Paris Cedex 13, France E-mail: {lombardy ,sakarovitch}Qenst .fr T h e star height of a rational language, introduced by Eggan in 1963, has proved t o be the most puzzling invariant defined for rational languages. Here, we give a new proof of Eggan’s theorem on the relationship between the cycle rank of a n automaton and the star height of a n expression that describes the language accepted by the automaton. We then present a new method for McNaughton’s result on the star height of pure-group language. It is based on t h e definition of a (finite) automaton which can be canonically associated t o every (rational) language a n d which we call universal. In contrast with the minimal automaton, the universal automaton of a pure-group language has the property that it contains a subautomaton of minimal cycle rank that recognizes the language.
The star height of a rational language is the infimum of the star height of the rational expressions that denote the language. The star height has been defined in 1963 by Eggan who basically proved two things and asked two questions. Eggan showed first that the star height of a rational expression is related t o another quantity that is defined on a finite automaton which produces the expression, a quantity which he called rank and which we call here loop complexcity. He proved then that there are rational languages of arbitrary large star height, provided that an arbitrary large number of letters are available. And he stated the following two problems.
Is the star height of a rational language computable? 0 Does there exist, on a fixed finite alphabet, rational languages of arbitrary large star height?
For a long time, the first one was considered as one of the most difficult problems in the theory of automata and eventually solved (positively) by Hashiguchi l o in 1988. The second problem, much easier, was solved in 1966 by Dejean and Schutzenberger 7 , positively as well. Soon afterwards, in 1967, McNaughton published a paper 1 2 , entitled “The loop complexity of pure-group languages” where he gave a conceptual proof of what Dejean and Schutzenberger had established by means of coinbinatorial virtuosity (one of the ‘(jewels”in formal
267
language theory cf14). He proved that the loop complexity, and thus the star height, of a language whose syntactic monoid is a finite group is computable and that this family contains languages of arbitrary large loop complexity (the languages considered by Dejean and Schutzenberger belong to that family). The purpose of this communication is t o give a new, and hopefully enlightening, presentation of Eggan’s and McNaughton’s results. We first give a new proof of Eggan’s theorem, by describing an explicit correspondence between the computation that yields the loop complexity of an automaton and the computation of an expression that denotes the language accepted by the automaton. We then present a new method for McNaughton’s result on the star height of pure-group language; it is based on the definition of a (finite) automaton which can be canonically associated to every (rational) language and which we call universal. In contrast with the minimal automaton, the universal automaton of a pure-group language has the property that it contains a subautomaton of minimal cycle rank that recognizes the language. In a forthcoming paper, we show how this method can be extended and the result generalized from pure-group languages to reversible languages”. We mostly use the classical terminology, notation and results for automata and languages (cf.’). We give explicit notes when we depart from the standard ones.
1
Eggan’s Theorem
1.1 Star height and loop complexity
Rational expressions (over A’) are the well-formed formulae built from the atomic formulae that are 0, 1 and the elements of A and using the binary operators + and . and the unary operator *. The operator * is the one that “gives access to infinity” . Hence the idea of measuring the complexity of an expression as the largest number of nested calls to that operator in the expression. This number is called the star height of the expression, denoted by h[E] and defined recursively by: if E = 0, E = 1 or E = a E A ,
h[E] = 0 ,
if E = E’+ El’ or E = E’ . E”,
h[E] = max(h[E’], h[E”]) ,
if E = F*,
h[E] = 1 + h[F] .
+
Examples 1 i) h[(u b)*] = 1 ;h[a* ( b a * ) * ]= 2 . ii) h[u* a*b(ba*b)*bu* + u*b(bu*b)*a(b+ a(ba*b)*a)*a(ba*b)*ba*] =3 , h[(a b(ba*b)*b)*]= 3 ; h[a*b(ab*a ba*b)*ba*]= 2 .
+ +
+
268
These examples show that two equivalent expressions may have different star heights (the expressions in i) as well as those in ii) are equivalent). The following definition is then natural. The star height of a rational language L of A*, denoted by h[L], is the minimum of the star height of the expressions that denotea the language L: h[L] = min{h[E] I E E RatEA* IEI = L } .
Definition 1
The star height of an expression also reflects a structural property of an automaton (more precisely, of the underlying graph of an automaton) which corresponds to that expression. In order to state it, we define the notion of a ballb of a graph: a ball in a graph is a strongly connected component that contains at least one arc (cf. Figure 1).
Figure 1. An automaton, its strongly connected components, and its balls.
Definition 2
The loop complexity‘ of a graph recursively defined by:
G
is the integer Ic(G)
if G contains no ball (in particular, if G is empty); if 6 is not a ball itself; Ic(G) = max{lc(P) I P ball of G} Ic(G) = 1 + min{Ic(G \ {s}) I s vertex ofg} if G is a ball. lC(6)
=0
As Eggan showed, star height and loop complexity are the two faces of the same object:
’
The loop complexity of a trim automaton A is equal t o the infimum of the star height of the expressions (denoting IdI) that are obtained by the different possible runs of the McNaughton-Yamada algorithm on A.
Theorem 1
“We write IEl for the language denoted by the expression E. Similarly, we write Id1 for the language accepted by the automaton A. RatE A’ is the set of rational expressions over the alphabet A. bLike in a ball of wool. ‘Eggan calls it “cycle rank” . McNaughton l 2 calls loop complexity of a language the minimum cycle rank of an automaton that accepts the language. We have taken this terminology and made it parallel to star height, for “rank” is a word of already many different meanings.
269
There is an infimum “hidden” in the definition of the loop complexity and the theorem states that it is equal to another infimum. It proves to be adequate to make this infimum more explicit and, for that purpose, to define the loop complexity, as well as the star height, relatively to an order on the vertices of the graph (or on the states of the automaton). We shall then relate more closely the two quantities, showing that they are equal when they are taken relatively to an order. The equality of the two minima will follow then obviously. We use the following notation and convention. If w is a total order on a set Q, we denote by G the largest element of Q for w. If R is a subset of Q, we still denote by w the trace of the order w on R and, in such a context, G is the largest element of R for w.
Definition 3 Let I; be a graph and w a total order on the set of vertices of I;. T h e loop complexity of I; relative to w is the integer Ic(I;, w) recursively defined by: i f G contains no ball (in particular, if G is empty); lc(G,w) = 0 if I; is not a ball itself; lc(I;, w ) = max{lc(P, w) I P ball of I;} Ic(I;, u)= 1
Property 1
+ Ic(G \ {GI, w)
if
For any graph I;,
is a ball.
Ic(I;) = min{lc(I;,w)
I
(1) (2)
(3)
w order on I; }.
Proof. By induction on the number of vertices of G, the base being 0. We see first that, for any total order w , lc(G) 5 Ic(G,w) which clearly holds if 6 contains no ball or is empty. If it holds
G is not
a ball itself,
Ic(I;) = max{lc(P) I P ball of I;} 5 max{lc(P,w) I P ball of 6) = Ic(I;,w) since Ic(P) it holds
5 Ic(P, w)
as P has strictly less vertices than 6. And if G is a ball,
+ min{lc(G \ {s}) I s vertex of G} < 1 + Ic(G \ @}) 6 1+ Ic(G \ {G},W)
Ic(G) = 1
Conversely, the definition of the loop complexity of I; amounts to the definition of a total order w on the vertices of such that Ic(I;) = Ic(I;,w). If 6 contains no ball, any order makes the property holds. If I; is not a ball
270
itself, let w be any order such that its trace on every ball of G is the order that has been determined by the induction hypothesis. If G is a ball itself, let s be the vertex such that the loop complexity of G \ { s } is minimum. Let then 1c, be the order on G \ { s } , determined by the induction hypothesis, such that Ic(G \ { s } ) = Ic(G \ { s } , G).The order w defined on 6 by w = s and the trace of w on G \ { s } being equal to 1c, is such that Ic(G) = ic(G,u).
1.2
T h e state elimination algorithm
McNaughton-Yamada’s algorithm is probably the best known algorithm for computing a rational expression that denotes the language accepted by an automaton. For our purpose however, it is convenient to use a variant of i t , due t o Brzozowski and McCluskey 2 , which is completely equivalentd. This algorithm has been described in l5 and in 16. It uses generalized automata and processes b y deleting state after state. Let us call generalized an automaton A = (Q, A , E , I , T ) in which the labels of the transitions are not letters anymore but expressions, that is the elements of E are triples ( p , e , q ) with p and q in Q and e E RatE A * . The label of a computation is, as usual, the product of the labels of the transitions that constitute this computation and the language accepted by A is the union of the labels of the successful computations of A. Starting from a (generalized) automaton A, the state elimination algorithm consists in building a generalized automaton C which can be called trivial: an initial state i, a final state t (distinct from i) and a single transition from i to t and labelled by a rational expression E which denotes the language accepted by A ( c j . Figure 2).
Figure 2 . The result of the state elimination algorithm.
The first phase consists in building a kind of “normalized” automaton L? by adding to A = (Q, A , E , I , T ) two distinct states i and t , and a transition labelled by 1 ~ from . i to every initial state of A, and a transition labelled by 1 ~ from . every final state of A to t . The state i is the unique initial state of L?, the state t its unique final state: L? is equivalent to A. As A, and ~
~
dThis statement can be made precise and meaningful: an expression obtained by one algorithm can be transformed into an expression computed by the other by using the axiom E’ = 1 EE’ (cf. 1 3 ) . Note that this axiom preserves star height.
+
27 1
then B, are finite, one can assume - after some finite unions on the labels of the transitions - that there is at most one transition from p to q for every pair ( p , q ) of states of B. The second phase has as many steps as there are states in A. It consists in successively removing states from B (but i and t ) and to update the transitions in such a way that a t every step an equivalent automaton is computed whose labels are obtained from those of the preceding one by union, product and star. More precisely, let q be an element of Q; let p l , pa, . . . , pl be the states of B which are the origin of a transition whose end is q , and K I , K g , . . . , I
Ii'l L'
Hk
(b) After the deletion of q
(a) Before the deletion of q
Figure 3. A step of the state elimination algorithm.
The automata B and B' are equivalent. By iterating this construction n times, ( n = 11&11)", an automaton C is obtained that contains no states of the automaton A and which is of the required form. eThe cardinal of a set Q is denoted by
11Q11.
272
1.3 The Eggan-Brzozowski index In order to prove Theorem 1, we define the Eyyan-Brzomwski index of an automaton A - E B index for short - which is at the same time a generalization and a refinement of the loop complexity. And as above for the loop complexity, we shall define the E B index of A, not absolutely but relative to a total order w on the set Q of the states of A, that order which is implicit in the state elimination algorithm. If A is a generalized automaton, we first call E B zndex of a transition e of A, denoted iEB(e), the star height of the labelf of e : i ~ ~ (=e h[le11. ) If A = ( Q ,A , E , I , T ) is a “classical” automaton over A : Ve E
E
=0. We call then E B index of A rehtiue to w , and we note i E B ( d , w ) , the integer defined by the following algorithm (called E B algorithm) where we keep the above notation and convention for the order and the trace of an order on a subset: 0
iEB(e)
If A is not, a ball: iEB(d,w)
= max({igB(e) [ e does not belong to a ball of A } U { ~ E B ( P , uI)P is ball of A } )
0
= 1 +max({iEB(e) I e is adjacent to z } , i E B ( d \ ~ If A is a “classical” automaton, (4)and (5) become respectively: iEB(d,w)
0
, w ) )
(5)
If A is not a ball: iEB(d,w)
0
(4)
If A is a ball:
= max({iEB(P,w) I P is ball of A } )
(6)
If A is a ball: iEB(d,U)
=1
+iEB(d\G,w)
(7)
to which the base of the recurrence has to be added: 0 If A does not contain any ball, or is empty:
=0.
(8) Since (8), (6) and (7) define the same induction as ( l ) , (2) and (3), it directly follows from Property 1: iEB(d,U)
fThe label of the transition e is denoted by
1.
273
Property 2
Ic(A) = min{iEB(A,w)
I
w order on Q }.
We denote by Es E ( d , w ) the rational expression obtained by running the state elimination algorithm on A with the order w, that is by deleting the states of A the smallest first. It should be noted that it follows from these definitions that, once the order w is fixed, the order of deletion of states in the state elimination algorithm is the reverse order of the “deletion” of states in the computation of the E B index. Theorem 1 is then the consequence of the following. Proposition 1 Let w be a total order on the set of states of an automaton A. The E B index of A relative to w is equal to the star height of the rational exp,ression obtained by running the state elimination algorithm on A with the order w , i.e. iEB(d,W)
=h[Es~(d,w)].
Proof. By induction on the number of states of A. By convention, the states i and t that have been added are larger than all the states of A in the order w and are not deleted in the state elimination algorithm . The base of the induction is thus a generalized automaton with 3 states, like in the Figure 4 a) or b). In case a), B contains no ball and it holds: i E B ( B ) ,w
= max(h[E], h[F], h[H]) = h[E
+ F . HI = h [ E s ~ ( t ? , w ).]
In case b ) , the unique state of t? which is neither initial nor final is a ball whose E B index is 1 h[G], and it holds:
+
i E B ( B , w)
= max(h[E],h[F], h[H], (1
+ h[G]))
= h[E
+ F . G* . H] = h [ E s ~ ( B , w ).]
Figure 4. Base of the induction
+
Let now t? be an automaton of the prescribed form and with 11 2 states, q the smallest state in the order w , and B’ the automaton after the first step
274
of the state elimination algorithm applied to B - that is, after deletion of q . Since the adjacency relations (for the other states than q ) are the same in B and in B‘, and as q is the smallest element in the order w , the E B algorithm runs in the same way in B and in B’, i.e. the succession of balls build in both cases is identical, up to the processing of q in B excluded. It remains to show that the computed values are identical as well. Let P be the smallest ball of B that strictly contains q - and if such a ball does not exist, let P = B - and let P‘ be “the image” of P in B’.Two cases are possible. If q is not the origin (and the end) of a loop - case a) --, the transitions of P’ are either identical to those of P or labelled by products F . H , where F and H are labels of transitions of P. It then comes: iEB(P’,w)
= max(max{iEB(e) I e does not belong to a ball of PI}, max{iEB(Q,w) 1 & is a ball of P’}) = max(max{iEB(e) 1 e does not belong to a ball of P}, max{iEB(Q),w I & is a ball of P})
= iEB(P,W) .
(9)
If q is the origin (and the end) of a loop labelled by G - case b) -, i.e. q is a ball of B by itself, the transitions of P’ are either identical to those of P or labelled by products F . G * . H. It then comes, since i E B ( { q } , W ) = 1 h[G]:
+
iEB(p’,w)
= max(max{iEB(e) I e does not belong to a ball of P‘}, max{ iE B (&,w ) 1 Q is a ball of P’}) = max(max{igB(e) I e does not belong to a ball of P}, (1
+ h [ G ] ) , m a x { i ~ ~ ( & , wI &) is a ball of P’})
= max(max{igB(e) I e does not belong to a ball of P}, iEB({q),
‘“)I
max{iEB(Q,w) I & ball of P, different from { q } } )
=~ E B ( P , ~ ) .
(9’)
If P = B (and P’ = a’),the equalities (9) and (9’) become iEB(B’,w)
= iEB(B,W)
(10)
which yields the induction and then the proposition. Otherwise, and without any induction on the number of nested balls that contain q , (10) is obtained from (9) by noting that the transitions of B’ are either identical to those of B or correspond to transitions that are adjacent to q .
275
In case a), the labels of these transitions are products of the labels of transitions of B,their index is obtained by taking a maximum and (10) follows from the relation max(u, b , c ) = max(a, max(b, c ) ) . In case b), the labels of these transitions are, as above, of the form F.G*.H, of index max(h[F], h[H], l+h[G]). The corresponding transition in B has label F (or H); it is processed by the E B algorithm when the index of the transition of label H (or F) and the one of the ball {y}, whose index is 1 h[G], are already computed. The result, that is ( l o ) , follows then, for the same reason as above.
+
1.4
No ,rush to conclusion
After Theorem 1 that shows that the correspondance between automata and expressions can be carried on to a correspondance between loop complexity and star height, one could have thought that to the minimal automaton would correspond an expression of ,minimal star height. There is no such thing of course (or the star height of a language would not be mysterious anymore). The following example describes one of the simplest languages whose minimal automaton is not of minimal loop complexity.
Example 2 Let Fz and F3 be the languages of A’ = { u , b } ’ consisting of words whose number of u’s is congruent to the number of b’s plus 1 modulo 2 and 3 respectively and let F6 be their union:
=1
F3 = {f I l f l a - l f l b = 1 mod 3) and F+j= {f I l f l a - l f l b = 1 , 3 , 4 or 5 mod 6) . The minimal automaton of F6 is the “double ring” of length 6, whose loop complexity is 3. The minimal automata of FZ and F3 have loop complexity 1 and 2, hence the star height of F(5 is at most 2 (cf. Figure 5). FZ = {f
I Ifla
- lflb
mod 2)
Figure 5 . An automaton of minimal loop complexity (right) which is not the minimal automaton (left) for F6.
276
2
Conway’s universal automaton
The new interpretation of McNaughton’s theorem we are aiming a t makes use of a construction which is basically due to Conway. Let A = (Q, M , E , I , T ) be an automaton over a monoid M . For any state q of A let us call “past of q (in A)” the set of labels of computations that go from an initial state of A to q , let us denote it by PastA(y); i.e. Pasta(q)
= { m E M 13 E I
i
3 q] A
In a dual way, we call “future of q (in A)’’ the set of labels of computations that go from q to a final state of A , and we denote it by FutA(q); i.e. FutA(q)
= { m E M 13 E T
y
A
t}
For every q in Q it then obviously holds: [Pastd(q)] [Futd(q)]
c 1-41 .
(*)
Moreover, if one denotes by TransA(p, q ) the set of labels of computations that go from p to q , it then holds: [Pastd ( P ) ] [Transd ( P , q ) ] [Futd(Y)]
c 1.41.
(**)
It can also be observed that a state p of A is initial (resp. final) if and only if 1 ~belongs ’ to PastA(p) (resp. to FutA(p)). Hence every automaton, and in every automaton, every state induces a set of factorizations - this will be how equations such as (*) or (**) will be called - of the subset accepted by the automaton. It is an idea essentially due to J. Conway , and that proved to be extremely fruitful, to take the converse point of view, that is to build an automaton from the factorizations of a subset (in any monoid). More specifically, let It’ be any subset of a monoid M and let us ca.11 factorization of Ii‘ a pair ( L ,R) of subsets of M such that
L R S K and ( L ,R) is maximalg for that property in M x M . We denote by Q K the set of factorizations of It’. For every p , q in Q K the factor FP,, of It‘ is the maximal subset of M such that
L, FP,,R,
c It‘
where p = ( L p ,R p )and q = (L,, Rq)indeed. gMaxirnal in the order induced by the inclusion in M .
277
It is easy to verify, as it was noted in 4 , that if a : M --+N is a surjective morphism that recognizes Ii',i.e. Ii'acu-' = Ii',and if (L, R ) is a factorization and F a factor of Ii' then: i) L = L 0 a - l , R = Ra0-l , and F = Frra-' ; ii) ( L a ,Ra) is a factorization and F a is a factor of K a ; or, in other words, factorizations and factors are syntactic objects with respect t o Ii'. AS a consequence, Q K is finite if and only if I< is recognizable. In 5 , the Fp,gare organized as a Q K x QK-matrix, called the factor matrix of the language It", subset of A*. A further step consists in building an automaton over the alphabet A, which we call the universal automaton of I 0, a n (n-ary) hyperoperation h on A is a mapping from A" into PA. We denote by the set of all n-ary hyperoperations on A and put
'H2'
Z.4 =
u 'HF'. n>O
In the theory of hyperclones, trivial hyperoperations called selectors play the role of projections in the standard clone theory. For 1 5 i 5 n, a hyperoperation e l E 'HT) is the i-th n-ary selector if and only if eY(a1,.. . , a,) = {ui} for every (al,. . . ,a,) E A". We denote by 2;T-a the set of all n-ary selectors and let J ~ ,=AUn>oJLz. Composition is naturally generalized in the following way: For f E 'HLm) and g1,92,.. . ,gm E R F ) ,we define the operation h = f[gl,g2,.. . ,gm] E H '): by h(z1,x2,...,zn) =
U { f ( ~ 1 7 ~ 2 , . . . , ~Irzni)E gi(x1,z27..',zn)7 for all i = 1 , 2 , . . . , m }
for every (51,z2, . . . ,z), E A". The operation h = f [gl,g2,. . . ,g m ] is called (hyper-)composition o f f with g1,g2,'..,grn.
2 8 8
NOWwe are ready to define hyperclones. A hyperclone C on A is a subset of ' H A which contains JH,A and is closed under (hyper-)composition. The set of all hyperclones on A is denoted by & A . It is known that the set ,&A is a lattice with respect to the inclusion relation. In Section 2 we establish normal form for hyperoperations. For ordinary operations several types of normal forms are known. We exploit any one of such normal forms for ordinary operations to derive a normal form for hyperoperations. In Section 3 we discuss the problem of whether Sheffer hyperoperations exist. This is a problem posed by B. A. Romov [5]. We explicitly give ternary and quarternary Sheffer hyperoperations and also show the non-existence of binary Sheffer hyperoperations on a two-element set A. Finally, in Section 4 we consider the cardinality of the set ' H A of all the hyperclones on A where A is a two-element set and show that it has the cardinality of the continuum. This answers affirmatively to Rosenberg's problem. This article is a summary of two papers [l]and [2]: The contents of Sections 2 and 3 with full proof can be found in [2] and the contents of Section 4 with more details will appear in [l].
2
Normal Form
In this section we construct a normal form for hyperoperations in 'HA. Before we discuss the case for hyperoperations, we shall review normal forms for ordinary operations. When the base set A consists of two elements, e.g., A = {O,l}, the operations on A are more commonly called Boolean functions and it is well-known that there are several normal forms for them such as conjunctive normal form, disjunctive normal f o r m and Galois normal form. For the case where the set A is a finite set with two or more elements, we have, for example, the following normal form ([3]):
f(z1,.. .
,%I
=
V
( c f ( a l ..., , a,)
( 2 1 ) A ~ a l ( x 1A)
. . . A Xa, ( x n ) ) .
(a1,...,a,,)€A"
Here, the operations ca E
02) and xa E 0:'
for a E A are defined as
for every x E A, and the operations A E 0; and V E 0; are any operations satisfying the laws aAl=a,
aAO=O
and
OVa=aVO=a
289
for all a
E
A.
In order to construct a normal form for hyperoperations we need one particular hyperoperation.
Definition 2 . 1 Let u E 'Hy) be defined as u(z1,zz) = {z1,z2} f o r every (z1,zz)
E
A2. W e call u the union operator
Definition 2 . 2 Let k be the number of elements in A, i e . , k = (A1 ( 1 < k < m) and n > 0. Denote by 3 the lexicographic order on the set A". For a hyperoperation h E a value vector v of h is a kn-vector given as
IFl2),
where z j E h ( a j ) ,1 5 j 5 k", f o r the j - t h element the order 3.
aj
of A" with respect to
For n > 0, let ( a l ,a2, . . ., u p ) be the sequence of all elements of A" with respect to the order 3. For a hyperoperation h E 'I-& and)a value vector v = ( ~ 1 ,... , z k n ) of h, we construct an (ordinary) operation h" E in a natural way as h v ( a j )= z j for every j ( 1 5 j 5 k").
02)
With these tools, the union operator and value vectors, we can establish a normal f o r m for hyperoperations.
Theorem 2 . 1 For h E 'Ht) h(z1,.. . ,z,) =
IJ
NF(hV(z1,. . ., 2"))
V:valueuector
02)
where hv is the operation in derived from h for value vector v and NF(f) is a normal form o f f f o r f E OA. Example. Let A = {0,1} and h
E 'H?)
be a hyperoperation satisfying
h(0,O) = { l } , h ( 0 , l )= {0}, h(1,O) = { 0 , 1 } and h ( 1 , l )= { I } .
290
There are two value vectors q , z12 of h:
The corresponding ordinary operations are h"1 and hvz defined as
hvl(O,O) = 1, h"'(0,l) = 0, hwl(l,O) = 0, h v l ( l , l )
=
1
and
hvz(O,O) = 1, hvz(O,l) = 0, hwz(l,O)= 1, h v z ( l , l ) = 1. Then a normal form of h is expressed as
As a consequece of Theorem 2.1, we have: Corollary 2. 2
3
%A zs
generated b y OAU{U}~i e . , ' H A = [OAu{u}].
Sheffer Hyperoperations
A hyperoperation h is called Sheffer when all hyperoperations in 'HA can be generated by h and selectors through finite applications of hypercomposition. In what follows, [h] denotes the hyperclone generated by {h} ,&,A.
u
Definition 3. 1 A hyperoperation h E 'HA is a Sheffer hyperoperation if a n d o n l y ilf'H~ = (h]. In this section we show the existence of Sheffer hyperoperations by actually exhibiting them. Our examples are quaternary and ternary Sheffer hyperoperations. We then claim that binary Sheffer hyperoperations do not exist on a two-element set. This is one of the phenomena where the case of hyperoperations is different from the case of ordinary operations as it is well-known that binary Sheffer operations exist in Oy). We shall see another
29 1
phenominon in Section 4 where the “hyper” case differs from the ordinary case. We shall adopt the convention of identifying a hyperoperation h E whose value is always a singleton with an ordinary operation fh E in an obvious manner: h(zlI’..lGJ = {fh(21,...,4)
02)
for every
3.1
(21,.
. . ,z),
E
7f2)
A”
Existence of Sheffer Hyperoperations
First we show the existence of a quaternary] i.e., 4-variable] Sheffer hyperoperation. In t h e case of O A it is known that the operation w E 02)defined as
+
w(z1,xz) = 1 max{x11zz)
+
is a Sheffer operation where is taken modulo k(= \A\). This operation is called Webb function. When IAl = 2 (Boolean case), the operation w is identical to N O R ( z 1 , xz), which is called the Sheffer function in the narrowest sense.
Definition 3. 2 Let t E ‘Hy) be defined as follows:
We shall show that this hyperoperation t is a Sheffer hyperoperation in the following way.
Lemma 3. 1 Webb function w is generated by t , i.e., w
E
[t].
In fact, w is expressed as w = t [ e f ,e;] ef ,e f ] . Since w is Sheffer in O A , Lemma 3.1 immediately implies:
Corollary 3 . 2 OA C [t] For the union operator u E ‘HT)defined in the previous section, it is readily verified that u = t [ e f ,e i , co[ef],c l [ e f ] ]and we have:
Lemma 3. 3 Union operator u is generated by t , i.e., u E [t]. Now we establish the main result of this subsection.
292
Theorem 3 . 4 The hyperoperation t is a Sheffer hyperoperation, i.e., ' H A = [tl. Proof. This is clear from Corollary 2.2, Corollary 3.2 and Lemma 3.3.
0
Next, we show that it is possible t o modify t to obtain ternary, i.e., 3variable, Sheffer hyperoperation.
Definition 3. 3 Let s E
'HT)be defined as
s(x1)x 2 , 2 3 ) = t ( z l ,x Z , x l , 2 3 ) for every (x1,x2,x3)E A3, i.e., s =t[(e:,e$,e:,ez]
Theorem 3. 5 The hyperoperation s is a Sheffer hyperoperation, i.e., ' H A =
.I.[ Proof. It suffices t o show that w E [s]and u E [s]. The former is shown as w = s [ e f ,e;, e'$ and the latter is verified by u = s [ e f ,e?j,b[ey]]where b E H ' ): 0 is a hyperoperation defined as b(0) = (1) and b ( x ) = (0) if x # 0.
3.2 Nonexistence of a Binary Sheffer Hyperoperation on { O J ) We have shown the existence of ternary and quaternary Sheffer hyperoperations on any finite set A. In this subsection we claim the negative result that there does not exist a binary Sheffer hyperoperation on the two element set A = {0,1}.
Lemma 3. 6 Let h E 'Fly). Assume that for every a E A one of the following conditions is satisfied.
( i ) I n the multiplication table of h, entry { a } appears in at most one row. (ii) I n the multiplication table of h, entry { u } appears in at most one column. Then h is not Sheffer. For the proof of Lemma 3.6, see [2].
Proposition 3. 7 Let A = {0,1}. There does not exist a binary, i.e., 2variable, Sheffer hyperoperation on A .
293
Sketch of the Proof. Suppose that h E ?.tT) is a binary Sheffer hyperoperation on A. First, by a simple argument, we see that h ( i , i ) = { i } for each i = 0 , l . Then, since at least one member in the image of h must be a singleton, we may assume, w.l.o.g., that h ( 0 , l ) = {0,1}. Finally, consider the only remaining value h(1,O). For each case of f ( 1 , O ) = {0}, (1) and (0, l}, we can check with the help of Lemma 3.6 that h cannot generate all 0 hyperoperations on A. Hence h is not a Sheffer hyperoperation.
4
Rosenberg’s Problem
In this section we fix that A = {0,1}. Hence, an (n-ary) hyperoperation h is a mapping from (0, l}, into {{0}, {l},{0,1}}. In 1998, I. G. Rosenberg asked the following question ([7]): Is the lattice of hyperclones on {0,1} of the continuum cardinality ? It should be noted that, for the case of ordinary clones, the cardinality of the lattice L A of all clones on A is countable when /A1 = 2 ([4]) and of the continuum when IAl 2 3 ([S]). We answer Rosenberg’s problem affirmatively. Thus the situation differs between the lattice of ordinary clones and that of hyperclones for a twoelement set A. The key to the solution to Rosenberg’s problem lies in the following sequence of hyperoperations.
Definition 4 . 1 Let A = (0,l). For every n > 0 , let h, E 7-tt’be the n-variable hyperoperation o n ( 0 , l ) defined as follows:
Let G denote the set of all such hyperoperations:
G = { h, I n > 0 }.
Note that hl is a constant hyperoperation: hl(x1) = (1) for every x1 E A.
NOTATION: For any n > 0, set G, = 6 - {h,}. The hyperoperation h, and the set which is essential to our discussion.
G,
satisfy the following property,
294
Lemma 4. 1 For every n > 0 , h, # Proof. Suppose that h, [G,] such that
E
[G,].
[G,]. Then there exist some m
+ n and g1,. . . ,g,
E
h, =~,[gl1g2,...,Sm1 (1) Case 1: Let m = 1. As noted above, h l is a constant hyperoperation and so the right-hand side of the equation (1) is constant. However, m n implies h, hl and h, is not a constant hyperoperation. Thus, the equation (1) does not hold. Case 2: Let m > 1. We may assume without loss of generality that for any j = 1 , 2 , . . . , m, gj in (1) is either
+
+
or
(1 5
(Q)
gj = en ti
(PI
gj = he,[ti,. . . , t p l
ij
5 n) (lj
> 0, p > 0)
et,
where is a selector defined in Section 1. We shall say that g j is of type ( a ) if gj takes the form of the above ( a ) and gj is of type (0) if g j takes the form of the above (p). In the right-hand side of the eqution (l),if there are two or more gj's that are of type (p),it is clear that ~ ~ ~ ~ l , ~ 2 , . . . , ~ m = l ~{0,1}. ~ , ~ , . . . , ~ ~
On the other hand, h,(O, 0 , . . . ,0) = (1) holds, and so the equation (1) does not hold in this case. Next, suppose that there is only one g j (call it gj,) that is of type (p) and all the rest is of type (a).For brevity, we may write the equation (1) as
h, = h,[ e y , . . . , g j 0 , .. .I. Then, for the tuple ( 1 , 0 , 0 , .. . , O ) E (0, l},, we have h,(l,O,o , . . . ,0) = (1) and
L i e ; " , . . . ,g ~ , , . . . 1 ( ~ , 0 ,. 0. ., , 0) 2 h,({l} ,..., (1),...) =
{O]l}.
This implies that h, # h,[gl, 92,. . . ,g,] in this case. Finally, suppose that every g j in the equation (1)is of type ( a ) .Then we have (1 5 21,. . . ,i, 5 n) h, = h,[eE,. . . ,e,,] n
295
If m > n, there are some p , q satisfying 1 5 p < q 5 m and i, = i,. P u t i = i, (= i,). Then, for the tuple xi = (0,. . . , 0 , 1 , 0 , . . . ,0) where the i-th component is 1 and all the rest is 0, we have the contradiction as h,(zi) = (1) and
h m [ g l , . . . , g m ] ( s i )= (0,1). The remaining case where m < n can be handled similarly. We have checked for all possible cases that the equation (1)does not hold 0 and thus proved that h, #
[&I.
Corollary 4. 2 Non-empty subsets of G generate mutually distinct hyperclones. Proof is immediate from the above lemma. As the set is countable, the set of all non-empty subsets of G has the cardinality of the continuum. Therefore, Corollary 4.2 gives the affirmative solution t o Rosenberg’s problem:
Theorem 4. 3 The lattice L ~ , J O ,of~ }all hyperclones o n ( 0 , l ) has the cardinality of the continuum.
References [l]Machida, H., Hyperclones on a two-element set, to appear in Multiple-
Valued Logic
-
An International Journal.
[2] Machida, H., Normal form of hyperoperations and the existence of Sheffer hyperoperations, submitted. [3] Poschel, R., and Kaluinin, L. A. (1979). Funktionen- und Relationenalgebren, VEB Deutscher Verlag der Wissenschaften, Berlin.
[4]Post, E. L. (1941). The two-valued iterative systems of mathematical logic, Ann. Math. Studies, 5 , Princeton Univ. Press. [5] Romov, B. A. (1998). Hyperclones on a finite set, Multiple-Valued Logic - An International Journal, 3, 285-300. [6] Rosenberg, I. G. (1996). An algebraic approach to hyperalgebras, Proc. 26th lnt. Symp. Multiple- Valued Logic, Santiago de Compostela, IEEE, 203-207.
296
[7] Rosenberg, I. G. (1998). Multiple-valued hyperstructures, Proc. 28th Int. Symp. Multiple- Valued Logic, Fukuoka, IEEE, 326-333. [8] Yanov, Yu. I. and Muchnik, A. A. (1959). Existence of k-valued closed classes without a finite basis (Russian), Dolcl. Alcad. Nauk., 127, 44-46.
297
Words guaranteeing minimal image S .W. Margolis
J.-E. Pin
M.V. Volkov*
Abstract Given a positive integer n and a finite alphabet A , a word w over A is said to guarantee minimal image if, for every homomorphism cp from the free monoid A* over A into the monoid of all transformations of an n-element set, the range of the transformation wcp has the minimum cardinality among the ranges of all transformations of the form vcp where v runs over A'. Although the existence of words guaranteeing minimal image is pretty obvious, the problem of their explicit description is very far from being trivial. Sauer and Stone in 1991 gave a recursive construction for such a word w but the length of the word resulting from that construction was doubly exponential (as a function of n). We first show that some known results of automata theory immediately lead to an alternative construction which yields a simpler word that guarantees minimal image: it has exponential length, more precisely, its length is O(lA($("'-")). Then using a different approach, we find a word guaranteeing minimal image similar to that of Sauer and Stone but of the length O(lA[:("'-")). On the other hand, we observe that the length of any word guaranteeing minimal image cannot be less than IAln-l.
Let X be a non-empty set. A transformation of the set X is an arbitrary function f whose domain is X and whose range (denoted by Im(f)) is ' a nonempty subset of X . The rank rk(f) of the function f is the cardinality of the set lm(f). Transformations of X form a monoid under the usual composition of functions; the monoid is called the full transformation monoid over X and 'This work was initiated when the third-named author was visiting Bar-Ilan University (Ramat Gan,Israel) with the support of Department of Mathematics and Computer Science, Bar-Ilan University, of Russian Education Ministry (through its Grant Center a t St Petersburg State University, grant EOC-1.C-92) and of Russian Basic Research Foundation. The work was also partially supported by the INTAS through the Network project 991224 "Combinatorial and Geometric Theory of Groups and Semigroups and its Applications to Computer Science", by the Emmy Noether Research Institute for Mathematics and the Minerva Foundation of Germany, by the Excellency Center "Group Theoretic Methods in the study of Algebraic Varieties" of the Israel Science foundation, and by the NSF.
298
is denoted by T ( X ) . If the set X is finite with n elements, the monoid T ( X ) is also denoted by T,. Now let A be a finite set called an alphabet. The elements of A are called letters, and strings of letters are called words ouer A. The number of letters forming a word u is called the length of u and is denoted by ! ( u ) . Words over A (including the empty word) form a monoid under the concatenation operation; the monoid is called the free monoid ouer the alphabet A and is denoted by A * . Both words over a finite alphabet and transformations of a finite set are classical objects of combinatorics. On the other hand, their interaction is essentially the main subject of the theory of finite automata. One of the aims of the present paper is to demonstrate how certain quite well known facts about finite automata may be utilized to improve some recent combinatorial results concerned with words and transformations. Vice versa, we shall also apply certain purely combinatorial considerations to some questions which, as we intend to show, are rather natural from the automata viewpoint. The combinatorial results we have in sight group around the notion of a word guaranteeing minimal image introduced by Sauer and Stone in [21]. To describe it, let us first fix a positive integer n (the size of the domain X of our transformations) and a finite alphabet A . Now suppose we have a mapping 'p : A + T,. It extends in a unique way to a homomorphism of the free monoid A* into T,; we will denote the homomorphism by p as well. Now, with each word u E A * , we associate the transformation up. A word w E A* is said to guamntee minimal image if the inequality rk(wcp)
I rk(u'p)
(1)
holds for every word u E A* and for every mapping 'p : A + T,. Clearly, words guaranteeing minimal image exist [20, Proposition 2.31. Indeed, for each mapping p : A + T,, there is a word wv such that rk(w,cp)
5 rk(ucp)
(2)
for all E A * . Since there are only finitely many mappings between the finite sets A and T, and since the composition of transformations cannot increase the size of its image, we can concatenate all words w, getting an (apparently very long) word w satisfying (1). Words guaranteeing minimal image have been proved to have some interesting algebraic applications. In [20] they were used to find identities in the full transformation monoids. Recently these words have been applied for studying the structure of the free profinite semigroup, see [2]. Of course, for application purposes, the pure existence statement is not sufficient, and one seeks an explicit construction.
299
The only construction of words guaranteeing minimal image known so far was due to Sauer and Stone [21, Corollary 3.51. The construction makes an elegant use of recursion but results in very long words such that, even over a two-element alphabet, it is hardly possible to write down the Sauer-Stone word that guarantees minimal image, say, in T5. To build a word guaranteeing minimal image in T, , Sauer and Stone make use of an intermediate notion which is also of independent interest. Given a transformation f of a finite set X , we denote by df(f) its deficiency, that is, the difference 1x1 - rk(f). For a homomorphism p : A* + T ( X ) ,we denote by d f ( p ) the maximum of the deficiencies d f ( v p ) where v runs over A * ; in other words, d f ( p ) = df(w,cp) where w, is any word satisfying (2). Now we say that a word w E A* witnesses for deficiency k (has property Ak in Sauer and Stone’s terminology), provided that, for all homomorphisms p : A* + T ( X )where X is a finite set, d f ( w p ) 2 k whenever d f ( p ) 2 k . The following easy observation explains how the two properties under consideration relate: Lemma 1. If a word w witnesses for deficiency k for all 0 guarantees minimal image in T,.
5 k < n , then it
Proof. Take an arbitrary homomorphism p : A* t T, and apply it to an arbitrary word v E A* thus obtaining a transformation u p E T,. Suppose that rk(vcp) = r . Then 1 k . In [17, 181 he proved this generalized conjecture for k 5 3 , but recently-J. Kari [13] exhibited a counter example in the case k = 4.) A comparison between the generalized Cernf problem and the aforementioned problem of determining the shortest word witnessing for deficiency k immediately reveals an obvious similarity in them. In fact, the only difference between the two situations in question is that in the former case we look for the shortest rank-decreasing word for a given homomorphism of deficiency > - k while in the latter case we are interested in a word with the same properties but with respect to an arbitrary homomorphism of deficiency 2 k . In the language of automata theory, we may alternatively describe this difference by saying that in the second situation we also look for the shortest word decreasing rank by k for an automaton, but in contrast with the generalized Cernf problem situation, the automaton is a black-box about which we only know that it admits a word of deficiency k . If thinking of a real computational device as a composite made from many finite automata, each a with relatively small number of states, a reasonable construction for an input signal which would simultaneously reset all those automata and which could be generated without analyzing the structure of each particular component of the device might be of some practical interest. As far as theoretical aspects are concerned, the connection just discussed leads to the following conclusion:
1x1
Theorem 2. For each k
23
and for each finite alphabet A, there exists a 1 word of length IAlkk("+')("+2)-1 + g k ( k + l ) ( k + 2 ) - 2 o v e r A that witnesses for deficiency k . Proof. We utilize a result by the second-named author [19].This result which is based on a combinatorial theorem by Frank1 [lo]yields the best approximation to the size of the shortest reset word known so far: Proposition 3. Suppose that the automaton ( X , A , ' p ) is such that the deficiency of the mapping 'p is no less than k , where 3 5 k < Then there
1x1.
1 existsa w o r d w E A * oflength - k ( k + l ) ( k + 2 ) - 1 verifyingdf(w'p)2 k . 0 6 1 For brevity, let m = - k ( k l ) ( k 2) - 1. By a well known result of 6 DeBruijn [7],there is a cyclic sequence over A , of length [A[", such that each word over A of length m appears as a factor of the sequence. Cut this
+
+
302
cycle in an arbitrary place and make it a word u of the same length IAI”. Since our cut goes through exactly m - 1 factors of length m, the word u still contains all but m - 1 words of length m as factors. Now let be the prefix of u of length m - 1 and let w = uv. Note that the word w has length [A[“ m - 1. Clearly, this procedure restores all those factors of length m that we destroyed by cutting the initial DeBruijn sequence, and therefore each word over A of length m appears as a factor in w. We note that there is an efficient procedure that, given A and m , builds DeBruijn’s sequences so, if necessary, ,the word w may be explicitly written. By Proposition 3, for any finite set X and for any homomorphism ‘p : A* -+ T ( X )with df(’p) > - k, there exists a word wv E A* of length m such that df(w,’p) 2 k . By the above construction of the word w , the word wv must appear as a factor in w so df(w’p) 2 k as well, and thus, w witnesses for deficiency k .
+
It should be mentioned that the natural idea used in the above proof (of “gluing together” individual reset words in order to produce an “universal” reset word) first appeared in a paper by Ito and Duske, cf. [12, Theorem 3.11. Corollary 4. Over each finite alphabet A and for each n > 3, there exists a 1 1 word of length IAlg(n3-n)-1 ,(n3 - n ) - 2 that guarantees minimal image an T,. 0
+
ProoJ As in the proof of Theorem 2, we construct a word w of length 1 1 1 JAJS(~~ + --(n3 ~ ) - -T I )~ -2 that has every word of length -(n3 - n ) - 1 as a 6 6 1 factor. Then of course w has also every word of length -k(k l ) ( k 2) - 1, 6 1< - k < n, as a factor and, as such, witnesses for deficiency k for all 1 5 k < n by Proposition 3. We may also assume w witmessing for deficiency 0 as every word does so. The corollary now immediately follows from Lemma 1. 0
+
+
Obviously, the constructions to which Theorem 2 and Corollary 4 refer are asymptotically (that is, for sufficiently large values of k and respectively n ) more economic than the Sauer-Stone construction. Still, the length of the resulting words is exponential as a function of k. Can we do essentially better by finding some words of polynomial length doing the same job? The following result answers this question in the negative: Theorem 5 . A n y word over a finite alphabet A guamnteeing minimal image in Tn contains every word over A of length n- 1 as a factor and has the length at least JAJnM1n - 2.
+
303
Proof. We recall the construction of the minimal automaton of a language of the form A*wA*,where w E A*. This construction can be readily obtained from the well-known construction of the minimal automaton of A*w, which is used, for instance, in pattern matching algorithms (implicitly in [15], and explicitly in [1, 3, 61). Given two words u and IJ words of A*, we denote by overlap(u,v) the longest word z E A* such that u = u’z,IJ = zv’ for some U’,IJ’ E A*. In other terms, overlap(u,v) is the longest suffix of u which is at the same time a prefix of v. U U’
0’
Z
V
Figure 1: z = overlap(u, v) Now given a word w = a1 ...,a E A * , the minimal automaton of A’wA’ is d ( w ) = ( X , A , p ) , with the set of states X = (a1 ...ai 10 5 i 5 m } , that is, the set of all prefixes of the word w, and the function p : A + T ( X ) defined as follows: for all a E A a1...a,(up)
=u1...a,,
a l . . . a i ( a p ) = o v e r l a p ( a l . . . a i a , w ) for O < i < m .
(4) (5)
The initial state is the empty word, and the unique final state is the word w. Lemma 6 . The automaton d ( w ) is synchronizing, and u E A* is a reset word for d ( w ) if and only if the word w is a factor of u. Proof. Since the final state is stabilized by each letter, a reset word u in A ( w ) necessarily sends every state on the final state. In particular, it sends the initial state to the final state, and thus is accepted by d ( w ) . It follows that w is a factor of u. Conversely, if w is a factor of u, and x is a state, then w is a factor of xu. It follows that the word zu is accepted by d ( w ) , whence x(up) = w. Thus u is a reset word.
Now take an arbitrary word v E A* of length n - 1 and consider the automaton d ( v ) = ( X ,A , cp). By Lemma 6, the mapping cp : A + T ( X ) = T, verifies r k ( v p ) = 1. By the definition, any word w E A* that guarantees minimal image in T, should satisfy rk(wcp) 5 rk(vcp) whence rk(wcp) = 1. Thus, w should be a reset word for automaton d ( v ) . By Lemma 6, w then has the word IJ as a factor.
304
Since there are (A("-1 different words over A of length n - 1 and since a word of length m 2 n - 1 has m - n + 2 factors of length n - 1, any word over A containing every word over A of length n - 1 as a factor has the length at least (A("-1 n - 2. (This is, in fact, an exact b o u n d s e e the reasoning with the DeBruijn sequences in the proof of Theorem 2.)
+
Another natural question concerns the behavior of the constructions for small values of k and for small sizes of the alphabet A . Here the SauerStone construction is often better as the following table shows. In the table, t denotes the size of the alphabet A and we omit some of the summands in the second column to fit onto the page. Table 1: The Sauer-Stone construction vs. Theorem 2 The length of the word from:
I Theorem 2
the Sauer-S t one construction
+ 4t6 + 6t5 + lot4 + 9t3 + 7t2 + 3t + 5t13 + l l t " + 21t" + 30t" + 37t9 + . . . + 4t tZ7+ 6tZ6+ 17tZ5+ 38tZ4+ 68tZ3+ 105tZ2+ . . . + 5t tS2+ 7tS1+ 24t5' + 62t4' + 130t48+ . . . + 6t t"' + 8t"' + 32tg9+ 94tg8+ 224tg7 + . . . + 7t t7
4 6 7
t14
+ 18 + 33 t55 + 54 t"
t34
t83$ 8 2
Using the values collected in this table, one can easily calculate that, for any t > 2, the Sauer-Stone construction produces shorter words than the construction based on Proposition 3 for k = 3 , 4 , 5 , 6 . The case t = 2 deserves some special attention. Here the following table, in which all words aTe meant to be over a two-letter alphabet, collects the necessary information: Table 2: The case of a two-letter alphabet
k
The length of the word from: the Sauer-Stone construction
3 4 5 6
842 216 248 3542 987 594 237 765 870 667 058 360
1 Theorem 2 520 524 306 17 179 869 217 36 028 797 018 964 022
305
We see that, for k = 4,5, the Sauer-Stone construction over a two-letter alphabet is more economic than one arising from Theorem 2. Moreover, we recall that Sauer and Stone have found a word of length 8 that witnesses for deficiency 2. Though this is not explicitly mentioned in [all, it is pretty obvious that starting a recursion analogous to (3) with that word, one obtains a sequence of words over a two-letter alphabet such that the ( k - l)thmember of the sequence witnesses for deficiency k for each k 2 2 and is shorter than the word wk arising from (3). A straight calculation shows that this produces a word of length 346 witnessing for deficiency 3, a word of length 89 768 witnessing for deficiency 4, a word of length 1470 865 754 witnessing for deficiency 5, a word of length 98 708 129 987 190 440 witnessing for deficiency 6, etc. Comparing the data in Table 2 with these figures, we observe that the Sauer-Stone construction modified this way yields shorter words than the construction Theorem 2 for k = 3,4,5. Yet, having in mind the benchmark we mentioned above, that is, of producing, over a two-letter alphabet, a word of reasonable size that guarantees minimal image in T5, we cannot be satisfied with a word of length 89 768. A more important motivation for further efforts is provided by the crucial question if any “simultaneous” CernJ; word which resets all synchronizing automata with n states must indeed consist of all “individual” CernJ; words (one for each synchronizing automaton) somehow put together. We shall answer this question by exhibiting a better construction than one which we got from the automata-theoretical approach. The behavior of this construction for small deficiencies/alphabet sizes will be also better than that of any of the constructions above. Given a transformation f : X + X , we denote by K e r ( f ) its kernel, that is, the partition of the set X into rk(f) classes such that 2,y E X belong to the same class of the partition if and only if xf = y f . By a cross-section of a partition IT of X we mean any subset of X having a singleton intersection with each 7r-class. We need an obvious and well known lemma: Lemma 7. Let f , g : X -+ X be two transformations of rank r . T h e n the product fg has m n k r if and only if lrn(f) is a cross-section of Ker(g). 0 Let cp : A* -+ T ( X )be a homomorphism, w E A* a word with rk(wcp) = r . Suppose that there exists a word v E A* such that rk(wvwcp) < r and let u = ala2 . . . a, be a shortest word with this property. Setting, for 0 5 i 5 m,
= Ker((a,-i+l ... a,w)cp), Cj = Irn((wal...ai)cp), 7ri
we have the following proposition:
306
Proposition 8. (1) T O , r 1 , . . . , rm-l are pairwise distinct partitions of into r parts. (2) CO,c1,.. . , Cm-l are pairwise distinct subsets of X of cardinality r . (3) If i j < m, Ci is a cross-section of rj. (4) If i j = m, Ci is not a cross-section of rj.
x
+ +
ProoJ Let i
< m. If ri
has less than r classes, then rk((wam-i+l
< r,
...am~)p)
a contradiction with the choice of u. Similarly, the set Ci should consist of r elements. Thus, both ( w a l . . . a i ) p , for 0 < -i< - m - 1, and ( a j + l . . . a m w ) p , for 1 < - j 5 m, are transformations of rank r. If i < j and the set Ci is not a cross-section of the partition rm-j,then by Lemma 7, the product (wa1 . . . ai)p(aj+l . . . a m w ) p = (20611 . . ' U i U j + l . . . a m w ) p has rank < r , again a contradiction with the choice of u. Furthermore, by the same lemma, Ci cannot be a cross-section of rm-i since rk(wuwp) < r . In particular, if i < j , the set Cm-j is a cross-section for ri, but not for rj. It follows that the partitions ri and rj are different provided that i # j. Similarly, all the sets Ci, for 0 5 i 5 m - 1, are different. It is Proposition 8 that allows us to improve the Sauer-Stone construction. If we mimic the strategy of [21] and want to create a sequence of words witnessing for deficiency k by induction on k , then on each step, we may assume that we have some word w of deficiency k and we seek for a bound to the length of the shortest word v verifying df(wvwcp) > k for a given evaluation cp of deficiency > k . Proposition 8 shows that the length of such a minimal word is tightly related to the size of a specific combinatorial configuration involving subsets and partitions of an n-element set. According to a well-known method in combinatorics, we now convert this combinatorial problem into a problem of linear algebra. Let X = { 1,. . . , n}. We identify each subset C E X with its characteristic vector (c1, . . . , cn) in Rn,defined by ci =
1 ifiEC, 0 otherwise.
The notation ICI, originally used to denote the number of elements of C , extends naturally to a linear form on R" defined by
307
Finally, if C ,D we observe that
X , then denoting by C . D the scalar product
cidi,
C . D = ICnDI. It follows that a subset C of X is a cross-section of the partition ( 0 1 , . . . , D,.} if and only if C . D, = 1 for all s = 1 , .. . , r . With this notation in hand, we can prove the following bound for the size of the combinatorial configuration arising in Proposition 8:
Proposition 9. If the partitions T O , T I , . . . ,T,-I and the subsets Co, C1, . . . , C,-l of an n-element set satisfy the conditions (1)-(4) of Proposition 8, then m 5 n - r 1.
+
Proof. We first prove that the vectors Co, C1, . . . , C,-l are linearly independent. Otherwise, one of the Cj’s is a linear combination of the preceding vectors Co, CI,.. . , Cj-1, say cj
c
=
XiCi.
O k . If already d f ( u k ' p ) > k , we have nothing to prove. If d f ( u k ' p ) = k , then by Corollary 10 there exists a word v of length 5 k 1 such that d f ( U k V U k ' p ) > k . Since by (7) the word u k v u k appears as a factor in U k + l , we also have d f ( u k + l ' p ) > k , as required. 0
+
From Theorem 11 and Lemma 1 we obtain
Corollary 12. For each n Tn .
> 1, the word un-l
guarantees minimal image in 0
A comparison between the definitions (3) and (7) shows that the word is shorter than the Sauer-Stone word W k (on the same alphabet) for each k 2 3. In fact, the leading monomial in the expansion of ! ( u k ) as a polynomial
uk
o f t = IAl equals t T ( k2 - k ); this means that asymptotically the construction (7) is better than the construction from Theorem 2. Moreover, we see that the shortest word in A* that resets all synchronizing automata with a fixed number of states and with the input alphabet A does not need consisting of all shortest "individual" reset words somehow put together. The following table exhibits some data about the size of words arising from (7) for small k and/or t . The data in the last column refer to a slight modification for the construction in the case when the alphabet consists of two letters; the modification is similar to the modification of the Sauer-Stone construction discussed above. Namely, we can make the word aba2b2ab play the role of 212 and proceed by (7) for k 2 3. Viewing the data in Table 3 against the corresponding data in Tables 1 and 2 shows that the gain provided by the new construction is quite large
309
Table 3: The length of the words defined via (7) k
t t3+3t2+2t t6+4t5+6t4+9t3 +7t2 +3t t'0+5t9+llt8+20t7+27t6+29t5+. . . +4t t ' 5 + 6 t ' 4 + 1 7 t 1 3 + 3 7 t ' 2 + 6 4 t ' 1 + ~+5t ~~ t21+7t20+24t'9+61t18+125t'7+... +6t t28+8t27+32t26+93t25+218t24+.. . +7t
[A1 = 2
u? = aba2b2ab
2 24 394 12 312 775 914 98 541 720 25 128 140 138
8 154 4872 307 194 39 014 280 9 948 642 938
even for small deficiencies and alphabet sizes. As for our "benchmark", that is, a word over a two-letter alphabet that guarantees minimal image in T5, Table 3 indicates that there is such a word of length 4872. Yet too lengthy to be written down here, the word appears to be much closer to what may be called "a word of reasonable length" for its size is already well comparable with the size of the monoid T5 itself (which is 3125).
References [l] A. V. Aho, J . E. Hopcroft, and J. D. Ullman, The design and analysis of computer algorithms, Addison-Wesley, 1974.
[2] J . Almeida and M. V. Volkov, Projinite methods in finite semigroup theory, Centro de Matemlltica d a Universidade do Porto, 2001, Preprint 2001-02.
[3] D. Beauquier, 3. Berstel, and Ph. Chrktienne, Eldments d'algorithmique, Masson, 1994 [in French]. [4] J. Cernf, Pozna'mka k homoge'nnym eksperimentom s konecnymi avtomatami, Mat.-Fyz. Cas. Slovensk. Akad. Vied. 14 (1964) 208-216 [in Slovak]. [5] J. Cernf, A. Pirick6, and B. Rosenauerova, On directable automata, Kybernetika, Praha 7 (1971) 289-298. [6] M. Crochemore and W. Rytter, Text algorithms, Oxford University Press, 1994.
[7] N. G. DeBruijn, A combinatorial problem, Proc. Nederl. Akad. Wetensch. 49 (1946) 758-764; Indagationes Math. 8 (1946) 461-467.
310
[8] L. Dubuc, Les automates circulaires biaisks verifient la conjecture de Cernc, RAIRO, Inform. Theor. Appl. 30 (1996) 495-505 [in French]. [9] D. Eppstein, Reset sequences for monotonic automata, SIAM J . Comput. 19 (1990) 500-510. [lo] P. Frankl, A n extremal problem for two families of sets, Eur. J. Comb. 3 (1982) 125-127. [ll] W. Goehring, Minimal initializing word: A contribution to Cerntj conjecture, J. Autom. Lang. Comb. 2 (1997) 209-226.
[12] M. Ito and J. Duske, On cofinal and definite automata, Acta Cybernetica 6 (1983) 181-189. [13] J . Kari, A counter example to a conjecture concerning synchronizing words in finite automata, EATCS Bulletin 73 (2001) 146. [14] J . Kari, Synchronizing finite automata on Eulerian digraphs, Math. Foundations Comput. Sci.; 26th Internat. Symp., Marianske Lazne 2001, Lect. Notes Comput. Sci. 2136 (2001) 432-438. [15] D. E. Knuth, J . H. Morris, Jr, and V. R. Pratt, Fast pattern matching in strings, SIAM J. Comput. 6 (1977) 323-350. [16] J.-E. Pin, Sur un cas particulier de la conjecture de Cerntj, Automata, Languages, Programming; 5th Colloq., Udine 1978, Lect. Notes Comput. Sci. 62 (1978) 345-352 [in French]. [17] J.-E. Pin, Le problkme de la synchronisation. Contribution i I'dtude de la conjecture de tern$, Thbse 3e cycle, Paris, 1978 [in French]. [18] J.-E. Pin, Sur les mots synchronisants dans un automate fini, Elektron. Informationverarbeitung und Kybernetik 14 (1978) 283-289 [in French]. [19] J .-E. Pin, On two combinatorial problems arising from automata theory, Ann. Discrete Math. 17 (1983) 535-548. [20] R. Poschel, M. V. Sapir, N. Sauer, M. G. Stone, and M. V. Volkov, Identities in full transformation semigroups, Algebra Universalis 31 (1994) 580-588. [21] N. Sauer and M. G. Stone, Composing functions to reduce image size, Ars Combinatoria 31 (1991) 171-176.
31 1
Power Semigroups and Polynomial Closure Stuart W. Margolis Department of Computer Science Bar Ilan University 52900 Ramat Gan, Israel
Benjamin Steinberg Faculdade de Cikkcias da Universidade do Porto 4099-002 Porto, Portugal*
Abstract We show that the pseudovariety of semigroups which are locally block groups is precisely that generated by power semigroups of semigroups which are locally groups; that is P(LG) = L ( P G ) (using that PG = BG). We also will show that this pseudovariety corresponds to the Boolean polynomial closure of the LG-languages which is hence polynomial time decidable. More generally, it is shown that if H is a pseudovariety of groups closed under semidirect product with the pseudovariety of pgroups for some prime p, then the pseudovariety of semigroups associated to the Boolean polynomial closure of the LH-languages is P(LH). The polynomial closure of the LH-languages is similarly characterized.
1
Introduction
A common approach to studying rational languages is to attempt to decompose them into simpler parts. Concatenation hierarchies allow this to be done in a natural way which, in addition, has applications to logic and circuit theory [8]. A concatenation hierarchy is built up from a base variety of languages V by taking, alternately, the polynomial closure and the boolean polynomial closure of the previous half level of the hierarchy. The most famous example in the literature of such a hierarchy is the dot-depth hierarchy, introduced by Brzozowski [2], which starts of with the trivial +variety, and whose union is the +-variety of star-free (aperiodic) languages. *The second author was supported in part by NSF-NATO postdoctoral fellowship DGE9972697,and by FCT through Centm d e Matema'tica da Universidade do Porto.
312
Pin and Margolis [6] also studied the group hierarchy which takes as its base the *-variety of all group languages. In [13, 141, the author studied the levels one-half and one of the concatenation hierarchy associated to a pseudovariety of groups H . In particular, it was shown that if H is a pseudovariety of groups closed under semidirect product with the pseudovariety G, of pgroups for some prime p , then
PH = BPol(H) where BPol(H) is the pseudovariety corresponding to the Boolean polyne mial closure of the H-languages [S]. A similar equality was shown to hold between the pseudovariety corresponding to the polynomial closure of the H-languages and an ordered analog of P H . All the aforementioned pseude varieties were considered as pseudovarieties of monoids. In this paper, we prove a semigroup analog of these results; here H is replaced by LH, the pseudovariety of semigroups whose submonoids are in H ; we are then able to show that BPoZ(LH) = P ( L H ) and its ordered analog (provided, of course, H = G, * H for some prime p ) . Special cases include: G , the pseudovariety of finite groups; G,; Gsol,the pseudovariety of finite solvable groups. For the case of G , we can characterize P ( L G ) as L ( P G ) ,semigroups which are locally block groups; hence BPol(LG) has a polynomial time membership algorithm.
2
Preliminaries
As this paper extends the results of [14] to the semigroup context, it seems best to refer the reader there for basic notation and definitions, only monoids will be replaced throughout by semigroups; the reader is also referred to the general references [1, 3, 7, 81. A semigroup S is a set with an associative multiplication. An ordered semigroup (S,5 ) is a semigroup S with a partial order 5 , compatible with the multiplication; that is to say, m 5 n implies rm 5 Tn and mr 5 nr. Any semigroup S can be viewed as an ordered semigroup with the equality relation as the ordering, and free semigroups will always be regarded this way. An order ideal of an ordered semigroup (S, 0 whence we can conclude that L, is infinite. The Pumping Lemma then applies to show that there exist S I , S Z , S ~E S such that s = s1s;s3 for all n > 0. Thus, by choosing n carefully, we see that s = sles3 with e an idempotent. Then sk+' = ~ l ( e s 3 s l e ) ~for s 3 lc > 0. Since S E LG, it follows that for some m > 0, ( e s 3 ~ l e= ) ~e whence p + l =
s l ( e ~ 3 s l e ) ~= s 3sles3 = s.
319
Thus S is completely regular (and so every element is %,-equivalent to an idempotent). Thus, to finish our proof, it suffices to show that all idempotents of S are 3-equivalent. Let e, f E S be idempotents. Then ( e f e ) n = e for some n > 0 (since s E LG) so e E SfS. Dually, f E SeS so e ,7 f . The result follows. 0 We now prove a theorem which implies the converse of Corollary 6.2.
Theorem 6.4. Let V 2 LG. Then P’V’ C LJ+ @ V. firthennore, i f V contains a non-trivial monoid, then P V C BPoZ(V). Proof. The second statement follows from the first by (*). It suffices to show that if S E V,then ( P ’ ( S ) ,2 ) E LJ+ @ V. The identity map $J : P’(S) + P’(S) gives rise to a relational morphism $J : P’(S) -e+ S; in fact, X $ J Y $ J= X Y = ( X Y ) $ J Let . e E S be an idempotent. Then
eq-’ = { X E P’(S)le E x } . An idempotent of e$-’ is then a subsemigroup E C S with e E E and E 2 = E . Lemma 6.3 shows that E is completely simple, so EeE = E . It follows that if Y E eq-l, then E Y E 2 BeE = E whence the local monoid with identity E has E as its greatest element; we conclude that e$J-l E LJ+. 0 Since LH contains a non-trivial monoid whenever H is non-trivial, we immediately obtain the following theorem which is one of our main results.
Theorem 6.5. Let H be a pseudovariety of groups such that Gp * H = H for some prime p. Then PoZ(H) = P’(LH)+ and BPoZ(LH) = P(LH). In particular, these results hold for H any of G, G p (p prime), or Gsol.
7 Locally Block Groups A block group is a semigroup whose regular elements have unique inverses (or, equivalently, semigroups which do not have a non-trivial right or left zero subsemigroup). The pseudovariety of such is denote BG. We use D for the pseudovariety of semigroups whose idempotents are right zeros. We now recall some important facts whose consequences we shall use without comment:
1. P G = J * G = B G = E J [4];
320
2. L(EJ) = EJ * D [12, Proposition 10.21, [17, The Delay Theorem]; 3. LG = G * D [15,171; 4. If H is a pseudovariety of groups, then BPoZ(H) = J * H [9, 141;
5. For any pseudovariety of semigroups V, J * V is generated by semidirect products M * N with M E J+ and N E V [16, 141; 6. For a monoid M , M E J+ if and only M E LJ+.
Proposition 7.1. Let H be a pseudovariety of groups. Then
c
c
P'(LH)+ P ~ Z ( L H ) L(P~z(H)); P(LH) BPoZ(LH) L(BPoZ(H)).
c
Proof. The first containment of the first statement follows from Theorem 6.4. The second containment follows from Proposition 6.1 which shows that
PoZ(LH) = LJ+ @ LH C_ L(LJ+ @ H) = L(PoZ(H)). The second statement follows from the first by (*).
0
The following lemma will be of use. As its proof is identical to the unordered case [18, Lemma 2.21, we omit the proof. 'p : S * T + T be a semidirect product projection from a semidirect product of (ordered) semigroups, and let e E T be an idempotent. Then any submonoid of ecp-' (order) embeds in S .
Lemma 7.2. Let
Using our collection of facts and the above lemma, one deduces immediately
Corollary 7.3. Let V be a pseudovariety of semigroups. Then
c
J+ * V LJ+ @ V = PoZ(V); J * V C_ BPoZ(V). We now show that for the case of G, all the pseudovarieties in question are the same. Theorem 7.4. P(LG) = L(PG) = L(BG)
321
Proof. Proposition 7.1 shows that P(LG) L(PG) (here we are using that PG = J * G = BPoZ(G)). For the other direction, using that PG = EJ, we see that L(PG) = E J * D = J * G * D = J *LG. But, by Corollary 7.3,
J * LG
BPoZ(LG).
However, by Theorem 6.5, the righthand side is none other than P(LG). The result follows. 0
It is clear that one can verify if a semigroup is locally a block group in polynomial time whence P(LG) = BPoZ(LG) has polynomial time membership problem. Observe that we have also shown that L(BG) = J * LG. We note that an entirely similar argument would show that P’(LG)+ = PoZ(LG) = L(P’G+) if one could show that EJ+ is local (the argument of [12, Proposition 10.21 fails because (B;)+@ EJ+).
References [l] J. Almeida, Finite Semigroups and Universal Algebra, World Scientific, 1994. [2] J. A. Brzozowski, Hierarchies of aperiodic languages, RAIRO Inform. T h b r . 10 (1976), 33-49. [3] S. Eilenberg, Automata, Languages and Machines, Academic Press, New York, Vol A, 1974: Vol B, 1976. [4] K. Henckell, S. Margolis, J. -E. Pin, and J. Rhodes, Ash’s type 11 theorem, profinite topology and Malcev products. Part I, Internat. J. Algebra and Comput. 1 (1991), 411-436. [5] S. W. Margolis and J.-E. Pin, Varieties of finite monoids and topology for the free monoid, in “Proceedings of the 1984 Marquette Conference on Semigroups” (K. Byleen, P. Jones and F. Pastijn eds.), Marquette University (1984), 113-130. [6] S . W. Margolis and J.-E. Pin, Product of group languages, PTOC.FCT Conf., Lecture Notes in Computer Science, Voll99 (Springer, Berlin, 1985), 285-299. [7] J.-E. Pin, Eilenberg’s theorem for positive varieties of languages, Russian M a t h . (Iz.VUZ) 39 (1995), 7483. [8] J.-E. Pin, Syntactic semigmups, Chap. 10 in Handbook of language theory, Vol. I, G. Rozenberg and A. Salomaa (ed.), Springer Verlag, 1997, 67S746. [9] J.-E. Pin, Bridges for concatenation hierarchies, in 25th ICALP, Berlin, 1998, pp. 431-442, Lecture Notes in Computer Science 1443, Springer Verlag.
3222322322
[lo] J.-E. Pin and P. Weil, Polynomial closure and unambiguous product, Theory Comput. Systems 30 (1997), 1-39. [ll] J.-E. Pin and P. Weil, Semidirect product of ordered semigroupps, Comm. in Algebra, to appear.
[12] B. Steinberg, Semidirect products of categories and applications, J. Pure Appl. Algebra 142 (1999), 153-182. [13] B. Steinberg, A note o n the equation PH = J appear.
* H,
Semigroup Forum, to
[14] B. Steinberg, Polynomial closure and topology, Internat. J. Algebra and Comput., 10 (2000), 603-624. [15] H. Straubing, Finite semigroup varieties of the form V Algebra 36 (1985), 53-94.
* D,
J. Pure Appl.
[16] H. Straubing and D. Thkrien, Partially ordered finite monoids and a theorem of I. Simon, J. Algebra 119 (1985), 393-399. [17] B. Tilson, Categories as algebra, J. Pure and Applied Algebra 48 (1987) 83198. [18] P.Weil, Closure of varieties of languages under products with counter, J. Comput. System Sci. 45 (1992), 316-339.
323
Routes and Trajectories Alexandru Mateescu*
Abstract
This paper is an overview of some basic facts about routes and trajectories. We introduce and investigate a new operation of parallel composition of words. This operation can be used both for DNA computation as well as for parallel computation. For instance the recombination of DNA sequences produces a new sequence starting from two parent sequences. The resulting sequence is formed by starting at the left end of one parent sequence, copying a substring, crossing over t o some site in the other parent sequence, copying a substring, crossing back to some site in the first parent sequence and so on. The new method that we introduce is based on syntactic constraints on the crossover operation.
1
Introduction
We define and investigate new methods to define parallel composition of words and languages. These operations are suitable both for concurrency and for DNA computation. The operation of splicing of routes leads to new shufflelike operations defined by syntactic constraints on the usual crossover operation. The constraints involve the general strategy to switch from one word to another word. Once such a strategy is defined, the structure of the words that are operated does not play any role.
Definition 1.1 A nondeterministic generalized sequential machine (gsm) is an ordered system G = (C,A, S.Qo, F') where C and A are alphabets, Q is a finite set of states, QO G Q is the initial set of states and F C Q is the final set of states, S : Q x C +Pfin(Q x A*) is the transition function. If L is a language, then 6(L) is the image by the gsm of L. *Faculty of Mathematics, University of Bucharest, Romania, email: alexmateQpcnet.ro
324
Definition 1.2 A.ssume that C = {alla2, . . .a,} is a n ordered alphubet. Let ~ -. ~ . l~~ l ,~ , , ) w E C* be a word. The Parikh vector of w is Q(w)= ( ~ wIwluz, where (wlui means the number of occurrences of ai in w. The operations are introduced using a uniform method based on the notion of route. A route defines how to skip from a word to another word during the parallel composition. These operations lead in a natural way to a large class of semirings. The approach is very flexible, various concepts from the theory of concurrency and of DNA computation can be introduced and studied in this framework. For instance, we provide examples of applications to fairness property and to parallelization of languages. The reader is referred to the monograph [l],for the notion of fairness. The application considered deals with the parallelization of non-contextfree languages. The parallelixation problem for a non-context-free language L consists in finding a representation of L as the shuffle over a set T of trajectories of two languages L1 and L2, such that each of the languages L1, T and L2 are context-free, or even regular languages. This problem is related to the problem of parallelization of algorithms for a parallel computer. This is a central topic in the theory of parallel computation.
Operations on routes and trajectories
2
In this section we introduce the notions of route and of splicing on routes. For more details on these notions, the reader is referred to [4]. Consider the alphabet V = (1, i , 2 , 2 } . Elements in V are referred to as versors. Denote V+ = (1,2) and V- = {I,2). VI = {1,T} and V2 = (2,2}. Definition 2.1 A route is afi element t E V * and a trajectory is an element
t'
E
v;.
Let C be an alphabet and let a,@be words over C. Assume that d E V and t E V*. Definition 2.2 The splicing of (Y with p on the route dt, denoted is defined as follows: if Q = au and p = bv, where a, b E C and u, v E C*, then:
(Y
wdt
0,
325
au Wdt bv =
b(aU Wt v ) , if d = 2, if d = I, (U ~t bv),
If a = au and p = A, a E C,U E C*, then U ( U ~t Wt
A), A),
i f d = 1, if d = 1, otherwise.
I f a = A a n d p = b v , b E C , v E C * , then b(A Wt v ) , i f d = 2, Wt v ) , if d = 2, otherwise.
Final 1y, AwtA=
A,
0,
i f t = A, otherwise.
Remark 2.1 One can easily notice that, if la1 a Wt p = 0.
# Itlvl or 1/31 # Itlv2,then
The operation of splicing on a route is extended in a natural way to the operation of splicing on a set of routes as well as an operation between languages. If T is a set of routes, the splicing of a with p on the set T of routes, denoted a WT p, is: a W T P = Ua!Wtp. tET
The above operation is extended to languages. In the sequel we consider some particular cases of the operation of splicing on routes. This will prove that splicing on routes is a very general operation with a great power of expressibility.
326
2.1
Some binary operations that are particular cases of splicing on routes
Here we show that many customary binary operations of words and languages are particular cases of the operation of splicing on routes. 1. If T = (1, I, 2,2}* then w~ is the crossover operation. 2. If T = 1*1*2*2* then WT is the simple splicing operation, see [7] for more details on this operation (simple splicing of two words a and p is y where a = y1a1, p = p l 7 2 and 7 = 7 1 7 2 ) .
3. If T = { 1n2n2mim I n, m 2 0}* then WT is the equal-length crossover. 4. If T = { 1,2}* then
CUT=
Lu,the shuffle operation.
5. If T = (12)*(1* U 2*) then WT= LUz, the literal s h d e . 6. If T C {1,2}* then
+,
is the shuffle on the set T of trajectories.
-,the catenation operation.
7. If T = 1*2* then
WT=
8. If T = 2*1* then
CUT=',
the anti-catenation operation.
9. Define T = 1*2*1* and note that w T = t , the insertion operation.
2.2
Some unary operations that are particular cases of splicing on routes
Let C be an alphabet. Assume that T is a set of routes such that T = T'Z* with T' C (1, I}*. Note that for all languages L,Ll,Lz 5 C*, with L1,L2 nonempty, it follows that:
L qr L1 = L qr L2Therefore, in this case, the operation qr does not depend on the second argument. Consequently, several well-known unary operations of words and languages are particular cases of the operation of splicing on routes. In the sequel we denote by VT the unary operation defined by DQT, in the case T E { 1 , i , 2}*.
327
1. If T = 1*f*2* then V T ( L )= Pref(L),the prefixes of L.
2. If T = I*1*2*then V T ( L )= S u f ( L ) ,the suffixes of L.
3. If T = T*l*T*2* then V T ( L )= Sub(L),the subwords of L.
4. If T = (1, I}*2*then V T ( L )= Scatt(L),the scattered subwords of L.
5. If T = {lkTklk 2 0}2* then V T ( L )= !j(L). 6 . If T = l*I*1*2*then V T ( L )= L
3
+C*.
Splicing on routes of regular and context-free languages
This section is devoted to the operation of splicing on routes of regular and context-free languages. We consider the situations when the set of routes are regular and context-free languages. The following theorem states, see [4],that the splicing of two regular languages over a regular set of routes is a regular language. However, the second part of this theorem involves context-free languages.
Theorem 3.1 Let L1, L2 and T , T C {1,2,1,2}* be three languages. (i) If all three languages are regular languages, then L1 WT L2 is a regular language. (ii) If two languages are regular languages and the third one is a contextfree language, then L1 WT L2 is a context-free language. In the sequel we use Theorem 3.1 to obtain some well-known closure properties as well as some other closure properties of regular and contextfree languages under a number of operations.
Corollary 3.1 Several closure properties of the families of regular and contextfree languages can be obtained from the above theorem. For instance the family of regular languages is closed under the following operations: crossover, simple splicing, shufle, literal shufle, catenation, anti-catenation, insertion. Moreover, the above operations applied to a context-free language and to a regular language produce a context-free language.
328 328
Remark 3.1 The conditions from the above theorem cannot be relaxed. For instance, if two languages are context-free and the third one is a regular language, then L1 WT L2 is not necessary a context-free language. Assume that T is regular, T = {1,2}*. It is known that there are context-free languages L1, Lz such that L1 WT L2, i.e,, L1LUL2 is not a context-free language. For the other two cases, assume that T = {ln2lZn 1 n 2 l}, L1 = {andbncm n,m 2 1) and L2 = {d}. Note that
I
(L1 WT Lz) n asdbsc+
= {a”dbncn
I n 2 1).
Hence, L1 WT L2 is not a context-free language. If T = {2n+1122n I n 2 l}, then LZ wr L1 is not a context-free language.
4
Trajectories
In this section we introduce the notion of the trajectory and of the shuffle on trajectories. The shuffle on trajectories is a special case of the operation of splicing on routes.
Remark 4.1 From now on we denote by ‘Cr” the uersor “1” and by ‘b” the uersor ‘2”. However in the remainder part of this paper we will not use the . the operation WT uersors 1 and 2. Thus the alphabet V is V = { r , ~ }Also is denoted by UJT and is called shufle on the set T of trajectories. Definition 4.1 A trajectory is an element t , t E V * . The following theorem is a representation result for the languages of the form L1 wTL2.
Theorem 4.1 For all languages L L and L z , L I ,LZ C_ C*, and for all sets T of trajectories, there exist a morphism cp and two letter-to-letter morphisms g and h, g : C + Cy and h : C -+ Cz where C1 and CZ are two copies of C, and a regular language R, such that
Consequently, we obtain the following:
329
Corollary 4.1 For all languages L1 and L2, L1, L2 & C*, and for all sets T of trajectories, there exist a gsm M and two letter-to-letter morphisms g and h such that
5
Some algebraic properties
This section is devoted to some important algebraic properties of the operation of shufle on trajectories that was introduced and studied in [6].
5.1
Completeness
A complete set T of trajectories, has the property that, for each lattice point in plane, i.e., for each point in plane with nonnegative integer coordinates, there exists at least one trajectory in T that ends in this lattice point. Definition 5.1 A set T of trajectories is complete i f l a U - J ~ P# a,@E C*.
8, f o r all
Definition 5.2 The balanced insertion is the following operation: if w = xx and x = y, then w t b z = xyx Example 5.1 ShufRe, catenation, insertion are complete sets of trajectories. Noncomplete sets of trajectories are, for instance, balanced literal shuffle, balanced insertion, all finite sets of trajectories, see [6] for more details about balanced literal shufRe and balanced insertion. Remark 5.1 T is complete i#Q(T)= N2,i.e., the restriction of the Parikh is surjective). mapping Q to T is a surjective mapping [ Q I T Proposition 5.1 If T is a set of trajectories such that T is a semilinear language with Q(T)effectively calculable, then it is decidable whether or not T is complete. Corollary 5.1 If T is a context-free language or i f T is a simple matrix language, then it is decidable whether or not T is complete.
330
5.2
Determinism
A deterministic set T of trajectories has the property that, for each lattice point in the plane, there exists at most one trajectory in T that ends in this lattice point.
) 1, Definition 5.3 A set T of trajectories is deterministic iff c a r d ( a l u ~ P 5 for all a, E C*. Example 5.2 Catenation, balanced literal s h d e , balanced insertion are deterministic sets of trajectories. Nondeterministic sets of trajectories are, for instance shufRe and insertion. R e m a r k 5.2 T is deterministic iff the restriction of the Parikh mapping to T is injective. Proposition 5.2 Let
*
L be a class of semilinear languages, eflectively closed
under catenation and under GSM mappings. If T E L, then it is decidable whether or not T is deterministic. Corollary 5.2 If T is a context-free language or i f T is a simple matrix language, then it is decidable whether or not T is deterministic. Proposition 5.3 It is undecidable whether or not a context-sensitive set T of trajectories is deterministic.
5.3
Commutativity
The property of an operation to be commutative is an well known algebraic property. Definition 5.4 A set T of trajectories is referred to as cornmutative ifl the operation LUT is a commutative operation, i.e. CYWTP= P L U T ~ ,for all
a,@E c*.
Example 5.3 S h d e is a commutative set of trajectories, whereas for instance, catenation and insertion are noncommutative sets of trajectories. Notation. The morphism sym : {r,u } and sym(r) = U .
+ {r,u}* is defined by sym(u) = r
33 1
Remark 5.3 T is commutative iflT = syrn(T). Proposition 5.4 Let T be a set of trajectories.
(i) if T is a regular language, then it is decidable whether or not T is commutative. (ii) if T is a wntext-free language, then it is undecidable whether or not T is commutative. Remark 5.4 N o nonempty commutative set of trajectories is deterministic.
5.4
The unit element
Again, the existence of an unit element is an important algebraic property of an operation. Definition 5.5 A set T of trajectories has a unit element iff the operation UJT has a unit element, i.e., ifithere exists a word 1 E C* such that l W ~ = a a U ~ 1for , all a E C*. Remark 5.5 T has a unit element ifl X is the unit element. Moreover, T has a unit element ifl (T* U u*) C T. Note that the above property is decidable, if T is a wntext-free language.
5.5
Associativity
We now start our discussion concerning associativity. After presenting a general characterization result (Proposition 5.5), we show that the property of associativity is preserved under certain transformations. Definition 5.6 A set T of trajectories is associative iff the operation WT is associative, i.e.
for all a, P, y E C* Example 5.4 The following sets of trajectories are associative:
( i ) T = {r,u}*,the s h d e ,
LU.
332
(ii) T = {riu2jri I i , j 2 O}*, the balanced insertion,
t b .
(iii) T = r*u*,the catenation, -. (iv) T
= u*r*,the
anti-catenation, '.
Examples of nonassociative sets of trajectories are:
(i') T = (ru)*(r*U u*),the literal shuffle, LUz.
(ii') T
= (ru)*,the balanced literal s h d e , u b l .
(iii') T = r*u*r*,the insertion, t. Definition 5.7 Let D be the set D = { x ,y,x}. Define the substitutions u and r as follows: u , r : V +P(D*),
4 r ) = {.,Y)
, .(u)
=
M,
r ( r ) = (4 7 d u ) = {Y,x}* Consider the morphisms cp and $: cp
, $ : V +D*,
Proposition 5.5 Let T be a set of trajectories. The following conditions are equivalent:
(i) T is an associative set of trajectories. ) (cp(T)LLk*) = r ( T )fl ($(T)LUx*). (ii) ~ ( 2 ' n
Proposition 5.6 Let T be a set of trajectories.
[i) if T is a regular language, then it is decidable whether or not T is associative.
(ii) if T is a context-free language, then it is undecidable whether or not T is associative.
333
Notation. Let A be the family of all associative sets of trajectories. Proposition 5.7 The family A is an anti-AFL. Proposition 5.8 If ( T i ) i e ~is a family of sets of trajectories, such that for all i E I , Ti is an associative set of trajectories, then TI,
,
T'= iEI
is an associative set of trajectories.
Definition 5.8 Let T be an arbitrary set of trajectories. The associative closure of T , denoted T , is
n
T=
TI.
TGT' ,TI EA
Observe that for all T, T C { r , u}* , T is an associative set of trajectories and, moreover, T is the smallest associative set of trajectories that contains T.
Example 5.5 One can easily verify that f- = Lu, i.e., the associative closure of insertion is the shuffle. Similarly, the associative closure of balanced insertion is balanced insertion itself, i.e., ;tb =+b. This is of course also obvious because balanced insertion is associative. Remark 5.6 The function
-,
- : P(V*)+P(V*)defined as:
i s a closure operator. We provide now another characterization of an associative set of trajectories. This is useful in finding an alternative definition of the associative closure of a set of trajectories and also to prove some other properties related to associativity.
Definition 5.9 Let W be the alphabet W = {x,y,z} and consider the following four morphisms, pi, 1 5 i 5 4, where
pi:w-+v* ,
1_ 1. The following operations: shuffle (LU), catenation (.), insertion (+), balanced insertion ( t b ) do not have the n-fairness property for any n, n 2 1. Definition 6.2 Let n be a fixed number, n 2 1. Define the language Fn as:
F, = { t
E
V*I I lt'lr - It'[, 15 n, for all t' such that t = t't'',t''
E V*}.
Remark 6.1 Note that a set T of trajectories has the n-fairness property i f and only if T & Fn. Proposition 6.1 For every n, n 2 1, the language Fn is a regular language. Corollary 6.1 Let T be a set of trajectories. If T is a wntext-free language and if n is @ed, n 2 1, then it is decidable whether or not T has the nfairness property.
Remark 6.2 The fairness property is not a property of the set 9(T)as one can observe in the case when T is the set T
=
{rid I i 2 1).
Indeed, T does not have the n-fairness property for any n, n 2 1 despite that 9 ( T ) is the first diagonal. Proposition 6.2 The fairness property is preserved in the transition to the commutative closure. Proposition 6.3 The fairness property is not preserved by applying the associative closure.
6.2
On parallelization of languages using shuffle on trajectories
The parallelization of a problem consists in decomposing the problem in subproblems, such that each subproblem can be solved by a processor, i.e., the subproblems are solved in parallel and, finally, the partial results are
337
collected and assembled in the answer of the initial problem by a processor. Solving problems in this way increases the time efficiency. It is known that not every problem can be parallelized. Also, no general methods are known for the parallelization of problems. Here we formulate the problem in terms of languages and shuffle on trajectories. Also we present some examples. Assume that L is a language. The parallelization of L consists in finding languages L1, Lz and T , T C V*, such that L = L1WTL2 and moreover, the complexity of L1, LZ and T is in some sense smaller than the complexity of L. In the sequel the complexity of a language L refers to the Chomsky class of L, i.e., regular languages are less complex than context-free languages which are, in turn, less complex than context-sensitive languages. It is easy to see that every language L , L C_ {u,b}* can be written as L = u*UTb* for some set T of trajectories. However, this is not a parallelization of L since the complexity of T is the same with the complexity of L. In view of Corollary 3.1 there are non-context-free languages L such that L = L l u T L 2 for some context-free languages L1, LZ and T . Moreover, one of those three languages can be even a regular language. Note that this is a parallelization of L. As a first example we consider the non-context-free language L C { a , b, c } * , L = {w II w la=] w Ib=l 20 Ic). Consider the languages: L1 G { u , b } * , L1 = { u 1 1 u la=[ u Ib}, LZ = c* and T = {t 11 t I T = 2 I t Iu}. One can easily verify that L = L1LUTLZ. Moreover, note that L1 and T are context-free languages, whereas L2 is a regular language. Hence this is a parallelization of L. As a consequence of Corollary 3.1 one cannot expect a significant improvement of this result, for instance to have only one context-free language and two regular languages in the decomposition of L. Finding characterizations of those languages that have a parallelization remains a challenging problem.
7 Conclusion The operations of splicing on routes and its special case of shuffle on trajectories offer a new topic of research in the area of parrallel composition of words. For the special case of infinite words the reader is referred to [5].
338
Acknowledgement. The author is grateful to Kyoto Sangyo University, Professor Masami Ito, and to Tarragona University Rovim i Virgili, Professor Carlos Martin- Vide for oflering all conditions to write this paper. Also, many thanks to the anonymous referee for his valuable comments.
References [l]N. Francez, Fairness, Springer-Verlag, Berlin, 1986.
121 J. S. Golan, The Theory of Semirings with Applications in Mathematics and Theoretical Computer Science, Longman Scientific and Technical, Harlow, Essex, 1992.
[3] W. Kuich and A. Salomaa, Semirings, Automata, Languages, EATCS M o n e graphs on Theoretical Computer Science, Springer-Verlag, Berlin, 1986. [4] A. Mateescu, “Splicing on Routes: a Framework of DNA Computation”, Unconventional Models of Computation, UMC’98, Auckland, New Zealand, C.S. Calude, J. Casti and M.J. Dinneen (eds.), Springer, 1998, 273 - 285.
[5] A. Mateescu and G.D. Mateescu, “Associative shuffle of infinite words”, Structures in Logic and Computer Science eds. J. Mycielski, G. Rozenberg and A. Salomaa Lecture Notes in Computer Science (LNCS) 1261, Springer, 1997, 291-307. [6] A. Mateescu, G. Rozenberg and A. Salomaa, “Shuffle on Trajectories: Syntactic Constraints”, Theoretical Computer Science, TCS, 197, 1-2, (1998) 1-56 Fundamental Study. [7] A. Mateescu, G. Ph,G. Rozenberg and A. Salomaa, Simple splicing systems; Discrete Applied Mathematics, 84, 1-3 (1998) 145-163.
[8] “Handbook of Formal Languages”, eds. G. Rozenberg and A. Salomaa, Springer, 1997.
339
Characterization of valuation rings and valuation semigroups by semistar-operations Rydki Matsuda Ibaraki University Mito,Japan Email:
[email protected] Let L be a commutative ring which coincides with its total quotient ring. That is, L = { a / b I a, b E L and b is a non-zerodivisor of L}. Let r be a totally ordered abelian (additive) group. A mapping w of L onto r U {GO} is called a valuation on L if v(ab)= .(a) + v(b) and w(a b) 2 inf {v(a),v(b)} for each elements a and b of L. The subring { a E L I .(a) 2 0) of L is called the valuation ring associated to v. Valuation ring is one of most important notion and tool in commutative ring theory. The aim of this talk is to characterize valuation rings and valuation semigroups by semistar-operations. In $0, we will state relationships between commutative ring theory and commutative semigroups (explicitly, grading monoids). In $1, we concern with a commutative ring without zerodivisors (that is, an integral domain) and a grading monoid. The results of 31 appeared on [M 6 and 71. In 92, we generalize results for integral domains to commutative rings with zerodivisors. The results of 92 appeared on [M8].
+
$0. Commutative ring theory and commutative semigroups
In this section, we will state relationships between commutative semigroups and commutative ring theory. Let G be a torsion-free abelian (additive) group, and let S be a subsemigroup of G which contains 0. Then S is called a grading monoid ([No]). We will call a grading monoid simply as a g-monoid. For example, the direct sum ZO@I .. .@I Zo of n-copies of the non-negative integers ZOis a g-monoid. Many terms in commutative ring theory may be defined analogously for S. For example, a non-empty subset I of S is called an ideal of S if S + I c I . Let I be an ideal of S with I S. If s1 + s2 E I (for s1,s2 E S) implies s1 E I or s2 E I , then I is called a prime ideal of S. Let I? be a totally ordered abelian (additive) group. A mapping v of a torsion-free abelian group G onto I? is called a valuation on G if v(z + y) = v(z) + w(y) for all 2,y E G. The subsemigroup {z E G I w ( z ) 2 0) of G is called the valuation semigroup of G associated to v. The maximum number n so that there exists a chain PI Pz ... P, of prime ideals of S is called the dimension of S. If every ideal I of S is finitely generated, that is, I = U;(S + si) for a finite number of elements s1,. . ,s, of S , then S is called a Noetherian semigroup. Many propositions for commutative rings are known to hold for
5
5 5 5
+
340
S. For example, if S is a Noetherian semigroup, then every finitely generated extension g-monoid S[q,. . . ,%,I = S 1Z o z i is also Noetherin [M4], and the integral closure of S is a Krull semigroup [M5]. Ideal theory of S is interesting itself and important for semigroup rings. Let R be a commutative ring, and let S be a g-monoid. There arises the semigroup ring R[S] of S over R: R[S]= R[X; S] = {Cfinite a,Xs I a, E R,s E S}. If S is the direct sum ZO@I . @I ZO of n-copies of ZO,then R[SI is isomorphic to the polyne mial ring R [ X 1 , - . -,Xn]of n variables over R. Assume that the semigroup ring D [ S ]over a domain D is a Krull domain. Then D. F. Anderson [A] and Chouinard [C] showed that C(D[Sl)E C(D)@ C ( S ) ,where C denotes ideal class group. Thus they were able to make domains that have various ideal class groups. For another example, assume that D is integrally closed and S is integrally closed. Then we have (I1 n .. . n In)" = I: n . . . n 1 : for every finite number of finitely generated ideals 11, ... ,I, of D[S] if and only if (I1 n . . . n I,)" = 1 : n . ..n I: for every finite number of finitely generated ideals 11, . . . ,I, of D and (I1 n. . . I,)" = I: n . . .nI: for every finite number of finitely generated ideals 11, .. . ,I, of S ([M3]), where w is the w-operation. For references of ideal theory of S and R[S] we confer [GI.
+
-
$1. Valuation semigroups and valuation domains Let D be an integral domain with quotient field K . Let F ( D ) be the set of non-zero fractional ideals of D. A mapping I +-+ I*of F ( D ) to F ( D ) is called a star-operation on D if for all a E K - (0) and I,J E F ( D ) ; (1) (a)*= (a); (2) (aI)* = aI*; (3) I c I*; (4) if I c J , then I* c J*; and (5) (I*)*= I*.Let C ( D ) be the set of star-operations on D. Let F'(D) be the set of non-zero D-submodules of K . A mapping I c--)I* of F '(D) to F'(D) is called a semistar-operation on D if for all a E K - { 0 } and I , J E F'(D); (1) (aI)* = aI*; (2) I c I*; (3) If I c J , then I* c J*; and (4) (I*)* = I*.Let C ' ( D ) be the set of semistar-operations on D. [H, Lemma 5.21 and [AA, Proposition 121 showed: Let V be a non-trivial valuation ring on a field, and M its maximal ideal. If M is principal, then I C ( V ) I= 1. If M is not principal, then I C ( V ) I= 2. [OM, Theorem 481 showed: A domain D is a discrete valuation ring of dimension 1 if and only if I C'(D) I= 2. [MS, Corollary 61 showed: Let D be an integrally closed quasi-local domain with dimension n. Then D is a valuation ring if and only if n 1 51 C'(D) I< 2n I . Let S be a g-monoid with quotient group G; G = { s - s' I s , s' E S}. Let F ( S )be the set of fractional ideals of S. A mapping I H I*of F ( S )to F ( S ) is called a star-operation on S if for all a E G, and I,J E F ( S ) ;(1) (a)* = (a);
+
+
341
(2) ( a + I)* = a + I*;(3) I c I*;(4) If I c J , then I*c J*;( 5 ) (I*)*= I*. For example, if we set Id = I for each I E F ( S ) , d is a star-operation on S which is called the &operation on S. Let I" be the intersection of principal fractional ideals containing I, then v is a star-operation on S which is called the v-operation on S. Let C ( S ) be the set of star-operations on S. Let F'( S) be the set of non-empty subsets I of G such that S I c I. A mapping I I-+ I*of F ' ( S ) to F'(S) is called a semistar-operation on S if for all a E G and I,J E F'(S); (1) ( a I)* = a I*; (2) I C I,*;(3) If I c J, then I* c J*; (4) (I*)*= I*. For example, if we set I d = I for each I E F'(S), then d' is a semistar-operation on S which is called the d'-operation on S. For each I E F'(S), we set I"' = I" if I E F ( S ) , and set Id = G if I 9 F ( S ) . Then v' is a semistar-operation on S which is called the v'-operation on S. Let C'(S) be the set of semistar-operations on S. Let V be a valuation semigroup. If its value group is discrete, V is called a discrete valuation semigroup. In this section, we will prove the following four Theorems:
+
+
+
Theorem 1. Let S be a g-monoid with dimension n. Then S is a discrete valuation semigroup if and only if I C'(S) I= n 1.
+
Theorem 2. Let V be a valuation semigroup of dimension n, v its valuation and I' its value group. Let M = P, P,-1 ... PI be the prime ideals of V , and let (0) Hn-l -.. H I I' be the convex subgroups of r. Let m be a positive integer such that n 1 5 m 5 2n 1. Then the followings are equivalent: (1) I C ' ( V ) I= m. (2) The maximal ideal of the g-monoid Vpi = (s - t I s E V,t E V - Pi) is principal for exactly 2n 1 - m of i. (3) The ordered abelian group I'/Hi has a minimal positive element for exactly 2n 1 - m of i.
5
2
5 5 5
2
2
+
+
+
+
Theorem 3. Let D be a domain with dimension n. Then D is a discrete valuation ring if and only if 1 C'(0) I= n 1.
+
Theorem 4. Let V be a valuation domain of dimension n, v its valuation and I' its value group. Let M = P, 2 P,-1 ... PI (0) be the prime ideals of V , and let (0) H,-1 ... H I I' be the convex subgroups of I'. Let m be a positive integer such that n 1 5 m 5 2n 1. Then the followings are equivalent: (1) I C ' ( V ) I= m. ( 2 ) The maximal ideal of Vpi is principal for exactly 2n 1 - m of i.
5
2
5 5 5
2
2
+
+
+
342
(3) r / H i has a minimal positive element for exactly 271
+ 1 - m of i.
Lemma 1. (1) Let * be a semistar-operation on a g-monoid S. Then, for all I,J E F'(S) we have (I J)*= (I* J*)*. (2) Let * be a semistar-operation on S. If R is an oversemigroup of S, that is, if R is a subsemigroup of the quotient group of S containing S, then R* is an oversemigroup of S. (3) Let R be an oversemigroup of S, and let * E C ' ( S ) . If we set .I" = J* for each J E F'(R), then a(*)is a semistar-operation on R. (4)Let R be an oversemigroup of S, and let * E X'(R). If we set 16(*) = (I R)* for each I E F'(S), then 6(*)is a semistar-operation on S. ( 5 ) Let V be a valuation semigroup on a torsion-free abelian group G. Then we have F'(V) = F ( V )U {G}.
+
+
+
We call a(*)of Lemma l(3) the ascent of l(4) the descent of * to S.
* to R. We call 6(*) of Lemma
Lemma 2. Let V be a valuation semigroup on a torsion-free abelian group with maximal ideal M . (1) If M is principal, then 1 C ( V ) I= 1. (2) If M is not principal, then 1 C(V)(=2. For the proof, assume that M = ( p ) is principal. Let I E F ( V ) , and let z @ I . Then we have I c (z+ p ) and z @ (z p ) . It follows that I = I" and C(V)= { d } . If M is not principal, we have v # d. Let * be a semistar-operation on V, and let I E F ( V ) such that I I*. Take an element b E I* - I. Then we have (b) c I* C I" C (b). Hence I" = I*and * = v.
+
5
Lemma 3. A g-monoid S is a discrete valuation semigroup of dimension 1 if and only if I C'(S) I= 2. For the proof, let G be the quotient group of S, and assume that S is a discrete valuation semigroup of dimension 1 with maximal ideal M. Let * be a semistar-operation on S. If S* = G, then * = e. If S* G, we have S* = S. Lemma 2 implies that C'(S) = {d',e}. If I C'(S) I= 2, then C(S) = { d } . Suppose that there exists a valuation oversemigroup V such that S V G. Let d" be the identity mapping on F'(V), and let * be the descent of d" to S. Then Cl(S) has at least three members d', e, *; a contradiction. Then M is principal by Lemma 2.
5
5 5
343
An element a of the quotient group of S is called integral over S if na E S for some positive integer n. The set 3 of integral elements over S is called the integral closure of S.
Lemma 4. If S is a valuation semigroup of dimension n, then n C’(s) 15 2n 1. If I C‘(S) I< 00, then is a valuation semigroup.
+
+ 1 51
For the proof, let G be the quotient group of S, and let S be a valuation semigroup of dimension n. Let M = P, Pn-l ... PI be a chain of prime ideals of S. Let di be the identity mapping on F’(Vpi),and let * i be the descent of d: to S. Then C’(S) contains at least ( n+ 1)-members d’,*1,-.. ,*“-1,e. Suppose that I C’(S) I< 2k + 1 for k = 1 , 2 , - .. ,n - 1, and let * E C’(V).If V* = G, then * = e. If V* = V, then * = d’ or * = u’. If V V * ,we have V* = Vpdfor some i. Then * is the descent to V of the semistar-operation a(*)on Vpi,and hence I C’(S) 15 2n 1. Suppose that is not a valuation semigroup. Then there exists an element u E G such that u 9 and -u $! S. Then we have S[u] S[2u]2 . .. 3 S. It follows that I C’(S) I= 00; a contradiction.
2
2
5
s
3
+ 2
s
Lemma 5. Let V be a valuation semigroup with value group r. Let P be a prime ideal of V , and let H be the associated convex subgroup of I?. Then I’/H is the value group of Vp.
For the proof, let w be the composition of u and the canonical mapping r / H . Then w is a valuation on G (the quotient group of S) with value group r / H . The valuation semigroup of w is Vp.
--+
Lemma 6. Let V be a discrete valuation semigroup of dimension n. Then we have I C’(V) I= n+ 1.
2
2
For the proof, let M = P 3 Pn-l ... PI be a chain of prime ideals “f of V. Set Ui = Vp; for each a. Then Ui is a discrete valuation semigroup of dimension i by Lemma 5. For each i, let d: be the identity mapping on F’(U+),and let *i be the descent of d: to V . We show that C’(V) = { e , * 1 , - . - ,*n-l,d’} by the induction on n. If n = 1, the assertion holds by Lemma 3. Suppose that the assertion holds for each i < n. We have C‘(Vi) = { a ( e ) ,a(*l),. . . ,a(*+)}by the induction hypothesis, where a is the ascent mapping of V to U+.Then Lemmas 1 and 2 complete the proof. Lemma 7. Let S be a g-monoid of dimension n with Then S is a discrete valuation semigroup.
I C’(S) I=
n
+ 1.
344
2
2
Proof. There exists a chain of prime ideals of S: S M = P,, 3 P,-1 PI. Let di be the identity mapping on F’(Spi) for each i. Let * i be the descent of d: to S for each i. By the assumption we have C’(S) = {e, * I , . . . , *,,-I, d’}. Lemma 4 implies that S is a valuation semigroup. Then Ui = Spi is a valuation semigroup of dimension i for each i. Then Lemmas 2 and 3 imply that V is discrete by the induction on n.
...
2
Theorem 1 follows from Lemmas 6 and 7. Let S be a g-monoid of dimension n with n 151 C’(S) 15 2n 1. Then S is not necessarily integrally closed. For example, let S = { 0 , 2 , 3 , 4 , . .. }. Then S is a 1-dimensional g-monoid, and has 1 C’(S) I= 3 . But S is not integrally closed.
+
+
Proof of Theorem 2. Assume that the maximal ideal of Vpi is principal for exactly 2n 1- m of i. We show that I C’(V) I= m by induction on n. Suppose that the assertion holds for n - 1. Set W = VpnPl. The case M is not principal: Since the assertion holds for W , and since 2n + 1 - m = 2 ( n - 1) 1 - ( m - 2 ) , we have 1 C’(W) I= m - 2. Let C’(W) = { * I , . - - ,*m-2}. Then 6 ( * 1 ) , - . -,6(*,,-2),d’,v’ are distinct each other by Lemma 2. Let * be a semistar-operation on V. If V* = V, then * = d’ or * = v’. If V* 2 V, we have V* = Vpi for some i. Then * is the descent of the ascent a(*)of * to Vpi, and hence I C’(V) I= m. The case that M is principal: Since the assertion hold for W , and since 2n - m = 2 ( n - 1) 1 - ( m- l), we have I C’(W)I= m - 1. Let C ’ ( W ) = { * I , . . - ,*,,-I}. Then 6(*1),-.’,6(*m- l),d’ are distinct each other. Let * be a semistar-operation on V. If V* = V, then * = d‘ by Lemma 2. If V* V , we have V* = Vpi for some i. Then * is the descent of the ascent a(*) to Vp;, and hence I C‘(V) I= m. Since the value group of Vp 0. If A has is the maximal
Theorem 2. Let V be a Marot valuation ring of r-dimension let M be its maximal ideal.
> 0,
and
347
(1)If M is principal, then I C(V)/ -I= 1. (2) If M is not principal, then I C(V)/ -I= 2. Proof. (1) Let M = xV, let I be a regular ideal of V, and let I' = I". If y # I is a regular element, then I is properly contained in yV. Hence I C xyV yV. Thus I' c yxV and y # I'. We conclude that I = I'. Remark 2 implies that I C(V)/ -I= 1. (2) Assume that I is a regular ideal of V which is not divisorial. Take a regular element u E I" - I. Then u-'I c M . If a-lI M ,take a regular element b E M - u-'I. Then I c abV c a M , and (u)= I" c (ub) c uM (u); a contradiction. Hence u-'I = M and hence, 1 C(V) I / -I= 2.
5
5
5
Proposition 1. Let V be a Z-valued Marot valuation ring. Then C'(V/) -I= 2.
I
For the proof, let M = ( p ) be the maximal ideal of V. We have F'(V) = { K } U {p"V I n E Z} U {infty ideals}, where K =q(V). Let * be a semistaroperation on V. If V* = K , then (p"V)* = p"V* = K . Hence * e. If V* = p"V for some n, since (V*)* = V*, we have n = 0. It follows that * d'. Therefore I C'(V)/ NI= 2. N
N
Remark 5. If I C'(A)/ NI=
2, then C'(A)/
N=
{ [ d ' ] ,[el}.
Remark 6. Let V be a Z-valued Marot valuation ring. Then need not be 2.
I C'(V) I
For example, let k be a field, w a Z-valued valuation on k, and W its valuation ring. Let M be a vactor space over k of dim ( M ) > 1, and let K = k @ M be the semidirect product of k and M ([Na]). For each element a+x ( a E k and x E M ) of K , we set v(u x) = ~ ( u )Then . v is a Z-valued valuation on K . The valuation ring V of v equals to W M . We see that V is a Marot ring, M is an infty ideal of V, and d',v' and e are distinct by Remark 4. Hence I C'(V) I# 2.
+
+
Let D be a Bezout domain, and M a D-module. Then the semidirect product D @ M is a Marot ring ([Ml, Remark 131). We may canonically define partial orders 5 on C ( R ) and on C'(R). For a domain D, Anderson-Anderson [AA]considered the partial order 5 on C ( D ) . We define the partial order 5 on C ( R ) (resp. C'(R))by *1 5 *2 if and only if I*'c I*2 for every I E F ( R ) (resp. F'(R)).
348
Remark 7. (1) The partial order 5 on C’(D) need not be a total order. (2) On C ‘ ( R ) ,d’ is the smallest element and e is the greatest element. (3) On C ( R ) ,d is the smallest element. (4) On C ( D ) ,v is the greatest element. Example for (1): Let D be a domain with overrings D1 and D2 which are incomparable. For each I E F ( D ) , set I*l = ID1 and I*2 = ID2. Then * 1 and *2 are incomparable.
Remark 8. If I C’(A) I= 2, then r-dim ( A ) > 0 need not hold. For example, let Ic be a field with I k I= 2, and let M be a Ic-module with I M I= 2. Then the semidirect product A = Ic @ M is a Marot ring, and I C‘(A) I= 2. However, r-dim ( A ) = 0. We note that each overring of a Marot ring is a Marot ring. And A is integrally closed if and only if A is the intersection of the set of valuation overrings of A ([HH,Corollary 2.41).
Lemma 3. If I C’( A) / NI=
2, then A is a Z-valued valuation ring.
Proof. Then A is a valuation ring. Since v’ is not similar to e, each regular fractional ideal of A is divisorial. By Theorem 1, the r-dimension of A is 1, and the maximal ideal M is principal. Suppose that A is not a Z-valued valuation ring. Let v be the valuation associated to A, and r its value group. There exists a E I? such that nv(p) < a for every n E N . Coose x E K such that v(x) = -a. We have x = a / b for a , b E A with b regular. It follows that a 5 v(b). Therefore n,Mn is an r-prime ideal of A, and hence r-dim ( A ) > 1; a contradiction. Proposition 2 and Lemma 3 imply the following,
Theorem 3. A is a Z-valued valuation ring if and only if I C ’ ( A ) / NI=
2.
Let P be a prime ideal of the ring R. Then the overring {x E q(R) I sx E R for some s E R - P } is denoted by R p ] .
Lemma 4. Let A be a valuation ring of r-dimension n. Then n + 1 51 C’(A)/ 4. For the proof, let M = P,
2 ... 2 PI be the r-prime ideals of A. Then
349
2
3 . .. ~ p n - l 3l V. If we set I*i = I q p i ] for each I E F’(A), we have there arise semistar-operations *I, . .. ,*,+I. Lemma 5. Let V be a Marot valuation ring of r-dimension 1. If the maximal ideal is principal, then V is Z-valued and 1 C’(V)/ NI= 2.
For, each regular ideal of V is divisorial by Theorem 2. The proof of Lemma 3 shows that V is Z-valued, and then 1 C’(V)/ w J = 2 by Proposition 1.
+
Lemma 6. Let V be a Marot valuation ring of r-dimension n 1, and let W be its valuation overring of r-dimension n for a positive integer n. Let * E C‘(V) so that * is not similar to any member of bwp(C’(W)).Then * is similar to d’ or v’. Proof. If V* = V, then we have * d’ or * 21’ by Theorem 2. Thus V* 2 V and W c V*. Let I be a regular ideal of V. Then IW c IV* c I*. Hence I*= (IW)*.It follows that * E SW,V(C’(W)). N
N
Proposition 2. Let V be a Marot valuation ring with r-dimension n. Then n 1 51 C’(V)/ -I< 2n 1.
+
+
Proof. We have n + l 51 C’(V)/ N I by Lemma 4. Assume that n = 1,and let M be the maximal ideal of V. If M is principal, then I C ’ ( V ) / NI= 2 by Lemma 5. If M is not principal, then I C’(V)/ -I= 3 by Theorem 2(2). Then repeating Lemma 6 step by step for a valuation ring V with r-dimension n, we have the required inequality. We say that a ring R has property (U), if each regular ideal of R is a (set-theoretical) union of regular principal ideals. We say that a ring R has property (FU), if Reg(I) c U y I i implies I c U y I i for each family of a finite number of regular ideals I , I l , - - -,In. If R has property (U), then R has property (FU), and if R has property (FU), then R is a Marot ring. If R is a Marot ring, then R need not have property (FU), and if R has property (FU), then R need not have property (U) ([Ml]).
Lemma 7 ([M2, (2.10)Lemma ). Let 6 , ..- ,V, be valuation overrings of A such that A = V, n . . .n V,. Then, if A has property (U), A is a Priifer ring. Theorem 4. Assume that A has property (U) and is integrally closed. If
350
C'(A)/ -I< 00, then A is a Priifer ring with only a finitely many r-prime ideals. Furthermore, if, in addition, A has a unique r-maximal ideal, then A is a valuation ring. Proof. A can be written in the form A = nxeAVx, where the Vx are valuation overrings of A. Let d', be the identity mapping of F'(Vx), and let *A be the descent of d i to A. Then *xl and t x z are not similar for A1 # Xa. It follows that 1 A I< 00. Then A is a Priifer ring by Lemma 7. Let {Pi I i} be the set of r-prime ideals of A. Set I*' = I A p 1 for each I E F'(A). Then *il and ti2 are not similar for il # ia. Hence A has only finitely many r-prime ideals. Corollary 1. Assume that A has property (U) which is integrally closed and has a unique r-maximal ideal with r-dimension n. Then A is a valuation ring if and only if n 1 51 C'(A)/ -15 2n 1.
+
+
Lemma 8. Let V be a Marot valuation ring with value group I'. Let P be an r-prime ideal of V , and let H be the associated convex subgroup of l?. Then I'/H is the value group of ypl. Proof. Let w be the composition of v and the canonical map {I?, 00) -+ {I'/H,co}. Then w is a valuation on K with value group r / H . Let a E K with w ( a ) >_ 0. There exists an element a E V such that .(a) - v(a) E H . Then there exists an element s E V - P such that .(a) - .(a) = w(s) or .(a) - .(a) = -v(s). Hence a E ypl. The proof is complete. Let V be a valuation ring. If the value group I? of V is discrete, then V is called discrete.
Theorem 5. Let A be a ring with r-dimension n. Then A is a discrete valuation ring if and only if I C'(A)/ -I= n 1.
+
Using Theorems 2 and 3, the proof is similar to that of Theorem 3 of 51.
Theorem 6. Let V be a Marot valuation ring of r-dimension n, and r its value group. Let M = P, 2 P,-1 3 . . . 2 PI be the r-prime ideals of V , and let (0) HnP1 -.- H I I? be the convex subgroups of I'. Let m be a positive integer such that n 1 5 m 5 2n + 1. Then the followings are equivalent: (1) I C'(V)/ -I= m. is principal for exactly 2n 1 - m of i. (2) The maximal ideal of
5
5 5 5 +
+
351
(3) r / H i has a minimal positive element for exactly 2n
+ 1- m of i.
Using Theorem 2, the proof is similar to that of Theorem 4 of 51. REFERENCES [A] D. F. Anderson, The divisor class group of a semigroup ring, Comm. Alg. 8 (1980),467-476. [AA] D. D. Anderson and D. F. Anderson, Examples of star operations on integral domains, Comm. Alg. 18(1990),1621-1643. [C] L. Chouinard, Krull semigroups and divisor class groups, Can. J . Math. 33 (1981),1459-1468. [GI R. Gilmer, Commutatibve Semigroup Rings, The Univ. Chicago Press, 1984. [HI W. Heinzer, Integral domains in which each non-zero ideal is divisorial, Mat hematika 15(1968),1 6 4170. [HH] G. W. Hinkle and J. A. Huckaba, The generalized Kronecker function ring and the ring R(X), J. Reine. Angew. Math. 292(1978),25-36. [Ml] R. Matsuda, On Marot rings, Proc. Japan Acad. 60(1984),134-137. [M2] R. Matsuda, Generalizations of multiplicative ideal theory to commutative rings with zerodivisors, Bull. Fac. Sci., Ibaraki Univ. 17(1985),49101. [M3] R. Matsuda, Torsion-free abelian semigroup rings IX,Bull. Fac. Sci., Ibaraki Univ. 26(1994),1-12. [M4] R. Matsuda, Some theorems for semigroups, Math. J. Ibaraki Univ. 30(1998), 1-7. [M5] R. Matsuda, The Mori-Nagata Theorem for semigroups, Math. Japon. 49 (1999),17-19. [M6] R. Matsuda, Note on the number of semistar-operations, Math. J. Ibaraki Univ. 31(1999),47-53. [M7]R. Matsuda, Note on valuation rings and semistar-operations, Comm. Alg . 28 (2000),2515-2519. [M8] R. Matsuda, A note on the number of semistar-operations,II, Far East J. Math. Sci. 2(2000),159-172. [MS] R. Matsuda and T. Sugatani, Semistar-operations on integral dce mains,II, Math. J. Toyama Univ. 18(1995),155-161. [Na] M. Nagata, Local Rings, Interscience, 1962. [No] D. Northcott, Lessons on Rings,Modules and Multiplicities, Cambridge Univ. Press,1968. [OM]A. Okabe and R. Matsuda, Semistar-operations on integral domains, Math. J. Toyama Univ. 17(1994),1-21.
352
FURTHER RESULTS ON RESTARTING AUTOMATA GUNDULA NIEMANN FRIEDRICH OTTO Fachbereich Mathematik/Informatik, Uniuersitat Kassel 0-34109 Kassel, Germany E-mail:{niemann,otto)~theory.informatik.uni-kassel.de JanEar et al (1995) developed the restarting automaton as a formal model for certain syntactical aspects of natural languages. Here it is shown that with respect to its expressive power the use of nonterminal symbols by restarting automata corresponds to the language theoretical operation of intersection with regular languages. Further, we establish another characterization of the class of Church-Rosser languages by showing that it coincides with the class of languages accepted by the det-RRWW-automata, thus extending an earlier result presented at DLT’99. Fi~ accepted by an RRWW-automaton, nally, we show that the Gladkij language L G is which implies that the class GCSL of growing context-sensitive languages is properly contained in the class L(RRWW).
1
Introduction
In JanEar et a1 presented the restarting automaton, which is a nondeterministic machine model processing strings that are stored in lists. These automata model certain elementary aspects of the syntactical analysis of natural languages. A restarting automaton, or R-automaton for short, has a finite control, and it has a readlwrite-head with a finite look-ahead working on a list of symbols (or a tape). As defined in it can perform two kinds of operations: a move-right step, which shifts the readlwrite-window one position to the right and possibly changes the actual state, and a restart step, which deletes some letters in the read/write-window, places this window over the left end of the list (or tape), and puts the automaton back into its initial state. In subsequent papers JanEar and his co-workers extended the restarting automaton by introducing rewrite-steps that instead of simply deleting some letters replace the contents of the readlwrite-window by some shorter string from the input alphabet ’. This is the so-called RW-automaton. Later the restarting automaton was allowed to use auxiliary symbols, so-called nonterminals, in the replacement operation which leads to the so-called RWWautomaton. Finally the restarting operation was separated from the rewriting operation , which yields the RRW-automaton. Obviously, the later variations can be combined, giving the so-called RRWW-automaton. ‘711,
‘
Since one can put various additional restrictions on each of these variants of the restarting automaton, a potentially very large family of automata and corresponding language classes is obtained. For example, various notions of monotonicity have been defined, and it has been shown that the monotonous and deterministic RW-automata accept the deterministic context-free languages and that the monotonous RWW-automata accept the context-free languages 6,7. However, the various forms of the non-monotonous deterministic restarting automaton were not investigated in detail until it was shown in l1 that the deterministic RWW-automata accept the Church-Rosser languages. Here we continue this work by investigating some classes of deterministic restarting automata and their relationship to each other and to the corresponding nondeterministic restarting automata. As a general result we will see that the use of nonterminals in the rewriting operation of a restarting automaton corresponds on the part of the language accepted to the operation of taking the intersection with a regular language. Secondly, we will show that by separating the restarting operation from the rewriting operation the descriptive power of the deterministic restarting automaton is not increased, that is, the deterministic RRWW-automata still accept the Church-Rosser languages. It should be pointed out that for the general case of nondeterministic restarting automata it is an open question whether or not this separation increases the power of the restarting automaton. Further we will see that = { w+wR#w I w E {a,b}* } is accepted by some the Gladkij language L G ~ RRWW-automaton, which proves that the class GCSL of growing contextsensitive languages is properly contained in ,C(RRWW), as L G is ~ not growing context-sensitive This paper is structured as follows. In Section 2 we provide the definition of the restarting automaton and its variants. In Section 3 we analyze the effect of using auxiliary symbols in the rewrite operation, and in Section 4 we consider the inclusion relations between the language classes defined by the various classes of deterministic restarting automata. In the following section we consider the Gladkij language. The paper closes with some characterizations of the language classes considered through certain types of prefix-rewriting systems. 'i7,
lt3.
2
The restarting automaton and some of its variants
In this section we do not follow the historical development of the restarting automaton, but instead we introduce the most general version first and present the other variants as certain restrictions thereof.
354
A restarting automaton with rewriting, RRWW-automaton for short, is described by a 9-tuple M = (Q, C, r,S,qo, 4, $, F, H ) , where Q is a finite set of states, C is a finite input alphabet, I' is a finite tape alphabet containing C, qo E Q is the initial state, &,$ E r \ C are the markers for the left and right border of the work space, respectively, F g Q is the set of accepting states, H g Q \ F is the set of rejecting states, and n
is the transition relation. Here I'ln =
U I?,
2s denotes the set of subsets
i=O
of the set S, and k 2 1 is the Zook-ahead of M . The look-ahead is implicitly given through the transition relation, but it is an important parameter of M . The transition relation consists of three different types of transition steps: 1. A move-right step is of the form (q',MVR) E S(q,u), where q E Q \ ( F U H ) , q' E Q and u E rk+lU r l k . {$}, u # $, that is, if M is in state q and sees the string u in its readlwrite-window, then it shifts its readlwrite-window one position to the right and enters state q', and if q' E F U H , then it halts, either accepting or rejecting. 2. A rewrite-step is of the form (q', w) E S(q, u),where q E Q \ ( F u H ) , U q' E Q , u E . {$}, and \u\ < \ u ( ,that is, the contents 2~ of the readlwrite-window is replaced by the string u which is strictly shorter than u, and the state q' is entered. Further, the readlwritewindow is placed immediately to the right of the string u . In addition, if q' 6 F U H , then M halts, either accepting or rejecting. However, some additional restrictions apply in that the border markers & and $ must not disappear from the tape nor that new occurrences of these markers are created. Further, the readlwrite-window must not move across the right border marker $, that is, if u is of the form u1$,then u is of the form u1$, and after execution of the rewrite operation the readlwrite-window just contains the string $.
3. A restart-step is of the form RESTART E S(q, u), where q E Q \ ( F U H ) and u E rk+'U rlk . {$}, that is, if M is in state q seeing u in its readlwrite-window, it can move its readlwrite-window to the left end of the tape, so that the first symbol it sees is the left border marker $, and it reenters the initial state qo.
Obviously, each computation of M proceeds in cycles. Starting from an initial configuration qo&w$,the head moves right, while MVR- and rewritesteps are executed until finally a RESTART-step takes M back into a config-
355 uration of the form qo$wl$. It is required that in each such cycle exactly one rewrite-step is executed. By k b we denote the execution of a complete cycle, that is, the above computation will be expressed as qo$w$ t-& qO&wl$. An input w E C' is accepted by M if there exists a computation of M which starts with the initial configuration qo&w$ and which finally reaches a configuration containing an accepting state qa E F . By L ( M ) we denote the language accepted by M . The following lemma can easily be proved by standard techniques from automata theory. Lemma 2.1. Each RRWW-automaton M is equivalent to an RRWWautomaton M' that satisfies the following additional restrictions: (a) M' enters an accepting or a rejecting state only when it sees the right border marker $ in its read/write-window. (b) M' makes a restart-step only when it sees the right border marker $ in its read/write-window. This lemma means that in each cycle and also during the last part of a computation the readlwrite-window moves all the way to the right before a RESTART is made, respectively, before the machine halts. By placing certain restrictions on the transition relation of an RRWWautomaton we get various subclasses. Here we are interested in the following restrictions and the corresponding language classes: An RRWW-automaton is deterministic if its transition relation is a (partial) function
S : (Q \ (F UH ) ) x 0
0
+ ( ( Q x ({MVR} U I?_ 0, given anbnc as input, M performs an accepting computation of the following form:
tb qo&wi$k L Q O $ W Z $k b .. . t-b qo$wm$ kb $uqaV$ for some qa E F . Since w1,. . . ,w, E C*, and since M accepts starting from the initial configuration qo&wi$,we see that w1,. . . ,W m E L6. If n is qo$anbnc$
sufficiently large, then M cannot rewrite the tape contents &anbnc$ into a string of the form &aib2id$within a single cycle. Hence, 201 is of the form w1 = an-jbn-jc for some j 2 1. Now consider the input z := anb2"d. Starting from the initial configuration qo&anbnb"d$, M will perform the same rewrite-step, that is, qo$a"b"b"d$ t-b $an-jbn-jqbnd$ for some q E Q. Following this rewritestep M will either reject on encountering the symbol d , or it will make a RESTART, that is, qo$anb2"d$ l-L qo$an-jb2n-jd$. As an-jb2"-jd $2 L6, we see that in each case L ( M ) # L6. Thus, L6 is not accepted by any deter0 ministic RRW-automaton. The observations above show that the following inclusions are proper. Corollary 4.3. (a) L(det-RW) C L(det-RWW). (b) L(det-RRW) c L(det-RRWW) = L(det-RWW). However, it remains open whether or not the inclusion L(det-RW) L(det-RRW) is proper. F'rom Corollary 4.3 and the results of Section 3 we obtain the following closure and non-closure properties in analogy to Corollary 3.3. Corollary 4.4. (a) L(det-RWW) and L(det-RRWW) are closed under the operation of taking the intersection with a regular language. (b) C(det-RW) and L(det-RRW) are not closed under this operation.
363
Since Ls E L(RRW), Lemma 4.2 implies that the inclusion L(det-RRW) C L(RRW) is proper. Further, since CRL is properly contained in GCSL 2 , also the inclusion L(det-RWW) c L(RWW) is proper. Hence, in summary we have the situation depicted in Figure 2 , where a question mark indicates that it is an open problem whether the corresponding inclusion is proper, and a language given as a label of an edge means that this language is an example that shows that the corresponding inclusion is indeed a proper one. Here L2 := { endn 1 n 2 0 } U { end" 1 rn > 2 n 2 0 } and L7 := { anbm I 0 5 n 5 m 5 2 n } are taken from 516.
C(RRWW)
Figure 2.
5
The Gladkij language is in L(RRWW)
For the nondeterministic restarting automata we have the chain of inclusions GCSL C L(RWW) L(RRWW) CSL, where CSL denotes the class of context-sensitive languages. It is known that GCSL is properly contained in CSL, but it is open which of the intermediate inclusions are proper. Let LGI denote the so-called Gladkij language 3 , that is, L G ~= { w#wR#w 1 w E {a,b}* }. It is known that L G ~is a context-sensitive language that is not growing context-sensitive l. Here we will show that this language is accepted by some RRWW-automaton, thus separating the class GCSL from the class L(RRWW). Theorem 5.1. L G E~ L(RRWW). Proof. We will construct an RRWW-automaton M that accepts the lan-
c
c
364
gudge L G ~As . by Corollary 3.3, L(RRWW) is closed under the operation of taking intersections with regular languages, we can restrict our attention to inputs of the form u#w#w, where u, w,w E { a , b}*. Let C := { a , b, #}, and let r := C U {&,$} U { A,, B,, C, I u E { a ,b}2 }. Further, for M's look-ahead we choose the number 7. The action of A4 on an input of the form u#w#w is described by the following algorithm, where win always denotes the actual contents of M's readlwrite-window: (1.)
(2.)
(3.) (3.1) (34
(4.) (4.1) (4.2)
(5.) (5.1) (5.2)
(5.3) (6.) (7.) (7.1) (7.2)
if win = $x#y#z$ then (* The window contains the tape inscription completely. *) if x#y#z E L G ~ or z E { a , b, A}, y = x and z = xC, for some u E { a , b } 2 then ACCEPT else REJECT; repeat MVR until win E r2. # . or win E { a , b, $ } . {A,B, I u E { a , b}2} . if win = u2#w2y for some ~ 2 , 7 1 2E r2then begin if '112 $! { a ,b}2 or v2 $! { a , b}2 or u2 # :w then REJECT; (* Here v#w#w = u ~ u 2 # u f w ~ # w .*) nondeterministically goto (4.) or goto (5.); REWRITE : ~ 2 # ~H2 A,,B,, ; repeat MVR until win E r*. $; if win ends in u2$ then RESTART else REJECT; (* A RESTART is performed if the tape contents was ulu2#u3l#Wl212 ( U l , W l , W l E {a,b)*,u2 E { a , b } 2 ) . *) (* u2#uF has been discovered, but it is not yet rewritten. *) repeat MVR until win E I?* . $; if win ends in Cx$for some x E { a ,b}2 then REWRITE : Cx$I+ $ else REJECT; RESTART; (* A RESTART is performed if the tape contents was ~ l u 2 # u f ' u l # W 1 c x ( U l , W l , W l E {a,b)*,x,u2 E and the C, has just been deleted. *) end; if win = c . A,,B,, . w' for some u2 E { a , b } 2 then nondeterministically goto (7.) or goto (8.); repeat MVR until win E I?* . $; if win ends in u2$ then REWRITE : u2$ c) CUz$ else REJECT; RESTART;
365
(* A RESTART is performed if the tape contents was ~ 1 - 4 u z B u , v 1 # w u(~w , v 1 , w E { a , b } * , w E {a,b}:!),and u:!has just been replaced by C,,. *) (8.) REWRITE : A,,B,, H #; (8.1) repeat MVR until win E I?* . $; (8.2) if win ends in Cu,$ then RESTART else REJECT; (* A RESTART is performed if the tape contents was w A u 2 B u z ~ 1 # w C u(zw , v l , w E {a,b}*,u2 E { a , b } 2 ) and , in this cycle A,, B,, has been replaced by #. *) In the following we give some example computations of M in order to illustrate how it works before we turn to proving that indeed L ( M ) = L G ~In . the description of these computations we place a bar underneath the important part of the window’s contents.
Example 1. Consider the input abbb#bbba#abbb:
qo$abbb#bbba#abbb$
k $abbb#bbba#abbb$. (2)
-
Now we can continue with either (4.) or (5.). However, (5.) will lead to rejection, so we continue with (4.):
$abbb#bbba#abbb$
H
$ab&bBbbba#abbb$
k $abAbbBbbba#ab&$ (4.1)
(4)
k $ ab&,Bbbba#abbb$.
H qo$abAb&,bba#abbb$ (4.2) (2)
___
Now we can continue with either (7.) or (8.). However, it is easily seen that (8.) will lead to rejection, and so we continue with (7.):
$ abAbbB b b ba#abbb$ k $ab&bBbbba#abbb$
H #abAb&,bba#abCbb$ (7) (7.1) H qo$abAbbBbbba#abCbb$ $abAbbBbbba#abCbb$.
A
(7.2)
(2)
-
Again we can continue with either (7.) or (8). This time, however, (7.) will lead to rejection, and we continue with (8.):
$abAbbBbbba#abCbb$
H (8)
H (8.2)
$ab#ba#abCbb$ -
k $ab#ba#abCbb$ (8.1)
qo$ab#ba#abCbb$ A &ab#ba#abCbb$. (2)
-
Here we can continue with either (4.) or (5.). This time (4.) will lead to
366
rejection, and we continue with (5.):
A $ab#ba#abCbb$ e $ab#ba#ab$
$ab#ba#abCbb$
-(5.1)
(5)
qo$ab#ba#ab$.
H (5.2)
Continuing in this way we will finally obtain the configuration qo$##cab$, which leads to acceptance. 0
Example 2. Consider the input abbb#bbba#abba: qo#abbb#bbba#abba$
A &abbb#bbba#abba$. -
(2)
We can continue with either (4.) or (5.):
e &abAbbBbbba#abba$ A $abAbb&bba#abba$ (4.1) e REJECT.
(1.) $abbb#bba#abba$
(4)
~
.
I
(4.2)
(2.) $abbb#bbba#abba$
A
$abbb#bbba#abba$
(5)
I-+
REJECT.
(5.1)
Thus, this input cannot be accepted by M .
0
Example 3. Consider the input abbb#baba#abbb: qo#abbb#baba#abbb$
A $abbb#baba#abbb$ (2)
-
I+
REJECT.
(3.1)
Thus, this input is not accepted either.
0
Based on these examples we can easily complete the proof of the theorem. From Example 1 we see that each string w E L G is ~ accepted by M . On the other hand, if w = x#y#z for some x,y , z E { a , b}* such that w # L G ~then , x # yR or x # z. In a computation it is checked whether or not x = y R in step (3.1), and it is checked whether or not x = z in step (4.2). Hence, it follows that the language L ( M ) coincides with the Gladkij language L G ~ .0
As the Gladkij language does not belong to the class GCSL, we obtain the following consequence. Corollary 5.2. GCSL is properly contained in t h e class ,L(RRWW). Thus, we see that at least one of the following two inclusions is proper: GCSL
L(RWW)
C L(RRWW),
but at the moment it is not clear which one. In fact we would expect that the Gladkij language is not contained in the class L(RWW), which would show
367
that in contrast to the situation in the deterministic case, the separation of the restart from the rewrite operation does increase the power of the nondeterministic restarting automatona. Also we would like to point out that using the same technique as for the Gladkij language it can be shown that C(RRWW) contains some rather complicated languages 12. Thus, it does not seem to be easy to separate this class from the class CSL of context-sensitive languages. 6
Restarting automata and prefix-rewriting systems
Instead of working directly with the various types of restarting automata, it may be easier to work with characterizations of the corresponding language classes through certain prefix-rewriting systems. A prefix-rewriting s y s t e m P on some alphabet C is a subset of C* x C*. Its elements are called (prefix-) rewrite rules, and usually they are written as (l + T). By dom(P) we denote the set of all left-hand sides of rules of P . A prefix-rewriting system P on C induces a prefix-reduction relation +-> on C * , which is the reflexive transitive closure of the single-step prefixreduction relation JP:= { ( l z ,TZ) I (l -+ T) E P, z E C* }. If u u holds, then u is an ancestor of v and u is a descendant of u (mod P ) . By O>(v)we denote the set of all ancestors of u , and for L C_ C*, O>(L) := U O>(v).
+>
UEL
Let u E C*. If there exists some u E C* such that u ~p u , then 'u. is called reducible mod P , otherwise it is irreducible mod P . By IRR(P) we denote the set of all irreducible strings mod P . Obviously, IRR(P) = C* \ dom(P) . C*, n
that is, for a system of the form P =
U{
xui
+ X Z ) ~I x
E Ri }, where
i= 1
. . , R, are regular languages, IRR(P) is a regular language as well. Using prefix-rewriting systems of this form the languages accepted by RW-automata can be characterized as follows. Theorem 6.1. lo L e t L C_ C * . T h e n L E C(RW) if and only if there exist a
R1, .
u { xui n
prefix-rewriting s y s t e m P of t h e f o r m P =
4
xvi I x E Ri }, where
i= 1
C* for i = 1,.. . , T I , ui,ui E C* or ui,'ui E C* . $ satisfying luil > Ivil, Ri i s a regular language, and $ is a n additional symbol n o t in C , and a regular language RO2 C* . $ s u c h t h a t L . $ = V>(&). For the class C(det-RW) a corresponding characterization can be derived cooperation with Tomasz Jurdziliski and Krzysztof LoryS we have recently been able to show that in contrast to our expectations the Gladkij language is accepted by an RWWautomaton. Thus, the first of the two inclusions above is proper.
368
using confluent prefix-rewriting systems of the form described above. Here a prefix-rewriting system P is called confluent, if two strings u , u E C* that have a common ancestor also have a common descendant. Sknizergues has investigated prefix-rewriting systems of the form above, which he called strict Rat-Fin controlled rewriting systems, and he has shown that for these systems confluence is unfortunately undecidable 13. For RRW-automata similar characterizations can be obtained. Only the form of the prefix-rewriting systems used in the characterization is different. Theorem 6.2. l o Let L C C * . Then L belongs to C(RRW) if and only if n
there exist a prefix-rewritingsystem P of the form P =
U { xuiy$ + xuiy$ I
i=l x E Ri(l),y E R,!') }, where $ is an additional symbol not in C, and for i = 1,.. . , n, ui,oi E C* satisfying 1uil > IuiJ, and Ri('),Ri2)C C* are regular languages, and a regular language Ro C C* . $ such that L . $ = V>(Ro). Again the deterministic class C(det-RRW) can be characterized in the same way by confluent prefix-rewriting systems. Further, together with Theorems 3.1 and 3.2 the results above yield corresponding characterizations for the classes of languages accepted by the (deterministic) (R) RWW-automata. Corollary 6.3. Let L C * . Then L E C(RWW) ( L E C(det-RWW)) if and only if there exist an alphabet F containing C , a (confluent) prefix-rewriting
c
n
system P of the form P =
U { xui + xwi I x E Ri
}, where for i = 1,.. . ,n ,
i=l
ui,ui E F* orui,ui E I?*.$ satisfying Juil > Iuil, Ri 2 r*is a regular language, and $ is an additional symbol not in I?, and regular languages R, Ro C F* . $ such that L . $ = V $ ( R o )n R .
Corollary 6.4. Let L C C'. Then L belongs to L(RRWW) (C(det-RRWW)) if and only if there exist an alphabet l? containing C, a (confluent) prefixn
rewriting system P :=
u { xUiy$ + xviy$ I x E Ri
(1)
i= 1
,y
E Ri(2) }, where $
is an additional symbol not in F, and for i = 1 , .. . , n, ui,ui E I?* satisfying lUil > luil, and Ri(l), Rt2) C I?* are regular languages, and regular languages Ro and R such that L . $ = VF(R0)n R . Acknowledgement The authors want t o express their thanks to Gerhard Buntrock for helpful discussions concerning the results presented in Section 3.
369
References 1. G. Buntrock. Wachsende kontext-sensitive Sprachen. Habilitationsschrift, Fakultat f i r Mathematik und Informatik, Universitat Wiirzburg, July 1996. 2. G. Buntrock and F. Otto. Growing context-sensitive languages and ChurchRosser languages. Information and Computation, 141:l-36, 1998. 3. A.W. Gladkij. On the complexity of derivations for context-sensitive grammars. Algebra i Logika Sem., 3:29-44, 1964. In Russian. 4. P. JanEar, F. MrBz, M. PlBtek, and J. Vogel. Restarting automata. In H. Reichel (ed.), Fundamentals of Computation Theory, Proc., Lect. Notes in Comp. Sci. 965, pp. 283-292. Springer, Berlin, 1995. 5. P. JanEar, F. MrBz, M. PlBtek, and J. Vogel. On restarting automata with rewriting. In G. P5un and A. Salomaa (eds.), New Trends in Formal Languages, Lect. Notes in Comp. Sci. 1218, pp. 119-136. Springer, Berlin, 1997. 6. P. Jantar, F. MrBz, M. PlStek, and J . Vogel. Different types of monotonicity for restarting automata. In V. Arvind and R. Ramanujam (eds.), Foundations of Software Technology and Theoretical Computer Science, Proc., Lect. Notes in Comp. Sci. 1530, pp. 343-354. Springer, Berlin, 1998. 7. P. Jantar, F. MrBz, M. PlBtek, and J. Vogel. On monotonic automata with a restart operation. Journal of Automata, Languages and Combinatorics, 4:287311, 1999. 8. R. WIcNaughton, P. Narendran, and F. Otto. Church-Rosser Thue systems and formal languages. Journal of the Association for Computing Machinery, 35~324-344, 1988. 9. G. Niemann and F. Otto. The Church-Rosser languages are the deterministic variants of the growing context-sensitive languages. In M. Nivat (ed.), Foundations of Software Science and Computation Structures, Proc., Lect. Notes in Comp. Sci. 1378, pp. 243-257. Springer, Berlin, 1998. 10. G. Niemann and F. Otto. Restarting automata and prefix-rewriting systems. Mathematische Schriften Kassel 18/99, Universitat Kassel, Dec. 1999. 11. G. Niemann and F. Otto. Restarting automata, Church-Rosser languages, and representations of r.e. languages. In G. Rozenberg and W. Thomas (eds.), Developments an Language Theory - Foundations, Applications, and Perspectives, Proc., pp. 103-114. World Scientific, Singapore, 2000. 12. G. Niemann and F. Otto. On the power of R R W - a u t o m a t a . In M. Ito, G. PBun and S. Yu (eds.), Words, Semigroups, and Transductions. Essays in Honour of Gabriel Thierrin, On the Occasion of His 80th Birthday, pp. 341355. World Scientific, Singapore, 2001. 13. G. Sknizergues. Some decision problems about controlled rewriting systems. Theoretical Computer Science, 71 :281-346, 1990.
370
Cellular Automata with Polynomials over Finite Fields * Hidenosuke Nishio Iwakura Miyake-cho 204, Sakyo-ku, Kyoto Email: YRAO5762Qnifty.ne.jp
Abstract
Information transmission in cellular automata(CA) is studied using polynomials over finite fields. The state set is thought to be a finite field and the local function is expressed in terms of a polynomial over it. The information is expressed by an unknown variable X, which takes a value from the state set and information transmission is discussed using polynomials in X . The idea is presented for the basic one dimensional CA with neighborhood index {-l,O, +I}, although it works for general CAs. We give first the algebraic framework for the extension of CA and then show some fundamental results on extended CAs.
1
Introduction
’What is the information?’ and ’How to study the information?’ are fundamental questions in information sciences. Shannon’s pioneering work originated the mathematical study of the information, focusing on information transmission through noisy communicating channels. He introduced the numerical measure entropy based on the probability theory. Also in the study of CA, it has been investigated from various points of view. J.von Neumann firstly proposed the self-reproducing 2-D CA with 29 state cells[vN66]. In his design the information is transmitted by means of many signals. The firing squad synchronization problem and other real time computations have been solved by utilizing many signals which travel in CA spaces with various speeds and specific meanings[M+T99]. In this paper we are going to discuss another way of viewing the information in CA, which is essentially different from signals. Our approach will be called algebraic “991. ‘This is an extended abstract and the full paper will be published elsewhere.
371
Definitions
2
The 1-D CA is defined as usual with the space 2 (the set of integers), the neighborhood index N , the state set Q and the local function f and denoted as CA=(Z,N,Q,f). Throughout this paper we assume the l-D CA with N={-l,O, +l} and denote it simply as CA(Q,f).
2.1
State Set
Q is generally a finite set but is thought to be a finite field in our study. I t may possibly be a ring or an integral domain, but for the sake of simplicity we assume first the structure of field. Thus Q=GF(q), where q = with prime p and positive integer n.
2.2
Local Function
Various ways in expressing the local function have been made use by CA researchers. When Q is an arbitrary finite set, f is expressed as a table or by listing up the function vaules for every combinations of neighboring cell states. If Q is a field, however, we can express it in terms of the polynomial over GF(q). Note that the linear CA has been extensively investigated, where f is expressed in the form of the linear combination of variables[G95]. Denote the cardinality of Q as IQI. So IQI = q = p". Then the local function f : Q x Q x Q + Q can be expressed as follows:
f(z,y, z ) = UQ + u1z + u2y + uq3-2z9-1
+ . + uixhyjzk + *.
* * *
yq-1 z 9-2 +uq3-1x9-lyq--1z~-1,
where ui E Q (0 5 i 5 q3 - 1). (I) x,y and z indicate the states of the neighboring cells -l(left), O(center) and +l(right), respectively. There are q q 3 local functions in all and it will be seen that (1) is a due form for expressing them. As for the polynomial expression of functions from GF(q)" to GF(q) and other topics on finite fields, see [L+N97]. Example 1. The binary set Q = {O,l}=GF(2) and the function f ( x , y, z ) = yz t x.
372
Information Function
3 3.1
Information Expressed by
X
We are going t o present an algebraic tool for studying the information transmission in CA. Let X be a symbol different from those used in equation (1). It stands for a n unkown state of the cell in CA and has been introduced to trace the information, which is essentially different from the signal. Take Example 1 above. From the fact that f ( O , O , O ) = 0 and f ( l , O , O ) = 1, we may write as f(X,O,O) = X (i). Similarly we write f ( X , 1,1) = X 1 (mod 2) (ii), which comes from the fact that f ( O , l , 1) = 1 and f ( l , l ,1) = 0. Also we have f(O,X,O)= 0 (iii). From the information related point of view, we claim: in case (i) the information X is transmitted without loss from the left to the center. In case (ii) it is also so, since from the output X 1 we can uniquely restore the input value of X . In case (iii), however, the state information X of the center is lost.
+
+
For generalizing the above argument, we consider another polynomial form, which we call the information function.
g ( X ) = a0
+ U l X + . . . + aixz + . . . + aq-lXQ-l, where ai E
y is a function Q
Q (0 5 i 5 q - 1).
(2)
-+ Q
and the set of such functions is denoted by &[XI. Note that Q [ X ] 2 Q. The element of &[XI \ Q is Evidently IQ[X]I=qq. called i n f o r m a t i v e , while that of Q constant.
3.2
Ring &[XI
We introduce two operations in Q [ X ] addition , and multiplication, following the ring operation of Q. So we have the following equations.
pX=O
and
Xq-X=O.
Consequently Q [ X ]becomes a (commutative) ring with identity. In fact it is isomorphic to the factor ring by ( X q - X ) which is a reducible polynomial. Therefore Q [ X ]is not a field and even not an integral domain. See Example 2 below. It is not a field nor an integral domain and is proved to be isomorphic to the direct sum of two copies of GF(2).
373
Example 2. Q = GF(2).
Q[X] = {O,l,X,X
+ 1).
Example 3. Appendix 1 lists up all polynomials of Q[X] over GF(3). Each polynomial function is equivalently expressed as the coefficeint vector(first column) and the value vector(third column).
4
Extension of CA
Defining CA(Q[X],fx) We extend a CA(Q, f) to its extended CA(Q[X],fx),where the state set Q[X] is the set of polynomials over Q and the local function fx is expressed 4.1
by the same polynomial form f as in (Q, f ) . The variables x,y and z, however, move in Q [ X ]instead of Q. That is, fx : Q[XI3+ Q[X].
4.2
Dynamics of CA(Q[X],f x )
The global map Fx : C -+ C is defined as usual, where C = Q[XIz is the set of all state configurations. The configuration at time t is defined by ct = F$(co) for the initial configuration C O . The suffix X is often omitted when no confusion is expected.
Cornput er Simulation Appendix 2 shows the dynamics of a finite cyclic boundary CA(Q[X],f ) over GF(3). The first simutation is done for the initial configuration with X. The others show the dynamics with the substituted initial configurations.
5
Results
We present several results which will make clear the features of dynamics of the extended CA in contrast t o those of the original CA.
Theorem 5.1 When CA(Q[X],f) starts with a constant configuration, its dynamics is the same as that of CA(Q,f). In other words, CA(Q,f) is embedded in CA(Q[Xl, f).
374
Theorem 5.2
CA ( Q [ X ] , f is ) surjective (injective, reversible) iff CA (Q,f) is surjective (injective, reversible). We define here substitution of configuraitions: For any configuration w E Q[XIz and a E Q, let wa be a configuration obtained from w by substituting a for X of each cell state g(X). When CA is finite (with cyclic or fixed boundary condition), its trajectory enters a cycle after a finite transient part. Denote the cycle length by #(w) and the transient length by ~ ( w )when , a CA starts with w. Note that when the number of cells in CA is n then its configurations are words of length 72.
Theorem 5.3 Let w be a word in Q[XIn. Then we have,
6
Concluding Remarks
We have shown results without proofs, which will appear elsewhere. Proofs were often conceived from the computer simulation of CA(Q[X],f),for which the author is greately indebted to Takashi Saito.
References [G95] M.Garzon,Models of Mssive Parallelism, Analysis of Cellular Automata and Neural Networks,Springer, 1995. [L+N97] R.Lidl and H.Niederreiter,Finte Fields, 2nd ed. Cambridge University Press, 1997. [M+T99] J.Mazoyer and V.Terrier, Signals in onedimensional cellular automata, Theoret. Comput.Sci., ~01.217,53-80(1999). “991 H.Nishio, Algebraic Studies of Information in Cellular Automata, Kyoto University, RIMS Kokyuroku, vol. 1106, 186-195(1999).
375
("H.Nishio, I] Global Dynamics of l-D Extended Cellular Automata, Kyoto University, RIMS Kokyuroku, ~01.1166, 200-206 (2000). [vN66]J.von Neumann, Theory of Self-reproducingAutomata, Univ. of Ilinois Press, 1966.
376
Appendix 1. Polynomials over GF ( 3 ) g(X)
a0
0 0 0 0 0 0 0 0 0 1 1. 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2
a1
a0
+ u ~ +X a2X2,
dX)
a2
0 0 0 0 1 X2 2x2 0 2 1 0 X x x2 1 1 x i- 2 x 2 1 2 2 0 2x 2 1 2 x x2 2x-I 2 x 2 2 2 1 0 0 1 + x2 0 1 1 2x2 0 2 1 0 1+x 1 1 1+X+X2 1 2 1+X+2X2 1+2x 2 0 2 1 1+2x+x2 2 2 1 2 x 2x2 2 0 0 2+x2 0 1 2 2x2 0 2 1 0 2 + x 2+x+x2 1 1 1 2 2+x+2x2 2+2x 2 0 2 1 2 2 x x2 2 2 12+2X+2X2
+
+
+
+
+
+
+
+
C L ~E
GF(3).
.9(0) g(1) g(2>
0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2
0 1 2 1 2 0 2 0 1 1 2 0 2 0 1 0 1 2 2 0 1 0 1 2 1 2 0
0 1 2 2 0 1 1 2 0 1 2 0 0 1 2 2 0 1 2 0 1 1 2 0 0 1 2
I d Q )I 1 2 2 3 2 2 3 2 2 1
2 2 3 2 2 3 2 2 1 2 2 3 2 2 3 2 2
377
Appendix 2. Simulation of CA[X] Q=GF(3), cyclic boundary, n = 6, f
= zz
(A) w = XI1111 time: cell 1 to 6. 0 : ((0 1 0) (0 0 1 ) (0 0 1) (0 0 1) (0 0 1) (0 0 1)) 1 : ((0 1 1) (0 1 1 ) (0 0 2) (0 0 2) (0 0 2) (0 1 1)) 2 : ((1 0 2) (0 0 0) (0 2 1) (0 0 0) (0 2 1) (0 0 0)) 3 : ((1 0 2) (1 0 2) (0 2 1) (1 1 1) (0 2 1) (1 0 2)) 4 : ((0 0 0) (2 0 1 ) (1 2 0) (2 2 2) (1 2 0) (2 0 1)) 5 : ((2 0 1) (2 0 1) (2 2 2) (1 0 2) (2 2 2) (2 0 1)) 6 : ((1 0 2) (0 0 0) (0 2 1) (2 1 0 ) (0 2 1) (0 0 0)) 7 : ((1 0 2) (1 0 2) (0 2 1) (0 2 1) (0 2 1) (1 0 2)) 8 : ((0 0 0) (2 0 1) (1 2 0) (1 0 2) (1 2 0) (2 0 1)) 9 : ((2 0 1) (2 0 1) (2 2 2) (0 1 2) (2 2 2) (2 0 1)) 10 : ((1 0 2 ) ( 0 0 0) (0 2 1) (1 2 0) (0 2 1) (0 0 0)) 11 : ((1 0 2) (1 0 2) (0 2 1) (2 0 1) (0 2 1) (1 0 2)) 12 : ((0 0 0) (2 0 1) (1 2 0) (0 1 2 ) (1 2 0) (2 0 1)) 13 : ( ( 2 0 1) ( 2 0 1) (2 2 2) (2 2 2) (2 2 2) (2 0 1)) 14 : ((1 0 2) (0 0 0) (0 2 1) (0 0 0) (0 2 1) (0 0 0 ) )
+ y.
T
= 2 , 4 = 12
(B) ~ o = O l l l l l 0 : ((0 0 0) (0 0 1) (0 0 1) (0 0 1) (0 0 1) (0 0 1)) 1 : ((0 0 1) (0 0 1) (0 0 2) (0 0 2) (0 0 2) (0 0 1)) 2 : ((0 0 2) (0 0 0) (0 0 1) (0 0 0) (0 0 1) (0 0 0)) 3 : ((0 0 2) (0 0 2) (0 0 1) (0 0 1) (0 0 1) (0 0 2)) 4 : ((0 0 0) (0 0 1 ) (0 0 0) (0 0 2) (0 0 0) (0 0 1)) 5 : ( ( 0 0 1) (0 0 1) ( 0 0 2) (0 0 2) ( 0 0 2) (0 0 1))
T
= 1, = 4
(C)w1=llllll 0 : ((0 0 1) (0 0 1 ) 1 : ((0 0 2) (0 0 2) 2 : ((0 0 0 ) (0 0 0 ) 3 : ( ( 0 0 0 ) (0 0 0 )
(0 0 1) (0 0 (0 0 2) (0 0 ( 0 0 0 ) (0 0 ( 0 0 0) ( 0 0
T
= 2,4 = 1
(D) ~ 2 = 2 1 1 1 1 1 0 : ((0 0 2) (0 0 1 ) 1 : ((0 0 0 ) (0 0 0) 2 : ((0 0 0 ) (0 0 0 ) 3 : ( ( 0 0 0 ) (0 0 0) 4 : ( ( 0 0 0 ) (0 0 0 )
(0 0 1) (0 0 2) ( 0 0 2) (0 0 2) (0 0 2)
T
= 1,Cj5 = 3
1) (0 0 2) (0 0 0) (0 0 0 ) (0 0
1) (0 0 1)) 2) (0 0 2)) 0 ) (0 0 0 ) ) 0) (0 0 0))
(0 0 1) (0 0 1) (0 0 2) (0 0 2) (0 0 0) (0 0 2) (0 0 1) (0 0 2) (0 0 2) ( 0 0 2)
(0 0 1)) (0 0 0)) (0 0 0)) (0 0 0)) (0 0 0 ) )
The cell state is expressed by the coefficient vector. For example coefficient vectors (0,1,0) and (O,l,l) mean X and X X2, respectively, as is seen in Appendix 1.
+
378
GENERALIZED DIRECTABLE AUTOMATA ZARKO POPOVIC AND STOJAN BOGDANOVIC University of NiS, Faculty of Economics, Trg Kralja Alelcsandra 11, 18000 NiS, Yugoslavia E-mail:
[email protected],
[email protected] TATJANA PETKOVIC University of Turku, TUCS and Department of Mathematics, F I N - 2 0 0 1 ~Turku, Finland E-mail:
[email protected] MIROSLAV CIRIC University of NiS, Faculty of Sciences and Mathematics, Cirila i Metodija 6, 18000 NiS, Yugoslavia E-mail: ciricmObankerinter.net,
[email protected] In [16]the last three authors introduced the notion of generalized directable automata as a generalization of many already known types of automata. The algorithms for testing whether a finite automaton belongs to some of important subclasses of the class of generalized directable automata are studied by the authors in [18].In this paper structural properties of finite and infinite generalized directable automata are considered, tests for membership of a finite automaton in the pseudovarieties of generalized directable and locally directable automata are given, and the least generalized directing and locally directing congruences on a finite automaton are described.
1 Introduction and preliminaries
Directable automata were introduced in [6] and later studied by many authors (see, for example, [14], [13] or [4]), whereas trapped, trap-directable (or one-trapped), uniformly locally nilpotent, uniformly locally definite, uniformly locally directable, uniformly locally trap-directable automata were introduced in [16] and, as it was proved there, they form generalized varieties of automata properly contained in the generalized variety of all generalized directable automata, also introduced in [16]. The algorithms for testing whether a finite automaton belongs to a pseudovariety of all trapped, trap-directable or locally trap-directable automata were considered by the authors in [18]. The algorithms for construction the least congruence on a finite automaton whose corresponding factor automaton belongs to the mentioned pseudovarieties were also given in [18]. More information about all these classes of automata can be found in 141.
379
This paper presents a deeper study of generalized directable automata. Some structural properties of generalized directable automata and their transition semigroups were given in [16]. However, finite generalized directable automata have some particular properties that are described in Section 2. Those properties give rise to an algorithm for testing whether a finite automaton is generalized directable. Since uniformly locally directable automata play an important role in characterization of generalized directable automata, and finite locally directable automata are uniformly locally directable, in Section 2 special attention is devoted to testing finite automata for local directability. Directing congruences on automata were first considered in [14],where it was noted that every finite automaton has the least directing congruence, and an algorithm for finding this congruence was given in [13]. In Section 3 of this paper the existence of the least directing congruence on an arbitrary, not necessarily finite, generalized directable automaton is proved. It is shown that there are interesting mutual relations between the least directing, trapping and trap-directing congruences on a generalized directable automaton. Eventually, the least generalized directing congruence is characterized in Section 4. In addition, for an irregular pseudovariety of automata P , the least L(P)-congruence is described. Let A be any set. Then AA and VA denote the diagonal (identity) relation and the universal'relation on A, respectively. For two binary relations a and p on A , their product is the relation a . /3 defined by: ( a ,b) E a p if and only if ( a ,c) E a and (c, b) E /3, for some c E A . If a * p = /3. a , we say that (Y and /3 commute. Automata considered throughout this paper will be automata without outputs in the sense of the definition from the book by F. GQcseg and I. Peak [ll]. It is well known that automata without outputs, with the input alphabet X , can be considered as unary algebras of type indexed by X , so notions such as a congruence, homomorphism, generating set etc., have their usual algebraic meanings (see, for example, [5]). The state set and the input set of an automaton are not necessarily finite. In order to simplify notations, an automaton with the state set A is also denoted by the same letter A. For any considered automaton A , its input alphabet is denoted by X , and the free monoid over X , the input monoid of A , is denoted by X*. Under the action of an input word u E X * , the automaton A goes from a state a into the state denoted by au. A state a E A is called a trap of A if au = a for every word u E X*. The set of all traps of A is denoted by Tr(A). A state a E A is reversible if for every word v E X * there exists a word u E X " such that avu = a , and the set of all reversible states of A , called the reversible part of A, is
380
denoted by R ( A ) . If it is nonempty, R(A) is a subautomaton of A . An automaton A is reversible if every its state is reversible. If for every a, b E A there exists u E X* such that b = au, then the automaton A is strongly connected. Equivalently, A is strongly connected if it does not have proper subautomata. On the other hand, A is connected if for every a, b E A there exist u , v E X " such that au = bv. The mergeability relation P A on A is defined by (a, b) E P A if and only if au = bu, for some u E X " . If (a, b) E P A , we say that a and b are mergeable. Otherwise they are nonmergeable. For a state a E A, by (a) we denote the monogenic subautomaton of A generated by a, i.e. the subautomaton (a) = {au I u E X * } . The least subautomaton of an automaton A , if it exists, is called the kernel of A , and in this case, it is the unique strongly connected subautomaton of A . Let u E X'. An automaton A is called u-trapped if au E T r ( A ) for every a E A , and in this case u is a trapping word of A. If au = bu for every a , b E A , then A is u-directable, u is a directing word of A and the set of all directing words of A is denoted by D W ( A ) . If A is u-directable and has a trap, or equivalently, if it is u-trapped and has a unique trap, then it is called u-trap-directable. Also, an automaton A is generalized u-directable if for every state a E A and every word v E X * holds auvu = au, and then u is a generalized directing word and the set of all generalized directing words is denoted by G D W ( A ) . An automaton A is trapped (resp. directable, trapdirectable, generalized directable) if there exists a word u E X * such that A is u-trapped (resp. u-directable, u-trap-directable, generalized u-directable). It can be proved (see [19]) that a finite automaton is directable if and only if any two its states are mergeable. For a word u E X*, a state a E A is a u-neck of A if bu = a, for every b E A , and it is a neck of A if it is a u-neck, for some u E X * . An automaton A is strongly directable if every its state is a neck, or equivalently, if i t is both strongly connected and directable. Let B be a subautomaton of an automaton A. If 8 is a congruence relation on B , then the relation R ( 8 ) defined by R(8) = BUAA is a congruence relation on A and it is called the Rees extension of 8 (up to a congruence on A ) . In particular, the Rees congruence e B of a subautomaton B is the Rees extension R ( V B ) . The factor automaton AleB is denoted by A I B and the automaton A is an extension of B by an automaton C (with a trap), if A I B ? C. Let A and B be automata and let H be an automaton such that there exist homomorphisms cp from A onto H and J+!I from B onto H . Then P = {(a, 6) E A x B I acp = b$} is a subdirect product of A and B and any automaton isomorphic t o P is called a pullback product of A and B with respect t o H . By a parallel composition of automata A and B we mean any automaton isomorphic to a subautomaton of their direct product A x B .
381
An automaton A is a direct s u m of its subautomata A,, a E Y , if UaEYA , and A, n Ap = 0,for every a,p E Y such that a # p. Automata A,, a E Y , are direct summands of A. They determine a partition of A called a direct sum decomposition of A , and the corresponding equivalence relation is a congruence relation on A called a direct s u m congruence. By the greatest direct s u m decomposition of A we mean the decomposition corresponding to the least direct sum congruence on A . An automaton A is direct s u m indecomposable if the universal relation VA is the only direct sum congruence on A. More on direct sum decompositions can be found in [8]. Here we quote the following theorem from [8], which will be widely used here.
A =
Theorem 1 (CiriC and BogdanoviC [ 8 ] ) Every automaton can be uniquely represented as a direct s u m of direct s u m indecomposable automata. This decomposition is the greatest direct s u m decomposition of that automaton. In this paper special attention is devoted to finite automata. Hence we will often use the following basic result describing the structure of an arbitrary finite automaton. Theorem 2 (KovaEeviC, CiriC, PetkoviC and BogdanoviC [15]) Every finite automaton can be uniquely represented as a n extension of a direct s u m of strongly connected automata b y a trap-directable automaton. If K is a class of automata, then an automaton is a locally K - a u t o m a t o n if every its monogenic subautomaton belongs to K , and the class of all locally K-automata is denoted by L ( K ) . In such a way locally directable and locally trap-directable automata are defined. In particular, if every monogenic subautomaton of an automaton A is u-directable, for some u E X * , i.e. all monogenic subautomata of A are directable and have a common directing word u, then A is uniformly locally directable, u is a uniformly locally directing word of A and the set of all such words is denoted by U L DW(A ). Furthermore, a uniformly locally strongly directable automaton is an automaton whose every monogenic subautomaton is strongly connected and u-directable, for a fixed 1 '1
E
x*.
By a generalized variety of automata we mean any class of automata closed under formation of subautomata, homomorphic images, finite direct products and direct powers, whereas by a pseudovariety of automata we mean any class of finite automata closed under formation of subautomata, homomorphic images and finite direct products. Equivalently, a class of automata is a pseudovariety if and only if it is the class of all finite members of some generalized variety (see [l]). As was proved in [16], the classes of directable, uniformly locally directable, generalized directable, trap-directable, uniformly locally trap-directable and trapped automata are generalized varieties, and
382
hence, finite members from these classes form pseudovarieties. A pseudovariety of automata is here defined to be irregular if it is contained in the pseudovariety of all finite directable automata. Otherwise it is called regular. Many interesting algebraic properties of irregular and regular pseudovarieties are described in [3]and [4]. Here we recall a result from [3] that will play an important role in the further work. Theorem 3 (BogdanoviC, CiriC, PetkoviC, Imreh and Steinby [3]) If P is a n arbitrary pseudovariety of automata, then L ( P ) is also a pseudovariety of automata. Moreover, if P is an irregular pseudovariety of automata and A i s a finite automaton, then A E L ( P ) if and only if A is a direct s u m of automata from P . For undefined notions and notations we refer to [ll],[5] and [12]. 2
Testing for generalized and local directability
Generalized directable automata were introduced and studied by the last three authors in [16],where they proved that a generalized directable automaton can be characterized as an extension of a uniformly locally directable automaton by a trap-directable automaton. By the next theorem we give a more precise structural characterization of these automata. Theorem 4 An automaton A i s generalized directable if and only if it is a n extension of a uniformly locally strongly directable automaton B by a trapdirectable automaton C . In that case we have
D W ( C ) U L D W ( B ) C GDW(A)
D W ( C )n U L D W ( B ) .
Proof. Let A be generalized directable. Consider arbitrary a E A and u E G D W ( A ) . Then auuu = a u , for every IJ E X ' , whence it follows that au E R ( A ) . Now, if we set B = R ( A ) ,we have that B is a subautomaton of A, and by a u E B , for every a E A and u E G D W ( A ) ,it follows that C = A / B is a trap-directable automaton and GDW(A) C DW(C). Let D be an arbitrary monogenic subautomaton of B. Since B is reversible, we have that D is strongly connected. Consider arbitrary a , b E D and u E G D W ( A ) . Then a u , b E D, so auu = b, for some TJ E X * , whence it follows that bu = auuu = au. Thus, D is directable and u E D W ( D ) , so we conclude that B is uniformly locally strongly directable and G D W ( A ) E U L D W ( B ) . Conversely, let A be represented its an extension of a uniformly locally strongly directable automaton B by a trap-directable automaton C. Consider arbitrary a E A, p E D W ( C ) , q E U L D W ( B ) and u E X ' , and set u = pq.
383
Then ap,apqup E D , for some strongly directable subautomaton D of B , whence auuu = (apqup)q = (ap)q = au, because q E D W ( D ) . Therefore, A is a generalized directable automaton and D W ( C ) U L D W ( B ) C G D W ( A ) . I
-
Besides the characterization of arbitrary generalized directable automata given in Theorem 4, the following theorem contains other equivalents of that property on finite automata. Theorem 5 The following conditions on a finite automaton A are equivalent: (i) A is generalized directable; (ii) every strongly connected subautomaton of A is directable; (iii) every subautomaton of A contains a directable subautomaton; (iv) (Va E A ) ( 3 u E X*)(Vu E X * ) auuu = au; (v) (Va E A)(% E X*)(Vu E X * ) ( 3 w E X * ) auuw = auw.
Proof. (i)+(ii). This implication is an immediate consequence of Theorem 4. (ii)+(i). By Theorem 2, A is an extension of an automaton B by a trap-directable automaton C, where B is a direct sum of strongly connected automata Bi, i E [I, n ] ,and by the hypothesis it follows that Bi is a directable automaton, for every i E [1,n]. Since D W ( B i ) is an ideal of X * , for each i E [l,n],and the intersection of any finite family of ideals is nonempty, then there exists q E DW(Bi). Then the automaton B is uniformly locally strongly directable, and hence, by Theorem 4, A is a generalized directable automaton. (ii)+(iii). This is an immediate consequence of Theorem 2. (iii)+(iv). Consider an arbitrary a E A. By the hypothesis, the monogenic subautomaton ( a ) contains a directable subautomaton B , and then there exists p E X * such that ap E B . Let u = pq, where q E D W ( B ) , and let u E X * be an arbitrary word. Then as in the proof of Theorem 4 we show that auuu = au. Thus, (iv) holds. (iv)=+(v).It is clear that for every a E A there exists u E X * such that auvu = au = au2, for every u E x * . (v)+(ii). Take an arbitrary strongly connected subautomaton B of A and a , b E B. By the hypothesis, there exists u E X' such that for every u E X * there exists w E X * such that auuw = auw. Then au, bu E B and B is strongly connected so there exists p E X' such that aup = bu, and for that p there exists q E X * such that aupq = auq, so auq = buq. Therefore, we have proved that a and b are mergeable, whence it follows that B is a directable automaton.
ny=,
384
Note that condition (v) means that for each a E A there exists u E X " such that aw.and any state from ( a u ) are mergeable, whereas condition (iv) means that every state has, in some sense, its own generalized directing word. Condition (ii) of Theorem 5 gives rise to an algorithm which tests a finite automaton A with n states and m input letters for generalized directability. The algorithm is a combination of two other algorithms. The first one is an algorithm for finding the strongly connected subautomata of a finite automaton. For that purpose we can use the algorithm given by the authors in [18], which works in time O(mn n 2 ) , or adapt the algorithm from the paper by J. Demel, M. Demlovii and V. Koubek [9] for finding the strongly connected components of a directed graph, which works in time O(rnn). Immediately after an arbitrary strongly connected subautomaton is formed, it can be checked for directability, using, for example, an algorithm given by B. Imreh and M. Steinby in [13]. The total time required for checking all strongly connected subautomata for directability is bounded by O(mn2). Therefore, the total working time for the whole algorithm is bounded by O(mn2),which is the same bound as for the directability test given in [13]. Recall that an automaton A is called locally directable if every its monogenic subautomaton is directable, and it is called uniformly locally directable if all its monogenic subautomata are directable and have a common directing word. In the general case, the class of uniformly directable automata is a proper subclass of the class of locally directable automata, as well as of the class of generalized directable automata. But, finite uniformly locally directable automata and finite locally directable automata form the same class, and in the second part of this section we study several properties of automata from this class and give an algorithm for testing a finite automaton for local directability. Theorem 6 T h e following conditions o n a finite automaton A are equivalent:
+
A is locally directable; every monogenic subautomaton of A has the directable kernel; A is a direct s u m of directable automata; every summand in the greatest direct s u m decomposition of A has the directable kernel; (v) (Va E A)(3u E X*)(Vv E X ' ) uvu = au.
(i) (ii) (iii) (iv)
Proof. Note first that, according to Theorem 2, a finite automaton is directable if and only if it has the directable kernel. This fact immediately implies the equivalences (i)@(ii) and (iii)@(iv). Since finite directable automata form an irregular pseudovariety, the equivalence (i)@(iii) follows from
385
Theorem 3. Finally, the claim (v) is just statement (i) written in symbols, i.e. (i)@(v) obviously holds. Using the previous theorem we can give an algorithm which tests a finite automaton A with n states and m input letters for local directability. This algorithm is a combination of three simpler algorithms. The first one is for computing the summands in the greatest direct sum decomposition of A , for example an algorithm given by the authors in [18],which works in time O(mn). Immediately after forming any of these summands, it can be checked whether this summand has a kernel, using one of two algorithms for finding the strongly connected subautomata of A mentioned before, which can be done in time O(mn) or O(mn n2). These algorithms have to be modified to check whether the considered summand has only one strongly connected subautomaton. If it is established that this summand has the kernel, this kernel can be immediately tested for directability using the mentioned algorithm from [13]. The total time needed for checking directability of all these kernels is bounded by O(mn2). Therefore, the whole algorithm can be realized in time O(mn2).
+
3
The least directing congruence
If K is a class of automata and A is an automaton, then a congruence relation 8 on A is called a K-congruence if the related factor automaton A18 belongs to K . According to M. CiriC and S. BogdanoviC [7,2], the class K is closed under homomorphic images and finite subdirect products if and only if the partially ordered set ConK(A) of all K-congruences on A is a sublattice of the congruence lattice Con ( A ) ,for every automaton A, or equivalently, if it is a filter of Con ( A ) ,for every automaton A. Therefore, if K is a generalized variety or a pseudovariety of automata and A is a finite automaton, then ConK ( A ) is a finite lattice, so it has the least element which is called the least K-congruence on A. If 8 is a congruence relation on an automaton A such that A18 is a directable automaton, then 8 is called a directing congruence on A. Recall from [13] that a congruence relation 8 on a finite automaton A is directing if and only if any two states a, b E A are 8-mergeable, by which we mean that there exists u E X * such that (au,bu) E 0. Since the class of all directable automata is a generalized variety, then every finite automaton has the least directing congruence. An algorithm for finding the least directing congruence on a finite automaton was given by B. Imreh and M. Steinby in [13]. But, in various theoretical considerations it is often of interest to describe such a congruence
386
through some logical formula, which is the main aim of this section. Note that T. PetkoviC and M. Steinby introduced in [17] the notion of a pair automaton of an automaton A. Here we will use a special subautomaton of this automaton defined as follows. On the set
A!:!, = { { a ,b} I a, b E A , a # b, ( a ,b)
4 PA}
of all pairs of nonmergeable states of A we define transitions by
{ a ,b}x = {ax,bx}, for every x E X. The transitions defined in this way are well-defined since if a pair {a,b} is nonmergeable then the pairs { a x , bx}, for all 2 E X , are nonmergeable as well. Then A::,), is an automaton which will be called the nonmergeable pair automaton of A. It plays an important role in the proof of the following theorem which characterizes the least directing congruence on a finite automaton. Theorem 7 Let A be an arbitrary finite automaton and let 64 be the transitive closure of the relation @A defined on A by
(a,b) E
QA H
a = b or (Vv E X * ) ( 3 uE X*){avu,bvu} = { a ,b}.
Then 6~ is the least directing congruence on A. Proof. It is evident that @ A is reflexive and symmetric. Let (a,b) E Q A and w E X'. Then for each v E X' there exists u E X* such that {a(wv)u,b(wv)u} = { a ,b}, whence
{(aw)vuw,(bw)vuw}= {aw,bw},
so (aw,bw) E @ A . Thus, @ A is compatible. Being the transitive closure of a reflexive, symmetric and compatible relation, d~ has the same properties and is transitive, so it is a congruence relation on A. To prove that 64 is a directing congruence, consider arbitrary a, b E A. If aw = bw for some w E X ' , then clearly (aw,bw) E d ~ so, a and b are dA-mergeable. Suppose now that aw # bw, for every w E X*. Then {a,b} is a state of the nonmergeable pair automaton A!:?, of A, and by Theorem 2, there exists w E X* such that {aw, bw} is a reversible state of A!;:?,. By this it follows that for each E X' there exists u E X* such that
{awvu,bwvu} = {aw,bw}vu = {aw,bw}, so (aw,bw) E @ A 6 ~ Therefore, . a and b are 6.4-mergeable, and by Lemma 5.3 of [13] we have that 64 is a directing congruence on A. It remains to prove that 6~ is contained in any directing congruence 8 on A . Assume that ( a ,b) E Q A . By the hypothesis and Lema 5.3 of [13], a
387
and b are 8-mergeable, so there exists v E X " such that (av,bv) E 8. On the other hand, ( a ,b) E @ A implies { m u ,bvu} = { a , b}, for some u E X * , and by (av,bv) E 8 it follows (avu,bvu) E 8 , so we conclude that ( a ,b) E 8. Thus, @ A 5 8, whence 64 8, which was to be proved.
c
Let us observe that a and b are distinct states of an automaton A and for every v E X' there exists u E X " such that { a v u , b v u } = { a , b } if and only if { a , b} is a reversible state of the nonmergeable pair automaton A$;]. Therefore, the previous theorem has the following equivalent formulation: Theorem 8 Let A be a n arbitrary finite automaton and let &A be the transitive closure of the relation @ A on A defined by
( a ,b) E
@A
e a = b or { a ,b} E R(A1:;J.
T h e n 6~ is the least directing congruence o n A. Note that the mentioned algorithm by B. Imreh and M. Steinby [13], for finding the least directing congruence on a finite automaton, is based on a similar result given in terms of graphs. By Theorem 7 and Theorem 2 the following result holds: Corollary 1 The least directing congruence o n a finite automaton A is the Rees extension of the least directing congruence o n the reversible part of A , i.e. d~ = ~ R ( AU) A A . If A is an infinite automaton, then it does not necessarily have the least directing congruence. In the second part of this section we prove the existence of the least directing congruence on an arbitrary generalized directable automaton, even on an infinite one, and we give a characterization of this congruence different than the one given for finite automata in Theorem 7. First we introduce several notions and notations. If A is an arbitrary (not necessarily finite) automaton, then to each state a E A we can associate a language G ( a ) X * defined as follows
G(a) = { u E X * I (Vv E X ' ) uvu = a } . The main properties of so defined languages are described by the next lemma. Lemma 1 Let A be a n arbitrary automaton and a E A . Then G(a) # 0 i f and only if ( a ) is a strongly directable automaton. In that case the following conditions hold: (a) G(a) = { u E X " la is a u-neck of ( a ) }; (b) G(a) is a left ideal of X * ; (c) G(a)w G(aw),for every w E X ' .
c
388
Proof. If G(a) # 0 then clearly ( a ) is a directable automaton. On the other hand, a is reversible, whence it follows that ( a ) is strongly connected. Thus, ( a ) is strongly directable. Conversely, let ( a ) be strongly directable. Then a is a u-neck of ( a ) ,for some u E X * , and then u E G(a). The assertion (a) is evident. Further, consider arbitrary u E G(a) and w E X * . Then auwu = a, for each u E X * , so wu E G(a). Thus, G(a) is a left ideal of X*. Consider also arbitrary u E G(a) and w E X * . Then awuu = a, whence awvuw = aw, for every u E X * . Hence, uw E G(aw).
Now we are ready to describe the least directing congruence on a generalized directable automaton. Theorem 9 Let A be an arbitrary generalized directable automaton and let U A be the transitive closure of the relation U A defined on A by
( a ,b) E
VA
e a = b or
G(a) nG(b)
# 0.
Then U A is the least directing congruence on A. Proof. The relation U A is clearly reflexive and symmetric. Consider a, b E A , a # b, such that (a,b) E U A , and an arbitrary w E X * . Then there exists TA E G(a)n G(b), and by (c) of Lemma 1 we have that uw E G(aw) f l G(bw), so (aw, bw) E U A . Therefore, U A is compatible, whence it follows that U A is a congruence relation. To prove that U A is a directing congruence on A , consider an arbitrary u E GDW(A) and a , b E A . Then u E G(au) n G(bu), so (au, bu) E V A C U A . Therefore, A/uA is a u-directable automaton, so U A is a directing congruence on A . Let 0 be an arbitrary directing congruence on A. Suppose that ( a ,b) E U A and a # b. Then there exists u E G(a) n G(b). On the other hand, for an arbitrary u E D W ( A I 0 ) we have that (au,bu) E 0, whence (auu,buu) E 0. Now by u E G(a)n G(b) i t follows that
( a ,b) = (avu, buu) E 0. Therefore, V A C B, whence U A & 0, and we have proved that directing congruence on A .
UA
is the least
The previous theorem can be equivalently formulated as follows: Theorem 10 Let A be an arbitray generalized directable automaton and let U A be the transitive closure of the relation V A on A defined by (a,b) E Then
UA
UA
@
a =b
OT
(321 E X*)(VVE X * ) uuu = 12 & buu = b.
is the least directing congruence on A .
389
As we see from Theorem 10, the condition which defines the relation V A is stronger than the one from Theorem 7 that defines the relation Q A . A congruence relation 6 on an automaton A is called a trapping congruence if the factor automaton A / @is a trapped automaton, and it is called a trapdirecting congruence if A / @is a trap-directable automaton. Let A be a generalized directable automaton. Then the relation T A defined on A by
(a,b) E
TA
w a = b or ( V U ,E~ X * ) ( 3 p , qE X * ) aup = b & bvq = a
is the least trapping congruence on A. In other words, ( a ,b) E TA if and only if either a = b or a and b belong to the same strongly connected subautomaton of A. Moreover, the relation I ~ on A A defined by ( a ,b) E 1 9 ~ H a = b or (VU,v E X * ) ( 3 p ,q E X * ) aup = a & bvq = b
is the least trap-directing congruence on A. Equivalently, (a,b) E IJA if and only if either a = b or a , b E R(A), i.e. I ~ Ais the Rees congruence of the subautomaton R(A) of A. As it was proved in [18], such defined relations are the least trapping and the least trap-directing congruences on an arbitrary finite automaton, and almost the same proofs can be given in the case when A is a generalized directable (not necessarily finite) automaton. The next theorem describes certain relationships between the congruences U A , T A and t 9 on ~ a generalized directable automaton. Theorem 11 Let A be a generalized directable automaton. T h e n UA.TA=TA.UA=I~A.
Proof. Since U A C_ I ~ Aand TA C_ I ~ A then , U A * T A C_ I ~ Aand T A * U A C_ I ~ A . Therefore, it remains to prove the opposite inclusions. For that reason, consider an arbitrary pair ( a ,b) E 194. If a = b, then clearly ( a ,b) E U A . T A and ( a ,b) E T A . U A . Assume that a # b. Then a, b E R ( A ) ,so by Theorem 4, (a) and (b) are strongly directable automata, i.e. G(a) # 0 and G(b) # 0 . Take arbitrary u E G(a) and v E G(b). Then by (b) and (c) of Lemma 1 we have that
uv E X*G(b)C_ G(b) and uv E G(a)v whence ( a ,av) E
TA
and (av,b) E
VA
s G(av),
2 U A , and similarly,
v u E X * G ( a ) G ( a ) and vu E G(b)u C G(bu), which yields ( a , h ) E V A E U A and (bu,b) E T A . Therefore, (a,b) E and ( a ,b) E U A . T A , so we have proved the assertion of the theorem.
TA
.UA
390
on a generalized directable automaton The next theorem gives interesting characterizations of the structure of generalized directable automata on which the relation V A is transitive. Theorem 12 The following conditions o n an automaton A are equivalent: In the general case, the relation
A is not necessarily transitive, i.e.
VA
UA
# YA.
(i) A is generalized directable and U A is transitive; (ii) A is generalized directable and YA n TA = AA; (iii) A is a pullback product of a directabze automaton and a trapped automaton (with respect to a trap-directable automaton); (iv) A is a subdirect product of a directable automaton and a trapped automaton; (v) A is a parallel composition of a directable automaton and a trapped automaton.
Proof. (i)+(ii). If u is transitive, then V A = Y A . Consider an arbitrary pair ( a ,b) E YA f l T A . If a = b then ( a ,b) E AA is trivially satisfied, so we can further assume that a # b. By ( a ,b) E T A it follows that a, b E B , for some strongly connected subautomaton B of A , and then there exists w E X’ such that aw = b. On the other hand, ( a ,b) E YA = U A implies that there exists u E X * such that avu = a and b v u = b, for each v E X ’ . Now a = awu = b u = b. Hence, V A n TA = A A . (ii)+(iii). By the general result proved for arbitrary universal algebras by I. Fleischer in [lo] it follows that an automaton A is a pullback product of automata A1 and A2 with respect to an automaton A3 if and only if there exists a pair of congruences O1 and (32 on A such that 81 n 02 = AA, 01 and 8 2 commute and A / & 5 A1, A / & % A2 and A / & ? AS, where 83 = 81 & = & . e l . Since by Theorem 11 we have that V A . TA = TA U A = ~ Q A , then V A n TA = AA implies that A is a pullback product of a directable automaton A / v A and a trapped automaton A/TA with respect to a trapdirectable automaton A / 6 A . (iii)+(iv) and (iv)=+(v). These implications are evident. (v)+(i). Let A B x C be a parallel composition of a directable automaton B and a trapped automaton C . Then B and C are generalized directable, and since generalized directable automata form a generalized variety, then A is also a generalized directable automaton. Furthermore, it can be easily verified that
( ( b , c ) , ( b ’ , c ’ ) ) € ~ A ($ b=b’ whence it follows that
UA
is transitive.
& (c=c’
or c , c ’ E T T ( C ) ) , W
391 4
The least generalized and locally directing congruences
A congruence relation 8 on an automaton A is called a generalized directing congruence if the factor automaton A / @is generalized directable, and it is called a locally directing congruence if A / @is a locally directable automaton. In this section we describe the least generalized directing and the least Iocally directing congruences on a finite automaton, and give algorithms for finding them. First we prove the following theorem: Theorem 13 Let a f i n i t e automaton A be represented as a n extension of a n automaton B by a trap-directable automaton C , where B is a direct s u m of strongly connected automata Bi, i E [l,n]. For each i E [1,n] let bi denote the least directzng congruence o n Bi. Then the relation ^ / A defined o n A by
( a ,b) E YA w a = b or ( a ,b ) E
&,
for some i E [l,n],
is the least generalized directing congruence o n A . Proof. It can be seen easily that Y A is a congruence relation on A . As in the proof of Theorem 5 we obtain that there exists p E DW(Bi/bi). Take also arbitrary q E DW(C) and u E X * . Consider now any a E A. Then aq E Bi,for some i E [l,721, and for u = qp holds auvq = (aq)pvq E Bi,what implies (auuu,au) = ( ( a u w ) p ,( a d p ) E
bi.
Thus, (auuu,au) E Y A , for every a E A , whence it follows that A / Y A is a generalized directable automaton, i.e. YA is a generalized directing congruence on A . To prove that Y A is the least generalized directing congruence on A, consider an arbitrary generalized directing congruence 0 on A. Let 'p be the natural homomorphism of A onto A / @ ,and for any i E [ l , n ] , let 'pi denote the restriction of 'p on Bi. Then for each i E 11,n],Bi'pi is a strongly connected subautomaton of A / @ ,so by Theorem 5, Bicpi is a directable automaton. This means that ker 'pi is a directing congruence on Bi, whence 6i ker ' p i , and now we have that
u n
YA
= AA U
bi
ker'p = 8.
i=l
So we have proved that A.
YA
is the least generalized directing congruence on w
392
Using the above theorem we can give an algorithm for finding the least generalized directing congruence on a finite automaton A with ‘12 states and in input letters. The algorithm consist of two parts. In the first one we compute the strongly connected subautomata of A. As we have mentioned in Section 2, we can use one of the algorithms given in [18]or [9].They work in time C3(m,z n 2 ) and U(rnn), respectively. In the second part of the algorithm we compute the least directing congruence on each strongly connected subautomaton of A. Here we can use the algorithm given by B. Imreh and M. Steinby in [13], which can be carried out in time C3(rnn2+ n 3 ) . Therefore, the total time for realizing the whole algorithm is bounded by C3(mrz2 + n 3 ) ,the same as for the algorithm for computing the least directing congruence. Before we describe the least locally directing congruence on a finite automaton, we give a more general result. Theorem 14 Let P be a n irregular pseudovariety of automata and let a finite automaton A be represented as a direct s u m of direct s u m indecomposable automata Ai, i E [ l , n ] . For each i E [1,n] let X p , i denote the least P congruence on Ai. T h e n the relation XP,A defined o n A by
+
(u,b) E AP,A
@
(a,b) E A P , ~for some i E [ l , n ] ,
i s the least L(P)-congruence on A.
Proof. Evidently, XP,A is a congruence relation on A. Let p be the natural hoinomorphism of A onto A‘ = A / X P , A ,and for each i E [l,n] let cpi denote the restriction of cp on Ai and A: = Aipi. Then for every i E [ l , n ] we have that
( a , b ) Ekerdi
e a , b E Ai H a,b
& (a,b) E ker4 E Ai & ( ~ ~ E6X P) , A
@
(a,b) E X P , ~ ,
so kercpi = X p , i , and now we conclude that A: E Ai/Xp,i E P, because Xp,i is a P-congruence on Ai. On the other hand, if a‘ E A: n A$, for some i , j E [1,n], i # j , then a’ = aicpi = aip and a’ = aj’pj = a j q , for some ai E Ai and aj E A j , which yields (ui,uj) E XP,A. But, by the definition of XP,A it follows that ai and aj must belong to the same Ak, for some k E [l,n], i.e. that i = k = j , which leads to a contradiction. Therefore, we conclude that A: f l AS = 0 for i , j E [l,n], i # j, so A’ is a direct sum of automata A:, i E [l,n]. Using again Theorem 3 we obtain that A’ E L ( P ) , and hence, X P , A is a L(P)-congruence on 4. To prove that XP,A is the least L(P)-congruence on A, consider an arbitrary L(P)-congruence 0 on A. Let q5 be the natural homomorphism of A onto A” = A/O, and for each i E [1,n] let di he the restriction of 4 on
393
Ai and A: = Ai+ = Ai+i. We are going to prove that A? is direct sum indecomposable, for every i E I . Fix i E I and consider A?. It is easy to see that the inverse homomorphic image B+rl of every direct summand B of A? is a direct summand of Ai, and since Ai is direct sum indecomposable, we conclude that so is A:. On the other hand, 8 is an L(P)-congruence on A, whence A" = A/O E L ( P ) , and seeing that L ( P ) is a pseudovariety, then we also have that A? E L ( P ) . According to Theorem 3, the automaton A: can be decomposed into a direct sum of automata from P , and since A: is direct sum indecomposable, we conclude that A: E P . By this and by A? = Ai+i Ai/ ker+i it follows that ker+i is a P-congruence on Ai, whence X P , ~g ker +i. Therefore, X p , i g ker +i for every i E [l,n],and hence, XP,A 5 ker = 8. So we have proved that XP,A is the least L(P)-congruence on A .
+
If we assume P to be the pseudovariety of all finite directable automata, then the following consequence is obtained: Corollary 2 Let afinite automaton A be represented as a direct s u m of direct s u m indecomposable automata Ai, i E [I, n]. For each i E [l,n] let 6i be the least directing congruence on Ai. Then the relation XA on A , defined b y
( a , b ) E XA
(u,b) E 6i for some i E [ l , n ] ,
is the least locally directing congruence on A. In the case when P is assumed to be the pseudovariety of all finite trapdirectable automata, Theorem 14 gives as a consequence Theorem 5 of [18] that characterizes the least locally trap-directing congruence on a finite automaton. An algorithm for finding the least locally directing congruence on a finite automaton A with n states and m input letters, based on the previous results, can be also composed of two algorithms. The first one is the algorithm for finding the greatest direct sum decomposition of A, given by the authors in [l8], which can be done in time U(7nn). In the second one we compute the least directing congruence on every summand of this decomposition, using the mentioned algorithm from [13], and this takes time U ( m n 2+ n 3 ) . Therefore, the whole algorithm can be also realized in time U(mn2+ n3). References 1. C. J. Ash, Pseudovarieties, generalized varieties and similarly described classes, J. Algebra 92 (1985), 104-115.
394
2. S. BogdanoviC and M. CiriC, A note on congruences o n algebras, in: Proc. of I1 Math. Conf. in PriStina 1996, Lj. D. KoEinac ed., PriStina, 1997, pp. 67-72. 3. S. BogdanoviC, M. CiriC, B. Imreh, T. PetkoviC and M. Steinby, Local properties of unary algebras, (to appear). 4. S. BogdanoviC, B. Imreh, M. CiriC and T. PetkoviC, Directable automata and their generalizations - A survey, in: S . CrvenkoviC and I. Dolinka (eds.), Proc. VIII Int. Conf. "Algebra and Logic" (Novi Sad, 1998), Novi Sad J. Math 29 (2) (1999), 31-74. 5. S. Burris and H. P. Sankappanavar, A course in universal algebra, Springer-Verlag, New York, 1981. 6. J. Cernjr, Potna'mka k homoge'nnym expperimentom s konecny'mi automatrni, Mat.-fyz. cas. SAV 14 (1964), 208-215. 7. M. CiriC and S. BogdanoviC, Posets of C-congruences, Algebra Universalis 36 (1996), 423-424. 8. M. CiriC and S. BogdanoviC, Lattices of subautomata and direct s u m decompositions of automata, Algebra Colloq. 6:l (1999), 71-88. 9. J. Demel, M. DemlovA and V. Koubek, Fast algorithms constructing minimal subalgebras, congruences, and ideals in a finite algebra, Theoretical Computer Science 36 (1985), 203-216. 10. I. Fleischer, A note on subdirect products, Acta. Math. Acad. Sci. Hungar. 6 (1955), 463-465. 11. F. GCcseg and I. PeSk, Algebraic Theory of Automata, AkadCmiai Kiad6, Budapest, 1972. 12. J. M. Howie, Fundamentals of Semigroup Theory, London Mathematical Society Monographs. New Series, Oxford: Clarendon Press, 1995. 13. B. Imreh and M. Steinby, Some remarks on directable automata, Acta Cybernetica 12 (1995), 23-35. 14. M. Ito and J. Duske, O n cofinal and definite automata, Acta Cybernetica 6 (1983), 181-189. 15. J. KovaEeviC, M. Cirit, T. PetkoviC and S. BogdanoviC, Decompositions of automata and reversible states, A. Adam and P. Domosi (eds.), Proceedings of the Nineth International Conference on Automata and Formal Languages, Publ. Math. Debrecen (to appear). 16. T. PetkoviC, M. CiriC and S. BogdanoviC, Decompositions of automata and transition semigroups, Acta Cybernetica (Szeged) 13 (1998), 385403. 17. T. PetkoviC and M. Steinby, Piecewise directable automata, Journal of Automata, Languages and Combinatorics 6 (2001), 205-220.
395
18. 2. PopoviC, S. BogdanoviC, T. PetkoviC and M. CiriC, Trapped automata, A. Adam and P. Domosi (eds.), Proceedings of the Nineth International Conference on Automata and Formal Languages, Publ. Math. Debrecen(to appear). 19. M. Steinby, On definite automata and related systems, Ann. Acad. Sci. Fenn., Ser. A, I Math. 444, Helsinki 1969. 20. G . THIERRIN, Decompositions of locally transitive semiautomata, Utilitas Mathematics 2 (1972), 25-32.
396
Acts over Right, Left Regular Bands and Semilattices Types Tatsuhiko Saito Mukunoura, Innoshima 374, Hiroshima, Japan 722-2321 Email:
[email protected] Let S be a semigroup and let X be a non-empty set. Then X is called a right act over S or simply S-act if there is a mapping X x S 4 X , ( 2 ,s) C) x s with the property ( x s ) t = x ( s t ) . A semigroup S is called a band if every element in S is an idempotent. A band S is called r i g h t regular (resp. l e f t regular) if sts = st (resp. sts = ts) holds for every s,t E S. A commutative band is called a semilattice. An S-act X is said to be a right regular band type, or simply RRB-type, if x s 2 = x s and x s t s = x s t for all x E X and every s, t E S. A left regular band type (LRB-type) S-act and a semilattice type (SL-type) S-act are similarly defined. When S is a free monoid a RRB-type automaton, an LRB-type automaton and an SL-type auromaton can be similarly defined. In this case, for an automaton A = ( A ,X , a), where A is an alphabet, X is a set of states and 6 is a mapping X x A -+ X , ( 2 ,a) C) xu. we can show that, if xu2 = za and xaba = xab for all x E X , a , b E A, then x s 2 = x s and x s t s = x for all x E X, s,t E A*. This fact can be applied to LRB-type automata and SL-type automata. Our purpose is to determine all S-act which are right regular band types, left regular band types and semilattice types, respectively. To achieve the purpose, we obtain necessary and sufficient conditions, for any given set X , and any semigroup S, in order that X is S-acts which are a RRB-type, a LRB-type and a SL-type, respectively (Theorems 1,3,5). Further we obtain more concrete results to construct actually RRB-type, LRB-type and SGtype automata, respectively (Corollaries 2,4,5). Let X be a S-act. It is well-known that defining a relation p on S by spt if x s = x t for all x E X . a transformation semigroup S/p on X can be obtained. Thus, from the above results, every right regular band, left regular band and semilattce can be obtained in the full transformation semigroup T ( X ) ,respectively.
Let X be a set and let p be an equivalence on X . Then the pclass containing x E X is denoted by x p and the partition of X determined by p is denoted by ~ ( p ) For . a mapping 4 : X -+ X , x H x 4 , let im(4) = ( x 4 l x E
397
X } , ker(4) = ( ( 2 ,y ) E X x Xl.4 = y 4 } and fix(+) = { x E Xl.4 = x } , which are called the image, the kernel and the set of fixed points of 4, respectively. When X is an act over a semigroup S, im(s), ker(s) and fix(s) can be defined for s E S, since s : x c-) 2s. Since ker(s) is an equivalence on X, xker(s) denotes the ker(s)-class containing x. The symboles n and U denote the set-theoretic intersection and union, respectively, and A and V denote the lattice theoretic meet and join, respectively. If X is an act over a free monoid S, the x l = x for all x E X , where 1 denotes the empty word in S. For a set X,1x1 denotes the cardinality of X , and for a word s,1.1 denotes the length of s. Lemma 1. Let X be an act over a free monoid A*. Then (1) If xaba = xab for every a, b E A U ( 1 ) and x E X . then X is a mght regular band type act over S. ( 2 ) If xaba = xba for every a, b E A U ( 1 ) and x E X , then X is a left regular band type act over S. (3) If xa2 = xa and xab = xba for every a, b E A and x E X , then X is a semilattice type act over S. Proof. Let x E X and s,t E A*. We show (1) by induction on Is1 = n and JtJ = m, ( 2 ) and (3) can be similarly shown. If n,m 5 1, then the assertion is true by the assumption. Suppose that xata = xat holds for a E A and t E A* with It1 = k. If It1 = k + 1, then t = t'c for some c E A and t' E A* with It'l = k . Then we have xata = xat'ca = xat'aca = xat'ac = xat'c = xat. Thus xata = xat for every t E A*. Suppose that xsts = ast holds for any t E A* and s E A* with 1.1 = k. If 1.1 = k + 1, then s = s'c for some c E A and some s' E A* with ls'I = k. Then we have xsts = xs'cts'c = xs'cts' = xs'd = xst. Thus xsts = xst for every s, t E A*. Therefore we have xs2 = x s l s = x s l = 2s. Consequently, X is a right regular band type S-act. Lemma 2. Let X be a set and let 4 : X + X , x c-) x+. Then the following are equivalent : (1) (x,x$) E ker(4) for every x E X. (2) im(4)=fuc(+). (3) im(4) nxker(+) = {qb} for every x E X , which means the set im(+) n xker(4) has only one elementis x4. Proof. ( 1 ) + ( 2 ) Let x E im(+). Then x = y+ for some y E X . Since (y,y+) E ker(4), x+ = (y+)+ = y 4 = x , Thus im(+) C fix(+). The reverse inclusion is clear. (2) + (3) Let y E im(+) n xker(4). Then y = y+ = x4, since ( x , y ) E ker(+) and y E im(+) = fix(+).
398
(3) + (1) Straightforward.
L e m m a 3. Let X and 4 be as in Lemma 2. If there exist a subset Y of X and an equvalence p o n X such that Y n x p = {x4} for every x E X , then Y = im(4) = h ( 4 ) and p = ker(4). Proof. Let x E im(4). Then z = y 4 for some y E X. Since {x} = Y n yp, we have z E Y . Thus im(4) C Y . Let x E Y. Since x E Y f l xp, x = x4. Thus Y C_ h ( 4 ) im(4). Let (x,y) E p. Since Y r l x p = Y n yp, xq5 = y4. Thus (x,y) E ker(4). Let (x,y) E ker(4). Then x4 = y4. Since ( x , x 4 ) E p and (Y,Y4) E P , we have (2, 9) E P. Let (X, 5 ) be an ordered set. A subset I of X is called an o-ideal if x E I and y 5 x imply y E I. Then the set of o-ideals in ( X ,5 ) forms a lattice ordered set under U and n. For 2 E X,let I ( x ) = {y E X l y 5 x}. Then I(z) is an o-ideal, which is called the principal ideal generated by x. If a subset Y of X has the minimun element, we denote it by min(Y).
Theorem 1. Let X and S be a non-empty set and a semigroup, respectively. Then X is a right regular band type S-act i f and only i f X is an ordered set under some order-relation 5 , and f o r each s E S , there exist a subset X , of X and an equivalence p , on X which satisfy the following conditions: (1) JX, n xpsI = 1 for every z E X and s E S, ( 2 ) each X , is an o-ideal in ( X , I), (3) if X , n x p = {y}, then y E I(z), ( 4 ) X,t = X, f l X , f o r every s, t E S and (5) if (2,9) E P s , y E x, and (Y, E pt, z E xt, then (2, ). E p t . Proof. Suppose that X is a right regular band type S-act. Define a relation 5 on X by y 5 x iff y = xs for some s E S. It is easy to see that 5 is reflexive and transitive. Let y 5 x and x 5 y. Then y = xs and x = yt for some s,t E S , so that we have x = xst = x s t s = y. Thus is an ordered set. Since (25)s = xs2 = xs for every x E X and s E S , so that by Lemma 2 we have that im(s) = h ( s ) and im(s) n xker(s) = {xs} and xs E I ( z ) , since xs 5 x. Let x E im(s) and let y 5 x for x,y E X,sE S. Then y = xt for some t E S , so that we have y = xt = zst = xsts E im(s), since x E im(s) = fix(s). Thus im(s) is an o-ideal. Let x E im(st). Then, as x = xst = zsts, x E im(s) n im(t). Thus im(st) C im(s) n im(t). The reverse inclusion is clear, since fix(s)n h ( t )C fix(&). Thus im(st) = im(s) n im(t). If (x,y) E ker(s), y E im(s) and (y, z ) E ker(t), z E im(t), then xst = y t = z. Thus ( x , z ) = ( x , x s t ) E ker(st). For each s E S , put im(s) = X, and ker(a) = pa. Then X , and p , satisfy the conditions (1)-(5). Suppose coversely that, for each s , S, ~ the subset X, of X and the equivalence p , on X satisfy the conditions (1)-(5). Define the action of S on X
(2,s)
399
by xs = y if X,nxp, = {y}. Then xs 5 x by the condition (3), and by Lemma 3 we have that X, = im(s) = fix(s) and p, = ker(s), so that ( 2 , ~ s )E ker(s) by Lemma 2, i.e., zs = ( x s ) s . Let zs = y and yt = z. Then y E X s , z E Xt and z I y, so that by the condition (2) z E X, Thus (zs)t E X, nXt = Xst, and ((zs)t)s = (zs)t, since (zs)t E X, = fix(s). As (x,y) E ps,y E X, and (y,z) E p t , z E Xt, by the condition ( 5 ) ( q z ) E pst. Consequently z E X,,n zp,t. Thus z = (zs)t = z(st), so that xs2 = (zs)s = zs and xsts = ((xs)t)s = (xs)t = xst for all z E X and every s,t E S, as requiered. Thr following result is useful to construct actually right regular band type automata. Corollary 2. Let X be an act over a free monoid S = A*. Then X is a right regular band type S-act if and only if X is an ordered set under some order relation I,and for each a E A, there exists an o-deal I, in (X, 2 ) with I, n I(%) # 0 for every x E X . Proof. Suppose that X is a right regular band type S-act. As is seen in the proof of Theorm 1, X can be an ordered set under some order relation 5, and im(s) is an @ideal in (X, 5 ) with im(s) f l I(z)# 0. For each a E A, put im(a) = I,. Then the proof is complete. Suppose conversly that (X, 5 ) is an ordered set and for each a E A, there exists an 0-ideal I, in (X, 5 )with I, nI(x) # 0. Define the action of A on X by xa E I, n I(x) if z 4 I,, otherwise za = z. Then I, =im(a) = &(a) and xa 5 x. Put ker(a) = pa. Let x E X and a, b E A. Then zab = (xa)b 5 xa and za E I,. Since I, is an @ideal, we have xab E I, = fix(a). Thus xab = (xab)a = zaba. By Lemma 1, X is a right regular band type S-act.
Theorem 3. Let X and S be as in Theorm 1. Then X is a left regular band type S-act if and only if, for each s E S, there exist a subset X, of X and equivalence ps on X which satisfy the following conditions: (1) (X, n zp,l = 1for every x E X and s E S, (2) pst = ps V pt for every s,t E S and (3) if x E X,,y E Xt and (x,y) E pt, then y E XSt. Proof. Suppose that X is a left regular band type S-act. Since xs2 = xs for every x E X,s E S, by Lemma 2 we have that im(s) = fix(s) and im(s) V zker(s) = {xs}. Let x , y E X and s, t E S. If (x,y) E ker(s), then xst = yst, so that (x,y) E ker(st). If (x,y) E ker(t), then zst = ztst = ytst = yst, so that (2,y) E ker(st). Consequently, ker(s) V ker(t) ker(st). If (x,y) E ker(st), then zst = yst. Since (2,~s)E ker(s) and (zs,z s t ) E ker(t), (5, zst) E ker(s) vker(t), similarly (y, yst) E ker(s) V ker(t) so that (z,y) E ker(s) V ker(t). Consequently ker(st) G ker(s) V ker(t). Thus ker(st) = ker(s) V ker(t). If 2 E im(s),y E im(t) and (z,y) E ker(t),
400
then zs = z, since im(s) = &(s), similarly y t = y , and zt = yt, so that y s t = ytst = ztst = zst = zt = y t = y. Thus y E im(st). Put X , = im(s) and p, = ker(s). Then X , and ps satisfy the conditions (1)-(3). Suppose conversely that, for each s E S, the subset X , of X and the equivalence p, on X satisfy the conditions (1)-(3). Define the action of 5’ on X by zs = y if X , np, = { y } . Then p, = im(s) = &(s) and p, = ker(s). Let z E X and s,t E S. Then by Lemma 2 (z,zs) E p, and (zs,( z s ) t ) E pt, so that (2, ( z s ) t ) E p,Vpt = p,t. Since zs E X,, (m)t E X t and (zs,( z s ) t ) E pt, by the condition (3) we have z E XSt. Consequently, ( z s ) t E X,t n zp,t, so that (zs)t = z(st). Thus X is S a d . By Lemma 2 (zs)s = 5s so that zs2 = zs and since ( z , z t ) E pt 5 p s t , z s t = z ( s t ) = ( z t ) ( s t ) = ztst. Therefore X is a left regular band type S a d . Let X be a set. Then the set of equivalences on X forms a latticeordered set under n and V. We consider here a special set R ( X ) of equivalences on X , that is, for each p E R ( X ) ,there exists a subset M, such that (1) M, n z p = 0 for every pclass zp, (2) for every X E R ( X ) , if (2,y ) E p V A, then ( u , v ) E X for every u E M, n z p and v E M, n yp. (3) if z 4 M, and z E zp, then there exists X E R ( X ) such that ( z ,v) 4 X for every v E y p even if (2, y ) E p V A. In this case, M, is called the join-mediating set of p in R ( X ) and R ( X ) is called a JM-set, i.e., every element in R ( X ) has the join-mediating set in
R(X). ) {1,3} U Example. Let X = {1,2,3,4} and R ( X ) = { p l , p z } with ~ ( p l = {2,4},7r(pg) = {1,2,3}U{4}. Then{1,2,3}and {2,4}arethejoin-mediating set of p1 and p2, respectively, so that R ( X ) is an JM-set. If 4 p 3 ) = {1,2} u {3,4}, then R(X) U { p 3 } is not an JM-set, since (2,4) 4 p3.
The following result is useful to construct actually left regular band type automata.
Corollary 4. Let X be a n act over a free semigmup S = A*. Then X is a left regular band type S-act .if and only iJ for each a E A, there exists a n equivalence pa o n X such that R(A) = {pala E A } is an JM-set. Proof. Suppose that X is a left regular band type 5’-act. For each s E S, put ker(s) = p,. From Theorem 3, we have p,t = p, V p t . Since (2s). = xs2 = zs, (z,zs) E ker(s) = p,, so that by Lemma 2 im(s) = fix(s). Let s E S and (z,y ) E p, V pt for any t E 5’. Then (z, y ) E p,t, so that zst = yst. Since (z, zs),( y ,y s ) E p, and (zs,y s ) E pt. Thus im(s) is the join-mediating set of p, in {ptlt E S}, so that {ptlt E S} is an JM-set. Therefore R(A) = {pala E A } is also an JM-set.
401
Suppose conversely that, for each a E A, there exists an eqivalence pa on X such that R(A) = {pala E A } is an JM-set. For each pa E R(A),let M,, be the join-mediating set of pa in R(A). Define the action of A on X by z a E Mp, nzp, and ya = xa if y E zp,. Since za E %pa,we have (za)a = 20. By Lemma 2, im(a) = &(a). If (z,y) E pa, then za = ya. If (z,y) E ker(a), then zp, = ypa, since z a E xrho,, ya E ypa and z a = y a . Thus pa = ker(a). Let z E X and a, b E A, and let xab = y. Since (5, xu) E ker(a) = pa and (20,y) = (za, zab) E Pb, (2, y) E pa V Pb. As Mpb is a join-mediating set of pb in R(A), ( z b , y b ) E pa = ker(a), so that zba = yba = ya = xaba for all z E X, since y E im(b) = f k ( b ) . By Lemma 1, X is a left regular band type S-ad.
Theorem 5. Let X and S be as in Theorem 1. Then X is a semilattice type S-act i f and only i f X is an ordered set for some order relation 5 , and for each s E S, there exist a subset X, of X and an equivalence p, o n X which satisfy the following conditions: (1) IX, n zp,I = 1 f o r every 2 E X , (2) each X , is an o-ideal, (3) if z, n zp, = {y}, then y E I ( z ) (4) xst = X, n X t for every s,t E S and ( 5 ) pst = p, V pt f o r every s,t E S. Proof. Suppose that X is a semilattice type S-act. From the fact that a semilattice is a right regular band an a left regular band, by using Theorems 1,3 this follows. Suppose conversely that, ( X , I ) is an ordered set, and for each s E S, the subset X, of X and the equivalence p, on X satisfy the conditions (1)(5). Define the action of S on X by 2s = y if X, n zp, = {y}. Then by Lemma 3 we have that X , = im(s) = fix(s) and p, = ker(s). Let z E X and s,t E S. By the same argument as the proof of Theorem 1, we obtain that 2s = (zs)s, (zs)t = ((xS)t)sand ( z s ) t E Xa fl x b = Xab and by the same argument as the proof of Theorem 3, we obtain that ( z t ) s = ( ( z s ) t ) sand (zs)t E ps V Pt = pst. Thus (xs)t = ( ( z s ) t ) s= ( x t ) s and ( z s ) t E X,t n xpst, so that ( z s ) t = z ( s t ) , and we have zs2 = xs and z ( s t ) = z ( t s ) . The following result is useful to construct actually semilattice type automata.
Corollary 6. Let X be an act over a free monoid S = A*. Then X is a semilattice type S-act af and only if X is an ordered set under some order, and f o r each a € A there exists an equivalence pa which satisfies relation I the following conditions: (1) R(A) = {p,(a E A } is a M-set,
402
(2) each pa-class xp, has the minimum min(xp,) and (3) Mp, = {min(zp,)1x E S} is an o-ideal in ( X , l ) and the joinmediating set of pa in R(A). Proof. Suppose that X is a semilattice type S-act. From x s 2 = xs, by Lemma 2 we have im(s) n xker = {xs}. Since a semilattice is a right regular band and left regular band, we have that ( X ,I)is an ordered set, where y I x iff xs = y for some s E S, and that im(s) = fix(s) is an &ideal in (X, 5 ) and the join-mediating set of ker(s) in R ( S ) = {ker(t)It E S}. Since xs 5 x, we have xs = min(xker(s)), so that im(s) = {min(xker(s))lx E S} which is an &ideal in ( X ,5 ) and the join-mediating set of ker(s) in R(S). Thus R ( S ) is a M-set. For each a E A, put ker(a) = pa and R (A) = {p,(a E A } . Then im(a) = {min(xp,)lx E X } is an &ideal in ( X ,5 ) and the join-mediating of pa in R(a). Suppose conversely that ( X ,I)is an ordered set and for each a E A, the equivalence pa satisfies the conditions (1)-(3). Define the action of A on X by za = y if y = min(zp,). Then we have xa E I ( z ) . Since Mppn xp, = {min(zp,)}, by Lemma 3 M p p = im(a) = &(a) and pa = ker(a). Let x E X , a, b E A. Then xa2 = (xa)a = za. Since xab 5 xa E im(a) = Mp, and Mp,is &ideal, we have xab E Mpa= &(a), so that zaba = (xab)a = xab. Let xab = y. Then (x,y) E p,Vpb. Since (x,xb) E p b , (y,yb) E p b and Mpbis the join-mediating set of p b in R ( X ) ,we have (xb, yb) E pa = ker(a), so that xba = yba = ya = xaba. Consequently, xab = xaba = xba for all x E X and every a, b E A. By Lemma 1, X is a semilattice type S-act.
References [l]Howie, J. M. “Fundamentals of Semigroup Theory”, Oxford Science Pub-
lications, Oxford, 1995. [2] Kunze, M. and S. CrvenkoviE, Maximal subsemilattices of the full transformation semigroup on a finite set, Dissertationes Matematicae CCCXIII, Polish academy of Sciences, 1991. [3] Petrich, M., “Lectures in Semigroups”, John Wiley and Sons, London, 1977. [4] Saito, T. and M. Katsura, Maximal inverse subsemigroups of the full transfomation semigroup in “Semigroups with Applications (ed. J. M. Howie, W. D. Munn and H. J. Weinert), World Scientific,l991, 101-113.
403
Two Optimal Parallel Algorithms on the Commutation Class of a Word Extended abstract Re& Schott*
Jean-Claude Spehnerl
Abstract The free partially commutative monoid M ( A ,0 ) defined by a set of commutation relations 0 on an alphabet A can be viewed as a model for concurrent computing: indeed, the independence or the simultaneity of two actions can be interpreted by the commutation of two letters that encode them. In this context, the commutation class Co(w) of a word w of the free monoid A* plays a crucial role. In this paper we present: - A characterization of the minimal automaton Ao(w) for Co(w)with the help of the new notion of @-dissection. - A parallel algorithm which computes the minimal automaton Ao(w). This algorithm is optimal if the size of A is constant. - An optimal parallel algorithm for testing if a word belongs to the commutation class CQ(W). Our approach differs completely from the methods (based on Foata's normal form) used by C. G r i n and A. Petit [2, 31 for solving similar problems. Under some assumptions the first algorithm achieves an optimal speedup. The second algorithm achieves also an optimal speedup and has a time complexity in O(1og n) if the number of processors is in O ( n ) where n is the length of the word w, the total number of operations is in O ( n ) and does not depend on the size of the alphabet A as for the classical sequential algorithm.
Keywords: Automaton, commutation class, optimal, parallel algorithm, partially commutative monoid. ~
~~
'LORIA and IECN, Universite Henri Poincark, 54506 Vandoeuvre-l&s-Nancy,France, e-mail: schottOloria.fr t Laboratoire MAGE, FacultC des Sciences et Techniques, UniversitC de Haute Alsace, 68093, Mulhouse, France, e-mail:
[email protected] 404
1
Introduction
The free partially commutative monoid was introduced by P. Cartier and D. Foata [l]for the study of combinatorial problems in connection with word arrangements. It has particularly been investigated as a model for concurrent systems (see [4, 131) since the pioneering work of A. Mazurkiewicz [9]. In this context the computation of the commutation class of an element w (i.e. all words equivalent t o w) is of great interest since it gives all transactions equivalent to the initial one modulo the partial commutation relations. In other words, if a transaction is correct (i.e. no deadlock appears during its execution) then all elements of its commutation class are also correct. This paper is devoted to the design of - an optimal parallel algorithm which computes the minimal automaton of the commutation class of a given word on a constant size alphabet and achieves an optimal speedup under some assumptions, - an optimal parallel algorithm for testing if a word belongs to this commutation class. Our test algorithm is particularly original since its time complexity does not depend on the size of the alphabet on which the word is written. The notion of optimality of parallel algorithms used in this paper is defined as follows (see [7]): Given a computational problem Q, let the sequential time complexity of Q be Tseq(n) where n is the size of Q’s data. This assumption means that there is an algorithm to solve Q whose running time is O(Tsep(n)).A parallel algorithm to solve Q will be called optimal if the total number of operations it uses is asymptotically the same as the sequential complexity of the problem, of the parallel algorithm. regardless of the running time Tpar(n) The organization of the paper is as follows: Section 2 provides the basic notions on partial commutativity and gives a characterization of the minimal automaton of a commutation class with the help of the new notion of @-dissection. Section 3 focuses on the design of a parallel algorithm which constructs the partial automaton of a commutation class. Testing if a word belongs to a commutation class is the subject of Section 4. We give mainly sketch of proofs of our results. All details will be provided in the full version of this paper.
405
2
The partial minimal automaton of the commutation class of a word
Let A be a finite alphabet, A* the free monoid on A and 0 a partial commutation relation on A. With ( A ,0 ) we associate the smallest congruence (denoted =e) such that: ( a ,6 ) E 0 @ ab E-0 ba. Let w be an element of A*. The commutation class of w is the set C e ( w ) defined as follows: C ~ ( W=)(w' E A*/w' E@ w). For each rational language L of A*, there exists a finite minimal automaton A ( L ) which recognizes L . If L is finite, A ( L ) admits a non terminal state z such that, for each letter a E A, z.a = z and by deleting the state z , we get the partial minimal automaton A P ( L ) of L. The partial minimal automaton of the class C e ( w ) is denoted A e ( w ) .
Definition 1 A scattered subword (not a factor) m = ail ai, . . . aih of w = aoal . . . a,-l called rigid relatively t o 0 if none of the pairs of letters '22)l 2i'(
I
%S),
is
' ' ' 1 (ai(h-l) 7 ' i h )
belongs t o 0 U O-l, i.e. two consecutive letters of m are either equal or distinct and not permutable with respect t o 0 . It is easy to prove that all words of C e ( w ) have the same rigid subwords.
Definition 2 i ) For each strictly increasing sequence of integers n = ( i l l . .. ,ip) of the set (0, . . . ,n - l}, the strictly increasing sequence r = (jl, . . . ,j,) such that {jl, . . . , j q } = (0,. . . ,n - 1) - {ill... ,ip} i s called the complementary sequence of n for ( 0 , . . . , n - 1). By symmetry, n is the complementary sequence of r f o r (0,. . . , n - 1). u = ail . . . aip and v = ajl . . . aj, are then subwords of w = aio . . . a,-l and w is a shzlfJEe of u and v. The word v is then said t o be complementary of u with respect t o w . A strictly increasing sequence n admits a unique complementary sequence but this is not true f o r words since two distinct strictly increasing sequences n = ( i l l .. . , ip) and n' = (ii,. . . ,ib) can define the same word u = ail . . . ai, = ail , . . ai; (see Example 1 below). ii) Let n be a permutation of ( 0 , . . . ,n - 1) and w' = a,(o) . . . Every pair ( i ,j ) of elements of ( 0 , . . . , n-1) such that i < j and n ( j ) < n(i) is called an inversion of the sequence (n(O),. . . , n(n - 1)) and also an inversion of w' with respect to w .
406
iii) A pair (a,r ) of strictly increasing complementary sequences of the set ( 0 , . . . ,n - l} is called a @-dissection of ( 0 , . . .,n - 1) if, for each inversion (j,i) of the sequence ar = ( i l , . . . , i p , j l , .. . , j q ) , the letters ai and aj are distinct and permutable for 0 . I f (a,r ) is a @-dissection of ( 0 , . . . , n - l}, the pair (u, w) of subwords u = ail . . . aip and v = aj, . . . aj, of w is called a @-dissection of the word w.
Example 1 If w = abcdbe and 0 = { ( a ,b ) , ( a ,c ) , ( a ,d ) , ( a ,e ) , (b, d ) , ( b , e ) ,( c ,d ) } , the sequences a = (1,2,3) and r = (0,4,5) are complementary for ( 0 , . . . ,5} and (a,r ) is a @-dissection of ( 0 , . . . ,n - 1) since the inversions (0, 1), (0,2) and (0,3) of the sequence ar = (1,2,3,0,4,5) correpond to the pairs ( a , b ) , ( a ,c ) and ( a ,d ) of 0 . The pair of words ( u ,w) where u = bcd and v = abe is therefore a @-dissection of w = abcdbe. (bc,adbe), (bcde,ab), (bcdbe,a ) are also @-dissections of w. The subword u = abe of w admits two complementary subwords w = cdb and v‘ = bcd which correspond to a = (0,1,5), r = (2,3,4) and a’ = (0,4,5), r’ = (1,2,3). 24
d
32
47
Figure 1: The graph of the partial minimal automaton Ao(w) for w = abcdbe and 0 = { ( a ,b), ( a , c ) ,( a , d ) ,( a , e ) ,( b , d ) ,( b , e ) ,( c , d ) } . The states are denoted in accordance with section 3. Theorem 1 a) The function 4 which associates the state s = 1.u with each @-dissection (u, w) of w is bijective. ii) If u = bl . . . bp, the letters a for which there exists a transition towards s relative to a , are the letters bi such that, if i # p , bi permutes with the letters bi+l, . . , ,bp.
407
iii) If u = c1 . . . cq, the letters a for which there exists a transition issued f r o m s relative to a, are the letters ci such that, if i # 1, ci permutes with the letters c1,. . . ,ci-1. Proof sketch. The proof of the theorem is based on the following results: u of w there exists at most one subword u of w such that ( u , v ) is a @-dissection of w. - If u and u are subwords of w,(u,v) is a O-dissection of w if and only if uu belongs to Ce (w). - For each state s of A(w), if u and u are words such that 1.u = s and s.w = f , L(Ae(w), 1,s) = C e ( u ) and L ( A e ( w ) s, , f ) = Ce(v). 0 - For each subword
3
A parallel algorithm
In this section we design an optimal parallel algorithm which constructs the partial minimal automaton Ae(w). We give an overview of how our algorithm works. The algorithm constructs first the partial automaton A0 which recognizes only the word w. The transformation of A0 into the automaton A e ( w ) is based, essentially, on the following simple transformation: if t , u and u are states and a , b are letters of A such that t = u.a,u = t.b and (a,b) E 0 U 0-l then there exists also a state s such that s = u.b and s.a = v. If s does not exist already it has to be constructed and the transitions s = u.b and s.a = u have to be created (if they do not exist). This transformation, called the permutation of the letters a and b at t , can generate as well new permutations of some letters at u and u. If such permutations are realized in parallel, it may be possible that they try to create simultaneously the same state. In order to avoid this possibility we associate an integer with each state. This integer does not depend on its creation procedure and we distribute the states among the different processors.
3.1
The distribution of the states among the processors
Theorem 1 proves that each state s of Ae(w) is in 1 - 1 correspondence with a O-dissection ( u , v ) of w. If w = w[O]w[l]w[2].. . w [ n - 11 (from now arrays of letters are used for words) u has the form w [ i l ] w [ i ~ . .]w.[ i t ] where ( i l ,i2,. . . ,ik) is a strictly increasing sequence of integers of {0,1,. . . ,n - 1). It follows that we can identify the number 1 2i1 2i2 . . . 2i'. with the state s. In fact, if we put z = s - 1 and remove iteratively from z the smallest power of 2, we recover the sequence (il,. . . ,ik). Every state is hence an element of the universe U = { 1,. . . ,2"}.
+
+
+
+
408
Let p be the number of processors which are available on the computer and r the largest odd number which is strictly less than p . If we suppose that p is a power of 2 (a frequent situation), r and p are mutual prime numbers. We split now U in r parts U l , . . . , U , of equal size (up to 1) such that, for each processor q of { 1,.. . , r } , U, is the set of integers s such that 1+ s mod r = q. The processor q has in charge the treatment of all created states which belong to U, and to store in its local memory all data concerning these states. If a state s is created, the processor q = 1 s mod r is activated for: 1) inserting s in a stack of its local memory, 2) affecting a number num[s]to the state s thanks t o the following procedure:
+
i n s e r t ( s ,4 ) ; { if 0 < num[s]5 size then stack[num[s]]:= s else { size := size + I; stack[size]:= s; num[s]:= size
1
1
3) testing if a state s of U, has been created previously by the procedure:
e z i s t ( s ,4 ) ; { if 0 < n u m [ s ]5 size and stack[num[s]]:= s then e z i s t ( s , q ) := t r u e else e x i s t ( s ,q ) := f a l s e } The variable size, common to all these procedures, is stored in the local memory of the processor q and is not used by any other procedure. For a given state s, all these procedures are executed by the same processor q. Therefore the simultaneous execution of several of these procedures for the same state s is not possible. Nevertheless, two distinct processors can execute simultaneously these procedures since they concern then distinct states. Remark 1 These procedures are executed in time 0(1) and replace avantageously the use of an array of booleans. I n fact, the time complexity of the initialization of an array of booleans for the universe U is in 0 ( 2 n ) . Here the initialization is reduced to let size = 0 for each processor (see [lo], page 289). Its time complexity is therefore in 0(1)and the total number of operations is in O ( r ) . Remark 2 The partition used here is well-balanced for the subsets U, of the universe U but it is not necessarily the case f o r the created subsets of states which belong to the subsets U,. If the word w has no particularities, such a splitting is adapted; otherwise, a size balancing method has to be found for the sets S n U,.
409
3.2
The data structures associated with a state
Let s be a state, q = 1 + s mod T the processor associated with s and e = nurn[s].The following data structures are used for s in the local memory of the processor q: - an array transin[..,el which contains the integers i such that there exists a state u such that s = u 22 = u.w[i] (transitions towards s ) ; - the number nbin[e]of transitions towards s; - an array nurnin[..,el such that if h = nurnin[i,el then transin[h,el = i - an array transout[..,e] which contains the integers i such that there exists a state w such that w = s 2i = s.w[i] (transitions issued from s); - the number nbout[e]of transitions issued from s; - an array nurnout[..,e] such that if h = nurnout[i,el then transout[h,el = i. The procedures below are all based on the same idea which is to avoid using arrays of booleans in order to realize the initialization in constant time.
+
+
insertin(i,e, q ) ; { nbin[e]:= nbin[e]+ 1; nurnin[i,e] := nbin[e];transin[nbin[e],e] := i
1 ezistin(i,e , q ) ; { if 0 ; { e := num[u]; if e x i s t m t ( j ,e, q) = false then { insertmt(j,e, q ) ; if nbin[e] # 0 then { for lc := 1 to nbin[e]pardo { h := trans+, el; if (w[h],w[j]) E 0 U 0-1then permute(u,h , j )
1 }
1
1
412
The treatment of the state ”next” is the dual part of the treatment of the state ”previous”.
3.7 The treatment of the diagonal state This procedure is executed by the processor q = s mod r associated with s. If s is created, then we have to initialize nbin[e]and nbout[e]and to create the transitions to and from s relative to the letters w[j] and w[i].If s exists already, the transition from u to s (resp. from s to w) is created if it does not exist. Possibly, there is nothing to do.
diagonal(s,i,j,init,q ) ; { e := n u m [ s ] ; if init = 1 then { nbin[e]:= 0; insertin(j,e, q ) ; nbout[e]:= 0; insertout(i,e , q )
1
else { if existin(j,e , q ) = false then insertin(j,e , q ) ; if existout(i, e, q ) = f a l s e then insertout(i, e, q )
1 Definition 3 A0 is the automaton determined by the procedure af feet-first-states and Ah is the current automaton after executing the procedure permute h times. The procedure permute is executed only once for a triple ( t , i , j ) and there exist only a finite number of such triples. Thus our algorithm terminates. Let f i e n d be the automaton which is finally constructed by our algorithm. It’s easy to prove that: 0 The automaton Aend is deterministic and monogeneous. 0 The language recognized by the automaton Aend is L(Aend,1,2”) = Co (w). It follows that:
Theorem 2 The partial automaton constructed b y our algorithm is isomorphic to the partial minimal automaton A e ( w ) of the commutation class Co(w). Theorem 3 i) If S i z e ( A o ( w ) )is the size of the partial minimal automaton Ae(w) a n d S is his set of states, the total number of operations of our algorithm is in O ( S i z e ( A s ( w ) )* card(A))= O(card(S)* (card(A))’).
413
ii) If the alphabet A is of constant size, our algorithm is optimal. iii) I f the alphabet A is of constant size and if the distribution of the states of S among the subsets U1, . . . , U,. is uniform (i.e. balanced), then our algorithm achieves an optimal speedup. Proof. i) If there exist k ( s ) transitions towards a state s and Z(s) transitions issued from s, the treatment of the state s in the procedures previous, n e x t and diagonal requires O ( k ( s )* Z(s)) operations. Since k ( s ) 5 card(A) and l ( s ) = Size(Ao(w)),the total number of operations is in O(Size(Ae(w)) * curd(A))= O(curd(S)* ( c u ~ d ( A ) ) ~ ) . ii) If the alphabet A is of constant size, the total number of operations is in O ( S i z e ( A e(w))). But any algorithm which constructs a partial automaton recognizing Ce (w) tests necessarily all the transitions issued from each state and therefore the number of operations of such an algorithm is necessarily in O(Size(Ae(w))) and this proves that our algorithm is optimal in this case. iii) If the distribution of the states of S is uniform (i.e. balanced) among the subsets U1, .. . , U,., the T processors are load-balanced. In addition all procedures which are not affected to a processor can be distributed with priority on the processors Ur+l,. . . ,Up and then uniformly among all processors. For every q E (1,. . . , p } , let Tq(n)be the total number of operations realized by the processor q during the execution of the algorithm for a word w of length n and let T,,,(n) = m a z { T , ( n ) ; 4 E (1,.. . , p } } . Since the processors are load-balanced, there exists a strictly positive constant c1 (c1 < I) such that, for every q E (1,. . . , p } , T,(n) 2 c1 * Tmaz(n).Therefore we get: c1 * P * T m a z ( n ) 5 Tq(4 I P * Tmaz(n). Let Tpar(n) and Tseq(n) be respectively the time complexity of our parallel algorithm and the time complexity of an optimal sequential algorithm which constructs the automaton Ao(w) when w is of length n. Since our algorithm is optimal, there exist strictly positive constants c2 and c3 such that: Tpar(n) = c2*TmaZ(n)and TSe,(n) = C ~ * C T,(n). ; = ~ Therefore the speedup * p and this proves Sp(n)= T;;'p(n, T (n) (see [7]) verifies c1* * p 5 S p ( n )5 that Sp(n)is in O ( p ) and is optimal. 0
xsES
E:
(2)
4
Testing if a word belongs to a commutation class
We want to test if a given word u = u[O]. . .u[n- 11 belongs to the commutation class Ce(w) i.e. if this word is recognized by the automaton Ao(w).
414
An elementary sequential algorithm solves this problem in time O ( n ). We design a parallel algorithm which solves this problem in time O(1ogn) when the number of processors is in O ( n) . Moreover the total number of operations is in O ( n ) and does not depend on the size of the alphabet A. Hence our algorithm is optimal. We give now an overview of our algorithm. We use first a very simple test which verifies that, for every letter a E A, the numbers of occurrences of a in the two words u and w are equal. Then we determine, for every i E (0 ,..., n - l}, the value j = eta[i] E (0 , . . . ,n - 1 ) such that w [ j ]= u[i]and the numbers of occurrences of the letter w [ j ]= u[i]in the words w[O]. . . w [ j- 11 and 2401.. . u[i- 11 are equal. Since the states of A o ( w ) are identified with integers of the form 1 2i1 2 i 2 . . . 2 2 k , we can determine, by a prefix sum calculation in O(1og n) time, all the intermediate states which are necessary for recognizing the word u. In fact this computation is done on the universe U = ( 1 , . . . ,2n} and U is also the set of states of a partial automaton Al(w1) where w1 is of length n and all letters of w1 are distinct and two by two permutable. Now u E C Q ( W )if, and only if all these intermediate states are states of the automaton AQ( w ) . Our algorithm uses three well-known procedures.
+ + + +
4.1
Known used procedures
The following procedures compute respectively the sum, the prefix sum and the maximum of the elements in an array. For details see [6, 7, 81. The procedure somme(k, 1, x[k..Z],s u m ) (where k < 1 ) computes the sum of the 1 - k + 1 elements of the array z[k..l] and puts the result in the variable sum. The procedure somme-prefiz(k, 1, z[k..Z],sx[k..l]) (where k < 1 ) computes, for each index i of { k , . . . ,1 } , the prefix sum sz[i]= z [ k ] . . . x [ i ] . The result is then in the array sz[k..Z]. The procedure m a x i m u m ( k ,1, x[k..Z],m a z ) (where k < 1 ) computes the maximum of the 1 - k 1 elements of the array x[k..Z] and puts the result in the variable m a x . All these procedures are optimal and have a time complexity in O(log(1- k + 1 ) ) when the number of processors is in 0(1- k + 1).
+ +
+
4.2
Letter occurrences in a word
An alphabetic order is given on the alphabet A: the array order is such that, for all a E A, a is the order[aIth letter of the alphabet A. Let w = w [ O ] w [ l ]. . .w[n- 11 be a word of length n and for each letter a of the alphabet A let nocv[order[a]]be the number of occurrences of a in w.
415
The purpose of the procedure letter-occurrences given below is to determine, in time O(1og n ) , the number of occurrences nocv[order[a]] of a in v simultaneously for every letter a in A. We choose a number base which is bigger than the number of occurrences of every letter in v : base = n - card(A) 2 (we suppose here that every letter of A has at least one occurrence in v ) . Since card(A) 5 n, we can precompute all the powers base2,.. . baseCard(A)-'of base in O(1ogn) time by an algorithm similar to somme-prefix. card(A) The value s u m computed by this procedure is x k = 1 nocv[k].base"'. s u m permits to determine simultaneously the number of occurrences of every letter of A in v.
+
letter-occurrences(v[O..n - 11,nocv[1..card( A ) ] ;) { base := n - card(A) + 2; for i := 0 to n - 1 pardo { k = order[v[i]]; ~ [ := i ] base"' ;
1
1
somme(0, n - 1, x[O..n - 11, s u m ) ; for k := 1 to card(A) pardo { divsum := s u m div basek-'; nocv[k]:= divsum mod base; (divsum stands for the floor of sumlbase"') }
The first test
4.3
The purpose of the procedure first-test given below is to compare for the two words w = w[O]w[l] . . . w [ n - 13 and u = u[O]u[l] . . .u[n- 11 of length n and for each letter a of the alphabet A, the number of occurrences nocw[order[a]] and noczi[order[a]] of the letter a in w and u. Hence the procedure Zetteroccurrences is called for the words u and w. If these numbers are not two by two equal then the array idoc[l..card(A)]contains a zero and testl # card(A). In this case u is not in the commutation class C e ( w ) .
first-test(u[O..n - l],w[O..n- 11); { letter-occurrences(u[O. .n - 11, nocu[l..card(A)]); letter-occurrences(w[O..n - 11, nocw[1..card(A)]); for k := 1 to card(A) pardo { if nocu[k]= nocw[k]then idoc[k]:= 1 else idoc[k]:= 0
1 somme(1, card(A),idoc[l..card(A)],t e s t l ) if testl # card(A) then write ('udoes not belong to the class')
1
416
The reference word Let z = z[O]z[l] . . . z [ n - 13 be the word of length n which satisfies the following conditions: order[z[O]]5 order[z[l]]5 . . . 5 order[+ - 111 for the 4.4
alphabetic order on A and for each letter a of A the number of occurrences of a in z is equal to the number of occurrences of a in w. We call z the reference word of w . By applying the procedure somme-pre f is to the array nocw we obtain an array decal such that, for every letter a E A such that order[a] > 1, decal[order[a]- 11 is the index of the first occurrence of the letter a in z. Moreover if order[a]= 1, the index of the first occurrence of a in z is obviously 0. In the same procedure we compute a better value for the identifier base which is equal to the maximum number of occurrences of a letter in w plus one. Similarly, as in the procedure letter-occcurrences, base is used in the procedure ref erence-word below for determining the indices of the occurrences of each letter a of A .
re ference-word(1, card(A),nocw[l..card(A)]); { somme-pre f is(1, card(A),nocw [ l .card(A)], . decal [l..card(A)]); decal[O]:= 0; m a x i m u m ( 1, card(A),nocw[l..card(A)],m a z ) ; base := m a s + 1;
1 4.5
Analysis of a word
The purpose of the procedure analyze-word given below is to determine, for a given word v = v[O]w[l] . . .v [ n- I ] of length n, the array phi such that: for each i of (0,. . . , n - l}, z[i]= uCphi[i]]and for every pair ( i , j ) such that z [ i ]= z [ j ] and i < j , phi[i]< p h i [ j ] . The array phi associates, for every letter a E A and for every admissible value of T , the rth occurrence of a in JI with the rth occurrence of a in z . The array decal[O..card(A)]is used by this procedure.
analyze-word(0, n - 1, v[O..n- l],phi[O..n- 13); { for i := 0 to n - 1 pardo { k = order[v[i]]; z[i]:= base"-l ; } somme-pre f i s ( 0 ,n - 1, z[O..n- I ] , ss[O..n - I ] ) ; for i := 0 to n - 1 pardo { m [ i ]:= ss[i]div s[i]; r[i]:= rz[i]mod base; - 11 ~ [ i := ] ]i; phi[decal[order[v[i]]
+
417
4.6
The transformation of a word
The next procedure uses the procedure analyse-word for determining the arraysphiu and phiw and the array eta[O..n-1] which is such that eta[phiu[i]]= phiw[i]for every i E (0, . . . ,n - 1). Thus eta[i]= j if and only if there exists an integer r such that u[i]and w[j] are the rth occurrence of a same letter of A in the words u and w.
transf orm-word(u[O..n - 11, w[O..n - 13, eta[O..n - 11); { analyze-word(0,n - l,w[O..n - l],phiw[O..n- 11); analyze-word(0, n - l,u[O..n- l],phiu[O..n- 11); for i := 0 to n - 1 pardo eta[phiu[i]]:=phiw[i]
1 4.7
The second test
The array eta and the procedure somme-pre f i x allow to determine all states of the automaton Aa(w) which recognize the word u and all its left factors in the case where u belongs to the commutation class C e ( w ) . In the opposite case, there exists an i E (0,. . . ,n - 1 ) such that eaist(sx[i]+ 1) = f a l s e and u is not in the commutation class C e ( w ) .
the-second-test(eta[O..n - 11); { for i := 0 to n - 2 pardo x [ i ]:= 2eta[i1; smme-prefix(O..n - 2,x[O..n - 21, sx[O..n - 21); for i := 0 to n - 2 pardo { q := 1 ( s x [ i ] 1) mod r ; if ezist(sx[i] 1, q) then y[i]:= 0 else y [ i ] := 1;
+
+ +
1
s m m e ( 0 ,n - 2, y[O..n - 21, test2); if test2 = 0 then write ('ubelongs to the class') else write ('u does not belong to the class')
1 Example 2 If w and 0 are as i n Example 1 and if u = dbceba, z = abbcde, base = 3, phiu[O]= 5, phi411 = 1, phiu[2]= 4, phiu[3]= 2, phiu[4] = 0 , phiu[5]= 3, phiw[O] = 0 , phiw[l] = 1, phiw[2] = 4, phiw[3] = 2, phiw[4] = 3 and phiw[5] = 5. Hence eta[O] = 3, eta[l] = 1, eta[2] = 2 , eta[3] = 5 and
418
eta[4] = 4. Since the states sx[O] + 1 = 9, sx[l] + 1 = 11, sx[2] + 1 = 15, sx[3]+ 1 = 47 and sx[4] + 1 = 63 are states of the automaton A e ( w ) , u = dbceba is accepted. But the word v = dbecba is not accepted since the state sx[2] + 1 = 43 is not a state of the automaton Ae(w) (see Figure 1). Theorem 4 If the number of processors is in O(n), our algorithm tests if a word u belongs t o the commutation class CQ(W)an time O(1ogn). The total number of operations is in O(n) and the algorithm is optimal. Moreover if the distribution of the states of 5' among the subsets U 1 , . . . , U, is uniform, then our algorithm achieves a n optimal speedup.
Proof. For every transition i from a state s to a state t , t = s.w(i) = s+2'. Hence the determination of the states s1 = l.u[O] = 1 2et"[01, 5-2 = s1.u[1] = s1+2eta['I1.. . , sn-l = sn-2.u[n-2] = sn-2+2eta[n-21 reduces to a prefix-sum if and only if all states SI, s2,. . . ,sn-l calculation. It follows that u E CQ(W) belong to the automaton A Q ( w ) . If the number of processors is in O(n), the time complexities of the procedures s o m m e , somme-pre f ix and m a x i m u m given in [6, 7, 81 are in O(1og n) and the total number of operations is in O(n). Moreover, all our procedures have the same complexities. Our algorithm is therefore optimal. The proof of the optimal speedup achievement is the same as in Theorem 3.
+
5
Conclusion
We have presented an optimal parallel algorithm for generating the commutation class of a word and an optimal parallel algorithm for testing if a word belongs to this commutation class. Our algorithms are efficient and easy to code. The notion of @-dissection is original to the authors. Applications to parallel processing are the object of further studies.
Acknowledgments : The authors are grateful to V. Diekert and F. Otto for discussions on the contents of this paper, to M. Ito for his pertinent comments which led t o a new proof of Theorem 1and to an anonymous referee for many remarks and suggestions which permitted to improve both the contents and the presentation of this paper.
419
References [l]Cartier P. and Foata D., Problhmes combinatoires de commutation et de rQarrangements, Lecture Notes in Math., 85, Springer Verlag, 1969.
[2] CQrin C., Automatic parallelization of programs with tools of trace theory, Proceedings of the 6th International Parallel Processing Symposium (IPPS), 1992, IEEE, 374-379.
[3] CQrin C. and Petit A., Speedup of recognizable languages, Proceedings of MFCS’93, Lecture Notes in Computer Science, 711, 332-341, Springer Verlag, 1993. [4] Cori R. and Perrin D., Automates et commutations partielles, RAIRO Inf. Theor., 19, 1985, 21-32. [5] Diekert V., Combinatorics of traces, Lecture Notes in Computer Science, 454, Springer Verlag, 1990. [6] Hillis W.D. and Steele G.L., JR., Data parallel algorithms, Communications of the ACM, 29, 12 (1986) 1170-1183. [7] JAjA J., An introduction to parallel algorithms, Addison-Wesley Pub. Company, 1992. [8] Ladner R.E. and Fischer M.J., Parallel prefix computation, JournaZ of the ACM, 27, 4 (1980) 831-838. [9] Mazurkievitch A., Concurrent program schemes and their interpretations, DQIMI Rept., PB 78, Aarhus University, 1977.
[lo] Mehlhorn K., Data structures and algorithms, volume 1,Springer Verlag, 1984. [ll]MQtivierY., An algorithm for computing asynchronous automata in the case of acyclic non-commutation graphs, Proc. ICALP’87, Lecture Notes in Computer Science, 372, 637-251.
[12] Schott R. and Spehner J.-C., Efficient generation of commutation classes, Journal of Computing and Information, 2, 1, 1996, 1110-1132. Special issue: Proceedings of Eighth International Conference of Computing and Information (ICCI’96), Waterloo, Canada, June 19-22, 1996. [13] Zielonka W., Notes on finite asynchronous automata and trace languages, RAIRO Inf. Theor., 21, 1987, 99-135.
420
A PROOF OF OKNINSKIAND PUTCHA’S THEOREM KUNITAKA SHOJI DEPARTMENT OF MATHEMATICS, SHIMANE UNIVERSITY MATSUE, SHIMANE 690-8504 JAPAN
Abstract. Oknihlci and Putcha proved that any finite semigroup S is an amalgamation base for all finite semigroups if the z-classes of S are linearly ordered and the semigroup algebra R [Slover C has a zero Jacobson radical. As its consequence they proved that every h i t e inverse semigroup U whose all of the classes form a chain is an amalgamation base for finite semigroups. In this paper we give another proof of the result for finite inverse semigroups by making use of semigroup representations only.
1. INTRODUCTION AND
PRELIMINARIES
A finite semigroup U is called amalgamation base for finite semigroups if every amalgam [S,T;U] of finite semigroups S,T with U as a core is embedded in a finite semigroup. Hall and Putcha [4] proved that if a finite semigroup S is an amalgamation base for all finite semigroups, then the 3-classes of S are linearly ordered. Okniriski and Putcha 171 prove that any finite semigroup U is an amalgamation base for all finite semigroups if all of the 3-classes of U are linearly ordered and the semigroup algebra C [U] over C has a zero Jacobson radical. As its consequence, they obtained the following. OkniriSki and Putcha’s theorem (Corollary 10 of [7]). A finite inverse semigroup whose 3-classes of U are linearly ordered is an amalgamation base for finite semigroups. The purpose of this paper is to give another proof for the theorem by using results and methods introduced in the paper [5]. Okniriski and Putcha[7] used both representations of semigroups and linear representations of semigroups. Our proof uses only representations of semigroups.
Convention. Let 7 ( X ) denote the full transformation semigroup on a set X with composition being from right to left. Let S be a semigroup. Then a left [resp. right] S-set is a set with an associative operation of S on the left [resp. right]. A left [resp. right] S-set X is faithfil if for distinct s,t E S , there exists x E X with s x # t x . Thus, given a faithful left [resp. right] S-set X , we obtain a canonical embedding of S into T ( X ) and vice-versa. For undefined terms of semigroup theory, we refer readers to [l]and [5].
42 1
Result 1 (Lemma 1 of [7]). Let U be a finite semigroup and G I ,G2 be two subgroups of U with an identity element in common which are isomorphic by an isomorphism 4 : G1 -+ G2 . Let S be a finite semigroup containing U as a subsemigroup. Then there exists a finite semigroup T such that S is a subsemigroup of T and there exists t E T such that 4 ( g ) = t-’gt for all 9 E Gi(C T ) . Result 2. (cf. Lemma 1and its corollary of [5])Let U be afinite semigroup. Then the following are equivalent : (1) u is an amalgamation base for finite semigroups ; ( 2 ) FOT any two embeddings 4 l , q 5 2 of U into the full transformation semigroup T ( X ) , there exist afinite set Y and two embeddings 61,62 : T ( X ) -+ T ( Y )such that Y contains X as a subset and 6141 and 62q52 coincide on
u; (3) FOTany finite semigroups S , T , any finite faithful left S-set X and any finite faithful left T-set Y , there exist a finite faithful left S-set X’ 2 X and a finite faithful left T-set Y’ 2 Y such that the U-sets X I , Y‘ are U isomorphic to each other. 2. PROOF OF OKNIkSKI AND PUTCHA’S THEOREM
This section is devoted to semigrouptheoretical proof of the Okniriski and Putcha’s theorem. We shall prove first the following several results will be used later.
Lemma 1. Let U be a finite inverse semigroup. Let 41, 4 2 be embeddings of U into the full transfonation semigroup T ( X ) such that /Y(l)I= IY(2)I,where Y(’) = & ( u ) ( X ) ) and Y(’) = + z ( u ) ( X ) ) . Then
(u
(u
UEU
UEU
any U-isomorphism between the right U-set T ( X ) 4 1 ( U )and the right U-set T ( X ) 4 2 ( U )extends a U-isomorphism from the right 41(U)-set T ( X ) to the right U-set T ( X ) . Proof. Suppose that there exists a U-isomorphism 6 from the right U-set T(X)q51(U) to the right U-set T ( X ) $ z ( U ) . Let f E Map(Y(’),X). Then there exists uniquely f’ E Map(Y(’),X) such that O(f+l(e)) = f’42(e) for all e E Eu. In fact, we define a mapping f’ E Map(yi“),X) by f’(z) = O(fq51(e))(x)if x E q52(e)(X)for some e E Eu, where EIJ denotes the set of all idempotents of U . If x E 4 2 ( e ) ( X )n &(el), where e, e’ E Eu,then e ( f 4 l ( e ) ) ( x )= e(fq5l(e))(42(e1)(X)) = W $ l ( e M l ( e ’ ) ) ( x )= W 4 1 ( e e ’ ) ) ( x ) and Q(f41(e’))(x)= e ( f 4 l ( e r ) ) ( 4 2 ( e ) ( X=) ) W 4 1 ( e ’ ) 4 1 ( e ) ) ( z= ) W41(e1e))(z). Thus f’ is well-definedand unique. So we obtain a mapping [ : M a p ( Y ( l ) X , ) -+ M a p ( Y ( l ) , X )with [(f)= f’.
422
For any f E M a p ( Y ( l ) , X ) ,let V ( ' ) ( f )be { h 6 7 ( X )- T(X)q51(U) I hq51(u)= fq5l(u) for all u E U } . Also, Let V @ ) ( f 'be ) { h E 7 ( X )- T(X)q52(U) I hq52(u)= fq5z(u) for all u E U } .
Note here that for any e E Eu, f = fq51(e)ly(l)if only iff' = 8(fq51(e))lycz,. Actually, if f = ulY(l)for some u E U then for any 2 E Y @ ) f'(z) , = @(fq51(e'))(.) .( E q52(e)(X)(e' E Eu))= e((fq51(e)Ml(e'))= 8((fq51(e))(~z(e')(.)) = Q ( f & ( e ) ) ( z ) .Hence f ' = 8(q51(u))(y(a). The converse is true. yca Thisimplies that IV(')(f)l= I X y ( l ) l - l = IX 1-1 = IV(2)(((f)I iff = fq51(e)IY(1)for some e E Eu and IV(')(f)l= IXY(')I = IXycz)I= IV(2)(((f))l if not. Hence
Iv(')(~)I
=
Iv(')(~(f))l for any f E M a p ( Y ( ' ) , X ) .
Thus, there exists a bijection Ef from V ( l ) ( fto ) V ( ' ) ( ( ( f )and ) so, there exists a bijection E from T ( X )- T(X)q5l(U)to 7 ( X ) - 7(X)q52(U). Thus we obtain a bijection 0 : T ( X ) + 7 ( X ) by gluing 8 and E. We shall prove that 0 is a U-homomorphism of the right q51(U)-set 7(X)q51(U)to the right $Z(U)-set 7(X)q5z(U). Let h E 7 ( X ) . For any u E U , letting z E X I we have
(W
) q 5 2
( 4 )(I.
(4) = =(h)($2 (u.-l> (42(4(.>I
= (S:(hM2
= e(hq51(uu-l))(q52(u)(.))
= Wq51(uu-lu))(.) = W q 5 1 ( 4 ) ( 2 )
= Whq51(4)(.). Hence 0 ( h & ( u ) ) = 8(h)q52(u).The lemma is proved.
0
Lemma 2. Let U be a finite inverse semigroup semigroup which is a disjoint union of 1-idempotent semigroup {e} and I such that I is an ideal of U and I = I e I . Then U is an amalgamation base for finite semigroups if the semigroup I is an amalgamation base for finite semigroups. Proof. Suppose that there exist two embeddings q51,q52 of U into the full transformation semigroup T ( X ) . Since I is an amagamation base for finite semigroups, by Result 2 we can assume that q5llr = 4211. Moreover, by Lemma 9 of [5], we may assume that &(e) and q5z(e) belong to a 3-class of 7 ( X ) ,since by assumption there exists an ideal J of T ( X ) whixh contains q51(I),but neither q5l(e)nor c#q(e). Let Y = Uu,~q51(u)(X), 2 1 = q5l(e)(X)Y and 2 2 = q52(e)(X)- Y . There exist a bijection ( : q5z(e)(X)+ q51(e)(X) such that ( ( 2 1 ) = 2 2 and the restrictio of ( to Y is the identity mapping on Y. Now we shall define a mappin 8 : T(X)q51(e)U 7(X)q51(I)-+ T(X)q5z(e)u T(X)q51(1)as follows : For any f E T(X)q51(I),
423
Q(f)=
{
f
i f f ET(X)41(4
fE4de) if f E T(X)41(e) - 7-(X)41(4 Then it is clear that 0 is bijective. Next we shall prove that 0 is a U-homomorphism from the right 41(U)-set 7(X)41( e )U7(X)41(I)-set 7(X)41(V)to the right 4 2 (U)-set 7(X)4z( U )U nX)41(I). Case 1 : u = e and f E T(X)&(e) - T(X)&(I). Then
W41(.))
= @(f) = ft4de) = ( f E 4 z ( e ) ) 4 z ( e= ) W4z(e). f E T(X)&(I). Then
Case 2 : u = e and
e ( f h ( e ) )= ( f h ( e ' ) ) h ( e )(forsome e' E Er) = f h ( e ' e ) = f(42(.'.) = f(4z(e'Mz(e) = (f41(e1)42(e) = f4z(e). Case 3 : u E I and f E 7 ( X ) & ( e ) - T(X)q51(I). Then 0(f4l(u)) =
f h ( 4 = f4d.I
= fE4du) = 0(f)42(.).
Case 4 : u E I and
=
f
E
T(X)qh(I). Then O(f+l(u)) = f41(u)
=
fd~(u)
0(f)42(4.
In any case, it holds that 0 ( f & ( u ) ) = O(f)+Z(u). Finally, Let U1 be the semigroup obtained from U by adjoining an identity element 1. We can extend 41, $2 by defining 41 (1)= q52( 1)to be the identity mapping on X . Then 41~42 are regarded as homorphisms from U1 to T ( X ) . Then by Lemma 1, we can get a U'-isomorphism from the right 41(U)-set 7 ( X ) to the right &@)-set 7 ( X ) which is an extension of 0. By Result 2, the lemme is proved. 0 Lemma 3. Let G be a finite group with an identity element e and I a finite (not necessarily inverse) regular semigroup. Let U he a finite regular s e m i p u p semigroup which is a disjoint union of G and I such that I is an ideal of U and I = IeI. If there are emheddings $1~42of U into a finite semigroup S such that the restrictions to I U { e } of 41 and 4 2 are equal, then there exists an embedding t of S into a finite semigroup T sucht ) all g E G , and that there exists t 6 T such that t - l ( & ( g ) ) t = & # ~ ( g for Y&(.) = t41(eu)7 E41(u)t= E41(ue) for all u of I . Proof. By Result 1, there exists an embedding 4 of S into a finite semigroup V sucht that there exists c E V such that c-'(&$1(g))c = $C$2(g) for all g E G . Also we can assume that V is a finite full transformation semigroup. Then there is an ideal J of V which contains &$1(I) but does not contain &l(e). Let TI be the full transformation semigroup on the set J . Regarding J as a left V-set, we define a mapping p : V + TI such that p ( v ) ( a )= va for all a E J . Then p is a homomorphism and the restriction of p to the ideal J is injective, since J is regular'. Next, let T2 be the Rees factor semigroup of V by the ideal J . For any v E V, let B denote the element of Tz containing v. Let T = TI x Tz
424
be the direct product of semigroups TI and Tz.Then we define a mapfor)all s E S. It is clear that [ ping E : S + T by E ( s ) = (pq5(s),m is an injective homomorphism. Let t = (pq!&(e),c). Then t satisfies the property that t-'([+l(g))t = & 5 2 ( g ) for all g E G. Actually, t-'(&$l(g))t = ( P 4 4 l ( e ) , ~ ) ( P 4 4 ~ ( g ) , ~ ) ( P 4 4 l ( e= ) , ~ () ~ 4 4 i ( g ) , c - W i ( g ) c ) = (P44l(g),44z(g)) = ( p 4 4 d g ) ,44dg)) (since P 4 4 l ( g ) = P44dg)). Also, we have t ( [ h ( u ) ) = ( p 9 4 l ( e ) , ~ ) ( p d ~ i ( u )=, I (p&h(eu),I) ) =
&$l(eu) for all u of I. Similarly, ([41(u))t= [41(ue) for all u of I . The lemma is proved. 0
L e m m a 4. Let G be a finite group with an identity element e and I a finite inverse semigroup. Let U be a finite inverse semigroup semigroup which as a disjoint union of G and I such that I is an ideal of U. Then U is an amalgamation base for finite semigroups i f the subsemigroup I U {e} of U is an amalgamation base for finite semigroups Proof. Suppose that there exist two embeddings 41,q52 of U into the full transformation semigroup T ( X ) . Since I is an amagamation base for finite semigroups, by Result 2 and Lemma 2 we can assume that I # J ~ ( = ~ q ! ~ z l r ~ { By ~ ) . Lemma 3, We can assume that there exists t E T ( X ) such that t-'(&bi(g))t = E4z(g) for all g E G, tdi(u) = 4i(eu) and 41(u)t = 41(ue) for all u E I . Now we define a map 0 : T ( X ) & ( U ) -+ T ( X ) + z ( U )as follows : For any f E T ( X ) 4 1 ( U ) ,
f
Q(f)=
ft
if f E 7 ( X ) 4 1 ( I ) i f f E T ( X ) h ( e )- T ( X ) 4 1 ( I )
To prove that 0 is well-defined, we shall prove that E T ( X ) h ( e )- T ( X ) q h ( I ) . Actually, it follows from the property that t+l(u)= &(eu) and $l(u)t= q51(ue) for all 21 E I . Next we shall prove that 6 is a U-homomorphism from the right ~$~(U)-set T ( X ) q h ( U )to the right qb(U)-set T ( X ) q h ( U ) .
fe E T ( X ) h ( e )- T ( X ) h ( I )i f f
Case 1 : u E I and f E T ( X ) & ( I ) . Then 0(fq51(u)) = f & ( u ) = f ~ # ~ ( u ) . On the other hand, 0 ( f ) 4 z ( u ) ( z = ) (f)4z(u) = f b ~ ( u ) Hence . O(f+l(u)) = e(f)4z(.). Case 2 : u
E I and
f
E T ( X ) d l ( e )- T ( X ) 4 1 ( I ) . Since tq5l(u)= 4 l ( e u ) , we have Q ( f 4 1 ( 4 )= f4l(.) = ( f 4 1 ( e ) ) 4 1 ( 4= f ( t h ( 4 )= ( f t ) 4 Z ( U ) =
@(f)(42(4).
Case 3 : u E G and f E T ( X ) 4 1 ( I ) . Then e(f41(u))= fq!q(u) = (fq51(et))q51(u)(for some idempotent er of I) = f&(e'u) = fc$z(e'u) = (f42(e1))42('11) = ( f 4 1 ( e r ) ) 4 2 ( 4= f42(.) = W)42(4. Case 4 : u E G and f E T ( X ) $ q ( e )- 7 ( X ) q h ( I ) . Then e(fqh(u)) =
(f4l(.))t
= f(t42(4) = e(f)42(4.
425 Consequently, 8 is a U-isomorphism between the right qh(U)-set T ( X ) and the right &(U)-set T ( X ) .The lemma follows from Lemma 1and Result 2. 0
Proof of Okniriski and Putcha's theorem. Let U be a finite inverse semigroup whose 3-classes form a chain. Then there exists a chain of ideals U = U1 2 UZ 2 ... 2 U,, such that U, is a maximal subgroup and each Ui/Ui+l is a completely 0-simple inverse semigroups (15 i 5 n - 1). Also, we can index idempotents of U so that n idempotents eil (1 5 i 5 n) form a chain and for each 1 5 i 5 n, ri idempotents eij (1 5 j 5 ri) are D-related and Ui = U e i l U . For each idempotent eil (1 5 i 5 n), let G,,, denote the maximal subgroup containing e i l . Particularly, U,, = G,,, . Since U,, is a finite group, by Lemma 3, U,, is an amalgamation base for finite semigroups. By Lemma 2 and Lemma 3, it suffices to prove that if UZU Gel, is an amalgamation base for fnite semigroups, then so is U ( = U1). So we suppose that UzUG,,, is an amalgamation base for finite semigroups. By Result 2, there exist two embeddings 4 1 , 4 z of U into the full transformation T ( X ) such that 41 and 4 2 coincide on UZU Gel,. We shall prove that there exists a U-isomorphism 8 from the right &(U)-set T ( X ) to the right 4 2 ( U ) set T ( X ) . Let & ( e i j ) ( X ) = X i j , 4 2 ( e i j ) ( X )= y i j for all 1 5 i 5 n and 1 5 j 5 ri. Then Xi1 = yil for all 1 5 i 5 m and Xij =. y i j for all 2 5 i 5 n and 1 5 j 5 ri. Moreover, lXljI = IY1jI for all 1 5 J 5 r1 since + l ( e l j ) , + z ( e l j ) are D-related to q5l(ell) (= 42(e11)). Let X i = Xij and yi = y i j , where m 5 i 5 n.
u
u
iT , t +;
T}.
Ground tree transducers were introduced by Dauchet and Tison [15]. The rewrite relation JR of any ground TRS R can be defined as a GTTrelation. Since the class of GTT-relations is effectively closed under forming
445
converse relations (trivial), compositions and reflexive, transitive closures, the confluence condition
e; 0 *&
c *; +& 0
can be expressed as an inclusion between two GTT-relations. Now the inclusion problem of GTT-relations turns out to be decidable, and hence a new simple proof of the decidability of the confluence of ground TRSs is obtained. In fact, the result can be extended to concern some more general classes of TRSs. These results can be found in [15] and [14]. Of course, it should be mentioned that the decidability of the confluence of a ground TRS was shown also by Oyamaguchi [47]. In [16] ground tree transducers are used for proving an even stronger result, the decidability of the first-order theory of ground term rewriting. In [18] Engelfriet introduces derivation trees for ground TRSs. A reduction sequence s * R . . . * R t of a ground TRS R is represented by a tree r E Tx,{#), where # is a new binary symbol, so that X(r) = s and p ( r ) = t for two given linear tree homomorphisms X and p. The properties of these derivation trees are quite similar to those of the derivation trees of contextfree grammars. In particular, for any ground TRS R , the set D R of derivation is the image of DR under trees is a regular tree language and the relation the yield-mapping q(r) = ( A ( r ) , p ( r ) ) .Any GTT-relation can be defined in a similar manner, and elegant, rigorous proofs are obtained for many results concerning GTTs and ground TRSs.
Acknowledgements This work was supported by the Academy of Finland under Grant SA 863038. I thank the participants of a TUCS seminar (1999-2000) for sharing with me the pains and pleasures of studying some of the new literature on tree automata and term rewriting, and especially Eija Jurvanen and Matti Ronka for their comments on this paper. Special thanks are due t o Tatjana PetkoviC for her generous help with the preparation of the typescript.
References [l] J. Avenhaus: Reduktionssysteme, Springer, Berlin 1995.
[2] F. Baader and T. Nipkow, Term Rewriting and All That, Cambridge University Press, Cambridge 1998. [3] B. Bogaert, F. Seynhaeve and S. Tison: The recognazibility problem for tree automata with comparisons between brothers. Foundations of
446
446
446 446 446 446 446 446 446 446 446 446 446 446 446
Software Science and Computation Structures (Proc. Conf. 1999) LNCSa 1578, Springer, Amsterdam, 1999, 150-164. B. Bogaert and S. Tison: Equality and disequality constraints on direct subterms in tree automata, Theoretical Aspects of Computer Science, STACS’92 (Proc. Symp., 1992), LNCS 577, Springer, Berlin 1992, 161171. W.S. Brainerd: Tree generating regular systems, Information and Control 14 (1969), 217-231. R. Biindgen: Termersetzungssysteme, Vieweg&Sohn, Wiesbaden 1998. H. Comon: An efficient method for handling initial algebras, Algebraic an Logic Programming (Proc. Intern. Workshop, 1988), LNCS 343, Springer, Berlin, 108-118. J.L. Coquid6, M. Dauchet, R. Gilleron and S. Vfigvolgyi: Bottom-up tree pushdown automata: classification and connection with rewrite systems. Theoretical Computer Science 127 (1994), 69-98. B. Courcelle: On recognizable sets and tree automata. In: Resolution of Equations in Algebraic Structures (eds. H. Ait-Kaci and M. Nivat), Academic Press, Boston 1989, 93-126. B. Courcelle: Basic notions of universal algebra for language theory and graph grammars. Theoretical Computer Science 163 (1996), 1-54. M. Dauchet: Simulation of Turing machines by a regular rewrite rule. Theoretical Computer Science 103 (1992), 409-420. M. Dauchet, A.C. Caron and J.L. Coquid6: Automata for reduction properties solving. J. Symbolic Computation 20 (1995), 215-233. M. Dauchet, F. De Comite: A gap between linear and non linear term-rewriting systems. Rewriting Techniques and Applications, RTA87 (Proc. Conf., 1987), LNCS 256, Springer, Berlin 1987, 95-104. M. Dauchet, T. Heuillard, P. Lescanne and S. Tison: Decidability of the confluence of finite ground term rewrite systems and other related term rewrite systems. Information and Computation 88 (1990), 187-201. M. Dauchet and S. Tison: Decidability of confluence for ground term rewriting systems. Fundamentals of Computation Theory, FCT’85 (Proc. Conf., 1985), LNCS 199, Springer, Berlin 1985, 80-89. M. Dauchet and S. Tison: The theory of ground rewrite systems is decidable. 5th IEEE Symposium on Logic in Computer Science (Proc. Symp., 1990), IEEE Computer Society Press, Los Alamitos, CA 1990, 242-248. N. Dershowitz and J.-P. Jouannaud, Rewrite systems. In: J . van Leeuwen (ed.), Handbook of Theoretical Computer Science, Vol. B , Elsevier Sci-
aLNCS = Lecture Notes in Computer Science
447
ence Publisher B.V., Amsterdam 1990, 243-320. [l8] J. Engelfriet: Derivation trees of ground term rewriting systems. Information and Computation 152 (1999), 1-15. [19] Z. Fulop: Undecidable properties of deterministic top-down tree transducers. Theoretical Computer Science 134 (1994), 311-328. [20] Z. Fiilop, E. Jurvanen, M. Steinby and S. VBgvolgyi: On one-pass term rewriting. Acta Cybernetica 14 (1999), 83-98. [21] Z. Fulop and S. VBgvolgyi: Congruential tree languages are the same as recognizable tree languages. - A proof for a theorem of D. Kozen. Bulletin of the EATCS 39 (1989), 175-185. [22] Z. Fulop and S. VAgvolgyi: A characterization of irreducible sets modulo left-linear term rewriting systems by tree automata. Fundamenta Informaticae XI11 (1990), 211-226. [23] Z. Fulop and S. VBgvolgyi: Ground term rewriting rules for the word problem of ground equations. Bulletin of the EATCS 45 (1991), 186-201. [24] Z. Fulop and S. VAgvolgyi: Minimal equational representations of recognizable tree languages. Acta Inforrnatica 34 (1997), 59-84. [25] J.H. Gallier and R.V. Book: Reductions in tree replacement systems. Theoretical Computer Science 37 (1985), 123-150. [26] T. Genet: Decidable approximations of sets of descendants and sets of normal forms. Rewriting Techniques and Applications, RTA-98 (Proc. Conf., 1998), LNCS 1379, Springer 1998, 151-165 [27] F. Gkcseg and M. Steinby: Tree languages. In: G. Rozenberg and A. Salomaa (eds.), Handbook of Formal Languages, Vol. 3, Springer, Berlin 1997, 1-87. [28] R. Gilleron: Decision problems for term rewriting systems and recognizable tree languages. Theoretical Aspects of Computer Science 1991, STACS 91, (Proc. Symp., 1991), LNCS 480, Springer, Berlin 1991, 148159. [29] R. Gilleron and S. Tison: Regular tree languages and rewrite systems. Fundamenta Informaticae 24 (1995), 157-175. [30] P. Gyenizse and S. Vagvolgyi: Linear generalized semi-monadic rewrite systems effectively preserve recognizability. Theoretical Computer Science 194 (1998), 87-122. 1311 P. Gyenizse and S. Vagvolgyi: A property of left-linear rewrite systems preserving recognizability. Theoretical Computer Science 242 (2000), 477498. [32] D. Hofbauer and M. Huber: Linearizing term rewriting systems using test sets. J. Symbolic Computation 17 (1994), 91-129. [33] D. Hofbauer and M. Huber: Test sets for the universal and existential
448
[34]
[35]
[36] [37]
[38] [39]
[40]
[41]
[42]
[43]
[44] [45] [46]
[47]
closure of regular tree languages. Rewriting Techniques and Applications, RTA-99 (Proc. Conf., 1999), LNCS 1631, Springer, Berlin 1999, 205-219. J. Jacquemard: Decidable approximations of term rewriting systems. Rewriting Techniques and Applications, RTA-96 (Proc. Conf., 1996), LNCS 1103, Springer, Berlin 1996, 362-376. Y. Kaji, T . Fujiwara and T. Kasami: Solving a unification problem under constrained substitutions using tree automata. J. Symbolic Computation 23 (1997), 79-117. D. Kapur, P. Narendran and H. Zhang: On sufficient completeness and related properties of term rewriting systems. Actu Informatzca 24 (1987), 395-415. E. Kounalis: Testing for inductive (co)-reducibility. Trees in Algebra and Programming, CAAP’SO (Proc. Coll., 1990), LNCS 431, Springer, Berlin 1990, 221-238. Kozen, D.: Complexity of finitely presented algebras. 9th Annual Symposium of the Theory of Computing (Proc. Conf., 1977), 164-177. G.A. Kucherov: A new quasi-reducibility testing algorithm and its application to proofs by induction. Algebraic and Logic Programming (Proc. Conf.,1988), LNCS 343, Springer, Berlin 1988, 204-213. G.A. Kucherov: On the relationship between term rewriting systems and regular tree languages. Rewriting Techniques and Applications, RTA-91 (Proc. Conf., 1991), LNCS 488, Springer, Berlin 1991, 299-311. G. Kucherov and M. Tajine: Decidability of regularity and related properties of ground normal form languages. Information and Computation 118 (1995), 91-100. D. Lugiez: A good class of tree automata. Applications to inductive theorem proving. Automata, Languages and Programming, ICALP’98 (Proc. Conf., 1998), LNCS 1443, Berlin 1998, 409-420. D. Lugiez and J.L. Moysset: Complement problems and tree automata in AC-like theories. Theoretical Aspects of Computer Science 1999, STACS 93, (Proc. Symp., 1993), LNCS 665, Springer, Berlin 1993, 515-524. D. Lugiez and J.L. Moysset: Tree automata help one to solve equational formulae in AC-theories. J. Symbolic Computation 18 (1994), 297-318. J . Mezei and J.B. Wright: Algebraic automata and context-free sets. Information and Control 11 (1967), 3-29. F. Otto: On the connections between rewriting and formal language theory. Rewriting Techniques and Applications, RTA-99 (Proc. Conf., 1999), LNCS 1631, Springer, Berlin 1999, 332-355. M. Oyamaguchi: The Church-Rosser property for ground term-rewriting systems is decidable. Theoretical Computer Science 49 (1987), 43-79.
449
[48]M. Oyamaguchi: Some results on decision problems for right-ground term-rewriting systems. Toyohashi Symposium on Theoretical Computer Science (Proc. Symp., 1990),41-42. [49]D. Plaisted: Semantic confluence tests and completion methods. Information and Control 65 (1985),182-215. [50]K. Salomaa: Deterministic tree pushdown automata and monadic tree rewriting systems. J. Computer and System Sciences 37 (1988),367-394. [51] F. Seynhaeve, S. Tison and M. Tommasi: Homomorphisms and concurrent term rewriting. Fundamentals of Computation Theory, FCT’99 (Proc. Conf. 1999),LNCS 1684,Springer, Berlin 1999,475-487. [52]H. Comon, M. Dauchet, R. Gilleron, F. Jaquemard, D. Lugiez, S. Tison and M. Tommasi: Tree Automata Techniques and Applications. http://www .grappa.univ-lille3.fr/tata/. [53] S. Tison: Tree automata and term rewrite systems. (Extended Abstract) Rewriting Techniques and Applications, RTA-2000 (Proc. Conf.) , LNCS 1833,Springer, Berlin 2000,27-30. [54]S. VAgvolgyi: A fast algorithm for constructing a tree automaton recognizing a congruential tree language. Theoretical Computer Science 115 (1993),391-399. [55] S.VAgvolgyi and R. Gilleron: For a rewrite system it is decidable whether the set of irreducible, ground terms is recognizable. Bulletin of the EATCS 48 (1992),197-209.
450
Key agreement protocol securer than DLOG Akihiro Yamamura * and
Kaorii Kurosawa t
Abstract
Our goal is to propose a key agreeInerit protocol that is secure even if the discrete logarithm problem can be efficiently solved in the underlying abelian group. The protocol is defiled over a non-cyclic finite abelim group whereas the DifFie-Hellman protocol is defined over a cyclic finite abeliari group. We analyze the generic reductions of breaking the proposed protocol to the discrete logarithm problem and show that a large number of queries to the discrete logarithm oracle are required to break the proposed protocol in the generic algorithm model.
Keg Word$: Diffie-Hellman protocol, multiple discrete logarithm problem, generic algorithm, discrete logarithm oracle
1
Introduction
In 1976 D i a e and Hellman proposed a protocol over an insecure channel to establish a secret key. Since then, their scheme has been applied to numer011s finite abelian groups like the miiltiplicative groups of finite fields and the groups of the rational points on elliptic curves and hyperelliptic curves ([3], [4]and [S]). However, the Diffie-Hellman key exchange protocol is inherently viilnerable to an adversary who can solve the discrete logarithm problem in the underlying group. The discrete logarithm problem is believed to be intractable in general, however, we cannot deny the existence of an efficient algorithm that solves the discrete logarithm problem. As a matter of fact, polynomial or subexponential time algorithms for the discrete logarithm problem have been discovered for several classes of finite abelian groups.
*Communications Research Laboratory, 42-1, Nukui-Kitamachi, Koganei, Tokyo, 1 8 4 8795 Japan email: akiOcrl.go.jp TTbaraki University, 4 1 2 1 , Nakanarusawa, Hitachi, Tbaraki, 316-8511, Japan email: kurosawaOcis.ibaraki.ac.jp
45 1
On the other hand, the quest for abelian groups appropriate to the DiffieHellman scheme made niimeroiis classes of abelian groups available to protocol designers. Some groups have potentially richer striictiires than the multiplicative groups of finite fields which are always cyclic. Several groups, for example, the multiplicative group of integers modiilo a composite number, the group of the rational points on an elliptic curve and a hyperelliptic curve and a commutative subgroup of the group of non-singular matrices over a finite ring are not necessarily cyclic. For these groups, the discrete logarithm problem does not fiilly reflect the complexity of their algebraic striictiires. In fact, in [S], it is shown that R(p) queries to the group operation oracle are required to solve the multiple discrete logarithm problem (see Section 2.3 for the definition) in a non-cyclic group isomorphic to Z, x Z, in the generic algorithm model whereas only R ( f i ) queries to the group operation oracle are required to solve the discrete logarithm problem for a group isomorphic to ZpTbfor any n 2 1. The results indicate that the multiple discrete logarithm problem is more difficult than the discrete logarithm problem. This observation motivates 11s to invent a key agreement protocol over a non-cyclic group so that we can exploit its complicated algebraic structure to enhance the security. We constriict a key agreement protocol whose seciirity is based on the intractability of the multiple discrete logarithm problem over a non-cyclic abelian group. We employ generic algorithms and generic reductions to produce evidence that the proposed protocol cannot be broken by the adversary who can solve the discrete logarithm problem. We prove that breakmg the proposed protocol requires R(fi) queries to the group operation oracles. Furthermore, we prove that breaking the proposed protocol requires R ( fi)queries if it is allowed to call the discrete logarithm oracle, which is introduced in Section 3.2, in addition to the group operation oracle. Therefore there exists no probabilistic polynomial time algorithm that breaks the proposed protocol even if the discrete logarithm problem is efficiently solved, Hence, the proposed protocol has a novel feature that it is secure against the adversary who can solve the discrete logarithm. Related works: A generic algorithm is a general purpose algorithm that does not make iise of any property of the representation of the group elements. In [8] it is proved that the computational complexity of breaking the DH protocol is also R(&. In [ 5 ] ,however, it is proved that solving the DLOG is strictly harder than breaking the DH protocol if p 2 I n.
452
2
Proposed key agreement protocol
We introduce a key agreement protocol that is defined over a general finite abelian group that is not necessarily cyclic. The protocol is called the Generalized Dz;tfie-Hellman protocol or simply the GDH protocol in this paper. We prove that the GDH protocol is at least as secure as the Diffie-Hellman (DH for short) protocol. In Section 3, we produce a stronger evidence that the GDH protocol is securer than the DH protocol. Before defining the protocol, we recall the notations in group theory. Let G be a finite abelian (multiplicative) group. The subgroup generated by the element a is denoted by < a > and similarly the subgroup generated by the elements a and b is denoted by < a, b >, that is:
< a , b > = {unb'" 1 n, rn E 2).
< a > = {an 1 n E Z},
For a E G, lul denotes the order of a, that is, the number of the elements in < a >. The order of a group G is denoted by /GI.
2.1
Proposed protocol
GDH protocol: Let G be a finite abelian group and a, b elements of G. We choose piiblic integers a, p, y,6 such that each of a , p, y,6 is relatively prime to both la1 and Ibl. step 1. Alice chooses integers
randomly. She computes
il, i2
(aaiibPiz,
a7i2b6i1
1
and sends it to Bob. step 2. Bob chooses integers
3'1, 3'2
randomly. He compiites
(aaji Pjz
7
arja b6ji
)
and sends it to Alice. step 3. Alice compiites (aajib@jz)6ii
and
(arjzb6ji
Pia
= aa6iljlbP6ilh
- aPriaja
) Then Alice computes a common key K = a a d i ~ +j P~ r i z j , by multiplying the two elements.
bPSi2ji.
bPd(ilj2 + i z j l )
453
step 4. Bob computes
and Then Bob similarly computes the common key K .
Remark Suppose that G is an abelian group and a and b are elements of G. If la1 and Ibl are relatively prime, then the siibgroiip < a, b > is in fact cyclic. Therefore the necessary condition for < a, b > to be non-cyclic is that la1 and Ibl have common prime divisors. We should note that a finite abelian group G is non-cyclic if and only if G contains a siibgroiip isomorphic to Z, x Z,for some prime p . Hence? we prefer to choose la1 and Ibl which have common prime divisors so that the scheme can be based on the striictiire of a non-cyclic group.
2.2
Security compared with the DiffieHellman protocol
Breaking the protocol is equivalent to solving the following algorithmic p r o b lem. Suppose that G is a finite abelian group and that a, b E G. Each of the parameters a, p, y, 6 is relatively prime to both la1 and Ibl. The GDH problem in G with respect to a, b is defined by:
INPUT: OUTPUT: where il, i2, j,, j o are randomly and independently chosen integers. The following resiilt guarantees that the GDH protocol is at least as secure s.a the DH protocol if the parameters are carefully chosen.
Theorem 2.1 Let a , p , y , b be integers. W e suppose that each of them is relatively prime to /GI. If there exists an efficient algorithm that solves the GDH problem (with the parameters a,0, y,6 ) in an abelian group G for all a, b E G, then there exists an efficient algorithm that solves the DH problem in G for all a E G. Proof. Suppose that there is an efficient algorithm that solves the GDH problem for all a and b. We construct an efficient algorithm that solves the DH problem, that is, an algorithm that computes ailjl for the inputs ail and
454 ajl where a is an element of G. Let b = 1 (the identity element of G). We should note that a , p, y,6 are integers relatively prime to ( a (since la( divides IGI. By our assumption, we have an efficient algorithm to solve the GDH problem for a and b. Let i2 = 3'2 = 0. We input (aai1bPi2
a7izb6il) = ( ( p i 1
, 1)= ((aily, 1)
and
pjz.
(&l
aYh
@1)
= (@l
, 1) = ((&)a, 1)
to the algorithm that solves the GDH problem with respect to a and b. Then we obtain U Ordiljl +Pyizjz
bPa(ilj2 +izj,) = p 5 i l j l
We note that we can compiite and (ajl)" because we are given a i l , ajl and a is a public information. Since both a and b are relatively prime to la[, we can find the integer m such that (aa6)" = a. Then ( a a s i l j l ) T n = 0 (aru6m)i l j l = &jl , and hence, the DH problem is efficiently solved.
2.3
MDLOG
Let G be a finite abelian group and a, b elements in G. We set H to be the subgroup of G generated by a and b. The multiple discrete logarithni problem (MDLOG for short) in the group H = < a , b > is the algorithmic problem defined by: INPUT: OUTPUT:
An element g of H
A pair (z,y) of non-negative integers such that g = axby.
Since H is generated by a and b, there exists at least one pair (z,y) of nonnegative integers satisfying g = axby. Although such a pair is not necessarily unique in general, the output is uniquely determined if H is the direct product < a > x < b >. Clearly the GDH problem is reduced to the MDLOG problem, and hence, the GDH protocol can he broken if the MDLOG is efficiently solved. We should remark that the result in [S] indicates that solving MDLOG is essentially harder than solving DLOG. On the other hand, DLOG is evidently reduced to MDLOG. We shall siimmarize the relationships among MDLOG, DLOG, GDH and DH in Section 6.
455
3
Generic reduction of breaking the proposed protocol
We discuss the security of the GDH protocol from the point of view of the generic model. Our conclusion is that the GDH protocol with carefully chosen parameters is securer thap the DH protocol in the generic model. To simplify the argument, we consider only the GDH protocol over a miiltiplicative group G isomorphic to Z, x Z, where p is a large prime in the rest of the paper. We shall show that the GDH protocol is secure even against the adversary who can solve the DLOG if we impose the condition on the parameters a, p, y, 6 as follows:
a , p, y, 6, are relatively prime to p
(1)
P6 is a quadratic nonresidue (mod p ) . -
(2)
and “7
We suppose that the conditions (1)and (2) are satisfied. The condition (1)is imposed to prevent a, 0, y, 6 from collapsing elements a, b E G. On the other hand, the condition (2) seems rather artificial. We explain why the condition (2) is imposed in Section 4.
3.1
Generic algorithms
We briefly review generic algorithms and generic reductions. A generic algorithm is a general purpose algorithm which does not rely on any property of the representation of the group (see [S] and [5] for details). Let a be a random mapping from Z, to a set S of s h e p of binary strings. The generic algorithm is allowed to make calls the group operation oracle that computes the function add and inv defined by for z, y E Z,,
add(a(z),a ( y ) ) = a ( z
+ y) and i n v ( a ( 2 ) ) = a(-.)
without any compiitational cost. A generic algorithm for the DLOG in the cyclic group Z, takes ( a ( l ) , a ( z )a)s an input and outputs z, where z E Z,. We note that in [5] the Dfie-Hellman oracle is introduced to study the generic reduction of the DLOG to the DH. Next let cr be a random mapping from Z, x Z, to a set S of size p 2 of binary strings. A generic algorithm for Z, x Z, is allowed to make calk group operation oracles which computes the function add and inv defined by a ( y i , 1~2))
=
~ ( z+i ~
i7LV(c7(Z1, 2 2 ) )
=
(T(--51,
add(a(zi, ZZ),
i 2 ,2
-Z2).
+ YZ)
456
A generic algorithm for the MDLOG in Z, x Z, takes (41,01, 4 0 , I),
as an inpiit and then outputs
(21, "2)
421,22))
where
$1, 2 2
E Z,.
3.2 Main Theorem We now investigate the hardness of breaking the (proposed) GDH protocol compared with the DLOG problem in terms of the generic reduction. First of all, a generic algorithm for GDH problem runs as follows. Let p be a large prime. The group Z, x Z, is encoded by into a set S of binary strings. A generic algorithm for the GDH problem in Z, x Z, takes a list ( 4 : 0 ) , 4 0 , 1 ) , daily P i z ) ,
C(Yi2,
ail), 4aj1,
/3j2),
4Yj2,
Sjl))
as an inpiit, compiites by calling the group operation oracles and then outputs v(aSi1j1
+ PYi2j2,
PS(i1jz
+ i2jl)).
In addition to the group operation oracles, we allow the generic algorithm to call the discrete logarithm oracle. A discrete logarithm oracle for Z, x Z, takes the pair (4i1, i 2 ) , 4j1, j 2 ) ) of the representations as an inpiit and then outputs the integer n such that n i l = jl(mod p ) and n i 2 = ja(m0d p ) without any computation cost if such TL exists. However, several plaiisible behaviors of the DLOG oracle are considered if there is no integer 7~ siich that nil = jl(mod p ) and ni2 = jn(m0d p ) . Let iis call such an inpiit illegal. We enumerate several plausible modes of the discrete logarithm oracle as follows.
Mode 1 An oracle does nothing to illegal inputs. Then the generic algorithm provides an error message and the computation proceeds to the next step. Mode 2 An oracle provides wrong answers (for example: randomly chosen integers) to illegal inpiits while it retiirns correct answers if it is given legal inputs. Mode 3 An oracle makes the entire computation stop without any output when illegal inpiits are given. In oiir study of the generic reduction of the GDH problem, we adopt Mode 1. Clearly, the other modes reduce the computational efficiency and the probability that the algorithm retiirns a correct answer.
457
Theorem 3.1 Let A be a generic algorithm that solves the GDH problem in the group Z p x Z p where p i s a. prime. We m p p o s e that the parameters a , p, y,b satisfij the conditions (1) and (2). Suppose that A niake.s at most R queries to the group operation oracle and at wjwst L queries t o the discrete logarithm, oracle, respectively. T h e n the probability 0 that the algorithm A returns the correct answer i s at most
2 L ( R + 6 ) ( R + 5 ) (n+6)(n+5) + 4(R+6) P P P2 +
where the probability is taken over il, i2, j,, j 2 and a representation o. T h e pxpected number of queries t o the discrete logarithm oracle is at least
ps
1 2
2
_ - _
2(R+6)(R+5)
p(R+5)' 0
We postpone the proof until Section 3.3 and discuss here the consequences of Theorem 3.1. Let T denote the total running time of the algorithm A . Since T >_ L R, we have T 2 L and T 2 R. Suppose that Q is a constant. By Theorem 3.1, we have
+
(2T + 1)(T+ 6)(T+ 5 ) P
+
4(T + 6 ) P2
0.
Therefore, T is in f l ( f i ) = f l ( 2 9 ) . This implies that there exists no probabilistic polynomial time algorithm that breaks the GDH protocol even if the DLOG is efficiently solved. Now we suppose that the discrete logarithm oracle is not available. The expected number of queries to the group operation oracle for solving the GDH problem is derived from Theorem 3.1 by letting L = 0. An upper bound of the success probability Q is
( R t - 6 ) ( R + 5+ ) 4(R+6) p2
P
and hence, the expected number of queries to the group operation oracle is estimated as
n(m if Q is a constant.
(n+
We also remark that the siiccess probability grows in proportion to ( 3 ) P 6)(R+5)as L grows in the boiind given in Theorem 3.1. Since ($)(R+6)(R+ 5 ) is small provided that p is large enough, the DLOG oracle does not siih stantially help to break the GDH protocol.
458
3.3
Proof of Theorem 3.1
The following is useful.
Lemma 3.1 ([7])W h e n given a non-zero polynowiial F of total degree d in Z,[X1, X 2 , . . . ,Xk] ( p is a priwie), the probability that F ( z 1 ,$ 2 , . . . ,zk) = 0 for independently and randomly chosen elewients ~ 1 ~ x. .2. ,,zk of Z, is at most d / p . Proof of Theorem 3.1. In the proof, we simulate a generic algorithm by polynomials over Z,. At the beginning, we have six pairs of polynomials (F1,Hl)=
(LO),
= (cyx1,Px2), (F5,H5) = (OYl,PK), (F37H3)
(Fz,H2)
= (0, 1 ) 7
(F47H4)
= (yx2,6x1):
(F67
H6) = (yy2,6y1)
in Z,[Xl, X z , Y1,Yz]. Each pair corresponds to the representations (of the group elements)
respectively. We compute polynomials
of representations (of the group elements). When the miiltiplication oracle is called with the inpiits corresponding to the pairs (Fk,H k ) and (Fr,Hi), we compute polynomials Fi and Hi by setting Fi = Fk Fr and Hi = Hk Hl where i > k , 1. Similarly, when the inversion oracle is called with the inpiits corresponding to the pair ( F k ,H k ) , we compute polynomials Fi = -Fk and Hi = -Hk where i > k , 1. When the discrete logarithm oracle is called with the inpiits corresponding to (Fk,H k ) and ( f i ,H l ) , it returns R (E Z,) such that *rFk( i l ,i 2 , j 1 , j z ) = 4( i l ,i 2 , j 1 , j ~ )
+
+
459
and S H k ( i 1 , i 2 , j,,3’2)
= Hl(i1, i 2 , j,, j 2 )
if such s exists. In this case, we do not produce polynomials, but we get the information that i l , i 2 , j 1 , j 2 satisfy the equations sFk = Fi and s H k = Hi. We suppose that a generic algorithm has a chance to return the correct answer only when we find non-trivial equations satisfied by il, i 2 , 3’1, j p in our simulation of the computation. Before starting the proof, we discuss more on the hehavior of the discrete logarithm oracle. When the algorithm calls the discrete logarithm oracle for inputs
g(Fk(ii,i 2 , j i , 3’21,
H k ( i i , i 2 , j i , 3’2))
~ ( F l ( i 1 , 3’1, j 2 ) ,
H ~ ( i 1i27 , j17 j 2 ) ) ,
and there are three possible events. The first possible event is that the inputs are illegal, that is, the second input is not a power of the first. The second event is that the inputs are legal but the polynomials Fk, Hk, Fl, HLsatisfy the condition FkHl = HkFl (mod p) as a polynomial over Z,. The third event is that the inputs are legal and FkHi # HkFl (mod p ) . We show that information on il, i 2 , 3’1, 3’2 can be derived only in the last event. If the first event occurs, the discrete logarithm oracle does not return anything except for an error message. We have no chance to gain the information on i l , i 2 , 3’1, j 2 other than that the second is not a power of the first. We now discuss the second event. Let 11ssuppose that
FkHl - F1Hk = 0 (modp). First we note that since Fk, Hk, Fl, Hl are polynomials of at most degree 1 over Z,,they are units or irreducible polynomials. Since the polynomial ring Z,[X1, X 2 , Yl,Yz] is a nniqiie factorimtion domain, we have either 71Fk = Fl and 71Hk = H1 for some 7 1 E Z, or 71Fk = Hk and u F ~= Hl for some 71 E Z,. In the case that 71Fk = F1 and 71Hk = Hit the discrete logarithm oracle returns 71 E Z, to the inputs a(Fk,Hk) and o(F1,H L ) ,but we do not obtain any information on il, i2, j 1 , j 2 because the equations 7LFk = Fl and f1Hk = Hl are satisfied not only by i l , i 2 , j 1 , j 2 but also b y all 2 1 , z 2 , y1, y 2 E Z,. Next we suppose that u F ~= Hk and 7 1 4 = Hl. By the definition of Fk and Hk, we can write
460
where c1, c2, c3, e4, c5, c6 E Zp. Since uFk = Hk, we have 11c1
= c2
Because we are assuming the condition (Z), the matrix
f
uff
-6
is non-singular. Hence, we have c3
= c4 = c5 = c~ = 0 (modp)
and so both Fk and Hk are constants. It follows that Fi and Hi are constants. Therefore, the oracle call with an input u(Fk,H k ) and ~ ( F Hi) L , such that FkHl = FiHk does not provide any information on i l , i 2 , j 1 , 3’2. Consequently, we can obtain information on i l , i 2 , j1, j 2 only when the third event occurs and so we say that a discrete logarithm oracle query is meaningful if it is called in the third event, otherwise it is nonsensical. We now find an upper bound of the probability that the algorithm A returns the correct answer. There are three probable cases for a generic algorithm to return the correct answer. (Case 1) At least one discrete logarithm oracle query is meaningful. (Case 2) All discrete logarithm oracle queries are nonsensical and there are ( F k ,H k ) and (Fl,H I ) such that (Fk,H k ) # ( F I :Hi) as polynomials over Z P , bllt
Fk (ii ,i 2 , j i , j 2 ) = F’ (ii,i 2 , j i : j 2 ) and Hk(ii,i2,ji,j2) = Hl(i1,i2,ji7j2).
(Case 3) All discrete logarithm oracle queries are nonsensical and we have
( F k ( i i ,i2,311,3’2), for some
( F k ,H k ) .
H k ( i i , i 2 , j i , j 2 ) ) = ( f f 6 i l j i-4- PYi2j2, Pd(i13’2
+i2j1))
46 1
We find an upper boiind on the probability in each of (Case l), (Case 2) and (Case 3). (Case 1) The probability that a discrete logarithm oracle query is meaningful is bounded by the probability that for some k and 1 with FiHk: - FkHi # 0 and there is s in Z,satisfying sFk:(21,~ 2 , 2 / 1 , 2 / 2 )= Fi(g;i,2 2 , 2/1, 9 2 )
and c9Hk:(~1,22, 2/i, 92)
= H ~ ( x 1~ , 2
~ 12/2) ~ 1 ,
for randomly chosen 2 1 , 22, y1,1~;?in Z,. Then the probability is bounded by an upper boiind of the probability that for randomly chosen z1, 2 2 , 1 ~ 11, ~ 2in Z p we have
F L ( z 22, ~ , 1 ~ 12/2)Hk(z1, , 22,2/1,2/2) = Fk(21,22,2/1, 2/2)Hl(zl,22, ! / I , 1/21 since we have
.sFk:(zi,22, 2/1, 2/2)Hk:(Z1,22, 91, Y 2 ) = F l ( z i ,22, 2/i,92)wk:(21,z2,
2/21
and s F k ( ~ i 22, , 211, 2/z)Hk(x1,22,211,2/2) = F k ( Z 1 , 2 ~2/1, , 2/2)wt(z1,%, 91, 2/2).
On the other hand, the probability that
F i ( ~ i2 2, , 2 / 1 , 2/2)Hk(21,22,2/1,2/2)= Fk(21,22,2/1, ?/2)Hl(Zl,22, 91, 2/21 for randomly chosen 21, 2 2 , y 1 , ~ 2in Z,is bounded by 2 / p by Lemma 3.1 since the total degree of the polynomials FlHk - FkHl does not exceed two and FlHk - FkHl # 0 as a polynomial. It follows that the probability that at least one discrete logarithm oracle query is meaningful is bounded by 2
L ( R + 6 ) ( R + 5) x -. P (Case 2 ) Assume that ( F k ,H k ) # (Fl,H l ) . There are three cases: (i) Fk # F1 (we do not care whether Hk: # Hl or Hk = H l ) and (ii) Fk = Fl and Hk # Hl. In the case (i), the probability that Fk:(ii7i2,j1,3’2)= F ~ ( i i , i 2 , 3 ’ i , j 2 )
for randomly chosen 21, 22, y1, 92 in Z, is at most the probability that
by Lemma 3.1. Hence,
J’k:(ii,i 2 , j1,3’2) = Fl(i1,i2,3’1, j 2 )
462 and HL(il>i 2 , j l , 3’2) = mil ,1:2: 3’1,j 2 ) for some k,1 and randomly chosen $1,
2 2 , y1, y 2
in Z, is
(R+6)(R+5) 1 x 2 P in the case (i). Similarly the probability in the case (ii) is at most
(R+6)(R+5) 1 x -. P
2
Therefore the probability in (Case 2) is at most
+
+
(R 6)(R 5) P
(Case 3) By Lemma 3.1, an upper bound is
( R +6 ) x for the probability of the event that for randomly chosen 2 1 , 2 2 , we have
for some
(Pk, H k )
y1, yp
in Z,,
because the total degrees of the polynomials Fk
- ff6x1
YI+ Pyx2 y 2
and
HI, - P W l Y 2
+X 2 Y d
are two. We note that a S # 0, 0-y # 0 and PS # 0 (mod p ) by the condition (1). Consequently, the probability that a generic algorithm outputs the correct answer is at most
463
4
Example
Let p be a large prime. In Sec.3, we have discussed the security of our GDH protocol over G =< a > x < b > wlch that \a\ = Jb(= p . That is, ap = bP = 1 and any g E G is expressed as g = ax@' uniquely for some z and g. Theorem 3.1 implies that the GDH protocol over the G is secure in the generic algorithm model even if the DLOG is solvable. In this section, we show an example of G =< a > x < b > such that JaJ= JbJ= p . Let q and r be large primes such that p 1 q - 1 and p 1 r - 1. A
Let n = qr. Let g1 be a pth root of unity in modq and g2 be a pth root of unity in modr. For some c1 E Z; and c2 E Z;,choose a E Z,, and b E Z, such as follows.
a = g1 mod q,
Q
b = g? mod q,
Theorem 4.1 Ifclc2
# 1 mod p ,
= gil mod r ,
b = g2 mod
T.
then
Z:=x.
Proof. It is enough to show that
< a > U < b >= (1). On the contrary, suppose that 1 # d E< a > U < b > . This implies that there exist z E
iZ; and g E Zi such that
d = ax = by mod n.
Then we have g?
= g?" modq,
g$lx = gg mod r . Therefore, z c1z
c2ymodp, = gmodp. =
464
Hence. ~ 1 ~ 2= x 3:y 1 ~ mod
Now since 3: # 0 mod p and
3:
p.
# 0 mod p ! we have c1c2 =
1 mod p.
This is a contradiction.
5
On the parameter condition
The condition (2) in page 5 claiming that ( p d ) / ( a y ) is a qiiadratic nonresidue (mod p ) is essentially important. If (pd)/(ayy)is a qiiadratic residue (mod p ) , that is, that (pd)/(ay) = 11' for some I L , then there exists an attack against the GDH protocol by using the DLOG oracle.
Attack against the GDH protocol in which ( p d ) / ( a ~is) quadratic residue: We siippose that ( p b ) / ( a y )= IL' for some 11. Then the matrix
is singular, and hence, the system of equations
has a nontrivial solution. Suppose that (s, t ) = ( q ,c 4 ) is a nontrivial solution. We are given group elements a, b, aailbPiz, ciyiab6i1 and so we can compute ( a a i i @ i ~ ) c B( a y i ~ b 6 ic4 ~
)
- ac&l+c4~i2 bc~Pi2+c46il
By the definition of c3, c 4 , we have we have obtained
+ ~ 4 y i 2 =) c 3 p i 2 + ~ 4 6 2 1 .Hence,
i~(c3a21
(ab")c3ail+c4yi2
We compute ab" and then call the discrete logarithm oracle with the inputs ( a b " ) C 3 a i 1 + C 4 ~and i2 ah". The oracle returns hl = c3aZ1 ~ 4 ~ 2 We 2 . then do a similar process with another ($, ck) and obtain h: = c!pil c i y 2 2 . Then we may be able to obtain il and 22. Likewise the adversary can obtain j l and j 2 .
+
+
465
References [I] W . D f i e and M.E.Hellman, New directions in cryptography, IEEE Transactions on Information Theory, 22 (1976) 644-654. [2] T.ElGama1, A piiblic key cryptosystem and a signatiire scheme based on discrete logarithms, IEEE Transactions on Information Theory, 31 (1985) 469-472.
[3] N.Koblitz, Elliptic ciirve cryptosystems, Math. Comp. 48 (1987) 203209.
[4]N.Koblitz, Hyperelliptic cryptosystems, 3. Cryptology, 1 (1989) 139150.
[5] U.M.Maiirer and S.Wolf, Lower bounds on generic algorjthms in groiips, Advances in Cryptology (Eurocrypt'98) Lecture Notes in Computer Scienc, Vol 1403, Springer-Verlag (1998) 72-84. [6] V.Miller, Uses of elliptic curves in cryptography, Advances in Cryptology (Crypto'85) Lectiire Notes in Compiiter Science, Vol 218, SpringerVerlag (1986) 417-426.
[7] J.T.Schwartz, Fast probabilistic algorithms for verification of polynomial identities, J. ACM 27 (4)(1980) 701-717. [8] V.Shoup, Lower bounds for discrete logarithms and related problems, Advances in Cryptology (Eiirocrypt'97) Lectiire Notes in Compiiter Science, Vol 1233, Springer-Verlag (1997) 256-266.
466
A NOTE ON RADEMACHER FUNCTIONS AND COMPUTABILITY * MARIKO YASUGI Faculty of Science, Kyoto Sangyo University, Motoyama, Kamigamo, Kita-ku, Kyoto, 603-8555, Japan
E-mail:
[email protected] MASAKO WASHIHARA Faculty of Science, Kyoto Sangyo University Motoyama, Kamigamo, Kita-ku, Kyoto, 603-8555, Japan E-mail: wasiharaOcc.kyoto-su.ac.jp We will speculate on some computational properties of the system of Rademacher The n-th Rademacher function is a step function on the interval functions [O, I ) , jumping at finitely many dyadic rationals of size and assuming values (1, -1) alternatingly.
{a}.
&
Keywords Rademacher functions, computability problems of discontinuous function, LP[O,11-space, computability structure, Cy-law of excluded middle, limiting recursion
1
Introduction
In [6], Pour-El and Richards proposed to treat computational aspects of some discontinuous functions by regarding them as points in some appropriate function spaces. It will then be of general interest to find examples of discontinuous functions which can be regarded as computable in Pour-El and Richards approach. We are working on the integer part function [z] in [lo] in this respect. It is not difficult to claim that it is a computable point in a function space. It is also important t o find out what sort of a principle, beside recursive algorithm, is necessary in evaluating the value of such a function at a possible point of discontinuity. In [lo], we are investigating this problem as well. In this article, we report some facts on a sequence of discontinuous functions which is computsble as a sequence of points in a function space. Let {&(z)} be the sequence of Rademacher functions, that is, for each n, &(z) is defined on [0, l),is discontinuous at the dyadic rational numbers of the form and assumes the values 1 and -1 alternatingly.
&,
'This work has been supported in part by Science Foundation No.12440031.
467
For a real number x,we call a pair ( { T ~ } a, ) an information on x if { r m }is a sequence of rational numbers which converges to x and a is a function from natural numbers t o natural numbers which serves as a modulus of convergence (of { T m } t o z). We will discuss computational aspects of this function system from two viewpoints. First, it is a computable sequence of points in the function space Lp[O, 11 (Section 2). Next, we would like to see how one might evaluate the value &(z) for a single computable number x (and for all n) (Section 3). It turns out that {q5n(x)} has a “weak computation in” the following sense: input an information on x, say ( { r m } a, ) , there is a program t o output a sequence of rational numbers, say { s n m } ,which converges to &(x). If x is a computable number, and its information ( { r m } , a )is recursive, then the output { s n m } is a recursive double sequence. In order to evaluate a modulus of convergence for { s n m } ,one has to apply the Cy-law of excluded middle (denoted by Cy-LEM), that is, we assume a formula of the form 3 x R ( x )VVx-R(x) for a recursive R. There is a counter-example for a sequence of computable reals in [0, l), say {xm}, for which the sequence of values { & ( x ~ ) is} not ~ computable, even for a single n. This shows that the Cy-LEM cannot be replaced by a recursive excluded middle. As a functional analysis treatment, we can show that {&} is a computable sequence of elements in the Banach space Lp[O, 11 for any computable real number p such that 1 5 p < CQ. See also [6],[B], [9] and [lo] for functional analysis approaches to discontinuous functions. For a quick review of computability properties in the function spaces, one can also refer t o [12]. The mathematical significance of the Rademacher function system among various discontinuous functions is that it is a subsystem of the Walsh function system, and the latter plays an important role in analysis. (We have consulted [l],[2] and [13] for Rademacher and Walsh functions.) It is only a matter of routine t o extend our discussion to the Walsh function system. For some discontinuous functions, if one changes the topology of the domain of a function, then it is possible that it becomes continuous. For that reason, it will be worthwhile to investigate computability structures in abstract topological spaces and metric spaces. (See, for example, [4], [7] and
WI.) 2
Rademacher functions and computability in a Banach space
Rademacher functions are step functions from [0, 1) to {-1,1} defined below. Definition 2.1 (Rademacher functions) Let n denote 0 , 1 , 2 , 3 , . . .. Then the
468
n t h Rademacher function
4"(z) is defined as follows. 4o(z) = 1,
2
E [O, 1)
where n 2 1 and i = 0 , 1 , 2 , .. . ,2"-' - 1. The sequence {&(x)) will be called the system of Rademacher functions, or the Rademacher system. A Rademacher function &(x) is a step function which takes a value 1 or -1, and jumps at binary fractions for k = 1 , 2 , . . . ,2" - 1. It is right continuous with left limit. As a sequence of functions, (&} is eventually constant at each binary point. Namely, let x be a binary point $, where n is the first number with respect t o which x can be expressed as such. Then k is an odd number, and &(x) = -1. For any m > n, x = for an 1, and this implies that q ! ~ ~ (=x 1. ) We will show that the function system ( 4 % )is endowed with some kind of computational attributes. Note A traditional computable real-function (on a compact interval, say [0,1]) is assumed to satisfy two conditions: it preserves sequential computability and it is uniformly continuous with a recursive modulus of continuity. Such a function is sometimes called G-computable, meaning Grzegorczykcomputable. We will first show that (4") is a computable sequence of points in a Banach space. Let ( X , 11 11) be a Banach space. According to Section 2 of Chapter 2 in [6], a family of sequences from X , say S, is called a computability structure of the space ( X ,11 11) if it satisfies the follwoing three properties: S is closed with respect to recursive linear combinations and effective limits, and the norms of a sequence from S form a computable real number sequence. A sequence in S is called S-computable, or simply computable. A sequence in S is called an efective generating set if it is a generating set (in the classical sense) for the space X. For L P [ O , 11, where p 2 1 is a computable real number, Pour-El and Richards, in Section 3 of Chapter 2 of [6], proposes to define computable sequences as follows. A sequence {fn} from LP[O, 11 is said to be LP-computable if there exists a G-computable double sequence of functions {gnk} which satisfies that J1gnk - fnllp converges t o 0 as k + 00, effectively in n and Ic, where Ilfll, = I f .1;)" Efective convergence means a convergence with a recursive modulus of convergence with respect t o the norm 1) !Ip.
&
(Jt
469
Let Sp be the computability structure for L P [ O , 11consisting of computable sequences as defined above. In [6], four kinds of effective generating sets for S p are listed: the sequence of monomials { 1,z, x 2 ,. . .}, an enumeration of all piecewise linear functions with corners of rational coordinates, an enumeration of trigonometric polynomials and an enumeration of all step functions with rational values and rational jump points. We will utilize the last generating set. Let us denote an effective enumeration of such functions by { e l } . An el contains information on the number of (finitely many) jump points, the jump points, and the corresponding values. It is a general practice that one regards a function defined on [0, 1) which is integrable over this interval as an L P [ O , 11 function, since it is equal to an L P [ O , 11 function almost everywhere. Then, for each n, 4, can be regarded as an element of P [ O , 11. Now, (4") can be obtained as a recursive subsequence of { e l } (which we call a re-enumeration of { e l } ) , and hence it is an L P [ O , 11-computable sequence. Re-enumeration can be obtained by examining el for each 1 if the number of jump points el represents is 2" - 1, if the jump points 1 2 k 2"-1 ,F ,. ' . , F ,. . ' , 7 and , finally if the corresponding values are are F 1, -1,1, - 1 , . . . , 1 , -1. We have thus obtained the Theorem 1 (LP[O,l]-computability) Let p be a computable real number such that 1 5 p < 00. The Rademacher function system ( 4 " ) is a computable sequence in the space L P [ O , 11. Having shown that ( 4 " ) is a computable sequence of points in a function space, we then question how one might evaluate the values {&(z)} for a computable z. We will observe this problem in the next section.
3
Computation within the Cy-law of excluded middle
We will introduce a weak notion of (pointwise) computability of a function, and show that the Rademacher functions form a sequence of weak computability. Let z be a real number, let { r m } be a sequence of rational numbers and let Q! be a number-theoretic function (from natural numbers t o natural numbers). When { r m } converges to z, Q is called a modulus of convergence (of { r m } to z) if the following holds. m 2 a ( p ) implies
Iz - rI,
5
The pair ( { r m }a, ) is then called an information on
1
2p
2.
470
Definition 3.1 (Weak computation) (1) We will temporarily call an algorithm to evaluate a function value f(z),say P , a weak computation if the following holds: given an information on z, say ({rm},a ) ,P outputs a sequence of rational numbers {sm} (from the information ({rm}, a ) )which (classically) converges t o f (z) . (2) The definition of weak computation P can be extended t o a sequence of functions, say {fn(z)}as follows: given an information ( { T ~ } a, ) on z, P outputs a sequence of rational numbers { S , , ~ } , which converges t o {fn(z)}, as m tends t o 00 for each n. Theorem 2 (Weak computation of {&}) The Rademacher function system (4,) has a weak computation (cf. (2) of Definition 3.1). Proof We will describe an algorithm Po which does the following job: input an information on z in [O,1), say ( { r m } , a )Po , outputs a sequence of rational numbers {snm}, which converges t o {&(z)}. Po is determined as a composition of several algorithms as below. (For simplicity, we assume x > 0. Amendment for the case z = 0 inclusive will be explained later.) 1 First, we define an algorithm P2 which, given ( { T ~ } a, ) ,outputs an integer holds. Ic (for each n) such that s) when it is true. In our case, at stage 1, P I checks whether T ~ ( _ 1 and 1 is odd. If n < j , then a k satisfying <x < can be computed. The value &(z) can be determined by &(x) = 1 if k is even and = -1 if k is odd. If n = j, then &(x) = -1. If n > j, then &(x) = 1. Notice that three cases above are recursive. 2) x is not a dyadic rational number. Then by the algorithm Pz, one can <x < find a Ic, satisfying effectively in n. The value of &(z) can be determined to be 1 or -1 according as k, is even or odd.
&
6,
%
Amendment When x = 0 is included, one may suspect that one has to start
F.
with k = -1 in searching for a k such that $ < x < We can avoid this complication by assuming that any computable number in the interval [0, 1)
472
can be effectively approximated by a computable sequence of positive rational numbers. The reason why a modulus of convergence cannot be determined for the sequence { s n p } in the proof of Theorem 2 lies in Fact 3. Namely, < z < (Case 1) and 5 2 < (Case 2). These cases can be alternatively expressed by the following conditions: T ~ (