Grammars and Automata for String Processing From Mathematics and Computer Science to Biology, and Back
© 2003 Taylor & Francis
TOPICS IN COMPUTER MATHEMATICS A series edited by David J.Evans, Loughborough University, UK
Volume 1
PRECONDITIONING METHODS: THEORY AND APPLICATIONS Edited by David J.Evans
Volume 2
DISCRETE MATHEMATICS FOR ENGINEERS By O.P.Kuznetsov and G.M.Adel’son Vel’skii
Volume 3
SYSTOLIC ALGORITHMS Edited by David J.Evans
Volume 4
PRECONDITIONING ITERATIVE METHODS Edited by David J.Evans
Volume 5
GRAMMAR SYSTEMS: A GRAMMATICAL APPROACH TO DISTRIBUTION AND COOPERATION By Erzsébet Csuhaj-Varjú, Jürgen Dassow, Jozef Kelemen and Gheorghe Paun
Volume 6
DEVELOPMENTS IN THEORETICAL COMPUTER SCIENCE: Proceedings of the 7th International Meeting of Young Computer Scientists, Smolenice, 16–20 November 1992 Edited by Jürgen Dassow and Alica Kelemenova
Volume 7
GROUP EXPLICIT METHODS FOR THE NUMERICAL SOLUTION OF PARTIAL DIFFERENTIAL EQUATIONS By David J.Evans
Volume 8
GRAMMATICAL MODELS OF MULTI-AGENT SYSTEMS By G.Paun and A.Salomaa
Volume 9
GRAMMARS AND AUTOMATA FOR STRING PROCESSING: FROM MATHEMATICS AND COMPUTER SCIENCE TO BIOLOGY, AND BACK Edited by Carlos Martín-Vide and Victor Mitrana
© 2003 Taylor & Francis
Grammars and Automata for String Processing From Mathematics and Computer Science to Biology, and Back
Essays in Honour of Gheorghe P aun
Edited by
Carlos Martín-Vide Rovira i Virgili University, Tarragona, Spain and Victor Mitrana University of Bucharest, Romania
Taylor & Francis Taylor & Francis Group LONDON AND NEW YORK
© 2003 Taylor & Francis
First published 2003 by Taylor & Francis 11 New Fetter Lane, London EC4P 4EE Simultaneously published in the USA and Canada by Taylor & Francis Inc, 29 West 35th Street, New York, NY 10001
Taylor & Francis is an imprint of the Taylor & Francis Group
© 2003 Taylor & Francis
Printer’s note: This book was prepared from camera-ready copy supplied by the authors. Printed and bound in Great Britain by TJ International Ltd, Padstow, Cornwall. All rights reserved. No part of this book may be reprinted or reproduced or utilised in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers. Every effort has been made to ensure that the advice and information in this book is true and accurate at the time of going to press. However, neither the publisher nor the authors can accept any legal responsibility or liability for any errors or omissions that may be made. In the case of drug administration, any medical procedure or the use of technical equipment mentioned within this book, you are strongly advised to consult the manufacturer’s guidelines.
British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library Library of Congress Cataloging in Publication Data A catalog record for this book is available from the Library of Congress
ISBN 0-415-29885-7
© 2003 Taylor & Francis
CONTENTS Contributing Authors Preface Gheorghe Paun and the Windmill Curiosity Marloes Boon-van der Nat and Grzegorz Rozenberg
viii ix 1
I. GRAMMARS AND GRAMMAR SYSTEMS Animal Farm: An Eco-Grammar System Maurice H.ter Beek Towards a Brain Compatible Theory of Syntax Based on Local Testability Stefano Crespi Reghizzi and Valentino Braitenberg The Power and Limitations of Random Context Sigrid Ewert and Andries van der Walt Parsing Contextual Grammars with Linear, Regular and Context-Free Selectors Karin Harbusch Linguistic Grammar Systems: A Grammar Systems Approach for Natural Language M.Dolores Jiménez-López Multi-Bracketed Contextual Rewriting Grammars with Obligatory Rewriting Martin Kappes Semi-Top-Down Syntax Analysis Jaroslav Král and Michal Žemlicka Descriptional Complexity of Multi-Parallel Grammars with Respect to the Number of Nonterminals Alexander Meduna and Dušan Kolár On the Generative Capacity of Parallel Communicating Extended Lindenmayer Systems György Vaszil
© 2003 Taylor & Francis
9
17 33
45
55
67 77
91
99
vi
Contents
II. AUTOMATA Cellular Automata and Probabilistic L Systems: An Example in Ecology Manuel Alfonseca, Alfonso Ortega and Alberto Suárez On Iterated Sequential Transducers Henning Bordihn, Henning Fernau and Markus Holzer Distributed Real-Time Automata Catalin Dima On Commutative Directable Nondeterministic Automata Balázs Imreh, Masami Ito and Magnus Steinby Testing Non-Deterministic X-Machines Florentin Ipate and Mike Holcombe Note on Minimal Automata and Uniform Communication Protocols Galina Jirásková On Universal Finite Automata and a-Transducers Manfred Kudlek Electronic Dictionaries and Acyclic Finite-State Automata: A State of the Art Denis Maurel A New Recursive Incremental Algorithm for Building Minimal Acyclic Deterministic Finite Automata Bruce W.Watson
111 121 131 141 151 163 171
177
189
III. LOGICS, LANGUAGES AND COMBINATORICS Syntactic Calculus and Pregroups Wojciech Buszkowski Homomorphic Characterizations of Linear and Algebraic Languages Virgil E.Cazanescu and Manfred Kudlek Using Alternating Words to Describe Symbolic Pictures Gennaro Costagliola, Vincenzo Deufemia, Filomena Ferrucci, Carmine Gravino and Marianna Salurso What Is the Abelian Analogue of Dejean’s Conjecture? James D.Currie Threshold Locally Testable Languages in Strict Sense Pedro García and José Ruiz Characterizations of Language Classes: Universal Grammars, Dyck Reductions, and Homomorphisms Sadaki Hirose and Satoshi Okawa On D0L Power Series over Various Semirings Juha Honkala The World of Unary Languages: A Quick Tour Carlo Mereghetti and Giovanni Pighizzini
© 2003 Taylor & Francis
203 215 225
237 243
253 263 275
Contents
A New Universal Logic Element for Reversible Computing Kenichi Morita Church-Rosser Languages and Their Relationship to Other Language Classes Gundula Niemann and Friedrich Otto Hiding Regular Languages Valtteri Niemi On the Difference Problem for Semilinear Power Series Ion Petre On Spatial Reasoning via Rough Mereology Lech Polkowski Languages and Problem Specification Loutfi Soufi The Identities of Local Threshold Testability Avraam N.Trakhtman
vii
285
295 305 317 327 337 347
IV. MODELS OF MOLECULAR COMPUTING Soft Computing Modeling of Microbial Metabolism Ruxandra Chiurtu, Alexandru Agapie, Manuela Buzoianu, Florin Oltean, Marius Giuclea and Roxana Vasilco DNA Hybridization, Shifts of Finite Type, and Tiling of the Integers Ethan M.Coven and Nataša Jonoska Generalized Homogeneous P-Systems Rudolf Freund and Franziska Freund Crossing-Over on Languages: A Formal Representation of Chromosomes Recombination Lucian Ilie and Victor Mitrana Restricted Concatenation Inspired by DNA Strand Assembly Carlos Martín-Vide and Alfonso Rodríguez-Patón DNA Tree Structures George Rahonis
© 2003 Taylor & Francis
359
369 381
391 403 413
CONTRIBUTING AUTHORS Agapie, Alexandru Alfonseca, Manuel Beek, Maurice H.ter Boon-van der Nat, Marloes Bordihn, Henning Braitenberg, Valentino Buszkowski, Wojciech Buzoianu, Manuela C az anescu, Virgil E. Chiurtu, Ruxandra Costagliola, Gennaro Coven, Ethan M. Crespi Reghizzi, Stefano Currie, James D. Deufemia, Vincenzo Dima, C at alin Ewert, Sigrid Fernau, Henning Ferrucci, Filomena Freund, Franziska Freund, Rudolf García, Pedro Giuclea, Marius Gravino, Carmine Harbusch, Karin Hirose, Sadaki Holcombe, Mike Holzer, Markus Honkala, Juha Ilie, Lucian Imreh, Balázs Ipate, Florentin Ito, Masami Jiménez-López, M.Dolores Jirásková, Galina
© 2003 Taylor & Francis
Jonoska, Nataša Kappes, Martin Kolár, Dušan Král, Jaroslav Kudlek, Manfred Martín-Vide, Carlos Maurel, Denis Meduna, Alexander Mereghetti, Carlo Mitrana, Victor Morita, Kenichi Niemann, Gundula Niemi, Valtteri Okawa, Satoshi Oltean, Florin Ortega, Alfonso Otto, Friedrich Petre, Ion Pighizzini, Giovanni Polkowski, Lech Rahonis, George Rodríguez-Patón, Alfonso Rozenberg, Grzegorz Ruiz, José Salurso, Marianna Soufi, Loutfi Steinby, Magnus Suárez, Alberto Trakhtman, Avraam N. Vasilco, Roxana Vaszil, György Walt, Andries van der Watson, Bruce W. Žemlicka, Michal
PREFACE The present book contains a collection of articles, clustered in sections, that are directly or indirectly related to areas where Gheorghe Paun has made major contributions. The volume opens with an essay discussing a conjecture about windmill movements formulated by Gheorghe during one of his visits to Leiden, The Netherlands. This is a good example, among many others, of his curiosity for all kinds of games and problems, not just mathematical ones. The first section, Grammars and Grammar Systems, includes a number of papers related to an important concept of the theory of formal languages: that of grammar. Some results in “classical” areas of grammar theory are presented: computational power of grammar systems with Lindenmayer components—Gheorghe Paun is one of the inventors of that framework—as well as a few practical applications of the theory (a grammar system approach to the study of natural language, and how an eco-grammar system might model George Orwell’s ‘Animal Farm’), new variants of contextual grammars, descriptional complexity (a survey of all the results regarding the number of nonterminals in multi-parallel grammars), parsability approaches for contextual grammars and context-free grammars with left recursive symbols, the power and limitation of regulated rewriting in image generation. The other classical part of formal language theory, that of Automata, is the subject of the second section. Several types of automata (cellular automata, directable automata, X-machines, finite automata, real-time automata) are investigated in search of new theoretical properties and some potential applications in software engineering, linguistics and ecology. A large number of contributions are grouped in the third section, Logics, Languages and Combinatorics. Homomorphical characterizations of the language classes in the Chomsky hierarchy, languages for picture descriptions, results regarding semilinear power series and D0L power series, unary languages, relationships between different classes of languages and the languages associated with rewriting systems are presented. Other contributions consider logical aspects: for instance, a simple logic element,
© 2003 Taylor & Francis
x
Preface
called “rotary memory”, is shown to be reversible and logically universal, while some topological structures of spatial reasoning are considered in the framework of rough mereology. The last section, Models of Molecular Computing, is dedicated to a very hot and exciting topic in computer science: computing with molecules. Both experiments and theoretical models are presented. Amongst other things, it is shown how a mathematical model is able to cope with metabolic reactions in bacteria, and also some relationships between the entropies of DNA based computing models (Adleman’s model, splicing systems) and tiling shifts. Furthermore, operations inspired by gene recombination and DNA strand assembly are considered as formal operations on strings and languages. All the papers are contributed by Gheorghe P aun’s collaborators, colleagues, friends and students from five continents, who wanted to show their recognition to him for his tremendous intellectual work on the occasion of his 50th birthday in this way. We have collected 40 papers by 69 authors here. (Another set of 38 papers by 75 authors was recently published: C.Martín-Vide & V.Mitrana, eds. (2001), Where Mathematics, Computer Science, Linguistics and Biology Meet. Kluwer, Dordrecht.) The subtitle of the present volume intends to reflect the sequence of Gheorghe P aun’s scientific interests over a period of time. Summing up, this book makes an interdisciplinary journey from classical formal grammars and automata topics, which still constitute the core of mathematical linguistics, to some of their most recent applications, particularly in the field of molecular biological processing. The editors would like to emphasize Loli Jiménez’s technical help in the preparation of the volume, Gemma Bel’s contribution to it in its final stage, as well as the publisher’s warm receptiveness to the proposal from the beginning. We hope this book will be a further step in the reviving of formal language theory as a highly interdisciplinary field, and will be understood as an act of scientific justice and gratitude towards Gheorghe Paun. Tarragona, November 2000 Carlos Martín-Vide Victor Mitrana
© 2003 Taylor & Francis
Gheorghe P a un and the Windmill Curiosity Marloes Boon-van der Nat Leiden Institute of Advanced Computer Science (LIACS) Leiden University The Netherlands
[email protected] Grzegorz Rozenberg Leiden Institute of Advanced Computer Science (LIACS) Leiden University The Netherlands and Department of Computer Science University of Colorado at Boulder U.S.A.
[email protected] Gheorghe is a frequent visitor in Leiden—by now we are really very good friends, and we know him pretty well. The most characteristic feature of Gheorghe is his enormous (and very contagious) enthusiasm for research which comes from his natural curiosity for everything around. This curiosity is best illustrated by the following. We have had certainly hundreds of visitors to our institute in the past, but none of them has ever asked any question about the most symbolic feature of the Dutch landscape: windmills. One day, after a standard ride from Bilthoven to Leiden (passing several windmills on the way), Gheorghe commented on something unusual about the windmills: their wings always turn counterclockwise. He immediately wanted an explanation (because once he gets an explanation of anything, he can formulate a theory of it!).
© 2003 Taylor & Francis
2
M.Boon-van der Nat, G.Rozenberg
This has turned out to be not an easy question to answer. We have passed it to many of our colleagues in Holland, but nobody knew the answer. In this way, an interesting question by Gheorghe has led us to some interesting research. Through this, our knowledge of something that is as Dutch as possible, has increased considerably. We have understood something very Dutch (something that, in the first place we should have already known) only because Gheorghe has asked a question. It is this posing of questions and searching for answers that makes Gheorghe such a good scientist. We feel that explaining to Gheorghe possible reasons for the counterclockwise turning of the wings of Dutch windmills will be very much appreciated by him (we are just curious how many interesting questions Gheorghe has posed during the 50 years of his life). First of all, in order to turn this question into a truly scientific problem, we gave it a name: the windmill wings invariant problem, abbreviated as the WWI problem. It has turned out to be a genuinely interdisciplinary problem, and our solution involves a combination of historical, ergonomic, and engineering arguments. From the historical point of view, one has to realize that before the windmill was invented, the hand-operated stone-grinding mill was used. This is shown schematically in Fig.1. Now comes the very important ergonomic argument: about 90% of people are right-handed (this must have also been true at the time that hand operated stone-grinding mills were used), and the natural direction for the right hand to turn the handle during long periods of grinding is anticlockwise, as indicated in Fig. 1.
Figure 1: A handmill.
Since turning the grinding stones in this way for long periods of time must have been really exhausting, people sought help from the forces of nature. A scheme for converting the force of wind to turn the grinding stones, illustrated in Fig. 2, was thus invented—the engineering instinct of mankind had manifested itself once again! Because of the original (hand-operated) scheme,
© 2003 Taylor & Francis
Gheorghe Pa un and the Windmill Curiosity
3
it was natural to use the new scheme in such a way that the grinding stone still turned in the same direction, hence anticlockwise. This meant that the vertical “gear” in the “gear transmission” would turn clockwise if observed from the “inside” (observer a in Fig. 2). Consequently, the same gear observed from the outside (observer b in Fig. 2) turns anticlockwise, as will the wings when observed from outside (observer c in Fig. 2). Thus, here we are (if we consider the construction from Fig. 2 as the precursor of the current windmills): “THE WINGS OF THE WINDMILL TURN ANTICLOCKWISE”
Figure 2: From handmill to windmill.
Soon afterwards, someone must have observed that the force used to turn the grinding stone can be used to turn several grinding stones (it is important to note here that efficiency arguments were known long before computer science became interested in the efficiency of algorithms!). By using just one more gear wheel, one could now power several grinding stones, each of which having its own (“small”) gear wheel powered by (turned by) the additional (“big”) gear wheel. In this way we get the standard construction of windmills illustrated in Fig. 3. Note that now, although nothing has changed from the outside (the wings still turn anticlockwise), the situation inside the mill has changed quite dramatically…each individual grinding stone now turns in the clockwise direction! Hence the main trace of the original motivation has been wiped out! This made our research even more challenging. We are sure that Gheorghe, after reading the above solution to the WWI problem, will right away formulate a language theoretic model of the situation. But then…this is the effect that we have expected from our paper
© 2003 Taylor & Francis
4
M.Boon-van der Nat, G.Rozenberg
Figure 3: The inside of a modern windmill.
anyhow, and in this way our contribution becomes a gift for Gheorghe of the sort that he likes the most. Let us then conclude this article with (anticlockwise!) birthday wishes from the two of us and the rest of the Leiden group:
© 2003 Taylor & Francis
Gheorghe Paun and the Windmill Curiosity
5
P.S.George: please notice that two membranes suffice to express the best birthday wishes from all of us. But to this aim one has to use words rather than arbitrary objects. Acknowledgements The authors are indebted to Maurice ter Beek, our local graphics wizzard, for helping with the illustrations, and to “The Dutch Windmill” Society for providing a lot of valuable information.
© 2003 Taylor & Francis
I GRAMMARS AND GRAMMAR SYSTEMS
© 2003 Taylor & Francis
Animal Farm: An Eco-Grammar System Maurice H.ter Beek Leiden Institute for Advanced Computer Science Leiden University The Netherlands
[email protected] Abstract. An eco-grammar system is used to model George Orwell’s Animal Farm: A Fairy Story.
1 Introduction Eco-grammar systems were originally introduced in [3] as a framework motivated by Artificial Life (cf. [8]) and able to model life-like interactions. Subsequently, many variants were introduced and studied (for an extensive survey cf. [2]), mostly with a strong focus on their generative power. The articles in [15] give a nice overview. This paper provides a glimpse of the modelling power of eco-grammar systems through a rather enhanced eco-grammar system that models George Orwell’s acclaimed Animal Farm: A Fairy Story. I decided to write the paper for many reasons. To begin with, I hope to inspire those working on eco-grammar systems to pursue further research with a perspective other than generative power in mind. A return to the original motivation of eco-grammar systems calls for them to be used to model issues stemming from Artificial Life. Even though I merely provide a humorous example in the style of Jürgen Dassow’s ecogrammar system modelling MIT’s Herbert as a can collecting robot ([6]); it is my belief that eco-grammar systems can also model more scientifically
© 2003 Taylor & Francis
10
M.H.ter Beek
challenging issues from Artificial Life. Valeria Mihalache set a good example in this direction in [10] by using eco-grammar systems to simulate games. This article also hints that eco-grammar systems can be used to generate stories. One of the postulates of the multidisciplinary research field Narratology—the study of narrative structure (cf. [1])—is that stories within the same literary genre follow a common pattern. In [16], Vladimir Propp interpreted a hundred fairy tales in terms of their smallest narrative units, so-called “narratemes”, and found that they all displayed the same narrative structure. This led to the field Semiotic Narratology, at the crossroads of Narratology and Semiotics—the science of signs (cf. [11]). This field focuses heavily on minimal narrative units which constitute the so-called “grammar of the plot” or “story grammars”. Closer to home, and more recently, Solomon Marcus and others associated formal languages to many Romanian fairy tales (cf. [9]). Hierarchies known from Formal Language Theory consequently enabled certain fairy tales to be classified as more sophisticated than others (cf. [7]). The above considerations show that it would be interesting to model more stories by eco-grammar systems and to search for structural equivalences between them. The same naturally holds for games. Finally, I celebrate Gheorghe P aun’s 50th birthday by bringing together two of his “hobbies”. For it is Gheorghe who is the (co-)author of many of the papers on eco-grammar systems—including the one that introduced the framework—and it is Gheorghe who wrote a sequel ([14]) to George Orwell’s other classic novel: Nineteen Eighty-Four ([13]).
2 Eco-Grammar Systems I assume the reader to be familiar with Formal Language Theory (otherwise cf. [17])—in particular with eco-grammar systems (otherwise cf. [2])—and to have read George Orwell’s Animal Farm: A Fairy Story (otherwise read [12]). Since the specific eco-grammar system used here is based on variants that are well known from the literature—e.g. simple eco-grammar systems ([5]) and reproductive eco-grammar systems ([4])—, the definition is given with little intuitive explanation. An eco-grammar system (of degree n, nⱖ0) is a construct where: •
E=(VE, PE), where: • VE is a finite alphabet, the environmental alphabet, and • PE is a finite and complete set of P0L rewriting rules of the form x→y with and , the environmental evolution rules,
© 2003 Taylor & Francis
Animal Farm: An Eco-Grammar System
11
and for 1ⱕiⱕn: •
is a multiset of animals of the i-th type, 1ⱕjⱕki, where: • •
•
are finite alphabets, the alphabets of the animals of the i-th type, is the reproduction symbol, and is the death symbol, are finite and complete sets of P0L rewriting rules of the form x→y with and , united with pure contextfree productions of the form with , p≥2, and 1ⱕfⱕp, the evolution rules of the animals of the i-th type, and is a finite set of pure context-sensitive productions of the form α→ß for the action rules of the animals of the i-th type.
A state of ⌺ is a construct =(wE, W1, W2,…, Wn), where Wi, 1ⱕiⱕn, is a multiset of symbols 1ⱕjⱕki. This wE is the environmental state and these to of the currently existing animals of the i-th type.
and each , are the states
The state of ⌺ changes when the state of every animal and the environment evolves at each position, except for those positions where animals perform actions. Note that the application of a production containing the reproduction symbol results in an increase in the number of symbols in the multisets of animals. A state =(wE, W1,W2, …, Wn) of ⌺ derives a state in one step, written as , iff: •
and where: • • •
, , for 1ⱕlⱕs, and
y1y2…ysys+1 is the result of applying rules from PE to x1x2… xsxs+ l,
and for 1ⱕiⱕn: • •
is the multiset obtained from for each 1ⱕjⱕki, either: putting in if with
© 2003 Taylor & Francis
by or
12
M.H.ter Beek
•
and
if
with
for 1ⱕfⱕp. Given ⌺ and an initial state 0, the set of state sequences of ⌺ is defined by Seq . This set thus contains the evolution stages (a.k.a. the developmental behaviour) of both the environment and the animals.
3 Animal Farm In this section, I present an eco-grammar system A modelling Animal Farm: A Fairy Story. The environment E of A consists of a house—originally named Manor Farm— and plenty of corn, hay, and straw. Hence VE={M, c, h, s} can serve as the environmental alphabet. Naturally the quantity of corn, hay and straw continuously grows. The evolution of the environment can thus be captured well by the environmental evolution rules PE={c→c, c→cc, h→h, h→ hh, s→s, s→ss}. Next I describe the animals. They all have essentially the same basic structure, i.e. the same alphabet, the same evolution rules, and the same action rules. However, some of the leading animals—i.e. animals with a name—undertake more actions throughout the book and thus have more action rules. Let L be an animal. This animal is either alive on the farm (modelled by L), alive outside the farm (modelled by ?), or dead (modelled by †). The alphabet of L is thus VL={L, ?}, and its mortality and mobility are guaranteed by its evolution rules PL={L→L, L→?, L→†, ?→?, ?→ L, ?→†, †→†}. In the course of the book, all animals participate in harvesting corn and hay, and they use straw. I thus choose the actions of L to be RL={ccc→c, hhh→h, ss→s}. Hence . Consequently I build the multisets of animals of A by replacing L and L by symbols modelling the animals of the book. Consider, for example, the most featured animals in the book: pigs. The vast majority of them are not leading animals and the multiset Pig thus contains quite a number of animals , where V P ={P, ?}, P P ={P→P, P→?, P→†, ?→?, ?→P,?→†, †→†}, and R P={ccc→c, hhh→h, ss→s}. Moreover the pigs Minimus, Pinkeye, and Squealer are also well described by . However, the main leading animals—the pigs Snowball and Napoleon—are not. For it is Snowball who, in Chapter 2, paints out Manor Farm and in its place paints Animal Farm, and it is Napoleon who makes it Manor Farm again in Chapter 10. Hence, they are well described © 2003 Taylor & Francis
Animal Farm: An Eco-Grammar System
by
13
and .
To keep A as simple as possible I do not model birth even though I model death. The only exception is the birth of nine puppies between the dogs Bluebell and Jessie in Chapter 3, as this is an important event in the book. Therefore, the animals Bluebell and Jessie are added to the multiset Dog. Compared to other dogs D, they have an augmented set of evolution rules, viz. and . Following the basic structure of animals L sketched above, the nearly complete list of animals featuring in Animal Farm: A Fairy Story—the capitalized letter in the sort name indicates the symbol that replaces L and L, and the name of the leading animals of that sort are added between brackets— becomes Chicken, Dog (Bluebell, Jessie, Pincher), goosE, Goat (Muriel), Horse (Mollie, Boxer, Clover), Man (Mr. Jones, Mrs. Jones, Mr. Frederick, Mr. Whymper, Mr. Pilkington), Pig (Old Major, Snowball, Napoleon, Minimus, Pinkeye, Squealer), Raven (Moses), Sheep, coW, and donkeY (Benjamin). For reasons of space, I only mention the generic names of the animals, i.e. I do not use the specific names for the female and for the male. Finally, note that Bluebell, Jessie, and Pincher are the only dogs, Muriel is the only goat, Moses is the only raven, and Benjamin is the only donkey. Then the eco-grammar system A modelling Animal Farm: A Fairy Story is:
where:
4 A Fairy Story In this section I show how A can generate the fairy story of the book. In the beginning there is Manor Farm and plenty—where plenty is modelled
© 2003 Taylor & Francis
14
M.H.ter Beek
by the presence of 9 symbols—of corn, hay, and straw. Of the humans, only Mr. Jones, Mrs. Jones and their four men are present. Hence:
Next, I summarize Chapter 1 to 10 of the book and for some chapters I display the state modelling the situation after that chapter. I leave it to the reader to display the other states and to spell out precisely which rules must be applied to obtain them. In Chapter 1—as in all chapters—the corn, hay, and straw naturally grow a little, while at the same time some straw is used. Hence 1 is the same as 0 except that the environmental state has become Mc13h13s9. From now on I will no longer mention the growth of corn, hay, and straw, nor the decrease in straw. In Chapter 2, Old Major dies. The rebellion then causes Mr. Jones, Mrs. Jones, and their four men to flee from the farm, and causes Snowball to change the name of the farm to Animal Farm. Moreover, a hay harvest takes place. In Chapter 3, Bluebell and Jessie whelp and there is a corn harvest. Hence:
In Chapter 4, Mr. Jones and his four men return to the farm only to be expelled again by the animals—at the cost of only one sheep—during the Battle of the Cowshed. Around this time, Moses flies off. In Chapter 5, Mollie disappears, and at the height of Animalism Snowball is chased off the farm by Bluebell’s and Jessie’s nine puppies, which have grown tremendously under Napoleon’s control. Hence:
In Chapter 6, nothing much happens. In Chapter 7, nine hens die after Comrade Napoleon orders them to starve themselves. He also orders his dogs to kill the four pigs, three hens, one goose, and three sheep that confess to have rebelled against him. Hence:
© 2003 Taylor & Francis
Animal Farm: An Eco-Grammar System
15
In Chapter 8, there are more harvests and another battle—at the cost of one cow, three sheep, and two geese this time—after Mr. Frederick and his men attack the farm. In Chapter 9, Moses reappears and Boxer dies. Hence:
In Chapter 10, Bluebell, Jessie, Pincher, Muriel, three horses, and Mr. Jones die. Furthermore, Mr. Pilkington is now an appreciated neighbour, in whose presence Napoleon changes the name of the farm back to Manor Farm. Hence:
The story of the book naturally is only one of the possible stories that A can generate. I leave it to the reader to play with A and to enjoy other outcomes of this fairy story. Acknowledgements I wish to thank Erzsébet Csuhaj-Varjú, Judit Csima, Nikè van Vugt, and Nadia Pisanti for useful comments and suggestions on a preliminary version of this paper.
References [1] [2]
[3] [4]
[5]
M.Bal, Narratology: Introduction to the Theory of Narrative. Toronto University Press, Toronto, 1985. E.Csuhaj-Varjú, Eco-grammar systems: recent results and perspectives. In Gh.P aun (ed.), Artificial Life: Grammatical Models. Black Sea University Press, Bucharest, 1995, 79–103. E.Csuhaj-Varjú, J.Kelemen, A.Kelemenová and Gh.P aun, Eco(grammar) systems: a generative model of artificial life, 1993, manuscript. E.Csuhaj-Varjú, J.Kelemen, A.Kelemenová and Gh.P a un, Eco(grammar)systems: a preview. In R.Trappl (ed.), Cybernetics and Systems’94. World Scientific, Singapore, 1994, vol. 1, 941–948. E.Csuhaj-Varjú, J.Kelemen, A.Kelemenová and Gh.Paun, Eco -grammar systems: a grammatical framework for studying life-like interactions, Artificial Life, 3.1 (1997), 1–28.
© 2003 Taylor & Francis
16
M.H.ter Beek
[6]
J.Dassow, An example of an eco-grammar system: a can collecting robot. In Gh.P aun (ed.), Artificial Life: Grammatical Models. Black Sea University Press, Bucharest, 1995, 240-244. J.Dassow and Gh.Paun, Regulated Rewriting in Formal Language Theory. Springer, Berlin, 1989. Ch.G.Langton (ed.), Artificial Life: An Overview. MIT Press, Cambridge, Mass., 1995. S.Marcus, La sémiotique formelle du folklore. Klincksieck, Paris, 1978. V.Mihalache, General artificial intelligence systems as eco-grammar systems. In Gh.P aun (ed.), Artificial Life: Grammatical Models. Black Sea University Press, Bucharest, 1995, 245–259. W.Nöth, Handbook of Semiotics. Indiana University Press, Bloomington, In., 1990. G.Orwell, Animal Farm: A Fairy Story. Martin Secker & Warburg, London, 1945. G.Orwell, Nineteen Eighty-Four. Martin Secker & Warburg, London, 1949. Gh. P a un, O mie nou sute nou zeci si patru. Ecce Homo, Bucuresti, 1993. English translation published as Nineteen Ninety-Four, or The Changeless Change. Minerva, London, 1997. Gh.Paun (ed.), Artificial Life: Grammatical Models. Black Sea University Press, Bucharest, 1995. V.Ia.Propp, Morfologiia skazki. Academia, Leningrad, 1928. English translation published as Morphology of the Folktale. Indiana University and The University of Texas Press, Austin, Tx., 1968. G.Rozenberg and A.Salomaa (eds.), Handbook of Formal Languages. Springer, Berlin, 1997.
[7] [8] [9] [10]
[11] [12] [13] [14]
[15] [16]
[17]
© 2003 Taylor & Francis
Towards a Brain Compatible Theory of Syntax Based on Local Testability1 Stefano Crespi Reghizzi Department of Electronics and Information Polytechnical University of Milan Italy
[email protected] Valentino Braitenberg Laboratory of Cognitive Science University of Trento Rovereto, Italy and Max Planck Institute for Biological Cybernetics Tübingen, Germany
[email protected] Abstract. Chomsky’s theory of syntax came after criticism of probabilistic associative models of word order in sentences. Immediate constituent structures are plausible but their description by generative grammars has met with difficulties. Type 2 (context-free) grammars account for constituent structure, but they go beyond the mathematical capacity required by language, because they generate unnatural mathematical sets as a result of being based on recursive function theory. 1 This work was presented at the Workshop on Interdisciplinary Approaches to a New Understanding of Cognition and Consciousness, Villa Vigoni, Menaggio, 1997. We acknowledge the support of Forschung für anwendungsorientierte Wissenverarbeitung, Ulm and of CNR-CESTIA.
© 2003 Taylor & Francis
18
S. Crespi Reghizzi, V. Braitenberg
Abstract associative models investigated by formal language theoreticians (Schützenberger, McNaughton, Papert, Brzozowski, Simon) are known as locally testable models. We propose a combination of locally testable and constituent structures models under the name of Associative Language Description and we argue that this combination has the same explanatory power as type 2 grammars but is compatible with brain models. We exemplify and discuss two versions of ALD, one of which is based on modulation while the other is based on pattern rules. In conclusion, we provide an outline of brain organization in terms of cell assemblies and synfire chains.
1 Introduction Chomsky’s theory of syntax came after his criticism of probabilistic associative models of word order in sentences, but the inadequacy of probabilistic left-to-right models (Markov process) had already been noticed by Lashley [21], who anticipated Chomsky’s arguments [7] by observing that probabilities between adjacent words in a sentence have little relation to the grammaticality of the string. Although associative models provide an intuitively appealing explanation of many linguistic regularities, they are also aligned with current views on information processing in the cortex. A classical argument pro syntax is that the choice of an element is determined by a much earlier element to be remembered across gaps filled by intervening clauses (constituents). Ambiguity too provides a strong indication that sentences carry a structure. The established model for immediate constituent analysis relies on context-free (CF) grammars. Their rules assign names (nonterminal symbols) to different kinds of constituents (syntax classes). Such grammars cannot handle many non-elementary aspects of language, but go beyond the mathematical capacity required by language, because they generate unnatural mathematical sets. In our opinion this is a clear indication that this model is misdirected. Related, more complex models, such as context sensitive grammars, are even more subject to the same criticism: for instance, they generate such mathematical languages as the set of strings whose length is a prime number [25], a consequence of being based on recursive function theory. On this basis and motivated by the search for a linguistic theory that is more consistent with the findings of brain science, we propose a new model, called associative language description (ALD). This grafts the immediate constituent structure onto the old associative models. The associative theories we build upon were investigated in the 60’s by mathematicians (notably Brzozowski, McNaughton, Papert, Schützenberger, Simon, Zalcstein) and are known as locally testable (LT) models. In sect. 2 we recall the LT definitions and we present the ALD models. The plural indicates that ALD is a
© 2003 Taylor & Francis
Towards a Brain Compatible Theory of Syntax Based on…
19
sort of general approach that can be realised in different ways. The original ALD model exploits LT to specify both the structure of constituents and their permitted contexts; the bounded ALD model of [11] is a mathematically simpler version that has been used to prove formal properties and also in technical applications for programming languages [14]. Then we briefly compare ALD and CF models, and we show that ALD are easier to infer from examples of structured sentences. In sect. 3 we sketch a brain organization for ALD processing in terms of neural mechanisms. In the conclusion we refer to early related research and discuss possible avenues for future work
2 Associative Language Descriptions 2.1 Local Testability Certain frequent patterns of language can be described by considering pairs of items that occur next to each other in a sentence. The precise nature of the items depends on the level of language description. The items are the phonemes at the phonological level, but lexemes or word categories at the syntactical level. In our discussion we do not specify the level of language description, since what we are proposing is an abstract model of grammar that in future development will have to be instantiated to account for various linguistic phenomena. The items will accordingly be represented as characters from a terminal alphabet Σ={a, b, …}. Looking at a string such as x=abccbcccc we note that x contains the following substrings of length 2: ab, bc, cc, cb. These are called digrams or 2-grams. We notice that ab, the prefix of the string, is the initial digram; similarly cc is the final digram or suffix. The study of the digrams (or more generally the k-grams, k≥2) that appear in phrases has been suggested many times as a technique for characterizing to some extent grammatically valid strings. In particular, the limits of a Markovian model based on the relative frequency of k-grams occurring in the language corpus have been examined by Chomsky and Miller [8]. Checking for the presence of certain k-grams in a given string to be analysed is a simple operation for a computer, and one that could very easily be performed by cortical structures, as already noticed by Wickelgren [26], who shows that serial order can be enforced by small associative memories. The recognition algorithm needs a finite memory to be used as a sliding window that is capable of storing k characters. The window is initially positioned the left to of the string to be analyzed. Its content is checked against the set of permitted k-grams. If it matches, the window advances by one position along the string, and the same check is performed, until the window reaches the right edge of the string. If the check is positive in all window positions, the string is accepted; otherwise it is rejected. Because the sliding window essentially performs a series of local
© 2003 Taylor & Francis
20
S. Crespi Reghizzi, V. Braitenberg
inspections, the languages thus defined have been aptly called locally testable (LT). As this algorithm uses a finite memory, independently of the length of the string, its discriminatory power cannot exceed that of a finite-state automaton. Not all finite-state languages are LT, but this loss of generative capacity primarily concerns certain periodic patterns, which are irrelevant for modeling human languages. One example would be a string over the alphabet a, b containing any number of a’s and a number of b’s that is a multiple of 3. Such a property cannot be checked by local inspections over the string. The formal properties of LT languages have been investigated by mathematicians using algebraic and automata-theoretical approaches. An early comprehensive reference is the book on non-counting languages by McNaughton and Papert [22]. Several variations to the notion of LT have been proposed by theoreticians (e.g. see [23]), but we stick to the simplest definition, since nothing is gained by considering more refined models at this early stage of our investigation. We use a special character ⊥, not present in the terminal alphabet, called the terminator, which encloses the sentences of the language. For k≥1, a k-gram x is either a string containing exactly k characters of Σ or a possibly shorter string starting or ending by the terminator. Formally:
Definition 1 For a string x of length greater than k, we consider three sets: αk(x) the initial k-gram of x; γk(x) the final k-gram of x; and, if x is longer than k+1, ßk(x), the set of internal k-grams (those that occur in any position other than on the left and right edges). (Notice that internal k-grams may not contain terminators.) A locally testable description (LTD) of order k≥1 consists of three sets of k-grams: A (initial), B (internal), C (final). A LTD D=(A, B, C) defines a formal language, denoted L(D), by the next condition. A string x is in L(D) if, and only if, its initial, internal, and final k-grams are resp. included in A, B, and C. More precisely, the condition is: . As a consequence, if two strings x and y have the same sets α, ß, and γ either both are valid sentences of L(D) or neither one is. It is often convenient to avoid specifying the exact width of the k-grams. A language L is called locally testable if there exists a finite integer k such that L admits a LTD of order k. 2.2 From Local Testability to Structure Definition It is obvious that an LTD falls short of the capacity required by natural language, as it enables only some finite-state languages to be defined. Its major weakness is its inability to define constituent structures, a necessary
© 2003 Taylor & Francis
Towards a Brain Compatible Theory of Syntax Based on…
21
feature of any syntax model. A simple way to introduce constituent structures into an LTD is now proposed. A constituent has to be considered as a single item, encoded by a new terminal symbol ∆ called a place holder (PH). By this expedient a constituent containing other constituents gives rise to k-grams containing the PH. Next, we define the tree structures that are relevant for ALD. A tree (see Fig. 1) whose internal nodes are labeled by ∆ and whose leaves are labeled by characters of Σ is called a stencil tree.
Figure 1: A stencil tree T with four constituents schematised as triangles. To the right, a condensed tree.
A tree is composed of juxtaposed subtrees of height one and leaves with labels in Σ ∪ {∆}, called constituents. The frontier of a stencil tree T is denoted by τ(T), while τ(Ki) denotes the frontier of the constituent Ki. Definition 2 For an internal node i of a stencil tree T, labeled by ∆, let Ki and Ti be resp. the constituent and the maximal subtree of T having root i. Introduce a new terminal symbol ki, and consider the ‘condensed’ tree TK´i obtained by replacing the subtree Ti in T with ki. Consider the frontier of the
© 2003 Taylor & Francis
22
S. Crespi Reghizzi, V. Braitenberg
condensed tree, which can be written as the concatenation of three parts: , where s and t are possibly empty terminal strings. The strings ⊥ s and t⊥ are called, resp., the left/right context of the constituent Ki (or of the subtree Ti) in . For instance, in Fig. 1 the left context of K3 is acbbbacbcbbcb and the right context of K1 is ⊥: notice that terminators are automatically prefixed/appended. 2.3 ALD Based on Local Testability The original idea of Associative Language Descriptions (ALD) was entirely founded on the concepts of LT. Each class of constituent is associated to an LT description, i.e. a triple 具A,B,C典, with the provision that k-grams contain place holders, if they contain nested constituents. For graduality, we start with a simpler model, where any constituent can freely occur anywhere an PH occurs. Definition 3 A free ALD F consists of a collection {D1, D2, …, Dm} of locally testable descriptions of order k, each one of the form Di=具Ai, Bi, Ci典, where each one of the three symbols represents a set of k-grams possibly containing PHs. Each triple Di is called a syntax class. A free ALD F={D1, D2, …, Dm} defines a formal language, L(F), consisting of a set of stencil trees, by the following condition. A tree T is in L(F) iff for each constituent K of T the frontier τ(TK) belongs to the language defined by some syntax class Di of F. Formally: Example 1 Take the alphabet Σ={a, +, ×} of certain arithmetic expressions, where a stands for some numeric value. Sums, e.g. a+a+a, are defined by the LTD and products are similarly defined by . Now we extend the language by allowing any variable to be replaced with any expression as constituent, thus giving rise to a tree language, which is defined by the free ALD , where:
For instance, the stencil tree a+a×a+a+a×a, where the constituents are underscored, is valid since the triples α, ß, γ exhibited by each constituent are included in D1 or in D2. On the other hand, the tree a+a×a+a is not valid since the underscored constituent yields , not included either in D1 or in D2.
© 2003 Taylor & Francis
Towards a Brain Compatible Theory of Syntax Based on…
23
The license to replace any constituent by another one would cause obvious over generalisation in a grammar, say, of English. For instance, if a preposition clause (in the garden) occurs in a certain position (the man in the garden), then any other clause (e.g. the noun clause the sharp knife) would be permitted in the same position, causing acceptance of the ungrammatical string the man the sharp knife. In other words, all syntax classes of a free ALD are equivalent with respect to their context of occurrence. This insufficient discriminatory capacity is remedied by the next development. The improvement consists of adding, to each syntax class, the indication of the valid contexts of occurrence. Two different manners of so doing have been considered. In the original proposal, the contexts are specified by listing the k-grams that may occur at the left and right edge of a constituent. The second manner, introduced in [11], uses patterns to specify at once a syntax class and its permitted contexts. Definition 4 An ALD G with modulation2 consists of a free ALD: F={D1, D2,…, Dm}, m≥1 with the following additions. For each LT description Di, two non-empty sets of k-grams without PHs3 are given: the opening (or left) passage Li and the closing (or right) passage Ri. Therefore, G is defined by a finite collection of components: G={E1, E2,…,Em}, where and defines a syntax class.
. As before, each component Ei
Before presenting the acceptance condition, we need to formalise the k-grams occurring on the borders of a constituent. Definition 5 The left border λ (K, T) of a constituent K of stencil tree T is the set of k-grams, with no PH, such that they occur in the frontier τ(T) of T, and they partially overlap the left edge of τ(K). Reformulating more precisely the second condition, after segmenting the string , as we have: The right border ρ(T, K) of a constituent is symmetrically defined.
2
In music, ‘modulation’ indicates a passage leading from one section of a piece to another. The reason for not allowing PHs is procedural. In order to parse a string, passages must be recognised, which would be inconvenient if the PH, itself the result of a parse, had to be detected in advance. This choice is, however, one of several possible variations. 3
© 2003 Taylor & Francis
24
S. Crespi Reghizzi, V. Braitenberg
As an example, from Fig. 1 we have the following borders for k=2:
For k=4: λ(K3, T)={bcbb, cbbb, bbbb};
.
Intuitively for a stencil tree to be valid each constituent must be preannounced by the occurrence of some k-grams belonging to the opening passage of its syntax class; a symmetrical condition applies to the closing passage. Example 2 To illustrate the discriminative capacity of modulation, we change Ex. 2, by permitting one containment only: multiplicative expression inside additive expressions. By inspecting a few instances of stencil trees one easily comes out with the ALD:
Notice that at least 3-grams are required to make modulation effective for L2 and R2, but for brevity 2-grams have been used in all other sets. As a test, one can verify that the tree a×a+a is rejected because ×a+ is not present in the left passage L1. On the other hand, the tree a+a×a is accepted since the left border {a+a, +a×} of the constituent is included in L2 and its right border is included in R2. 2.4 Bounded ALD Recently the original definition has been reshaped into a mathematically simpler model [11], which is more directly comparable with CF grammars, yet it does not decrease capacity. The model will here be called bounded ALD to distinguish it from ALD with modulation. In a sense, bounded ALD stay to ALD with modulation in a similar relation as CF grammars stay to CF grammars with regular expressions: stencil trees of bounded ALD are bounded in degree, because recursion is used instead of iteration to produce repetitive structures. The other difference between the two models has to do with the manner in which permitted contexts are specified. Definition 6 A bounded ALD A consists of a finite collection of rules of the form 〈a,b,c〉, usually written x[y]z, where , and . For a rule, the string z is called the pattern and the strings x and y are called the permissible left/right contexts. If a left/right permissible context is irrelevant, it can be replaced the“don’t care” symbol ‘-’.
© 2003 Taylor & Francis
Towards a Brain Compatible Theory of Syntax Based on…
25
A tree T is valid for a bounded ALD A iff for each constituent Ki of T there exists a rule v[(Ki)]v where u is a suffix of the left context of Ki in T and v is a prefix of the right context of Ki in T. The language L(A) defined by A is the set of stencil trees valid for A. Sometimes we also consider the set of strings corresponding to the frontier of a tree language. This allows us to talk of the string language defined by an ALD. Example 3 The string language {ancbn|n≥1} is defined by the rules: . Both contexts can be dropped from the first two rules, and the right (or left, but not both) context can be omitted from the last rule to give the equivalent ALD: –[a∆b]–, a[c]–. The following remarks are rigorously justified for bounded ALD [11], but we expect them to hold for ALD with modulation too, with little change. Ambiguity: The phenomenon of ambiguity occurs in ALD much as in CF grammars. For example, the following ALD A ambiguously defines the Dyck language over the alphabet {b, e}:–[∆∆]–, –[b∆e]–, –[ε]–, because a sentence like bebe has two distinct trees in L(A). Other formal properties [11]: Bounded ALD languages (both tree and string) form a strict subfamily of CF. Actually an algorithm enables a CF grammar to be constructed that is structurally equivalent to a bounded ALD. More precisely, the ALD tree languages enjoy the Non-Counting property of CF languages [12]. This property is believed to be a linguistic universal of all natural and artificial languages intended for human communication. The family of ALD string languages is not closed with respect to the basic operations of concatenation, star, union, and complementation: a lack of nice mathematical properties that does not affect the potential uses of ALD for real languages, as witnessed by the successful definition of Pascal [14]. Indeed it is a common misconception that any language family is useless unless it is closed with respect to union and concatenation. Practical examples abound to the contrary, since very rarely is it possible to unite two languages without modification. ALD models differ from Chomsky’s grammars in another aspect: they are not generative models, because the notion of deriving a sentence by successive application of rules is not present. On the other hand, an ALD can be used to check the validity of a string by a parsing process. Preliminary analysis of the problem indicates that the classical parsing methods for CF grammars can be adapted to our case.
2.5 Grammar by Example One of the appealing features of ALD is that they are suitable for grammar inference, the learning process that enables a language on the limit [16] to
© 2003 Taylor & Francis
26
S. Crespi Reghizzi, V. Braitenberg
be identified from a series of positive and negative examples. We already observed that it is straightforward to construct a LTD by inspecting a given sample of strings and extracting the k-grams. If the value of k is unknown, the learner can start with k=2, and gradually increase the length if the inferred LTD is too general. Over generalisation means that some strings are accepted, and these are tagged or negative by the informant. In the basic model of language learnability theory, the sentences presented to the learner are flat strings of words, but other studies have considered another model, in which sentences are presented in the form of structures (such as stencil trees in [9] or functor-argument structures in [20]). The availability of structure in the information has been defended on several grounds such as the presence of semantic tagging. In practice, learning from structures is considerably simpler than learning from strings, because the problem space is more constrained. Constructing the ALD from a sample of stencil trees is a simple, determinate task that consists of extracting and collecting the relevant sets of k-grams, much as in the LTD case, since for a given integer k the ALD which is compatible with a given sample of trees is essentially unique. In the case of programming languages such as Pascal or C, small values of k have proved sufficient, so this grammar inference approach can be applied without combinatorial explosion. As a consequence, ALD can be specified by examples, rather than by rules, because of the direct bijective relation between two sets of positive and negative stencil trees and the ALD. In synthesis, the k-gram extraction method is a good procedure for extrapolating an unbounded set of valid structures from a given sample. In contrast, for CF grammars there may exist many structurally equivalent grammars which are compatible with a sample of stencil trees, and the inference process is more complex and undetermined.
3 Mapping ALD on Brain Structures From the point of view of neuronal modelling, associative language description has the great advantage of naturally blending into some of the most accredited theories of brain function. This is not the place to give a full account of the experimental evidence [5] on which these theories rest, and we will only sketch some of the main results. We start by recalling the basic mechanisms assumed. Cell assemblies (CA) [18]: It seems that the things and events of our experience are represented by ensembles of neurons (=nerve cells) which are strongly connected to each other and therefore (a) tend to become active all together (‘ignite’) even if only some of them are activated, and (b) stay active even after the external excitation ceases. Sequences of cell assemblies: Although the individual CA, when activated, may follow a certain temporal order in the activation of its component neurons,
© 2003 Taylor & Francis
Towards a Brain Compatible Theory of Syntax Based on…
27
it is not a sufficient physiological basis for the sequential order which characterizes much of behaviour: sequences of items in some animal (bird song) and human (singing, speech) vocalisations or skilled behaviour such as occurs in crafts or musical performance. There must be control mechanisms in the brain which extinguish an active CA and ignite the following one in well defined sequences, although possibly with varying rhythm. These may be genetically determined or acquired by learning. Synfire chains [1] [2]: There is evidence of very precisely timed sequences of neural activations in chains of neurons, or groups of neurons, which conduct neural activity but cannot arrest it or store it anywhere along the way. These so-called synfire chains are probably responsible for the timing of events within a time span of a few tenths of a second (whereas sequences of cell assemblies may have a duration of several seconds). These, too, are the result of learning processes. We may ask some questions in connection with associative grammar. First, in what way are the sequences which define the local rules of grammar (k-grams) learned and stored. The items which occur in ordered groups of k elements are words (though in the previous definitions they are denoted by single letters), composed of one or a few syllables, and therefore with a duration of between 0.2 and 1 (or at most 2) seconds (the duration of a syllable being about 0.2 seconds). A trigram composed of such elements would span a time of several seconds, too long for synfire chains but quite in the order of magnitude of various kinds of skilled behaviour (such as sports, musical performance, etc.). As in these and other performances, the ability of the brain to learn sequences of events is evident and in some cases can even be related to some detailed neurophysiology. It is not impossible to imagine neuronal networks containing representatives of such learned sequences which are only activated when the correct sequence is presented in the input. They are stored in parallel and may be prevented from being activated more than one at a time by some mechanism of reciprocal inhibition. In analysing a sequence of input events, they may also be activated in partially overlapping temporal episodes as required by the ‘sliding’ window idea. Of course the number of 3-grams is enormous in language, but the number of neurons involved in language processing in the brain is also very large, perhaps of the order of 108 and the number of the useful combinations of these neurons is possibly greater. Moreover, in reality we do not imagine that the system requires all k-grams to have the same value k (as in the formal definition of ALD), but rather we suggest that the value will vary in different contexts, in order to minimize the memory requirements. It is conceivable that the minimal values required for k are discovered in the learning phase, assuming that the grammar by example algorithms outlined in Sect. 2 is deployed. This mechanism for matching k-grams is absolutely essential for ALD as it is needed not only for recognizing constituents, but
© 2003 Taylor & Francis
28
S. Crespi Reghizzi, V. Braitenberg
also for detecting the k-grams passage that announces the modulation between two constituents. We notice that some k-grams play a special role when they act as the initial (or final) k-gram of a constituent. The requirement of specially marking some k-grams is consistent with the so called ‘X bar’ theory of syntax [19], which affirms that each syntax class requires the presence of a specific lexeme. It is not difficult to imagine how the proposed neuronal scheme would detect such marked or compulsory k-grams. Another question is how sequences of items are embodied in synaptic networks and how they are learned. Connections between neurons or groups of neurons are statistically symmetrical, but again in the cortical ‘wiring’, which is mostly stochastic, there is ample opportunity for asymmetrical connection that may embody the relation of temporal order or sequence. Moreover it is clear that synaptic relations between neurons in the cortex are to a large extent determined by learning (or experience), and it is certain that much of what is learnt in the way of succession of events (causal relations etc.) will determine the unidirectional influence of one group of neurons on another. There is a technical difficulty, however, in the physiology of learning when the sequence which is learned involves considerable delays, such as in trigrams (sequences of three words) which span several seconds. We are forced to assume that there are different delays between the input and internal representatives of trigrams. The synfire chains already mentioned may provide an adequate mechanism for delays with the possibility of representing asynchronous input in a synchronous way to the internal representative. Finally the nesting rules (e.g. in Dyck’s language of parentheses) have always been a problem when translated in neurological terms. A model based on the concepts of decaying activity in CA that serve as quasi-bistable memory elements has been shown to incorporate the virtues required for ‘push-down’ memories. The problem of inserting phrases into phrases to a certain extent involves recognizing legal or unusual k-grams on the borders; this requires a brain mechanism which can interrupt the embedding phrase throughout the duration of the embedded phrase. Once the embedded phrase has finished, the embedding phrase resumes. For this there are plausible neurological models, if the elements of the phrase are represented in the brain by concatenated CAs. Due to the nature of the CA, which is stable in both its active and inactive state, the information for continuing the embedding phrase may be preserved in the activity of the last active cell assembly before the interruption. When there are several embedded constituents, it is well known that the order of closure of the various phrases is inverse to the order of opening. This is fairly easily explained if we assume that the activity of the CA, which indicates an interrupted phrase, slowly decays in time and that completion starts with the most active CA, the most recently activated one [24] [3].
© 2003 Taylor & Francis
Towards a Brain Compatible Theory of Syntax Based on…
29
Sequences of cell assemblies at a rate of 4–5 a second have been postulated in many contexts of behaviour (e.g. vision) and are probably related to the periodic action of a mechanism controlling the state of activity of the cortex. This mechanism also involves inhibitory links, which not only isolate individual cell assemblies from the background activity (by the principle of ‘winner takes all’) but also see to it that in the sequencing of cell assemblies one is just extinguishing when the next one is activated. This explanation essentially answers the question of how the ‘place holders’ of the theory are represented in the brain. But details are, of course, not known. It is certainly true that not all the elements that spell out the grammaticality of a sentence can be identified with the elements at the surface level of language. Traditional grammar rules involve such things as grammatical (or lexical) categories. In theory we may think of tags that are attached to words by means of the all pervading associative mechanism typical of the cortex. But it is not at all clear how these categories are extracted from language in the early phase of learning. A final note on the non-counting property that all human languages seem to have. Modulo-counting is important in various forms of behaviour, particularly in music: complex periodic structures occur in many compositions. The brain is quite effective at processing such periodic vocalisations or motor sequences, as in percussion playing. We are therefore inclined to think that the non-counting property of language has to do with other constraints external to the brain, such as the excessive noise-sensitivity of such languages: a cough could easily transform an odd string into an even one, thus subverting the grammaticality and meaning of an utterance.
4 Conclusion We believe that the associative language description model explains fundamental syntactic phenomena more convincingly, in terms of brain behaviour, than earlier attempts with context-free grammars. In spite of obvious limitations and drastic simplification, the model should provide a good basis for extension and refinement. We summarise our findings and add a few remarks. The ALD model is based on constituent structures and associative memory. Whereas the first concept provides the basis of the classical Chomskian grammars, the latter has been rejected by linguists as inadequate for discriminating sentences from non-sentences. Yet associative processing is a fundamental mechanism of the brain and plays an important role in speech understanding and language production. The new model is based on the mathematical theory of local testability and counter-free languages, research into which began in the 1960s, but until now it has not been combined with constituent structures. Loosely related ideas have been studied in the area of programming languages, where the concept of precedence of operators was
© 2003 Taylor & Francis
30
S. Crespi Reghizzi, V. Braitenberg
introduced by Floyd [15], to make parsing deterministic. He proposes that digrams (called precedence relations) be used to detect the left and right border of a constituent. The fact that non-counting operator precedence grammars can be easily inferred from examples was noticed in [9] and the formal model was studied by [13]. Several mathematical aspects could be investigated in the vein of formal language theory. One such aspect concerns weak generative capacity: can any regular language be generated by an ALD? (the same question for CF languages had a negative answer in [11]). But a more relevant research project would be to assess the linguistic adequacy of the model, preferably by implementing the learning procedures that would allow ALD to be automatically produced from a linguistic body. Such a research programme would enable the local testability model to be tuned so that, for example, compulsory k-grams and long-distance relations could be specified and Boolean operations could be applied to sets. The possibility of validating the brain model depends on timing analysis and using psycholinguistic findings for purposes of comparison. Likewise, the computer simulation of the proposed cell assemblies should also be feasible. Acknowledgment We would like to thank Alessandra Cherubini, Pierluigi San Pietro, and Friedemann Pulvermüller.
References [1] M.Abeles, Local Cortical Circuits: An Electrophysiological Study. Springer, New York, 1982. [2] M.Abeles, Corticonics: Neural Circuits of the Cerebral Cortex. Cambridge University Press, Cambridge, 1991. [3] V.Braitenberg, Il Gusto della Lingua: Meccanismi Cerebrali e Strutture Grammaticali. Alpha Beta, Merano, 1996. [4] V.Braitenberg and F.Pulvermüller, Entwurf einer neurologischen Theorie der Sprache, Naturwissenschaften, 79 (1992), 103–117. [5] V.Braitenberg and A.Schüz, Anatomy of the Cortex: Statistics and Geometry. Springer, New York, 1991. [6] J.A.Brzozowski, Hierarchies of aperiodic languages, RAIRO Informatique Théorique, 10 (1976), 33–49. [7] N.Chomsky, Syntactic Structures. Mouton, The Hague, 1957. [8] N.Chomsky and G.Miller, Finitary models of language users. In R. Luce, R.Bush and E.Galanter (eds.), Handbook of Mathematical Psychology. John Wiley, New York, 1963, 112–136.
© 2003 Taylor & Francis
Towards a Brain Compatible Theory of Syntax Based on…
31
[9] S.Crespi Reghizzi, Reduction of enumeration in grammar acquisition. In Proceedings of the Second International Conference on Artificial Intelligence, London, 1971, 546–552. [10] S.Crespi Reghizzi, An effective model for grammar inference. In Proceedings of Information Processing 71, Ljubliana, 1972, 524–529. [11] S.Crespi Reghizzi, A.Cherubini and P.L.San Pietro, Languages based on structural local testability. In C.S.Calude and M.J.Dinneen (eds.), Combinatorics, Computation and Logic. Springer, Berlin, 1999, 159–174. [12] S.Crespi Reghizzi, G.Guida and D.Mandrioli, Non-counting context-free languages, Journal of the ACM, 25 (1978), 571–580. [13] S.Crespi Reghizzi, G.Guida and D.Mandrioli, Operators precedence grammars and the non-counting property, SIAM Journal of Computing, 10 (1981), 174– 191. [14] S.Crespi Reghizzi, M.Pradella and P.L.San Pietro, Conciseness of associative language descriptions. In J.Dassow and D.Wotschke (eds.), Proceedings of Descriptional Complexity of Automata, Grammars and Related Structures, Universität Magdeburg, 1999, 99–108. [15] R.W.Floyd, Syntactic analysis and operator precedence, Journal of the ACM, 10 (1963), 316–333. [16] E.M.Gold, Language identification in the limit, Information and Control, 10 (1967), 447–474. [17] S.A.Greibach, The hardest context-free language, SIAM Journal of Computing, 2 (1973), 304–310. [18] D.O.Hebb, The Organization of Behaviour: A Neuropsychological Theory. John Wiley, New York, 1949. [19] R.Jackendoff, X’ Syntax: A Study of Phrase Structure. MIT Press, Cambridge, Mass., 1977. [20] M.Kanazawa, Learnable Classes of Categorial Grammars. CSLI, Stanford, Ca., 1998. [21] K.S.Lashley, The problem of serial order in behavior. In L.A.Jeffress (ed.), Cerebral Mechanisms in Behavior. John Wiley, New York, 1951, 112–136. [22] R.McNaughton and S.Papert, Counter-Free Automata. MIT Press, Cambridge, Mass., 1971. [23] J.E.Pin, Variétés de Langages Formels. Masson, Paris, 1984. [24] F.Pulvermüller, Syntax und Hirnmechanismen: Perspektive einer multidisziplinären Sprachwissenschaft, Kognitionswissenschaft, 4 (1994), 17-31.
© 2003 Taylor & Francis
32 [25]
S. Crespi Reghizzi, V. Braitenberg A.Salomaa, Formal Languages. Academic Press, New York, 1973.
[26] A.Wickelgren, Context-sensitive coding, associative memory, and serial order in (speech) behavior, Psychological Review, 76 (1969).
© 2003 Taylor & Francis
The Power and Limitations of Random Context Sigrid Ewert1 Department of Computer Science University of Bremen Germany
[email protected] Andries van der Walt Department of Computer Science University of Stellenbosch South Africa
[email protected] Abstract. We use random context picture grammars to generate pictures through successive refinement. The productions of such grammars are context-free, but their application is regulated—‘permitted’ or ‘forbidden’—by contexts that are randomly distributed in the developing picture. Grammars using this relatively weak context often succeed where context-free grammars fail, eg., in generating the typical iteration sequence of the Sierpi ski carpet. We were also able to develop iteration theorems for three subclasses of these grammars; finding necessary conditions is problematic for most models of context-free picture grammars with context-sensing ability, since they consider a variable and its context as a connected unit. We give two detailed examples of picture sets generated with random context picture grammars. Then we show how to construct a picture set that cannot be generated using random context only. 1
Postdoctoral fellow with a scholarship from the National Research Foundation, South Africa.
© 2003 Taylor & Francis
34
S.Ewert, A.van der Walt
Figure 1: Production of a random context picture grammar
1 Introduction Random context picture grammars (RCPGs), a method of syntactic picture generation, have been described and studied elsewhere [1], [3], [2], [4]. The model was generalized in [5]. We formally introduce RCPGs in Section 2. In Section 3, we give two detailed examples of picture sets generated with these grammars. Finally, in Section 4, we show how to construct a picture set that cannot be generated using random context only.
2 Definitions We generate pictures using productions such as those in Figure 1, where A is a variable, m僆{1, 2, 3,…}, x11, x12,…, xmm are variables or terminals, and and are sets of variables. The interpretation is as follows: if a developing picture contains a square labeled A and if all variables of and none of label the squares in the picture, then the square labeled A may be divided into equal squares with labels x11, x12,…, xmm. In order to cast the formulation into a more linear form, we denote the square with sides parallel to the axes, lower lefthand vertex at (s, t) and upper righthand vertex at (u, v) by ((s, t), (u, v)). We use lowercase Greek letters for such constructs. Thus, (A, ␣) denotes a square α labeled A. Furthermore, if α is such a square, ␣11, ␣12,…, ␣mm denote the equal squares into which α can be divided, with, eg., α11 denoting the bottom left one. A random context picture grammar (RCPG) G=(VN, VT, P, (S, )) has a finite alphabet V of labels, consisting of disjoint subsets VN of variables and VT of terminals. P is a finite set of productions of the form A→[x11, x12, …, xmm] , where , and . Finally, there is an initial labeled square (S, ) with .
© 2003 Taylor & Francis
The Power and Limitations of Random Context
35
A pictorial form is any finite set of nonoverlapping labeled squares in the plane. If ⌸ is a pictorial form, we denote by l(⌸) the set of labels used in ⌸. The size of a pictorial form ⌸ is the number of squares contained in it, i.e. |⌸|. For an RCPG G and pictorial forms ⌸ and ⌫, we write if there is a production A→[x11, x12,…, xmm] in G, ⌸ contains a labeled square , and and . As usual, denotes the reflexive transitive closure of . A picture is a pictorial form ⌸ with . The gallery generated by a grammar G=(VN, VT, P, (S, )) is the set of pictures ⌸ such that {(S, )} ⌸. Note. To improve legibility, we write productions of the type A→[x11] as A→x11 .
3 The Power of Random Context We give an indication of the power of random context picture grammars by showing two examples of galleries that can be generated using this type of context. Example 1 (Sierpin´ski carpet) Consider the iteration sequence of the Sierpi ski carpet, members of which are shown in Figures 2 and 3. A context-free grammar cannot generate such a sequence, because it cannot ensure that all regions of a given picture have the same degree of refinement. This gallery can be created with the RCPG Gcarpet=({S, T, U, F},-{w, b}, P, (S, ((0,0), (1,1)))), where P is the set:
© 2003 Taylor & Francis
S → [T, T, T, T, w, T, T, T, T] ({};{U})
(1)
T → U({}; {S, F}) | F ({}; {S, U, F}) | b ({F}; {}) U → S ({}; {T})
(2) (3) (4) (5)
F → b ({}; {T})
(6)
36
S.Ewert, A.van der Walt
We associate the colours ‘light’ and ‘dark’ with the terminals ‘w’ and ‘b’, respectively.
Figure 2: Sierpi ski carpet: first refinement
Figure 3: Sierpi ski carpet: second refinement
© 2003 Taylor & Francis
The Power and Limitations of Random Context
37
The initial square labeled S is divided into nine equally big squares, of which the middle square is labeled w and the others T (1). This pictorial form can now derive a picture or a more refined pictorial form. The decision is made by any T. If T decides to terminate, it produces an F (3). The other T’s, on sensing the F, each produce a b (4). Once there are no T’s left in the pictorial form, F also produces a b (6). Alternatively, T produces a U and all other T’s follow suit (2). Once this has been done, each U is replaced by an S (5) and the process is repeated. Example 2 Consider the following gallery. A picture consists of 2i, iⱖ0, identical lanes. The upper half of a lane consists of 2i identical isosceles triangles which are next to each other, and which are grey on a light background. The lower half of a lane is again divided in 2j, jⱖ1, sub-lanes, the upper half of which is dark and the lower half light. Examples are given in Figures 4 and 5. This gallery can be created with the RCPG , where P is the set:
(7) (8) (9) (10) (11) (12) (13) (14) (15) (16) (17) (18) (19) (20)
(21) (22)
© 2003 Taylor & Francis
38
S.Ewert, A.van der Walt
(23) (24) (25) (26) We associate the colours ‘light’, ‘dark’ and ‘grey’ with the terminals ‘w’, ‘b’ and ‘f’, respectively. A picture in this gallery is generated in two phases. In the first phase, the canvas is divided into lanes (7, 9, 10). In the second phase, a row of triangles is generated in the upper half of each lane (8, 11–20), while the lower half is divided into sub-lanes (8, 21–26). The triangles and sub-lanes are generated independently of each other. The first phase proceeds as follows. At the beginning of the ith, iⱖ1, iteration of the loop (7, 9, 10), the pictorial form ⌸i consists of 4i-1 equally big squares, all labeled S. This pictorial form can derive the pictorial form ⌸i+1 or start filling in the existing squares with triangles and stripes. The decision is made by any S. If S decides to generate more squares, the square containing it is divided in four and each quarter labeled T; all other S’s follow suit (7). Once this has been done, each T is stored in U (9), the S’s are restored (10) and an S must again make the decision described above. If, on the other hand, S decides to start decorating the existing lanes with triangles and stripes, the square containing it is divided in four and the lower two quarters labeled Istripe, while the upper left and right quarters are labeled Il-tri and Ir-tri, respectively (8). Each Ir-tri generates an isosceles right triangle with the right angle on the left, while each Il-tri generates an isosceles right triangle with the right angle on the right. The smoothness of the hypotenuse is determined by any Ir-tri. If Ir-tri decides to smoothen the hypotenuse further, the square containing it is divided in four and the lower left quarter labeled b, the upper right quarter w, and the remaining two quarters Tr-tri; all other occurrences of Ir-tri follow suit (11). Each Il-tri, on sensing Tr-tri in the pictorial form, smoothens the hypotenuse it is part of: the square containing it is divided in four and the lower right quarter labeled b, the upper left w, and the remaining two quarters Tl-tri (17). Once this has been done, each Tr-tri is stored in Ur-tri (14) and each Tl-tri in Ul-tri (19). Then Ir-tri and Il-tri are restored (15 and 20, respectively) and the aforementioned decision by an Ir-tri must be made again.
© 2003 Taylor & Francis
The Power and Limitations of Random Context
39
If Ir-tri decides to complete the existing triangles, the square containing it is divided in four and the two lower quarters labeled b, the upper left Ftri and the upper right w (12).
Figure 4: Triangles and stripes: medium refinement
Figure 5: Triangles and stripes: high refinement
© 2003 Taylor & Francis
40
S.Ewert, A.van der Walt
All other Ir-tri’s, on sensing Ftri in the pictorial form, generate the final edge, which consists of a w in the upper right quarter and b’s in the rest (13). Similarly, each Il-tri, on sensing Ftri, generates a final edge consisting of a w in the upper left quarter and b’s in the rest (18). Each Istripe generates 2i, iⱖ0, sub-lanes. With each execution of the loop (21, 24 and 25), the number of sub-lanes is doubled. Whether or not the loop is executed is decided by any Istripe. If Istripe decides to repeat the loop, it creates four quarters all labeled Tstripe; all Istripe’s follow suit (21). Once this has been done, each Tstripe is stored in Ustripe (24), the Istripe’s are restored (25) and the decision by an Istripe must be made again. If Istripe decides to colour the existing sub-lanes, the square containing it is divided in four and the two lower quarters are labeled w, while the upper left is labeled Fstripe and the upper right f (22). All other occurrences of Istripe, on sensing Fstripe in the pictorial form, generate four quarters, of which the two lower ones are labeled w and the upper ones f (23). Once this has been done, Fstripe produces an f (26).
4 The Limitations of Random Context We now turn to the limitations of RCPGs. We prove that every gallery generated by an RCPG of a certain type has a property which we call commutativity at level m. This enables us to construct a picture set that cannot be generated using random context only. To facilitate the description, we consider only pictures occupying the unit square ((0, 0), (1, 1)). A picture is called n-divided, for nⱖ1, if it consists of 4n equal subsquares, each labeled with a terminal. A level-m subsquare of an n-divided picture, with 1ⱕmⱕn, is a square: ((x2 -m, y2 -m), ((x+1)2-m, (y+1)2-m)), where x and y are integers and 0ⱕx, y. But somehow we have to unify the different structures in such a way that we have only one language: the language of the system. This is where the ‘master’ comes in. The ‘master’ is a special component without axiom that can start its work just when it receives strings from every component that make up the system. It has to unify all that information and generate the language of the system. In this way, the language generated by a Linguistic Grammar System will be the result of putting in correspondence (via master) all the structures generated by components of the system. Summing up, a Linguistic Grammar System is a PCGS with renaming, with separate alphabets, composed of CDGS (with output and input filters), non-returning, non-centralized and having two types of communication (request and command.) Before embarking on the formal definition of a Linguistic Grammar System, let us see how it looks in a picture:
3 In PCGS with Renaming we add ‘weak codes’ to the basic model that allow us to translate the strings generated by a component before communicating them to another module.
© 2003 Taylor & Francis
60
M.D.Jiménez-López
3.1 An Attempt to Formally Define Linguistic Grammar Systems We assume that the reader is familiar with the basics of Formal Language Theory. For more information we refer to [19], [17]. Definition 1 A Linguistic Grammar System of degree n+m, with n, m≥1, is an (n+m+1)-tuple: Γ=(K, (1, I1,), (2, I2, O2),…, (n, In, On), h1,…, hm) where: •
K=(Q1,…, Qn, q1,…, qn} are query symbols, their indices 1,…, n pointing to 1,…, n components, respectively. Q1,…, Qn refers to the whole string of the i-th component, while q1,…, qn refer to a substring of the i-th component.
•
(1, I1,), (2, I2, O2),…, (n, In, On) are the components of the system: – 1=(N1, T1, G1,…, Gk, f1), is the ‘master’ of the system, where: * N1 is the non-terminal alphabet. * T1 is the terminal alphabet. * Without axiom. * Gr=(N1, T1, Pr), for 1≤r≤k, is a usual Chomsky grammar, where: · N1 is the non-terminal alphabet. · T1 is the terminal alphabet.
© 2003 Taylor & Francis
Linguistic Grammar Systems: A Grammar Systems Approach for…
61
· Pr are finite sets of rewriting rules over , where K’={[hj, Qi]|1≤i≤n, 1≤j≤ m}, and every [hj, Qi] is a symbol. * is the derivation mode of γ1. – i=(Ni, Ti, Si, G1,…, Gk, fi); for 2≤i≤n, is a CD Grammar System where: * Ni is the non-terminal alphabet. * Ti is the terminal alphabet. * Si is the axiom. * Gr=(Ni, Ti, Pr), for 1≤r≤k, is a usual Chomsky grammar, where: · Ni is the non-terminal alphabet. · Ti is the terminal alphabet. · Pr are finite set of rewriting rules over , where K’={[hj, Qi] |1≤i≤n, 1≤j≤m}, and every [hj, Qi] is a symbol. * is the derivation mode of γi. – , i=1 is the input filter of the master. – , 2≤i≤n is the input filter of the i-th component. – is the output filter of the i-th component. •
, 1≤j≤m are weak codes such that: – hj(A) = A, for –
. , for
.
We write and . Sets Ni, Ti, K, K´ are mutually disjoint for any i, 1≤i≤n. We do not require for 1≤i, j≤n, i≠j. Definition 2 Given a Linguistic Grammar System Γ=(K, (γ1, I1,), (γ2, I2, O2),…, (γn, In, On), h1,…, hm), its state is described at any moment by an n-tuple (x1,…, xn), where each , 1≤i≤n, represents the string that is available at node i at that moment. Definition 3 Given a Linguistic Grammar System Γ=(K, (γ1, I1), (γ2, I2, O2),…, (γn, In, On), h1,…, hm), for two n-tuples (x1, x2,…, xn), (y1, y2, …,yn), xi, we write , 1≤i≤n, we write if one of the following cases holds:
© 2003 Taylor & Francis
62
M.D.Jiménez-López
1. |xi|K=0, |xi|K’=0, CDGS γi, or
, and for each i, 1≤i≤n, we have and xi=yi.
For each γi, with xi, yi ∈Vi* we write that:
iff
• xi=x1, yi=xk+1. • x j+i , i.e. 1≤j≤k.
in the
such
,
2. There is an i, 1≤i≤n, such that |xi|K>0, then for each such i we write , t≥1, for , |zj|K=0, 1≤j≤t+1; if , 1≤j≤t, then providing that ; when for some j, 1≤j≤t, , then yi=xi; for all i, 1≤i≤n, for which yi is not specified above, we have yi=xi. 3.
There is an i, 1≤i≤n, such that |xi|K’>0, then for each such i we write , t≥1, for , |zj|K’=0, 1≤j≤t+1; if , 1≤j≤t, then providing that ; when for some j, 1≤j≤t, , then yi=xi; for all i, 1≤i≤n, for which yi is not specified above, we have yi=xi.
4. (x1,…, xn)
(y1,…, yn) iff
, for i=1,…, n.
Point 1 defines a rewriting step, whereas points 2, 3 and 4 define communication steps. In 1 no query symbol Qi, qi, or hj(Qi) is present in the current string and the string doesn’t match the output filter of the CDGS, so no communication (by request or command) can be done. In this case we perfom a rewriting step. In 2 we define a communication step by request without renaming. Some query symbols, say (or ) appear in a string xi. In this case rewriting stops and some communication steps are performed. Every symbol (or ) 1≤j≤l, must be replaced with the current string (or substring) of the component , assuming that no 1≤j≤l, contains a query symbol. If one of the strings , 1≤j≤l, also contains query symbols these symbols must be replaced with the requested strings before communicating that string. In 3, we define a communication step by request with renaming. In this case some query symbols appear in a string xi. Everything works as in the case of the communication by request without renaming with the only difference that here must be replaced not by the string xij, but by . And finally, in 4, we define a communication step by command. In this case, copies of those strings which are able to pass the output filter of some γj and the input filter of some γi (i≠j) join (concatenated in the order of system components) the string present at γi. © 2003 Taylor & Francis
Linguistic Grammar Systems: A Grammar Systems Approach for…
63
Definition 4 The language generated by a Linguistic Grammar System as above is:
Notice that we start with the set of axioms of the components γ2,…, γn of a Linguistic Grammar System and with an empty set in the master module (γ1). We perform derivation and communication steps until the master (which has not axiom) produces a terminal string, x. 4 Final Remarks Our aim in this paper was to show the possible adequacy of Grammar Systems Theory in the field of Linguistics. We have presented some traits that may justify applying this theory to the study of language. Modularity, easy generation of non-context-free structures, parallelism, interaction, cooperation, distribution and other features have been adduced as important notions for language that can be captured by using Grammar Systems to study linguistic matters. By no means was our purpose to make an exhaustive revision of all the aspects that prove that this theory is suitable in Linguistics. Nevertheless, we have attempted to show its applicability by means of a new variant of grammar systems -the so-called Linguistic Grammar Systems- which we have introduced to show how the different modules that make up a grammar work and interact with one another to generate an acceptable language structure. We have informally presented the model and attempted to give a formal definition. We know how difficult introducing a new theory in the study of natural language can be but, taking into account the properties of Grammar Systems, we think it would be worthwhile trying to apply this theory in Linguistics. This paper is just one small example of how valuable the application of Grammar Systems Theory in Linguistics could be. References [1] E.Csuhaj-Varjú, Grammar systems: a multi-agent framework for natural language generation. In Gh.Paun (ed.), Mathematical Aspects of Natural and Formal Languages. World Scientific, Singapore, 1994, 63–78.
© 2003 Taylor & Francis
64
M.D.Jiménez-López
[2] E.Csuhaj-Varjú, J.Dassow, J.Kelemen and Gh.P aun, Grammar Systems: A Grammatical Approach to Distribution and Cooperation. Gordon and Breach, London, 1994. [3] E.Csuhaj-Varjú and M.D.Jiménez-López, Cultural eco-grammar systems: a multi-agent system for cultural change. In A.Kelemenová (ed.), Proceedings of the MFCS’98 Satellite Workshop on Grammar Systems, Silesian University, Opava, Czech Republic, 1998, 165–182. [4] E.Csuhaj-Varjú, M.D.Jiménez-López and C.Martín-Vide, Pragmatic ecorewriting systems’: pragmatics and eco-rewriting systems. In Gh. Paun and A.Salomaa (eds.), Grammatical Models of Multi-Agent Systems. Gordon and Breach, London, 1999, 262–283. [5] E.Csuhaj-Varjú, J.Kelemen and Gh.Paun, Grammar systems with WAVE-like communication. Computers and AI, 15/5 (1996), 419–436. [6] J.Dassow, Gh.Paun, Regulated Rewriting in Formal Language Theory. Springer, Berlin, 1989. [7]
J.Dassow, Gh.Paun and G.Rozenberg, Grammar systems. In G.Rozenberg, A.Salomaa (eds.), Handbook of Formal Languages. Springer, Berlin, 1997, vol. 2, 155–213.
[8]
R.Jackendoff, The Architecture of Language Faculty. MIT Press, Cambridge, 1997.
[9]
M.D.Jiménez-López, Sistemas de gram’aticas y lenguajes naturales: ideas intuitivas al respecto. In C.Martín-Vide (ed.), Lenguajes Naturales y Lenguajes Formales XII, PPU, Barcelona, 1996, 223–236.
[10] M.D.Jiménez-López, Cultural eco-grammar systems: agents between choice and imposition. A preview. In G.Tatai & L.Gulyás (eds.), Agents Everywhere. Springer, Budapest, 1999, 181–187. [11] M.D.Jiménez-López, Grammar Systems: A Formal-Language-Theoretic Framework for Linguistics and Cultural Evolution, PhD Dissertation, Universitat Rovira i Virgili, Tarragona, 2000. [12] M.D.Jiménez-López and C.Martín-Vide, Grammar systems for the description of certain natural language facts. In Gh.Paun, A.Salomaa (eds.), New Trends in Formal Languages. Springer, Berlin, 1997, 288–298. [13] M.D.Jiménez-López and C.Martín-Vide, Grammar Systems and Autolexical Syntax: Two Theories, One Single Idea. In R.Freund &; A. Kelemenová (eds.), Grammar Systems 2000. Silesian University, Opava, 2000, 283– 296. [14] V.Mihalache, PC grammar systems with separated alphabets. Acta Cybernetica, 12/4 (1996), 397–409. [15] V.Mitrana, Gh.P a un and G.Rozenberg, Structuring grammar systems by priorities and hierarchies. Acta Cybernetica, 11/3 (1994), 189–204. [16] Gh.Paun, PC grammar systems and natural languages, private communication. [17] G.Rozenberg and A.Salomaa (eds.), Handbook of Formal Languages. Springer, Berlin, 1997.
© 2003 Taylor & Francis
Linguistic Grammar Systems: A Grammar Systems Approach for…
65
[18] J.M.Sadock, Autolexical Syntax. A Theory of Parallel Grammatical Representations. University of Chicago Press, Chicago, 1991. [19] A.Salomaa, Formal Languages. Academic Press, New York, 1973.
© 2003 Taylor & Francis
Multi-Bracketed Contextual Rewriting Grammars with Obligatory Rewriting Martin Kappes Fachbereich Informatik Johann Wolfgang Goethe University Frankfurt am Main, Germany
[email protected] Abstract. We study the generative capacity and closure properties of multi-bracketed contextual rewriting grammars with obligatory rewriting. This model is a generalization of multi-bracketed contextual rewriting grammars. They possess an induced Dyck-structure to control the derivation process and to provide derivation trees. It will be shown that this class of grammars is closed under intersection with regular sets. 1 Motivation and Introduction Contextual grammars were introduced by Marcus in [8]. They are a formalization of the linguistic idea that more complex well formed strings are obtained by inserting contexts into already well formed strings. Therefore, these grammars are based on the principle of adjoining. Multi-bracketed contextual grammars were introduced in [5]. Generally speaking, this is a class of contextual grammars working on an induced Dyck-structure which controls the derivation process and also provides context-free-like derivation trees. They were generalized to multi-bracketed contextual rewriting grammars in [7].
© 2003 Taylor & Francis
68
M.Kappes
Here we will study a further generalization of this model called multibracketed contextual rewriting grammars with obligatory adjoining. In the models studied in [5] and [7], each string derived by a grammar G in a finite number of derivation steps belongs to the language generated by G. We will now “filter” these strings by imposing the restriction that only certain generated strings are in the language of G as only strings consisting exclusively of terminal symbols are in the language generated by a Chomsky grammar. We will briefly study the generative capacity of these grammars. Our main result is that, in contrast to the classes investigated in [5] and [7], the classes of languages generated by those grammars are closed under intersection with regular sets. All the models investigated in this paper are based on so-called internal contextual grammars which were introduced by Paun and Nguyen in [11]. Detailed information on contextual grammars can be found in the monograph [10]; a more compressed source of information is [1]. The first approach to induce a bracket structure in contextual grammars was so called bracketed contextual grammars, introduced by Martín-Vide and Paun in [9]; the generative capacity of this class was studied in [6]. 2 Definitions Let * denote the free monoid generated by the finite alphabet Σ and Σ+= Σ*-{}, where denotes the empty word. FIN, REG, CF and CS denote the families of finite, regular, context-free and context-sensitive languages. We assume the reader is familiar with the common notions of formal language theory as presented in [3]. For an alphabet Γ we define the projection to Γ via pr(σ)=σ, if , and otherwise. Let ∆ denote a finite set of indices. By we define the bracket alphabet induced by ∆. By D∆ we denote the DyckLanguage over B∆ (see [2]). Let Σ and ∆ denote two alphabets. The set of all Dyck-covered words over Σ with respect to the index alphabet ∆ is given by:
Furthermore, for each we define . Throughout the paper we always assume . Notice that the first and the last symbol of each non-empty Dyck-covered word is a pair of brackets belonging together. It is easy to see that each can be interpreted as unique encoding for a tree, where ∆ is the label alphabet for the internal nodes and Σ is the label alphabet for the leaf nodes in the following way: a string
© 2003 Taylor & Francis
Multi-Bracketed Contextual Rewriting Grammars with…
69
Figure 1: The derivation process in an MBICR-grammar: a context (µ, v) may be adjoined to an =1[B2]B3 yielding a tree 1µ[E2]Ev3 if and only if there is an such that , and . In the above figure, we have , , pr Σ ( 2 ) w 2 , pr ( 3 )=w 3 , prΣ(µ)=u and prΣ(v)=v.
is identified with a tree where the root is labelled by A, and the subtrees of the root are determined by the unique decomposition of =12…n such that , 1≤i≤n. A multi-bracketed contextual rewriting grammar (MBICR) is a tuple G=(Σ, ∆, Ω, P), where Σ is a finite set of terminals, ∆ is a finite set of indices, is a finite set of axioms and P is a finite set of tuples (S, C, K, H). where , K, and C is a finite subset of such that, for all (u,v) C, we have . The derivation relation on is defined by if and only if =1[A2]A3, ß=1µ[B2]Bv3, and there is an , such that and . The structure language generated | there by G (or strong generative capacity of G) is is an α∈Ω such that where denotes the reflexive transitive closure of the derivation relation. The language generated by G (or weak generative capacity of G) is . An MBICR-grammar G=(Σ, ∆, Ω, P) is with F-choice for a family of languages F, if for all . By MBICR(F) we denote the set of all languages which can be generated by an MBICR(F)-grammar. The derivation process in an MBICR-grammar is illustrated in Figure 1. A special case occurs in MBICR(F)-grammars, where for all there is an such that K=H={A} and for all (µ, v) ∈C. These grammars are called MBIC-grammars and were studied in [5]. In MBICR-grammars, we have ß∈T(G) for each ß such that there is an α∈Ω with and for each ß T(G). In this sense, each derivation step immediately yields a string in T(G) and L(G).
© 2003 Taylor & Francis
70
M.Kappes
Analogously to the concept of terminal symbols in Chomsky grammars, we will now impose the restriction that only some indices ϒ ⊆ ∆ are considered “valid” and that a string ß derived from an axiom in a finite number of derivation steps is only in T(G) if , i.e. it only contains brackets with “valid” indices. A multi-bracketed contextual rewriting grammar with obligatory adjoining (MBICRO) is a tuple , where Σ, ∆, Ω and P are defined as in an MBICR-grammar and ϒ ⊆ ∆ is a set of permitted indices. The derivation process is defined as in an MBICR-grammar but the strong and weak generative capacities of G are given by | there is an α∈Ω such that and . Thus, all brackets indexed by symbols from have to be replaced during the derivation process in order to obtain a string in T(G). Let us consider the following example: G=({a, b, c, d, e}, {A, B}, {A}, {[Aa[Bbc]Bd]A}, (1, 2}), where: π1=(Σ+, {([Aa[Bb, c]Bd]A)}, {B}, {A}), π2=(Σ+,{([Ae, e]A)}, {B}, {A}). It is not difficult to see that using π1 i times yields a derivation:
In order to obtain a string in T(G) we have to use production π2 exactly once to remove the pair of brackets indexed by B from the sentential form. After applying π 2 once, no further derivation steps are possible. Hence L(G)={anebncnedn|n≥1}. 3 Generative Capacity We will now investigate the generative capacity of MBICRO-grammars. Theorem 1 The diagram in Figure 2 represents the relation between the various language classes, where a (dashed) arrow indicates a (not necessarily) strict inclusion and families not linked by a path in the diagram are not necessarily incomparable. Proof. For the results about MBIC- and MBICR-Grammars we refer the reader to [5] and [7]. Since each language generated by an MBICR-grammar possesses the so-called internal bounded step-property (cf. [10]) and since the language generated in the above example does not possess this property, we have for all families of languages F with
© 2003 Taylor & Francis
Multi-Bracketed Contextual Rewriting Grammars with…
71
Figure 2: Generative capacity of MBICRO-Grammars.
Lemma 6 in [5] proves that . Lemma 4.5 in [7] proves that for each MBICR(FIN) grammar G there is a context free grammar G’ with L(G’)=T(G). Clearly, for each MBICRO(F) grammar we have for the MBICR(F)grammar G″=(Σ, ∆, Ω, P). As context-free languages are closed under intersection with regular languages and morphisms, we also have . Hence MBICRO(FIN)=CF. In Lemma 4.6 of [7] it was shown that there is a language which can be generated by an MBIC(REG)-grammar but there is no so-called tree adjoining grammar (c.f. [4] for details) generating this language. Furthermore, Lemma 3.2 in [7] presents a construction to convert a given MBICR(Σ+)-grammar into a tree adjoining grammar generating the same language. It is straightforward to modify this construction so that it yields the same result for MBICRO(Σ + )-grammars. Therefore, . Let be a context-sensitive language. We construct the MBICROgrammar , where , and . Since the family of context-sensitive languages is closed under quotient with singleton sets, all selector languages are context-sensitive, and it is not difficult to prove L(G)=L. Hence MBICRO(CS)=CS. 4 Closure Properties It is easy to prove that MBICRO(F) is closed under union, concatenation and Kleene-star for all families of languages F with by using the same constructions as for MBIC and MBICR-grammars (cf. [5] for details).
© 2003 Taylor & Francis
72
M.Kappes
The classes MBIC(F) and MBICR(F) are not closed under intersection with regular languages if For MBICR-grammars this result follows from the introductory example given in this paper: the language can be generated by an MBICR(Σ+)grammar. However, L傽a*eb*c*ed*={anebncnedn|n≥1} cannot be generated by any MBICR-grammar. In what follows, we will give a construction to prove that MBICRO(F) is closed under intersection with regular languages for arbitrary families of languages F. Theorem 2 For all families of languages F, MBICRO(F) is closed under intersection with regular languages. Proof. Let be an arbitrary MBICRO(F)-grammar and R a regular language. Without loss of generality, we assume . In order to simplify the construction we will first transform G” into a normal form. Consider the mapping defined by t(σ)=σ if and [σσ]σ if σ∈Σ. It is easy to see that applying the homomorphic extension of t to an yields a string , where each symbol σ Σ is replaced by [σσ]σ. Furthermore, for the MBICRO(F)grammar , where and:
we have if and only if there is an such that t(α”)=α. Hence L(G)=L(G”). Figure 3 shows an example for this transformation. Since R is regular, there exists a deterministic finite automaton M= (Q, Σ, δ, q0, F) with L(M)=R (cf. [3] for notational details). We construct the indexset . For an intuitive explanation, let us take a look at the tree interpretation: If a node is labelled by (A, [p, q], [r, s]), then [p, q] is a value propagated from the immediate predecessor of the node stating that this node is supposed to generate a yield w such that δ(p, w)=q. The tuple [r, s] denotes that the immediate successors of the node are supposed to generate a yield w such that δ(r, w)=s. Now we construct a mapping convert which relabels the indices of the brackets in all possible ways such that for the resulting strings the following properties hold: (1) For each partition =123 such that we have 2=[Xγ1…γn]X, where X=(A, [p, q], [p0, pn]), and , Yi=(Bi, [pi-1, pi], [ri, si]), 1≤i≤n. See Figure 4 for an illustration. (2) For each partition =123 such that where X=(σ, [p, q], [r, s]) and δ(r, σ)=s.
© 2003 Taylor & Francis
we have 2= [Xσ]X,
Multi-Bracketed Contextual Rewriting Grammars with…
73
Figure 3: Example for the normal form used to prove the closure under intersection with regular sets: the tree =[A[B]B[Bbbc]B]A is converted into the normal form tree t()=[A[B[aa]a[B[B[bb]b[bb]b[cc]c]B]A. Notice that prΣ()=prΣ(t()).
Figure 4: Example for the mapping convert. Here, a node is shown with its immediate nonterminal successors B1,…, Bn. Applying the mapping convert with [p, q] as set of states to the above tree yields exactly all trees of the above form for arbitary pi, ri, , 0≤i≤n.
© 2003 Taylor & Francis
74
Formally we define the mapping convert: as follows: if =[A1…n]A with then:
M.Kappes
, 1≤ i≤n,
If =[σσ]σ for a σ ∈ Σ , then: convert(α, [p, q])={[Xσ]X|X=(σ, [p, q], [r, s]) and δ(r,σ)=s for r,
}.
Furthermore, convert(λ, [p, q])=λ if and else. Notice that there is exactly one decomposition of the above kind for each . Analogously, we also have to define a mapping convert2 which relabels the contexts of the grammar G: for an arbitrary (µ, v) such that µv∈t(DC()) and arbitrary , we define the mapping convert2 (µ, v, [p, q], [p’, q’])={([Xß1…ßk-1ξ, ρßk+1…ßn]X)|X= (A, [p, q], [p0, pn]), i ∈ convert(i, [pi-1, pi]), 1≤i≠k≤n:
where with , 1≤i≠k≤n, , and , 0≤i≤n. Notice that there is exactly one such decomposition for each (µ, v) such that . We define the grammar G′=(Σ, Φ, Φ′, Ω′, P′), where:
and:
It is not difficult to prove that, if α ∈Ω and , then for all β ′∈ convert(ß, [q0, f]) for an f∈F there is an α′ convert(, [q0, f]) such that α′ ∈ Ω′ and . On the other hand, if α′ Ω′ and , then there is an f ∈ F, an α ∈ Ω and a ß such that α′ ∈ convert(a, [q0, f]), β ′∈ convert(ß, [q0, f]) and . Furthermore, since properties (1) and (2) hold for all strings
© 2003 Taylor & Francis
Multi-Bracketed Contextual Rewriting Grammars with…
75
derived in G′, it is not difficult to see that for each we have . Also, for each βT(G) such that there is a convert(ß, [q0, δ(q0, prΣ(ß))]) such that . Using these facts we can show that L(G′)=L(G)傽R=L(G″)傽R. Since the selector languages of G″ and G′ are identical, the claim follows. 5 Conclusion We have studied the generative capacity and the closure properties of multibracketed contextual rewriting grammars with obligatory rewriting. The family of languages generated by an MBICRO(F)-grammar is closed under intersection with regular languages for arbitrary families of languages F. The questions whether and whether or not remain open. References [1] A.Ehrenfeucht, Gh.Paun and G.Rozenberg, Contextual grammars and formal languages. In G.Rozenberg and A.Salomaa (eds.), Handbook of Formal Languages. Springer, Berlin, 1997, vol. 2, 237–294. [2] M.A.Harrison, Introduction to Formal Language Theory. Addison-Wesley, Reading, Mass., 1978. [3] J.E.Hopcroft and J.D.Ullman, Introduction to Automata Theory, Languages and Computation. Addison-Wesley, Reading, Mass., 1979. [4] A.K.Joshi and Y.Schabes, Tree-adjoining grammars. In G.Rozenberg and A.Salomaa (eds.), Handbook of Formal Languages. Springer, Berlin, 1997, vol. 3, 69–123. [5] M.Kappes, Multi-bracketed contextual grammars, Journal of Automata, Languages and Combinatorics, 3.2 (1998), 85–103. [6] M.Kappes, On the generative capacity of bracketed contextual grammars, Grammars, 1.2 (1998), 91–101. [7] M.Kappes, Multi-bracketed contextual rewriting grammars, Fundamenta Informaticae, 38.8 (1999), 257–280. [8] S.Marcus, Contextual grammars, Revue Roumaine des Mathématiques Pures et Appliquées, 14.10 (1969), 1525–1534. [9] C.Martín-Vide and Gh.Paun, Structured contextual grammars, Grammars, 1.1 (1998), 33–55. [10] Gh.Paun, Marcus Contextual Grammars. Kluwer, Dordrecht, 1997. [11] Gh.Paun and X.M.Nguyen, On the inner contextual grammars, Revue Roumaine des Mathématiques Pures et Appliquées, 25.4 (1980), 641–651.
© 2003 Taylor & Francis
Semi-Top-Down Syntax Analysis Jaroslav Král Michal Zemlicka Faculty of Mathematics and Physics Charles University Prague, Czech Republic
[email protected] [email protected] Abstract. We show that the parsers developed for LR grammars can be modified so that they can produce outputs intuitively near to top-down parses for LR grammars. The parsers can be implemeted by recursive procedures, i.e. they can behave like recursive descent parsers. We present a class of grammars with left recursive symbols that are well suited to this technique.
1 Introduction Many modern compilers use parsers that are implemented as a collection of recursive procedures (procedure driven parsers, PDP) because they have many software engineering advantages. Such parsers can be well understood by human beings, and they can be easily documented, and manually modified. The manual modification is often necessary to support effectiveness, error recovery, pecularities of semantics, etc. PDP can be very effective. In the theory of parsing, procedure driven parsing is an implementation of recursive descent parsing. Recursive descent parsing works, however, for LL grammars only. The grammars of programming languages are not LL. PDP in compilers must therefore be adapted by ad hoc modifications so that they can work for non LL grammars and languages.
© 2003 Taylor & Francis
78
J.Král, M.Zemlicka
On the other hand, the output of LR parsers in compilers is something more than bottom-up parses in the classical sense, i.e. more than merely the reversed right-most derivations. We show that the output of LR parsers can have a form that contains the maximal possible amount of on-line information about the syntactical structure of the input string and that LR parsers can be implemented as PDP. In order to simplify the explanation, we shall assume a lookahead of length one. We use, unless stated otherwise, the notation from [7] and [1]. All the grammars discussed below are LR(1). The generalization to general LR(k) case is straightforward. G, G0, G1, G2,…will denote grammars. N, T, P are nonterminal and terminal alphabets and the set of productions, respectively. p1, p2,…are productions. . We write PG, TG, NG, VG if G cannot be understood implicitly. The procedures in procedure driven parsers (PDP) usually contain calls of semantic routines. The actions of any PDP are read and lexical operations of tokens from input and the calls of semantic routines are activated from the PDP procedures. It leads us to the following convention. for the input x=x1x2…xn is the string The output of the parser outρ(x)=σ0x1σ1x2σ2…, where xi are input tokens and σi are sequences of semantic symbols for i=0, 1, 2,…, n. The semantic symbols are new symbols that are different from symbols in . We shall further assume without any substantial loss of generality and/ or applicability of the results given below that the grammars contain no rules of the form A→λ, where λ is an empty string. We further assume, that the initial symbol S⬘ is the left hand side of just one production and that the grammars are reduced, i.e. for each there are strings x1, x2 from V* and z from T* such that and . Under these conditions any grammar is completely defined by the set of its productions. We assume that the productions are unambiguously numbered. The productions will often be given in a modified Backus-Naur normal form. In order to present all the relevant information, the production numbers inclusive, we use the notation clear from the following example of the grammar G01:
The full syntax grammar Syn(G) of the grammar G is defined by the productions . Ai,k are new terminal symbols called syntactical symbols. In order to enhance legibility we often write A.i instead of Ai,n. The symbols < A.i and > A.i are left and right s-brackets respectively. 1 G0 is somewhat ugly because it has to show all the cases and peculiarities of the algorithms given below in very little space.
© 2003 Taylor & Francis
Semi-Top-Down Syntax Analysis
79
LBG is the set of left s-brackets of G and RBG the set of right s-brackets of G. A homomorphism h on a language K is the semantic homomorphism for G if h(a)=a for all a in TG and h(K)=L(G). Let M be a set of symbols. Then hM is the homomorphism for which h(a)=a for , h(a)=λ otherwise. For every L(G) the complete parse SynG(x) of x is the string such that hT(y)=x. Obviously for every , corresponds to (codes) the left-most derivation of x, i.e. the top-down parse of x. Similarly, produces a bottom-up parse, i.e. reversed right-most derivations, in an appropriate coding.
2 Syntax Directed Semantics Semantics Sem on L(G) is a function from L(G) into a language (set of strings) Sem(G) such that hT(Sem(x))=x for every . SynG is semantics by definition. A semantics sem1 on L(G) covers a semantics sem2 on L(G) if there is a semantic homomorphism h such that for every , sem2(x)=h(sem1(x)). A semantics sem is syntax directed by a grammar G if it is covered by Syn(G). We say that sem1 covers sem2 via h. A grammar G strongly covers a grammar G1 if L(G)=L(G1) and SynG covers SynG1. The relation strongly covers is incomparable with the relations left covers and right covers [6]. A semantics sem on L(G) is on line computable if and , s>0, x i for i=1, 2, …, s, implies sem(x)=σ0x1σ1x2…σsxsu⬘ and sem(y)=σ0x1σ1x2…σsxsw⬘; i.e. σi depends only on the left context (read off prefix) and the coming symbol. On line computable semantics can be generated on line with input. A production for some . The s-bracket S.20 →> S.2 , < S.50 →< S.5 , > S.50 →> S.5
© 2003 Taylor & Francis
Semi-Top-Down Syntax Analysis
Table 3: The normalized parser denotes state, s slice.
© 2003 Taylor & Francis
83
Added items have numbers greater than 22. S
84
J.Král, M.Zemlicka Table 4: Run of the parser
R60.1→S5.2. The symbols Q.30, R.60 are replaced with empty strings. The parser can be mechanically transformed into a parser with the same functionality, implementable as a PDP. The stack symbols of are the symbols of slices. n.m is the symbol of the m-th slice of the state n. If performs a read action according to the item i: A→x1.ax2, u belonging to the slice (n.m) and moves to the state t, then pushes the symbols n.1, n.2,…, n.m on the stack replaces the top symbol m.n by the symbol t.1 and gives an output that is identical to the output of . The actions can be viewed as m call operations of procedures slicen_1, slicen_2,…, slicen_m and the first action of the procedure slicen_m. If performs the reduce move RDi for the item that has a production with the left hand side A, then pops the top stack symbol. Now let the be the symbol k.s and let the Go move under new top stack symbol of A in lead from state k to state t. Then the top symbol k.s in is replaced by t.1 and the symbol >A.i is output. This can be interpreted as the operation RTi, which is equivalent to the return from a procedure and the first operation after return. At the start of , the stack of contains only the symbol 1.1, the input string is on the input tape. In order to make it easier to implement in the form of a PDP we modify the way the output of is computed. We assume that calPP has a memory . Let the content of be . With each call and read action A of in the state S the set SA is associated. Before the action A the operation is performed. The content of is output during the read operation and is emptied. The closing action of the read operation is the output of the just read up symbol. The read operation is implemented as the function read returning the read symbol. Let a be the symbol on input. The ο operation on the expressions of the sets theory language defining the sets M and K produces the expression defining the set M, x=x⬘ s.50’) ; return (’S’) END ELSE error; ’B’: BEGIN (* 10.1 *) remember (’{S4,1}’); read; (* 11.1 *) IF on_input ( ‘’ ) THEN BEGIN out (’>S.4’); return (’S’) END ELSE error END © 2003 Taylor & Francis
86
J.Král M. emli ka Table 6: Run of the parser
G1.
END END ELSE IF on_input (’b’) THEN BEGIN … on_input(’a’) returns true if the character a is equal to the just read symbol. remember(S) performs the ο operation of S with memory contents. The description of read is given above.out outputs the string given by the argument. The lookahead in is given statically in the procedure bodies. It is also possible to pass it into the procedure via a parameter with the form of a list of pairs: nonterminal, its possible right contexts. The procedure code is needed only once for a given LR(0) kernel (see parsers for LALR grammars [1]). The possible right context is computed dynamically at the place of the call.
5 Kind Grammars The PDP for general LR grammars are quite complex, not easy to modify manually and not easy to be extensible in the sense proposed in [8]. The grammars of programming languages are “almost” LL. The main problem is that the grammars have left recursive symbols that are usually used to
© 2003 Taylor & Francis
Semi-Top-Down Syntax Analysis
87
Figure 1: Parsing structure for E-productions.
define the syntax of arithmetic expressions and lists. A closer look shows that a limited form of left recursiveness is used (see the grammar G3):
This observation led to the definition of kind grammars.
Definition 1 (k-kind grammar) A context-free grammar G that only has productions without left recursion and productions with direct left recursion is called kind if: 1. for every two productions A→αXβ, , and 2. for every nonterminal
and X≠Y then .
Productions are grouped by the nonterminal on the left hand side of each production. Kind grammars produce LL languages such as those shown in [11], which also shows how to mechanically generate graphs that are easily transformable into programs. An example for E-group is in Figure 1. For this structure it is easy to generate a parsing procedure like in the following listing: PROCEDURE Parse_E; BEGIN CASE LookAhead OF term_num, term_left: Parse_T; term_minus: BEGIN ReadTerm(term_minus); Parse_T END; END; WHILE LookAhead IN [term_plus, term_minus] DO CASE LookAhead OF
© 2003 Taylor & Francis
88
J.Král, M.Zemlicka
term_minus: BEGIN ReadTerm(term_minus); Parse_T END; term_plus: BEGIN ReadTerm(term_plus); Parse_T END; END; END; Note that the program contains loops. This is a substantial difference from the previous case. Kind grammars form a proper superclass of LL grammars. It is possible to design an extensible parser for kind grammars that may be extended during parsing time—even by a parallel process.
6 Conclusions LR parsing is implementable in the procedure driven form and can produce information close to top-down parses. So top-down parsing is sometimes possible using tools derived for bottom-up parsing. We prepare an implementation of a dynamically modifiable parser based on the concept of kind grammars.
References [1]
A.V.Aho, R.Sethi and J.D.Ullman, Compilers: Principles, Techniques, and Tools. Addison-Wesley, Reading, Mass., 1986.
[2]
J.Drózd, Semi-Top-Down Parsing, Master thesis, Faculty of Mathematics and Physics, Charles University, Prague, 1985 (in Czech).
[3]
J.Drózd, Recursive Descent Parsing for LR(k) Grammars, PhD dissertation, Faculty of Mathematics and Physics, Charles University, Prague, 1990 (in Czech).
[4]
J.Král, Almost top-down analysis for generalized LR (k) grammars. In Trudy Vsesojuznogo sympoziuma po metodam realizacii algoritmiceskich jazykov I, SO AN SSSR Novosibirsk, 1976, 230–247.
[5]
J.Král, A top-down no backtrack parsing of general context-free languages. In Proceedings of Mathematical Foundations of Computer Science, MFCS’77, Springer, Berlin, 1977, 333–341.
[6]
J.Král, Parsing and Syntax Directed Compiling, Technical Report, Institute of Computing Techniques, Prague, 1982 (in Czech).
[7]
Gh.Paun, Marcus Contextual Grammars. Kluwer, Dordrecht, 1998.
[8]
M.Zemlicka, Extensible Language Compiler, Master thesis, Faculty of Mathematics and Physics, Charles University, Prague, 1994 (in Czech).
[9]
M.Zemlicka, Extensible LL(1) parsing. In Proceedings of SOFSEM’95, Milovy, 1995.
© 2003 Taylor & Francis
Semi-Top-Down Syntax Analysis
89
[10] M.Zemlicka, Parsing of Extensible Languages, Technical Report, Faculty of Mathematics and Physics, Charles University, Prague, 1996 (in Czech). [11] M.Zemlicka and J.Král, Run-time extensible deterministic top-down parsing, Grammars, 2.3 (1999), 283–293.
© 2003 Taylor & Francis
Descriptional Complexity of Multi-Parallel Grammars with Respect to the Number of Nonterminals Alexander Medunar Dušan Kolár Department of Computer Science and Engineering Technical University of Brno Czech Republic {meduna, kolar}@dcse.fee.vutbr.cz
Abstract. The present paper discusses multi-parallel grammars and their descriptional complexity with respect to the number of nonterminals. It proves that eight-nonterminal multi-parallel grammars characterize the family of recursively enumerable languages. It also summarizes all the important results on the descriptional complexity of multi-grammars with respect to the number of nonterminals.
1 Introduction The theory of selective substitution grammars classified multi grammars into three basic types –multi-sequential grammars, multi-continuous grammars, and multi-parallel grammars (see [2]). The descriptional complexity of the first two types with respect to the number of nonterminals was investigated in [4] and [5]. The present paper completes this investigation by discussing this complexity regarding multi-parallel grammars. More specifically, the present paper proves that eight-nonterminal multi-parallel grammars characterize the family of recursively enumerable languages. It also concludes by summarizing all the important results on the
© 2003 Taylor & Francis
92
A.Meduna, D.Kolá
descriptional complexity of multi-grammars with respect to the number of nonterminals.
2 Definitions This paper assumes that the reader is familiar with formal language theory, including selective substitution grammars (see [6] and Chapter 10 in [1]). Let be an alphabet. The cardinality of is denoted by card(). * represents the free monoid generated by under the operation of concatenation. The unit of * is denoted by ε. Let +=*-{ε}; algebraically, + is the free semigroup generated by under the operation of concatenation. For , |w| denotes the length of w. Next, we give the definition of a multi-parallel grammar. Compared to the definition of a multi-parallel grammar given in [2], the following definition is simpler; however, it is easy to see that both definitions are equivalent. Let m be a positive integer. An m-parallel grammar, G, is a quintuple: G=(, P, S, T, K), where is an alphabet, and:
, P is a finite substitution on *, K={1,…, n},
where for i=1,…, n, i is a language of the form: i=F1F2…Fm, with:
. G directly derives from u, symbolically
for all h=1,…, m. Let u, denoted as: if either u=S and 1. u=a1…an with 2. for some 3. υ=x1…xn with
or there exists a natural number, n, so: for all i=1,…, n; , where
;
.
Instead of , this paper writes a→x hereafter. In the standard manner, extend to n(where n≥0) +, and *. The language of G, L(G), is defined as:
© 2003 Taylor & Francis
Descriptional Complexity of Multi-Parallel Grammars with…
93
G is a multi-parallel grammar if G represents an m-parallel grammar for some m1. A queue grammar (see [3]) is a sixtuple, Q=(V, T, W, F, R, g), where V and W are alphabets satisfying , and is a finite relation such that for any , there exists an element If there exist , and such that u=arb, and =rzc, then Q directly derives from u, denoted by . In the standard manner, define n, +, and *. A derivation of the form with and is a successful derivation. The language of Q, L(Q), is defined as .
3 Results The present section demonstrates that the family of recursively enumerable languages is equal to the family of languages generated by eight-nonterminal multi-parallel grammars. Lemma 1 Let: Q=(V, T, W, F, R, g) be a queue grammar. Then, there exists an eight-nonterminal multi-parallel grammar, G, satisfying: L(G)=L(Q) Proof. Let: Q=(V, T, W, F, R, g) be a queue grammar. Without any loss of generality, assume that: Construction: For a natural number, mappings—, , ß, and :
, introduce the following four
1. Define an injection, , from to ({4, 5}{3})n. In the standard manner, extend so it is defined from * to (({4, 5}{3})n)*. 2. Define the bijection, , from {4, 5, 3} to {0, 1, 3} as (4)=0, (5)=1, and (3)=3. In the standard manner, extend so it is defined from {4, 5, 3}* to {0, 1, 3}*. 3. Define the injection, ß, from to ({0, 1}{3})n so that for every , . In the standard manner, extend so it is defined from to (({0, 1} {3})n)*.
© 2003 Taylor & Francis
94
A.Meduna, D.Kolá
4. Define the relation, , from that:
to
so
In the standard manner, extend so it is defined from .
to
Let m be any natural number satisfying:
Construct the following m-parallel grammar:
with:
Initially, set A. For every
to K, where:
© 2003 Taylor & Francis
; then, extend K in the following three-step way: where
and
, add:
Descriptional Complexity of Multi-Parallel Grammars with…
B. For every every , add:
where
95
and
, and
to K, where:
C. For every
, add:
to K, where:
The construction of G is completed. Next, we outline a proof demonstrating , but leave it up to the reader to provide a rigorous version of this proof. Examine the construction of G to make the following three observations: I. G derives no sentential form containing two consecutive identical nonterminals from {0, 1, 2, 3, 4, 5, 6}.
© 2003 Taylor & Francis
96
A.Meduna, D.Kolá
II. If some terminals precede some occurrences of symbols 4, 5, or 3 in a sentential form, f, derived by G, then these occurrences can never be removed, so G cannot derive a member of L(G) from f at this point. III. G uses a selector introduced in C only during the last derivation step of any successful derivation. Based on properties A through C, observe that every successful derivation actually simulates a successful derivation in Q. To give an insight into this simulation in greater detail, consider:
according to G simulates ab
, in Q. By using selectors constructed in A and B, xc by making these two steps:
(X acts as a “filling” symbol). More precisely, every successful derivation, 2 * with , has this form:
where: h1, and: for j=1,…, h:
Then, in Q:
© 2003 Taylor & Francis
Descriptional Complexity of Multi-Parallel Grammars with…
Therefore, the reader. As
A proof demonstrating and
97
is left to Because G has
only eight nonterminals -0, 1, 2, 3, 4, 5, 6, and X-, Lemma 1 holds.
ⵧ
Theorem 1 The family of languages generated by eight-nonterminal multiparallel grammars coincides with the family of recursively enumerable languages. Proof. Obviously, every language generated by an eight-nonterminal multiparallel grammar represents a recursively enumerable language. By Lemma 1, for every queue grammar, Q, there exists an eightnonterminal multi-parallel grammar, G, satisfying L(G)=L(Q). Recall that the family of languages generated by queue grammars coincides with the family of recursively enumerable languages (see [3]). Consequently, every recursively enumerable language is generated by an eight-nonterminal multiparallel grammar. Therefore, Theorem 1 holds. ⵧ The following theorem summarizes all the fundamental results on the descriptional complexity of multi-grammars with respect to the number of nonterminals. Theorem 2 The following four families are identical: A. the family of languages generated by eight-nonterminal multi-parallel grammars; B. the family of languages generated by six-nonterminal multi-continuous grammars; C. the family of languages generated by eight-nonterminal multi-sequential grammars; D. the family of recursively enumerable languages. Proof. This theorem follows from Theorem 1 above, Theorem 1 in [4], and Theorem 1 in [5]. ⵧ
© 2003 Taylor & Francis
98
A.Meduna, D.Kolá
References [1] [2] [3] [4]
[5] [6]
J.Dassow and Gh. Paun, Regulated Rewriting in Formal Language Theory. Springer, New York, 1989. H.C.M.Kleijn and G.Rozenberg, Multi grammars, International Journal of Computer Mathematics, 12 (1983), 177–201. H.C.M.Kleijn and G.Rozenberg, On the generative power of regular pattern grammars, Acta Informatica, 20 (1983), 391–411. A.Meduna, Eight-nonterminal multi-sequential grammars characterize the family of recursively enumerable languages, International Journal of Computer Mathematics, 65 (1997), 179–189. A.Meduna, Descriptional complexity of multi-continues grammars, Acta Cybernetica, 1998, 375–384. A.Meduna, Automata and Languages: Theory and Applications. Springer, London, 2000.
© 2003 Taylor & Francis
On the Generative Capacity of Parallel Communicating Extended Lindenmayer Systems1 György Vaszil Computer and Automation Research Institute Hungarian Academy of Sciences Budapest, Hungary
[email protected] Abstract. We investigate the generative power of parallel communicating systems with extended Lindenmayer systems as components. First we prove that, like the context-free case, non-returning systems can be simulated by returning systems. Then we demonstrate the power of non-returning systems by showing that they are able to generate the twin-shuffle language over arbitrary alphabets.
1 Introduction Parallel communicating Lindenmayer systems were introduced in [3], the paper which initiated the study of their generative power (see also [1]). It was shown that three, or even two, components are enough for a system to be strictly more powerful then the components it contains. The study of parallel communicating extended Lindenmayer systems continued in [9], where it was shown that systems that have components with or without tables generate the same class of languages. A normal form for the production rules was also given. Different modes of derivation were investigated in [7] and in [10], and the equivalence of several different ways of communication was shown in [8]. 1 Research supported by the Hungarian Scientific Research Fund “OTKA” Grant no. T 029615.
© 2003 Taylor & Francis
100
G.Vaszil
In this paper we continue to investigate the generative power of parallel communicating extended Lindenmayer systems. We show that non-returning systems, with or without tables, can be simulated by returning systems, and we demonstrate the power of non-returning systems by showing that they are able to generate the twin-shuffle language over arbitrary alphabets.
2 Preliminaries The reader is assumed to be familiar with the basics of formal language theory. Further details can be found in [6]. An alphabet V is a finite set of symbols and a string over V is a finite sequence of elements of V. The set of all nonempty strings over V is denoted by V+. The empty string is denoted by ε , V* stands for the union of V+ and {ε}. A language L over V is a subset of V*. |w| and |w|X denotes the length of a word w and the number of occurences of symbols from set X in w. For details about Lindenmayer systems, consult [5]. Bellow only the basic definitions about extended Lindenmayer systems are presented. An E0L system is a quadruple G=(N,T, P,), where N and T are disjoint sets; N is the nonterminal alphabet and T is the terminal alphabet. P is a finite set of rewriting rules (productions) over (N傼T) with rules of the form a→v, where and , and is the axiom of the system. Furthermore, P is complete, that is, all letters of (N傼T) can be rewritten by at least one rule of P. A string x directly derives a string y in G, denoted by , if x=a1a2…an, y=␣1␣2…␣n, and for every and , 1ⱕiⱕn. The language generated by an E0L system G is , where * denotes the reflexive and transitive closure of . A tabled E0L system or an ET0L system is a quadruple G=(N, T, P, ω), where N and T are the disjoint alphabets of nonterminals and terminals, is the axiom and P is a finite set of E0L production sets over , P={P(1),…, P(t)}. The elements of P are called tables. In a direct derivation step only productions from one of the tables in P can be chosen, but in different steps of the same derivation, different production sets can be used. PC grammar systems with Chomsky type components were introduced in [4], with Lindenmayer type components in [3]. Definition 1 A parallel communicating extended Lindenmayer system with n components (a PC E0L or PC ET0L system in short), nⱖ1, is an (n+3)tuple ⌫=(N, K, T, G1,…, Gn), where N is a nonterminal alphabet, T is a terminal alphabet, and K={Q1, Q2,…, Qn} is an alphabet of query symbols. N, T, and K are pairwise disjoint sets. , 1ⱕiⱕn, called a
© 2003 Taylor & Francis
On the Generative Capacity of Parallel Communicating…
101
component of ⌫, is an extended Lindenmayer system as above, G1 is said to be the master of ⌫. An n-tuple (x1,…, xn), where , 1ⱕiⱕn, is called a configuration of ⌫, (1,…, n) is said to be the initial configuration. PC Lindenmayer systems change their configurations by performing direct derivation steps. Definition 2 Let ⌫=(N, K, T, G1,…, Gn), nⱖ1, be a parallel communicating Lindenmayer system. We say that a configuration (x1,…, xn) directly derives (y1,…, yn), denoted by , if one of the following two cases holds: 1. There is no xi, which contains any query symbol, that is, for 1ⱕiⱕn. Then for each i, 1ⱕiⱕn, (yi is obtained from xi by a direct derivation step in Gi). 2. There is some xi 1ⱕiⱕn, which contains at least one occurrence of a query symbol. In this case (y1,…, yn) is obtained from (x1,…, xn) as follows: For each xi with |xi|K⫽0 we write , where , 1ⱕjⱕt+1, and , 1ⱕlⱕt. If for each il, 1ⱕlⱕt, then and in returning systems , in non-returning systems 1ⱕlⱕt. If for some il, 1ⱕlⱕt, then yi=xi. For all j, 1ⱕjⱕn, for which yj is not specified above, yj=x j. Let denote the reflexive and transitive closure of ⇒. The first case is the description of a rewriting step. If no query symbol is present in any of the sentential forms, then each component uses its rewriting rules. The second case describes a communication. If a query symbol appears in a sentential form, the rewriting process is interrupted and one or more communication steps must be performed. Each query symbol must be replaced by the sentential form of the component with the same index, provided that the replacing strings do not contain further query symbols. If this condition cannot be fulfilled, a circular query has appeared and the derivation is blocked. In returning systems, after they have communicated their sentential form to another one, components must return to their axioms and begin to generate a new string. In non-returning systems they continue to rewrite the current string. Let and denote a rewriting and a communication step, respectively. Definition 3 The language generated by a parallel communicating system of extended Lindenmayer systems ⌫=(N, K, T, G 1 ,…,G n ), , 1ⱕiⱕn, is:
© 2003 Taylor & Francis
102
G.Vaszil
where G1 is the master of ⌫. Thus, the generated language consists of terminal strings appearing as sentential forms of the master. Let us denote the classes of languages generated by returning and nonreturning PC Lindenmayer systems with at most n components of type by and , respectively. When an arbitrary number of components is considered, we use * in the subscript instead of n. Let also , X as above, denote the class of languages generated by extended Lindenmayer systems, and the class of recursively enumerable languages.
3 About Generative Capacity First we recall a theorem from [9], which shows that returning or nonreturning PC E0L systems and PC ET0L systems generate the same class of languages. Theorem 1 [9]
.
Now we continue by showing that non-returning PC E0L or PC ET0L systems can be simulated by returning systems. Theorem 2
.
Proof. By Theorem 1 above, it is sufficient to show that non-returning PC E0L systems can be simulated by returning PC E0L systems. Let ⌫=(N, K, T, G1,…, Gn) be a non-returning PC E0L system with , 1ⱕiⱕn. Let us also assume that , 1ⱕiⱕn. This does not involve any loss in generality, since a nonterminal S can be added to N and new rules S→i can be added to Pi, 1ⱕiⱕn, without changing the generated language. Now we construct a returning PC E0L system ⌫’ which generates the same language as ⌫. Let:
where
is the master grammar, , 1ⱕi, jⱕn,
, 1ⱕiⱕ2n, , and:
The set of productions are as follows. If rules for a symbol not explicitly given, we assume the presence of “chain” rules, x→x.
© 2003 Taylor & Francis
are
On the Generative Capacity of Parallel Communicating…
103
for 1ⱕiⱕn,
for n+1ⱕiⱕ2n,
for 1ⱕi, jⱕn, and:
Let us now follow the derivations in ⌫’ to see that it generates the same language as ⌫. In the following, for any string , 1ⱕiⱕt, [␣] denotes the string [x 1 ][x 2 ]…[x n ]. If is the initial step of ⌫, then in ⌫’ after the first rewriting step we get:
where and ␣i, 1ⱕiⱕn differ only in the indices of the query symbols they contain; if ␣i contains Qj then contains Qij. Otherwise, the system is blocked after the next rewriting step. If no communication follows the first rewriting step in ⌫, then the sentential forms are sent to and the the simulation of the next rewriting step starts. If holds in ⌫, then this communication is simulated with the aid of the components G11,…, Gnn in the following way. If for some j, 1ⱕjⱕn, a sentential form does not contain query symbols, it is transmitted to and also to the components Gij, 1ⱕiⱕn, where n copies are saved. The k-th of these saved copies can then be transmitted to a sentential form , if it contains Qkj (Qkj is contained by , if ␣k in ⌫ contains Qj). This way none of the query symbols in the , 1ⱕiⱕn, sentential forms are replaced by start symbols, but they all receive a different copy of the string they have requested. This way we get:
through a series of communication steps. The ␦jk, 1ⱕj, kⱕn “garbage” will be removed after the next rewriting step by Ga.
© 2003 Taylor & Francis
104
G.Vaszil
The further rewriting steps and communications of ⌫ are simulated in a similar way. Let us assume that holds in ⌫. Now ⌫’ starts from a configuration:
where step we get:
, 1ⱕj, kⱕn. After a rewriting and a communication
and then: where and i, 1ⱕiⱕn, differ only in the indices of the query symbols they contain. Otherwise, the system is blocked after the next rewriting step. If holds in ⌫ then it is simulated in the same way as explained above, and we get: Now the simulation of the next rewriting step can start. 䊐 Next we focus on the power of non-returning systems. We recall a lemma from [2], which states that every recursively enumerable language is the morphic image of the intersection of the so called twin-shuffle language over some alphabet and a regular language. First we recall the shuffle operation and the notion of the twin-shuffle language. Let V be a finite alphabet and let . The shuffle of strings ␣ and  denoted by is the set . Consider now an alphabet V, denote by the set , and define the coding by , . Let h(w), be denoted by . The twin-shuffle language over alphabet V, denoted by TS(V), is . In [2] the following characterisation of recursively enumerable languages is given. Lemma 1 [2] For every recursively enumerable language L, there is a twinshuffle language TS(V), a regular language R, and a weak coding h, such that L=h(TS(V)傽R). Based on this lemma, we obtain the following theorem. Theorem 3 For every recursively enumerable language L, there is a nonreturning PC E0L system ⌫, a regular language R, and a weak coding h, such that L=h(L(⌫)傽R). Proof. By Lemma 1, it is sufficient to construct a non-returning PC E0L system ⌫ which generates TS(V), the twin-shuffle language over an arbitray
© 2003 Taylor & Francis
On the Generative Capacity of Parallel Communicating…
105
alphabet V. Let V={a(1), a(2), …, a(m)} be an alphabet with m elements and let Let . Let: ⌫=(N, K, T, G, G0, G1,…, Gm, Gm+1, Gm+2), where are
is the master grammar, the other components , 1ⱕiⱕm+2, with:
The production sets are as follows. If for a symbol no rule is explicitely given, we assume the presence of “chain” rules, x→x:
Let also:
for 1ⱕiⱕm, and:
© 2003 Taylor & Francis
106
G.Vaszil
Let us now follow the derivations in this system to see how it generates the twin-shuffle language over V. It starts with:
and after a rewriting step we get:
where ␦ is either S or Q0, ␦j, m+1ⱕjⱕm+2 is either S1 or Q0. If ␦=Q0 the system generates ε after the next rewriting step, so let us assume that ␦=S in order to continue. The string A1A1[AA]1 is transferred to components G1, …, Gm, and possibly to components Gm+1, or Gm+2. We get: where , m+1ⱕjⱕm+2 is either S1 or A1A1[AA]1. Now each Gi starts deriving a different word of the twin shuffle language by rewriting A1 and A1 to and , 1ⱕiⱕm. If Gm+1 has received A1A1[AA]1, it changes the order of the two “key” As and the nonterminal marking this order by producing A 2A2[AA]2 (if Gm+2 has received A1A1[AA]1 the system is going to be blocked; this component is designed to change the order of the As in the opposite way). Meanwhile, G0 erases its string and introduces a query symbol Qi for some i, 1ⱕiⱕm+2. This process produces: where 1ⱕiⱕm+2 and is either A2A2[AA]2 or S2. Now one of the m+2 strings is received by G0, then G1, …, Gm+2 erase their sentential forms and introduce Q0 (or possibly S1 in the case of Gm+1, Gm+2). Thus, after the next rewriting step we get: where ␦0 is for some i, 1ⱕiⱕm, or A1A1[A A]1, and ␦j m+1ⱕjⱕm+2, is S1 or Q0. Now the process can continue in the same way by adding more symbols to the subwords of the twin-shuffle string in G1, …, Gm, or by changing the order of A and A in Gm+1, Gm+2. If G, the master, introduces Q0, it receives the string generated so far, erases the nonterminals Aj, Aj, [AA]j, or [AA]j, 1ⱕjⱕ2, and produces a word of the twin-shuffle language over V. Any other way of functioning leads to a blocking configuration, so our proof is complete. 䊐 Corollary 1 For each family of languages , with and being closed under arbitrary morphisms and intersection with regular languages holds.
© 2003 Taylor & Francis
On the Generative Capacity of Parallel Communicating…
107
Proof. By Theorem 3 and the properties of the inclusion 䊐 would imply , a contradiction. Since is closed under intersection with regular languages and arbitrary morphisms (see [5]), this corollary combined with Theorem 1 again implies the inclusion , which is also the consequence of earlier results from [3] (see also [1]) combined with Theorem 1.
References [1] E.Csuhaj-Varjú, J.Dassow, J.Kelemen and Gh. Paun, Grammar Systems: A Grammatical Approach to Distribution and Cooperation. Gordon and Breach, London, 1994. [2] J.Engelfriet and G.Rozenberg, Fixed point languages and representations of recursively enumerable languages. Journal of the ACM, 27/3 (1980), 499–518. [3] Gh. Paun, Parallel communicating systems of L systems. In G.Rozenberg and A.Salomaa (eds.), Lindenmayer Systems: Impacts on Theoretical Computer Science, Computer Graphics, and Developmental Biology. Springer, Berlin, 1992, 405–418. [4] Gh. Paun and L.Sântean, Parallel communicating grammar systems: The regular case. Annals of the University of Bucharest, Mathematics-Informatics Series, 38/2 (1989), 55–63. [5] G.Rozenberg and A.Salomaa, The Mathematical Theory of L Systems. Academic Press, New York, 1980. [6] A.Salomaa, Formal Languages. Academic Press, New York, 1973. [7] Gy. Vaszil, Parallel communicating grammar systems without a master. Computers and Artificial Intelligence, 15/2–3 (1996), 185–198. [8] Gy. Vaszil, Communication in parallel communicating Lindenmayer systems. Grammars, 1/3 (1999), 255–270. [9] Gy. Vaszil, On parallel communicating Lindenmayer systems. In Gh. Paun and A.Salomaa (eds.), Grammatical Models of Multi-Agent Systems. Gordon and Breach, London, 1999, 99–112. [10] Gy. Vaszil, Further remarks on parallel communicating grammar systems without a master. Journal of Automata, Languages and Combinatorics, accepted.
© 2003 Taylor & Francis
II AUTOMATA
© 2003 Taylor & Francis
Cellular Automata and Probabilistic L Systems: An Example in Ecology Manuel Alfonseca Alfonso Ortega Alberto Suárez Department of Computer Science Engineering Autonomous University of Madrid Spain {manuel.alfonseca, alfonso.ortega}@ii.uam.es
Abstract. This paper revisits the formal definition of deterministic and probabilistic cellular automata, with special attention to the problem of updating the probabilistic information of each automaton in the grid. An example is given. On the other hand, we introduce a formal notation for probabilistic L systems and the language generated by them. Several examples are given. We propose a new equivalence between both fields: the step-equivalence between a probabilistic L system and a probabilistic cellular automaton. The paper includes a constructive proof of this result and its application to a bi-dimensional probabilistic cellular automaton that models an ecosystem.
1 Cellular Automata A cellular automaton is defined as six-fold (G, G0, N, Q, f, T), where G is a matrix of automata. G0 is the initial state of the grid and is a mapping G0:G→Q, an injective function that assigns an initial state to each automaton in
© 2003 Taylor & Francis
112
M. Alfonseca, A. Ortega A. Suárez
the grid. N (neighborhood) is a function that assigns to each automaton in the grid the set of its neighbors. Q is the set of possible states of every automaton in the grid. f is the transition mapping f: Q×Qn→Q, where f(q0, (q1,…, qn)), n=#Q is the next state of any automaton in the grid if its current state is q0 and whose neighborhood’s states are (q1,…, qn). T⊆Q is the set of final or target states. Every automaton in the grid has the same number of neighbors, transition mappings and set of possible and final states. It is obvious that each finite automaton in the grid is defined by a= (Qn, Q, f, G0 (a), (T). 2 Probabilistic Cellular Automata Cellular automata are probabilistic if each automaton in the grid is a probabilistic finite automaton. In probabilistic cellular automata, the automata on the grid choose their next state from a set of options by assigning probabilities to each transition while the pure non-deterministic approach only establishes the set of options. A probabilistic cellular automaton is the six-fold (G, G0, N, Q, M, T), where G, N, Q, T are defined as in a cellular automaton, and G0 is both the initial state of the grid and the mapping:
This mapping is an injective function that assigns an initial state vector to each automaton in the grid. The state vector of an automaton shows the probability of the automaton being in each state. The following notations will be used indistinctly in the following pages: Πi (G0 (x))=Πqi(G0 (x))=probability that the automaton x is in state qi at the initial moment. M is the transition matrix, a matrix of probabilities of transition between states, with dimension #Qn×#Q×#Q. In order to simplify the notation, that M is considered a family of #Qn square matrices #Q×#Q (there is a matrix for each particular neighborhood configuration). Each finite probabilistic automaton a in the grid is defined as a=(Qn, Q, M, G0(a), T). Example 1 Assume an infinite square grid. The concatenation of the row and column indices identifies the automaton at this position in the grid. The Von Neumann neighborhood will be used. Automata are binary, that is Q={0, 1}. The cellular automaton is pca 1=(G1,G 0,N N,Q,M,T), where is an infinite square matrix of automata around position (0, 0). , i.e. each state is initially equiprobable.
© 2003 Taylor & Francis
Cellular Automata and Probabilistic L Systems: An Example…
113
where the matrices are disposed from left to right and from top to bottom. The first matrix is M0000 and the last is M1111. Let us choose an automaton in the grid and name it x:
Assume that the five automata have the following probability vectors at a given moment (the indices identify the automata in the previous figure, the first position in the vector is the probability of being 0): p1=(0.2, 0.8), p2= (0.6, 0.4), p 3=(0.3, 0.7), p 4=(0.9, 0.1), p x=(0.1, 0.9). If the neighborhood configuration of automaton x were, for instance (0, 1, 0, 0), the following matrix operation computes the next state vector for automaton x: px×M0100. The probability of this situation is p1[0] p2[1] p3[0] p4[0]. We have to compute the equivalent probabilities for all possible neighborhood configurations and add the results, thus getting: , which can be expressed by means of the tensor product, where the dot operator represents the element by element matrix product:
2.1 A Configuration of a Probabilistic Cellular Automaton A configuration C of a probabilistic cellular automaton is a time dependent mapping C(t): F→Q that assigns a state to each automaton in the grid. The probability that a probabilistic cellular automaton (A) is in a given configuration (C) at a given moment (t) will be denoted pt,A(C), where t and A will be omitted whenever they are obvious from the context. Each automaton in the grid has a state vector that shows the probability for it being in each possible state. So, the event “the automaton is in configuration C” could be expressed as “the automaton is in configuration C(a)”. If va is the state vector for automaton a, then πC(a)(va) represents the probability that automaton a is in state C(a):
© 2003 Taylor & Francis
114
M. Alfonseca, A. Ortega A. Suárez
The sum of these probabilities over the set of all possible configurations must equal 1: This expression assumes that the set of possible configurations is ordered.
3 Bidimensional IL Systems A bidimensional L System is an L System whose words are matrices of characters instead of linear strings. In order to clarify the notation, the following conventions will be followed: the context will always be written before the symbol changed by the production rule; the context will be determined by a function c that generates the horizontal and vertical displacements of the context symbols with respect to the current symbol. Formally, a bidimensional System is defined as the five-fold where Σ, P, g, ω are defined in the usual way and c: [1, k]傽N→ {-1, 0, +1}. Example 2 A bidimensional IL System with a von Neumann neighborhood is an extended system whose c function is defined as follows: c(1)=(0, +1), c(2)=(+1, 0), c(3)=(0, -1), c(l)=(-1, 0):
If a Moore neighborhood is used we get an 〈8, 0〉 bidimensional IL System with the following c function: c(1)=(-1, +1), c(2)=(0, +1), c(3)=(+1, +1), c(4)=(+1, 0), c(5)=(+1, -1), c(6)=(0, -1), c(7)=(-1, -1), c(8)=(-1, 0):
4 Probabilistic L Systems We define a probabilistic L System as an L System in which each production rule has an associated probability with the restriction that the sum of the probabilities associated to all the rules applicable to a symbol at any time must be 1. A deterministic L System (DL) can be seen as probabilistic with a probability of 1 associated to every rule.
© 2003 Taylor & Francis
Cellular Automata and Probabilistic L Systems: An Example…
115
In a DL, System, a derivation is linear. In a probabilistic L System (S), it is a tree, with a probability associated to each branch and the sum of the probabilities associated to all the branches with the same origin being 1. This tree will be called Tn(S), and n is the depth of the tree. Formally, a probabilistic L System is an L System in which the rule set P has been replaced by a set of pairs (R, p(R)), where R is a derivation rule and p(R) its probability, with the restriction that if P’⊂ P is the set of rules applicable to a symbol at a given context, ΣRinP’=1. Example 3 Assume the probabilistic where: Σ2={0, 1, g} (g is the end marker)
IL System
,
i.e. the first symbol in the string becomes its right neighbor, the last becomes its left neighbor, intermediate 0s become its left/right context with probability 0.3/0.7 and intermediate 1s do the same with probability 0.7/0.3.
Figure 1 shows the first three derivations of system S2 and indicates the probability of each branch.
Observe that the probability of reaching a node by a given path is the product of the probabilities of the branches that go from the axiom to that
© 2003 Taylor & Francis
116
M. Alfonseca, A. Ortega A. Suárez
node along that path. The sum of the probabilities of all the nodes in the tree at a given derivation level is 1. If a node at derivation level k may be reached by more than one path, the probability that the word is generated by a derivation of depth k is the sum of the probabilities of all the paths, computed as above. Formally, the probability that word x is generated by system S in n derivations is where Di is a path in tree Tn(S); bj is a branch in path Di; rk is a production rule applied at branch bj; ρ(Di) is the result of derivation Di. Each path in the tree is a derivation. The expression could be written alternatively in Lindenmayer notation as follows:
where p(pi((l, k))) is the probability associated with the production rule pi((l, k)). Let S be a probabilistic L System. Let n be a natural number . Let θ be a real number . The language generated by SΘ, is defined as: In the previous example, the language for threshold 0 is: L(S2, 0)={0001, 0011, 0101, 0111, 0000, 0010, 1010, 1011, 1111}.
5 Step-Equivalence Between Probabilistic L Systems and Probabilistic Cellular Automata Let A be a probabilistic cellular automaton. Let S be a probabilistic bidimensional IL System. Definition 1 S is step-equivalent to A if and only if:
Theorem 1 Given a probabilistic cellular automaton A=(G, G0, N, M, Q), there is an equivalent probabilistic bidimensional IL System that is stepequivalent to the cellular automaton. Proof. (Constructive proof) Consider the bidimensional IL System , where and si expresses the axiom of the L System; g is a symbol not in Σ; P is the set of the pairs (rules, probability):
© 2003 Taylor & Francis
Cellular Automata and Probabilistic L Systems: An Example…
117
; axiom w is a matrix with the same dimensions as G and whose elements are all equal to si; and c and N refer to the same elements in their matrices. Rules with g use , where is obtained from with replacing g with the appropriate boundary symbol in the automaton. It is easy to see that S is step equivalent to A. ⵧ Assume an automaton whose mean-field evolution follows the Lotka-Volterra equations for a predator (species Y, carnivorous) and a prey (species X, herbivorous), with a slight modification that accounts for the saturation of the herbivorous species:
where is the saturation level of species X. The saturation term is necessary since the automaton cannot represent the unlimited growth of species X in the absence of individuals of the species Y. The territory in which the population dynamics takes place is a regular two-dimensional square lattice with periodic boundary conditions. Only nearest neighbors displacements are allowed. The solution to the inverse problem of finding the reactive rules that yield a specified set of mean-field equations was given by Boon et al. in their extensive review on reactive lattice-gas automata. The reactive rules are encoded into a reaction probability matrix, whose entries are the probability of obtaining an outgoing from a given incoming configuration In particular, one possible prescription leading to the previous expressions in the mean-field limit is:
(1) where δ(n, n’) is a Kronecker delta (an indicator equal to 1 if n=n’ and 0 otherwise), the inverse of h represents the reaction time-scale, and m is the maximum number of particles of a single species at a given node, which coincides with the number of channels associated to a node. In the present model m=4, meaning that this is the maximum number of individuals of each species that may occupy a given node. The condition that p(nin→ nout) be a
© 2003 Taylor & Francis
118
M. Alfonseca, A. Ortega A. Suárez
probability (i.e. a non-negative number in the interval [0, 1]) imposes restrictions on the possible values of the reaction constants Ki, h and that should be smaller or equal can be used in the simulations. In particular, to m (this upper limit corresponds to full occupation of an automaton node). Figure 2
Figure 3
Figure 2 depicts the time-evolution of the automaton for K1=K2= K3=K4=1, h=0.0313 (close to the maximum possible value of h). For hese values of the reaction constants, there are configurations (in particular those where ) for which Eq. (1) yields negative values. These negative values are set to zero. This procedure does not alter the mean-field behavior of the automaton in a significant manner, since the configurations affected appear rather infrequently. The same plot also shows the dynamics corresponding to the solution of the mean-field equations. The mean-field equations in this automaton provide a very good approximation to the evolution of the species’ node densities, even for the largest possible values of h, even though, as the figure shows, the frequency of the damped oscillations predicted by the meanfield approximation is slightly lower than the frequency of the actual simulated time-series. These small discrepancies can once again be accounted for by the (limited) influence of correlations on the dynamics. The observation that the influence of correlations is small in this automaton is corroborated by the absence of spatial structure in the species’ populations. Figure 3 compares the results of simulations in an automaton with the same characteristics as the previous one, except that the inverse reaction time-scale is h=0.00313. In this case, the mean-field approximation provides an excellent description of the global population dynamics in the ecosystem.
© 2003 Taylor & Francis
Cellular Automata and Probabilistic L Systems: An Example…
119
5.1 The Step-Equivalent IL System Figure 4
Each automaton in the grid contains several individuals of each species. Let us call 1=x(t) and k=y(t) the number of individuals of species x and y at time t. The previous probability prescription can be represented by means of the state diagram in figure 4, where:
This probability distribution is shown in the following table, which represents the rules of the step-equivalent IL system:
© 2003 Taylor & Francis
120
M. Alfonseca, A. Ortega A. Suárez
The alphabet of the step-equivalent Bidimensional IL System is the set of all possible configurations of the nodes (the number of individuals of species x and y in the node). The table above gives the rules. The axiom is a matrix of symbols si. The context is the current symbol itself. References [1]
M.Alfonseca, Teoría de Lenguajes, Gramáticas y Autómatas. Promosoft, Madrid, 1997.
[2]
J.P.Boon, D.Dab, R.Kapral and A.Lawniczak, Lattice gas automata for reactive systems. Physics Reports, 273 (1996), 55–147.
[3]
G.T.Herman and G.Rozenberg, Developmental Systems and Languages. North Holland/American Elsevier, Amsterdam, 1975.
[4]
G.Rozenberg and A.Salomaa (eds.), Lindenmayer Systems: Impacts on Theoretical Computer Science, Computer Graphics, and Developmental Biology. Springer, Berlin, 1992.
© 2003 Taylor & Francis
On Iterated Sequential Transducers Henning Bordihn Faculty of Informatics Otto-von-Guericke University Magdeburg, Germany
[email protected] Henning Fernau Wilhelm-Schickard Institute of Informatics University of Tübingen Germany
[email protected] Markus Holzer I.R.O. Department University of Montréal Québec, Canada
[email protected] Abstract. We continue to explore the relationships between Lindenmayer Systems and iterated sequential transducers introduced by Manca, Martín-Vide, and P a un in [10]. We investigate nondeterministic as well as deterministic transducers. The latter case was stated as an open question by Manca, Martín-Vide, and P aun.
1 Introduction and Definitions Iterated transducers are a natural extension of Lindenmayer Systems, as already observed by Wood in [18]. Independently of this, there has been
© 2003 Taylor & Francis
122
H.Bordihn, H.Fernau, M.Holzer
steady interest in this topic in the formal language community, as can be seen by studying [1, 2, 11, 12, 13, 16]. Recently, this field of research has been revived by the carving paradigm for computing [9]. In this paper, we continue to explore the relationships between Lindenmayer Systems and iterated sequential transducers introduced by Manca, Martín-Vide, and Paun in [10]. We investigate both nondeterministic and deterministic transducers. The latter case was stated as an open question in [10]. We assume the reader to be familiar with the basic notions of formal languages, as contained in [15]. In general, we have the following conventions: denotes inclusion, while denotes strict inclusion. The empty word is denoted by . For , where V is some alphabet, and for , |x|a denotes the number of occurrences of the letter a in x. The families of languages generated by regular, context-free, contextsensitive, general type-0 Chomsky grammars, D0L, 0L, F0L, ED0L, E0L, and ET0L systems are denoted by REG, CF, CS, RE, D0L, 0L, F0L, ED0L, E0L, and ET0L, respectively. Details about these families can be found in [6, 15]. Recall that an ED(1, 0)L system is given by a quadruple G=(V, ⌺, P, ω), where V and ⌺ are the total alphabet and the terminal alphabet, respectively, is the axiom, and P is a mapping from into V*. We write (␣, a)→w instead of P(␣, a)=w and call it production in P. A word x directly yields the word y, in symbols , if and only if x=a1a2…ak, y=w1w2…wk, kⱖ1, , and (ai-1, ai)→wi is a production in P, 1ⱕiⱕk. Here, we set a0=. Let * be the reflexive transitive closure of the relation . By definition, G generates the language . Intuitively, an ED(1, 0)L system is a parallel rewriting system with one-sided context. We denote the corresponding family of languages by ED(1, 0)L. Similarly, ED2L is the family of languages generated by parallel rewriting systems with two-sided context. Vitányi has shown in [17, Section 3.2] that ED2L equals the family of recursively enumerable languages RE. An iterated (finite state) sequential transducer (IFT) [10] is a construct ␥=(K, V, s0, a0, F, P), where K, V are disjoint alphabets (the set of states and the alphabet of ␥), (the initial state), (the starting symbol), (the set of final states), and P is a finite set of transition rules of the form sa→xs′, for (in state s, the device reads the symbol a, passes to state s′, and produces the string x). For and , we define:
This is a direct transition step with respect to ␥. We denote by transitive closure of the relation . Then, for , we define:
© 2003 Taylor & Francis
the reflexive
On Iterated Sequential Transducers
123
We say that w derives w′; note that this means that w′ is obtained by translating the string w, starting from the initial state of ␥ and ending in any state of ␥, not necessarily a final one. We denote by the reflexive transitive closure of . If s0w w′s, for some , i.e. a derivation stops in a final state, we write . The language generated by ␥ is:
In other words, we iteratively translate the strings obtained by starting from a0, without caring about the states we reach at the end of each translation; we necessarily stop in a final state, only after the last step. The IFT’s, as defined above, are nondeterministic. If for each pair , there is at most one transition rule sa→xs′ in P, then we say that γ is deterministic. For n≥1, let IFTn denote the family of languages of the form L(γ), where γ is a nondeterministic IFT with at most n states; similarly, DIFTn is the family of languages generatable by DIFT’s with at most n states. Let IFTn and DIFTn.
2 Nondeterministic Language Families In [10] it was shown that the hierarchy of nondeterministic finite state sequential transducers induced by the number of states collapses, and that the fourth level characterizes the recursively enumerable languages. Moreover, the following chains of inclusions were exhibited in [6, 10]:
and . The next theorem corrects and shows the relationships of the first two levels of the IFT hierarchy, hence solving a problem posed after [10, Theorem 3]. Theorem 1
.
Proof. The strictness of the trivial inclusion follows from [6, Theorem 2.1]. Since the finite axiom set {ω1,…, ωk} of an F0L system can be generated by rules of the form s0a0→wis0, the inclusion is clear. On the other hand, since the mappings defined by one-state finite state transducers are exactly the substitutions, is obvious. The strict inclusion follows from [6, Theorem 7.1], together with the fact that E0L is closed under finite union. The inclusion relation has been shown in [10, Lemma 3]. Finally, consider the IFT γ=({s0, s1}, {a, b, c, a0}, s0, a0, {s0}, P), with the set of transition rules:
© 2003 Taylor & Francis
124
H.Bordihn, H.Fernau, M.Holzer
In the following, is shown. Hence, L(␥) is non-E0L according to [3, Example 3]. The first step s0a0→ cbs1 introduces the word cb. Each subsequent translation s0cbn *cb2ns1 starting with the rule s0c→cs1 doubles the number of b′s. After removing the leading letter c with the rule s0c→s0 or when starting with a word without a leading c, the iteration continues by introducing a′s on arbitrary positions, while the number of b′s remains unchanged. ⵧ Theorem 2 For every , there is an IFT ␥ with at most three states and a regular set R, such that L=L(γ)傽R. Proof. The proof parallels the construction RE=IFT4 given in [10]. Let be a recursively enumerable language. Then:
Since L is in RE, the right derivative is also in RE due to the well-known closure properties of RE. Thus, is generated by a grammar Ga=(Na, Σ, Sa, Pa) in Geffert normal form [4], i.e. Na= {Sa, Aa, Ba, Ca} and the rules are of the forms Sa→␣, for , and AaBaCa→. Assume , for with a⫽b. We consider the IFT , where and:
We add s0a0→s0 if . The first step s0a0 → Saas0 introduces the axiom of Ga or the empty word by s0a0→s0. Each subsequent translation s0w w′s0 corresponds to an equivalent derivation step sequence in Ga. Note that derivation steps from different grammars Ga and Gb, for a⫽b cannot be mixed due to the distinct nonterminals. Moreover, the presence of the rightmost symbol (introduced in the first step) ensures that ␥ does not reach the end of the string in state sA or sB. Consequently, L(␥) is the language of all sentential forms induced by the grammars Ga, for . Thus, L=L(␥)傽Σ*. ⵧ Note that if L is an undecidable language, then the constructed language of all sentential forms in the previous proof is also undecidable. Since we have seen that the latter language belongs to IFT3, we have:
© 2003 Taylor & Francis
On Iterated Sequential Transducers
125
Corollary 1 The family IFT3 contains non-recursive languages. Due to [10, Lemma 7], ET0L is included in IFT3. The previous corollary implies that the inclusion is strict, as already claimed in [10], because ET0L languages are recursive. Corollary 2
.
The following theorem answers an open question posed in the report version of [10] after Theorem 6, since the operation “intersection with regular languages” is a sequential transducer mapping,1 and it can be seen as a sort of generalization of that theorem, since morphisms are sequential transducer mappings. We state our theorem without proof, because the construction is quite similar to the well-known triple construction. and a sequential transducer ␥, we have
Theorem 3 Let n≥1. For .
3 Deterministic Language Families Here, we consider deterministic IFT′s. This case was left almost completely open in [10]. Firstly, we show an analogue to Theorem 1, using the deterministic variants of E0L and IFT2. This solves an open problem stated in the report version of [10], where, with respect to DIFT 2, only and was shown. Lemma 1
.
Proof. Firstly, we show the inclusion. Consider a given ED0L system G=(V, Σ, P, ω). Without loss of generality, we can assume that . Construct the deterministic IFT , where a0 is a new symbol not contained in and:
The first step, s0a0→ws1, introduces the axiom of G. Each subsequent translation s0w w′si, with i=0 if and i=1 otherwise, corresponds precisely to an equivalent derivation step in G. Thus, the final state 1 If, in the definition of an IFT, we distinguish between an input and an output alphabet, we call those devices (finite state) sequential transducers, FT for short. If ␥ is an FT and L a language over the input alphabet of ␥, then the FT image of L under . ␥ is defined as
© 2003 Taylor & Francis
126
H.Bordihn, H.Fernau, M.Holzer
s0 is reached whenever a terminal string in the sense of G is derived. Consequently, L(G)=L(␥). Secondly, we prove the strictness of the inclusion. We construct the deterministic IFT ␥=({s0, s1}, {a, b, a0}, s0, a0, {s0 s1}, P⬘⬘), where: P⬘⬘={S0a0→aas0, s0a→bs1, s1a→as0, s0b→as0}. It is easily verified that γ generates the language {a2, ab, ba}, which is a nonED0L language [6, Exercise 4.9]. ⵧ With the lemma given above and the fact that DIFT1 equals the family of D0L languages, we state without proof: Corollary 3
and
.
In the following, we shall prove that any recursively enumerable language with and end-marker, i.e. L{#} with can be generated by some deterministic IFT. From the equality RE=ED2L, Vitányi concludes in [17, Theorem 3.47]: Lemma 2 If
is a recursively enumerable language, then , where # is a new symbol, i.e. .
Lemma 2 is also true when taking DIFT′s instead of ED(1, 0)L systems. More precisely, we find: Theorem 4
.
Proof. Consider an ED(1, 0)L system G=(V, Σ, P, ω), generating L. We shall construct a DIFT ␥ for L. Let be the set of states of ␥, () its initial state, and its set of final states. The alphabet of ␥ equals , where a0 is a new letter. The transition rules are the following:
Obviously, ␥ is deterministic and it simulates the derivation of G. Therefore, [17, Theorem 3.42] yields:
ⵧ
Corollary 4 The family DIFT contains non-recursive languages. Clearly, this implies that almost all DIFTn classes contain non-recursive languages.2 Due to [17, Theorem 3.46], we can state: 2 It would be of interest to determine the smallest n such that DIFTn contains a non-recursive language.
© 2003 Taylor & Francis
On Iterated Sequential Transducers
127
Corollary 5 The closure of DIFT under letter-to-letter homomorphisms is equal to RE. Finally, we mention that the proofs of the closure properties for IFT′s given in the report version of [10] are not transferable to the deterministic case, but we are able to give a result concerning the closure with respect to intersections with regular sets. ⱖ 1. For Lemma 3 Let nⱖ and a regular language R which is accepted by a deterministic finite automaton with k states and m accepting states, we have , where ᐍ=max{n, k}. Proof. Consider a DIFT ␥=(K, V, s0, a0, F, P) generating L and a deterministic finite automaton accepting R, where Z={z0, z1,…, zk-1} is the set of states, z0 the initial state, the set of accepting states, X the input alphabet and ␦: Z×X→Z the transition function. Moreover, let K=(s0, s 1,…, sn-1} and ᐍ =max{n, k}. Let . Furthermore, set and consider the morphism h defined by h(a)=a′ for ; we simply write x′ instead of h(x) for strings . Now, construct the DIFT:
␥′ is deterministic and has max{k, n}+m states. ␥′ works as follows. It simulates a derivation step of ␥ essentially by , but a symbol ⌳ is written whenever ␥ performs an erasing step sia→sj, and an additional symbol # is written whenever ␥ is led into a final state. In those phases, the states are interpreted as renamed states of ␥, and the last symbol of the output is # if and only if the corresponding output belongs to L(␥). In the next step, ␥′ rewrites any symbol a′ by a and erases all occurrences of ⌳ and #. Meanwhile, it simulates the automaton , interpreting the states as states of . Whenever a letter # is read and, at the same time, ␥′ is led to a
© 2003 Taylor & Francis
128
H.Bordihn, H.Fernau, M.Holzer
state corresponding to an accepting state of , then ␥′ enters a final state yif instead of yi. Thus, at the end of such a phase, ␥′ is in a final state if and only if a word in L(␥)傽R is generated. This proves our assertion. ⵧ
4 Extended Iterated Sequential Transducers Iterated sequential transducers are a sort of “pure” rewriting mechanism. Thus, “extended versions” such as Lindenmayer languages can also be meaningfully considered. Let ␥ be an IFT with input/output alphabet V and let be a “terminal alphabet.” Then L(␥, ⌬)=L(␥)傽⌬* is an extended language generated by ␥. The corresponding language families are denoted by EIFTn and EDIFTn, respectively. Some easy consequences of our above findings are: Theorem 5 1. 2.
for all nⱖ1 and for all nⱖ1.
Looking more carefully, we also find ED0L=EDIFT 1, E0L=EIFT1 and RE=EIFT3 because of Theorem 2. Finally, we state: Theorem 6
.
Proof. Let L be an ET0L language. By using [15, Theorems V.1.3 and V.1.4], we can assume that L is generated by an ET0L system G=(V, Σ, (P1, P2}, ω) with . Let a 0 and c be new symbols, i.e. . Define , where:
It is easy to see that L=L(␥)傽Σ* holds.
ⵧ
5 Conclusions and Further Research Proposals In this paper, we obtained further results about the relationships between iterated transducers and Lindenmayer systems. Similarly, the relationships with models that are obviously akin such as restart automata (see e.g. [7]), clog automata [5] and “sequentialized versions” of programmed 0L systems [14] are of interest, particularly as far as the descriptional complexity is concerned (similar to the number of states as studied in the present note for IFT′s). Furthermore, (E)T0L systems with regular context conditions (which are specific to each table and which determine the applicability of a table)
© 2003 Taylor & Francis
On Iterated Sequential Transducers
129
can be easily simulated by IFT′s and, to our knowledge, have not been investigated before, although they seem to be quite natural and interesting in their own right. Besides the power of iterated “propagating” (i.e. -free) IFT′s and DIFT′s, the main open problem concerning iterated transducer languages is whether the hierarchy is infinite. We conjecture that the hierarchy is infinite, but it is very hard to give a satisfying proof. Many derivation steps might not contribute to the candidate language, because the transducer is not in a final state after the present sentential form has been scanned.
References [1] P.R.J.Asveld, On controlled iterated gsm mappings and related operations, Revue Roumaine des Mathématiques Pures et Appliquées, 25 (1980), 136–145. [2] J.M.Autebert and J.Gabarró, Iterated gsm’s and co-cfl, Acta Informatica, 26 (1989), 749–769. [3] A.Ehrenfeucht and G.Rozenberg, The number of occurrences of letters versus their distribution in some E0L languages, Information and Control, 26 (1974), 256–271. [4] V.Geffert, Normal forms for phrase structure grammars, RAIRO Informatique Théorique et Applications, 25 (1991), 473–496. [5] L.Haines, Representation theorems for context-sensitive languages, Notices of the American Mathematical Society, 16.3 (1969), 527. [6] G.T.Herman and G.Rozenberg, Developmental Systems and Languages. NorthHolland, Amsterdam, 1975. [7] P.Jancar et al., On restarting automata with rewriting. In Gh. Paun and A.Salomaa (eds.), New Trends in Formal Languages, Springer, Berlin, 1997, 119–136. [8] J.van Leeuwen, The membership problem for ET0L-languages is polynomially complete, Information Processing Letters, 3 (1975), 138–143. [9] V.Manca, C.Martín-Vide and Gh. Paun, New computing paradigms suggested by DNA computing: computing by carving. In L.Kari, H.Rubin and D.H.Wood (eds.), Preliminary Proceedings of the 4th DIMACS Workshop on DNA Based Computers. University of Pennysylvania, Philadelphia, Pa., 1998, 41–56. [10] V.Manca, C.Martín-Vide and Gh. Paun, Iterated gsm-mappings: a collapsing hierarchy. In J.Karhumäki, H.Maurer, Gh. Paun and G.Rozenberg (eds.), Jewels are Forever. Springer, Berlin, 1999, 182–193. [11] Gh. Paun, On the iteration of gsm mappings, Revue Roumaine des Mathématiques Pures et Appliquées, 23 (1978), 921–937. [12] Gh. Paun, Classes of iterated gsm’s suggested by suspicious communication questions, Revue Roumaine de Linguistique, Cahiers de Linguistique Théorique et Appliquée, XXIV.2 (1987), 139–144. [13] Gh. Paun, The complexity of language translation by gsm’s, Revue Roumaine de Linguistique, Cahiers de Linguistique Théorique et Appliquée, XXV.1 (1988), 49–58.
© 2003 Taylor & Francis
130
H.Bordihn, H.Fernau, M.Holzer
[14] K.S.Rajasethupathy and R.K.Shyamasundar, Programmed 0L-systems, Information Sciences, 20 (1980), 137–150. [15] G.Rozenberg and A.Salomaa, The Mathematical Theory of L Systems. Academic Press, New York, 1980. [16] H.Takahashi, The maximum invariant set of an automaton system, Information and Control, 32 (1976), 307–354. [17] P.M.B.Vitányi, Lindenmayer Systems: Structure, Languages, and Growth Functions, Technical Report 96, Mathematisch Centrum, Amsterdam, 1980. [18] D.Wood, Iterated a-NGSM maps and ⌫ systems, Information and Control, 32 (1976), 1–26.
© 2003 Taylor & Francis
Distributed Real-Time Automata C t lin Dima Department of Fundamentals of Computer Science University of Bucharest Romania
[email protected] Abstract. We introduce a class of automata with real-time constraints, automata which can be seen as tuples of real-time automata working on their own tape and synchronized on their transitions. Though the general class is equivalent to timed automata and hence is not closed under language complementation, the subclass consisting of automata with a so-called stuttering-free condition is shown to be closed under language complementation.
1 Introduction Automata theory remains one of the most fruitful and influential domains in theoretical computer science. It is used in such diverse fields as linguistics, semantics and formal verification and it has influenced such distant disciplines as logic and control theory. In the past decade much interest has been shown in adapting automata theory to the problems of real-time systems, and particular emphasis has been put on generalizing finite automata by taking time into account. A few types of automata have proved to be useful, but unfortunately their properties are rather complementary. Nevertheless, no class of automata can
© 2003 Taylor & Francis
132
C. Dima
claim to be the most natural and expressive extension of finite automata. Timed automata of [1] have an undecidable universality problem; event-clock automata [2, 7], though complementable, have rather complicated algebraic properties and are not closed under substitution; and real-time automata [4] cannot support semantics of real-time concurrent programming languages. In this paper, we build upon an idea that arose in [6]: a tuple of real-time automata working on their own tape but synchronized on their transitions might still be closed under complementation. This idea has its roots in the semantics of real-time synchronous programming languages, which (when working on variables with finite domains) can be given in the so-called Simple Duration Calculus [5]. From the very beginning it is apparent that the automata built this way are too powerful–they are equivalent to the timed automata of [1] and the complementation procedure from [6] is therefore lost. However, we find that by adding the so-called stuttering freeness condition automata become determinizable and hence complementable. It is the same condition that on real-time automata required some properties of the Kleene algebra of sets of intervals; see [6]. The paper is organized as follows. In the next section we recall the definitions and properties of signals and real-time automata. The third section contains the definitions of distributed real-time automata and the observation that stuttering implies nonclosure under complementation. The fourth section contains the constructions for determinization and complementation of the automata being discussed. We end by formalizing the problem of constructing the semantics of real-time synchronous programs from the semantics of each component.
2 Preliminaries Let us first fix some notations: IR+ denotes the set of nonnegative numbers while denotes the set of intervals with nonnegative rational bounds. We denote by πi the usual projection of a Cartesian product on the i-th component and Diag(A) denotes the diagonal relation on A, i.e. . For a function f: and for some α>0, if there exist some a, , a≠b and >0 such that f(x)=a for all and f(x)= b for all , then we call a the left limit of f at α and b the right limit of f at α and denote them f(α-0) and f(α+0), respectively. We say that f has a discontinuity at α iff f(α-0)≠f(α+0). The discontinuity is left iff we also have that f(α)=f(α+0). Definition 1 A signal over an alphabet Σ is a function σ : [0, α)→Σ whose domain is an initial interval of the positive numbers and which has finitely many discontinuities, all being left discontinuities.
© 2003 Taylor & Francis
Distributed Real-Time Automata
133
Given n-sets of symbols Σ1,…, Σn (not necessarily pairwise disjoint), an nsignal is a signal over Σ1×…×Σn We assume throughout this paper that the alphabets are finite. We denote the domain of σ as dom(σ) and the endpoint of this domain as endp(σ). The set of all signals over Σ is denoted Sig(Σ) and subsets of it are called real-time languages, in analogy with sets of words over Σ. Also the set of n-signals is denoted Sig(Σ1,…, Σn). Note that for each i 僆 [n], the projection πi: Σ1×…×Σn→Σi extends to a function that is also denoted πi from Sig(Σ1,…, Σn) to Sig(Σ) and defined as endp(πi(σ))=endp(σ ) and πi(σ)(t)=πi(σ (t)) for all . Now let σ Σ Sig(Σ). For each discontinuity α of σ we define the duration of the symbol after α as the difference d(σ , α)=α-ß, where ß is the next discontinuity after α, if it exists, or endo(σ ) otherwise. Similarly, define d(σ , α) for α=0 too. Then we extend this definition for n-signals: the duration of the i-symbol after α in an n-signal σ is di(σ , α)=d(πi(σ ), α), if α is a discontinuity of πi(σ ) and undefined otherwise. Sig(Σ) can be endowed with a noncommutative monoidal structure by defining a concatenation operation: given two signals σ1 and σ2, their concatenation can be defined as the signal σ with domain [0, endp(σ 1)+endp (σ2)) and whose value is:
We denote the concatenation of σ 1 with σ 2 as σ 1; σ 2. The unit of concatenation is the unique signal with empty domain σ e: [0,0)→Σ Then (Sig(Σ)), the powerset of Sig(Σ), can be given a structure that is similar to (Σ*), namely a Kleene algebra structure [3] by defining the star of a real-time language
, as
where L0={ σ ε} and
. Here, concatenation on languages is the natural extension of concatenation on words. Definition 2 A real-time automaton (RTA for short) is a tuple A=(Q, Σ, λ, l, δ, S, F) where Q is the (finite) set of states, is the transition relation, are the sets of initial, resp. final states, λ: Q→S is the state labeling function and is the interval labeling function. RTA works over signals: a run of length n is a sequence of locations connected by δ, i.e. . A run is called accepting iff it starts in S and ends in F. The run accepts a signal σ iff the signal can be split into n parts such that the state within the i-th part equals λ(qi) while the length of this
© 2003 Taylor & Francis
134
C. Dima
i-th part is in the interval (qi). Formally there exist such that dom and σ (t)= (qi) for all and all i ∈ [n]. This definition may also be extended to any runs and signals: we say that a run is associated to a signal iff the above properties hold for the respective run and signal. Clearly the splitting points must contain all the time points where state changes occur but there might be more splitting points than discontinuities, as two consecutive locations in the run might have the same state label. This is a form of nondeterminism specific to automata with real-time constraints. Definition 3 A RTA is language deterministic iff each signal in L( ) is associated to a unique run. is stuttering-free iff it does not have a transition (q, r) with λ(q)=λ(r). is state-deterministic iff initial locations have disjoint labels and transitions starting in the same locations have disjoint labels too, i.e. whenever r≠s and either r, s, ∈ S or (q, r),(q, s) ∈ δ then (r)≠λ(s) or (r)≠(s)=θ. is simply called deterministic whenever it is both statedeterministic and stuttering-free. Hence, for stuttering-free RTA, the splitting points are unique for each signal. Notice also that there might be state-deterministic RTA that are not language deterministic due to stuttering steps. The important property that stutteringfree RTA have is that the usual determinization procedure of finite automata can be adapted for them: Theorem 1 The class of languages which are accepted by stuttering-free realtime automata is closed under complementation. This theorem is a corollary of the results proven in [6].
3 Distributed Real-Time Automata Definition 4 An n-distributed real-time automaton (n-DRTA) is a tuple:
where n≥1, Q is the set of states, λi: Q→Σi are the state labeling functions, are the interval labeling functions, is the transition relation and are the sets of initial, resp. final, states. The transitions (q, K, r)∈ δ also obey the following condition: if λi(q)≠λi(r) then i ∈ K. The second component of each transition is called the resetting component. Its use will be apparent in defining the language of these automata.
© 2003 Taylor & Francis
Distributed Real-Time Automata
135
A run is some alternating sequence (q1, K1, q2,…, Km-1, qm) consisting of states qi ∈ Q and sets of indices with the property that (qj, Kj, qj+1) δ for all . An accepting run, therefore, is a run that begins in an initial state and ends in a final state. We denote the set of runs of by Runs( ) and the set of accepting runs of by Runs( ). Intuitively, in n-DRTA there are n real-time automata which work in parallel but which are synchronized on certain transitions. The resetting component is used by each RTA to determine where its current state has to be changed. These ideas are formalized as follows: the i-th underlying RTA of the above nDRTA is the RTA where: (1) Then, given a run ρ=(q1, K1, q2,…, Km-1, qm) we may define the i-th projection of this run as the set of runs in induced by ρ:
Note that there might be more than one run in pri(ρ) because definition 1 implies that a single transition in δ induces a whole range of transitions in δi. We also call the runs in pri(ρ) unidimensional. Then, given an n-signal , we say it is accepted by the accepting run ρ=(q1, K1, q2 …, Km-1, qm) (and hence accepted by the n-DRTA) iff for each i ∈ [n] the signal is accepted by some unidimensional run in the i-th projection of ρ. The language accepted by is the set of n-signals accepted by . We say two n-DRTA are equivalent iff they have the same language. We will denote by DRTA the class of all n-DRTA, for all n ∈ IN. We should point out here that “some” may be replaced by “any” in the definition of acceptance of a signal. Proposition 1 The language emptiness problem is decidable for DRTA. This follows since n-DRTA are special cases of timed automata with n-clocks which have a decidable emptiness problem [1]. Definition 5 A n-DRTA S, F) is called language deterministic iff each n-signal σ ∈ L( ) is associated to a unique accepting run. is called state deterministic iff the following conditions are met:
· for each pair of transitions (q, K, r), (q, K, s) δ with the property that λi(r)=λi(s) for all i ∈ [n], there is some j ∈ [n]such that ij(r)≠ij(s);
© 2003 Taylor & Francis
136
C. Dima
· for each pair of initial states r, s ∈ S with the property that λ (r)= λ (s) for all i ∈ [n], there is some j ∈ [n] such that ij (r)≠ij [s].
i
i
The automaton is called stuttering free iff, for each transition (q, K, r) ∈ δ, K is the set of indices i such that λi(q)≠λi(r). If is both stuttering-free and state deterministic then it is simply called deterministic. Finally, the automaton is called complete iff for each n-signal (not necessarily in the language of ) there exists a unique associated run that starts in an initial state. Hence, in a stuttering-free n-DRTA the transition relation could just be given as a subset of Q2 with the property that if (q, r) ∈ δ then there exists some i ∈ [n]such that λi(q)≠λi(r). We will use this definition of the transition relation for these automata. This also has the advantage that runs can be described then simply as (finite) sequences of states (qi)i∈[m]. It is clear that a deterministic n-DRTA is language deterministic. It is also easy to find some language nondeterministic automata that are still deterministic or some state deterministic automata that are not language deterministic. All these notions are important to prove the closure under complementation: for the usual complementation construction, a language deterministic automaton is needed which is also complete and its set of final states must then be complemented. Note that if the n-DRTA is stuttering-free then all its projections are stuttering-free too. 3.1 Comments on the Stuttering-Free Condition The stuttering-free condition looks rather strong and it can easily be seen that it does not allow DRTA to be closed under concatenation of languages. However, it is essential in the determinization procedure: without the stuttering condition, n-DRTA have essentially the same expressive power as timed automata of [1], hence they have an undecidable universality problem. It is mentioned in [1] that the following language1 is accepted by some timed automaton, while its complement cannot be accepted:
1 Actually, this is a modification of the example in [1], which was stated for the timed words semantics of timed automata.
© 2003 Taylor & Francis
Distributed Real-Time Automata
137
The language consists of signals in which there exist two (possibly non consecutive) discontinuities where the signal jumps from a to b which are separated by a unit interval. Consider the language which consists of 2-signals over {a, b}×{c} and whose first projection is L. We can easily find a 2-DRTA for this language:
Hence, 2-DRTA cannot be closed under complementation, otherwise some timed automata would accept the complement of L.
4 Complementation of Stuttering-Free n-DRTA The aim of the determinization construction is for each signal to have a single run that accepts the signal. For stuttering-free n-DRTA this means that at each discontinuity there must be only one choice to make between the different transitions enabled at that point. Let’s call a transition (q, r) enabled at some discontinuity α in an n-signal σ iff the right limit of each projection of the signal is consistent with the labels of r. It is clear that even for deterministic n-DRTA at each discontinuity of σ more transitions might be enabled. But the length of the next i-symbol uniquely determines the transition which is to be taken. Therefore, in the classical determinization procedure we need to take into account not only sets of states which have the same state label, but also those which share a specific interval label that is not available to any other (similarly state-labeled) set of states. This idea is formalized in the sequel. Start with a n-DRTA S, F) which is stuttering free. Construct the set of subsets of Q consisting of identically labeled states: Also for each T ∈ Id(Q) and T´ ⊆ T denote Ii(T, T´) the intersection of all intervals which label T´ together with the complements of the intervals which label T\T´:
© 2003 Taylor & Francis
138
C. Dima
As usual, the intersection of an empty family of intervals is IR. Note that the collection of all sets Ii(T, T´), where T is fixed but T´ ranges over all subsets of T, forms a partition or IR. This property is essential in the determinization construction and it also assures that the deterministic automaton is complete. On the other hand, it is clear that Ii(T, T´) may not be an interval, but rather a finite union of intervals. So int(Ii(T, T´)) denotes the pairwise disjoint and minimal (in cardinality) set of intervals of which Ii(T, T´) is composed and pi(T, T´) denotes the cardinal of int(Ii(T, T´)). Consider also some enumeration of the intervals in int(Ii(T, T´)):
The states of the deterministic n-DRTA which is equivalent to are (n+2)uples (T, T´, k1,…, kn), where T ∈ Id (Q), T´⊆T and for all i∈[n]. Denote this set of tuples . The state labeling functions are then clear: (2) The interval labeling functions are defined using int(Ii(T, T´)):
The set of initial states consists of all the tuples (S, S´, k1,…, kn) in which the set of initial states S is paired with all its nonempty subsets S´ and some tuple of indices . Similarly, the set of final states consists of all the tuples (U, U’, k1,…, kn), where U’ has a nonempty intersection with the set of final states F (and therefore is itself nonempty) and . To define the transitions of we need another notation: for each denote by R(T, T’, k1,…, kn) the element of Id(Q) which collects the destinations of all transitions that start from T´ and lead to identically labeled states:
Then
will have as its transitions all the tuples of the kind: ((T,T’, k1,…,kn), (R,R’,l1,…,ln)),
where R=R(T, T’, k1, …, kn),
and
]. Hence:
Theorem 2 Stuttering-free n-DRTA and deterministic n-DRTA have the same expressive power.
© 2003 Taylor & Francis
Distributed Real-Time Automata
139
Yet another important outcome of the above construction is that the nDRTA obtained is complete. The n-DRTA which accepts L( )will be, then, the n-DRTA whose components are the same as for with the exception of the set of final states which is \ . Hence: Theorem 3 The class of languages which are accepted by n-DRTA is closed under complementation.
5 Constructing the Semantics of Real-Time Synchronous Programs As noted in [4], RTA may be used to provide semantics for sequential realtime programs, i.e. programs in which real-time constraints are put on the delays between the executions of instructions. We can then define the semantics of an n-component synchronous real-time program as a suitable n-DRTA whose projections are the n RTAs that are the semantics of each component but also whose transition relation models the synchronizations between the components. This idea is formalized as follows: Definition 6 Given n stuttering-free RTA a synchronization pattern on is a set of tuples with the following property:
A synchronization pattern δ is just a special case of stuttering-free n-DRTA:
Therefore, the accepting condition and language of a synchronization pattern are particular cases of the respective definitions for n-DRTA. However, the reverse relationship is unclear because it is unclear whether each n-DRTA is equivalent to some synchronization pattern of its underlying RTAs. The choice of a synchronization pattern consisting of all 2n-uples of the form ((q,…, q), (r,…, r)) is wrong since it is possible that (q, r) ∉ δ1 because it is also possible that i ∉ K, for any transition (q, K, r) of the given n-DRTA. Hence, we conjecture that synchronization patterns are less expressive than n-DRTA. References [1]
R.Alur and D.L.Dill, A theory of timed automata, Theoretical Computer Science, 126 (1994), 183–235.
© 2003 Taylor & Francis
140 [2] [3] [4] [5] [6] [7] [8]
C. Dima R.Alur, L.Fix and T.A.Henzinger, A determinizable class of timed automata. In Computer-Aided Verification. Springer, Berlin, 1994, 1–13. J.H.Conway, Regular Algebra and Finite Machines. Chapman and Hall, London, 1971. C.Dima, Automata and regular expressions for real-time languages. In Proceedings of the AFL’99 workshop, Vasszeczeny, Hungary, 1999. C.Dima, Simple Duration Calculus Semantics of a Real-Time Synchronous Programming Language, 1999, unpublished ms. C.Dima, Real-time automata and the Kleene algebra of sets of real numbers. In Proceedings of STACS’2000. Springer, Berlin, 2000. T.A.Henzinger, J.F.Raskin and P.Y.Schobbens, The regular real-time languages. In Proceedings of the 25th ICALP. Springer, Berlin, 1998. J.E.Hopcroft and J.D.Ullman, Introduction to Automata Theory, Languages and Computation. Addison-Wesley, Reading, Mass., 1992.
© 2003 Taylor & Francis
On Commutative Directable Nondeterministic Automata1 Balázs Imreh Department of Informatics József Attila University Szeged, Hungary
[email protected] Masami Ito Department of Mathematics Faculty of Science Kyoto Sangyo University Japan
[email protected] Magnus Steinby Department of Mathematics University of Turku Finland
[email protected] Abstract. In [8] an input word w of a nondeterministic automaton (nda) was called: (1) D1-directing if the set of states aw in which may be after reading w is the same singleton set {b} for all initial states a; 1 This work has been supported by the Hungarian-Finnish S & T Co-operation Programme for 1997–1999, Grant SF-10/97, the Japanese Ministry of Education, Mombusho International Scientific Research Program, Joint Research 10044098, the Hungarian National Foundation for Science Research, Grant T030143, and the Ministry of Culture and Education of Hungary, Grant FKFP 0704/1997.
© 2003 Taylor & Francis
142
B.Imreh, M.Ito, M.Steinby
(2) D2-directing if the set aw is the same for every initial state a; (3) D3-directing if some state b appears in every set aw. Here we consider these notions for commutative nda. Commutativity is a very strong assumption which virtually eliminates the distinction between general nda and complete nda (cnda). Moreover, the sets of Di-directing words of a given nda are essentially of the same form in all six cases considered. We also give bounds for the maximal lengths of minimum-length Di-directing words of an n-state nda or cnda (i= 1, 2, 3).
1 Introduction An input word of an automaton is directing if it takes from every state to the same fixed state. An automaton is called directable if it has a directing word. Directability has been studied extensively from several points of view and for various types of automata (cf. [3], [7], [8] for some references). In particular, the directability of nondeterministic automata (nda) has also received some attention. In [8] an input word w of an nda was said to be: (1) D1-directing if the set of states in which may be after reading w consists of the same single state b regardless of the starting state, i. e. if aw={b} for all states a, (2) D2-directing if the set aw is the same for all states a, and (3) D3-directing if there is a state b which appears in every set aw. Similar notions have been considered by Goral ik et al. [4] in connection with a game with binary relations on a finite set. Moreover, D1-directable complete nda (cnda) were explicitly studied by Burkhard [1]. Here we consider D1-, D2- and D3-directing words of commutative nda and commutative cnda. Commutativity turns out to be a very strong property in this context. In particular, the considerable differences between general nda and cnda observed in [8] are mostly eliminated. A D1- or D3-directing word of a commutative nda cannot contain any incomplete letters, that is to say, letters for which there are no transitions from some states. Also, incomplete letters may appear only in relatively short minimal D2-directing words, which do not affect the bounds we are interested in. Section 2 introduces some basic notions. In Section 3 we consider the commutative equivalence of words, the commutative subword relation, and commutative closures and cones. A commutative cone is a set which with any word w also contains every word in which w is a commutative subword. It follows from a well-known theorem by König [9] that any commutative cone is finitely generated and, hence, a regular language. This fact is used in Section 4, where we describe the sets of Di-directing words of a given nda or
© 2003 Taylor & Francis
On Commutative Directable Nondeterministic Automata
143
cnda (i=1, 2, 3); in all six cases these sets are commutative cones. This also applies to the sets of directing words of ordinary commutative automata, and altogether we get just two different families of languages as opposed to the five families obtained in [6] without assuming commutativity. In Section 5, we give upper bounds for the length of a minimum-length Di-directing word of an n-state commutative nda or cnda (n≥1, i=1, 2, 3). For deterministic commutative automata, the general lower bound (n-1)2 can be replaced by the exact linear bound n-1 (cf. [7], [10]), and here too commutativity has a considerable effect. In particular, the bounds for Di-directing words of complete nda and general nda are the same. The exact bound 2n-n-1 for Burkhard’s D1-directing words [1] can be replaced by the exact linear bound n-1. A considerable improvement also takes place for for D2-directing words. For D3-directing words the bounds given in [8] can be lowered considerably for general nda. However, it seems that commutativity has not yet been fully utilized for D2- and D3-directability. 2 Preliminaries In what follows, X is always a finite nonempty alphabet. We denote by X* the set of all finite words over X and by lg(w) the length of a word w. The symbol ε represents the empty word. The set of nonempty words over X is denoted by X+. The number of the occurrences of a given letter in a word w∈X* is denoted by lgx(w). A deterministic finite automaton with input alphabet X is a system (A, X, δ), where A is a finite nonempty set of states and δ: A×X→A is the transition function. The transition function is extended to A×X* as usual. An automaton (A, X, δ) may also be viewed as the finite algebra =(A, X), where each is realised as the unary operation . For any a∈A and w∈X*, we also denote δ(a, w) by aw or aw. An automaton =(A, X) is commutative if axy=ayx for all a∈A and x, y ∈ X. The class of commutative automata is denoted by Com. A word w∈X* is a directing word of an automaton =(A, X) if aw=bw for all a,b∈A, and is called directable if it has a directing word. The set of all directing words of an automaton is denoted by DW( ) and the class of all directable automata by Dir. A recognizer over X is a system A=(A, X, δ, a0, F), where (A, X, δ) is an automaton, is the initial state, and is the set of final states. The language recognized by A is the set .A language L is recognizable (or regular) if L=L(A) for some recognizer A. The set of all recognizable languages over X is denoted by Rec(X). We define a nondeterministic automaton (an nda for short) as a generalized automaton =(A, X), where each letter is realised as a binary relation on A. For any a∈A and
© 2003 Taylor & Francis
144
B.Imreh, M.Ito, M.Steinby
is the set of states which may assume when it receives the input x in state a. For any C ⊆ A and x∈X, we set The set of states reachable from some state in C ⊆ A by reading the input word w can now be defined as follows: (1) (2)
if w=xv for some x∈X and v∈X*.
For any w=x1x2…xk, where k≥0 and xi∈X, we may view as the relational product If there is no danger of confusion, we write simply aw and Cw for a and C , respectively. A complete nda (a cnda for short) is an nda such that for all a∈A and x∈X. An nda is commutative if for all x,y∈X. The classes of all commutative nda and commutative cnda are denoted by COM and cCOM, respectively. Since any commutative automaton can be regarded as a commutative cnda, we have An nda
is called the Y-reduct of an nda for every y∈Y. Let us recall the three notions of directability studied in [8]. For any nda and any w ∈Σ∗, we consider the following three conditions: (D1) (D2) (D3) If w satisfies (Di), then it is a Di-directing word of (i=1, 2, 3). For each i=1, is denoted by Di( ), and is Di2, 3, the set of Di-directing words of directable if it has a Di-directing word. The class of Di-directable nda is denoted by Dir(i) and the class of Di-directable cnda by CDir(i). 3 Commutative Equivalence, Closures and Cones A word u ∈ X∗ is a commutative subword of a word v ∈X∗, and we express this by writing u≤cv, if lgx(u)≤lgx(v) for every x ∈X. Similarly, the words u and v are said to be commutatively equivalent, u≡c v in symbols, if lgx(u)=lgx(v) for every x ∈X. The following facts are obvious. Lemma 1 Let
=(A, X) be a commutative nda and let u, v ∈X∗:
(a) If u≡cv, then au=av and Cu=Cv for all a∈A and C⊆A. (b) If u≤cv, then Av⊆Au. (c) If u≤cv and au=bu for some a, b∈A, then also av=bv. The commutative closure of a language L⊆X* is the language:
© 2003 Taylor & Francis
On Commutative Directable Nondeterministic Automata
145
and the commutative cone generated by L in X* is defined as the language:
For L={w}, we write simply c(w) and [w)c for c(L) and [L)c, respectively. A language L⊆X* is called commutative if c(L)=L. The following facts are easily verified. Lemma 2 The mappings L哫c(L) and L哫(L)c are algebraic closure operators on X*. Moreover, for any L⊆X*: (a) and , (b) , (c) c([L)c)=[L)c, and (d) [c(L))c=[L)c. Furthermore, for any words The (internal) shuffle of two languages K, L⊆X* is the set of all words u 0 v 1 u 1 v 2 …u m-1 v m, where m≥1, u i , and (cf. [2], for example). It is easy to see that commutative cones can be expressed in terms of the shuffle operation as follows. For any If the language L is commutative, then . Clearly, ≤c is a quasi-order on X* and ≡c is the corresponding equivalence: for any The proper part , where S is the initial symbol and P contains the productions:
© 2003 Taylor & Francis
Using Alternating Words to Describe Symbolic Pictures
233
The concept of consistency can be extended to DSP grammars in the natural way. A DSP grammar , is consistent if any is consistent. The DSP grammar RM of Example 3 is a consistent DSP grammar. Indeed, any word is consistent as can be easily verified by observing that any string on the righthand side of any production is consistent and the subpictures generated by any nonterminal can never overlap. It can be easily verified that when is in canonical form, any sentential form sf is a sentence in , where N is the set of nonterminals. As a consequence, by applying the function dspic() to sf, we obtain a drawn symbolic picture called pictorial sentential form. Now, we provide the definition of SP grammars and SP languages. Definition 8 A symbolic picture grammar (SP grammar, for short) G is a pair < , spic ()>, where is a -grammer and spic() is the b function that translates a -word into a symbolic picture. Given an SP grammar , the SP language L generated by G, denoted by L(G), is:
In the next example we provide an SP grammar in canonical form which describe some simple arithmetic expressions, and show some pictorial sentential forms. Example 4 Let , spic ()>, be the symbolic picture grammar in canonical form with P={E→ E rb+r b T, E→T, F→n, T→T r b *rb F, , T→F,}. A rightmost pictorial derivation of G is:
In [2] the canonical normal form of DSP and SP grammars has been exploited to prove their equivalence with some subclasses of Positional Grammars. Such formalism has been proposed to specify visual languages and has successfully been used for the automatic generation of visual programming environments [1]. The most appealing feature of the model is its capability of inheriting and extending to the visual field concepts and techniques of a traditional string languages, because it represents a natural extension of contextfree grammars. We believe that the study of symbolic picture languages can be usefully exploited to provide insight into the features of the Positional grammar model.
© 2003 Taylor & Francis
234
G.Costagliola, V.Deufemia, F.Ferrucci, C.Gravino, M.Salurso
6 Conclusions In this chapter, we have presented models of drawn symbolic pictures and symbolic pictures. Several interesting questions may be studied in this context. In particular, the descriptions of symbolic pictures involve possible conflicts in the assignment of symbols to positions. The proposed definition solves the conflict by giving priority to the first visit of a position. So, it is interesting to investigate whether or not the lack of conflicts is decidable for a given (drawn) symbolic picture grammar. The problem has been partially addressed in [2], where it was shown that it is always possible to decide whether or not a -grammer generates only consistent descriptions for a drawn symbolic picture. The hypothesis of consistency plays an important role in the analysis of some decidability and complexity properties of (drawn) symbolic picture languages. As a matter of fact, in [2] it was shown that the membership problem for consistent regular drawn symbolic pictures within a stripe is decidable in linear time deterministically. Another interesting issue concerns the analysis of a variant of the proposed models where the conflicts can be resolved by taking into account priority values which could be assigned to the alphabet symbols.
References [1] G.Costagliola, A.De Lucia, S.Orefice and G.Tortora, Automatic generation of visual programming environments, IEEE Computer, 28 (1995), 56–66. [2] G.Costagliola and F.Ferrucci, Symbolic picture languages and their decidability and complexity properties, Journal of Visual Languages and Computing, Special Issue on the Theory of Visual Languages, 10 (1999), 381–419. [3] G.Costagliola and M.Salurso, Drawn symbolic picture languages. In Proceedings of the International Workshop on Theory of Visual Languages, Capri, 1997, 1–14. [4] J.Feder, Plex languages, Information Sciences, 3 (1971), 225–241. [5] F.Ferrucci et al., Symbol-relation grammars: a formalism for graphical languages, Information and Computation, 131 (1996), 1–46. [6] H.Freeman, Computer processing of line-drawing images, Computer Surveys, 6 (1974), 57–97. [7] D.Giammarresi and A.Restivo, Two-dimensional languages. In G.Rozenberg, A.Salomaa (eds.), Handbook of Formal Languages. Springer, Berlin, 1997, vol. 3, 215–267. [8] J.E.Hopcroft and J.D.Ullman, Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, Reading, Mass. 1979. [9] C.Kim, Complexity and decidability for restricted class of picture languages, Theoretical Computer Science, 73 (1990), 295–311. [10] C.Kim, Picture iteration and picture ambiguity, Journal of Computer and System Sciences, 40 (1990), 289–306.
© 2003 Taylor & Francis
Using Alternating Words to Describe Symbolic Pictures
235
[11] C.Kim and I.H.Sudborough, The membership and equivalence problems for picture languages, Theoretical Computer Science, 52 (1987), 177–191. [12] K.Marriott, Constraint multiset grammars. In Proceedings of the IEEE Symposium on Visual Languages, St. Louis, Mi., 1994, 118–125. [13] H.A.Maurer, G.Rozenberg and E.Welzl, Using string languages to describe picture languages, Information and Control, 54 (1982), 155–185. [14] K.Slowinski, Picture words with invisible lines, Theoretical Computer Science, 108 (1993), 357–363. [15] I.H.Sudborough and E.Welzl, Complexity and decidability for chain code picture languages, Theoretical Computer Science, 36 (1985), 173–202.
© 2003 Taylor & Francis
What Is the Abelian Analogue of Dejean’s Conjecture?1 James D.Currie Department of Mathematics and Statistics University of Winnipeg Manitoba, Canada
[email protected] Abstract. We motivate the study of Abelian repetitive thresholds and give interesting open problems and some (weak) bounds.
We say that a word w encounters yy if we can write w=XY1Y2Z, where For example, banana=b an an a encounters yy, with X=b, Y1=Y2=na, Z=a. We say that w avoids yy if w does not encounter yy. Thue [14] proved that the language is infinite. We define an equivalence relation ~ on words by saying that Y1~Y2 if Y2 can be obtained from Y1 by reordering the letters. For example stops~posts. We say that a word w encounters yy in the Abelian sense if we can write w=XY1Y2Z, where . For example, disproportionate=dis pro por tionate encounters yy in the Abelian sense, with X=dis, Y1=pro, Y2=por, Z=tionate. We say that w avoids yy in the Abelian sense if w does not encounter yy. In 1961, Erdös [9] asked whether the language was infinite for some finite alphabet ⌺. Evdokimov [10] answered this question in the affirmative in 1
Research supported by an NSERC Operating Grant.
© 2003 Taylor & Francis
238
J.D.Currie
1968 for |⌺|=25. Erdös had mentioned the possibility that L might be infinite even with |⌺| as small as 4, and the problem of showing that 4 letters sufficed to avoid yy in the Abelian sense became well-known. Finally, in 1992, Keränen [11] showed that 4 letters sufficed. These problems of avoiding yy generalize into interesting decision problems. Let p=y1y2…yn be a word, where the yi are letters. We say that word w encounters p if we can write w=XY1Y2…YnZ, where if yi=yj. Thus (departing from English words, alas): 12345672347568=1 234 56 7 234 7 56 8 encounters p=abcacb with X=1, Y1=234=Y4, Y2=56=Y6, Y3=7=Y5, Z=8. We say that p is k-avoidable if there is some alphabet ⌺, |⌺|=k, such that the language is infinite. Evidently this depends only on k. If p is k-avoidable for some finite k, then we say that p is avoidable. In 1979, Bean, Ehrenfeucht & McNulty [2], and independently Zimin [15] gave a decision procedure for determining whether p is avoidable. The problem of deciding, given k and p, whether p is k-avoidable has remained open, but a solution to it has been pursued vigorously. For example, Cassaigne’s PhD thesis [4] includes an inventory of words over {a, b, c}, which are classified according to whether they are known to be 2-avoidable, 3-avoidable or 4-avoidable. Here is one intriguing open problem: Problem 1 Is there an avoidable pattern which is not 4-avoidable? [1] As in the case of yy, we have natural Abelian analogues: We say that word w encounters p in the Abelian sense if we can write w=XY1Y2… YnZ, where if yi=yj. Thus, for example: 12345672437658=1 234 56 7 243 7 65 8 encounters p=abcacb in the Abelian sense with X=1, Y1=234~243=Y4, Y2=56~65=Y6, Y 3=7=Y5, Z=8. Proceeding in the obvious way, define k-avoidability (avoidability) in the Abelian sense. In 1993 [6], I asked which patterns are avoidable or k-avoidable in the Abelian sense. From Keränen’s work we know that yy is 4-avoidable in the Abelian sense, but what more can be said? At this point we should point out that Abelian k-avoidability can have useful implications for ‘ordinary’ k-avoidability: Lemma 1 The pattern p=abcacb is 4-avoidable. Proof. Any word avoiding yy in the Abelian sense must avoid abcacb in the ordinary sense. Since yy is 4-avoidable in the Abelian sense, abcacb is 4-avoidable. 䊐
© 2003 Taylor & Francis
What Is the Abelian Analogue of Dejean’s Conjecture?
239
Remark. In fact, abcacb is 3-avoidable. Whether abcacb is 2-avoidable is an open problem [4]. Ordinary and Abelian avoidability are nevertheless not identical, as shown by the following lemma from [7]: Lemma 2 The pattern p=abcabdabcba is avoidable, but not in the Abelian sense. The problem of ordinary k-avoidability has been studied since 1979. Another tool which can be brought to bear on this problem is the study of repetitive thresholds, introduced by Dejean [8]. Fix r, 17/5. Suppose then that w contains some subword ABACBAB. Our proof can be broken into cases depending on which of |A| and |B| is greater: 1. |A|>|B|: In this case, w contains the word uvu, where u=A, v=B. Then, |uvu|/|uv|=|ABA|/|AB|=1+|A|/|AB|ⱖ1+1/2>7/5. 2. |B|>|A|: In this case, w contains the word uvu, where u=B, v=A. Again, |uvu|/|uv|=|BAB|/|BA|=1+|B|/|BA|ⱖ1+1/2>7/5.
© 2003 Taylor & Francis
240
J.D.Currie
In either case, w encounters yr, and r>7/5.䊐 The problem of which words p are k-avoidable, or avoidable in the Abelian sense, is obviously of intrinsic interest. We see also that Abelian avoidability gives insight into the open problem of ‘ordinary’ k-avoidability. Since repetitive thresholds also give insight into k-avoidability, it is natural to seek the Abelian version of Dejean’s conjecture. Fix r, 14. The behaviour of a sort of dual repetitive threshold function would also be useful: Define the dual Abelian repetitive threshold function on (1, 2] by letting:
For example, by [11], D ART(2)=4. Only very weak information about D ART(r) is known for most r. In [5] it is shown that: (1) Considering the number of years it took to show that D ART(2)=4, it seems unlikely that showing ART(4) such that: (1) and, furthermore: supp(ahi(w)) ≠ supp(ahj(w)) whenever 0≤i<j. (Note that, if A is a field, this condition is equivalent to the local finiteness of the family {hn(w)}n≥0). Consider the series r given in (1) and denote: ahn(w)=cnwn, where
and wn∈X* for n≥0. Then we have: (2)
In what follows, the righthand side of (2) is called the normal form of r. A sequence (cn)n≥0 of elements of A is called an D0L multiplicity sequence over A if there exists an D0L power series r over A such that (2) is the normal form of r.
© 2003 Taylor & Francis
On D0L Power Series over Various Semirings
265
Now, suppose A and B are commutative semirings and B債A. We say that A is a Fatou extension of B with respect to D0L power series if whenever r∈A> is an D0L power series over A such that , r is an D0L power series over B. Similarly, A is a Fatou extension of B with respect to D0L multiplicity sequences if whenever (cn)n≥0 is an D0L multiplicity sequence over A such that for all n≥0, (cn)n≥0 is an D0L multiplicity sequence over B. For later use we recall the characterization of D0L multiplicity sequences from Honkala [7]. A sequence (an)n≥0 of nonnegative integers is called a modified PD0L length sequence if there exists a nonnegative integer t such that a0=a1=…=at-1=0 and (an+t)n≥0 is an PD0L length sequence. A sequence (an)n≥0 of nonnegative integers is a modified PD0L length sequence if and only if the sequence (an+1-an)n≥0 is N-rational (see Rozenberg and Salomaa [9]). Theorem 1 Suppose A is a commutative semiring. A sequence (cn)n≥0 of nonzero elements of A is an D0L multiplicity sequence over A if and only if there exists a positive integer k, nonzero and modified PD0L length sequences (sin)n≥0 for 1≤i≤k such that: (3) for all n≥0. We will also need the following result from Honkala [7]. Theorem 2 Suppose A is a field. A sequence (cn)n≥0 of nonzero elements of A is an D0L multiplicity sequence over A if and only if there exists a positive integer t and integers β1,…, βt such that: (4) for n≥0.
3 Fatou Properties The positive Fatou properties given below follow easily from the definitions and Theorem 2. A direct proof of Theorem 4 would be much more difficult. Theorem 3 Suppose A債R is a semiring. Then A is a Fatou extension of A+ with respect to D0L multiplicity sequences and D0L power series. In particular, Z is a Fatou extension of N with respect to D0L multiplicity sequences and D0L power series.
© 2003 Taylor & Francis
266
Proof. Suppose
J.Honkala
is an D0L power series over A given by:
where h: A<X*>→A<X*> is a monomial morphism, a∈A and w∈X*. Define the monomial morphism h1: A+<X*>→A+<X*> by h1(x)=h(x) if the coefficient of h(x) is positive and h1(x)=-h(x) if the coefficient of h(x) is negative, x∈X. Then:
is an D0L power series over A+. Hence, A is a Fatou extension of A+ with respect to D0L power series. It follows that A is a Fatou extension of A+ with respect to D0L multiplicity sequences. Theorem 4 Suppose E and F are fields and F⊆E. Then E is a Fatou extension of F with respect to D0L multiplicity sequences. Proof. Suppose (cn)n≥0 is an D0L multiplicity sequence over E such that for all n≥0. Then the only if-part of Theorem 2 implies the existence of a positive integer t and integers β1,…, βt such that: (5) for n≥0. Now (5) implies by the if-part of Theorem 2 that (cn)n≥0 is an D0L multiplicity sequence over F. Theorem 5 R+ is a Fatou extension of Q+ with respect to D0L multiplicity sequences. Proof. Suppose (cn)n≥0 is an D0L multiplicity sequence over R+ such that for all n≥0. Then (cn)n≥0 is an D0L multiplicity sequence over R. By Theorem 4, (cn)n≥0 is an D0L multiplicity sequence over Q. Hence, by Theorem 3, (cn)n≥0 is an D0L multiplicity sequence over Q+. In general, a field is not a Fatou extension of its subfield with respect to D0L power series. Example 1 Let E<X*>→E<X*> by
© 2003 Taylor & Francis
, X={b} and define the monomial morphism h: . Then:
On D0L Power Series over Various Semirings
267
is an D0L power series over E. Because:
we have . However, r is not an D0L power series over Q. Indeed, if g: Q<X*>→Q<X*> were a monomial morphism such that:
we would have g(b2)=2b4 which is not possible. Hence extension of Q with respect to D0L power series.
is not a Fatou
Example 1 also shows that R+ is not a Fatou extension of Q+ with respect to D0L power series. Example 2 Let A=Q+ and X={b, c, d, e}. Define the monomial morphism h: A<X*>→A<X*> by:
Then:
is an D0L power series over Q+. The associated D0L multiplicity sequence (cn)n≥0 is given by:
for all n≥0. However, by (see Example 2.4 in Honkala [6]). Hence, Theorem 1, (cn)n≥0 is not an D0L multiplicity sequence over N (resp. Z). It follows that Q is not a Fatou extension of N or Z and Q+ is not a Fatou extension of N with respect to D0L multiplicity sequences or D0L power series.
4 Decidability Questions In this section, we discuss various decidability questions which are closely related to Fatou properties of D0L multiplicity sequences and D0L power series. Theorem 6 Suppose A債R is a semiring. It is decidable whether or not cn∈A+ for all n≥0 if (cn)n≥0 is an D0L multiplicity sequence over A.
© 2003 Taylor & Francis
268
J.Honkala
Proof. By Theorem 2 there exists a positive integer t and integers ß1,…, ßt such that: (6) for n≥0. Therefore, cn∈A+ for all n≥0 if and only if cn∈A+ for all 0≤n→A<X*> is a monomial morphism. Denote for x∈X. We claim that (7) holds for all n≥0. First, (7) is clear if n=0. Then, if (7) holds for n≥0, we have:
Hence (7) holds for all n≥0. This concludes the proof of Lemma 1 in one direction. Suppose then that (7) holds. Let wn=gn(w0), n≥0, where g: X*→X* is a morphism. Define the monomial morphism h: A<X*>→A< X*> by h(x)=axg(x), x∈X. Then it follows that:
is an D0L power series over A. Hence (cn)n≥0 and (wn)n≥0 are compatible. Lemma 2 Suppose (tin)n≥0, 0≤i≤m, are Z-rational (resp. N-rational) sequences. It is decidable whether or not there exist such that: (8) for all n≥0.
© 2003 Taylor & Francis
On D0L Power Series over Various Semirings
271
Proof. Suppose (tin)n≥0, 0≤i≤m, are Z-rational sequences. Then there exist a positive integer k and integers ß1,…, ßk such that: (9) for all 0≤j≤m, n≥0. Next, decide whether or not there exist integers a1,…, am such that (8) holds for 0≤n0, and let Vr, u be the set of their representations with respect to r. By König’s lemma, we find that the set of minimal elements in Vr, u is finite. We denote it by . Using lemma 3(i), it can be seen that a vector of nonnegative integers t is a representation of some un with respect to r, if and only if t is a linear combination of the vectors in . This implies that, denoting by the set of words represented by the vectors in , n n , if and only if u is a linear combination we have (r, u )>0, for some of words in .
© 2003 Taylor & Francis
On the Difference Problem for Semilinear Power Series
321
Furthermore, let us denote . From the above, we finally obtain that (r, un)>0 if and only if n is a linear combination of integers in . The claim follows now by lemma 1. 䊐 Theorem 1 Let r be a quasisemilinear series and u be a word in a nonnegative integer n0 such that:
. There is
Pr,u={nⱖn0|(r,un)>0} is a semilinear set of nonnegative integers. Proof. Clearly, and thus, it is enough to prove the theorem for a quasilinear series , where p0 is a monomial, and p1,…, pm are proper polynomials. As in the proof of lemma 4, we consider the words un, with (r, un)>0 and Vr,u the set of their representations with respect to r. By König’s lemma, the set of minimal elements in Vr,u is finite and, hence, by lemma 3(iii), , for some nonnegative integers n1,…, nk and some elementary quasilinear series r1,…, rk. Moreover, (r0, un)=0, for all . The claim follows now using lemma 4. 䊐
3 The Result The first step towards the main result of this paper is to characterize those ⺞-semilinear power series that have bounded coefficients. Lemma 5 If s is a semilinear series with bounded coeficients, then s is of the form , for some monomials m1,…, mk and some words u1,…, uk. Proof. Since s is a semilinear series, then it is of the form:
for some monomials m1,…, mk, and some polynomials p1,…, pk. Consider the polynomial p1=␣1u1+…+␣iui, with , and . All the coefficients ␣1,…, ␣i must be equal to 1, since otherwise s does not have bounded coefficients. Moreover, if p1 is not a monomial, i.e. iⱖ2, then , and this is not bounded for j1, j2ⱖ0. The lemma is thus proved. 䊐 We are now ready for the main result of this paper. Theorem 2 If r and s are semilinear series such that rⱖs, and s has bounded multiplicities, then r-s is a rational series.
© 2003 Taylor & Francis
322
I.Petre
Proof. We prove slightly more generally that if r is a quasisemilinear series, s=v⬘ v*, with and rⱖs, then r-s is a quasisemilinear series. Using lemma 5, the theorem obviously follows from this. We begin by making several reductions of the general problem to simpler instances of it. Let the quasisemilinear form of the series r be r=r1+…+ri, with r1,…, ri quasilinear series. Our first claim is that we can assume v⬘=1, without any loss of generality. Assume that this is not the case, i.e. v⬘⫽1, and consider the linear series r1, , where p0 is monomial, and p1,…, pm are proper polynomials. Take then the representations with respect to r1 of the words v⬘vn, for all nⱖ0. By König’s lemma, the set of minimal elements in this set of vectors is finite, and then, by lemma 3(iii), we obtain that:
is a quasisemilinear series such that , is a quasilinear series. We proceed similarly with r2,…, ri, and obtain that:
where
, for all nⱖ0, and
r=r⬘+v⬘r ⬙ , where r⬘ and r⬙ are quasisemilinear series, and (r⬘, v⬘ vn)=0, for all nⱖ0. Since rⱖs, we then have that v⬘r⬙ ⱖs, and hence r⬙ ⱖv*. Moreover, r-s=r⬘+v⬘(r⬙ -v*). It is enough now to prove that the difference r⬙ -v* is quasisemilinear. The first claim is thus proved. Our second goal is to reduce the problem to the case when r is a linear series. Let us assume that this is not the case. Due to our first reduction, we can assume that r is of the form , with r1,…, ri elementary quasilinear series. By theorem 1, the set Pk of powers of v in support of each of the series , 1ⱕkⱕi, is a linear set, modulo a finite number of elements. Hence, Pk is of the form , with Ak linear and Bk finite sets of nonnegative integers, for all 1ⱕkⱕi. Let . Since rⱖv*, the set covers the entire set . By lemma 2, can thus be covered also by a disjoint union of linear sets , with , for all 1ⱕkⱕi. From this, one can easily obtain that the series v* can be written as: v*=s0+s1+…+si, with a linear series, for all 1ⱕkⱕi, and a polynomial. We need not be concerned with the polynomial s0 since clearly, by lemma 3(i), t-s0 is a quasisemilinear series, for any quasisemilinear series t such that tⱖs0. So, we can assume that s0=0.
© 2003 Taylor & Francis
On the Difference Problem for Semilinear Power Series
323
Since , we have that , for all 1ⱕkⱕi. To obtain the claim of the theorem, it is enough to prove that is quasisemilinear, where rk is an elementary quasilinear series, say , and sk is of the form . This completes the second reduction. Before computing the difference in this case, we make one last reduction by observing that it is enough to assume that nk=jk=0. To see this, we need only point out that n kⱕj k , since otherwise , which is impossible. Hence, , and thus, we can assume that nk=0. In this hypothesis, we now have:
By lemma 4, there is an integer d such that (rk, vn)>0 if and only if n is a multiple of d. This and the fact that rkⱖsk shows that both jk and dk must be multiples of d. In particular this implies that . We then have:
Hence, we can indeed assume that nk=jk=0. In this case, the difference can be computed as follows:
and, thus, , which is quasisemilinear. The proof of the theorem is now complete. 䊐 We now give an example of how to compute the difference of two semilinear power series. Example 1 Let r=(a2+a3)* and let s=(a2)*+a3(a3)*. We first compute the difference r⬘=r-(a2)* as follows:
Hence, r’=a3 (a2)*(a2+a3)*. We continue as follows:
© 2003 Taylor & Francis
324
I.Petre
Let us denote now r⬙=(a2)*(a2+a3)*, and s⬙=(a3)*. Since the representation of a3 with respect to r⬙ is the vector (0, 0, 1), we have:
The difference r⬙-s⬙ is now:
All that remains to be done is to compute the difference (a2+a3)*-(a3)*, which is:
This proves the rationality of the difference.
4 Conclusions In this paper, we have continued the research initiated in Petre [7] and we have shown once again that the semilinear formal power series in commuting variables behave nicely under basic operations. Namely, we have extended one of Eilenberg’s results for rational series in noncommuting variables, and proved that the difference of two semilinear power series is rational, provided that one of them has bounded coefficients. It is an open problem whether this is semilinear or not, and we conjecture that the answer to this question is negative.
References [1] S.Eilenberg, Automata, Languages and Machines. Academic Press, New York, 1974. [2] S.Eilenberg and M.P.Schützenberger, Rational sets in commutative monoids, Journal of Algebra, 13 (1969), 173–191. [3] S.Ginsburg, The Mathematical Theory of Context-Free Languages. McGrawHill, New York, 1966. [4] W.Kuich, The Kleene and the Parikh theorem in complete semirings. In Automata, Languages and Programming. Springer, Berlin, 1987, 211–215.
© 2003 Taylor & Francis
On the Difference Problem for Semilinear Power Series
325
[5] W.Kuich, Semirings and formal power series: their relevance to formal languages and automata. In G.Rozenberg and A.Salomaa (eds.), Handbook of Formal Languages. Springer, Berlin, 1997, vol. 1, 609–677. [6] W.Kuich and A.Salomaa, Semirings, Automata, Languages. Springer, Berlin, 1986. [7] I.Petre, On semilinearity in formal power series. In Proceedings of Developments in Language Theory, DLT 1999, to appear. [8] A.Salomaa and M.Soittola, Automata-Theoretic Aspects of Formal Power Series. Springer, Berlin, 1978.
© 2003 Taylor & Francis
On Spatial Reasoning via Rough Mereology Lech Polkowski Polish-Japanese Institute of Information Technology and Institute of Mathematics Warsaw University of Technology Poland email:
[email protected] Abstract. The content of this note is a demonstration that Mereotopology, one of the principal ingredients of spatial reasoning toolery, may be developed in the framework of Rough Mereology. We introduce Rough Mereology into Stanislaw Le niewski’s Ontological and Mereological framework and we reveal topological structures which may be defined therefrom. These structures turn out to be closely related to those studied in the Calculus of Individuals [20]. 1 Introduction In this paper, we study some properties of reasoning devices applicable in spatial reasoning and founded on rough mereology. Rough mereology was proposed by Polkowski and Skowron [22], [21], [25] as a paradigm for reasoning under uncertainty. In particular, they discussed logics for synthesizing approximate solutions by distributed systems [21] as well as problems of design, analysis and control in those systems [25]. This theory is rooted in rough set theory proposed by Pawlak [18] and in Stanislaw Le niewski’s mereological theory [15], [27], [26] and may be regarded as an extension of the latter. We underline here this aspect of rough mereology by proposing a
© 2003 Taylor & Francis
328
L.Polkowski
simple and convenient formalization of this theory within Lesniewski’s ontological framework [16], [17], [13], [28]. As spatial reasoning is by its very nature related to continuous objects, it is convenient to develop it in terms of part-whole relations, i.e. in a mereological framework, e.g. the one indicated above. Spatial reasoning is an important aspect of reasoning under uncertainty and is related to logical and philosophical investigations about the nature of space and time [5], [23], [29], [30], [31] as well as to research in natural language processing (cf. [2], [3]) and geographic information systems (cf. [9]), etc. Although often not explicitly mentioned, important aspects of spatial reasoning are present in the field of mobile robotics under the guise of navigation, obstacle avoidance and other techniques related to spatial reasoning [1], [8], [10]. Our study of rough mereology as a vehicle for spatial reasoning leads us to mereotopological structures immanent to spatial reasoning. We begin with a formal introduction to Stanislaw Lesniewski’s Ontology and Mereology and we continue with a section on Rough Mereology. Then we discuss the mereotopological features of Rough Mereology. 2 Ontology: An Introduction The theory of part-whole relations may be conveniently presented in an ontological language intended by Stanislaw Lesniewski [15] as a formulation of general principles of being (cf. [26], [11], [28], [13]). This language is an alternative to the standard language of set theory. The only primitive notion of Lesniewski’s Ontology is the copula “is” denoted by the symbol ε. The original axiom of Ontology defining the meaning of ε is as follows. 2.1 The Ontological Axiom
In this axiom, the defined copula ε happens to occur on both sides of the equivalence: however, the definiendum XεY belongs to the left side only and we may perceive the axiom as the definition of the meaning of XεY via the meaning of terms of “lower level” ZεX, ZεY, etc. According to this reading of the axiom, we gather that the proposition XεY is true if and only if the conjunction holds for the following three propositions: (I)
.
This proposition asserts the existence of an object (name) Z which is X, and so X is not an empty name.
© 2003 Taylor & Francis
On Spatial Reasoning via Rough Mereology
(II)
329
.
This proposition asserts that any two objects which are X are each other (‘a fortiori’, they will be identified later on): X is an individual name (or X is an individual object, a singleton). (III) This proposition asserts that every object which is X is Y (or, X is contained in Y). The meaning of XεY can be made clear now: X is an individual and this individual is Y. Identity of individual objects is introduced via:
The universal name V and the empty name Λ are introduced by: . 3 Mereology: An Introduction Mereology is a theory of collective classes, unlike ontology which is a theory of distributive classes. Mereology may be based on each of a few notions such as of a part, an element, a class etc. Historically, it was conceived by Stanislaw Lesniewski as a theory of the relation of being a part and we follow this line of development. We assume that the copula ε is given and that the Ontology Axiom holds. Under these assumptions, we introduce the notion of the nameforming functor pt of part. 3.1 Mereology Axioms (A0) (A1) (A2) non(XεptX).
.
On the basis of the notion of a part, we define the notion of an element (a possible improper part) as a name-forming functor el. Definition 1 We may now introduce the notion of a (collective) class via a functor Kl. Definition 2 .
© 2003 Taylor & Francis
330
L.Polkowski
The class functor is subject to additional postulates. (A4)
.
Hence, KlY is an individual name. (A5) The class operator may be regarded as an aggregation operator which turns any non-empty collection of objects (a general name) into an individual object (an individual name); we show below that it may serve as an efficient neighborhood-forming operator. 3.2 Subset, Complement We define the notions of a subset and a complement and we will look at relations and functions in a mereological context. We define first the notion of a subset as a name-forming functor sub of an individual variable. Definition 3
.
We now define the notion of being external as a binary proposition-forming functor ext of individual variables. Definition 4
.
The notion of a complement is rendered as a name-forming functor comp of two individual variables. Definition 5 .
where
We may now go on to Rough Mereology. 4 Rough Mereology: An Introduction Rough mereology is an extension of mereology based on the predicate of being a part in a degree; this predicate is rendered here as a family of nameforming functors µr parameterized by a real parameter r in the interval [0, 1] with the intent that XεµrY reads “X is a part of Y in degree at least r”. We begin with the set of axioms and we construct the axiom system as an extension of systems for ontology and mereology. We assume, therefore, that a functor el of an element satisfying the mereology axiom system within a given ontology of ε is given; around this, we develop a system of axioms for rough mereology.
© 2003 Taylor & Francis
On Spatial Reasoning via Rough Mereology
331
4.1 The Axiom System The following is the list of basic postulates: (RM1) (RM2) (RM3) (RM4)
. .
It follows that the functor µ1 coincides with the given functor el and thus establishes a link between rough mereology and mereology while functors µr with r