Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen
Editorial Board David Hutchison Lancaster University, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Friedemann Mattern ETH Zurich, Switzerland John C. Mitchell Stanford University, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel Oscar Nierstrasz University of Bern, Switzerland C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen University of Dortmund, Germany Madhu Sudan Massachusetts Institute of Technology, MA, USA Demetri Terzopoulos New York University, NY, USA Doug Tygar University of California, Berkeley, CA, USA Moshe Y. Vardi Rice University, Houston, TX, USA Gerhard Weikum Max-Planck Institute of Computer Science, Saarbruecken, Germany
3302
Wei-Ngan Chin (Ed.)
Programming Languages and Systems Second Asian Symposium, APLAS 2004 Taipei, Taiwan, November 4-6, 2004 Proceedings
13
Volume Editor Wei-Ngan Chin National University of Singapore Department of Computer Science School of Computing 3, Science Drive 2, Singapore 117543 E-mail:
[email protected] Library of Congress Control Number: 2004113831 CR Subject Classification (1998): D.3, D.2, F.3, D.4, D.1, F.4.1 ISSN 0302-9743 ISBN 3-540-23724-0 Springer Berlin Heidelberg New York This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. Springer is a part of Springer Science+Business Media springeronline.com © Springer-Verlag Berlin Heidelberg 2004 Printed in Germany Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper SPIN: 11341635 06/3142 543210
Foreword
On behalf of the organizing committee I would like to welcome you all to the second Asian Symposium on Programming Languages and Systems (APLAS 2004) held in Taipei on November 4–6, 2004. Since the year 2000, researchers in the area of programming languages and systems have been meeting annually in Asia to present their most recent research results, thus contributing to the advancement of this research area. The last four meetings were held in Singapore (2000), Daejeon (2001), Shanghai (2002), and Beijing (2003). These meetings were very fruitful and provided an excellent venue for the exchange of research ideas, findings and experiences in programming languages and systems. APLAS 2004 is the fifth such meeting and the second one in symposium setting. The first symposium was held in Beijing last year. The success of the APLAS series is the collective result of many people’s contributions. For APLAS 2004, first I would like to thank all the members of the Program Committee, in particular the Program Chair Wei-Ngan Chin, for their hard work in putting together an excellent program. I am most grateful to invited speakers, Joxan Jaffar, Frank Pfenning, and Martin Odersky, who have traveled a long way to deliver their speeches at APLAS 2004. I would like to thank all the referees, who helped review the manuscripts, the authors, who contributed to the proceedings of APLAS 2004, the members of the Organizing Committee, who made considerable effort to organize this event, and all the participants present at this meeting. Without your support this symposium would not have been possible. Finally I would like to acknowledge the support of the Asian Association for Foundation of Software and Academia Sinica, Taiwan. I am sure you will enjoy this meeting, and I hope you will also find time to do some sightseeing in Taipei and take back some fond memories of your visit after this meeting is over.
September 2004
D. T. Lee
Preface
This volume contains the proceedings of the 2nd Asian Symposium on Programming Languages and Systems (APLAS 2004). The symposium was held in Taipei, Taiwan and was sponsored by the Asian Association for Foundation of Software (AAFS) and the Academia Sinica. Following our call for papers, 97 full submissions were received. Almost all the papers were reviewed by three (or more) program committee members with the help of external reviewers. The program committee met electronically over a 10-day period and accepted 26 papers after careful deliberations. I would like to thank members of the APLAS 2004 Program Committee, for the tremendous effort they put into their reviews and deliberations, and all the external reviewers for their invaluable contributions. The final program covered both foundational and practical issues in programming languages and systems. Apart from the 26 accepted papers, the symposium also featured invited talks from three distinguished speakers, Joxan Jaffar (National University of Singapore), Frank Pfenning (Carnegie Mellon Univer´ sity, USA) and Martin Odersky (Ecole Polytechnique F´ed´erale de Lausanne, Switzerland). Many people helped to promote APLAS as a high-quality forum in Asia to serve programming language researchers worldwide. Following a series of wellattended workshops that were held in Singapore (2000), Daejeon (2001), and Shanghai (2002), the first formal symposium was successfully held in Beijing in 2003. The present symposium benefited from the past momentum, and was also due to the contributions of many people. Foremost, I am grateful to the General Chair, D. T. Lee, for his invaluable support and guidance, making our symposium in Taipei possible. I am also indebted to our Local Arrangements Chair, Tyng-Ruey Chuang, for the considerable effort he put into planning and organizing the meeting itself. Hidehiko Masuhara kindly agreed to act as Poster Chair, and Shengchao Qin helped with publicity matters. From the AAFS Committee, I would like to especially thank Atsushi Ohori and Tetsuo Ida for providing sound advice. Last but not least, I thank Florin Craciun for his dedication in handling the CyberChair submissions system and other administrative matters.
September 2004
Wei-Ngan Chin
Organization
General Chair D.T. Lee (Academia Sinica, Taiwan)
Program Chair Wei-Ngan Chin (National University of Singapore)
Program Committee Jifeng He (United Nations University, Macau) Thomas Henzinger (University of California, Berkeley, USA) Yuh-Jzer Joung (National Taiwan University, Taiwan) Gabriele Keller (University of New South Wales, Australia) Jenq-Kuen Lee (National Tsinghua University, Taiwan) Luc Maranget (INRIA, France) Hidehiko Masuhara (University of Tokyo, Japan) Luke Ong (University of Oxford, UK) Tamiya Onodera (IBM Research, Japan) Zongyan Qiu (Peking University, China) Martin Rinard (Massachusetts Institute of Technology, USA) David Sands (Chalmers University of Technology, Sweden) Akihiko Takano (National Institute of Informatics, Japan) Kazunori Ueda (Waseda University, Japan) Chengyong Wu (Chinese Academy of Science, China) Hongwei Xi (Boston University, USA) Kwangkeun Yi (Seoul National University, Korea)
Local Arrangements Chair Tyng-Ruey Chuang (Academia Sinica, Taiwan)
Poster Chair Hidehiko Masuhara (University of Tokyo, Japan)
Publicity Shengchao Qin (National University of Singapore)
X
Organization
External Referees Seika Abe Joonseon Ahn Ki-yung Ahn Wolfgang Ahrendt C. Scott Ananian Stefan Andrei Thomas Arts Martin Berger Manuel Chakravarty Byeong-Mo Chang Rong-Guey Chang Chiyan Chen Chung-Kai Chen Gang Chen Woongsik Choi Tyng-Ruey Chuang Koen Claessen Florin Craciun Huimin Cui Xie Haibin Fritz Henglein Kohei Honda Gwan-Hwang Hwang Chung-Wen Hwang D. Doligez Derek Dreyer Kai Engelhardt Hyunjun Eo Jacques Garrigue Martin Giese
William Greenland Hai-Feng Guo Hwansoo Han Pao-Ann Hsiung Ming-Yu Hung Tatsushi Inagaki Wu Jiajun Jang-Wu Jo Hyun-Goo Kang Paul Kennedy Siau-Cheng Khoo Youil Kim Jaejin Lee Oukseh Lee James Leifer Young-Jia Lin Tao Liu Zhanglin Liu Tom Melham Francois Metayer Oege de Moor Andrzej Murawski Keisuke Nakano Huu Hai Nguyen Susumu Nishimura Jeff Polakow Corneliu Popeea Shengchao Qin Julian Rathke Masahiko Sakai
Sponsoring Institutions Asian Association for Foundation of Software (AAFS) Academia Sinica, Taiwan
Masataka Sassa Sean Seefried Sunae Seo Rui Shi K. Y. Shieh Yuan-Shin Donald Bruce Stewart Eijiro Sumii Josef Svenningsson Munehiro Takimoto Feng Tang C. L. Tang Akihiko Tozawa Yih-Kuen Tsay Jerome Vouillon Ken Wakita Bow-Yaw Wang Jason Wu Dana N. Xu Hongseok Yang Wuu Yang Masahiro Yasugi Handong Ye Yi-Ping You Shoji Yuen Patrick Zadarnowski Naijun Zhan Xiaogang Zhang Dengping Zhu
Table of Contents
Invited Talk A CLP Approach to Modelling Systems Joxan Jaffar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
Session 1 An Algebraic Approach to Bi-directional Updating Shin-Cheng Mu, Zhenjiang Hu, Masato Takeichi . . . . . . . . . . . . . . . . . . .
2
Network Fusion Pascal Fradet, St´ephane Hong Tuan Ha . . . . . . . . . . . . . . . . . . . . . . . . . . .
21
Session 2 Translation of Tree-Processing Programs into Stream-Processing Programs Based on Ordered Linear Type Koichi Kodama, Kohei Suenaga, Naoki Kobayashi . . . . . . . . . . . . . . . . .
41
An Implementation of Subtyping Among Regular Expression Types Kenny Zhuo Ming Lu, Martin Sulzmann . . . . . . . . . . . . . . . . . . . . . . . . . .
57
An Implementation Scheme for XML Transformation Languages Through Derivation of Stream Processors Keisuke Nakano . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
74
Session 3 Detecting Software Defects in Telecom Applications Through Lightweight Static Analysis: A War Story Tobias Lindahl, Konstantinos Sagonas . . . . . . . . . . . . . . . . . . . . . . . . . . . .
91
History Effects and Verification Christian Skalka, Scott Smith . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Controlled Declassification Based on Intransitive Noninterference Heiko Mantel, David Sands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Session 4 A Concurrent System of Multi-ported Processes with Causal Dependency Tatsuya Abe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
XII
Table of Contents
Concurrency Combinators for Declarative Synchronization Pawel T. Wojciechowski . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 A Uniform Reduction Equivalence for Process Calculi Zining Cao . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
Invited Talk Substructural Operational Semantics and Linear Destination-Passing Style Frank Pfenning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
Session 5 PType System: A Featherweight Parallelizability Detector Dana Na Xu, Siau-Cheng Khoo, Zhenjiang Hu . . . . . . . . . . . . . . . . . . . . 197 A Type Theory for Krivine-Style Evaluation and Compilation Kwanghoon Choi, Atsushi Ohori . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 Region-Based Memory Management for a Dynamically-Typed Language Akihito Nagata, Naoki Kobayashi, Akinori Yonezawa . . . . . . . . . . . . . . . 229
Session 6 Protocol Specialization Matthias Neubauer, Peter Thiemann . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246 Automatic Generation of Editors for Higher-Order Data Structures Peter Achten, Marko van Eekelen, Rinus Plasmeijer, Arjen van Weelden . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262 A MATLAB-Based Code Generator for Sparse Matrix Computations Hideyuki Kawabata, Mutsumi Suzuki, Toshiaki Kitamura . . . . . . . . . . . . 280
Session 7 D-Fusion: A Distinctive Fusion Calculus Michele Boreale, Maria Grazia Buscemi, Ugo Montanari . . . . . . . . . . . . 296 A Functional Language for Logarithmic Space Peter Møller Neergaard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311 Build, Augment and Destroy, Universally Neil Ghani, Tarmo Uustalu, Varmo Vene . . . . . . . . . . . . . . . . . . . . . . . . . 327 Free Σ-Monoids: A Higher-Order Syntax with Metavariables Makoto Hamana . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
Table of Contents
XIII
Invited Talk The Scala Experiment – Can We Provide Better Language Support for Component Systems? Martin Odersky . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
Session 8 Pointcuts as Functional Queries Michael Eichberg, Mira Mezini, Klaus Ostermann . . . . . . . . . . . . . . . . . . 366 Formal Design and Verification of Real-Time Embedded Software Pao-Ann Hsiung, Shang-Wei Lin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382
Session 9 McJava – A Design and Implementation of Java with Mixin-Types Tetsuo Kamina, Tetsuo Tamai . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398 A Relational Model for Object-Oriented Designs He Jifeng, Zhiming Liu, Xiaoshan Li, Shengchao Qin . . . . . . . . . . . . . . 415 Exploiting Java Objects Behavior for Memory Management and Optimizations Zoe C.H. Yu, Francis C.M. Lau, Cho-Li Wang . . . . . . . . . . . . . . . . . . . . 437
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453
A CLP Approach to Modelling Systems Joxan Jaffar School of Computing, National University of Singapore, Republic of Singapore 117543
[email protected] We present a formal method for modelling the operational behavior of various kinds of systems of concurrent processes. A first objective is that the method be broadly applicable. A system can be described in terms of its processes written in a traditional syntax-based manner, or in some non-traditional form such as a timed automaton. The number of processes may be fixed, or parameterized, or, because of dynamic process creation, unbounded. The communication and synchronization between processes may be synchronous or not, and via shared variables or some form of channels. We may have a traditional interleaving of processes, or use a specific scheduling strategy. The observables modelled should not be restricted to just the values of the program variables, but possibly other attributes of the system such as its registers and cache, its clock and battery values, etc. An example application area which touches upon these characteristics is that of determining worst-case execution time. We choose to model a generic system S in the form of a CLP program P . The model-theoretic semantics of P shall characterize the “collecting” semantics of S, that is, those states that are observable. The proof-theoretic semantics of P , on the other hand, further characterize the “trace” semantics of S. An advantage of this CLP approach is that intricate details of the system can be captured in a familiar logical framework. We then present a specification language for an extensive class of system behaviors. In addition to the traditional safety and liveness properties which specify the universality or eventuality of certain predicates on states, we introduce the notions of relative safety and relative progress. The former extends traditional safety assertions to accommodate non-behavioral properties such as symmetry, serializability and commutativity between processes. The latter provides for specifying progress properties. Our specification method is not just for stating the property of interest, but also for the assertion of properties held at various program points. Finally, we present an inference method, based upon a notion of inductive tabling, for proving an assertion A. This method can use assertions that have already been proven, use the assertion A itself, in a manner prescribed by induction principles, and dynamically generate new assertions. All these properties are shown to be useful in preventing redundant computations, which then can lead to efficient proofs. Our proof method thus combines the search characteristic of model-checking and abstract interpretation, and methods of inductive assertions. We demonstrate a prototype implementation on some benchmark examples.
Joint work with Andrew Santosa and R˘ azvan Voicu.
W.-N. Chin (Ed.): APLAS 2004, LNCS 3302, p. 1, 2004. c Springer-Verlag Berlin Heidelberg 2004
An Algebraic Approach to Bi-directional Updating Shin-Cheng Mu, Zhenjiang Hu, and Masato Takeichi Department of Information Engineering, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113, Japan {scm,hu,takeichi}@ipl.t.u-tokyo.ac.jp Abstract. In many occasions would one encounter the task of maintaining the consistency of two pieces of structured data that are related by some transform — synchronising bookmarks in different web browsers, the source and the view in an editor, or views in databases, to name a few. This paper proposes a formal model of such tasks, basing on a programming language allowing injective functions only. The programmer designs the transformation as if she is writing a functional program, while the synchronisation behaviour is automatically derived by algebraic reasoning. The main advantage is being able to deal with duplication and structural changes. The result will be integrated to our structure XML editor in the Programmable Structured Document project.
1
Introduction
In many occasions would one encounter the task of maintaining consistency of two pieces of structured data that are related by some transform. In some XML editors, for example [3, 15], a source XML document is transformed to a userfriendly, editable view through a transform defined by the document designer. The editing performed by the user on the view needs to be reflected back to the source document. Similar techniques can also be used to synchronise several bookmarks stored in formats of different browsers, to maintain invariance among widgets in an user interface, or to maintain the consistency of data and view in databases. As a canonical example, consider the XML document in Figure 1(a) representing an article. When being displayed to the user, it might be converted to an HTML document as in Figure 1(b), with an additional table of contents. The conversion is defined by the document designer in some domain-specific programming language. We would then wish that when the user, for example, adds or deletes a section in (b), the original document in (a) be updated correspondingly. Further more, the changes should also trigger an update of the table of contents in (a). We may even wish that when an additional section title is added to the table of contents, a fresh, empty section will be added to the article bodies in both (a) and (b). All these are better done without too much effort, other than specifying the transform itself, from the document designer. W.-N. Chin (Ed.): APLAS 2004, LNCS 3302, pp. 2–20, 2004. c Springer-Verlag Berlin Heidelberg 2004
An Algebraic Approach to Bi-directional Updating
3
View-updating [5, 7, 10, 14, 1] has been intensively studied in the database community. Recently, the problem of maintaining the consistency of two pieces of structured data was brought to our attention again by [12] and [11]. Though developed separately, their results turn out to be surprisingly similar, with two important features missing. Firstly, it was assumed that the transform is total and surjective, which ruled out those transforms that duplicate data. Secondly, structural changes, such as inserting to or deleting from a list or a tree, were not sufficiently dealt with. <article> Program inversion Program inversion <section>
Our first effort Our first effort
Our second effort ...
<section> Our second effort Our first effort
...
...
Our second effort
...
(a)
(b)
Fig. 1. An XML article and its HTML view with a table of contents
In this paper we will address these difficulties using a different approach inspired by previous studies of program inversion [2, 8]. We extend the injective functional language designed in [13], in which only injective functions are definable and therefore every program is trivially invertible. The document designer specifies the transform as if she were defining an injective function from the source to the view. A special operator for duplication specifies all element-wise dependency. To deal with inconsistencies resulting from editing, however, we define an alternative semantics, under which the behaviour of programs can be reasoned by algebraic rules. It will be a good application of program inversion [8] and algebraic reasoning, and the result will soon be integrated into our XML editor in the Programmable Structured Document project [15]. In Section 2 we give a brief introduction of the injective functional language in which the transforms are specified, and demonstrate the view-updating problem more concretely. An alternative semantics of the language is presented in Section 3, where we show, by algebraic reasoning, how to solve the view-updating problem. Section 4 shows some more useful transform, before we conclude in Section 5.
4
2
S.-C. Mu, Z. Hu, and M. Takeichi
An Injective Language for Bi-directional Updating
Assume that a relation X , specifying the relationship between the source and the view, is given. In [11], the updating behaviour of the editor is modelled by two functions getX :: S → V and putX :: (S × V ) → S . The function getX transforms the source to the view. The function putX , on the other hand, returns an updated source. It needs both the edited view and the original source, because some information might have been thrown away. For example, if the source is a pair and getX simply extracts the first component, the second component is lost. The cached original source is also used for determining which value is changed by the user. A more symmetrical approach was taken in [12], where both functions take two arguments. The relation X is required to be bi-total (total and surjective), which implies that duplicating data, which would make the relation non-surjective, is not allowed. In this paper we will explore a different approach. We make getX :: S → V and putX :: V → S take one argument only, and the transform has got to be injective — we shall lose no information in the source to view transform. A point-free language allowing only injective functions has been developed in [13] with this as one of the target applications. Duplication is an important primitive of the language. Restricting ourselves to injective functions may seem like a severe limitation, but this is not true. In [13], it was shown that for all possibly non-injective functions f :: A → B , we can automatically derive an injective function f :: A → (B , H ) where H records book-keeping information necessary for inversion. The extra information can be hidden from the user (for example, by setting the CSS visibility if the output is HTML). In fact, one can always make a function injective by copying the input to the output, if duplication is allowed. Therefore, the key extension here is duplication, while switching to injective functions is merely a change of presentation – rather than separating the original source and the edited view as two inputs to putX , we move the burden of information preserving to X . This change, however, allows putX itself to be simpler, while making it much easier to expose expose its properties, limitation, and possibly ways to overcome the limitation. In this section we will introduce the language, Inv with some examples, and review the view-updating problem in our context. Extensions to the language and its semantics to deal with the view-updating problem will be discussed in Section 3. Some readers may consider the use of a point-free language as “not practical”. We will postpone our defend to Section 5. 2.1
Views
The View datatype defines the basic types of data we deal with. View ::= Int | String | () | (View × View ) | L View | R View | List View | Tree View List a ::= [ ] | a : List a Tree a ::= Node a (List (Tree a))
An Algebraic Approach to Bi-directional Updating
5
The atomic types include integer, string, and unit, the type having only one value (). Composite types include pairs, sum (L View and R View ), lists and rose trees. The (:) operator, forming lists, associates to the right. We also follow the common convention writing the list 1 : 2 : 3 : [ ] as [1, 2, 3]. More extensions dealing with editing will be discussed later. For XML processing we can think of XML documents as rose trees represented by the type Tree. This very simplified view omits features of XML which will be our future work. In fact, for the rest of this paper we will be mostly talking about lists, since the techniques can easily be generalised to trees. 2.2
An Injective Language Inv
The syntax of the language Inv is defined as below. We abuse the notation a bit by using XV to denote the union of X and the set of variable names V . The ∗ operator denotes “a possibly empty sequence of”. X ::= X ˘ | nil | zero | C | δ | dup P | cmp B | inl | inr | X ; X | id | X ∪ X | X × X | assocr | assocl | swap | µ(V : XV ) C ::= succ | cons | node B ::= < | ≤ | = | ≥ | > P ::= nil | zero | str String | (S ; )∗ id S ::= C ˘ | fst | snd The semantics of each Inv construct is given in Figure 2. A relation of type A → B is a set of pairs whose first components have type A and second components type B , while a function1 is one such that a value in A is mapped to at most one value in B . A function is injective if all values in B are mapped to at most one value in A as well. The semantics of every Inv program is an injective function from View to View . That is, the semantics function [[ ]] has type Inv → View → View . For example, nil is interpreted as a constant function always returning the empty list, while zero always returns zero. Their domain is restricted to the unit type, to preserve injectivity. The function id is the identity function, the unit of composition. The semicolon (;) is overloaded both as functional composition and as an Inv construct. It is defined by (f ; g) a = g (f a). Union of functions is simply defined as set union. To avoid non-determinism, however, we require in f ∪ g that f and g have disjoint domains. To ensure injectivity, we require that they have disjoint ranges as well. The domain of a function f :: A → B , written dom f , is the partial function (and a set) {(a, a) ∈ A | ∃b ∈ B :: (a, b) ∈ f }. The range of f , written ran f , is defined symmetrically. The product (f × g) is a function taking a pair and applying f and g to the two components respectively. We make composition bind tighter than product. Therefore (f ; g × h) means ((f ; g) × h). 1
For convenience, we refer to possibly partial functions when we say “functions”.
6
S.-C. Mu, Z. Hu, and M. Takeichi
[[nil ]] () [[zero]] () [[succ]] n [[cons]] (a, x ) [[node]] (a, x ) [[inl ]] a [[inr ]] a [[id ]] a
= = = = = = = =
[] 0 n +1 a: x Node a x La Ra a
[[swap]] (a, b) = (b, a) [[assocr ]] ((a, b), c) = (a, (b, c))
[[assocl ]] (a, (b, c)) = ((a, b), c) [[cmp ]] (a, b) = (a, b), if a b [[δ]] a = (a, a) [[f ; g]] x = [[g]] ([[f ]] x ) [[f × g]] (a, b) = ([[f ]] a, [[g]] b) [[f ∪ g]] = [[f ]] ∪ [[g]], if dom f ∩ dom g = ran f ∩ ran g = ∅ [[f ˘]] = [[f ]]◦ [[µF ]] = [[F µF ]]
Fig. 2. Functional semantics of Inv constructs apart from dup
The fixed-point of F , a function from Inv expressions to Inv expressions, is denoted by µF . We will be using the notation (X : expr ) to denote a function taking an argument X and returning expr . The converse of a relation R is defined by (b, a) ∈ R ◦ ≡ (a, b) ∈ R The reverse operator ˘ corresponds to converses on relations. Since all functions here are injective, their converses are functions too. The reverse of cons, for example, decomposes a non-empty list into the head and the tail. The reverse of nil matches only the empty list and maps it to the unit value. The reverse operator distributes into composition, products and union by the following rules, ◦ all implied by the semantics definition [[f ˘]] = [[f ]] : [[(f ; g)˘]] = [[g˘]]; [[f ˘]] [[(f × g)˘]] = [[(f ˘ × g˘)]] [[(f ∪ g)˘]] = [[f ˘]] ∪ [[g˘]]
[[f ˘˘]] = [[f ]] [[(µF )˘]] = [[µ(X : (F X ˘)˘)]]
The δ operator is worth our attention. It generates an extra copy of its argument. Written as a set comprehension, we have δA = {(n, (n, n)) | n ∈ A}, where A is the type δ gets instantiated to. We restrict A to atomic types (integers, strings, and unit) only, and from now on use variable n and m to denote values of atomic types. To duplicate a list, we can always use map δ; unzip, where map and unzip are to be introduced in the sections to come. Taking its reverse, we get: δA ˘ = {((n, n), n) | n ∈ A} That is, δ˘ takes a pair and lets it go through only if the two components are equal. That explains the observation in [8] that to “undo” a duplication, we have to perform an equality test. In many occasions we may want to duplicate not all but some sub-component of the input. For convenience, we include another Inv construct dup which takes a sequence of “labels” and duplicates the selected sub-component. The label is
An Algebraic Approach to Bi-directional Updating
7
either fst, snd , cons˘, and node˘. Informally, think of the sequence of labels as the composition of selector functions (fst and snd ) or deconstructors, and dup can be understood as: [[dup f ]] x = (x , [[f ]] x ) If we invert it, (dup f )˘ becomes a partial function taking a pair (x , n), and returns x unchanged if f x equals n. The second component n can be safely dropped because we know its value already. We write (dup f )˘ as eq f . For example, dup (fst; snd ) ((a, n), b) yields (((a, n), b), n), while eq (fst; snd ) (((a, n), b), m) returns ((a, n), b) if n = m. Formally, dup is defined as a function taking a list of function names and returns a function: dup id =δ dup (fst; P ) = (dup P × id ); subl dup (snd ; P ) = (id × dup P ); assocl dup (cons˘; P ) = cons˘; dup P ; (cons × id ) dup (node˘; P ) = node˘; dup P ; (node × id ) Here, [[subl ]] ((a, b), c) = ((a, c), b), whose formal definition is given in Section 2.3. Another functionality of dup is to introduce constants. The original input is kept unchanged but paired with a new constant: [[dup nil ]] a = (a, [ ]) [[dup zero]] a = (a, 0) [[dup (str s)]] a = (a, s) Their reverses eliminates a constant whose value is known. In both directions we lose no information. The cmp construct takes a pair of values, and let them go through only if they satisfy one of the five binary predicates given by non-terminal B . 2.3
Programming Examples in Inv
All functions that move around the components in a pair can be defined in terms of products, assocr , assocl , and swap. We find the following functions useful: subr = assocl ; (swap × id ); assocr subl = assocr ; (id × swap); assocl trans = assocr ; (id × subr ); assocl Their semantics, after expanding the definition, is given below: [[subr ]] (a, (b, c)) = (b, (a, c)) [[subl ]] ((a, b), c) = ((a, c), b) [[trans]] ((a, b), (c, d )) = ((a, c), (b, d ))
8
S.-C. Mu, Z. Hu, and M. Takeichi mktoc h1 cont extract enlist body
= = = = = =
denode article; cons˘; (h1 × cont); cons; ennode html denode title; ennode h1 extract; (enlist × body); cons map (denode section; cons˘; (denode title × id ); dup fst; swap); unzip map (ennode li); ennode ol map ((ennode h3 × id ); cons; ennode div)
denode s = node˘; swap; eq (str s) ennode s = (denode s)˘ Fig. 3. An Inv program performing the transform from Figure 1(a) to Figure 1(b). String constants are written in typewriter font
Many list-processing functions can be defined recursively on the list. The function map applies a function to all elements of a list; the function unzip takes a list of pairs and splits it into a pair of lists. They can be defined by: map f = µ(X : nil ˘; nil ∪ cons˘; (f × X ); cons) unzip = µ(X : nil ˘; δ; (nil × nil ) ∪ cons˘; (id × X ); trans; (cons × cons)) This is what one would expect when we write down their usual definition in a point-free style. The branches starting with nil ˘ are the base cases, matching empty lists, while cons˘ matches non-empty lists. It is also provable from the semantics that (map f )˘ = map f ˘. The function merge takes a pair of sorted lists and merges them into one. However, by doing so we lose information necessary to split them back to the original pair. Therefore, we tag the elements in the merged list with labels indicating where they were from. For example, merge ([1, 4, 7], [2, 5, 6]) = [L 1, R 2, L 4, R 5, R 6, L 7]. It can be defined in Inv as below: merge = µ(X : eq nil ; map inl ∪ swap; eq nil ; map inr ∪ (cons˘ × cons˘); trans; ((leq × id ); assocr ; (id × subr ; (id × cons); X ); (inl × id ) ∪ ((gt; swap) × id ); assocr ; (id × assocl ; (cons × id ); X ); (inr × id )); cons)
where leq = cmp (≤) and gt = cmp (>). As a final example, the program in Figure 3 performs the transform from Figure 1(a) to Figure 1(b). It demonstrates the use of map, unzip and dup. For brevity, the suffixing id in dup (fst; id ) will be omitted. 2.4
The View-Updating Problem
Now consider the scenario of an editor, where a source document is transformed, via an Inv program, to a view editable by the user. Consider the transform toc = map (dup fst); unzip, we have:
An Algebraic Approach to Bi-directional Updating
9
toc [(1, “a”), (2, “b”), (3, “c”)] = ([(1, “a”), (2, “b”), (3, “c”)], [1, 2, 3]) Think of each pair as a section and the numbers as their titles, the function toc is a simplified version of the generation of a table of contents, thus the name. Through a special interface, there are several things the user can do: change the value of a node, insert a new node, or delete a node. Assume that the user changes the value 3 in the “table of contents” to 4: ([(1, “a”), (2, “b”), (3, “c”)], [1, 2, 4]) Now we try to perform the transformation backwards. Applying the reverse operator to toc, we get (map (dup fst); unzip)˘ = unzip˘; map (eq fst). Applying it to the modified view, unzip˘ maps the modified view to: [((1, “a”), 1), ((2, “b”), 2), ((3, “c”), 4)] pairing the sections and the titles together, to be processed by map (eq fst). However, ((3, “c”), 4) is not in the domain of eq fst because the equality check fails. We wish that eq fst would return (4, “c”) in this case, answering the user’s wish to change the section title. Now assume that the user inserts a new section title in the table of contents: ([(1, “a”), (2, “b”), (3, “c”)], [1, 2, 4, 3]) This time the changed view cannot even pass unzip˘, because the two lists have different lengths. We wish that unzip˘ would somehow know that the two 3’s should go together and the zipped list should be [((1, “a”), 1), ((2, “b”), 2), (⊥, 4), ((3, “c”), 3)] where ⊥ denotes some unconstrained value, which would be further constrained by map (dup fst) to (4, ⊥). The Inv construct eq fst should also recognise ⊥ and deal with it accordingly. In short, we allow the programmer to write Inv transforms that are not surjective. Therefore it is very likely that a view modified by the user may fall out of the range of the transform. This is in contrast of the approach taken in [12] and [11]. The two problems we discussed just now are representative of the view-updating problem. There are basically two kinds of dependency we have to deal with: element-wise dependency, stating that two pieces of primary-typed data have the same value, and structural dependency, stating that two pieces of data have the same shape. One possible solution is to provide an alternative semantics that extends the ranges of Inv constructs in a reasonable way, so that the unmodified, or barely modified programs can deal with the changes. We will discuss this in detail in the next section.
10
3
S.-C. Mu, Z. Hu, and M. Takeichi
Alternative Semantics
We will need some labels in the view, indicating “this part has been modified by the user.” We extend the View data type as described below: View ::= . . . | ∗Int | ∗String List a ::= . . . | a ⊕ List a | a List a Here the ∗ mark applies to atomic types only, indicating that the value has been changed. The view a⊕x denotes a list a: x whose head a was freshly inserted by the user, while a x denotes a list x which used to have a head a but was deleted. The deleted value a is still cached for future use. The two operators associate to the right, like the cons operator (:). A similar set of operators can be introduced for Tree but they are out of the scope of this paper. The original semantics of each Inv program is an injective function. When the tags are involved, however, we lost the injectivity. Multiple views may be mapped to the same source. For example, the value 1 is mapped to (1, 1) by δ. In the reverse direction, (n, ∗1) and (∗1, n), for all numerical n, are all mapped to 1. Similarly, all these views are mapped back to [1, 2, 3] when the transform is map succ: [2, 3, 4], a [2, 3, 4], 2 : a [3, 4], 2 ⊕ [3, 4], 2 : 3 ⊕ [4] for all a. We define two auxiliary functions notag? and ridtag. The former is a partial function letting through the input view unchanged if it contains no tags. The latter gets rid of the tags in a view, producing a normal form. Their definitions are trivial and omitted. The behaviour of the editor, on the other hand, is specified using two functions getX and putX , both parameterised by an Inv program X: getX = notag?; [[X ]] ˙ [[X ˘]]; ridtag putX ⊆ The function getX maps the source to the view by calling X . The function putX , on the other hand, maps the (possibly edited) view back to the document ˙ denotes by letting it go though X ˘ and removing the tags in the result. Here ⊆ ˙ “functional refinement”, defined by f ⊆g if and only if f ⊆ g and dom f = dom g. In general [[X ˘]]; ridtag is not a function since [[X ˘]] may leave some values unspecified. However, any functional refinement of [[X ˘]]; ridtag would satisfy the properties we want. The implementation can therefore, for example, choose an “initial value” for each unspecified value according to its type. The initial view is obtained by a call to getX . When the user performs some editing, the editor applies putX to the view, obtaining a new source, before generating a new view by calling getX again. In the original semantics of Inv, the ˘ operator is simply relational converse. In the extended semantics, the ˘ operator deviates from relational converse for three constructs: δ, cons and sync, to be introduced later. For other cases we still ◦ have [[f ˘]] = [[f ]] . The distributivity rules of ˘ given in Section 2.2 are still true. In the next few sub-sections we will introduce extensions to the original semantics in the running text. A summary of the resulting semantics will be given in the end of Section 3.2.
An Algebraic Approach to Bi-directional Updating
3.1
11
Generalised Equality Test
The simple semantics of δA˘, where A is an atomic type, is given by the set {((n, n), n) | n ∈ A}. To deal with editing, we generalise its semantics to: [[δ˘]] (n, n) = n [[δ˘]] (∗n, m) = ∗n
[[δ˘]] (∗n, ∗n) = ∗n [[δ˘]] (m, ∗n) = n
When the two values are not the same but one of them was edited by the user, the edited one gets precedence and goes through. Therefore (∗n, m) is mapped to ∗n. If both values are edited, however, they still have to be the same. Note that the semantics of δ does not change. Also, we are still restricted to atomic types. One will have to call map δ; unzip to duplicate a list, thereby separate the value and structural dependency. The syntax of dup can be extended to allow, a possibly non-injective function. The results of the non-injective function, and those derive from them, are supposed to be non-editable. It is a useful functionality but we will not go into its details. 3.2
Insertion and Deletion
Recall unzip defined in Section 2.3. Its reverse, according to the distributivity of ˘, is given by: unzip˘ = µ(X : (nil ˘ × nil ˘); δ˘; nil ∪ (cons˘ × cons˘); trans; (id × X ); cons) The puzzle is: how to make it work correctly with the presence of and ⊕ tags? We introduce several new additional operators and types: – two new Inv operators, del and ins, both parameterised by a view. The function del a :: [A] → [A] introduces an (a ) tag, while ins a :: [A] → [A] introduces an (a ⊕ ) tag. – two kinds of pairs in View : positive (a, b)+ and negative (a, b)- . They are merely pairs with an additional label. They can be introduced only by the reverse of fstb± and snda± functions to be introduced below. The intention is to use them to denote pairs whose components are temporary left there for some reason. – six families of functions fsta2 and snda2 , where 2 can be either +, −, or nothing, defined by fstb2 (a, b)2 = a snda2 (a, b)2 = b That is, fstb+ eliminates the second component of a positive pair only if it equals b. Otherwise it fails. Similarly, snda eliminates the first component of an ordinary pair only of it equals a. When interacting with existing operators, they should satisfy the algebraic rules in Figure 4. In order to shorten the presentation, we use 2 to match +, − and nothing, while ± matches only + and −. The 2 and ± in the same rule must match the same symbol.
12
S.-C. Mu, Z. Hu, and M. Takeichi
With the new operators and types, an extended unzip capable of dealing with deletion can be extended from the original unzip by (here “. . .” denotes the original two branches of unzip): unzip˘ = µ(X : . . . ∀a, b· ((ins a)˘ × (ins b)˘); X ; ins (a, b) ∪ ((ins a)˘ × isList); X ; ins (a, b) ∪ (isList × (ins b)˘); X ; ins (a, b) ∪ ((del a)˘ × (del b)˘); X ; del (a, b) ∪ ((del a)˘ × cons˘; sndb- ); X ; del (a, b) ∪ (cons˘; snda- × (del b)˘); X ; del (a, b)) where a and b are universally quantified, and isList = nil ˘; nil ∪ cons˘; cons, a subset of id letting through only lists having no tag at the head. Look at the branch starting with ((ins a)˘ × (ins b)˘). It says that, given a pair of lists both starting with insertion tags a ⊕ and b ⊕, we should deconstruct them, pass the tails of the lists to the recursive call, and put back an ((a, b) ⊕ ) tag. If only the first of them is tagged (matching the branch starting with ((ins a)˘ × isList)), we temporarily remove the a ⊕ tag, recursively process the lists, and put back a tag ((a, b) ⊕ ) with a freshly generated b. The choice of b is non-deterministic and might be further constrained when unzip is further composed with other relations. The situation is similar with deletion. In the ◦ branch starting with (del a × sndb+ ; cons) where we encounter a list with an a deleted by the user, we remove an element in the other list and remember its value in b. Here universally quantified b is used to match the value — all the branches with different b’s are unioned together, with only one of them resulting in a successful match. It would be very tedious if the programmer had to explicitly write down these extra branches for all functions. Luckily, these additional branches can be derived automatically using the rules in Figure 4. In the derivations later we will omit the semantics function [[ ]] and use the same notation for the language and its semantics, where no confusion would occur. This is merely for the sake of brevity. In place of ordinary cons, we define two constructs addressing the dependency of structures. Firstly, the bold cons is defined by:: cons = cons ∪ a::A (snda- ; del a) ∪ a::A (snda+ ; ins a) 2 2 (f × g); fst(g b) = fstb ; f , if g total
(f
× g); snd(f2 a) swap; snda2 snda2 ˘; eq nil
= =
snda2 ; g, fsta2
if f total
assocl ; (fstb2 × id ) = (id × sndb2 ) assocl ; (snda2 × id ) = (snda2 ∪ snda ) 2 = snda2 ; (sndb2 ∪ sndb ) assocl ; snd(a,b)
= (λ [ ] → a)
Fig. 4. Algebraic rules. Here (λ [ ] → a) is a function mapping only empty list to a. Only rules for assocl are listed. The rules for assocr can be obtained by pre-composing assocr to both sides and use asscor ; assocl = id . Free identifiers are universally quantified
An Algebraic Approach to Bi-directional Updating
13
Secondly, we define the following sync operator: sync = (cons × cons) sync˘ = (cons˘ × cons˘) ∪ a,b∈A (((del a)˘; snda- ˘ × (del b)˘; sndb- ˘) ∪ ((del a)˘; snda- ˘ × cons˘; sndb ; sndb- ˘) ∪ (cons˘; snda ; snda- ˘ × (del b)˘; sndb- ˘)) ∪ a,b∈A (((ins a)˘; snda+ ˘ × (ins b)˘; sndb+ ˘) ∪ ((ins a)˘; snda+ ˘ × isList; sndb+ ˘) ∪ (isList; sndb+ ˘ × (ins b)˘; sndb+ ˘)) In the definition of unzip, we replace every singular occurence of cons with cons, and every (cons × cons) with sync. The definition of sync˘ looks very complicated but we will shortly see its use in the derivation. Basically every product corresponds to one case we want to deal with: when both the lists are cons lists, when one or both of them has a tag, or when one or both of them has a ⊕ tag. After the substitution, all the branches can be derived by algebraic reasoning. The rules we need are listed in Figure 4. To derive the first branch for insertion, for example, we reason: unzip˘ ⊇ {fixed-point} sync˘; trans; (id × unzip); cons ⊇ ⊇
{since sync˘ ⊇ ((ins a)˘; snda+ ˘ × (ins b)˘; sndb+ ˘) for all a, b} ((ins a)˘ × (ins b)˘); (snda+ ˘ × (ins b)˘); trans; (id × unzip); cons + {claim: (snda+ ˘ × sndb+ ˘); trans ⊇ (snd(a,b) )˘} + )˘; (id × unzip); cons ((ins a)˘ × (ins b)˘); (snd(a,b)
=
{since (f × g); sndf+a = snda+ ; g for total f } + )˘; cons ((ins a)˘ × (ins b)˘); unzip; (snd(a,b)
⊇
+ ; ins (a, b)} {since cons ⊇ snd(a,b) + + )˘; snd(a,b) ; ins (a, b) ((ins a)˘ × (ins b)˘); unzip; (snd(a,b)
=
{since sndx+ ˘; sndx+ = id } ((ins a)˘ × (ins b)˘); unzip; ins (a, b)
2 We get the first branch. The claim that trans˘; (snda2 × sndb2 ) = snd(a,b) can be verified by the rules in Figure 4 and is left as an exercise. The introduction of two kinds of pairs was to avoid the suffix being reduced to (del (a, b))˘ in the last two steps. To derive one of the branches for deletion, on the other hand, one uses the inclusion sync˘ ⊇ ((del a)˘; snda- ˘ × cons˘; sndb ; sndb- ˘) for the first step, and cons ⊇ snd(a,b) ; del (a, b) and (snd(a,b) )˘; snd(a,b) = id fort the last step. All the branches can be derived in a similar fashion.
14
S.-C. Mu, Z. Hu, and M. Takeichi [[nil ]] () [[zero]] () [[succ]] n [[cons]] (a, x ) [[node]] (a, x ) [[inl ]] a [[inr ]] a [[id ]] a
= = = = = = = =
[] 0 n +1 a: x Node a x La Ra a
[[cmp ]] (a, b)2 = (a, b)2 , if a b [[f ; g]] x = [[g]] ([[f ]] x ) [[f × g]] (a, b)2 = ([[f ]] a, [[g]] b)2 [[f ∪ g]] = [[f ]] ∪ [[g]], if dom f ∩ dom g = ran f ∩ ran g = ∅ [[µF ]] = [[F µF ]] [[f ˘]] [[f ; g˘]] [[(f × g)˘]] [[(f ∪ g)˘]] [[µF ˘]]
= = = = =
[[f ]]◦ [[g˘]]; [[f ˘]] [[(f ˘ × g˘)]] [[f ˘]] ∪ [[g˘]] [[µ(X → (F X ˘)˘]]
[[swap]] (a,b)2 = (b,a)2 [[assocr ]] ((a,b)± , c)±= (a,(b,c)± )± [[assocr ]] ((a,b)± , c) = (a,(b,c)± ) [[assocr ]] ((a,b),c)± = (a,(b,c))± 2 [[fst 2 assocl = assocr ˘ =b a ]] (a, b) 2 [[snd 2 ]] (a, b) =a b (f ˘)˘ = f [[del a]] (a x ) = (a, x )[[ins a]] (a ⊕ x ) = (a, x )+ [[δ]] n = (n, n) [[δ˘]] (n, n)2 = n dup id =δ [[δ˘]] (∗n, ∗n)2 = ∗n dup (fst; P ) = (dup P × id ); subl dup (snd ; P ) = (id × dup P ); assocl [[δ˘]] (∗n, m)2 = ∗n dup (cons˘; P ) = cons˘; dup P ; (cons × id ) [[δ˘]] (m, ∗n)2 = ∗n dup (node˘; P ) = node˘; dup P ; (node × id ) [[dup nil ]] a = (a, [ ]) sync = (cons × cons) [[(dup nil )˘]] (a, [ ])2 =a sync˘ [[dup zero]] a = (a, 0) = (cons˘ × cons˘) ∪ a,b∈A (((del a)˘; snda- ˘ × (del b)˘; sndb- ˘) [[(dup zer 0)˘]] (a, 0)2 = a [[dup (str s)]] a = (a, s) ∪ ((del a)˘; snda- ˘ × cons˘; sndb ; sndb- ˘) 2 [[(dup (str s))˘]] (a, s) = a a ˘×(del b)˘; sndb ˘)) ∪ (cons˘; snda ; snd + + ∪ a,b∈A (((ins a)˘; snda ˘ × (ins b)˘; sndb ˘) cons = cons ∪ ((ins a)˘; snda+ ˘ × isList; sndb+ ˘) ∪ a::A (snda- ; del a) ∪ (isList; sndb+ ˘ × (ins b)˘; sndb+ ˘)) ∪ a::A (snda+ ; ins a) Fig. 5. Summary of the alternative semantics. The patterns should be matched from the top-left to bottom-left, then top-right to bottom-right
3.3
The Put-Get-Put Property and Galois Connection
A valid Inv program is one that does not use fsta2 and sndb2 apart from in cons and sync. The domain of getX , for a valid X , is restricted to tag-free views, so is its range. In fact, notag?; [[X ]] reduces to the injective function defined by the original semantics. Therefore, getX ; getX ◦ = dom getX . Furthermore, notag?; ridtag = notag?. As a result, for all valid Inv programs X we have the following get-put property: getX ; putX = dom getX
(1)
An Algebraic Approach to Bi-directional Updating
15
This is a desired property for our editor: mapping an unedited view back to the source always gives us the same source document. On the other hand, putX ; getX ⊆ id is not true. For example, (putδ ; getδ ) (∗a,b) = (a, a) = (∗a, b). This is one of the main differences between our work and that of [12] and [11]. They both assume the relation X to be bi-total, and that the put-get property putX ; getX = id holds. It also implies that duplication cannot be allowed in the language. Instead, we have a weaker property. First of all, for all valid X we have dom getX ⊆ ran putX . That is, every valid source input to getX must be a result of putX for at least one view, namely, the view the source get mapped to under the original semantics. Pre-composing put X to (1) and use putX ; dom getX ⊆ putX ; ran putX = putX , we get the following put-get-put property: putX ; getX ; putX ⊆ putX
(2)
When the user edits the view, the editor calls the function put X to calculate an updated source, and then calls getX to update the view as well. For example, (∗a, b) is changed to (a, a) after putδ ; getδ . With the put-get-put property we know that another putX is not necessary, because it is not going to change the view — the result of putX ; getX ; putX , if anything, is the same as that of putX . It is desirable to have putX ; getX ; putX = putX . However, this is not true, and dom getX = ran putX . For a counter-example, take X = (δ × id ); assocr ; (id × δ). The function getX takes only pairs with equal components and returns it unchanged. Applying putX to (∗b, a) results in (b, a), which is not in the domain of getX . Such a result is theoretically not satisfactory, but does not cause a problem for our application. The editor can signal an error to the user, saying that such a modification is not allowed, when the new source is not in the domain of getX . The domain check is not an extra burden since we have to call getX anyway. A Galois connection is a pair of functions f :: A → B and g :: B → A satisfying f x y ≡x gy
(3)
Galois connected functions satisfy a number of properties, including f ; g; f = f . For those X that dom getX = ran putX do hold, getX and putX satisfy (3), if ◦ we take to be equality on tag-free View s and to be (putX ; getX ) . That is, s s if and only if the two sources s and s are exactly the same, while a view v is no bigger than v under if there exists a source s such that v = getX s and s = put v . For example, (n, n) is no bigger than (∗n, m), (m, ∗n), (∗n, ∗n), and (n, n) itself under , when the transform is δ. The only glitch here is that is not reflexive! In fact it is reflexive only in the range of getX — the set of tag-free views. However, this is enough for getX and putX to satisfy most properties of a Galois connection. 3.4
Implementation Issues
In our experimental implementation, we have a simple interpreter for Inv. One way to incorporate the algebraic rules in the previous section in the imple-
16
S.-C. Mu, Z. Hu, and M. Takeichi
mentation is to call a pre-processor before the program is interpreted. Another possibility is to build the rules implicitly in the interpreter. In this section we will talk about how. The abstract syntax tree of Inv is extended with new constructs cons and sync. The “intermediate” functions introduced in the last section, namely ins,del , fst ± s and snd ± s, are not actually represented in the abstract syntax. Instead, we extend the value domain View with additional constructs: View ::= . . . | (View , +View ) | (+View , View ) | (View , -View ) | (-View , View ) | ⊥ | NilTo View Conceptually, after we apply snda+ ˘ to a value b, we get (+a, b), while (-a, b) is the result of applying snda- ˘ to b. The reader can think of them as a note saying “the value should have been b only, but we temporarily pair it with an a, just to allow the computation to carry on.” Or one can think of it as a pending application of snda+ or snda- . The ⊥ symbol denotes an unconstrained value. Finally, NilTo a denotes a function taking only [ ] and returns a. To implement the sync operator, we add the following definitions (some cases are omitted): [[sync˘]] (a: x , b: y) = ((a, x ), (b, y)) [[sync˘]] (a ⊕ x , y) = ((+a, x ), (+⊥, y)) [[sync˘]] (a x , b: y) = ((-a, x ), (-b, y)) [[sync˘]] (x , b ⊕ y) = ((+⊥, x ), (+b, y)) [[sync˘]] (a: x , b y) = ((-a, x ), (-b, y)) ◦
The first clause is simply what (cons×cons) would do. The second clause shows that when there is a deletion in the first list, we throw away an element in the second list as well, while keeping note of the fact by the (- , ) tag. It cor◦ ˆ responds to the (del a ◦ ; snda- ◦ × cons ◦ ; sndb- ; sndb- ◦ ) branch of (cons ×cons) . The ◦ ◦ ◦ fourth branch, on the other hand, corresponds to ((ins a) ; snda+ ×isList; sndb+ ). The newly introduced, unconstrained value b is represented by ⊥. Now we build in some extra rules for cons and cons˘: [[cons]] (-a, x ) = a x [[cons˘]] (a x ) = (-a, x ) [[cons]] (+a, x ) = a ⊕ x [[cons˘]] (a ⊕ x ) = (+a, x ) They correspond to the fact that snd ˘; cons = del and snda ˘; cons = ins a. Also, some additional rules for assocr : [[assocr ]] ((a, +b), c) = (a, (+b, c)) [[assocr ]] ((+a, b), c) = (+a, (b, c))
[[assocr ]] (+(a, b), c) = (+a, (+b, c))
The three clauses correspond to the rules for assocl in the left column of Figure 4. Finally we need some rules for dup nil and its inverse eq nil : [[(eq nil )]] (-a, [ ]) = NilTo a which corresponds to the rule
4
[[(dup nil )]] (NilTo a) = (-a, [ ]) snda2 ˘; eq
nil = (λ [ ] → a) in Figure 4.
More Examples
In this section we will show more transforms defined in Inv that do satisfy dom getX = ran putX and how they react to user editing.
An Algebraic Approach to Bi-directional Updating
4.1
17
Snoc and List Reversal
The function snoc :: (a, List a) → List a, appending an element to the end of a list, can be defined recursively as: snoc = µ(X : eq nil ; dup nil ; cons ∪ (id × cons◦ ); subr ; (id × X ); cons) For example [[snoc]] (4, [1, 2, 3]) = [1, 2, 3, 4]. Conversely, snoc˘ extracts the last element of a list. But what is the result of extracting the last element of a list whose last element was just removed? We expand the base case: snoc˘ ⊇ {fixed-point} cons˘; eq nil ; dup nil ⊇ {specialising cons ⊇ snda- ; del a} (del a)˘; snda- ˘; ea nil ; dup nil =
{since snda- ˘; eq nil = (λ[ ] → a)} (del a)˘; (λ[ ] → a); dup nil
=
{since snda- ˘; eq nil = (λ[ ] → a) ⇒ snda- ˘ = (λ[ ] → a); dup nil } (del a)˘; snda- ˘
That is, for example, eval snoc˘ (4 [ ]) = (-4, [ ]). Inductively, we have eval snoc˘ (1 : 2 : 3 : 4 [ ]) = (-4, 1 : 2 : 3 : [ ]), which is reasonable enough: by extracting the last element of a list whose last element, 4, is missing, we get a pair whose first element should not have been there. The ubiquitous fold function on lists can be defined by fold f g = µ(X : nil ˘; g ∪ cons˘; (id × X ); f ) The function reverse, reverting a list, can be defined in terms of fold as reverse = fold snoc nil . Unfolding its definition, we can perform the following refinement: reverse˘ ⊇
{unfolding the definitions} snoc˘; (id × reverse˘); cons ⊇ {by the reasoning above, snoc˘ ⊇ (del a)˘; snda- ˘} (del a)˘; snda- ˘; (id × reverse˘); cons = {since (f × g); sndf- a = snda- ; g for total f } (del a)˘; reverse˘; snda- ˘; cons ⊇
{since cons ⊇ snda- ; del a and snda- ˘; snda- = id } (del a)˘; reverse˘; del a
which shows that reverse˘ regenerates the tags (and, similiarly, ⊕ tags) upon receipt of the “partial” pairs returned by snoc. For example, we have eval reverse (1 : 2 : 3 4 : [ ]) = 4 : 3 2 : 1 : [ ] which is exactly what we want. A lesson is that to deal with lists, we have to first learn to deal with pairs.
18
4.2
S.-C. Mu, Z. Hu, and M. Takeichi
Merging and Filtering
Recall the function merge defined in Section 2.3, merging two sorted lists into one, while marking the elements with labels remembering where they were from: merge ([1, 4, 7], [2, 5, 6]) = [L 1, R 2, L 4, R 5, R 6, L 7] Filtering is an often needed feature. For example, in a list of (author , article) pairs we may want to extract the articles by a chosen author. The Haskell Prelude function filter :: (a → Bool ) → List a → List a, returning only the elements in the list satisfying a given predicate, however, is not injective because it throws away some items. A common scenario of filtering is when we have a list of sorted items to filter. For example, the articles in the database may be sorted by the date of creation, and splitting the list retains the order. If we simplify the situation a bit further, it is exactly the converse of what merge does, if we think of L and R as true and false! To make merge work with editing tags, we simply replace every occurrence of cons with cons, including the cons in (cons × cons). This time the latter shall not be replaced by sync because we certainly do not want to delete or invent elements in one list when the user edits the other! This merge does behave as what we would expect. For example, when an element is added to the split list: merge (1 : 3 ⊕ 4 : 7 : [ ], [2, 5, 6]) = L 1 : R 2 : L 3 ⊕ [L 4, R 5, R 6, L 7] the new element is inserted back to the original list as well.
5
Conclusion
Bi-directional updating, though an old problem[5, 7, 10, 14, 1], has recently attracted much interests, each took a slightly different approach according to their target application. We have developed a formalisation of bi-directional updating which is able to deal with duplication and structural changes like insertion and deletion. From a specification X , written as an injective function, we induce two functions getX and putX that satisfy the important get-put and put-get-put properties. To find out how putX reacts to user editing, one can make use of algebraic reasoning, which also provides a hint how the formalisation can be implemented in an interpreter. Our formalisation deals with duplication and structural changes at the cost of introducing editing tags, which is okay for our application — to integrate it to our structural editor in [15]. The complementary approach taken by [11], on the other hand, chooses not to use any information how the new view was constructed. Upon encountering inconsistency, the system generates several possible ways to resolve the inconsistency for the user to choose from. It would be interesting to see whether there is a general framework covering both approaches. Another feature of our work is the use of an injective language, and various program derivation and inversion techniques. The injective language Inv has been introduced in [13], where it is also described how to automatically derive
An Algebraic Approach to Bi-directional Updating
19
an injective variant for every non-injective program. So far we have a primitive implementation. For an efficient implementation, however, the techniques described in [9] based on parsing may be of help. The history of point-free functional languages many be traced to Backus’s FP [4], although our style in this paper is more influenced by [6]. A number of wellestablished libraries adopt a point-free style, such as the XML processing library HaXml [16]. Furthermore, there is a tedious, uninteresting way of converting a certain class of pointwise functional programs into Inv. The class of programs is essentially the same as the source language in [9], that is, first-order functional programs with linear patterns in case expressions, where every variable is used at least once. Acknowledgements The idea of using algebraic rules and program transformation to guide the processing of editing tags was proposed by Lambert Meertens during the first author’s visit to Kestrel Institute, CA. The authors would like to thank Johan Jeuring for useful improvements to an earlier draft of this paper, and the members of the Programmable Structured Document in Information Processing Lab, Tokyo University for valuable discussions. This research is partly supported by the eSociety Infrastructure Project of the Ministry of Education, Culture, Sports, Science and Technology, Japan.
References 1. S. Abiteboul. On views and XML. In Proceedings of the 18th ACM SIGPLANSIGACT-SIGART Symposium on Principles of Database Systems, pages 1–9. ACM Press, 1999. 2. S. M. Abramov and R. Gl¨ uck. The universal resolving algorithm and its correctness: inverse computation in a functional language. Science of Computer Programming, 43:193–299, May-June 2002. 3. Altova Co. Xmlspy. http://www.xmlspy.com/products ide.html. 4. J. Backus. Can programming be liberated from the von Neumann style? a functional style and its algebra of programs. Communications of the ACM, 21(8):613– 641, August 1978. 5. F. Bancilhon and N. Spyratos. Update semantics of relational views. ACM Transactions on Database Systems, 6(4):557–575, December 1981. 6. R. S. Bird and O. de Moor. Algebra of Programming. International Series in Computer Science. Prentice Hall, 1997. 7. U. Dayal and P. A. Bernstein. On the correct translation of update operations on relational views. ACM Transactions on Database Systems, 7(3):381–416, September 1982. 8. R. Gl¨ uck and M. Kawabe. A program inverter for a functional language with equality and constructors. In A. Ohori, editor, Programming Languages and Systems. Proceedings, number 2895 in Lecture Notes in Computer Science, pages 246–264. Springer-Verlag, 2003.
20
S.-C. Mu, Z. Hu, and M. Takeichi
9. R. Gl¨ uck and M. Kawabe. Derivation of deterministic inverse programs based on LR parsing (extended abstract). In Y. Kameyama and P. J. Stuckey, editors, Proceedings of Functional and Logic Programming, number 2998 in Lecture Notes in Computer Science, pages 291–306, Nara, Japan, 2004. Springer-Verlag. 10. G. Gottlob, P. Paolini, and R. Zicari. Properties and update semantics of consistent views. ACM Transactions on Database Systems, 13(4):486–524, December 1988. 11. M. B. Greenwald, J. T. Moore, B. C. Pierce, and A. Schmitt. A language for bi-directional tree transformations. Technical Report, MS-CIS-03-08, University of Pennsylvania, August 2003. 12. L. Meertens. Designing constraint maintainers for user interaction. ftp://ftp.kestrel.edu/ pub/papers/meertens/dcm.ps, 1998. 13. S.-C. Mu, Z. Hu, and M. Takeichi. An injective language for reversible computation. In Seventh International Conference on Mathematics of Program Construction. Springer-Verlag, July 2004. 14. A. Ohori and K. Tajima. A polymorphic calculus for views and object sharing. In Proceedings of the 13th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 255–266. ACM Press, 1994. 15. M. Takeichi, Z. Hu, K. Kakehi, Y. Hayashi, S.-C. Mu, and K. Nakano. TreeCalc:towards programmable structured documents. In The 20th Conference of Japan Society for Software Science and Technology, September 2003. 16. M. Wallace and C. Runciman. Haskell and XML: generic combinators or typebased translation? . In Proceedings of the 1999 ACM SIGPLAN International Conference on Functional Programming, pages 148–159. ACM Press, September 1999.
Network Fusion Pascal Fradet1 and St´ephane Hong Tuan Ha2 1
INRIA Rhˆ one-Alpes, 655, av. de l’Europe, 38330 Montbonnot, France
[email protected] 2 IRISA/INRIA Rennes, Campus de Beaulieu, 35042 Rennes, France Stephane.Hong Tuan
[email protected] Abstract. Modular programming enjoys many well-known advantages but the composition of modular units may also lead to inefficient programs. In this paper, we propose an invasive composition method which strives to reconcile modularity and efficiency. Our technique, network fusion, automatically merges networks of interacting components into equivalent sequential programs. We provide the user with an expressive language to specify scheduling constraints which can be taken into account during network fusion. Fusion allows to replace internal communications by assignments and alleviates most time overhead. We present our approach in a generic and unified framework based on labeled transition systems, static analysis and transformation techniques.
1
Introduction
Modular programming enjoys many well-known advantages: readability, maintainability, separate development and compilation. However, the composition of modular units (components) gives rise to efficiency issues. Sequential composition poses space problems: the producer delivers its complete output before the consumer starts. Parallel composition relies on threads, synchronization and context switches which introduce time overhead. In this paper, we propose an invasive composition method, network fusion, which strives to reconcile modularity and efficiency. Our technique automatically merges networks of interacting components into equivalent sequential programs. Our approach takes two source inputs: a network of components and user-defined scheduling constraints. Networks are formalized as Kahn Process Networks (Kpns) [7] a simple formal model expressive enough to specify component programming and assembly. Scheduling constraints allow the user to choose the scheduling strategy by specifying a set of desired executions. The operational semantics of Kpns and scheduling constraints are both formalized as guarded labeled transition systems (Lts).
This work has been supported in part by the ACI DISPO and R´egion Bretagne
W.-N. Chin (Ed.): APLAS 2004, LNCS 3302, pp. 21–40, 2004. c Springer-Verlag Berlin Heidelberg 2004
22
P. Fradet and S.H.T. Ha
Network fusion is an automatic process which takes a Kpn and scheduling constraints and yields a sequential program respecting the constraints. Note that constraints may introduce artificial deadlocks, in which case the user will be warned. The resulting program must be functionally equivalent to the Kpn modulo the possible deadlocks introduced by constraints. Fusion alleviates most time overhead by allowing the suppression of context switches, the replacement of internal communications by assignments to local variables and optimizations of the resulting sequential code using standard compiling techniques. Network fusion can be seen as a generalization of filter fusion [12] to general networks using ideas from aspect-oriented programming [8] (scheduling constraints can be seen as an aspect and their enforcement as weaving). The four main steps of the fusion process are represented in Figure 1.
Network of Components Kpn (Sec. 2.2)
Scheduling Constraints (Sec. 4.1)
Abstraction Sec. 3
Abstract Execution Graph Aeg (Sec. 3)
Enforcing constraints Sec. 4.2
Constrained Aeg Scheduling Sec. 4.3
: source
Scheduled Aeg
: intermediate result
Concretization Sec. 3
−→ : transformation step
Sequential Program
Fig. 1. Main steps of network fusion
– The first step is the abstraction of the network into a finite model called an Abstract Execution Graph (Aeg). An Aeg over-approximates the set of possible executions traces. We do not present this step in details since it relies on very standard analysis techniques (e.g., abstract interpretation) and many different abstractions are possible depending on the desired level of precision. Instead, we focus on the properties that an Aeg must satisfy. – The second step consists in enforcing constraints. This is expressed as a synchronized product between guarded Lts (the Aeg and the constraints).
Network Fusion
23
In general, this step does not sequentialize completely the execution and leaves scheduling choices. – The third step completes the scheduling of the constrained Aeg. Several strategies can be used as long as they are fair. Again, these strategies can be expressed as guarded Lts and scheduling as a synchronized product. – The fourth step, concretization, maps the scheduled (serialized) Aeg to a single sequential program. Further transformations (e.g., standard optimizations) can then be carried out on the resulting program. We have chosen to present fusion in an intuitive and mostly informal way. In particular, we do not provide any correctness proofs. They would require a complete description of the operational semantics of Kpn too long to fit space limits. This paper is organized as follows. Section 2 presents the syntax and semantics of Kpns. Section 3 describes Aegs and defines the abstraction and concretization steps which both relate Aeg to concrete models (programs and Kpns). Section 4 presents the language of constraints and the two main transformation steps of fusion: constraints enforcement and scheduling. We propose three extensions of the basic technique in Section 5 and, finally, we review related work and conclude in Section 6.
2
Networks
We start by providing the syntax of components and networks. We just outline their semantics and provide some intuition using an example. A complete structural operational semantics for Kpns can be found in [6]. 2.1
Basic Components
Components are made of commands c of the form: l 1 : g | a ; l2 where l1 and l2 denote labels, g a guard and a an action. An action is either a read operation on an input channel f ?x, a write operation on an output channel f !x, or an internal action i (left unspecified). A component (or process) p is a set of commands {c1 , . . . , cn }. If the current program point of a component p is l1 , if l1 : g | a ; l2 is a command of p and the guard g is true, then the action a can be executed and the program point becomes l2 . The components we consider in this paper represent valid, sequential and deterministic programs. They have the following restrictions: – A component has a unique entry point denoted by the label l0 . – All the labels used in p are defined in the lhs of commands. – Two commands with the same label have mutually exclusive guards. The program P in Figure 2 sends the set N in increasing order on channel f . Program C assigns a with the value read on channel f if a < b or assigns b with
24
P. Fradet and S.H.T. Ha
Components P =
Semantics
p0 : x := x + 1 ; p1 ; ; p0 p1 : f !x
⎫ ⎧ ⎨ c0 : a < b | f ?a ; c1 ; ⎬ C = c0 : a ≥ b | ι?b ; c1 ; ⎭ ⎩ c1 : o!(a + b) ; c0
(p0 , c0 ) x → 0 Assuming inputs:
a → 0
ι → 1 : . . .
b → 0
P
f
f !x
ι
/ C o
(p0 , c0 )
11 1 (p
0 , c1 )
x → 1
x → 0
a → 0
a → 0
b → 0
Network
11 1
a≥b|ι?b
x:=x+1 (p1 , c0 )
f →
f →
11 1
b → 1
11 1 (p
a≥b|ι?b
f →
x:=x+1
1 , c1 )
11 1
11 1 (p
o!(a+b)
0 , c0 )
x → 1
x → 1
x → 0
a → 0
a → 0
a → 0
b → 0
b → 1
b → 1
f → 1 :
f →
f →
Fig. 2. A Simple Kpn and its trace semantics
the value read on the channel ι otherwise. Then, it sends a + b on the channel o and loops. Note that guards are omitted when they are true. The semantics of a component p is expressed as a Lts (Σp , (l0 , s0 ), Ep , −→p ) where: – Σp is an (infinite) set of states (l, s) where l a label and s a store mapping variables to their values. – (l0 , s0 ) is the initial state made of the initial label l0 and store s0 . We assume that the initial label is always indexed by 0 and that the initial store initializes integer variables by the value 0, – Ep is the set of commands of p, – −→p is the transition relation (actually, a function) on states labeled with the current command. The initial labels of programs P and C (Figure 2) are p0 and c0 respectively and the variables x, a and b are initialized to 0. In the remaining, we use c|g and c|a to denote the guard and the action of the command c respectively. To simplify the presentation, we consider only non-terminating programs. Termination could always be represented by a final looping command such as lend : skip ; lend .
Network Fusion
2.2
25
Networks of Components
A Kpn k is made of a set of processes {p1 , . . . , pn } executed concurrently. Networks are build by connecting output channels to input channels of components. Such channels are called internal channels whereas the remaining (unconnected) channels are the input and output channels of the network. The communication on internal channels is asynchronous (non blocking writes, blocking reads) and is modeled using unbounded fifos. In order to guarantee a deterministic behavior, Kpns require the following conditions [7]: – An internal channel is written by and read from exactly one process. – An input channel is read from exactly one component (and written by none). – An output channel is written by exactly one component (and read from none). – A component cannot test the absence of values on channels. In order to simplify technical developments, we assume that networks have a single input and output channels denoted by ι and o respectively and that the input channel never remains empty. The global execution state of a Kpn is called a configuration. It is made of the local state of each component and the internal channel states i.e., finite sequences of values v1 : . . . : vn : . The operational semantics of Kpn is expressed as a Lts (Σk , α0 , Ek , −→k ) where: – Σk is a (infinite) set of configurations, – the initial configuration α0 is such that each component is in its initial state and each internal channel is empty, – Ek is the union of the sets of commands of components; these sets are supposed disjoint, – the transition relation −→k is defined as performing (non deterministically) any enabled command of any process. A command is enabled when the current program point is its lhs label, its guard is true in the current configuration/state and it is not a blocking read (i.e., a read on an empty channel). The transition relation gives rise to an infinite graph representing all the possible execution traces. A small part of the transition relation −→p for our example is depicted in Figure 2. Here, no global deadlock is possible and all traces are infinite. An infinite execution trace is said to be fair if any enabled action at any point in the trace is eventually executed. The denotational semantics of a Kpn is given by the function from the input values (the input channel) to the output values (the output channel) generated by fair executions. We will write Traces(k) and IO(k) to denote the set of traces and the denotational semantics of the Kpn k respectively. Kpns of deterministic components are deterministic [7]. Also, all fair executions with the same input yield the same output [6]. An important corollary for us is that Kpns are serializable: they can always be implemented sequentially.
26
3
P. Fradet and S.H.T. Ha
Abstract Execution Graphs
Network fusion necessitates to find statically a safe and sequential scheduling. This step relies upon an abstract execution graph (Aeg), a finite model upperapproximating all the possible executions of the Kpn. We present in this section the key properties than an Aeg should satisfy and present an example. An Aeg k is a finite Lts (Σk , α0 , Ek , −→k ) with: – – – –
Σk a finite set of abstract configurations, α0 is the initial abstract configuration, Ek a (finite) set of commands, −→k a labeled transition relation.
The idea behind abstraction is to summarize in an abstract configuration a (potentially infinite) set of concrete configurations [10]. This set is given by the function conc : Σk → P(Σk ) defined as: conc(α ) = {α | α ≈ α } where ≈ is a safety relation relating k and k (and we write k ≈ k ). There can be many possible abstractions according to their size and accuracy. Network fusion is generic w.r.t. abstraction as long as the Aeg respect two key properties: safety and faithfulness. To be safe, the initial abstract configuration of an Aeg must safely approximate the initial concrete configuration. Furthermore, if a configuration α1 is safely approximated by α1 and the network evolves in the configuration α2 , then there exists a transition from α1 to α2 in the Aeg such that α2 is safely approximated by α2 . These two points ensure that any execution trace of the Kpn is safely simulated by one in the Aeg. Formally: Definition 1 (Safety). Let k ≈ k , then k is a safe approximation of k iff α0 ≈ α0 c
c
α1 ≈ α1 ∧ α1 −→k α2 ⇒ ∃α2 . α2 ≈ α2 ∧ α1 −→k α2 A key property of safe abstractions is that they preserve fairness. Of course, since they are upper approximations they include false paths (abstract traces whose concretization is empty). However, for abstract traces representing feasible concrete traces, fair abstract traces represent fair concrete traces. Safety also implies that all fair concrete traces are represented by fair abstract traces. An Aeg is said to be faithful if each abstract transition corresponds to a concrete transition modulo the non-satisfiability of guards or a blocking read. In other words, faithfulness confines approximations to values. A false path can only be an abstract trace with a transition whose concrete image would be a transition with a false guard or a blocking read. Formally: Definition 2 (Faithfulness). Let k ≈ k , then k is a faithful approximation of k iff c c α1 −→k α2 ∧ α1 ≈ α1 ⇒ ∃α2 .α2 ≈ α2 ∧ α1 −→k α2 ∨ ¬G[[c|g ]]α1 ∨ (c|a = f ?x ∧ α1 [f → ])
Network Fusion
27
Faithfulness rules out, for instance, the (highly imprecise but safe) abstraction made of a unique abstract state representing all concrete states. In practice, the same abstract state cannot represent different program points (label configurations). In order to provide some intuition we give here a crude but straightforward abstraction: – Each process state is abstracted into the program point it is associated to. So, variables are not taken into account and process stores are completely abstracted away, – Each internal channel state is represented by an interval approximating the length of its file.
1
(p0 , c0 ) f → [0, +∞[
S GG GG GG GG
f !x
a≥b|ι?b
GG GG GG GG # (p0 , c1 ) 1 f → [0, +∞[
a 1 | f?
GFED @ABC λ1 T
f?
¬f ! ∩ ¬f ?
GFED @ABC λ0 N GFED @ABC λ1 T
'
29
(¬f !)P
A Demand Driven Strategy (ΛDD )
Fig. 4. Examples of Scheduling Constraints
Figure 4 gathers a few examples of constraints for a network with (at least) two components P (writing a file f ) and C (reading the file f ). – The constraint ΛFF summarizes in a small automaton the strategy used by Filter Fusion [12]. The producer P starts until it writes on f ? The control is passed to the consumer C until it reads f and so on. This strategy bounds the size of the fifo f to be at most 1 and therefore it may introduce artificial deadlocks from some networks. ΛFF sequentializes completely the execution of P and C (no scheduling choice remains). – The constraint ΛIS is similar to ΛFF except that both P and C can be executed between writes and reads on f . ΛIS leaves some scheduling choices. – The constraint ΛFB is a generalization of ΛFF to a file f with k places (i.e., P writes k times before the control is passed to C). This is the formalization of the extension of filter fusion proposed in [3] – A demand driven strategy is specified by ΛDD . The consumer C is executed until it blocks i.e., is about to read the empty channel f . Then, P is executed until it produces a value in f . The control is passed to C which immediately reads f and continues. These constraints can be applied to any network as long as it has two components P and C connected at least with a channel f . Of course, constraints can be specified for any number of components and channels.
30
P. Fradet and S.H.T. Ha
4.2
Enforcing Constraints
Enforcing a constraint Λ = (Σλ , λ0 , Eλ , −→λ ) to an Abstract Execution Graph k = (Σk , α0 , Ek , −→k ) can be expressed as a parallel composition (k Λ). This operation can be defined formally as follows. We assume that all shorthands (like (¬?)p ) used in constraints are replaced by the actions of the Aeg they represent. k Λ = (Σk × Σλ , (α0 , λ0 ), Ek , −→kλ ) with g|a
α −→k α
g∧g |a
g |a
c
α −→k α
λ−→λ
c
a ∈ Σk \Σλ
(α , λ) −→kλ (α , λ)
(α , λ) −→kλ (α , λ )
If an action a is taken into account by the constraints, the execution can proceed only if both Lts can execute a (i.e., they can both execute commands made of a and a true guard). The actions not taken into account by the constraints can be executed independently whenever possible. Constraints do not introduce new actions (Eλ ⊆ Ek ). To simplify the presentation, we assumed in the above inference rules that the guards did not use the condition Bp . We now present the rule corresponding to this condition in isolation. The Bp construct serves to pass the control to another component when one is blocked. The condition Bp is easily defined w.r.t. Kpns: p is blocked in configuration α if there is no outgoing transition labeled with a command of p. However, Aegs are approximations with false paths; a component p can be blocked even if the corresponding abstract state has outgoing transitions labeled with commands of p. Actually, p is blocked in an abstract state if any outgoing p transition has either a false guard or is a read on an empty channel (i.e., is not ci enabled). Formally, let c1 , . . . , cn all the commands of p such that α −→ k αi ¬(ci|g ) ∨ f = 0 if ci|a = f ?x and gi = otherwise ¬(ci|g ) then the necessary and sufficient condition for p to be blocked in α is bp (α ) = gi i=1,...,n
The product of an Aeg with a transition guarded by Bp is defined as follows: g|a
α −→k α (α , λ)
Bp |a
λ −→λ λ
g∧bp (α )|a −→kλ
(α , λ )
Figure 5 represents the product of the Aeg of Figure 3 with ΛFF . The component P is executed until it produces a value on f then C is executed until it
Network Fusion
31
(λ0 , p0 , c0 ) f → [0, 0]
x:=x+1
(λ0 , p1 , c0 ) f → [0, 0]
f !x
(λ1 , p0 , c0 ) f → [1, 1]
(λ0 , p0 , c1 ) f → [0, 0]
u
n
o!(a+b)
a=< δ0 i
¬B P |P
) ?>=< 89:; δ1
B C ∧ ¬B P |P
¬B C |C
B C ∧ B P |deadlock
B C ∧ B P |deadlock
& ?>=< 89:; x δ2
Fig. 6. Round-Robin Scheduling
The schedule is fair and ensures a complete serialization of the execution. It starts by enforcing the execution of one instruction of P , then one instruction of C and so on. If one of the two processes is blocked at its turn, then an instruction of the other process is executed instead. When both processes are blocked then it is a global deadlock denoted by the special instruction deadlock. Constrained Aegs are composed in parallel with the automaton of Figure 6 to obtain sequential programs. The composition is the same as before except that the deadlock action does not belong to the set of actions of components. The product will therefore introduce a new deadlock transition along with a new state in the Aeg. This new transition, which detects a global deadlock, will be implemented by printing an error message and terminating the program. When such a transition appears in the result of fusion, the user is warned of a possibility of deadlock. Let us consider the scheduling of the original Aeg of Figure 3. This situation would arise if the user does not provide any constraint. The Aeg obtained after product (and simplifications) is given in Figure 7. Simplifications are needed
Network Fusion
o
(δ0 , p0 , c0 ) f → [0, +∞[
o!(a+b)
33
(δ1 , p1 , c0 ) f → [0, +∞[
O
f !x
x:=x+1
(δ1 , p1 , c0 ) f → [0, +∞[
a≥b|ι?b
1 , p1 , c0 ) / f(δ→ [0, +∞[
f !x
(δ0 , p1 , c0 ) f → [0, +∞[
a 0 Γ; ∅; ∅ Ti : ti for all i < |T | (T-State) T : t0 S : t0 S : t0 S + S : t0
(T-Choice)
Γ; b; p fi : ti Γ; b ; p fj : tj i < j (T-Thread) Γ; b; p fi ◦ fj : ti Γ; b; p c : t Γ; b; p A{ c } : t
(T-InService)
Fig. 5. Additional judgments and rules for typing states
Lemma 3 (Progress). Suppose S is a closed, well-typed state (that is, S : t for some t and policy SP). Then either S is a value or else, there is some state S with S −→ S . We conclude that for a given policy SP, well-typed programs satisfy combinators in SP. An expression e is a well-typed program if it is closed and it has a type t in the empty type environment, written e : t. Theorem 1 (Combinator Satisfiability). Given a policy SP, if e : t, then all runs e −→∗ v0 , where v0 is some value, satisfy combinators in SP.
6
Related Work
There have been recently many proposals of concurrent languages with novel synchronization primitives, e.g. Polyphonic C# [2] and JoCaml [5], which are based on the join-pattern abstraction [7]; and Concurrent Haskell [14], Concurrent ML [20], and (Nomadic) Pict [21, 24] which have synchronization constructs based on channel abstractions. They enable to encode complex concurrency control more easily than when using standard constructs, such as monitors and locks. Flow Java [4] extends Java with single assignment variables, which allow programmers to defer binding of objects to these variables. Threads accessing an unbound variable are blocked, e.g. the method call c.m() will block until c has been bound to an object (by other thread). This mechanism can be used to implement barrier synchronization in concurrent programs. The above work is orthogonal to the goals of this paper. We are primarily focused on a declarative way of encoding and verifying synchronization through separation of concerns (see [11, 16, 8, 23, 18, 19] among others), with higher-level transactional facilities that provide automatic concurrency control. Below we discuss example work in these two areas. Separation of Concurrency Aspects. The previous work, which set up goals close to our own is by Ren and Agha [23] on separation of an object’s functional behaviour and the timing constraints imposed on it. They propose an
176
P.T. Wojciechowski
actor-based language for specifying and enforcing at runtime real-time relations between events in a distributed system. Their work builds on the earlier work of Frølund and Agha [8] who developed language support for specifying multiobject coordination, expressed in the form of constraints that restrict invocation of a group of objects. For a long time, the object-oriented community has been pointing out, under the term inheritance anomaly [17], that concurrency control code interwoven with the application code of classes can represent a serious obstacle to class inheritance, even in very simple situations. Milicia and Sassone [18, 19] address the inheritance anomaly problem, and present an extension of Java with a linear temporal logic to express synchronization constraints on method calls. This approach is similar to ours however we are focused on verifying static constraints between code fragments. The Aspect Oriented Programming (AOP) approach is based on separately specifying the various concerns of a program and some description of their relationship, and then relying on the AOP tools to weave [12] or compose them together into a coherent program. H¨ ursch and Lopes [11] identify various concerns, including synchronization. Lopes [16] describes a programming language D, which allows thread synchronization to be expressed as a separate concern. More recently, the AOP tools have been proposed for Java, such as AspectJ [15]; they allow aspect modules to be encoded using traditional languages, and weaved at the intermediate level of Java bytecode. We are not aware of much work on formalizing combinator-like operations. Achermann and Nierstrasz [1] describe Piccola, which allows software components to be composed (although not isolated) using connectors, with rules governing their composition. Isolation-Only Transactions. A number of researchers describe a way to decompose transactions, and provide support of the isolation property in common programming. For instance, Venari/ML [9] implements higher-order functions in ML to express modular transactions, with concurrency control factored out into a separate mechanism that the programmer could use to ensure isolation. Flanagan and Qadeer’s [6] developed a type system for specifying and verifying the atomicity of methods in multithreaded Java programs (the notion of “atomicity” is equivalent to isolation in this paper). The type system is a synthesis of Lipton’s theory of left and right movers (for proving properties of parallel programs) and type systems for race detection. Harris and Fraser [10] have been investigating an extension of Java with atomic code blocks that implement Hoare’s conditional critical regions (CCRs). However, both Flanagan and Qadeer’s atomic methods and Harris and Fraser’s atomic blocks must be sequential, while our isolated (composite) services can themselves be multithreaded. Black et al. [3] defined an equation theory of atomic transaction operators, where an operator corresponds to an individual ACID (Atomicity, Consistency, Isolation, and Durability) property. The operators can be composed, giving dif-
Concurrency Combinators for Declarative Synchronization
177
ferent semantics to transactions. The model is however presented abstractly, without being integrated with any language or calculus. Vitek et al. [25] (see also Jagannathan and Vitek [13]) have recently proposed a calculi-based model of (standard) ACID transactions. They formalized the optimistic and two-phase locking concurrency control strategies.
7
Conclusions
Our small, typed calculus may be a useful basis for work on different problems of declarative synchronization. One problem that we have identified in this paper, and solved using a type system, is satisfiability of combinators and scheduling policies. Such combination of static typing with runtime support would be helpful to implement concurrency combinators. It may be also worthwhile to investigate algorithms for inferring the typing annotations. We have focused on the simplest language that allows us to study the core problem of §1, rather than attempting to produce an industrial-strength language. We think however that analogous work could be carried out for other languages, too. We hope that having abstractions similar to our concurrency combinators in the mainstream programming languages would facilitate the development of concurrent, service-oriented systems, especially those that need to deal with unanticipated evolution. Acknowledgments. Research supported by the Swiss National Science Foundation under grant number 21-67715.02 and Hasler Stiftung under grant number DICS-1825.
References 1. F. Achermann and O. Nierstrasz. Applications = Components + Scripts – A Tour of Piccola. In M. Aksit, editor, Software Architectures and Component Technology, pages 261–292. Kluwer, 2001. 2. N. Benton, L. Cardelli, and C. Fournet. Modern concurrency abstractions for C#. In Proc. ECOOP ’02, LNCS 2374, June 2002. 3. A. P. Black, V. Cremet, R. Guerraoui, and M. Odersky. An equational theory for transactions. In Proc. FSTTCS ’03 (23rd Conference on Foundations of Software Technology and Theoretical Computer Science), Dec. 2003. 4. F. Drejhammar, C. Schulte, P. Brand, and S. Haridi. Flow Java: Declarative concurrency for Java. In Proc. ICLP ’03 (Conf. on Logic Programming), 2003. 5. F. L. Fessant and L. Maranget. Compiling join-patterns. In Proc. HLCL ’98 (Workshop on High-Level Concurrent Languages), 1998. 6. C. Flanagan and S. Qadeer. A type and effect system for atomicity. In Proc. PLDI ’03 (Conf. on Programming Language Design and Implementation), June 2003. 7. C. Fournet, G. Gonthier, J.-J. L´evy, L. Maranget, and D. R´emy. A calculus of mobile agents. In Proc. of CONCUR ’96, LNCS 1119, Aug. 1996. 8. S. Frølund and G. Agha. A language framework for multi-object coordination. In Proc. ECOOP ’93, LNCS 627, July 1993.
178
P.T. Wojciechowski
9. N. Haines, D. Kindred, J. G. Morrisett, S. M. Nettles, and J. M. Wing. Composing first-class transactions. ACM TOPLAS, 16(6):1719–1736, Nov. 1994. 10. T. Harris and K. Fraser. Language support for lightweight transactions. In Proc. OOPSLA ’03, Oct. 2003. 11. W. Hursch and C. Lopes. Separation of concerns. Technical Report NU-CCS-95-03, College of Computer Science, Northeastern University, Feb. 1995. 12. R. Jagadeesan, A. Jeffrey, and J. Riely. A calculus of untyped aspect-oriented programs. In Proc. ECOOP 2003, LNCS 2743, July 2003. 13. S. Jagannathan and J. Vitek. Optimistic concurrency semantics for transactions in coordination languages. In Proc. COORDINATION ’04, Feb. 2004. 14. S. P. Jones, A. Gordon, and S. Finne. Concurrent Haskell. In Proc. POPL ’96 (23rd ACM Symposium on Principles of Programming Languages), Jan. 1996. 15. G. Kiczales, E. Hilsdale, J. Hugunin, M. Kersten, J. Palm, and W. Griswold. Getting started with AspectJ. Communications of the ACM, 44(10):59–65, Oct. 2001. 16. C. V. Lopes. D: A Language Framework for Distributed Programming. PhD thesis, College of Computer Science, Northeastern University, Dec. 1997 (1998). 17. S. Matsuoka and A. Yonezawa. Analysis of inheritance anomaly in object-oriented concurrent programming languages. In Research Directions in Concurrent ObjectOriented Programming, pages 107–150. MIT Press, 1993. 18. G. Milicia and V. Sassone. Jeeg: A programming language for concurrent objects synchronization. In Proc. ACM Java Grande/ISCOPE Conference, Nov. 2002. 19. G. Milicia and V. Sassone. Jeeg: Temporal constraints for the synchronization of concurrent objects. Tech. Report RS-03-6, BRICS, Feb. 2003. 20. P. Panangaden and J. Reppy. The Essence of Concurrent ML. In F. Nielson, editor, ML with Concurrency: Design, Analysis, Implementation, and Application, pages 5–29. Springer, 1997. 21. B. C. Pierce and D. N. Turner. Pict: A programming language based on the picalculus. In G. Plotkin, C. Stirling, and M. Tofte, editors, Proof, Language and Interaction: Essays in Honour of Robin Milner. MIT Press, 2000. 22. G. D. Plotkin. Call-by-name, call-by-value and the λ-calculus. Theoretical Computer Science, 1:125–159, 1975. 23. S. Ren and G. A. Agha. RTsynchronizer: Language support for real-time specifications in distributed systems. In Proc. ACM Workshop on Languages, Compilers, & Tools for Real-Time Systems, 1995. 24. P. Sewell, P. T. Wojciechowski, and B. C. Pierce. Location-independent communication for mobile agents: A two-level architecture. In Internet Programming Languages, LNCS 1686, pages 1–31, 1999. 25. J. Vitek, S. Jagannathan, A. Welc, and A. L. Hosking. A semantic framework for designer transactions. In Proc. ESOP ’04, Mar./April 2004. 26. W3C. Web Services Architecture, 2004. http://www.w3.org/TR/ws-arch/. 27. P. Wojciechowski, O. R¨ utti, and A. Schiper. SAMOA: A framework for a synchronisation-augmented microprotocol approach. In Proc. IPDPS 2004 (18th International Parallel and Distributed Processing Symposium), Apr. 2004.
A Uniform Reduction Equivalence for Process Calculi Zining Cao Department of Informatics, School of Mathematical Sciences, Peking University, Beijing 100871, P. R. China
[email protected] Abstract. We present a new uniform definition of reduction-based semantics for different process calculi, called indexed reduction equivalence (or congruence). We prove that early bisimulation coincides with indexed reduction equivalence for π-calculus, context bisimulation coincides with indexed reduction equivalence for higher order π-calculus and indexed reduction congruence is strictly finer than contextual barbed congruence for Safe Mobile Ambients.
1
Introduction
The studies of π-calculus started from Milner, Parrow and Walker’s paper in 1992 [7]. Roughly speaking, the π-calculus is an extension of CCS where channels can be exchanged along channels. Several notions of bisimulations were proposed to describe the equivalence of processes in π-calculus [10], such as early bisimulation, late bisimulation, open bisimulation and so on. In Sangiorgi’s Ph.D dissertation [8], higher order π-calculus and some interesting equivalences, such as context bisimulation, normal bisimulation and barbed equivalence, were presented. Recently, several new process calculi have been proposed which allow us to describe the mobility of software. Among them, a famous one is Mobile Ambients (MA) [2], whose computational model is based on the notion of movement. In [4], Levi and Sangiorgi argued that the basic operational semantics for Mobile Ambients led to the phenomenon of ’grave interference’. Grave interference makes reasoning on programming much more difficult [4]. To solve this problem, a variant calculus of Mobile Ambients was proposed in [4], called Safe Mobile Ambients (SA). For SA, the concept of co-action was introduced, and a reduction is executed by the cooperation of action and co-action. Unlike π-calculus, the operational semantics of SA was firstly given by reduction system, therefore the definition of equivalence for SA cannot be given as usual style of labeled bisimualtion in π-calculus. Several researchers studied labeled translation system for SA. In [5], a labeled translation system based
This work was supported by the National Science Fund of China.
W.-N. Chin (Ed.): APLAS 2004, LNCS 3302, pp. 179–195, 2004. c Springer-Verlag Berlin Heidelberg 2004
180
Z. Cao
operational semantics for SA and a labeled bisimulation based equivalence were given. Another labeled translation system based operational semantics was given in [6], and the notions of late bisimulation and early bisimualtion were also presented. Up to now, all proposed labeled translation systems for SA seem somewhat complicated, which makes the labeled bisimualtion for SA not very convenient for application. On the other hand, barbed equivalence is an equivalence based on reduction semantics, so it seems more natural for SA than labeled bisimualtions. Some works on comparing barbed congruence between labeled bisimulation based equivalence have also been studied. In [5], a labeled bisimulation was proved to coincide with contextual barbed congruence. In [6], the notions of late bisimulation and early bisimualtion were proved to be equivalent to contextual barbed congruence. In [11], the authors defined different formulations of barbs, and by following Honda and Yoshida’s approach [3], the authors proved that these different formulations lead to the same contextual barbed congruence in the context of SA. In this paper, we want to give another notion of reduction based semantics for mobile process, named indexed reduction equivalence/congruence. Like barbed equivalence/congruence, indexed reduction equivalence/congruence is also suitable for several famous calculi such as π-calculus, higher order π-calculus and SA. The new notion improves barbed equivalence/congruence in several aspects: First, indexed reduction equivalence coincides with early bisimulation for πcalculi with or without match prefix. On the contrary, in [10], a counterexample shows that in the case of π-calculus without match prefix, weak barbed equivalence does not coincide with weak early bisimulation. Second, for SA if we take into account the interactions inside of nested ambients, the contextual barbed congruence seems extremely coarse: only top-level free ambients can be observed, but the weakness of this approach is that the contents of an ambient may not be observed because of the lack of co-capabilities inside it. On the contrary, indexed reduction congruence can keep track about what happens internally in an ambient by using the indices, which allows us to distinguish some processes which are viewed to be equivalent with respect to contextual barbed congruence. To see the difference between them for SA, let us consider the two following SA processes: m[in m.p[in p.0]] and m[in m.q[in q.0]]. Intuitively, m[in m.p[in p.0]] is different from m[in m.q[in q.0]] if we care the interactions inside of nested ambients, because testing process n[in m.in p.0] can enter ambients m and p of the first process, but can only enter ambient m of the second one. However, from the view of contextual barbed congruence the two processes are equivalent (See Lemma 2). On the other hand, we prove that indexed reduction congruence can distinguish these two processes, so indexed reduction congruence is finer than contextual barbed congruence. Third, for π-calculus and higher order π-calculus, the meaning of ’barb’ is ’channel’; but for SA, the meaning of ’barb’ is ’ambient’. Hence the intuitive
A Uniform Reduction Equivalence for Process Calculi
181
meaning of barbed equivalence/congruence is actually changed in different calculi. Whereas in the definition of indexed reduction equivalence/congruence, the concept of index, which can be viewed as the name or location of components, is uniform for various calculi. This paper is organized as follows: In Section 2 we introduce the corresponding syntax and labeled transition system for indexed π-calculus and present the notion of indexed reduction equivalence. We prove that the indexed reduction equivalence coincides with early bisimulation for π-calculus. In Section 3 we present indexed higher order π-calculus and its indexed reduction equivalence. We prove that indexed reduction equivalence is equivalent to context bisimulation for higher order π-calculus. In Section 4, we introduce the syntax and reduction system of the ”indexed version” of SA, and present the notion of indexed reduction congruence for SA. At last, we prove that indexed reduction congruence is strictly finer than contextual barbed congruence for SA. The paper is concluded in Section 5.
π-Calculus
2
In this section, a new notion called ”indexed reduction equivalence” for πcalculus is presented, and the equivalence between indexed reduction equivalence and early bisimulation is proved. We first briefly review π-calculus. 2.1
Syntax and Labeled Transition System of π-Calculus
We use a, b, c,..., x, y, z,... to range over the class of names. The class P rπ of the π-calculus processes is built up using the operators of prefixing, sum, parallel composition, restriction and replication in the grammar below: P ::= 0 | x(y).P | xy.P | τ.P | P1 + P2 | P1 |P2 | (νx)P | !P The actions are given by α ::= xy | xy | (νy)xy | τ We write bn(α) for the set of names bound in α, which is y if α is (νy)xy and otherwise. n(α) denotes the set of names of α. In each process of the form (νy)P or x(y).P the occurrence of y is a bound within the scope of P . An occurrence of y in a process is said to be free iff it does not lie within the scope of a bound occurrence of y. The set of names occurring free in P is denoted fn(P ). An occurrence of a name in a process is said to be bound if it is not free, we write the set of bound names as bn(P ). Process P and Q are α-convertible, P ≡α Q, if Q can be obtained from P by a finite number of changes of bound names. The operational semantics of π-calculus is presented in Table 1. We have omitted the symmetric versions of the rules of summation, parallelism and communication.
182
Z. Cao Table 1. α
ALP :
P −→ P α
Q −→
Q
τ
P ≡α Q, P ≡α Q
T AU : τ.P −→ P
xy
xz
OU T : xy.P −→ P IN : x(y).P −→ P {z/y} α α P −→ P P −→ P SU M : P AR : bn(α) ∩ f n(Q) = Ø α α P + Q −→ P P |Q −→ P |Q xy
COM 1 : COM 2 :
P −→ P
(νy)xy
−→
P
τ
xy
Q −→ Q
τ
P |Q −→ P |Q P
xy
Q −→ Q
P |Q −→ (νy)(P |Q )
y∈ / f n(Q)
α
RES :
P −→ P α
(νx)P −→ (νx)P
x∈ / n(α)
OP EN : α
REP :
2.2
P |!P −→ P
xy
P −→ P (νy)P
(νy)xy
−→
P
x = y
α
!P −→ P
Syntax and Labeled Transition System of Indexed π-Calculus
This paper presents a uniform behaviour equivalence for π-calculus, higher order π-calculus and Safe Mobile Ambients, based on indexed labeled transition systems. Indices are added to the labels of the transition system. Those indices are used in the indexed reduction bisimulation, equivalence and congruence to identify between which components (tested process, testing processes) a transition takes place. In order to give indexed reduction equivalence, we first introduce the notion of indexed process. Given an index set I, w.l.o.g., let I be the set of natural numbers, the class of the first order indexed processes IP rπ is built similar to P rπ , expect that every prefix is assigned to an index: M ::= 0 | {x(y)}i .M | {xy}i .M |{τ }i .M | M1 + M2 | M1 |M2 | (νx)M | !M, here i ∈ I. In the following, we need the notation {P }i which denotes indexed process with the same index i on every prefix in its scope. The formal definition can be given inductively as follows: {0}i ::= 0; {x(y).P }i ::= {x(y)}i .{P }i ; {xy.P }i ::= {xy}i .{P }i ; {τ.P }i ::= {τ }i .{P }i ; {P1 + P2 }i ::= {P1 }i + {P2 }i ; {P1 |P2 }i ::= {P1 }i |{P2 }i ; {(νx)P }i ::= (νx){P }i ; {!P }i ::=!{P }i . Intuitively, the index set I can be viewed as set of names of components or set of locations, then correspondingly, {P }i denotes a process P whose name is i or a process P located at i. The indexed actions are given by Iα ::= {xy}i | {xy}i | {(νy)xy}i | {τ }i,j
A Uniform Reduction Equivalence for Process Calculi
183
Similar to Table 1, we give the operational semantics of indexed processes in Table 2. The main difference between Table 1 and Table 2 is that the label Iα on the transition arrow is of the form {α}i or {τ }i,j , here α is an input or output action, i and j are indices. If we adopt the distributed view, {α}i can be regarded as an input or output action performed by component i, and {τ }i,j can be regarded as a communication between components i and j. In the following, we view {α}i as a distributed input or output action, and {τ }i,j a distributed communication. Table 2. Iα
ALP :
M −→ M Iα
N −→ N {xy}i OU T : {xy}i .M −→ M M −→ M
{xz}i
Iα
P AR :
Iα
M + N −→ M
M −→ M Iα
M |N −→ M |N
COM 2 :
M
M −→ M
N −→ N
{τ }i,j
M |N −→ M |N
{(νy)xy}i
−→
M
{xy}j
N −→ N
{τ }i,j
M |N −→ (νy)(M |N )
M −→ M Iα
(νx)M −→ (νx)M
x∈ / n(Iα)
y∈ / f n(N ) {xy}i
Iα
RES :
bn(Iα) ∩ f n(N ) = Ø
{xy}j
{xy}i
COM 1 :
T AU : {τ }i .M −→ M
IN : {x(y)}i .M −→ M {z/y}
Iα
SU M :
{τ }i,i
M ≡α N, M ≡α N
M −→ M
OP EN :
(νy)M Iα
REP :
{(νy)xy}i
−→
M
x = y
M |!M −→ M Iα
!M −→ M
Remark. Since {τ }i,j and {τ }j,i have the same meaning: a communication between components i and j, hence in the above transition system, {τ }i,j and {τ }i,j
{τ }j,i are considered as the same label, i.e., M −→ M is viewed to be same as {τ }j,i
M −→ M . 2.3
Indexed Reduction Equivalence for π-Calculus
In the following, we propose a new uniform framework to define equivalence on processes, based on the move of messages among indexed processes rather than observables such as barbs. Definition 1. Strong indexed reduction bisimulation Let K, L ∈ IP rπ , we write K ∼ired L, if there is a symmetric relation R, {τ }i,j
s.t. whenever K R L then for any indexed process M, K|M −→ K implies {τ }i,j
L|M −→ L for some L with K R L . Definition 2. Strong indexed reduction equivalence
184
Z. Cao
Let P, Q ∈ P rπ , we write P ∼red Q if {P }l ∼ired {Q}l for some index l. From a distributed view, P and Q are said to be strong indexed reduction equivalent if we locate P and Q at location l, then any distributed communication {τ }i,j between {P }l and indexed testing process M can be matched by the same distributed communication {τ }i,j between {Q}l and M , where {τ }i,j denotes the communication between locations i and j. Before giving the weak indexed reduction equivalence, let us compare communication {τ }i,i with {τ }i,j firstly. In [1], a distributed variant of CCS is proposed, where the τ -transitions are considered, as usual, to be invisible. Hence this view takes {τ }i,i and {τ }i,j as the same. For weak indexed reduction equivalence, we adopt another view where {τ }i,i is regarded as different from {τ }i,j , since for an observer who can distinguish between sites, {τ }i,i is an internal communication at location i, and {τ }i,j represents external communication between two different locations i and j. In other words, we regard {τ }i,i as a private event at location i, and {τ }i,j as a visible event between locations i and j. Now we give the weak indexed reduction equivalence which neglect {τ }i,i since from a distributed view {τ }i,i happens internally in location i. Definition 3. Weak indexed reduction bisimulation Let K, L ∈ IP rπ , we write K ≈ired L, if there is a symmetric relation R, s.t. {τ }i,i
whenever K R L then for any indexed process M, (1) K|M −→ *K implies {τ }i,i
{τ }i,i
L|M −→ *L for some L with K R L , here −→ * is the reflexive and transitive {τ }i,i
{τ }i,j
{τ }i,j
closure of −→ ; (2) K|M −→ K , here i = j, implies L|M −→ L for some L with K R L . Definition 4. Weak indexed reduction equivalence Let P , Q ∈ P rπ , we write P ≈red Q if {P }l ≈ired {Q}l for some index l. 2.4
Indexed Reduction Equivalence Coincides with Early Bisimulation
In this section, we study the relation between indexed reduction equivalence and early bisimulation. For the sake of space, we only discuss the case of weak bisimulation, and the same proposition holds for strong bisimulation. We first review the definition of weak early bisimulation. Definition 5. A symmetric relation R ∈ P rπ × P rπ is a weak early bisimulation τ τ if whenever P R Q, (1) P −→*P implies there exists Q s.t. Q −→*Q and P τ τ α R Q ; here −→* is the reflexive and transitive closure of −→; (2) P −→ P with α bn(α)∩fn(P, Q)= and α = τ implies there exists Q s.t. Q −→ Q and P R Q . We write P ≈e Q if P and Q are weak early bisimilar. Now we can give the equivalence between weak early bisimulation and weak indexed reduction equivalence as follows: Proposition 1. For any P, Q ∈ P rπ , P ≈e Q ⇔ P ≈red Q.
A Uniform Reduction Equivalence for Process Calculi
185
Proof. See Appendix A. Remark. For the π-calculus obtained by omitting the match prefixes, weak early bisimulation is not equivalent to weak barbed equivalence, an example was given in [10, page 102, Exercise 2.4.30] as follows: Let P = ax|Ex,y , Q = ay|Ex,y , here Ex,y =!x(z).yz|!y(z).xz, then ¬(P ≈e Q) but for each context C of the π-calculus without the match prefixes, C[P ] and C[Q] are weak barbed bisimilar. Since Proposition 1 holds, this example indicates that weak indexed reduction equivalence is different from weak barbed equivalence for π-calculus without match prefixes. In fact, Proposition 1 can be extended to π-calculus with match prefixes. Because of space limitation, we do not give the detailed proofs.
Higher Order π-Calculus
3
In this section, we first review the syntax and labeled transition system of the higher order π-calculus, then extend it by introducing the notion of ”index”. The definitions and propositions in the following are parallel with the case of π-calculus. 3.1
Syntax and Labeled Transition System of Higher Order π-Calculus
In this section we briefly recall the syntax and labeled transition system of the higher order π-calculus. We only focus on a second-order fragment of the higher order π-calculus [9], i.e. there is no abstraction in this fragment. We assume a set N of names, ranged over by x, y, z,... and a set V ar of process variables, ranged over by X, Y, Z, U, .... We use E, F, P, Q,... to stand for processes. The class of processes is denoted as P rHO . The grammar for the higher order π-calculus processes are given by P ::= 0 | U | x(U ).P | xQ.P | τ.P | P1 |P2 | (νx)P | !P A restriction (νx)P binds the free occurrences of x in P. Higher order input prefix x(U ).P binds all free occurrences of U in P . The set of names occurring free in P is denoted as fn(P ). Alpha conversion relates expressions which are syntactically identical modulo renaming of bound names and bound variables. A c process is closed if it has no free variable. P rHO is the set of all closed processes. We write bn(α) for the set of names bound in action α, which is { y } if α is (ν y)xE and otherwise. n(α) denotes the set of names that occur in α. The operational semantics of higher order process is given in Table 3. 3.2
Syntax and Labeled Transition System of Indexed Higher Order π-Calculus
We firstly introduce the concept of indexed process. Given an index set I, w.l.o.g., let I be the set of natural numbers, the class of the indexed processes IP rHO is built similar to P rHO , except that every prefix is assigned to an index. We usually use K, L, M, N to denote indexed processes.
186
Z. Cao Table 3. α
ALP :
P −→ P α
Q −→
Q
P ≡α Q, P ≡α Q
τ
T AU : τ.P −→ P
xE
xE
OU T : xE.P −→ P IN : x(U ).P −→ P {E/U } α P −→ P bn(α) ∩ f n(Q) = Ø P AR : α P |Q −→ P |Q COM :
P
(ν y )xE
−→
τ
P
xE
Q −→ Q
P |Q −→ (ν y)(P |Q ) P −→ P RES : x∈ / n(α) α (νx)P −→ (νx)P
y ∩ f n(Q) = Ø α
α
OP EN :
P
(ν z )xE
(νy)P
−→
P
(νy,z )xE
−→
REP :
P |!P −→ P α
!P −→ P
x = y, y ∈ f n(E) − z P
The formal definition of indexed process is given as follows: M ::= 0 | U | {x(U )}i .M | {xK}i .M | {τ }i .M | M1 |M2 | (νx)M | !M, here i ∈ index set I and K is an indexed process. In each process of the form (νx)M the occurrence of x is a bound within the scope of M . An occurrence of x in a process is said to be free iff it does not lie within the scope of a bound occurrence of x. The set of names occurring free in M is denoted as fn(M ), and the set of bound names is denoted as bn(M ). Indexed input prefix {x(U )}i .M binds all free occurrences of U in M . An indexed c process is closed if it has no free variable. IP rHO is the set of all closed indexed processes. Process M and N are α-convertible, M ≡α N , if N can be obtained from M by a finite number of changes of bound names and bound variables. We use {P }i to denote indexed process with the same given index i on every prefix in its scope. The formal definition can be given inductively as follows: {0}i ::= 0; {U }i ::= U ; {τ.P }i ::= {τ }i .{P }i ; {x(U ).P }i ::= {x(U )}i .{P }i ; {xE.P }i ::= {x{E}i }i .{P }i ; {P1 |P2 }i ::= {P1 }i |{P2 }i ; {!P }i ::=!{P }i ; {(νx)P }i ::= (νx){P }i . The indexed actions are given by Iα ::= {xK}i | {xK}i | {(ν y)xK}i | {τ }i,j We write bn(Iα) for the set of names bound in Iα, which is { y } if Iα is {(ν y)xK}i and otherwise. n(Iα) denotes the set of names that occur in Iα. The operational semantics of indexed processes is given in Table 4. 3.3
Indexed Reduction Equivalence for Higher Order π-Calculus
The definition of strong and weak indexed reduction equivalence for higher order π-calculus is same as Definition 1-4 in the case of π-calculus. For the limitation of space, we only discuss weak bisimulation and equivalence.
A Uniform Reduction Equivalence for Process Calculi
187
Table 4. Iα
ALP :
M −→ M Iα
N −→ N OU T : {xK}i .M P AR : COM :
M
{τ }i,i
M ≡α N, M ≡α N {xK}i
−→ M
IN : {x(U )}i .M
Iα
M −→ M Iα
M |N −→ M |N
{(ν y )xK}i
−→
T AU : {τ }i .M −→ M
M
{xK}j
N
−→
{τ }i,j
M |N −→ (ν y)(M |N )
M −→ M Iα
(νx)M −→ (νx)M
OP EN :
M
x∈ / n(Iα)
{(ν z )xK}i
(νy)M
−→
M
{(νy,z )xK}i
−→
−→ M {K/U }
bn(Iα) ∩ f n(N ) = Ø N
y ∩ f n(N ) = Ø Iα
Iα
RES :
{xK}i
REP :
M |!M −→ M Iα
!M −→ M
x = y, y ∈ f n(K) − z
M
Definition 6. Weak indexed reduction bisimulation c Let K, L ∈ IP rHO , we write K ≈ired L, if there is a symmetric relation R, {τ }i,i
s.t. whenever K R L then for any indexed process M, (1) K|M −→ *K implies {τ }i,i
{τ }i,j
L|M −→ *L for some L with K R L ; (2) K|M −→ K , here i = j, implies {τ }i,j
L|M −→ L for some L with K R L . Definition 7. Weak indexed reduction equivalence c , we write P ≈red Q if {P }l ≈ired {Q}l for some index l. Let P , Q ∈ P rHO 3.4
Indexed Reduction Equivalence Coincides with Context Bisimulation
In this section we give the equivalence between context bisimulation and indexed reduction equivalence for higher order π-calculus. We first review the definition of context bisimulation. Definition 8. A symmetric relation R is a weak context bisimulation if P R Q τ τ implies: (1) whenever P −→*P , then there exists Q such that Q −→*Q and xE
xE
P R Q ; (2) whenever P −→ P , there exists Q s.t. Q −→ Q and P R Q ; (3) (ν b)xE (ν c)xF c, s.t. Q −→ Q and for all C(U ) whenever P −→ P , there exist Q , F , with fn(C(U ))∩{b, c}=∅, (νb)(P |CE) R (ν c)(Q |CF ). We write P ≈ct Q if P and Q are weak context bisimilar. The following proposition shows that weak indexed reduction equivalence coincides with weak context bisimulation. c , P ≈ct Q ⇔ P ≈red Q. Proposition 2. For any P , Q ∈ P rHO The proof of this proposition includes two steps: firstly an indexed version of context bisimulation is proved to be equivalent to indexed reduction equivalence;
188
Z. Cao
secondly, the equivalence between this indexed version of context bisimulation and the original context bisimulation is proved. For the limitation of space, we do not give the detail proof.
4
Safe Mobile Ambients
For π-calculus, we have proved that the indexed reduction equivalence coincides with the early bisimulation. For higher order π-calculus, the equivalence between context bisimulation and indexed reduction equivalence is proved. But things are different for the Safe Mobile Ambients. For this formalism, the indexed reduction congruence is strictly finer than the contextual barbed congruence. 4.1
Syntax and Reduction System of Safe Mobile Ambients
The class P rSA of the safe ambients is built using the operators of prefixing, parallel composition, ambient, restriction and recursion in the grammar below: P ::= 0 | α.P | P1 |P2 | n[P ] | (νn)P | X | recX.P, here n ∈ set N of names, X ∈ set V ar of process variables. α is called capability and of one of the following forms: α ::= in n | in n | out n | out n | open n | open n Recursive operator recX.P binds all free occurrences of X in P . A process is c is closed if it has no free variable; it is open if it may have free variables. P rSA the set of all closed processes. A context is a term with a hole [] in it: C ::= [] | α.C | C|P | P |C | n[C] | (νn)C | recX.C The operational semantics of Safe Mobile Ambients is reported in Table 5. Table 5. ST RU C :
P −→ P P ≡ Q, P ≡ Q Q −→ Q
IN : n[in m.P1 |P2 ]|m[in m.Q1 |Q2 ] −→ m[n[P1 |P2 ]|Q1 |Q2 ] OU T : m[n[out m.P1 |P2 ]|out m.Q1 |Q2 ] −→ n[P1 |P2 ]|m[Q1 |Q2 ] OP EN : open n.P |n[open n.Q1 |Q2 ] −→ P |Q1 |Q2 P AR : AM B :
P −→ P P |Q −→ P |Q
P −→ P n[P ] −→ n[P ]
RES : REC :
P −→ P (νn)P −→ (νn)P
P {recX.P/X} −→ P recX.P −→ P
Structural congruence is a congruence relation including the following rules: P |Q ≡ Q|P ; (P |Q)|R ≡ P |(Q|R); P |0 ≡ P ; (νn)0 ≡ 0; (νm)(νn)P ≡ (νn)(νm)P ; (νn)(P |Q) ≡ P |(νn)Q if n ∈ / f n(P ); (νn)(m[P ]) ≡ m[(νn)P ] if n = m.
A Uniform Reduction Equivalence for Process Calculi
4.2
189
Syntax and Reduction System of Indexed Safe Mobile Ambients
The class IP rSA of the indexed processes is built similar to P rSA , expect that an index is assigned to every prefix: M ::= 0 | Iα.M | M1 |M2 | n[M ] | (νn)M | X | recX.M, here n ∈ N, X ∈ V ar. Iα is called indexed capability and is one of the following forms. Iα ::= {in n}i | {in n}i | {out n}i | {out n}i | {open n}i | {open n}i , here i ∈ set I of indices. Operator recX.M binds all free occurrences of X in M . An indexed process is c is the set of all closed indexed processes. closed if it has no free variable. IP rSA We need the notation {P }i which denotes indexed process with the same index i on every capability in its scope. The formal definition is given inductively as follows: {0}i ::= 0; {α.P }i ::= {α}i .{P }i ; {P1 |P2 }i ::= {P1 }i |{P2 }i ; {n[P ]}i ::= n[{P }i ]; {(νn)P }i ::= (νn){P }i ; {recX.P }i ::= recX.{P }i . The formal definition of indexed context is given below: C ::= [] | Iα.C | C|M | M |C | n[C] | (νn)C | recX.C The operational semantics of indexed processes is given in Table 6. Table 6. i,j
ST RU C :
M −→ M i,j
N −→
N
M ≡ N, M ≡ N i,j
IN : n[{in m}i .M1 |M2 ]|m[{in m}j .N1 |N2 ] −→ m[n[M1 |M2 ]|N1 |N2 ] i,j
OU T : m[n[{out m}i .M1 |M2 ]|{out m}j .N1 |N2 ] −→ n[M1 |M2 ]|m[N1 |N2 ] i,j
OP EN : {open n}i .M |n[{open n}j .N1 |N2 ] −→ M |N1 |N2 i,j
P AR :
M −→ M i,j
M |N −→ M |N
i,j
RES :
M −→ M i,j
n[M ] −→ n[M ]
i,j
(νn)M −→ (νn)M i,j
i,j
AM B :
M −→ M
REC :
M {recX.M/X} −→ M i,j
recX.M −→ M i,j
Remark. Similar to the case of π-calculus, M −→ M is viewed to be same as j,i M −→ M . Structural congruence for indexed processes is similar to the one for original Safe Mobile Ambients, and we do not give the formal definition here. Example: For indexed process {p[in p.P ]}0 |{q[in p.Q]}1 , indices 0 and 1 can be viewed as names of components p[in p.P ] and q[in p.Q] respectively and
190
Z. Cao 0,1
{p[in p.P ]}0 |{q[in p.Q]}1 −→ p[q[{Q}1 ]|{P }0 ] represents the reduction between components 0 and 1. 4.3
Indexed Reduction Congruence for Safe Mobile Ambients
In fact, the contextual barbed congruence is considered appropriate if we do not have to care about what happens inside of nested ambients. But when we take into account the interactions inside of nested ambients, contextual barbed congruence seems too coarse. In this section, we give the concept of indexed reduction congruence for SA and show that this equivalence can distinguish processes which have different behaviours inside of nested ambients but are considered to be same with respect to contextual barbed congruence. Let us first review the definition of contextual barbed congruence for SA in [5, 11]: Definition 9. Contextual barbed congruence c c A symmetric relation R ⊆ P rSA ×P rSA is a contextual barbed bisimulation if whenever P R Q then for any context C: (1) Whenever C[P ] −→*P then C[Q] −→*Q and P R Q , here −→* is the reflexive and transitive closure of −→; (2) For each ambient n, if C[P ] ⇓ n, then also C[Q] ⇓ n. Here P ⇓ n means ∃P , P −→*P , P ≡ (ν m)(n[µ.Q / m. 1 |Q2 ]|Q3 ) where µ ∈ {in n, open n} and n ∈ Two closed processes P and Q are contextual barbed congruent, written as P ≈sa Q, if there is a contextual barbed bisimulation R, P R Q. A difference between π-calculus and Safe Mobile Ambients is that if a testing process interact with a π-calculus process, the interaction must happen at the top-level of process, whereas for Safe Mobile Ambients, the interaction may happen between the inner processes that are located in nested ambients. Therefore in the case of π-calculus, barbed testing reflect the interaction capability of process with context, but for Safe Mobile Ambients, barbed testing is not enough if we care the behaviour of testing process inside of nested ambients. On the contrary, by using indices, indexed reduction congruence can provide more information about what happens internally in ambients. Definition 10. Weak indexed reduction bisimulation c Let K, L ∈ IP rSA , we write K ≈ired L, if there is a symmetric relation R, i,i
s.t. whenever K R L then for any indexed context M, (1) M [K] −→*K implies i,i
i,j
M [L] −→*L for some L with K R L ; (2) M [K] −→ K , here i = j, implies i,j
M [L] −→ L for some L with K R L . Definition 11. Weak indexed reduction congruence c Let P , Q ∈ P rSA , we write P ≈red Q if {P }l ≈ired {Q}l for some index l.
A Uniform Reduction Equivalence for Process Calculi
4.4
191
Comparison Between Indexed Reduction Congruence and Contextual Barbed Congruence
Now we show that indexed reduction congruence is strictly finer than contextual barbed congruence for SA. c , P ≈red Q ⇒ P ≈sa Q. Proposition 3. For any P , Q ∈ P rSA
Proof. See Appendix B. To show that the inverse proposition does not hold, let us see two processes: m[in m.p[in p.0]] and m[in m.q[in q.0]]. Lemma 1. ¬(m[in m.p[in p.0]] ≈red m[in m.q[in q.0]]). Proof. Let indexed context M [] = n[{in m}1 .{in p}2 .0]|[], then it is clear that 0,1 0,2
M [m[{in m}0 .p[{in p}0 .0]]] −→−→ m[p[n[0]]], and M [m[{in m}0 . q[{inq}0 .0]]] 0,1
−→ m[q[{in q}0 .0]|n[{in p}2 .0]], but m[q[{in q}0 .0]|n[{in p}2 ]] can not perform 0,2
reduction −→. On the contrary, from the view of contextual barbed congruence, these two processes are equivalent. Lemma 2. m[in m.p[in p.0]] ≈sa m[in m.q[in q.0]]. Proof. See Appendix C. The key point in proof of Lemma 2 is that after cooperating with capability in m of context E[], m[in m.p[in p.0]] and m[in m.q[in q.0]] reduce to F [m[n[S]|p[in p.0]]] and F [m[n[S]|q[in q.0]]] respectively. Since there is no capability open m, in m or out m in ambient m, we have F [m[n[S]|p[in p.0]]] ⇓ k iff F [m[n[S]|q[in q.0]]] ⇓ k for any ambient k. Hence m[in m.p[in p.0]] ≈sa m[in m. q[in q.0]]. From the above Lemma 1 and 2, we have: c , P ≈sa Q does not imply P ≈red Q. Proposition 4. For some P , Q ∈ P rSA
5
Conclusions
In this paper, a notion of bisimulation equivalence called ”indexed reduction equivalence/congruence” is presented, in terms of ”indexed contexts” where components of a context can be given indices which represent ”components” of communications. This approach is a unifying treatment of equivalences on process calculi. Its relationship with existing notions of equivalences on the πcalculus, higher order π-calculus and Safe Mobile Ambients are studied. Other ”uniform” formulation is [3], which does not use barbs. Perhaps that is the only one which can be related to the present work as a uniform approach, though it is quite different, both in conception and in formal nature. The idea of indexed reduction equivalence/congruence can also be extended to other process calculi.
192
Z. Cao
References 1. G. Boudol, I. Castellani, M. Hennessy and A. Kiehn. Observing localities, Theoretical Computer Science, 114: 31-61 1993. 2. L. Cardelli and A. D. Gordon. Mobile Ambients. Theoretical Computer Science, 240(1):177-213, 2000. 3. K. Honda and N. Yoshida. On reduction-based process semantics. Theoretical Computer Science, 152(2): 437-486 1995. 4. F. Levi and D. Sangiorgi. Controlling interference in Ambients. In Proc. POPL’00, pages 352–364, Boston, Massachusetts, Jan. 19-21, 2000. 5. M. Merro and M. Hennessy. Bisimulation congruences in Safe Ambients. Computer Science Report 5/01. An extended abstract appear in Proc. POPL’02. 6. M. Merro and F. Zappa Nardelli. Bisimulation proof methods for Mobile Ambients. Technical Report COGS 01:2003. 7. R. Milner, J. Parrow, and D. Walker. A calculus of mobile processes, (Part I and II). Information and Computation, 100:1-77, 1992. 8. D. Sangiorgi. Expressing mobility in process algebras: first-order and higher-order paradigms, Ph.D thesis, Department of Computer Science, University of Einburgh, 1992. 9. D. Sangiorgi. Bisimulation in higher-order calculi, Information and Computation, 131(2), 1996. 10. D. Sangiorgi, D. Walker. The π-calculus: a theory of mobile processes, Cambridge University Press, 2001. 11. M. G. Vigliotti and I. Phillips. Barbs and congruences for Safe Mobile Ambients. In: Foundations of Wide Area Network Computing, July 2002.
Appendix A. Proof of Proposition 1 Proposition 1. For any P, Q ∈ P rπ , P ≈e Q ⇔ P ≈red Q. Proof. ⇒ It is easy. ⇐ Let R = {(P, Q) : {P }0 ≈ired {Q}0 }. Suppose (P, Q) ∈ R, we consider the following cases: xy
τ
xy
(1) P −→*P ; (2) P −→ P ; (3) P −→ P ; (4) P
(νy)xy
−→ P .
τ
{τ }0,0
Case (1): P −→*P . For indexed process {0}1 , we have {P }0 |{0}1 −→ *{P }0 {τ }0,0
|{0}1 . Since {P }0 ≈ired {Q}0 , {Q}0 |{0}1 −→ *{Q }0 |{0}1 and {P }0 ≈ired τ {P }0 |{0}1 ≈ired {Q }0 |{0}1 ≈ired {Q }0 . Hence Q −→*Q and (P , Q ) ∈ R. {τ }0,1
xy
Case (2): P −→ P . Since {P }0 |{xy.0}1 −→ {P }0 |{0}1 and {P }0 ≈ired {τ }0,1
{Q}0 , we have {Q}0 |{xy.0}1 −→ {Q }0 |{0}1 and {P }0 ≈ired {Q }0 . By the xy
construction of {xy.0}1 , we have Q −→ Q , and (P , Q ) ∈ R. xy
{τ }0,1
Case (3): P −→ P . Since {P }0 |{x(w).wt.0}1 −→ {P }0 |{yt.0}1 , by {τ }0,1
the definition of ≈ired , we have {Q}0 |{x(w).wt.0}1 −→ {Q }0 |{zt.0}1 and
{τ }1,2
{P }0 |{yt.0}1 ≈ired {Q }0 |{zt.0}1 . Furthermore {P }0 |{yt.0}1 |{y(w)}2 −→
A Uniform Reduction Equivalence for Process Calculi
193
{τ }1,2
{P }0 , by the definition of ≈ired , we have {Q }0 |{zt.0}1 |{y(w)}2 −→ {Q }0 xy
and {P }0 ≈ired {Q }0 . Hence y = z, Q −→ Q and (P , Q ) ∈ R. Case (4): P
{τ }0,1
(νy)xy
−→ P . Since {P }0 |{x(w).wt.0}1 −→ (νy)({P }0 |{yt.0}1 ) {τ }0,1
for any w, t, by the definition of ≈ired , we have {Q}0 |{x(w).wt.0}1 −→ QC and (νy)({P }0 |{yt.0}1 ) ≈ired QC. So Q must perform an output action xy
(νy)xy
through channel x, and there are two cases: Q −→ Q or Q −→ Q . xy
Now we prove the former case holds, suppose not, Q −→ Q . Since (νy)({P }0 |{yt.0}1 )|{z(s).0}2 can not perform {τ }1,2 for any z, we xy
have QC|{z(s).0}2 can not perform {τ }1,2 for any z. But since Q −→ Q , we have {τ }0,1
{τ }1,2
{Q}0 |{x(w).wt.0}1 |{y(s).0}2 −→ {Q }0 |{yt.0}1 |{y(s).0}2 −→ {Q }0 , here {Q }0 |{yt.0}1 |{y(s).0}2 can perform {τ }1,2 . (νy)xy
It is contrary to {P }0 ≈ired {Q}0 , so Q −→ Q . Hence {Q}0 |{x(w).wt.0}1
{τ }0,1
−→ (νy)({Q }0 |{yt.0}1 ) and (νy)({P }0 |{yt.0}1 ) ≈ired (νy)({Q }0 |{yt.0}1 ) which implies {P }0 ≈ired {Q }0 , hence we have (P , Q ) ∈ R.
Appendix B. Proof of Proposition 3 Definition B1. Let C be a context (or process if disregarding hole []), we say that M is an indexed context (or indexed process) w.r.t. C if one of the following cases holds: (1) C = [] and M = []. (2) C = 0 and M = 0. (3) C = X and M = X. (4) C = α.C1 and M = {α}i .M1 , here M1 is an indexed context w.r.t. C1 . (5) C = C1 |C2 and M = M1 |M2 , here M1 (M2 ) is an indexed context w.r.t. C1 (C2 ). (6) C = n[C1 ] and M = n[M1 ], here M1 is an indexed context w.r.t. C1 . (7) C = (νn)C1 and M = (νn)M1 , here M1 is an indexed context w.r.t. C1 . (8) C = recX.C1 and M = recX.M1 , here M1 is an indexed context w.r.t. C1 . Example: m[{in m.0}1 .p[{in p.0}2 ]]|{open n.0}1 is an indexed process w.r.t. m[in m.0.p[in p.0]]|open n.0. Definition B2. Let P −→ P be an one-step reduction from P to P , and i,j IP (IP ) is indexed process w.r.t. P (P ), then we say that IP −→ IP is a corresponding one-step indexed reduction w.r.t. P −→ P if one of the following cases holds: (1) P = n[in m.P1 |P2 ]|m[in m.Q1 |Q2 ] −→ m[n[P1 |P2 ]|Q1 |Q2 ] = P and i,j
IP = n[{in m}i .IP1 |IP2 ]|m[{in m}j .IQ1 |IQ2 ] −→ m[n[IP1 |IP2 ]|IQ1 |IQ2 ] = IP , here IP1 (IP2 , IQ1 , IQ2 ) is an indexed process w.r.t. P1 (P2 , Q1 , Q2 ).
194
Z. Cao
(2) P = m[n[out m.P1 |P2 ]|out m.Q1 |Q2 ] −→ n[P1 |P2 ]|m[Q1 |Q2 ] = P and i,j
IP = m[n[{out m}i .IP1 |IP2 ]|{out m}j .IQ1 |IQ2 ] −→ n[IP1 |IP2 ]|m[IQ1 |IQ2 ] = IP , here IP1 (IP2 , IQ1 , IQ2 ) is an indexed process w.r.t. P1 (P2 , Q1 , Q2 ). (3) P = open n.P1 |n[open n.Q1 |Q2 ] −→ P1 |Q1 |Q2 = P and IP = {open n}i . i,j
IP1 |n[{open n}j .IQ1 |IQ2 ] −→ IP1 |IQ1 |IQ2 = IP , here IP1 (IQ1 , IQ2 ) is an indexed process w.r.t. P1 (Q1 , Q2 ). i,j
(4) IQ −→ IQ is a corresponding one-step indexed reduction w.r.t. Q −→ Q , here P ≡ Q, P ≡ Q , IP ≡ IQ, IP ≡ IQ . i,j (5) P = R|Q −→ R |Q = P , IR −→ IR is a corresponding one-step indexed i,j
reduction w.r.t. R −→ R , IQ is an indexed process w.r.t Q and IP = IR|IQ −→ IR |IQ = IP . i,j
(6) P = (νn)Q −→ (νn)Q = P , IQ −→ IQ is a corresponding one-step i,j
indexed reduction w.r.t. Q −→ Q and IP = (νn)IQ −→ (νn)IQ = IP . i,j
(7) P = n[Q] −→ n[Q ] = P , IQ −→ IQ is a corresponding one-step i,j
indexed reduction w.r.t. Q −→ Q and IP = n[IQ] −→ n[IQ ] = IP . i,j
(8) P = recX.Q −→ Q = P , IQ{recX.IQ/X} −→ IQ is a corresponding i,j
one-step indexed reduction w.r.t. Q{recX.Q/X} −→ Q and IP = recX.IQ −→ IQ = IP . i1 ,j1
i2 ,j2
in−1 ,jn−1
Definition B3. We say that M1 −→ M2 −→... −→ Mn is a corresponding indexed reduction w.r.t. P1 −→ P2 −→ ... −→ Pn , if for every k, Mk is an ik ,jk indexed process w.r.t. Pk and Mk −→ Mk+1 is a corresponding one-step indexed reduction w.r.t. Pk −→ Pk+1 . c , P ≈red Q ⇒ P ≈sa Q. Proposition 3. For any P , Q ∈ P rSA
Proof. Let R = {(P, Q) : IP ≈ired IQ, here IP (IQ) is an indexed process w.r.t. P (Q)}. We need to prove that for any context C, (1) if C[P ] R C[Q] and C[P ] −→*P , then C[Q] −→*Q and P R Q ; (2) if C[P ] R C[Q] and C[P ] ⇓ n then C[Q] ⇓ n. (1): For any context C, if C[P ] −→*P , then we have a corresponding indexed i1 ,j1
in ,jn
reduction w.r.t. C[P ] −→*P as follows: IC[IP ] −→... −→ IP , here IP is an indexed process w.r.t. P . i i 1 j1 n jn Since IP ≈ired IQ, we have IC[IQ] =⇒...=⇒ IQ and IP ≈ired IQ , here i,j i1 ,i1 i,j in ,in i1 ,i1 in ,in =⇒ means −→...−→... −→ if i = j, otherwise −→... −→. Therefore there is Q s.t. C[Q] −→*Q and P R Q . (2): For arbitrary context C, if C[P ] ⇓ n, then C[P ] −→*P , P ≡ (ν k)(n[µ.P1 |P2 ]|P3 ) where µ ∈ {in n, open n} and n ∈ / { k}. (a) Suppose P ≡ (ν k)(n[in n.P1 |P2 ]|P3 ), we have a corresponding indexed reduction w.r.t. C[P ]|m[in n.0] −→* P |m[in n.0] −→ (ν k)(n[m[0]|P1 |P2 ]|P3 ):
A Uniform Reduction Equivalence for Process Calculi
195
i1 ,j1 in ,jn IC[IP ]|m[{in n}i .0] −→... −→ (ν k)(n[{in n}j .IP1 |IP2 ]|IP3 )|m[{in n}i .0] i,j −→ (ν k)(n[m[0]|IP1 |IP2 ]|IP3 ), here IP1 (IP2 , IP3 ) is an indexed process w.r.t. P1 (P2 , P3 ), index i is different from indices in IC and IP . i i i,j 1 ,j1 n ,jn Since IP ≈ired IQ, we have IC[IQ]|m[{in n}i .0] =⇒ ... =⇒ N =⇒ N . Since the unique occurrence of index i is in m[{in n}i .0], by the reduction k1 ,k1 kn ,kn from N to N , we have N −→ ... −→ (ν k)(n[{in n}j .IQ1 |IQ2 ]|m[{in n}i .0]. Therefore C[Q] ⇓ n. (b) Suppose P ≡ (ν k)(n[open n.P1 |P2 ]|P3 ), then we have a corresponding indexed reduction w.r.t. C[P ]|open n.0 −→*P |open n.0 −→ (ν k)(P1 |P2 |P3 |0): i1 ,j1 in ,jn k)(n[{open n}j .IP1 |IP2 ]|IP3 )|{open n}i .0 IC[IP ]|{open n}i .0 −→... −→ (ν i,j
−→ (ν k)(IP1 |IP2 |IP3 |0), here IP1 (IP2 , IP3 ) is an indexed process w.r.t. P1 (P2 , P3 ), index i is different from indices in IC and IP . i i i,j 1 ,j1 n ,jn Since IP ≈ired IQ, we have IC[IQ]|{open n}i .0 =⇒ ... =⇒ N =⇒ N . Since the unique occurrence of index i is in {open n}i .0, by the reduction from k1 ,k1 kn ,kn k)(n[{open n}j .IQ1 |IQ2 ]|{open n}i .0. N to N , we have N −→ ... −→ (ν Therefore C[Q] ⇓ n.
Appendix C Proof of Lemma 2 Lemma 2. m[in m.p[in p.0]] ≈sa m[in m.q[in q.0]]. Proof. Let R = {(C[m[in m.p[in p.0]]], C[m[in m.q[in q.0]]]): here C is an arbitrary context} ∪ {(C[m[s1 [S1 ]|...|si [Si ]]], C[m[t1 [T1 ]|...|tj [Tj ]]]): here C is an arbitrary context, S1 , ..., Si , T1 , ..., Tj are arbitrary processes and s1 , ..., si , t1 , ..., tj are arbitrary ambients}. We want to prove that R is a contextual barbed bisimulation, i.e., if P R Q then for any context D, (1) D[P ] −→*P implies D[Q] −→*Q and P R Q ; (2) For each ambient n, D[P ] ⇓ n implies D[Q] ⇓ n. We only prove claim (1), proof of claim (2) is similar. Given an arbitrary context D, let E[] ≡ D[C[]]. Case (a): If E[] −→*E [], then E[m[in m.p[in p]]] −→*E [m[in m. p[in p]]] implies E[m[in m.q[in q]]] −→*E [m[in m.q[in q]]] and E [m[in m. p[in p]]] R E [m[in m.q[in q]]]. Case (b): If E[] −→*F [n[in m.S1 |S2 ]|[]], then we have E[m[in m.p[in p]]] −→* F [m[n[S ]|p[in p]]], which implies E[m[in m.q[in q]]] −→*F [m[n[S ]|q[in q]]] and F [m[n[S ]|p[in p]]] R F [m[n[S ]|q[in q]]]. Case (c): If E[] −→*E [] and m[s1 [S1 ]|...|si [Si ]] −→*m[u1 [U1 ]|...|ue [Ue ]], then E[m[s1 [S1 ]|...|si [Si ]]] −→*E [m[u1 [U1 ]|...|ue [Ue ]]] implies E[m[t1 [T1 ]|...|tj [Tj ]]] −→*E [m[v1 [V1 ]|...|vf [Vf ]]] and E [m[u1 [U1 ]|...|ue [Ue ]]] R E [m[v1 [V1 ]|...|vf [Vf ]]].
Substructural Operational Semantics and Linear Destination-Passing Style (Invited Talk) Frank Pfenning Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
[email protected] We introduce substructural operational semantics (SSOS), a presentation form for the semantics of programming languages. It combines ideas from structural operational semantics and type theories based on substructural logics (such as linear logic) in order to obtain a rich, uniform, and modular framework. We illustrate SSOS with a sequence of specifications, starting from a simple functional language presented in linear destination-passing style (LDPS). Next we show how to extend the first specification modularly (that is, by adding new rules for new constructs without changing earlier rules) to treat imperative and concurrent constructs. We briefly compare our means of achieving modularity with that of modular structural operational semantics [1] and contextual semantics [2]. We then discuss how structural properties of configurations (on which the operational semantics is defined) are related to structural properties of various forms of hypothetical judgments originating in the study of linear logic and type theory. Ordered, linear, affine, and unrestricted hypothetical judgments can be used to characterize and classify semantic specifications. We are currently investigating the meta-theory of SSOS, and to what extent modularity in specifications carries over to modularity in the proof of properties such as type preservation and progress. Many SSOS specifications can be realized immediately in the concurrent logical framework (CLF). In fact, SSOS arose from the early specifications of Concurrent ML and the π-calculus in CLF [3].
References 1. Mosses, P.D.: Modular structural operational semantics. Journal of Logic and Algebraic Programming 60–61 (2004) 195–228 2. Wright, A.K., Felleisen, M.: A syntactic approach to type soundness. Information and Computation 115 (1994) 38–94 3. Cervesato, I., Pfenning, F., Walker, D., Watkins, K.: A concurrent logical framework II: Examples and applications. Technical Report CMU-CS-02-102, Department of Computer Science, Carnegie Mellon University (2002) Revised May 2003. W.-N. Chin (Ed.): APLAS 2004, LNCS 3302, p. 196, 2004. c Springer-Verlag Berlin Heidelberg 2004
PType System: A Featherweight Parallelizability Detector Dana N. Xu1 , Siau-Cheng Khoo1 , and Zhenjiang Hu2,3 1 School of Computing, National University of Singapore {xun,khoosc}@comp.nus.edu.sg 2 University of Tokyo, 3 PRESTO 21, Japan Science and Technology Corporation
[email protected] Abstract. Parallel programming is becoming an important cornerstone of general computing. In addition, type systems have significant impact on program analysis. In this paper, we demonstrate an automated typebased system that soundly detects parallelizability of sequential functional programs. Our type inference system discovers the parallelizability property of a sequential program in a modular fashion, by exploring a ring structure among the program’s operators. It handles self-recursive functions with accumulating parameters, as well as a class of non-linear mutual-recursive functions. Programs whose types are inferred to be parallelizable can be automatically transformed to parallel code in a mutumorphic form – a succint model for parallel computation. Transforming into such a form is an important step towards constructing efficient data parallel programs.
1
Introduction
Many computational or data-intensive applications require performance level attainable only on parallel architectures. As multiprocessor systems have become increasingly available and their price/performance ratio continues to improve, interest has grown in parallel programming. While sequential programming is already a challenging task for programmers, parallel programming is much harder as there are many more issues to consider, including available parallelism, task distribution, communication overheads, and debugging. A desirable approach for parallel program development is to start with a sequential program, test and debug the sequential program and then systematically transform the program to its parallel counterpart. In the functional programming community, functions are usually defined recursively, and it is an open problem whether a general and formal method exists to parallelize any sequential recursive definition. One practically useful approach is the skeletal approach [20, 9], where two restrictions have been imposed on function definitions: W.-N. Chin (Ed.): APLAS 2004, LNCS 3302, pp. 197–212, 2004. c Springer-Verlag Berlin Heidelberg 2004
198
D.N. Xu, S.-C. Khoo, and Z. Hu
1. The operators used in the higher order functions should satisfy the associative property. 2. Programs should be expressed in some restrictive recursive forms captured by the higher order functions such as map, reduce, scan, etc. In this paper, we propose a parallelizability detection methodology that alleviates these restrictions. Specifically, we demonstrate a system, called Parallelizable Type System (PType system in short), in which parallelizability of sequential recursive code can be detected through automatic program analysis. By parallelizability, we mean that there exists a parallel code with time complexity that is of order O(log m / m) faster than its sequential counterpart, where m is the size of the input data. To alleviate the first restriction, we introduce a type inference system that discovers the extended-ring property of the set of operators used in a program. We show that this property ensures parallelization of a program. Through our system, users need not know how associative operators are combined to enable parallelization. This separation of concern will greatly facilitate parallelization process. To remove the second restriction, our system accepts any first-order functional programs with strict semantics. If a program passes the type checking phase, it can be automatically converted to parallel codes. Otherwise, the program will remain as it is. For example, consider the following polynomial function definition: poly [a] c = a poly (a : x ) c = a + c × (poly x c)
In the skeletal approach, we have to introduce a (non-intuitive) combining operator comb2 (which is associative). Thus, the revised definition of poly is: poly xs c = fst (polytup xs c) polytup [a] c = (a, c) polytup (a : x ) c = (a, c) ‘comb2‘ (polytup x c) where comb2 (p1 , u1 ) (p2 , u2 ) = (p1 + p2 ∗ u1 , u2 ∗ u1 )
As this revised definition matches the following skeleton, parallelization is thus guaranteed. poly xs c = fst (reduce comb2 (map (\ x → (x , c)) xs))
On the other hand, our PType system can detect that the sequential definition of poly is parallelizable. It infers that the expression (a + c × (poly x c)) has the type R[+,×] . This implies that + and × in R[+,×] exhibit an extended-ring property. The corresponding parallel code for poly is as follows. poly [a] c = a poly (xl ++ xr ) c = poly xl c + (prod xl c) × (poly xr c) prod [a] c = c prod (xl ++ xr ) c = (prod xl c) × (prod xr c)
PType System: A Featherweight Parallelizability Detector
199
An algorithm that automatically transforms a well-PTyped sequential program to an efficient homomorphism, a desired parallel computation model [21], can be found in [23]. In our implementation, the system handles first-order functional programs. It is able to parallelize a wide class of recursively-defined functions with accumulating parameters and with non-linear recursion. For clarity of the presentation, we first illustrate the system without these two features in Section 4.1 and discuss them separately in Section 4.2. The main technical contributions of this work are as follows: 1. We propose an extended ring property of operators used in sequential programs, which guarantees the parallelizability of these programs. This frees programmers from the burden of finding a skeleton form. 2. We propose a novel and featherweight type inference system for detecting parallelizability of sequential programs in a modular fashion. We believe this is the first work on capturing parallelism in a type inference context. The outline of the paper is as follows. In the next section, we describe the syntax of the language used, and the background of our work. Section 3 provides our account of the parallelizability property. The discovery of parallelizability using a type system is described in Section 4. We illustrate the working of the PType system through examples in Section 5. Section 6 describes our implementation. Finally, we discuss the related work and conclude the paper in Section 7.
2
Background
The PType system operates on a first-order typed functional language with strict semantics. The syntax of our source language is given in Figure 1. To aid the type inference, programmers are required to provide as annotatations properties of user-defined binary operators used in a program. Such requirements are typical for achieving reduction-style parallelism. For example, the system-defined annotation #(Int, [+, ×], [0, 1]) is needed for the function definition poly . The annotation tells the system that, for all integers, operators + and × satisfy the extended-ring property with 0 and 1 as their respective identities. Function definitions in this paper are written in Haskell syntax [15]. For the remainder of the paper, we shall discuss detection of parallelism for recursive functions of the form f (a : x) = E[ti m i=1 , q x, f x]
where f is inductively defined on a list and E [ ] denotes an expression context with three groups of holes, denoted by . The context itself contains no occurrence of references to a , x and f . ti m i=1 is a group of m terms, each of which is allowed to contain occurrences of a , but not those of references to (f x ). The q x denotes
200
D.N. Xu, S.-C. Khoo, and Z. Hu τ ∈ Typ Types n ∈ Cons Constants c ∈ Con Data Constructors v ∈ Var Variables ⊕ ∈ Op Binary Primitive Operators γ ∈ Ann Annotations γ ::= #(τ, [⊕1 , . . . , ⊕n ], [ι⊕1 , . . . , ι⊕n ]) e, t ∈ Exp Expressions e, t ::= n | v | c e1 . . . en | e1 ⊕ e2 | if e0 then e1 else e2 | f e1 . . . en | let v = e1 in e2 p ∈ Pat Patterns σ ∈ Prog Programs σ ::= γi∗ , (fi p1 . . . pn = e)∗ ∀ i. i ≥ 1 p ::= v | c v1 . . . vn where f1 is the main function. Fig. 1. Syntax of the source language
an application of a parallelizable auxiliary function.1 Lastly, f x is the selfrecursive call. For example, given the function definition f1 (a : x ) = if a > 0 then length x + f1 x else 1 + f1 x
we have f1 (a : x ) = E [a > 0, 1, length x , f x ] where E [t1 , t2 , t3 , t4 ] = if t1 then t3 + t4 else t2 + t4
As our analysis focuses on the syntactic expressions consisting of recursive calls, all variables directly or indirectly referencing an expression consisting of recursive call(s) need to be traced. We call such variables references to a recursive call, which is formally defined below: Definition 1 (Reference to a Recursive Call). A variable v is a reference to a recursive call if the evaluation of v leads to an invocation of that call. Consider the following two function definitions: f2 (a : x ) = let v2 = 1 + f2 x in a + v2 f3 (a : x ) = let v3 = 1 + f3 x in let u = 2 + v3 in a + u,
Variable v2 is a reference to the recursive call (f2 x ) as it names an expression which encloses a recursive call. In f3 , variables u and v3 are references to the recursive call (f3 x ). Variable u indirectly references the recursive call since it contains v3 . For ease of the presentation, we focus our attention on recursive function definitions that are linear self-recursive (and discuss the handling of non-linear 1
It is possible to consider applications of multiple parallelizable auxiliary functions in an expression, as in qj x nj=1 . These functions are examples of mutumorphism [14]. Their calls can be tupled to obtain a single (q x ) via the technique described in [4, 13].
PType System: A Featherweight Parallelizability Detector
201
and mutually recursive functions in Section 4.2.) Furthermore, we do not consider functions with self-recursive calls occurring in the test of a conditional. Parallelization of such functions requires these functions to be annotated with a special (constraint) form of the extended-ring property [6], which are not described in this paper. Context Preservation. Our parallelization process is inspired from a program restructuring technique known as context preservation [8]. We briefly describe the technique here. Consider the polynomial function definition again. Context preservation is performed primarily on the recursive equation of poly : poly (a : x) c = a + c × (poly x c)
A contextual function (or context, for short) will extract away the recursive subterm of the RHS of this equation. It can be written as λ (•) . α + β × (•). Here, the symbol • denotes a recursive subterm containing an occurrence of a self-recursive call, while α and β denote subterms that do not contain any recursive call. Such a context is said to be context preserving modulo replication (or context preserving, in short) if after composing the context with itself, we can obtain (by transformation) a resulting context that has the same form as the original context. Context preservation guarantees that the underlying function can be parallelized. Theorem 1 (Context Preservation Theorem [8, 14]). Given is a recursive function f of the form f (a : x ) = e where expression e consists of recursive call(s). If e is context preserved, then f can be parallelized. For function poly , let its context be denoted by λ (•) . α1 + β1 × (•)). We compose this context with its renamed copy, (λ(•) . α2 + β2 × (•)), and simplify the composition through a sequence of transformation steps: (λ (•) . α1 + β1 × (•)) ◦ (λ(•) . α2 + β2 × (•)) — function composition = λ (•) . α1 + β1 × (α2 + β2 × (•)) — × is distributive over + = λ (•) . α1 + (β1 × α2 + β1 × (β2 × (•))) — +, × being associative = λ (•) . (α1 + β1 × α2 ) + (β1 × β2 ) × (•) = λ (•) . α + β × (•) where α = α1 + β1 × α2 and β = β1 × β2
Since the simplified form matches the original context, poly is context preserving. However, this transformation process, which is informally described in [5], is more expensive than our type-based approach. Moreover, context preservation checking is not modular, and thus lack of reusability.
3
Parallelizability
Given that context preservation leads to parallelizability, we focus on detecting context preservation of sequential programs, but in a modular fashion. Our first technical contribution is to introduce an extended ring property of the operators which guarantees automatic detection of context preservation.
202
D.N. Xu, S.-C. Khoo, and Z. Hu
sv ∈ S−Values sv ::= bv | if ζa then ζb else bv bv ::= • | (ζ1 ⊕1 . . . ⊕n−1 ζn ⊕n •) where [⊕1 , . . . , ⊕n ] possesses the extended-ring property
ζ ∈ C−Exp ζ ::= C[a, (q x)] where C is an arbitrary expression context not involving references to •
Fig. 2. Skeletal Values
Definition 2. Let S = [⊕1 , . . . , ⊕n ] be a sequence of n binary operators. We say that S possesses the extended-ring property iff 2 1. all operators are associative; 2. each operator ⊕ has an identity, ι⊕ , such that ∀ v : ι⊕ ⊕ v = v ⊕ ι⊕ = v ; 3. ⊕j is distributive over ⊕i ∀ 1 ≤ i < j ≤ n . As an example, in the non-negative integer domain, operators max , + and ×, in that order form an extended ring. Their identities are 0, 0 and 1 respectively. We now describe a set of “skeletons” (of expressions) which are constructed using a sequence of binary operators with the extended-ring property. We will show that expressions expressible in this “skeletal” form are guaranteed to be context preserving. We call them skeletal values (or s-values, in short). These are defined in Figure 2. We use • to denote a self-recursive call in a function definition. An s-value of the form (ζ1 ⊕1 . . . ⊕n−1 ζn ⊕n •)3 is said to be composed directly by the sequence of operators [⊕1 , . . . , ⊕n ] with the extended-ring property. An s-value of the form if ζ0 then ζ1 else bv is said to be in conditional form. Its self-recursive call occurs only in its alternate branch. The following lemma states that all s-values are context preserving. Consequently, any expression that can be normalized to an s-value can be parallelized. Lemma 1 (S-Values Are Context Preserved). Given a recursive part of a function definition f (a : x ) = e , if e is an s-value, then e can be context preserved. The proof is done by a case analysis on the syntax of s-values. Details can be found in [23]. It is worth-mentioning that s-values cover a wide class of recursive function definitions that are parallelizable. In the remainder of the paper, we will provide many practical sequential programs that can be expressed in, or normalized to an s-value, and thus be directly parallelized. 2
3
We can also extend this property to include semi-associative operators and their corresponding left or right identities. Such extension enables more sequential programs to be parallelized. By default, it is equivalent to (ζ1 ⊕1 (· · · ⊕n−1 (ζn ⊕n •) . . . )).
PType System: A Featherweight Parallelizability Detector
4
203
PType System
The main focus of the PType system is a type-inference system that enables discovery of parallelizability of sequential programs. Operationally, the type system aims to deduce the extended-ring property of a sequential program in a modular fashion. To this end, it associates each sub-expression in a recursive function definition with a type term from the type language PType. ρ ∈ PType ρ ::= ψ | φ
ψ ∈ NType ψ ::= N
φ ∈ RType φ ::= RS where S is a sequence of operators
Fig. 3. PType Expressions
The set of PType terms are defined in Figure 3. It comprises two categories: NType and RType. We write [[ρ]] to denote the semantics of PType ρ. Thus, [[N ]] = C−Exp,
where C−Exp is defined in Figure 2. Given that S = [op1 , . . . , opn ] with the extended-ring property, we have: [[RS ]] = {e | e ∗ e ∧ e is an s-value ∧ e is composable by operators in S},
where ∗ represents a normalization process that we have defined to obtain s-values. The core set of rules for the normalization process is in [23]. Since expressions of type RS (for some S ) can be normalized to an s-value, any expression containing a self-recursive call but could not be translated to an s-value is considered ill-typed in our PType system. As an illustration, the RHS of the self-recursive equation of the following function definition has ptype R[max ,+,×] . f6 (a : x) = 5 ‘max‘ (a + 2 × (f6 x)),
Note that in the definition of [[ RS ]], the expression e is said to be composable, rather than to be composed directly, by a set of operators. There are two reasons for saying that: 1. e need not simply be an s-value of bv category; it can also include conditionals and local abstractions, but its set of operators must be limited to S. 2. As operators in S have identities, we allow e to contain just a subset of operators in S . We can always extend e to contain all operators in S using their respective identities. The last point implies that the RType semantics enjoys the following subset relation: Lemma 2. Given two sequences of operators S1 and S2 , both with the extendedring property, if S1 is a subsequence of S2 , then [[RS1 ]] ⊆ [[RS2 ]].
204
D.N. Xu, S.-C. Khoo, and Z. Hu
The above lemma leads to the following subtyping relation: Definition 3 (Subtyping of RType). Given two sequences of operators S1 and S2 , both with the extended-ring property, we say RS1 is a subtype of RS2 , denoted by RS1 5 then a + f7 x else a × f7 x
Under the type assumption Γ = {a :: N , x :: N }, the types for each of the branches are R[+] and R[×] . By the rules (if-merge) and (sub), the type of the conditional becomes R[+,×] .
PType System: A Featherweight Parallelizability Detector v = κ (var − N) Γ ∪ {v :: N } κ v :: N (con)
Γ κ n :: N Γ κ e1 :: N Γ κ e0 :: N
v = κ Γ ∪ {v :: RS } κ v :: RS Γ (f x) (f x) :: RS
205
(var − R)
(rec)
Γ κ e2 :: ρ (ρ = N ) ∨ (ρ = RS ∧ ⊕ ∈ S) Γ κ (e1 ⊕ e2 ) :: ρ
(op)
Γ κ e1 :: ρ1 Γ κ e2 :: ρ2 if (ρ, ρ1 , ρ2 ) Γ κ (if e0 then e1 else e2 ) :: ρ
(if)
Γ κ e1 :: N Γ ∪ {v :: N } κ e2 :: ρ Γ κ (let v = e1 in e2 ) :: ρ
(let − N)
Γ κ e1 :: RS Γ ∪ {v :: RS } v e2 :: RS Γ κ (let v = e1 in e2 ) :: RS
(let − R)
Γ κ e :: N g ∈ F V (κ) Γ κ (g e) :: N if (ρ, ρ, ρ)
(g)
if (RS , N, RS )
Γ κ e : ρ ρ 0 ∧ sbp x ((−1) + c) else sbp x c
Two annotations are needed to type-check this program. The annotation for operators of Bool is meant for type checking the function sbp , and that for operators of Int is for type checking the context of the accumulating parameter c . The context is computed as follows: C[[ RHS of sbp ]]c = if (a == ( )then 1 + c else if (a == ) )then (−1) + c else c
The PType inferred are : sbp :: N , c :: R[+] and sbp :: R[∧] . Note that, when we type check the function body of sbp , the PType of c is set to N .
208
D.N. Xu, S.-C. Khoo, and Z. Hu
Non-linear Mutual Recursion. We extend the PType system to cover a subset of non-linear recursive functions with an additional requirement that the binary operators must be commutative. This additional requirement is typical for research in the parallelization of non-linear recursive functions. To parallelize a set of non-linear mutual recursive functions, we group these functions into a tuple and type-check them together. Thus, we extend κ in κ to become a set of mutual-recursive calls. Consider the following mutually defined recursive functions: fi (a : x ) = ei ∀ i ∈ {1, . . . , m} where ∀ i ∈ {1, . . . , m} : ei = pi1 ⊕ (pi2 ⊗ f1 x ) ⊕ . . . ⊕ (pim ⊗ fm x ) ∀ j ∈ {1, . . . , m} : pij = gij a (qj x )
Here, functions gij are arbitrary functions (i.e., arbitrary contexts) involving a and (qj x ), ∀i, j ∈ {1, . . . , m}. Before type checking, we group the function definitions into a tuple: (f1 , . . . , fm ) = (e1 , . . . , em ). For all j ∈ {1, . . . , m}, type check ej with rules defined in Figure 4, together with the (op-RR) rule and type check the tuple (e1 , . . . , em ) using the (nonlinear) rule. S = ⊕ : S (length S) ≤ 2 ⊕ is commutative Γ {(f1 x),...,(fm x)} e2 :: RS Γ {(f1 x),...,(fm x)} e1 :: RS (op − RR) Γ {(f1 x),...,(fm x)} (e1 ⊕ e2 ) :: RS Γ {(f1 x),...,(fm x)} ej :: RS ∀ j ∈ {1, . . . , m} (nonlinear) Γ {(f1 x),...,(fm x)} (e1 , . . . , em ) :: RS
Example: Fibonacci. For the following non-linear recursive definition of the Fibonacci function, lfib [ ] = 1 lfib (a : x ) = lfib x + lfib x
lfib [ ] = 0 lfib (a : x ) = lfib x
we sketch below the type checking process: Γ ∪ {a :: N , x :: N } {(lfib x ),(lfib x )} Γ ∪ {a :: N , x :: N } {(lfib x ),(lfib x )} {(lfib x ),(lfib x )} Γ ∪ {a :: N , x :: N } {(lfib x ),(lfib x )}
(lfib x + lfib x ) :: R[+] (lfib x ) :: R[ ] (lfib x ) :: R[+] — since R[ ] .
Automatic Generation of Editors for Higher-Order Data Structures
265
must be created. The second is the initial value of type t of the editor. The third is a callback function of type t → env → env. This callback function tells the editor which parts of the program need to be informed of user actions. The editor uses this function to respond to changes to the value of the editor. ::3 GECDef t env :==4 (String, t , CallBackFunction t env) :: CallBackFunction t env :== t → env → env
The (GECInterface t env) is a record that contains all methods of the newly created GECt . :: GECInterface t env = { gecGetValue :: env → (t , env) , gecSetValue :: t → env → env }5
The gecGetValue method returns the current value, and gecSetValue sets the current value of the associated GECt object. Programs can be constructed combining editors by tying together the various gecSetValues and gecGetValues. We are working on an arrow combinator library that abstracts from the necessary plumbing [5]. For the examples in this paper, it is sufficient to use the following tying function: selfGEC :: String (t → t) t (PSt ps) → (PSt ps) |6 gGEC{||} t selfGEC s f v env = env1 where ({gecSetValue} ,env1) = gGEC{||} (s , f v ,λx → gecSetValue (f x)) env
Given an f of type t → t on the data model of type t and an initial value v of type t, selfGEC gui f v creates the associated GECt using gGEC (hence the context restriction). selfGEC creates a feedback loop that sends every edited output value back as an input to the same editor, after applying the function f. Example 1: The standard appearance of a GEC is given by the following program that creates an editor for a self-balancing binary tree: module Editor import StdEnv, StdIO, StdGEC Start :: *World → *World Start world = startIO MDI Void myEditor world myEditor :: (PSt ps) → (PSt ps) myEditor = selfGEC "Tree" balance (Node Leaf 1 Leaf) :: Tree a = Node (Tree a) a (Tree a) | Leaf
In this example, we create a GECTree Int which displays the indicated initial value Node Leaf 1 Leaf (upper screen shot). The user can manipulate this 3 4 5 6
Type definitions are preceded by ::. :== introduces a synonym type. {f0 :: t0 , . . . , fn :: tn } denotes a record with field names fi and types ti . In a function type, | introduces all overloading class restrictions.
266
P. Achten et al.
value in any desired order, producing new values of type Tree Int (e.g., turning the upper Leaf into a Node with the pull-down menu). Each time a new value is created or edited, the feedback function balance is applied. balance takes a argument of type Tree a and returns the tree after balancing it. The shape and lay-out of the tree being displayed adjusts itself automatically. Default values are generated by the editor when needed. Note that the only things that need to be specified by the programmer are the initial value of the desired type, and the feedback function. In all remaining examples, we only modify myEditor and the type for which an instance of gGEC is derived. The tree example shows that a GECt explicitly reflects the structure of type t. For the creation of GUI applications, we need to model both specific GUI elements (such as buttons) and layout control (such as horizontal, vertical layout). This has been done by specializing gGEC [6] for a number of types that either represent GUI elements or layout. Here are the types and their gGEC specialization that are used in the examples in this paper: :: Display a = Display a // a non-editable GUI: e.g., . :: Hide a = Hide a // an invisible GUI, useful for state. :: UpDown = UpPressed | DownPressed | Neutral // a spin button:
3
.
Dynamically Typed Higher-Order GECs
In this section we show how to extend GECs with the ability to deal with functions and expressions. Because functions are opaque, the solution requires a means of interpreting functional expressions as functional values. Instead of writing our own parser/interpreter/type inference system we use the Esther shell [22] (Sect. 3.1). Esther enables the user to enter expressions (using a subset of Clean) that are dynamically typed, and transformed into values and functions using compiled code. It is also possible to reuse earlier created functions, which are stored on disk. Its implementation relies on the dynamic type system [1, 19, 23] of Clean. The shell uses a text-based interface, and hence it makes sense to create a special string-editor (Sect. 3.2), which converts any string into the corresponding dynamically typed value. This special editor has the same power as the Esther command interpreter and can deliver any dynamic value, including higher-order polymorphic functions. 3.1
Dynamics in Clean
A dynamic is a value of static type Dynamic, which contains an expression as well as a representation of its static type, e.g., dynamic 42 :: Int, dynamic map fst :: ∀a b: [ ( a , b ) ] → [ a ] . Basically, dynamic types turn every (first and higherorder) data structure into a first-order structure, while providing run-time access to the original type and value.
Automatic Generation of Editors for Higher-Order Data Structures
267
Function alternatives and case patterns can match on values of type Dynamic. Such a pattern match consists of a value pattern and a type pattern, e.g., [4 , 2] :: [ Int ] . The compiler translates a pattern match on a type into a run-time type unification. If the unification is successful, type variables in a type pattern are bound to the offered type. Applying dynamics at run-time will be used to create an editor that changes according to the type of entered expressions (Sect. 3.2, Example 2). dynamicApply :: Dynamic Dynamic → Dynamic dynamicApply (f :: a → b) (x :: a) = dynamic f x :: b dynamicApply df dx = dynamic "Error" :: String dynamicApply tests if the argument type of the function f, inside its first argument, can be unified with the type of the value x, inside the second argument. dynamicApply can safely apply f to x, if the type pattern match succeeds. It yields a value of the type that is bound to the type variable b by unification, wrapped in a dynamic. If the match fails, it yields a string in a dynamic. Type variables in type patterns can also relate to type variables in the static type of a function. A ^ behind a variable in a pattern associates it with the same type variable in the static type of the function. matchDynamic :: Dynamic → t | TC t matchDynamic (x :: t^) = x
The static type variable t, in the example above, is determined by the static context in which it is used, and imposes a restriction on the actual type that is accepted at run-time by matchDynamic. The function becomes overloaded in the predefined TC (type code) class. This makes it a type dependent function [19]. The dynamic run-time system of Clean supports writing dynamics to disk and reading them back again, possibly in another program or during another execution of the same program. This provides a means of type safe communication, the ability to use compiled plug-ins in a type safe way, and a rudimentary basis for mobile code. The dynamic is read in lazily after a successful run-time unification. The amount of data and code that the dynamic linker links is, therefore, determined by the evaluation of the value inside the dynamic. writeDynamic :: String Dynamic env → (Bool, env) | FileSystem env readDynamic :: String env → (Bool, Dynamic, env) | FileSystem env
Programs, stored as dynamics, have Clean types and can be regarded as a typed file system. We have shown that dynamicApply can be used to type check any function application at run-time using the static types stored in dynamics. Combining both in an interactive ‘read expression – apply dynamics – evaluate and show result’ loop, already gives a simple shell that supports the type checked run-time application of programs to documents. The composeDynamic function below, taken from the Esther shell, applies dynamics and infers the type of an expression. composeDynamic :: String env → (Dynamic, env) | FileSystem env showValueDynamic :: Dynamic → String
268
P. Achten et al.
composeDynamic expr env parses expr. Unbound identifiers in expr are resolved by reading them from the file system. In addition, overloading is resolved. Using the parse tree of expr and the resolved identifiers, the dynamicApply function is used to construct the (functional) value v and its type τ . These are packed in a dynamic v :: τ and returned by composeDynamic. In other words, if env expr :: τ and [[expr]]env = v then composeDynamic expr env = (v :: τ , env). The showValueDynamic function yields a string representation of the value inside a dynamic.
3.2
Creating a GEC for the Type Dynamic
With the composeDynamic function, an editor for dynamics can easily be constructed. This function needs an appropriate environment to access the dynamic values and functions (plug-ins) that are stored on disk. The standard (PSt ps) environment used by the generic gGEC function (Sect. 2) is such an environment. This means that we can simply use composeDynamic in a specialized editor to offer the same functionality as the command line interpreter. Instead of Esther’s console we use a String editor as interface to the application user. In addition we need to convert the provided string into the corresponding dynamic. We therefore define a composite data type DynString and a specialized gGEC-editor for this type (a GECDynString ) that performs the required conversions. :: DynString = DynStr Dynamic String
The choice of the composite data type is motivated mainly by simplicity and convenience: the string can be used by the application user for typing in the expression. It also stores the original user input, which cannot be extracted from the dynamic when it contains a function. Now we specialize gGEC for this type DynString. The complete definition of gGEC{|DynString|} is given below. gGEC{|DynString|} (gui , DynStr _ expr, dynStringUpdate) env 7 (stringGEC, env) = gGEC{||} (gui , expr, stringUpdate dynStringUpdate) env = ({ gecSetValue = dynSetValue stringGEC.gecSetValue , gecGetValue = dynGetValue stringGEC.gecGetValue } ,env) where dynSetValue stringSetValue (DynStr _ expr) env = stringSetValue expr env dynGetValue stringGetValue env (nexpr, env) = stringGetValue env (ndyn, env) = composeDynamic nexpr env = (DynStr ndyn nexpr, env) stringUpdate dynStringUpdate nexpr env (ndyn, env) = composeDynamic nexpr env = dynStringUpdate (DynStr ndyn nexpr) env
The created GECDynString displays a box for entering a string by calling the standard generic gGEC{||} function for the value expr of type String, yielding a 7
This is Clean’s ‘do-notation’ for environment passing.
Automatic Generation of Editors for Higher-Order Data Structures
269
stringGEC. The DynString-editor is completely defined in terms of this Stringeditor. It only has to take care of the conversions between a String and a DynString. This means that its gecSetValue method dynSetValue simply sets the string component of a new DynString in the underlying String-editor. Its gecGetValue method dynGetValue retrieves the string from the String-editor, converts it to the corresponding Dynamic by applying composeDynamic, and combines these two values in a DynString-value. When a new string is created by the application user, the callback function stringUpdate is evaluated, which invokes the callback function dynStringUpdate (provided as an argument upon creation of the DynString-editor), after converting the String to a DynString. It is convenient to define a constructor function mkDynStr that converts any input expr, that has value v of type τ , into a value of type DynString guaranteeing that if v :: τ and [[expr]] = v, then (DynStr (v::τ ) expr) :: DynString. mkDynStr :: a → DynString | TC a mkDynStr x = let dx = dynamic x in DynStr dx (showValueDynamic dx)
Example 2: We construct an interactive editor that can be used to test functions. It can be a newly defined function, say λx → x^2, or any existing function stored on disk as a Dynamic. Hence the tested function can vary from a small function, say factorial, to a large complete application. :: MyRecord = { function :: DynString , argument :: DynString , result :: DynString } myEditor = selfGEC "test" guiApply (initval id 0) where initval f v = { function = mkDynStr f , argument = mkDynStr v , result = mkDynStr (f v) } guiApply r=:8 { function = DynStr (f::a → b) _ , argument = DynStr (v::a) _ } = {r &9 result = mkDynStr (f v)} guiApply r = r
The type MyRecord is a record with three fields, function, argument, and result, all of type DynString. The user can use this editor to enter a function definition and its argument. The selfGEC function will ensure that each time a new string is created with the editor "test", the function guiApply is applied that provides a new value of type MyRecord to the editor. The function guiApply tests, in a similar way as the function dynamicApply (see Sect. 3.1), whether the type of the supplied function and argument match. If so, a new result is calculated. If not, nothing happens. This editor can only be used to test functions with one argument. What happens if we edit the function and the argument in such a way that the result is not a plain value but a function itself? Take, e.g., as function the twice function 8 9
x =:e binds x to e. {r & f0 =v0 ,..., fn =vn } is a record equal to r, except that fields fi have value vi .
270
P. Achten et al.
λf x → f (f x) , and as argument the increment function ((+) 1) . Then the result is also a function λx → ((+) 1) ((+) 1 x) . The editor displays as
result. There is no way to pass an argument to the resulting function. With an editor like the one above, the user can enter expressions that are automatically converted into the corresponding Dynamic value. As in the shell, unbound names are expected to be dynamics on disk. Illegal expressions result in a Dynamic containing an error message. To have a properly higher-order dynamic application example, one needs an editor in which the user can type in functions of arbitrary arity, and subsequently enter arguments for this function. The result is then treated such that, if it is a function, editors are added dynamically for the appropriate number of arguments. This is explained in the following example. Example 3: We construct a test program that accepts arbitrary expressions and adds the proper number of argument editors, which again can be arbitrary expressions. The number of arguments cannot be statically determined and has to be recalculated each time a new value is provided. Instead of an editor for a record, we therefore create an editor for a list of tuples. Each tuple consists of a string used to prompt to the user, and a DynString-value. The tuple elements are displayed below each other using the predefined list editor vertlistAGEC and access operator ^^, which will be presented in Sect. 4.1. The selfGEC function is used to ensure that each change made with the editor is tested with the guiApply function and the result is shown in the editor. myEditor = selfGEC "test" (guiApply o (^^)) (vertlistAGEC [ show "expression " 0]) where guiApply [ f=:(_ , ( DynStr d _)):args ] = vertlistAGEC [ f:check (fromDynStr d) args ] where check (f::a → b) [ arg=:(_ , DynStr (x::a) _):args ] = [ arg : check (dynamic f x) args ] check (f::a → b) _ = [ show "argument " "??" ] check (x::a) _ = [ show "result " x] show s v = (Display s , mkDynStr v)
The key part of this example is formed by the function check which calls itself recursively on the result of the dynamic application. As long as function and argument match, and the resulting type is still a function, it will require another argument which will be checked for type consistency. If function and argument do not match, "??" is displayed, and the user can try again. As soon as the resulting type is a plain value, it is evaluated and shown using the data constructor Display, which creates a non-editable editor that just displays its value. With this editor, any higher-order polymorphic function can be entered and tested.
Automatic Generation of Editors for Higher-Order Data Structures
4
271
Statically Typed Higher-Order GECs
The editors presented in the previous section are flexible because they deliver a Dynamic (packed into the type DynString). They have the disadvantage that the programmer has to program a check, such as the check function in the previous example, on the type consistency of the resulting Dynamics.
In many applications it is statically known what the type of a supplied function must be. In this section we show how the run-time type check can be replaced by a compile-time check, using the abstraction mechanism for GECs. This gives us a second solution for higher-order data structures that is statically typed, which allows, therefore, type-directed generic GUI creation. 4.1
Abstract Graphical Editor Components
The generic function gGEC derives a GUI for its instance type. Because it is a function, the appearance of the GUI is completely determined by that type. This is in some cases much to rigid. One cannot use different visual appearances of the same type within a program. For this purpose abstract GECs (AGEC) [7] have been introduced. An instance of gGEC for AGEC has been defined. Therefore, an AGECd can be used as a GECd , i.e., it behaves as an editor for values of a certain domain, say of type d. However, an AGECd never displays nor edits values of type d, but rather a view on values of this type, say of type v. Values of type v are shown and edited, and internally converted to the values of domain d. The view is again generated automatically as a GECv . To makes this possible, the ViewGEC d v record is used to define the relation between the domain d and the view v. :: ViewGEC d v = { d_val , d_oldv_to_v , update_v , v_to_d
:: :: :: ::
d d → (Maybe v) → v v→v v→d }
// // // //
initial domain value convert domain value to view value correct view value convert view value to domain value
It should be noted that the programmer does not need to be knowledgeable about Object I/O programming to construct an AGECd with a view of type v. The specification is only in terms of the involved data domains. The complete interface to AGECs is given below. :: AGEC d // abstract data type mkAGEC :: (ViewGEC d v) → AGEC d | gGEC{||} v (^^) :: (AGEC d) → d // Read current domain value (^=) infixl :: (AGEC d) d → AGEC d // Set new domain value
The ViewGEC record can be converted to the abstract type AGEC, using the function mkAGEC above. Because AGEC is an abstract data type we need access functions to read (^^) and write (^=) its current value. AGECs allow us to define arbitrarily many editors geci :: AGECd that have a private implementation of type GECvi . Because AGEC is abstract, code that has been written for editors that manipulates some type containing AGECd , does not change when the value
272
P. Achten et al.
of type AGECd is exchanged for another AGECd . This facilitates experimenting with various designs for an interface without changing any other code. We built a collection of functions creating abstract editors for various purposes. Below, we summarize only those functions of the collection that are used in the examples in this paper: vertlistAGEC counterAGEC hidAGEC displayAGEC
:: [ a ] → AGEC [ a ] | gGEC{||} a // all elements displayed in a column :: a → AGEC a | gGEC{||} , IncDec a // a special number editor :: a → AGEC a // identity, no editor :: a → AGEC a | gGEC{||} a // identity, non-editable editor
The counter editor
below is a typical member of this library.
counterAGEC :: a → AGEC a | gGEC{||} , IncDec a counterAGEC j = mkAGEC { d_val = j , d_oldv_to_v = λi _ → (i , Neutral) , update_v = updateCounter, v_to_d = fst } where updateCounter (n , UpPressed) = (n+one, Neutral) updateCounter (n , DownPressed) = (n-one, Neutral) updateCounter (n , Neutral) = ( n ,Neutral)
A programmer can use the counter editor as an integer editor, but because of its internal representation it presents the application user with an edit field combined with an up-down, or spin, button. The updateCounter function is used to synchronize the spin button and the integer edit field. The right part of the tuple is of type UpDown (Sect. 2), which is used to create the spin button. 4.2
Adding Static Type Constraints to Dynamic GECs
The abstraction mechanism provided by AGECs is used to build type-directed editors for higher-order data structures, which check the type of the entered expressions dynamically. These statically typed higher-order editors are created using the function dynamicAGEC. The full definition of this function is specified and explained below. dynamicAGEC :: d → AGEC d | TC d dynamicAGEC x = mkAGEC { d_val=x , d_oldv_to_v=toView , update_v=updView x , v_to_d=fromView x } where toView newx Nothing = let dx = mkDynStr newx in (dx , hidAGEC dx) toView _ (Just oldx) = oldx fromView :: d (DynString, AGEC DynString) → d | TC d fromView _ (_ , oldx) = case ^^oldx of DynStr (x::d^) _ → x updView :: d (DynString, AGEC DynString) → (DynString, AGEC DynString) | TC d updView _ (newx=:(DynStr (x::d^) _) ,_) = (newx, hidAGEC newx) updView _ (_ , oldx) = (^^oldx, oldx)
The abstract Dynamic editor, which is the result of the function dynamicAGEC initially takes a value of some statically determined type d. It converts this value
Automatic Generation of Editors for Higher-Order Data Structures
273
into a value of type DynString, such that it can be edited by the application user as explained in Sect. 3.2. The application user can enter an expression of arbitrary type, but now it is ensured that only expressions of type d are approved. The function updView, which is called in the abstract editor after any edit action, checks, using a type pattern match, whether the newly created dynamic can be unified with the type d of the initial value (using the ^-notation in the pattern match as explained in Sect. 3.1). If the type of the entered expression is different, it is rejected10 and the previous value is restored and shown. To do this, the abstract editor has to remember the previously accepted correctly typed value. Clearly, we do not want to show this part of the internal state to the application user. This is achieved using the abstract editor hidAGEC (Sect. 4.1), which creates an invisible editor, i.e., a store, for any type. Example 5: Consider the following variation of Example 2: :: MyRecord a b = { function :: AGEC (a → b) , argument :: AGEC a , result :: AGEC b } myEditor = selfGEC "test" guiApply (initval ((+) 1.0) 0.0) where initval f v = { function = dynamicAGEC f , argument = dynamicAGEC v , result = displayAGEC (f v) } guiApply myrec=:{ function = af , argument = av } = {myrec & result = displayAGEC ((^^af) (^^av))}
The editor above can be used to test functions of a certain statically determined type. Due to the particular choice of the initial values ( (+) 1.0 :: Real → Real and 0.0 :: Real), the editor can only be used to test functions of type Real → Real applied to arguments of type Real. Notice that it is now statically guaranteed that the provided dynamics are correctly typed. The dynamicAGECeditors take care of the required checks at run-time and they reject ill-typed expressions. The programmer therefore does not have to perform any checks anymore. The abstract dynamicAGEC-editor delivers a value of the proper type just like any other abstract editor. The code in the above example is not only simple and elegant, but it is also very flexible. The dynamicAGEC abstract editor can be replaced by any other abstract editor, provided that the statically derived type constraints (concerning f and v) are met. This is illustrated by the next example. Example 6: If one prefers a counter as input editor for the argument value, one only has to replace dynamicAGEC by counterAGEC in the definition of initval: 10
There is currently no feedback on why the type is rejected. Generating good error messages as in [15] certainly improves the user interface.
274
P. Achten et al.
initval f v = { function = dynamicAGEC f , argument = counterAGEC v , result = displayAGEC (f v) }
The dynamicAGEC is typically used when expression editors are preferred over value editors of a type, and when application users need to be able to enter functions of a statically fixed monomorphic type. One can create an editor for any higher-order data structure τ , even if it contains polymorphic functions. It is required that all higher-order parts of τ are abstracted, by wrapping them with an AGEC type. Basically, this means that each part of τ of the form a → b must be changed into AGEC (a → b) . For the resulting type τ an edit dialog can be automatically created, e.g., by applying selfGEC. However, the initial value that is passed to selfGEC must be monomorphic, as usual for any instantiation of a generic function. Therefore, editors for polymorphic types cannot be created automatically using this statically typed generic technique. As explained in Sect. 3.2 polymorphic types can be handled with dynamic type checking.
5
Applications of Higher-Order GECs
The ability to generate editors for higher-order data structures greatly enhances the applicability of GECs. Firstly, it becomes possible to create applications in which functions can be edited as part of a complex data structure. Secondly, these functions can be composed dynamically from earlier created compiled functions on disk. Both are particular useful for rapid prototyping purposes, as they can add real-life functionality. In this section we discuss one small and one somewhat larger application. Even the code for the latter application is still rather small (just a few pages). The code is omitted in this paper due to space limitations, but it can be found at http://www.cs.kun.nl/∼clean/gec. Screen shots of the running applications are given in Appendix A. An Adaptable Calculator. In the first example we use GEC to create a ‘more or less’ standard calculator. The default look of the calculator was adapted using the aforementioned AGEC customization techniques. Special about this calculator is that its functionality can be easily extended at run-time: the application user can add his or her own buttons with a user defined functionality. In addition to the calculator editor, a GEC editor is created, which enables the application user to maintain a list of button definitions consisting of button names with corresponding functions. Since the type of the calculator functions are statically known, a statically typed higher-order GEC is used in this example. The user can enter a new function definition using a lambda expression, but it is also possible to open and use an earlier created function from disk. Each time the list is changed with the list editor, the calculator editor is updated and adjusted accordingly. For a typical screen shot see Fig. 1.
Automatic Generation of Editors for Higher-Order Data Structures
275
A Form Editor. In the previous example we have shown that one can use one editor to change the look and functionality of another. This principle is also used in a more serious example: the form editor. The form editor is an editor with which electronic forms can be defined and changed. This is achieved using a meta-description of a form. This meta-description is itself a data structure, and therefore, we can generate an editor for it. One can regard a form as a dedicated spreadsheet, and with the form editor one can define the actual shape and functionality of such a spreadsheet. With the form editor one can create and edit fields. Each field can be used for a certain purpose. It can be used to show a string, it can be used as editor for a value of a certain basic type, it can be used to display a field in a certain way by assigning an abstract editor to it (e.g., a counter or a calculator), and it can be used to calculate and show new values depending on the contents of other fields. For this purpose, the application user has to be able to define functions that have the contents of other fields as arguments. The form editor uses a mixed mode strategy. The contents of some fields can be statically determined (e.g., a field for editing an integer value). But the form editor can only dynamically check whether the argument fields of a specified function are indeed of the right type. The output of the form editor is used to create the actual form in another editor which is part of the same application. By filling in the form fields with the actual value, the application user can test whether the corresponding form behaves as intended. For a typical screen shot see Fig. 2.
6
Related Work
In the previous sections we have shown that we can create editors that can deal with higher order data structures. We can create dynamically typed higherorder editors, which have the advantages that we can deal with polymorphic higher order data structures and overloading. This has the disadvantage that the programmer has to check type safety in the editor. The compiler can ensure type correctness of higher-order data structures in statically typed editors, but they can only edit monomorphic types. Related work can be sought in three areas: Grammars Instead of Types: Taking a different perspective on the typedirected nature of our approach, one can argue that it is also possible to obtained editors by starting from a grammar specification instead of a type. Such toolkits require a grammar as input and yield an editor GUI as result. Projects in this flavor are for instance the recent Proxima project [21], which relies on XML and its DTD (Document Type Definition language), and the Asf+Sdf Meta-Environment [10] which uses an Asf syntax specification and Sdf semantics specification. The major difference with such an approach is that these systems need both a grammar and some kind of interpreter. In our system higher-order elements are immediately available as a functional value that can be applied and passed to other components.
276
P. Achten et al.
GUI Programming Toolkits: From the abstract nature of the GEC toolkit it is clear that we need to look at GUI toolkits that also offer a high level of abstraction. Most GUI toolkits are concerned with the low level management of widgets in an imperative style. One well-known example of an abstract, compositional GUI toolkit based on a combinator library is Fudgets [11]. These combinators are required for plumbing when building complex GUI structures from simpler ones. In our system far less plumbing is needed. Most work is done automatically by the generic function gGEC. The only plumbing needed in our system is for combining the GEC-editors themselves. Furthermore, the Fudget system does not provide support for editing function values or expressions. Because a GECt is a t-stateful object, it makes sense to have a look at object oriented approaches. The power of abstraction and composition in our functional framework is similar to mixins [13] in object oriented languages. One can imagine an OO GUI library based on compositional and abstract mixins in order to obtain a similar toolkit. Still, such a system lacks higher-order data structures. Visual Programming Languages: Due to the extension of the GEC programming toolkit with higher-order data structures, visual programming languages have come within reach as application domain. One interesting example is the Vital system [14] in which Haskell-like scripts can be edited. Both systems allow direct manipulation of expressions and custom types, allow customization of views, and have guarded data types (like the selfGEC function). In contrast with the Vital system, which is a dedicated system and has been implemented in Java, our system is a general purpose toolkit. We could use our toolkit to construct a visual environment in the spirit of Vital.
7
Conclusions
With the original GEC-toolkit one can construct GUI applications without much programming effort. This is done on a high level of abstraction, in a fully compositional manner, and type-directed. It can be used for any monomorphic firstorder data type. In this paper we have shown how the programming toolkit can be extended in such a way that GECs can be created for higher-order data structures. We have presented two methods, each with its own advantage and disadvantage. We can create an editor for higher-order data structures using dynamic typing, which has as advantage that it can deal with polymorphism and overloading, but with as disadvantage that the programmer has to ensure type safety at runtime. We can create a editor for higher-order data structures using the static typing such that type correctness of entered expressions or functions is guaranteed at compile-time. In that case we can only cope with monomorphic types, but we can generate type-directed GUIs automatically. As a result, applications constructed with this toolkit can manipulate the same set of data types as modern functional programming languages can. The system is type-directed and type safe, as well as the GUI applications that are constructed with it.
Automatic Generation of Editors for Higher-Order Data Structures
277
References 1. M. Abadi, L. Cardelli, B. Pierce, G. Plotkin, and D. R`emy. Dynamic typing in polymorphic languages. In Proceedings of the ACM SIGPLAN Workshop on ML and its Applications, San Francisco, June 1992. 2. P. Achten. Interactive Functional Programs - models, methods, and implementations. PhD thesis, University of Nijmegen, The Netherlands, 1996. 3. P. Achten and S. Peyton Jones. Porting the Clean Object I/O library to Haskell. In M. Mohnen and P. Koopman, editors, Proceedings of the 12th International Workshop on the Implementation of Functional Languages, IFL’00, Selected Papers, volume 2011 of LNCS, pages 194–213. Aachen, Germany, Springer, Sept. 2001. 4. P. Achten and R. Plasmeijer. Interactive Functional Objects in Clean. In C. Clack, K. Hammond, and T. Davie, editors, Proc. of the 9th International Workshop on the Implementation of Functional Languages, IFL 1997, Selected Papers, volume 1467 of LNCS, pages 304–321. St.Andrews, UK, Springer, Sept. 1998. 5. Achten, Peter and van Eekelen, Marko and Plasmeijer, Rinus and van Weelden, Arjen. Arrows for Generic Graphical Editor Components. 2004. Under Submission; available as Nijmegen Technical Report NIII-R0416, http://www.niii.kun.nl/research/reports/full/NIII-R0416.pdf. 6. Achten, Peter, van Eekelen, Marko and Plasmeijer, Rinus. Generic Graphical User Interfaces. In Greg Michaelson and Phil Trinder, editors, Selected Papers of the 15th Int. Workshop on the Implementation of Functional Languages, IFL03, LNCS. Edinburgh, UK, Springer, 2003. To appear: draft version available via ftp://ftp.cs.kun.nl/pub/Clean/papers/2004/achp2004-GenericGUI.pdf. 7. Achten, Peter, van Eekelen, Marko and Plasmeijer, Rinus. Compositional ModelViews with Generic Graphical User Interfaces. In Practical Aspects of Declarative Programming, PADL04, volume 3057 of LNCS, pages 39–55. Springer, 2004. 8. A. Alimarine and R. Plasmeijer. A Generic Programming Extension for Clean. In T. Arts and M. Mohnen, editors, The 13th International workshop on the Implementation of Functional Languages, IFL’01, Selected Papers, volume 2312 of ¨ LNCS, pages 168–186. Alvsj¨ o, Sweden, Springer, Sept. 2002. 9. E. Barendsen and S. Smetsers. Graph Rewriting Aspects of Functional Programming, chapter 2, pages 63–102. World scientific, 1999. 10. M. v. d. Brand, A. van Deursen, J. Heering, H. de Jong, M. de Jonge, T. Kuipers, P. Klint, L. Moonen, P. Olivier, J. Scheerder, J. Vinju, E. Visser, and J. Visser. The Asf+Sdf Meta-Environment: a Component-Based Language Development Environment. In R. Wilhelm, editor, Compiler Construction 2001 (CC’01), pages 365–370. Springer-Verlag, 2001. 11. M. Carlsson and T. Hallgren. Fudgets - a graphical user interface in a lazy functional language. In Proceedings of the ACM Conference on Functional Programming and Computer Architecture, FPCA ’93, Kopenhagen, Denmark, 1993. 12. D. Clarke and A. L¨ oh. Generic Haskell, Specifically. In J. Gibbons and J. Jeuring, editors, Generic Programming. Proceedings of the IFIP TC2 Working Conference on Generic Programming, pages 21–48, Schloss Dagstuhl, July 2003. Kluwer Academic Publishers. ISBN 1-4020-7374-7. 13. M. Flatt, S. Krishnamurthi, and M. Felleisen. Classes and mixins. In The 25TH ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL98), pages 171–183, San Diego, California, 1998. ACM, New York, NY. 14. K. Hanna. Interactive Visual Functional Programming. In S. P. Jones, editor, Proc. Intnl Conf. on Functional Programming, pages 100–112. ACM, October 2002.
278
P. Achten et al.
15. B. Heeren, J. Jeuring, D. Swierstra, and P. Alcocer. Improving type-error messages in functional languages. Technical Report UU-CS-2002-009, Utrecht University, Institute of Information and Computing Sciences, 2002. 16. R. Hinze. A new approach to generic functional programming. In The 27th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 119–132. Boston, Massachusetts, January 2000. 17. R. Hinze and S. Peyton Jones. Derivable Type Classes. In G. Hutton, editor, 2000 ACM SIGPLAN Haskell Workshop, volume 41(1) of ENTCS. Montreal, Canada, Elsevier Science, 2001. 18. S. Peyton Jones and Hughes J. et al. Report on the programming language Haskell 98. University of Yale, 1999. http://www.haskell.org/definition/. 19. M. Pil. Dynamic types and type dependent functions. In D. Hammond and Clack, editors, Implementation of Functional Languages (IFL ’98), LNCS, pages 169–185. Springer Verlag, 1999. 20. R. Plasmeijer and M. van Eekelen. Concurrent CLEAN Language Report (version 2.0), December 2001. http://www.cs.kun.nl/∼clean/contents/contents.html. 21. M. Schrage. Proxima, a presentation-oriented XML editor. PhD thesis, University of Utrecht, 2004 (to appear). 22. A. van Weelden and R. Plasmeijer. A functional shell that dynamically combines compiled code. In P. Trinder and G. Michaelson, editors, Selected Papers Proceedings of the 15th International Workshop on Implementation of Functional Languages, IFL’03. Heriot Watt University, Edinburgh, Sept. 2003. To appear; draft version available via ftp://ftp.cs.kun.nl/pub/Clean/papers/2004/vWeA2004Esther.pdf. 23. M. Vervoort and R. Plasmeijer. Lazy dynamic input/output in the lazy functional language Clean. In R. Pe˜ na and T. Arts, editors, The 14th International Workshop on the Implementation of Functional Languages, IFL’02, Selected Papers, volume 2670 of LNCS, pages 101–117. Springer, Sept. 2003.
A
Screen Shots of Example Applications
Fig. 1. A screen shot of the adaptable calculator. Left the editor for defining button names with the corresponding function definitions. Right the resulting calculator editor
Automatic Generation of Editors for Higher-Order Data Structures
279
Fig. 2. A screen shot of the form editor. The form editor itself is shown in the upper left window, the corresponding editable spreadsheet-like form is shown in the other
A MATLAB-Based Code Generator for Sparse Matrix Computations Hideyuki Kawabata, Mutsumi Suzuki, and Toshiaki Kitamura Department of Computer Engineering, Hiroshima City University {kawabata,suzuki,kitamura}@arch.ce.hiroshima-cu.ac.jp
Abstract. We present a matrix language compiler CMC which translates annotated MATLAB scripts into Fortran 90 programs. Distinguishing features of CMC include its applicability to programs with sparse matrix computations and its capability of source-level optimization in MATLAB language. Different from other existing similar translators, CMC has an ability to generate codes based on information on the shape of matrices such as triangular and diagonal. Integrating these functionalities, CMC provides the user with a simple way to develop fast large-scale numerical computation codes beyond prototyping. Experimental results show that the programs of SOR and CG methods generated by CMC can run several times as fast as the original MATLAB scripts.
1
Introduction
The MATLAB system [15] is one of the most popular matrix computation environments for rapid prototyping. The MATLAB language is typeless and equipped with a rich set of intrinsic functions, so that programs in the language tend to be compact and portable. MATLAB also has the facility to handle data structures for sparse matrices, which allows the user to execute large-scale computations in the interpreter-based environment [12]. MATLAB’s execution speed on sparse computation is, however, somewhat limited due to its dynamic nature; run-time checks for types and array bounds prohibit a MATLAB code to compete with a program written in a compiled language like Fortran 90. Among measures for alleviating MATLAB’s problem of speed, translating a MATLAB script to a program in compiled language seems to be a promising approach. There are, of course, trade-offs between how well the semantics of the original MATLAB code is preserved after translation and how much run-time overhead is reduced. The MathWorks provides the MATLAB Compiler (MCC) which can translate MATLAB codes into C [16]. Unfortunately, the run-time overhead is hardly removed by using MCC because the procedures of run-time libraries are almost the same as those invoked by interpretation. As a result, although a compiled code by MCC can be used just as a substitute for its MATLAB source script, the execution time of the code wouldn’t be shortened very much, especially when a large-scale computation with sparse matrices is carried out. W.-N. Chin (Ed.): APLAS 2004, LNCS 3302, pp. 280–295, 2004. c Springer-Verlag Berlin Heidelberg 2004
A MATLAB-Based Code Generator for Sparse Matrix Computations
281
FALCON project [8] has proposed a translation system of MATLAB programs into Fortran 90. The system translates MATLAB programs into Fortran 90 using its powerful static type inference mechanism. FALCON’s optimization capability based on its symbolic analysis facility is reported to be very effective for loop-intensive codes. However, the execution time of a code can not necessarily be reduced by the translation when matrix operations by loop-free notations dominate overall execution time of the code. In addition, the FALCON system does not have a functionality to handle sparse matrix structures, which limits the applicability of the system for large-scale computations. In this article, we present a MATLAB-based code generator for sparse matrix computations. The system aims at letting the user make the most of MATLAB’s intelligible syntax for developing large-scale numerical computation codes. The compiler translates annotated MATLAB scripts into codes in a compiled language. We believe that the introduction of annotations in MATLAB scripts might not only help the translator to output highly optimized static codes, but also enhance source codes’ maintainability. We have been developed a system named CMC (a Compiler for Matrix Computations) which compiles annotated MATLAB scripts into optimized Fortran 90 programs. Like FALCON, CMC translates MATLAB scripts into Fortran 90 in order to reduce dynamic overheads. Unlike FALCON, CMC – is able to maintain sparse matrices using several types of sparse structures, – is capable of source-level optimizations, and – can produce optimized codes using detailed shape information of matrices. The rest of this paper is organized as follows: Section 2 describes basic design and details of each functionality of CMC [13]. In Section 3, we discuss issues on using multiple sparse structures. Experimental results are presented in Sect. 4. Related work and conclusions are presented in Sects. 5 and 6, respectively.
2
CMC: A Compiler for Sparse Matrix Computations
2.1
Overview of Our Approach
The code shown in Fig. 1 (a) is a complete description of a function written in MATLAB. In the code, variables r, A, b, and x are not typed and the user function [r] = res(A,b,x) r = b - A*x;
(a) A function written in MATLAB function [r] = res(A,b,x) %cmc integer, auxarg:: s %cmc real:: A(s,s) %cmc real, colvec:: b(s),x(s) r = b - A*x;
subroutine res(A,x,b,r,s) implicit none integer s real*8 r(s) real*8 A(s,s) real*8 x(s) real*8 b(s) r = b - MATMUL(A,x) end
(c) Translated dense code
(b) Annotated code for CMC Fig. 1. An example of dense code translation by CMC
282
H. Kawabata, M. Suzuki, and T. Kitamura
function [r] = res(A,b,x) %cmc integer, auxarg:: s,n %cmc real, ccs(n):: A(s,s) %cmc real, colvec:: b(s),x(s) r = b - A*x;
(a) Annotated code in MATLAB
subroutine res(A val,A colptr,A rowind,x,b,r,s,n) implicit none ... real*8 A val(s*n) integer A colptr(s+1), A rowind(s*n) ... r = b do Xk = 1, s do Xiptr = A colptr(Xk), A colptr(Xk+1)-1 Xi = A rowind(Xiptr) r(Xi) = r(Xi) - A val(Xiptr) * (x(Xk)) enddo enddo end
(b) Translated sparse code Fig. 2. An example of sparse code translation by CMC
can pass values of any attribute, e.g., integer scalars or complex matrices, as arguments. Figure 1 (b) shows an example of an annotated code for CMC. Directives are lines with %cmc at the beginning, which are taken as comment lines by the MATLAB interpreter. The syntax of inserted annotations is similar to that of the variable declaration statements in Fortran 90. There are newly defined declarators; these include shape declarators. colvec means that the variables are column vectors. Other shape declarators not shown in Fig. 1 (b) include rowvec and diag, which indicate a row vector and a diagonal matrix, respectively. Figure 1 (c) shows the translated subroutine in Fortran 90 from the code of Fig. 1 (b) by CMC. Variables for return values are listed following dummy arguments; call by address facility of Fortran allows to do this. Additional variables indicated by auxarg are, if any, appended to the end of the list of arguments. The translated code is fairly similar to the source code thanks to the fact that Fortran 90 supports some array operations, although this is not the case for sparse matrices. Figure 2 (a) shows an example of annotations for sparse matrices. For indicating structures of variables, we use a set of structure declarators which currently consists of ccs, crs, and md. ccs indicates that each listed variable is of the CCS (Compressed Column Storage) format [3]. ccs(n) means that the amount of memory for the matrices to be allocated is as large as n nonzero entries on each column. crs and md are for the CRS (Compressed Row Storage) format and the MD (multi-diagonal) format, respectively. Details of those sparse matrix structures are described in Sect. 3. Figure 2 (b) shows the translated sparse code from Fig. 2 (a) by CMC. Matrix A is stored in CCS format using three one-dimensional arrays. Operations are paraphrased using indirect accesses of elements in loops. The fact that sparse matrix structures are not available in Fortran 90, unlike MATLAB, makes it tough to write and maintain programs which deal with sparse matrices in the language. CMC is designed to remove the burden from the user who wants to develop fast sparse Fortran 90 programs in a simple manner. The Syntax of Annotations for CMC. Figure 3 shows the summary the syntax of annotations for the current implementation of CMC. annotation cor-
A MATLAB-Based Code Generator for Sparse Matrix Computations
283
annotation → anno var | anno param | anno auxarg anno var → %CMC attribute list :: variable list attribute list → attribute | attribute , attribute list attribute → type | shape | structure type → logical | integer | real | complex shape → _scalar | _lowvec | _colvec | _diag | _tril | _triu | _full structure → _dense | _ccs ( cexpr ) | _crs ( cexpr ) | _md ( cexpr ) variable list → variable | variable , variable list variable → identifier | identifier ( cexpr ) | identifier ( cexpr , cexpr ) anno param → %CMC type , parameter :: assign list assign list → identifier = cexpr | identifier = cexpr , assign list anno auxarg → %CMC type , auxarg :: scalar list scalar list → identifier | identifier , scalar list
Fig. 3. The syntax of annotations for CMC
responds to an annotation line in a source code. CMC supports parameters and auxiliary variables for size declarations of vectors and matrices. The former is the same as parameters of Fortran 90 and the latter is for additional arguments for generated Fortran 90 codes by CMC. In Fig. 3, cexpr means an expression of constants, parameters, and auxiliary variables. Currently, CMC does not support multidimensional arrays. 2.2
On the Introduction of Annotations in MATLAB Programs
The function in Fig. 1 (a) can be used, giving a matrix and two column vectors to it, to get a residual column vector in an iterative solver for linear systems. The user can also use the code passing matrices for b and x and a scalar for A as arguments without receiving any run-time error from the MATLAB interpreter. Although it is free for the user to use programs in any way, there could be usages for the codes which were previously unthought of by the programmer. For those cases, it is preferable for the programmer to assert the usage of the function in some way in order for the maintainability of the code, as well as for the elimination of causes for any trouble. From this viewpoint, it can be said that the introduction of annotations for the arguments of a function in MATLAB scripts might not only help CMC to generate highly optimized static codes, but also enhance source codes’ maintainability. 2.3
The Structure of CMC
CMC’s structure together with the flow of building a sparse executable using CMC is shown in Fig. 4. CMC reads an annotated MATLAB code and generate a Fortran 90 subroutine which can be linked with other Fortran 90 subroutines to make an executable. CMC also has a facility to output optimized code as a MATLAB function M-file. The user can enjoy CMC’s ability of source-level optimization on the MATLAB interpreter.
284
H. Kawabata, M. Suzuki, and T. Kitamura
prepared by the user
CMC
Annotated MATLAB script (function M-file)
=
syntax analysis control flow analysis data dependence analysis attribute analysis optimization code generation
tril, ccs Optimized MATLAB code (optional)
M diag
-
tril, ccs
D
*
tril, ccs
Fortran90 code
scalar Routines in Fortran90, C, etc.
Library compile and link compiler and linker
executable
w
L
tril, ccs
Fig. 5. Inferencing attributes for variables in the expression M=D-w*L where the attributes of D, w and L are available
Fig. 4. Building a sparse code using CMC
2.4
Attribute Inference Mechanisms of CMC
Type inference mechanisms are needed for translating a MATLAB program to an efficient Fortran 90 code. CMC follows the way of FALCON [8] on this issue, except that CMC expects arguments’ attributes, such as shapes, types and sizes, to be given by the user through annotations. CMC automatically decides attributes of other variables, including intermediate ones, in the program. CMC uses an iterative algorithm for forward data-flow analysis. For each operation in a program, the result’s attributes are determined from the attributes of operands and the kind of the operation. An example is depicted in Fig. 5. When attributes of operands’ are not available at some point in the course of the iterative attribute analysis, determination of the result’s attributes is postponed to subsequent iterations. CMC expects programs in which all variables’ attributes can be eventually determined in this way. CMC deals with detailed shape information for matrices such as triangular and diagonal matrices as shown in Table 1. For example, parsing the expression C=A*B, where the variables A and B are matrices of upper triangular and diagonal, respectively, CMC will decide the shape of the variable C to be upper triangular. Detailed information for each matrix is used for generating efficient code. For example, diagonal matrices are treated as one-dimensional arrays. Currently, CMC does not make use of sparsity of dense triangular matrices but omits computations on zeros. CMC also decides each variable to be sparse or dense using the rules shown in Table 2. Sparse structures are only used for matrices (not for vectors). Details on currently available sparse structures in CMC will be described in Sect. 3. CMC generates Fortran 90 programs based on the annotations supplied by the programmer. A generated code by CMC does not support dynamic expansion of matrices and does not check array bounds at run-time. When annotations were written incorrectly, the program generated by CMC would cause run-time errors.
A MATLAB-Based Code Generator for Sparse Matrix Computations
285
Table 1. Resulting shapes for binary operations between A and B. Symbols S, R, C, U, L, D, and F denote scalar, row vector, column vector, upper triangular matrix, lower triangular matrix, diagonal matrix, and full matrix, respectively (a) shape of A*B shape of B shape of A S R C U L D Scalar Row vec. Column vec. Upper tri. mat. Lower tri. mat. Diagonal mat. Full mat.
S R C U L D F
R — F — — — —
C S — C C C C
U R — U F U F
L R — F L L F
D R — U L D F
F F R — F F F F
(b) shape of A+B, A-B, A./B, and A.*B shape of B shape of A S R C U L D F Scalar Row vec. Column vec. Upper tri. mat. Lower tri. mat. Diagonal mat. Full mat.
S R C U† L† D† F
R R — — — — —
C — C — — — —
U† — — U F U F
L† — — F L L F
D† — — U L D F
F — — F F F F
†: For operations + and -, resulting shape is F when the scalar value is not zero. Table 2. Resulting structure of a binary operation between A and B shape of B shape of A
scalar dense sparse
scalar dense sparse
(scalar) dense sparse† dense dense dense sparse† dense sparse
†: Resulting structure is dense when the operation is an addition or a subtraction.
2.5
Source-Level Optimization Mechanisms of CMC
As discussed in [17], source-level optimization is effective for matrix computation codes. CMC is equipped with the following optimization functionalities, which are simpler than those mentioned in [17] but also effective: – classical techniques [1] such as loop-invariant code motion (LICM), copy propagation, common subexpression elimination (CSE), and dead-code elimination, and – matrix language oriented optimizations such as strength reduction of operations considering ranks of operands. Figures 6 and 7 show programs of the SOR method and the CG method, respectively. Figures 6 (a) and 7 (a) are source codes and Figs. 6 (b) and 7 (b) are optimized codes by CMC. The plainness of the sources compared to optimized codes are apparent even for those relatively small examples. However, the execution speed of those naively written codes are prohibitive compared to optimized codes. Unfortunately, the MATLAB system, including MCC, does not have those fairly simple optimization capabilities. As for the strength reduction for matrix computations, CMC currently supports reordering of an associative operator *. CMC does not evaluate computational load for each matrix operation exactly; it is difficult to do that for sparse
286
H. Kawabata, M. Suzuki, and T. Kitamura
function [x,i] = sor0s(A,x0,b,tol) %cmc integer,parameter :: s=50*50 %cmc real, ccs(5) :: A(s,s) %cmc real, colvec :: x0(s), b(s) %cmc real, scalar :: tol w = 1.8; D = diag(diag(A)); L = -tril(A,-1); U = -triu(A,1); r0 = norm(b-A*x0); x = x0; i = 0; while 1 i = i+1; x = (D-w*L)\(((1-w)*D+w*U)*x+w*b); if norm(b-A*x)/r0