Lecture Notes in Artificial Intelligence Subseries of Lecture Notes in Computer Science Edited by J. G. Carbonell and J. Siekmann
Lecture Notes in Computer Science Edited by G. Goos, J. Hartmanis and J. van Leeuwen
1453
Marie-Laure Mugnier Michel Chein (Eds.)
Conceptual Structures" Theory, Tools and Applications 6th International Conference on Conceptual Structures, ICCS'98 Montpellier, France, August 10-12, 1998 Proceedings
Springer
Series Editors Jaime G. Carbonell, Carnegie Mellon University, Pittsburgh, PA, USA J/~rg Siekmann, University of Saarland, Saarbriiclen, Germany
Volume Editors Marie-Laure Mugnier Michel Chein LIRMM 161 rue Ada, F-34392 Montpellier Cede~ 5, France E-mail: {mugnier, chein} @lirmm.fr Cataloging-in-Publication Data applied for
Die Deutsche Bibliothek - CIP-Einheitsaufnahme Conceptual structures : theory, tools and applications ; proceedings / 6th International Conference on Conceptual Structures, ICCS '98, Montpellier, France, August 10 - 14, 1998. Marie-Lanre Muguier ; Michel Chein (ed.). - Berlin ; Heidelberg ; New York ; Barcelona ; Budapest ; Hong Kong ; London ; Milan ; Paris ; Singapore ; Tokyo : Springer, 1998 (Lecture notes in computer science ; Vol. 1453 : Lecture notes in artificial intelligence) ISBN 3-540-64791-0
CR Subject Classification (1991): 1.2, G.2.2, E4.1, E2 ISBN 3-540-64791-0 Springer-Verlag Berlin Heidelberg New York This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are liable for prosecution under the German Copyright Law. 9 Springer-Verlag Berlin Heidelberg 1998 Printed in Germany Typesetting: Camera ready by author SPIN 10638203 06/3142 - 5 4 3 2 1 0
Printed on acid-free paper
In memory of our colleague and friend, Jean Fargues
Preface Since 1993 the International Conference on Conceptual Structures (ICCS) has been the primary forum for reporting progress in conceptual structures research with main emphasis on conceptual graphs. Conceptual graphs are a knowed~e representation model introduced by John F. Sowa in 1984, stemming from ~emantic networks and the existential graphs of C.S. Peirce. The formal aspects have two mathematical bases, logic and graph theory. Conceptual graphs are used in several domains, such as natural language processing, knowledge based systems, knowledge engineering and database design, among others. Research teams around the world have developed a sizeable software base and have built applications upon it. These last years, ICCS conferences have expanded to other knowledge representation formalisms having relationships with conceptual graphs. Our desire for ICCS'98 was to develop this trend. The call for papers, the composition of the editorial board and of the program committee, and the choice of the invited speakers were guided by the desire to broaden the range of the conference. We firmly believe that efforts must be strengthened to bring the conceptual graphs community and the knowledge representation community closer. These proceedings gather 30 papers (20 long papers and 10 research notes), carefully selected from 66 proposals presented at the sixth International Conference on Conceptual Structures, held in the city of Montpellier, France. These papers are broadly classified into the following categories: knowledge representation and knowledge engineering, tools, conceptual graphs and other formalisms, relationships with logics, algorithms and complexity, epistemology and ontologies, natural language processing, applications. It is a pleasure to thank the invited speakers, Franz Baader, Daniel Kayser, and John F. Sowa for accepting our invitation to give talks and to contribute papers to the proceedings. We thank the authors, the editorial board, the program committee, and the auxiliary reviewers, for making this book, result of their cooperative work, a valuable contribution in the knowledge representation research field. Very special thanks are due to Eric Salvat, who spared no effort to make this conference a success (among his numerous activities for ICCS'98, he realized our Web site). Finally, on behalf of the organizing committee, we thank the institutions that contributed to this conference: AFIA (French Association of AI), CNRS (National Center of Scientific Research), INRIA (National Institute of Automatics and Informatics), LIRMM (Laboratory of Informatics, Robotics, and Microelectronics of Montpellier), Universit~ Montpellier 2, District de Montpellier, R6gion Languedoc-Roussillon. Montpellier, August 1998 Michel CHEIN and Marie-Laure MUGNIER
Organizing
Committee
H o n o r a r y Chair John F. Sowa
SUNY at Binghamton, USA
General Chair Michel Chein
LIRMM, Universitd Montpellier 2, France
P r o g r a m Chair Marie-Laure Mugnier
LIRMM, Universitd Montpellier 2, France
Publicity Chair Eric Salvat
LIRMM, Universitd Montpellier 2, France
Local A r r a n g e m e n t s Rose Dieng David Genest Pascal Jappy Corine Zicler
INRIA Sophia Antipolis, France LIRMM, Universitd Montpellier 2, France LIRMM, Universitd Montpellier 2, France LIRMM, Universit@ Montpellier 2, France
Editorial Board Brian Gaines Fritz Lehmann Dickson Lukose Guy Mineau Leroy Searle Stefano Spaccapietra Rudolf Wille Program
The University of Calgary, Canada Cycorp, USA The University of New England, Australia Universitd Laval, Canada University of Washington, USA Swiss Federal Institute of Technology, Switzerland Technische Universit/it Darmstadt, Germany
Committee
Harmen van den Berg Jean Bdzivin Bernard Botella Marc Champesme Peter Creasy Walling R. Cyre Harry S. Delugach Judy Dick Rose Dieng Peter Eklund Gerard Ellis Bruno Emond John Esch Norman Foo
Telematics Research Centre, The Netherlands Univ. de Nantes, France Dassault Electronique, St-Cloud, France Univ. Paris-Nord, France Univ. of Queensland, Australia Virginia Polytechnic Inst. & Sate Univ., USA Univ. of Alabama in Hunstville, USA ActE, Toronto, Canada INRIA Sophia Antipolis, France Griffith Univ., Australia Peirce Holdings International, Australia Univ. du Qudbec ~ Hull, Canada Lockheed Martin, USA Univ. of New South Wales, Australia
Christophe Fouquerd Brian Garner Robert Godin Michel Habib Ollivier Haemmerld Roger Hartley Mary Keeler Robert Kremer Michel Lecl6re Maurizio Lenzerini Robert Levinson Bernard Levrat Graham A. Mann Rokia Missaoui Jens-Uwe Moeller Bernard Moulin Maurice Pagnucco Mike P. Papazoglou Heike Petermann Heather Pfeiffer Daniel Rochowiak Genevieve Simonet Rudi Studer Gerd Stumme William Tepfenhart Eric Tsui Michel Wermelinger Mark Willems Vilas Wuwongse Pierre Zweigenbaum External
Univ. Paris-Nord, France Deakin Univ., Australia Univ. du Qudbec ~ Montrdal, Canada Univ. MontpeUier 2, France INAPG, Paris, France New Mexico State Univ., Las Cruces, USA Univ. of Washington, USA Univ. of Calgary, Canada Univ. de Nantes, France Univ. degli Studi Di Roma "La Sapienza", Italy Univ. of California at Santa Cruz, USA Univ. d'Angers, France Univ. of New South Wales, Australia Univ. du Qudbec ~ Montrdal, Canada Univ. of Hamburg, Germany Univ. de Laval, Canada Univ. of New South Wales, Australia Univ. of Tilburg, The Netherlands Univ. Hamburg, Germany New Mexico State Univ., Las Cruces, USA Univ. of Alabama in Huntsville, USA Univ. Montpellier 2, France AIFB, Univ. Karlsruhe, Germany Technische Universit~it Darmstadt, Germany AT&T Research, USA CSC Financial Services Group & Univ. of Sydney Univ. Nova de Lisboa, Portugal Cycorp, USA Asian Institute of Technology, Thailand DIAM, SIM/AP-HP & Univ. Paris 6, France
Referees
Tassadit Amghar Pierre Brezellec Steve R. Callaghan Tru H. Cao Alexander Friedmann
David Genest Gwen Kerdiles Thdr~se Libourel Stdphane Loisean Lhouari Nourine
Abdellatif Obaid Eric Salvat Henry Soldano Guy Tremblay
Table of c o n t e n t s
Invited Talks Conceptual Graph Standard and Extensions . . . . . . . . . . . . . . .
John F. Sowa Matching in Description Logics: Preliminary Results . . . . . . . . . . Franz Baader, Alex Borgida, Deborah L. McGuinness Ontologically, Yours . . . . . . . . . . . . . . . . . . . . . . . . . . . . Daniel Kayser
3
15 35
Knowledge Representation and Knowledge Engineering Executing Conceptual Graphs . . . . . . . . . . . . . . . . . . . . . . . Walling R. Cyre From Actors to Processes: The Representation of Dynamic Knowledge Using Conceptual Graphs . . . . . . . . . . . . . . . . . . . . . . Guy W. Mineau A Semantic Validation of Conceptual Graphs . . . . . . . . . . . . . . Juliette Dibie, Ollivier Haemmerld, St~phane Loiseau Using Viewpoints and CG for the Representation and Management of a Corporate Memory in Concurrent Engineering . . . . . . . . . Myriam Ribi~re
51
65 80
94
Tools WebKB-GE
- A Visual Editor for Canonical Conceptual Graphs
(Research Note)
............................
Simon Pollitt, Andrew Burrow, Peter W. Eklund Mapping of CGIF to Operational Interfaces (Research Note) . . . . . . Arno Puder TOSCANA-Systems Based on Thesauri . . . . . . . . . . . . . . . . . Bernd Groh, Selma Strahringer, Rudolf Wille MULTIKAT, a Tool for Comparing Knowledge of Multiple E x p e r t s . . Rose Dieng, Stefan Hug A Platform Allowing Typed Nested Graphs: How CoGITo Became CoGITaNT (Research Note) . . . . . . . . . . . . . . . . . . . . . . David Genest, Eric Salvat
111
119 127 139
154
Conceptual Graphs and Other Models Towards Correspondences between Conceptual Graphs and Description Logics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pascal Coupey, Catherine Faron
165
• Piece Resolution: Towards Larger Perspectives . . . . . . . . . . . . . Stgphane Coulondre, Eric Salvat Triadic Concept Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . Rudolf Wille Powerset Trilattices . . . . . . . . . . . . . . . . . . . . . . . . . . . . Klaus Biedermann
179 194 209
Relationships with Logics Simple Concept Graphs: A Logic Approach . . . . . . . . . . . . . . . Susanne Prediger Two FOL Semantics for Simple and Nested Conceptual Graphs . . . . Genevieve Simonet Peircean Graphs for the Modal Logic $5 . . . . . . . . . . . . . . . . . Torben Braiiner Fuzzy Order-Sorted Logic Programming in Conceptual Graphs with a Sound and Complete Proof Procedure . . . . . . . . . . . . . . . Tru H. Cao, Peter N. Crcasy
225 240 255
270
Algorithms and Complexity Knowledge Querying in the Conceptual Graph Model: The RAP Module (Research Note) . . . . . . . . . . . . . . . . . . 287 Olivier Guinaldo, Ollivier Haemmerld Stepwise Construction of the Dedekind-MacNeille Completion (Research Note) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295 Bernhard Ganter, Sergei O. Kuznetsov PAC Learning Conceptual Graphs . . . . . . . . . . . . . . . . . . . . 303 Pascal Jappy, Richard Nock
Epistemology and Ontologies Procedural Renunciation and the Semi-automatic Trap . . . . . . . . . Graham A. Mann Ontologies and Conceptual Structures . . . . . . . . . . . . . . . . . . William M. Tepfenhart
319 334
Natural Language Processing Manual Acquisition of Uncountable Types in Closed Worlds (Research Note) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351 Galia Angelova A Logical Framework for Modeling a Discourse from the Point of View of the Agents Involved in It (Research Note) . . . . . . . . . . . . 359 Bernard Moulin Computational Processing of Verbal Polysemy with Conceptual Structures (Research Note) . . . . . . . . . . . . . . . . . . . . . . 367 Karim Chibout, Anne Vilnat
XIII
Word Graphs: The Second Set . . . . . . . . . . . . . . . . . . . . . . C. Hoede, X. Liu Tuning Up Conceptual Graph Representation for Multilingual Natural Language Processing in Medicine (Research Note) . . . . . . . . . Anne-Marie Rassinoux, Robert H. Baud, Christian Lovis, Judith C. Wagner, Jean-Raoul Scherrer
375
390
Applications
Conceptual Graphs for Representing Business Processes in Corporate Memories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401 Olivier Gerbg, Rudolf K. Keller, Guy W. Mine-au Handling Specification Knowledge Evolution Using Context Lattices 416 Aldo de Moor, Guy W. Mineau Using CG Formal Contexts to Support Business System Interoperations (Research Note) . . . . . . . . . . . . . . . . . . . . . . . . . 431 Hung Wing, Robert M. Colomb, Guy W. Mineau A u t h o r Index
439
Conceptual Graph Standard and Extensions John F. Sowa
Abstract. The proposed ANSI standard for conceptual graphs codifies the basic core that has formed the basis for most CG implementations since 1984. Standards are essential for building and sharing development tools and for supporting commercial applications. But the CG community has traditionally been a coalition of researchers who have been actively extending conceptual graphs with a variety of experimental features that often go far beyond the core and are sometimes incompatible with it. This paper discusses the recently developed CG standard and its approach to standardizing the CG core while allowing innovative extensions. 1 Accommodating Standardization and Innovation A draft proposed ANSI standard (dpANS) for conceptual graphs has been developed by the NCITS T2 Committee on Information Interchange and Interpretation. It is now being circulated for comments and approval; barring unforeseen difficulties, it should be approved before the end of 1998. The dpANS is based on the original CG syntax and semantics (Sowa 1984) with some extensions and modifications that have been requested or suggested by the community of CG users and implementers. But the number of possible extensions and modifications is greater than can be accommodated by tile 1998 CG standard. This conference is the sixth in a series of International Conferences on Conceptual Structures (ICCS), which have followed seven annual workshops on conceptual graphs. Thrcc journals, publishcd in the U.S., U.K., and Francc, devotcd spccial issues to papers on conceptual graphs (Way 1992, Sowa 1992, Chcin 1996). Many of the papcrs in thesc confcrcnccs and journals have prcscntcd ncw ways of using thc basic features of conceptual graphs, but others have developed novel extensions to the syntax and semantics. Growth and change are inevitable and cssential in any language or system that is widely used, but practical applications require a stable platform. The stability of the CG core is ensured by the generality of first-order logic (FOL). Frege (1879) and Peirce (1885) independently developed equivalent versions of FOL, even though they had no knowledge of each other's work and they used widely divergent notations. Since then, first-order logic has proved to be a solid foundation that is gencral enough to describe and represent anything that can be computcd by any digital computer, including the computer itself. By ensuring that the common core of CGs is gcneral enough to represent classical first-order logic, its generality is ensured. To promote interoperability with other computer systems, the CG standard has been developed in parallel with the proposed ANSI standard for the Knowledge Interchange Format (KIF). Both CGs and KIF express FOL in a form that can bc automatically translated from onc to the other whilc preserving the semantics.
Despite the generality of classical FOL, there are many reasons why people who use a knowledge representation language may want to diverge from FOL: 9
The traditional algebraic notation for predicate calculus, which is based on Peirce's notation of 1885 with a change of symbols by Giuseppe Peano, has been widely criticized as unreadable. Peirce himself later developed a graph notation, which he believed was a "more iconic" representation for "the atoms and molecules of logic." Conceptual graphs, which are based on Peirce's graphs, are more readable than predicate calculus for many applications. But there is an infinite number of ways in which logical notations can be specialized or tailored to an application, especially when a graphical display is available.
9
C o m p u t a b i l i t y . Theorem proving in full FOL can often take an cxponcntial amount of time. Many variations of logic restrict the expressibility so that only cfficiently computable problems can be expressed. Such restrictions can be accommodated in any notation for FOL by stating constraints on the combinations of operators that may be used. An example is the Horn-clause subset used in Prolog, which does not allow disjunctions in the conclusion of an implication. Such constraints can be imposed on CGs in the same way they are imposed on the algebraic notations.
9
Convenience. Many extensions to the logical notations make them more convcnient for expressing certain frequently occurring features. Such extensions can be accommodated by definitional facilities that allow new symbols to be defined by combinations of the old symbols.
9
Surprises. Although logic can express anything that can be computed, it cannot express exceptions that are unknown or unknowable to the person who is using it. Computer input/output facilities are a primary source of "surprises" that cannot bc anticipated by a programmer or other computer user.
e
C o n t e x t dependence. Natural languages are often more concise than formal languages because they omit much of the information that can be inferred from the context. Linguists have been analyzing and codifying the ways that languages use context-dependent information, but the complete specification of those methods remains an active area of research.
9
Because of its generality, first-order logic can be used as a metalanguage to define new kinds of logics that go beyond FOL. By quantifying over predicates or relations, for example, higher-order logic (HOL) can summarize in one statement an infinite number of first-order statements.
Readability.
E x t e n d e d logics.
These issues have been actively explored in the past dozen years of CG workshops and conferences, and many different approaches to them have been suggested, implemented, and published. To accommodate innovation while standardizing the core, the dpANS for CGs defines the syntax and semantics of those features that are based on wellunderstood logical principles. That includes all of classical first-order logic and some extensions, such as the use of FOL at the metalevel and the ability to quantify over relations for HOL. These are the same features that are defined in the dpANS for KIF. To accommodate the open-ended research areas, the
dpANS for CGs also defines syntax for certain features whose semantics have not yet been standardized; those features include the surprises and context dependencies. The dpANS gives a formal specification for the conceptual graph interchange format (CGIF), which is designed for communication between computer systems. To allow further experimentation with the syntax designed for human readability, the methods for displaying CGs to humans are suggested as informal guidelines rather than normative standards. The semantics of those displays is determined by their mapping to CGIF and KIF.
2 Actors Functions are relations that behave like one-way streets in guiding a computation. In graphic form, they correspond to dataflow diagrams, which have a preferred directionality. By definition, a function is a relation that has one argument called the output, which has a single value for each combination of values of the other arguments, called the inputs. In conceptual graphs, the output concept of a function is attached to its last arc, whose arrow points away from the circle. I f f is a function from type TI to type T2, the constraint of a single output for each input can be stated by an axiom in a conceptual graph or predicate calculus:
IT1: ~-,-b')-,-T2:
~)q.
(Vx:T I)(q~:T2)y= f(x). Both the graph and the formula say that for every input of type T 1, f h a s exactly one output (3!y) of type T2. Since the distinction between functions and other relations is important for many applications, it is useful to have a special notation that marks a relation as a function. In Figure 1, the diamond nodes indicate that the relations Sum, Pred, and CS2N are functional.
Number:?a~
r .... , :~um//~'lNumber
l..mber:
?b~'~'"/
String: ?e ~
I~>/~
,
n; if true, the output of Cond is the second argument 1; otherwise, the output of Cond is the third argument, which results from the recursive call. The keyword Jimctional is a metalanguage abbreviation for the axiom that there exists exactly one output x for every input n. The next three lines show how the conceptual graph would be translated to a lambda expression in predicate calculus, a function in C, and a function in LISP: *
Predicate calculus, facto = (~n:Integer)cond(2 > n, I, n'facto(n- I)).
,
C. int facto(int n) { return ((2 > n) ? 1 : n'facto(--n)); }
9
LISP. (defun facto (n) ( i f ( > 2 n) I (* n (facto (subl n)))))
These translations illustrate the equivalence between a specification in logic and a purely functional specification in a programming language. For convenience, efficiency, and the ability to control multiple processes, languages like C and LISP support many other features, but the functional mechanisms alone are sufficient to simulate a Turing machine. For parallel machines, functional languages can often achieve higher efficiency than conventional languages because all the dependencies are specified before compile time; no surprises can occur that would interrupt the parallel processes. Sowa (1984) also defined token-passing methods for evaluating the actors in dataflow diagrams. A question mark was used to trigger a backward-chaining or lazy method of evaluating dataflow diagrams, and an exclamation mark was used to trigger a forward-chaining or eager method of evaluation. Those methods of evaluation are not specified in the dpANS for conceptual graphs.
3 lndexicals An important feature of Peirce's logical graphs is their explicit marking of contexts, which map directly to the way contexts are represented in natural languages. Peirce also coined the term indexical for context-dependent references, such as pronouns and words like here, there, and now. In CGs, the symbol # represents the general indexical, which is usually expressed by the definite article the. More specific indexicals are marked by a qualifier after the # symbol, as in #here, #now, #he, #she, or #it. Figure 3 shows two conceptual graphs for the sentence I f a farmer owns a donkey, then he beats it. The CG on the left represents the original pronouns with indexicals, and the one on the right replaces the indexicals with the coreference labels ?x and ?y.
CG with Indexlcals
Figure 3.
CO with Indexicals resolved
Two conceptual graphs for "If a farmer owns a donkey, then he heats it."
In the concept Animate: #he, the label Animate indicates the semantic type, and the indexical #he indicates that the referent must be found by a search for some type of Animate entity for which the masculine gender is applicable. In the concept Entity: #it, the label Entity is synonymous with T, which may represent anything, and the indexical #it indicates that the referent has neuter gender. The search for referents starts in the inner context and proceeds outward to find concepts of an appropriate type and gender. The CG on the right of Figure 3 shows the result of resolving the indexicals: the concept for he has been replaced by ?x to show a coreference to the farmer, and the concept for it has been replaced by ?y to show a coreference to the donkey. Predicate calculus does not have a notation for indexicals, and its syntax does not show the context structure explicitly. Therefore, the CG on the left of Figure 3 cannot be translated directly to predicate calculus. After the indexicals have been resolved, the CG on the right can be translated to the following formula: fix: Farmer)(VyDonkey)0/z:Own) (expr(z,x) ^ thme(zd0) (3w:Beat)(agnt(w,x) ^ ptnt(w,y))). The dpANS for conceptual graphs specifies # as the syntactic marker for indexicals, but it does not define their semantics, the rules for resolving them to their referents, or their translation to predicate calculus and KIF. It allows CGs with indexicals to be sent between different CG systems, but their semantics is left undefined.
The complete determination of all the rules for resolving indexicals is still an active area of research in linguistics and logic. As an example, consider the sentence You can lead a horse to water, but you can't m a k e him drink. The first step would be the generation of a logical form with indexicals, such as the CG in Figure 4, which may be read literally It is possible (Psbl) f o r you to lead a horse to water, but it is not possible ( -" Psbl) f o r you to cause him to drink the liquid. The relation -" Psbl is defined as a combination of -- and Psbl: relation -'Psbl(*p) Js -Proposition: (Psbl)§
?p.
-f-ropos.~n: Proposition: Poreon: #you ~
Proposition: Proposition: I Pemon:~ou
Figure 4. CG for "Youcan leada horseto water,but you can't makehim drink." A parser and semantic interpreter that did a purely local or context-free analysis of the English sentence could generate the four concepts marked as indexicais by # symbols in Figure 4: ,
The two occurrences of you would map to the two concepts of the form Person: #you.
e
The pronoun him represents a masculine animate indexical in the objective case, whose concept is Animate: #he.
9 The missing object of the verb drink is presupposed by the concept type Drink, which requires a patient of type Liquid. The implicit concept Liquid: # is marked as an indexical, so that the exact referent can be determined from the context. The indexicals would have to be resolved by a context-dependent search, proceeding outward from the context in which each indexical is nested.
10 Conversational implicatures. Sometimes no suitable referent for an indexical can be found. In such a case, the person who hears or reads the sentence must make further assumptions about implicit referents. The philosopher Paul Grice (1975) observed that such assumptions, called conversational implicatures, are often necessary to make sense out of the sentences in ordinary language. They are justified by the charitable assumption that the speaker or writer is trying to make a meaningful statement, but for the sake of brevity, m a y leave some background information unspoken. To resolve the indexicals in Figure 4, the listener would have to make the following kinds of assumptions to fill in the missing information: 1. The two concepts of the form Person: #you would normally be resolved to the listener or reader of the sentence. Since no one is explicitly mentioned in any containing context, some such person must be assumed. T h a t assumption corresponds to drawing an if-then nest of contexts with a hypothetical reader x coreferent with you:
If:
Person: *x....
- -Person:
/you
Then:
The entire graph in Figure 4 would be inserted in place of the three dots in the then part, and every occurrence of #you would be replaced by ?x. The resulting graph could be read l f there exists a person x, then x can lead a horse to water, but x can't make him drink the liquid. 2. The concept Animate: #he might be resolved to either a human or a beast. Since the reader is referred to as you, the most likely referent is the horse. But in both C G s and DRSs, coreference links can only be drawn between concepts under one of the following conditions: * The antecedent concept Horse cur in the same context. .
and the indexical Animate: #he both oc-
The antecedent occurs in a context that includes the context of the indexical.
In Figure 4, neither of these conditions holds. To make the second condition true, the antecedent Horse can be exported or lifted to some containing context, such as the context of the hypothetical reader x. This assumption has the effect of treating the horse as hypothetical as the person x. After a coreference label is assigned to the concept Horse: *y, the indexical #he could be replaced by ?y. 3. The liquid, which had to be assumed to make sense of the verb drink, might be coreferent with the water. But in order to draw a corefercnce link, another assumption must be made to lilt the antecedent concept Water: *z to the same hypothetical context as the reader and the horse. Then the concept Liquid: # would become Liquid: ?z. The result would be the following CG with all indexicals resolved:
11
If: Person: *x Horse: *y Water: *z Then: Proposi t| on: (Psbl)§ Person: ?x§ (Thme)§ ?y (Dest)§ Proposi ti on:
?z §
(~Psbl)§ t ion: Person: ?x § (Rsl t)§ tuation: Animate: ?y§247247247
§ ?z
.
This CG may be read I f there exist a person x, a horse y, and water z, then the person x can lead the horse y to water z, but the person x can't make the animate being y drink the liquid z. Before the indcxicals are resolved, the type labels are needed to match the indexicals to their antecedents. Afterwards, the bound concepts Person: ?x, Horse: ?y, Animate: ?y, Water: ?z, and Liquid: ?z could be simplified to just ~, ~, or ?z. As this example illustrates, indexicals occur in the intermediate stage of translating language to logic, and their correct resolution may require nontrivial assumptions. Many programs in AI and computational linguistics follow rules of discourse representation to resolve indexicals. The problem of making the correct assumptions about conversational implicatures is more difficult. The kinds of assumptions needed to understand ordinary conversation are similar to the assumptions that are made in nonmonotonic reasoning. Both of them depend partly on context-independent rules of logic and partly on context-dependent background knowledge.
4 Metalevel and Higher-Order Logic Mctalcvcl reasoning can bc treated as first-order rcasoning with the object-level language as part of the domain of discourse. As an example, consider the sentence The greater-than relation is transitive. That sentence uses English as a metalanguage for talking about logic, it could bc translated to the following CG:
Relation: greaterThan§ (Attr) § This metalevel CG talks about some attribute (Attr) of a relation. At the object level, that attribute implies an axiom of a particular form about how the greaterThan relation may be used. The translation from the metalevel to the object level can be specified by the metametalevel CG in Figure 5.
12
If:
i Relation:"r ~
Transitive I
If:
IT'n:
Then:
Figure 5.
Metametalevel CG
mapping from the metalevel to the object level
The If-part of Figure 5 is the following conceptual graph, written in the linear notation:
Relation: *r§247
t| ve.
This graph matches the previous CG, with the variable *r matching the name greaterThan. The Then-part of Figure 5 is another If-Then rule, which contains conceptual relations whose label is pr. The Greek letter p represents an operator that maps a name like greaterThan to a label of a conceptual relation. The following statement defines the symbol > as the value of p when applied to the name greaterThan:
pgreaterThan = '>' With this definition, the If-Then rule in Figure 5 would generate the conclusion in Figure 6.
If:
Then: ~
Figure 6.
I
Object-level CG derived by applying the rule in Figure 5 to the CG in Figure 7.18
The If-Then rule in Figure 6 is the axiom that defines what it means for the greaterThan relation to be transitive. In English, Figure 6 may be read, I f x is greater than y and y is greater than z, then x is greater than z. The symbols *x, *y, and *z in the IF-part of Figure 6 represent the defining occurrences of the
13 variables x, y, and z. The symbols ?x and ?z in the Then-part represent references back to the defining occurrences. KIF uses the same form ?x, ?y, and ?z for both the defining occurrences and the repeated references. Following is the KIF form of Figure 6:
(-> (and (> ?x ?y) (> ?y ?z)) (- ?x ?z) ) As these examples illustrate, conceptual graphs and KIF can be used like E-R diagrams to represent metalevel information. But they can also state something that no E-R diagram ever could: the object-level axiom for transitivity in Figure 6. Furthermore, they can even state metametarules like Figure 5, which map from the metalevel to the object level. With such power, CGs and KIF can be used as the object language, the metalanguage, or even the metametalanguage. In fact, they can be used at a potentially infinite number of levels. $ Pictures as Literals Figure 7 shows a concept of type Situation, which is linked by two image relations (Imag) to two different kinds of images of that situation: a picture and the assoc i a t ~ sound. The description relation (Dscr) links the situation to a proposition that describes some aspect of it. It is equivalent to the entailment operator, x I=p, which means that p is a true proposition about x. The proposition is linked by three statement relations (Strut) to statements of the proposition in three different languages: an English sentence, a conceptual graph, and a formula in the Knowledge Representation Language (KIF). As the diagram illustrates, the sound image and the picture image capture information that is not stated in the propositional forms, but even they are only partial representations. S..auo.
' plmum:
/
CLAMKETu
scra,e
/
/
-
I I~Y
I
Figure 7. A CG representing a situation of a plumber carrying a pipe
14 The sound and the picture, which are displayed graphically in Figure 7, would be stored in some conventional representation, such as a GIF or JPEG file. The CG could be mapped to the following formula in predicate calculus:
(3s:Situation)(3p:Proposition)(3g:CG) (~x:Sound)(3y:Picture)(3z:English )(3w:K IF) (dscr(s,p) ^ imag(s,x) ^ imag(s~v) ^ stored(x,clankety.wav) ^ stored(y,plumber.gif) ^ stmt(p,z) ^ stmt(p~) ^ stmt(p,w) ^ literal(z,"A plumber is carrying a pipe.") ^ literal(g,~Plumber§247247 ") ^ literal(w,O(exists ( ( ~ plumber) (?y carry) (?z pipe)) (and (agnt ~ ~ ) (throe ~ ?z)))")). The stored relation links images to the names of files in which they are stored, and the literal relation links linguistic entities to the character strings used to express them. A multimedia system can display them in any form that is convenient for the users. Those forms, however, are not specified by the dpANS for conceptual graphs. References
Chcin, Michel, ed. (1996) Revue d'Intelligence artificielle, Special Issue on Conceptual Graphs, vol. 10, no. 1. Frege, Gottlob (1879) Begriffsschrift, translated in Jean van Heijenoort, ed. (1967) From Frege to G6del, Harvard University Press, Cambridge, MA, pp. 1-82. Gricc, H. Paul (1975) "Logic and conversation," in P. Cole & J. Morgan, eds., Syntax and Semantics 3: Speech Acts, Academic Press, New York, pp. 41-58. Peirce, Charles Sanders (1885) "On the algebra of logic," American Journal of Mathematics, vol. 7, pp. 180-202. Sowa, John F. (1984) Conceptual Structures: Information Processing in Mind and Machine, Addison-Wesley, Reading, MA. Sowa, John F., ed. (1992) Knowledge-Based Systems, Special Issue on Conceptual Graphs, vol. 5, no. 3, September 1992. Way, Eileen C., ed. (1992) Journal of Experimental and Theoretical Artificial Intelligence (JETAI), Special Issue on Conceptual Graphs, vol. 4, no. 2.
Matching in Description Logics: Preliminary Results Franz Baader 1, Alex Borgida 2, and Deborah L. McGuinness 3 z Theoretical Computer Science, RWTH Aachen, 52074 Aachen, Germany baader@inf ormatik, rwth-aachen, de
2 Dept. of Computer Science, Rutgers University, New Brunswick, N J, USA borgida@c s. rutgers, edu
s Information Systems and Services Research, AT&T Labs - Research Florham Park, N J, USA dim@research, art. tom
Matching of concepts with variables (concept patterns) is a relatively new operation that has been introduced in the context of concept description languages (description logics), originally to help filter out unimportant aspects of large concepts appearing in industrialstrength knowledge bases. This paper proposes a new approach to performing matching, based on a "concept-centered" normal form, rather than the more standard "structural subsumption" normal form for concepts. As a result, matching can be performed (in polynomial time) using arbitrary concept patterns of the description language JrL:_, thus removing restrictions from previous work. The paper also addresses the question of matching problems with additional "side conditions", which were motivated by practical experience. Abstract.
1
Introduction
Knowledge representation systems based on Description Logic Systems (DL systems) can be used to represent the terminological knowledge of an application domain in a structured and formally well-understood way 12, 4, 11, 32, 8. With the help of these languages, the important notions of the domain can be described by concept descriptions, i.e., expressions that are built from atomic concepts (unary predicates) and atomic roles (binary predicates) using the concept constructors provided by the description logic language (DL language) of the system. The atomic concepts and the concept descriptions represent sets of individuals, whereas roles represent binary relations between individuals. For example, using the atomic concept Woman and the atomic role child, the concept of all women having only daughters (i.e., women such that all their children are again women) can be represented by the concept description Woman I-1 Vchild.Woman.
DL systems provide their users with various inference capabilities that allow them to deduce implicit knowledge from the explicitly represented knowledge.
16 For instance, the subsumption algorithm allows one to determine subconceptsuperconcept relationships: C is Subsumed by D (C E D) iff all instances of C are also instances of D, i.e., the first description is always interpreted as a subset of the second description. For example, the concept description Woman obviously subsumes the concept description Woman I-I Vchild.Woman. With the help of the subsumption algorithm, a newly introduced concept description can automatically be placed at the correct position in the hierarchy of the already existing concept descriptions. Two concept descriptions C, D are equivalent (C - D) iff they subsume each other, i.e., if they always represent the same set of individuals. For example, the descriptions Woman FIVchild.Woman and (Vchild.Woman)I-I Woman are equivalent since • is interpreted as set intersection, which is obviously commutative. The traditional inference problems for DL systems (like subsumption) are now well-investigated, which means that algorithms are available for solving the subsumption problem and related inference problems in a great variety of DL languages of differing expressive power (see, e.g., 22, 31, 28, 20, 1, 3, 19, 13, 10, 6, 2, 6, 7). In addition, the computational complexity of these inference problems has been investigated in detail 22, 27, 29, 15, 14, 17, 30, 16. It has turned out, however, that building and maintaining large DL knowledge bases requires support by additional inference capabilities, which have not been considered in the DL literature until very recently. The present paper is concerned with such a new inference service, namely, matching of concept descriptions. Matching of Description Logic concepts was introduced in the CLASSIC system (version 2), under the name of "ilterin', as a technique for specifying which aspects of a concept should be selected for printing or explanation. The need for this facility became apparent when dealing with large knowledge bases, involving concepts whose description spans multiple pages of output: in many cases, such concepts carried details that either were obviously true (e.g., the age of a person is a number) or were intended for some internal function (e.g., graphical display) rather than domain modeling. In either case, both the printing and the explaining of results provided by the more traditional inference services 24, 23 required pruning. In projects using CLASSIC, pruning of the description resulted in concepts that were approximately an order of magnitude smaller. In small applications such as 25, this actually saved 3-5 pages of printout; in larger applications such as 33, 26 it might save up to 30 pages. This pruning mechanism was first formalized 23 as a purely syntactic match involving terms/concepts with variables, and then given a semantics and a syntactic implementation in 9. Given a concept pattern D (i.e., a concept description containing variables) and a concept description C without variables, the matching problem introduced in 9 asks for a substitution a (of the variables by concept descriptions) such that C E a(D). More precisely, one is interested in a "minimal" solution of the matching problem, i.e., a should satisfy the property that there does not exist a substitution (i such that C E 5(D) E a(D). For example, the minimal matcher of the pattern D := Vresearch-interests.X against
17 the description
C := Vpets.Cat I-I Vresearch-interests.Al r-I Vhobbies.Gardening
assigns AI to the variable X, and thus finds the scientific interests (in this case Artificial Intelligence) described in the concept. (The concept pattern can be thought of as a "format statement", describing what information is to be displayed (or explained), if the pattern matches successfully against a specifi, ~.oncept. If there is no match, nothing is displayed.) In some cases, this pruning effect can be improved by imposing additional side conditions on the solutions of matching problems. For example, the information that the research interests lie in the area of Artificial Intelligence may not provide interesting information if our knowledge base is concerned onl) '~ i~, AI researchers. A side condition stating that the solutions for the variable :i must be subsumed by KR would make sure that matching succeeds only if the research interests belong to (a subfield of) Knowledge Representation. Thus, the description C from above no longer matches the pattern D, whereas C I :-- Vpets.Cat 7 Vresearch-interests.DL i-I Vhobbies.Gardening
would still yield a solution (provided that DL is defined by a description that is subsumed by KR). In some cases we would like to have a matching process which succeeds only if the variable X is substituted for by a value that is strictly subsumed by some description (or pattern). The utility of such strict side-conditions might be more clearly seen in an example where the concept Person is known to have Number restrictions on the age attribute, and we are interested in seeing the value restriction for age only if it represents some additional (i.e., stricter) constraint. Another point worth noting is that according to the standard Description Logic semantics, every description is subsumed by all concepts of the form VR.3-, where T denotes the universal concept. Hence the pattern D above (concerning research interests) matches every concept. Side conditions requiring the value substituted for a variable to be strictly subsumed by 3- prevent such "trivial" matches. Matching algorithms for a DL containing most of the constructs available in CLASSIC are introduced in 23 and 9. These algorithms are based on the rolecentered normal form 1 of concept descriptions usually employed by structural subsumption algorithms. The main drawback of these algorithms is that they cannot treat arbitrary matching problems since they require the concept pattern to be in structural normal form. In 5, Baader and Narendran consider unification of concept descriptions in ~-L:0, which allows for conjunction 03), value restriction (VR.C), and the top concept (3-). Matching modulo equivalence, i.e., the question whether, for a given pattern D and a description C, there exists a substitution a such that C __=_a(D), 1 We call this normal form "role-centered" since it groups sub-descriptions by role names, whereas the concept-centered normal form used in this paper groups value restrictions by concept names.
8 can be seen as a special case of unification where One of the descriptions (namely C) does not contain variables. Since C E a(D) iff C = a(C N D), matching modulo subsumption (as introduced above) is an instance of matching modulo equivalence. The polynomial matching algorithm described in 5 does not impose restrictions on the form of the patterns. However, it is restricted to the small language 9r/~0. In the present paper, we show that this algorithm can be extended to treat matching in languages allowing for inconsistent concept descriptions, namely ~-~• which extends ~'~o by the bottom concept (3-), and ~ / ~ , which extends ~'/~• by primitive negation (-~A, where A is an atomic concept). In addition, we consider matching under additional conditions on the variable bindings, which also arose in examples in 25, 23 and were responsible for about 25% of our space savings in our deployed example. In this paper, we consider two different variants of these "side conditions": subsumption conditions and strict subsumption conditions. Subsumption conditions are of the form X _ E, where X is a variable and E is a pattern (i.e., it may contain variables), and they restrict the marchers to substitutions a satisfying a(X) U a(E). It should be noted that such a side condition is not a matching problem since variables may occur on both sides. We shall see, however, that in many cases matching under subsumption conditions can be reduced in polynomial time to matching without subsumption conditions. In contrast, strict subsumption conditions may increase the complexity of the matching problem. Such conditions are of the form X F- E, where X is a variable and E is a pattern, and they restrict the marchers to substitutions a satisfying a ( X ) U a(E) and a ( X ) ~ a(E). We shall show that, even for the small languages ~'/:0, matching under strict subsumption conditions is NP-hard.
2
Formal
preliminaries
In this section, we first introduce syntax and semantics of the description languages considered in this paper. Then, we formally introduce matching problems, and state some simple results about matching problems and their solutions.
Definition 1. Let C and R be disjoint finite sets representing the set of atomic concepts and the set of atomic roles. The set of all 5rs over g and ~ is inductively defined as follows:
descriptions
- Every element of g is a concept description (atomic concept). - The symbols T (top concept} and 3_ (bottom concept} are concept descriptions. If A E g, then -~A is a concept description (atomic negation). - If C and D are concept descriptions, then C tq D is a concept description (concept conjunction). - I / C is a concept description and R E ~ is an atomic role, then VR.C is a concept description (value restriction). -
19 In the sublanguage ~-s of ~c/:., atomic negation and _L may not be used, whereas in ~s177only atomic negation is disallowed. The following definition provides a model-theoretic semantics for ~ . and its sublanguages: D e f i n i t i o n 2. A n interpretation I consists of a nonempty set A I, the domain of the interpretation, and an interpretation function that assigns to every atomic concept A E g a set A I C_ A ~, and to every atomic role R E T r binary relation R I C_ A I • A 1. The interpretation function is extended to complex concept descriptions as follows: TI
:=
A I,
-1-I := O, (-~A) I := A z \ A ~, ( C F I D ) / := C / A D z, (VR.C) I :-- {d 6 AI
Ve 6 At: (d,e) 6 R I --+ e 6 CX}.
Based on this semantics, subsumption and equivalence of concept descriptions is defined as follows: Let C and D be ~'s descriptions. C is subsumed by D (C U D) iff C I C__D x for all interpretations I. C is equivalent to D (C - D) iff C x = D I for all interpretations I. - C is strictly subsumed by D (C U D) iff C ___D and C ~ D.
-
-
In order to define matching of concept descriptions, we must introduce the notion of a concept pattern and of substitutions operating on patterns. For this purpose, we introduce an additional set of symbols A~ (concept variables), which is disjoint from C U 7~. D e f i n i t i o n 3. The set of all ~ s tively defined as follows:
patterns over C, T~, and 2( is induc-
Every concept variable X 6 2( is a pattern. - Every 3:s description over C and T~ is a pattern. If C and D are concept patterns, then C ~ D is a concept pattern. If C is a concept pattern and R 6 T~ is an atomic role, then V R . C is a concept pattern. -
-
-
Thus, concept variables can be used like atomic concepts, with the only difference being that atomic negation may not be applied to variables. A substitution a is a mapping from 2( into the set of all Ys descriptions. This mapping is extended to concept patterns in the obvious way, i.e., -
-
-
-
a(A) := A and a ( ~ A ) := -~A for all A 6 C, a ( T ) := T and a(_l.) := _l_, a ( C N D) := a(C) N a(D), and ~(VR.C) := VR.a(C).
20
For example, applying the substitution a := { X ~ A q VR.A, Y ~ B } to t h e pattern X ~ Y fq V R . X yields the description A q (VR.A) rq B lq V R . ( A R VR.A). Obviously, the result of applying a substitution to an ~'s pattern is an ~ s description. 2 An ~rs maps concept variables to ~s descriptions. ~L:• are defined analogously. Subsumption can be extended to substitutions as follows. The substitution a is subsumed by the substitution T (a E T) iff a ( X ) E T ( X ) for all variables XEX. 9
D e f i n i t i o n 4. A n ~'s problem is of the form C =" D where C is an Yrs description and D is an YrE~-concept pattern. A solution or matcher of this problem is a substitution a such that C -- a(D). A subsumption condition in yrs is of the form X E ? E where X is a concept variable and E is an Yrs pattern. The substitution a satisfies this condition iff a ( X ) U a ( E ) . A strict subsumption condition in ffcs is of the form X E ? E where X is a concept variable and E is an Yrf.~-concept pattern. The substitution a satisfies this condition iff a ( X ) E a(E). Matching problems a n d (strict) subsumption conditions in ~ s and ~ ' s 1 7 7are defined analogously. Note that also the solutions are then constrained to belong to the respective sublanguage. Instead of a single matching problem, we may also consider a finite system {(71 - ? D 1 , . . . , Cm - ? Din} of such problems. The substitution a is a solution of this system iff it is a solution of all the matching problems Ci - ? Di contained in the system. However, it is easy to see that solving systems of matching problems can be reduced (in linear time) to solving a single matching problem. L e m m a 1. Let R a , . . . , R m
be distinct atomic roles. Then a solves the system {C1 ='9 D1,. .., Cm ='-9 D,~} iff it solves the single matching problem VRt.C1 Iq .. 9fq VRm.Cm - ? k/R1.D1 ~ 9"" R VRm.Dm.
Consequently, we may (without loss of generality) restrict our attention to single matching problems with or without finite sets of (strict) subsumption conditions. In 9, 23, a different type of matching problems has been considered. We will refer to those problems as matching problems modulo subsumption in order to distinguish them from the matching problems modulo equivalence introduced above. D e f i n i t i o n 5. A matching problem modulo subsumption is of the form C E ? D where C is a concept description and D is a pattern. A solution of this problem is a substitution a satisfying C E a(D). 2 Note that this would not be the case if we had allowed the application of negation to concept variables.
21
For any description language allowing conjunction of concepts, matching modulo subsumption can be reduced (in linear time) to matching modulo equivalence: L e m m a 2. The substitution a solves the matching problem C E ? D iff it solves ~?
C=
CnD.
For ~-L:-~,and more generally for any description language in which variables in patterns may only occur in the scope of "monotonic" operators, solvability of matching problems modulo subsumption can be reduced to subsumption: L e m m a 3. Let C r? D be a matching problem modulo subsumption in Y:f.,,
and let aT be the substitution that replaces each variable by T. Then C E ? D has a solution iff aT solves C ~? D. Thus, solvability of matching problems modulo subsumption in ~ s and its sublanguages is not an interesting new problem. This changes, however, if we consider such matching problems together with additional (strict) subsumption conditions. In fact, these conditions may exclude the trivial solution a-r. In addition, one is usually not interested in an arbitrary solution of the matching problem C E ? D, but rather in computing a "minimal" solution: D e f i n i t i o n 6. Let C E ? D be a matching problem modulo subsumption. The
solution a o C E ? D is called minimal iff there does not exist a substitution (i such that C ~ 5(D) E a(D). L e m m a 4. Let C ~? D be an YrS.-matching problem modulo subsumption. If
a is the least solution oC E ? D w.r.t, subsumption of substitutions, i.e., a E 5 or all solutions 6, then a is also a minimal solution. Proof. This is an immediate consequence of the following fact, which can easily be proved by induction on the structure of .Tg~-concept patterns: If a __ (i, then a(D) E ~(D) for any ~'s pattern D. It should be noted that talking about the least solution is a slight abuse of language since the least solution of a given matching problem is unique only up to equivalence: if a and r are both least solutions of the same matching problem then they subsume each other, which means that a(X) - T(X) for all variables XEX. The converse of Lemma 4 need not hold. For example, for the matching problem,VR.A ~? VR.ANVR.X, the substitutions a :-- {X ~ A} and r := {X T} are both minimal solutions, but r obviously cannot be a least solution. This example also demonstrates that minimal solutions of a given matching problem need not be unique up to equivalence.
22
3
Matching in .~Z2•
The purpose of this section is to show that solvability of ~-/Z• problems can be decided in polynomial time. In addition, for matching problems modulo subsumption we can compute a minimal solution in polynomial time. Our algorithm is based on a "concept-centered" normal form for ~rs177 descriptions. First, let us recall the concept-centered normal form for 9rs descriptions introduced in 5. It is easy to see t h a t any 5rs description can be transformed into an equivalent description t h a t is either 3- or a (nonempty) conjunction of descriptions of the form V R 1 . " ' V R m . A for m > 0 (not necessarily distinct) atomic roles R I , . . . , R m and an atomic concept A ~ 3-. We abbreviate V R 1 . " "VRm.A by V R 1 . . . R m . A , where R 1 . . . Rm is considered as a word over the alphabet Z := 7E of all atomic roles. In addition, instead of V w l . A • . . . R Vwl.A we write VL.A where L := { w l , . . . , w t } is a finite set of words over Z. The t e r m V0.A is considered to be equivalent to T. Using these abbreviations, any pair of 5r/~o-concept descriptions G, D containing the atomic concepts A 1 , . . . , Ak can be rewritten as C - VU1.A1 R ... fq VUk.Ak
and
D -- VV1.A1 R . . . R VVk.Ak,
where Ui, V/ are finite sets of words over the alphabet of all atomic roles. This normal form provides us with the following characterization of equivalence of 5r/~o-concept descriptions 5: L e m m a 5. Let C, D be JC~.o-concept descriptions with normal forms as introduced above. Then C = D iff Ui = Vi for all i, 1 < i < k. This characterization can in turn be used to reduce matching of 5rs descriptions to a certain formal language problem, which can easily be shown to be solvable in polynomial time (see 5 for details). If we treat _l_ like an arbitrary atomic concept, 9~/2• descriptions C, D can still be represented in the form 3 C -~ ~i~Uo.•
VU1.A1 -1... q VUk.Ak and D - VVo.• N VV1.A1 N ... ~ ~Vk.Ak.
However, equivalence of the descriptions no longer corresponds to equality of the languages Ui and V/. The reason is that V R 1 . " V R m . • is subsumed by any value restriction of the form V R 1 . . . . VR,~. VRm+I... 9VRm+~.A. This fact is taken into account by the following characterization of equivalence of 5r/:• concept descriptions: L e m m a 6. Let C, D be 3zs177 introduced above. Then C-
D
if
descriptions with :Ts
U o . Z * = Vo.E* and Ui U Uo.Z* = V~ U Vo.E* for all i, 1 < i < k,
a We shall call this the J--/:o-normal form of the descriptions.
forms as
23
where Z* is the set of all words over the alphabet o all atomic roles and stands or concatenation. Proof. Assume t h a t the right-hand side of the equivalence stated in the l e m m a holds. It is sufficient to show t h a t this implies C ~ D (since D _ C then follows by symmetry). Considering the normal form of D this means t h a t we must show t h a t for all w E Vo we have (1) C _ Vw.• and for all i, 1 < i < k, and all w E Vi we have (2) C E Vw.As. Thus, let w E V0. By assumption, V0 C_ Vo.E* - Uo'•*, which implies t h a t there exist a word u E U0 and v E E * such t h a t w ---- uv. Thus, the normal form for C contains the conjunct Vu.• Since Vu.2 _ Vuv.• for any word v we have established t h a t (1) holds. P r o p e r t y (2) can be shown similarly. Conversely, assume t h a t the right-hand side of the equivalence stated in the l e m m a does not hold, i.e., (1) Uo'~* ~ V0"~*, or for some i, 1 < i < k, (2)
usuUo.~* ~ Y~UYo.~*. First, we assume t h a t (1) holds. Without loss of generality we m a y assume t h a t there exist a word w := R I . . . R m E Z* such that w E Uo'Z* and w Vo'Z*. We claim t h a t this implies D ~ C, and thus C ~ D. In order to prove this claim, we construct an interpretation I as follows: the domain A I := {do,... ,din) consists of m + 1 distinct individuals; the interpretation of atomic concepts Ai is given by A t := AX; finally, the atomic roles are interpreted as S I := {(ds-1, ds) I S = Ri). It is easy to see t h a t this interpretation satisfies do E (Vu.As) I for all words u E Z* (since A / = AI), and do E (Vu.• I for all words u t h a t are not a prefix of w = R 1 . . . Rm. Consequently, do E (Vu.As) t for all u E Vi. In addition, w ~ V0.Z* implies t h a t no word in Vo is a prefix of w, and thus do E (Vu.• t for all words u E Vo. This shows t h a t do E D t. However, by construction, do r (Vw.• t, which implies do r C / . Second, we assume t h a t (1) does not hold, i.e., U0-Z* = Vo.E*, and t h a t (2) holds. Without loss of generality we m a y assume t h a t there exists a word w := R1 . . . Rm E Z* such t h a t w E Us and w r Vi U Vo.Z*. Again, we claim t h a t this implies D ~ C, and thus C ~ D. In order to prove this claim, we construct an interpretation I as follows: the domain A s := {do,... ,dm} consists of m + 1 distinct individuals; the interpretation of atomic concepts Aj for j ~ i is given by A / := At; the interpretation of As is A / := A t \ {din); finally, the atomic roles are interpreted as S t := {(ds-1, ds) I S = Ri}. By construction do r (~/w.As) I, and thus do r C t. On the other hand, it is easy to show (using arguments t h a t are similar to the ones employed in the first case) t h a t do E D I. If D is an 9 c s 1 7 7 containing the variables X 1 , . . . , Xt, then its Y s normal form is of the form D ~ VVo.• n VV1 .At M... MVVk.Ak MVW~.X1 M ... ~ VW~.Xl. If we want to match D with the description C (with normal form as above), we must solve the following "formal language" equations (where Xj# are interpreted
24 as variables for finite sets of words):
(•
Uo.Z* = Vo.Y,* U W~.X~,o.E* U . . . U Wt'Xt,o'Z*,
and for all i, 1 < i < k,
(Ai)
Ui U Uo.~* = Vi U W1.X1, i U . . . U W t . X t , i U Uo'~*.
T h e o r e m 1. Let C be an 3:s description and D an 3:s177 pattern with ffrf~o-normal forms as introduced above. Then the matching problem __? C = D has a solution iff the formal language equations ( l ) and ( A 1 ) , . . . , (Ak) are each solvable.
Proof. Let k
k
a := {Xt ~ VLl,o.J- rl -I VLt,i.Ai, . . . , X t ~-+VLt,o.J- rl F VL~,i.Ai} i=I
i=l
be a substitution. 4 By employing elementary equivalences between concept descriptions we can show that the ~ s form of a(D) is
G(D) - V (Vo U W1.LI,o U . - . U W~.Lt,o) .• n k
n ~ / ( v / u w1 .LI,i u . - . u Wt-Lt,i) .Ai.
i=1
Lemma 6 implies that C - a(D) iff Uo.E* -- (Vo U W1-LI,o U - . - U Wl.Ll,o)-Z*,
(1)
and for all i, 1 < i < k,
Ui U Uo'~* = Vi U WI'LI# U . . . U W~.L~,i U (Vo U Wl'Lz,o U . - . U Wt'Ll,o)'2*.
(2)
Since concatenation distributes over union, (1) corresponds to the fact that the assignment Xl,o := L1,0,..., Xt,o := Lt,o solves equation (_L). In addition, if we already know that (1) holds, then (2) corresponds to the fact that the assignment Xl,i := L I # , . . . ,X~,i := Lt,i solves equation (Ai). This shows how a solution a of the matching problem C =? D yields solutions of the equations (_L), (At), ..., (Ak), and conversely how solutions of these equations can be used to construct a matcher a.
Example 1. As a running example, we will consider the problem of matching the pattern D := Xt n (VR.X1) n (VS.X2) 4 Without loss of generality we restrict our attention to the images of variables occurring in D, and assume that a introduces only atomic concepts occurring in C or D.
25 against the description C := VR.((VS.At) M (VR._L)) MVS.VS._L. The ~'s177
forms of C and D are
C - V { R R , SS}._L flV{RS}.A1 and D - V0._LRV0.A1RV{e, R}.X1 ITV{S}.X2. __?
Thus, the matching problem C =" D is translated into the following two equations:
(• {RR, S S } . Z * = 0.E* U {~, R}.Xl,0.~* U {S}.X2,0.E*, (A1) { R S } U { R R , S S ) . ~ * = $ U { e , R } . X 1 j U {S}.X2j U { R R , S S ) . ~ * . If we want to utilize Theorem 1 for deciding matching problems in 5rs177we must show how solvability of the equations ( l ) , (A1), ..., (Ak) can be tested. First, we address the problem of solving equation (_l_). Lemma
7. Equation ( I ) has a solution iff replacing Xj,o'~* by the sets
N
weW~
solves equation (1). 5 Proof. To show the only-if direction, we assume that the assignment X1;o := M l , o , . . . , X~,o := Mt,o solves equation (3_). First, we prove that Mj,o.~* C_ nwew~ w-1 .(Uo-E*) ho2ds for all j, 1 < j _ t. Thus, let v E Mj,o-Z* and w e Wj. Since Wj.Mj,o.Z C_ Uo.Z*, we know that wv E U0.Z*, and thus v E w-l.(Uo.Z*). This shows that Mj,o.E* C_ w - l . ( U o . Z *) for all w E Wj, and thus Mj,o.~* C_ n ~ e w j w-l"(Uo'Z*) 9 As an immediate consequence, we obtain Uo.Z* = Vo.~* U W1.MI,o.~* U . . . U Wt.Ml,o.Z*
VoZ. u
N
wEW1
u... u w,
N
wEWl
It remains to be shown that the inclusion in the other direction holds as well. Obviously, we have Vo.E* C_ U0.Z* since there exists a solution of ( l ) . To conclude the proof of the only-if direction, assume that u E Wj and v E We must show that uv e Uo.Z*. Obviously, u E Wj implies
v E u-l.(Uo.E*), and thus uv E Uo.Z*. To prove the if direction, it is sufficient to show that there exist finite sets of words Lj,0 (j = 1 , . . . , t ) such that Lj,0"E 9 = n ~ e w . w - 1 . ( u 0 . z * ). This is an immediate consequence of the fact that languages of t~e form L.E* for finite L are closed under (binary) intersection and left quotients (see (1) and (2) of Lemma 8 below). 5 For a word w and a set of words L we have w - l . L := {u wu E L}. This language is called a left quotient of L.
26 L e m m a 8. Let U, V be finite languages and w a word.
1. There exists a finite language L1 such that LI.S* = w-I.(U.S*). 2. There exists a finite language L2 such that L2.E* = U.Z* N V.S*. 3. U.S* U V.S* = (U tA V).S* and U.(V.S*) = (U.V).~*. Proof. (1) Since (uv)-lL = v-l-(u-l.L) for all languages L, it is sufficient to consider the case where w has length 1, i.e., w 9 S . We distinguish two cases: - If the e m p t y word c belongs to U, then U.S* = S* = w - l - S *, and thus we can take L1 := {~}. If e r U, then our assumption t h a t the length of w is 1 implies t h a t w - I . ( U . S *) = (w-l.U) 9 S*, and thus we can take L1 := w - l ' U , which is finite since U is finite. -
(2) It is easy to see t h a t we can take L2 := (U N V-Z*) U (V N U.S*). (3) is trivial. For the matching problem of Example 1, we replace X1.S* by
R - I . ( { R R , SS}.S*)N~-I.({RR, SS}-S*) = {R}.S*N{RR, SS}.Z* : { R R } . S * and X2.S* by
S - I . ( { R R , SS}.Z*) = {S}.Z*. It is easy to see t h a t this replacement solves the equation. The finite languages Lj,o are defined as L1,0 := {RR} and L2,0 := {S}. Now, let us consider the equations (Ai) for 1 < i < k. L e m m a 9. Equation (Ai) has a solution iff replacing the variables Xj,i by the sets Lj,i := ~wewj w-l'(Ui U Uo.Z*) yields a solution of (Ai).
Proof. The proof of the only-if direction is very similar to the proof of this direction for L e m m a 7. In particular, one can show t h a t any assignment Xl,i := M l , i , . . . , X~,i := Mt,i t h a t solves (Ai) satisfies Mj,i C_ Lj,i. To prove the if direction, it is sufficient to show t h a t there exist finite sets of words Lj,i such t h a t Wj.Lj,i U Uo.Z* = Wj.Lj,i t3 Uo.Z*. We have Lj,i = Nwew~ (w-l"Ui U w-l.(Uo.Z*)). By applying distributivity of intersection over union, this intersection of unions can be transformed into a union of intersections. Except for the intersection Nwewj w-l'(Uo'Z*), all the intersection expressions in this union contain at least one language w-lUi for a word w E Wj. Since Ui is finite, this shows t h a t N~oewj w-l"(Uo "S*) is the only (possibly) infinite language in the union. Consequently, if we define Lj,i : : Lj,i \ NweW~ w-l'(UO'~'*), then Lj,i is a finite language. A
In order to prove t h a t Wj.Lj,i kJ Uo.S* = Wj.Lj,i tA Uo.S*, it is sufficient to show t h a t u 9 Wj and v 9 Lj,i \ Lj,i implies uv 9 Uo.Z*. By definition of Lj,i, we know t h a t v 9 Nwew~ w-l"(Uo'S*), and thus u 9 Wj implies uv 9 Uo.S*.
27 For the matching problem of Example 1, we have
L,,~ = R-I.({RS} U {RR, SS}.~*) N~-I.({RS} 9 {RR, SS}.E*) = (iS} U { R } . r * ) N ({RS} U {RR, SS}.E*) = {RS} U {RR}.2*,
L2,1 = S - I ' ( { R S } U {RR, S S } . 2 " ) =
A
Again, it is easy to see that replacing the variables Xj,1 by Lj,1 yields a solution of equation (A1). The finite languages Lj,1 are defined as LI,1 :-- {RS} and L2,1 := ~). Lemma 7 and 9 provide us with a polynomial algorithm for deciding solvability of matching problems in ~'t:• T h e o r e m 2. Solvability of matching problems in ~f~• can be decided in poly-
nomial time. Pro@ Obviously, Lemma 7 and 9 provide us with an effective method for testing matching problems in ~t:j_ for solvability. It remains to be shown that this test can be realized in polynomial time. First, note that the combined size 6 of the finite languages Ui and Vi is linear in the size of the concept description and the pattern. Thus, the size of the equations (_l_) and (Ai) is polynomial in the size of the original matching problem. Both for equation (1) and for equation (Ai) we compute a "candidate" for a solution and then test whether it really is a solution. First, let us consider equation (_l_). Given the finite language U0, we can construct (in polynomial time) a deterministic finite automaton that accepts the left-hand side U0"E* of equation (1), and whose size is polynomial in the size of U0. Regarding the right-hand side of equation (• it is easy to see that computing the candidate and inserting it into the right-hand side can be done in polynomial time. To be more precise, we can compute (in polynomial time) a deterministic finite automaton accepting the (regular) language obtained by inserting the candidate solution into the right-hand side of equation (_1_), and the size of this automaton is polynomial in the size of the equation. In fact, in order to construct this automaton, we start with finite deterministic a u t o m a t a for U0.Z* and V0-E*, and then apply the closure properties stated in Lemma 8 a polynomial number of times. It is easy to see that each closure operation can be realized as a polynomial operation on deterministic finite automata. Since equivalence of regular languages given by deterministic finite a u t o m a t a can be decided in time polynomial in the size of the automata, 7 this shows that solvability of equation (1) can be tested in polynomial time. 6 As size of a finite language we take the sum of the length of the words occurring in the language. 7 Note that this would not be the case for nondeterministic finite automata.
28 The equations (Ai) can be treated similarly. We just have to extend our argument regarding closure properties and deterministic finite automata from languages of the form L.~* for finite L to languages of the form L U L~-Z * for finite L, L ~. The proofs of Lemma 7 and 9 also show how to compute a matcher of a given solvable ~'s177 problem in polynomial time. In fact, if the matching problem is solvable, then the following substitution {7 is a matcher: k
k
{7 :---- {Xl ~ VLI,o.• Cl - VL1 i.Ai, ... i----1
'
'
Xe ~ VLt,o.• R 7 VLt i.Ai}, i=l
'
where the languages Lj,o (1 < j < ~) are defined as in the proof of Lemma 7, and the languages Lj# (1 _< j < ~, 1 < i < k) are defined as in the proof of Lemma 9. It should be noted that the language Lj,i = Lj,i \ ~wewj w-1 .(Uo.Z*) is a subset of J,cwj v-lUi, and thus its size is polynomial in the size of the matching problem. For the matching problem of Example 1, we thus obtain the matcher {X
1 v-~
(VR.VR._I_) N (VR.VS.A1), X2 ~-~ VS.&}.
L e m m a 10. Assume that the given Jzs177 problem C -~ =" D is solvable. Then the substitution a defined above is the least solution of C -~ ~" D. Proof. Assume that k
k
5 := {Xl ~ VM1 0.-l- n 7 VM1 i.Ai, ... '
i=1
'
'
X~ ~ VMt,0.-l- n -7 VMt i.Ai} i----1
'
_?
is another solution of C =" D. Consequently, the assignment X1,0 :-- M l , o , . . . , Xl,o := M~,o solves equation (Z), and the assignment Xl,i := M l , i , . . . ,Xl,i := M~,i solves (Ai). As shown in the proofs of Lemma 7 and 9, this implies that Mj,o.E* C Lj,o.E* (1 < j < ~) and Mj# C Lj,i (1 < j < ~, 1 < i < k). As a consequence of Lemma 6, we can infer a ( X j ) E 5(Xj) from Mj,o.Z* C Lj,o-E* and Mj,i U Mj,0.E* C_ Lj,i U Lj,o.E*. We already know that the first inclusion holds. For the second inclusion, it remains to be shown that Mj,i C_ Lj,i U Lj,0.Z*. This is an immediate consequence of Mj,i C_ Lj,i since Lj,i = Lj# U Nwewj w-l"(Uo'~*) and Lj,o.E* = N,o~w~ w-l"(Uo'E*). This lemma, together with Lemma 4, immediately implies the following theorem: T h e o r e m 3. Let C E_? D be a solvable matching problem modulo subsumption. Then the least solution of C =" - ~ C n D is a minimal solution of C r-? D, and this solution can be computed in polynomial time.
29 4
M a t c h i n g in .~'Z:-~
The results for matching in 5 / : • can easily be extended to the language ~/:-.. In principle, negated atomic concepts are treated like new atomic concepts. The fact that A • -~A is inconsistent (i.e., equivalent to _L) is taken care of by extending the language in the value restriction for the concept J_ appropriately. To be more precise, let C, D be ~'E• descriptions, and A 1 , . . . , Ak the atomic concepts occurring in C, D. By treating the negated atomic concepts ~Ai like new atomic concepts, we can transform C and D into their 9V/:o-normal forms: C =- VUo._LN VU~.A1 n . . . R VUk.Ak n VUk+t.-~Ai N . . . R VU2k.-~Ak, D - VVo.-L 7 ~r R . . . R VVk .Ak r'l ~b"~k..l_l .-~A1 r7... R VV2k.~Ak. If we define k
k
Oo::uouU(u, nuk+4
and
o::vouU(y nvk+,),
i=1
i=1
then Lemma 6 can be generalized to ~'/:~ as follows: L e m m a 11. Let C, D be 3:f~-~-concept descriptions with 3:~'o-normal forms as introduced above. Then C - O
and
ity
Ui U Uo-~* = V/U V0-~* oralli, 1 < i < 2 k , where ~* is the set of all words over the alphabet of all atomic roles. Consequently, all the results for matching in 5v/:~.carry over to ~'/:~: we simply have to replace k by 2k and the sets U0, V0 by Uo, V0. T h e o r e m 4. Let C =_? D be an Y:E~-matching problem. Solvability of C =? D c a n be tested in polynomial time. If C =_? D is solvable, then a least solution of C =_? D can be computed in polynomial time.
5
Matching under
side conditions
In the following, we consider strict subsumption conditions in more detail. For (non-strict) subsumption conditions we just briefly mention some results.
Strict subsumption conditions Recall that a strict subsumption condition is of the form X E ? E where X is a concept variable and E is a concept pattern. If the concept patterns of a set of strict subsumption conditions do not contain variables (i.e., the expressions E on the right-hand sides of the strict subsumption conditions are concept descriptions), then it is sufficient to compute a least solution of the matching problem, and then test whether this solution also solves the strict subsumption conditions.
30 T h e o r e m 5. Let C -? D be an J:s problem, and X1 r-? E l , . . . , Xn u ? En strict subsumption conditions such that El,. 9 En are Jrf~-concept descriptions. Then solvability of C -? D under these conditions is decidable in polynomial time. If the right-hand sides of strict subsumption conditions may contain variables, then solvability becomes NP-hard, even for the language ~-Z:o. This will be shown by reducing 3SAT 18 to the matching problem under strict subsumption conditions. Recall that matching modulo subsumption can be reduced to matching modulo equivalence, and that a system of matching problems can be coded into a single matching problem. For this reason, we may, without loss of generality, construct a problem that consists of matching problems modulo subsumption, matching problems modulo equivalence, and strict subsumption conditions. T h e o r e m 6. Matching under strict subsumption conditions is NP-hard, even for the small language ~f~o.
Proof. Let A be an arbitrary concept name. For every propositional variable p occurring in the 3SAT problem, we introduce three concept variables, namely Xp, X~, and Zp, and two roles Rp and R~. Using these concept variables and roles, we construct the matching statement VRB.A q VR~.A E_ Zp,
(3)
and the strict subsumption condition
Zp E VRp.X~ n VR~.X~.
(4)
It is easy to see that the subsumption relationship between VRp.A q VR~.A and VRp.Xp fq VR~.X~ enforced by (3) and (4) implies that any solution 0 of (3) and (4) satisfies:
(O(Xp) - A V 0(Xp) - T) A (0(X~) - A V 0(X~) - T). In addition, the fact that this subsumption relationship must be strict implies O(Xp) -- T V O(X~) =_ T.
Finally, if this solution also satisfies the matching statement
A - Xp n Z~,
(5)
then we know that both variables can not be replaced by T, i.e.,
O(Xp) =_-A V O(X~) =_A. This shows that, if we take T as the truth value 1 and A as the t r u t h value 0, then any solution assigns either 0 or 1 to Xp, and the opposite t r u t h value to
31 It remains to be shown that 3-clauses and the corresponding t r u t h conditions can be encoded. We introduce a concept variable Zc and three roles Rr for every 3-clanse c in our 3SAT problem, and represent the clause by a matching problem together with a strict subsumption condition. For example, assume that c := p V -~q V r. Then c is represented by the matching statement
VRc,I.A n VRc,2.A n VRc,a.A E Zc,
(6)
and the strict subsumption condition
Zc - VRc,1 .Zp i3 VRc,2.Xff 3 VRc,3 .Zr.
(7)
Obviously, the strict inclusion implied by these two statements can only be satisfied by a substitution ~ if it assigns -7 to at least one of the variables Xp, X~, and Xr. This completes our reduction. Note that (5) is a matching problem modulo equivalence that cannot be represented by a matching problem modulo subsumption. Thus, it is still open whether the NP-hardness result also holds for matching modulo subsumption under strict subsumption conditions. Theorem 6 only provides a hardness result for matching under strict subsumption conditions. Thus, another open question is how to extend the match: ing algorithm for 9vs to an algorithm that can also handle strict subsumption conditions.
Subsumption conditions If the subsumption conditions do not introduce cyclic variable dependencies, then a matching problem with subsumption conditions can be reduced to an ordinary matching problem. To be more precise, the sequence of subsumption conditions X1 ___7E l , . . . , Xn E ? En is acyclic iff for all i, 1 < i < n, the pattern Ei does not contain the variables X i , . . . , Xn. Given such an acyclic sequence of subsumption conditions, we can define a substitution s a inductively as follows:
a(X1) := II1NE1 and a(Xi) := Y ~ a ( E i )
(l> way. Consider the ~genuine)r car-crash rej~ort below: J'ai vu I'arriSre du vthicule B s'approcher. J'ai appuy6 h fond sur le frein, le vthicule B a continu6 h s'approcher. J'ai cru sur le moment que mes freins n'avaient pas r~pondu et nous avons rempli ce constat. J'ai immediatement apr~s essay6 rues freins qui fonclionnaient tr~s bien. J'ai mainlenant la certitude, la rue 6lant Idg~.rement en penle, que c'est le camion qui a recul6 et est venu me percuter (...)
B68
I saw the back o f vehicle B coming nearer. ! pushed on the brake to the bottom, but vehicle B kept coming nearer. I believed, on the spot, that my brakes had failed and we.filled the present report. Immediately after, I checked the brakes which work quite well. I am now absolutely conviced that, as the street has a slight slope, it is the truck who was moving back and stroke me (...)
Independently of the words used, the representation of what happened requires a rather complex description on some notions, e.g. beliefs and how to confirm them, in order to infer that the author now disagrees with what (s)he wrote in the report; on the other hand, the comprehension of this text requires only a very crude description of vehicles and of their dynamics (there is a brake; when it works, it prevents you from coming nearer the vehicle you see). In most other texts, the opposite holds: confirmation of beliefs is irrelevant, but considerations on the effect of speed and dampness over braking distances play an essential role. Having a single ontology to shelter all the knowledge that may come into play entails: 9 reasoning in every case in the most complex setting - - a very inefficient s t r a t e g y - - , and 9 handling a number of concepts presumably larger by an order of magnitude than the size of the vocabulary, hence the need to use e.g. STREET#1 (a line), STREET#2 (a surface, the causeway), STREET#3 (a skew surface, the causeway plus the sidewalks), STREET#4 (a volume, including the buildings opening onto the sidewalks), and so on. The alternative is to use a simple ontology (e.g. STREET defined as a line) and to refine it when and where the need arises. In the above example, "brake" can at first be defined as a part of a vehicle, but clearly we need later to distinguish among
47 BRAKE#1 (the pedal), BRAKE#2 (the brake shoe), and BRAKE#3 (the whole device by which pressing on the pedal results in the shoe rubbing on the wheels). Having all sorts of brakes and streets cluttering the ontology of the domain of car-crashes is a very hard constraint to cope with, and a very useless one too! Having ontologies at different levels, with bridges between them, and reasoning at the simplest level, except when it is inconsistent to do so seems much more attractive. But the difficulties must not be underestimated: 9 designing a strategy to select the ontological level appropriate for a given situation is difficult, 9 having efficient cooperation between levels which, by definition, do not share the same ontology is even harder, 9 deciding when to start and stop a new level of reasoning cannot be grounded on any formal principle, but only on empirical basis. Even if these obstacles are fearsome and hence this "promising track" not really attractive, I believe that the relatively easy way explored by philosophers and logicians has shown its intrinsic limitations, and nothing really less fearsome might solve the problem. More technical developments of this idea can be found in previous papers 8 5.
5 Summary The less controversial examples of conceptual structures use mathematically perfectly defined notions (not going back as far in History as Port Royal's definition of comprehension and extension on triangles, it is worth noticing that e.g. Chein & Mugnier's paper 1 use an ontology of polygons!). This is so because only in that case, the quest for a unique ontology is admissible and successful. The acceptance of ontological multiplicity, which goes necessarily with the need for a process of conceptual adaptation does not mean setting the fox (I mean the scruffiness inherent to natural languages) to mind the geese (I mean the neatness of the conceptual level); it is essential if we want to develop tools truly adapted to reallife domains, and not to their idealisation, but it can be done with a high standard of rigour. This idea provides by itself no methodology; at least, endorsing it means to stop looking for oversimplistic solutions where they obviously do not work.
References 1. Chein, M., Mugnier, M-L.: Conceptual Graphs: Fundamental Notions. Revue d'Intelligence Artificielle vol.6 n~ (1992) 365-406 2. Computational Intelligence. Special Issue on Non-Literal Language (D.Fass, J.Martin, E.Hinckelmann eds.) vol.8 n~ (Aug.1992) 3. Gayral, F., Kayser, D., L6vy F.: Quelle est la couleur du feu rouge du Boulevard Henri IV ? in R6f6rence et anaphore, Revue VERBUM tome XIX n~ (1997) 177-200
48 4. Kayser, D.: Une s6mantique qui n'a pas de sens. Langages n~ (Septembre 1987) 33-45 5. Kayser, D.: Le raisonnement h profondeur variable. Actes des journ6es nationales du GRECO-P.R.C. d'Intelligence Artificielle, F,ditions Teknea, Toulouse (1988) 109-136 6. Kayser, D.: Terme et d6notation. La Banque des Mots, n ~ sp6cial 7-1995 (1996) 19-34 7. Kayser, D.: La s6mantique lexicale est d'abord inf6rentielle. Langue Fran~aise n~ "Aux sources de la polys6mie nominale" (P.Cadiot et B.Habert eds.) (Mars 1997) 92-106 8. Kayser, D., Coulon, D.: Variable-Depth Natural Language Understanding. Proc. 7 th I.J.C.A.I., Vancouver (1981) 64-66 9. Lesniewski, S.: Sur les fondements de la math6matique (1927) (trad.fr. par G.Kalinowski, Hermes, Pads, 1989) 10. Montague, R.: Formal Philosophy (R.Thomason, ed.) Yale University Press (1974) 11. Nunberg, G.D.: The Pragmatics of Reference. Indiana University Linguistics Club, Bloomington (Indiana) (June 1978) 12. Pustejovsky, J.: The Generative Lexicon. The MIT Press (1995) 13. Unger, P.: There are no ordinary things. Synthese vol.41 (1979) 117-154
M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 3-14, 1998 Springer-Verlag Berlin Heidelberg 1998
4
J.F. Sowa
Conceptual Graph Standard and Extensions
5
6
J.F. Sowa
Conceptual Graph Standard and Extensions
7
8
J.F. Sowa
Conceptual Graph Standard and Extensions
9
10
J.F. Sowa
Conceptual Graph Standard and Extensions
11
12
J.F. Sowa
Conceptual Graph Standard and Extensions
13
14
J.F. Sowa
Executing Conceptual Graphs Walling R. Cyre The Bradley Department of Electrical and Computer Engineering Virginia Tech Blacksburg, VA 24061-0111
[email protected] This paper addresses the issue of directly executing conceptual graphs by developing an execution model that simulates interactions among behavioral concepts and with attributes related to object concepts. While several researchers have proposed various mechanisms for computing or simulating conceptual graphs, but these usually rely on extensions to conceptual graphs. The simulation algorithm described in this paper is inspired by digital logic simulators and reactive systems simulators. Behavior in conceptual graphs is described by action, event and state concept types along with all their subtypes. Activity in such concepts propagates over conceptual relations to invoke activity or changes in other behavioral concepts or to affect the attributes related to object type concepts. The challenging issues of orderly simulation of behavior recursively described by another graphs, and of combinational relations are also addressed. Abstract.
1 Introduction Conceptual Graphs have been used to represent a great variety of knowledge, including dynamic knowledge or knowledge which describes the behavior of people, animate beings, displays and mechanisms. In simple graphs it is not difficult to mentally trace possible activities and decide if the situation is modeled correctly, and the anticipated behavior is described. In complex and hierarchical graphs, however, this may not be practical so that validation of the description must be carried out by simulation or reasoning. From another viewpoint, simulation is useful to predict the consequences of behavior described by a conceptual graph. In this paper, mechanics for simulation by executing conceptual graphs are presented. As an example of conceptual graphs execution, consider the graph of Figure 1 which describes the throwing of a ball. Suppose one wishes to execute or simulate this graph to observe the behavior it describes. As discussed in the next section, most researchers have suggested extending conceptual graphs with special nodes called actors or demons to implement behavior. Instead, consider what is necessary to execute this graph as it is. First, the throw action must be associated with some underlying procedure or type definition that knows what types of concepts may be related to it, and how the execution of the throw action affects these adjacent
52 concepts. In this case, executing a throw should only change the position attribute of the ball from matching that of Joe (100,35) to that of Hal (200,0) and change the tense of throw from future to past. Note that this situation has been modeled so that only attributes of concepts are modified. Had the position concept of the ball been coreferent with Joe's position, then it would be necessary to necessary for the throw procedure to modify the structure of the graph by removing the position link of the ball from Joe's position to Hal's position. Since the graph is a data structure, modifying it is not a problem. In the discussions that follow, it will be assumed that action procedures only act on the values of attributes, and that these values may be rewritten multiple times. ~
(100,350)
t
Fig. 1. A Description of a Behavior In this discussion, throw was assumed to have a procedure defined for it in a library of concept procedures. Such a Liberia can easily get out of hand, so it is more desirable to have a type definition for throw in terms of a modest set of primitive actions which copy, replace, combine or operate only on values of attributes. Here, the type definition would have only a copy operation from the agent position concept to the object position concept. Figure 1 only has one isolated behavioral concept: throw. The conceptual graph of Figure 2 provides a more interesting example in which the execution of actions and events have effects on one another. One would like to execute or simulate this description by specifying an initial condition, say that action 1 is active, the state is true and the variable has the initial value of zero. Action 1 generates event 2, which in turn deactivates action 1 and initiates the increment action. The increment action reads its variable's value attribute, increments it and returns the sum as the result. Starting the increment also generates event 4 if it is enabled by the state being true. Firing event 4 terminates the increment and re-initiates action 1. The example of Figure 1 considered a single action, and its procedure or definition needed to know what to do with the possible adjacent concepts. Figure 2 has relations such as generator, terminator, intiator and enabler, which relate actions, events and states. As described later, to simplify the simulator, these and similar types of relations can be processed uniformly regardless of the subtypes of
53 actions events and states they are incident to. Only actions need to have 'personalized' procedures. In the following sections, related work by other researchers is considered. Following that, a simulation algorithm is described, including it's supporting mechanisms (data types), it's simulation cycle, and how conceptual relations among behavioral concept types affect the execution of graphs.
~
event: #2 F
I action: #l
tincrement: #3 I
/
t~minat~176176 7 event: #4 ~
s
nerat0 r
~
state: ~
~
~,
~
d
~
variable: #6 1 ~ ( . ~
Fig. 2. An Example Graph
2 Related Work In 1984, Sown stated that "Conceptual Graphs represent declarative information." and then went on to describe " ... a formalism for graphs of actors that are bound to conceptual graphs." These actor nodes form (bipartite) dataflow graphs 7 with value type concepts of conceptual graphs. The relations between the actors and concepts are input_arc and output_arc to the actors. Sown described an execution model for computing with conflict-free, acyclic dataflow graphs using a system of tokens. Actors can be defined recursively or by other graphs with actors. This model only allows single assignments (of referents) to (generic) value concepts. While this view of executing conceptual graphs has the computational power of dataflow models, functional dataflow graphs are not the most popular model of computation, and they require the appendage of a new node type (actors) to conceptual graphs. In addition, dataflow models are not easily derived from most natural language descriptions. In the present paper we show an execution model for general conceptual graphs without special actors. Other models of computation were considered by Delugach, including state transition models 5,6. Another type of node, called 'demon' was added to conceptual graphs to account for the problem of states. The argument is that if a conceptual graph is to be true, and a system can be in only one state at a time, then
54 (state) concepts must be created and destroyed when transitions occur. This approach was extended recently by the introduction of assertion types and assertion events 12. While demon nodes or assertion events offer the attractive capability of modifying the structure of a conceptual graph, they are external to conceptual graphs as actors are. Here, we avoid having to create and destroy states, by simply marking them as being true or false (negated). The actor graphs and problem maps of Lukose 10 are object-oriented extensions of conceptual graphs to provide executability. An actor graph consists of the type definition of an a c t concept supplemented with an actor whose methods are invoked by messages. Execution of an actor graph terminates in a goal state which may be a condition for the execution of other actor graphs. Control of sequences and nesting of actor graphs are provided by problem maps. While this approach does associate executable procedures with a c t concept types, no mention is given to event and proposition types. The application of conceptual graphs to governing agents, such as robots, was introduced by Mann 11. A conceptual database includes schemata of actions the agent can perform. Commands in the form of conceptual graphs derived from natural language expressions are unified with schemata to evoke behavior. The behavior itself is produced by programs associated with demon concepts in the schemata. At the same time, the present author considered the visual interpretation of conceptual graphs, including the pictorial depiction of their knowledge 3. Comments were included on animating displays generated from conceptual graphs. Animation would be produced by animator procedures associated with action and event type concepts. In this case, a conceptual graph would govern a display engine rather than a robot. While both of these proposals, execute conceptual graphs, each is rather specialized. Recently, simulation of conceptual graphs has been considered more generally 1. This approach considers actions, states and events, where the underlying execution model is a state transition system. Action concepts are preconditioned by states and events. These actions can be recursively defined as sets of actions joined by temporal relations and conditioned by states and events. Events are changes in the truth of states. These state changes are described by links to transition concepts which, in turn, are invoked by actions through 'effect' relations. A simulation step consists of identifying the set of actions whose preconditions are true, and selecting one for execution to advance the simulation. Time apparently advances with each time step. The user is consulted to resolve indeterminacy due to multiple enabled actions in a step. Simulation also detects unreachable actions and inconsistencies. The execution model we describe treats states and events more generally and considers the interaction between behavior concepts (actions, events, states) and values or entities. Our set of general concept types and the simulation algorithm were developed through an examination of modeling notations for computer systems rather than from considering human behavior. As discussed later, behavioral models are not limited to computer system behavior. Since the present execution model is based on computer systems 2, it is appropriate to review here some approaches used in computer simulators. This will
55 be limited to event-driven simulators. In computer logic simulators, the only possible events are changes in values (signals). At a given point in time, one or more signals may change. The behavior of each circuit having a signal change event on any input is simulated to determine output signal events. An output event consists of a value change and the time the event will occur due to delay in the circuitry. These events are posted to a queue. Once all output events due to current events have been determined, the next future event(s) is found from the event queue and simulation time is advanced to that time so that event can be processed. If the next event occurs at the present time, the simulation time is not advanced. In a simple simulator, the code that simulates behaviors of circuits is contained n a library. In more general digital simulators 9, the user may write processes which respond to input events and produce output events. The procedures of all processes execute once upon initial startup of the simulator. This is appropriate since hardware is always processing its inputs as long as power is applied. After startup, procedures execute only when stimulated. In a simulation cycle, all stimulated processes complete before the next cycle begins. A very general modeling notation called Statecharts is supported by a more elaborate simulation procedure 8. Our simulation algorithm was inspired by this approach. Statecharts are founded on finite state machines. The states may be hierarchical, consisting in turn of transition diagrams. Parallel compound machines are supported so the system can be in multiple sub-states at a time. Each transition can be triggered by an event and predicated on a condition. Both conditions and events may be Boolean combinations of other conditions and events, respectively. When a transition occurs, an action may be stimulated. Other actions may be stimulated by events as long as the system is in a particular (enabling) state. Such action invocations may be predicated on other conditions. In addition, actions and events may generate new events. The objective of the present paper is to show how conceptual graphs can be executed or simulated by an approach such as this.
3 Simulation Approach 3.1 Execution Semantics of Concept Types To begin to develop a general conceptual graph simulator, it is necessary to first consider the types of concepts that will participate in the simulation. In the present discussion, we consider the top-level type hierarchy of Figure 3. The behavior types actively participate in a simulation. Objects are passive elements whose attributes, such as value for variable and position for entity, can be affected by the execution of action types. State concepts describe the status of what is described by a conceptual graph, and may be pre-conditions on the execution of actions or the occurrence of events. States may be defined by the activity of actions or their relationships among attributes of objects, such as the values of variables and the positions of objects. Events are instantaneous and mark changes in activity of actions or the attributes of objects.
56 entity object ~
variable action
behavior ~
e
v
e
n
t state value
attribute ~
position delay
Fig. 3. Top-Level Concept Type Hierarchy Since conceptual graph theory allows concepts to be defined n terms of conceptual graphs, such recursively defined concepts must be accounted for in executing a graph. Consider an action that is defined by a graph that includes actions, events and states. When the defined action is executed, its graph must be executed completely before the action is completed and it's consequences may be propagated on. In order to show that our approach is quite general, some traditional concept types 13 are interpreted in terms of the simulator type hierarchy of Figure 3, as shown in Table 1. Note that some concept type names conflict. For example, Sowa's event type is classified as an action here because it has an agent and does not exclude a duration. Our events have no duration and are only discontinuities in actions. The type believe is an action (operation) whose operand is a belief or proposition (state). In this paper, we will only discuss the attributes value of variables and position of entity. Note however, that all discussions are extended to any attribute, such as the age or the color of an entity. Conceptual relations describe the interaction among these concept types and directs the simulator in assigning attributes and generating events during simulation. The interpretation of conceptual relations during simulation is described in detail in a later section.
3.2 Simulation Support Mechanisms The simulation algorithm described here is event-driven. Since event is a concept type in the type hierarchy, it is useful to define incident as the element processed by
57 the simulation algorithm. Each incident is associated with a specific concept of the conceptual graph(s) being simulated, and has five attributes which specify: the concept type, the concept identifier the simulator time, the level of recursion and the type of operation to be performed. The simulator time is measured with respect to the beginning of the simulation, which occurs at time zero. The simulator may perform many cycles at a given simulator time. That is, a cycle does not necessarily advance the simulation time, and simulation time cannot be reversed. Often, incidents generated at the present time must be processed in the present time (without delay). Such incidents are assigned a zero for simulation time. The types of incidents generated and processed by the simulator are listed in Table 2. The effects of these types of incidents are described later. Table 1. Traditional Concept Types
Traditional Type
Simulator Type
act age animal believe belief message color communicate contain event proposition teacher think warm
action attribute entity action state variable attribute action state action state entity action attribute
The simulator uses the collection of lists shown in Table 3 to keep tract of incidents and the state of the conceptual graph. The Queue contains incidents that have not yet been processed by the simulator. The Current list contains the incidents to be processed during the current simulation cycle, and the Future list contains incidents generated during the current cycle and to be processed in a future cycle. The Current incidents are selected from Queue at the beginning of a cycle, and the Future incidents will be added to the Queue at the end of the cycle and before the next cycle begins. Which action concepts are active is kept track of by the Activity list. Activity describes the status of an action; an action may be active, but may not do anything during a given cycle. Those actions which must be executed during the current cycle are listed in the Execute list. The True_states list keeps track of which states are true at the present time. A state concept may reflect
58 the activity of an action, or may represent conditions defined by relationships among attributes of objects, such as values of variables or positions of entities. The Value and Position lists keep track of the current values of variable concepts and the positions of entity concepts. In Prolog, lists are convenient structures for maintaining the status of the conceptual graph. In other implementation languages, lists of pointers might be used instead, and some lists might be eliminated entirely. For example, values and positions can be left in referent fields of concepts, but then, the graph would have to be scanned each cycle. Similarly, the status of action concepts and the truth of state concepts can be represented by the unary relations (active) and (not), respectively, incident to the concepts.
Table 2. Types of Simulation Incidents
Incident Type
Parameters
Function
action
type, id, time, level, operation
event
type, id, time, level, none
state
type, id, time, level, operation
variable
type, id, time, level, new_value
entity
type, id, new_position
time,
level,
Starts, stops or resumes an action. Fires an event. Enters or exits a state (makes true or false). Assigns a new value. Assigns a new position.
Table 3. Working Lists used by the Simulator
List
Contents
Queue
Pending incidents
Current
Incidents to be processed in the current cycle
Future
Future incidents produced in the current cycle
Activity True_states
Action concepts that are currently active State concepts that are true
Values
Pairs of variable concepts and their current values
Positions
Pairs of entity concepts and their current positions
Execute
Action concepts to be executed in the current cycle
59 3.3 The Simulation Cycle A simulation cycle consists of the following steps: 1) Get current incidents: The list of incidents, Current, to be processed during the current cycle is extracted from the incidents Queue. All incidents at the current level of recursion and having a zero time are extracted for processing in the current cycle. If no incidents with zero time exist at the current level, then the level is raised. If no incidents with zero time exist at the top level, simulation time must be advanced. Then, the incident(s) with the earliest time and the deepest level is extracted for processing during this cycle. The simulation level is set to that level and simulator time is advanced to that time. Current incidents are deleted from the Queue. 2) Update attributes: Process attribute (value and position) change incidents by changing these attributes of the affected concepts. This may result in an immediate change in whether some states (conditions) are true, so the True_states list may be affected. Incidents are deleted from the Current list as they are processed. 3) Process remaining incidents: Process state, action and event incidents from the Current list. The order of processing these incidents is immaterial since any consequences are placed in the Future list for processing during a later simulation cycle. 4) Execute actions. Finally, action concepts to be executed during the current cycle are executed in this step. This must be last, since event occurrences and state changes stimulate and precondition activities, respectively. 5) Update Oueue: Append the Future list to the Queue. The manner of processing the various types of incidents is described in the following paragraphs. Note again that an incident is processed only if the simulator time has caught up with the incident time. Attribute (value and position) incidents specify changes in values of variables and positions of entities, and the time they are to occur. In each case, the affected concept is located and its appropriate attribute is changed, both in the graph and in the Values and Positions lists. These incidents must be processed before action and event incidents, since value and position changes can affect the truth of preconditions for executing actions and firing events. Two types of state incidents are defined. An enter state incident causes the state concept to be added to the True_states list, and an exit incident removes the state concept from this list. State incidents are generated explicitly by actions, and so do not account for changes in status of state concepts due to changes in activity of actions and the attributes of variables or entities. For example, the state incidents
60 incident(service, #, _,exit) and incident(idle, #, _,enter) are indicated by the expression, "Reset changes the mode from service to idle." An event incident fires the indicated event concept in a conceptual graph. The consequences of firing an event are determined by the conceptual relations incident with the event. Incidents generated through conceptual relations are described shortly. Action incidents change the activity of action concepts by adding them to or removing them from the Activity list. A stop incident removes the action. A start incident or r e s u m e incident places the action onto the Activity list. A s t a r t incident invokes an execution of the action with initial values and positions for associated variables and entities. A resume incident invokes an execution of the action using the last values or positions. This supports persistence in actions. Once incidents have been processed and changes in objects and states are completed, the action concepts stimulated into execution are processed. Action concepts may have operands they operate on to produce results. These will be value or position attributes of other concepts. If an action is executed by a start incident, it is reset to initial values/positions before execution begins. Otherwise, the current values/positions are used. Execution may not only generate value or position incidents with respect to result concepts, but may also generate event, state and other action incidents. To generate value and position events, a procedure for transforming inputs to outputs must be available. Rather than defining actor or demon nodes as part of an extended conceptual graph to account for these procedures, we follow the pattern of digital simulators that have libraries of procedures for simulating the behavior of action concepts. That is there are a collection of primitive actions which the simulator knows how to process. Other actions can be recursively defined in terms of schemata employing these primitive actions. So, to execute a complex action, its schema is executed. But, the schemata may take multiple cycles to execute. To satisfy this requirement the simulator has levels of cycles. That is the function of the level parameter of the incidents. All current incidents at the deepest level of recursion must be completed before any higher-level incidents are processed. It is possible that some primitive actions may be invoked at different levels at the same simulation time. The present simulation strategy executes the action at the deepest level first. The simulation cycle is not completed until all action concepts have completed their execution, that is, suspended themselves. Self-suspension here means the computation is complete, and has nothing to do with terminating the activity of the action, unless the action generates a stop incident to itself.
3.4 Conceptual Relations and the Production of New Incidents As described thus far, the simulator consumes incidents but produces none, so a simulation would soon die out. New incidents are produced by the conceptual relations incident to firing events and executing actions. Table 4 shows a collection of relation types among behaviors and objects. Although their names may be
61
unfamiliar, these relations account for most interactions among concepts, with the exception of attributes. Only binary relations are shown in the table; some ternary relations will be considered later. The challenge in developing a model for executing conceptual graphs is to determine which incidents are generated by the various relations, and how combinations of relations incident with concepts interact.
Table 4. Signatures of Selected Conceptual Relations
Has
entity
entity
part
event
state
action
event
state
attribute
status
position color age value structure
i part
variable action
variable
agent patient source destination
operand result
cause deactivator temporal part generator
initiator resumor terminator temporal trigger temporal part make_true entrance make_false exit
if
enabler
part
In Table 4, a relation is interpreted as the row concept 'has relation' with the column concept, such as an event has initiator action. This interpretation yields some unusual relation type names, but is traditional in conceptual graphs, and is necessary when considering combinations of relations incident with behavior concepts. First, consider relations actions have with behaviors (Row 3 in Table 4). When an event incident to an initiator relation fires, it will generate an action start incident for the related action, with zero time and the current level of recursion, e.g. incident(action, Id, 0, L, start). Similarly, the suspendor and resumor relations will generate stop and resume action incidents for their related actions. A single action incident may not be sufficient to stimulate the execution of the action. If the action has one or more i f relations with states and the states are not true (on the True_states list), then the action will not execute. In addition, an action must have an action incident on each of its initiator or resumor relations to execute, since conceptual graph theory interprets multiple relations of the same type incident to a concept as conjoined (ANDed). Disjunction of relations is not conveniently represented in conceptual graphs. For this purpose, we define new relations or and x o r to synthesize artificial disjunctive concepts. The relation x o r indicates exclusive-or. Since complex combinations cannot be expressed this way, introduction of artificial concepts is necessary, as in the example graph in Figure 4, which indicates that action A executes only if states S1 and $2 are true and if a start
62 incident was produced from event E5 as well as event E3 or E4 or both E2 and E2. That is the condition (S1 and $2 and E5 and ((El and E2) or E3 or E4)) Event event: *1 was artificially introduced to represent the event that E3 or E4 or event: *2 occurred. Event event: *2 is the event that E1 and E2 occur simultaneously.
action, A (initiator)-> event: *1 (or) -> event: *2 (part) -> event: E l (part) -> event: E2, (or) -> event: E3 (or) -> event: E4 (initiator) -> event: E5 (if) -> state: S1 (if) -> state: $2,. Fig. 4. Complex Conditioning of Action Execution.
Which types of incident are generated by some of the relations identified in Table 4 are shown in Table 5. Thus far the generation of the time parameter of incidents has not been addressed, so all incidents generated with the above relations have zero (present) time and the simulation time never advances. To introduce time, it is necessary to add a set of ternary relations comparable to the relations of Table 4. For example the relation initiator_after has the signature shown in Figure 5.
action
-> (initiatoLafter) 1 - > event 2 -> delay,.
Fig. 5. Signature for initiator_after relation. During firing of the event, then, incident(action,#,T,start) is added to the Future list, where the value of T is the current simulation time plus the delay. Table 4 also shows temporal relations among actions and events. These may include interval relations, endpoint relations and point relations 4. Although temporal relations seem to imply causality, we interpret them here as constraints. For example, action: A1 -> (starts_when finishes) -> action: A2
63 does not cause a start action incident for action A1 to be generated when action A2 terminates. Instead the simulator must check when action A1 is initiated that action A2 has terminated, and post an exception if this is not the case. Alternatively, temporal relations could be interpreted as causal, in which case the temporal and other behavioral relations can be checked statically for consistency. Similarly, duration relations incident to actions can be used as constraints to check if a stop incident occurs with the appropriate delay after a start incident, or the duration can be used when the action starts to generate a stop incident that terminates the action.
Table 5. Incidents Produced by Conce ~tualRelations
Concept activity
Incident Relation
Consequent Incident
Fire event
initiator
action start
resumor
action resume
Execute action
terminator
action stop
trigger
event
entrance
state enter
exit
state exit
cause
action start
deactivator
action stop
generator
event
maketrue
state enter
make_false
state exit
4 Conclusions Mechanisms for simulating hierarchical conceptual graphs without introducing special nodes such as actors or demons have been described. To support execution of graphs, concept types are classified as behavior (action, event, state), object (entity, variable) and attribute. Execution is performed by procedures associated with action types that operate on object attributes, and by procedures associated with conceptual relations among behavior concepts. Although the simulation strategy was inspired by digital system simulators, the approach has been show to be applicable to general concept types and relations.
64
5 Acknowledgments This work was funded in part by the National Science Foundation, Grant MIP-9707317.
References 1.
.
3.
.
.
6.
.
8.
9. 10. 11.
12. 13.
C. Bos, B. Botella and P. Vanheeghe, "Modelling and Simulating Human Behaviours with Conceptual Graphs," Proc. 5th Int'l Conf. on Conceptual Structures, Seattle, WA, 275-289, August 3-8, 1997. Walling Cyre "A Requirements Language for Automated Analysis," International Journal of Intelligent Systems, 10(7), 665-689, July, 1995. W. R. Cyre, S. Balachandar, and A. Thakar, "Knowledge Visualization from Conceptual Structures," Proc. 2nd Int'l Conf. on Conceptual Structures, College Park, MD, 275-292, August 16-20, 1994. W. R. Cyre, "Acquiring Temporal Knowledge from Schedules," in G. Mineau, B. Moulin, J. Sowa, eds., Conceptual Graphs for Knowledge Representation, Springer-Verlag, NY, 328-344, 1993. (ICCS'93) H. Delugach, "Dynamic Assertion and Retraction of Conceptual Graphs," Proc. 7th Workshop on Conceptual Structures, Binghamton, NY, July 11-13, 1991. H. Delugach, "Using Conceptual Graphs to Analyze Multiple Views of Software Requirements," Proc. 6th Workshop on Conceptual Structures, Boston,MA, July 29, 1990. J. Dennis, "First Version of a Data Flow Procedure Language," Lecture Notes in Computer Science, Springer-Verlag, NY, 362-376, 1974. D. Harel and A. Naamad, The STATEMATE Semantics of Statecharts, i-Logix, Inc. Andover, MA, June 1995. R. Lipsett, C.F. Schaefer & C. Ussery, VHDL: Hardware Description and Design, Kluwer Academic, Boston, 1989. D. Lukose, "Executable Conceptual Structures," Proc. 1st Int'l Conf. on Conceptual Structures, Quebec City, Canada, 223-237, August 4-7, 1993. G. Mann, "A Rational Goal-Seeking Agent using Conceptual Graphs," Proc. 2nd Int'l Conf. on Conceptual Structures, College Park, MD, 113-126, August 16-20, 1994. R. Raban and H. S. Delugach, "Animating Conceptual Graphs," Proc. 5th Int'l Conf. on Conceptual Structures, Seattle, WA, 431-445, August 3-8, 1997. J. Sowa, Conceptual Structures, Addison-Wesley, Reading, MA, 1984.
From Actors to Processes: The Representation of Dynamic Knowledge Using Conceptual Graphs Guy W. Mineau Department of Computer Science Universit6 Laval Quebec City, Canada tel.: (418) 656-5189 fax: (418) 656-2324 email:
[email protected] The conceptual graph formalism provides all necessary representational primitives needed to model static knowledge. As such, it offers a complete set of knowledge modeling tools, covering a wide range of knowledge modeling requirements. However, the representation of dynamic knowledge falls outside the scope of the actual theory. Dynamic knowledge supposes that transformations of objects are possible. Processes describe such transformations. To allow the representation of processes, we need a way to represent state changes in a conceptual graph based system. Consequently, the theory should be extended to include the description of processes based on the representation of assertions and retractions about the world. This paper extends the conceptual graph theory in that direction, taking into account the implementation considerations that such an extension entails. Abstract.
1 Introduction This paper introduces a second-order knowledge description primitive into the conceptual graph theory, the process statement, needed to define dynamic processes. It explains how such processes can be described, implemented and executed in a conceptual graph based environment. To achieve this goal, it also introduces assert and retract operations. The need for the description of processes came from a major research and development project conducted at DMR Consulting Group in Montreal, where a corporate memory was being developed using conceptual graphs as its representation formalism. Among other things, the company's processes needed to be represented. Although they are actually described in a static format using first-order conceptual graphs, advanced user support capabilities will eventually require them to be explained, taught, updated and validated. For that purpose, we need to provide for their execution, and thus, for their representation as dynamic knowledge. Dynamic knowledge supposes that transformations of objects are possible. Processes describe such transformations. To represent processes, we need a way to describe state changes in a conceptual graph based system. We decided to use assertion and retraction operations as a means to describe state changes. Therefore, the definition of a process that we put forth in this paper is based on such operations. Generally, processes can be described using algorithmic languages. These languages are mapped onto state transition machines, such as computers. So, a
66
process can be described as a sequence of state transitions. A transition transforms a system in such a way that its previous state gives way to a new state. These previous and new states can be described minimally by conditions, called respectively pre and postconditions, which characterize them. The preconditions of a transition form the smallest set of conditions that must conjunctively be true in order for the transition to occur; its postconditions can be described in terms of assertions to and retractions from the previous state. Thus, transitions can be represented by pairs of pre and postconditions. Processes can be defined as sequences of transitions, where the postconditions of a transition match the preconditions of the next transition. The triggering of a transition may be controlled by different mechanisms; usually, it depends solely on the truth value of the preconditions of the transition. This simplifies the control mechanism which needs to be implemented for the execution of processes; therefore, this is the approach that we advocate. Section 2 reviews the actual conceptual graph (cg) literature on processes. Section 3 presents an example that shows how a simple process can be translated into a set of transitions. Section 4 describes the process statement that this paper introduces. Finally, because of its application-oriented nature, this paper also addresses the implementation issues related to the engineering of such a representation framework; section 5 covers these issues.
2 T h e C G L i t e r a t u r e on P r o c e s s e s Delugach introduced a primitive form of demons in 1. Demons are processes triggered by the evaluation of some preconditions. Delugach's demons take concepts as input parameters, but assert or retract concepts as the result of their actions, contrarily to actors which only compute individuals of a predetermined type. Demons are thus a generalization of actors. They can be defined using other demons as well. Consequently, they allow the representation of a broader range of computation. We extended these ideas by allowing a demon to have any cg as input and output parameters. Consequently, our processes are a generalization of Delugach's demons. We kept the same graphical representation as Delugach's, using a labeled double-lined diamond box. However, we had to devise a new parameter passing mechanism. We chose context boxes since what we present here is totally compatible with the definition of contexts as formalized in 2. Similarly to 3, we chose state transitions as a basis for representing a process, except that we do not impose to explicitly define all execution paths1; the execution of a process will create this path dynamically. This simplifies the process description activity. There is much work in the cg community about the extension the cg formalism to include process related primitives 4, 5, 6, 7, 8. One of the main motivation behind these efforts, is the development of an object oriented architecture on top of a cg-based system 12. As we advocate, 7 focuses on Simple primitives to allow the modeling of processes. The process description language that we foresee could be extented to include high-level concepts as proposed in 7. In 9, we explain how our two approaches complete each other. Also, 12 uses transitions as a basis for describing behaviour and 11 uses contexts as pre and postconditions for modeling behaviour. The work presented here is An execution path is defined as a possible sequence of operations, i.e., of state transitions, according to some algorithm.
67 totally compatible with what is presented in these two papers, but furthermore, 1) it adds packaging facilitly, 2) it deals with implementation details that render a full definition and execution framework for processes, and 3) it is completely compatible with the definition of contexts as formalized in 2, providing a formal environment for using contexts as state descriptions. In what follows, we present a framework to describe processes in such a way that: 1) both dynamic and static knowledge can be defined using simple conceptual graphs (in a completely integrated manner), 2) they can be easily executed using a simple execution engine, and 3) inferences on processes are possible in order to validate them, produce explanations and support other knowledge-dependent tasks.
3 From Algorithms to Pre and Postcondition Pairs We believe that a small example will be sufficient to show how a simple process, the iterative factorial algorithm, can be automatically translated into a set of pre/postcondition pairs. From this example, the definitions and explanations given in sections 4 and 5 below will then become easier to present and justify. Let Figure 1 illustrate the process that we want to represent as a set of pre/postconditions pairs. Since a process relies on a synchronization mechanism to properly sequence the different transitions that it is composed of, and since we wish to represent transitions only in terms of pre and postconditions (for implementation simplicity), we decided to include the sequencing information into the pre/postconditions themselves 2. With an algorithmic language such as C, variable dependencies and boolean values determine the proper sequence of instructions. Then it is rather easy to determine the additional conditions that must be inserted in the pre and postconditions for them to represent the proper sequence structure of the algorithm. Without giving a complete algorithm that extracts this sequence structure, Figure 2 provides the synchronizationgraph of the algorithm of Figure 1. The reader will find it easy to verify its validity, knowing that all non-labeled arcs determine variable dependencies between different instructions, that arcs marked with T and F indicate a dependency on a boolean value, and that arcs marked as L indicate a loop. i0: int fact(int n) ii: { int f; 12:
int i;
13:
f = i;
14: 15:
i = 2; while (i Competence/Level of competence necessary conditions for Domain(x) . .. att, ~.~q~Iame Domaan. X ~ a t t : ~) >activity necessary conditions for Objective(x) Obj ective:,x~___...~attr! ~"~(attr)
> Object > Focus
In the context of CE project, this proposition for viewpoints is not sufficient. So we have to clarify this integration of viewpoints and our aim in using viewpoints. Second step in characterization of viewpoints and use with conceptual graphs The integration of viewpoints proposed in 10 is based on a simple definition of viewpoints that is: "viewpoint is the explicit expression of a particular subtype relation existing between two concept types". This definition takes into account the difficulty in elaborating an unique model using different terminologies. It Mlows us to describe a complex object under different perspectives, and allows the possibility of comparing two descriptions in verifying if they use equivalent concept types. But our aim is to compare two descriptions on the same level of abstraction, written from different angles, but focusing on the same problem or task. The determination of such graphs is not easy with only viewpoints and terminology. In fact it is very difficult to automatically determine the viewpoint in a description (represented by a conceptual graph) from the concept types used. So we must distinguish viewpoints used in descriptions of complex objects and viewpoints used in descriptions of an expertise, a proposition or a belief.
99
Description viewpoints ./(Vpt
../ TYPE: PC exploitation system~,~..,,, TYPE:Computer_unix TYPE: Computer VMS TYPE: C o m p u t e r ~ . . ~ TYPE:Computer IPX ' ~ (Vpt protocol _ n e t w o r k ~ TYPE: Computer_TCP-IP TYPE: Computer_PPP TYPE: Server '(Vpt usability in network)/~t--------TYPE: client_terminal TYPE: client_Computer Expertise viewpoints 9 ..--(Vpt electronics engineer),. TYPE:cable topology TYPE:topology of network ~f- " " - "",(Vpt network_designer)< TYPE:Machine_topology
Description of complex objects Computer /1(Repr) < Computer unix: primo Computer__TCP-IP: primo Computer:primo ~ (Repr) < Sever: primo ~-'~Repr) < Computer_unix: M2 /.t-(Repr) < Computer_TCP-IP: Computer:M2 ~--------(Repr) < client_terminal: primo -"~Repr) < Description topologic of network, by network designer: graph definition of Machine_topology(x) composed_by) > Computer_Unix Machine_topology: *x (nb_~gmputer) Numb~er: 10 "~composed_by) > Server: primo
~
Fig. 3. Example of description and expertise viewpoints We consider our first definition of viewpoint dedicated to the description of complex objects as "viewpoint of description", and we clarify in this paper the notion of "viewpoint of expertise". The distinction is essential, it determines the nature of knowledge that we index with viewpoints, and the possibility of each viewpoint in terms of knowledge management.
Definition: projection operation (as reminder) Let G and G' two conceptual graphs. The viewpoint projection of G into G' is an application ~:G--)G', where nG is a subgraph of G', such that: - For each concept c in G, ~c is a concept in G' and type(~c) is a subtype of type(c). - For each conceptual relation r in G, type(rcr)=type(r). If the ith arc of r is linked to a concept c in G, the ith arc of rcr must be linked to ~c in G'.
Definition: viewpoints of expertise, see example in (Fig. 3.) Suppose the experts have a common objective, described by a basic concept type F 1. Their expertises, according to this common objective F1, will be expressed by v-oriented subtypes of F 1, according to the expertise viewpoints V 1..... V n. An expertise viewpoint relation V is a viewpoint relation where the concept type Objective is replaced by Flin its definition.
100
sufficient conditions for expertise_Vpt (F1,TC) ..,~person:*x pVpt > (described_by)~--.~F1: *y ":aDomain:*z 4
Methodology to Build and Manage CE Project Memory with C G and Viewpoints
As we said in (Section 2) a part of a project memory in CE must contain all states of the artefact during the project. So our first interest is to represent an artefact. According to the definition of viewpoints in CE and the integration of such viewpoints in the conceptual graph formalism, we propose a method to build the artefact with a multiview approach and a method to facilitate the management (information retrieval and up-date) of this part of the corporate memory. We proposed also algorithms 4.1
Methodologyto Build the Artefact
For describing a design object or product, we must follow several steps: a) decomposition of the object in several components, b) describe each component c) describe relations among components and their interactions. 12 proposes a multi-view product model based on the life cycle of the product. Each object must be described under several views by only one expert. The different views or steps in the life cycle are "skeletal structure or topologic view", "geometric view", "manufacturing view", "material view", "thermal view", "mechanical view" and "dynamic/Kinematic view". We can note according to our definition of viewpoint, that those views represent the different focus or objective, taken by experts to describe objects. Each expert can also have his description according to the focus. This model is proposed for only one expert for one view, but we can generalize and apply it to several experts on one view. Indeed two experts can be in the same domain, and express their description in the same focus, but can have two different descriptions that correspond to their level of competence and experience in the domain. For example see (Fig. 5.)
Method to represent the artefact with viewpoints and CG (Fig. 4.) O We can note that those different views constitute the different focus on description that all participants can take. So we can characterize the objective characteristic of all "expertise viewpoints" with the different step of the life cycle on the product.Declaration of concept type s in the lattice 9Introduction of all basic concept types in the lattice corresponding to all components of the object, 9Introduction of all oriented concept types corresponding to the different expertise for each focus on components. @ Construction of the CG for the decomposition of the product O Introduction of all expertise viewpoints (according to life cycle and participant) O Description through second order conceptual graphs, of all viewpoint relations existing between basic concept types and v-oriented concept types
101
9 Definition via a CG of all oriented concept types O Instanciation of concept types corresponding to the different objects and their different description(s) to express interaction or relations between components
/~owledge baseof CG~'~I ,
~ticeC~ type Setofrelation ~ewpointmanagemen~ @ O I LO Viewpointbase) ~,~Instantiation base~
|
I
Fig. 4. Steps in the building of artefact ODescription of viewpoint relation and expertise viewpoint relation Description viewpoints ~1~VpLmechanic) < TYPE:arrow TYPE:Beam ~--------(Vpt_shape) < TYPE:structural_element ~'-(Vpt_geometric) < TYPE:parallelepiped Expertise viewpoints TYPE:Description_geometric_tree: (Vpt_geometric&El~: TYPE:Treegeometric . _ j...-~Vpt topologic&Bz~ TYPE:Tree_skeletal TYPE: Description_topologic_tre~"~f-_~._(Vpt-_topologicE3)~ TYPE:Tree_skin TYPE:Tree_flux T~ . . . . . . . . . _.......~(Vpt thermal&E4)< et~:uescripti~ )~ TYPE:Tree_surface TYPE:DescriptiOn_material_tree~(Vpt_materialE3) < (Vpt_material&E1X 9 Definition via a CG of oriented concept types
definition graph of Treegeometric (x) Tree~------~Composed._of) >cylinder (Composed of)
> parallepiped
>(has_for)
TYPE:Tree_mechanic TYPE:Tree_thermal
>size:*y
> (has for) >quotation:*z
Fig. 5. Example of a part of an artefact represented vith viewpoints and CGs
Algorithmsfor Consistencycheckingduringthe buildingof the artefact: In this section, we present two of several algorithms(sub-process of the principal programs are not described in this paper), for checking consistency during the building of the artefact. We can not deal with all cases, in this paper, so the assumption is that the concept types lattice and the set of relations (conceptual relations, viewpoint relations, expertise relations) are already build.We denote the input variables with I:, and Output variable with O: 9 creation of a viewpoint relation Vpt between two concept types T1 and T2: Program
crea t e v i e w ~ o i n t_rela tion ( I : Vpt, T1, T2 )
102
G := TYPE:T2 -> (Vpt) -> TYPE:T1 If in the concept type lattice T2 < T1 then a d d G in the v i e w p o i n t base else y o u can not create a v i e w p o i n t r e l a t i o n b e t w e e n EndProgram
T2 and T1
9 instantiation of a concept type Tc by a r e , r e n t ref P r o g r a m i n s t a n t i a t i o n (I: Tc,ref) if Tc:ref a l r e a d y p r e s e n t in the i n s t a n t i a t i o n b a s e then ok for the i n s t a n t i a t i o n else if Tc is a b a s i c c o n c e p t type then creation_description(I:Tc, ref, i n s t a n t i a t i o n base) else if Tc is a v _ o r i e n t e d concept type then Tb := b a s i c _ c o n c e p t _ t y p e _ a s s o c i a t e d ( I : v i e w p o i n t base) if Tb:ref a l r e a d y p r e s e n t in i n s t a n c i a t i o n base then creation_viewpoint_instance(I:Tb,Tc,ref,instanciationbase,O: L i s t _ t y p e s instantiated) creation_other_instanciation(I:Tb,Tc,ref,List_types_i n s t a n t i a t e d , v i e w p o i n t base, i n s t a n c i a t i o n base) else add T:ref(Repr)->C 2 9 is_generalized viewpoint(C1,C2)" iff is_specialized_viewpoint(C2,C1) 9 is_equivalent_concept (C1,C 2) iff type(C1) t..) type(C2) ~ T ^ 3 G in viewpoint base such that G: TYPE:type(C 1) ->(equiv)-> TYPE:type(C2) 9 is_inclusion_concept (C1,C2) iff type(C1) u type(C2) ~ T ^ 3 G in viewpoint base such that G: TYPE:type(C 1) ->(incl)->TYPE:type(C2) 9 is_exclusion_concept (C1,C 2) iff type(C1) u type(C2) ~ T A 3G in viewpoint base such as G: TYPE:type(C 1) - >(excl) -> TYPE:type(C2) 9 ismore_generalized_concept (C1,C2) iff is_generalization (C1,Cx) v is__generalized_viewpoint(C 1,C2) v is generalization&conceptualization(Ct,Cx)
New relations among elementary links of different conceptual graphs Let C G I = ( q , ~ I , A I ) and CG2=(C2,~e~2,A2) the conceptual graphs to be compared. We define additional kinds of relations possible 2between elementary links of CG 1 and CG 2, respectively denoted linkl=rell(C 11...Cln) and link 2 = rel 2 (C21...C2n), where rel 1 and rel 2 have the same arity: 9 9
is_concept_total_viewpoint_specialization(Linkl,Link2) iff type(rell)=type(rel 2) ^ Vie l..n, isspecialized_viewpoint (adj(i,rell), adj(i,rel2)) is_concept_partial_viewpoint_specialization (Linkl,Link2) iff type(rell)=type(rel2) ^ V i e 1..n, (isspecialized_viewpoint (adj(i,rell), adj(i,rel2)) v 2.In this article we do not define the same relation if type(rell) C1, hr: ~2->~1, ha:A2->^1) from CG 2 to CG 1 such that Vlink2e A2, is_concept_total_equivalent (link 1,ha(link2)) 9 CG 2 is "a partially equivalent graph" of CG 1 iff q a graph morphism (he: C2->C1, hr: ~(2->~1, ha:A2->N1) from CG 2 to CG 1 such that Vlink2e A2, (is_concept_partial_equivalent (link 1,ha(link2)) v is_same_link (link 1,ha(link2))) A 31ink2e A2 such that: is_concept_partial_equivalent (link 1,ha(link2)) 9 CG 2 is "a totally included graph" of CG 1 iff 3 a graph morphism (hc: C2->C1, hr: 9~2->9~1, ha:A2->A1) from CG 2 to CG 1 such that Vlink2e A 2, is_concept_total_inclusion (link l,ha(link2)) 9 CG 2 is "a partially included graph" of CG 1 iff q a graph morphism (he: C2->C1,
107
9
hr: PW>~, ha:A2->A1) from CG 2 to CG 1 such that Vlink2~ A2, (is_concept._partial_inclusion (link 1,ha(link2) ) v is_same_link (link 1,ha(link2))) ^ 31ink2e A2 such that: is_concept_partial_inclusion (linkl,ha(link2)) CG 2 is "a exclusion graph" of CG 1 iff 3 a graph morphism (hc: C2->C1, hr: PW>~, ha:A2->A1) from CG 2 to CG 1 such that 31ink2e A2, is_concept_exclusion (link 1,ha(link2) )
4.2.0.1 Strategies of integration of a proposition in the artefact When all relations between elementary links are given, 3 proposes several strategies to integrate the two compared conceptual graphs. In the artefact, the information or knowledge must be as precise as possible. So we detail the different cases of relations that could exist between two conceptual graphs: 9 If there is one of the different relations of specialization, detail in 3, then we apply the "strategy of the highest direct specialization": if the proposition is more precise than a description in the artefact, and use more precise expression, we prefer to restrict what was expressed in the artefact. 9 If there is one of the different relations of instantiation, then we apply the "strategy of the highest direct instantiation". 9 If there is a relation of partial equivalence and equivalence, then we can keep the two graphs in the different viewpoints they belong to, or choose one of them. 9 If there is a relation of inclusion or partial inclusion, we keep the graph that includes the other graphs, because the information in the included graphs will be present in the other graph. 9 If there is an excluded relation, the proposition is not valid. This case can note append in this context of integration of a solution in the artefact, but is efficient in the "evaluate task".
5 Conclusion In this paper, we present an approach to the construction of a corporate memory in CE, in taking account the different steps of the artefact during the design process. We propose a representation of artefact with viewpoints and conceptual graphs based on previous work on CG and an adaptation of viewpoints and viewpoints management to this problem. We introduce the notion of description viewpoints, and expertise viewpoints. We propose several algorithms (not detailed in this paper)for the building and maintain of the CG base representing artefact. Related work are already done in Design Rationale: 9 propose to use case librairies to represent past experiences and 6 focused on the usability of design rationale documents, but they do not use the conceptual graph formalism. Our approach using viewpoints and CGs can be extend to the different elements of a corporate memory detailed in 6 and it take care of the variety of information sources and the context in which knowledge or information must be understood. The implementation of this work is in progress and realized with the COGITO platform. In further work we have to organize answers of algorithms according to the different levels of viewpoints. Our aim is to propose a support system allowing the management of a CE project memory, based first on the artefact, but also must be
108
extended to the proposition (for the Design Rationale part) of 11, i.e. take into account the history of artefact but also the different possible choices during the CE process represented by the different design propositions. In this way, we could use efficiently the algorithm of comparison to see differences between several propositions
References 1.
Carbonneill,B., Haemmerlr, O., Rock: Un syst~me de question/reponse fondd sur le formlaisme des graphes conceptuels, In actes du 9i~meCongr~s Reconnaissances des Formes et Intelligence Artificielle, Paris, p. 159-169, 1994.
2.
Cointe,C., Matta, N., Ribi~re, M., Design Propositions Evaluation: Using Viewpoint to manage Conflicts in CREoPS2, In Proceedings of ISPE/CE, Concurrent Engineering: Research and Applications, Rochester August 1997.
3.
Dieng, R., Hug, S.,MULTIKAT, a Tool for Comparing Knowledge of Multiple Experts, In Proceedings of ICCS'98, Ed. Springer Verlag, Montpellier, France, August 1998.
4.
Finch,I., Viewpoints - Facilitating Expert Systems for Multiple Users, In Proceedings of the 4th International Conference on Database and Expert Systems Applications, DEXA'93, ed. Springer Verlag, 1993.
5.
Gerbr, O., Conceptual Graphs for corporate Knowledge Repositories, In Proceedings of ICCS'97, ed. Springer-Verlag, Seattle, Washington, USA, August 1997
6.
Karsenty,L., An empirical evaluation of design rationale documents, Electronic Proceedings of CHI'96, http:llwww.acm.orglsigchilchi961proceedings/paperslKarsenty/ lk_txt.htm, 1996.
7.
Leite,J.,Viewpoints on viewpoints, in Proceedings of Viewpoints 96: An International Workshop on Multiple Perspectives in Software Development, San Francisco, USA, 14-15 October, 1996.
8.
Marino,O., Rechenmann, F., P. Uvietta, Multiple perspectives and classification mechanism in Object-oriented Representation, Proc. 9th ECAI, Stockholm, Sweden, p. 425-430, Pitman Publishing, London, August 1990.
9.
Prasad,M.V.N., Plaza, E., Corporate Memories as Distributed Case Librairies, Proceedings of the 10th banff, Knowledge Acquisition for Knowledge-based Systems Workshop, Banff, Alberta, Canada, November 9-14, p. 40-1 40-19, 1996.
10.
Ribi~re,M., Dieng, R., Introduction of viewpoints in Conceptual Graph Formalism, In Proceedings of ICCS'97, Ed. Springer-Verlag, Seattle, USA, Aofit 1997,
11.
Ribi~re, M., Matta, N., Guide for the elaboration of a Corporate Memory in CE, submitted to the 5th European Concurrent Engineering Conference, Erlangen-Nuremberg, Germany, april 26-29, 1998.
12.
Tichkiewitch, S., Un modkle multi-rues pour la conception integr~e, in Summer Scool on ....Entreprises communicantes: Tendances et Enjeux", Modane, France, 1997.
13.
Sowa, J.F., Conceptual Structures, lnformation Processing in Mind and Machine. Reading, Addison-Wesley, 1984.
WebKB-GE
- - A V i s u a l E d i t o r for C a n o n i c a l Conceptual Graphs
S. Pollitt 1, A. Burrow i, and P.W. Eklund 2 i sepollitt/alburrow@cs,
adelaide, edu. au
Department of Computer Science, University of Adelaide, AUSTRALIA 5005 2
p. eklund@gu, edu. au
School of Information Technology, Griffith University, Parklands Drive, Southport, AUSTRALIA 9276 Abstract. This paper reports a CG editor implementation which uses canonical formation as the direct manipulation metaphore. The editor is written in Java and embedded within the WekKB indexation tool. The user's mental map is explicitly supported by a separate representation of a graph's visual layout. In addition, co-operative knowledge formulation is supported by network-aware work-sharing features. The layout language and its implementation are described as well as the design and implementation features. 1
Introduction
Display form conceptual graphs (CGs) provide information additional to the graph itself. An editing tool should therefore preserve layout information. For aesthetic reasons a regular layout style is also preferred. However, one consideration of good CG layout (as opposed to a general graph layout) is that understandability is the primary goal rather than attractiveness 2. The editor we describe (WebKB-GE) limits manipulation on the graph to the canonical formation rules 16 (copy, restrict, join, simplify). Atomic graphs are also canonical and therefore any CG constructed using WebKB-GE will be canonical. WebKB 12 is a public domain experimental knowledge annotation toolkit. It allows indices of any Document Elements (DEs) on the WWW to be built using annotations in CGs. This permits the semantic content, and relationships to other DEs, to be precisely described. Search is initiated remotely, via a WWWbrowser and/or a knowledge engine. This enables construction of documents using inference within the knowledge engine to assemble DEs. Additionally, the knowledge base provides an alternate index through which both query and direct hyperlink navigation can occur. WebKB has been built using Javascript and Java for the WWW-based interface and C and C + + for the inference engines. One of the goals of the WebKB toolkit is to aid Computer Supported Co-operative Work (CSCW). WebKB-GE is integrated into WebKB and therefore multi-user/distributed features are implemented. 2
Design
Goals
WebKB-GE is designed to be used by domain experts in a distributed cooperative environment. This means: (i) domain dependent base languages must
112
be distributed; (ii) co-operation depends on a shared understanding of a base language; (iii) domain experts are not necessarily experts in CG theory; (iv) large, collaborative domain knowledge bases are difficult to navigate; (v) a medium for collaborative communications must be provided. The design of WebKB-GE supports the construction of accurate well-formed CGs allowing the user to experience a CG canon's expressiveness. This is achieved through a direct manipulation interface. The properties of a graphs depiction are explicitly stored between sessions. WebKB-GE is designed to operate as a client tool in a distributed environment. 2.1
Direct M a n i p u l a t i o n
Direct manipulation (DM) allows complex commands to be activated by direct and intuitive user actions. DM is the visibility of the object of interest; rapid, reversible, incremental actions; and replacement of complex command language by direct manipulation of the object of interest 15. It should allow the user to form a model of the object represented in the interface 4. A well recognised subclass of DM interfaces is the graphical object editor 17 where the subject is edited through interaction with a graphical depiction. Unidraw 18 is a toolkit explicitly designed to support the construction of graphical object editors. WebKB-GE is an example of a graphical object editor handling CGs. Central to the DM interface is the Editing/Manipulation Pane. This contains a number of Objects manipulated by Tools. Relations between objects are also indicated. Objects provide visual representations of complex entities. The Concepts and Relations in CGs are graphical objects in WebKB-GE. A concept may contain additional information (an individual name for example) but only the Type is displayed in the visual representation. A palette of tools is provided to manipulate objects. The behaviour of a tool may be a function of the manipulated object, so each tool expresses an abstract operation. "Operation animation" is an essential feature of a DM interface. An operation like "move" allows the user to see the object dragged from the start to the finish point. Visual feedback about the success or failure of an operation must be provided. 2.2
Canonical Graphs
WebKB is aimed at the domain expert. It is important to restrict the graphs to those derivable from a canon. To ensure canonical graphs, the only operations allowed are canonical formation rules 16; (i) C o p y - a copy of a canonical graph is canonical; (ii) R e s t r i c t - a more general type may be restricted to a more specific type (as defined in the Type hierarchy). Also, a generic concept type may be replaced by an individual object of that type; (iii) J o i n - two canonical sub-graphs containing an identical concept are joined at that concept; (iv) Simplify - when a relation is duplicated between identical concepts the duplicates are redundant and removed. A distinction is made between operations that affect the graph and operations that affect the representation of the graph. Each of the four canonical operations operate on the underlying graph and the visual representation is updated accordingly. These are the only operations allowed on the graph itself. Operations on the representation of the graph, such as a "Move" operation, are also allowed.
113
(b)
le)
(c)
$
Fig. 1. Visual feedback in a WebKB-GE join operation. The left hand side shows an unsuccessful join. The system starts in a stable state (a) a concept is dragged in (b). The join is invalid so the dragged object turns red. The mouse is released at this point. The operation is undone by snapping the concept back to the previous position (c). The right hand side of shows a successful join. The system starts in (d), the lower left concept is moved towards an identical concept (e). The mouse is released and the two concepts snap together as the join is performed (f).
2.3
Distributed Multi-User Application
CSCW tools such as WebKB, allow members of a group to access a shared canon. The canon is received each time the user starts the application to ensure changes are propogated from the central site. The server is not fixed and the user chooses from several. Graphs created by distributed users should be available to the other members of the workgroup. The ideal way is for clients to send graphs back to the central server. When users share information it is important that the original creator be acknowledged. These features are implemented in WebKB-GE. 2.4
L a y o u t Retention
In preceding local implementations of CG editors 1, 3 only the graph operations were considered important. No display or linear form editors in the literature have the capacity to maintain display form representations 11, 13. In these tools the formation rules are sometimes implemented for graph manipulation but only the final linear form of the graph is stored between editing sessions. The visual representation of a conceptual graph contains additional metainformation significant to the graph's creator. Layout should be disturbed as little as possible by operations performed on the underlying CG. This allows the users' mental map 14 to be preserved. Additionally, the user will want to alter the visual representation while not altering the underlying graph. The only such operation permitted is moving objects to new spatial locations. These moving operations are constrained - - a regularity is imposed by the layout language describing the visual representations. The method chosen for layout storage is implemented according to Burrow and Eklund2 and described below in Section 3.4.
114
3 Implementation 3.1 Architecture A general multiple server/multiple client architecture allows communication over a network such as the Internet. The network is not an essential p a r t of the system - - b o t h server and client can run on a single machine. The server code controls distribution of a canon to clients. In addition, initial layouts for a canon are sent to a client. The server can handle database access for a shared CG canon. The client requests a copy of a canon from a server as well as the layout of any new subgraphs the user adds to the editing pane.
3.2
Implementation Language: Java
One i m p o r t a n t issue is the difference between J a v a Applets and J a v a Applications. An Applet is a J a v a program 5 t h a t runs within a restricted environment, normally a Web browser. This restricted environment does not allow certain operations for security reasons: (i) making a network connection to a machine other t h a n t h a t from which the applet was loaded. This restricts each instance of the client tO only contact a single server. Changing servers requires the whole applet be re-loaded from the new server: (ii) accessing the local disk. The client is unable to r e a d / w r i t e to the local disk and unable to save/load CGs locally. This is not a large problem depending on how saving graphs is handled. If all graphs are saved via the server no local access is required. Each time an applet is loaded in a web browser all code for t h a t applet is downloaded from the server to the client machine. This ensures the user is receiving continuously updated software but can be a slow if the applet is large. With a J a v a application, the J a v a Run-time Environment (JRE) is downloaded separately for the appropriate platform. The application code is executed using the JRE. Applet restrictions do not apply for J a v a Applications and for this reason b o t h the client and server code are written as J a v a Applications. A number of DM toolkits are available for use with Java. Most provide a layer on top of the Abstract Windowing Toolkit (AWT) to make interface creation straightforward. Some of the more widely known toolkits are subArctic 6, Sgraphics 9 and the Internet Foundation Classes 7. The DM toolkit for W e b K B - G E is subarctic6, a constraint-based layout toolkit. At the time W e b K B - G E was written it was the most stable of supported toolkits. It is also available free for both commercial and non-commercial use. Objectspace has created the "Generic Collection Library" (JGL) for J a v a 8. This library was also used.
3.3
Communication
For communication between client and server a simple protocol was implemented. Currently only two active operations are implemented: (i) Canon Request - the server responds to this by returning a copy of the canon being served, read from its local disk; (ii) Layout Request - - the server reads the relation name sent by the client and returns the layout specification for t h a t relation. Additional operations, such as canon modifications and database access, can be added to the protocol if required. The client application contains two parsers to process information from a server. One parser reads the linear CGs and other information
115
from the canon to build the internal CG data store. The second reads layout and builds the visual representation of each graph. 3.4
Layout Language
The layout language devised to display CGs is the feature that differentiates this editor from othersll, 13, 3. The language originates in Kamada and Kawai10 who developed a constraint based layout language (COOL) to generate a pictorial representation of text. Graphical relations in COOL are one of either geometric relations - - constraints between variables characterising the objects; and drawing relations - - lines and labels on and between objects. Specifying layout is performed in two stages: (i) constraining the (defined) reference points to lie on a geometric curve (line, arc, spline); (ii) connecting the reference points of the object by a geometric curve with an arrowhead. Burrow and Eklund 2 devised a canon that represents the visual structure of conceptual graphs in the language of CGs. Following that work, the actual physical locations of objects are not stored in WebKB-GE. Instead, spatial relationships between the objects are captured in the language. A container-based approach is used. All objects must be stored in horizontal and vertical containers. The ordering of the objects within containers is preserved as objects are moved. If a container in either orientation does not exist at the final move location, a new container is created in that direction. Moving the final object out of a container dissolves the container. Horizontal containers are defined to extend to the width of the editing pane with the height of the tallest object contained. Vertical containers are defined to extend to the height of the editing pane with the width of the widest object. In WebKB, as in any co-operative work environment, it is important to record the original source of knowledge and data. Because layout information is saved separately from the linear form a mapping from linear form to display objects is also required. This is achieved by generating a quasi-unique identifier for every graph component. This is created using the Internet Protocol (IP) address of the machine on which the graph was created along with the time-stamp. This results in a twenty four digit identifier. Inside the client application the linear and display form of the graph are stored and manipulated separately. The two forms are synchronised using the twenty four digit identifier discussed above. The abstract (linear) graph information is stored in a series of data structures descending from the editing pane. The canon currently in use is also stored in the editing sheet and constructed from the Type and Relation hierarchies. Each entry in the Relation hierarchy contains a graph describing the concept types to be connected by the relation. Both graphs in relation definitions and user constructed graphs have their abstract information stored internally in a CG map: one for each graph. The graphical layout is stored in a series of containers with the top level being the "Container Constrainer". This handles alignment of horizontal and vertical containers and creates/destroys new/old containers. Each container is responsible for managing the layout of the objects contained within it. When objects change their dimensions or are added and removed the container resizes appropriately. The code
116
ensures containers are not too close together. Each object in the graphical representation maintains a connection with the abstract graph object it displays. When graph operations are targeted on a graphical object the abstract object is retrieved. Links between abstract graph objects (edges) are not stored in object containers but maintained within the user interface. Due to the simplicity of the layout language there are only a very Small number of definitions that may appear. HBOX:hbox_uum ~ ( H C O N T A I N S ) ~ E L E M E N T : ! uniqueid. VBOX:vboxnum-+(VCONTAINS)-+ELEMENT:!uniqueid. hboxnum and vboxnum denote the box into which to insert the element. The box numbers do not have to indicate any sort of ordering. The uniqueid is the placeholder for the corresponding element in the linear description. E L E M E N T : ! uniqueid--+ ( E L T R I G H T ) - + ELEMENT:!uniqueid. ELEMENT:!uniqueid~(ELTBELOW)-+ELEMENT:!uniqueid. The second uniqueid indicates the corresponding element is to the right (below) the first uniqueid.
HBOX:hboxnum-+(BOXBELOW)-->HBOX:hboxnum. VBOX:vboxnum--+(BOXRIGI-IT)-+VBOX:vboxnum. The second box specified is below (to the right of) the first. With these definitions the relative layout is stored. When a relation is added to the editing pane by the user, or a graph is loaded from a file, the following process occurs: 1. the linear form is loaded into a temporary CG map. This restricts the search space for resolving unique IDs and rollback if an error occurs. If a relation from the canon is being added, this occurred when the canon was retrieved; 2. the layout script of the graph is processed and objects to be placed in the containers. Container objects are reordered correctly; 3. dummy objects are resolved using the appropriate abstract objects from the linear form. Graphical representations of the links between objects are created and stored in the interface; 4. each graphical container is resized to fit the largest contained object. Containers are assigned an initial starting position which accounts for spacing between the containers. Positions of the graphical links are updated as the containers (and consequently, the objects) move; 5. if the previous stages occur successfully, the layout and abstract objects are merged in the editing pane. ! 1 :User
{-> ( !2 :Role) -> !3 :WN_expert,
-> ( !4 :Chrc) -> !5 :Property
).
Fig. 2. The augmented linear form of the CG. Fig. 2 shows the linear form augmented with unique identifiers. The graphical layout script is shown in Fig. 3 and the corresponding screen-shot shown in Fig. 4. The format of the canon used by the editor is a simple series of definitions giving the concept and relation hierarchies. The lattices containing the definitions must be defined first, with a separate lattice for every arity relation required. For example (from the default canon):
117
[HB0X:I] -> (HCONTAINS)-> [ E L ~ N T : ! I ] . [HBOX:I] -> (HCONTAINS) -> [ELEMENT:!4]. [RSOX:2] -> (HCONTAINS) -> [ELEMENT:!2]. [HBOX:2] -> (HCONTAINS) -> [ELEMENT:!5]. [HBOX:3] -> (HCONTAINS) -> [ELEI~NT:!3]. [ELENENT:!I] -> (ELTRIGHT) -> [ELENENT:!4]. [ELEMENT: !2] -> (ELTRIGHT) -> [ELEMENT: !5]. [HBOX:I] -> (BOXBEL0W) -> [RSOX:2]. [HBOX:2] -> (BOXBEL0W) -> [HBOX:3].
[VB0X:I] [VBOX: I] [VBOX:2] [VBOX:2] [VB0X:2] [ELEMENT: [ELEMENT: [ELEMENT: [VBOX: I]
-> (VCONTAINS) -> [ELEMENT: !I]. -> (VCONTAINS) -> [ELEMENT: !2]. -> (VCONTAINS) -> [ELEMENT: !4]. -> (VCONTAINS) -> [ELEMENT:!5]. -> (VCONTAINS) -> [ELEMENT: !3]. !1] -> (ELTBEL0W) ~> [ELEMENT: !2]. !4] -> (ELTBEL0W) -> [ELEMENT: !5]. !5] -> (ELTBELOW) -> [ELEMENT: !3]. -> (BOXRIGHT) -> [VBOX:2].
Fig. 3. The layout description of the graph.
Fig. 4. The top level window with the example graph loaded. lattice lattice
UNIVERSAL, ABSURD i s t y p e : *. TOP-T1-T1, BOT-T1-T1 i s r e l a t i o n
( UNIVERSAL, UNIVERSAL ) .
The concept hierarchy is then defined: type Entity(x) is [!I:UNIVERSAL:*x]. type Situation(x) t y p e S o m e t h i n g _ p l a y i n g _ a _ r o l e ( x ) i s [ ! 3 :UNIVERSAL: * x ] .
is
[!2:UNIVERSAL:*x].
Finally the relation hierarchies are defined: relation Attributive_binaryRel(x, y) i s [ ! 1 : UNIVERSAL: *x] - > ( ! 2 : TOP-TI-TI) -> [ ! 3 : UNIVERSAL : * y ] . relation Spatial_binaryRel(x, y) i s [ ! 7 : UNIVERSAL: *x] - > ( ! 8 : TOP-TI-TI) -> [ ! 9 : UNIVERSAL : * y ] .
r e l a t i o n Component b i n a r y R e l ( x , [ ! 4 : UNIVERSAL : *x] - > ( ! 5 : TOP-TI - T I ) - > [ ! 6 : UNIVERSAL : * y ] .
y) i s
Once the linear sections of the canon have been defined, the initial layouts of the relations must be defined. Each relation layout is specified in a file with the naming form: -.layout. A layout script for the graph is contained in that file. For example, the layout for Spatial_binaryRel is: [HBOX: I] -> (HCONTAINS) -> [HBOX:I] -> (HCONTAINS) -> [HBOX:I] -> (HCONTAINS) -> [ELEMENT: !7] -> (ELTRIGHT) [ELEMENT:!8] -> (ELTRIGHT)
4
[ELEMENT: !7]. [ELEMENT:!8]. [ELEMENT:!9]. -> [ELEMENT: !8]. -> [ELEMENT:!9].
[VBOX: I] [VBOX:2] [VBOX:3] [VEOX:I] [VEOX:2]
-> -> -> -> ->
(VCONTAINS) (VCONTAINS) (VCONTAINS) (BOXItIGHT) (BOXRIGHT)
-> [ELEMENT: !7]. -> [ELEMENT: !8]. -> [ELEMENT:!9]. -> [VBOX:2]. -> [VBOX:3].
Conclusion
A visual-form CG editor has been designed and implemented. Editing operations are restricted to the four canonical formation rules. This guarantees well-formed CGs. The key feature of this editor is the use of a graphical scripting language to capture relevant details of the graph layout. This information is stored along with the linear information.
118
The application is written to be used for co-operative work by networkconnected users and in particular for use in the WebKB indexation toolkit. Only simple graphs are currently supported. The ability to extend the framework to nested graphs, along with extending the layout language with different containers, is inherent in the design of the editor. WekKB and W e b K B - G E m a y be obtained from h t t p : / / w w w , i n t . gu. edu. au/kvo.
References 1. A.L. Burrow. Meta tool support for a GUI for conceptual structures. Hons. thesis, Dept. of Computer Science, www.int.gu.edu.au/kvo/reports/andrew.ps.gz, 1994. 2. A.L. Burrow and P.W. Eklund. A visual structure representation language for conceptual structures. In Proceedings of the 3rd International Conference on Conceptual Structures (Supplement), pages 165-171, 1995. 3. Peter W. Eklund, Josh Leane, and Chris Nowak. GRIT: An Implementation of a Graphical User Interface for Conceptual Structures. Technical Report TR94-03, University of Adelaide, Dept. Computer Science, Feb. 1994. 4. P.W. Eklund, J. Leane, and C. Nowak. GRIT: A GUI for conceptual structures. In Proceeedings of the 2rd International Workshop on PEIRCE. ICCS-93, 1993. 5. James Gosling and Henry McGilton. The Java Language Environment: A White Paper. Technical report, Sun Microsystems, 1996. 6. Scott Hudson and Ian Smith. subarctic User Manual. Technical report, GVU Center, Georgia Institute of Technology, 1997. 7. IFC Dev. Guide. Technical report, Netscape Communications Corp., 1997. 8. JGL User Manual. Technical report, Objectspace Inc., 1997. 9. Mike Jones. Sgraphics Design Documentation. Technical report, Mountain Alter-
native Systems, http://www.mass, com/software/sgraphics/index, 1997. 10. T. Kamada and S. Kawai. A general framework for visualizing abstract objects and relations. ACM Transactions on Graphics, 10(1):1-39, January 1991. 11. R.Y. Kamath and Walling Cyre. Automatic integration of digital system requirements using schemata. In 3rd International Conference on Conceptual Structures ICCS'95, number 954 in LNAI, pages 44-58, Berlin, 1995. Springer-Verlag. 12. P.H. Martin. The WebKB set of tools: a common scheme for shared W W W annotations, shared knowledge bases and information retrieval. In Proceedings of the CGTools Workskop at the 5th International Conference on Conceptual Structures ICCS '97, pages 588-595. Springer Verlag, LNAI 1257, 1997. 13. Jens-Uwe MSller and Detlev Wiesse. Editing conceptual graphs. In Proceedings of the 4 th International Conference on Conceptual Structures ICCS'96, pages 175187, Berlin, 1996. Springer-Verlag, LNAI 1115. 14. George G. Roberston, Stuart K. Card, and Jock D. Mackinlay. Information visualization using 3D interactive animation. CACM, 36(4):57-71, Apr 1993. 15. B Shneiderman. Direct manipulation. Computer, 16(8):57-68, Aug 1983. 16. J Sowa. Conceptual Structures : Information Processing in Mind and Machine. Addison-Wesley, 1984. 17. J. Vlissides. Generalized graphical object editing. Technical Report CSL-TR-90427, Dept. of Elec. Eng. and Computer Science, Stanford University, 1990. 18. John M. Vlissides and Mark A. Linton. Unidraw: A framework for building domainspecific graphical editors. ACM Trans. on Info. Systems, 8(3):237-268, Jul 1990.
Mapping of CGIF to Operational Interfaces A. Puder International Computer Science Institute 1947 Center St., Suite 600 Berkeley, CA 94704-1198 USA puder~icsi, berkeley, edu
Conceptual Graph Interchange Format (CGIF) is a notation for conceptual graphs which is meant for communication between computers. CGIF is represented through a grammar that defines 'tonthe-wire-representations". In this paper we argue that for interacting applications in an open distributed environment this is too inefficient both in terms of the application creation process as well as runtime characteristics. We propose to employ the widespread middleware platform based on CORBA to allow the interoperability within a heterogeneous environment. The major result of this paper is a specification of an operational interface written in CORBA's Interface Definition Language (IDL) that is equivalent to CGIF, yet better suited for the efficient implementation of applications in distributed systems. K e y w o r d s : CGIF, CORBA, IDL. A b s t r a c t . The
1
Introduction
Conceptual Graphs (CG) are abstract information structures t h a t are independent of a notation (see 5). Various notations have been developed for different purposes (see Figure 1). Among these are the display form (graphical notation) or the linear form (textual notation). These two notations are intended for h u m a n computer interaction. Another notation called Conceptual Graph Interchange Format (CGIF) is meant for communication between computers. C G I F is represented through a g r a m m a r t h a t defines "on-the-wire-representations" (i.e. the format of the d a t a transmitted over the network). The reason for developing C G I F was to support the interoperability for C G based applications t h a t needed to communicate with other C G - b a s e d applications. We argue t h a t for interacting applications in an open distributed environment this is too inefficient both in terms of the application creation process as well as runtime characteristics. Applications t h a t need to interoperate are written by different teams of programmers, in different p r o g r a m m i n g languages using different communication protocols. A generalization of this problem is addressed by so-called middleware platforms. As the name suggests, these platforms reside between the operating syst e m and the application. One prominent middleware platform is defined through the Common Object Request Broker Architecture (CORBA) which allows the interoperability within a heterogeneous environment (see 4). In this p a p e r we will show how to use C O R B A for C G - b a s e d applications.
120
~ .
.
.
.
.
.
.
.
.
.
.
ConceptualGraphs /
.
~
~
Intension
_
.
.
.
. . . . . . . . . . .
"
.Linear Form 9 . ....
Display F o r m
.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Human Computer Interaction
.
.
CGIF
9
--..
.
.
.
.
.
.
.
CGIDL
...............................
.
.
9
. . '
ComputerComputer Interaction
Fig. 1. Different notations represent the intension of conceptual graphs.
The outline of this paper is as follows: in Section 2 we give a short overview of CORBA. In Section 3 we discuss some drawbacks of using CGIF for distributed applications. In Section 4 we present our mapping of CGIF to CORBA IDL, which is further explained in Section 5 through an example. It should be noted that we describe work-in-progress. The following explanations emphasize the potential of using CORBA technology for CG-based applications. A complete mapping of CGIF to CORBA IDL is subject to further research.
2
Overview
of CORBA
Modern programming languages employ the object paradigm to structure computation within a single operating system process. The next logical step is to distribute a computation over multiple processes on a single machine or even on different machines. Because object orientation has proven to be an adequate means for developing and maintaining large scale applications, it seems reasonable to apply the object paradigm to distributed computation as well: objects are distributed over the machines within a networked environment and communicate with each other. As a fact of life, the computers within a networked environment differ in hardware architecture, operating system software, and the programming languages used to implement the objects. T h a t is what we call a heterogeneous distributed environment. To allow communication between objects in such an environment, one needs a rather complex piece of software called a middleware platform. The Common Object Request Broker Architecture (CORBA) is a specification of such a middleware platform. The CORBA standard is issued by the Object Management Group (OMG), an international organization with over 750 information software vendors, software developers, and users. The goal of the OMG is the establishment of industry guidelines and object management specifications to provide a common framework for application development. CORBA addresses the following issues:
121
O b j e c t o r i e n t a t i o n : Objects are the basic building blocks of CORBA applications. D i s t r i b u t i o n t r a n s p a r e n c y : A caller uses the same mechanisms to invoke an object whether it is located in the same address space, on the same machine, or on a remote machine. H a r d w a r e , OS, a n d l a n g u a g e i n d e p e n d e n c e : CORBA components can be implemented using different programming languages on different hardware architectures running different operating systems. V e n d o r i n d e p e n d e n c e : CORBA compliant implementations from different vendors interoperate and applications are portable between different vendors. One important aspect of CORBA is that it is a specification and not an implementation. CORBA just provides a framework allowing applications to interoperate in a distributed and heterogeneous environment. But it does not prescribe any specific technology how to implement the CORBA standard. The standard is freely available via the World Wide Web at h t t p : / / w w w , omg. o r g / . Currently there exist many implementations of CORBA focusing on different market segments.
Fig. 2. Basic building blocks of a CORBA based middlewaxe platform.
Figure 2 gives an overview of the components of a CORBA system (depicted in gray), as well as the embedding of an application in such a platform (white components). The Object Request Broker (ORB) is responsible for transferring operations from clients to servers. This requires the ORB to locate a server implementation (and possibly activate it), transmit the operation and its parameters, and finally return the results back to the client.
122
An Object Adapter (OA) offers various services to a server such as the management of object references, the activation of server implementations, and the instantiation of new server objects. Different OAs may be tailored for specific application domains and may offer different services to a server. The ORB is responsible for dispatching between different OAs. One mandatory OA is the so-called Basic Object Adapter (BOA). As its name implies, it offers only very basic services to a server. The interface between a client and a server is specified with an Interface Definition Language (IDL). According to the object paradigm, an IDL specification separates the interface of a server from its implementation. This way a client has access to a server's operational interface without being aware of the server's implementation details. An IDL-compiler generates a stub for the client and a skeleton for the server which are responsible for marshalling and unmarshalling the parameters of an operation. The Dynamic Invocation Interface (DII) and the Dynamic Skeleton Interface (DSI) allow the sending and receiving of operation invocations. They represent the marshalling and unmarshalling API offered to stubs and skeletons by the ORB. Different ORB implementations can interoperate through the Internet Inter-ORB Protocol (IIOP) which describes the on-the-wire representations of basic and constructed IDL data types as well as message formats needed for the protocol. With that respect it defines a transfer syntax, just like CGIF does. The design of IIOP was driven by the goal to keep it simple, scalable, and general.
3
Using
CGIF
in a
heterogeneous environment
In this section we explain how CGIF might be used in constructing distributed applications and the disadvantages this has. Figure 3 depicts a typical configuration. The application consists of a client and a server, communicating via a transport layer over the network. The messages being exchanged between the client and the server contain CGs as defined by CGIF. First note that this is not sufficient for a distributed application. CGIF only allows to code parameters for operations, but the kind of operations to be invoked at the server is out of scope of CGIF. The distinction between parameters and operations corresponds to the distinction between KIF and KQML (see 2). In that respect there is no equivalent to KQML in the CG-world. One of our premises is that the client and server can be written in different programming languages, running on different operating systems using different transport media. A programmer would most certainly define some data structures to represent a CG in his/her programming language. In order to transmit a CG, the internal data structure needs to be translated to CGIF. This is accomplished by a stub. On the server side the CG coded in CGIF needs to be translated back to an internal data structure again. This is done by a skeleton. The black nodes in Figure 3 show the important steps in the translation process. At step 1 the CG exists as a data structure in the programming language used to implement the client. The stub, which is also written in the same pro-
123
0 ..........
?
e" ...............................................................................
ii
..................
iN!iiiiiiiiiiiiiiTi!ii
Fig. 3. Marshalling code is contained in stubs and skeletons.
gramming language as the client, translates this data structure to CGIF (step 2). After transporting the CG over the network, it arrives at the server (step 3). The skeleton translates the CG back into a data representation in the programming language used for the implementation of the server. At step 4 the CG can finally be processed by the server. CGIF does not prescribe an internal data structure of a programming language. I.e., using CGIF for transmitting CGs, a programmer must first make such a definition based on his/her programming language followed by a manual coding of the stub and skeleton. This imposes a high overhead in the application creation process. The main benefit of using CORBA IDL is that stubs and skeletons are automatically generated by an IDL compiler and there is a well-defined mapping from IDL to different programming languages. Furthermore, the IDL specification does not only allow the specification of parameters but also for operations which makes it suitable for the specification of operational interfaces between application components. Although an IDL specification induces a transfer syntax through IIOP similar to CGIF, an IDL specification is better suited for the design of distributed applications. An IDL specification hides the transfer syntax and focuses on the user defined types, which are mapped to different programming languages by an IDL compiler. CGIF on the other hand exposes the complexity of the transfer syntax to an application programmer who is responsible for coding stubs and skeletons.
4
CG
interface
through
IDL
In this section we show how to translate some of the basic definitions of the proposed CG standard to CORBA IDL. The explanations presented here should
124
be seen as a proof of concept. A more thorough approach including all definitions of the CG standard is still a research topic. The mapping we explain in the following covers definitions 3.1 (Conceptual Graphs), 3.2 (Concept) and 3.3 (Conceptual Relation) of the proposed CG standard (see 6). The basic design principle of the operational interface is to exploit some common features of the object paradigm and the CG definitions. A conceptual graph is a bipartite graph consisting of concept and conceptual relation nodes. This definition resembles an object graph, where objects represent the nodes of a CG and object references arcs between the nodes. Therefore it seems feasible to model the nodes of a CG through CORBA objects and the links between the nodes through object references. This way of modelling a CG through CORBA IDL has several advantages. Since object references are used to connect relation with concept nodes, one CG can be distributed over several hosts. The objects, which denote the nodes of a CG are not required to remain in the same address space, since an object reference can span host boundaries in a heterogeneous environment. Furthermore, a CG does not necessarily need to be sent by value, but rather by reference. Only if the receiving side of an operation actually accesses a node, it will be transferred over the network. This scheme enables a lazy evaluation strategy for CG operations. It is common to place all IDL definitions related to a particular service in a separate namespace to avoid name conflicts with other applications. Therefore, we assume that the following IDL definitions are embraced by an IDL module: module CG { // Here come all definitions related to // the CG module
}; Using the inheritance mechanism of CORBA IDL we first define a common base interface for concepts and conceptual relations. This interface is called Node and defined through the following specification: typedef string Label; interface Node { attribute Label type;
}; The interface Node contains all the definitions which are common for concept and relation nodes. The common property shared by those two types of nodes is the type label. The type label is represented through an attribute of type Label. Note that we made Label a synonym for s t r i n g through a t y p e d e f declaration. Next, we define the interface for all concept nodes: interface Concept : Node { attribute Label referent;
}; typedef sequence ConceptSeq;
125
The interface Concept inherits all properties (i.e., attribute Label) from interface Node and adds a new property, namely an attribute for the referent. The following t y p e d e f defines an unbounded sequence of concepts. This is necessary for the definition of the relation node: interface Relation
: Node { attribute ConceptSeq links;
}; Just as the interface Concept, the interface R e l a t i o n inherits all properties from Node. The new property added here is a list of neighboring concept nodes, represented by the attribute l i n k s . Note that ConceptSeq is an ordered sequence. The length of this sequence corresponds with the arity of the relation. The first item in a sequence refers to the concept node pointing towards the relation. The final definition gives the IDL abstraction of a CG: typedef sequence Graph;
A CG is represented by a list of interface Node. For brevity reasons the IDL type is called Graph. The order of appearance of the individual nodes is of no importance. This data structure suffices to transmit a simple CG over the network. In the following section we provide a little example how this definition might be used in a real application context. 5
Example
Given the basic specifications from the previous section, how does a programmer develop an application using CG? Using the CORBA framework the programmer would have to design the interface to the application to be written based on the definition from the previous section. E.g., consider a simple application which would offer database functionality for CGs, such as save, restore, etc. The resulting IDL specification would look some like the following: #include "cg. idl" interface DB {
typedef string Key; exception Duplicate {}; exception NotFound {}; exception IllegalKey {}; Key save( in CG::Graph c ) raises( Duplicate ); CG::Graph retrieve( in Key k ) raises( NotFound, IllegalKey ); void delete( in CG::Graph c ) raises( NotFound, IllegalKey );
};
126 This example assumes t h a t the basic definitions for CGs from Section 4 are stored in a file called "cg. i d l " . The definitions are made known through the # i n c l u d e directive. Access to the database is defined through interface DB. The database stores CGs and assigns a unique key to each CG. The database allows to save, retrieve and delete CGs. If a CG is saved, the database returns a unique key for this CG. The operations r e t r i e v e and d e l e t e need a key as an input parameter. Errors are reported through exceptions. The IDL definition of interface DB is all t h a t a client program will need in order to access an implementation. As pointed out before, the language and precise technology that was used to implement the database are irrelevant to the client.
6
Conclusion
The construction of C G - b a s e d applications can benefit from the usage of middleware platforms. In this paper we have shown how to translate the basic definitions for a conceptual graph to C O R B A IDL. A conceptual graph is represented through a set of C O R B A objects which do not need to reside in the same address space. Besides language independence, this has the advantage of supporting lazy evaluation strategies for CG operations. Once a proper mapping from C G I F to C O R B A IDL has been accomplished, the existing CG applications should be re-structured using C O R B A (see 1). By doing so, those applications could more easily exploit specific services they offer among each other and to other applications. There are several C O R B A implementations available including free ones (see 3). Applications, which use C O R B A as a middleware platform, can be easily accessed in a standardized fashion from any C O R B A implementation, such as the one included in the Netscape Communicator.
References 1. CGTOOLS. Conceptual Graphs Tools homepage, http://cs.une.edu.au/~cgtools/, School of Mathematical and Computer Science, University of New England, Australia, 1997. 2. M.R. Genesereth and S.P. Ketchpel. Software Agents. Communications of the Association or Computing Machinery, 37(7):48-53, July 1994. 3. MICO. A free CORBA 2.0 compliant implementation, http://www.vsb.informatik.uni-frankfurt, de/~mico, Computer Science Department, University of Frankfurt, 1997. 4. Object Management Group (OMG), The Common Object Request Broker: Architecure and Specification, Revision 2.2, February 1998. 5. J.F. Sowa. Conceptual Structures, information processing mind and machine. Addison-Wesley Publishing Company, 1984. 6. J.F. Sowa. Standardization of Conceptual Graphs. ANSI DraB, 1998.
TOSCANA-Systems
Based on Thesauri
B e r n d G r o h 1, S e l m a S t r a h r i n g e r 2, a n d R u d o l f W i l l e 2 1 School of Information Technology Griffith University PMB 50 Gold Coast Mail Centre QLD 9726 Australia e-mail:
[email protected] Fachbereich Mathematik Technische Universits Darmstadt, Schloi3gartenstr. 7 64289 Darmstadt, Germany e-marl: {strahringer~ wille}@mathematik.tu-darmstadt.de
A b s t r a c t . T O S C A N A is a computer program which allows an online interaction with d a t a bases to analyse and explore d a t a conceptually. Such interaction uses conceptual data systems which are based on formal contexts consisting of relationships between objects and attributes. Those formal contexts often have attributes taken from a thesaurus, which may be understood as ordered set and be completed to a join-semilattice (if necessary). The join of thesaurus terms indicates the degree of resemblance of the terms and should therefore be included in the formal contexts containing those terms. Consequently, the formal contexts of a conceptual d a t a system based on a thesaurus should have join-closed attribute sets. A problem arises for the TOSCANA-system implementing such conceptual d a t a system because the attributes in a nested line diagram produced by T O S C A N A might not be join-closed, although its components have join-closed attribute sets. In this paper we offer a solution to this problem by developing a method for extending line diagrams to those whose attributes are join-closed. This method allows to implement TOSCANA-systems based on thesauri which respect the join-structure of the thesauri.
Keywords:
1
C o n c e p t u a l K n o w l e d g e Processing, F o r m a l C o n c e p t A n a l y s i s , D r a w i n g Line D i a g r a m s , T h e s a u r i
TOSCANA
T O S C A N A is a c o m p u t e r p r o g r a m which allows an online i n t e r a c t i o n w i t h d a t a bases to a n M y s e a n d e x p l o r e d a t a c o n c e p t u a l l y . T O S C A N A realizes conceptual data systems 17,20 which are m a t h e m a t i c a l l y specified s y s t e m s c o n s i s t i n g of a d a t a c o n t e x t a n d a collection of f o r m a l contexts, called conceptual scales, tog e t h e r w i t h line d i a g r a m s of their concept lattices. T h e r e is a c o n n e c t i o n b e t w e e n f o r m a l o b j e c t s of t h e c o n c e p t u a l scales a n d t h e o b j e c t s in t h e d a t a c o n t e x t t h a t
128
can be activated to conceptually represent the data objects within the line diagrams of the conceptual scales. This allows thematic views into the database (underlying the data context) via graphically presented concept lattices showing networks of conceptual relationships. The views may even be combined, interchanged, and refined so that a flexible and informative navigation through a conceptual landscape derived from the database can be performed (cf. 21). For an elementary understanding of conceptual data systems, it is best to assume that the data are given by a larger formal context K :-- (G, M, I). A conceptual scale derived from data context K can then be specified as a subcontext (G, Mj, I M (G • Mj)) with Mj _C M. A basic proposition of Formal Concept Analysis states that the concept lattice of K can be represented as a V-subsemilattice within the direct product of the concept lattices of subcontexts (G, Mj, IN (G x Mj)) (j E J) if M = Uj~J Mj 3; p. 77. This explains how every concept lattice can be represented by a nested line diagram of smaller concept lattices. Figure 1 shows a nested line diagram produced with the assistance of TOSCANA from a database about environmental literature. One concept lattice is represented by the big circles with their connecting line segments, while the line diagram of the second is inserted into each big circle. In such a way the direct product of any two lattices can be diagramed, where a non-nested line diagram of the direct product may be obtained from a nested diagram by replacing each line segment between two big circles by line segments between corresponding elements of the two line diagrams inside the two big circles. In Fig. 13 , the black little circles represent the combined concept lattice which has the two smaller concept lattices as V-homomorphic images. Since larger data contexts usually give rise to an extensive object labelling, TOSCANA attaches first to a node the number of objects which generate the concept represented by that node; after clicking on that number, TOSCANA presents all names of the counted objects. TOSCANA-systems have been successfully elaborated for many purposes in different research areas, but also on the commercial level. For example, TOSCANA-systems have been established: for analyzing data of children with diabetes 17, for investigating international cooperations 11, for exploring laws and regulations concerning building constructions 13, for retrieving books in a library 12, 15, for assisting engineers in designing pipings 19, for inquiring flight movements at Frankfurt Airport 10, for inspecting the burner of an incinerating plant 9, for developing qualitative theories in music esthetics 14, for studying semantics of speech-act verbs 8, for examining the medical nomenclature system SNOMED 16 etc. In applying the TOSCANA program, the desire often arises to extend the program by additional functionalities so that TOSCANA is still in a process of further development (cf. 18).
3 Translation of the labels in Fig. 1: Fluss/river, Oberflaechengewaesser/surface waters, Talsperre/impounded dam, Stausee/impounded lake, Staugewaesser/back water, Stauanlage/reservoir facilities, Staudamm/storage dam, Stausufe/barrage weir with locks, Seen/lakes, Teich/pond
129
Staugewaesser I_~/~23~V---a-4~ /~"-.. J Staudamm I
Oberflaechen
Fig. 1. Nested line diagram of a data context derived from a database about environmental literature
130
2
Thesaurus
Terms
as Attributes
D a t a contexts, which are supposed to be analysed and explored conceptually, often have attributes that are terms of a thesaurus. Those terms are already hierarchically structured by relations between the terms. We discuss in this section how the hierarchical structure of the thesaurus terms should be respected in a T O S C A N A - s y s t e m with additional functionality (compare other approaches in 2, 5, 6). Let us first describe a formalization of thesauri t h a t meets our purpose. Mathematically, we understand a thesaurus as a set T of terms together with a order relation "~ corn
- most-specialized(ll,
I
12) = 11 iff 3 a specialization relation between 11 and 12
147
-
12 iff 3 a specialization relation between 12 and 11 0 otherwise. c o m m o n - s p e c i a l i z a t i o n (11, 12) = most-specialized(11, 12) iff most-specialized(11, 12) ~ 0 r (c 1 ..... Cn) iff type (r) ~ T A V i ~ 1, n, c i ~ 0 (with r = r 2 n r I and V i ~ I, n, c i = common-specialization (adj(i, r2), adj(i, rl)) ) 0 otherwise. most-generalized
-
(11, 12) =
12 iff 3 a specialization relation between 11 and 12 11 iff 3 a specialization relation between 12 and 11,
0 otherwise. (11, 12) = most-generalized (11, 12) iff most-generalized (11, 12) ~ 0 r (c 1 ..... Cn) iff type (r) ~ Ttr ^ V i ~ 1, n, c i ~ 0 (with r = r I u r 2 and V i ~ I, n, c i = common-generalization (adj(i, r2), adj(i, rl) ) ) 0 otherwise. m o s t - i n s t a n t i a t e d (11, 12) = 1 1 iff 3 an instantiation relation between 11 and 12
- common-generalization
-
-
common-instantiation
- most-conceptualized
-
12 iff 3 an instantiation relation between 12 and 11 0 otherwise. (11, 12) = most-instantiated (11, 12). (11, 12) = 1 1 iff 3 an instantiation relation between 12 and 11
12 iff 3 an instantiation relation between 11 and 12 0 otherwise. common-conceptualization (11, 12) = most-conceptualized (11, 12) iff most-conceptualized (11, 12) ~e 0 r (c 1 ..... Cn) i f f r l = r2 ~e Ttr ^ V i ~ 1, n, c i ~e 0 (with r=rl= r 2 and V i ~ I, hi, ci=common-conceptualization (adj(i, r2), adj(i, rl)) ) 0 otherwise.
With the example of Fig. 3, common-specialization (111,121) = 121, common-instantiation (113,125) = 125, common-generalization (122, 123) = GTI
6 6.1
common-generalization (112, 122) = 112 common-conceptualization (114, 124) = 114 ~ (incites-to) ~ Action-error.
Global Relations among Conceptual Graphs Dermitions
Let G 1 = ( C 1, R1, "~1) and G 2 = ( C 2, ~ , "~2) tWO CGs of Expert 1 and of Expert 2, corresponding to the same viewpoint v and based on the c o m m o n support Scom. w e adapt as follows the definition of graph morphism proposed by 1. W - m o r p h i s m . A 3-ple of functions:
(hc: C 1 ~ C 2, h r ~kt --~ ~k~2, ha: -~1 ~ "~2) is
148
iff k/11 = r l ( C l l ' ...Cln )E "~1:
calledatF-m~176176176
ha(It) = h r (rl) (h e (Cli) .... h c (Cln) A type (h r (rl)) < type (rl) ^ V i ~ 1..n,
type (h e (Cli)) < type (Cli) A referent (h e (Cli)) < referent type (Cli).
We define several binary relations on G 1 * G 2. S a m e g r a p h , S u b g r a p h , S u p e r g r a p h , Incompatibility - IsSameGraph (G1, G 2 ) iff 3 a bijective ~ - m o r p h i s m WGI_* G2 = (he, hr, h a ) such
that V 11 ~ A 1, is-same-link (11, ha(ll) ). - IsSubgraph (G 2 , G 1 ) iff 3 a W-morphism tlflG2___>G1 -- (he, hr, ha) such that is-same-
link (ha(12), 12 ). - IsSupergraph (G 2 , G 1 ) iff IsSubgraph (G1, G2). - IncompatibleGraphs (G 1, G 2 ) iff 3 11 ~ A 1, 3 12 ~ A 2 such that are-incompatible-
links (11,12 ). Specialization 1. We define the relations IsConceptTotalSpecialization, IsConceptPartialSpecialization, IsRelationTotalSpecialization, IsRelationPartialSpecialization, IsRelation&ConceptPartialSpecializationls, Relation&ConceptTotalSpecialization. - IsRelation&ConceptTotalSpecialization (G 2 ,G 1 ) iff 3 a W-morphism tIJG2___> G1 =
(h c, h r , ha) such that ~' 12 ~ A 2,11 = h a (12) ~ A 1 satisfies: is-relation&concept-totalspecialization (12, 11). Generalization.The generalization relations are defined w.r.t, the specialization ones. Instantiation. We define the relations IsTotallnstantiation, IsPartiallnstantiation,IsRelationSpecialization&Totallnstantiation and lsRelationSpecialization&Partiallnstantiation. For example: - IsRelationSpecialization&Totallnstantiation
(G 2 , G1) iff 3 tIJG2_._> G1 such that
V 12 ~ A 2 : is-relation-specialization&total-instantiation (12, h a (12)). Conceptualization. The conceptualization relations are defined w.r.t, the instantiation relations. For lack of room, we don't detail the relations > or > (see 4 12 for more details). According to the clustering of local relations among elementary links, the global relations between graphs are clustered as shown in Fig. 4. Based on this clustering, we
1.Notice that our specialization and generalization relations differ from the terminology usually adopted in Conceptual Graph community.
149
define the following functions: - m o s t - s p e c i a l i z e d ( G 1, G2) = G 1 iff 3 a specialization relation between G 1 and G 2
G 2 iff 3 a specialization relation between G 2 and G 1 0 otherwise. - m o s t - g e n e r a l i z e d (G 1, G2) = G 2 iff most-specialized(G 1, G2) = G 1 G 1 iff most-specialized(G 1, G2) = G 2 0 otherwise. m o s t - i n s t a n t i a t e d (G 1, G2) = G 1 iff 3 an instantiation relation between G 1 and G 2
-
G 2 iff 3 an instantiation relation between G 2 and G 1 0 otherwise. - m o s t - c o n c e p t u a l i z e d (G 1, G2) = G 2 iff most-instantiated (G1, G2) = G 1 G 1 iff most-instantiated (G 1, G2) = G 2 0 otherwise. R e m a r k : The previous ~F-morphisms need not be injective. The interest to use or not surjective W-morphisms is discussed in 12.
S \ ( Specialization (
_Specialization &
)
Fig. 4: Clustering of global relations among conceptual graphs. Example. In the example of Fig. 3, the following relations hold: IsRelation&ConceptPartialGeneralization&PartialConceptualization (G 1, G2) IsRelation&ConceptPartialSpecialization&PartialInstantiation (G 2, GI) Therefore, most-generalized (G 1, G 2) = G 1, most-specialized (G 1, G2) = G 2 most-instantiated (G 1, G2) = G 1, most-conceptualized (G 1, G2) = G 2
7 7.1
Construction of the Base of Integrated Conceptual Graphs Strategies of Integration
Unless the relation I n c o m p a t i b l e G r a p h s (G 1 , G 2) was set up between both graphs, their integration is possible. The building of the integrated graph Gcom from both graphs G 1 and G 2 is guided by a s t r a t e g y for solving conflicts. To each strategy, two functions, fglobal and floeal can be associated. If there exists a global relation between G 1 and G2, the result of the function fglobal associated to the current strategy will be included in Gcom. If no such global relation exists, the local relations among comparable elementa-
150
ry links are exploited : in case of choice between two such comparable links 11 ~ -'~1, and 12 ~ -'~2 the result of the function flocal associated to the current strategy will be included in Gcom. If the function flocal gives no result, both links 11 and 12 are included in Gcom. The connection of the partial graphs thus obtained is performed with strategy-dependent join operations: we implemented another version of the maximal join operator 12. Once the integrated graph created for each viewpoint, the common knowledge model is obtained. According to the integration strategy chosen by the KE, the graphs of this common knowledge model may represent or not the complete knowledge of both initial graph bases. MULTIKAT supplies the KE with 9 predefined strategies among which she can make a choice. These strategies allow to carry out the integration according to various criteria. The ~direct~ strategies are better when the ICE prefers to always restrict to what was explicitly expressed by at least one expert and takes no initiative for modifying the knowledge expressed by an expert. The ~indirect~ strategies are useful when the KE prefers to exploit the knowledge expressed by an expert, in order to take initiative for modifying (e.g. generalizing, specializing, conceptualizing or instantiating) the other expert's knowledge.
Strategy of the highest direct generalization: Preconditions: An expert focuses on particular cases, while the other expert expresses general knowledge, valid in more general cases. fglobal (G1, G2) = most-generalized (G 1, G2); flocal (ll, 12) = most-generalized (11, 12)
Strategy of the highest indirect generalization: Preconditions: The characteristics of the expert are the same as in the previous case. fglobal(G1,GE)=most-generalized (G1, G2); flocal (11,12) = common-generalization (11, 12)
Strategy of the highest direct specialization: Preconditions: An expert is more specialized than the other, on a given aspect and uses more precise expressions. fglobal (G1, G2) = most-specialized (G 1, G2); flocal (11, 12) = most-specialized (11, 12)
Strategy of the highest indirect specialization: Preconditions: The preconditions on the experts are the same as in the previous case. fglobal (G1,GE)=most-specialized (G1, G2); flocal (ll, 12) = common-specialization (11, 12)
Strategy of the highest direct conceptualization: Preconditions: An expert focuses on too particular cases and on too specific examples, while the other expert expresses general knowledge, at a better level of abstraction. fglobal(G1,G2) =most-conceptualized (G 1, G2); flocal(ll, 12) =most-conceptualized (l 1, 12) If the function gives no result, both links l 1 and 12 are included in Gcom.
Strategy of the highest indirect conceptualization : Preconditions: The characteristics of the expert are the same as in the previous case. fglobal(G 1,GE)=m~176 !,G2); flocal(ll,lE)=common-conceptualization(ll, 12)
151
Strategy of the highest direct instantiation 1: Preconditions: An expert gives useful and precise examples. fglobal (GI, G2) = most-instantiated (G 1, G2); flocal (11, 12) = most-instantiated (11, 12) Strategy of the greatest competence: Preconditions: An expert is known as having a higher level of competence in a given field. flocat (11, 12) = specialist (11, 12) Strategy of experts' consensus: Preconditions: (1) Both experts have the same level of competence in the considered field and the KE has no criterion for choosing one rather than the other. (2) Or, for Represented through Conceptual Graphs. To appear in Prade ed., Proc. of ECAI'98, John Wiley & Sons, Ltd, Brighton, UK (1998) 341-345. Easterbrook, S. Handling conflict between domain descriptions with computer-supported negotiation. Knowledge Acquisition, 3(3):255-289, (1991). Gaines, B., Shaw, M. Comparing the Conceptual Systems of Experts. In Proc. of IJCAI'89, Detroit, USA (1989) 633--638. Garcia, C. Co-Operative Building of an Ontology within Multi-Expertise Framework. Proc. of the 2nd Int. Conf. on the Design of Cooperative Systems (C00P'96), Juan-les-Pins (1996) 435-454. Garner, B. J., Lukose, D. Knowledge Fusion. In H. D. Pfeiffer &al eds, Conceptual Structures: Theory and Implementation, Springer-Verlag, LNA1754, Las Cruces, NM, USA (1992) 158-167. Guinaldo, O. Conceptual Graph Isomorphism: Algorithm and Use. In Eklund et al, eds, Conceptual Structures: Knowledge Representation as lnterlingua, Spring.-Verl., LNAI 1115, Sydney, Australia (1996) 160-174. Haemmefl6, O. CoGITo: une plate-forme de ddveloppement de logiciels sur les graphes conceptuels. Ph.D thesis, Montpellier II University, France (1995). Hug., S. Vergleich von Begriffsgraphen fiir die Modellierung von Wissen mehrerer Experten. Master Thesis, Univ. of Karlsruhe, Germany (1997). Mineau, G. W., Allouche, M. Establishing a Semantic Basis: Toward the Integration of Vocabularies. In Gaines et al eds Proc. of KAW'95, pp. 2-1 - 2-16, Banff, Canada (1995). Murray, K., Porter, B. Developing a Tool for Knowledge Integration: Initial Results. lnt.Journal of Man-Machine Studies, 33:373-383 (1990). Poole, J., Campbell, J. A. A Novel Algorithm for Matching Conceptual and Related Graphs. In G. Ellis et al eds, Conceptual Structures: Applications, Implementation and Theory, Springer-Veflag, LNAI 954, Santa Cruz, CA, USA (1995) 293-307. Ribi~re, M., Dieng, R. Introduction of Viewpoints in Conceptual Graph Formalism. In Lukose & al eds, Fulfilling Peirce's Dream, ICCS'97, Springer-Verlag, LNAI 1257, Seattle, USA,(1997) 168-182. Shaw, M. L. G., Gaines, B. R. A methodology for recognizing conflict, correspondence, consensus and contrast in a knowledge acquisition system. Knowledge Acquisition, 1(4):341-363, (1989). Sowa, J. F. Conceptual Structures: Information Processing in Mind and Machine. Reading, AddisonWesley, (1984). Tennison, J., Shadboldt, N. APECKS: a Tool to Support Living Ontologies. In Gaines et al eds, Proc. of the 11th Banff Workshop on Knowledge Acquisition, Modeling and Management (KA W'98), Banff, Canada, (1998). Also in http://ksi.cpsc.ucalgary.ca/KAW/KAW98/tennisolaJ Willems, M. Projection and Unification for Conceptual Graphs. In Ellis et al eds, Conceptual Structures: Applications, Implementation and Theory, Springer-Verlag, LNAI 954, Santa Cruz, CA, USA, (1995) 278-292.
A Platform Allowing Typed Nested Graphs: How CoGITo Became C o G I T a N T David Genest and Eric Salvat LIRMM (CNRS and Universit~ Montpellier II), 161 rue Ada, 34392 Montpellier Cedex 5, France.
A b s t r a c t . This paper presents CoGITaNT, a software development platform for applications based on conceptual graphs. CoGITaNT is a new version of the CoGITo platform, adding simple graph rules and typed nested graphs with coreference links.
1
Introduction
The goal of the CORALI project (Conceptual graphs at Lirmm) is to build a theoretical formal model, to search for algorithms for solving problems in this model, and to develop of software tools implementing this theory. Our research group considers the CG model 16 as a declarative model where knowledge is solely represented by labeled graphs and reasoning is can be done by labeled graphs operations 3. In a first stage, such a model, the "simple conceptual graph" (SCG) model has been defined 4,11. This model has sound and complete semantics in first order logic. This model has been extended in several ways such as rules 12, 13 and positive nested graphs with coreference links 5. As for the SCG model, a sound and complete semantics has been proposed for these extensions 6, 15. The software platform CoGITo (Conceptual Graphs Integrated Tools) 9, 10 had been developed on the SCG model. This paper presents the updated version of this platform, CoGITaNT ( C o G I T o allowing Nested Typed graphs), which is based on these extensions: graph rules and nested conceptual graphs with coreference links. 2
CoGITaNT
CoGITaNT is a software platform: it enables an application developer to manage graphs and to apply the operations of the model. For portability and maintenance reasons, the object oriented programming paradigm was chosen and CoGITaNT was developed as a set of C + + classes. Each element of the theoretical model is represented by a class (conceptual graph, concept vertex, concept type, support,... ). Hence, the use of object programming techniques allows to represent graphs in a "natural" way (close to the model): for example, a graph is a set of concept vertices plus a set of relation vertices having an ordered set of
155
neighbors. The methods associated with each class correspond to the usual handling functions (graph creation, type deletion, edge addition, ... ) and specific operations on the model (projection, join, fusion, ... ). CoGITaNT is compiled using the freeware C + + compiler GNU C++ on Unix systems. CoGITaNT is available for free on request (for further informations, please send an e-mail to
cogito@lirmm, fr). C o G I T a N T is an extension of CoGITo: each functionality of the previous version is present in the new one. Hence, applicationsbased upon C o G I T o should use C o G I T a N T without important source filesmodifications. CoGITo has been
presented in 10, 2, we only describe here some distinctive characteristics. The platform manages SCGs, and implements algorithms that have been developed by the CORALI group, such as a backtracking projection algorithm, a polynomial projection algorithm for the case when the graph to be projected is a tree, and a maximal join (isojoin) algorithm. The new version extends available operations on simple graphs by implementing graph rules. CoGITaNT also introduces a new set of classes that allows handling of typed nested graphs with coreference links and some other new features: graphs are no more necessarily connected, the set of concept types and the set of relation types ;ay not be necessarily ordered by a "kind of" relation in a lattice but may be partially ordered in a poset. 3
Graph
Rules
Conceptual graph rules have been presented in 13 and 12. These rules are of "IF G1 THEN G2" (noted G1 =~ G2) kind, where G1 and G2 are simple conceptual graphs with co-reference links between concepts of G1 and G2. Such vertices are called the connection points of the rule. Conceptual graph rules are implemented as a new CoGITaNT class. An instance of this class consists of two SCGs, the hypothesis and the conclusion, and a list of couple of concept vertices (cl, c~), where Cl is a vertex of the hypothesis and c2 is a vertex of the conclusion. This list represents the list of connection points of the rule.
G,I 0.o.,:.x
. on::y
person :"
oo:'z I
lather
,
person:,.y
,
Fig. 1. An example of a graph rule. Figure 1 may be interpreted as the following sentence: "if a person X is the brother of a person Y, and Y is the father of a person Z, then X is the uncle of Y, and there exists a person which is the father of X and Y". In this figure, x, y, z
156
are used to represent coreference links. Vertices which referents are *x, *y and *z are the connection points of this rule. CG rules can be used in forward chaining and backward chaining, both mechanisms being defined with graph operations. Furthermore, these two mechanisms are sound and complete with respect to deduction on the corresponding subset of FOL formulae. Forward chaining is used to explicitly enrich facts with knowledge which is implicitly present in the knowledge base by the way of rules. Then, when a fact fulfills the hypothesis of a rule (i.e. when there is a projection from the hypothesis to the fact), then the conclusion of the rule can be "added" to the fact (i.e. joined on connection points). This basic operation of forward chaining is implemented in CoGITaNT. This method allows to apply a rule R on a CG G, following a projection ~r from the hypothesis of R to G. The resulting graph is RIG, ~r, obtained by joining the conclusion of R on G. The join is made on the connection points of the conclusion and the vertices of G which are images of the corresponding connection points in the hypothesis. This method is the basic method of forward chaining, since it allows only to apply a given rule on a given graph and following a given projection. But, since CoGITaNT allows to compute all projections between two graphs, then it is easy to compute all applications of a rule on a graph. Backward chaining is used to prove a request (a goal) on a knowledge base without applying rules on the facts of the knowledge base. Then, we search for a unification between the conclusion of a rule and the request. If such an unification exists, then a new request is built deleting the unified part of the previous request and joining the hypothesis of the rule to the new request. We defined backward chaining method in two steps: first, we compute all unifications between the conclusion of the rule and the request graph; then, given a rule R, a request graph Q and a unification u(Q, R), another procedure builds the new request. As in forward chaining, the implemented methods are the basic methods for a backward chaining mechanism. The user can manage the resolution procedure, for example by implementing heuristics that choose, in each step, the unifications to use at first.
4 4.1
Typed Nested Graphs with Coreference Links Untyped Nested Graphs and Typed Nested G r a p h s
In some applications, the SCG model does not allow a satisfactory representation of the knowledge involved. An application for the validation of stereotyped behavior models in human organizations 1 uses CGs to represent behaviors: Such CGs represent situations, pre-conditions and post-conditions of actions, each of these elements being described with CGs. Another application is using CGs for document retrieval 7: a document is described with a CG, such a CG represents the author, the title, and the subject (which is described by a CG). In these applications, knowledge can be structured by hierarchical levels, but this structure can not be represented easily by the SCG model. A satisfactory way of representing this structure is to put CGs within CGs.
157 An extension of the model, called the (untyped) "nested conceptual graph" (NCG) model 5 allows the representation of this hierarchical structure by adding in each concept vertex a "partial internal description" which may be either a set of NCGs or generic (i.e. an empty set, denoted **).
J perso.:':'" ~
watch:':'"
television : * power-on button : * : **
I
screen : * ITV show:* J presenter:*:"
2
agent
1 talk:~. **
1
object
2
person:*:** Ir
uninteresting : * : "* I
Fig. 2. An untyped nested conceptual graph. The (untyped) nested graph of figure 2 may represent the knowledge "A person is watching television. This television has a power-on button and its screen displays an uninteresting TV show where a presenter talks about someone". In figure 2, the same notion of nesting is used to represent that the button and the screen are "components" of the TV, the screen "represents" a TV show, and the scene where the presenter talks about someone is a "description" of the show. Hence, untyped NCGs fail to represent these various nesting semantics. However, in some applications, to specify the nesting semantics is useful. Typed NCGs may represent these more specific semantics than "the nested graph is the partial internal description" (see figure 3). The typed NCG model is not precisely described here, please refer to 5 for a more complete description. The typed NCG extension adds to the support a new partially ordered type set, called the nesting type set which is disjoint from the other type sets, and have a greatest element called "Description". A typed NCG G can be denoted G -- (R, C, E, l) where R, C and E are respectively relation, concept and edge sets of G, 1 is the labeling function of R and C such that Vr E R, l(r) = type(r) is an element of the relation type set, Vc e C, l(c) = (type(c), ref(c), desc(c)), where type(c) is an element of the concept type set, tel(c) is an individual marker or *, and desc(c) (called "internal partial description") is **, called the generic description or a non-empty set of couples (ti, G~), where ti is a nesting type and G~ is a typed NCG. 4.2
T y p e d N e s t e d G r a p h s in C o G I T a N T
One of the main new characteristics of CoGITaNT is that it allows handling typed NCGs. Data structures used by CoGITaNT for representing such graphs are a natural implementation of this structure.
158
~
I person:,,*:**
television : *
:*:**
-.
"-.
Component power-on button screen
watch
: * : **
" " " -..
:*
"-..
"..
Representation TV show: *
"'.
~
Description J presenter
:
:
tt
2
t agent
2 talk : *
uninteresting
object
: * : ** J
person : * : **
1
Fig. 3. A typed nested conceptual graph and a coreference link (dashed line). As described in figure 4, a typed N C G is composed of a list of connected components and a list of coreference classes (see further details). A connected component is constituted of a list of relation vertices and a list of concept vertices. A concept vertex is composed of a concept type, a referent (individual marker or *) and a list of nestings (** is represented by a NULL pointer). A nesting is constituted of a nesting type and a typed NCG. As for simple graphs, available methods implement usual handling functions and operations on the extended model. The projection structure and the projection operation defined on SCGs have been adapted to the recursive structure of NCGs and the projection operation follows the constraint induced by types of nestings 5. A "projection between typed NCGs" from H to G is represented by a list of "projections between connected components" 1. A "projection between connected components" from CCH to ccv is represented by a list of pairs (rl, r2) of relations 2 and a list of structures (Cl, c2, n) 3 where Cl is a concept vertex of CCH, c2 is a concept vertex of cco and n is an e m p t y list (if cl has a generic description) or a list of structures (nl, n2, g) where nl is a nesting of Cl, n2 is a nesting of c2 and g is a "projection between typed NCGs" from the graph nested in nl to the graph nested in n2. Thus, the representation of a projection between t y p e d NCGs has a recursive structure which is comparable with the structure of typed NCGs. The projection algorithm between NCGs computes first every projection between level-0 graphs without considering nestings. Obtained projections are then filtered: let H be one of these projections, if a concept vertex c of H has a nesting (t, G) such t h a t II(c) does not contain a nesting (g, G ~) such 1 for each connected component in H there is one "projection between connected components" in the list. 2 for each relation vertex r in CCH there is one pair (r, II(r)) in the list, where II(r) is the image of r. 3 for each concept vertex c in CCH there is one structure (c, II(c), n) in the list, where ll(c) is the image of c.
159
typed NCG
I
I
I list of coreference classes
list of connected components
connected component
I
I
I
I
list of concept vertices
list of relation vertices
relation vertex
relation type
coreference class
list of concept vertices
concept vertex
I
I
concept type
referent
, j
list of nestings ~
nesting
I nesting type
I typed NCG
Fig. 4. Data structures: typed NCG. that t >_ t' and there is a projection from G to G', t h e n / - / i s not an acceptable projection. Acceptable projections are then completed (third part of (Cl, c2, n) structures: projections between nestings). SCGs are also typed NCGs (without any nesting) and untyped NCGs are also typed NCGs (the type of each nesting is "Description"), hence CoGITaNT can also handle SCGs and untyped NCGs. Even if data structures and operations are not optimal when handling such graphs, there is no sensible lack of performances comparing to SCGs operations of CoGITo. Useless comparisons of (generic) descriptions for SCGs handled as typed NCGs (e.g. comparison of NULLpointers) and useless comparisons of nesting types (equal to "Description") for untyped NCGs handled as typed NCGs do not influence on the overall performances of the system. 4.3
Coreference Links
CoGITaNT also allows handling coreference links. A set of non-trivial coreference classes is associated with each graph. These classes are represented by lists of (pointers to) concept vertices. Trivial coreference classes (such as one-element classes and classes constituted by the set of concept vertices having the same type and the same individual marker) are not represented for memory saving reasons. The coreference link in figure 3 is represented by a (non-trivial) 2element coreference class that contains the two "person" concept vertices. Of course, the projection method makes use of coreference classes: the images of all vertices of a given coreference class must belong to the same (trivial or non trivial) coreference class. The projection algorithm from H to G currently computes first every projection from H to G without coreference constraints.
160
L e t / / b e one of these projections. In order to return only those conforming to coreference constraints, projections are then filtered: VCOH a non trivial coreference class of H , Vcl 6 Coil, Vc2 6 COH, if /-/(cl) and H(c2) are not in a same (trivial or non trivial) coreference class, then H is not an acceptable projection.
5
The
BCGCT
Format
A simple extension of the B C G C T format 9 is the CoGITaNT native file format: it is a textual format allowing supports, rules and graphs to be saved in permanent memory. Files in this format can easily be written and understood by a user, B C G C T is indeed a structured file format which represents every element of the model in a natural way. For example, the representation of a graph is constituted of three parts, the first describes concept vertices, the second describes relation vertices, and the third represents edges between these vertices. The representation of a concept vertex is a 3-tuple constituted of a concept type, an individual marker or an optional "*" (generic concept), and a set of couples (tl, Gi) where t~ is a nesting type and Gi is a graph identifier, or an optional "**" (generic description). Ex: cl= t e l e v i s i o n : * : (Component , g t v d e s c r ) ; ( "gtvdescr" is a graph identifier). Coreference links are represented using the same variable symbol for each concept vertex belonging to the same coreference class. Ex: c3= p e r s o n : $xl :** ; and c9= p e r s o n : $xl ; represent that these concept vertices belong to the same coreference class.
6
Conclusion
The platform has been provided to about ten research centers and firms. In particular, two collaborations of our research group, the one with Dassault Electronique and the other with the ABES (Agency of French university libraries) lead evolutions of the model and the platform. The research center of the firm Dassault Electronique uses the platform for building a software that furnishes an assistance for the acquisition and the validation of stereotyped behavior models in human organizations 1. This application led our research group to the definition of the typed NCG model, and its implementation in CoGITaNT. The other project of our group is on document retrieval: a first approach 8 convinced the ABES to continue with our collaboration to study the efficiency of a document retrieval system based on CGs and to develop a prototype of such a system. These collaborations make us consider many improvement perspectives for CoGITaNT. The efficiency of some operations can be improved by algorithmical studies. Indeed, the optimization of the unification procedure for graph rules will improve the backward chaining mechanism efficiency. Moreover, coreference links processing during projection computing can be improved by an algorithm that considers as soon as possible restrictions induced by these links.
161
We plan the extension of the expressiveness of C o G I T a N T with nested graph rules. A first theoretical study has been done in 14, 12, but nested graphs considered in this study are without coreference links. The extension of this work with coreference links will be implemented in the platform. In the long term, negation (or limited types of negation which are required by applications we are involved in) will be introduced in C o G I T a N T as required by the users of the platform.
References 1. Corinne Bos, Bernard Botella, and Philippe Vanheeghe. Modelling and simulating human behaviours with conceptual graphs. In Proceedings of ICCS '97, volume 1257 of LNAI, pages 275-289. Springer, 1997. 2. Boris Carbonneill, Michel Chein, Olivier Cogis, Olivier Guinaldo, Ollivier Haemmerl~, Marie-Laure Mugnier, and Eric Salvat. The COnceptual gRAphs at LIrmm project. In Proceedings of the first CGTools Workshop, pages 5-8, 1996. 3. Michel Chein. The CORALI project: From conceptual graphs to conceptual graphs via labelled graphs. In Proceedings of ICCS '97, volume 1257 of LNAI, pages 65-79. Springer, 1997. 4. Michel Chein and Marie-Laure Mugnier. Conceptual graphs: Fundamental notions. RIA, 6(4):365-406, 1992. 5. Michel Chein and Marie-Laure Mugnier. Positive nested conceptual graphs. In Proceedings of ICCS '97, volume 1257 of LNAI, pages 95-109. Springer, 1997. 6. Michel Chein, Marie-Laure Mugnier, and Genevieve Simonet. Nested graphs: A graph-based knowledge representation model with FOL semantics. To be published in proceedings of KR'98, 1998. 7. David Genest. Une utilisation des graphes conceptuels pour la recherche documentaire. M~moire de DEA, 1996. Universit~ Montpellier II. 8. David Genest and Michel Chein. An experiment in document retrieval using conceptual graphs. In Proceedings of ICCS '97, volume 1257 of LNAI, pages 489-504. Springer, 1997. 9. Olliver Haemmerl6. CoGITo : une plateforme de ddveloppement de logiciels sur les graphes conceptuels. PhD thesis, Universit6 MontpeUier II, France, 1995. 10. OUiver Haemmerl~. Implementation of multi-agent systems using conceptual graphs for knowledge and message representation: the CoGITo platform. In Supplementary Proceedings of ICCS'95, pages 13-24, 1995. 11. Marie-Laure Mugnier and Michel Chein. Repr6senter des connaissances et raisonnet avec des graphes. RIA, 10(1):7-56, 1996. 12. Eric Salvat. Raisonner avec des opdrations de graphes : Graphes conceptuels et r~gles d'infdrence. PhD thesis, Universit6 Montpellier II, France, 1997. 13. Eric Salvat and Marie-Laure Mugnier. Sound and complete forward and backward chaining of graph rules. In Proceedings of ICCS '96, volume 1115 of LNAI, pages 248-262. Springer, 1996. 14. Eric Salvat and Genevieve Simonet. R~gles d'inf~rence pour les graphes conceptuels embot6s. RR LIRMM 97013, 1997. 15. Genevieve Simonet. Une sfimantique logique pour les graphes embot~s. RR LIRMM 96047, 1996. 16. John F. Sowa. Conceptual Structures: Information Processing in Mind and Machine. Addison Wesley, 1984.
Towards Correspondences Between Conceptual Graphs and Description Logics P. Coupeyt and C. Faron$ tLIPN-CNRS UPRESA 7030, Universit6 Paris 13, Av. J.B. Clement, 93430 Villetaneuse, France
[email protected] :~LIFO, Universtit6 d'Orl~ans, rue L~onard de Vinci, BP 6759 45067 Orleans cedex 2, France Catherine.Faron~lip6. fr,
[email protected] A b s t r a c t . We present a formal correspondence between Conceptual Graphs and Description Logics. More precisely, we consider the Simple Conceptual Graphs model provided with type definitions (which we call TSC~) and the .Af~,f.O7. standard Description Logic. We prove an equivalence between a subset of TSC~ and a subset of .AEF_.OZ.Based on this equivalence, we suggest extensions of both formalisms while preserving the equivalence. In particular, regarding to standard Description Logics where a concept can be defined by the conjunction of any two concepts, we propose an extension of type definition in CGs allowing type definitions from the "conjunction" of any two types and consequently partial type definitions. Symmetrically, regarding generalization/specialization operations in Conceptual Graphs, we conclude by suggesting how Description Logics could take advantage of these correspondences to improve the explanation of subsumption computation.
1
Introduction
Conceptual Graphs (CGs) 21 and Description Logics (DLs) 3 both are knowledge representation formalisms descended from semantic networks. They are dedicated to the representation of assertional knowledge (i.e. facts) and terminological knowledge: hierarchies of concept types and relation types in CGs, hierarchies of concepts and roles in DLs. Subsumption is central in both formalisms: between graphs or between types in CGs (generalization), between concepts in DLs. Similarities between these formalisms have often been pointed out 1, 11 but up to now, to our knowledge, no formal study has ever been carried out about correspondences between CGs and DLs. Beyond an interesting theoretical result, such a work would offer CGs and DLs communities a mutual advantage of about 15 years of research. More precisely, numerous formal results in DLs about semantics and subsumption complexity could easily be adapted to CGs. Symmetrically, specialization/generalization and graph projection operations defined in CGs would help in explaining the computation of subsumption in DLs and thus contribute to current research in this community 14.
166
In this paper, we present a formal correspondence between Conceptual Graphs and Description Logics. More precisely, focusing on the terminological level, we consider the standard Description Logic .A/~CG~ 18 and the Simple Conceptual Graphs model 7, 8, 5 provided with type definitions 6, 12, 13 we call 7-SC~. We outline fundamental differences between both formalisms and set up restrictions on them, thus defining two subsets: G and s Then we show, regarding their formal semantics, that G and /2 are two notational variants of the same formalism and that subsumption definitions in G and /2 are equivalent. Based on this result, we make both formalisms vary while preserving the equivalence. In particular, regarding standard DLs where a concept can be defined by the conjunction of any two concepts, we propose an extension of type definition in CGs allowing concept type definitions from the "conjunction" of any two types and consequently partial type definitions. Symmetrically, regarding generalization/specialization operations in CGs, we suggest how DLs could take advantage of their correspondences with CGs to improve the explanation of subsumption computation. First we present ,4/~EOZ and TSCG in sections 2 and 3. Then we prove in section 4 the equivalence between the sub-formalisms ~ of 7-SCG and ~ of .As In section 5 we examine extensions of ~ and/2 that preserve the equivalence.
2
Description Logics
Description logics are a family of knowledge representation formalisms descended from the KL-ONE language 3. They mostly formalize the idea of concept definition and reasoning about these definitions. A DL includes a terminological language and an assertional language. The assertional language is dedicated to the statement of facts and assertional rules, the terminologicM language to the construction of concepts and roles (binary relations). A concept definition states necessary and sufficient conditions for membership in the extension of that concept. Concepts in DLs are organized in a taxonomy. The two fundamental reasoning tasks are thus subsumption computation between concepts, and classification. The classifier automatically inserts a new defined concept in the taxonomy, linking it to its most specific subsumers and to the most general concepts it subsumes. Among standard Description Logics we have selected As 18 since it is the largest subset of the best known DL CLASSIC 2 and the necessary description language for which a correspondence with the Simple Conceptual Graphs model holds true. .AZ.s is inductively defined from a set Pc of primitive concepts, a set P~ of primitive roles, a set I of individuals, the constant concept -l-, and two abstract syntax rules; one for concepts ( P is a primitive concept, R a role and ai elements
of I): C, D
--4
-IP
most general concept primitive concept
167
I
VR.C
13n.c I CmD
I {al...a~}
universal restriction on role existential restriction on role concept conjunction concept in extension
and one for roles (Q is a primitive role): R
-+
Q Q-1
primitive role inverse primitive role
Figure 1 presents examples of ,4s formulae. The first one describes the females who are researchers; the second one, the males who have at least one child; the third, the females all of whose children are graduates; the fourth, the boys whose mother has a sister who is a researcher; the last one, the males whose (only) friends are Pamela and Claudia, who are female.
Female 3 Researcher Male rq 3child.T Female rq Vchild.Graduate Boy 7 3child -1 .(Female 3 3sister.Researcher) Male rq VFriend.( { Pamela Claudia} rq Female) Fig. 1. Examples of A E E O Z formulae
A concept may be fully defined 1 (i.e. necessary and sufficient conditions) from a term C of . 4 s this is noted A - C, or partially defined (i.e. necessary but not sufficient conditions) from a term C of A s this is noted A c C. A terminological knowledge base (T-KB) is thus a set of concepts and their definitions (partial or full). Note that all partial definitions can be converted into full definitions by using new primitive concepts (cf. 17). Let an atomic concept A be partially defined w.r.t, a term C: A r- C. This can be converted into a full definition by adding to Pc a new primitive concept A': A - A ' rq C. A' implicitly describes the remainder of the necessary additional conditions for a C to be an A. From a theoretical point of view this supposes that one can always consider a T-KB has no more partial definitions. Figure 2 presents the definition of concept R N e p h e w .
R N e p h e w - Boy I3 3child -1 .(Female 13 3sister.Researcher) Fig. 2. Defmition of concept RNephew 1 We assume that concept definitions are non-recursive.
168
The formal meaning of concept descriptions built according to the above rules is classically given as an extensional semantics by an interpretation I = (D, I1.11I). The domain D is an arbitrarily non-empty set of individuals and I1.11z is an interpretation function mapping each concept onto a subset of D, each role onto a subset of D • D and each individual ai onto an element of D (if ai and aj are different individual names then IlaillI # HajllI). The denotation of a concept description is given by:
117-11~ = D IIC rq DII ~ = IICII I n IIDII'
IIVR.CII ~ = {a E D, Vb, if (a, b) E Ilnll ~ then b E IICIl') 113R.Cll ~ = {a E D, 3b, (a, b) ~ IIRII~ and b E IICIIz} II{al... a.}ll I = U;~=l{llaill z} IIQ-11 I = {(a,b), (b,a) E IIQIIz} An interpretation I is a model for a concept C if IICll I is non-empty. Based on this semantics, C is subsumed by D, noted C v- D, iff IlCll I C IIDI I for every interpretation I. C is equivalent to D iff (C r- D) and (D U C). The reader will find complete theoretical and practical developments in 17.
3
Conceptual
Graphs
Conceptual Graphs have first been introduced by J. Sowa in 21. In this paper, we consider the Simple Conceptual Graphs model SCG defined in 7, 8, 5 and extended with type definitions in 12, 13. We call it TSCG. It is a formal system appropriate to comparisons with Description Logics. We focus on the terminological level of 7-SC~, i.e the canon. A canon mainly consists of two type hierarchies: the concept type hierarchy Tc and the relation type hierarchy Tr, each one provided with a most general type, respectively Tc and T~. A canon also contains a canonical base of conceptual graphs, called star graphs, expressing constraints on the maximal types of the adjacent concept vertices of relation vertices 16. Conceptual graphs are built according to the canon, i.e they are made of concept and relation nodes whose types belong to T~ and T~ and respect the constraints expressed in the canonical base. Both type hierarchies are composed of atomic and defined types. A concept type definition s is a monadic abstraction, i.e. a conceptual graph whose one generic concept is considered as formal parameter. It is noted t~(x) r D(z). The formal parameter concept node of D(z) is called the head of t~, its type the genus of t~ and D(z) the differentia from tc to its genus 13. A relation type definition is a n-ary abstraction, i.e. a conceptual graph with n generic concepts considered as formal parameters. It is noted tr(xz, ...zn) r D(xz, ...xn). Figure 3 presents the definition of concept type RNephew and its logical interpretation. 2 We assume that type definitions are non-recursive.
169
vx (RN~phew(x) n ~ . ~ c h ~ ( z )))
.~
(3~, 3z Bou(~) ^ chile(y,x) ^ Female(y) ^ ~i~te~(U, z) ^
Fig. 3. Definition of concept type RNephew
T~ and T~ are provided with the order relations _ 0, is composed of a graph G and n special generic vertices of G.
Definition 1 (Conceptual g r a p h r u l e 13). A conceptual graph rule R : G1 ~ G2 is a couple of lambda-abstractions (XXl...XnGl,XXl...xnG2). Xl, ...,xn are called connection points. In the following, we will denote by x~ (resp. x 2) the vertex xi of G1 (resp. G2). For each i e 1..n, x~ and x~ are coreferent.
R:
~ G2 of co: *y
*Y 1 2
1
2
Fig. 1. A CG rule.
1
181
The rule in figure 1 informally means the following: "If an employee z is in an office y managed by a manager x, then z has got a desk which is inside the office y, and x employs z". 2.2
Logical interpretation o f a rule
associates to every lambda-abstraction Axl...x,~G a first-order formula, where all the variables are existentially quantified, except variables of xl...xn which are free. Let R : G1 ~ G2 be a CG rule, if --+ is the logical implication connector, then k~(R) = (~(A(xl ...xn)G1)) --~ (~(:k(xl ...xn)V2)). To every couple of vertices x i1 and x 2 is associated the same variable. 4~(R) is the universal closure of ~(R): 9 (R) = VXl...Vxn~(R). For example, consider the rule R of figure 1, ~(R) = VxVyVz(3w( Manager(x) A Manage(w) A O f f i c e ( y ) A Employee(z) A agt(x, w) A obj (w, y) A loc( z, y ) ) --~ (3u3vO f f ice(y ) A Desk(u ) A Employee( z ) A Employ(v) A Manager(x) A loc(u, y) A poss(z, u) A obj(v, z) A agt(x, v) ). A knowledge base is composed of a support S, a set of CGs (facts), and a set of rules.
2.3
Piece resolution procedure
Consider a goal Q to be proven. The piece resolution procedure allows us to determine, given a knowledge base K B , whether ~ ( B K ) ~ ~(Q). The basic operation of the piece resolution is the piece unification which, given a goal Q and a rule, builds, if possible, a new goal Q' to be proven such that if Q~ is proven on K B , then Q is proven on K B .
Definition 2 (Piece and cut points 13). Let R : G1 ~ G2 be a rule. A cut point of G2 is either a connection point (i.e. a generic concept shared with G1) or a concept with an individual marker (which may be common with G1 or not). A cut point of G1 is either a connection point of G1 or a concept with an individual marker which is common with G2. The pieces of G2 are obtained as follows: remove from G2 all cut points; one obtains a set of connected components; some of them are not CGs since some edges have lost an extremity. Complete each incomplete edge with a concept with same label as the former cut point. Each connected component is a piece. Equivalently, two vertices of G2 belong to the same piece if and only if there exists a path from one vertex to the other that does not go through a cut point. In figure 1, G2 has two pieces. The first one includes all vertices from y to z and the second one includes all vertices from z to x. Indeed, x, y and z are cut points, and G2 is split at z vertex. Instead of splitting the goal in subgoals to be treated separately 7 8, as also for Prolog 5 for first-order logic, piece resolution processes as much as possible the graph as a whole. For example, the request Q' of figure 2 can unify with the rule of figure 1. Indeed, we can unify the subgraph of Q containing the vertices from Manager
182
Q: t.an.,o,:Tom
Emp,o,
:.
Fig. 2. A request Manager
~
: Tom
1
agt
2
Manage
l
obj
2
Office : *
Employee: *
Fig. 3. A new request built by unification of the request and the rule
to Employee with the piece of G2 from the concept vertex with the marker x to the concept vertex with the marker z. We obtain a new request Q~. A piece resolution is a sequence of piece unifications. It ends successfully if the last produced goal is the empty graph. It needs the definition of a tree exploration strategy (breadth-first or depth-first search like in Prolog for example). For more details on piece resolution, we refer the reader to 13 12. The piece resolution procedure is j clearly a top-down (or backward chaining) procedure. Indeed, rules are applied to a conclusion (the goal is actually a rule with an empty hypothesis), thus generating new goals to be proven. This is done until we obtain the empty graph. Thus the procedure uses here the goal to guide the process.
Theorem 1 (Soundness and completeness of piece resolution 13). Let K B be a knowledge base and Q be a CG request. Then: - If a piece resolution of Q on K B ends successfully, then ~ ( K B ) ~ ~(Q). If ~ ( K B ) ~ ~(Q), then there is a piece resolution of Q on K B that ends successfully. -
3
Piece
resolution within first-order logic
Resolution in backward chaining for conceptual graph rules has been defined in section 2. This mechanism is sound and complete relative to the deduction in first-order logic. The underlying idea is to determine subgraphs as large as possible, that can be processed as a whole. As already mentioned, conceptual graph rules can be expressed as first-order sentences. Therefore it may be legitimate to express graph resolution using first-order formulae. This will lead us to a new kind of resolution called Piece Resolution. A unification between a goal graph and a graph rule produces a new request which is also a conceptual graph. This new goal can be expressed as a first-order logic sentence, and can also contain existential quantifiers. Now, classical proof procedures for first-order logic generally use clausal form 10 or specific forms, obtained in every case by taking out
183
existential quantifiers. This is called Skolemisation 3. This process modifies the knowledge base, thus further inferences do not use the original one. On the contrary, graph resolution uses the original base and allows existential quantifiers to take part of the proof mechanism (in the associated logical interpretation of rules) by using graphs structure. The inference is thus more "natural". Moreover, we will see that graph resolution provides ways to improve effectiveness theoretically speaking as well as practically. We will now clarify the idea of piece in first-order logic. 3.1
Pieces
D e f i n i t i o n 3 (Logical rule). A logical rule is a formula of the form F = 3xl...3xs(A1 A ... A Ap) ~- B1 A ... A Bn, p > 1, universally closed. There are no functional symbols. We will omit universal quantifiers for sake of readability. The hypothesis of F, /~{B~li E 1..n} is denoted by hyp(F), and the conclusion of F, 3xl...3x, A { A d i 9 1..p} is denoted by conc(F). Each A~,i 9 1..p and each Bi, i 9 1..n is an atom and x~, i 9 1..s are variables appearing only in A i , i 9 1..p. All other variables in A i , i 9 1..p also appear in the hypothesis and are universally quantified. If n = O, F is also called a logical fact. When we want to know if a logical fact is logically implied by a set of logical rules, we talk about logical goal.
Example 1 (Logical rule). Consider the following logical rule: R = 3u(ta(u) A r4(x,u)) +-- tl(x) A tl(y) A r l ( x , y ) . We have hyp(R) = tl(x) A tl(y) A r l ( x , y ) and conc(R) = 3u(t4(u) A ra(x, u) ). Remark 1. If s = 0, then F is equivalent to a set of horn clauses {Ai +-- B1 A ... ^ B n , i e 1..p} D e f i n i t i o n 4 (Piece). Let C = A1 A ... A Ap be a conjunction of atoms and V = {Xl, ...,xs} be a set of variables. Pieces of C in relation to V are defined in the following way: for all atoms A and A' members of {A1, ..., An}, A and A I belong to the same piece if and only if there is a sequence of atoms (P1, ..., Pro) of {A1, ...,Ap} such that P1 = A and P m = A' and Vi = 1 , . . . , m - 1, Pi and Pi+l share at least one variable of V. By construction, the set of pieces is a partition
of C. D e f i n i t i o n 5 (Logical r u l e pieces). Let R=3xl...3x, (A1 A ... A Ap) +-- B1 A ... A Bn be a logical rule. Pieces of R are pieces of conc(R) = A1 A ... A Ap in relation to {xl, ..., xs}.
Example 2. Let R = VxVyVz(Sw(Manager(x) A Manage(w) A O f f i c e ( y ) A Employee(z) A agt(x, w) A obj(w, y) A loc(z, y) ) --~ (3u3vO f fice(y) A Desk(u) A Employee(z) A Employ(v) A Manager(x) A loc(u, y) A BOSS(Z, u) A obj(v, z) A agt(x, v)) be a logical rule. R has five pieces, which are :
184
(Desk(u), loc(u, y),poss(z, u)} which contains atoms that share the existentially quantified variable u, - {Employ(v), obj(v, z), agt(x, v)} which contains atoms that share the existentially quantified variable v, - {Employee(z)}, { M a n a g e r ( x ) } and { O f f i c e ( y ) } which contain atoms where no existentially quantified variable appears, thus giving one atom in each piece. -
D e f i n i t i o n 6 ( P i e c e s p l i t t i n g ) . Let R = 3Xl...3xs(A1A...AAp) +- B1A...ABn be a logical rule, and t be the number of pieces of R. R can be split, giving several logical rules RI, ..., Rt. Each Ri, i E 1..t has the same hypothesis as R, and the piece number i as conclusion. Ri = 3xil...3xisi (Ail A ... A Aip~) +- B1 A ... A Bn, i E 1..t. Pi is the number of atoms in the piece number i, si is the number of existentially quantified variables concerning the piece, and Aij, j E 1..pi are atoms of the piece number i. Each variable not in xij is universally quantified. Construction of R1 A ... A Rt is called piece splitting of R. We will denote in the same way the splitting operation and its result R1 A ... A Rt. Remark 2. Uie1 t(Jje1 p Aij) = Uke1 p Ak. Indeed, the union of atoms in conclusions of new rules is'e~actly the set of atoms in the conclusion of the initial rule. Therefore, an atom belongs to only one piece, thus it can appear in only one of the newly generated rules. Hence )-~ie1 t Pi = P. No atom is created and none is deleted. The set of pieces is a partition of the set of atoms in the conclusion of the initial rule.
P r o p o s i t i o n 1. R is logically equivalent to R1 A ... A Rt. Proof (Sketch). In R, we can group existential quantifiers together in front of concerned atoms sets. Each group of existential quantifiers together with the corresponding set of atoms is by definition a piece. We rewrite the obtained formula by eliminating implication. Then by distributivity of V on A, we obtain as many formulae as pieces, connected by conjunction, each of them having the same hypothesis which is initial logical rule hypothesis. Common variables to these formulae are all universally quantified. As V x ( F A G) and (VxF A VxG) are equivalent, we can decompose R in R1 A ... ARt.
D e f i n i t i o n 7 (Trivial logical r u l e s ) . Let R be a logical rule. R is trivial if and only if every interpretation of R satisfies R (R is valid). Example 3 (Trivial logical rules). The following logical rules are trivial:
t(x) t(x) ^ r(x, y) 3u(t(u)) t(y) 3v(r(x, y, v)) t(z) ^ r(x, y, z) Piece splitting can generate logical trivial rules, therefore useless. It is the case, among others, of rules in which each atom of the conclusion appears also in the hypothesis. Let R1, ...,Ri, ...Rt be the result of piece splitting of R. We showed that R and R1, ..., Ri, ...Rt are logically equivalent. Let Ri be a trivial logical rule. R and R1, ..., Ri-1, Ri+l, ...Rt are logically equivalent. Hence Ri can be deleted.
185
3.2
Basic operations
Definition 8 (Piece unification). Let Q be a logical goal and R be a logical rule. There is a unification between Q and R, say a=, if and only if: 1. there are two substitutions al and a2 defined respectively on a subset of variables of Q and on a subset of universally quantified variables of R. 2. au = alUa2. Pieces of an(Q) are defined as pieces of conc(au(Q) ) in relation to the set of existentially quantified variables of an(R). There must exist as least one piece of an(Q) appearing entirely in the conclusion of ~ru(R). Once a piece unification a~ has been found, a new request Q' is built from Q. Q' becomes the new logical goal.
Definition 9 ( C o n s t r u c t i o n o f a new logical goal). Let Q be a logical goal, R be a logical rule and au be a piece unifier between Q and R, we obtain the new goal Q~ the following way: 1. Delete from an(Q) pieces that appear in an(R) and add atoms of hyp(au(R) ) 2. Update existential quantifiers an(Q), more specifically: (a) Delete existential quantifiers in an(Q) that were corresponding to deleted existentially quantified variables in au (R). (b) Add existential quantifiers corresponding to variables appearing in atoms of hyp( au (R ) ) added to an(Q) (corresponding to universally quantified variables of an(R)). These steps may need variable renaming, to avoid variable binding phenomena. Indeed, an atom of hyp(au(R)) added to an(Q) must not contain an already quantified variable in a=(Q), because it would be "captured" by a wrong quantifier. Therefore, we must rename common variables to Q and R.
Definition 10 (Logical piece resolution). Logical piece resolution of a logical goal Q is a sequence of piece unifications. It ends successfully if the last produced request is empty. In this case, the last used rule has an empty hypothesis (i.e. a logical fact). Example ~ (Piece unification). Let R be the following logical rule: R = VxVyVz (3w( Manager (x)AManage(w)AO f fice(y)AEmployee(z)Aagt(x, w)Aobj (w, y)A loc(z, y) ) -+ (3u3vO f fiee(y)ADesk(u)AEmployee(z)AEmploy(v)AManager(x)
^ toc(u, y) A poss(z, u) A obj(v, z) ^ agt(x, v)). Let Q be the following request (note that Q and R must not have variables in common): Q -~ 3i3j3kManager(tom) A Employ(i) A Employee(j) A Car(k) A
agt(tom, i) A obj(i, j) A poss(j, k) A unifier is au = {(x, tom), (i, v), (j, z)}. After applying a~, we construct the new goal Q':
Q' = 3w3z3y3k(Manager(tom) A Manage(w) A Office(y) A Employee(z) A agt(tom, w) A obj(w, y) A loc(z, y) A Car(k) A poss(z, k))
186
Remark 3. It is possible to simulate a piece resolution in backward chaining with conceptual graphs by a piece resolution with logical rules, and vice/versa. Thus these problems are equivalent. The first reduction (graphs -+ logic) is trivial. For the second reduction (logic --+ graphs), we need to construct a support and to add concept vertices with universal type for terms that do not appear in predicate of arity one. In the support, each concept type is covered by the universal type and thus two concept types are incomparable. The relation types set is partitioned in subsets of relation types of the same arity where each relation type is covered by the greater element and thus two relation types are incomparable.
3.3
Soundness and completeness of logical piece resolution
L e m m a 1. Let Q be a logical goal and R be a logical rule. If Q' is a new goal built by a piece unification between Q and R, then Q', R ~ Q T h e o r e m 2 (Soundness of logical piece resolution 6). Let 1" be a set of logical rules, and Q be a logical goal. If a logical piece resolution of Q on 1" ends successfully, then 1" ~ Q. Proof (Sketch). The way we build Q' allows us to prove Q', R ~ Q. To prove the theorem, we then proceed by induction on the number of piece unifications.
T h e o r e m 3 (Completeness of logical piece resolution 6). Let F be a set of logical rules, and Q be a logical goal. If 1" ~ Q then there is a logical piece resolution of Q that ends successfully. Proof (Sketch). We first prove by refutation the existence of a piece unification between the logical goal Q and the logical rule R, giving a logical goal Q' (assuming that Q', R ~ Q, but not Q' ~ Q). Secondly, we prove by induction that the implication of a logical goal by a set of logical rules can be "decomposed" in a sequence of logical implications involving the previous goal and only one rule to give the next goal. Then we proceed by induction on the number of piece unifications. 3.4
F r o m fact goals to rule goals
So far, goals were only rules without hypothesis. But it is useful to consider rules as goals. Indeed it would allow to decide whether a rule is logically implied by a set of rules, for example to know if the rule is worth being added to the set. We could also want to compute the minimal cover of a set (that is the minimal set that allows to generate the initial set), or the transitive closure of a set (see section 5).
Definition 11. Let S be a set of logical rules, and R be a logical rule. The operation w, defined on R with respect to S and noted ws(R), replaces every universally quantified variable of R by a new and unique constant (not appearing neither in S nor in R).
187
Let F be a set of logical rules, and R be a logical rule. R is of the form R = 3xl...3x,(A1A ... AAp) e- B1A ... ABn, p >_ 1. Let R' = w r ( R ) . Then F ~ R if and only if F, hyp(R') ~ conc(R').
T h e o r e m 4 (12).
Proof (Sketch). ~ Let I be a model of/~ and hyp(R'). We show that I is also a model of conc(R'). v== We prove t h a t / ~ A -~R is inconsistent. To do that, we show that the set of ground instances of/~ A hyp(R') A -~conc(R') which is inconsistent is included in the set of ground instances of/~ A -~R. The theorem of compactness allows to conclude.
4
Comparison and statistical analysis
In order to estimate how much piece resolution can reduce the resolution tree, we built a logical rules random generator. We present in this section the results of this first series of tests, comparing the number of unitary backtracks, for the same rule base, in the case of piece resolution and SLD-resolution 9 11 (used in Prolog) 1. The rule base is translated into Horn clauses in order to fit with Prolog, whose effect is to multiply the base size. Therefore, the piece resolution algorithm deals with the original base, and Prolog deals with the translated base. To prevent procedures from looping, there is no cycle in the dependency graph of logical rules. We made 3011 tests, varying each parameter (base size, maximal number of atoms in hypothesis and conclusion, ... ). A quick analysis over the results shows that the number of backtracks in piece resolution is always lower than 310,000, whereas it is lower than 5,000,000 for Prolog. The table 4 gives the distribution in details for each method. It shows that piece resolution reduces the number of backtracks. Indeed the number of cases with more than 10.000 backtracks is 73 (2,42%) for piece resolution, whereas it is 279 (9.19%) for Prolog.
x=number of backtracks x < 100 101 < x < 1000 1001 < x < 10000 10001 < x < 100000 100001 < x < 1000000 1000001 < x
Prolog 2047 67,98% 407 13,52% 280 9,30% 153 5,08% 82 2,72% 42 1,39%
Piece resolution 2504 83,16% 294 9,76% 140 4,65% 66 2,19% 7 0,23% 0 0%
Fig. 4. Repartition of tests, according to the number of backtracks. The table 5 gives for the same base the repartition of cases in which piece resolution reduces the number of backtracks (function of number of backtracks 1 We used SWI-Prolog (Copyright (C) 1990 Jan Wielemaker, University of Amsterdam).
188
for Prolog - number of backtracks for piece resolution). The percentage is given relatively to the number of cases t h a t show improvement for piece resolution. x = improvement inumber of cases percentage x < 100 696 47,54% I01<x~i z. In particular, the boundary elements already determine the ordered structures ( Lk, ~ z and hence z k,i E b(L). Moreover z k,i "~k z such t h a t zk = zk'ik showing t h a t Lk = b(n)k := {bk I b e b(L)}.
D e f i n i t i o n 4.2: A set c:= {cl, c2, c3, c4, c~, c6} C L of a b o u n d e d trilattice L is called a cycle if Cl " 3 c2 " 2 c3 ~1 c4 ~3 e5 ~2 c6 "~1 Cl, el "~2 c4, c2 r176c5, and c3 "~a c6 (cf. Fig. 4). A cycle of (triadic) complements is a cycle r in which for each x E c there are elements y , z E c such t h a t for all i E { 1 , 2 , 3 } the join condition xi V Yi = Yi V zi = xi V zi = li and the meet condition xi A Yi A zi = 0~ are satisfied.
\//c, -3~ cl
~2 c6
Fig. 4. A cycle with six elements.
L e m m a 4.3: A cycle c of a trilattice L has one, three or six elements. A cycle of complements has one element if and only if ILl = 1. Proof: T w o elements ck, ct E c of a cycle r := {Cl, c2, c3, c4, c~, c6} with k ~ 1 can either b e / - e q u i v a l e n t for some i E {1,2,3} or not. If Ck = ct it follows Icl -< 3 in the first and Icl = 1 in the second case (apply the uniqueness condition). T h e r e are no cycles with two elements. If a cycle of c o m p l e m e n t s has one element then ILl = 1 because of the meet and the join condition.
218
Definition 4.4: A bounded trilattice L := (L, k x and, since a >j x, also a 2), 9 ~: V u E -~ C O 7~ is a mapping such that a ( V ) C_ C and a ( E ) C T~, and all e E E with ~,(e) = ( v l , . . . , v~) satisfy a(e) E T~k, 9 p: Y --+ ~(g)\{0} is a mapping. For an edge e e E with u(e) = ( v i , . . . , V k ) , we define lel := k, and we write d e ) l , := and p(e) := p(vl) x . . . x p(vk). Apart from some little differences, the concept graphs correspond to the simple conceptual graphs defined in 8 or 1. We only use multi-hypergraphs instead of bipartite graphs in the mathematization. The application v assigns to every edge the tuple of all its incident vertices. The function R labels the vertices and edges by concept and relation names, respectively, and the mapping p describes the references of every vertex. In contrast to Sowa, we allow references with more than one object name (i. e. individual marker) but no generic markers, i. e. existential quantifiers, yet. They can be introduced into the syntax easily (cf. 6), but in this paper we want to put emphasis on the elementary language. That is why we can omit coreference links here which are only relevant in connection with generic markers. 3
Semantics
for Concept
Graphs
We agree with J.F.Sowa, when he writes about the importance of a semantics: "To make meaningful statements, the logic must have a theory of reference that
227
determines how the constants and variables are associated with things in the universe of discourse." 9, p. 27 Usually, the semantics for conceptual graphs is given by the translation of conceptual graphs into first-order logic (cf. 8 or 1). For some notions and proofs, a set-theoretic, extensional semantics was developed (cf. 4), but it is rarely used. We define a semantics based on relational contexts. That means, we interpret the syntactical elements (concept, object and relation names) by concepts, objects and relations of a relational context. We prefer this contextual semantics for several reasons. As the basic elements of concept graphs are concepts, we want a semantics in which concepts are considered in a formal, but manifold way. Therefore, it is convenient to use Formal Concept Analysis, which is a mathematization of the philosophical understanding of concepts as units of thought constituted by their extension and intension (cf. 11). Furthermore, it is essential for Formal Concept Analysis that these two components of a concept are unified on the basis of a specified context. This contextual view is supported by Peirce's pragmatism which claims that we can only analyze and argue within restricted contexts where we always rely on preknowledge and common sense (cf. 12). Experience has shown that formal contexts are a useful basis for knowledge representation and communication because, on the one hand, they are close enough to reality and, on the other hand, their formalizations allow an efficient formal treatment. As formal contexts do not formalize relations on the objects, the contexts must be enriched with relational structures. Therefore, R.Wille invented power context families in 12 where relations are described as concepts with extensions consisting of tuples of objects. Using relational contexts in this paper, we have chosen a slightly simpler formalism. Nevertheless, this formalism can be transformed to power context families and vice versa. This is explained in detail in 5 and will not be discussed in this paper. Let us start with the formal definition of a relational context (originally introduced in 7). D e f i n i t i o n 3. A formal context (G, M, I) is a triple where G and M are finite sets whose elements are called objects and attributes, respectively, and I is a binary relation between G and M which is called an incidence relation. A formal context, together with a set fit := U~=I fitk of sets of k-ary relations on G is called relational context and denoted by K := ( ( G, fit), M, I). The concept lattice ~_~_(G,M, I) := (f~(G, M, I), depth(G). The level in G of a node c of UCG is the level in A(G) of the node K of A(G) containing c (i.e. the number of edges of the path in A(G) from R(G) to K). A NG ~ / G is k-normal iff 1. every node of A(G) is a normal SG r~f, 2. for any co-identical concept nodes c and d appearing in two distinct nodes of A(G) such that c is a complex node, if the level of c' in G is inferior to k then c ~ is a complex node and if c~ is a complex node then R(Desc(c~)) is exactly equivalent to R(Dese(c)). Note that a normal NG ~ f is k-normal for any natural integer k. T h e o r e m 2. Let G and H be two NGref s and k >_depth(G).
If there is a projection from G to H then ~(S), ~(H) ~ ~5(G). Conversely, if H is k-normal and ~5(S), ~(H) ~ ~(G) then there is a projection from G to H. P r o o f : We use the same notations as in the proof of Theorem 1. A ~-substitution from a NG ref G to a NG r~l H is defined in a similar way as a ~P-substitution from a SG r e / G to a SG ref H. Lemma 1 is true for NGre:s instead of SGre/s and the semantics 9 instead of ~P. Lemma 2 is true for S G ~ / s and for the semantics instead of ~P provided that H is a normal SG ~ / (the condition a(ec) = p(eri(c)) disappears as the term ec does not exist in the semantics ~).
251
Let us suppose t h a t there is a projection from G to H . Then there is an Aprojection ~ = (~o, ( ~ K ) K node of A(G)) from A(G) to A(H). From L e m m a 2, for any node K of A(G), there is a ~-substitution aK from K to ~o0(K) such t h a t for any concept node c of K , aK(e~) = p(e~---~-~). If a variable x is assigned to two co-referent nodes c and d appearing in different nodes K and K ' of A(G) then aK(x) = crg(e-~) = p(e~--~-(~) and aK,(X) = aK,(e~) = p ( e ~ ) . As preserves co-identity, ~ ( c ) -- ~o~,(d), then aK(x) = aK,(x). Therefore there is a substitution a of the variables of qh(G) by UH constants such t h a t for any node K of A(G), the restriction of a to the variables of ~ ( K ) is a ~ . The atoms of ~ ( G ) are obtained from those of the formulas ~ ( K ) , K node of A(G), by adding the context argument eK. To show t h a t a is a ~-substitution from G to H , it remains to show that for any node K of A(G), a(eK) = p(e~o(g)). If K = R(G) then ~o(K) = R(H) and a(eK) = p(e~o(l,:)) = ao, otherwise let (J, c)K be the edge of A(G) into K, eK : e~. From the definition of an Aprojection, (~o(J),~Og(c))~o(K) is an edge of A(H), then e~o(K ) = e ~ - ~ . It follows t h a t a(eg) ----a(e-~) = p(e~--j-~) = p(e~o(g)). Then a is a ~-substitution from G to H and we conclude with L e m m a 1. Convcrscly, lct us supposc that H is k-normal with k >_ depth(G) and ~(S), ~ ( H ) qh(G). From L e m m a 1, there is a ~-substitution a from G to H . We construct an A-projection ~ from A(G) to A(H) such t h a t for any node K o f A(G) and any concept node c of K , a(e-e) = p(e~--~-~). We define ~ o ( K ) and ~ g for any node K of A(G) and prove the preceding property by induction on the level l of K in A(G). For I -- 0, K is R(G). The atoms of ~(G) (resp ~ ( H ) ) associated with the nodes of R(G) (resp. R(H)) are those with a0 as first argument. Then the restriction of a to the variables of ~(R(G)) is a ~-substitution from R(G) to R(H). From L e m m a 2, there is a p r o j e c t i o n / / o from R(G) to R(H) such t h a t for any concept node c of R(G), a(e~) = p(en-~-~). We define ~o(R(G)) = R(H) and ~OR(C) = / / o - Suppose ~ is defined up to level I. Let (J, c)K be an edge of A(G) with K at level 1 + 1. Let s be a node of K and t(el, ..,, en) be the a t o m of ~(G) associated with s. Let t'(e~, .., ,e~) be an a t o m of ~ ( H ) such t h a t t' _< t and for any i in {1, ..., n}, a(ei) = p(e~). Let K ' be a node of A(H) and s' a concept node of K ' such t h a t the a t o m t'(e~, .., ,e~) is associated with s ~. el = e/~ = e~ and e~ = eg, then a(e-~) = p(eg,), then eg, r a0, then there is an edge (J',c')K' into K ' and eg, = e-~. By induction hypothesis, a(e~) = p(e~--j-~). We have
p(eT) = p(eK,) = a(e-e) = p(e~--~(-~), then c' -- ~ j ( c ) , i.e. c' and ~ j ( c ) are coidentical nodes. H is k-normal, c' and ~j(c) are co-identical, c' is a complex node and levelH(~j(c)) = levelc(c) < depth(G) depth(G). E.g. the 2-normal form of the graph/-/2 of Figure 6 is H2 and its 3-normal form is G2. In Figure 7, for any k > 2, G and K are their own k-normal form and the k-normal form of H is K . Corollary 1 of Theorem 2 shows that reasoning on NGrefs may be performed using graph operations without any restriction on the NGr~fs. C o r o l l a r y 1: Let G and H be two NGref s and H~ be the k-normal form o
H, with k > max(depth(G), depth(H)). 9 (S), ~ ( H ) ~ ~(G) iff there is a projection from G to H~k. Note that from a practical point of view, it is sufficient to construct H~ up to level depth(G) since projection preserves levels. The semantics q5 is unable to express not only that an entity is represented by several concept nodes in a node of A(G), but also that several concept nodes representing the same entity have distinct descriptions. It is pertinent in applications in which the meaning of a graph is not changed when merging co-identical concept nodes of a SG re/ or replacing the description of co-identical concept nodes by the union of their descriptions (in particular in applications where graphs are naturally normal). But in some applications, each concept node has a specific situation in its SG ref and a specific description; merging these nodes or mixing their descriptions would destroy some information. For instance, let c and d be two concept nodes appearing in a NG ~eI G and representing a given lake. c appears in the context of a biological study (i.e. is related in a SG ~ f to nodes and appears in the description graphs of nodes concerning a biological study) and contains biological information about the lake in its description graph: animals and plants living in the lake. d appears in the context of a touristic study and contains touristic information about the lake in its description graph: possibilities of bathing, sailing and walking at the lake. The formulas ~(G), ~ ( H ) and ~ ( K ) are equivalent, where H is obtained from G by exchanging the description graphs of c and c ~ and K is the k-normal form of G
253
and H (with k -- max(depth(G), depth(H))), in which the description of c and c * is the union of the biological and touristic descriptions. Such an equivalence is obviously undesirable. In those applications, the semantics to be used is the semantics ~. 3.3
T h e s e m a n t i c s ~P
The semantics 9 is extended from s G r e l s to NGrels in the same way as the semantics ~, except that the context argument is the variable ec assigned to the concept node c representing the context instead of the term ee. Thus a description is specific to a concept node and not to a co-identity class. We add a context argument to each predicate (concept type predicates become 3-adic). For any NG ref G and any node K of A(G), let nc(K) be the number of K concept nodes. Assign r(G) variables yl, -.., Yr(G) to the r(G) co-refG classes and for any node K of A(G), ne(K) variables XlK, ..., xKc(K) to K concept nodes. All variables yi and x K are distinct. The variables of k~(K) are x K, ..., xK o(K) and, if K contains generic nodes, some of the variables Yl, .-., Yr(G). For any node K of A(G), ~ ( e , K ) is the conjunction of the atoms obtained from those of O(K) by adding the first argument e. ~ ( e , G ~) is defined by induction on the depth of G ~. For any node K of A(G) and any concept node e of K, ec denotes the variable assigned to c in k~(K).
~"(e, G') = 3x~(v')... n(G') Xnc(R(G')) (~'(e, R(G')) A (AceD(V,)O'(ec, Desc(c)))) The formula
k~(G) = 3yl... Yr(G)O'(ao, G) is associated with any NG tel G defined on the support S. E.g. the formula associated with the graph G2 of Figure 6 is
~(G2) = ~xl (t(ao, a, xl) ^ ~x2(t(x~, a, x~) ^ ~x3t(x2, a, x3))). ~P(G) may be defined from A(G) as the existential closure of the conjunction of the formulas O'(eg,K), K node of A(G), where eK is a0 if K is R(G) and otherwise let (J, c)K be the edge of A(G) into K , eK is ec. E.g. the formula associated with the graph G2 of Figure 6 may be written as O(G2) =
3xlx2x3(t(ao,a, xl) A t(xl,a, x2) A t(xz,a, x3)). Projection is sound and complete with respect to ~.
Let G and H be two NGrel s. O(S), ~9(H) ~ k~(G) iff there is a projection from G to g . T h e o r e m 3.
E.g. in Figure 6, for any i in {1,2}, ~(S),O(Hi) ~ O(Gi) and there is no projection from Gi to Hi. S k e t c h o f p r o o f : A technical proof is given in 9, similar to that concerning 9 . A more intuitive one is given here, which would not be available for ~. Let So be the support obtained from S by adding the individual marker ao, the universal concept type T (if it does not already exist) and a binary relation type tco~te~t (context relation). For any NG reJ' G on S, let Go be the NG r~/ on So reduced to a concept node labelled ( T , a o , G ) and Simple(G) be
254
the SG ref on So obtained from the union of the nodes of A(Go) by adding for each edge (J, c)K of A(Go) nc(K) relation nodes of type tcon~ext relating c to each concept node of K . It can be shown t h a t (1) there is a projection from G to H iff there is a projection from Simple(G) to Simple(H) and (2) ~ ( S ) , ~ ( H ) ~ ~P(G) iff ~ ( S 0 ) , ~ ( S i m p l e ( H ) ) ~ ~(Simple(G)) (any substitution of the variables of ~ ( G ) leading to the e m p t y clause by the resolution method is available for ~(Simple(G)), and conversely). We conclude with the soundness and completeness result on SGreSs. This proof would not be available for the semantics 9 because the SG tel Simple(H) obtained from a k-normal NG ref H containing distinct co-identical nodes is not normal.
4
Conclusion
I have presented two FOL semantics for Simple and Nested Graphs. Projection is sound and complete with respect to both semantics. This result shows t h a t reasoning on Conceptual Graphs may be performed using graph operations instead of logical provers (e.g. reasoning with G r a p h Rules 8,7). Non FOL formalisms, with "nested" formulas as in the logics of contexts have been proposed for Nested Graphs. It remains to compare these formalisms and the semantics t h a t m a y be associated with t h e m to the FOL semantics presented here.
References 1. M. Chein and M.L. Mugnier. Conceptual Graphs: fondamental notions. Revue d'InteUigence Artificielle, 6(4):365-406, 1992. Hermes, Paris. 2. M. Chein, M.L. Mugnier, and G. Simonet. Nested graphs: A graph-based knowledge representation model with fol semantics. In Proceedings of the Sixth Inter-
3. 4. 5. 6. 7. 8. 9. 10. 11.
national Conference on Principles of Knowledge Representation and Reasoning (KR '98), Trento, Italy, June 1998. R.V. Guha. Contexts: a formalization and some applications. Technical Report ACT-CYC-42391, MCC, December 1991. PhD Thesis, Stanford University. J. McCarthy. Notes on Formalizing Context. In Proc. IJCAI'93, pages 555-560, 1993. M.L. Mugnier and M. Chein. Representer des connaissances et raisonner avec des graphes. Revue d'Intelligence Artificielle, 10(1):7-56, 1996. Hermes, Paris. A Preller, M.L. Mugnier, and M. Chein. Logic for Nested Graphs. Computational Intelligence Journal, (CI 95-02-558), 1996. E. Salvat. Raisonner avec des opdrations de graphes : graphes conceptuels et r~gles d'infdrence. PhD thesis, Montpellier II University, France, December 1997. E. Salvat and M.L. Mugnier. Sound and Complete Forward and Backward Chainings of Graph Rules. In ICCS'96, Lecture Notes in A.L Springer Verlag, 1996. G Simonet. Une autre s6mantique logique pour les graphes conceptuels simples ou emboit~s. Research Report 96-048, L.I.R.M.M., 1996. G Simonet. Une s6mantique logique pour les graphes conceptuels emboitds. Research Report 96-047, L.I.R.M.M., 1996. J.F. Sowa. Conceptual Structures - Information Processing in Mind and Machine. Addison Wesley, 1984.
Peircean Graphs for the Modal Logic $5 Torben Braiiner* Centre for Philosophy and Science-Theory Aalborg University Langagervej 6 9220 Aalborg East, Denmark t orbenb@hum, auc. dk
A b s t r a c t . Peirce completed his work on graphical methods for reasoning within propositional and predicate logic, but left unfinished similar systems for various modal logics. In the present paper, we put forward a system of Peircean graphs for reasoning within the modal logic $5. It is proved that our graph-based formulation of $5 is indeed equivalent to the traditional Hilbert-Frege formulation. Our choice of proof-rules for the system is proof-theoretically well motivated as the rules are graphbased analogues of Gentzen style rules as appropriate for $5. Compared to the system of Peircean graphs for $5 suggested in 17, our system has fewer miles (two instead of five), and moreover, the new ~les seem more in line with the Peircean graph-rules for propositional logic.
1
Introduction
It was a m a j o r event in the history of diagrammatic reasoning when Charles Sanders Peirce (1839 - 1914) developed graphical methods for reasoning within propositional and predicate logic, 18. This line of work was taken up again in 1984 where conceptual graphs, which are generalisations of Peircean graphs, were introduced in 15. Since then, conceptual graphs have gained widespread use within Artificial Intelligence. The recent book 1 witnesses a general interest in logical reasoning with diagrams within the areas of logic, philosophy and linguistics. Furthermore, the book witnesses a practically motivated interest in d i a g r a m m a t i c reasoning related to the increasing use of visual displays within such diverse areas as hardware design, computer aided learning and multimedia. Peirce completed his work on graphical methods for reasoning within propositional and predicate logic but left unfinished similar systems for various m o d a l logics - see the account given in 12. In the present paper, we put forward a syst e m of Peircean graphs for reasoning within the modal logic $5. The importance of this logic is recognised within m a n y areas, notably philosophy, m a t h e m a t i c a l logic, Artificial Intelligence and computer science. It is proved t h a t our graphbased formulation of the modal logic $5 is indeed equivalent to the traditional Hilbert-Frege formulation. Our choice of proof-rules for the system is prooftheoretically well motivated as the rules are graph-based analogues of Gentzen * The author is supported by the Danish Natural Science Research Council.
256
style rules as appropriate for S51. Gentzen style is one way of formulating a logic which is characterised by particularly appealing proof-theoretic properties. It should be mentioned that a system of Peircean graphs for $5 was also suggested in 17. However, our system has fewer rules (two instead of five), and moreover, the new rules seem more in line with the Peircean graph-rules for propositional logic. In the second section of this paper we give an account of propositional and modal logic in Hilbert-Frege style. Graph-based systems for reasoning within propositional logic and the modal logic $5 are given in respectively the third and the fourth section. In section five it is proved that our graph-based formulation of $5 is equivalent to the Hilbert-Frege formulation. In section six we discuss possible further work.
2
Propositional
and Modal
Logic
In this section we shall give an account of propositional logic and the modal logic $5 along the lines of 14. See also 8, 9. We formulate the logics in the traditional Hilbert-Frege style. Formulae for propositional logic are defined by the grammar s ::= plsA...Asl'~(s) where p is a propositional letter. Parentheses are left out when appropriate. Given formulae r and r we abbreviate -~(r162 and -~(-~r162 as respectively r =~ r and r 1 6 2 D e f i n i t i o n 1. The axioms and proof-rules for propositional logic are as follows: A1 F- r :=~ (r :=~ r F- (r ==~ (r ~ 0)) =~ ((r ~ r ~ (r =~ 0)). A 3 I- (-1r =:~ -~r =~ (r ==~r M o d u s P o n e n s If F- r and F- r ~ r then F- r A2
Given the definition of derivability for propositional logic, it is folklore that one can prove soundness and completeness in the sense that a formula is derivable using the axioms and rules if and only if it is valid with respect to the standard truth-functional semantics. Formulae for modal logic are defined by extending the grammar for propositional logic with the additional clause
s ::. . . .
I O(s)
The connective symbolises "it is necessary that". Given a formula r we abbreviate -~-~r as (ATTR3)=:~QUALITY: @X
II (lub{not high+ G, not low+q1} angle brackets, the first element is a procedure name, the last element is a place holder for the returned value, and all other elements are an ordered list of input parameters. <START KEY:*k *r> The input parameters may optionally have a type label-like prefix, which behaves as a kind of type checking at instantiation. An attempt to bind a parameter of this kind with a concept which does not match will result in an error message. Actors may be connected to concepts using the dashed arrow convention. Semantically, the procedure name must be appear in a catalogue of actors, and must be defined in a suitable execution language, such as C +§ , Java, Prolog or Lisp, running on hardware connected to the car's electrical system. The d~ operator can translate actors by a process akin to Skolemisation. A procedural call with parameters can be introduced into first order logic expression as a named function provided that its name is unique in expression, which is already necessary for actors. Skolemisation is used to eliminate existential quantifiers in first-order logic. An existentially quantified variable is replaced by a term which consists of a Skolem function applied to all variables which are universally quantified outside the existential quantifier in question. With the existential qualifier eliminated, the universal quantification can be ignored,
325
since any variable must be universally quantified. This means that actors could appear in first-order logic expressions without interfering with their quantifications. Within the predicate for the concept connected to the output of the actor, ~ places a second element: a function with the input variables as arguments. Thus the DIVI actor in Figure 1 would appear as .... A (MAX-DUR, (DIVI (a, b)) A ... If desired, the output variable *c could be equated to the function elsewhere in the expression. The above arrangement is simple to understand and allows the graph's declarative nodes to interact with the actors. Suppose it was learned that the engine of the car has been replaced with a different engine. When the graph is updated to reflect this, the new instance will have a new characteristic fuel-consumption. Because of the actor connections, the maximum duration and range will automatically alter to reflect this change; yet it is difficult to imagine natural relations which could be used to link FUEL-CONS with those two concepts. From a goal-oriented perspective, if an engine was stationary and the goal was a non-zero speed, this could trigger a search for a key, which also could not be done naturally with a relation. Notice, however, that only the actor tokens can influence and be influenced by the declarative nodes: the elements of the procedure itself are beyond reach. For this reason, we may still wish for code which had explicit statement inside a graph, perhaps as chains of linked actors representing the subroutines of complex actions. In human beings, only our highestlevel plans are available to our reflection; lower level acts, those which have become automatised, as psychologists say, are not available.
3 The Semi-automatic Trap Let us turn now to another risk facing our community: the semi-automatic trap. This is our tendency to keep building systems of logical symbols that depend at least partially on human involvement for their creation, manipulation and interpretation. That is a serious problem for builders of intelligent systems, and to a lesser degree for information-system builders and logicians. The semi-automatic trap is best understood in its historical perspective. Logic has a long history of being done by humans on paper. The symbols which are used in a typical logic have a long aetiology. Perhaps they include Greek letters, betraying their Platonic ancestry. Essentially, these symbols are passive, mnemonic devices, designed to signify objects, classes or operations by triggering the associated ideas inside human brains. They are created by human beings for human manipulation and human interpretation. For most of their history, and in most of their applications, this role for logical symbols as manual tools has been uncontroversial and unproblematic. The decades-long revolution which has placed a computer on every desk offered rich opportunities to practitioners of logic, builders of information systems and artificial intelligence researchers, among countless others. Yet the advent of computers can now be seen, with the benefit of hindsight, to have loaded the semi-automatic trap, ready to be sprung. In their ubiquitous incarnation as semi-automatic machines - that is, devices which accept symbolic input from users via a keyboard, process them, and display symbolic output for human eyes via a screen - desktop (and laptop and palmtop) computers have human dependence built in as a basic affordance. For the greater part, the software which has been developed on such machines also tends to depend on human intervention via this interface. This too seems natural enough.
326
But this built-in, natural dependence can work against us when we try to use those machines and logic symbols to build symbolic structures which can behave like human conceptual knowledge, for the following reasons. First, the culture of logical mnemonics carries certain unstated assumptions with it. For example, although a symbol could in principle have a extensional grain size of anything from a highly specific and nameless microfeature to a broad metaphysical category, the symbols used in practice tend to cluster around a narrow range of possible grains associated with convenient words used for their labels. The fact that concepts have to be named could push hand-built ontologies away from the microfeatural to a coarser resolution. Second, mnemonic symbols can easily be parasitic for their true meaning on the human viewer, that is, they could merely draw on associations, inference mechanisms and the like in the head of a developer, or user, instead of supplying them itself. It is surprising how much this blinds us to the need for an interpretative component inside the system. Third, the tradition of logic still leads us to think of conceptual knowledge in terms of verifying propositions, and that truth-conditional semantics might be sufficient as a denotation for any conceptual structures. This again leads to the neglect of active procedural output as part of meaning, as discussed in Section 2. Let me make the problem clearer by taking it to extremes. Consider for a moment the fact that natural existence proofs of intelligence - humans and (more problematically) animals - have no screens and keyboards. If, as we wish to maintain, symbolic conceptual structures exist within their brains, then keyboards and screens cannot be a necessary condition for intelligence. But think of the disservice that would be done to most intelligent machines - and surely every CG program - if the screen and keyboard were removed. This will be my basic test for a system caught in the semi-automatic trap - will it continue to function and be useful with the human-dependent I/O devices disconnected? The reader may well object that this is test is uncharitable. After all, animals (including humans) and computers are really very different. Animals arrive with a genetic legacy, which might include innate conceptual structures and the machinery to support them. Computers are general-purpose machines to which any structure must be supplied - and this is ultimately a screen and keyboard job. To make a fair comparison, then, the machine does need a screen and keyboard. Very well - let the original test be modified as follows: the builder may stand in for Nature, supplying any programs and data that the system needs, but during the design phase only. Once the program is finished (born), the screen and keyboard must come off. The test still seems absurd because we have not yet exhausted the differences between animals and computers. Animals have sense organs and bodies with actuators, and these serve as I/O channels to the world. But - and this is important to full understanding of the issue - those channels do not depend on the manipulation or scrutiny on the part of a external human, as screens and keyboards do. Even in a human being, they do not: they serve a human brain, and could be influenced by or influence an external human in various ways, but they do not depend on it. Note that if the computer had other I/O devices with this property, such as cameras, microphones, sound generators and motors, they must remain connected during our test. We only wish to disable the machine in a specific way, so that it reveals its autonomy or the lack of autonomy. Am I serious when I demand that CG-based devices be useful without their screen and keyboard? Not entirely. To disqualify screens and keyboards on the grounds that they permit a kind of epistemological cheating is to oversimplify, because it ignores the fact that these devices can also be used to stand in for more complicated, and possibly less useful, "natural" I/O such as speech and hearing. We may be very satisfied, for example, with a natural language program that communicates using ASCII text,
327
provided we stay vigilant against other uses of the screen and keyboard which commit humans to too much of the wrong kind of involvement. This is the kind that does not relate to natural communication with the functioning program and would not be possible if the program were a person i Furthermore, it oversimplifies because some purposes of logicians and informationsystem builders may correctly demand no more than semi-automation. A successful logic visualiser might be essentially a tool for human work, like a calculator or spreadsheet. In these cases, it could be argued, the semi-automatic trap is really no trap at all. Even here, though, much of the motivation of tool developers lies in the potential they see in computerising conceptual structures to free human users of the need to carry out the time-consuming, repetitive or difficult parts of logic work - and this means automation. Of course, individual systems might pass the test, with their designers successfully avoiding the seductive powers of the screen and keyboard, and using their computers to try to write truly automatic programs that collect their own data, form their own goals, build their own representations and draw their own conclusions for action. The trouble is that we can s e e m to be building systems capable of human-like conceptual processing, when what we are really doing is only building systems that help people to do so. Disguising this mistake is the fact that, at least in information-retrieval systems, even a semi-automatic program could be quite useful. In fact, some information-system builders argue that a semi-automatic "intelligence amplification" (IA) approach is the best we can hope for at present, or even that this is better than AI altogether, since it preserves a human role in these affairs. But builders of autonomous agents, intelligent reasoning systems and true (that is, non-teleoperated) robots must avoid the semi-automatic trap to build automatic machines, because these devices must function independently. These builders cannot get away with thinking representations are all there is to knowledge. Again, we can visualise this in terms of the ability of such devices to operate without a keyboard and screen plugged in, even if they were needed to start the programs. A final barb that may hold us in the semi-automatic trap is that of power. I mean this in a more specific sense than notions of technological dominance or "knowledge is power". If responsibility and power are opposite sides of the same coin, it follows that human involvement in the knowledge process grants us power over it. In the artificial realm of symbols on a blackboard, the logician presides as god, letting this or that assumption be true, assigning variables, defining axioms and applying stepwise procedures. In the methodology of conceptual graphs, we proclaim the existence of types which divide aspects of reality into meaningfully distinct atoms, forming the basis of an ontology. The authority for this act is that of the knowledge specialist, or domain expert. Such persons naturally occupy positions of power in our technocratic society. And if the ontologies that are derived from their proclamations were in turn to form an obligatory language to which all artistic, scientific, commercial or military participants in a knowledge-mediated discourse have to conform, the focus of power would become very tangible indeed. 1. One might argue that language is an exception, because it is essentially about communications with other human beings, and thus dependent on them. But human speech is based on vocal apparatus which can produce sounds for other purposes, and auditory sensors which are used to detect many kinds of sounds besides speech. And of course, both can operate meaningfully even when there is no external human present. The same cannot be said of a keyboard and screen.
328
A number of critics have warned that society could find itself mired in a kind of ideological determinism instrumented by information technology e.g. 7. Particularly applicable here is the risk of opportunism on the part of an organisation or company that establishes an comprehensive ontology defining all the concepts, relations, contexts and actors for an industry as a bid for control of that industry. If one powerful organisation controlled these element, it could be difficult or impossible for outsiders to get new ideas recognised, or to communicate ideas which fell outside those terms of reference. The concern is that the organisation could disguise its attempt to monopolise the discourse by claiming that it was simply establishing a useful knowledge standard. But engineering standards for knowledge interchange should not be permitted to lead towards standardised knowledge content. Our community should be on its guard about this, so that our innovations do not become yet another pillar of inequality. If as we oversee the construction of large, shareable ontologies we are to avoid the twin evils of hard labour and ideological monopoly, we must become willing to take our hands off the levers of power to some degree. Since humans and their institutions tend to seek to consolidate power, not relinquish it, this barb can be expected to be the sharpest of all.
3 Fully Automated Acquisition In the CG community, we tend to neglect knowledge acquisition. In a set of 148 papers from ICCS meetings of the last five years, only 17 mentioned knowledge acquisition at all, and fewer discussed the topic in any detail. Concepts and relations appear in the catalogues of practically all CG systems as a matter of design-time proclamation, which is to say, human judgement. Neglecting other means of gathering knowledge tends to lock in proclamation as the only method of establishing knowledge using CGs. That may discourage experienced knowledge engineers looking for improved technologies for their craft, who know that manual knowledgebase creation and maintenance is potentially so labour-intensive that it may make their systems unacceptably costly. Perhaps CG developers are beginning to see the cruciality of this issue: whereas only 3 of the 56 papers in 1992 concerned this topic, this had risen to 9 out of 48 papers by 1997. Now consider what the semi-automatic trap has to teach us about knowledge acquisition. If, as in the above reductio ad absurdum, conceptual graph systems were not permitted to use a screen and keyboard for more than the initial system specification, would knowledge acquisition become impossible? Evidently not, because humans and animals can learn. But, for the sake of argument, how would it change under this restriction? The term "knowledge acquisition", as it is conventionally used, means encoding knowledge which has come from asking human beings (experts). Sometimes it takes on a slightly broader sense, in which; the knowledge may come from other sources like models, or books. "Asking a human" sounds like a natural process, available to a person and not likely to be disabled by removing the screen and keyboard from a computer. But of course, what almost always actually happens is that another person elicits the knowledge from the expert, casts the knowledge into the representation formalism by hand and then types it into the system by screen and keyboard. This would not be possible in a system in which these two devices were disconnected, that is, once the system was deployed. We could, of course, imagine a system in which deployment was delayed until all the knowledge it would ever need was built-in at design time. This would be a model of
329
totally innate knowledge. New knowledge could be had within the deployed system, but only by derivation from the innate supply. Perhaps, with great foresight on the part of the designer, such a knowledge system would be adequate. By analogy, most commercial programs are sold without their source code, with all information they need sealed into the executable code. But this strategy seems inherently risky. A system closed off from external change seems to lack an essential flexibility. It is precisely this sort of inflexibility that makes us seek more advanced forms of software than conventional programs. The alternative would be to use I/O devices which are allowed to remain connected (or, in the liberal interpretation of the rule, to use the screen and keyboard for natural exchange only). To do that, the human knowledge elicitor must be replaced. This means creating not only a natural language interface, but also a program to conduct the elicitation process, automatically generating the graphs for expression as questions, and a method for automatically dealing with the conceptual graphs that result from the parsing of the expert's responses. Since I have written elsewhere about parsing and learning by asking 10, I will focus here on this method. SKB
EKB
PKB
Syllabus
Figure 2. A teachable CG knowledge machine. Figure 2 describes a CG machine containing three knowledgebases: 9 SKB, the semantic knowledgebase, consisting of conceptual, relational and actor hierarchies. 9 EKB, the episodic knowledgebase, a sequential list of conceptual graphs, representing the conceptual history of the system. 9 PKB, the procedural knowledgebase, containing the source for actors defined in the system. This could be as simple as a set of listings of Lisp functions. The acquisition process depends on a special teaching language in which a range of operations on these databases may be expressed in simple fashion. Such a language would be something like the Structured English Interface for the Deakin Toolset 11, except that additions and changes to the conceptual hierarchies as well as the construction of graphs based on them would be allowed. The procedural knowledgebase would need a theoretically different set of techniques for skill learning, which will not be addressed here. For example, the following sequence of expressions sets up a
330
situation in which a hungry bear is inside a cave, beginning with no concept of bear or cave:
NEWCON DEFCON
BEAR < ANIMAL BEAR (A'I~R FURRY ATFR COLOUR:Brown LOC PLACE EATS ANIMAL) CAVE < LANDMARK CAVE (CHRC DARK CONT ROCKS CONT BATS)
BEAR1 G105 ASSERT
JOIN (BEAR EXPR HUNGER) JOIN (CAVE CONT BEAR1) G105
NEWCON DEFCON
The "interpreter" (in the specific technical sense of a parser-executor of strings in an artificial language) of Figure 2 chooses appropriate operators such as copy, restrict, or join and decides how to apply these to update the SKB and EKB. In most cases, the opcode of an instruction should be enough to select the correct knowledgebase, since the operations appropriate to hierarchy-building and episodic memory are different. Instructions concerning operations on individual graphs would use local variables to hold the graphs until a command sent the graph to a specified knowledgebase. The course of knowledge to be learned will be introduced in a specially prepared sequence of these expressions in this language, called a syllabus. The syllabus would be written by hand, which might at first glace seem to defeat the notion of automating acquisition. However, until we are prepared to construct much more sophisticated perceptual devices (such as a camera which returns conceptual graphs describing objects and events in its field of view), the data which informs the learning process must come from such human-mediated sources. Since advanced raw-data-to-CG converters seem far off, our efforts to reduce human involvement are compromised. It might instead be hoped that the simple teaching language, beginning from a form like that described above, would evolve with experience, so that frequently recurring patterns of operations were eventually chunked up into powerful elements of a higher-level language. Ideally, the high-level form would both decouple the content of the material from the operations, thus allowing syllabuses focus primarily on content, and to be shorter and easier to write.
4 Fully Automated Interpretation Given that the goal is creating a system for intelligent reasoning and not a tool for manipulating logic, one way test that a system is not semi-automatic-trapped is to systematically replace all the mnemonics in the graphs - the type labels in the concepts, relations, actors and contexts - with "blind" labels such as random combinations of characters. If this disadvantages or disables the run-time system, it means that the symbols in the graphs are parasitic on the meanings in the user's head, and so is not fully automated. In practice, it would be useful to be able to switch between the arbitrary labels and mnemonic ones, since the mnemonics are legitimate and useful for design and debugging. Being able to turn off the mnemonics at run-time would force attention away from the graphs and onto the human-read/writable forms with which users of the system will deal. Let us assume, for convenience, that the form for users of a hypothetical system is natural language, and permit that to explain what is needed to automatically interpret graphs. Section 2 argued that a truth-conditional semantics is not enough. In 10, I
331
expressed this as the need to move beyond truth-preserving algorithms to plausibilitypreserving heuristics. What does this mean? Imagine that the natural language system is asked two questions: 1. .
"Can a rabbit fly?" "Can you arrange the names in file "customers.txt" alphabetically and send that to printerl?
To answer 1, a truth-conditional, truth-preserving approach can be tried. Once the pragmatic component of the system has recognised the form as a question, an attempt to join the definitions of RABBIT and FLY can be made, and since selectional constraints in the definition graphs should reject an attempt to fit two incompatible graphs together, an answer of "no", based on whether the join algorithm was successful, can be returned. Perhaps, though, it would be more convivial to return either the successfully joined graph or an error message that repotted what had blocked the join. This would avoid the embarrassment of a simple "yes" answer, in the event that the system had deduced that a rabbit was a suitable patient for transport by aeroplane. It is easy to show that to properly answer 2 it takes a plausibility-preserving, procedural heuristic approach to prevail. First, assume that the only procedure available which could possibly alphabetise the file is a generalised Sort function. The word "arrange" is insufficient to choose an operation, so the system searches all available acts for a suitable match. The way the match is performed is crucial to success here; it must be quite liberal if it is to cope with the many possible ways in which it might properly be summoned. Assuming the Sort function was found to have an optional parameter called "alphabetical", then it might be appropriate to use Sort(alphabetical, customers.txt). We could not know this kind of relationship with the certainty required for a truth-preserving algorithm. A heuristic enforcing only plausibility, on the other hand, could take such a liberty, and thus only it could succeed in this case. Second, in order to avoid the pragmatic howler of returning a yes-or-no answer to 2, the system must actively carry out first the sort and then the print operation. The "and" linking the two is not a Boolean conjunction, but a conditional link ordering two tasks. Successfully recognising the two clauses as a pipelined print operation and performing it is the interpretation of the question. If unsuccessful, some kind of error message representing an explanation would then be appropriate, as in 1.
5
Conclusions
By making provision for actors, CG theory has already prepared the way for progressing beyond the notion that description is all there is to representation. To address the shortcomings of formal knowledge representations which do not recognise the significance of procedural aspects of knowledge, more attention must be paid to both the tokens which mark their presence and to the active processes which substantiate those tokens. These processes cannot be completely divorced from the pragmatics of a CG system. It will not be enough to continue developing algorithms which manipulate CGs, without somehow recognising these explicitly within the graphs themselves. Ideally, the entire codification of an active process would appear in a conceptual graph, but this complicates the notation. At least actor nodes with the same names as coded procedures outside the system should be allowed for.
332
Those interested in the adoption of this formalism as a knowledge standard must also now progress beyond the notion that representations are all there is to knowledge. To create working knowledge for intelligent systems, it is important not to perpetuate passive symbols designed for human use. This engenders a kind of introspection which is compelling to system builders, and potentially aversive to commercial developers. More seriously, it carries the risk of building too much human involvement into the system. The semi-automatic trap is a gedankenexperiment designed to reveal this risk. By asking how knowledge systems would function without their screens and keyboards. it reminds us that these conduits can sometimes work against the development of true automated reasoning. Two escapes from the semi-automatic trap are briefly discussed: fully automated acquisition and fully automated interpretation. In the case of knowledge acquisition, too much human involvement is problematic because it commits the system builder to large amounts of collection and maintenance work on the knowledgebase. It also risks granting a great deal of power to any highlyresourced organisation which is able to manually create a large ontology. It would be preferrable to eliminate the human elicitor in knowledge acquisition. Although the suggested teaching experiment does not accomplish this, it may be a step in the right direction. In the case of interpretation, the ease with which humans take on the role of interpreter of symbols makes the human-readability of CGs a double-edged sword. Therefore I suggested that the mnemonic labels inside the nodes be able to be switched off, so that any parasitism of the system may be exposed. Our artificial reasoning systems will be better able to cope with the vagaries of real tasks when they use plausibility-preserving heuristics instead of truth-preserving algorithms. Truth preservation is important for maintaining the canonicity of true graphs during arbitrary transformations, but it might block sensible but unsound steps in the reasoning process. Such steps could be ubiquitous in commonsense thinking.
References 1. Brachman, J. et.al. Krypton: A Functional Approach to Knowledge Representation. IEEE Computer, 1983, 16, 10, 67-74. 2. Buchler, J. Charles Peirce's Empiricism. New York: Harcourt, Brace & Co., 1939. 3. Burk, A.W. The Collected Papers of Charles Sanders Peirce. Vol. 5. 4. Ellis, G. Object-oriented Conceptual Graphs. In G. Ellis, R. Levinson, W. Rich and J.F. Sowa (Eds.) Conceptual Structures: Applications, Implementation and Theory. Lecture Notes in AI 954, Springer-Verlag, Berlin, 1995, 114-157.
5. Hewitt, C., et. al. Knowledge Embedding in the Description System Omega. Proceedings of the First National Conference on Artificial Intelligence, Stanford, CA, 1980, 157-164. 6. Hiebert, J. Conceptual and Procedural Knowledge in Mathematics: An Introductory Analysis. In J. Hiebert (Ed.) Conceptual and Procedural Knowledge: The Case of Mathematics. Hillsdale, NJ: Lawrence Earlbaum Assoc., 1986, pp.l-27. 7. Lacroix, G. Technical Domination and Techniques of Domination in the New Bureacratic Processes. In L. Yngstrom, et. al. (Ed.s) Can Information Technology Result in Benevolent Bureacracies? The Netherlands: Elsevier Science Publishing Co., 1985, 173-178.
333 8. Lenat, D. et. al. CYC: Towards Programs with Common Sense. Communications of the ACM, 1990, 33, 30-49. 9. McCarthy, J. Recursive Functions of Symbolic Expressions and their Computation by Machine, Part 1. Communications of the ACM, 1960, 3, 4. 10. Mann, G.A. Control of a Navigating Rational Agent by Natural Language. PhD thesis, University of New South Wales, 1996. 11. Munday, C., Sobora, F. & Lukose, D. UNE-CG-KEE: Next Generation Knowledge Engineering Environment. Proceedings of the I st Australian Knowledge Structures Workshop. Armidale, Australia, 1994, 103-117. 12. Nebel, B. & v o n Luck, K. Hybrid Reasoning in BACK. In Z.W. Ras and L. Saitta (Ed.s) Methodologies for Intelligent Systems, Vol.3. North-Holland, Amsterdam, The Netherlands, 1988. 13. Rochowiak, D. A Pragmatic Understanding of "Knowing That" and "Knowing How": The Pivotal Role of Conceptual Structures. In D. Lukose, H. Delugach, M. Keeler, L. Searle & J.F. Sowa (Eds.) Conceptual Structures: Fulfilling Peirce's Dream. Lecture Notes in AI 1257, Springer-Verlag, Berlin, 1997, 25-40. 14. J.F. Sowa: Conceptual structures. Menlo Park, California: Addison-Wesley Publishing Company, 1984. 15. Sowa, J.F. Conceptual Graph Summary. In T.E. Nagle et. al. (Eds.), Conceptual Structures: Current Research and Practice. Chichester: Ellis Horwood, 1992, 339-348. 16. Sowa, J.F. Logical Foudations for Representing Object-Oriented Systems. Journal of Theoretical and Experimental Artificial Intelligence, 1993, 5. 17. von Neumann, J. The Computer & the Brain. New York: Yale University Press, 1958.
Ontologies and Conceptual Structures William M. Tepfenhart AT&T Laboratories 480 Red Hill Rd Middletown, NJ 07748 William.Tepfenhart@ att.com
Abstract. This paper addresses
an issue associated with representing information using conceptual graphs. This issue concerns the great variability in approaches that individuals use with regard to the conceptual graph representation and the ontologies employed. This great variability makes it difficult for individual authors to use results of other authors. This paper lays out all of these differences and the consequences on the ontologies. It compares the ontologies and representations used in papers presented at the International Conference on Conceptual Structures in 1997. This comparison illustrates the diversity of approaches taken within the CG community.
1 Introduction One of the problems about reading papers on conceptual structures is that there are almost as many different approaches to conceptual structures as there are authors. In the original book by Sowa 1, he described three basic representational elements: concepts, conceptual relations, and actors. Since then, other authors have modified concepts, conceptual relations, and actors in very different manners -- different in terms of how they are defined and used. In addition, there are at least four graph types: simple graphs, nested graphs, positive nested graphs, and actor graphs. These differences make comparison between papers difficult and at times impossible. However, there is an even worse problem and that is -- it is fracturing the conceptual graph community along multiple lines. This paper does not attempt to unify all of the different approaches. Such a task is difficult and the effort involved tremendous. It is not even clear that the result would be of value to any but a few. Instead, this paper lays out certain fundamental differences in the various approaches to conceptual graphs. Using the results of this paper, the interested reader will understand how to interpret papers based on very different sets of premises and perhaps be more forgiving to those who have chosen a different approach.
335
The basic elements for which differences are identified in this paper are: the descriptive emphasis, the definitional information, the conceptual grounding, the processing approaches, the ontological structures, and the knowledge structures. As will be shown, there are many degrees of freedom in how one can combine all of these different elements. This paper will not argue which combinations of these elements are meaningful, although it might seem to some readers that some are not. The next six sections of this paper address the following topics: 9 descriptive emphasis - what aspects about the world are stressed most in the ontology and how that affects where and how concepts are defined. 9 definitional information - what information is captured in the definition of concepts and how that information is to be used. 9 conceptual groundings - the semantic basis on which the meaning of the concept is founded. 9 processing approaches - how information captured within conceptual graphs is processed and the implications in terms of how concepts are defined. 9 ontological structures - how concepts are arranged in a type structure and the kinds of processing that can be performed over it. 9 knowledge structures - the graph structures which individuals use to express information and how that structure influences the ontology. Each section describes the element and gives examples of the approach. This is followed by a section that classifies individual papers according to the ontological assumptions on which they are based. The paper concludes by giving a summary of the results presented here.
2 Descriptive Emphasis One element contributing to an ontology is the descriptive emphasis. The descriptive emphasis is the part of the physical world that is stressed most within the ontology and knowledge structures. Some descriptions focus on the state while others focus on the act. The distinction between the two is significant in terms of the kind of information captured, the types of operations that are performed over them, and the kinds of inferences that can be achieved. In fact, the different emphasis controls what kinds of information must be derived from a given graph and knowledge base versus what information is trivially extracted.
2.1 State An ontology that emphases state concentrates on things and the relationships among them. Actions are expressed as changes in state and are characterized by an initial state, a final state, and the act that links the two. The ontology, of course, supports this kind of treatment directly. An example of this for 'A Cat Sat On A Mat' is,
336
Cat: * -> (on) -> Mat: * (posture) -> Standing: * -> <Sit> -> Cat: * -> (on) -> Mat: * (posture) -> Sitting: * In this example, the initial state is one in which a cat is on a mat in a standing position; the final state is one in which a cat is on a mat in a sitting position; and the link between the two is an actor that represents the movement of the cat into a sitting position. The use of an actor <Sit> expresses the semantics of an active relation although, as will be discussed in a later section, actors are not the only mechanism (computational) to express the changes that are taking place. The ontology reflects this way of viewing the physical world by having states and the objects within them captured as concepts. The concepts are defined within the concept type structure. The relationships between objects within a state are captured as conceptual relations which are defined within the relation type structure. Relationships between states are captured as active relations which can be defined as a relation within the relation type structure or as actors within an actor type structure.
2.2 Act An ontology that emphasizes acts concentrates on the transitions and the roles that things play within them. Actions that take place are characterized by the subject of the act, the recipient of the act, the location it took place, and the manner in which the subject executed the act. An example of this is, Sat: * - > (agent) -> Cat: * (location) -> Mat: * In this case, the act is expressed by Sat: * where the agent of the act is given by Cat:*, and the location is given by Mat: *. The ontology, of course, supports this kind of treatment directly. Here the objects involved and the act are expressed as concepts which are defined within the concept type structure. The relationships between the act and the participants are expressed as conceptual relationships which are defined within the relation type structure.
3 Def'mitional Information Sowa states that a concept is defined by Sowa, type a(x) is u where a is the type label and the body u is the differentia of a, and type(x) is called the genus of a. An example of this is, type CIRCUS-ELEPHANT(x) is
337
ELEPHANT: *X (has_components) -> Type_0f (physical_object) ~*} (has_internal_structure)-> Structure ...... 'A substance is a set containing an uncountable number of physical objects of various types and the members within the set have some internal structure'. Tell proposes to perform computations over sets and then substance properties can be interpreted as consequences of the relationships that exist among the instances. Tell explains compositions, states, viscosity, and reactivity of substances. This is a conceptualisation of substances at a micro-level, providing an extremely detailed description of the micro-changes that can take place over time. The NL-oriented approaches, however, always take into consideration (i) the linguistic facts, and (ii) (to some extent) the object denoted by the noun and the properties of this object. According to Lyons, the count-mass distinction is primarily a linguistic one, which is clearly seen in cases of multilinguality (remember the example of news, uncountable in English and countable in Bulgarian). R. Dale Dal investigates the domain of cooking and discusses conflicts between a naive ontology and linguistic facts. In the domain of cooking, rice and lentil are rather similar (small
355
objects of roughly the same size), whose individuals are not considered separately in recipes. The linguistic expressions of ingredients, however, represent rice as a mass noun while lentil behaves like a count noun: e.g. four ounces of rice, four ounces of lentils. 'If the count/mass distinction was ontologically based, we would expect these descriptions to be either both count or both mass' Dal. We found particularly interesting the observation that 'physical objects are not inherently count or mass, but are viewed as being count or mass' in some domains. So Dal considers physical objects from either a mass or a count perspective. 'Thus, a specific physical object can be viewed one time as a mass, and another time as a countable object: when cooking, I will in all likelihood view a quantity of rice as a mass, but if I am a scientist examining rice grains for evidence of pesticide use, I may view the same quantity of rice as a countable set of individuals'. In Dal, exactly one perspective at a time is allowed: each object in the closed domain of cooking is either mass, or count. Comparing Tell and Dal, we see that at a micro-level all objects can be treated as sets of instances; but natural language does not work at the level of atomic components. In most realistic domains the basic objects have much bigger granularity and they are denoted by words that make us treat them as count or mass. Unfortunately, the context-dependent NL usage provides flexible shifts of granularity (see Hol D. 'A road can be viewed as a line (planning a trip), as a surface (driving on it), and as a volume (hitting a pothole) .... Many concepts are inherently granularity-dependent.' Probably we could say that closed domain is such a domain, where a pre-fixed number of perspectives to each object exist and the relevant kinds of granularity are pre-fixed as well. To summarize, in NL we refer to the conceptual entities as follows: (i) when the stuff itself is referred to, an uncountable noun is typically used; in other cases the emphasis is on the object form and shape and then we refer to particular instances by means of countable nouns; (ii) in some domains the entities are treated as compositions at the level of micro-ingredients,while in other domains we see them as compositions of ingredients at a much higher level.
4
A Mixed
Count-Mass
Taxonomy
in Closed
Domains
We acquire CG in the domain of admixture separation from polluted water in order to generate domain explanations in several NLs from the underlying KB AB1. In the context of the present considerations, we try to satisfy the following requirements: (i) adequate internal conceptuMisation providing easy surface verbalisation in different NLs with a proper usage of singular-plural and count-mass nouns; (ii) clear separation between conceptual data (in the single KB) and linguistic data (in the system lexicons, one lexicon per language); (iii) conceptual structures allowing for easy integration of the closed domain into more universal ontologies. In the type hierarchy we define important domain types and integrate them under an upper model. Figure 2 represents a simplified view to the taxonomy
356
where countable and mass entities are classified as subtypes of PHYSICALO B J E C T . We acquire as concept types two important objects in the domain: oil drop and oil particle. Note that this decision is domain-dependent, since in this domain the polluting oil exists as particles and drops in the polluted water. Furthermore, OIL-DROP and OIL-PARTICLE are subtypes of the substance OIL. These two types show the borderline where the domain taxonomy is integrated into a universal taxonomy of physical objects. The ISA-KIND relation, introduced in AB2, defines the perspective of looking at particles and drops as OIL: they are typical 'quantities of stuff'. To conform to some standard, we adopted the keyword P A C K A G E R from Dal. The ISA-KIND relation indicates that the classifications OIL --+ OIL-PARTICLE and OIL -+ OIL-DROP are partitions into role subtypes, because these are subtypes that can be changed during the life time of the physical object (similarly to PROFESSION for PERSONS), while the partitions OIL --+ MINERAL-OIL and OIL --+ SYNTHETIC-OIL are classifications into natural subtypes according to the usual type-of relation. Note that the PACKAGER-perspective covers the two cases of part-whole relations mentioned in WCH: it denotes the relations portion-mass and stuff-object, which are not distinguished in the current domain and consequently, are not treated as different ones in the conceptual model. Such a conceptual solution provides flexible links to the lexicons in case of count-mass nouns. Imagine in some natural language only the word for the stuff exists (e.g. grape is a mass noun in Bulgarian and Russian); then this word is linked to the stuff-concept. But in some other languages, the related words can name typical 'packaged' quantities (like grape in English); if we acquire such quantities as types, the respective words will be connected to these special concept types. Note that it is not obligatory to have 'naming' lexicon elements in any language and for any domain types; the explanations are constructed for the existing concepts in the corresponding grammatical forms. Since the inheritance works for type-of relations, in DBR-MAT we can, for instance, generate the explanation: Each oil particle has dimension less than 0.05 mm. Here we use particle in singular, since it is a countable object and inherits the characteristic features of PARTICLE. As a subtype of PARTICLE, OIL-PARTICLE has SHAPE and DIMENSION. From the perspective of OIL, however, we talk about particles always in plural, and thus we connect countable and mass nouns in the generated explanations. For instance: Oil appears as oil particles and oil drops. Viewed as oil,
oil particles have density and relative weight. Additionally, in the KB, when instances of the types OIL-PARTICLE and OIL-DROP appear in conceptual graphs with unspecified plural referent, i.e. as OIL-PARTICLE: {*} and OIL-DROP: {*}, we can make a generalisation and replace these types by OI L. Then they are verbalised as mass nouns. It is obvious that the natural language we generate is not as flexible as a human one, but this solution is an opportunity to mix countable and uncountable perspective in one utterance. Note that we cannot produce phrases like two oil particles, because in our domain-specific KB the unspecified plural referent sets
357
. . .
/ . . .
PHYSICAL-OBJECT C O U N TAB LE
\
SUB STAN C E
CONCEPT-TYPE
/\ DRO P
PARTIC LE
OIL-DROP
0 IL
NAqI/RAL
O IL - PARTIC LE
ROLE
PROFESSION
SYNTHETIC -OIL
PACKAGER
M IN E R A L - O IL
~ IL- p~ACR_ ~ _ ~ ~
,~ACKAGER
I
~ IL-DRO P ~.....~
Fig. 2. Taxonomy mixing countable and mass objects by classification into role and n~tur~l subtypes.
from the type definition of SUBSTANCE would not be instantiated with counted sets. To conclude, such crosspoint types between countable and mass types require very careful elaboration of: (i) the type hierarchy, (ii) the type definitions of the supertypes and the 'crosspoint' type so that to assure correct generalisation and specialisation, (iii) the characteristics of both supertypes to assure correct inheritance.
5
Conclusion
This paper discusses difficulties in classifying world objects as countable and uncountable and presents a (more or less) empirical solution applied in an ongoing project. We see that KA requires extremely detailed analysis of the source texts and deep understanding of the CG structures and idiosyncrasies. It is somehow risky to consider only the NL level, since the language phenomena in the acquisition texts are often misleading (clearly seen in a multilingual paradigm). Actually the distinction countable-mass should be defined at a deeper concep-
358
tual level. We try to keep the two perspectives closely related: the uncountable stuff and the countable individual objects made by this stiff. Fig. 2 shows that we allow both count-mass perspectives together but only to specially acquired, domain dependent types. In this sense, our approach addresses the closed world of one domain. To 'open' this closed world and to integrate another closed world for another domain will probably require the addition of new 'crosspoint' types between countable-mass objects. In our view, the classification countable-uncountable is not less meaningful than the partition abstract-material. Fig. 1 differs from many upper-level ontological classifications, where abstract-real is the highest partition of the top (see FH1). But a careful analysis of the taxonomy in Fig. 1 shows that it is easy to replace the two upper layers, i.e. the classification concrete-abstract can easily become the topmost partition. Despite the problems discussed in this paper, it seems worthwhile to consider at least the countable-mass distinction of p h y s i c a l o b j e c t as one of the unifying principles for top-level ontologies.
Acknowledgements The author is grateful to the three anonymous referees for the fruitful discussion and the suggestions.
References AB1 Angelova, G., Boncheva, K.: DB-MAT: Knowledge Acquisition, Processing and NL Generation using Conceptual Graphs. ICCS-96, LNAI 1115 (1997) 115-129. lAB2 Angelova, G., Boncheva, K.: Task-Dependent Aspects of Knowledge Acquisition: a Case Study in a Technical Domain. ICCS-97, LNAI 1257 (1996) 183-197. Dal Dale, R.: Generating Referring Expressions in a Domain of Objects and Processes. Ph.D. Thesis, University of Edinburgh (1988). FH1 Fridman, N., Hafner, C.: The State of the Art in Ontology Design. A Survey and Comparative Review. AI Magazine Vol. 18(3) (1997), 53 - 74. Gull Guarino, N.: Some Organizing Principles for a Unified Top-Level Ontology. In: Working Notes AAAI Spring Symp. on Ontological Engineering, Stanford (1997). Hal Hayes, P.: The Second Naive Physics Manifesto. In Brachman, Levesque (eds.) Readings in Knowledge Representation, Morgan Kaufmann Publ. (1985) 468-485. Hol Hobbs, J.: Sketch of an ontology underlying the way we talk about the world. Int. J. Human-Computer Studies 43 (1995), 819 - 830. Mol Molhova, J. The Noun: a Contrastive English - Bulgarian Study. Publ. House of the Sofia University "St. K1. Ohridski', Sofia (1992). Sol Sowa, J.: Conceptual Structures: Information Processing in Mind and Machine. Addison-Wesley, Reading, MA (1984). So2 Sowa, J.: Conceptual Graphs Summary. In: Nagle, Nagle, Gerholz, Eklund (Eds.), Conc. Structures: Current Research and Practice, Ellis Horwood (1992) 3-52. Tel Tepfenhart, W.: Representing Knowledge about Substances. LNAI 754 (1992) 59-71. WCH Winston, M., Chaffin, R., I-Ierrmann, D.: A Taxonomy of Part-Whole Relations. Cognitive Science 11 (1987) 417-444.
A Logical Framework for Modeling a Discourse from the Point of View of the Agents Involved in It 1 Bernard Moulin, Professor Computer Science Department and Research Center in Geomatics Laval University, Pouliot Building, Ste Foy (QC) G1K 7P4, Canada Phone: (418) 656-5580, E-mail:
[email protected] A b s t r a c t . The way people interpret a discourse in real life goes well be-
yond the traditional semantic interpretation based on predicate calculus as is currently done in approaches such as Sowa's Conceptual Graph Theory, Kamp's DRT. From a cognitive point of view, understanding a story is not a mere process of identifying truth conditions of a series of sentences, but i s a construction process of building several partial models such as a model of the environment in which the story takes place, a model of mental attitudes for each character and a model of the verbal interactions taking place in the story. Based on this cognitive basis, we propose a logical framework differentiating three components in an agent's mental model: a temporal model which simulates an agent's experience of the passing of time; the agent's memory model which records the explicit mental attitudes the agent is aware of and the agent's attentional model containing the knowledge structures that the agent manipulates in its current situation
1
Introduction
Conceptual Graphs (CG) 16 have been applied in several natural language research projects and the CG notation can be used to represent fairly complex sentences, including several interesting linguistic phenomena such as attitude reports, anaphors, indexicals and subordinate sentences. However, most researchers have overlooked the importance of modeling the context in which sentences are uttered by locutors. Even modeling a simple sentence such as "Peter saw the girl who was playing in the park with a red ball" requires a proper representation of the context of utterance. For instance, the preterit "saw" cannot be modeled without referring to the time when the locutor uttered the sentence: hence, we need to represent the locutor and the context of utterance in addition to representing the sentence itself. To this end, we proposed to model the contents of whole discourses using an approach in which it is possible to explicitly represent the context of utterance of speech acts 1 An extended version of this paper can be found in reference 11. This research is sponsored by the Natural Sciences and Engineering Council of Canada and FCAR. My apologies to the reviewers because I could not answer all their questions in such a short paper.
360
8, 9, 10. Any sentence is thought of as resulting from the action of a locutor performing a speech act, which determines the context of utterance of that sentence. A major contribution of that approach was the explicit introduction of temporal coordinate systems in the discourse representation using three kinds of constructs: the narrator's perspective, the temporal localizations and the agent's perspectives. As an example let us consider the story displayed in Figure 1. It is told by an unidentified narrator using the past tense, which indicates that the reported events occurred in the past relative to the narrator's time, and hence to the moment when the reader reads the story. When the narrator reports the characters' words, the verb tense changes for the present or the future. These tenses are relative to the characters' temporal perspectives which differ from the narrator's temporal perspective that is temporally located after that date. This example shows the necessity of explicitly introducing in the discourse representation the contexts of utterance of the different speech acts performed by the narrator and the characters. The complete representation of this story can be found in 11. Monday October 20 1997, Quebec city (SI). Peter wanted to read Sowa's book ($2), but he did not have it ($3). He recalled that Mary bought it last year ($4). He phoned her ($5) and asked her ($6): "Can you lend me Sowa's book for a week?"(S7) Mary answered (S8):"Sure! ($9) Come and pick it! (SI0)". John leplied (SII):"Thanks! (SI2) I will come tomorrow" (SI3). F i g u r e 1: A s a m p l e story Note: the numbers Si are used to identify the various sentences of the text
However, even such a representation is not sufficient if we want to enable software agents to reason about the discourse content. We aim at creating a system which will be able to manipulate the mental models of the characters involved in the discourse and simulate certain mechanisms related to story understanding. We based our approach on cognitive studies that have shown that readers adopt a point of view "within the text or discourse" 2. The Deitic Shift Theory (DST) argues that the metaphor of the reader "getting inside the story" is cognitively valid. The reader often takes a cognitive stance within the world Of the narrative and interprets the text from that perspective 14. Segal completes this view by discussing about the mimesis mechanism: "A reader in a mimetic mode is led to experience the story phenomena as events happening around him or her, with people to identify with and to feel emotional about... The reader is often presented a view of the narrative world from the point of view of a character... We propose that this can occur by the reader cognitively situating him or herself in or near the mind of the character in order to interpret the text" (15 p 67 artt 68). Hence, from a cognitive point of view, understanding a story is not a mere process of identifying truth conditions of a series of sentences, but is a construction process of building several partial models such as a model of the environment in which the story takes place, a model of mental attitudes for each character and a model of the verbal interactions taking place in the story. Hence the following assumption: When understanding a discourse, a reader creates several mental models that contain the mental attitudes (beliefs, desires, emotions, etc.) that she attributes to
361
each character as well as the communicative and non-communicative actions performed by those characters.
Hence, when using CGs to model the semantic content of a discourse, we need: 1) a way of representing the context of utterance of agents' speech acts; 2) the underlying temporal structure; 3) a way of representing the mental models of each character involved in the discourse. In the next sections we address point 3) and propose an approach based on a logical framework that differentiates three components in an agent's mental model: a temporal model which simulates an agent's experience of the passing of time (Section 2); the agent's memory model which records the explicit mental attitudes the agent is aware of and the attentional model containing the knowledge structures that the agent manipulates in its current situation (Section 3).
2. A n Agent's M e n t a l M o d e l Based on A w a r e n e s s In order to find an appropriate approach to represent agents' mental models, we can consider different formalisms that have been proposed to model and reason about mental attitudes 2 1 among which the so-called BDI approach 13 5 is widely used to formalize agents' knowledge in multi-agent systems. These formalisms use a possibleworlds approach 6 for modeling the semantics of agent's attitudes. For example, in the BDI approach 13 5, an agent's mental attitudes such as beliefs, goals and intentions are modeled as sets of accessible worlds associated with an agent and a time index thanks to accessibility relations typical of each category of mental attitudes. However, such logical approaches are impaired by the problem of logical omniscience, according to which agents are supposed to know all the consequences of their beliefs. This ideal framework is impractical when dealing with discourses that reflect human behaviors, simply because people are not logically omniscient 7. In addition, it is difficult to imagine a computer program that will practically and efficiently manipulate sets of possible worlds and accessibility relations. In order to overcome this theoretical problem, Fagin et al. 4 proposed to explicitly model an agent's knowledge by augmenting the possible-worlds approach with a syntactic notion of awareness, considering that an agent must be aware of a concept before being able to have beliefs about it. In a more radical approach, Moore suggested partitioning the agent's memory into different spaces, each corresponding to one kind of propositional attitude (one space for beliefs, another for desires, another for fears, etc.), "these spaces being functionally differentiated by the processes that operate on them and connect them to the agent's sensors and effectors" 7. The approach that we propose in this paper tries to reconcile these various positions, while providing a practical framework for an agent 2 In AI litterature, elements such as beliefs, goals and intentions are usually called "mental states". However, we use the term "mental attitudes" to categorize those elements. An agent's mental model evolves through time and we use the term agent's mental state to characterize the current state of the agent's mental model. Hence, for us an agent's mental state is composed of several mental attitudes.
362
in order to manipulate knowledge extracted from a discourse. The proposed agent's framework is composed of three layers: the agent's inner time model which simulates its experience of the passing of time; the agent's memory model which records the explicit mental attitudes the agent is aware of and the attentional model containing the knowledge structures that the agent manipulates in its current situation. In order to formalize the agent's inner time model, we use a first-order, branching time logic, largely inspired by the logical language proposed in 5 and 13. It is a firstorder variant of CTL*, Emerson's Computational Tree Logic 3, extended to a possible-worlds framework 6. In such a logic, formulae are evaluated in worlds modeled as time-trees having a single past and a branching future. A particular time index in a particular world is called a world-position. The agent's actions transform one world position into another. A primitive action is an action that is performable by the agent and uniquely determines the world position in the time tree. The branches of a time tree can be viewed as representing the choices available to the agent at each moment in time. CTL* provides all the necessary operators of a temporal logic. It is quite natural to model time using the possible-worlds approach because the future is naturally thought of as a branching structure and because the actions performed by the agent move its position within this branching structure. The agent's successive world positions correspond to the evolution of the agent's internal mental state through time as a result of its actions (reasoning, communicative and non-communicative acts). The agent does not need to be aware of all the possible futures reachable from a given world position: this is a simple way of modeling the limited knowledge of future courses of events that characterizes people. An agent's successive world positions specify a temporal path that implements the agent's experience of the passing of time: this characterizes its "inner time". This inner time must be distinguished from what we will call the "calendric time" which corresponds to the official external measures of time that is available to agents and users (dates, hours, etc.).
3
The Agent's Memory Model and Attentional Model
The agent's mental attitudes are recorded in what we call the agent's memory model. Following Fagin 4 we consider that the definition of the various mental attitudes in terms of accessibility relations between possible worlds corresponds to the characterization of an implicit knowledge that cannot be reached directly by the agent. At each world position the agent can only use the instances of mental attitudes it is aware of. Following Moore's proposal of partitioning the agent's memory into different spaces, the awareness dimension is captured by projecting an agent's current world-position onto so-called knowledge domains. The projection of agent Ag's world-position Wto on the knowledge domain Attitude-D defines the agent's range of awareness relative to domain Attitude-D at time index tO in world w. The agent's range of awareness is the subset of predicates contained in knowledge domain Attitude-D which characterize the particular instances of attitudes the agent is aware of at world-position Wto . The
363
knowledge domains that we consider in this paper are the belief domain Belief-D and the goal domain Goal-D. But an agent can also use other knowledge domains such as the emotion domain Emotion-D that can be partitioned into sub-domains such as FearD, Hope-D and Regret-D. In addition to knowledge domains which represent an agent's explicit recognition of mental attitudes, we use domains to represent an agent's explicit recognition of relevant elements in its environment, namely the situational domain Situational-D, the propositional domain Propositional-D, the calendric domain Calendric-D and the spatial domain Spatial-D. The situational domain contains the identifiers of any relevant situation that an agent can explicitly recognize in the environment. Situations are categorized into States, Processes, Events, and other sub-categories which are relevant for processing temporal information in discourse 8 9: these sub-categories characterize the way an agent perceives situations. A situation is specified by three elements: a propositional description found in the propositional domain Propositional-D, temporal information found in the calendric domain Calendric-D and spatial information found in the spatial domain Spatial-D. Hence, for each situation there is a corresponding proposition in Propositional-D, a temporal interval in Calendric-D and a spatial location in Spatial-D. Propositions are expressed in a predicative form which is equivalent to conceptual graphs. The elements contained in the calendric domain are time intervals which agree with a temporal topology. The elements contained in the spatial domain are points or areas which agree with a spatial topology. Figure 2 illustrates how worlds and domains are used to model agent Peter's mental attitudes obtained after reading sentences S 1 to $4 in the text of Figure 1. Worlds are represented by rectangles embedding circles representing time indexes and related together by segments representing possible time paths. Ovals represent knowledge domains. Curved links represent relations between world positions and elements of domains (such as Spatial-D, Calendric-D, Peter's Belief-D, etc.) or relations between elements of different domains (such as Situational-D and Spatial-D, Calendric-D or Propositional-D). After reading sentences S1 to $3, we can assume that Peter is in a world-position represented by the left rectangle in Figure 2, at time index tl in world W1. This world position is associated with a spatial localization Peter's.home in the domain Spatial-D and a date dl in the domain Calendric-D. This information is not mentioned in the story but is necessary to structure Peter's knowledge. Time index tl is related to beliefs P.bl and P.b2 in Peter's Belief-D. P.bl is related to situation sl in Situational-D which is in turn related to proposition p29, to location Peter's.home in Spatial-D and to a time interval -, Now in Calendric-D. Now is a variable which takes the value of the date associated to the current time index. Proposition p29 is expressed as a conceptual graph represented in a compact linear form: Possess (AGNT- PERSON:Peter; OBJ- BOOK: Sowa's.Book) Notice that in Calendric-D we symbolize the temporal topological properties using a time axis: only the dates dl and d2 associated with time indexes tl and t2 have been represented as included in the time interval named October 201997.
364
Specification of propositions PROP(p29, Possess(AGNT-PERSON:Peter; OBJ-BOOK: Sowa's.Book)) PROP(p30, Read (AGNT-PERSON:Peter; OBJ-BOOK:Sowa's.Book)) PROP(p31, Buy (AGNT-PERSON:Mary; OBJ- BOOK: Sowa's.Book)) PROP(p32, Lend (AGNT-PERSON:Mary;PINT- PERSON: Peter; OBJ-BOOK:Sowa's,Book))
Figure 2: Worlds and Domains From sentence $2 we know that Peter wants to read Sowa's book, which is represented by the link between time index tl and the goal P.gl in Peter's Goal-D. P.gl is related to situation s4 in Situational-D which is in turn related to proposition p30 in Propositional-D. P.gl is related to a date in Calendric-D and a location in Spatial-D, but those links have not been represented completely in order to simplify the figure (only dotted segments indicate the existence of those links). In world W1, agent Peter can choose to move from time index tl to various other time indexes shown by different circles in the world rectangle in Figure 2. Moving from one time index to another is the result of performing an elementary operation. From our little story we can imagine that Peter wanted that Mary lends him the book: the corresponding elementary operation is the creation of a goal P.g2 with the status active. In Figure 2 this is represented by the large arrow linking the rectangles of worlds W and W2 on which appears the specification of the elementary operation Creates (Peter, P.g2, active). When this elementary operation is performed, agent Peter moves into a new world W2 at time index t2 associated with the spatial localization Peter's.home and date d2. Time index t2 is still related to beliefs P.bl and P.b2 in Belief-D, but also to goals P.gl and P.g2 in Peter's Goal-D. In an agent Agl's mental model certain domains may represent mental attitudes of another agent Ag2: They represent the mental attitudes that Agl attributes to Ag2. As an example, Goal
365
P.g2 in Peter's Goal-D is related to Goal rag6 in Mary's Goal-D which is contained in Peter's mental model. Goal mg6 is associated with situation s2 which is itself related to proposition p32. Beliefs and goals are formally expressed using predicates which hold for an agent Ag, a world w and a time index t as for example Peter's beliefs and goals at time index t2: Peter, W2, t2 I= BELe.bl(Peter,STATE~I(NOTp29, -, Now, Peter's.home )) Peter, W2, t2 I= BELp.b2(Peter,EVENT~(p31, dx, -) ) Peter, W2, t2 I= GOALp.gl(Peter,PROCESS~(p30, -, Quebec ), active) Peter, W2, t2 I= GOAl-,.gz(Peter,GOAL.~(Mary, PROCESSa(p32, -, Quebec), active), active) The agent's memory model gathers the elements composing the agent's successive mental states. The amount of information contained in the agent's memory model may increase considerably over time, resulting in efficiency problems. This is similar to what is observed with human beings. They record "on the fly" lots of visual, auditive and tactile but they usually do not consciously remember this detailed information over long periods of time. They remember information they pay attention to. Similarly in our framework, the agent's attentional model gathers some information extracted from the agent's memory model because of its importance or relevance for the agent's current activities. The attentional model is composed of a set of knowledge bases that structure the agent's knowledge and enable it to perform the appropriate reasoning, communicative and non-communicative actions. Among those knowledge bases, we consider the Belief-Space, Decision-Space, Conversational-Space and Action-Space. The Belief-Space contains a set of beliefs exlracted from the memory model and a set of rules enabling the agent to reason about those beliefs. Each belief is marked by the world position that was the agent's current world position when the belief was acquired or inferred. The Decision-Space contains a set of goals extracted from the memory model and a set of rules enabling the agent to reason about those goals. Each goal is marked by the world position that was the agent's current world position when the goal was acquired or inferred. The Conversational-Space models the agents' verbal interactions in terms of exchanges of mental attitudes and agents' positionings relative to these mental attitudes. In our approach a conversation is thought of as a negotiation game in which agents negotiate about the mental attitudes that they present to their interlocutors: they propose certain mental attitudes and other locutors react to those proposals, accepting or rejecting the proposed attitudes, asking for further information or justification, etc. 12. The Action-Space records all the communicative, noncommunicative and inference actions that are performed by the agent. Details and examples about the attentional model can be found in 11.
5
Conclusion
This paper is a contribution to the debate about the notion of context which has taken place for several years in the CG community. Whereas our temporal model 8, 9, 10. presented a static representation of the various contexts of utterance found in a
366 discourse, the present approach considers that the context is built up from the accumulation of knowledge in various knowledge bases (the various spaces of the attentional model) that compose the agent's mental model. The proposed logical framework provides the temporal model of discourse with semantics that can be practically implemented in an agent's system. However, the comparison of this framework with other approaches of context modeling would deserve another entire paper.
References 1. Cohen P. R.& Levesque H. J. (1990), Rational Interaction as the Basis for Communication, in Cohen P.R., Morgan J. & Pollack M. E.(edts.) (1990), Intentions in Communication, MIT Press, 221-255. 2. Duchan J. F., Bruder G.A. & Hewitt L.E. (1995), Deixis in Narrative, Hillsdale: Lawrence Erlbaum Ass. 3. Emerson E. A. (1990), Temporal and modal logic, In van Leeuwen J. (edt.), Handbook of Theoretical Computer Science, North Holland, Amsterdam, NL. 4. Fagin R., Halpern J.Y., Moses Y. & Vardi M.Y., Reasoning about Knowledge, MIT Press, 1996. 5. Haddadi A. (1995), Communication and Cooperation in Agent Systems, Springer Verlag Lecture Notes in AI n. 1056. 6. Kripke S. (1963), Semantical considerations on modal logic, Acta Philosophica Fennica, vol 16, 83-89. 7. Moore R.C. (1995), Logic and Representation, CSLI Lecture Notes, n. 39. 8. Moulin B. (1992), A conceptual graph approach for representing temporal information in discourse, Knowledge-Based Systems, vo15 n3, 183-192. 9. Moulin B. (1993), The representation of linguistic information in an approach used for modelling temporal knowledge in discourses. In Mineau G. W., Moulin B. & Sowa J. F. (eds.), Conceptual Graphs for Knowledge representation, Lecture Notes on Artificial Intelligence, Springer Verlag, 182-204. 10. Moulin B. (1997), Temporal contexts for discourse representation: an extension of the conceptual graph approach, the Journal of Applied Intelligence, vol 7 n3, 227255. 11. Moulin B. (1998), A logical framework for modeling a discourse from the point of view of the agents involved in it, Res. Rep. DIUL-RR 98-03, Laval Univ., 16 p. 12. Moulin B., Rousseau D.& Lapalme G. (1994), A Multi-Agent Approach for Modeling Conversations, Proc. of the International Conference on Artificial Intelligence and Natural Language, Pads, 35-50. 13. Rao A.S.& Georgeff M. P. (1991), Modeling rational agents within a BDI architecture, In proceedings of KR'91 Conference, Cambridge, Mass, 473-484. 14. Segal E.M. (1995a), Narrative comprehension and the role of Deictic Shift Theory, in 2, 3-17. 15. Segal E.M. (1995b), A cognitive-phenomenological theory of fictional narrative, in 2, 61- 78. 16. Sowa J. F. (1984). Conceptual Structures. Reading Mass: Addison Wesley.
Computational Processing of Verbal Polysemy with Conceptual Structures Karim Chibout and Anne Vilnat Language and Cognition Group,LIMSI-CNRS, B.P. 133, 91403 Orsay cedex, France. chibout, vilnat @lims i. fr
A b s t r a c t . Our work takes place in the general framework of lexicosemantic knowledge representation to be used by a Natural Language Understanding system. More specifically, we were interested in an adequate modelling of verb descriptions allowing to interpret semantic incoherence due to verbal polysemy. The main goal is to realise a module which is able to detect and to deal with figurative meanings. Therefore, we first propose a lexico-semantic knowledge base; then we present the processes allowing to determine the different meanings which may be associated to a given predicate, and to discriminate these meanings for a given sentence. Each verb is defined by a basic action (its supertype) specified by the case relations allowing to specify it (its definition graph), that means the object, mean, manner, goal and/or result relations which distinguish the described verb meaning from the specified basic action. This description is recursive : the basic actions are in turn defined using more general actions. To help interpreting the different meanings conveyed by a verb and its hyponyms, we have determined three major types of heuristics consisting in searching the type lattice, and/or examining the associated definitions.
1
Introduction
Our work takes place in the general framework of lexico-semantic knowledge representation to be used by a Natural Language Understanding system. More specifically, we were interested in an adequate modelling of verb descriptions allowing to interpret semantic incoherence due to verbal polysemy. The main goal is to realise a module which is able to detect and to deal with figurative meanings.We studied two complementary thrusts to model polysemy : (a) lexico-semantic knowledge representation, (b) processes to interpret the different meanings which may be associated to a given predicate, and to discriminate these meanings for a given sentence. After an outline of some metaphor examples that justified our approach, we will present links between verbs within a lexical network and the semantic structure associated with each of them. We will finally define the model of polysemy propounded from this representation formalism; in particular, selection rules within the network in order to process figurative meanings of verbs.
368
2
Polysemy,
metaphors
and
figurative
meanings
The distinction between polysemy and homonymy must be clearly distinguished : two homonyms only share the same orthography, two polysems share semantic elements. In case of homonymy, the solution is to create as many concepts as necessary. Otherwise, in case of polysemy the different senses of a word are mostly figurative meanings and derive from its core meaning. Thus, it is necessary to be able to determine this core meaning, and the derivation rules implied in this senses elaboration. The following examples illustrate this phenomenon. Lexical metaphors are created by selecting words specific semantic features . In the following example, both nouns s h a r e / s p h e r i c a l / f e a t u r e . 1)La terre est une orange. (Earth is an orange.) 1 The characteristic feature used in the comparison may also be taken in a figurative meaning, and the phrase is twice non literal. 2) Sally is a block of ice. The feature / c o l d / o f the ice (literal meaning) is assimilated to the feature / c o l d / o f human character (figurative meaning). This characteristic may be true or assumed (in the beliefs) : 3) John is a gorilla, to signify a man with violent, brutal manners; whereas the ethologist will insist on the peaceful manners of this animal. The same metaphor may convey multiple meanings depending on the context, by selecting different semantic features. In 3), gorilla may also m e a n / h a i r y / o r even/(physically) strong/. Figure interpretation consists in selecting one of the semantic characteristics among those which define the metaphoric term. When multiple meanings are possible, they are semantically related as they come from a unique representation. Polysemy is also considered as in context (re-)building process from a unique representation. We pay most attention to semantic incoherence resolution due to verbal polysemy. Analysing multiple meanings of French verbs allows us to precise the semantic representations from which they are elaborated. 4) Les vagues couraient jusquaux rochers. (The waves run towards the rocks) The verbal metaphor in 4) may be interpreted in replacing to run by to move quickly, which both are implied by the semantic description of this verb. Roughly speaking, the semantic features associated to to run are :/to m o v e / + / w i t h feet/ + / q u i c k l y / . . . Hyperonym is not the only part of the meaning which is used to build a figurative meaning. 5) La robe dtrangle la taille de la jeune fille. (The dress constricts the young girls waist, with the French verb dtrangler, which literally means to strangle, to translate the notion conveyed by to constrict). In this example to strangle is defined as "to suffocate by grasping the neck"; the semantic incoherence cannot be resolved by the hyperonym but by the selection of the f e a t u r e / g r a s p / (synonym of to constrict) which specifies the method used to do the action. To find 1- ~in this paper, the examples illustrating polysemy will generally be given in French, with a translation in English, to maintain their polysemous nature, which will probably lost their character in English. Being not English native speaker, it was difficult to always find analaguous examples, and to be sure that they convey the same senses.
369
the different meanings of a predicate, always semantically related, it is necessary to select its hyperonym , or to extract another part of its definition, or to combine both mechanisms (see for example the interpretation of to run in 4). These polysemous behaviours lead us to propose a hierarchical representation of the concepts, completed for each concept by a precise semantic description allowing to express semantic features that differentiate it from its father (direct hyperonym) and from its brothers in the hierarchy. This representation is implemented in the conceptual graph formalism ? : the hierarchy will take place in the type lattice, and semantic descriptions will be translated in type definitions. 3
Ontology,
definitions
and
conceptual
graphs
From the preceding study, it is clear that the best suited tool to build the ontology is a dictionary to obtain the different meaning components. The entries constitute rather precise descriptions generally including the hyperonym and referring to verbs whose meanings are related to the one defined. The different relations (such as mean, manner, goal,...) that specify the verb with respect to its hyperonym are also given. Yet, the definitions in a dictionary are not always homogenous concerning their structure and their content ?, ?. Thus it is impossible to rely only on a dictionary to build our network.The method used to organise verbal concepts has been divided in two steps. Firstly, we have analysed about a hundred verbs in French. These concepts have been categorised in details following precise criteria, such as a systematic definition of the kind of the relations between the verb and its semantic features. The representation of each lexical item is called conceptual schema, and corresponds to an enhanced dictionary definition. Each conceptual verb is given in terms of its nearest hyperonym (the central event) and some semantic cases which specify it. In addition to the classical case relations (agent, object, mean,...), four cases are essential for a complete verb description : the manner an event is realised, the method used to realise it, the result of the event and its intrinsic goal. For example, the verbs to cut and to cover have the following semantic descriptions : t o c u t : to divide (nearest hyperonym) a solid object (object) into several pieces (result) using an edge tool (mean) by going through the object (method) t o c o v e r : to place (nearest hyperonym) something (object) over something else (support)in order to hide it(goal1) or protect it (goal2). The verb hierarchy is organised following the case relations which are associated. A verbal concept is hyperonym (respectively hyponym) of another one if they share a common hyperonym and if the case structure of this verb presents one of the following feature : (a) lack (respectively presence) of a value defined for a given case, for example to divide is hyperonym of to cut, because the definition of to cut includes a mean and a method; (b) presence of a case with multiple value (respectively unique value), for example to cover is hyperonym of to plate and to veil which only include one of the possible goals (respectively to protect and to hide) ; (c) presence of a case with a generic value (respectively specific value), for example to decapitate is hyperonym of to guillotine; the mean case of the first
370
(edge tool) is a sub-type of the one of the second (guiliotine).This b o t t o m - u p building of the hierarchy allows us to define some large semantic classes. Thus, we determine about fifteen primitives appearing to be similar to the classical case relations, and mostly corresponding to state verbs : Owner (ex : to own), Location (ex : to live in), Container (ex : to contain), Support(ex : to support), Patient (ex: to suffer), T i m e (ex: to exist), Experiencer (ex : to know),... At the J
(a)
Acilon
F~re,C ~ r 9..
FaireCe~.r po,~ibllite .
.
.
.
.
.
.
.
.
.
.
Empecher (Prevent) Etoeffer (Suffocate)
FaireC~eml~ FaireMomir
Museler (Muzzle)
Fal~-r
.
Tue~ (Kill)
Arretex (Stop)
Fairc Est~cr
FaireCesse~Paraitrc
S~ (S/~ Divisef (Divide)
|
(Mince)
Gommer (Rubout)
--9
FaimEntrer
Cachet Dissiper Effa~r (Hide) (Dissipate) (Erase)
Asphyxie~ Etrangler Noyer Decapiter Egorgex Peigner Couper Casser Fragmentef ""' (Asl~yxiate) (Strangle) (Drown)(Decapilate)(CutThr~t)(Comb) (Cut) (Bleak) (Fragmen0
..
999 FalrcDeveairLoc
Fai,eDisnarmtm ~
Intermmpre Demeler (ntem~t) (Unravel)
Guillotiner (Guillotine)
~'lir
Imp~ter (Impo~)
Eclipsef (Eclipse)
FaimPenetrer (Penetrate) Instiller (Instill)
"9
Bds~ (Smash)
(c)
give-food:*x
~ ~ ' , , , . ~
animate:*z
Fig. 1. Extract of the hierarchy (a) and description of (c) type definition graph
to feed: (b) canonical graph and
higher level of the hierarchy, the process verbs and the action verbs derive from these state primitives. So processes are expressed as Devenir (become)/Cesser (cease) + primitive, and action verbs as FaireDevenir/FaireCesser + primitive : - Devenir Contenant -
(to fill, to eat, to drink), Devenir Temps (to born), DevenirExpdrienceur (to learn) CesserContenant (to (become) empty), CesserTemps (to die), CesserExpdrienceur (to forget) FaireDevenir Contenant (to fill, to water), FaireDevenir Temps (to create, to give birth to, to calve), FaireDevenirExpgrienceur (to teach ) - FaireCesserContenant (to empty), FaireCesserTemps (to interrupt, to kill)
-
Our study has concerned about 2000 French verbs. Thus we have constituted a large lexico-semantic network corresponding to more than 1000 verbal concepts,
371
organised inside a hierarchy (see Fig.l). For each node, the description consists in : a sub-categorisation frame, a case structure (represented in a canonical graph) and a definition (represented in a definition graph), describing how it processes its hyperonym (i.e. its super-type in the hierarchy), (see Fig.l). The association word/concept is defined in a global table. These descriptions have allowed us to build the lexico-semantic knowledge base used by the Natural Language Processing system developed at LIMSI ?. The definition graphs have been built following the method we describe in the first part of this paragraph. Thus we make sure that the definitions are not axi-hoc definitions, following a top-down building of the hierarchy, but respect the meanings of the words they represent. But, at that time we cannot verify that the constraints due to inheritance mechanismare verified : this constitutes the following step of our work. 4
Verbal
polysemy
: elements
of interpretation
As we have said it, we built the hierarchy we presented just above to be able to interpret lexical polysemy of French verbs. We assume that there exists a (proto)typical meaning associated to each verb. The prototype is the more often attested meaning. The so-called figurative meanings derive from this prototypical meaning. These figurative meanings are related to the semantic structure of the verb by taking into account either the nearest hyperonym (as in example 4), or other parts of the conceptual schema (as in example 5). But other processes are necessary to interpret metaphoric senses, based on primitives substitutions. Let us consider the interpretation of the following example: 6) La fermi@re nourrit le feu avec des branchages. (The farmer's wife f e e d s the fire with lopped branches (to keep going the fire)) The constraints expressed in the canonical graph of to feed are violated. But the definition graph (see Fig.l) includes the particular meaning of this word (in example 6) : it is given in the sub-graph corresponding to to maintain (in French entretenir which means to keep going in this context). The semantic constraints expressed in the canonical graph of to maintain are then fulfilled (particularly for the type of the object fire). Selecting parts of the semantic representation of lexical items is not the only process used to build figurative meanings ; nevertheless, as we show it just below, it is a necessary condition to be able to elaborate these meanings.Some figurative meanings rely on conceptual metaphors ?. Thus, nourrir quelqu 'an de ragots (to feed someone with ill-natured gossips), abreuver quelqu'un de connaissances (to swamp someone with knowledge) may be interpreted from the metaphor Mind is Container. Such conceptual metaphors are interesting because they are rather general : they apply not only to a given verb, but to a whole class of verbs. Thus, gaver de connaissances (to fill ones mind with knowledge), se nourrir de lectures (to improve the mind with reading), dgvorer un livre (to devour a book) will be interpreted using the metaphor Mind is Container. These interpretation processes do not invalidate our model for polysemy. The semantic representation
372
we propose is based on a hierarchy, thus it is possible to go up to the semantic primitives from which verbal items are dependant. The extracts of the type lattice in Fig.2 clarify the way the interpretation is done. Semantic primitives play a central role to interpret figurative meanings due to conceptual metaphor. Thus, all the metaphoric uses, such as abreuver quelqu'un de connaissances (to swamp someone with knowledge), for which the analogy Mind is container will be relevant, will be interpreted by substituting the primitive CONTAINER to the primitive EXPERIENCER. From the metaphoric verb, the semantic class corre-
Absorber SeRemplir Deve~nna(~'c~ (ab~j,~) (fill) / / ~ / ~ ApprcndreComprendre DP:cldffre~ Avalor "'" (learn) (understand) (decil~he~ ',swallow) ~eNourrir Boire Lira :e~d~er (drink) (read)
I,
,~br~uv7
~foe~ll~r
(inculcate) (tea~)
I ~o~r~
(proclaim)
(eat)
D~s
SeGaver (devour)(gorgeoneself)
Fig. 2. Interpretation examples
sponding to the primitive is reached. Then the substitution rule (Container -> Experiencer) is applied, and the pertinent concept (FaireApprendre for Abreuver (to water) and Lira (to read) for D4vorer (to devour) is searched for in this branch using canonical graphs. The searched concept must be described with a canonical graph expressing constraints that are verified by the arguments of the metaphoric sentence (in our examples knowledge and book) The most difficult point is due to the fact that canonical graphs don't always discriminate the pertinent concept from the others belonging to the same branch (for example FaireApprendre, Inculquer (to inculcate), Enseigner (to teach), etc. have identical canonical graphs).This technical difficulty doesn't invalidate the model proposed for conventional metaphoric senses. Conceptual metaphors are not applied casually ; it is the fact that a verbal item is related to a given primitive in its semantic structure which justifies the use of this or that metaphor. There is no "spontaneous generation" of meaning; verbs contain in their conceptual schemas the entry points towards other meanings. Following examples illustrate the classic metaphor of SPACE -> TIME : 7) couper la parole (to interrupt someone),interrompre (to interrupt)
373 8) briser une conversation (to break off a conversation)interrompre brusquement (to suddenly interrupt) 9)entrecouper ses phrases de sanglots (to interrupt one's phrases with sobs) interrompre frgquemment (to frequently interrupt) All the verbs couper, briser, entrecouper depend on the class FaireCesserEspace (expressing spatial discontinuity). Interpreting metaphors consists in substituting class FaireCesser Temps (temporal discontinuity) to this class. For the two last examples, the adverbs attached when they are interpreted belong to the definitions of the verbs (to smash briser: break suddenly... ; entrecouper : to cut frequently... About ten substitution rules of this kind have been determined. They are probably not exhaustive, but have nevertheless being tested on 1000 verbs. Moreover, the fact that common meanings are shared by a class of verbs (regrouped because they belong to a same branch) partially validate our ontology. Such figurative meaning interpretations are rather poor. The following example illustrates this point. 10) Le paysan coupait souvent par le champ de bld.(The farmer often cuts through the wheatfield) Polysemy is solved by selecting the case method ,i.e. to go through associated to cut in its definition graph (see Sect. ??). But replacing to cut by to go through looses information : the implicit notion of spatial reduction (or more exactly moving distance reduction) conveyed by to cut in this sentence is lost. A more precise interpretation will need to determine which meaning parts of the metaphoric verb have to be transferred to the inferred meaning.
5
Related
works
Lexical classifications based on semantic criteria are relatively numerous in artificial intelligence. Semantic nets are main lexical knowledge representation modes; but they essentially concern nouns. However, Levin ? proposes an English verbs summary classification founded on syntactico-semantic criteria. The hierarchy lacks accuracy : Levin gets main classes (occasionally sub-classes) in which verbs are listed; but, on no account, there are semantic relations defined between verbs within a same class (or sub-class). Verbs within a same class are assimilated to syntactic synonyms ; therefore from a semantic point of view, these verbs may convey rather distant meanings. Some authors try to build precise semantic hierarchies of English verbs on a large scale ?. They emphasise the complexity of this task, notably due to the different semantic fields implied in relations between verbs having connected meanings. Miller and his collaborators determine a particularisation relation Mlowing to group different semantic components that distinguish a verb from its hyperonym. This relation between two verbs V1 and V2 (V1 hyponym of V2) is named troponymy and is expressed by the formula "accomplish V1 is accomplishing V2 with a particular manner". For example,battle, war, tourney, duel,..are troponyms of the verbal predicate fight. Troponyms of communication verbs imply the speaker intention or his motivation to communicate, as in examine, confess, preach,.., or the communication
374
media used : fax, email, phone, telex, .... Relying on these principles, they implemented a large lexical network (Wordnet) that organise verbs and other lexical categories in terms of signified. Our work is in part inspired by their approach, but we systematically precise the kind of relation between t r o p o n y m concepts (case relations).
6
Conclusion
We have presented linguistic and computational aspects of verbal polysemy. Figurative meanings of a predicate represent as m a n y shifts in meaning from a unique semantic structure that define this verb. The resolution processes consist in a simple selection of the pertinent elements of this representation (i. e. the pertinent subgraph of the definition graph) or in the same selection plus an inference from one of these elements (conceptual metaphors). Our knowledge base has been implemented in conceptual graphs but till need to be verified concerning possible incoherence due to the inheritance mechanism. The psychological validity of the representation and of the treatment we propose is also being tested in an experiment. We don't claim that our work is exhaustive neither in the semantic representation proposed, nor in the different acceptions understanding processes. We define a general framework in which the links between the multiple senses of a verb m a y be expressed. Polysemy is one of the m a j o r characteristics of natural language, it is also one of the most complex to apprehend. Moreover, as the other contextual phenomena (anaphora, implicit, etc.), polysemy is one of the m a j o r difficulty in Natural Language Processing.
References 1. Chibout, K., Masson, N. : Un r@seau lexico-s~mantique de verbes construit & partir du dictionnaire pour le traitement informatique du fran~ais, Actes du colloque LTTAUPELF-UREF Lexicomatique et Dictionnairique, Lyon, Septembre 1995. 2. Lakoff, G., Johnson, M. : Les m6taphores dans la vie quotidienne. Collection "Propositions", les 6ditions de Minuit (1985). 3. Levin, B. : English verb classes and alternations, University of Chicago Press, 1993. 4. Martin, R. : Pour une logique du sens. Linguistique nouvelle. Presses Universitaires de France, (1983). 5. Miller, A. G., Fellbaum, C., Gross, D. : WORDNET a Lexical Database Organised on Psycholinguistic Principles. in Zernik (Ed.) : Proceedings of the First International Lexical Acquisition Workshop, I.J.C.A.I., D~troit,(1989). 6. Searle, J.R. : Metaphor in Andrew Ortony (Ed.) : Metaphor and Thought. Cambridge University Press (1979), pp. 284-324. 7. Sowa J. Conceptual Structures : processing in mind and machine. AddisonWesley.Reading, Massachussetts, (1984). 8. Vapillon J., Briffault X., Sabah G., Chibout K. : An object oriented linguistic engineering using LFG and CG, ACL/EACL Workshop : Computational Environments for Grammar Development and Linguistic Engineering, Madrid (1997).
Word Graphs: The Second Set C. H o e d e I a n d X. Liu 2 1 University of Twente, Faculty of Mathematical Sciences, P.O. Box 217, 7500 AE Enschede, The Netherlands c. hoede~math, utwente, n l
2 Department of Applied Mathematics, Northwestern Polytechnical University, 710072 Xi'an, P.R. China
A b s t r a c t . In continuation of the paper of Hoede and Li on word graphs for a set of prepositions, word graphs are given for adjectives, adverbs and Chinese classifier words. It is argued that these three classes of words belong to a general class of words that may be called adwords. These words express the fact that certain graphs may be brought into connection with graphs that describe the important classes of nouns and verbs. Some subclasses of adwords are discussed as well as some subclasses of Chinese classifier words.
K e y words: Knowledge graphs, word graphs, adjectives, adverbs, classifters. A M S S u b j e c t Classifications: 05C99, 68F99.
1
Introduction
W e refer t o t h e p a p e r of H o e d e a n d Li 1 for a n i n t r o d u c t i o n t o k n o w l e d g e g r a p h s as far as n e e d e d for this p a p e r . W e o n l y recall t h e following. W o r d s are c o n s i d e r e d to b e r e p r e s e n t a b l e b y d i r e c t e d labeled graphs. T h e vertices, or tokens, are i n d i c a t e d b y squares a n d r e p r e s e n t somethings. T h e arcs have c e r t a i n t y p e s t h a t are c o n s i d e r e d t o r e p r e s e n t t h e relationship b e t w e e n s o m e t h i n g s as recognizable b y t h e mind. T h e g r a p h s t h a t we will discuss are t h e r e f o r e c o n s i d e r e d t o b e s u b g r a p h s of a huge mind graph, r e p r e s e n t i n g t h e knowledge of a m i n d a n d t h e r e f o r e also called knowledge graph. T h e s e knowledge g r a p h s are v e r y similar t o c o n c e p t u a l g r a p h s , b u t are r e s t r i c t e d as far as t h e n u m b e r of t y p e s of r e l a t i o n s h i p is concerned. T h e r e are two t y p e s of relationships. T h e b i n a r y relationships, t h e u s u a l arcs, m a y have t h e following labels:
376
EQU : Identity SUB
: Inclusional
part-ofness
Alikeness D I S : Disparateness CAU : Causality OaD : Ordering P A R : Attribuation SKO : Informational dependency. ALI
:
The SKO-relationship is used as a loop to represent universal quantification. Next to the binary relationship there are the n-ary frame-relations. There are four of these. : Relationship of constituting elements with a concept, being a subgraph of the mind graph. NEGPAR : Negation of a certain subgraph. P O S P A R : Possibility of a certain subgraph. NECPAR : Necessity of a certain subgraph. FPAR
These four frame relationships generalize the wellknown logical operators. If a certain subgraph of the mind graph is the representation of a wellformed proposition p, this proposition is represented by the frame, ~p is represented by the same subgraph framed with the N E G P A R relationship and the modal propositions ~p and Rp are represented by the same subgraph framed with the POSPAR and the N E C P A R relationship respectively. In this way logical systems can be represented by different types of frames of very specific subgraphs. We refer to Van den Berg 2 for a knowledge graph treatment of logical systems. So logic is described by frames of propositions. If a subgraph of the mind graph does not correspond to a proposition the framing, and the representation of the frame by a token, may still take place. Any such frame may be baptized, i.e. labeled with a word. The directed ALI-relationship is used between a word and the token to type the token. Thus ALI
(
STONE
is to be read as "something like a stone". Note that the token may represent a large subgraph of the mind graph. In particular verbs may have large frame contents. Verbs are represented in the same way. So ~
ALI
HIT
377
is t h e way th e verb HIT is represented. T h e directed EQu-relationship is used between a word and a token to valuate or instantiate the token. So PLUTO
EQU
~O
). c; =< Ul,~2},{gl,~2} > c~ = < {i1,~2}, {g~,~2} > c~ =< {i.}, {g,} > For C1 and C2 the same formal context is produced. 3. Calculate the context lattice Each formal context should occur only once, so redundant formal contexts (i.e. the above C~) must be removed. Furthermore, in order to create a lattice, we must also add a formal context including all extension graphs, as well as a context including all intention graphs. After renumbering the formal contexts, the resulting context lattice is as represented in fig. 1:
c~ = < {il, i2, i3}, {} > Fig. 1. The Context Lattice for the Example
426
4
Structuring Composition Norm Management
On top of the context lattice structure, a mechanism that allows for its efficient querying has been defined 10. One of its main advantages is that complex queries consisting of sequences of steps and involving both intentions and extensions can be formulated. This mechanism can be used for a structured yet flexible approach to norm management in specification discourse, by automatically determining which users to involve in discussions about specification and changes and what they are to discuss about (their agendas). There are two main ways in which a context lattice can be used in a user-driven specification process. First, it can be used to assess whether a new knowledge definition is legitimate by checking if a particular specification process is covered by some composition norm. As this is a relatively simple task of projecting the specification process on the composition norm base, we do not work out this application here. The second application of context lattices in the specification process is applying (new) specification constraints (constraints on the relations that hold between different knowledge definitions) on the existing (type, norm, and state) knowledge bases. This differs from the first application as, after a constraint has b.een applied, originally legitimate knowledge base definitions may become illegitimate, and would then need to be respecified. In this section we will first give a brief summary of how context lattices can be queried. Then, it will be illustrated how this query mechanism can play a role in composition norm management, by applying it to our example in the resolution of one realistic specification constraint.
4.1 QueryingConcept Lattices In order to make series of consecutive queries where the result of one query is the input for the embedding query, which is needed for navigating a context lattice, we need two more constructs. First, we need to be able to query a particular context extension or intention. Second, we must be able to identify the context which matches the result of an extension or intention-directed query. For the first purpose, two query functions have been defined that allow respectively extension or intention graphs to be retrieved from a specified formal context 10:
5E. (C*, q) = {geE(C*)lg < q} ~I* (C*, q) = {geI(C*)lg < q}
(3) (4)
Furthermore, Mineau and Gerb6 have constructed two context-identifying functions:
CE(G) = < I*(a), E*(I*(G)) > CI(T) = < I*(E*(T)), E*(T) >
(5) (6)
Space does not permit to describe the inner workings of these functions in detail (see 10 for further explanation). Right now, it suffices to understand that these functions allow the most specific context related to respectively a set of extension graphs G
427 or a set of intention graphs T to be found Together, these functions can be used to produce embedded queries by alternately querying and identifying contexts, thus enabling navigation through the context lattice.
4.2
Supporting the Specification Process
In the applications of context lattices, discussed in the previous section, the following general steps apply: 1) Check either the specification of a new knowledge definition against the composition norm base or the specifications of an existing knowledge base against a specification constraint. 2) Identify the resulting illegitimate knowledge definition(s). 3) Identify appropriate 'remedial composition norms' (i.e. composition norms in which the illegitimate knowledge definition is in the extension) 4) Build discourse agendas (overviews of the specifications to discuss) for the users identified by those remedial composition norms, so that they can start resolving the illegitimate definitions. These processes consist of sequences of queries that switch their focus between what is being defined and who is defining. For this purpose, the functions provided by context lattice theory are concise and powerful, at least from a conceptual point of view. One way in which we can apply context lattices is by formulating specification constraints, which constrain possible specifications and can be expressed as (sequences of) composition norm queries. Note that the example of the resolution of a specification constraint presented next is simple, and the translation into context lattice queries is not yet very elegant. However, what we try to present here is the general idea that flattening queries using context lattices is a powerful tool for simplifying and helping to understand queries with respect to the contexts where they apply. In future work, we aim to develop a more standardized approach that can apply to different situations.
4.3
An Example
One specification constraint could be: "Only actors involved in the definition of permitted actions are to be involved in the definition of(the functionality of) information tools". The constraint guarantees that enabling technical functionality is defined only by those who are also involved in defining the use that is being made of at least some of these tools. This specification constraint and much more complex ones can be helpful to realize more user-driven specification, tailored to the unique characteristics of a professional community. The power of the approach developed in this paper is that it allows such Constraints to be easily checked against any existing norm base, identifying (now) illegitimate knowledge definitions, and providing the contextual information necessary for their resolution. We will illustrate these rather abstract notions by translating the above mentioned informal specification constraint into a concrete sequence of composition norm queries. Decomposing the specification constraint, we must answer the following questions:
428
1. Which actors control the specification of which information tools? 2. Are there illegitimate composition norms (because some of these norm actors are not also being involved in the specification of any permitted actions?) 3. Which actors are to respecify these illegitimate norms on the basis of what agendas? Questions 1-3 can be decomposed into the following steps (this decomposition is not trivial, in future research we aim at providing guidelines to achieve it): la. Determine which specializations gj of information tools have been defined. The query Sl should start at the top of the context lattice, as this context contains all extension graphs: sl = ~E* (C~, ql) ----{gl, g2 } = { M A I L I N G _ L I S T ,
PRIVATE_MAILING~LIST
}
where ql ----INFO_T00L
lb. For each of these information tools gj, determine which actors ai control its specification: s2 = ~ . (C~(gl), q2) = ~* (CL q2) = {i~, i2} s3 = ~ . (cE@2), az) = ~,. (C~, q2) = {il, i2} whereq2----ACTOR:?
SPECIFY
-
(rslt) -> TYPE ----{ A C T O R , LIST_0WNER}
a2 ----ACTOR:?
and as = ACTOR:
? ----{ ACTOR
, LIST_OWNER
}
2a. Determine which actors ai are involved in the specification of permitted actions. This query should be directed toward the bottom of the context lattice, as this context contains all intention graphs (which in turn include the desired actor concepts): 84 = ~,. ( c ~ , q4) = {i~} whereq4----ACTOR:?
(rslt) a4 = ACTOR:
PERM_ACTION
(obj)
->
SPECIFY
-
}
2b. Using aa, determine, for each type of information tool gj (see la), its corresponding si and actors ai(see lb), which actors a'i currently illegitimately control its specification process. g1: M A I L I N G _ L I S T i a 2 ---- a 2 -- ( a 2 CI a 4 ) ---- { ACTOR
}
92: P R I V A T E _ M A I L I N G L I S T
a'3 = as - (a3 n a4) = { ACTOR
}
2c. For each tool identified by the gj having illegitimate controlling actors a'i, define the illegitimate composition norms c k = < i t , g j > by selecting from the si from l b i those it which contain ai: c '1 =
c'z = < Q , 9 2
>
3. In the previous two steps we identified the illegitimate norms. Now we will prepare the stage for the specification discourse in which these norms are to be corrected. A composition norm does not just need to be seen as a context. It is itself a knowledge definition which needs to be covered by the extension graph of at least one other composition norm, which in that case acts as a meta-norm. In order to correct the illegitimate norms we need to (a) identify which actors are permitted to do this and (b) what items should be on their specification agenda. This step falls outside the scope of this paper but is presented here to provide the reader with the whole picture. A forthcoming paper will elaborate on meta-norms and contexts of contexts.
429 t
3a. For each illegitimate composition norm c k, select the actors ai from the permitted (meta) composition norms era which allow that c k to be modified: 1 /
CTn = < i r a , g i n > where~m
=
ACTOR:? ->
PERM_COMP:
->
(obj)
->
MODIFY
-
#
t
and g m = c~
3b. For each of these actors ai, build an agenda Ai. Such an agenda could consist of (1) all illegltamate norms c k that each actor is permitted to respecify and (2) contextual information from the most specific context in which these norms are represented, or other contexts which are related to this context in some significant way. The exact contextual graphs to be included in these agendas are determined by the way in which the specification discourse is being supported, which is not covered in this paper and needs considerable future research. However, we would like to give some idea of the direction we are investigating. In our example, we identified the illegitimate (derived) composition norm 'any actor is permitted to control (i.e. initiate, execute, and evaluate) the specification of a private mailing list' (< il, if2 >). From its formal context C~ it also appears that a list owner, on the other hand, is permitted to at least execute the modification of this type (< i2, if2 >). If another specification constraint would say that one permitted composition for each control process category per knowledge definition suffices, then only the initiation and evaluation of the modification now would remain to be defined (as the execution of the modification of the private mailing list type is already covered by the norm referring to the list owner). Thus, the specification agendas A i for the actors ai identified in 3a could include : 'you can be involved in the respecification of the initiation and the evaluation of the modification of the type private mailing list', as well as 'there is also actor-such-and-such (e.g. the list owner) who has the same (or more general/specific) specification rights, with whom yon can negotiate or whom you can ask for advice.'. Of course, in a well-supported discourse these kinds of agendas would be translated into statements and queries much more readable to their human interpreters, but such issues are of a linguistic nature and are not dealt with here. .
5
.
/
Conclusions
Rapid change in work practices and supporting information technology is becoming an ever more important aspect of life in many distributed professional communities. One of their critical success factors therefore is the continuous involvement of users in the (re)specification of their network information System. In this paper, the conceptual graph-based approach for the navigation of context lattices developed by Mineau and Gerb6 1997 was used to structure the handling of user-driven specification knowledge evolution. In virtual professional communities, the various kinds of norms and the knowledge definitions to which they apply, as well as the specification constraints that apply to these norms, are prone to change. The formal context lattice approach can be used to guarantee that specification processes result in 1 For lack of space, we have not included such composition norms in our example, but since they are also represented in a context lattice, the same mechanisms apply. The only difference is that the extension graphs are themselves contexts (as defined in Sect.3).
430
legitimate knowledge definitions, which are both meaningful and acceptable to the user community. Extracting the context to which a query is applied, provides simpler graphs that can more easily be understood by the user when he interacts with the CG base. It also provides a hierarchical path that guides the matching process between CGs, that would otherwise not be there to guide the search. Even though the computation cost of matching graphs would be the same, overall performance would be improved by these guidelines as the search is more constrained. But the most interesting part about using a context lattice, is that it provides a structuring of different contexts that help conceptualize (and possibly visualize) how different contexts ('micro-worlds') relate to one another, adding to the conceptualization power of conceptual graphs. In future research, we plan to further formalize and standardize the still quite conceptual approach presented here, and also look into issues regarding its implementation.
References 1. A. De Moor. Applying conceptual graph theory to the user-driven specification of network information systems. In Proceedings of the Fifth International Conference on Conceptual Structures, University of Washington, Seattle, August 3-8, 1997, pages 536-550. SpringerVerlag, 1997. Lecture Notes in Artificial Intelligence No. 1257. 2. B.R. Gaines. Dimensions of electronic journals. In T.M. Harrison and T. Stephen, editors, Computer Networking and Scholarly Communication in the Twenty-First Century, pages 315-339. State University of New York Press, 1996. 3. L.J. Arthur. Rapid Evolutionary Development - Requirements, Prototyping & Software Creation. John Wiley & Sons, 1992. 4. T.W. Malone, K.-Y. Lai, and C. Fry. Experiments with Oval: A radically tailorable tool for cooperative work. ACM Transactions on Information Systems, 13(2): 177-205, 1995. 5. E Holt. User-centred design and writing tools: Designing with writers, not for writers. Intelligent Tutoring Media, 3(2/3):53-63, 1992. 6. G. Fitzpatrick and J. Welsh. Process support: Inflexible imposition or chaotic composition? Interacting with Computers, 7(2):167-180, 1995. 7. L.J. Arthur. Quantum improvements in software system quality. Communications of the ACM, 40(6):46-52, 1997. 8. 1. Hawryszkiewycz. A framework for strategic planning for communications support. In Proceedings of the Inaugural Conference of lnfortnatics in Multinational Enterprises, Washington, October 1997, 1997. 9. E Dignum, J. Dietz, E. Verharen, and H. Weigand, editors. Proceedings of the First International Workshop on Communication Modeling 'Communication Modeling - The Language/Action Perspective', Tilburg, The Netherlands, July 1-2, 1996. Springer eWiC series, 1996. http://www.springer.co.uk/eWiC/Workshops/CM96.html. 10. G. Mineau and O. Gerbt. Contexts: A formal definition of words of assertions. In Proceedings of the Fifth International Conference on Conceptual Structures, University of Washington, Seattle, August 3-8, 1997, pages 80--94. Springer Verlag, 1997. Lecture Notes in Artificial Intelligence, No. 1257. 11. E Dignum and H. Weigand. Communication and deontic logic. In R. Wieringa and R. Feenstra, editors, Working Papers of the IS-CORE Workshop on Information Systems - Correctness and Reusability, Amsterdam, 26-30 September, 1994, pages 401--415, September 1994. 12. J.F. Sowa. Conceptual Structures: Information Processing in Mind and Machine. AddisonWesley, 1984.
Using CG Formal Contexts to Support Business System Interoperation Hung Wing 1, Robert M. Colomb 1 and Guy Mineau 2 for Distributed Systems Technology Department of Computer Science The University of Queensland Brisbane, Qld 4072, Australia Dept. of Computer Science Universit~ Laval, Canada
1 CRC
A b s t r a c t . This paper describes a standard interoperability model based on a knowledge representation language such as Conceptual Graphs (CGs). In particular, it describes how an Electronic Data Interchange (EDI) mapping facility can use CG contexts to integrate and compare different trade documents by combining and analysing different concept lattices derived from formal concept analysis theory. In doing this, we hope to provide a formal construct which will support the next generation of EDI trading concerned with corporate information.
1
Introduction
There have been several attempts to overcome semantic heterogeneity existing between two or more business systems. It could be a simple paper-based system in which purchase orders generated from a purchasing program can be faxed (or communicated via telephone) to a human coordinator, whose job is to extract and transcribe the information from an order to a format that is required by an order entry program. In general, the coordinator has specific knowledge that is necessary to handle the various inconsistencies and missing information associated with exchanged messages. For example, the coordinator should know what to do when information is provided that was not requested (unused item) or when information that was requested but it is not provided (null item). The above interoperation technique is considered simple and relatively inexpensive to implement since it does not require the support of EDI software. However, this approach is not flexible enough to really support a complex and dynamic trade environment where time critical trade transactions (e.g. a foreign exchange deal) may need to be interoperated on-the-fly without prior negotiation (about standardised trading terms) having to rely on the human ability to quickly and correctly transcribe complex trade information. In facilitation of system interoperation, other more sophisticated systems include general service discovery tools like the trader service of Open Distributed Processing 5, schema integration tools in multidatabase systems 2, contextbased interchange tools of heterogeneous systems 7, 1, email message filtering
432
tools of Computer Systems for Collaborative Work (CSCW) 3, or EDI trade systems 6. The above systems are similar in the sense that they all rely on commonly shared structures (ontologies) of some kind to compare and identify semantic heterogeneity associated with underlying components. However, what seems lacking in these systems is a formal construct which can be used to specify and compare the different contexts associated with trade messages. Detailed descriptions of theses systems and their pros and cons can be found in 9. In this paper, we describe an enhanced approach which will support business system interoperation by using Conceptual Graph Formal Context4 deriving from the Formal Concept Analysis theory 8. The paper is organised as follows: Section 2 overviews some of the relevant formal methods. Section 3 describes how we can overcome the so-called 1st trade problem (refers to the initial high cost to establish a collection of commonly agreed trading terms).
formal ~ formal
/
.
, ~
formal
Fig. 1. EDI Mapping Facility (EMF)
2
Relevant
formal
methods
In designing the EDI Mapping Facility (EMF) (shown in Figure 1), we aim to facilitate the following: 1) S y s t e m a t i c i n t e r o p e r a t i o n : allow business system to dynamically and systematically collaborate with each other with minimal human intervention, 2) U n i l a t e r a l c h a n g e s : allow business system to change and extend trade messages with minimal consensus from other business systems, and 3) M i n i m i s i n g u p f r o n t c o o r d i n a t i o n : eliminate the so-called one-to-one bilateral trade agreements imposed by traditional EDI systems. To support the above aims, we need to be able to express the various message concepts and relationships among these concepts. In doing so we need a logical notation of some kind. In general, a formal notation such as first order logic, Object Z, or CGs is considered useful due to the following: 1) it is an unambiguous logical notation, 2) it is an expressive specification language, and 3) the specification aspects can be demonstrated by using mathematical proof techniques.
433
However, we choose CGs to specify EDI messages due to the following added benefits: First, the graphical notation of CG is designed for human readability. Second, the Canonical Formation Rules of CGs allow a collection of conceptual graph expressions to be composed (by using join, copy) and decomposed (by using restrict, simplify ) to form new conceptual graph expressions. In this sense, the formation rules are a kind of graph grammar which can be used to specify EDI messages. In addition, they can also be used to enforce certain semantic constraints. Here, the canonical formation rules define the syntax of the trade expressions, but they do not necessarily guarantee that these expressions are true. To derive correct expressions from other correct expressions we need rules of inference. Third, aiming to support reasoning with graphs, Peirce defined a set of five inference rules (erasure, insertion, iteration, de-iteration, double negation) and an axiom (the empty set) based on primitive operations of copying and reasoning about graphs in various contexts. Thus, rules of inference allow a new EDI trade expression to be derived from an existing trade expression, allowing an Internet-based trade model to be reasoned about and analysed. Furthermore, to facilitate systematic interoperation.we need to be able to formalise the various trade contexts (assumptions and assertions) associated with EDI messages. According to Mineau and Gerb~, informally: 'A context is defined in two parts: an intention, a set of conceptual graphs which describe the conditions which make the asserted graphs true, and an extension, composed of all the graphs true under these conditions' 4. Formally, a context Ci can be described as a tuple of two sets of CGs, T/ and Gi. T/ defines the conditions under which Ci exists, represented by a single intention graph; Gi is the set of CGs true in that context. So, for a context Ci, Ci= < T/, Gi > = , where I(Ci), a single CG, is the intention graph of Ci, and E(Ci), the set of graphs conjunctively true in Ci, are the extension graphs. Based on Formal Concept Analysis theory 8, Mineau and Gerb~ further define the formal context, named C*, as a tuple < T/, Gi > where Gi = E*(Ti) and T~ = I*(G~) = I(C*). With these definitions, the context lattice, L, can be computed automatically by applying the algorithm given in the formal concept analysis theory described below. This lattice provides an explanation and access structure to the knowledge base, and relates different worlds of assertions to one another. Thus, L is defined as: L = < {C*}, < > . In the next section, we describe how these formal methods can be applied to solve the so-called 1st trade problem.
3
An
example:
Overcoming
the
1st trade
problem
This example is based on the following trade scenario: in a foreign exchange deal, a broker A uses a foreign exchange standard provided by a major New York bank to compose a purchase order. Similarly, a broker B uses another standard provided by a major Tokyo bank to compose an order entry. Note that these two standards are both specialisations of the same EDI standard. The key idea here
434
is to deploy an approach in which we can specify the different assumptions and assertions relevant to the trade messages so we use these formalised specifications to systematically identify the different mismatches (null, unused, and missing items). As an example, Figure 2 shows how we may use CG contexts to model the various trade assumptions (intents il, ..., iT) and concept assertions (extents e l , ..., el2).
CotCllXt: f:a~lIn r=,'llor4e
L-'oNrlEXT: ~tl~l~. ~ h ~ g e
C6. r cl
r!
c2
14 3
~.,.~:-) t.XlC4TEX'i': FcrCilm e x c h r ~
Iktt~mem:
~,
Fac~tmlO00
....
l .lfi. . . . . .
J
I
{;}I(I1N(?: F,lell. e l { l ~ .
Fltlor t I
I2
{.I k~tllt.XT: r~letkll qxchn.ll~, ('.IT~IK'y~J|'Y
~7 ( ~ a - - ~ , * ~ ' ~ i ~ ; ' ; ~ ' , ~
;~-')
~:~'~{~--~
I
F i g . 2. Sample C G contexts relevant to a purchase order (left) and an order entry (right)
There are several steps involve in the systematic interoperation of EDI messages. The following steps is based on the EMF shown in Figure 1. 9 S t e p 1. P r e p a r e a n d f o r w a r d specs: Broker A and B can interact with the Customer Application Program and Supplier Application Program, respectively, to compose a purchase order and an order entry based on the standardised vocabularies provided by a foreign exchange standard. Figure 2 shows two possible specifications: a purchase order and an order entry. Once formal specifications have been defined (by using CG formation rules and contexts), they can be forwarded to either an Purchase Order Handler or an Order Entry Handler for processing. Upon receiving an order request, the Purchase Order Handler checks its internal record stored in the Supplier Log to see whether or not this order spec has been processed before. If not, this 'lst trade' spec is forwarded to the EMF Server for processing. Otherwise, based on the previously established trade information stored in the Supplier Log, the relevant profile can then be retrieved and forwarded with the relevant order data to an appropriate Order Entry Handler for processing. In order to identify the discrepancy between a purchase order
435 and an order entry, the Order Entry Handler needs to forward an order entry spec to an EMF Server for processing. 9 S t e p 2. I n t e g r a t e a n d c o m p a r e specs: To effectively compare two specs from different sources the EMF server needs to do the following: 1) formalise the specs and organise their formal contexts into two separate type hierarchies known as Context Lattices. Note that in order to compare two different specs, an agreement on an initial ontology must exist; 2) by navigating and comparing the structures of these context lattices it can identify and integrate contexts of one source with contexts of another source. In doing so it can form an integrated lattice, and 3) this integrated lattice can then be accessed and navigated to identify those equivalent a n d / o r conflicting intentions (or assumptions). From these identified and matched intentions it can compare the extents in order to identify those matched, unused, null, and conflict assertions. The result of the comparison steps can then be used to generate the necessary mapping profiles. In the following, we describe how the above steps can be formally carried out. Moe attributes
M )o a t t r i b u t e s il
.C-•o objects
cl r c3 c4 c5
i2
i3
ia
x x x
x x
x
cbe objects
r c7 r
x g
i5 i6
i7
x
r,9 cl( cll cl~ x
x
x
c12
Fig. 3. FCA Formal Contexts represent two different sets of assumptions (about standard, currency and scale factor)
G e n e r a t i n g t h e C o n t e x t Lattices: Based on the FCA theory, the above formal CG contexts can be systematicMly re-arranged to form the corresponding FCA contexts of the purchase order and order entry (denoted a s / C r a n d / C o , respectively). These contexts are illustrated as cross-tables shown in Figure 3. The cross-table on the left depicts the formal context (/C~,) of the purchase order spec representing a query graph, while the cross-table shown on the right depicts the formal context (/Co) of the order entry spec representing a type graph. To simplify our example, all asserted conceptual relations shown in Figure 2 have been ignored in the cross-tables. If the application is required to query and compare the asserted relations, they can easily be included in the cross-tables prior to generating the context lattice. Recall from FCA theory that for a given context /C we can systematically find its formal concepts (X~, Bi). By ordering these formal concepts based on the sub/superconcept relation ( P/Order Context Lattice
2}> C9=
ClO= O/Entry Context Lattice
Fig. 4. Context lattices generated from cross-tables shown in Figure 3
respectively. The context /(:7, has five formal concepts { C1,C2,C3,C4,C5} and the context/Co also has five formal concepts { C 6 , C 7 , C 8 , C 9 , C 1 0 }. I n t e g r a t i n g t h e C o n t e x t L a t t i c e s : At this point, the context lattices B(/C 7,) and B(ICo) represent the important hierarchical conceptual clustering of the asserted concepts (via the extents) and a representation of all implications between the assumptions (via its intents). With these context lattices we can then proceed to query and compare the asserted concepts based on the previously specified assumptions. However, before we can compare the individual concepts, we need to combine the context lattices to form an integrated context lattice. In doing this we ensure that only those concepts that are based on the same type of assumptions (or intention type) can be compared with each other. Otherwise, the comparison would make no sense.
$2
'C~ pair 7 matching pair
(
CIO= C5--
Fig. 5. Integrated Context Lattice
Based on the information given by the individual contexts shown in Figure 2 we can derive that il is equivalent to i4 (i.e. both context C2 and C7 are based on the same foreign exchange standard). Thus, we can integrate and compare context C2's individual concepts c2, c3, c4 against node C7's individual concepts (c7, cs, c12). By comparing these individual concepts we find that c2 = c7, c3 = cs, and c4 = cir. Note that this comparison is true only when the above concepts are defined according to the convention specified by intents
437 il and i4. This integration step is illustrated in the top part of the integrated lattice shown in Figure 5. Similarly, we can integrate and compare contexts C3 and C8. In this case, we find that concept cs = c3 (quantity) according to the assumption that cs and c3 are based o n i4 (foreign exchange standard) and i5 (factor = 1). This integration step is illustrated in the left part of the integrated lattice shown in Figure 5. While integrating and comparing C5 and C9 we find some discrepancies in the intentions i2 (factor =1) and i6 (factor = 1000), also in i3 (currency = USD) and i7 (currency = JPY). These discrepancies in the intent parts suggest that both c4 and c12 (CostPerUnit) are based on a conflicting assumption. This integration and comparison step is illustrated in the right part of the integrated lattice shown in Figure 5. The results of the integration process can be used to form the result profiles which identify those null, unused and mismatched items of a purchase order and an order entry. This profile is then forwarded to the relevant handlers to control the data flow between business sytems. In general, a mapping profile can systematically be generated by inferring the integrated context lattice. 9 S t e p 3, Forward relevant data: Upon receiving the mapping profiles from an EMF Server, the Purchase Order Handler and Order Entry Handler store these profiles in the Supplier Log and Customer Log, respectively. In doing so, subsequent order requests can use these profiles to coordinate and transport the purchase order data to the appropriate order entry programs without having to integrate and compare the Purchase Order's and Order Entry's specs. It is important to point out that by navigating the context lattice, analysis software would be able to identify the reasons behind the mismatching results. Some mismatches (e.g. unknown concepts those which cannot be identified by a particular standard) can be impossible to interpret by another system without the intervention of a human coordinator. However, some other mismatches (e.g. those exchanged concepts that were based on a shared ontology but were not provided or asked for) can be systematically appended to the original specs and forwarded back to the Purchase Order Handler or Order Entry Handler for re-processing. A n o p e n r e s e a r c h a g e n d a : Previously, we have described an approach to identify discrepancies among different CG concepts. It is important to note that discrepancies may come from relations and not just from concepts. They may come in the way concepts are connected by relations or they may come from within nested concepts (or nested relations). For example, the message 'Broker A delivers a quote to Broker B' may have different interpretations depending on whether A calls (or emails) B to deliver a quote on the spot (which may not be secured), or A requests a quote specialist (by using a server) to make a delivery (in which a quote can be securely delivered by using the encryption, certification, and non-repudiation techniques). If the application does not care how the quote is delivered, as long as B receives the quote, then it is not necessary to analyse or reason about the relevant nested concepts (or relations). However, if the security associated with the delivery is of concern we need to find a way to compare and
438
identify the potential conflicts embedded in nested concepts. Discrepancies associated with relations can be solved by using the above described approach. For example, we can substitute relations (instead of concepts) as the extents of the cross table shown in Figure 3. In doing so, we can then generate the necessary context lattices and integrated lattices based on relations and not concepts. Thus, we can then compare and identify discrepancies a m o n g relations. If we view the different ways in which concepts are connected by relations as a collection of 'super concepts', then to identify the discrepancies a m o n g these super concepts, a partial common ontology (which describes how concepts m a y be connected by relations) must be used. The level of matching between different ontologies will have a direct impact on the comparison heuristic. The problem here is to discover some heuristics to guide a search through two lattices in order to integrate them. In doing so, we can find enough similarity to discover dissimilarities. To conclude, by using formal concept analysis and conceptual graph f o r m a l ism we can systematically create context lattices to represent complex message specifications and their assumptions. In doing so, message specs can be effectively navigated and compared, thus it is feasible that a formal EDI m a p p i n g approach can be facilitated.
References 1. C. Goh, S. Bressan, S. Madnick, and M. Siegel. Context interchange: Representing and reasoning about data semantics in heterogeneous systems. ACM Sigmod Record, 1997. 2. Vipul Kashyap and Amit Sheth. Semantic and schematic similarities between database objects: a context-based approach. The VLDB Journal, 5, 1996. 3. J. Lee and T. Malone. Partially Shared Views: A Scheme for Communication among Groups that Use Different Type Hierarchies. ACM Transactions on Inormation Systems, 8(1), 1990. 4. G. Minean and O. Gerb~. Contexts: A formal definition of worlds of assertions. In Dickson Lukose et al, editor, International Conerence on Conceptual Structures (ICCS97), 1997. 5. A. Puder and K. RSmer. Generic trading service in telecommunication platforms. In Dickson Lukose et al, editor, International Conerence on Conceptual Structures
(ICCS97), 1997. 6. L. Raymond and F. Bergeton. EDI success in small and medium-sized enterprises: A field study. Journal of Organizational Computing and Electronic Commerce, 6(2), 1996. 7. G. Wiederhold. Mediators in architecture of future information systems. IEEE Computer, 25(3), 1992. 8. R. Wille. Restructuring Lattice Theory: An Approach Based on Hierarchies of Concepts. In I. Rival, editor, Ordered Sets. Reidel, Dordrecht, Boston, 1982. 9. Hung Wing. Managing Complex and Open Web-deloyable Trade Objects. PhD Thesis, University of Queensland, QLD. 4072, Australia, 1998.
Ontologically, Yours Daniel Kayser L.I.P.N., Institut Galilée Université Paris-Nord Avenue Jean-Baptiste Clément F-93430 Villetaneuse, France
[email protected] Abstract. The word ‘ontology’ belongs to the vocabulary of Philosophy, and it is now commonly used in Artificial Intelligence with a rather different meaning. There is nothing wrong here, so long as what is meant remains clear. AI typically claims to use ontology for building ‘conceptual structures’. Now concepts are named by words, and in the relation concept-word, the goals of AI imply not to move too far away from words. Therefore, it is not at all certain that ‘conceptual structures’ should be handled in the way that logicians and philosophers consider to be adequate for concepts. Thinking about the actual use of words in reasoning may therefore open some new perspectives in the theory of ‘conceptual structures’.
1 Introduction Knowledge Representation (KR) aims at expressing the knowledge concerning a domain in terms of its concepts, relations, individuals or whatever; however, the choice of these entities is not considered to be a genuine KR question, but rather an ontological one. Ontology is more and more frequently referred to in Artificial Intelligence (AI), and there is an active and scientifically very creative research community currently interested in ‘ontologies’. Now, the word ‘ontology’ belongs to the vocabulary of Philosophy, where it is deeply related with problems of existence. When AI considers some questions as being ontological, does it really mean that we are concerned with metaphysical problems? Most of us will certainly answer negatively. We are considering models of a reality; these models are designed for a given purpose, and the main problem we are interested in is their adequacy to this purpose. Being adequate is not grounded on being “real”; furthermore being realistic, i.e. aiming at the representation of “all” aspects of reality, goes in the opposite direction to being adequate, for obvious complexity reasons. What “AI ontology” wants is to find the basis for efficient models, whereas what “philosophical ontology” wants is to discuss the problem of real M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 35-48, 1998. © Springer-Verlag Berlin Heidelberg 1998
36
D. Kayser
existence, whatever this may mean (see below). Their goals are not comparable, and their methods can be completely contradictory. Although it may distract me from my main points, I would like to digress on what means "to exist" for the layman.
1.1 Trivial Considerations about Existence The most "natural" way of existing is precisely to exist in Nature. Trees exist. Ghosts do not exist. But we often forget that tree is a word, that people use to refer to a category of objects, and the “existence” of trees is nothing more than an agreement that this oak, that pine, and so on belong to the same category, agreement based on the fact that these objects share common properties that we consider as important. The other “way of existence” is the existence inside of a model. In the standard propositional calculi, there exist tautologies; in euclidean geometry, for each point there exists a unique line including the point and having a given direction, etc. In some sense, the distinction between both “ways of existence” is ill-defined, because as we have noticed, rather than to Nature itself, the first one refers to an agreement about the "obvious model” of Nature, hence existence in Nature is also existence inside of a model. But there is another deep problem to take into consideration, namely whether objects (e.g. this oak, etc.) do exist in Nature or only in the "obvious model”, presumably innate, by which we perceive the external world. At a very different scale from ours, this oak is not an object at all: it is a collection of molecules or, alternatively, an indistinguishable part of the vegetal cover of the earth. This question may at first sight appear to be more relevant to “philosophical ontology” than to AI, but we will see that it is not the case.
1.2 Plan of the Paper Section 2 explains the difference between "philosophical" ontology and AI in terms of their different view of what a concept actually is. Of course, only a small part of such a subject is covered, and we try to focus on what is relevant for the basic idea defended in this paper. We have already underlined that even if there are fundamental discrepancies between a concept and the lexical unit that names it, it was nonetheless impossible to ignore the behaviour of words in a discussion about concepts. Therefore, section 3 develops and exemplifies some linguistic considerations, in the domain of technical terms as in the general use of the language. We will then be ready to present, in section 4, the main thesis of the paper: the inadequacy of current AI ontologies — as soon as they concern a domain dealing with concepts which are not perfectly defined logically — originates in an idealistic view which does not meet AI needs. This view should be replaced by a much more flexible one, taking into account variability, i.e. the fact that ontological uniqueness is a wrong
Ontologically, Yours
37
requirement for AI, and borrowing from the linguistic processes by which the meaning of a word gets adapted to its context of use.
2 Philosophical vs. AI View of a Concept Philosophy, as for that matter Mathematical Logic, has a rather strict notion of a concept. In its extensional version, a concept corresponds exactly to a unary predicate, i.e. to a function which, for every object, tells whether or not it "falls under" the concept. The other view of a concept, i.e. its intension, can either be seen as a kind of generalization (intension as extension in multiple worlds, cf. [10]), as a mere variant (intension as a set of properties, each property being nothing more than a unary predicate) or as ontologically distinct (properties "exist", their having an extension being a contingent, less important, feature). The intensional view has interesting applications in AI (equational calculus based on intensional normal forms has proven useful in Description Logics), but identifying an object with the set of its properties is counterintuitive, and when the properties are allowed to change with time, the identity of the object is hard to maintain. We therefore concentrate on the extensional view. There are a number of difficult questions lurking behind this seemingly simple notion, some of which are discussed below.
2.1 Vague Concepts The first problem is well known. Whether an object "falls" or not under a concept does not always come as a yes / no decision. Under the designation of vagueness (Russell), fuzziness (Zadeh) or whatever, this issue has been investigated by various means. None of them appear to be completely satisfactory. Consider for instance the popular fuzzy-set representation of a vague concept: if a concept c is vague, it is hard to decide which real number µ (i.e. a number, each digit of which has a well-defined value!) corresponds to the degree of membership of every object o to c. The meaning of setting µ to 1 must also be clear, i.e. there must exist a criterion telling what a perfect instance of c is, and this seems somehow contradictory to the fact that c is only vaguely defined. Several other technical or mathematical proposals have been put forward in order to cope with vagueness, and we will not discuss them here, because we want to focus on a more difficult issue, which seems to have been often overlooked.
2.2 Vague Objects? In order to give either a yes / no answer, or a more finely shaded one as to whether object o falls under concept c, it is not only important to know what we mean by c, but
38
D. Kayser
also what o precisely is. All traditional logical approaches tend to consider that objects 1 "are there", and to ignore the problem of deciding what counts as an object . There are well known philosophical difficulties, e.g. whether a heap is an object or a set of objects, yielding to possible paradoxes (sorites), see also [13]. There may be also physical difficulties to define objects, e.g. at the wave/corpuscle level. None of them is really relevant to AI: when we represent entities in knowledge bases, they usually behave as “classical” objects. Now even so, what we consider in the real world to be an object tends to be much more problematic than most of us are ready to admit. Language adds a lot of difficulties, that we cannot ignore — therefore, we will discuss them in the following section — but language is far from being responsible for 2 all the problems. Ostensive designation for instance introduces also ambiguity. If, pointing to some direction, you utter or make a gesture suggesting “this is awkward”, you may mean that some object, or all objects of some category, or a situation in which some (set of) object(s) play(s) a role, or many other things, are unpleasant to you. A "conceptual" translation such as: ($ x) (awkward(x))
(F)
requires to define over which domain D (physical objects, classes of physical objects, sets, situations, etc.) the variable x is quantified, or in other terms what counts as an object, independently of any "sorites" or "quantum physics" kind of consideration.
2.3 Variable Ontology Even if we are able to make a reasonable choice for D under some given circumstances, what seems to have been up to now completely neglected is the fact that this choice must be a revisable one. As a matter of fact, the ability to modify an ontology during a reasoning process seems to be a fundamental property of intelligence. My favourite example here — but an amazing number of various situations exhibit the same pattern — consists in planning a journey; suppose you have to go by car from A to B; it is often enough to take A and B as points, and the route between them as a line. But the map on which you plan your route may give indications of narrow sections of the road, or of low bridges, in which case you have to take temporarily into account two or three-dimensional clearance considerations. Moreover, if the place B where you are heading, is located in a town T, it is convenient to reason as if T were a point, up to T's outskirts where you might have to choose between directions T-West and T-East, a very strange choice if T continues to be handled as a point!
1
Mereology [9] can be counted as different in this respect, but this approach adds specific difficulties which we cannot develop here. 2 See also [11].
Ontologically, Yours
39
We will discuss in section 4. some implications of this kind of phenomena. The only consequence that we draw for now is that the solution of an "ontological problem" should not be a structure such that the concepts of the application rigidly map to a unique entity of the structure. The degrees of freedom that are needed should answer the problems of vagueness or fuzziness as well as the problem of the identity of objects.
3 Some Linguistic Considerations
3.1 General Natural languages add a lot of specificities to the problem of concept representation; just to mention a few of them, the distinction between homonymy and polysemy, the distinction between proper (literal) meaning and various metonymic or metaphorical extensions, the assignment of a syntactic category, the variation of the meaning of an expression when it gets translated from a language to another, … It may therefore seem wise for "knowledge engineers" to be careful not to tread on the dangerous ground of language. But this cautiousness would lead to evade the real problems. After all, it is not by chance if, in order to make clear what concept we are interested in, we always designate it by a natural language word or phrase. The widespread belief that precautions — such as putting capital letters for concepts and lowercase ones for words — are sufficient to prevent concepts from borrowing the involved behaviour of words, is groundless. Let us consider a couple of examples.
3.2 What Kind of Thing is a Unix Command? “Command” is a word that has various meanings according to context. However, in the narrow scope of the Unix Operating System, it seems that this word corresponds to a single, well-defined concept. 3 Looking across a User's Guide tells us how the word is used; because of some of the linguistic phenomena listed in §3.1., and especially of metonymy, the uses may not be very helpful to determine exactly where to put the concept in an ontology. However, it should provide useful hints. The first occurrence of the word, and the only one which has the form of a definition, is very misleading:
3
We refer in this section to "SunOS User's Guide: Getting Started ©1990 Sun Microsystems Inc." The numbers following the sentences refer to the page numbers in that Guide.
40
D. Kayser
Most commands are executable files; that is, they are files that you can run.
(9)
Later on, it appears that the Guide tries to carefully distinguish between the command and the command line (i.e. command + options + arguments). But quickly the tendency to conciseness seems to override this distinction: The mkdir command creates directory.
(23)
Is this sentence a metonymy, i.e. should it be read as “command lines having mkdir as their command create directories”? Or is it still the proper sense, i.e. the command does create the directory, but the way to do it requires that you type a command line including the command and some other stuff? A similar doubt arises with the very frequent occurrences of sentences “the syntax of xxx is …” where xxx has been introduced as a command. The (non-ambiguous) meaning of such sentences can be both understood as metonymic ("the syntax of a command line starting with xxx is …”) or proper (commands have syntax, explaining how command lines using them should be formed). Sentences like the following one are obviously inconsistent with the policy expressed in the Guide, since options are said to be part of command lines, not of commands: Use the ls -l command to determine what permissions files and directories have.
(37)
The above remarks only illustrate the very trivial fact that, despite its attempts to name consistently the concepts, the Guide, as every technical or non-technical piece of text, cannot completely avoid sloppiness in its wording. However, this leaves completely open our initial question: where should we put the concept command in an ontology? Under “string of characters”? Under “class of processes”? Under “implements”? Or still in a different part of the ontology? An even more important question is “what is the criterion allowing to choose among these possibilities?”. Consider for a moment that we select (more or less arbitrarily) one of these options; for instance, we decide that “basically”, a command is a kind of string of characters; then, all the other concepts that may be called a command should be derived from this "basis". This decision being taken, a careful reading of the Guide allows us to endow this "basic" concept with roles, e.g. commands are strings of characters: • typed by a user, • in order to fulfill one of his/her goals, • being interpreted by a program, • thereby creating a process, • etc. The Guide could then be rewritten, albeit in a clumsy style, in a consistent way with this ontological decision. As we have seen above, this rewriting may not be unique, but if the meaning of the sentence is clear, the various translations should be equivalent.
Ontologically, Yours
41
But some cases remain problematic. Consider the sentence: kill provides you with a direct way to stop commands that you no longer want.
(85)
The clear meaning of this sentence is: « if you type at some time t a command line starting with the command (string of characters) c, in order to fulfill goal g, thereby creating process p, if
at some later time t’, g is not fulfilled, p is still active, and you have the new goal g’ consisting in cancelling p,
then you should type at t’ a command line l obeying the syntax "kill PIDs", where … ». The least that can be said here is that the translation is not easy. Similar difficulties will be encountered with: running a command in the background.
(88)
restarting a command
(88)
These difficulties are related to the spatio-temporal dimension which was not taken into account in the choice of our "basic" concept. They belong to the type/token kind of difficulties, that arise more or less everywhere. As a matter of fact, if a command is identified to a string of characters, two different users typing the same string will issue the same command, as will do a single user typing the same string at two different times. The correspondence between command and process will therefore become rather problematic. In technical domains, the story generally ends well: there is in principle one (or several) “ontologically correct” representation(s) of each occurrence of a term that can be expressed as formula(s) using the “basic concept”, whatever it is, that has been selected for that term. However, the relationship can be very hard to find (in the (85) example above, commands stands for processes initiated by command lines starting with "commands" in the “basic” sense of the term ; in a similar study [6] on the Hypercard User's Manual, an even more far-fetched solution was needed: it turned out that field was used for the way the content of the “basic” field concept was displayed on the printer!).
42
D. Kayser
3.3 What Kind of Thing is a Traffic Light? In everyday life, it is far less obvious to determine a "basic" concept, and, more radically, whether this idea of "basic concept" makes sense at all. In a previous work [4], I argued that book should basically refer to an equivalence class of physical objects, the equivalence relation being under-determined (depending on the context, e.g. two different editions of a book count as equivalent or not). Therefore, even in a well-defined situation, there may remain for ordinary language an under-determination as to how a given occurrence of a word such as book should be translated by means of "basic" concepts. In a more recent paper [3] my colleagues and myself have tried to investigate how the concept of traffic light was to be ontologically classified, when car-crash reports 4 had to be analyzed . Two sensible choices appear as possible: either a traffic light is a physical object, or it is a process. Of course, there is a relation between them: the process is the purpose of the object, the object is the place where the process takes place. In several sentences, both concepts must coexist in a single occurrence of the term. Consider for example: (…) car d’ordinaire se trouve un feu à ce carrefour. (hors fonctionnement ce jour-là)
A6
(…) as usually there is a traffic light at this crossroads (out of use this very day) What is located at the crossroads is the physical object, and what is not functioning is the process. Maybe, more strikingly: Étant à l’arrêt au feu tricolore (rouge)
B4
being stopped at the three-coloured traffic light (red) The physical object is not three-coloured, nor is the process! The process allows the light emitted by the object to take different colours at different times; the parenthesis reflects the trouble of the author of the report, to assign simultaneously different types to a single word. An "ontologically correct" paraphrase of the text would be something like: being stopped in the vicinity of the location of a physical object sheltering a process emitting light that at different times takes three different colours, at a phase of the process where the colour was red !! A similar case can be made about: A Nanterre, arrêtée à un feu rouge, un automobiliste n'a pas vu le feu
B66
in Nanterre, while I was stopped at a red traffic light, a car-driver did not see the traffic light 4
We are grateful to M.A.I.F. Insurance Co. for having put a number of reports at our disposal. The number following the sentences correspond to a numbering of these texts.
Ontologically, Yours
43
the author is stopped in the vicinity of the object, at a specific phase of the process, and what happened is presumably not the fact that the other car-driver did not see the object, but that he did not pay attention to the current phase of the process. The temporal interpretation is event sharper in: Alors que je redémarrais au feu vert, (…)
B6
While I was getting going again at the green traffic light, (…) If the "basic concept" is the object, we have to jump from it to the process and then to the instant at which the process is changing phases. The conclusion is that we may arbitrarily assign a "basic" concept to the term traffic light. But then nearly every occurrence of the term must be treated as a metonymy, that our knowledge of the usual driving situations allows us to decode, more or less accurately, in order to get an expression using the "basic" meaning. However, this view entails: • a choice of the "basis", for which no solid ground seems available, • a tedious, sometimes reckless, translation of the occurrences of the term, • and this results in an overall impression of artificiality and inadequacy of the conceptual representation.
3.4 Pustejovsky’s Generative Lexicon In [12] and various other writings, J.Pustejovsky suggests to deal with the variability of semantic interpretation of a given lexical unit by a process called type coercion. Lexical units get a type, but when they are used in sentences where this type yields illformed expressions, the process operates and produces an entity of a new type. This process relies on a qualia structure associated to each lexical unit; four facets are defined: CONST, FORMAL, TELIC, AGENTIVE, but the method to fill these facets remains intuitive, and the examples provided show that arbitrariness cannot be eliminated. To take once more the example of book, FORMAL is filled with physobj (standing for physical object) and AGENTIVE, with write, while it would have been possible to consider the "agent" of the physical-object book to be the printer, or alternatively the "formal" facet of the written book as its semantic content rather than as a physical-object. To cope with the plurality of views on a single word, Pustejovsky introduced more recently "complex types", e.g. book has type physical-object.info, and according to the context, one, or the other, or both members of this couple is/are activated. Unfortunately, this step goes in the direction of a finite enumeration of word senses, which is precisely what the "generative" approach is supposed to avoid.
44
D. Kayser
3.5 (Provisional) Conclusion The difficulties arising in the above examples can be blamed on the language only. Using natural languages necessarily entails delicate transpositions in a neat conceptual structure, but the existence and importance of such structures might remain unquestionned. Alternatively, they can be a hint to the pointlessness of keeping concepts pure from any linguistic contamination.
4 Some Promising Tracks
4.1 Semantic Adaptation The meanings that a single word can take in various contexts form a virtually 5 unlimited set. Dictionaries keep, in an often peculiar organization , the most typical semantic values. Even if this fact is seldom acknowledged, no text understanding is possible if the understander has only access to the dictionary meaning of each word of the text: every human unconsciously twists in some way or another the sense(s) found in the dictionary in order to find an adequate value of a word in its context. As an illustration among billions of others, the temporal value of feu in example (B6) §3.3. is not given in any dictionary — and should probably not be given if we are to keep their size compatible with human use. There is therefore a process of semantic adaptation, about which surprizingly little 6 is known neither psychologically nor computationally , in order to transform the lexical information on a word into a reasonably adequate semantic value for that word in its context. This process is, to some degree, regular. The regularities have been listed in Linguistics, and usually considered as rules governing metonymy. However, linguists focus their study on "interesting" metonymies rather than on the most common ones; moreover their rules correspond to a flat list of possible shifts, little being said concerning the actual applicability of the rule in a given situation. Now a blind application of the rules generates a huge number of interpretations, most of which would never come to the mind of a human understander. This describes the situation of words, but what is the relevance of the phenomenon of semantic adaptation and of its regularity when concepts are considered? 5
As no semantic organizational principle has been spelled out, most dictionaries organize the senses according to the (syntactic) constructions allowed by each meaning; as a result, intuitively very similar senses are remote in the dictionary (see [7]). 6 See however the special issue of Computational Intelligence devoted to non-literal meaning [2].
Ontologically, Yours
45
As we already insisted, concepts are not that different from words, not different anyhow to a point where a treatment having interesting features for words would be completely irrelevant for concepts.
4.2 Conceptual Adaptation The idea is thus to take advantage of the study of the process of semantic adaptation in words, in order to investigate to which extent it might be useful in concepts too. At first sight, such an idea seems both counterproductive and … absurd: • counterproductive, because concepts have been invented, as it were, to get rid of the pecularities of words when speaking about content; imitating at the level of concepts the seemingly erratic behaviour of lexical semantics is sort of going backwards, with respect to the advances obtained in working with concepts instead of words; • absurd, as semantic adaptation navigates from concept to concept in order to find which one meets the constraints imposed by the context on the occurrence of a word; it makes no sense at all to allow a concept to wander around in a structure of concepts which themselves wander around! However, there might be a more reasonable picture, introducing different levels. At each level, the domain of interest is structured with stable concepts. Before going further, let us observe that the same concept can be found at different levels, but with no easy correspondence. As a metaphor, the black circle representing Montpellier on the 1/5,000,000 map that I have in my notebook does not yield a huge black dot on the 1/50,000 map they sell you for going cycling in the outskirts of the city. Should it correspond to a typical point in the city (e.g. the townhall), to the area delimited by the city limits, by the limits of the conurbation, …? The answer depends on the situation, if any, where the question makes sense. Consider now a concept occurring in a conceptual representation; even it bears the same name as a "stable" concept appearing at some level, this does not necessarily entail a full coincidence with it. A process of conceptual adaptation may reveal as necessary here as the process of semantic adaptation was inescapable for text understanding. More precisely, I argue that any formula (F) representing some piece of knowledge (or some questions to be answered, e.g. if the formula has free variables) in a conceptual language should not be matched with (for consistency checking) or fed into (for inference) a single conceptual structure, identifying the concept names appearing in the formula with those occuring in the pre-existing structure. Instead of that, the formula should rather be confronted with several such structures, without being misled by identity of concept names. In each case, a process of conceptual adaptation should be invoked, in order to discard the levels where (F) makes no sense, and to modify (F), following regular patterns, in order to get from it meaningful results. I know very well that this opinion should be backed up at least by an illustrative example. The trouble is that every simple example, say of a size compatible with this
46
D. Kayser
paper, will more easily be solved with a one-level conceptual architecture, and if the agreement between the concept names of the structure and (F) is not good, the normal reaction is to blame the knowledge engineer for inconsistently using his/her own predicate names. Therefore, I will only hint at what such an example should look like; of course, it will always be possible to solve it in a « logically correct » way. Consider the (genuine) car-crash report below: J'ai vu l'arrière du véhicule B s'approcher. J'ai appuyé à fond sur le frein, le véhicule B a continué à s'approcher. J'ai cru sur le moment que mes freins n'avaient pas répondu et nous avons rempli ce constat. J'ai immediatement après essayé mes freins qui fonctionnaient très bien. J'ai maintenant la certitude, la rue étant légèrement en pente, que c'est le camion qui a reculé et est venu me percuter (…)
B68
I saw the back of vehicle B coming nearer. I pushed on the brake to the bottom, but vehicle B kept coming nearer. I believed, on the spot, that my brakes had failed and we filled the present report. Immediately after, I checked the brakes which work quite well. I am now absolutely conviced that, as the street has a slight slope, it is the truck who was moving back and stroke me (…) Independently of the words used, the representation of what happened requires a rather complex description on some notions, e.g. beliefs and how to confirm them, in order to infer that the author now disagrees with what (s)he wrote in the report; on the other hand, the comprehension of this text requires only a very crude description of vehicles and of their dynamics (there is a brake; when it works, it prevents you from coming nearer the vehicle you see). In most other texts, the opposite holds: confirmation of beliefs is irrelevant, but considerations on the effect of speed and dampness over braking distances play an essential role. Having a single ontology to shelter all the knowledge that may come into play entails: • reasoning in every case in the most complex setting — a very inefficient strategy —, and • handling a number of concepts presumably larger by an order of magnitude than the size of the vocabulary, hence the need to use e.g. STREET#1 (a line), STREET#2 (a surface, the causeway), STREET#3 (a skew surface, the causeway plus the sidewalks), STREET#4 (a volume, including the buildings opening onto the sidewalks), and so on. The alternative is to use a simple ontology (e.g. STREET defined as a line) and to refine it when and where the need arises. In the above example, "brake" can at first be defined as a part of a vehicle, but clearly we need later to distinguish among BRAKE#1 (the pedal), BRAKE#2 (the brake shoe), and BRAKE#3 (the whole device by which pressing on the pedal results in the shoe rubbing on the wheels). Having all sorts of brakes and streets cluttering the ontology of the domain of carcrashes is a very hard constraint to cope with, and a very useless one too!
Ontologically, Yours
47
Having ontologies at different levels, with bridges between them, and reasoning at the simplest level, except when it is inconsistent to do so seems much more attractive. But the difficulties must not be underestimated: • designing a strategy to select the ontological level appropriate for a given situation is difficult, • having efficient cooperation between levels which, by definition, do not share the same ontology is even harder, • deciding when to start and stop a new level of reasoning cannot be grounded on any formal principle, but only on empirical basis. Even if these obstacles are fearsome and hence this "promising track" not really attractive, I believe that the relatively easy way explored by philosophers and logicians has shown its intrinsic limitations, and nothing really less fearsome might solve the problem. More technical developments of this idea can be found in previous papers [8] [5].
5 Summary The less controversial examples of conceptual structures use mathematically perfectly defined notions (not going back as far in History as Port Royal’s definition of comprehension and extension on triangles, it is worth noticing that e.g. Chein & Mugnier's paper [1] use an ontology of polygons!). This is so because only in that case, the quest for a unique ontology is admissible and successful. The acceptance of ontological multiplicity, which goes necessarily with the need for a process of conceptual adaptation does not mean setting the fox (I mean the scruffiness inherent to natural languages) to mind the geese (I mean the neatness of the conceptual level); it is essential if we want to develop tools truly adapted to reallife domains, and not to their idealisation, but it can be done with a high standard of rigour. This idea provides by itself no methodology; at least, endorsing it means to stop looking for oversimplistic solutions where they obviously do not work.
References 1. Chein, M., Mugnier, M-L.: Conceptual Graphs: Fundamental Notions. Revue d’Intelligence Artificielle vol.6 n°4 (1992) 365-406 2. Computational Intelligence. Special Issue on Non-Literal Language (D.Fass, J.Martin, E.Hinckelmann eds.) vol.8 n°3 (Aug.1992) 3. Gayral, F., Kayser, D., Lévy F.: Quelle est la couleur du feu rouge du Boulevard Henri IV ? in Référence et anaphore, Revue VERBUM tome XIX n os1-2 (1997) 177-200 4. Kayser, D.: Une sémantique qui n'a pas de sens. Langages n°87 (Septembre 1987) 33-45 5. Kayser, D.: Le raisonnement à profondeur variable. Actes des journées nationales du GRECO-P.R.C. d'Intelligence Artificielle, Éditions Teknea, Toulouse (1988) 109-136 6. Kayser, D.: Terme et dénotation. La Banque des Mots, n° spécial 7-1995 (1996) 19-34
48
D. Kayser
7. Kayser, D.: La sémantique lexicale est d'abord inférentielle. Langue Française n°113 “Aux sources de la polysémie nominale” (P.Cadiot et B.Habert eds.) (Mars 1997) 92-106 8. Kayser, D., Coulon, D.: Variable-Depth Natural Language Understanding. Proc. 7th I.J.C.A.I., Vancouver (1981) 64-66 9. Lesniewski, S.: Sur les fondements de la mathématique (1927) (trad.fr. par G.Kalinowski, Hermès, Paris, 1989) 10. Montague, R.: Formal Philosophy (R.Thomason, ed.) Yale University Press (1974) 11. Nunberg, G.D.: The Pragmatics of Reference. Indiana University Linguistics Club, Bloomington (Indiana) (June 1978) 12. Pustejovsky, J.: The Generative Lexicon. The MIT Press (1995) 13. Unger, P.: There are no ordinary things. Synthese vol.41 (1979) 117-154
Executing Conceptual Graphs Walling R. Cyre The Bradley Department of Electrical and Computer Engineering Virginia Tech Blacksburg, VA 24061-0111
[email protected] Abstract. This paper addresses the issue of directly executing conceptual graphs by developing an execution model that simulates interactions among behavioral concepts and with attributes related to object concepts. While several researchers have proposed various mechanisms for computing or simulating conceptual graphs, but these usually rely on extensions to conceptual graphs. The simulation algorithm described in this paper is inspired by digital logic simulators and reactive systems simulators. Behavior in conceptual graphs is described by action, event and state concept types along with all their subtypes. Activity in such concepts propagates over conceptual relations to invoke activity or changes in other behavioral concepts or to affect the attributes related to object type concepts. The challenging issues of orderly simulation of behavior recursively described by another graphs, and of combinational relations are also addressed.
1 Introduction Conceptual Graphs have been used to represent a great variety of knowledge, including dynamic knowledge or knowledge which describes the behavior of people, animate beings, displays and mechanisms. In simple graphs it is not difficult to mentally trace possible activities and decide if the situation is modeled correctly, and the anticipated behavior is described. In complex and hierarchical graphs, however, this may not be practical so that validation of the description must be carried out by simulation or reasoning. From another viewpoint, simulation is useful to predict the consequences of behavior described by a conceptual graph. In this paper, mechanics for simulation by executing conceptual graphs are presented. As an example of conceptual graphs execution, consider the graph of Figure 1 which describes the throwing of a ball. Suppose one wishes to execute or simulate this graph to observe the behavior it describes. As discussed in the next section, most researchers have suggested extending conceptual graphs with special nodes called actors or demons to implement behavior. Instead, consider what is necessary to execute this graph as it is. First, the throw action must be associated with some underlying procedure or type definition that knows what types of concepts may be related to it, and how the execution of the throw action affects these adjacent concepts. M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 51-64, 1998. © Springer-Verlag Berlin Heidelberg 1998
52
W.R. Cyre
In this case, executing a throw should only change the position attribute of the ball from matching that of Joe (100,35) to that of Hal (200,0) and change the tense of throw from future to past. Note that this situation has been modeled so that only attributes of concepts are modified. Had the position concept of the ball been coreferent with Joe’s position, then it would be necessary to necessary for the throw procedure to modify the structure of the graph by removing the position link of the ball from Joe’s position to Hal’s position. Since the graph is a data structure, modifying it is not a problem. In the discussions that follow, it will be assumed that action procedures only act on the values of attributes, and that these values may be rewritten multiple times.
position
(100,350)
person: Joe
agnt
person: Hal
dest
obj
ball
throw
position
(200,0)
position
(100,350)
future Fig. 1. A Description of a Behavior
In this discussion, throw was assumed to have a procedure defined for it in a library of concept procedures. Such a Liberia can easily get out of hand, so it is more desirable to have a type definition for throw in terms of a modest set of primitive actions which copy, replace, combine or operate only on values of attributes. Here, the type definition would have only a copy operation from the agent position concept to the object position concept. Figure 1 only has one isolated behavioral concept: throw. The conceptual graph of Figure 2 provides a more interesting example in which the execution of actions and events have effects on one another. One would like to execute or simulate this description by specifying an initial condition, say that action 1 is active, the state is true and the variable has the initial value of zero. Action 1 generates event 2, which in turn deactivates action 1 and initiates the increment action. The increment action reads its variable’s value attribute, increments it and returns the sum as the result. Starting the increment also generates event 4 if it is enabled by the state being true. Firing event 4 terminates the increment and re-initiates action 1. The example of Figure 1 considered a single action, and its procedure or definition needed to know what to do with the possible adjacent concepts. Figure 2 has relations such as generator, terminator, intiator and enabler, which relate actions, events and states. As described later, to simplify the simulator, these and similar types of relations can be processed uniformly regardless of the subtypes of actions events and states they are incident to. Only actions need to have ‘personalized’ procedures.
Executing Conceptual Graphs
53
In the following sections, related work by other researchers is considered. Following that, a simulation algorithm is described, including it’s supporting mechanisms (data types), it’s simulation cycle, and how conceptual relations among behavioral concept types affect the execution of graphs.
terminator
event: #2
initiator
generator action: #1
increment: #3 terminator
initiator
event: #4
enabler
generator
state:
“7”
operand
>
result
variable: #6
“0”
value
Fig. 2. An Example Graph
2 Related Work In 1984, Sowa stated that “Conceptual Graphs represent declarative information.” and then went on to describe “ ... a formalism for graphs of actors that are bound to conceptual graphs.” These actor nodes form (bipartite) dataflow graphs [7] with value type concepts of conceptual graphs. The relations between the actors and concepts are input_arc and output_arc to the actors. Sowa described an execution model for computing with conflict-free, acyclic dataflow graphs using a system of tokens. Actors can be defined recursively or by other graphs with actors. This model only allows single assignments (of referents) to (generic) value concepts. While this view of executing conceptual graphs has the computational power of dataflow models, functional dataflow graphs are not the most popular model of computation, and they require the appendage of a new node type (actors) to conceptual graphs. In addition, dataflow models are not easily derived from most natural language descriptions. In the present paper we show an execution model for general conceptual graphs without special actors. Other models of computation were considered by Delugach, including state transition models [5,6]. Another type of node, called ‘demon’ was added to conceptual graphs to account for the problem of states. The argument is that if a conceptual graph is to be true, and a system can be in only one state at a time, then (state) concepts must be created and destroyed when transitions occur. This approach was extended recently by the introduction of assertion types and assertion events [12].
54
W.R. Cyre
While demon nodes or assertion events offer the attractive capability of modifying the structure of a conceptual graph, they are external to conceptual graphs as actors are. Here, we avoid having to create and destroy states, by simply marking them as being true or false (negated). The actor graphs and problem maps of Lukose [10] are object-oriented extensions of conceptual graphs to provide executability. An actor graph consists of the type definition of an act concept supplemented with an actor whose methods are invoked by messages. Execution of an actor graph terminates in a goal state which may be a condition for the execution of other actor graphs. Control of sequences and nesting of actor graphs are provided by problem maps. While this approach does associate executable procedures with act concept types, no mention is given to event and proposition types. The application of conceptual graphs to governing agents, such as robots, was introduced by Mann [11]. A conceptual database includes schemata of actions the agent can perform. Commands in the form of conceptual graphs derived from natural language expressions are unified with schemata to evoke behavior. The behavior itself is produced by programs associated with demon concepts in the schemata. At the same time, the present author considered the visual interpretation of conceptual graphs, including the pictorial depiction of their knowledge [3]. Comments were included on animating displays generated from conceptual graphs. Animation would be produced by animator procedures associated with action and event type concepts. In this case, a conceptual graph would govern a display engine rather than a robot. While both of these proposals, execute conceptual graphs, each is rather specialized. Recently, simulation of conceptual graphs has been considered more generally [1]. This approach considers actions, states and events, where the underlying execution model is a state transition system. Action concepts are preconditioned by states and events. These actions can be recursively defined as sets of actions joined by temporal relations and conditioned by states and events. Events are changes in the truth of states. These state changes are described by links to transition concepts which, in turn, are invoked by actions through ‘effect’ relations. A simulation step consists of identifying the set of actions whose preconditions are true, and selecting one for execution to advance the simulation. Time apparently advances with each time step. The user is consulted to resolve indeterminacy due to multiple enabled actions in a step. Simulation also detects unreachable actions and inconsistencies. The execution model we describe treats states and events more generally and considers the interaction between behavior concepts (actions, events, states) and values or entities. Our set of general concept types and the simulation algorithm were developed through an examination of modeling notations for computer systems rather than from considering human behavior. As discussed later, behavioral models are not limited to computer system behavior. Since the present execution model is based on computer systems [2], it is appropriate to review here some approaches used in computer simulators. This will be limited to event-driven simulators. In computer logic simulators, the only possible events are changes in values (signals). At a given point in time, one or more signals may change. The behavior of each circuit having a signal change event on any input
Executing Conceptual Graphs
55
is simulated to determine output signal events. An output event consists of a value change and the time the event will occur due to delay in the circuitry. These events are posted to a queue. Once all output events due to current events have been determined, the next future event(s) is found from the event queue and simulation time is advanced to that time so that event can be processed. If the next event occurs at the present time, the simulation time is not advanced. In a simple simulator, the code that simulates behaviors of circuits is contained n a library. In more general digital simulators [9], the user may write processes which respond to input events and produce output events. The procedures of all processes execute once upon initial startup of the simulator. This is appropriate since hardware is always processing its inputs as long as power is applied. After startup, procedures execute only when stimulated. In a simulation cycle, all stimulated processes complete before the next cycle begins. A very general modeling notation called Statecharts is supported by a more elaborate simulation procedure [8]. Our simulation algorithm was inspired by this approach. Statecharts are founded on finite state machines. The states may be hierarchical, consisting in turn of transition diagrams. Parallel compound machines are supported so the system can be in multiple sub-states at a time. Each transition can be triggered by an event and predicated on a condition. Both conditions and events may be Boolean combinations of other conditions and events, respectively. When a transition occurs, an action may be stimulated. Other actions may be stimulated by events as long as the system is in a particular (enabling) state. Such action invocations may be predicated on other conditions. In addition, actions and events may generate new events. The objective of the present paper is to show how conceptual graphs can be executed or simulated by an approach such as this.
3 Simulation Approach 3.1 Execution Semantics of Concept Types To begin to develop a general conceptual graph simulator, it is necessary to first consider the types of concepts that will participate in the simulation. In the present discussion, we consider the top-level type hierarchy of Figure 3. The behavior types actively participate in a simulation. Objects are passive elements whose attributes, such as value for variable and position for entity, can be affected by the execution of action types. State concepts describe the status of what is described by a conceptual graph, and may be pre-conditions on the execution of actions or the occurrence of events. States may be defined by the activity of actions or their relationships among attributes of objects, such as the values of variables and the positions of objects. Events are instantaneous and mark changes in activity of actions or the attributes of objects.
56
W.R. Cyre
entity object
variable action
T
behavior
event state value
attribute
position delay
Fig. 3. Top-Level Concept Type Hierarchy Since conceptual graph theory allows concepts to be defined n terms of conceptual graphs, such recursively defined concepts must be accounted for in executing a graph. Consider an action that is defined by a graph that includes actions, events and states. When the defined action is executed, its graph must be executed completely before the action is completed and it’s consequences may be propagated on. In order to show that our approach is quite general, some traditional concept types [13] are interpreted in terms of the simulator type hierarchy of Figure 3, as shown in Table 1. Note that some concept type names conflict. For example, Sowa’s event type is classified as an action here because it has an agent and does not exclude a duration. Our events have no duration and are only discontinuities in actions. The type believe is an action (operation) whose operand is a belief or proposition (state). In this paper, we will only discuss the attributes value of variables and position of entity. Note however, that all discussions are extended to any attribute, such as the age or the color of an entity. Conceptual relations describe the interaction among these concept types and directs the simulator in assigning attributes and generating events during simulation. The interpretation of conceptual relations during simulation is described in detail in a later section. 3.2 Simulation Support Mechanisms The simulation algorithm described here is event-driven. Since event is a concept type in the type hierarchy, it is useful to define incident as the element processed by the simulation algorithm. Each incident is associated with a specific concept of the conceptual graph(s) being simulated, and has five attributes which specify: the concept
Executing Conceptual Graphs
57
type, the concept identifier the simulator time, the level of recursion and the type of operation to be performed. The simulator time is measured with respect to the beginning of the simulation, which occurs at time zero. The simulator may perform many cycles at a given simulator time. That is, a cycle does not necessarily advance the simulation time, and simulation time cannot be reversed. Often, incidents generated at the present time must be processed in the present time (without delay). Such incidents are assigned a zero for simulation time. The types of incidents generated and processed by the simulator are listed in Table 2. The effects of these types of incidents are described later. Table 1. Traditional Concept Types
Traditional Type
Simulator Type
act age animal believe belief message color communicate contain event proposition teacher think warm
action attribute entity action state variable attribute action state action state entity action attribute
The simulator uses the collection of lists shown in Table 3 to keep tract of incidents and the state of the conceptual graph. The Queue contains incidents that have not yet been processed by the simulator. The Current list contains the incidents to be processed during the current simulation cycle, and the Future list contains incidents generated during the current cycle and to be processed in a future cycle. The Current incidents are selected from Queue at the beginning of a cycle, and the Future incidents will be added to the Queue at the end of the cycle and before the next cycle begins. Which action concepts are active is kept track of by the Activity list. Activity describes the status of an action; an action may be active, but may not do anything during a given cycle. Those actions which must be executed during the current cycle are listed in the Execute list. The True_states list keeps track of which states are true at the present time. A state concept may reflect the activity of an action, or may represent conditions defined by relationships among attributes of objects, such as values of variables or positions of entities. The Value and Position lists keep track of
58
W.R. Cyre
the current values of variable concepts and the positions of entity concepts. In Prolog, lists are convenient structures for maintaining the status of the conceptual graph. In other implementation languages, lists of pointers might be used instead, and some lists might be eliminated entirely. For example, values and positions can be left in referent fields of concepts, but then, the graph would have to be scanned each cycle. Similarly, the status of action concepts and the truth of state concepts can be represented by the unary relations (active) and (not), respectively, incident to the concepts.
Table 2. Types of Simulation Incidents
Incident Type action
Parameters type, id, time, level, operation
event
type, id, time, level, none
state
type, id, time, level, operation
variable
type, id, time, level, new_value
entity
type, id, new_position
time,
Function
level,
Starts, stops or resumes an action. Fires an event. Enters or exits a state (makes true or false). Assigns a new value. Assigns a new position.
Table 3. Working Lists used by the Simulator
List Queue Current Future Activity True_states Values Positions Execute
Contents Pending incidents Incidents to be processed in the current cycle Future incidents produced in the current cycle Action concepts that are currently active State concepts that are true Pairs of variable concepts and their current values Pairs of entity concepts and their current positions Action concepts to be executed in the current cycle
Executing Conceptual Graphs
59
3.3 The Simulation Cycle A simulation cycle consists of the following steps: 1) Get current incidents: The list of incidents, Current, to be processed during the current cycle is extracted from the incidents Queue. All incidents at the current level of recursion and having a zero time are extracted for processing in the current cycle. If no incidents with zero time exist at the current level, then the level is raised. If no incidents with zero time exist at the top level, simulation time must be advanced. Then, the incident(s) with the earliest time and the deepest level is extracted for processing during this cycle. The simulation level is set to that level and simulator time is advanced to that time. Current incidents are deleted from the Queue. 2) Update attributes: Process attribute (value and position) change incidents by changing these attributes of the affected concepts. This may result in an immediate change in whether some states (conditions) are true, so the True_states list may be affected. Incidents are deleted from the Current list as they are processed. 3) Process remaining incidents: Process state, action and event incidents from the Current list. The order of processing these incidents is immaterial since any consequences are placed in the Future list for processing during a later simulation cycle. 4) Execute actions. Finally, action concepts to be executed during the current cycle are executed in this step. This must be last, since event occurrences and state changes stimulate and precondition activities, respectively. 5) Update Queue: Append the Future list to the Queue. The manner of processing the various types of incidents is described in the following paragraphs. Note again that an incident is processed only if the simulator time has caught up with the incident time. Attribute (value and position) incidents specify changes in values of variables and positions of entities, and the time they are to occur. In each case, the affected concept is located and its appropriate attribute is changed, both in the graph and in the Values and Positions lists. These incidents must be processed before action and event incidents, since value and position changes can affect the truth of preconditions for executing actions and firing events. Two types of state incidents are defined. An enter state incident causes the state concept to be added to the True_states list, and an exit incident removes the state concept from this list. State incidents are generated explicitly by actions, and so do not account for changes in status of state concepts due to changes in activity of actions and the attributes of variables or entities. For example, the state incidents
60
W.R. Cyre
incident(service, #, _,exit) and incident(idle, #, _,enter) are indicated by the expression, "Reset changes the mode from service to idle.” An event incident fires the indicated event concept in a conceptual graph. The consequences of firing an event are determined by the conceptual relations incident with the event. Incidents generated through conceptual relations are described shortly. Action incidents change the activity of action concepts by adding them to or removing them from the Activity list. A stop incident removes the action. A start incident or resume incident places the action onto the Activity list. A start incident invokes an execution of the action with initial values and positions for associated variables and entities. A resume incident invokes an execution of the action using the last values or positions. This supports persistence in actions. Once incidents have been processed and changes in objects and states are completed, the action concepts stimulated into execution are processed. Action concepts may have operands they operate on to produce results. These will be value or position attributes of other concepts. If an action is executed by a start incident, it is reset to initial values/positions before execution begins. Otherwise, the current values/positions are used. Execution may not only generate value or position incidents with respect to result concepts, but may also generate event, state and other action incidents. To generate value and position events, a procedure for transforming inputs to outputs must be available. Rather than defining actor or demon nodes as part of an extended conceptual graph to account for these procedures, we follow the pattern of digital simulators that have libraries of procedures for simulating the behavior of action concepts. That is there are a collection of primitive actions which the simulator knows how to process. Other actions can be recursively defined in terms of schemata employing these primitive actions. So, to execute a complex action, its schema is executed. But, the schemata may take multiple cycles to execute. To satisfy this requirement the simulator has levels of cycles. That is the function of the level parameter of the incidents. All current incidents at the deepest level of recursion must be completed before any higher-level incidents are processed. It is possible that some primitive actions may be invoked at different levels at the same simulation time. The present simulation strategy executes the action at the deepest level first. The simulation cycle is not completed until all action concepts have completed their execution, that is, suspended themselves. Self-suspension here means the computation is complete, and has nothing to do with terminating the activity of the action, unless the action generates a stop incident to itself. 3.4 Conceptual Relations and the Production of New Incidents As described thus far, the simulator consumes incidents but produces none, so a simulation would soon die out. New incidents are produced by the conceptual relations incident to firing events and executing actions. Table 4 shows a collection of relation types among behaviors and objects. Although their names may be unfamiliar, these relations account for most interactions among concepts, with the exception of attributes. Only binary relations are shown in the table; some ternary relations will be
Executing Conceptual Graphs
61
considered later. The challenge in developing a model for executing conceptual graphs is to determine which incidents are generated by the various relations, and how combinations of relations incident with concepts interact.
Table 4. Signatures of Selected Conceptual Relations
Has
entity
entity
part
variable action
event
state
variable
action
event
state
attribute
status
position color age value structure
part agent patient source destination
operand result
cause deactivator temporal part generator
initiator resumor terminator temporal trigger temporal part make_true entrance make_false exit
if
enabler
part
In Table 4, a relation is interpreted as the row concept ‘has relation’ with the column concept, such as an event has initiator action. This interpretation yields some unusual relation type names, but is traditional in conceptual graphs, and is necessary when considering combinations of relations incident with behavior concepts. First, consider relations actions have with behaviors (Row 3 in Table 4). When an event incident to an initiator relation fires, it will generate an action start incident for the related action, with zero time and the current level of recursion, e.g. incident(action, Id, 0, L, start). Similarly, the suspendor and resumor relations will generate stop and resume action incidents for their related actions. A single action incident may not be sufficient to stimulate the execution of the action. If the action has one or more if relations with states and the states are not true (on the True_states list), then the action will not execute. In addition, an action must have an action incident on each of its initiator or resumor relations to execute, since conceptual graph theory interprets multiple relations of the same type incident to a concept as conjoined (ANDed). Disjunction of relations is not conveniently represented in conceptual graphs. For this purpose, we define new relations or and xor to synthesize artificial disjunctive concepts. The relation xor indicates exclusive-or. Since complex combinations cannot be expressed this way, introduction of artificial concepts is necessary, as in the example graph in Figure 4, which indicates that action A executes only if states S1 and S2 are true and if a start incident was produced from event E5 as well as event E3 or E4 or both E2 and E2. That is the condition (S1 and S2 and E5 and ((E1 and E2) or
62
W.R. Cyre
E3 or E4)) Event [event: *1] was artificially introduced to represent the event that E3 or E4 or [event: *2] occurred. Event [event: *2] is the event that E1 and E2 occur simultaneously.
[action, A] (initiator) -> [event: *1] (or) -> [event: *2] (part) -> [event: E1] (part) -> [event: E2], (or) -> [event: E3] (or) -> [event: E4] (initiator) -> [event: E5] (if) -> [state: S1] (if) -> [state: S2],. Fig. 4. Complex Conditioning of Action Execution.
Which types of incident are generated by some of the relations identified in Table 4 are shown in Table 5. Thus far the generation of the time parameter of incidents has not been addressed, so all incidents generated with the above relations have zero (present) time and the simulation time never advances. To introduce time, it is necessary to add a set of ternary relations comparable to the relations of Table 4. For example the relation initiator_after has the signature shown in Figure 5.
[ action ] -> (initiator_after) [1] -> [event] [2] -> [delay],. Fig. 5. Signature for initiator_after relation.
During firing of the event, then, incident(action,#,T,start) is added to the Future list, where the value of T is the current simulation time plus the delay. Table 4 also shows temporal relations among actions and events. These may include interval relations, endpoint relations and point relations [4]. Although temporal relations seem to imply causality, we interpret them here as constraints. For example, [action: A1] -> (starts_when_finishes) -> [action: A2]
Executing Conceptual Graphs
63
does not cause a start action incident for action A1 to be generated when action A2 terminates. Instead the simulator must check when action A1 is initiated that action A2 has terminated, and post an exception if this is not the case. Alternatively, temporal relations could be interpreted as causal, in which case the temporal and other behavioral relations can be checked statically for consistency. Similarly, duration relations incident to actions can be used as constraints to check if a stop incident occurs with the appropriate delay after a start incident, or the duration can be used when the action starts to generate a stop incident that terminates the action.
Table 5.
Incidents Produced by Conceptual Relations
Concept activity
Incident Relation
Consequent Incident
Fire event
initiator resumor terminator trigger entrance exit cause deactivator generator make_true make_false
action start action resume action stop event state enter state exit action start action stop event state enter state exit
Execute action
4 Conclusions Mechanisms for simulating hierarchical conceptual graphs without introducing special nodes such as actors or demons have been described. To support execution of graphs, concept types are classified as behavior (action, event, state), object (entity, variable) and attribute. Execution is performed by procedures associated with action types that operate on object attributes, and by procedures associated with conceptual relations among behavior concepts. Although the simulation strategy was inspired by digital system simulators, the approach has been show to be applicable to general concept types and relations.
64
W.R. Cyre
5 Acknowledgments This work was funded in part by the National Science Foundation, Grant MIP-9707317.
References 1.
2. 3.
4.
5. 6.
7. 8. 9. 10. 11.
12. 13.
C. Bos, B. Botella and P. Vanheeghe, “Modelling and Simulating Human Behaviours with Conceptual Graphs,” Proc. 5th Int’l Conf. on Conceptual Structures, Seattle, WA, 275-289, August 3-8, 1997. Walling Cyre "A Requirements Language for Automated Analysis," International Journal of Intelligent Systems, 10(7), 665-689, July, 1995. W. R. Cyre, S. Balachandar, and A. Thakar, “Knowledge Visualization from Conceptual Structures,” Proc. 2nd Int’l Conf. on Conceptual Structures, College Park, MD, 275-292, August 16-20, 1994. W. R. Cyre, "Acquiring Temporal Knowledge from Schedules," in G. Mineau, B. Moulin, J. Sowa, eds., Conceptual Graphs for Knowledge Representation, Springer-Verlag, NY, 328-344, 1993. (ICCS'93) H. Delugach, “Dynamic Assertion and Retraction of Conceptual Graphs,” Proc. 7th Workshop on Conceptual Structures, Binghamton, NY, July 11-13, 1991. H. Delugach, “Using Conceptual Graphs to Analyze Multiple Views of Software Requirements,” Proc. 6th Workshop on Conceptual Structures, Boston,MA, July 29, 1990. J. Dennis, “First Version of a Data Flow Procedure Language,” Lecture Notes in Computer Science, Springer-Verlag, NY, 362-376, 1974. D. Harel and A. Naamad, The STATEMATE Semantics of Statecharts, i-Logix, Inc. Andover, MA, June 1995. R. Lipsett, C.F. Schaefer & C. Ussery, VHDL: Hardware Description and Design, Kluwer Academic, Boston, 1989. D. Lukose, “Executable Conceptual Structures,” Proc. 1st Int’l Conf. on Conceptual Structures, Quebec City, Canada, 223-237, August 4-7, 1993. G. Mann, “A Rational Goal-Seeking Agent using Conceptual Graphs,” Proc. 2nd Int’l Conf. on Conceptual Structures, College Park, MD, 113-126, August 16-20, 1994. R. Raban and H. S. Delugach, “Animating Conceptual Graphs,” Proc. 5th Int’l Conf. on Conceptual Structures, Seattle, WA, 431-445, August 3-8, 1997. J. Sowa, Conceptual Structures, Addison-Wesley, Reading, MA, 1984.
From Actors to Processes: The Representation of Dynamic Knowledge Using Conceptual Graphs Guy W. Mineau Department of Computer Science Université Laval Quebec City, Canada tel.: (418) 656-5189 fax: (418) 656-2324 email:
[email protected] Abstract. The conceptual graph formalism provides all necessary representational primitives needed to model static knowledge. As such, it offers a complete set of knowledge modeling tools, covering a wide range of knowledge modeling requirements. However, the representation of dynamic knowledge falls outside the scope of the actual theory. Dynamic knowledge supposes that transformations of objects are possible. Processes describe such transformations. To allow the representation of processes, we need a way to represent state changes in a conceptual graph based system. Consequently, the theory should be extended to include the description of processes based on the representation of assertions and retractions about the world. This paper extends the conceptual graph theory in that direction, taking into account the implementation considerations that such an extension entails.
1 Introduction This paper introduces a second-order knowledge description primitive into the conceptual graph theory, the process statement, needed to define dynamic processes. It explains how such processes can be described, implemented and executed in a conceptual graph based environment. To achieve this goal, it also introduces assert and retract operations. The need for the description of processes came from a major research and development project conducted at DMR Consulting Group in Montreal, where a corporate memory was being developed using conceptual graphs as its representation formalism. Among other things, the company’s processes needed to be represented. Although they are actually described in a static format using first-order conceptual graphs, advanced user support capabilities will eventually require them to be explained, taught, updated and validated. For that purpose, we need to provide for their execution, and thus, for their representation as dynamic knowledge. Dynamic knowledge supposes that transformations of objects are possible. Processes describe such transformations. To represent processes, we need a way to describe state changes in a conceptual graph based system. We decided to use assertion and retraction operations as a means to describe state changes. Therefore, the definition of a process that we put forth in this paper is based on such operations. Generally, processes can be described using algorithmic languages. These languages are mapped onto state transition machines, such as computers. So, a process M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 65-79, 1998. © Springer-Verlag Berlin Heidelberg 1998
66
G.W. Mineau
can be described as a sequence of state transitions. A transition transforms a system in such a way that its previous state gives way to a new state. These previous and new states can be described minimally by conditions, called respectively pre and postconditions, which characterize them. The preconditions of a transition form the smallest set of conditions that must conjunctively be true in order for the transition to occur; its postconditions can be described in terms of assertions to and retractions from the previous state. Thus, transitions can be represented by pairs of pre and postconditions. Processes can be defined as sequences of transitions, where the postconditions of a transition match the preconditions of the next transition. The triggering of a transition may be controlled by different mechanisms; usually, it depends solely on the truth value of the preconditions of the transition. This simplifies the control mechanism which needs to be implemented for the execution of processes; therefore, this is the approach that we advocate. Section 2 reviews the actual conceptual graph (cg) literature on processes. Section 3 presents an example that shows how a simple process can be translated into a set of transitions. Section 4 describes the process statement that this paper introduces. Finally, because of its application-oriented nature, this paper also addresses the implementation issues related to the engineering of such a representation framework; section 5 covers these issues.
2 The CG Literature on Processes Delugach introduced a primitive form of demons in [1]. Demons are processes triggered by the evaluation of some preconditions. Delugach’s demons take concepts as input parameters, but assert or retract concepts as the result of their actions, contrarily to actors which only compute individuals of a predetermined type. Demons are thus a generalization of actors. They can be defined using other demons as well. Consequently, they allow the representation of a broader range of computation. We extended these ideas by allowing a demon to have any cg as input and output parameters. Consequently, our processes are a generalization of Delugach’s demons. We kept the same graphical representation as Delugach’s, using a labeled double-lined diamond box. However, we had to devise a new parameter passing mechanism. We chose context boxes since what we present here is totally compatible with the definition of contexts as formalized in [2]. Similarly to [3], we chose state transitions as a basis for representing a process, 1 except that we do not impose to explicitly define all execution paths ; the execution of a process will create this path dynamically. This simplifies the process description activity. There is much work in the cg community about the extension the cg formalism to include process related primitives [4, 5, 6, 7, 8]. One of the main motivation behind these efforts, is the development of an object oriented architecture on top of a cgbased system [12]. As we advocate, [7] focuses on simple primitives to allow the modeling of processes. The process description language that we foresee could be extented to include high-level concepts as proposed in [7]. In [9], we explain how our two approaches complete each other. Also, [12] uses transitions as a basis for describing behaviour and [11] uses contexts as pre and postconditions for modeling behaviour. The work presented here is 1
An execution path is defined as a possible sequence of operations, i.e., of state transitions, according to some algorithm.
The Representation of Dynamic Knowledge Using Conceptual Graphs
67
totally compatible with what is presented in these two papers, but furthermore, 1) it adds packaging facilitly, 2) it deals with implementation details that render a full definition and execution framework for processes, and 3) it is completely compatible with the definition of contexts as formalized in [2], providing a formal environment for using contexts as state descriptions. In what follows, we present a framework to describe processes in such a way that: 1) both dynamic and static knowledge can be defined using simple conceptual graphs (in a completely integrated manner), 2) they can be easily executed using a simple execution engine, and 3) inferences on processes are possible in order to validate them, produce explanations and support other knowledge-dependent tasks.
3 From Algorithms to Pre and Postcondition Pairs We believe that a small example will be sufficient to show how a simple process, the iterative factorial algorithm, can be automatically translated into a set of pre/postcondition pairs. From this example, the definitions and explanations given in sections 4 and 5 below will then become easier to present and justify. Let Figure 1 illustrate the process that we want to represent as a set of pre/postconditions pairs. Since a process relies on a synchronization mechanism to properly sequence the different transitions that it is composed of, and since we wish to represent transitions only in terms of pre and postconditions (for implementation simplicity), we decided to include the sequencing information into the 2 pre/postconditions themselves . With an algorithmic language such as C, variable dependencies and boolean values determine the proper sequence of instructions. Then it is rather easy to determine the additional conditions that must be inserted in the pre and postconditions for them to represent the proper sequence structure of the algorithm. Without giving a complete algorithm that extracts this sequence structure, Figure 2 provides the synchronization graph of the algorithm of Figure 1. The reader will find it easy to verify its validity, knowing that all non-labeled arcs determine variable dependencies between different instructions, that arcs marked with T and F indicate a dependency on a boolean value, and that arcs marked as L indicate a loop. l0: int fact(int n) l1: { int f; l2: int i; l3: f = 1; l4: i = 2; l5: while (i [TYPE:T1] If in the concept type lattice T2 < T1 then add G in the viewpoint base else you can not create a viewpoint relation between T2 and T1 EndProgram
• instantiation of a concept type Tc by a referent ref Program instantiation (I: Tc,ref) if [Tc:ref] already present in the instantiation base then ok for the instantiation else if Tc is a basic concept type then creation_description(I:Tc, ref, instantiation base) else if Tc is a v_oriented concept type then Tb := basic_concept_type_associated(I: viewpoint base) if [Tb:ref] already present in instanciation base then creation_viewpoint_instance(I:Tb,Tc,ref,instanciationbase,O: List_types_instantiated) creation_other_instanciation(I:Tb,Tc,ref,List_types_i nstantiated,viewpoint base, instanciation base) else add [T:ref][C2] • is_generalized_viewpoint(C1,C2) iff is_specialized_viewpoint(C2,C1) • is_equivalent_concept (C1,C2) iff type(C1) ∪ type(C2) ≠ Τ ∧ ∃G in viewpoint base such that G:[TYPE:type(C1)] ->(equiv)->[TYPE:type(C2)] • is_inclusion_concept (C1,C2) iff type(C1) ∪ type(C2) ≠ Τ ∧ ∃G in viewpoint base such that G:[TYPE:type(C1)] ->(incl)->[TYPE:type(C2)] • is_exclusion_concept (C1,C2) iff type(C1) ∪ type(C2) ≠ Τ ∧ ∃G in viewpoint base such as G:[TYPE:type(C1)] ->(excl)->[TYPE:type(C2)] • is_more_generalized_concept (C1,C2) iff is_generalization (C1,C2) ∨ is_generalized_viewpoint(C1,C2) ∨ is_generalization&conceptualization(C1,C2)
New relations among elementary links of different conceptual graphs Let CG1=( 1, 1, 1) and CG2=( 2, 2, 2) the conceptual graphs to be compared. We define additional kinds of relations possible 2between elementary links of CG1 and CG2, respectively denoted link1=rel1(C11...C1n) and link2 = rel2 (C21...C2n), where rel1 and rel2 have the same arity: • is_concept_total_viewpoint_specialization(Link1,Link2) iff type(rel1)=type(rel2) ∧ ∀i∈[1..n], is_specialized_viewpoint (adj(i,rel1), adj(i,rel2)) • is_concept_partial_viewpoint_specialization (Link1,Link2) iff type(rel1)=type(rel2) ∧∀i∈[1..n], (is_specialized_viewpoint (adj(i,rel1), adj(i,rel2)) ∨
2.In this article we do not define the same relation if type(rel1) 1, hr: 2-> 1, ha: 2-> 1) from CG2 to CG1 such that ∀link2∈ 2, is_concept_total_equivalent (link1,ha(link2)) • CG2 is "a partially equivalent graph" of CG1 iff ∃ a graph morphism (hc: 2-> 1, hr: 2-> 1, ha: 2-> 1) from CG2 to CG1 such that ∀link2∈ 2, (is_concept_partial_equivalent (link1,ha(link2)) ∨ is_same_link (link1,ha(link2))) ∧ ∃link2∈ 2 such that: is_concept_partial_equivalent (link1,ha(link2)) • CG2 is "a totally included graph" of CG1 iff ∃ a graph morphism (hc: 2-> 1, hr: 2-> 1, ha: 2-> 1) from CG2 to CG1 such that ∀link2∈ 2, is_concept_total_inclusion (link1,ha(link2)) • CG2 is "a partially included graph" of CG1 iff ∃ a graph morphism (hc: 2-> 1,
Management of a Corporate Memory in Concurrent Engineering
hr: 2-> 1, ha: 2-> 1) from CG2 to CG1 such that ∀link2∈ 2, (is_concept_partial_inclusion (link1,ha(link2)) ∨ is_same_link (link1,ha(link2))) ∧ ∃link2∈ 2 such that: is_concept_partial_inclusion (link1,ha(link2)) CG2 is "a exclusion graph" of CG1 iff ∃ a graph morphism (hc: 2-> 1, hr: 2-> 1, ha: 2-> 1) from CG2 to CG1 such that ∃link2∈ 2, is_concept_exclusion (link1,ha(link2))
•
4.2.0.1 strategies of integration of a proposition in the artefact When all relations between elementary links are given, [3] proposes several strategies to integrate the two compared conceptual graphs . In the artefact, the information or knowledge must be as precise as possible. So we detail the different cases of relations that could exist between two conceptual graphs: • If there is one of the different relations of specialization, detail in [3], then we apply the "strategy of the highest direct specialization": if the proposition is more precise than a description in the artefact, and use more precise expression, we prefer to restrict what was expressed in the artefact. • If there is one of the different relations of instantiation, then we apply the "strategy of the highest direct instantiation". • If there is a relation of partial equivalence and equivalence, then we can keep the two graphs in the different viewpoints they belong to, or choose one of them. • If there is a relation of inclusion or partial inclusion, we keep the graph that includes the other graphs, because the information in the included graphs will be present in the other graph. • If there is an excluded relation, the proposition is not valid. This case can note append in this context of integration of a solution in the artefact, but is efficient in the "evaluate task".
5 Conclusion In this paper, we present an approach to the construction of a corporate memory in CE, in taking account the different steps of the artefact during the design process. We propose a representation of artefact with viewpoints and conceptual graphs based on previous work on CG and an adaptation of viewpoints and viewpoints management to this problem. We introduce the notion of description viewpoints, and expertise viewpoints. We propose several algorithms (not detailed in this paper)for the building and maintain of the CG base representing artefact. Related work are already done in Design Rationale: [9] propose to use case librairies to represent past experiences and [6] focused on the usability of design rationale documents, but they do not use the conceptual graph formalism. Our approach using viewpoints and CGs can be extend to the different elements of a corporate memory detailed in [6] and it take care of the variety of information sources and the context in which knowledge or information must be understood. The implementation of this work is in progress and realized with the COGITO platform. In further work we have to organize answers of algorithms according to the different levels of viewpoints. Our aim is to propose a support system allowing the management of a CE project memory, based first on the artefact, but also must be
107
108
M. Ribiere
extended to the proposition (for the Design Rationale part) of [11], i.e. take into account the history of artefact but also the different possible choices during the CE process represented by the different design propositions. In this way, we could use efficiently the algorithm of comparison to see differences between several propositions
References 1.
Carbonneill, B., Haemmerlé, O., Rock: Un système de question/reponse fondé sur le formlaisme des graphes conceptuels, In actes du 9ième Congrès Reconnaissances des Formes et Intelligence Artificielle, Paris, p. 159-169, 1994.
2.
Cointe, C., Matta, N., Ribière, M., Design Propositions Evaluation: Using Viewpoint to manage Conflicts in CREoPS2, In Proceedings of ISPE/CE, Concurrent Engineering: Research and Applications, Rochester August 1997.
3.
Dieng, R., Hug, S.,MULTIKAT, a Tool for Comparing Knowledge of Multiple Experts, In Proceedings of ICCS’98, Ed. Springer Verlag, Montpellier, France, August 1998.
4.
Finch, I., Viewpoints - Facilitating Expert Systems for Multiple Users, In Proceedings of the 4th International Conference on Database and Expert Systems Applications, DEXA’93, ed. Springer Verlag, 1993.
5.
Gerbé, O., Conceptual Graphs for corporate Knowledge Repositories, In Proceedings of ICCS’97, ed. Springer-Verlag, Seattle, Washington, USA, August 1997
6.
Karsenty, L., An empirical evaluation of design rationale documents, Electronic Proceedings of CHI’96, [http://www.acm.org/sigchi/chi96/proceedings/papers/Karsenty/ lk_txt.htm], 1996.
7.
Leite, J.,Viewpoints on viewpoints, in Proceedings of Viewpoints 96: An International Workshop on Multiple Perspectives in Software Development, San Francisco, USA, 14-15 October, 1996.
8.
Marino, O., Rechenmann, F., P. Uvietta, Multiple perspectives and classification mechanism in Object-oriented Representation, Proc. 9th ECAI, Stockholm, Sweden, p. 425-430, Pitman Publishing, London, August 1990.
9.
Prasad, M.V.N., Plaza, E., Corporate Memories as Distributed Case Librairies, Proceedings of the 10th banff, Knowledge Acquisition for Knowledge-based Systems Workshop, Banff, Alberta, Canada, November 9-14, p. 40-1 40-19, 1996.
10.
Ribière, M., Dieng, R., Introduction of viewpoints in Conceptual Graph Formalism, In Proceedings of ICCS’97, Ed. Springer-Verlag, Seattle, USA, Août 1997.
11.
Ribière, M., Matta, N., Guide for the elaboration of a Corporate Memory in CE, submitted to the 5th European Concurrent Engineering Conference, Erlangen-Nuremberg, Germany, april 26-29, 1998.
12.
Tichkiewitch, S., Un modèle multi-vues pour la conception integrée, in Summer Scool on ""Entreprises communicantes: Tendances et Enjeux", Modane, France, 1997.
13.
Sowa, J.F., Conceptual Structures, Information Processing in Mind and Machine. Reading, Addison-Wesley, 1984.
WebKB-GE — A Visual Editor for Canonical Conceptual Graphs S. Pollitt1 , A. Burrow1 , and P.W. Eklund2 1
2
Department of Computer Science, University of Adelaide, Australia 5005 sepollitt/
[email protected] School of Information Technology, Griffith University, Parklands Drive, Southport, Australia 9276
[email protected] Abstract. This paper reports a CG editor implementation which uses canonical formation as the direct manipulation metaphore. The editor is written in Java and embedded within the WekKB indexation tool. The user’s mental map is explicitly supported by a separate representation of a graph’s visual layout. In addition, co-operative knowledge formulation is supported by network-aware work-sharing features. The layout language and its implementation are described as well as the design and implementation features.
1
Introduction
Display form conceptual graphs (CGs) provide information additional to the graph itself. An editing tool should therefore preserve layout information. For aesthetic reasons a regular layout style is also preferred. However, one consideration of good CG layout (as opposed to a general graph layout) is that understandability is the primary goal rather than attractiveness [2]. The editor we describe (WebKB-GE) limits manipulation on the graph to the canonical formation rules [16] (copy, restrict, join, simplify). Atomic graphs are also canonical and therefore any CG constructed using WebKB-GE will be canonical. WebKB [12] is a public domain experimental knowledge annotation toolkit. It allows indices of any Document Elements (DEs) on the WWW to be built using annotations in CGs. This permits the semantic content, and relationships to other DEs, to be precisely described. Search is initiated remotely, via a WWW-browser and/or a knowledge engine. This enables construction of documents using inference within the knowledge engine to assemble DEs. Additionally, the knowledge base provides an alternate index through which both query and direct hyperlink navigation can occur. WebKB has been built using Javascript and Java for the WWW-based interface and C and C++ for the inference engines. One of the goals of the WebKB toolkit is to aid Computer Supported Co-operative Work (CSCW). WebKB-GE is integrated into WebKB and therefore multi-user/distributed features are implemented.
2
Design Goals
WebKB-GE is designed to be used by domain experts in a distributed cooperative environment. This means: (i) domain dependent base languages must be distributed; (ii) co-operation depends on a shared understanding of a base M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 111–118, 1998. c Springer-Verlag Berlin Heidelberg 1998
112
S. Pollitt, A. Burrow, and P.W. Eklund
language; (iii) domain experts are not necessarily experts in CG theory; (iv) large, collaborative domain knowledge bases are difficult to navigate; (v) a medium for collaborative communications must be provided. The design of WebKBGE supports the construction of accurate well-formed CGs allowing the user to experience a CG canon’s expressiveness. This is achieved through a direct manipulation interface. The properties of a graphs depiction are explicitly stored between sessions. WebKB-GE is designed to operate as a client tool in a distributed environment. 2.1
Direct Manipulation
Direct manipulation (DM) allows complex commands to be activated by direct and intuitive user actions. DM is the visibility of the object of interest; rapid, reversible, incremental actions; and replacement of complex command language by direct manipulation of the object of interest [15]. It should allow the user to form a model of the object represented in the interface [4]. A well recognised subclass of DM interfaces is the graphical object editor [17] where the subject is edited through interaction with a graphical depiction. Unidraw [18] is a toolkit explicitly designed to support the construction of graphical object editors. WebKB-GE is an example of a graphical object editor handling CGs. Central to the DM interface is the Editing/Manipulation Pane. This contains a number of Objects manipulated by Tools. Relations between objects are also indicated. Objects provide visual representations of complex entities. The Concepts and Relations in CGs are graphical objects in WebKB-GE. A concept may contain additional information (an individual name for example) but only the Type is displayed in the visual representation. A palette of tools is provided to manipulate objects. The behaviour of a tool may be a function of the manipulated object, so each tool expresses an abstract operation. “Operation animation” is an essential feature of a DM interface. An operation like “move” allows the user to see the object dragged from the start to the finish point. Visual feedback about the success or failure of an operation must be provided. 2.2
Canonical Graphs
WebKB is aimed at the domain expert. It is important to restrict the graphs to those derivable from a canon. To ensure canonical graphs, the only operations allowed are canonical formation rules [16]; (i) Copy – a copy of a canonical graph is canonical; (ii) Restrict – a more general type may be restricted to a more specific type (as defined in the Type hierarchy). Also, a generic concept type may be replaced by an individual object of that type; (iii) Join – two canonical sub-graphs containing an identical concept are joined at that concept; (iv) Simplify – when a relation is duplicated between identical concepts the duplicates are redundant and removed. A distinction is made between operations that affect the graph and operations that affect the representation of the graph. Each of the four canonical operations operate on the underlying graph and the visual representation is updated accordingly. These are the only operations allowed on the graph itself. Operations on the representation of the graph, such as a “Move” operation, are also allowed.
WebKB-GE — A Visual Editor for Canonical Conceptual Graphs
113
Fig. 1. Visual feedback in a WebKB-GE join operation. The left hand side shows an unsuccessful join. The system starts in a stable state (a) a concept is dragged in (b). The join is invalid so the dragged object turns red. The mouse is released at this point. The operation is undone by snapping the concept back to the previous position (c). The right hand side of shows a successful join. The system starts in (d), the lower left concept is moved towards an identical concept (e). The mouse is released and the two concepts snap together as the join is performed (f).
2.3
Distributed Multi-User Application
CSCW tools such as WebKB, allow members of a group to access a shared canon. The canon is received each time the user starts the application to ensure changes are propogated from the central site. The server is not fixed and the user chooses from several. Graphs created by distributed users should be available to the other members of the workgroup. The ideal way is for clients to send graphs back to the central server. When users share information it is important that the original creator be acknowledged. These features are implemented in WebKB-GE. 2.4
Layout Retention
In preceding local implementations of CG editors [1,3] only the graph operations were considered important. No display or linear form editors in the literature have the capacity to maintain display form representations [11,13]. In these tools the formation rules are sometimes implemented for graph manipulation but only the final linear form of the graph is stored between editing sessions. The visual representation of a conceptual graph contains additional metainformation significant to the graph’s creator. Layout should be disturbed as little as possible by operations performed on the underlying CG. This allows the users’ mental map [14] to be preserved. Additionally, the user will want to alter the visual representation while not altering the underlying graph. The only such operation permitted is moving objects to new spatial locations. These moving operations are constrained — a regularity is imposed by the layout language describing the visual representations. The method chosen for layout storage is implemented according to Burrow and Eklund[2] and described below in Section 3.4.
114
3
S. Pollitt, A. Burrow, and P.W. Eklund
Implementation
3.1 Architecture A general multiple server/multiple client architecture allows communication over a network such as the Internet. The network is not an essential part of the system — both server and client can run on a single machine. The server code controls distribution of a canon to clients. In addition, initial layouts for a canon are sent to a client. The server can handle database access for a shared CG canon. The client requests a copy of a canon from a server as well as the layout of any new subgraphs the user adds to the editing pane. 3.2 Implementation Language: Java One important issue is the difference between Java Applets and Java Applications. An Applet is a Java program [5] that runs within a restricted environment, normally a Web browser. This restricted environment does not allow certain operations for security reasons: (i) making a network connection to a machine other than that from which the applet was loaded. This restricts each instance of the client to only contact a single server. Changing servers requires the whole applet be re-loaded from the new server: (ii) accessing the local disk. The client is unable to read/write to the local disk and unable to save/load CGs locally. This is not a large problem depending on how saving graphs is handled. If all graphs are saved via the server no local access is required. Each time an applet is loaded in a web browser all code for that applet is downloaded from the server to the client machine. This ensures the user is receiving continuously updated software but can be a slow if the applet is large. With a Java application, the Java Run-time Environment (JRE) is downloaded separately for the appropriate platform. The application code is executed using the JRE. Applet restrictions do not apply for Java Applications and for this reason both the client and server code are written as Java Applications. A number of DM toolkits are available for use with Java. Most provide a layer on top of the Abstract Windowing Toolkit (AWT) to make interface creation straightforward. Some of the more widely known toolkits are subArctic [6], Sgraphics [9] and the Internet Foundation Classes [7]. The DM toolkit for WebKB-GE is subArctic[6], a constraint-based layout toolkit. At the time WebKB-GE was written it was the most stable of supported toolkits. It is also available free for both commercial and non-commercial use. Objectspace has created the “Generic Collection Library” (JGL) for Java [8]. This library was also used. 3.3 Communication For communication between client and server a simple protocol was implemented. Currently only two active operations are implemented: (i) Canon Request — the server responds to this by returning a copy of the canon being served, read from its local disk; (ii) Layout Request — the server reads the relation name sent by the client and returns the layout specification for that relation. Additional operations, such as canon modifications and database access, can be added to the protocol if required. The client application contains two parsers to process information from a server. One parser reads the linear CGs and other information from the canon to build the internal CG data store. The second reads layout and builds the visual representation of each graph.
WebKB-GE — A Visual Editor for Canonical Conceptual Graphs
115
3.4 Layout Language The layout language devised to display CGs is the feature that differentiates this editor from others[11,13,3]. The language originates in Kamada and Kawai[10] who developed a constraint based layout language (COOL) to generate a pictorial representation of text. Graphical relations in COOL are one of either geometric relations — constraints between variables characterising the objects; and drawing relations — lines and labels on and between objects. Specifying layout is performed in two stages: (i) constraining the (defined) reference points to lie on a geometric curve (line, arc, spline); (ii) connecting the reference points of the object by a geometric curve with an arrowhead. Burrow and Eklund [2] devised a canon that represents the visual structure of conceptual graphs in the language of CGs. Following that work, the actual physical locations of objects are not stored in WebKB-GE. Instead, spatial relationships between the objects are captured in the language. A container-based approach is used. All objects must be stored in horizontal and vertical containers. The ordering of the objects within containers is preserved as objects are moved. If a container in either orientation does not exist at the final move location, a new container is created in that direction. Moving the final object out of a container dissolves the container. Horizontal containers are defined to extend to the width of the editing pane with the height of the tallest object contained. Vertical containers are defined to extend to the height of the editing pane with the width of the widest object. In WebKB, as in any co-operative work environment, it is important to record the original source of knowledge and data. Because layout information is saved separately from the linear form a mapping from linear form to display objects is also required. This is achieved by generating a quasi-unique identifier for every graph component. This is created using the Internet Protocol (IP) address of the machine on which the graph was created along with the time-stamp. This results in a twenty four digit identifier. Inside the client application the linear and display form of the graph are stored and manipulated separately. The two forms are synchronised using the twenty four digit identifier discussed above. The abstract (linear) graph information is stored in a series of data structures descending from the editing pane. The canon currently in use is also stored in the editing sheet and constructed from the Type and Relation hierarchies. Each entry in the Relation hierarchy contains a graph describing the concept types to be connected by the relation. Both graphs in relation definitions and user constructed graphs have their abstract information stored internally in a CG map: one for each graph. The graphical layout is stored in a series of containers with the top level being the “Container Constrainer”. This handles alignment of horizontal and vertical containers and creates/destroys new/old containers. Each container is responsible for managing the layout of the objects contained within it. When objects change their dimensions or are added and removed the container resizes appropriately. The code ensures containers are not too close together. Each object in the graphical representation maintains a connection with the abstract graph object it displays. When graph operations are targeted on a graphical object the abstract object is
116
S. Pollitt, A. Burrow, and P.W. Eklund
retrieved. Links between abstract graph objects (edges) are not stored in object containers but maintained within the user interface. Due to the simplicity of the layout language there are only a very small number of definitions that may appear. [HBOX:hboxnum]→(HCONTAINS)→[ELEMENT:!uniqueid]. [VBOX:vboxnum]→(VCONTAINS)→[ELEMENT:!uniqueid]. hboxnum and vboxnum denote the box into which to insert the element. The box numbers do not have to indicate any sort of ordering. The uniqueid is the placeholder for the corresponding element in the linear description. [ELEMENT:!uniqueid]→(ELTRIGHT)→ [ELEMENT:!uniqueid]. [ELEMENT:!uniqueid]→(ELTBELOW)→[ELEMENT:!uniqueid]. The second uniqueid indicates the corresponding element is to the right (below) the first uniqueid. [HBOX:hboxnum]→(BOXBELOW)→[HBOX:hboxnum]. [VBOX:vboxnum]→(BOXRIGHT)→[VBOX:vboxnum].
The second box specified is below (to the right of) the first. With these definitions the relative layout is stored. When a relation is added to the editing pane by the user, or a graph is loaded from a file, the following process occurs: 1. the linear form is loaded into a temporary CG map. This restricts the search space for resolving unique IDs and rollback if an error occurs. If a relation from the canon is being added, this occurred when the canon was retrieved; 2. the layout script of the graph is processed and objects to be placed in the containers. Container objects are reordered correctly; 3. dummy objects are resolved using the appropriate abstract objects from the linear form. Graphical representations of the links between objects are created and stored in the interface; 4. each graphical container is resized to fit the largest contained object. Containers are assigned an initial starting position which accounts for spacing between the containers. Positions of the graphical links are updated as the containers (and consequently, the objects) move; 5. if the previous stages occur successfully, the layout and abstract objects are merged in the editing pane. [!1:User]{->(!2:Role)->[!3:WN_expert], ->(!4:Chrc)->[!5:Property]}. Fig. 2. The augmented linear form of the CG.
Fig. 2 shows the linear form augmented with unique identifiers. The graphical layout script is shown in Fig. 3 and the corresponding screen-shot shown in Fig. 4. The format of the canon used by the editor is a simple series of definitions giving the concept and relation hierarchies. The lattices containing the definitions must be defined first, with a separate lattice for every arity relation required. For example (from the default canon):
WebKB-GE — A Visual Editor for Canonical Conceptual Graphs [HBOX:1] -> (HCONTAINS) -> [ELEMENT:!1]. [HBOX:1] -> (HCONTAINS) -> [ELEMENT:!4]. [HBOX:2] -> (HCONTAINS) -> [ELEMENT:!2]. [HBOX:2] -> (HCONTAINS) -> [ELEMENT:!5]. [HBOX:3] -> (HCONTAINS) -> [ELEMENT:!3]. [ELEMENT:!1] -> (ELTRIGHT) -> [ELEMENT:!4]. [ELEMENT:!2] -> (ELTRIGHT) -> [ELEMENT:!5]. [HBOX:1] -> (BOXBELOW) -> [HBOX:2]. [HBOX:2] -> (BOXBELOW) -> [HBOX:3].
117
[VBOX:1] -> (VCONTAINS) -> [ELEMENT:!1]. [VBOX:1] -> (VCONTAINS) -> [ELEMENT:!2]. [VBOX:2] -> (VCONTAINS) -> [ELEMENT:!4]. [VBOX:2] -> (VCONTAINS) -> [ELEMENT:!5]. [VBOX:2] -> (VCONTAINS) -> [ELEMENT:!3]. [ELEMENT:!1] -> (ELTBELOW) -> [ELEMENT:!2]. [ELEMENT:!4] -> (ELTBELOW) -> [ELEMENT:!5]. [ELEMENT:!5] -> (ELTBELOW) -> [ELEMENT:!3]. [VBOX:1] -> (BOXRIGHT) -> [VBOX:2].
Fig. 3. The layout description of the graph.
Fig. 4. The top level window with the example graph loaded. lattice UNIVERSAL, ABSURD is type : *. lattice TOP-T1-T1, BOT-T1-T1 is relation ( UNIVERSAL, UNIVERSAL ).
The concept hierarchy is then defined:
type Entity(x) is [!1:UNIVERSAL:*x]. type Situation(x) is [!2:UNIVERSAL:*x]. type Something_playing_a_role(x) is [!3:UNIVERSAL:*x].
Finally the relation hierarchies are defined: relation Attributive_binaryRel(x, y) is [!1:UNIVERSAL:*x] -> (!2:TOP-T1-T1) -> [!3:UNIVERSAL:*y]. relation Spatial_binaryRel(x, y) is [!7:UNIVERSAL:*x] -> (!8:TOP-T1-T1) -> [!9:UNIVERSAL:*y].
relation Component_binaryRel(x, y) is [!4:UNIVERSAL:*x] -> (!5:TOP-T1-T1) -> [!6:UNIVERSAL:*y].
Once the linear sections of the canon have been defined, the initial layouts of the relations must be defined. Each relation layout is specified in a file with the naming form: -.layout. A layout script for the graph is contained in that file. For example, the layout for Spatial binaryRel is: [HBOX:1] -> (HCONTAINS) -> [HBOX:1] -> (HCONTAINS) -> [HBOX:1] -> (HCONTAINS) -> [ELEMENT:!7] -> (ELTRIGHT) [ELEMENT:!8] -> (ELTRIGHT)
4
[ELEMENT:!7]. [ELEMENT:!8]. [ELEMENT:!9]. -> [ELEMENT:!8]. -> [ELEMENT:!9].
[VBOX:1] [VBOX:2] [VBOX:3] [VBOX:1] [VBOX:2]
-> -> -> -> ->
(VCONTAINS) -> [ELEMENT:!7]. (VCONTAINS) -> [ELEMENT:!8]. (VCONTAINS) -> [ELEMENT:!9]. (BOXRIGHT) -> [VBOX:2]. (BOXRIGHT) -> [VBOX:3].
Conclusion
A visual-form CG editor has been designed and implemented. Editing operations are restricted to the four canonical formation rules. This guarantees well-formed CGs. The key feature of this editor is the use of a graphical scripting language to capture relevant details of the graph layout. This information is stored along with the linear information.
118
S. Pollitt, A. Burrow, and P.W. Eklund
The application is written to be used for co-operative work by networkconnected users and in particular for use in the WebKB indexation toolkit. Only simple graphs are currently supported. The ability to extend the framework to nested graphs, along with extending the layout language with different containers, is inherent in the design of the editor. WekKB and WebKB-GE may be obtained from http://www.int.gu.edu.au/kvo.
References 1. A.L. Burrow. Meta tool support for a GUI for conceptual structures. Hons. thesis, Dept. of Computer Science, www.int.gu.edu.au/kvo/reports/andrew.ps.gz, 1994. 2. A.L. Burrow and P.W. Eklund. A visual structure representation language for conceptual structures. In Proceedings of the 3rd International Conference on Conceptual Structures (Supplement), pages 165–171, 1995. 3. Peter W. Eklund, Josh Leane, and Chris Nowak. GRIT: An Implementation of a Graphical User Interface for Conceptual Structures. Technical Report TR94-03, University of Adelaide, Dept. Computer Science, Feb. 1994. 4. P.W. Eklund, J. Leane, and C. Nowak. GRIT: A GUI for conceptual structures. In Proceeedings of the 2rd International Workshop on PEIRCE. ICCS-93, 1993. 5. James Gosling and Henry McGilton. The Java Language Environment: A White Paper. Technical report, Sun Microsystems, 1996. 6. Scott Hudson and Ian Smith. subArctic User Manual. Technical report, GVU Center, Georgia Institute of Technology, 1997. 7. IFC Dev. Guide. Technical report, Netscape Communications Corp., 1997. 8. JGL User Manual. Technical report, Objectspace Inc., 1997. 9. Mike Jones. Sgraphics Design Documentation. Technical report, Mountain Alternative Systems, http://www.mass.com/software/sgraphics/index, 1997. 10. T. Kamada and S. Kawai. A general framework for visualizing abstract objects and relations. ACM Transactions on Graphics, 10(1):1–39, January 1991. 11. R.Y. Kamath and Walling Cyre. Automatic integration of digital system requirements using schemata. In 3rd International Conference on Conceptual Structures ICCS’95, number 954 in LNAI, pages 44–58, Berlin, 1995. Springer-Verlag. 12. P.H. Martin. The WebKB set of tools: a common scheme for shared WWW annotations, shared knowledge bases and information retrieval. In Proceedings of the CGTools Workskop at the 5th International Conference on Conceptual Structures ICCS ’97, pages 588–595. Springer Verlag, LNAI 1257, 1997. 13. Jens-Uwe M¨ oller and Detlev Wiesse. Editing conceptual graphs. In Proceedings of the 4th International Conference on Conceptual Structures ICCS’96, pages 175– 187, Berlin, 1996. Springer-Verlag, LNAI 1115. 14. George G. Roberston, Stuart K. Card, and Jock D. Mackinlay. Information visualization using 3D interactive animation. CACM, 36(4):57–71, Apr 1993. 15. B Shneiderman. Direct manipulation. Computer, 16(8):57–68, Aug 1983. 16. J Sowa. Conceptual Structures : Information Processing in Mind and Machine. Addison-Wesley, 1984. 17. J. Vlissides. Generalized graphical object editing. Technical Report CSL-TR-90427, Dept. of Elec. Eng. and Computer Science, Stanford University, 1990. 18. John M. Vlissides and Mark A. Linton. Unidraw: A framework for building domainspecific graphical editors. ACM Trans. on Info. Systems, 8(3):237–268, Jul 1990.
Mapping of CGIF to operational interfaces A. Puder International Computer Science Institute 1947 Center St., Suite 600 Berkeley, CA 94704{1198 USA
[email protected] Abstract. The Conceptual Graph Interchange Format (CGIF) is a no-
tation for conceptual graphs which is meant for communication between computers. CGIF is represented through a grammar that de nes \on{ the{wire{representations". In this paper we argue that for interacting applications in an open distributed environment this is too inecient both in terms of the application creation process as well as runtime characteristics. We propose to employ the widespread middleware platform based on CORBA to allow the interoperability within a heterogeneous environment. The major result of this paper is a speci cation of an operational interface written in CORBA's Interface De nition Language (IDL) that is equivalent to CGIF, yet better suited for the ecient implementation of applications in distributed systems. Keywords: CGIF, CORBA, IDL.
1
Introduction
Conceptual Graphs (CG) are abstract information structures that are independent of a notation (see [5]). Various notations have been developed for dierent purposes (see Figure 1). Among these are the display form (graphical notation) or the linear form (textual notation). These two notations are intended for human computer interaction. Another notation called Conceptual Graph Interchange Format (CGIF) is meant for communication between computers. CGIF is represented through a grammar that de nes \on{the{wire{representations" (i.e. the format of the data transmitted over the network). The reason for developing CGIF was to support the interoperability for CG{ based applications that needed to communicate with other CG{based applications. We argue that for interacting applications in an open distributed environment this is too inecient both in terms of the application creation process as well as runtime characteristics. Applications that need to interoperate are written by dierent teams of programmers, in dierent programming languages using dierent communication protocols. A generalization of this problem is addressed by so{called middleware platforms . As the name suggests, these platforms reside between the operating system and the application. One prominent middleware platform is de ned through the Common Object Request Broker Architecture (CORBA) which allows the interoperability within a heterogeneous environment (see [4]). In this paper we will show how to use CORBA for CG{based applications. M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 119-126, 1998 Springer-Verlag Berlin Heidelberg 1998
120
A. Puder
Conceptual Graphs
Intension Extension Linear Form
Display Form
Human Computer Interaction
Fig. 1.
CGIF
CGIDL
Computer Computer Interaction
Dierent notations represent the intension of conceptual graphs.
The outline of this paper is as follows: in Section 2 we give a short overview of CORBA. In Section 3 we discuss some drawbacks of using CGIF for distributed applications. In Section 4 we present our mapping of CGIF to CORBA IDL, which is further explained in Section 5 through an example. It should be noted that we describe work{in{progress. The following explanations emphasize the potential of using CORBA technology for CG{based applications. A complete mapping of CGIF to CORBA IDL is subject to further research.
2 Overview of CORBA Modern programming languages employ the object paradigm to structure computation within a single operating system process. The next logical step is to distribute a computation over multiple processes on a single machine or even on dierent machines. Because object orientation has proven to be an adequate means for developing and maintaining large scale applications, it seems reasonable to apply the object paradigm to distributed computation as well: objects are distributed over the machines within a networked environment and communicate with each other. As a fact of life, the computers within a networked environment dier in hardware architecture, operating system software, and the programming languages used to implement the objects. That is what we call a heterogeneous distributed environment . To allow communication between objects in such an environment, one needs a rather complex piece of software called a middleware platform . The Common Object Request Broker Architecture (CORBA) is a speci cation of such a middleware platform. The CORBA standard is issued by the Object Management Group (OMG), an international organization with over 750 information software vendors, software developers, and users. The goal of the OMG is the establishment of industry guidelines and object management speci cations to provide a common framework for application development. CORBA addresses the following issues:
Mapping of CGIF to Operational Interfaces Object orientation:
cations.
121
Objects are the basic building blocks of CORBA appli-
A caller uses the same mechanisms to invoke an object whether it is located in the same address space, on the same machine, or on a remote machine. Hardware, OS, and language independence: CORBA components can be implemented using dierent programming languages on dierent hardware architectures running dierent operating systems. Vendor independence: CORBA compliant implementations from dierent vendors interoperate and applications are portable between dierent vendors. Distribution transparency:
One important aspect of CORBA is that it is a speci cation and not an implementation . CORBA just provides a framework allowing applications to interoperate in a distributed and heterogeneous environment. But it does not prescribe any speci c technology how to implement the CORBA standard. The standard is freely available via the World Wide Web at http://www.omg.org/. Currently there exist many implementations of CORBA focusing on dierent market segments.
IDL Compiler Client
Client Stub
Server Skeleton
Server
DSI DII
Object Adapter ORB
Fig. 2.
Basic building blocks of a CORBA based middleware platform.
Figure 2 gives an overview of the components of a CORBA system (depicted in gray), as well as the embedding of an application in such a platform (white components). The Object Request Broker (ORB) is responsible for transferring operations from clients to servers. This requires the ORB to locate a server implementation (and possibly activate it), transmit the operation and its parameters, and nally return the results back to the client.
122
A. Puder
An Object Adapter (OA) oers various services to a server such as the management of object references, the activation of server implementations, and the instantiation of new server objects. Dierent OAs may be tailored for speci c application domains and may oer dierent services to a server. The ORB is responsible for dispatching between dierent OAs. One mandatory OA is the so{called Basic Object Adapter (BOA). As its name implies, it oers only very basic services to a server. The interface between a client and a server is speci ed with an Interface Definition Language (IDL). According to the object paradigm, an IDL speci cation separates the interface of a server from its implementation. This way a client has access to a server's operational interface without being aware of the server's implementation details. An IDL{compiler generates a stub for the client and a skeleton for the server which are responsible for marshalling and unmarshalling the parameters of an operation. The Dynamic Invocation Interface (DII) and the Dynamic Skeleton Interface (DSI) allow the sending and receiving of operation invocations. They represent the marshalling and unmarshalling API oered to stubs and skeletons by the ORB. Dierent ORB implementations can interoperate through the Internet Inter{ORB Protocol (IIOP) which describes the on{the{wire representations of basic and constructed IDL data types as well as message formats needed for the protocol. With that respect it de nes a transfer syntax, just like CGIF does. The design of IIOP was driven by the goal to keep it simple, scalable, and general.
3 Using CGIF in a heterogeneous environment In this section we explain how CGIF might be used in constructing distributed applications and the disadvantages this has. Figure 3 depicts a typical con guration. The application consists of a client and a server, communicating via a transport layer over the network. The messages being exchanged between the client and the server contain CGs as de ned by CGIF. First note that this is not sucient for a distributed application. CGIF only allows to code parameters for operations, but the kind of operations to be invoked at the server is out of scope of CGIF. The distinction between parameters and operations corresponds to the distinction between KIF and KQML (see [2]). In that respect there is no equivalent to KQML in the CG{world. One of our premises is that the client and server can be written in dierent programming languages, running on dierent operating systems using dierent transport media. A programmer would most certainly de ne some data structures to represent a CG in his/her programming language. In order to transmit a CG, the internal data structure needs to be translated to CGIF. This is accomplished by a stub . On the server side the CG coded in CGIF needs to be translated back to an internal data structure again. This is done by a skeleton . The black nodes in Figure 3 show the important steps in the translation process. At step 1 the CG exists as a data structure in the programming language used to implement the client. The stub, which is also written in the same pro-
Mapping of CGIF to Operational Interfaces
Client
Server
1
4
Stub
Skeleton
2
3
123
Transport Layer
Fig. 3. Marshalling code is contained in stubs and skeletons.
gramming language as the client, translates this data structure to CGIF (step 2). After transporting the CG over the network, it arrives at the server (step 3). The skeleton translates the CG back into a data representation in the programming language used for the implementation of the server. At step 4 the CG can nally be processed by the server. CGIF does not prescribe an internal data structure of a programming language. I.e., using CGIF for transmitting CGs, a programmer must rst make such a de nition based on his/her programming language followed by a manual coding of the stub and skeleton. This imposes a high overhead in the application creation process. The main bene t of using CORBA IDL is that stubs and skeletons are automatically generated by an IDL compiler and there is a well{de ned mapping from IDL to dierent programming languages. Furthermore, the IDL speci cation does not only allow the speci cation of parameters but also for operations which makes it suitable for the speci cation of operational interfaces between application components. Although an IDL speci cation induces a transfer syntax through IIOP similar to CGIF, an IDL speci cation is better suited for the design of distributed applications. An IDL speci cation hides the transfer syntax and focuses on the user de ned types, which are mapped to dierent programming languages by an IDL compiler. CGIF on the other hand exposes the complexity of the transfer syntax to an application programmer who is responsible for coding stubs and skeletons. 4
CG Interface through IDL
In this section we show how to translate some of the basic de nitions of the proposed CG standard to CORBA IDL. The explanations presented here should
124
A. Puder
be seen as a proof of concept. A more thorough approach including all de nitions of the CG standard is still a research topic. The mapping we explain in the following covers de nitions 3.1 (Conceptual Graphs), 3.2 (Concept) and 3.3 (Conceptual Relation) of the proposed CG standard (see [6]). The basic design principle of the operational interface is to exploit some common features of the object paradigm and the CG de nitions. A conceptual graph is a bipartite graph consisting of concept and conceptual relation nodes. This de nition resembles an object graph, where objects represent the nodes of a CG and object references arcs between the nodes. Therefore it seems feasible to model the nodes of a CG through CORBA objects and the links between the nodes through object references. This way of modelling a CG through CORBA IDL has several advantages. Since object references are used to connect relation with concept nodes, one CG can be distributed over several hosts. The objects, which denote the nodes of a CG are not required to remain in the same address space, since an object reference can span host boundaries in a heterogeneous environment. Furthermore, a CG does not necessarily need to be sent by value, but rather by reference. Only if the receiving side of an operation actually accesses a node, it will be transferred over the network. This scheme enables a lazy evaluation strategy for CG operations. It is common to place all IDL de nitions related to a particular service in a separate namespace to avoid name con icts with other applications. Therefore, we assume that the following IDL de nitions are embraced by an IDL module: module CG { // Here come all definitions related to // the CG module };
Using the inheritance mechanism of CORBA IDL we rst de ne a common base interface for concepts and conceptual relations. This interface is called Node and de ned through the following speci cation: typedef string Label; interface Node { attribute Label type; };
The interface Node contains all the de nitions which are common for concept and relation nodes. The common property shared by those two types of nodes is the type label. The type label is represented through an attribute of type Label. Note that we made Label a synonym for string through a typedef declaration. Next, we de ne the interface for all concept nodes: interface Concept : Node { attribute Label referent; }; typedef sequence ConceptSeq;
Mapping of CGIF to Operational Interfaces
125
The interface Concept inherits all properties (i.e., attribute Label) from interface Node and adds a new property, namely an attribute for the referent. The following typedef de nes an unbounded sequence of concepts. This is necessary for the de nition of the relation node: interface Relation : Node { attribute ConceptSeq links; };
Just as the interface Concept, the interface Relation inherits all properties from Node. The new property added here is a list of neighboring concept nodes, represented by the attribute links. Note that ConceptSeq is an ordered sequence. The length of this sequence corresponds with the arity of the relation. The rst item in a sequence refers to the concept node pointing towards the relation. The nal de nition gives the IDL abstraction of a CG: typedef sequence Graph;
A CG is represented by a list of interface Node. For brevity reasons the IDL type is called Graph. The order of appearance of the individual nodes is of no importance. This data structure suces to transmit a simple CG over the network. In the following section we provide a little example how this de nition might be used in a real application context.
5 Example Given the basic speci cations from the previous section, how does a programmer develop an application using CG? Using the CORBA framework the programmer would have to design the interface to the application to be written based on the de nition from the previous section. E.g., consider a simple application which would oer database functionality for CGs, such as save, restore, etc. The resulting IDL speci cation would look some like the following: #include "cg.idl" interface DB { typedef string Key; exception Duplicate {}; exception NotFound {}; exception IllegalKey {};
};
Key save( in CG::Graph c ) raises( Duplicate ); CG::Graph retrieve( in Key k ) raises( NotFound, IllegalKey ); void delete( in CG::Graph c ) raises( NotFound, IllegalKey );
126
A. Puder
This example assumes that the basic de nitions for CGs from Section 4 are stored in a le called \cg.idl". The de nitions are made known through the #include directive. Access to the database is de ned through interface DB. The database stores CGs and assigns a unique key to each CG. The database allows to save, retrieve and delete CGs. If a CG is saved, the database returns a unique key for this CG. The operations retrieve and delete need a key as an input parameter. Errors are reported through exceptions. The IDL de nition of interface DB is all that a client program will need in order to access an implementation. As pointed out before, the language and precise technology that was used to implement the database are irrelevant to the client. 6
Conclusion
The construction of CG{based applications can bene t from the usage of middleware platforms. In this paper we have shown how to translate the basic de nitions for a conceptual graph to CORBA IDL. A conceptual graph is represented through a set of CORBA objects which do not need to reside in the same address space. Besides language independence, this has the advantage of supporting lazy evaluation strategies for CG operations. Once a proper mapping from CGIF to CORBA IDL has been accomplished, the existing CG applications should be re{structured using CORBA (see [1]). By doing so, those applications could more easily exploit speci c services they oer among each other and to other applications. There are several CORBA implementations available including free ones (see [3]). Applications, which use CORBA as a middleware platform, can be easily accessed in a standardized fashion from any CORBA implementation, such as the one included in the Netscape Communicator. References 1. CGTOOLS. Conceptual Graphs Tools homepage. http://cs.une.edu.au/cgtools/, School of Mathematical and Computer Science, University of New England, Australia, 1997. 2. M.R. Genesereth and S.P. Ketchpel. Software Agents. Communications of the Association for Computing Machinery, 37(7):48{53, July 1994. 3. MICO. A free CORBA 2.0 compliant implementation. http://www.vsb.informatik.uni-frankfurt.de/mico, Computer Science Department, University of Frankfurt, 1997. 4. Object Management Group (OMG), The Common Object Request Broker: Architecure and Speci cation, Revision 2.2, February 1998. 5. J.F. Sowa. Conceptual Structures, information processing mind and machine. Addison{Wesley Publishing Company, 1984. 6. J.F. Sowa. Standardization of Conceptual Graphs. ANSI Draft, 1998.
TOSCANA-Systems Based on Thesauri Bernd Groh1 , Selma Strahringer2 , and Rudolf Wille2 1
School of Information Technology Griffith University PMB 50 Gold Coast Mail Centre QLD 9726 Australia e-mail:
[email protected] 2 Fachbereich Mathematik Technische Universit¨ at Darmstadt, Schloßgartenstr. 7 64289 Darmstadt, Germany e-mail: {strahringer, wille}@mathematik.tu-darmstadt.de
Abstract. TOSCANA is a computer program which allows an online interaction with data bases to analyse and explore data conceptually. Such interaction uses conceptual data systems which are based on formal contexts consisting of relationships between objects and attributes. Those formal contexts often have attributes taken from a thesaurus, which may be understood as ordered set and be completed to a join-semilattice (if necessary). The join of thesaurus terms indicates the degree of resemblance of the terms and should therefore be included in the formal contexts containing those terms. Consequently, the formal contexts of a conceptual data system based on a thesaurus should have join-closed attribute sets. A problem arises for the TOSCANA-system implementing such conceptual data system because the attributes in a nested line diagram produced by TOSCANA might not be join-closed, although its components have join-closed attribute sets. In this paper we offer a solution to this problem by developing a method for extending line diagrams to those whose attributes are join-closed. This method allows to implement TOSCANA-systems based on thesauri which respect the joinstructure of the thesauri.
Keywords: Conceptual Knowledge Processing, Formal Concept Analysis, Drawing Line Diagrams, Thesauri
1
TOSCANA
TOSCANA is a computer program which allows an online interaction with databases to analyse and explore data conceptually. TOSCANA realizes conceptual data systems [17],[20] which are mathematically specified systems consisting of a data context and a collection of formal contexts, called conceptual scales, together with line diagrams of their concept lattices. There is a connection between formal objects of the conceptual scales and the objects in the data context that M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 127–138, 1998. c Springer-Verlag Berlin Heidelberg 1998
128
B. Groh, S. Strahringer, and R. Wille
can be activated to conceptually represent the data objects within the line diagrams of the conceptual scales. This allows thematic views into the database (underlying the data context) via graphically presented concept lattices showing networks of conceptual relationships. The views may even be combined, interchanged, and refined so that a flexible and informative navigation through a conceptual landscape derived from the database can be performed (cf. [21]). For an elementary understanding of conceptual data systems, it is best to assume that the data are given by a larger formal context K := (G, M, I). A conceptual scale derived from data context K can then be specified as a subcontext (G, Mj , I ∩ (G × Mj )) with Mj ⊆ M . A basic proposition of Formal Concept Analysis states that the concept lattice of K can be represented as a W -subsemilattice within the direct product of the concept lattices of subcontexts S (G, Mj , I ∩(G×Mj )) (j ∈ J) if M = j∈J Mj [3; p. 77]. This explains how every concept lattice can be represented by a nested line diagram of smaller concept lattices. Figure 1 shows a nested line diagram produced with the assistance of TOSCANA from a database about environmental literature. One concept lattice is represented by the big circles with their connecting line segments, while the line diagram of the second is inserted into each big circle. In such a way the direct product of any two lattices can be diagramed, where a non-nested line diagram of the direct product may be obtained from a nested diagram by replacing each line segment between two big circles by line segments between corresponding elements of the two line diagrams inside the two big circles. In Fig. 11 , the black little circles represent W the combined concept lattice which has the two smaller concept lattices as -homomorphic images. Since larger data contexts usually give rise to an extensive object labelling, TOSCANA attaches first to a node the number of objects which generate the concept represented by that node; after clicking on that number, TOSCANA presents all names of the counted objects. TOSCANA-systems have been successfully elaborated for many purposes in different research areas, but also on the commercial level. For example, TOSCANA-systems have been established: for analyzing data of children with diabetes [17], for investigating international cooperations [11], for exploring laws and regulations concerning building constructions [13], for retrieving books in a library [12], [15], for assisting engineers in designing pipings [19], for inquiring flight movements at Frankfurt Airport [10], for inspecting the burner of an incinerating plant [9], for developing qualitative theories in music esthetics [14], for studying semantics of speech-act verbs [8], for examining the medical nomenclature system SNOMED [16] etc. In applying the TOSCANA program, the desire often arises to extend the program by additional functionalities so that TOSCANA is still in a process of further development (cf. [18]).
1
Translation of the labels in Fig. 1: Fluss/river, Oberflaechengewaesser/surface waters, Talsperre/impounded dam, Stausee/impounded lake, Staugewaesser/back water, Stauanlage/reservoir facilities, Staudamm/storage dam, Stausufe/barrage weir with locks, Seen/lakes, Teich/pond
TOSCANA-Systems Based on Thesauri
129
Stauanlage Staugewaesser
Staudamm
23649 55
Stausee
Staustufe 7
Talsperre
Oberflaechengewaesser 741 18 6
2
39
11 9
Fluss
Seen 641 27 2
213 3 8
9
1 11 1
1
15
8 2
5 Teich
58 49 1 1 2
1
9 2
1
2
Fig. 1. Nested line diagram of a data context derived from a database about environmental literature
130
2
B. Groh, S. Strahringer, and R. Wille
Thesaurus Terms as Attributes
Data contexts, which are supposed to be analysed and explored conceptually, often have attributes that are terms of a thesaurus. Those terms are already hierarchically structured by relations between the terms. We discuss in this section how the hierarchical structure of the thesaurus terms should be respected in a TOSCANA-system with additional functionality (compare other approaches in [2], [5], [6]). Let us first describe a formalization of thesauri that meets our purpose. Mathematically, we understand a thesaurus as a set T of terms together with a order relation ≤, i.e. a reflexive, transitive, and anti-symmetric binary relation on T . Note that we do not assume that a thesaurus has a tree structure, i.e., we allow that, for an element x ∈ T , there are elements y and z in T with x ≤ y, x ≤ z and y 6≤z, z 6≤y. We assume that a thesaurus has a unique maximal element. If this is not made explicit, we add a new top element to the structure. For t1 , t2 ∈ T , the element t1 ∨ t2 is defined as the unique minimal upper bound of t1 and t2 . If there is no unique minimal upper bound of t1 and t2 , we add a new element to T which is understood as the conjunction of the minimal upper bounds of t1 and t2 . We interpret t1 ∨ t2 as the smallest common generalization of the terms t1 and t2 . The extension of (T, ≤) obtained by completing the ∨operation leads mathematically to a ∨-semilattice. This allows us to assume for the sequel that the considered thesaurus is formalized by a finite ∨-semilattice. A formal context (G, M , I) where M := (M, ≤) is a finite ∨-semilattice is called a context with compatible attribute-semilattice if m1 ≤ m2 and gIm1 imply gIm2 for all m1 , m2 ∈ M and g ∈ G. Interesting examples of contexts with compatible attribute-semilattice can be derived from document databases where the documents have been indexed by thesaurus terms to describe their content. For a data context K := (G, M , I) with compatible attribute semilattice M := (M, ≤), the question arises how a conceptual data system based on K may respect the semilattice structure on the attributes of K. This question especially suggests to discuss desirable properties of conceptual scales derived from K. As mentioned above, the join m1 ∨ m2 of the attributes m1 and m2 is interpreted as the smallest common generalization of m1 and m2 and indicates therefore the degree of resemblance of the attributes m1 and m2 . Of course, for a conceptual scale Kj := (G, Mj , I ∩ (G × Mj )) with Mj ⊆ M , it is desirable to code this indication of resemblance into the scale; hence, the attribute set Mj of the scale Kj should be join-closed, i.e. m1 ∨ m2 ∈ Mj for all m1 , m2 ∈ Mj . We say that a subcontext (conceptual scale) Kj is join-closed if its attribute set Mj is joinclosed. Now, we conclude our consideration with the claim that the conceptual scales derived from a data context with compatible attribute-semilattice should be join-closed. Since TOSCANA-systems work with nested line diagrams, we are confronted with the question of how the resemblance of attributes is visible in a nested line diagram which represents conceptual scales derived from a data context with compatible attribute-semilattice. An investigation of this question uncovers the problem that, for join-closed conceptual scales Kj := (G, Mj , I ∩ (G × Mj ))
TOSCANA-Systems Based on Thesauri
131
(j = 1, 2) of the data context K := (G, M , I), the join m1 ∨ m2 for m1 ∈ M1 and m2 ∈ M2 may not have a correspondig concept node to be attached to within the nested line diagram which represents the union of the conceptual scales K1 and K2 ; in particular, this would yield m1 ∨ m2 6∈M1 ∪ M2 . Thus, we have to extend the union M1 ∪ M2 by those problematic attributes which yields the join-closed attribute set M12 := M1 ∪ M2 ∪ (M1 ∨ M2 ) with M1 ∨ M2 := {m1 ∨ m2 | m1 ∈ M1 and m2 ∈ M2 }. This attribute set determines the join-closed subcontext K1 ∨ K2 := (G, M12 , I ∩ (G × M12 )) which should be the contextual basis for the desired extension of the nested line diagram. The problem is how to derive graphically that extension from the already drawn diagram. For solving this problem, we propose to insert into the nested line diagram node after node for the problematic attributes and to complete the diagram after each positioning of a new node to obtain a drawing for the concept lattice of the respective extended subcontext. This procedure, which finally reaches a line diagram for the concept lattice of K1 ∨ K2 , reduces the problem to the case of extending a subcontext by one attribute. How this elementary case can be treated will be explained in the next section.
3
Extending Lattice Diagrams
We consider a formal context K := (G, M, I) and a subcontext K0 := (G, M0 , I ∩ (G × M0 ) with M0 ⊆ M ; furthemore, let m ∈ M \ M0 such that mI is not an extent of K0 , and let Km 0 := (G, M0 ∪ {m}, I ∩ (G × M0 ∪ {m}). In this section we describe how to extend a line diagram of the concept lattice of K0 to a line m diagram of the concept lattice of Km 0 . Recall that U(K0 ) denotes the set of m m extents of the context K0 and that (U(K0 ), ⊆) ∼ = B(Km 0 ). This means that it suffices to consider only the closure system of extents together with the settheoretic inclusion for our drawing problem. We start by defining the ordered set P := (U(K0 )∪{mI }, ⊆). In P the element I m has exactly one upper cover which we denote by C. Obviously, completing P by forming all intersections yields the closure system (U(Km 0 ), ⊆). To identify the new intersections, we consider the following partition of P into four classes (see Fig. 2). (1) (2) (3) (4)
[C) := {X ∈ P |C ⊆ X} (mI ] := {X ∈ P |X ⊆ mI } (C] \ ((mI ] ∪ {C}) with (C] := {X ∈ P |X ⊆ C} P \ ([C) ∪ (C])
We say that S ∈ U(K0 ) is a minimal ∩-generator if mI ∩ S 6∈P and if R ⊆ S with mI ∩ R = mI ∩ S (R ∈ U(K0 )) implies R = S. Since minimal ∩-generators have to be incomparable to mI , they cannot be in the class (1) or (2). For an element S of class (4), the equality (S ∩ C) ∩ mI = S ∩ mI shows that S cannot be a minimal ∩-generator. Thus, the minimal ∩-generators form a subset gm of I the class (3). Obviously, we have (U(Km 0 ) = P ∪ {m ∩ S | S ∈ gm } and each I minimal ∩-generator S has S ∩ m as lower neighbour in (U(Km 0 ).
132
B. Groh, S. Strahringer, and R. Wille
P (1)
C
(4) (3)
(4)
I
m (3) (2)
Fig. 2. Partition of the ordered set P into four classes
Now, we are able to describe how to derive a line diagram for (U(Km 0 ) from a line diagram of (U(K0 ). First we insert a node for mI and join this node with the node of C by a line segment. Then we move the subdiagram representing (C]0 := {X ∈ U(K0 ) | X ⊆ C} along that line segment until the node of C coincides with the node of mI , but keep a copy of the nodes representing (C]0 \ ((mI ] ∪ {C}), i.e. class (3), still at its original place so that the nodes representing elements in (C]0 \ ((mI ] ∪ {C}) double. Each pair of doubled nodes is joined by a new line segment. The resulting diagram has eventually too many nodes because only the intersection of mI with the minimal ∩-generators yields new elements. Therefore we replace all moved nodes of the elements in (C]0 \ ((mI ] ∪ {C}) that are not in gm by a dummy; but we keep all line segments in place. In this way we obtain a line diagram for (U(Km 0 ) where the nodes represent the lattice elements and the dummies guarantee a satisfying arrangement of the line segments. Let us now illustrate our drawing procedure by an example. The documents in the database ULIDAT of the Umweltbundesamt (Federal Environmental Agency of Germany) are indexed by terms of the Umwelt-Thesaurus (Environmental Thesaurus) to describe their content for the purpose of document retrieval (cf. [1]). Hence we may understand the documents as objects and the thesaurus terms as attributes of a formal context K := (G, M, I) with ∨-compatible attributesemilattice. We have chosen the scales K1 and K2 induced by the sets of terms M1 and M2 , respectively, visualized in Fig. 3 together with the order relation constituted by the thesaurus. The ∨-closure M12 := M1 ∪ M2 ∪ (M1 ∨ M2 ) consists of the thesaurus terms shown in Fig. 42 . We can see that the term ”Stehendes Gewaesser” which is a join of ”Seen” and ”Staugewaesser” and the term ”T” (for the top element of the thesaurus) which is a join of ”Stauanlage” and ”Oberflaechengewaesser” are the new attributes. Since all objects have the 2
Translation of the new label in Fig. 4: Stehendes Gewaesser/standing waters
TOSCANA-Systems Based on Thesauri
133
Ober aechengewaesser Seen Teich
Fluss
(M1 ; )
Stauanlage Staudamm
Staugewaesser Stausee
Staustufe Talsperre (M2 ; )
Fig. 3. Two ordered sets of thesaurus terms
attribute ”T”, it does not carry any information and can therefore be omitted in the line diagram. The line diagram of the concept lattice of K1 ∨ K2 in Fig. 5 will be extended to a line diagram of B(K1 ∨ K2 ). For reasons of comprehension we use a non-nested line diagram first and will later extend a nested line diagram. The line diagram of B(K1 ∨K2 ) in Fig. 6 has been obtained from the line diagram in Fig. 5 by doubling six lattice elements which yields the new elements in Fig. 6 represented by the black circles with a white dot. In Fig. 7, a line diagram of B(K1 ∨ K2 ) is shown that has been derived from the nested line diagram in Fig. 1. In the line diagram in Fig. 1, each of the three top elements in the two large circles labelled with ”Oberflaechengewasser” and ”Fluss” has been doubled. The resulting lattice elements are encircled by an ellipse in Fig. 7.
4
Discussion
The study of TOSCANA-systems based on thesauri has been motivated by different investigations of thesauri and classification systems from which the desire arose to develop methods for representing the resemblance of classification terms within line diagrams of concept lattices. The understanding of the join of two classification terms as the smallest generalization of those terms suggests an algebraic description method for the degree of resemblance of thesaurus terms. From this suggestion we derived the claim that a TOSCANA-system based on a thesaurus should have join-closed conceptual scales. Since TOSCANA represents combined conceptual scales by nested line diagrams, it is a consequence of our claim that the attributes in those diagrams should be join-closed too. This leads
134
B. Groh, S. Strahringer, and R. Wille
T
Ober aechengewaesser Stehendes Gewaesser Fluss
Seen
Stauanlage Staugewaesser
Teich Stausee
Staudamm
Staustufe Talsperre
(M1 [ M2 [ (M1 _ M2 ); ) Fig. 4. The join-closure of the thesaurus terms shown in Fig. 3
to the problem how to extend a line diagram of a concept lattice with attributes taken from a ∨-semilattice to a line diagram with join-closed attributes. Our solution of this problem offers an incremental procedure of inserting into line diagrams node by node new ∧-irreducible attribute concepts. Each new attribute forces an extension of the actual concept lattice which doubles a convex subset of the lattice (cf. [4]). This extension need not to be the new concept lattice generated by the old one and the new attribute, but it contains the new lattice. Thus the line diagram of the extension yields a graphical representation of the new concept lattice if we replace the elements of the extension which are not in that lattice by dummies. Of course, one might even erase those elements, but then one has to erase line segments too and has to insert eventually new line segments which could cause serious problems. An advantage of keeping the superfluous nodes in only changing them to dummies lies in the possibility to automate the drawing procedure. Thus, with our method, we especially offer an incremental procedure for drawing concept lattices automatically (an incremental algorithm for determining concept lattices has already been published in [7]).
TOSCANA-Systems Based on Thesauri
135
23649 Stauanlage Oberflaechengewaesser
55 Staudamm
741 Fluss
7 641
18
Seen 213
Staugewaesser 2
27
Staustufe 6
Teich
3 58
8
Talsperre
49
11 Stausee
2
39
1 11
9
2 15
1
9 1
1
5 9
8
1 2
2
1
1 2
Fig. 5. A non-nested line diagram of the concept lattice visualized in Fig. 1
References 1. W. Batschi: Environmental thesaurus and classification of the Umweltbundesamt (Federal Environmental Agency), Berlin. In: S. Stancikova and I. Dahlberg (eds.): Environmental Knowledge Organization and Information Management. Indeks, Frankfurt/Main 1994. 2. C. Carpineto and G. Romano: A Lattice Conceptual Clustering System and Its Application to Browsing Retrieval. Machine Learning 24 (1996), 95-122. 3. B. Ganter and R. Wille, Formale Begriffsanalyse: Mathematische Grundlagen. Springer, Berlin-Heidelberg 1996. (English translation to appear) 4. W. Geyer: The generalized doubling construction and formal concept analysis. Algebra Universalis 32 (1994), 341–367. 5. R. Godin and H. Mili: Building and Maintaining Analysis-Level Class Hierarchies Using Galois Lattices. In Proceedings of the ACM Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA’93), A. Paepcke (Ed.), Washington, DC, 1993, ACM Press, 394-410. 6. R. Godin, G.W. Mineau, and R. Missaoui: Incremental structuring of knowledge bases. In: Proceedings of the International Knowledge Retrieval, Use, and Storage
136
B. Groh, S. Strahringer, and R. Wille 23649 Stauanlage Oberflaechengewaesser
55
Stehendes Gewaesser
Staudamm
741 Fluss
7 641
18
Seen 213
Staugewaesser
2
27
Staustufe 6
Teich
3 8
58 49
Talsperre 11 Stausee
2
39
1 11
9
2 15
1
9 1
1
5 9
8
1 2
2
1
1 2
Fig. 6. The line diagram in Fig. 5 extended by six new lattice elements
7. 8. 9. 10. 11. 12.
for Efficiency Symposium (KRUSE’95), Santa Cruz, Lecture Notes in Artificial Intelligence, Springer, 1995, 179-198. R. Godin, R. Missaoui, and H. Alaoui: Incremental Concept Formation Algorithms Based on Galois (Concept) Lattices. Computational Intelligence, 11(2) (1995), 246-267. A. Grosskopf and G. Harras: A TOSCANA-system for speech-act verbs. FB4Preprint, TU Darmstadt 1998. E. Kalix, Entwicklung von Regelungskonzepten f¨ ur thermische Abfallbehandlungsanlagen. Diplomarbeit, TU Darmstadt, 1997. U. Kaufmann, Begriffliche Analyse u ¨ ber Flugereignisse – Implementierung eines Erkundungs- und Analysesystems mit TOSCANA. Diplomarbeit, TH Darmstadt 1996. B. Kohler-Koch and F. Vogt, Normen und regelgeleitete internationale Kooperationen. FB4-Preprint 1632, TU Darmstadt 1994. W. Kollewe, C. Sander, R. Schmiede, and R. Wille, TOSCANA als Instrument der der bibliothekarischen Sacherschließung. In: H. Havekost and H.J. W¨ atjen (eds.), Aufbau und Erschließung begrifflicher Datenbanken. (BIS)-Verlag, Oldenburg, 1995, 95-114.
TOSCANA-Systems Based on Thesauri Stauanlage Staugewaesser
Staudamm
23649 55
Stausee
Staustufe 7
Talsperre
Stehendes Gewaesser Oberflaechengewaesser 741 18 2
6 39
11 9
Fluss
Seen 641 27
213 3
2
8
9
1 11 1
1
15
8 2
5 Teich
58 49 1 1 2
1
9 2
1
2
Fig. 7. The nested line diagram in Fig. 1 extended by six new lattice elements
137
138
B. Groh, S. Strahringer, and R. Wille
13. W. Kollewe, M. Skorsky, F. Vogt, and R. Wille, TOSCANA ein Werkzeug zur begrifflichen Analyse und Erkundung von Daten. In: R. Wille and M. Zickwolff (eds.), Begriffliche Wissensverarbeitung – Grundfragen und Aufgaben. B.I.Wissenschaftsverlag, Mannheim, 1994, 267-288. 14. K. Mackensen and U. Wille: Qualitative text analysis supported by conceptual data systems. Preprint, ZUMA, Mannheim 1997. 15. T. Rock and R. Wille: Ein TOSCANA-System zur Literatursuche. In: G. Stumme and R. Wille (eds.): Begriffliche Wissensverarbeitung: Methoden und Anwendungen. Springer, Berlin-Heidelberg (to appear) 16. M. Roth-Hintz, M. Mieth, T. Wetter, S. Strahringer, B. Groh, R. Wille, Investigating SNOMED by Formal Concept Analysis. Submitted to: Artificial Intelligence in Medicine. 17. P. Scheich, M. Sorsky, F. Vogt, C. Wachter, R. Wille: Conceptual data systems. In: O. Opitz, B. Lausen, R. Klar (eds.): Information and classification. Springer, Berlin-Heidelberg 1993, 72–84. 18. G. Stumme and K. E. Wolff: Computing in conceptual data systems with relational structures. In: G. Mineau and A. Fall (eds.): Proceedings of the Second International Symposium on Knowledge Retrieval, Use, Storage for Efficiency. Simon Fraser University, Vancouver 1997, 206–219. 19. N. Vogel: Ein begriffliches Erkundungssystem f¨ ur Rohrleitungen. Diplomarbeit, TU Darmstadt 1995. 20. F. Vogt and R. Wille: TOSCANA - a graphical tool for analyzing and exploring data. In: R. Tamassia, I.G. Tollis (eds.): Graph Drawing ’94. Lecture Notes in Computer Science 894. Springer Berlin-Heidelberg-New York 1995, 226-233. 21. R. Wille: Conceptual landscapes of knowledge: a pragmatic paradigm for knowledge processing. In: G. Mineau and A. Fall (eds.): Proceedings of the Second International Symposium on Knowledge Retrieval, Use, Storage for Efficiency. Simon Fraser University, Vancouver 1997, 2–13.
M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 139-153, 1998 Springer-Verlag Berlin Heidelberg 1998
140
R. Dieng and S. Hug
MULTIKAT, a Tool for Comparing Knowledge of Multiple Experts
141
142
R. Dieng and S. Hug
MULTIKAT, a Tool for Comparing Knowledge of Multiple Experts
143
144
R. Dieng and S. Hug
MULTIKAT, a Tool for Comparing Knowledge of Multiple Experts
145
146
R. Dieng and S. Hug
MULTIKAT, a Tool for Comparing Knowledge of Multiple Experts
147
148
R. Dieng and S. Hug
MULTIKAT, a Tool for Comparing Knowledge of Multiple Experts
149
150
R. Dieng and S. Hug
MULTIKAT, a Tool for Comparing Knowledge of Multiple Experts
151
152
R. Dieng and S. Hug
MULTIKAT, a Tool for Comparing Knowledge of Multiple Experts
153
A Platform Allowing Typed Nested Graphs: How CoGITo Became CoGITaNT David Genest and Eric Salvat LIRMM (CNRS and Universit´e Montpellier II), 161 rue Ada, 34392 Montpellier Cedex 5, France.
Abstract. This paper presents CoGITaNT, a software development platform for applications based on conceptual graphs. CoGITaNT is a new version of the CoGITo platform, adding simple graph rules and typed nested graphs with coreference links.
1
Introduction
The goal of the CORALI project (Conceptual graphs at Lirmm) is to build a theoretical formal model, to search for algorithms for solving problems in this model, and to develop of software tools implementing this theory. Our research group considers the CG model [16] as a declarative model where knowledge is solely represented by labeled graphs and reasoning is can be done by labeled graphs operations [3]. In a first stage, such a model, the “simple conceptual graph” (SCG) model has been defined [4,11]. This model has sound and complete semantics in first order logic. This model has been extended in several ways such as rules [12,13] and positive nested graphs with coreference links [5]. As for the SCG model, a sound and complete semantics has been proposed for these extensions [6,15]. The software platform CoGITo (Conceptual Graphs Integrated Tools) [9,10] had been developed on the SCG model. This paper presents the updated version of this platform, CoGITaNT (CoGITo allowing Nested Typed graphs), which is based on these extensions: graph rules and nested conceptual graphs with coreference links.
2
CoGITaNT
CoGITaNT is a software platform: it enables an application developer to manage graphs and to apply the operations of the model. For portability and maintenance reasons, the object oriented programming paradigm was chosen and CoGITaNT was developed as a set of C++ classes. Each element of the theoretical model is represented by a class (conceptual graph, concept vertex, concept type, support,. . . ). Hence, the use of object programming techniques allows to represent graphs in a “natural” way (close to the model): for example, a graph is a set of concept vertices plus a set of relation vertices having an ordered set M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 154–161, 1998. c Springer-Verlag Berlin Heidelberg 1998
A Platform Allowing Typed Nested Graphs: How CoGITo Became CoGITaNT
155
of neighbors. The methods associated with each class correspond to the usual handling functions (graph creation, type deletion, edge addition, . . . ) and specific operations on the model (projection, join, fusion, . . . ). CoGITaNT is compiled using the freeware C++ compiler GNU C++ on Unix systems. CoGITaNT is available for free on request (for further informations, please send an e-mail to
[email protected]). CoGITaNT is an extension of CoGITo: each functionality of the previous version is present in the new one. Hence, applications based upon CoGITo should use CoGITaNT without important source files modifications. CoGITo has been presented in [10,2], we only describe here some distinctive characteristics. The platform manages SCGs, and implements algorithms that have been developed by the CORALI group, such as a backtracking projection algorithm, a polynomial projection algorithm for the case when the graph to be projected is a tree, and a maximal join (isojoin) algorithm. The new version extends available operations on simple graphs by implementing graph rules. CoGITaNT also introduces a new set of classes that allows handling of typed nested graphs with coreference links and some other new features: graphs are no more necessarily connected, the set of concept types and the set of relation types ;ay not be necessarily ordered by a “kind of” relation in a lattice but may be partially ordered in a poset.
3
Graph Rules
Conceptual graph rules have been presented in [13] and [12]. These rules are of “IF G1 THEN G2 ” (noted G1 ⇒ G2 ) kind, where G1 and G2 are simple conceptual graphs with co-reference links between concepts of G1 and G2 . Such vertices are called the connection points of the rule. Conceptual graph rules are implemented as a new CoGITaNT class. An instance of this class consists of two SCGs, the hypothesis and the conclusion, and a list of couple of concept vertices (c1 , c2 ), where c1 is a vertex of the hypothesis and c2 is a vertex of the conclusion. This list represents the list of connection points of the rule. G1
1 person : *x
1
G2
person : *x
2 brother
uncle
2
father
person : *z
2 person : *z 1
father
2
1 person : *y
2
1 person : *
father
person : *y
Fig. 1. An example of a graph rule.
Figure 1 may be interpreted as the following sentence: “if a person X is the brother of a person Y, and Y is the father of a person Z, then X is the uncle of Y, and there exists a person which is the father of X and Y”. In this figure, x, y, z
156
D. Genest and E. Salvat
are used to represent coreference links. Vertices which referents are ∗x, ∗y and ∗z are the connection points of this rule. CG rules can be used in forward chaining and backward chaining, both mechanisms being defined with graph operations. Furthermore, these two mechanisms are sound and complete with respect to deduction on the corresponding subset of FOL formulae. Forward chaining is used to explicitly enrich facts with knowledge which is implicitly present in the knowledge base by the way of rules. Then, when a fact fulfills the hypothesis of a rule (i.e. when there is a projection from the hypothesis to the fact), then the conclusion of the rule can be “added” to the fact (i.e. joined on connection points). This basic operation of forward chaining is implemented in CoGITaNT. This method allows to apply a rule R on a CG G, following a projection π from the hypothesis of R to G. The resulting graph is R[G, π], obtained by joining the conclusion of R on G. The join is made on the connection points of the conclusion and the vertices of G which are images of the corresponding connection points in the hypothesis. This method is the basic method of forward chaining, since it allows only to apply a given rule on a given graph and following a given projection. But, since CoGITaNT allows to compute all projections between two graphs, then it is easy to compute all applications of a rule on a graph. Backward chaining is used to prove a request (a goal) on a knowledge base without applying rules on the facts of the knowledge base. Then, we search for a unification between the conclusion of a rule and the request. If such an unification exists, then a new request is built deleting the unified part of the previous request and joining the hypothesis of the rule to the new request. We defined backward chaining method in two steps: first, we compute all unifications between the conclusion of the rule and the request graph; then, given a rule R, a request graph Q and a unification u(Q, R), another procedure builds the new request. As in forward chaining, the implemented methods are the basic methods for a backward chaining mechanism. The user can manage the resolution procedure, for example by implementing heuristics that choose, in each step, the unifications to use at first.
4 4.1
Typed Nested Graphs with Coreference Links Untyped Nested Graphs and Typed Nested Graphs
In some applications, the SCG model does not allow a satisfactory representation of the knowledge involved. An application for the validation of stereotyped behavior models in human organizations [1] uses CGs to represent behaviors: Such CGs represent situations, pre-conditions and post-conditions of actions, each of these elements being described with CGs. Another application is using CGs for document retrieval [7]: a document is described with a CG, such a CG represents the author, the title, and the subject (which is described by a CG). In these applications, knowledge can be structured by hierarchical levels, but this structure can not be represented easily by the SCG model. A satisfactory way of representing this structure is to put CGs within CGs.
A Platform Allowing Typed Nested Graphs: How CoGITo Became CoGITaNT
157
An extension of the model, called the (untyped) “nested conceptual graph” (NCG) model [5] allows the representation of this hierarchical structure by adding in each concept vertex a “partial internal description” which may be either a set of NCGs or generic (i.e. an empty set, denoted ∗∗). 2
1
person : * : **
agent
1 watch : * : **
object 2
television : * power-on button : * : ** screen : * TV show : * 2 presenter : * : **
1 agent 1 attr
1 talk : * : **
2 object
person : * : **
2 uninteresting : * : **
Fig. 2. An untyped nested conceptual graph.
The (untyped) nested graph of figure 2 may represent the knowledge “A person is watching television. This television has a power-on button and its screen displays an uninteresting TV show where a presenter talks about someone”. In figure 2, the same notion of nesting is used to represent that the button and the screen are “components” of the TV, the screen “represents” a TV show, and the scene where the presenter talks about someone is a “description” of the show. Hence, untyped NCGs fail to represent these various nesting semantics. However, in some applications, to specify the nesting semantics is useful. Typed NCGs may represent these more specific semantics than “the nested graph is the partial internal description” (see figure 3). The typed NCG model is not precisely described here, please refer to [5] for a more complete description. The typed NCG extension adds to the support a new partially ordered type set, called the nesting type set which is disjoint from the other type sets, and have a greatest element called “Description”. A typed NCG G can be denoted G = (R, C, E, l) where R, C and E are respectively relation, concept and edge sets of G, l is the labeling function of R and C such that ∀r ∈ R, l(r) = type(r) is an element of the relation type set, ∀c ∈ C, l(c) = (type(c), ref (c), desc(c)), where type(c) is an element of the concept type set, ref (c) is an individual marker or ∗, and desc(c) (called “internal partial description”) is ∗∗, called the generic description or a non-empty set of couples (ti , Gi ), where ti is a nesting type and Gi is a typed NCG. 4.2
Typed Nested Graphs in CoGITaNT
One of the main new characteristics of CoGITaNT is that it allows handling typed NCGs. Data structures used by CoGITaNT for representing such graphs are a natural implementation of this structure.
158
D. Genest and E. Salvat 2 person : * : **
1 agent
1 watch : * : **
object 2
television : *
Component power-on button : * : ** screen : *
Representation TV show : *
Description 2 presenter : * : **
1
1 attr
1 talk : * : **
agent
2 object
person : * : **
2 uninteresting : * : **
Fig. 3. A typed nested conceptual graph and a coreference link (dashed line).
As described in figure 4, a typed NCG is composed of a list of connected components and a list of coreference classes (see further details). A connected component is constituted of a list of relation vertices and a list of concept vertices. A concept vertex is composed of a concept type, a referent (individual marker or ∗) and a list of nestings (∗∗ is represented by a NULL pointer). A nesting is constituted of a nesting type and a typed NCG. As for simple graphs, available methods implement usual handling functions and operations on the extended model. The projection structure and the projection operation defined on SCGs have been adapted to the recursive structure of NCGs and the projection operation follows the constraint induced by types of nestings [5]. A “projection between typed NCGs” from H to G is represented by a list of “projections between connected components”1 . A “projection between connected components” from ccH to ccG is represented by a list of pairs (r1 , r2 ) of relations2 and a list of structures (c1 , c2 , n)3 where c1 is a concept vertex of ccH , c2 is a concept vertex of ccG and n is an empty list (if c1 has a generic description) or a list of structures (n1 , n2 , g) where n1 is a nesting of c1 , n2 is a nesting of c2 and g is a “projection between typed NCGs” from the graph nested in n1 to the graph nested in n2 . Thus, the representation of a projection between typed NCGs has a recursive structure which is comparable with the structure of typed NCGs. The projection algorithm between NCGs computes first every projection between level-0 graphs without considering nestings. Obtained projections are then filtered: let Π be one of these projections, if a concept vertex c of H has a nesting (t, G) such that Π(c) does not contain a nesting (t0 , G0 ) such 1 2 3
for each connected component in H there is one “projection between connected components” in the list. for each relation vertex r in ccH there is one pair (r, Π(r)) in the list, where Π(r) is the image of r. for each concept vertex c in ccH there is one structure (c, Π(c), n) in the list, where Π(c) is the image of c.
A Platform Allowing Typed Nested Graphs: How CoGITo Became CoGITaNT
159
typed NCG
list of connected components
list of coreference classes
connected component
list of relation vertices
list of concept vertices
list of concept vertices
concept vertex
relation vertex
relation type
coreference class
concept type
referent
list of nestings
nesting
nesting type
typed NCG
Fig. 4. Data structures: typed NCG.
that t ≥ t0 and there is a projection from G to G0 , then Π is not an acceptable projection. Acceptable projections are then completed (third part of (c1 , c2 , n) structures: projections between nestings). SCGs are also typed NCGs (without any nesting) and untyped NCGs are also typed NCGs (the type of each nesting is “Description”), hence CoGITaNT can also handle SCGs and untyped NCGs. Even if data structures and operations are not optimal when handling such graphs, there is no sensible lack of performances comparing to SCGs operations of CoGITo. Useless comparisons of (generic) descriptions for SCGs handled as typed NCGs (e.g. comparison of NULL pointers) and useless comparisons of nesting types (equal to “Description”) for untyped NCGs handled as typed NCGs do not influence on the overall performances of the system. 4.3
Coreference Links
CoGITaNT also allows handling coreference links. A set of non-trivial coreference classes is associated with each graph. These classes are represented by lists of (pointers to) concept vertices. Trivial coreference classes (such as one-element classes and classes constituted by the set of concept vertices having the same type and the same individual marker) are not represented for memory saving reasons. The coreference link in figure 3 is represented by a (non-trivial) 2element coreference class that contains the two “person” concept vertices. Of course, the projection method makes use of coreference classes: the images of all vertices of a given coreference class must belong to the same (trivial or non trivial) coreference class. The projection algorithm from H to G currently computes first every projection from H to G without coreference constraints.
160
D. Genest and E. Salvat
Let Π be one of these projections. In order to return only those conforming to coreference constraints, projections are then filtered: ∀coH a non trivial coreference class of H, ∀c1 ∈ coH , ∀c2 ∈ coH , if Π(c1 ) and Π(c2 ) are not in a same (trivial or non trivial) coreference class, then Π is not an acceptable projection.
5
The BCGCT Format
A simple extension of the BCGCT format [9] is the CoGITaNT native file format: it is a textual format allowing supports, rules and graphs to be saved in permanent memory. Files in this format can easily be written and understood by a user, BCGCT is indeed a structured file format which represents every element of the model in a natural way. For example, the representation of a graph is constituted of three parts, the first describes concept vertices, the second describes relation vertices, and the third represents edges between these vertices. The representation of a concept vertex is a 3-tuple constituted of a concept type, an individual marker or an optional “∗” (generic concept), and a set of couples (ti , Gi ) where ti is a nesting type and Gi is a graph identifier, or an optional “∗∗” (generic description). Ex: c1=[television:*:(Component,gtvdescr)]; (“gtvdescr” is a graph identifier). Coreference links are represented using the same variable symbol for each concept vertex belonging to the same coreference class. Ex: c3=[person:$x1:**]; and c9=[person:$x1]; represent that these concept vertices belong to the same coreference class.
6
Conclusion
The platform has been provided to about ten research centers and firms. In particular, two collaborations of our research group, the one with Dassault Electronique and the other with the Abes (Agency of French university libraries) lead evolutions of the model and the platform. The research center of the firm Dassault Electronique uses the platform for building a software that furnishes an assistance for the acquisition and the validation of stereotyped behavior models in human organizations [1]. This application led our research group to the definition of the typed NCG model, and its implementation in CoGITaNT. The other project of our group is on document retrieval: a first approach [8] convinced the Abes to continue with our collaboration to study the efficiency of a document retrieval system based on CGs and to develop a prototype of such a system. These collaborations make us consider many improvement perspectives for CoGITaNT. The efficiency of some operations can be improved by algorithmical studies. Indeed, the optimization of the unification procedure for graph rules will improve the backward chaining mechanism efficiency. Moreover, coreference links processing during projection computing can be improved by an algorithm that considers as soon as possible restrictions induced by these links.
A Platform Allowing Typed Nested Graphs: How CoGITo Became CoGITaNT
161
We plan the extension of the expressiveness of CoGITaNT with nested graph rules. A first theoretical study has been done in [14,12], but nested graphs considered in this study are without coreference links. The extension of this work with coreference links will be implemented in the platform. In the long term, negation (or limited types of negation which are required by applications we are involved in) will be introduced in CoGITaNT as required by the users of the platform.
References 1. Corinne Bos, Bernard Botella, and Philippe Vanheeghe. Modelling and simulating human behaviours with conceptual graphs. In Proceedings of ICCS ’97, volume 1257 of LNAI, pages 275–289. Springer, 1997. 2. Boris Carbonneill, Michel Chein, Olivier Cogis, Olivier Guinaldo, Ollivier Haemmerl´e, Marie-Laure Mugnier, and Eric Salvat. The COnceptual gRAphs at LIrmm project. In Proceedings of the first CGTools Workshop, pages 5–8, 1996. 3. Michel Chein. The CORALI project: From conceptual graphs to conceptual graphs via labelled graphs. In Proceedings of ICCS ’97, volume 1257 of LNAI, pages 65–79. Springer, 1997. 4. Michel Chein and Marie-Laure Mugnier. Conceptual graphs: Fundamental notions. RIA, 6(4):365–406, 1992. 5. Michel Chein and Marie-Laure Mugnier. Positive nested conceptual graphs. In Proceedings of ICCS ’97, volume 1257 of LNAI, pages 95–109. Springer, 1997. 6. Michel Chein, Marie-Laure Mugnier, and Genevi`eve Simonet. Nested graphs: A graph-based knowledge representation model with FOL semantics. To be published in proceedings of KR’98, 1998. 7. David Genest. Une utilisation des graphes conceptuels pour la recherche documentaire. M´emoire de DEA, 1996. Universit´e Montpellier II. 8. David Genest and Michel Chein. An experiment in document retrieval using conceptual graphs. In Proceedings of ICCS ’97, volume 1257 of LNAI, pages 489–504. Springer, 1997. 9. Olliver Haemmerl´e. CoGITo : une plateforme de d´ eveloppement de logiciels sur les graphes conceptuels. PhD thesis, Universit´e Montpellier II, France, 1995. 10. Olliver Haemmerl´e. Implementation of multi-agent systems using conceptual graphs for knowledge and message representation: the CoGITo platform. In Supplementary Proceedings of ICCS’95, pages 13–24, 1995. 11. Marie-Laure Mugnier and Michel Chein. Repr´esenter des connaissances et raisonner avec des graphes. RIA, 10(1):7–56, 1996. 12. Eric Salvat. Raisonner avec des op´ erations de graphes : Graphes conceptuels et r`egles d’inf´erence. PhD thesis, Universit´e Montpellier II, France, 1997. 13. Eric Salvat and Marie-Laure Mugnier. Sound and complete forward and backward chaining of graph rules. In Proceedings of ICCS ’96, volume 1115 of LNAI, pages 248–262. Springer, 1996. 14. Eric Salvat and Genevi`eve Simonet. R`egles d’inf´erence pour les graphes conceptuels emboˆıt´es. RR LIRMM 97013, 1997. 15. Genevi`eve Simonet. Une s´emantique logique pour les graphes emboˆıt´es. RR LIRMM 96047, 1996. 16. John F. Sowa. Conceptual Structures: Information Processing in Mind and Machine. Addison Wesley, 1984.
Towards Correspondences Between Conceptual Graphs and Description Logics P. Coupey1 and C. Faron2 1
LIPN-CNRS UPRESA 7030, Universit´e Paris 13, Av. J.B. Cl´ement, 93430 Villetaneuse, France
[email protected] 2 LIFO, Universtit´e d’Orl´eans, rue L´eonard de Vinci, BP 6759 45067 Orl´eans cedex 2, France
[email protected],
[email protected] Abstract. We present a formal correspondence between Conceptual Graphs and Description Logics. More precisely, we consider the Simple Conceptual Graphs model provided with type definitions (which we call T SCG) and the ALEOI standard Description Logic. We prove an equivalence between a subset of T SCG and a subset of ALEOI. Based on this equivalence, we suggest extensions of both formalisms while preserving the equivalence. In particular, regarding to standard Description Logics where a concept can be defined by the conjunction of any two concepts, we propose an extension of type definition in CGs allowing type definitions from the “conjunction” of any two types and consequently partial type definitions. Symmetrically, regarding generalization/specialization operations in Conceptual Graphs, we conclude by suggesting how Description Logics could take advantage of these correspondences to improve the explanation of subsumption computation.
1
Introduction
Conceptual Graphs (CGs) [21] and Description Logics (DLs) [3] both are knowledge representation formalisms descended from semantic networks. They are dedicated to the representation of assertional knowledge (i.e. facts) and terminological knowledge: hierarchies of concept types and relation types in CGs, hierarchies of concepts and roles in DLs. Subsumption is central in both formalisms: between graphs or between types in CGs (generalization), between concepts in DLs. Similarities between these formalisms have often been pointed out [1,11] but up to now, to our knowledge, no formal study has ever been carried out about correspondences between CGs and DLs. Beyond an interesting theoretical result, such a work would offer CGs and DLs communities a mutual advantage of about 15 years of research. More precisely, numerous formal results in DLs about semantics and subsumption complexity could easily be adapted to CGs. Symmetrically, specialization/generalization and graph projection operations defined in CGs would help in explaining the computation of subsumption in DLs and thus contribute to current research in this community [14]. M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 165–178, 1998. c Springer-Verlag Berlin Heidelberg 1998
166
P. Coupey and C. Faron
In this paper, we present a formal correspondence between Conceptual Graphs and Description Logics. More precisely, focusing on the terminological level, we consider the standard Description Logic ALEOI [18] and the Simple Conceptual Graphs model [7,8,5] provided with type definitions [6,12,13] we call T SCG. We outline fundamental differences between both formalisms and set up restrictions on them, thus defining two subsets: G and L. Then we show, regarding their formal semantics, that G and L are two notational variants of the same formalism and that subsumption definitions in G and L are equivalent. Based on this result, we make both formalisms vary while preserving the equivalence. In particular, regarding standard DLs where a concept can be defined by the conjunction of any two concepts, we propose an extension of type definition in CGs allowing concept type definitions from the “conjunction” of any two types and consequently partial type definitions. Symmetrically, regarding generalization/specialization operations in CGs, we suggest how DLs could take advantage of their correspondences with CGs to improve the explanation of subsumption computation. First we present ALEOI and T SCG in sections 2 and 3. Then we prove in section 4 the equivalence between the sub-formalisms G of T SCG and L of ALEOI. In section 5 we examine extensions of G and L that preserve the equivalence.
2
Description Logics
Description logics are a family of knowledge representation formalisms descended from the KL-ONE language [3]. They mostly formalize the idea of concept definition and reasoning about these definitions. A DL includes a terminological language and an assertional language. The assertional language is dedicated to the statement of facts and assertional rules, the terminological language to the construction of concepts and roles (binary relations). A concept definition states necessary and sufficient conditions for membership in the extension of that concept. Concepts in DLs are organized in a taxonomy. The two fundamental reasoning tasks are thus subsumption computation between concepts, and classification. The classifier automatically inserts a new defined concept in the taxonomy, linking it to its most specific subsumers and to the most general concepts it subsumes. Among standard Description Logics we have selected ALEOI [18] since it is the largest subset of the best known DL CLASSIC [2] and the necessary description language for which a correspondence with the Simple Conceptual Graphs model holds true. ALEOI is inductively defined from a set Pc of primitive concepts, a set Pr of primitive roles, a set I of individuals, the constant concept >, and two abstract syntax rules; one for concepts (P is a primitive concept, R a role and ai elements of I): C, D →
|
> P
most general concept primitive concept
Towards Correspondences Between Conceptual Graphs and Description Logics
| | | |
∀R.C ∃R.C C uD {a1 . . . an }
167
universal restriction on role existential restriction on role concept conjunction concept in extension
and one for roles (Q is a primitive role): R →
|
Q Q−1
primitive role inverse primitive role
Figure 1 presents examples of ALEOI formulae. The first one describes the females who are researchers; the second one, the males who have at least one child; the third, the females all of whose children are graduates; the fourth, the boys whose mother has a sister who is a researcher; the last one, the males whose (only) friends are Pamela and Claudia, who are female. F emale u Researcher M ale u ∃child.> F emale u ∀child.Graduate Boy u ∃child−1 .(F emale u ∃sister.Researcher) M ale u ∀F riend.({P amela Claudia} u F emale) Fig. 1. Examples of ALEOI formulae
A concept may be fully defined1 (i.e. necessary and sufficient conditions) from a term C of ALEOI, this is noted A ≡ C, or partially defined (i.e. necessary but not sufficient conditions) from a term C of ALEOI, this is noted A < C. A terminological knowledge base (T-KB) is thus a set of concepts and their definitions (partial or full). Note that all partial definitions can be converted into full definitions by using new primitive concepts (cf. [17]). Let an atomic concept A be partially defined w.r.t. a term C: A < C. This can be converted into a full definition by adding to Pc a new primitive concept A0 : A ≡ A0 u C. A0 implicitly describes the remainder of the necessary additional conditions for a C to be an A. From a theoretical point of view this supposes that one can always consider a T-KB has no more partial definitions. Figure 2 presents the definition of concept RN ephew.
RN ephew ≡ Boy u ∃child−1 .(F emale u ∃sister.Researcher) Fig. 2. Definition of concept RNephew 1
We assume that concept definitions are non-recursive.
168
P. Coupey and C. Faron
The formal meaning of concept descriptions built according to the above rules is classically given as an extensional semantics by an interpretation I = (D, k.kI ). The domain D is an arbitrarily non-empty set of individuals and k.kI is an interpretation function mapping each concept onto a subset of D, each role onto a subset of D × D and each individual ai onto an element of D (if ai and aj are different individual names then kai kI 6= kaj kI ). The denotation of a concept description is given by: k>kI = D kC u DkI = kCkI ∩ kDkI k∀R.CkI = {a ∈ D, ∀b, if (a, b) ∈ kRkI then b ∈ kCkI } k∃R.CkI = {a ∈ S D, ∃b, (a, b) ∈ kRkI and b ∈ kCkI } n I k{a1 . . . an }k = i=1 {kai kI } −1 I kQ k = {(a, b), (b, a) ∈ kQkI } An interpretation I is a model for a concept C if kCkI is non-empty. Based on this semantics, C is subsumed by D, noted C < D, iff kCkI ⊂ kDkI for every interpretation I. C is equivalent to D iff (C < D) and (D < C). The reader will find complete theoretical and practical developments in [17].
3
Conceptual Graphs
Conceptual Graphs have first been introduced by J. Sowa in [21]. In this paper, we consider the Simple Conceptual Graphs model SCG defined in [7,8,5] and extended with type definitions in [12,13]. We call it T SCG. It is a formal system appropriate to comparisons with Description Logics. We focus on the terminological level of T SCG, i.e the canon. A canon mainly consists of two type hierarchies: the concept type hierarchy Tc and the relation type hierarchy Tr , each one provided with a most general type, respectively >c and >r . A canon also contains a canonical base of conceptual graphs, called star graphs, expressing constraints on the maximal types of the adjacent concept vertices of relation vertices [16]. Conceptual graphs are built according to the canon, i.e they are made of concept and relation nodes whose types belong to Tc and Tr and respect the constraints expressed in the canonical base. Both type hierarchies are composed of atomic and defined types. A concept type definition2 is a monadic abstraction, i.e. a conceptual graph whose one generic concept is considered as formal parameter. It is noted tc (x) ⇔ D(x). The formal parameter concept node of D(x) is called the head of tc , its type the genus of tc and D(x) the differentia from tc to its genus [13]. A relation type definition is a n-ary abstraction, i.e. a conceptual graph with n generic concepts considered as formal parameters. It is noted tr (x1 , ...xn ) ⇔ D(x1 , ...xn ). Figure 3 presents the definition of concept type RN ephew and its logical interpretation. 2
We assume that type definitions are non-recursive.
Towards Correspondences Between Conceptual Graphs and Description Logics RN ephew(x) ⇔
Boy:*x
child
Female:
sister
169
Researcher:
∀x (RN ephew(x) ⇔ (∃y, ∃z Boy(x) ∧ child(y, x) ∧ F emale(y) ∧ sister(y, z) ∧ Researcher(z))) Fig. 3. Definition of concept type RNephew
Tc and Tr are provided with the order relations ≤c and ≤r . Between two atomic concept types, the order relation is given by the user. In the extended model provided with type definitions, a defined concept type tc is by definition introduced as sub-type of its genus. The ≤c relation is thus extended with tc ≤c genus(tc ). On the contrary, the relation type definitions do not extend the ≤r relation: between any two relation types, the order relation is given by the user. Any conceptual graph can be converted into an equivalent conceptual graph by expanding its defined types, i.e. replacing each one by its own definition graph. The atomic form of a conceptual graph is an expanded graph all of whose types are atomic. In the following, for type definitions, we will only consider the expanded form of definition graphs. The reader will find a complete development of type definition in [13].
4
Correspondences Between T SCG and ALEOI
Let us now consider the terminological parts of T SCG and ALEOI. Regarding the definition of concept RNephew (figure 2) and type RNephew (figure 3) which obviously describe the same set of individuals, one may easily guess that a conceptual graph made of a relation R between two concepts C:x and D: is translated in ALEOI by the formula C u ∃R.D. In this section, our aim is to formalize this intuition and prove an equivalence between a subset G of T SCG and a subset L of ALEOI. In the following subsection we define G and L by restricting T SCG and ALEOI. 4.1
Restrictions of T SCG and ALEOI
To state an equivalence between T SCG and ALEOI, some restrictions are necessary which correspond to real differences between CGs and DLs, others are only useful to avoid the writing of definitions and properties unnecessarily complex. In section 5 we show that most of these restrictions may be relaxed while preserving the equivalence. Necessary Restrictions 1. In ALEOI and more generally in all DLs, complex role descriptions are not allowed. In T SCG, let us consider two relations R and R0 from a concept vertex C:*x to a concept vertex D: whose logical formula is ∃y C(x) ∧
170
2. 3. 4.
5. 6.
P. Coupey and C. Faron
R(x, y) ∧ R0 (x, y) ∧ D(y). This can not be translated in ALEOI since it is impossible to describe properties on roles, and thus to describe the conjunction R(x, y) ∧ R0 (x, y). This kind of description corresponds to cycles (cf. [21, page 78]). In consequence, we only consider in T SCG definition graphs which do not contain cycles, i.e. trees (like in [15]). Since there is no constraint on roles in ALEOI, we consider the canonical base in T SCG is empty. Concept vertices adjacent to a relation vertex can thus have any type. Since there is no connective to define complex roles in ALEOI, we consider there is no defined relation types in the set Tr of T SCG. Since there are only binary relations in standard DLs, we do not consider n-ary relation types in T SCG. As outlined in section 5 this is not a real restriction since it is known that any n-ary relation can be translated as a set of binary relations. Since CGs are existential graphs, we do not consider in ALEOI the universal quantification on roles, i.e. the ∀ connective which is standard in DLs. Since there is no way in CGs to define3 a type as the conjunction of two types, we limit in ALEOI the scope of the u connective of standard DLs: the conjunction connective is restricted to C u ∃R.D.
We will show in section 5.2 how to relax all these restrictions, the fifth excepted, by extending T SCG and ALEOI. Useful Restrictions 1. Even the descriptions of sets of individuals in ALEOI and individual referents in T SCG both are allowed, it would unnecessarily complicate the proof of the equivalence. In consequence, we consider no concepts description in extension in ALEOI ({a1 . . . an }) and no individual concepts in T SCG(only generic concepts). 2. Since any concept in ALEOI is defined, except primitive ones that can not be compared, we consider type hierarchies as only made of incomparable types we call primitive types: the most specialized common super-type of any two atomic types is type >c in Tc and type >r in Tr . 3. Since in DLs subsumption between concepts relative to a terminology T is equivalent to subsumption with an empty terminology (cf. section 2), we only consider unordered atomic types in the canon of T SCG(i.e. no defined types in the canon). We will show in section 5 how to relax these useful but not necessary restrictions. 4.2
G and L
We call G and L the sub-formalisms of T SCG and ALEOI corresponding to the above restrictions. 3
We only consider “fully” defined types (cf. the second useful restriction below).
Towards Correspondences Between Conceptual Graphs and Description Logics
171
Concept definition in L may be defined like in ALEOI(cf. section 2), with no concept in extension, no universal restriction on roles and a limited version of conjunction: C u ∃R.D. Here we give an equivalent definition of concept definition relying on an inductive definition of the set C of concepts: C0 = Pc ∪ {>} −1 .(Cp+1 ) . . . u ∃Rq−1 .(Cq ), C0 ∈ Cn = Cn−1 ∪ {C0 u ∃R1 .(C1 ) . . . u ∃Rp .(Cp ) u ∃Rp+1 C0 , ∀i = 1 . . . q, Ci ∈ Cn−1 , Ri ∈ Pr } Concepts in L are then defined as follows: A ≡ C, C ∈ C. Concept type definitions in G are inductively defined by considering the following inductive definition of the set CG of conceptual graphs: CG 0 = { C: , C ∈ Tc }
CG n = CG n−1 ∪ {
G1
R1
C1 :
Rp
Cp :
Rp+1
Cp+1:
Rq
Cq :
Gp
C:
Gp+1
, C ∈ Tc , ∀i = 1 . . . q, Ri ∈ Tr , Ci ∈ Tc , Gi ∈
Gn
CG n−1 } Concept types in G are then defined as follows: t ⇔ D(x), D(x) ∈ CG, its head being the concept vertex C:x . The canon of G is made of two unordered sets Tc and Tr of types we call primitive types since they correspond to primitive concepts and roles in L. 4.3
Equivalence Between G and L
Theorem 1. G and L are two notational variants of the same formalism and subsumption definitions in G and L are equivalent. Proof. Let us first show that any concept type definition of G can be translated in L and that any concept definition of L can be translated in G. To do this, let us consider the two functions f : G −→ L and g : L −→ G. f is defined as follows: (Pc ∪ {>} = Tc and Pr = Tr ) if G ∈ CG 0 , f (G) = f ( C:x ) = C,
if G ∈ CG n , f (G) = f (
R1
C1 :
Rp
Cp :
Rp+1
Cp+1:
Rq
Cq :
C:
G1 Gp Gp+1
) = C u ∃R1 .(C1 ) . . . u ∃Rp .(Cp ) u
Gn
−1 .(Cp+1 ) . . . u ∃Rq−1 .(Cq ) , with Ci = f (Gi ), ∀i = 1 . . . q. ∃Rp+1
172
P. Coupey and C. Faron
A concept C : of G is translated into a concept C of L, a relation Ri,i=1...p to a role Ri , a relation Rj,j=p+1...n , to a role Rj−1 . Symmetrically, g is defined as follows: if D ∈ C0 , g(D) = D: , −1 .(Dp+1 ) . . .u∃Rq−1 .(Dq )) = if D ∈ Cn , g(D) = g(C0 u∃R1 .(D1 ) . . .u∃Rp .(Dp )u∃Rp+1 R1
C1 :
Rp
Cp :
Rp+1
Cp+1:
Rq
Cq :
C:
G1 Gp Gp+1
, with Gi = g(Di ) and Di = Ci u . . ., ∀i = 1 . . . q.
Gn
It is obvious that g = f −1 and f is bijective, meaning that any concept type definition of G can be translated to a concept definition of L and vice-versa. Let us now consider the semantics of concept type definitions in G. Let t(x) ⇔ G(x), G ∈ CG n . Its logical interpretation is the following: ∀x(t(x) ↔ F (x)), F (x) being an existentially closed conjunction of unary and binary predicates interpreting the concepts and relations of G respectively. The extensional semantics of such a concept type definition G in G is given in [8]. It is exactly the same as the equivalent concept definition f (G) in L (cf. section 2). In other words G and L have the same semantics. G and L are thus two notational variants of the same logical formalism. Moreover, since the definitions of subsumption between two concept types in G and subsumption between two concepts in L both rely on the inclusion of the extensions of concept types and concepts respectively, they are equivalent. Remark: Our proof of the equivalence is based on the semantics. It could also have been shown by examining the connection between the equational system of ALEOI(properties of its connectives) which characterizes equivalence classes [10,9] and the notion of irredundant graphs introduced in [7,8].
5
Extensions
The equivalence between G and L is a first (but necessary) step towards broader correspondences between CGs and DLs. Research in DLs mostly consists in studying formally and in practice different Description Logics by varying the description language. Our aim is to apply this principle to CGs by conjointly extending G and L while preserving the equivalence, then taking advantage of these correspondences to transfer results from DLs to CGs and vice-versa. In this section, we succinctly present some of these extensions. We do not give the equivalence proofs but only the intuitive elements which conjecture them. Some of these extensions still are subsets of T SCG (section 5.1) while others propose an enrichment of CGs (section 5.2).
Towards Correspondences Between Conceptual Graphs and Description Logics
5.1
173
Subsets of T SCG
Individuals Whereas both T SCG and ALEOI allow description of individuals, for the sake of simplicity, we have defined G with no individual referents in conceptual graphs, and L with no concept definition in extension ({a1 . . . an}). Extending G with individual referents in conceptual graphs while preserving the equivalence with L does not bring any major problem. To do this, the terminological language of L must be extended with P (ai ) and >(ai ) which are respectively equivalent to P u {ai } and > u {ai }, {ai } being a concept in extension (cf. section 2) restricted to a single element. As a result, a limited form of cycles can be tolerated in CGs: cycles “closed” at an individual concept vertex can be translated in DLs. Figure 4 presents an example of such a conceptual graph of G and its translation in L.
chief
BusDriver:
Male:*x
friend
Female:Claudia
own
Cat:
sister
M ale u ∃f riend.(F emale(Claudia) u ∃own.Cat) u ∃chief.(BusDriver u ∃sister.F emale(Claudia)) Fig. 4. A definition graph including individuals and its translation in DLs
Relation Type Definition T SCG is provided with a relation type definition mechanism (cf. section 3). Some DLs also are provided with role connectives allowing the description of complex roles: R, Q →
| |
RuQ domain R C range R C
role conjunction domain restriction co-domain restriction
For example, childR ≡ (domain child (M ale u ∃own.Cat)) u (range childR (F emale u ∃ member.F ootballT eam)) defines a new role childR which describes a particular child relation between father and daughter such that the father owns a cat and the daughter is member of a football team. The translation of this example in CGs is given in figure 5. The equivalence between L and G could thus be extended to role and relation type definitions 4 . 4
Even if intuition allows to conjecture such an extension, an in-depth study still has to be done to make explicit a precise equivalence between relation type definitions in CGs and role definitions in DLs, which is outside of the scope of this paper.
174
P. Coupey and C. Faron
childR(x, y) ⇔ Cat:
own
Male:*x
child
Female:*y
member
FootballTeam:
Fig. 5. Definition of relation type childR
Canonical Base Translating in DLs the star graphs of the canonical base of T SCG also requires the addition of role connectives to L. From a theoretical point of view, a star graph associated to a type relation R of Tr defines constraints relative to the concepts vertices adjacent to a relation vertex of type R in a conceptual graph (cf. section 3). In DLs, this is viewed as the definiMale:*y married be tion of a new, more specific, relation type. Let Female: *x the star graph associated to the married relation type. Its translation in DLs would be the definition of a new role marriedF M ≡ (domain married F emale) u (range married M ale), the role marriedF M being thus equivalent to the married relation type with its star graph. More generally, the constraints described by a star graph in T SCG are translated in DLs by using the range and domain restrictions in role definition. Cycles in CGs Translating in DLs conceptual graphs not limited to trees also requires the addition of role connectives to L. As an example, let R and R0 be two relations vertices from a concept vertex C:*x to a concept vertex D: . Its translation in ALEOI provided with role connectives would be the following: C u ∃(R u R0 ).D. N-ary Relation Types In standard DLs roles are binary while relation types in CGs are n-ary. However it is known that any n-ary relation can be described with a set of binary relations [4,20]. From a theoretical point of view, adding n-ary relations to L and G can therefore be considered without any change. 5.2
Extending Type Definitions in T SCG
Concept Conjunction The conjunction of any two concepts is standard in DLs, while impossible in CGs. As an example, let N onSmoker and BusDriver be two primitive concepts; N onSmokeruBusDriver describes all the individuals who are non smokers and bus drivers. As it is shown in figure 6a, the problem that arises when translating such a conjunction in CGs is relative to the non connexion of definition graphs. A “non connected” graph being a set of connected graphs, from a logical point of view (as it is shown in [8]) it is interpreted as the conjunction of the formulae interpreting its connected components. Then, such a conjunction is exactly the logical expression underlying the concept conjunction we want to translate in figure 6a. However authorizing this kind of definition graphs in CGs would induce technical graph writing problems when expanding defined concept types: as it is shown in figure 6c, to which concept should the
Towards Correspondences Between Conceptual Graphs and Description Logics
175
f riend relation be connected when expanding concept C in the graph of figure 6b? C(x) ⇔
NonSmoker:
C: friend
NonSmoker:*x
?
friend
C(x) ⇔ NonSmoker:*x
BusDriver: id
BusDriver:*x a
Researcher: b
Researcher:
BusDriver: c
d
Fig. 6. Concept type conjunction
To this representational problem in GCs, we propose a solution drawing its source from DLs. Some DLs are provided with a specific role id (for identity) describing all the couples (x, x) [14,19]. Let id be an equivalent relation type in T SCG: it allows to define concept types as a concept type conjunction while preserving the connection of the defintion graph. For instance, concept type C in figure 6d describes all the individuals which are N onSmoker and BusDriver. The logical interpretation of this definition graph is the following: ∀ x(C(x) ⇔ ∃ y (N onSmoker(x) ∧ BusDriver(y) ∧ id(x, y))). Since x is equal to y (thanks to the id relation type), x is a BusDriver and a N onSmoker which is exactly the semantics of N onSmoker u BusDriver. In other words, concept type definitions in G may be extended while preserving its equivalence with L (extended with id role)5 . Uniformizing Type Definition In DLs, one can always consider that a T-KB contains only fully defined concepts by adding new primitive concepts (cf. section 3.2). This may be transposed to CGs by using the id relation. More precisely, a partial concept type definition may be converted into a full definition by adding a relation id from the head of the definition graph to a concept whose type is a new primitive concept type (cf. the figure 7b and 7c). This improvement would give an homogeneous semantics for all types in the canon. In CGs there is a distinction between atomic types which are sub-types of other type(s) and those which are fully defined with a conceptual graph. Regarding correspondences with DLs, one may envision an extension of CGs with partial definitions. Provided with partial definitions and the id relation, it would be possible to associate an equivalent full definition to any atomic or partially defined concept type, by building a conceptual graph using the “id” relation for conjunction of super-types and a new type - like in DLs (cf. section 2. For example, in figure 7a, C1 and C2 are partially defined by the conjunction 5
Of course this extension would imply to adapt the graphs operations (for example, by taking into account the commutativity of id) which is out of the scope of this paper.
176
P. Coupey and C. Faron
of P 1 and P 2 for C1 and the conjunction of C1 and P 3 for C2 (P 1, P 2 and P 3 are primitive types). Figure 7b presents the partial definitions of C1 and C2 and figure 7 their full definitions (P 4 and P 5 are two new primitive types). The expanded form of the definition of C2 is presented in figure 7d. C1(x) ⇔
C1(x) ⇒ P1:*x
P1 P2
P1:*x id
P2:
P3 C2
a
P2:
id
P4:
id id
P1:*x
id
P3:
id
P5:
P3:
id
P2:
P5:
id
P4:
C2(x) ⇔ C1:*x
C2(x) ⇒
C1
C2(x) ⇔ id
C1:*x
id
P3:
b
c
d
Fig. 7. Definition graphs for non primitive concept types
As a result, from a theoretical point of view, one would now consider a terminological knowledge base for CGs as a set of (full) concept type definitions, the canon containing only primitive concept types as it is the case in DLs. This result should be adapted to type relation definition. Such an extension would allow to unify the formal semantics for types, definition graphs and subsumption. Thus, from a theoretical point of view, the whole terminological reasoning in CGs could be circumscribed to only the graph specialization/generalization operations6 .
6
Conclusion and Perspectives
In this paper, our aim was to provide formal correspondences between Conceptual Graphs and Description Logics. We have proved that a subset G of T SCG and a subset L of the ALEOI Description Logic are two notational variants of the same formalism at the terminological level, and that subsumption definitions are equivalent. This theoretical work is a necessary prerequisite for taking mutual advantage of research in CGs and DLs. As a first step in this transposition process, we have proposed an extension of T SCG with a special relation type id. This extension allows: to introduce partial type definitions without any change from a theoretical point of view - like in DLs; to associate a definition graph to any “non primitive” concept type (atomic or not) - like in DLs. As a result, it syntactically and semantically unifies subsumption between any concept types since it could be defined by a specialization of two expanded graphs. 6
Of course, we do not claim that, from a practical point of view, the canon and partial definitions are not usefull but that they can be not considered for formal study of terminological reasoning.
Towards Correspondences Between Conceptual Graphs and Description Logics
177
In this paper, we have focused on correspondences between the terminological levels of DLs and CGs. A first perspective of this work consists in studying formal correspondences between their assertional level. The most important perspective concerns adaptations of results from CGs to DLs. In particular, it would be fruitful to use graphs operations to solve a crucial problem in DLs: how to explain subsumption? Indeed, in DLs, a terminological knowledge base is split into formulae and the normalization process simplifies and transforms the initial descriptions before computing subsumption. The problem is that the users are often confused since they do not understand the connection between their initial descriptions and the results given by the system, or how the system has obtained these results. It has been proven many times in practice that a lot of results are unexpected by users [14] : subsumption computation is a “black box”. Thanks to a correspondence between CGs and DLs, graph operations, which are intuitive, easily comprehensible and readable, may be used in DLs to explain step by step the subsumption computation.
References 1. B. Bi´ebow and G. Chaty. A comparison between conceptual graphs and KL-ONE. In ICCS’93, First International Conference on Conceptual Structures, LNAI 699, pages 75–89. Springer-Verlag, Berlin, Qu´ebec, Canada, 1993. 2. R.J. Brachman, D.L. McGuiness, P.F. Patel-Schneider, L.A. Resnick, and A. Borgida. Living with CLASSIC: When and how to use a KL-ONE-like language. In J. Sowa, editor, Principles of Semantic Networks, pages 401–456. Morgan Kaufmann, San Mateo, Cal., 1991. 3. R.J. Brachman and J.G. Schmolze. An overview of the KL-ONE knowledge representation system. Cognitive Science, 9(2):171–216, 1985. 4. B. Carbonneill and O. Haemmerl´e. Standardizing and interfacing relational database using conceptual graphs. In ICCS’94, Second International Conference on Conceptual Structures, LNAI 835, pages 311–330. Springer-Verlag, Berlin, Maryland, USA, 1994. 5. M. Chein. The CORALI project: from conceptual graphs to conceptual graphs via labelled graphs. In ICCS’97, Fifth International Conference on Conceptual Structures, LNAI 1257, pages 65–79. Springer-Verlag, Berlin, Seattle, USA, 1997. 6. M. Chein and M. Lecl`ere. A cooperative program for the construction of a concept type lattice. In supplement proceedings of ICCS’94, Second International Conference on Conceptual Structures, pages 16–30. Maryland, USA, 1994. 7. M. Chein and M.L. Mugnier. Conceptual graphs: Fundamental notions. Revue d’Intelligence Artificielle, 6(4):365–406, 1992. 8. M. Chein and M.L. Mugnier. Repr´esenter des connaissances et raisonner avec des graphes. Revue d’Intelligence Artificielle, 10(1):7–56, 1996. 9. P. Coupey and C. Fouquer´e. Extending conceptual definitions with default knowledge. Computational Intelligence Journal, 13(2):258–299, 1997. 10. R. Dionne, E. Mays, and F.J. Oles. The equivalence of model-theoretic and structural subsumption description logics. In 13th International Joint Conference on Artificial Intelligence, pages 710–716, Chamb´ery, France, 1993. 11. C. Faron and J.G. Ganascia. Representation of defaults and exceptions in conceptual graphs formalism. In ICCS’97, Fifth International Conference on Conceptual Structures, LNAI 1257, pages 153–167. Springer-Verlag, Berlin, Seattle, USA, 1997.
178
P. Coupey and C. Faron
12. M. Lecl`ere. Les connaissances du niveau terminologique du mod` ele des graphes conceptuels : construction et exploitation. University thesis, Universit´e de Montpellier 2, France, 1995. 13. M. Lecl`ere. Reasonning with type definitions. In ICCS’97, Fith International Conference on Conceptual Structures, LNAI 1257, pages 401–415. Springer-Verlag, Berlin, Seattle, USA, 1997. 14. D.L. McGuinness and A.T. Borgida. Explaining subsumption in description logics. In 14th International Joint Conference on Artificial Intelligence, pages 816–821, Montreal, Canada, 1995. 15. M.L. Mugnier. On generalization/specialization for conceptual graphs. Journal of Experimental & Theoretical Artificial Intelligence, 7(3):325–344, 1995. 16. M.L. Mugnier and M. Chein. Characterization and algorithmic recognition of canonical conceptual graphs. In ICCS’93, First International Conference on Conceptual Structures, LNAI 699, pages 294–311. Springer-Verlag, Berlin, Qu´ebec, Canada, 1993. 17. B. Nebel. Reasoning and Revision in Hybrid Representation Systems. Number 422 in Lecture Notes in Computer Science. Springer-Verlag, 1990. 18. A. Schaerf. Query Answering in Concept-Based Knowledge Representation Systems: Algorithms, Complexity and Semantics Issues. PhD thesis, Universit` a di Roma “La Sapienza”, Roma, Italy, 1994. 19. K. Schild. A correspondence theory for terminological logics: Preliminary report. In 12th International Joint Conference on Artificial Intelligence, pages 466–471, Sydney, Australia, 1991. 20. J.G. Schmolze. Terminological knowledge representation systems supporting nary terms. In J.G. Schmolze, editor, Principles of Knowledge Representation and Reasoning: 1st International Conference, pages 432–443. Toronto, Ont., 1989. 21. J.F. Sowa. Conceptual Structures : Information Processing in Mind and Machine. Addison-Wesley, Reading, Massachusetts, 1984.
Piece Resolution: Towards Larger Perspectives St´ephane Coulondre and Eric Salvat L.I.R.M.M. (U.M.R. 9928 Universit´e Montpellier II / C.N.R.S.) 161 rue Ada, 34392 Montpellier cedex 5 - France e-mail: {coulondre,salvat}@lirmm.fr
Abstract. This paper focuses on two aspects of piece resolution in backward chaining for conceptual graph rules [13]. First, as conceptual graphs admit a first-order logic interpretation, inferences can be proven by classical theorem provers. Nevertheless, they do not use the notion of piece, which is a graph notion. So we define piece resolution over a class of first-order logical formulae: the logical rules. An implementation of this procedure has been done, and we compare the results with the classical SLD-resolution (i.e. Prolog). We point out several interesting results: it appears that the number of backtracks is strongly reduced. Second, we point out the similarities between these rules and database data dependencies. The implication problem for dependencies is to decide whether a given dependency is logically implied by a given set of dependencies. A proof procedure for the implication problem, called “chase”, was already studied. The chase is a bottom-up procedure: from hypothesis to conclusion. This paper introduces a new proof procedure which is topdown: from conclusion to hypothesis. Indeed, we show that the implication problem for dependencies can be reduced to the existence of a piece resolution.
1
Introduction
This paper focuses on two aspects of piece resolution in backward chaining for conceptual graph rules. This procedure has been originally presented in [13]. It is summarized in section 2. As conceptual graphs admit a first-order logic interpretation, inferences can be proven by classical theorem provers. Nevertheless, we point out that the clausal form of logic prevent from using the notion of piece. So we define in section 3 piece resolution over a class of first-order logical formulae: the logical rules. We made an implementation of this procedure, and we compare in section 4 the results with those of the classical SLD-resolution (i.e. Prolog), on the same benchmarks. We show several interesting results: it appears that the number of backtracks is strongly reduced. Second, we point out in section 5 the similarities between logical rules and data dependencies. Data dependencies are well-known in the context of relational database. They specify constraints that the data must satisfy to model correctly the part of the world under consideration. The implication problem for dependencies is to decide whether a given dependency is logically implied by a given set of dependencies. A proof procedure for the implication problem, called “chase” was already studied in [2], in M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 179–193, 1998. c Springer-Verlag Berlin Heidelberg 1998
180
S. Coulondre and E. Salvat
the case of tuple-generating and equality-generating dependencies. This class of dependencies generalizes most cases of interest, including the well-known functional and multivalued dependencies. The chase is a bottom-up procedure: from hypotheses to conclusion. This paper introduces a new proof procedure which is top-down: from conclusion to hypothesis. To do that, we use the logical rules and show that the problem for implication for dependencies can be reduced to the existence of a piece resolution. We end this paper with several concluding remarks pointing out some possible generalizations of our results and further work.
2
Piece Resolution within Conceptual Graphs
This section summarizes backward chaining of conceptual graph rules by way of graph operations. The framework we consider here is composed of simple conceptual graphs (non-nested), without negation. They are not necessary connected. 2.1
The Rules
Rules are of the type “If G1 then G2 ”, where G1 and G2 are conceptual graphs with possible coreference links between some concept vertices of G1 and G2 . A coreference link indicates that two vertices denote the same individual. In other words, Φ associates to these vertices the same variable or constant. A rule is denoted R : G1 ⇒ G2 , where G1 and G2 are called the hypothesis and the conclusion of R. A useful notation is that of lambda-abstraction. A lambdaabstraction λx1 ...xn G, with n ≥ 0, is composed of a graph G and n special generic vertices of G. Definition 1 (Conceptual graph rule [13]). A conceptual graph rule R : G1 ⇒ G2 is a couple of lambda-abstractions (λx1 ...xn G1 , λx1 ...xn G2 ). x1 , ..., xn are called connection points. In the following, we will denote by x1i (resp. x2i ) the vertex xi of G1 (resp. G2 ). For each i ∈ [1..n], x1i and x2i are coreferent.
R:
G1
1
Manager : *x
agt
2
1
Manage
1
Employee : *z
G2
Office : *y
2
loc
1
Desk
2
obj
2
Office : *y 2
loc
poss 1
1
Manager : *x
agt
2
Employ
1
Fig. 1. A CG rule.
obj
2
Employee : *z
Piece Resolution: Towards Larger Perspectives
181
The rule in figure 1 informally means the following: “If an employee z is in an office y managed by a manager x, then z has got a desk which is inside the office y, and x employs z”. 2.2
Logical Interpretation of a Rule
Φ associates to every lambda-abstraction λx1 ...xn G a first-order formula, where all the variables are existentially quantified, except variables of x1 ...xn which are free. Let R : G1 ⇒ G2 be a CG rule, if → is the logical implication connector, then Ψ (R) = (Φ(λ(x1 ...xn )G1 )) → (Φ(λ(x1 ...xn )G2 )). To every couple of vertices x1i and x2i is associated the same variable. Φ(R) is the universal closure of Ψ (R): Φ(R) = ∀x1 ...∀xn Ψ (R). For example, consider the rule R of figure 1, Φ(R) = ∀x∀y∀z(∃w(M anager(x) ∧ M anage(w) ∧ Of f ice(y) ∧ Employee(z) ∧ agt(x, w) ∧ obj(w, y)∧loc(z, y)) → (∃u∃vOf f ice(y)∧Desk(u)∧Employee(z)∧Employ(v)∧ M anager(x) ∧ loc(u, y) ∧ poss(z, u) ∧ obj(v, z) ∧ agt(x, v)). A knowledge base is composed of a support S, a set of CGs (facts), and a set of rules. 2.3
Piece Resolution Procedure
Consider a goal Q to be proven. The piece resolution procedure allows us to determine, given a knowledge base KB, whether Φ(BK) |= Φ(Q). The basic operation of the piece resolution is the piece unification which, given a goal Q and a rule, builds, if possible, a new goal Q0 to be proven such that if Q0 is proven on KB, then Q is proven on KB. Definition 2 (Piece and cut points [13]). Let R : G1 ⇒ G2 be a rule. A cut point of G2 is either a connection point (i.e. a generic concept shared with G1 ) or a concept with an individual marker (which may be common with G1 or not). A cut point of G1 is either a connection point of G1 or a concept with an individual marker which is common with G2 . The pieces of G2 are obtained as follows: remove from G2 all cut points; one obtains a set of connected components; some of them are not CGs since some edges have lost an extremity. Complete each incomplete edge with a concept with same label as the former cut point. Each connected component is a piece. Equivalently, two vertices of G2 belong to the same piece if and only if there exists a path from one vertex to the other that does not go through a cut point. In figure 1, G2 has two pieces. The first one includes all vertices from y to z and the second one includes all vertices from z to x. Indeed, x, y and z are cut points, and G2 is split at z vertex. Instead of splitting the goal in subgoals to be treated separately [7] [8], as also for Prolog [5] for first-order logic, piece resolution processes as much as possible the graph as a whole. For example, the request Q0 of figure 2 can unify with the rule of figure 1. Indeed, we can unify the subgraph of Q containing the vertices from Manager
182
S. Coulondre and E. Salvat Q: Manager : Tom
1
2
agt
Employ Car : *
1
2
obj
2
Employee : * 1
poss
Fig. 2. A request Q’ : Manager : Tom Car : *
2
1
poss
2
agt 1
Manage
Employee : *
1
1
obj loc
2
Office : * 2
Fig. 3. A new request built by unification of the request and the rule
to Employee with the piece of G2 from the concept vertex with the marker x to the concept vertex with the marker z. We obtain a new request Q0 . A piece resolution is a sequence of piece unifications. It ends successfully if the last produced goal is the empty graph. It needs the definition of a tree exploration strategy (breadth-first or depth-first search like in Prolog for example). For more details on piece resolution, we refer the reader to [13] [12]. The piece resolution procedure is clearly a top-down (or backward chaining) procedure. Indeed, rules are applied to a conclusion (the goal is actually a rule with an empty hypothesis), thus generating new goals to be proven. This is done until we obtain the empty graph. Thus the procedure uses here the goal to guide the process. Theorem 1 (Soundness and completeness of piece resolution [13]). Let KB be a knowledge base and Q be a CG request. Then: – If a piece resolution of Q on KB ends successfully, then Φ(KB) |= Φ(Q). – If Φ(KB) |= Φ(Q), then there is a piece resolution of Q on KB that ends successfully.
3
Piece Resolution within First-Order Logic
Resolution in backward chaining for conceptual graph rules has been defined in section 2. This mechanism is sound and complete relative to the deduction in first-order logic. The underlying idea is to determine subgraphs as large as possible, that can be processed as a whole. As already mentioned, conceptual graph rules can be expressed as first-order sentences. Therefore it may be legitimate to express graph resolution using first-order formulae. This will lead us to a new kind of resolution called Piece Resolution. A unification between a goal graph and a graph rule produces a new request which is also a conceptual graph. This new goal can be expressed as a first-order logic sentence, and can also contain existential quantifiers. Now, classical proof procedures for first-order logic generally use clausal form [10] or specific forms, obtained in every case by taking out
Piece Resolution: Towards Larger Perspectives
183
existential quantifiers. This is called Skolemisation [3]. This process modifies the knowledge base, thus further inferences do not use the original one. On the contrary, graph resolution uses the original base and allows existential quantifiers to take part of the proof mechanism (in the associated logical interpretation of rules) by using graphs structure. The inference is thus more “natural”. Moreover, we will see that graph resolution provides ways to improve effectiveness theoretically speaking as well as practically. We will now clarify the idea of piece in first-order logic. 3.1
Pieces
Definition 3 (Logical rule). A logical rule is a formula of the form F = ∃x1 ...∃xs (A1 ∧ ... ∧ Ap ) ← B1 ∧ ... ∧ Bn , p ≥ 1, universally closed. There are no functional symbols.VWe will omit universal quantifiers for sake of readability. The hypothesis of V F , {Bi |i ∈ [1..n]} is denoted by hyp(F ), and the conclusion of F , ∃x1 ...∃xs {Ai |i ∈ [1..p]} is denoted by conc(F ). Each Ai , i ∈ [1..p] and each Bi , i ∈ [1..n] is an atom and xi , i ∈ [1..s] are variables appearing only in Ai , i ∈ [1..p]. All other variables in Ai , i ∈ [1..p] also appear in the hypothesis and are universally quantified. If n = 0, F is also called a logical fact. When we want to know if a logical fact is logically implied by a set of logical rules, we talk about logical goal. Example 1 (Logical rule). Consider the following logical rule: R = ∃u(t4 (u) ∧ r4 (x, u)) ← t1 (x) ∧ t1 (y) ∧ r1 (x, y). We have hyp(R) = t1 (x) ∧ t1 (y) ∧ r1 (x, y) and conc(R) = ∃u(t4 (u) ∧ r4 (x, u)). Remark 1. If s = 0, then F is equivalent to a set of horn clauses {Ai ← B1 ∧ ... ∧ Bn , i ∈ [1..p]} Definition 4 (Piece). Let C = A1 ∧ ... ∧ Ap be a conjunction of atoms and V = {x1 , ..., xs } be a set of variables. Pieces of C in relation to V are defined in the following way: for all atoms A and A0 members of {A1 , ..., Ap }, A and A0 belong to the same piece if and only if there is a sequence of atoms (P1 , ..., Pm ) of {A1 , ..., Ap } such that P1 = A and Pm = A0 and ∀i = 1, . . . , m − 1, Pi and Pi+1 share at least one variable of V . By construction, the set of pieces is a partition of C. Definition 5 (Logical rule pieces). Let R=∃x1 ...∃xs (A1 ∧ ... ∧ Ap ) ← B1 ∧ ... ∧ Bn be a logical rule. Pieces of R are pieces of conc(R) = A1 ∧ ... ∧ Ap in relation to {x1 , ..., xs }. Example 2. Let R = ∀x∀y∀z(∃w(M anager(x) ∧ M anage(w) ∧ Of f ice(y) ∧ Employee(z) ∧ agt(x, w) ∧ obj(w, y) ∧ loc(z, y)) → (∃u∃vOf f ice(y) ∧ Desk(u) ∧ Employee(z) ∧ Employ(v) ∧ M anager(x) ∧ loc(u, y) ∧ poss(z, u) ∧ obj(v, z) ∧ agt(x, v)) be a logical rule.
184
S. Coulondre and E. Salvat
R has five pieces, which are: – {Desk(u), loc(u, y), poss(z, u)} which contains atoms that share the existentially quantified variable u, – {Employ(v), obj(v, z), agt(x, v)} which contains atoms that share the existentially quantified variable v, – {Employee(z)}, {M anager(x)} and {Of f ice(y)} which contain atoms where no existentially quantified variable appears, thus giving one atom in each piece. Definition 6 (Piece splitting). Let R = ∃x1 ...∃xs (A1 ∧...∧Ap ) ← B1 ∧...∧Bn be a logical rule, and t be the number of pieces of R. R can be split, giving several logical rules R1 , ..., Rt . Each Ri , i ∈ [1..t] has the same hypothesis as R, and the piece number i as conclusion. Ri = ∃xi1 ...∃xisi (Ai1 ∧ ... ∧ Aipi ) ← B1 ∧ ... ∧ Bn , i ∈ [1..t]. pi is the number of atoms in the piece number i, si is the number of existentially quantified variables concerning the piece, and Aij , j ∈ [1..pi ] are atoms of the piece number i. Each variable not in xij is universally quantified. Construction of R1 ∧ ... ∧ Rt is called piece splitting of R. We will denote in the same way the splitting operation and its result R1 ∧ ... ∧ Rt . S S S Remark 2. i∈[1..t] ( j∈[1..pi ] Aij ) = k∈[1..p] Ak . Indeed, the union of atoms in conclusions of new rules is exactly the set of atoms in the conclusion of the initial rule. Therefore, an atom belongs to only one P piece, thus it can appear in only one of the newly generated rules. Hence i∈[1..t] pi = p. No atom is created and none is deleted. The set of pieces is a partition of the set of atoms in the conclusion of the initial rule. Proposition 1. R is logically equivalent to R1 ∧ ... ∧ Rt . Proof (Sketch). In R, we can group existential quantifiers together in front of concerned atoms sets. Each group of existential quantifiers together with the corresponding set of atoms is by definition a piece. We rewrite the obtained formula by eliminating implication. Then by distributivity of ∨ on ∧, we obtain as many formulae as pieces, connected by conjunction, each of them having the same hypothesis which is initial logical rule hypothesis. Common variables to these formulae are all universally quantified. As ∀x(F ∧ G) and (∀xF ∧ ∀xG) are equivalent, we can decompose R in R1 ∧ ... ∧ Rt . t u Definition 7 (Trivial logical rules). Let R be a logical rule. R is trivial if and only if every interpretation of R satisfies R (R is valid). Example 3 (Trivial logical rules). The following logical rules are trivial: t(x) ← t(x) ∧ r(x, y) ∃u(t(u)) ← t(y) ∃v(r(x, y, v)) ← t(z) ∧ r(x, y, z) Piece splitting can generate logical trivial rules, therefore useless. It is the case, among others, of rules in which each atom of the conclusion appears also in the hypothesis. Let R1 , ..., Ri , ...Rt be the result of piece splitting of R. We showed that R and R1 , ..., Ri , ...Rt are logically equivalent. Let Ri be a trivial logical rule. R and R1 , ..., Ri−1 , Ri+1 , ...Rt are logically equivalent. Hence Ri can be deleted.
Piece Resolution: Towards Larger Perspectives
3.2
185
Basic Operations
Definition 8 (Piece unification). Let Q be a logical goal and R be a logical rule. There is a unification between Q and R, say σu , if and only if: 1. there are two substitutions σ1 and σ2 defined respectively on a subset of variables of Q and on a subset of universally quantified variables of R. 2. σu = σ1 ∪σ2 . Pieces of σu (Q) are defined as pieces of conc(σu (Q)) in relation to the set of existentially quantified variables of σu (R). There must exist as least one piece of σu (Q) appearing entirely in the conclusion of σu (R). Once a piece unification σu has been found, a new request Q0 is built from Q. Q0 becomes the new logical goal. Definition 9 (Construction of a new logical goal). Let Q be a logical goal, R be a logical rule and σu be a piece unifier between Q and R, we obtain the new goal Q0 the following way: 1. Delete from σu (Q) pieces that appear in σu (R) and add atoms of hyp(σu (R)) 2. Update existential quantifiers σu (Q), more specifically: a) Delete existential quantifiers in σu (Q) that were corresponding to deleted existentially quantified variables in σu (R). b) Add existential quantifiers corresponding to variables appearing in atoms of hyp(σu (R)) added to σu (Q) (corresponding to universally quantified variables of σu (R)). These steps may need variable renaming, to avoid variable binding phenomena. Indeed, an atom of hyp(σu (R)) added to σu (Q) must not contain an already quantified variable in σu (Q), because it would be “captured” by a wrong quantifier. Therefore, we must rename common variables to Q and R. Definition 10 (Logical piece resolution). Logical piece resolution of a logical goal Q is a sequence of piece unifications. It ends successfully if the last produced request is empty. In this case, the last used rule has an empty hypothesis (i.e. a logical fact). Example 4 (Piece unification). Let R be the following logical rule: R = ∀x∀y∀z (∃w(M anager(x)∧M anage(w)∧Of f ice(y)∧Employee(z)∧agt(x, w)∧obj(w, y)∧ loc(z, y)) → (∃u∃vOf f ice(y)∧Desk(u)∧Employee(z)∧Employ(v)∧M anager(x) ∧ loc(u, y) ∧ poss(z, u) ∧ obj(v, z) ∧ agt(x, v)). Let Q be the following request (note that Q and R must not have variables in common): Q = ∃i∃j∃kM anager(tom) ∧ Employ(i) ∧ Employee(j) ∧ Car(k) ∧ agt(tom, i) ∧ obj(i, j) ∧ poss(j, k) A unifier is σu = {(x, tom), (i, v), (j, z)}. After applying σu , we construct the new goal Q0 : Q0 = ∃w∃z∃y∃k(M anager(tom) ∧ M anage(w) ∧ Of f ice(y) ∧ Employee(z) ∧ agt(tom, w) ∧ obj(w, y) ∧ loc(z, y) ∧ Car(k) ∧ poss(z, k))
186
S. Coulondre and E. Salvat
Remark 3. It is possible to simulate a piece resolution in backward chaining with conceptual graphs by a piece resolution with logical rules, and vice/versa. Thus these problems are equivalent. The first reduction (graphs → logic) is trivial. For the second reduction (logic → graphs), we need to construct a support and to add concept vertices with universal type for terms that do not appear in predicate of arity one. In the support, each concept type is covered by the universal type and thus two concept types are incomparable. The relation types set is partitioned in subsets of relation types of the same arity where each relation type is covered by the greater element and thus two relation types are incomparable. 3.3
Soundness and Completeness of Logical Piece Resolution
Lemma 1. Let Q be a logical goal and R be a logical rule. If Q’ is a new goal built by a piece unification between Q and R, then Q0 , R |= Q Theorem 2 (Soundness of logical piece resolution [6]). Let Γ be a set of logical rules, and Q be a logical goal. If a logical piece resolution of Q on Γ ends successfully, then Γ |= Q. Proof (Sketch). The way we build Q’ allows us to prove Q0 , R |= Q. To prove the theorem, we then proceed by induction on the number of piece unifications. u t Theorem 3 (Completeness of logical piece resolution [6]). Let Γ be a set of logical rules, and Q be a logical goal. If Γ |= Q then there is a logical piece resolution of Q that ends successfully. Proof (Sketch). We first prove by refutation the existence of a piece unification between the logical goal Q and the logical rule R, giving a logical goal Q0 (assuming that Q0 , R |= Q, but not Q0 |= Q). Secondly, we prove by induction that the implication of a logical goal by a set of logical rules can be “decomposed” in a sequence of logical implications involving the previous goal and only one rule to give the next goal. Then we proceed by induction on the number of piece unifications. t u 3.4
From Fact Goals to Rule Goals
So far, goals were only rules without hypothesis. But it is useful to consider rules as goals. Indeed it would allow to decide whether a rule is logically implied by a set of rules, for example to know if the rule is worth being added to the set. We could also want to compute the minimal cover of a set (that is the minimal set that allows to generate the initial set), or the transitive closure of a set (see section 5). Definition 11. Let S be a set of logical rules, and R be a logical rule. The operation ω, defined on R with respect to S and noted ωS (R), replaces every universally quantified variable of R by a new and unique constant (not appearing neither in S nor in R).
Piece Resolution: Towards Larger Perspectives
187
Theorem 4 ([12]). Let Γ be a set of logical rules, and R be a logical rule. R is of the form R = ∃x1 ...∃xs (A1 ∧ ... ∧ Ap ) ← B1 ∧ ... ∧ Bn , p ≥ 1. Let R0 = ωΓ (R). Then Γ |= R if and only if Γ, hyp(R0 ) |= conc(R0 ). Proof (Sketch). =⇒ Let I be a model of Γ and hyp(R0 ). We show that I is also a model of conc(R0 ). ⇐= We prove that Γ ∧ ¬R is inconsistent. To do that, we show that the set of ground instances of Γ ∧ hyp(R0 ) ∧ ¬conc(R0 ) which is inconsistent is included in the set of ground instances of Γ ∧ ¬R. The theorem of compactness allows to conclude. t u
4
Comparison and Statistical Analysis
In order to estimate how much piece resolution can reduce the resolution tree, we built a logical rules random generator. We present in this section the results of this first series of tests, comparing the number of unitary backtracks, for the same rule base, in the case of piece resolution and SLD-resolution [9] [11] (used in Prolog) 1 . The rule base is translated into Horn clauses in order to fit with Prolog, whose effect is to multiply the base size. Therefore, the piece resolution algorithm deals with the original base, and Prolog deals with the translated base. To prevent procedures from looping, there is no cycle in the dependency graph of logical rules. We made 3011 tests, varying each parameter (base size, maximal number of atoms in hypothesis and conclusion, . . . ). A quick analysis over the results shows that the number of backtracks in piece resolution is always lower than 310,000, whereas it is lower than 5,000,000 for Prolog. The table 4 gives the distribution in details for each method. It shows that piece resolution reduces the number of backtracks. Indeed the number of cases with more than 10.000 backtracks is 73 (2,42%) for piece resolution, whereas it is 279 (9.19%) for Prolog. x=number of backtracks x ≤ 100 101 ≤ x ≤ 1000 1001 ≤ x ≤ 10000 10001 ≤ x ≤ 100000 100001 ≤ x ≤ 1000000 1000001 ≤ x
Prolog 2047 67,98% 407 13,52% 280 9,30% 153 5,08% 82 2,72% 42 1,39%
Piece resolution 2504 83,16% 294 9,76% 140 4,65% 66 2,19% 7 0,23% 0 0%
Fig. 4. Repartition of tests, according to the number of backtracks.
The table 5 gives for the same base the repartition of cases in which piece resolution reduces the number of backtracks (function of number of backtracks 1
We used SWI-Prolog (Copyright (C) 1990 Jan Wielemaker, University of Amsterdam).
188
S. Coulondre and E. Salvat
for Prolog - number of backtracks for piece resolution). The percentage is given relatively to the number of cases that show improvement for piece resolution. x= improvement number of cases percentage x ≤ 100 696 47,54% 101 ≤ x ≤ 1000 290 19,81% 1001 ≤ x ≤ 10000 232 15,85% 10001 ≤ x ≤ 100000 127 8,67% 100001 ≤ x ≤ 1000000 78 5,33% 1000001 ≤ x 41 2,80% Fig. 5. Repartition of tests in which piece resolution is more efficient than Prolog, function of decreasing number of backtracks.
In the same way, the table 6 gives for the same base the repartition of cases in which piece resolution increases the number of backtracks (function of number of backtracks for piece resolution - number of backtracks for Prolog). The percentage is given relatively to the number of cases that show degradation for piece resolution. x= degradation number of cases percentage x ≤ 100 228 76,77% 101 ≤ x ≤ 1000 36 12,12% 1001 ≤ x ≤ 10000 21 7,07% 10001 ≤ x ≤ 100000 10 3,37% 100001 ≤ x 2 0,67% Fig. 6. Repartition of tests in which piece resolution is less efficient than Prolog, function of increasing number of backtracks.
Let us point out the 42 cases where the number of backtracks for Prolog is greater than 1 million (table 4). Moreover, the number of backtracks is decreased by more than 1 million for piece resolution, except in one case (decreasing of 931.850), the maximum decreasing being 4.878.882. Therefore, even considering the time parameter, a better efficiency is shown by piece resolution in 33 cases out of 42, in spite of a low efficient algorithm. Indeed, we have implemented a “brute force” algorithm that computes every possible unification between the terms of a goal and a rule, keeping only piece unifications. We can certainly decrease execution time by studying more efficient algorithms. One may argue that the number of backtracks for piece resolution is lower because a part of the backtracks made by Prolog is moved inside the piece unification operation. Indeed, the decision problem associated with piece unification between two formulae is NP-complete. Several arguments may run counter to this allegation. First, practical results show that for 33 out of 42 Prolog difficult cases, piece resolution takes less longer to achieve. Yet its algorithm is very rough.
Piece Resolution: Towards Larger Perspectives
189
In particular, in 4 cases, piece resolution terminates within 1 second, and in less than 10 backtracks, while Prolog makes between 1,4 and 3,3 million backtracks, in 6,5 and 16 minutes. Thus the backtracks have not been all moved into the unification process. Theoretically speaking, the piece idea keeps intact a part of the formula structure and allows to detect fails sooner. Indeed, substitutions that are not piece unifications are rejected, whereas Prolog will use them to unify and develop its solution tree, and the fail will take place later (not sooner anyway). As stated before, we can improve the piece unification resolution by studying efficient piece unification algorithms, using graph theory or Constraint Satisfaction Problems (CSP) results. Indeed, there is a strong correspondence between a CSP and projection checking on two conceptual graphs [4]. In particular, unification has the same complexity as a projection between two graphs. It is polynomial when the corresponding piece has a tree form.
5
An Application to Relational Database Theory
Data dependencies are well-known in the context of relational databases. They aim to formalize constraints that the data must satisfy. A dependency is a statement to the effect that when certain tuples are present in the database, so are certain others, or some values of the first ones are equal. The formalism usually used to define the TGDs and EGDs and to describe the chase procedure is either that of tableaux, or that of first-order logic with identity [2] [1]. In this paper we use the latter formalism. In this section, we show how dependencies theory can benefit from the logical piece resolution. 5.1
Dependencies
For sake of simplicity, we assume, as in [2], several restrictions on the model. These restrictions can be lifted with minor modifications. We assume the database is under the universal relation assumption (only one relation symbol noted R), distinct attributes have disjoint domains (typed dependencies), that dependencies do not contain constants. Definition 12 (Tuple-Generating Dependency [1]). A TGD is a firstorder logic sentence of the form ∀x1 ...∀xn [ϕ(x1 , ..., xn ) → ∃z1 ...∃zk ψ(y1 , ..., ym )], where {z1 , ..., zk } = {y1 , ..., ym } − {x1 , ..., xn }, where ϕ is a conjunction (possibly empty) of atoms of the form R(w1 , ..., wl ) (where each w1 , ..., wl is a variable) using all the variables x1 , ..., xn , and where ψ is a non empty conjunction of atoms of the form R(w1 , ..., wl ) (where each w1 , ..., wl is a variable), using all the variables z1 , ..., zk . Example 5 (TGD). The following first-order sentence is a TGD: F = ∀x∀y∀z∀t∀u∀v∀w[R(x, y, z, t)∧R(x, u, v, w) → ∃s R(s, y, v, t)∧R(s, u, z, w)]. F means that for every couple of tuples having the same value for the first attribute in the instance of the relation R, there are two other tuples in the instance with values taken from the two latter, and a new value for the first attribute.
190
S. Coulondre and E. Salvat
Definition 13 (Equality-Generating Dependency [1]). An EGD is a firstorder logic sentence of the form ∀x1 ...∀xn [ϕ(x1 , ..., xn ) → ψ(x1 , ..., xn )], where ϕ is a conjunction (possibly empty) of atoms of the form R(w1 , ..., wl ) (where each w1 , ..., wl is a variable) using all the variables x1 , ..., xn , and where ψ is of the form w = w0 , (where w and w0 are variables of x1 , ..., xn ). As dependencies are typed, equality atom involves a pair of variables assigned to the same position. Let A be the attribute associated to this position. Then this EGD is called an A-EGD. As dependencies are typed, we assume that variables in atoms occur only in their assigned positions. Example 6 (EGD). The following first-order sentence is an EGD: G = ∀x∀y∀z∀t∀u∀v[R(x, y, z, t) ∧ R(u, y, z, v) → (x = u)]. G means that all couples of tuples that have the same value for the second and third attribute in the instance of the relation R, also have the same value for the first attribute of the instance. The Implication Problem for Dependencies and the Chase Procedure Let D be a set of dependencies, and d be a dependency. The implication problem is to decide whether D |= d. The chase procedure [2] has been designed to solve the implication problem. We will not describe it in details here. Informally speaking, the chase procedure takes the hypothesis of d and treats it as if it formed a set of tuples, thus replacing variables with symbols. Then it applies repeatedly the dependencies of D, following two distinct rules: a T-rule for a TGD, whose effect is to add tuples to the relation, and a E-rule for an EGD, whose effect is to “identify” two symbols. If we obtain the conclusion of d, then we are done. When d is a TGD, we stop when obtaining the tuples satisfying the conclusion of d. If d is an EGD, we stop when obtaining identification of the two symbols involved in the equality in the conclusion of d. This mechanism has been shown sound and complete in [2]. Note that the implication problem for TGDs and EGDs is semi-decidable [14]. Thus this procedure may not stop. Nevertheless, some cases have been shown decidable. The chase procedure is clearly a bottom-up (or forward chaining) procedure. Indeed, rules are applied to hypotheses, thus generating some consequences (new tuples or identification of symbols). This is executed until the desired conclusion is obtained. The initial goal (i.e. the conclusion of d) is not used to guide the process. We now show that we can solve the implication problem with the piece resolution method presented in section 3. In other words, we provide a top-down (or backward chaining) procedure, 5.2
From One Model to Another
In this section, we reduce the problem of implication for dependencies to the existence of a piece resolution within logical rules. Our model of logical piece resolution does not handle equality. This is not a problem for TGD, which are included in the set of logical rules, but EGD contains equality predicates. We
Piece Resolution: Towards Larger Perspectives
191
have two ways of dealing with this problem. First, we can add equality treatment to our model formalism, but we want to stay in the scope of the CG piece resolution. Second, we can use some results of [2] to eliminate equality for treatment of the implication problem. That is the solution chosen here. Suppressing Equality in Dependencies In this section, we use several results previously shown in [2]. Informally, instead of dealing with equality and identifying symbols within an EGD, we can say that they “look the same” within the relation. Expressing the results in the first-order logic formalism, we obtain the following transformation: Let e = ∀x1 , ...xn (JR → (xi = xj )) be an EGD. We associate to e two TGDs e1 and e2 , as follows: e1 = ∀x1 , ...xn , y1 , ...yl−1 (JR ∧ R(xj , y1 , ..., yl−1 ) → R(xi , y1 , ..., yl−1 ) e2 = ∀x1 , ...xn , y1 , ...yl−1 (JR ∧ R(xi , y1 , ..., yl−1 ) → R(xj , y1 , ..., yl−1 ) where l is the arity of the only relation R and yk , k ∈ [1..l − 1] are new variables not appearing in x1 , ...xn . Example 7. The 2 TGDs associated with the EGD of example 6 are: F1 = ∀x∀y∀z∀t∀u∀v∀n∀r∀s[R(x, y, z, t)∧R(u, y, z, v)∧R(u, n, r, s) → R(x, n, r, s)] F2 = ∀x∀y∀z∀t∀u∀v∀n∀r∀s[R(x, y, z, t)∧R(u, y, z, v)∧R(x, n, r, s) → R(u, n, r, s)] In the following, we will denote by D∗ the set of dependencies D in which every EGD has been replaced by its associated TGDs. Theorem 5 ([2]). Let D be a set of TGDs and EGDs, d be a TGD and e be a non trivial A-EGD (i.e. e = ∀x1 , ...xn (JR → (xi = xj )) and i 6= j). D |= d if and only if D∗ |= d. D |= e if and only if D∗ |= e1 and there is a non trivial A-EGD in D. Reducing the Implication Problem We now only deal with TGDs. The previous results showed that the implication problem for TGDs and EGDs is reducible to the implication for TGDs. Theorem 6 ([6]). Let D be a set of TGDs and EGDs, d be a TGD and e be a non trivial A-EGD. Let ω be the operation already presented in definition 11. D |= d if and only if D∗ , hyp(ωD∗ (d)) |= conc(ωD∗ (d)). D |= e if and only if D∗ , hyp(ωD∗ (e1 )) |= conc(ωD∗ ) and there is a non trivial A-EGD in D. Proof. TGDs are included in the set of logical rules. We simply apply theorems 4 and 5. t u Theorem 7 ([6]). Let D be a set of TGDs and EGDs, d be a TGD, and e be a non trivial A-EGD. D |= d if and only if there is a logical piece resolution of conc(ωD∗ (d)) on {D∗ , hyp(ωD∗ (d))} that ends successfully. D |= e if and only if there is a logical piece resolution of conc(ωD∗ (e1 )) on {D∗ , hyp(ωD∗ (e1 ))} that ends successfully and there is a non trivial A-EGD in D.
192
S. Coulondre and E. Salvat
Proof. By theorem 6 and the property of soundness and completeness of logical piece resolution (theorems 2 and 3). t u Therefore, we showed that we can apply logical piece resolution in backward chaining to solve the implication problem for TGDs and EGDs. Since this problem is semi-decidable, this procedure may never terminate, as for the chase process. But the advantage is that the inference is guided by the goal. We hope to develop some heuristics to help exploration of the solutions tree.
6
Conclusion
In this paper we presented in three points the contributions of the piece resolution mechanism. The underlying idea was to adapt a graph notion to first-order logic. This resolution method has been originally defined on conceptual graph rules [13]. The central notion is that of a piece, that allows to unify subgraphs at once instead of splitting graphs into trivial subgraphs (restricted to a relation and its neighbors). In section 3, we translated the piece notion, which comes from graph theory, into first-order logic. As for conceptual graph rules, the piece resolution method is sound and complete on the set of FOL formulae defined by logical rules. In section 4, we compared SLD-resolution and logical piece resolution, considering the number of backtracks of each method. This comparison shows that pieces can considerably reduce the number of backtracks. We must keep in mind, nevertheless, that the unification in Prolog is polynomial, whereas the decision problem of piece unification between two formulae is NP-complete. The efficiency of the whole piece resolution mechanism depends on the unification, which is the central operation. Thus, an algorithmic study of the unification shall improve the whole procedure. Then, in section 5, we pointed out the similarities between logical rules and data dependencies and we presented a new proof procedure for the implication problem of dependencies, issued from the piece resolution on logical rules. This procedure is top-down. We assumed several restrictions on the model. As already stated, these restrictions can be lifted with minor modifications. The logical rules model can deal with several predicates, thus lifting the universal relation assumption. It can also handle non-disjoint domains, thus unsorted (untyped) dependencies, and constants. That is, we can express a constraint like “Employee #23 can have only one desk”. Our theorems still work in these cases, provided that the reduction of EGDs to TGDs also works in the unrestricted model, which is stated in [2]. We think this procedure comes to fill a gap in the data dependencies processing. Acknowledgments We are grateful to M. C. Rousset for showing us the similarity between FOL formulae associated to CG rules and TGDs, to M. L. Mugnier who read the manuscript very carefully, and to M. Y. Vardi for his useful additional information.
Piece Resolution: Towards Larger Perspectives
193
References 1. S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, Reading, Mass., 1995. 2. Catriel Beeri and Moshe Y. Vardi. A proof procedure for data dependencies. Journal of the ACM, 31(4):718–741, October 1984. 3. Chin-Liang Chang and Richard Char-Tung Lee. Symbolic Logic and Mechanical Theorem Proving. Academic Press, New York, 1973. 4. M. Chein and M.-L. Mugnier. Repr´esenter des connaissances et raisonner avec des graphes. Revue d’Intelligence Artificielle, 10(1):7–56, 1996. 5. Alain Colmerauer. Prolog in 10 figures. Communications of the ACM, 28(12):1296– 1310, December 1985. 6. St´ephane Coulondre. Exploitation de la notion de pi`ece des r`egles de graphes conceptuels en programmation logique et en bases de donn´ees. Master’s thesis, Universit´e Montpellier II, 1997. 7. Jean Fargues, Marie-Claude Landau, Anne Dugourd, and Laurent Catach. Conceptual graphs for semantics and knowledge processing. IBM Journal of Research and Development, 30(1):70–79, January 1986. 8. Bikash Chandra Ghosh and V. Wuwongse. A direct proof procedure for definite conceptual graph programs. Lecture Notes in Computer Science, 954, 1995. 9. R. A. Kowalski. Predicate logic as a programming language. Proc. IFIP 4, pages 569–574, 1974. 10. J. W. Lloyd. Foundations of Logic Programming, Second Edition. Springer-Verlag, 1987. 11. Donald W. Loveland. A simplified format for the model elimination procedure. Journal of the ACM, 16(3):233–248, July 1969. 12. Eric Salvat. Raisonner avec des op´erations de graphes: graphes conceptuels et r`egles d’inf´erence. PhD thesis, Universit´e Montpellier II, Montpellier, France, December 1997. 13. Eric Salvat and Marie-Laure Mugnier. Sound and complete forward and backward chainings of graph rules. In Proceedings of the Fourth International Conference on Conceptual Structures (ICCS-96), volume 1115 of LNAI, pages 248–262, Berlin, August19–22 1996. Springer. 14. Moshe Y. Vardi. The implication and finite implication problems for typed template dependencies. Journal of Computer and System Sciences, 28(1):3–28, February 1984.
Triadic Concept Graphs Rudolf Wille Technische Universit¨ at Darmstadt, Fachbereich Mathematik Schloßgartenstr. 7, D–64289 Darmstadt,
[email protected] Abstract. In the paper “Conceptual Graphs and Formal Concept Analysis”, the author has presented a first attempt in unifying the Theory of Conceptual Graphs and Formal Concept Analysis. This context-based approach, which is philosophically supported by Peirce’s pragmatic epistemology, is grounded on families of related formal contexts whose formal concepts allow a mathematical representation of the concepts and relations of conceptual graphs. Such representation of a conceptual graph is called a “concept graph” of the context family from which it is derived. In this paper the theory of concept graphs is extended to allow a mathematical representation of nested conceptual graphs by “triadic concept graphs”. As in the preceding paper, our focuss lies on the mathematical structure theory, which later could be used for extending the already developed logical theory of simple concept graphs. The overall aim of this research is to contribute to the development of a contextual logic as basis of Conceptual Knowledge Processing.
1
Contextual Logic
“Contextual Logic” is understood as a mathematization of traditional philosophical logic which is based on “the three essential main functions of thinking concept, judgment and conclusion” [Ka88; p. 6]. For mathematizing the doctrine of concepts we offer Formal Concept Analysis [GW96] which formalizes concepts on the basis of formal contexts. For mathematizing the doctrine of judgments and conclusions we use the Theory of Conceptual Graphs [So84],[So98]. Conceptual Logic shall be primarily developed as basis of Conceptual Knowledge Processing as discussed in [Wi94]. First ideas toward a contextual logic have been presented in the paper “Restructuring mathematical logic: an approach based on Peirce’s pragmatism” [Wi96a] which proposes Formal Concept Analysis for establishing more connections between the logic-mathematical theory and reality. In the paper “Conceptual Graphs and Formal Concept Analysis” [Wi97b] a first attempt is made in unifying Formal Concept Analysis and the Theory of Conceptual Graphs for the foundation of a contextual logic. This context-based approach, which is philosophically supported by Peirce’s pragmatic epistemology, is grounded on families of related formal contexts whose formal concepts allow a mathematical representation of the concepts and relations of conceptual graphs. Such representation of a conceptual graph is called a “concept graph” of the context family from which it is derived. M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 194–208, 1998. c Springer-Verlag Berlin Heidelberg 1998
Triadic Concept Graphs
195
While the approach to concept graphs in [Wi97b] concentrates on the foundation of a mathematical structure theory, the paper “Simple concept graphs: a logic approach” [Pr98b] presents a logical theory of concept graphs, whose syntax defines concept graphs as syntactical constructs over an alphabet of object names, concept names, and relation names and whose semantics adapt the mathematical structure theory of [Wi97b] for the interpretation of those syntactical constructs (see also [Pr98a]). Up to now, the developed theory only allows to mathematically represent conceptual graphs without nesting. Therefore the theory should be extended so that nested conceptual graphs could be treated within Contextual Logic too. In this paper we approach such an extended theory of concept graphs by using Triadic Concept Analysis (see [LW95]). Again, our focuss lies on founding the mathematical structure theory, which later could be used for extending the logical theory of concept graphs. In Section 2 we discuss through triadic contexts of tonal music how subdivisions of concept graphs can be described using triadic concepts. This discussion motivates the foundation of a theory of “triadic concept graphs” in Section 3. In Section 4, we explain how positive nested conceptual graphs can be mathematically represented by triadic concept graphs. Finally, in Section 5, we sketch ideas of further research toward a comprehensive foundation of Contextual Logic.
2
Triadic Contexts of Tonal Music
The discussion of concept graphs in music theory, offered in this section, shall provide us with an alternative view of the potential application of concept graphs, which especially suggests to allow not only nestings, but also subdivisions with overlappings. Concept graphs can be used to represent the harrmonic analysis of chord sequences. For example, the sequence of major triads C
−
F
−
G
−
D
−
E
−
A
is analysed in [He49; p. 38] as a diatonic modulation from c major to a major via d major which can be represented by the concept graph visualized in Figure 1: Each small box represents the concept of a chroma and each disc represents the concept of a major triad which is understood as a ternary relation between tones of specific chroma; the functional meaning of a major triad is determined by the containment of its disc in the large boxes which represent major keys, respectively. As in the given example, objects and relationships might have different meanings in different surroundings. Therefore we propose a triadic approach for the extended theory of concept graphs which allows to represent mathematically the different meanings of the considered relationships. To keep this paper (as much as possible) selfcontained, we recall the basic notions of Triadic Concept Analysis from [Wi95] and [LW95]. A triadic context is defined as a quadruple (G, M, B, Y ) where G, M , and B are sets and Y is a ternary relation between G, M , and B, i.e. Y ⊆ G × M × B; the elements of G, M , and B are called
196
R. Wille key: d major
key: c major key: a major
e
a 2
c major triad
3 g
2 1 c 3
g major triad
1
1 f
g # 2
d major triad
e major triad
2
2
f major triad
f#
b
g
3
d
1
3
c# 2 1 e 3
3 a
b
a major triad
1 a
Fig. 1. Harmonic analysis of a diatonic modulation
(formal) objects, attributes, and modalities, respectively, and (g, m, b) ∈ Y is read: the object g has the attribute m in the modality b. Formal objects, attributes, and modalities may formalize entities in a wide range, but in the triadic context they are understood in the role of the corresponding Peircean category. In particular, the formal modalities as abstract instances of the third category may formalize relations, mediations, representations, interpretations, evidences, evaluations, modalities, meanings, reasons, purposes, conditions etc. If real data are described by a triadic context, the names of the formal objects, attributes, and modalities yield the elementary bridges to reality which are basic for interpretations (cf. [Wi92]). A triadic concept of a triadic context (G, M, B, Y ) is defined as a triple (A1 , A2 , A3 ) with A1 ⊆ G, A2 ⊆ M , and A3 ⊆ B such that the triple (A1 , A2 , A3 ) is maximal with respect to component-wise set inclusion in satisfying A1 × A2 × A3 ⊆ Y , i.e., for X1 ⊆ G, X2 ⊆ M , and X3 ⊆ B with X1 × X2 × X3 ⊆ Y , the containments A1 ⊆ X1 , A2 ⊆ X2 , and A3 ⊆ X3 always imply (A1 , A2 , A3 ) = (X1 , X2 , X3 ). If (G, M, B, Y ) is described by a three-dimensional cross table, this means that, under suitable permutations of rows, columns, and layers of the cross table, the triadic concept (A1 , A2 , A3 ) is represented by a maximal rectangular box full of crosses. For a particular triadic concept c := (A1 , A2 , A3 ), the components A1 , A2 , and A3 are called the extent, the intent, and the modus of c, respectively; they are also denoted by Ext(c), Int(c), and M od(c). For the description of derivation operators, it is convenient to denote the underlying triadic context alternatively by K := (K1 , K2 , K3 , Y ). For {i, j, k} = {1, 2, 3} with j < k and for X ⊆ Ki and Z ⊆ Kj × Kk , the (i)-derivation operators are defined by X →X (i) := {(aj , ak ) ∈ Kj × Kk | ai , aj , ak are related by Y for all ai ∈ X}, Z →Z (i) := {ai ∈ Ki | ai , aj , ak are related by Y for all (aj , ak ) ∈ Z}. It can be easily seen that a triple (A1 , A2 , A3 ) with Ai ⊆ Ki for i = 1, 2, 3 is a triadic concept of K if and only if Ai = (Aj ×Ak )(i) for {i, j, k} = {1, 2, 3} with j < k.
Triadic Concept Graphs
197
The set T(K) of all triadic concepts of the triadic context K := (K1 , K2 , K3 , Y ) is structured by set inclusion considered in each of the three components of the triadic concepts. For each i ∈ {1, 2, 3}, one obtains a quasiorder i and its corresponding equivalence relations ∼i defined by (A1 , A2 , A3 ) i (B1 , B2 , B3 ) :⇐⇒ Ai ⊆ Bi (A1 , A2 , A3 ) ∼i (B1 , B2 , B3 ) :⇐⇒ Ai = Bi
and (i = 1, 2, 3).
The relational structure T(K) := (T(K), 1 , 2 , 3 ) is called the concept trilattice of the triadic context K. As contextual and conceptual basis for concept := (K1 , . . . , Kn ) graphs we define a power context family of triadic contexts by K (n ≥ 2) with Kk := (Gk , Mk , B, Yk ) (k = 1, . . . , n) such that Gk ⊆ (G1 )k . How specific contextual logics may be grounded on such power context families shall be demonstrated by a mathematical model for tuning logics of tonal music. Since 1980 the computer organ MUTABOR has been developed within the research project on ”Mathematical Music Theory” at Darmstadt University of Technology (see [GHW85],[MW88],[ARW92],[Wi97a]). The special feature of MUTABOR is its realtime-computing of pitches for the keys which is performed immediately after a key is touched. This allows to play on a regular keyboard with arbitrary pitches of the continuum of audible frequencies in almost arbitrary combinations. The computation of pitches is performed by specific computer programs called “tuning logics”. A challenging problem is to design tuning logics which allow to play music on MUTABOR in just intonation. For solving this problem, mathematical models of tone systems for just intonation are needed. The most common system for this purpose is the Harmonic Tone System which is generated by the intervals octave (2 : 1), perfect fifth (3 : 2), and perfect third (5 : 4). An unbounded mathematical model of the Harmonic Tone System is given by the integer set T := {2p 3q 5r | p, q, r ∈ Z} where Z is the set of all integers (cf. [Wi80]). As we have already seen by our preceding example of a diatonic modulation, a harmonic analysis (necessary for treating the tuning problem) is of triadic nature. Therefore we build from the integer set T a power context family of triadic contexts (Gk , Mk , B, Yk ) (k = 1, . . . , 12) by defining Gk := T k , M1 as the set of all keys, octave ranges, tone letters and chromas (with indices for syntonic commas), Mk as the set of all k-ary harmonies, chordal and harmonic forms, and B as the set of all tonalities, major and minor keys (for precise mathematical definitions of all those musical notions and of the relations Yk we refer to [Wi76],[Wi80],[NW90]). The musicologist M. Vogel has proposed a tuning logic for just intonation (see [Vo75], pp. 343–345) which has been implemented on MUTABOR by using the triadic mathematization of the Harmonic Tone System underlying the described power context family. In discussing Vogel’s tuning logic through an example concerned with the so-called Problem of the Harmony of Second Degree, we demonstrate the use of concept graphs derivable from the power context family of the Harmonic Tone System. Figure 2 shows above a concept graph representing a sequence of major and minor triads whose tone concepts are described by MIDI numbers of keys.
198
R. Wille
52 2 2 c major triad
1
60
3
57
1
d major triad
1
52
2
2
3
f major triad
3
59
62
3
g major 1 triad
55
3
c major triad
2
53
1
55
60
e
e
b 2 2
c major triad
1
c
3
a
2
f major triad
1
d major triad
1
d
3
g major 1 triad
3
g
c major triad
2
f
3
2
3
1
g
c
-1
tonality: g
tonality: f
b-2
e-1 2
-1
2 c major triad
1
c
3
f major triad
1 3
g
a
tonality: c
2
2
3 d major triad
f
e-2
2
1
-1
d
3 g major 1
g-1
triad
3
c major triad
-1
tonality: a
1 -1
tonality: c
-1
c
Fig. 2. A sequence of major and minor chords transformed into just intonation
Triadic Concept Graphs
-2
-1
g -1
b-1b
f
g0
d0
+1
bb b
b
eb b
f# e-2 b-2 000000 111111 000000 111111
-1
c
b
+1
1 0 0 1 0 1 -2 0g-2 1 d# 0# 1 -1 0 1 000000 111111 000000 111111 0 1 000000 111111 000000 111111 a 0 1 000000 111111 000000 111111 0 1 000000 111111 000000 111111 0 1 000000 111111 000000 111111 0 1 000000 111111 000000 111111 0-1 1 000000 111111 000000 111111 -1 -1 -1 -1 0 1 000000 111111 000000 111111 c g d a e b-1 0 1 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 0 1 000000 111111 000000 111111 000000 111111 0 1 000000 111111 000000 111111 000000 111111 0 1 000000 111111 000000 111111 000000 111111 0 1 000000 111111 000000 111111 000000 111111 0 1 000000 111111 000000 111111 000000 111111 0 1 f c 000000 111111 000000 111111 000000 111111 0 1 000000 111111 000000 111111 000000 111111 0 0 0c 0 1 000000 111111 000000 111111 000000 111111 a0 e0 b f g0 0 1 000000 111111 000000 111111 000000 111111 b b b 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 +1 +1 +1 +1 +1 f c g d e 0a+1b 1 b b b b b 0 1 0 1 0 1 -2
a-2
d
-2
c#
199
-2
a#
f
-1
#
d0
+1
b
b
Fig. 3. The chord sequence of Figure 2 represented in Euler’s speculum musicum
The question is: which pitches should MUTABOR compute for those keys? The concept graph in the middle of Figure 2 yields a partial answer to this question in showing chroma concepts instead of the key concepts (e.g. the keys 60, 52, and 55 of the c major triad are replaced by the general chromas c, e, and g, respectively; notice that the keys 52 and 55 are in the range of the small octave and that the key 60 is in the range of the one-line octave). Figure 2 shows below a subdivided concept graph with precise chroma letters of the Harmonic Tone Sytem which, together with the octave range, uniquely determine the pitch, respectively. How Vogel’s tuning logic reaches those pitches may be understood best with the aid of Euler’s “speculum musicum” which graphically represents the chroma system of the Harmonic Tone System (cf. [Vo75], p. 102); a detail of the speculum musicum is presented in Figure 3 together with the harmonic analysis of our chord sequence obtained by Vogel’s tuning logic. Mathematically, Euler’s net represents the numbers 3q 5r (q, r ∈ Z) where c stands for 30 50 and 3q 5r leads to 3q+1 5r by one step to the right and to 3q 5r+1 by one step to the top (3q 5r is the chroma of the tones 2p 3q 5r with p ∈ Z). Our chord sequence is indicated in Figure 3 by the hatched triangles from right to left. The hexagon on the right represents the tonality c which comprises all chromas contained in this hexagon. The other similar hexagons represent the tonalities f , a−1 , g −1 , and c−1 . The key idea of Vogel’s tuning logic is to determine the pitch of a touched chord by the (eventually new) tonality which has the same denotation as the chroma of the reference tone of the touched chord with respect to the actual tonality. For our example, this yields a suprising tonality modulation from c to c−1 via f , a−1 , and g −1 . It is still an open problem in which circumstances Vogel’s solution is adequate and in which not. Viewing a harmonic analysis as a judgment, which can be mathematized by a concept graph, helps in understanding how concept graphs of power context families of triadic contexts should be defined. Of course, the main purpose of such a definition is to allow an adequate mathematization of nested conceptual
200
R. Wille
graphs which are understood as logical abstraction of verbal judgments. But the examples from music show that subdivisions with overlappings should be permitted. To cover such broad understanding of formal judgments, we define in the next section triadic concept graphs as a generalization of the concept graphs introduced in [Wi97b].
3
Triadic Concept Graphs
For the definition of triadic concept graphs, we first define an abstract concept graph with subdivision as a structure G := (V, E, ν, C, κ, θ, σ) for which n 1. V and E are sets and ν is a mapping of E to k=1 V k (2 ≤ n ∈ N) so that (V, E, ν) can be considered as a directed multi-hypergraph with vertices from V and edges from E (we define |e| = k :⇔ ν(e) = (v1 , . . . , vk )), 2. C is a set and κ is a mapping of V ∪ E to C such that κ(e1 ) = κ(e2 ) always implies |e1 | = |e2 | (the elements of C may be understood as abstract concepts), 3. θ is an equivalence relation on V , 4. σ is a partial mapping of the vertex set V to the power set of V × C such that, for the partial mappings σ1 and σ2 with σ1 (w) := {v ∈ V | (v, c) ∈ σ(w) for some c ∈ C} and σ2 (w) := {c ∈ C | (v, c) ∈ σ(w) for some v ∈ V } for w ∈ dom(σ), we have σ1 (V ) = ∅ = σ2 (V ) and v ∈(σ1 )m (v) for all v ∈ V and m ∈ N. Abstract concept graphs with subdivision are quite common. Figure 4 (taken from [Ka79]) shows examples of such graphs (without edges) representing the be a power context family lexical fields “Gew¨asser” and “waters”. Now, let K of triadic contexts K1 , . . . , Kn with Kk := (Gk , Mk , B, Yk ) (1 ≤ k ≤ n) and n T(K let CK k ). Then an abstract concept graph G with subdivision, := k=1 specified by G := (V, E, ν, C, κ, θ, σ), is called a triadic concept graph over the if power context family K 1. 2. 3. 4. 5.
C = CK , κ(V ) ⊆ T(K1 ), κ(e) ∈ T(Kk ) for all e ∈ E with |e| = k, σ2 (w) ⊆ G1 for all w ∈ dom(σ). c1 ∈ Ext(c2 ), c2 ∈ Ext(c3 ), . . . , cm−1 ∈ Ext(cm ) for c1 , c2 , . . . , cm ∈ σ2 (V ) imply c1 = cm .
is defined to be a mapping A realization of such triadic concept graph G over K ρ of V to the power set of G1 satisfying, for all v ∈ V , w ∈ dom(σ), and e ∈ E with ν(e) = (v1 , . . . , vk ), 1. 2. 3. 4.
˜ for all w ˜ ∈ dom(σ), ∅ = ρ(v) ⊆ Ext(κ(v)) if v ∈σ1 (w) ˜ for all w ˜ ∈ dom(σ), ρ(v1 ) × · · · × ρ(vk ) ⊆ Ext(κ(e)) if {v1 . . . vk } ⊆σ1 (w) v1 θv2 always implies ρ(v1 ) = ρ(v2 ), ∅ = ρ(v) ⊆ (Int(κ(v)) × M od(c))(1) if (v, c) ∈ σ(w),
Triadic Concept Graphs
201
Lexical field "Gewaesser"
Meer Maar
Lache Pfuetze
Weiher
See
Teich Tuempel
Haff
Kanal
Pfuhl
Fluss Bach Rinnsal
Strom
Lexical field "water"
lake
sea lagoon
reservoir
channel
pool mere
river
plash puddel pond
brook
rivulet
rill
stream
runnel
trickle
torrent
burn
beck
canal
Fig. 4. Abstract concept graphs with subdivision for two lexical fields
5. ρ(v1 ) × · · · × ρ(vk ) ⊆ (Int(κ(e)) × M od(c))(1) if (v1 , c), . . . , (vk , c) ∈ σ(w), 6. σ2 (w) ⊆ ρ(w). The pair G := (G, ρ) is called a realized triadic concept graph of the power or, shortly, a triadic concept graph of K. context family K In discussing the definition of triadic concept graphs, let us first remark that with only one modality the triadic concept graphs of the power context family K as introduced and with dom(σ) = ∅ can be considered as the concept graphs of K in [Wi97b]; in this way, triadic concept graphs can be understood as a generalization of (dyadic) concept graphs. The basic idea of triadic concept graphs is to describe the parts of a subdivision by specific triadic concepts which are also elements of the object set G1 and therefore called “general objects”. Condition 5 of the definition of triadic concept graphs over a power context family guarantees (under a suitable finite chain condition) that the general objects can be determined inductively. Condition 4 and 5 of the definiton of realizations explicates the idea that the general object c ∈ σ2 (w) (w ∈ dom(σ)) represents the part of the subdivision that is constituted by the vertex set {v ∈ V | (v, c) ∈ σ(w)}; for a vertex v and an edge e in that part, the extents of κ(v) and κ(e) are modified to the extents (Int(κ(v)) × M od(c))(1) and (Int(κ(e)) × M od(c))(1) to respect
202
R. Wille
the modalities of c, while the intents of κ(v) and κ(e) keep the identity of the triadic concepts κ(v) and κ(e), respectively, through changing modalities.
4
Contextual Semantics of Nested Conceptual Graphs
In this section we show how a nested conceptual graph can be represented mathematically by a triadic concept graph. As an example we choose the nested conceptual graph discussed in [CM97]. This conceptual graph (with added individual markers) is shown in Figure 5. To obtain the representing triadic concept := (K1 , K2 ) which represents graph, we first derive the power context family K the knowledge coded in the nested conceptual graph; K1 := (G1 , M1 , B, Y1 ) and K2 := (G2 , M2 , B, Y2 ) are described in Figure 6 by cross tables in which columns without crosses are omitted. The names of the formal objects, attributes, and modalities are heading the rows, the columns, and the vertical sections of the tables, respectively. The general objects u, a, b, and c, which represent the large boxes, are triadic concepts of K1 generated by their corresponding attributes and modalities u, a, b, and c, i.e. u := ({P eter, #1, a}, {u}, {u}), a := ({#2, b}, {a}, {a}), b := ({#3, #4, c}, {b}, {b}), c := ({#4, #5, #6, #7, #8}, {c}, {c}). The concepts and the binary relations of the nested conceptual graphs are represented by triadic concepts of K1 and K2 according to the following list: Person ↔ ({P eter, #5, #7}, {P erson}, {u, c}), Think ↔ ({#1}, {T hink, u}, {u}), Painting ↔ ({a}, {P ainting, u}, {u}), Bucolic ↔ ({#2}, {Bucolic, a}, {a}), Scene ↔ ({b}, {Scene, a}, {a}), Boat ↔ ({#3}, {Boat, b}, {b}), Lake ↔ ({#4}, {Lake, b, c}, {b, c}), Couple ↔ ({c}, {Couple, b}, {b}), Fish ↔ ({#6}, {F ish, c}, {c}), Sleep ↔ ({#8}, {Sleep, c}, {c}), agent ↔ ({(#1, P eter), (#6, #5), (#8, #7)}, {agent}, {u, c}), object ↔ ({(#1, a)}, {object, u}, {u}), attribute ↔ ({(b, #2)}, {attribute, a}, {a}), in ↔ ({(#3, c), (#6, #4)}, {location, in}, {,c}), ¯ on ↔ ({(#4, #3)}, {location, on}, {b}),
Triadic Concept Graphs
203
All triadic concepts of K1 and their triadic relationships can be read off from the triadic diagram of the concept trilattice T(K1 ) depicted in Figure 7; the line diagram on the right, visualizing the containment of the extents, represents the hierarchy of concept types indicated by the conceptual graph in Figure 5 (see [LW95],[Bi97] for reading conventions of triadic diagrams). The listed triadic and the assigment of individual concepts form a triadic concept graph over K markers to concept types in Figure 5 yields a realization and hence a triadic which mathematically represents the nested conceptual graph concept graph of K of Figure 5. The example makes clear that a triadic approach is necessary if the concept and relations of nested conceptual graphs shall be represented by formal concepts. A dyadic representation is not flexible enough to capture precisely the knowledge coded in a nested conceptual graph. 2
Person: Peter
1
agent
Think: #1
1 object 2 Painting: a 1
attr
2
Bucolic: #2
Scene: b Couple: c Person: #5
Person: #7
2
1
agent
2
1
agent
1
Fish: #6
2
in
Lake: #4
Sleep: #8
2 in
1
Boat: #3
2
on
1
Lake: #4
Fig. 5. Example of a positive nested conceptual graph
The triadic approach presented in this section can be applied to positive nested conceptual graphs as they are mathematized in [CM96] and [CM97] by labelled bipartite graphs G := (R, C, E, l) for which R and C are disjoint sets of vertices labelled via l by relation types and concept types, respectively, and E is a set of edges between R and C labelled in such a way that adjacent edges of
R. Wille
u
a
b
c
Person Think Painting u Bucolic Scene a Boat Lake Couple b Person Lake Fish Sleep c
IK1 Peter #1 #2 #3 #4 #5 #6 #7 #8 u a b c
u IK2
a
b
c
agent object u attribute a location in on b agent location in c
204
(#1, Peter) (#1, a ) (b , #2) (#3, c ) (#4, #3) (#6, #5) (#6, #4) (#8, #7) Fig. 6. A power context family derived from the conceptual graph in Figure 5
Triadic Concept Graphs
205
a b c u u #2 #1
Scene
a
a Bucolic
Peter #6
Think u Person
b
Painting
c
#5 #8 #7
#4 #3 c
Sleep Fish Lake b Boat Couple
Fig. 7. A triadic diagram of the concept trilattice of K1 in Figure 6
a vertex of R get the consecutive numbers 1, 2, . . .; further labels of a vertex of C may be individual markers and also descriptions of subgraphs of G which can be considered to be nested into that vertex. Clearly, the graph G := (R, C, E, l) without its individual markers may be understood as an abstract concept graph := (K1 , . . . , Kn ) G with subdivision. For establishing the power context family K with Kk := (Gk , Mk , B, Yk ) (k = 1, . . . , n) which can be associated with G, we complete the invidual markers of G so that each of its concept nodes carries at least one individual marker; then we build with those markers and with the ˜ 1 (which will be changed descriptions of the distinguished subgraphs the set G later to the set G1 by replacing the subgraph descriptions by suitable general objects). The subgraph descriptions form the modality set B and shall be also elements of the attribute sets Mk ; the other elements of Mk are the relation types assigned to a relation vertex with k adjacent concept vertices and, in case
206
R. Wille
of k = 1, also the assigned concept types. Y˜1 is then taken as the smallest ternary relation satisfying the following conditions: 1. (g, a, a) ∈ Y˜1 for all subgraph descriptions a and for all individual markers and subgraph descriptions g assigned within the subgraph described by a, ˜ 1 , M1 , B, Y˜1 ) permits a mapping ρ of V to the 2. the triadic context K˜1 := (G ˜ 1 for a suitably chosen κ : C → B(K˜1 ) satisfying the conditions power set of G 1. and 4. of the definition of a realization in Section 3. For each subgraph description a, a triadic concept a can be defined by a := (({a} × {a})(1) , (({a} × {a})(1) × {a})(2) , (({a} × {a})(1) × (({a} × {a})(1) × {a})(2) )(3) )
starting an inductive procedure from the most inner subgraphs and the tria˜ 1 , M1 , B, Y˜1 ) and, after each step, replacing in the object set of dic context (G the actual context a subgraph description by the corresponding triadic concept. Finally we obtain the triadic context K1 := (G1 , M1 , B, Y1 ) whose object set G1 contains no more subgraph descriptions, but, as general objects, the triadic concepts defined for the subgraph descriptions. The other triadic contexts of the may be similarily defined as K1 so that the abstract power context family K which maconcept graph G can be concretized to a triadic concept graph of K thematically represents the positive nested conceptual graphs described by the labelled bipartite graph G := (R, C, E, l).
5
Further Research
The started theory of triadic concept graphs shall be elaborated toward a comprehensive theory of formal judgments and conclusions as essential part of Contextual Logic. This will be performed in two directions: the elaboration of a mathematical structure theory which treats triadic concepts as realizations of abstract concept graphs with subdivision and the development of a logic theory which understands triadic concept graphs within contextual models of a syntactical language. For the mathematical structure theory, a challenging problem is to find suitable structures and representations which combine concept lattices and concept graphs effectively and communicatively. For the logic theory, the first task consists in extending the existing theory for simple concept graphs (see [Pr98a],[Pr98b]) to a theory of syntax and semantics for triadic concept graphs, which adapts the started mathematical structure theory. Then, of course, a main desire is to widen the expressibility of the developed theory in activating larger parts of predicate logic which are still decidable and allow effective algorithms. Also the integration of modal logics should be tackled. Overall, the pragmatic meaning of the theoretical developments has to be continuously reflected and examined on the basis of concrete applications.
Triadic Concept Graphs
207
References 1. V. Abel, P. Reiss, R. Wille: MUTABOR II - Ein computergesteuertes Musikinstrument zum Experimentieren mit Stimmungslogiken und Mikrot¨ onen. FB4Preprint Nr. 1513, TU Darmstadt 1992. 2. K. Biedermann: How triadic diagrams represent triadic structures. In: D. Lukose, H. Delugach, M. Keeler, L. Searle, J. Sowa (eds.): Conceptual Structures: Fulfilling Peirce’s Dream. Springer, Berlin-Heidelberg-New York 1997, 304–317. 3. M. Chein, M.-L. Mugnier: Repr´esenter des connaissances et raisonner avec des graphes. Revue d’intelligence artificelle 101 (1996), 7–56. 4. M. Chein, M.-L. Mugnier: Positive nested conceptual graphs. In: D. Lukose, H. Delugach, M. Keeler, L. Searle, J. Sowa (eds.): Conceptual Structures: Fulfilling Peirce’s Dream. Springer, Berlin-Heidelberg-New York 1997, 95–109. 5. B. Ganter, H. Hempel, R. Wille: MUTABOR - Ein rechnergesteuertes Musikinstrument zur Untersuchung von Stimmungen. ACUSTICA 57 (1985), 284–289. 6. B. Ganter, R. Wille: Formale Begriffsanalyse: Mathematische Grundlagen. Springer, Berlin-Heidelberg 1996. (English translation to appear) 7. R. Hernried: Systematische Modulation. de Gruyter, Berlin 1949. 8. I. Kant: Logic. Dover, New York 1988. 9. G. L. Karcher: Kontrastive Untersuchung von Wortfeldern im Deutschen und Englischen. Peter Lang, Frankfurt 1979. 10. F. Lehmann, R. Wille: A triadic approach to formal concept analysis. In: G. Ellis, R. Levinson, W. Rich, J. F. Sowa (eds.): Conceptual Structures: Applications, Implementations and Theory. Springer, Berlin-Heidelberg-New York 1995, 32–43. 11. C. Misch, R. Wille: Eine Programmiersprache f¨ ur MUTABOR. In: F. Richter Herf (ed.): Mikrot¨ one II. Edition Helbing, Innsbruck 1988, 87–94. 12. W. Neumaier, R. Wille: Extensionale Standardsprache der Musiktheorie - eine Schnittstelle zwischen Musik und Informatik. In: H.-P. Hesse (ed.): Mikrot¨ one III. Edition Helbing, Innsbruck 1990, 149–167. 13. S. Prediger: Einfache Begriffsgraphen: Syntax und Semantik. FB4-Preprint Nr. 1962, TU Darmstadt 1998. 14. S. Prediger: Simple concept graphs: a logic approach. FB4-Preprint, TU Darmstadt 1998 (this volume). 15. M. Vogel: Die Lehre von den Tonbeziehungen. Verlag f¨ ur systematische Musikwissenschaft. Bonn-Bad Godesberg 1975. 16. J. F. Sowa: Conceptual structures: information processing in mind and machine. Adison-Wesley, Reading 1984. 17. J. F. Sowa: Knowledge representation: logical, philosophical, and computational foundations. PWS Publishing Co., Boston (to appear) 18. R. Wille: Mathematik und Musiktheorie. In: G. Schnitzler (ed.): Musik und Zahl. Verlag f¨ ur systematische Musikwissenschaft, Bonn-Bad Godesberg 1976, 233–264. ¨ 19. R. Wille: Mathematische Sprache in der Musiktheorie. Jahrbuch Uberblicke Mathematik 1980. Bibl. Institut, Mannheim 1980, 167–184. 20. R. Wille: Begriffliche Datensysteme als Werkzeug der Wissenskommunikation. In: H. H. Zimmermann, H.-D. Luckhardt, A. Schulz (eds.): Mensch und Maschine Informationelle Schnittstellen der Kommunikation. Universit¨ atsverlag Konstanz, Konstanz 1992, 63–73. 21. R. Wille: Pl¨ adoyer f¨ ur eine philosophische Grundlegung der Begrifflichen Wissensverarbeitung. In: R. Wille, M. Zickwolff (eds.): Begriffliche Wissensverarbeitung Grundfragen und Methoden. B.I.-Wissenschaftsverlag, Mannheim 1994, 11–25.
208
R. Wille
22. R. Wille: The basic theorem of triadic concept analysis. Order 12 (1995), 149–158. 23. R. Wille: Restructuring mathematical logic: an approach based on Peirce’s pragmatism. In: A. Ursini, P. Agliano (eds.): Logic and Algebra. Marcel Dekker, New York 1996, 267–281. 24. R. Wille: Conceptual structures of multicontexts. In: P. W. Eklund, G. Ellis, G. Mann (eds.): Conceptual Structures: Knowledge Representation as Interlingua. Springer, Berlin-Heidelberg-New York 1996, 23–39. 25. R. Wille: MUTABOR - ein Medium f¨ ur musikalische Erfahrungen. In: M. Warnke, W. Coy, G. C. Tholen (eds.): Hyperkult: Geschichte, Theorie und Kontext digitaler Medien. Stroemfeld Verlag, Basel 1997, 383–391. 26. R. Wille: Conceptual Graphs and Formal Concept Analysis. In: D. Lukose, H. Delugach, M. Keeler, L. Searle, J. Sowa (eds.): Conceptual Structures: Fulfilling Peirce’s Dream. Springer, Berlin-Heidelberg-New York 1997, 290–303.
Powerset Trilattices K. Biedermann Technische Universit¨ at Darmstadt, Fachbereich Mathematik Schloßgartenstr. 7, D–64289 Darmstadt,
[email protected] Abstract. The Boolean lattices are fundamental algebraic structures in Lattice Theory and Mathematical Logic. Since the triadic approach to Formal Concept Analysis gave rise to the triadic generalization of lattices, the trilattices, it is natural to ask for the triadic analogue of Boolean lattices, the Boolean trilattices. The first step in establishing Boolean trilattices is the study of powerset trilattices, which are the the triadic generalization of powerset lattices. In this paper, an order-theoretic characterization of the powerset trilattices as certain B-trilattices is presented. In particular, the finite B-trilattices are (up to isomorphism) just the finite powerset trilattices. They have 3n elements. Further topics are the triadic de Morgan laws, the cycles of triadic complements as the triadic complementation and the atomic cycles, which take over the role of the atoms in the theory of Boolean lattices.
1
Introduction
The idea of a formalization of traditional philosophical logic, which is based on concepts, judgements, and conclusions as the three essential main functions of thinking, lead to a unification of the Theory of Conceptual Graphs and the Theory of Formal Concept Analysis (cf. [Wi97]). It has turned out that for simple conceptual graphs without nesting the dyadic setting of Formal Concept Analysis is appropriate but for conceptual graphs with nesting the triadic approach to Formal Concept Analysis, Triadic Concept Analysis, is required. This is elaborated in the paper ”Triadic Concept Graphs” by R. Wille in this volume. The triadic approach to Formal Concept Analysis on the other hand gave rise to a new class of algebraic structures, the so-called trilattices (cf. [Wi95] and [Bi98]), which are the triadic generalization of lattices. Since Boolean lattices are fundamental algebraic structures in Lattice Theory and Mathematical Logic, it is natural to ask for the triadic analogue of Boolean lattices, the Boolean trilattices, which play a similar role in the triadic case as Boolean lattices in the dyadic case. The first step in establishing Boolean trilattices, done in this paper, is the study of powerset trilattices as the triadic generalization of powerset lattices. In the following section, the basic notions and results about trilattices are given. Some first properties of powerset trilattices will be presented in the third section. Most of them correspond to familiar results concerning powersets. In the fourth M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 209–221, 1998. c Springer-Verlag Berlin Heidelberg 1998
210
K. Biedermann
section, B-trilattices are defined as special (triadically) complemented trilattices and it can be shown that, as for Boolean lattices, the triadic complementation in B-trilattices is unique. The atomic cycles, which correspond to the atoms in the theory of Boolean lattices, enable us to give a first characterization of the powerset trilattices as certain B-trilattices. In particular, the finite B-trilattices turn out to be the finite powerset trilattices just as the finite Boolean lattices are (up to isomorphism) the finite powerset lattices. Before motivating powerset trilattices, we must introduce trilattices as special triordered sets – just as lattices are special ordered sets. More about Formal Concept Analysis and Lattice Theory can be found in [GW96] and [DP90]. For basic results about Boolean lattices, the reader is for example referred to the article by R. W. Quackenbush in [Gr79] and to [MB89].
2
Trilattices
A triordered set is a relational structure (P, .1 , .2 , .3 ) for which the relations are quasiorders with ∼i :=.i ∩ &i for i = 1, 2, 3 such that the following conditions hold for all x, y ∈ P and {i, j, k} = {1, 2, 3}: (1) x ∼i y, x ∼j y implies x = y (uniqueness condition), (2) x .i y, x .j y implies x &k y (antiordinality). A triordered set P := (P, .1 , .2 , .3 ) gives rise to the ordered sets (Pi , ≤i ) with i = 1, 2, 3, where Pi := {[x]i | x ∈ P } with [x]i := {y ∈ P | x ∼i y} and [x1 ]i ≤i [x2 ]i :⇔ x1 .i x2 . They will be called the ordered structures of the triordered set P . Triordered sets are represented by so-called triadic diagrams as depicted in Fig. 1. The elements in P correspond to the little circles in the interior triangular net, showing the so-called geometric structure (P, ∼1 , ∼2 , ∼3 ). Following the parallel lines to one of the three side diagrams, which represent the ordered structures (P, ≤i ), i = 1, 2, 3, yields the corresponding classes of i-equivalent elements. So, from the three ordered structures, one can read if two elements are comparable to each other with respect to some quasiorder while from the interior net one can see if two elements are i-equivalent to one another. More about triadic diagrams can be found in [LW95] and [Bi97]. A mapping ϕ : P → Q between triordered sets P := (P, .1 , .2 , .3 ) and Q := (Q, .1 , .2 , .3 ) is i-isotone for i ∈ {1, 2, 3} if x .i y ⇒ ϕ(x) .i ϕ(y) holds for all x, y ∈ P . If these and also the converse implications x .i y ⇐ ϕ(x) .i ϕ(y) are satisfied for all i ∈ {1, 2, 3}, then ϕ is a triorder-embedding which is already injective (use the uniqueness condition). A triorder-isomorphism is a surjective triorder-embedding. According to the order-theoretic approach, trilattices are special triordered sets in which certain operations, the so-called ik-joins, exist. These operations correspond to meet and join in lattice theory and are defined in the following way: Let (P, .1 , .2 , .3 ) be a triordered set, let X and Y be subsets of P , and let {i, j, k} = {1, 2, 3}. An element b ∈ P is called an ik-bound of (X, Y ) if x .i b for
Powerset Trilattices
211
all x ∈ X and y .k b for all y ∈ Y . An ik-bound l of (X, Y ) is called an ik-limit of (X, Y ) if l &j b for all ik-bounds b of (X, Y ). An ik-limit ˜l of (X, Y ) is called the ik-join of (X, Y ) and denoted by X∇ik Y if ˜l .k l for all ik-limits l of (X, Y ). It is easy to see that there is at most one such ik-limit of (X, Y ). A trilattice is a triordered set L := (L, .1 , .2 , .3 ) in which all the ik-joins {x1 , x2 }∇ik {y1 , y2 } exist where x1 , x2 , y1 , y2 ∈ P and {i, j, k} = {1, 2, 3}. If X = {x} and Y = {y} we will shortly write x∇ik y for {x}∇ik {y}. A triordered set (L, .1 , .2 , .3 ) is called a complete trilattice if, for arbitrary sets X, Y ⊆ L, all the ik-joins X∇ik Y exist. In particular, a complete trilattice is bounded by Oj := L∇ik L = L∇ki L where {i, j, k} = {1, 2, 3}. For a bounded trilattice L we define 0i := [Oi ]i and 1i := [Oj ]i = [Ok ]i where {i, j, k} = {1, 2, 3}, and also the boundary of L by b(L) := {x ∈ L | x ∼i Oi for some i ∈ {1, 2, 3}}. The best way to motivate in which sense the powerset trilattices are the triadic generalization of powerset lattices is to present them within the framework of Triadic Concept Analysis. So we briefly recall some basic definitions and results, which can be found in [Wi95] and [WZ98]. For an immediate study of the powerset trilattices, the reader is directly referred to the first definition in the following section. A triadic context is a quadruple K := (K1 , K2 , K3 , Y ) consisting of a set K1 of objects, a set K2 of attributes, a set K3 of conditions and a ternary relation Y ⊆ K1 × K2 × K3 indicating under which condition b an object g has an attibute m or, in symbols (g, m, b) ∈ Y . For two subsets X1 ⊆ K1 and X2 ⊆ K2 a derivation operator can be defined in the following way: hX1 , X2 i(3) := {a3 ∈ K3 | (x1 , x2 , a3 ) ∈ Y for all x1 ∈ X1 and x2 ∈ X2 } and similarly for hX1 , X3 i(2) and hX2 , X3 i(1) . A triple A := (A1 , A2 , A3 ) ∈ P(K1 )×P(K2 )×P(K3 ) is said to be a triadic concept of K if Aj = hAi , Ak i(j) for all {i, j, k} = {1, 2, 3} with i < k. It follows that the triadic concepts (A1 , A2 , A3 ) of K are precisely those triples in P(K1 ) × P(K2 ) × P(K3 ) which are maximal with respect to the component-wise set inclusion. With respect to the triadic context K, which can be understood as a three-dimensional cross table, a triadic concept can be understood as a maximal, not necessarily connected box which is completely filled out with entries of the relation Y . The set of all triadic concepts T(K) of K can be endowed with three quasiorders .i , i ∈ {1, 2, 3}, by (A1 , A2 , A3 ) .i (B1 , B2 , B3 ) ⇔ Ai ⊆ Bi . It is easy to see that (T(K), .1 , .2 , .3 ) is a triordered set. From the Basic Theorem of Triadic Concept Analysis it even follows that (T(K), .1 , .2 , .3 ) is a complete trilattice which is called the concept trilattice of K. The ik-joins can similarly be described as the following 13-join for subsets X, Y ⊆ T(K): X∇13 Y = hhX, Y i(2) , Y i(1) , hX, Y i(2) , hhhX, Y i(2) , Y i(1) , hX, Y i(2) i(3) , S S where X := {X1 | (X1 , X2 , X3 ) ∈ X} and Y := {Y3 | (Y1 , Y2 , Y3 ) ∈ Y}.
212
3
K. Biedermann
Powerset Trilattices
According to the ordinary (monadic) understanding, the powerset P(M ) of a given set M consists of all subsets of M and is naturally ordered by set inclusion. We will call (P(M ), ⊆) the powerset lattice of M . Within the dyadic setting of Formal Concept Analysis, the powerset P(M ) is represented by the (dyadic) powerset context KM := (M, M, 6=) and its complete concept lattice B(KM ) := (B(KM ), ≤) which has as dyadic concepts the pairs (X1 , X2 ) with X1 ∪ X2 = M and X1 ∩ X2 = ∅ or, equivalently, X1 = X2C and which is ordered by (X1 , X2 ) ≤ (Y1 , Y2 ) :⇔ X1 ⊆ Y1 (⇔ X2 ⊇ Y2 ). Consequently, one component already determines the other component and (P(M ), ⊆) is obviously isomorphic to B(KM ). Structurally, the monadic and the dyadic understanding of a powerset are closely linked. In Triadic Concept Analysis, the idea of a powerset is represented by the triadic powerset context KM := (M, M, M, YM ) with YM := M 3 \{(a, a, a) | a ∈ M } and its complete concept trilattice T(KM ). In [Wi95], it has already been observed that the triadic concepts (X1 , X2 , X3 ) of KM are characterized by X1 ∩X2 ∩X3 = ∅ and Xi ∪ Xj = M for distinct i, j ∈ {1, 2, 3}. We use these conditions to define the triadic generalization of powersets. Definition 3.1: The powertriset of a set M is defined as T(M ) := {(X1 , X2 , X3 ) ∈ P(M )3 | X1 ∩ X2 ∩ X3 = ∅ and Xi ∪ Xj = M for i 6= j ∈ {1, 2, 3}}. It can be equipped with three quasiorders by (X1 , X2 , X3 ) .i (Y1 , Y2 , Y3 ) :⇔ Xi ⊆ Yi where i = 1, 2, 3. The powerset trilattice is defined as T(M ) := (T(M ), .1 , .2 , .3 ) and denoted by T(n) if M is finite with |M | = n. Fig. 1 shows the powerset trilattice T(4). The triple represented by a little circle in the triangular net can be obtained by following straight lines to circles in the three side diagrams. There the numbers labelled “below” these circles are collected where “below” in a side diagram is the opposite direction to that indicated by the arrow. Since obviously T(M ) = T(KM ), powerset trilattices are complete trilattices, which will also follow from Proposition 3.2 and Proposition 3.5. Moreover, the ordered structures (T(M )i , ≤i ), i ∈ {1, 2, 3}, are isomorphic to (P(M ), ⊆) (cf. Proposition 3.4). Note the three bounding elements O1 = (∅, M, M ), O2 = (M, ∅, M ), and O3 = (M, M, ∅). Note also that a triple (X1 , X2 , X3 ) ∈ T(M ) is not determined by one of its components.
Powerset Trilattices 4
213
3 2 1 3
O3
O2
1
4
1
2 2
3 3
1 4
2
O1 X=({1,2},{1,3,4},{2,3,4})
Fig. 1. The powerset trilattice T(4) with the cycle cX of complements of X = ({1, 2}, {1, 3, 4}, {2, 3, 4}).
Proposition 3.2: Let (X1 , X2 , X3 ) ∈ P(M )3 . Define the symmetric difference of X, Y ⊆ M by Xi ∆Xj := (Xi \ Xj ) ∪ (Xj \ Xi ). Then (X1 , X2 , X3 ) ∈ T(M ) if and only if Xk = (Xi ∩ Xj )C = Xi ∆Xj for all {i, j, k} = {1, 2, 3}. Proof: If (X1 , X2 , X3 ) ∈ T(M ) and {i, j, k} = {1, 2, 3}, then X1 ∩ X2 ∩ X3 = ∅ is equivalent to Xk = (Xi ∩ Xj )C . Moreover, Xi ∪ Xj = M implies (Xi ∩ Xj )C = (Xi \ Xj ) ∪ (Xj \ Xi ). Conversely, we obtain Xi ∪ Xk = Xi ∪ (Xi ∩ Xj )C = M 2 and also X1 ∩ X2 ∩ X3 = ∅. So a triple in T(M ) is determined by two of its components. It immediately follows that T(M ) satisfies the unique meet condition and the condition of antiordinality and is therefore a triordered set. Next we consider the finite case: Proposition 3.3: If M is finite with |M | = n then |T(M )| = 3n . Proof: Assume X ⊆ M with |X| = k ≤ n as the first component of a triple in T(M ). As the second component Y ⊆ M with X ∪ Y = M , we can choose |P(X)| =P 2k suchsets to obtain the triple (X, Y, X∆Y ) ∈ T(M ). Altogether, n there are k=0 nk 2k 1n−k = (2 + 1)n = 3n triples in T(M ). 2
214
K. Biedermann
From the preceding proof it also becomes clear that, in general, any subset X ⊆ M can be enlarged to a triple in T(M ) and can therefore occur in any of the three components. If we identify [(X1 , X2 , X3 )]i with Xi for all (X1 , X2 , X3 ) ∈ T(M ) and i ∈ {1, 2, 3}, it immediately follows: Proposition 3.4: The ordered structures (T(M )i , ≤i ), i ∈ {1, 2, 3}, of powerset trilattices are isomorphic to (P(M ), ⊆). The ik-joins in T(M ) can explicitly be determined, which makes the triordered set T(M ) a complete trilattice: S Proposition 3.5: Let X, Y ⊆ST(M ) be arbitrary subsets. Define X := {Xi | (X1 , X2 , X3 ) ∈ X} and Y := {Yk | (Y1 , Y2 , Y3 ) ∈ Y} for {i, j, k} = {1, 2, 3}. Then X∇ik Y = (Z1 , Z2 , Z3 ) where Zi = X ∪ Y C Zj = (X ∩ Y )C Zk = Y . In particular, the 13-join of ((X1 , X2 , X3 ), (Y1 , Y2 , Y3 )) ∈ T(M )2 is determined by (X1 , X2 , X3 )∇13 (Y1 , Y2 , Y3 ) = (X1 ∪ Y3C , (X1 ∩ Y3 )C , Y3 ). Proof: We ensure first that Z := (Z1 , Z2 , Z3 ) ∈ T(M ). Obviously, Zi ∪ Zk = M and, since Zj = X C ∪ Y C , we also have Zi ∪ Zj = M and Zj ∪ Zk = M . Intersecting the three components yields Zi ∩ Zj ∩ Zk = (X ∪ Y C ) ∩ (X C ∪ Y C ) ∩ Y = ((X ∩ X C ) ∪ Y C ) ∩ Y = Y C ∩ Y = ∅ and hence Z ∈ T(M ). But it is also an ik-bound of (X, Y) because (X1 , X2 , X3 ) .i Z for all (X1 , X2 , X3 ) ∈ X and (Y1 , Y2 , Y3 ) .k Z for all (Y1 , Y2 , Y3 ) ∈ Y. Among the triples in T(M ) having X in their ith and Y in their k th component, Z has the greatest possible j th component Zj = (X ∩ Y )C such that Z1 ∩ Z2 ∩ Z3 = ∅ is still satisfied. Therefore Z is also an ik-limit of (X, Y). Since Z has also the smallest possible k th component, it is the ik-join of (X, Y). 2 Many equations valid in all powerset trilattices can now be deduced but we will restrict ourselves to the triadic de Morgan laws after introducing the triadic complementation. Definition 3.6: For an element X := (X1 , X2 , X3 ) ∈ T(M ) and a permutation σ ∈ S3 , the σ-complement of X is defined by X σ := (Xσ−1 (1) , Xσ−1 (2) , Xσ−1 (3) ) ∈ T(M ). The set cX := {X σ | σ ∈ S3 } will be called the cycle of (triadic) complements of X. Obviously, X σ can be obtained from X by moving the ith component Xi of X to the σ(i)th position. Note that the cycle of complements of a boundary element
Powerset Trilattices
215
such as ({4}, {1, 2, 3, 4}, {1, 2, 3}) in Fig. 1 lies entirely on the boundary. Proposition 3.7: In powerset trilattices T(M ) the triadic de Morgan laws (X∇ik Y )σ = X σ ∇σ(i)σ(k) Y σ hold for all X, Y ∈ T(M ), {i, j, k} = {1, 2, 3}, and σ ∈ S3 . Proof: Let X = (X1 , X2 , X3 ) and Y = (Y1 , Y2 , Y3 ). Then X∇ik Y = (Z1 , Z2 , Z3 ) with Zi := Xi ∪ YkC , Zj := (Xi ∩ Yk )C , and Zk := Yk by Proposition 3.5. Since Xi is the σ(i)th component of X σ and Yk is the σ(k)th component of Y σ , it follows that X σ ∇σ(i)σ(k) Y σ has Xi ∪ YkC as σ(i)th component, (Xi ∩ Yk )C as σ(j)th component, and Yk as σ(k)th component. This implies X σ ∇σ(i)σ(k) Y σ = 2 (Z1 , Z2 , Z3 )σ = (X∇ik Y )σ . Note that the cycle of complements of a bounding element equals {O1 , O2 , O3 } and has therefore three elements. Cycles of complements have the following properties: Proposition 3.8: Let cX be the cycle of complements of X ∈ T(M ) and let {i, j, k} = {1, 2, 3}. Then all Y ∈ cX satisfy 1. Y ∼k Y (ij) and 2. [Y ]i ∨ [Y (ij) ]i = 1i and [Y ]i ∧ [Y (ij) ]i ∧ [Y (ik) ]i = 0i . Proof: 1.: The k th component of Y (ij) = (Y(ij)(1) , Y(ij)(2) , Y(ij)(3) ) is unchanged. 2.: This follows immediately from Definition 3.1, Definition 3.6, and (the proof of) Proposition 3.4. 2 In the following section it is shown that cX has six elements if X 6∈ {O1 , O2 , O3 } (cf. Proposition 4.6). Note also that (Y σ )τ = Y σ◦τ for all Y ∈ cX and σ, τ ∈ S3 .
13
03 3
O3
O2
11 1
2
O1
01
02 12
Fig. 2. The smallest non-trivial trilattice B 3 , isomorphic to T(1).
Powerset lattices are order-isomorphic to powers of the smallest non-trivial lattice having two elements. We finish this section with the corresponding triadic
216
K. Biedermann
result and define the smallest non-trivial trilattice B 3 := (B3 , .1 , .2 , .3 ) by B3 := {O1 , O2 , O3 } with Oi ∼i Oi and Oi .i Oj for all distinct i, j ∈ {1, 2, 3} (cf. Fig. 2). Theorem 1: For any powerset trilattice T(M ) the mapping ϕ : T(M ) → B M 3 with ϕ(X1 , X2 , X3 )(x) := Oi if x 6∈Xi defines a triorder-isomorphism, i.e., any powerset trilattice is (triorder-)isomorphic to a power of B 3 . Proof: Let (X1 , X2 , X3 ) ∈ T(M ) and let x ∈ M . Since x 6∈ Xi for some i ∈ {1, 2, 3} implies x ∈ Xj and x ∈ Xk for {j, k} = {1, 2, 3} \ {i}, it follows that ϕ is well-defined and ϕ(X1 , X2 , X3 ) ∈ B3M . But it is also a triorder embedding: Let X, Y ∈ T with X .i Y , i.e. Xi ⊆ Yi where i ∈ {1, 2, 3}. If x ∈ Xi then ϕ(X)(x) ∼i ϕ(Y )(x), if x ∈ Yi \ Xi then ϕ(X)(x) = Oi .i ϕ(Y )(x), and if x 6∈Yi then ϕ(X)(x) = Oi = ϕ(Y )(x). Thus ϕ(X) .i ϕ(Y ). Conversely, let ϕ(X) .i ϕ(Y ) and let x ∈ Xi . Then Oi 6= ϕ(X)(x) and hence ϕ(Y )(x) 6= Oi such that x ∈ Yi . It follows X .i Y . It remains to show that ϕ is surjective. So let f ∈ B3M and define X := (X1 , X2 , X3 ) by Xi := {x ∈ M | f (x) 6= Oi } for i ∈ {1, 2, 3}. Then obviously X ∈ T(M ) satisfying ϕ(X) = f , which completes the proof. 2 Note that the powerset trilattice T(4) in Fig. 1 is isomorphic to B 43 .
3
2 1
Fig. 3. A trilattice with isomorphic Boolean lattices as side diagrams which is not a powerset trilattice.
Powerset Trilattices
4
217
B-Trilattices
According to Proposition 3.4, powerset trilattices have isomorphic powerset lattices as ordered structures. The trilattice in Fig. 3 has this property but is different from T(3). If we add two further conditions, namely some property of the ik-joins in powerset trilattices (Lemma 4.1) and the triadic complementation (Definition 4.2), and in this way define B-trilattices, the powerset trilattices can be characterized as certain B-trilattices. In particular, the finite B-trilattices are (up to isomorphism) just the powerset trilattices T(n). Lemma 4.1: Let L := (L, .1 , .2 , .3 ) be a bounded trilattice in which x∇ik y ∼k y holds for all i 6= k in {1, 2, 3}. If z ∈ L then z k,i := Ok ∇ik z is the only boundary element b ∈ b(L) satisfying b ∼k z and b &i z. In particular, the boundary elements already determine the ordered structures (Lk , ≤k ), k ∈ {1, 2, 3}. Proof: Obviously z k,i ∼i Ok &i z and hence z k,i ∈ b(L). Moreover z k,i ∼k z such 2 that [z]k = [z k,i ]k showing that Lk = b(L)k := {[b]k | b ∈ b(L)}. Definition 4.2: A set c := {c1 , c2 , c3 , c4 , c5 , c6 } ⊆ L of a bounded trilattice L is called a cycle if c1 ∼3 c2 ∼2 c3 ∼1 c4 ∼3 c5 ∼2 c6 ∼1 c1 , c1 ∼2 c4 , c2 ∼1 c5 , and c3 ∼3 c6 (cf. Fig. 4). A cycle of (triadic) complements is a cycle c in which for each x ∈ c there are elements y, z ∈ c such that for all i ∈ {1, 2, 3} the join condition [x]i ∨ [y]i = [y]i ∨ [z]i = [x]i ∨ [z]i = 1i and the meet condition [x]i ∧ [y]i ∧ [z]i = 0i are satisfied.
c3
c4
~1
c2
c5
~3
~2 c1
c6
Fig. 4. A cycle with six elements.
Lemma 4.3: A cycle c of a trilattice L has one, three or six elements. A cycle of complements has one element if and only if |L| = 1. Proof: Two elements ck , cl ∈ c of a cycle c := {c1 , c2 , c3 , c4 , c5 , c6 } with k 6= l can either be i-equivalent for some i ∈ {1, 2, 3} or not. If ck = cl it follows |c| ≤ 3 in the first and |c| = 1 in the second case (apply the uniqueness condition). There are no cycles with two elements. If a cycle of complements has one element then |L| = 1 because of the meet and the join condition. 2
218
K. Biedermann
Definition 4.4: A bounded trilattice L := (L, .1 , .2 , .3 ) is (triadically) complemented if any x ∈ L belongs to a cycle of complements. Definition 4.5: A complemented trilattice B := (B, .1 , .2 , .3 ) is called a B-trilattice, if its ordered structures are isomorphic Boolean lattices and x∇ik y ∼k y holds for all {i, j, k} = {1, 2, 3}. The powerset trilattices are obviously B-trilattices. By Lemma 4.1, the boundary elements already determine the Boolean side diagrams. Moreover, as for Boolean lattices, the triadic complementation in B-trilattices is unique: Proposition 4.6: Let c be a cycle of a B-trilattice B. If the join and the meet condition hold for x, y, z ∈ c for an index i ∈ {1, 2, 3}, then the (ordinary) complement [x]i in (Bi , ≤i ) satisfies [x]0i = [y]i ∧ [z]i . Moreover, x belongs to exactly one cycle of complements, denoted by cx , and all cycles of complements except {O1 , O2 , O3 } have six elements if |B| > 1. Proof: By definition, the elements x, y, z satisfy [x]i ∧ ([y]i ∧ [z]i ) = 0i . On the other hand, it follows [x]i ∨ ([y]i ∧ [z]i ) = ([x]i ∨ [y]i ) ∧ ([x]i ∨ [z]i ) = 1i ∧ 1i = 1i and thus [y]i ∧ [z]i is the (unique) complement of [x]i in (Bi , ≤i ). To show the uniqueness of the triadic complementation for x ∈ B we start with c1 := x ∈ c of a cycle c of complements, using the same notation as in Fig. 4, and determine the position of c2 ∈ c in the triangular net: Let x1 := O1 ∇31 c1 . Then x1 ∼1 c1 ∼1 c6 and x1 &3 c1 , c6 from which x1 .2 c1 , c6 follows by antiordinality. Thus [x1 ]2 ≤2 [c1 ]2 ∧ [c6 ]2 = [c2 ]02 which implies [x1 ]02 ≥2 [c2 ]2 . Thus, for x2 ∈ B with x2 ∼3 x1 and [x2 ]2 = [x1 ]02 we obtain x2 &2 c2 and it easily follows that also [x2 ]1 = [x1 ]01 holds. As for x1 we similarly deduce that x3 := O2 ∇32 c2 .1 c2 , c3 such that [x3 ]1 ≤1 [c2 ]1 ∧ [c3 ]1 = [c1 ]01 = [x1 ]01 = [x2 ]1 , i.e. x3 .1 x2 . On the other hand, x3 &1 x2 since x3 ∼2 c2 .2 x2 and x3 ∼3 x2 , and hence x3 ∼1 x2 which implies x3 = x2 because of the uniqueness condition. Therefore [x1 ]02 = [c2 ]2 which fixes the position of c2 ∼3 c1 . Now, repeating this argument for c2 instead of c1 yields [c4 ]3 = [O2 ∇12 c2 ]03 fixing the positions of c4 ∼2 c1 and c5 ∼1 c2 and, because c is a cycle, also c3 and c6 . In this way, we have determined the unique cycle of complements cx of x. If a cycle c of complements has three elements then it follows c = {O1 , O2 , O3 } because of the join and the meet condition of cycles of complements. 2 Since two cycles of complements of a B-trilattice are either disjoint or identical, we can define the triadic complementation as a (unary) operation: Definition 4.7: Let B be a B-trilattice, let {i, j, k} = {1, 2, 3} and (ij) ∈ S3 (ij) (ij) be a transposition. Then we define Ok := Ok , Oi := Oj and for all non(ij) (ij) ∈ cx by x 6= x and x(ij) ∼k x and define bounding elements x ∈ B we fix x xσ◦τ := (xσ )τ if σ, τ ∈ S3 . The element xσ is called the σ-complement of x.
Powerset Trilattices
219
According to the preceding proof, x(ij) can be determined as the intersection of [x]k and [Oi ∇ki x]0j , because {x(ij) } = [x]k ∩ [Oi ∇ki x]0j , and therefore as the intersecting point of the corresponding lines in the triangular net of the triadic diagram. Crucial for the characterization of powerset lattices as atomic and complete Boolean lattices are the so-called atoms, which are the upper covers of the smallest element 0. (Recall that a lattice (L, ≤) with 0 is called atomic if for any x ∈ L \ {0} there is an atom a with a ≤ x.) Here special cycles of complements, the so-called atomic cycles, will be needed to characterize the powerset trilattices. Definition 4.8: An element a ∈ L of a trilattice L is called an i-atom (i-coatom) for i ∈ {1, 2, 3}, if [a]i is an atom (coatom) of (Li , ≤i ), i.e., b .i a and b 6∼i a implies b = Oi (a .i b and b 6∼i a implies b ∈ 1i ). Proposition 4.9: Let a ∈ B be a k-atom of a B-trilattice B for some k ∈ {1, 2, 3}. Then a is a boundary element and, for all i ∈ {1, 2, 3} and σ ∈ S3 , the σ-complements of a are either i-atoms, i-coatoms or in 1i . Moreover ca ⊆ b(B). Proof: Suppose there is a k-atom x with x 6∈ b(B). With the notation as in Lemma 4.1 it follows that xj,i ∼j x and xj,i &i x where {i, j, k} = {1, 2, 3}. By antiordinality we obtain xj,i .k x which imlies xj,i = Ok because x is a k-atom. But then x ∼j Ok follows which contradicts x 6∈b(B). Therefore a k-atom a is a boundary element and hence there are just two k-atoms, namely ak,i and ak,j with {i, j, k} = {1, 2, 3}. Moreover, ak,j is an i-coatom because y &i ak,j implies y i,j .k ak,j by antiordinality, such that y i,j = Ok and hence y ∈ 1i . Since ak,i ∈ 1i , it follows from Proposition 4.6 that ([ak,i ]i ∧ [ak,j ]i )0 is an atom in (Bi , ≤i ), i.e. the (ik)-complements of the k-atoms are i-atoms. Repeating this argument finally yields that for all i ∈ {1, 2, 3} the cycle of complements ca consists of i-atoms, i-coatoms and elements in 1i and that ca ⊆ b(B). 2 Definition 4.10: A cycle of complements as in the previous proposition is called an atomic cycle (of complements) and abbreviated by a. If i ∈ {1, 2, 3} then ai ⊆ a denotes the subset of i-atoms in a, and A(B) is defined as the set of all atomic cycles of the B-trilattice B. The Characterization Theorem: Let B be a B-trilattice with atomic and complete Boolean lattices as side diagrams. Then a triorder-isomorphism ϕ : B → T(A(B)) can be defined by ϕ(x) := (ϕ(x)1 , ϕ(x)2 , ϕ(x)3 ) with ϕ(x)i := {a ∈ A(B) | a .i x for some a ∈ ai }, i.e. the B-trilattices with ordered structures isomorphic to powerset lattices are (up to isomorphism) exactly the powerset trilattices. Proof: First of all we show that ϕ is well-defined, i.e. ϕ(x) ∈ T(A(B)). Obviously, ϕ({O1 , O2 , O3 }) ⊆ T(A(B)). So, let x ∈ B \ {O1 , O2 , O3 }, let {i, j, k} = {1, 2, 3}
220
K. Biedermann
and assume a 6∈ϕ(x)k . Then, for a ∈ ai with a = ai,j , it follows a &k x and, since a &j x, also a .i x. Thus a ∈ ϕ(x)i showing that ϕ(x)i ∪ ϕ(x)k = A(B). To ensure that also ϕ(x)1 ∩ ϕ(x)2 ∩ ϕ(x)3 = ∅, assume a ∈ ϕ(x)i for some i ∈ {1, 2, 3}, i.e. a .i x for some a ∈ ai . Because of the meet condition, there is an index k ∈ {1, 2, 3} \ {i} such that x(ik) 6&i a. If we choose a ∈ ai with a = ai,j for j ∈ {1, 2, 3} \ {i, k}, then we get x(ik) .i a(ik) . Moreover, since x(ijk) = (x(ik) )(jk) ∼i x(ik) we also have x(ijk) .i a(ik) and, with x(ik) , x(ijk) .j a(ik) , it follows x(ik) , x(ijk) &k a(ik) . Because x, x(ik) , and x(ijk) satisfy the meet condition for k, we obtain x 6&k a(ik) , i.e. a 6∈ ϕ(x)k and ϕ(x)1 ∩ϕ(x)2 ∩ϕ(x)3 = ∅ (Note that a ∈ ϕ(x)j because of ϕ(x)j ∪ϕ(x)k = A(B).). Next we ensure that ϕ is a triorder-embedding. Let x, y ∈ B with x .i y and i ∈ {1, 2, 3}. Then it is obvious that ϕ(x) .i ϕ(y). Conversely, let ϕ(x) .i ϕ(y), i.e. ϕ(x)i ⊆ ϕ(y) W i . Since (Bi , ≤i ) is an atomicWand complete Boolean lattice it follows [x]i = {[a]i | a ∈ ai , a ∈ ϕ(x)i } ≤i {[a]i | a ∈ ai , a ∈ ϕ(y)i } = [y]i and thus x .i y. Therefore it remains to show that ϕ is surjective.W So let W (A1 , A2 , A3 ) ∈ T(A(B)), let x1 ∈ {[a]1 | a ∈ a1 , a ∈ A1 } and x3 ∈ {[a]3 | a ∈ a3 , a ∈ A3 } and define y := x1 ∇13 x3 and z := x3 ∇31 x1 . Then y ∼2 z and z .1 y. We will show that y = z. Suppose y 6∼1 z. Then there is a 1-atom a with a .1 y but a 6.1 z and we can choose a such that a = a1,3 . The first inequality then implies a := ca ∈ ϕ(y)1 and the latter yields a(12) &1 z. But since also a(12) &3 z we obtain a(12) .2 z ∼2 y and hence a ∈ ϕ(y)2 . But we can also show a ∈ ϕ(y)3 as follows. Since (B1 , ≤1 ) is an atomic and complete Boolean lattice and z = x3 ∇31 x1 ∼1 x1 we conclude A1 = ϕ(x1 )1 = ϕ(z)1 and, since a 6.1 z with a as the already chosen 1-atom, it follows a 6∈ϕ(z)1 = A1 which implies a ∈ A3 because of A1 ∪ A3 = A(B). But, since also y = x1 ∇13 x3 ∼3 x3 implies A3 = ϕ(x3 )3 = ϕ(y)3 , we get a ∈ ϕ(y)3 . Therefore a ∈ ϕ(y)1 ∩ ϕ(y)2 ∩ ϕ(y)3 which contradicts ϕ(y) ∈ T(A(B)). Consequently y ∼1 z and hence y = z. But then A1 = ϕ(y)1 and A3 = ϕ(y)3 and therefore ϕ(y) = (A1 , A2 , A3 ) showing that ϕ is a surjective triorder-embedding, i.e. an isomorphism. 2 Theorem 1 and the Characterization Theorem immediately yield: Corollary: Any finite B-trilattice B with n atomic cycles satisfies B ∼ = = T(n) ∼ n B3 .
5
Discussion
Should we now identify the B-trilattices with the Boolean trilattices? In fact, the B-trilattices have similar properties as Boolean lattices such as the unique complementation. They also play an analogous role in the Characterization Theorem. In this sense, the presented order-theoretic approach to B-trilattices corresponds to the definition of Boolean lattices as complemented and distributive lattices. From an algebraic point of view, Boolean lattices can equivalently be understood as algebras in which certain equations hold. There are the lattice equations, the
Powerset Trilattices
221
distributive laws and identities concerning the (ordinary) complement. Trilattices, on the other hand, can algebraically be characterized by certain trilattice equations (cf. [Bi98]). But there is not yet an algebraic way to understand the B-trilattices. In fact, it has not been possible to equivalently express the defining properties of B-trilattices by suitable triadic equations. It becomes, for example, quite complicated to describe the distributive laws of the ordered structures as terms with ik-joins. In general, there seems to be no immediate and simple connection between the triadic operations of B-trilattices – the ik-joins and the σ-complements – and the Boolean operations for their side diagrams. So the next step is to find such unifying and simplifying triadic equations as the distributive laws and also identities concerning the triadic complements.
References Bi97.
Bi98. DP90. Gr79. GW96. LW95.
MB89. Wi95. Wi97. WZ98.
K. Biedermann: How Triadic Diagrams Represent Conceptual Structures. In: D. Lukose, H. Delugach, M. Keeler, L. Searle, J. Sowa (eds.), Conceptual Structures: Fulfilling Peirce’s Dream, Lecture Notes in Artificial Intelligence 1257. Springer-Verlag, Berlin-Heidelberg-New York 1997, 304317. K. Biedermann: An Equational Theory for Trilattices. FB4-Preprint, TU Darmstadt 1998. B.A. Davey, H.A. Priestley: Introduction to Lattices and Order. Cambridge University Press, Cambridge 1990. G. Gr¨ atzer: Universal Algebra. Springer-Verlag, Berlin-Heidelberg-New York, 1979. B. Ganter, R. Wille: Formale Begriffsanalyse: Mathematische Grundlagen. Springer, Berlin-Heidelberg 1996. F. Lehmann, R. Wille: A Triadic Approach to Formal Concept Analysis. In: G. Ellis, R. Levinson, W. Rich and J.G. Sowa (ed.). Conceptual Structures: Applications, Implementations and Theory. Lecture Notes in Artificial Intelligence 954. Springer-Verlag, Berlin-Heidelberg-New York 1995, 32-43. J. D. Monk, R. Bonnet (eds.): Handbook of Boolean Algebras. Elsevier Science Publishers B.V., Amsterdam-New York 1989. R. Wille: The Basic Theorem of Triadic Concept Analysis. Order 12 (1995), 149-158. R. Wille: Conceptual Graphs and Formal Concept Analysis. In: D. Lukose, H. Delugach, M. Keeler, L. Searle (eds.): Conceptual Structures: Fulfilling Peirce’s Dream. Springer, Berlin-Heidelberg-New York 1997, 290-303. R. Wille, M. Zickwolff: Grundlegung einer Triadischen Begriffsanalyse. In: G. Stumme, R. Wille: Wissensverarbeitung: Methoden und Anwendungen. Springer, Berlin-Heidelberg 1998.
Simple Concept Graphs: A Logic Approach Susanne Prediger Technische Universit¨ at Darmstadt, Fachbereich Mathematik Schloßgartenstr. 7, D–64289 Darmstadt,
[email protected] Abstract. Conceptual Graphs and Formal Concept Analysis are combined by developing a logical theory for concept graphs of relational contexts. Therefore, concept graphs are introduced as syntactical constructs, and their semantics is defined based on relational contexts. For this contextual logic, a sound and complete system of inference rules is presented and a standard graph is introduced that entails all concept graphs being valid in a given relational context. A possible use for conceptual knowledge representation and processing is suggested.
1
Introduction
The first approach combining the theory of Conceptual Graphs and Formal Concept Analysis was described by R.Wille in [12]. For connecting the conceptual structures in both theories, the concept types appearing in conceptual graphs were considered to be formal concepts of formal contexts, the constituents in Formal Concept Analysis. To facilitate this connection, concept graphs, appropriate mathematizations of conceptual graphs, were introduced. The theory of concept graphs of formal contexts was developed as a mathematical structure theory where concept graphs of formal contexts are realizations of abstract concept graphs. In this paper, a logic approach is presented by developing a logical theory for concept graphs. Therefore, concept graphs are defined as syntactical constructs over an alphabet of objects names, concept names and relation names (Section 2). Then, a contextual semantics is specified. We interpret the syntactical names by objects, concepts and relations of a relational context (Section 3). In this way, we can profit from all notions for concepts that have been developed in Formal Concept Analysis. The introduced contextual logic is carried on by the study of inferences (Section 4). Based on a model-theoretic notion for the entailment of concept graphs, a sound and complete set of inference rules is established and compared to the notion of projections between concept graphs. With the standard model, we can present another interesting tool for reasoning with concept graphs. In Section 5, we introduce a standard graph for a given relational context. It gives a basis of all concept graphs being valid in the relational context. In the last section, we suggest how this approach can be used for knowledge representation and processing. M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 225–239, 1998. c Springer-Verlag Berlin Heidelberg 1998
226
2
S. Prediger
Syntax for the Language of Concept Graphs
We want to introduce concept graphs as syntactical constructs. Therefore, we need an alphabet of concept names, relation names and objects names. As the theory of conceptual graphs provides an order-sorted logic, we start with ordered sets of names that are not necessarily lattices. These orders are determined by the taxonomies of the domains in view, they formalize ontological background knowledge. Definition 1. An alphabet of concept graphs is a triple (C, G, R), where (C, ≤C ) is a finite ordered set whose elements are called concept names, G is a finite set whose elements are called object names, and (R, ≤R ) is a set, partitioned into finite ordered sets (Rk , ≤Rk )k=1,...,n whose elements are called relation names. Now, we can introduce concept graphs as statements formulated with these syntactical names. That means, we consider the concept graphs to be the wellformed formulas of our formal language. In accordance with the first mathematization of conceptual graphs in [12], the structure of (simple) concept graphs is described by means of directed multi-hypergraphs and labeling functions. Definition 2. A (simple) concept graph over the alphabet (C, G, R) is a structure G := (V, E, ν, κ, ρ), where • (V, E, ν) is a (not necessarily connected) finite directed multi-hypergraph, i. e. a structure where V and E are finite Ssets whose elements are called n vertices and edges respectively, and ν: E → k=1 V k is a mapping (n ≥ 2), • κ: V ∪ E → C ∪ R is a mapping such that κ(V ) ⊆ C and κ(E) ⊆ R, and all e ∈ E with ν(e) = (v1 , . . . , vk ) satisfy κ(e) ∈ Rk , • ρ: V → P(G)\{∅} is a mapping. For an edge e ∈ E with ν(e) = (v1 , . . . , vk ), we define |e| := k, and we write ν(e) i := vi and ρ(e) := ρ(v1 ) × . . . × ρ(vk ). Apart from some little differences, the concept graphs correspond to the simple conceptual graphs defined in [8] or [1]. We only use multi-hypergraphs instead of bipartite graphs in the mathematization. The application ν assigns to every edge the tuple of all its incident vertices. The function κ labels the vertices and edges by concept and relation names, respectively, and the mapping ρ describes the references of every vertex. In contrast to Sowa, we allow references with more than one object name (i. e. individual marker) but no generic markers, i. e. existential quantifiers, yet. They can be introduced into the syntax easily (cf. [6]), but in this paper we want to put emphasis on the elementary language. That is why we can omit coreference links here which are only relevant in connection with generic markers.
3
Semantics for Concept Graphs
We agree with J.F.Sowa, when he writes about the importance of a semantics: “To make meaningful statements, the logic must have a theory of reference that
Simple Concept Graphs: A Logic Approach
227
determines how the constants and variables are associated with things in the universe of discourse.” [9, p. 27] Usually, the semantics for conceptual graphs is given by the translation of conceptual graphs into first-order logic (cf. [8] or [1]). For some notions and proofs, a set-theoretic, extensional semantics was developed (cf. [4]), but it is rarely used. We define a semantics based on relational contexts. That means, we interpret the syntactical elements (concept, object and relation names) by concepts, objects and relations of a relational context. We prefer this contextual semantics for several reasons. As the basic elements of concept graphs are concepts, we want a semantics in which concepts are considered in a formal, but manifold way. Therefore, it is convenient to use Formal Concept Analysis, which is a mathematization of the philosophical understanding of concepts as units of thought constituted by their extension and intension (cf. [11]). Furthermore, it is essential for Formal Concept Analysis that these two components of a concept are unified on the basis of a specified context. This contextual view is supported by Peirce’s pragmatism which claims that we can only analyze and argue within restricted contexts where we always rely on preknowledge and common sense (cf. [12]). Experience has shown that formal contexts are a useful basis for knowledge representation and communication because, on the one hand, they are close enough to reality and, on the other hand, their formalizations allow an efficient formal treatment. As formal contexts do not formalize relations on the objects, the contexts must be enriched with relational structures. Therefore, R.Wille invented power context families in [12] where relations are described as concepts with extensions consisting of tuples of objects. Using relational contexts in this paper, we have chosen a slightly simpler formalism. Nevertheless, this formalism can be transformed to power context families and vice versa. This is explained in detail in [5] and will not be discussed in this paper. Let us start with the formal definition of a relational context (originally introduced in [7]). Definition 3. A formal context (G, M, I) is a triple where G and M are finite sets whose elements are called objects and attributes, respectively, and I is a binary relation between G and M which Sn is called an incidence relation. A formal context, together with a set R := k=1 Rk of sets of k-ary relations on G is called relational context and denoted by K := ((G, R), M, I). The concept lattice B(G, M, I) := (B(G, M, I), ≤) is also called the concept lattice of K and denoted by B(K). For the basic notions in Formal Concept Analysis like the definition of the concept lattice, please refer to [3]. We just mention the denotation g I := {m ∈ M/(g, m) ∈ I} for g ∈ G (and dually for m ∈ M ) which will be used in the following paragraphs. Now, we can specify how the syntactical elements of (C, G, R) are interpreted in relational contexts by context-interpretations. The object names are interpreted by objects of the context, the concept names by its concepts and the relation
228
S. Prediger
names by its relations. In this way, we can embed the order given on C into the richer structure of the concept lattice. Order-preserving mappings are required because the interpretation shall respect the subsumptions given by the orders on C and R. Definition 4. For an alphabet A := (C, G, R) and a relational context K := ((G, R), M, I) we call the union ι := ιC ∪˙ ιG ∪˙ ιR of the mappings ιC : (C, ≤C ) → B(K), ιG : G → G and ιR : (R, ≤R ) → (R, ⊆) a K-interpetation of A if ιC and ιR are order-preserving and we have ιR (Rk ) ⊆ Rk for all k = 1, . . . , n. The tuple (K, ι) is called a context-interpretation of A. Having defined how the syntactical elements are related to elements of a relational context, we can explain formally how to distinguish valid statements from invalid statements. Due to our contextual view, the notion of validity also depends on the specified relational context. That means, a concept graph is called valid in a context-interpretation if the assigned objects belong to the extension of the assigned concepts, and if the assigned relations conform with the labels of the edges. Let us make these conditions precise in a formal definition. Definition 5. Let (K, ι) be a context-interpretation of A. The concept graph G := (V, E, ν, κ, ρ) over A is called valid in (K, ι) if • ιG ρ(v) ⊆ Ext(ιC κ(v)) for all v ∈ V (vertex condition) • ιG ρ(e) ⊆ ιR κ(e) for all e ∈ E (edge condition). If G is valid in (K, ι), then (K, ι) is called a model for G and G is called a concept graph of (K, ι). Note that, theoretically, any formal context could be completed to a model if the relations and the interpretation were chosen in the right way. For a given concept graph G := (V, E, ν, κ, ρ), a formal context (G, M, I) and a given mapping ιG : G → G, we can always define an order-preserving mapping M, the vertex condition, for ιC : C → B(G,W I) satisfying example the mapping with ιC (c) := ιG ρ(v)II , ιG ρ(v)I | v ∈ V, κ(v) ≤C c . Thus, we can obtain a model by choosing appropriate relations and a mapping ιR . This shows that looking for an adequate model is not only a matter of formalism. It always depends on the specific purpose. There is one interesting model for every concept graph, namely its standard model. It codes exactly the information given by the concept graph. Definition 6. Let G := (V, E, ν, κ, ρ) be a concept graph over the alphabet (C, G, R). We define the standard model of G to be the relational context KG := ((G, RG ), C, I G ) together with the KG -interpretation ιG := ιC ∪˙ ιG ∪˙ ιR where G G G ιC : C → B(KG ) with ιC (c) := (cI , cI I ), ιG := idG , RG := ιR (R). The incidence relation I G ⊆ G × C and the mapping ιR are defined in such a way that for all g ∈ G, c ∈ C, (g1 , . . . , gk ) ∈ G k and R ∈ R, we have the following conditions: (g, c) ∈ I G
⇐⇒ ∃ v ∈ V :
(g1 , . . . , gk ) ∈ ιR (R) ⇐⇒ ∃ e ∈ E :
κ(v) ≤C c and g ∈ ρ(v) κ(e) ≤R R and (g1 , . . . , gk ) ∈ ρ(e).
Simple Concept Graphs: A Logic Approach
229
We can read this definition as an instruction of how to construct the standard model. As objects of the context, we take all object names in G, as attributes of the context, we take all concept names in C and we relate the object name g and the concept name κ(v) (i. e., set (g, κ(v)) ∈ I G ) if the object name g belongs to the reference ρ(v) of the vertex v. For preserving the order, we additionally relate g to every concept name c satisfying κ(v) ≤C c. Similarly for the relations. It is proved in [5] that this standard model is really a model for G. Constructing a standard model for a given concept graph is the easiest way to find a relational context that codes exactly the information formalized in the concept graph. Thus, it allows us to translate knowledge expressed on the graphical level into knowledge on the contextual level. In the following section, we will see how the standard model helps to characterize inferences of concept graphs on the contextual level.
4 4.1
Reasoning with Concept Graphs Entailment and Validity in the Standard Model
Having specified a formal semantics, we can easily describe inferences on the semantical level by entailments. For this, we only consider concept graphs over the same alphabet in the whole section. We recall the usual definition. Definition 7. Let G1 and G2 be two concept graphs over the same alphabet. We say that G1 entails G2 if G2 is valid in every model for G1 . We denote this by G1 |= G2 . The following proposition explains how the entailment can be characterized by standard models (for the proof see appendix). Proposition 1. The concept graph G1 entails the concept graph G2 if and only if G2 is valid in the standard model (KG1 , ιG1 ) of G1 . That means, using the contextual language, we obtain an effective method for deciding whether a concept graph entails another one or not. Beyond this, we could theoretically concentrate completely on the context level and describe the relation |= by means of inclusion in the standard models as the following lemma shows. Lemma 1. Let G1 and G2 be two concept graphs over the same alphabet with standard models (KG1 , ιG1 ) and (KG2 , ιG2 ), respectively. They satisfy G1 |= G2
⇐⇒
G2 1 I G1 ⊇ I G2 and ιG R (R) ⊇ ιR (R) for all R ∈ R.
Although the lemma is not very practical for reasoning in general, it has important consequences. Firstly, we can see easily that the relation |= is reflexive and transitive, i. e., it is a preorder. Secondly, it implies that equivalent concept graphs (i. e. concept graphs with G1 |= G2 and G2 |= G1 ) have identical standard models.
230
S. Prediger
Finally, we can characterize the order that is induced by the preorder |= on the equivalence classes of concept graphs: the lemma shows how it can be characterized by the inclusions in the corresponding standard models. In particular, this allows us to describe the infimum and supremum of concept graphs by join and intersection in the standard model. The infimum of the two equivalence classes of the concept graphs G1 and G2 is the equivalence class of the juxtaposition G1 ⊕ G2 (cf. [2]). It is not difficult to see that the standard model of this juxtaposition is exactly the standard model one obtains by “joining the standard models”: for (KG1 ⊕G2 , ιG1 ⊕G2 ), we have I G1 ⊕G2 = I G1 ∪ I G2 and G2 1 ⊕G2 1 (R) = ιG ιG R R (R) ∪ ιR (R) for all R ∈ R. Whereas it is a difficult task to construct the supremum of two equivalence classes (if it exists at all), we can deduce immediately from Lemma 1 that its standard model (K, ι) is the intersection of the standard models. That means, G2 1 we have I = I G1 ∩ I G2 and ιR (R) = ιG R (R) ∩ ιR (R) for all R ∈ R. We conclude that describing the order induced by |= is much easier on the context level than it is on the graph level. Especially suprema and infima can be characterized easily. This shows that for some purposes, it is convenient to translate the information given in concept graphs to the context level. For other purposes, it is interesting to do reasoning only on the syntactical level. But how can we characterize inferences on the syntactical level? It is usually done in two ways: by inference rules that were inspired by Peirce’s inference rules for existential graphs and by projections, i. e. graph morphisms that can be supported by graph-theoretical methods and algorithms (cf. [8], [4]). 4.2
Projections
Projections describe inferences from the perspective of graph morphisms. We recall the definition of projections as graph morphisms respecting the labeling functions (cf. [1]). It is slightly modified for concept graphs. Definition 8. For the two concept graphs G1 := (V1 , E1 , ν1 , κ1 , ρ1 ) and G2 := (V2 , E2 , ν2 , κ2 , ρ2 ) over the same alphabet, a projection from G2 to G1 is defined as the union πV ∪˙ πE of mappings πV : V2 → V1 and πE : E2 → E1 such that |e| = |πE (e)|, πV (ν2 (e) i ) = ν1 (πE (e)) i and κ1 (πE (e)) ≤R κ2 (e) for all edges e ∈ E2 and for all i = 1, . . . , |e|; and κ1 (πV (v)) ≤C κ2 (v) and ρ1 (πV (v)) ⊇ ρ2 (v) for all vertices v ∈ V2 . We write G1 . G2 if there exists a projection from G2 to G1 . The relation . defines a preorder on the class of all concept graphs, i. e., it is reflexive and transitive but not necessarily antisymmetric. It can be proved that this relation is characterized by the following inference rules (cf. [1], for concept graphs see [5]): 1. 2. 3. 4.
Double a vertex. Delete an isolated vertex. Double an edge. Delete an edge.
Simple Concept Graphs: A Logic Approach
C:
HUMAN ADULT
CHILD
WOMAN
G1
CHILD: Hansel, Gretel
G2
HUMAN: Gretel, Witch
G3
WOMAN: Witch
G :=
{Hansel, Gretel, Witch}
R :=
{THREATEN}
231
ADULT: Witch
1
THREATEN
2
CHILD: Hansel
ADULT: Witch
G4
ADULT: Witch
1
THREATEN
2
CHILD: Hansel
Fig. 1. Counter-examples to the Completeness of the Projection
5. 6. 7. 8.
Generalize a concept name. Generalize a relation name. Restrict a reference. Copy the concept graph.
That means, two concept graphs G1 and G2 satisfy G1 . G2 if and only if G2 can be derived from G1 by applying these rules (which are elaborated more precisely in the appendix). Note that the Rule 7 (restrict a reference) is different to the restriction rule defined in [1]: Here, we cannot replace an individual maker by a general marker but delete an individual marker from the set of objects being the reference. It can be proved that these rules are sound (cf. [5]). But as the examples in Figure 1 show, they are not complete. It is easy to see that G2 must be valid in every model for G1 . On the other hand, it cannot be derived from G1 because by these rules, references can only be restricted, not extended or joined. Even for concept graphs with references of only one element, the rules are not complete, when we consider redundant graphs. This is shown by the concept graphs G3 and G4 . We have G3 |= G4 , but the concept name WOMAN cannot be replaced by ADULT with the given rules.
232
4.3
S. Prediger
A Sound and Complete Calculus for all Concept Graphs
There are two ways to treat the incompleteness: restricting the class of considered concept graphs (e. g. to concept graphs of normal form like in [4]) or extending the rules. As it is convenient for conceptual knowledge processing to allow all concept graphs instead of restricting them to normal forms, we decided to modify and extend the rules. (Note that the introduced rules are usually needed to transform a concept graph into normal form.) Definition 9. Let G1 and G2 be two concept graphs over the same alphabet. We call G2 derivable from G1 and denote G1 ` G2 if it can be derived by the following inference rules (which are elaborated in the appendix): 1.
Double a vertex. Double a vertex v and its incident edges (several times if v occurs more than once in ν(e)). Extend the mappings κ and ρ to the doubles. 2. Delete an isolated vertex. Delete a vertex v and restrict κ and ρ accordingly. 3. Double an edge. Double an edge e and extend the mappings κ and ρ to the double. 4. Delete an edge. Delete an edge e and restrict the mappings κ and ρ accordingly. 5.∗ Exchange a concept name. Substitute the assignment v 7→ κ(v) for v 7→ c for such a concept name c ∈ C for which there is a vertex w ∈ V with κ(w) ≤C c and ρ(v) ⊆ ρ(w). 6.∗ Exchange a relation name. Substitute the assignment e 7→κ(e) for e 7→R for such a relation name R ∈ R for which there is an edge f ∈ E with κ(f ) ≤C R and ρ(e) ⊆ ρ(f ). 7. Restrict a reference. Replace the assignment v 7→ρ(v) by v 7→A with the subset ∅ = 6 A ⊆ ρ(v). 8. Copy the concept graph. Construct a concept graph that is identical to the first concept graph up to the names of vertices and edges. 9.∗ Join vertices with equal references. Join two vertices v, w ∈ V satisfying ρ(v) = ρ(w) into a vertex v ∨ w with the same incident edges and references, and set κ(v ∨ w) = c for a c ∈ C with κ(v) ≤C c and κ(w) ≤C c. 10.∗ Join vertices with corresponding edges. Join two vertices v, w ∈ V which have corresponding, but uncommon edges (i. e. for every edge e ∈ E incident with v there exists an edge e0 incident with w, and vice versa, with equal label and equal references, and there incident vertices only differ in v and w once) into a vertex v ∨ w with the same incident edges, κ(v∨w) = c for a c ∈ C with κ(v) ≤C c and κ(w) ≤C c, and ρ(v ∨ w) = ρ(v) ∪ ρ(w).
We will state these inference rules more precisely in the appendix and prove formally that they are sound and complete. Note that Rule 8 is redundant because it can be substituted by applying Rule 1 and 4.
Simple Concept Graphs: A Logic Approach
233
Proposition 2 (Soundness and Completeness). Let G1 and G2 be two concept graphs over the same alphabet. Then, we have G1 |= G2
⇐⇒
G1 ` G2 .
Let us sum up what has been achieved. There are three possibilities for characterizing inferences on concept graphs. The usual model-theoretic way is the entailment (cf. Def. 7). On the syntactical level, we have a sound and complete set of inference rules (cf. Def. 9) whereas the projections cannot be used in the general case due to incompleteness. With their graphical character, the inference rules can visualize inferences and can be intuitively used to derive relatively similar concept graphs by hand. That is why they can support communication about reasoning to a certain degree. For implementation and general questions of decidability, it seems to be more convenient to use the third notion to characterize inferences, namely validity in the standard model (cf. Prop. 1).
5
The Standard Graph of a Relational Context
The construction of a standard model for a given concept graph gives not only an efficient mathematical method for reasoning, but also a mechanism to translate the knowledge given in concept graphs to knowledge formalized in relational contexts. This possibility to translate from the graphical level to the contextual level is important for the development of conceptual knowledge systems. For such a conceptual knowledge system, the opposite direction is equally important. How can we translate knowledge given in relational contexts into the language of concept graphs? Obviously, we can state many different valid concept graphs for a given relational context. If we look for a so-called standard graph that codes the same information as the relational context, we have to look for a concept graph that entails all other valid concept graphs. For a similar purpose, R.Wille proposed a procedure to construct a canonical concept graph in [12]. We will modify this procedure for our purpose here. We start with a relational context, say K := ((G, R), M, I). For constructing a concept graph, we need an alphabet (C, G, R). We define C := B(K), G := G and R := R. For every index k = 1, . . . , n, we determine for every relation R ∈ Rk all maximal k-tuples (A1 , . . . , Ak ) of non-empty subsets of G being included in R. All those (k + 1)-tuples (R, A1 , . . . , Ak ) are collected in the set EK . That means, we define for R ∈ Rk the set Refmax (R) := {A1 × . . . × Ak ⊆ R | B1 × . . . × Bk ⊆ R implies B1 × . . . × Bk 6⊃A1 × . . . Ak } and obtain the set of edges EK :=
[
{(R, A1 , . . . , Ak ) | R ∈ Rk , A1 × . . . × Ak ∈ Refmax (R)}.
k=1,...,n
234
S. Prediger
Now, we define VK:= {A ⊆ G | there exists a (R, A1 , . . . , Ak ) ∈ EK with A = Ai for an i ≤ k} ∪ {g II ⊆ G | g ∈ G}, S
and set νK : EK → nk=1 VKk with νK (R, A1 , . . . , Ak ) := (A1 , . . . , Ak ) and κK : VK ∪ EK → B(K) ∪ R where κK (R, A1 , . . . , Ak ) := R and κK (A) := (AII , AI ). Finally, we can choose ρK (A) := A for all A ∈ VK . In this way, we obtain a concept graph G(K) := (VK , EK , νK , κK , ρK ) that is valid in (K, id) and is called the standard graph of K. Proposition 3. The standard graph G(K) of a relational context K entails every concept graph G0 that is valid in (K, id). This proposition (which is proved in the appendix) guarantees the demanded property of the standard graph. It is an irredundant graph that entails all concept graphs which are valid in its context. Thus, the standard graph is the counterpart to the standard model. With the standard model, we gather all the information given in the concept graph and have a tool to translate it from the graph level into the context level. Vice versa, we can translate information from the context level to the graph level by constructing the standard graph. The relationship between a context K and the context KG(K) , belonging to the standard model (KG(K) , ιG(K) ) of G(K) can also be described: The proof of Prop. 3 shows that the context KG(K) only differs from K because its set of attributes is not reduced and the attributes have different names. Their concept lattices are isomorphic. Vice versa, starting with a concept graph G and constructing the standard model (KG , ιG ), we cannot say that the standard graph of KG is isomorphic to G in the formal sense because it is not a concept graph over the same alphabet. Nevertheless, it encodes the same information in an irredundant form.
6
Contextual Logic for Knowledge Representation
With the approach to contextual logic presented in this paper, we have proposed a logic for concept graphs that is equipped with a model-theoretic foundation and in which inferences can be characterized in multiple ways. From a computational point of view, an efficient method has been presented to do reasoning by checking validity in the corresponding standard models. The major domain of application we have in mind for this logic, is conceptual knowledge representation and processing. In particular, the contextual semantics allows an integration of concept graphs into conceptual knowledge systems like TOSCANA that are based on Formal Concept Analysis. Vice versa, an integration of concept lattices and various methods of Formal Concept Analysis into tools for conceptual graphs is possible. For this purpose, the separation of syntax and semantics is less important than the possibility of expressing knowledge on two different levels, the graph level and the context level. With the
Simple Concept Graphs: A Logic Approach
235
standard model and the standard graph, we have developed two notions that help to translate knowledge from one level to the other. With it, the foundation is laid for conceptual knowledge systems which combine the advantages of both languages. For example, we can imagine a system that codes knowledge in relational contexts and provides, with the concept graphs, a graphical language as interface and representation tool for knowledge. In such a system, the knowledge engineer could extend a given knowledge base by constructing new concept graphs over the existing alphabet. Then, implemented algorithms on the graph level or on the context level (whatever is more convenient for the special situation) could check whether the new concept graph is already valid in the context (i. e., the information is redundant) or whether it represents additional information. Concept lattices could be used to find the conceptual hierarchy on the concepts and to determine the conceptual patterns and dependencies of concepts and objects. Obviously, we could profit from all the methods and algorithms already existent for conceptual graphs. The architecture of conceptual knowledge systems including relational contexts and concept graphs should be discussed, and the role of the different languages should be further explored. As the expressivity of the developed language is still quite limited, the extensions by quantifiers and nested concept graphs are considered in current research.
7
Appendix: Formal Proofs
Proof of Proposition 1. We only have to prove that G2 is valid in an arbitrary model (K, λ) for G1 with K := ((G, R), M, J) and λ := λG ∪˙ λC ∪˙ λR if G2 is valid in the standard model (KG1 , ιG1 ) of G1 with KG1 = ((G, RG1 ), C, I G1 ). As a result of the vertex condition for G1 in the model (K, λ), we have λG ρ1 (v) ⊆ Ext λC κ1 (v) ⊆ Ext λC (c) for all concept names c ∈ C and for all vertices v ∈ V1 with κ1 (v) ≤C c (because λC is order-preserving). It follows S λG ( {ρ1 (v) | v ∈ V1 , κ1 (v) ≤C c}) ⊆ Ext λC (c) for all c ∈ C. As a result of the vertex condition for G2 in the standard model (KG1 , ιG1 ), we have ρ2 (w) ⊆ S 1 Ext ιG {ρ1 (v) | v ∈ V1 , κ1 (v) ≤C κ2 (w)} for all vertices w ∈ V2 . C (κ2 (w)) := S This implies for all w ∈ V2 the vertex condition λG (ρ2 (w)) ⊆ λG ( {ρ1 (v) | v ∈ V1 , κ1 (v) ≤C κ2 (w)}) ⊆ Ext λC (κ2 (w)). For the edge condition, one can proceed similarly. 2 Proof of Soundness: G1 ` G2 ⇒ G1 |= G2 . Due to the transitivity of |=, it suffices to show soundness for each single inference rule. Therefore, we will give the exact definition of every inference rule by describing the derived concept graph G2 . Then, we can prove the entailment by using Prop. 1 and checking that G2 is valid in the standard model (KG1 , ιG1 ) S G1 1 of G1 := (V1 , E1 , ν1 , κ1 , ρ1 ). Because of ιG {ρ1 (v) | v ∈ G := idG , Ext(ιC c) = S 1 R = {ρ (e) | e ∈ E , κ (e) ≤R R} for V1 , κ1 (v) ≤C c} for all c ∈ C and ιG 1 1 1 R all R ∈ R (cf. Def. 6), we only have to convince ourselves that G2 satisfies the following vertex and edge conditions:
236
S. Prediger
∀w ∈ V2 : ρ2 (w) ⊆ ∀f ∈ E2 : ρ2 (f ) ⊆ 1.
2.
3.
4. 5.∗
6.∗
S S
{ρ1 (v) | v ∈ V1 , κ1 (v) ≤C κ2 (w)}
(vertex condition)
{ρ1 (e) | e ∈ E1 , κ1 (e) ≤R κ2 (f )}
(edge condition).
Double a vertex. The concept graph derived by doubling the vertex v ∈ V1 is G2 := (V2 , E2 , ν2 , κ2 , ρ2 ) which is defined by ˙ {(v, 1), (v, 2)}, • V2 := V1 \{v} ∪ ˙ E v with • E2 := E1 \Ev ∪ Ev := {e ∈ E1 | ν1 (e)|i = v for some i = 1, . . . , |e|} and E v := {(e, δ) | e ∈ Ev , δ ∈ {1, 2}[e,v] } where [e, v] := {i | ν1 (e)|i = v}, • ν2 |E1 \Ev := ν1 |E1 \Ev and ν1 (e)|i if i 6∈[e, v] ν2 (e, δ)|i := for all (e, δ) ∈ E v , (v, δ(i)) if i ∈ [e, v] • κ2 : V 2 ∪ E2 → C ∪ R x 7→κ1 (x) for all x ∈ V1 \{v} ∪ E1 \Ev (v, j) 7→κ1 (v) for j = 1, 2 (e, δ) 7→κ1 (e) for all (e, δ) ∈ E v , • ρ2 |V1 \{v} := ρ1 |V1 \{v} and ρ2 (v, j) := ρ1 (v) for j = 1, 2. For this derived concept graph G2 , the vertex and edge condition can be checked easily. It is left to the reader. Delete an isolated vertex. If v ∈ V1 is an isolated vertex of G1 (i.e., there is no edge e ∈ E1 and no i = 1, . . . , |e| with ν1 (e)|i = v), the components of the concept graph G2 derived by deleting the isolated vertex v are defined as follows: V2 := V1 \{v}, E2 := E1 , ν2 := ν1 , κ2 := κ1 |V2 ∪ E1 and ρ2 := ρ1 |V2 . These components obviously satisfy the vertex and edge conditions. Double an edge. The concept graph G2 derived by doubling the edge e ∈ E1 is defined by V2 := V1 , E2 := E1 \{e} ∪ {(e, 1), (e, 2)} where (e, 1), (e, 2) 6∈ E1 , ν2 |E1 \{e} := ν1 |E1 \{e} and ν2 (e, j) := ν1 (e) for j = 1, 2, κ2 |V1 ∪(E1 \{e}) := κ1 |V1 ∪(E1 \{e}) and κ2 (e, j) := κ1 (e) for j = 1, 2 and ρ2 := ρ1 . It satisfies the vertex and edge conditions. Delete an edge. Deleting the edge e ∈ E1 , one obtains the concept graph G2 := (V1 , E1 \{e}, ν1 |E1 \{e} , κ1 |V1 ∪(E1 \{e}) , ρ1 ) which satisfies the vertex and edge conditions. Exchange a concept name. The concept graph derived by substituting the concept name κ1 (v) for a c ∈ C for which there is a vertex w ∈ V1 with κ1 (w) ≤C c and ρ1 (v) ⊆ ρ1 (w), is defined by G2 := (V1 , E1 , ν1 , κ2 , ρ1 ) with κ2 |(V1 \{v})∪E1 := κ1 |(V1 \{v})∪E1 and κ2 (v) := c. The edge condition is obviously satisfied, and the vertex condition is satisfied because κ1 (w) ≤C c 1 implies ρ1 (v) ⊆ ρ1 (w) ⊆ Ext ιG C c. Exchange a relation name. The concept graph derived by substituting the relation name κ1 (e) for such an R ∈ R for which there is an edge f ∈ E1 with κ1 (f ) ≤C R and ρ1 (e) ⊆ ρ1 (f ), is defined by G2 := (V1 , E1 , ν1 , κ2 , ρ1 ) with κ2 |V1 ∪(E1 \{e}) := κ1 |V1 ∪(E1 \{e}) and κ2 (e) := R. It satisfies the edge 1 condition because κ1 (f ) ≤R R implies ρ1 (e) ⊆ ρ1 (f ) ⊆ ιG R R.
Simple Concept Graphs: A Logic Approach
237
7.
Restrict references. The concept graph derived by restricting the reference 6 A ⊆ ρ1 (v), is defined ρ1 (v) of the vertex v ∈ V1 to the reference A with ∅ = by G2 := (V1 , E1 , ν1 , κ1 , ρ2 ) with ρ2 |V1 \{v} := ρ1 |V1 \{v} and ρ2 (v) := A. From A ⊆ ρ1 (v) we deduce the vertex condition. 8. Copy the concept graph. For a copied concept graph G2 , there exist two bijections ϕV : V1 → V2 and ϕE : E1 → E2 such that κ1 (v) = κ2 (ϕV (v)) and ρ1 (v) = ρ2 (ϕV (v)) for all v ∈ V1 , and ϕV (ν1 (e)) = ν2 (ϕE (e)) and κ1 (e) = κ2 (ϕE (e)) for all e ∈ E1 . It trivially satisfies the vertex and edge conditions. ∗ 9 . Join vertices with equal references. The concept graph derived from G1 by joining the two vertices v and w with equal references (i. e. with ρ1 (v) = ρ1 (w)) is G2 := (V2 , E1 , ν2 , κ2 , ρ2 ) with ˙ {v ∨ w}, • V2 := V1 \{v, w} ∪ v ∨ w if ν1 (e)|i = v or ν1 (e)|i = w • ν2 |i (e) := ν1 (e)|i otherwise for all e ∈ E1 , i = 1, . . . , |e|, • κ2 |(V1 \{v,w}) ∪E1 := κ1 |(V1 \{v,w}) ∪E1 and κ2 (v ∨ w) := c for a c ∈ C with κ1 (v) ≤C c and κ1 (w) ≤C c, • ρ2 |V1 \{v,w} := ρ1 |V1 \{v,w} and ρ2 (v ∨ w) := ρ1 (v). The vertex and edge conditions are satisfied from ρ1 (v) = ρ2 (v ∨ w), κ1 (v) ≤C c and κ1 (w) ≤C c, we deduce Ext(κ1 (v)) ∪ Ext(κ1 (w)) ⊆ Ext(κ2 (v ∨ w)). 10∗ . Join vertices with corresponding edges. Let us assume that the vertices v, w ∈ V1 have corresponding, but uncommon edges, that means for every edge e ∈ Ev (i. e., that is incident with v) there exists an edge e0 ∈ Ew and vice versa with κ1 (e) = κ1 (e0 ), ν1 (e)|i = v for exactly one i ∈ {1, . . . , |e|} and ν1 (e0 )|i = w, ν1 (e)|j 6= w and ν1 (e0 )|j 6= w for all j = 1, . . . , |e|, and ρ1 (ν1 (e)|j ) = ρ1 (ν1 (e0 )|j ) if ν1 (e)|j 6= v. Then the concept graph derived from G1 by joining the two vertices v and w is G2 := (V2 , E1 , ν2 , κ2 , ρ2 ) where V2 and κ2 are defined as in Rule 9∗ , and ρ2 is defined by ρ2 |V1 \{v,w} := ρ1 |V1 \{v,w} and ρ2 (v ∨ w) := ρ1 (v) ∪ ρ1 (w). The vertex and edge conditions are satisfied because κ1 (v) ≤C c and κ1 (w) ≤C c imply Ext κ1 (v) ∪ Ext κ1 (w) ⊆ Ext κ2 (v ∨ w). 2 Proof of Completeness: G1 |= G2 ⇒ G1 ` G2 . We will prove completeness by using so-called stars, which are concept graphs with only one edge and its incident vertices. For a given concept graph G := (V, E, ν, κ, ρ), the stars of G are all those stars which are subgraphs of G, i. e. all concept graphs G0 := (V 0 , E 0 , ν|V 0 ∪ E 0 , κ|V 0 ∪ E 0 , ρ|V 0 ∪ E 0 ) where E 0 := {e} for an edge e ∈ E and V 0 := {ν(e)|i | i = 1, . . . , |e| }. The stars are interesting because we can derive a concept graph from the set of all its stars and its isolated vertices using Rule 9∗ (join vertices with equal references). Consequently, it suffices to prove that every star A of G2 can be derived from G1 if G1 entails G2 . Using
238
S. Prediger
Rule 8 (copy concept graph), we obtain enough copies to derive all stars of G2 from which we can derive G2 . Let G1 and G2 be two concept graphs with G1 |= G2 and let A be a star of G2 with edge f and vertices w1 , w2 , . . . , wk . For deriving A from G1 , we proceed in three steps. (i.) First, we derive stars from G1 such that, for every tuple (g1 , . . . , gk ) of objects in ρ(f ), there is a star Ag1 ,...,gk with edge eg1 ,...,gk and ρ(eg1 ,...,gk ) = (g1 , . . . , gk ). (ii.) Then, we join these stars in several steps by joining the corresponding vertices. We obtain a star B with an edge f 0 that has the same references as the star A of G2 . But it does not necessarily have the same concept and relation names. (iii.) In order to adapt the concept and relation names by Rules 5∗ and 6∗ , we first have to derive isolated vertices vi for every vertex wi of A with κ1 (vi ) = κ2 (wi ) and ρ1 (vi ) ⊇ ρ2 (wi ). Then, we can finally deduce a copy of A from B. i) As A is valid in the standard model (KG1 , ιG1 ) and κ2 (f ) ∈ R, there S G1 1 {ιR R | R ∈ T }. exists a set T ⊆ κ1 (E1 ) of relations such that ιG R κ2 (f ) = Consequently, for all (g1 , . . . , gk ) ∈ ρ2 (f ), there exists an R ∈ T such that G1 1 ιG G (g1 , . . . , gk ) = (g1 , . . . , gk ) ∈ ιR (R). Because of R ∈ κ1 (E1 ), we can find an edge eg1 ,...,gk ∈ E1 with (g1 , . . . , gk ) ∈ ρ1 (eg1 ,...,gk ). By means of Rule 2 and 4 (delete vertices and edges), we can derive, for all tuples (g1 , . . . , gk ) ∈ ρ2 (f ), the corresponding star of G1 with the edge eg1 ,...,gk . Using Rule 7 (restrict references), we restrict the references to g1 , . . . , gk . In this way, we derive stars denoted by Ag1 ,...,gk with vertices denoted by vg1 , . . . , vgk . ii) In the first substep, we join the k th vertices of all stars Ag1 ,...,gk where the first k − 1 references are identical. For every tuple (g 1 , . . . , g k−1 ) ∈ ρ2 (w1 ) × . . . × ρ2 (wk−1 ), we consider all stars Ag1 ,...,gk−1 ,gk with gk ∈ ρ2 (wk ) and unify the relation names κ(eg1 ,...,gk−1 ,gk ) by Rule 6∗ (exchange relation names) into a common relation name Reg1 ,...,gk−1 . As all gk belong to ρ2 (wk ), they satisfy κ(eg1 ,...,gk−1 ,gk ) ≤R κ(f ). Thus, we find a common relation name Reg1 ,...,gk−1 ≤R κ(f ). Thereafter, we join the k th vertices of all changed concept graphs Ag1 ,...,gk−1 ,gk by Rule 10∗ (join vertices with corresponding edges). Then, we join their first, then second, and finally (k − 1)th vertices. After deleting the double edges (Rule 4), we obtain a star with k vertices that we denote by Ag1 ,...,gk−1 . It has an edge eg1 ,...,gk−1 , and we have ρ(eg1 ,...,gk−1 ) = {g 1 } × . . . × {g k−1 } × ρ2 (wk ) and κ(eg1 ,...,gk−1 ) ≤R κ(f ). In the second substep, we join the vertices of all those stars Ag1 ,...,gk−1 (which all have the same k th reference) that correspond in the (k − 1)th reference. Applying Rule 6∗ , 10∗ and 4, we obtain concept graphs Ag1 ,...,gk−2 with the edge eg1 ,...,gk−2 satisfying ρ(eg1 ,...,gk−2 ) = {g1 } × . . . × {gk−2 } × ρ2 (wk−1 ) × ρ2 (wk ). After k steps of joining, we obtain a star B with edge f 0 that has the same references as the edge f of A. iii) As A is valid in the standard model (KG1 , ιG1 ), every vertex wi of A S 1 {ρ1 (v) | v ∈ V1 , κ1 (v) ≤C κ2 (wi )}. satisfies ρ2 (wi ) ⊆ Ext (ιG G κ2 (wi )) =
Simple Concept Graphs: A Logic Approach
239
Thus, for every vertex wi of A, we can use Rule 4 (delete edges) and derive all isolated vertices v ∈ V1 with κ1 (v) ≤C κ2 (wi ). By means of Rule 10∗ (join vertices with corresponding edges), they can be joined into an isolated vertex vi with κ1 (vi ) = κ2 (wi ) and ρ1 (vi ) ⊇ ρ2 (wi ). Finally, we can exchange the concept and relation names (Rules 5∗ and 6∗ ) and, by means of Rule 2 (delete all isolated vertices), we obtain a concept graph 2 that is isomorphic to A. Taken as a whole, this proves G1 ` A. Proof of Proposition 3. Let G0 be valid in (K, id). We prove the assertion by showing that G0 is valid in the standard model of the standard graph G(K) := G and by using Prop. 1. The standard model (KG , ιG ) of the concept graph G S G with KG = ((G, R), B(K), I G ) satisfies ιG {ρK (e) | e ∈ G = idG , and ιR (R) = G E, κK (e) ≤ R} = R for all R ∈ R. This implies ιR = idR . Consequently, the edge condition for G0 in (KG , ιG ) is satisfied. Furthermore, we have ιG C (c) := IG IGIG ) and, according to the definition of the incidence relation in the (c , c G standard model, we have for every concept name c ∈ C the equations cI := S S II S {ρK (A) | κK (A) ≤ c} = {g | (g II , g I ) ≤ c} = {g II | g ∈ cI } = cI . Thus, the vertex condition is also satisfied. 2
References 1. Chein, M., Mugnier, M.-L.: Conceptual Graphs: Fundamental Notions. Revue d’Intelligence Artificielle 6 (1992) 365–406 2. Chein, M., Mugnier, M.-L.: Conceptual Graphs are also Graphs. Rapport de Recherche 95003, LIRMM, Universit´e Montpellier II (1995) 3. Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations. Springer, Berlin–Heidelberg, to appear 1998 4. Mugnier, M.-L., Chein, M.: Repr´esenter des Connaissances et Raisonner avec des Graphes. Revue d’Intelligence Artificielle 10 (1996) 7–56 5. Prediger, S.: Einfache Begriffsgraphen: Syntax und Semantik. Preprint, FB Mathematik, TU Darmstadt (1998) 6. Prediger, S.: Existentielle Begriffsgraphen. Preprint, FB Mathematik, TU Darmstadt (1998) 7. Priß, U.: The Formalization of WordNet by Methods of Relational Concept Analysis. In: Fellbaum, C. (ed.): WordNet - An Electronic Lexical Database and some of its Applications. MIT-Press (1996) 8. Sowa, J. F.: Conceptual Structures: Information Processing in Mind and Machine. Adison-Wesley, Reading (1984) 9. Sowa, J. F.: Knowledge Representation: Logical, Philosophical, and Computational Foundations. PWS Publishing Co., Boston, to appear 1998 10. Wermelinger, M.: Conceptual Graphs and First-Order Logic. In: Ellis, G. et al. (eds.): Conceptual Structures: Applications, Implementations and Theory, Proceedings of the ICCS ’95. Springer, Berlin–New York (1995) 323–337 11. Wille, R.: Restructuring Mathematical Logic: an Approach based on Peirce’s Pragmatism. In: Ursini, A., Agliano, P. (eds.): Logic and Algebra. Marcel Dekker, New York (1996) 267–281 12. Wille. R.: Conceptual Graphs and Formal Concept Analysis. In: Lukose, D. et. al. (eds.): Conceptual Structures: Fulfilling Peirce’s Dream, Proceedings of the ICCS’97. Springer, Berlin–New York (1997) 290–303
Two FOL Semantics for Simple and Nested Conceptual Graphs G. Simonet LIRMM (CNRS and Universit´e Montpellier II), 161, rue Ada, 34392 Montpellier Cedex 5, France, tel:(33)0467418543, fax: (33)0467418500, email:
[email protected] Abstract. J.F. Sowa has defined a FOL semantics for Simple Conceptual Graphs and proved the soundness of the graph operation called projection with respect to this semantics. M. Chein and M.L. Mugnier have proved the completeness result, with a restriction on the form of the target graph of the projection. I propose here another FOL semantics for Simple Conceptual Graphs corresponding to a slightly different interpretation of a Conceptual Graph. Soundness and completeness of the projection with respect to this semantics are true without any restriction. I extend the definitions and results on both semantics to Conceptual Graphs containing co-reference links and to Nested Conceptual Graphs.
1
Introduction
The Conceptual Graphs model has been proposed by J.F. Sowa [11] as a Semantics Networks one in Knowledge Representation. J.F. Sowa has defined a FOL (First Order classical Logic) semantics denoted by Φ for Simple Conceptual Graphs and proved the soundness of the graph operation called projection with respect to this semantics. M. Chein and M.L. Mugnier [1,5] have studied the basic model of Conceptual Graphs and several of its extensions. Among other results, they have proved the completeness of the projection with respect to the semantics Φ in Simple Graphs, with a restriction on the form of the target graph of the projection. This result shows that reasoning on Conceptual Graphs may be performed using graph operations instead of logical provers. The semantics Φ has already been extended to Nested Conceptual Graphs in non-classical logics with “nested” formulas, i.e. formalisms in which a formula may appear as an argument of a predicate [11]. Thus the structure of a Nested Graph is directly translated into the structure of a nested formula. These logics are similar to the logics of contexts of [4] and [3]. A. Preller et al. [6] have defined a sequent formal system with nested formulas and proved the soundness and completeness of this formal system with respect to the projection in the Simple and Nested Graphs models. More details about related works may be found in [2]. I propose here a slightly different interpretation of a Conceptual Graph which leads to the need for co-reference links in Simple Graphs and to another FOL semantics denoted by Ψ . Projection is shown sound and complete with respect to M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 240–254, 1998. c Springer-Verlag Berlin Heidelberg 1998
Two FOL Semantics for Simple and Nested Conceptual Graphs
241
the semantics Ψ without any restriction. Then the definitions and results about the semantics Φ and Ψ are extended to Nested Graphs. More details about these extensions may be found in [10,9].
2 2.1
Simple Graphs Basic Notions
I briefly recall the basic (simplified) definitions: support, SG, projection, semantics Φ, normal graph, normal form of a graph, as well as the soundness and completeness result. For more details on these definitions, see [11,1,5]. I first specify some notations. Any partial order is denoted by ≤. |= is the entailment relation symbol in FOL. Basic ontological knowledge is encoded in a support S = (TC , TR , I). TC and TR are type sets, respectively a set of concept types and a set of relation types, partially ordered by an A-Kind-Of relation. Relation types may have any arity greater or equal to 1, and two comparable relation types must have the same arity. I is a set of individual markers. All supports also possess a generic marker denoted by ∗. The following partial order is defined on the set of markers I ∪{∗}: ∗ is the greatest element and elements of I are pairwise noncomparable. Asserted facts are encoded by Simple Graphs. A SG (Simple Graph) on a support S, is a labelled bipartite graph G = (R, C, E, l). R and C are the node sets, respectively relation node set and concept node set. E is the set of edges. Edges incident on a relation node are totally ordered, they are numbered from 1 to the degree of the relation node. The ith neighbor of relation r is denoted by Gi (r). Each node has a label given by the mapping l. A relation node r is labelled by a relation type type(r), and the degree of r is equal to the arity of type(r). A concept node c is labelled by a couple (type(c), ref (c)), where type(c) is a concept type and ref (c), the referent of c, either belongs to I — then c is an individual node — or is the generic marker ∗ — then c is a generic node. The SG of Figure 1 represents the information: a person looks at the photograph A (the generic referent * is omitted).
Fig. 1. A Simple Graph
A projection from a SG G = (RG , CG , EG , lG ) to a SG H = (RH , CH , EH , lH ) is a mapping Π from RG to RH and from CG to CH which 1. preserves adjacency and order on edges: ∀rc ∈ EG Π(r)Π(c) ∈ EH and if c = Gi (r) then Π(c) = Hi (Π(r)) 2. may decrease labels: ∀x ∈ RG ∪ CG lH (Π(x)) ≤ lG (x).
242
G. Simonet
The partial order on relation labels is that of TR . The partial order on concept labels is the product of the partial orders on TC and on I ∪ {∗}. SGs are given a semantics in FOL, denoted by Φ [11]. Given a support S, a constant is assigned to each individual marker and an n-adic (resp. unary) predicate is assigned to each n-adic relation (resp. concept) type. For simplicity, we consider that each constant or predicate has the same name as the associated element of the support. To S is assigned a set of formulas Φ(S) which corresponds to the interpretation of the partial orderings of TR and TC . Φ(S) is the set of formulas ∀x1 ...xp (t(x1 , ..., xp ) → t0 (x1 , ..., xp )), where t and t0 are type such that t ≤ t0 and p is the arity of t and t0 . Φ maps any graph G on S into a formula Φ(G) in the following way. First assign to each concept node c a term which is a variable if c is generic, and otherwise the constant corresponding to ref (c). Two distinct generic nodes receive distinct variables. Then assign an atom to each node of G: the atom tc (e) to a concept node c where tc stands for the type of c and e is the term associated with c; the atom tr (e1 , ..., ep ) to a relation node r where tr stands for the type of r and ei is the term associated with the ith neighbor of r. Φ(G) is the existential closure of the conjunction of these atoms. E.g. the formula associated with the graph of Figure 1 is Φ(G) = ∃y1 y2 (P erson(y1 ) ∧ Look(y2 ) ∧ photograph(A) ∧ agent(y2 , y1 ) ∧ object(y2 , A)). Projection is sound and complete with respect to the semantics Φ, i.e.: given two SGs G and H defined on S, if there is a projection from G to H then Φ(S), Φ(H) |= Φ(G) (soundness [11]); conversely, if H is normal and Φ(S), Φ(H) |= Φ(G) then there is a projection from G to H (completeness [5]). A graph is normal iff each individual marker appears at most once in concept node labels. A graph G can be transformed into a normal graph G0 (called the normal form of G) by merging all concept nodes having the same individual referent, provided that these nodes have the same type. The formulas Φ(G) and Φ(G0 ) are trivially equivalent. Figure 2 is a counterexample to the completeness result when H is not normal. Φ(G) = t(a) ∧ r(a, a) and Φ(H) = t(a) ∧ t(a) ∧ r(a, a).
Fig. 2. Counterexample to the completeness result in SGs with Φ
Φ(S), Φ(H) |= Φ(G) (as Φ(H) and Φ(G) are equivalent formulas) but there is no projection from G to H. The normal form of H is G. 2.2
Co-Reference Links
According to the semantics Φ, merging several individual nodes having the same referent (i.e. representing the same entity) does not change the meaning of the graph. However, in some applications, it may be useful to represent the same entity with different concept nodes corresponding to different aspects or viewpoints
Two FOL Semantics for Simple and Nested Conceptual Graphs
243
on this entity, so that merging these nodes would destroy some information. Moreover, the different concept nodes representing the same entity may have different types (and therefore different labels in the graph) according to the different aspects of the entity that they represent. When a specified individual is represented by several nodes, the individual marker suffices for expressing that these nodes represent the same entity. But in the case of an unspecified entity represented by several generic nodes, an additional structure, called a co-reference link, is needed. A SGref (SG with co-reference links) is a graph G = (R, C, E, l, co-ref ), where co-ref is an equivalence relation on the set of G generic nodes (the coreference relation).The intuitive semantics of co-ref is “represents the same entity as”. Any SG may be considered as a SGref in which the co-reference relation is reduced to the identity one. The relation co-ref is naturally extended to an equivalence relation co-ident on the set C of all concept nodes of G (the coidentity relation). Every equivalence class for the co-identity relation is a set of concept nodes representing the same entity which are either co-referent generic nodes (in that case they are explicitly linked by a co-reference link in a graphical representation of G) or individual nodes having the same referent. ∀c, c0 ∈ C, co-ident(c, c0 ) iff (co-ref (c, c0 ) or ref (c) = ref (c0 ) ∈ I) The definitions and results on SGs are modified as follows by the introduction of co-reference links. A projection from a SGref G = (RG , CG , EG , lG , corefG ) to a SGref H = (RH , CH , EH , lH , co-refH ) is a projection Π from the SG (RG , CG , EG , lG ) to the SG (RH , CH , EH , lH ) that preserves co-identity, i.e. ∀c, c0 ∈ CG if co-refG (c, c0 ) then co-identH (Π(c), Π(c0 )). Note that the co-identity of individual nodes is already preserved: for any c, c0 ∈ CG , if ref (c) = ref (c0 ) ∈ I then ref (Π(c)) = ref (Π(c0 )) = ref (c) ∈ I. Φ may be extended to SGref s by assigning the same variable to co-referent generic nodes. A SGref is normal iff it contains no co-reference links and each individual marker appears at most once in concept node labels (i.e. iff the co-identity relation is reduced to the identity one). The normal form of a SGref G is obtained from G by merging its co-identical nodes (provided that they have the same type). Therefore co-reference links in a SG G have no real interest when G is interpreted according to the semantics Φ. Their interest will appear with the semantics Ψ and in Nested Graphs for both semantics. 2.3
Another FOL Semantics
The goal here is to modify the semantics Φ into a semantics Ψ which would express the fact that the same entity is represented by several concept nodes. The structure of a graph G should be fully represented in Ψ (G), which is not the case in Φ(G). Ψ (G) is defined as follows. Two terms are assigned to each concept node c of G. The first term ec represents the co-identity class c of c: the constant corresponding to ref (c) if c is an individual node, otherwise the variable assigned to the co-reference class of c (ec is the term assigned to c in Φ(G)). The second term ec is a variable representing the node
244
G. Simonet
itself. All these variables are distinct. Assign the atom tc (ec , ec ) to each concept node c and the atom tr (ec1 , ..., ecp ) to each relation node r of G, where ci is the ith neighbor of r. Ψ (G) is the existential closure of the conjunction of these atoms. E.g. in Figure 2, Ψ (G) = ∃x1 (t(a, x1 ) ∧ r(x1 , x1 )) and Ψ (H) = ∃x1 x2 (t(a, x1 ) ∧ t(a, x2 ) ∧ r(x1 , x2 )). Ψ (S) is defined as Φ(S), except that predicates associated with concept types become binary. Projection is sound and complete with respect to this semantics without any restriction. Theorem 1. Let G and H be two SGref s. Ψ (S), Ψ (H) |= Ψ (G) iff there is a projection from G to H. E.g. in Figure 2, Ψ (S), Ψ (H) 6|= Ψ (G) and there is no projection from G to H. Proof: the proof is similar to that with the semantics Φ. However it is given here because the complete proof with Φ (for SGs) is only available in French [5] and because Lemmas 1 and 2 will be used for NGref s. For any formula F , let C(F ) denote the clausal form of F . C(Ψ (H)) is the set of atomic clauses obtained from Ψ (H) atoms by substituting Skolem constants to the variables. Let ρ be this substitution. C(¬Ψ (G)) contains a unique clause whose literals are the negations of Ψ (G) atoms. The Herbrand Universe UH of the formula Ψ (S) ∧ Ψ (H) ∧ ¬Ψ (G) is the set of constants appearing in C(Ψ (H)) or in Ψ (G). A Ψ -substitution from G to H w.r.t. ρ (in short Ψ -substitution from G to H) is a substitution σ of the variables of Ψ (G) by UH constants such that for any atom t(e1 , ..., en ) of Ψ (G), there is t0 ≤ t such that σ(t0 (e1 , ..., en )) is an atom of C(Ψ (H)) (i.e. for any atom t(e1 , ..., en ) of Ψ (G), there is an atom t0 (e01 , ..., e0n ) of Ψ (H) such that t0 ≤ t and for any i in {1, ..., n}, σ(ei ) = ρ(e0i )). Theorem 1 immediately follows from Lemmas 1 and 2. 2 Lemma 1: Ψ (S), Ψ (H) |= Ψ (G) iff there is a Ψ -substitution from G to H. Proof: let C = C(Ψ (S) ∧ Ψ (H) ∧ ¬Ψ (G)). Ψ (S), Ψ (H) |= Ψ (G) iff C is unsatisfiable. If there is a Ψ -substitution from G to H then the empty clause can be obtained from C by the resolution method, so C is unsatisfiable. Conversely, let us suppose that C is unsatisfiable. Let v be the Herbrand interpretation defined by: for any predicate t of arity n and any constants a1 , ..., an of UH , v(t)(a1 , ..., an ) is true iff there is t0 ≤ t such that t0 (a1 , ..., an ) is an atom of C(Ψ (H)). v is a model of C(Ψ (S) ∧ Ψ (H)). Then v is not a model of C(¬Ψ (G)), which provides a Ψ -substitution from G to H. 2 Lemma 2: If there is a projection Π from G to H then there is a Ψ substitution σ from G to H such that for any concept node c of G, σ(ec ) = ρ(eΠ(c) ) and σ(ec ) = ρ(eΠ(c) ). Conversely, if there is a Ψ -substitution σ from G to H then there is a projection Π from G to H such that for any concept node c of G, σ(ec ) = ρ(eΠ(c) ) and σ(ec ) = ρ(eΠ(c) ). Proof: let us suppose that there is a projection Π from G to H. For any variable x of Ψ (G), let c be the concept node of G such that x = ec (resp. ec ) and let σ(x) be the UH constant ρ(eΠ(c) ) (resp. ρ(eΠ(c) )). Then for any concept node c of G, σ(ec ) = ρ(eΠ(c) ) and σ(ec ) = ρ(eΠ(c) ). For any atom t(e1 , ..., en ) of Ψ (G), let s be a node of G associated with this atom. The atom of Ψ (H) associated with Π(s) is in the form t0 (e01 , ..., e0n ), with t0 ≤ t and for any i in {1, ..., n}, σ(ei ) = ρ(e0i ). σ is a Ψ -substitution from G to H.
Two FOL Semantics for Simple and Nested Conceptual Graphs
245
Conversely, let us suppose that there is a Ψ -substitution σ from G to H. For any node s of G, let t(e1 , ..., en ) be the atom of Ψ (G) associated with s and let Π(s) be a node of H associated with an atom in the form t0 (e01 , ..., e0n ), with t0 ≤ t and for any i in {1, ..., n}, σ(ei ) = ρ(e0i ). Then for any node s of G, type(Π(s)) ≤ type(s) and for any concept node c of G, σ(ec ) = ρ(eΠ(c) ) and σ(ec ) = ρ(eΠ(c) ). Π fullfills the label-decreasing condition. Let us shoow that it preserves adjacency and order on edges. If c = Gi (r) then σ(ec ) = ρ(eΠ(c) ) = ρ(eHi (Π(r)) ), then Π(c) = Hi (Π(r)). Note that with the semantics Φ, we only have σ(ec ) = ρ(eΠ(c) ) = ρ(eHi (Π(r)) ), then Π(c) = Hi (Π(r)) and we need the normality condition to ensure that Π(c) = Hi (Π(r)). Let us shoow that Π preserves co-identity. If co-refG (c, c0 ) then c = c0 , then σ(ec ) = σ(ec0 ) = ρ(eΠ(c) ) = ρ(eΠ(c0 ) ), then Π(c) = Π(c0 ), i.e. co-identH (Π(c), Π(c0 )). Π is a projection from G to H. 2
3 3.1
Nested Graphs The Model
The Nested Graphs model allows to associate with any concept node a partial internal description in the form of a Nested Graph. E.g. the Nested Graph (with co-reference links) of Figure 3 is obtained from the SG of Figure 1 by adding the partial internal description to the concept node labelled (photograph,A): the photograph represents a person on a boat (and adding a co-reference link to express that the person on the boat is the person looking at the photograph). A NG (Nested Graph) G, its depth depth(G) and the set U CG of concept nodes
Fig. 3. A Nested Graph with co-reference links
appearing in G are defined by structural induction. 1. A basic NG G0 is obtained from a SG G by adding to the label of each concept node c, a third field, denoted by Desc(c), equal to ∗∗ (the empty description); depth(G0 ) = 0 and U CG0 = CG0 = CG . 2. Let G be a basic NG, c1 , c2 , ..., ck concept nodes of G, and G1 , G2 , ..., Gk NGs. The graph G0 obtained by substituting Gi to the description ∗∗ of ci for i = 1, ..., k is a NG; depth(G0 ) = 1 + max{depth(Gi ), 1 ≤ i ≤ k} and U CG0 = CG ∪ (∪1≤i≤k U CGi ).
246
G. Simonet
It is important to note (for the following definition of A(G)) that if a SG or NG H is used several times in the construction of a NG G, we consider that several copies of H (and not several times the graph H itself) are used in the construction of G. A NG can be denoted by G = (R, C, E, l) as a SG except that the label l(c) of any concept node c has a third field Desc(c) (in addition to the fields type(c) and ref (c)) which is either ∗∗ or a NG. A complex node is a concept node c such that Desc(c) is a NG. The set of complex nodes of G is denoted by D(G). A NGref (NG with co-reference links) is a graph G = (R, C, E, l, co-ref ), where co-ref is an equivalence relation on the set of generic nodes appearing in G. NGref s are defined by structural induction from SGref s as NCs from SGs, with the following definition of co-ref . If G0 is the basic NGref obtained from a SGref G, then co-refG0 = co-refG . If G0 is the NGref obtained from a basic NGref G, concept nodes ci of G and NGref s Gi , then co-refG0 is an equivalence relation on the set of generic nodes of U CG0 such that its restriction to the set of generic nodes of CG (resp. U CGi ) is co-refG (resp. co-refGi ). The relation co-ref is extended to an equivalence relation co-ident on the set U CG as in the SGref model. Any NGref G has an associated rooted tree A(G) whose nodes are the SGref s used in the construction of G and edges are in the form (J, c)K, where J and K are nodes of A(G) and c is a concept node of J. Its root is denoted by R(G). E.g. the rooted tree associated with the NGref G of Figure 3 is represented in Figure 4: it has two nodes R(G) and K1 and one edge (R(G), c1 )K1 (co-reference links are not edges of A(G)). A(G) is defined by structural induction. If G0 is the
Fig. 4. Rooted tree associated with the NGref of Figure 3
basic NGref obtained from a SGref G, then A(G0 ) is reduced to its root G. If G0 is obtained from a basic NGref G, concept nodes ci of G and NGref s Gi , then A(G0 ) is obtained from the rooted trees A(Gi ) by adding R(G) as the root and the edges (R(G), ci )R(Gi ). The sets CG and CR(G) will be identified, as well as the set U CG and the union of the sets CK , K node of A(G).The relation co-refG is translated in A(G) into an equivalence relation co-refA(G) on the union of the generic node sets of the nodes of A(G). A projection Π from a NGref G to a NGref H and the image by Π of any node c of U CG , denoted by Π(c), are defined by induction on the depth of G.
Two FOL Semantics for Simple and Nested Conceptual Graphs
247
A projection from a NGref G to a NGref H is a family Π = (Π0 , (Πc )c∈D(G) ) which satisfies: 1. Π0 is a projection from the SGref R(G) to the SGref R(H), 2. ∀c ∈ D(G), Π0 (c) ∈ D(H) and Πc is a projection from Desc(c) to Desc(Π0 (c)), 3. Π preserves co-identity: ∀c, c0 ∈ U CG if co-refG (c, c0 ) then co-identH (Π(c), Π(c0 )), where Π(c) = Π0 (c) if c ∈ CG and otherwise let c1 be the node of D(G) such that c ∈ U CDesc(c1 ) , Π(c) = Πc1 (c). Projection can be formulated in terms of the rooted trees A(G) and A(H) as follows. Let A1 = (X1 , U1 ) and A2 = (X2 , U2 ) be two rooted trees of SGref s with co-reference links and R1 and R2 be their respective roots. An A-projection from A1 to A2 is a family ϕ = (ϕ0 , (ϕK )K∈X1 ) which satisfies: 1. ϕ0 is a mapping from X1 to X2 that maps R1 to R2 , 2. for any node K of A1 , ϕK is a projection from the SGref K to the SGref ϕ0 (K), 3. ϕ preserves adjacency in rooted trees: ∀(J, c)K ∈ U1 (ϕ0 (J), ϕJ (c))ϕ0 (K) ∈ U2 , 4. ϕ preserves co-identity: ∀K, K 0 ∈ X1 ∀(c, c0 ) ∈ CK × CK 0 if co-refA1 (c, c0 ) then co-identA2 (ϕK (c), ϕK 0 (c0 )). Condition 3 on the image of an edge (J, c)K by an A-projection ϕ is represented in Figure 5. It can be shown by induction on the depth of G that for any NGref s
Fig. 5. Edge (J, c)K and its image by an A-projection ϕ
G and H, there is a projection from G to H iff there is an A-projection from A(G) to A(H). 3.2
The Semantics Φ
In this section I will extend the semantics Φ to NGref s. I mentioned earlier the lack of interest of introducing co-reference links in a SG interpreted in the
248
G. Simonet
semantics Φ because of the possibility of merging nodes representing the same entity. Note that such nodes appearing in a NG G may be merged if they appear in the same SG of A(G) (the resulting node having as description the union of the descriptions of the original nodes), but not otherwise. Thus the introduction of co-reference links in NGs interpreted in any semantics does improve their expressive power. In order to extend Φ to NGref s, we add a first argument to each predicate, called the context argument. Thus a binary predicate is associated with each concept type, and a (n+1)-adic predicate is associated with each nadic relation type. For instance, if z (resp. x, y) is the variable assigned to a generic node of type Photograph (resp. Person, Boat), then the interpretation of the atom P erson(z, x) is “x is a person in the context of the photograph z” and that of the atom on(z, x, y) is “the person x is on the boat y in the context of the photograph z”. With S is associated the set Φ(S) of formulas ∀zx1 ...xp (t(z, x1 , ..., xp ) → t0 (z, x1 , ..., xp )), where t and t0 are types such that t ≤ t0 and p is the arity of t and t0 in the semantics Φ for SGs. For any NGref G, let r(G) be the number of equivalence classes of the relation co-refG . Assign r(G) distinct variables y1 , ..., yr(G) to the co-refG classes. For any node K of A(G), if K contains generic nodes then the variables of Φ(K) are some of the variables y1 , ..., yr(G) . A subgraph of G is a NGref that is either equal to G or to the description graph of a concept node appearing in G. A formula Φ0 (e, K) (resp. Φ0 (e, G0 )) is associated with any term e and any node K of A(G) (resp. subgraph G0 of G). Φ0 (e, K) (resp. Φ0 (e, G0 )) is the formula associated with K (resp. G0 ) when it is in the context represented by the term e, or more specifically, when K is the root of the description graph (resp. G0 is the description graph) of a concept node associated with the term e. Φ0 (e, K) is defined as the conjunction of the atoms obtained from those of Φ(K) by adding the first argument e. Φ0 (e, G0 ) is defined by induction on the depth of G0 . For any node K of A(G) and any concept node c of K, ec denotes the term assigned to c in Φ(K) (which is in fact the term assigned to the co-identity class c of c). Φ0 (e, G0 ) = Φ0 (e, R(G0 )) ∧ (∧c∈D(G0 ) Φ0 (ec , Desc(c))) The formula Φ(G) = ∃y1 ... yr(G) Φ0 (a0 , G) is associated with any NGref G defined on the support S, where a0 is a constant representing the general context induced by the support S, so that the same constant a0 is used for all NGref s defined on the support S. E.g. the formula associated with the graph G of Figure 3 is Φ(G) = ∃y1 y2 y3 (P erson(a0 , y1 ) ∧ Look(a0 , y2 ) ∧ photograph(a0 , A)∧ agent(a0 , y2 , y1 ) ∧ object(a0 , y2 , A) ∧ Φ0 (A, Desc(c))), where c is the concept node with referent A and Φ0 (A, Desc(c)) = P erson(A, y1 ) ∧ Boat(A, y3 ) ∧ on(A, y1 , y3 ). Φ(G) may be defined from A(G). Φ(G) = ∃y1 ...yr(G) (∧K node of A(G) Φ0 (eK , K)) where eK = a0 if K = R(G) and otherwise let (J, c)K be the edge of A(G) into K, eK = ec . The normality notion and the soundness and completeness results extend to NGref s. Let us first consider two counterexamples to the completeness result
Two FOL Semantics for Simple and Nested Conceptual Graphs
249
(presented in Figure 6). Φ(G1 ) = t(a0 , a) ∧ r(a0 , a, a) and Φ(H1 ) = t(a0 , a) ∧ t(a0 , a) ∧ r(a0 , a, a). Φ(G1 ) and Φ(H1 ) are equivalent formulas, but there is no projection from G1 to H1 . The problem here is that a node of A(H1 ) is not a normal SGref . Φ(G2 ) = t(a0 , a) ∧ t(a, a) ∧ t(a, a) and Φ(H2 ) = t(a0 , a) ∧ t(a, a). Φ(G2 ) and Φ(H2 ) are equivalent formulas, but there is no projection from G2 to H2 . The problem here is that there are two co-identical concept nodes appearing in H2 such that one of them is a complex node and the other one is not. We
Fig. 6. Counterexample to the completeness result in NGref s with Φ
have seen that the semantics Φ does not distinguish co-identical nodes in a SGref . In the same way, it does not distinguish which one(s) of co-identical nodes appearing in a NGref contain(s) a given piece of information in its (their) description graph(s), as the description graphs of these nodes have the same context argument (the term assigned to the co-identity class of the nodes). A strong definition of normality could be the following. A NGref G is strongly normal iff (1) every node of A(G) is a normal SGref and (2) for any co-identical concept nodes c and c0 appearing in two distinct nodes of A(G), if c is a complex node then c0 is a complex node and Desc(c0 ) is an exact copy of Desc(c), i.e. Desc(c0 ) is a copy of Desc(c) such that each generic node appearing in Desc(c) is co-referent to its copy appearing in Desc(c0 ). The fact that Desc(c0 ) is an exact copy of Desc(c), and not only a copy of Desc(c), is important. E.g. consider the
Fig. 7. Strong normality
NGref s G, H and K in Figure 7. G and K are strongly normal, but H is not. Φ(G), Φ(H) and Φ(K) are equivalent formulas, but there is no projection from
250
G. Simonet
G or K to H. Condition (2) of strong normality may be weakened as follows. Let G be a NGref and G0 and H 0 be two SGref s nodes of A(G) (resp. two NGref s subgraphs of G). An exact projection from G0 to H 0 is a projection from G0 to H 0 mapping each generic node c of CG0 (resp. U CG0 ) to a generic node of CH 0 (resp. U CH 0 ) co-referent to c in G. G0 and H 0 are exactly equivalent iff there is an exact projection from G0 to H 0 and from H 0 to G0 . A NGref G is normal iff 1. every node of A(G) is a normal SGref , 2. for any co-identical concept nodes c and c0 appearing in two distinct nodes of A(G), if c is a complex node then c0 is a complex node and R(Desc(c0 )) is exactly equivalent to R(Desc(c)). It can be shown by induction on the depth of Desc(c) that for any co-identical complex nodes c and c0 appearing in a normal NGref , not only the SGref s R(Desc(c)) and R(Desc(c0 )) but the whole NGref s Desc(c) and Desc(c0 ) are exactly equivalent. Moreover, Desc(c) and Desc(c0 ) are exact copies of each other except for redundant relation nodes (a relation node r appearing in a node K of A(G) is redundant iff there is another relation node r0 of K with the same respective neighbors as r and type(r0 ) ≤ type(r)). However, putting a NGref G into normal form (i.e. transforming G into a normal NGref G0 such that Φ(G) ≡ Φ(G0 )) is not always possible. E.g. consider the graph H2 of Figure 6. If H20 was a normal NGref such that Φ(H2 ) ≡ Φ(H20 ) then Φ(H20 ) would contain the atom t(a, a), then any concept node of U CH20 with referent a would have a concept node with referent a in the root of its description graph, and there would be an infinite chain in A(H20 ), which is impossible. The definition of normality is weakened into that of k-normality in such a way that any NGref G may be put into k-normal form for any k ≥ depth(G). The level in G of a node c of U CG is the level in A(G) of the node K of A(G) containing c (i.e. the number of edges of the path in A(G) from R(G) to K). A NGref G is k-normal iff 1. every node of A(G) is a normal SGref , 2. for any co-identical concept nodes c and c0 appearing in two distinct nodes of A(G) such that c is a complex node, if the level of c0 in G is inferior to k then c0 is a complex node and if c0 is a complex node then R(Desc(c0 )) is exactly equivalent to R(Desc(c)). Note that a normal NGref is k-normal for any natural integer k. Theorem 2. Let G and H be two NGref s and k ≥ depth(G). If there is a projection from G to H then Φ(S), Φ(H) |= Φ(G). Conversely, if H is k-normal and Φ(S), Φ(H) |= Φ(G) then there is a projection from G to H. Proof: We use the same notations as in the proof of Theorem 1. A Φ-substitution from a NGref G to a NGref H is defined in a similar way as a Ψ -substitution from a SGref G to a SGref H. Lemma 1 is true for NGref s instead of SGref s and the semantics Φ instead of Ψ . Lemma 2 is true for SGref s and for the semantics Φ instead of Ψ provided that H is a normal SGref (the condition σ(ec ) = ρ(eΠ(c) ) disappears as the term ec does not exist in the semantics Φ).
Two FOL Semantics for Simple and Nested Conceptual Graphs
251
Let us suppose that there is a projection from G to H. Then there is an Aprojection ϕ = (ϕ0 , (ϕK )K node of A(G) ) from A(G) to A(H). From Lemma 2, for any node K of A(G), there is a Φ-substitution σK from K to ϕ0 (K) such that for any concept node c of K, σK (ec ) = ρ(eϕK (c) ). If a variable x is assigned to two co-referent nodes c and c0 appearing in different nodes K and K 0 of A(G) then σK (x) = σK (ec ) = ρ(eϕK (c) ) and σK 0 (x) = σK 0 (ec0 ) = ρ(eϕ 0 (c0 ) ). As ϕ K
preserves co-identity, ϕK (c) = ϕK 0 (c0 ), then σK (x) = σK 0 (x). Therefore there is a substitution σ of the variables of Φ(G) by UH constants such that for any node K of A(G), the restriction of σ to the variables of Φ(K) is σK . The atoms of Φ(G) are obtained from those of the formulas Φ(K), K node of A(G), by adding the context argument eK . To show that σ is a Φ-substitution from G to H, it remains to show that for any node K of A(G), σ(eK ) = ρ(eϕ0 (K) ). If K = R(G) then ϕ0 (K) = R(H) and σ(eK ) = ρ(eϕ0 (K) ) = a0 , otherwise let (J, c)K be the edge of A(G) into K, eK = ec . From the definition of an Aprojection, (ϕ0 (J), ϕJ (c))ϕ0 (K) is an edge of A(H), then eϕ0 (K) = eϕJ (c) . It follows that σ(eK ) = σ(ec ) = ρ(eϕJ (c) ) = ρ(eϕ0 (K) ). Then σ is a Φ-substitution from G to H and we conclude with Lemma 1. Conversely, let us suppose that H is k-normal with k ≥ depth(G) and Φ(S), Φ(H) |= Φ(G). From Lemma 1, there is a Φ-substitution σ from G to H. We construct an A-projection ϕ from A(G) to A(H) such that for any node K of A(G) and any concept node c of K, σ(ec ) = ρ(eϕK (c) ). We define ϕ0 (K) and ϕK for any node K of A(G) and prove the preceding property by induction on the level l of K in A(G). For l = 0, K is R(G). The atoms of Φ(G) (resp Φ(H)) associated with the nodes of R(G) (resp. R(H)) are those with a0 as first argument. Then the restriction of σ to the variables of Φ(R(G)) is a Φ-substitution from R(G) to R(H). From Lemma 2, there is a projection Π0 from R(G) to R(H) such that for any concept node c of R(G), σ(ec ) = ρ(eΠ0 (c) ). We define ϕ0 (R(G)) = R(H) and ϕR(G) = Π0 . Suppose ϕ is defined up to level l. Let (J, c)K be an edge of A(G) with K at level l + 1. Let s be a node of K and t(e1 , .., , en ) be the atom of Φ(G) associated with s. Let t0 (e01 , .., , e0n ) be an atom of Φ(H) such that t0 ≤ t and for any i in {1, ..., n}, σ(ei ) = ρ(e0i ). Let K 0 be a node of A(H) and s0 a concept node of K 0 such that the atom t0 (e01 , .., , e0n ) is associated with s0 . e1 = eK = ec and e01 = eK 0 then σ(ec ) = ρ(eK 0 ), then eK 0 6= a0 , then there is an edge (J 0 , c0 )K 0 into K 0 and eK 0 = ec0 . By induction hypothesis, σ(ec ) = ρ(eϕJ (c) ).
We have ρ(ec0 ) = ρ(eK 0 ) = σ(ec ) = ρ(eϕJ (c) ), then c0 = ϕJ (c), i.e. c0 and ϕJ (c) are co-identical nodes. H is k-normal, c0 and ϕJ (c) are co-identical, c0 is a complex node and levelH (ϕJ (c)) = levelG (c) < depth(G) ≤ k then ϕJ (c) is a complex node and K 0 is exactly equivalent to R(Desc(ϕJ (c))). Let s00 be the image of s0 by an exact projection from K 0 to R(Desc(ϕJ (c))). The atom of Φ(R(Desc(ϕJ (c)))) associated with s00 is in the form t00 (e02 , .., , e0n ) with t00 ≤ t0 ≤ t. It follows that the restriction of σ to the variables of Φ(K) is a Φ-substitution from K to R(Desc(ϕJ (c))). From Lemma 2, there is a projection ΠK from K to R(Desc(ϕJ (c))) such that for any concept node c00 of K, σ(ec00 ) = ρ(eΠK (c00 ) ). We define ϕ0 (K) = R(Desc(ϕJ (c))) and ϕK = ΠK . ϕ0 (K) = R(Desc(ϕJ (c)))
252
G. Simonet
may be written as (ϕ0 (J), ϕJ (c))ϕ0 (K) is an edge of A(H), then ϕ preserves adjacency in rooted trees. Let us show that ϕ preserves co-identity. For any nodes K and K 0 of A(G) and any concept nodes c and c0 of K and K 0 respectively, if co-refA(G) (c, c0 ) then c = c0 , then σ(ec ) = σ(ec0 ) = ρ(eϕK (c) ) = ρ(eϕ 0 (c0 ) ), then K
ϕK (c) = ϕK 0 (c0 ), i.e. co-identA(H) (ϕK (c), ϕK 0 (c0 )). ϕ is an A-projection from A(G) to A(H), then there is a projection from G to H 2 The k-normal form G0k of G is defined for any k ≥ depth(G) as follows. A rooted tree Ak of SGref s is built level by level from its root to level k. Its root is a copy of R(G). Let J be a node of Ak at level l < k and c a concept node of J (c is the copy of a node c0 of U CG ). If there is at least one complex node of U CG co-identical to c0 then add the edge (J, c)K to Ak , where K is a copy of the union of the SGref s R(Desc(c00 )), c00 complex node of U CG co-identical to c0 . Two generic nodes appearing in nodes of Ak are co-referent iff they are copies of co-referent nodes of U CG . Let G0k be the NGref such that A(G0k ) is obtained from Ak by replacing each node of Ak by its normal form (when merging several co-identical concept nodes, only one of the subtrees issued from these nodes is kept, as these subtrees are exact copies of each other in Ak ). G0k is k-normal and Φ(G) ≡ Φ(G0k ) for any k ≥ depth(G). E.g. the 2-normal form of the graph H2 of Figure 6 is H2 and its 3-normal form is G2 . In Figure 7, for any k ≥ 2, G and K are their own k-normal form and the k-normal form of H is K. Corollary 1 of Theorem 2 shows that reasoning on NGref s may be performed using graph operations without any restriction on the NGref s. Corollary 1: Let G and H be two NGref s and Hk0 be the k-normal form of H, with k ≥ max(depth(G), depth(H)). Φ(S), Φ(H) |= Φ(G) iff there is a projection from G to Hk0 . Note that from a practical point of view, it is sufficient to construct Hk0 up to level depth(G) since projection preserves levels. The semantics Φ is unable to express not only that an entity is represented by several concept nodes in a node of A(G), but also that several concept nodes representing the same entity have distinct descriptions. It is pertinent in applications in which the meaning of a graph is not changed when merging co-identical concept nodes of a SGref or replacing the description of co-identical concept nodes by the union of their descriptions (in particular in applications where graphs are naturally normal). But in some applications, each concept node has a specific situation in its SGref and a specific description; merging these nodes or mixing their descriptions would destroy some information. For instance, let c and c0 be two concept nodes appearing in a NGref G and representing a given lake. c appears in the context of a biological study (i.e. is related in a SGref to nodes and appears in the description graphs of nodes concerning a biological study) and contains biological information about the lake in its description graph: animals and plants living in the lake. c0 appears in the context of a touristic study and contains touristic information about the lake in its description graph: possibilities of bathing, sailing and walking at the lake. The formulas Φ(G), Φ(H) and Φ(K) are equivalent, where H is obtained from G by exchanging the description graphs of c and c0 and K is the k-normal form of G and H (with k = max(depth(G), depth(H))), in which the description of c and c0
Two FOL Semantics for Simple and Nested Conceptual Graphs
253
is the union of the biological and touristic descriptions. Such an equivalence is obviously undesirable. In those applications, the semantics to be used is the semantics Ψ . 3.3
The Semantics Ψ
The semantics Ψ is extended from SGref s to NGref s in the same way as the semantics Φ, except that the context argument is the variable ec assigned to the concept node c representing the context instead of the term ec . Thus a description is specific to a concept node and not to a co-identity class. We add a context argument to each predicate (concept type predicates become 3-adic). For any NGref G and any node K of A(G), let nc(K) be the number of K concept nodes. Assign r(G) variables y1 , ..., yr(G) to the r(G) co-refG classes and for K any node K of A(G), nc(K) variables xK 1 , ..., xnc(K) to K concept nodes. All K K variables yi and xj are distinct. The variables of Ψ (K) are xK 1 , ..., xnc(K) and, if K contains generic nodes, some of the variables y1 , ..., yr(G) . For any node K of A(G), Ψ 0 (e, K) is the conjunction of the atoms obtained from those of Ψ (K) by adding the first argument e. Ψ 0 (e, G0 ) is defined by induction on the depth of G0 . For any node K of A(G) and any concept node c of K, ec denotes the variable assigned to c in Ψ (K). R(G0 ) R(G0 ) Ψ 0 (e, G0 ) = ∃x1 ... xnc(R(G0 )) (Ψ 0 (e, R(G0 )) ∧ (∧c∈D(G0 ) Ψ 0 (ec , Desc(c)))) The formula Ψ (G) = ∃y1 ... yr(G) Ψ 0 (a0 , G) is associated with any NGref G defined on the support S. E.g. the formula associated with the graph G2 of Figure 6 is Ψ (G2 ) = ∃x1 (t(a0 , a, x1 ) ∧ ∃x2 (t(x1 , a, x2 ) ∧ ∃x3 t(x2 , a, x3 ))). Ψ (G) may be defined from A(G) as the existential closure of the conjunction of the formulas Ψ 0 (eK , K), K node of A(G), where eK is a0 if K is R(G) and otherwise let (J, c)K be the edge of A(G) into K, eK is ec . E.g. the formula associated with the graph G2 of Figure 6 may be written as Ψ (G2 ) = ∃x1 x2 x3 (t(a0 , a, x1 ) ∧ t(x1 , a, x2 ) ∧ t(x2 , a, x3 )). Projection is sound and complete with respect to Ψ . Theorem 3. Let G and H be two NGref s. Ψ (S), Ψ (H) |= Ψ (G) iff there is a projection from G to H. E.g. in Figure 6, for any i in {1, 2}, Ψ (S), Ψ (Hi ) 6|= Ψ (Gi ) and there is no projection from Gi to Hi . Sketch of proof: A technical proof is given in [9], similar to that concerning Φ. A more intuitive one is given here, which would not be available for Φ. Let S0 be the support obtained from S by adding the individual marker a0 , the universal concept type > (if it does not already exist) and a binary relation type tcontext (context relation). For any NGref G on S, let G0 be the NGref on S0 reduced to a concept node labelled (>, a0 , G) and Simple(G) be the SGref on S0 obtained from the union of the nodes of A(G0 ) by adding for each edge
254
G. Simonet
(J, c)K of A(G0 ) nc(K) relation nodes of type tcontext relating c to each concept node of K. It can be shown that (1) there is a projection from G to H iff there is a projection from Simple(G) to Simple(H) and (2) Ψ (S), Ψ (H) |= Ψ (G) iff Ψ (S0 ), Ψ (Simple(H)) |= Ψ (Simple(G)) (any substitution of the variables of Ψ (G) leading to the empty clause by the resolution method is available for Ψ (Simple(G)), and conversely). We conclude with the soundness and completeness result on SGref s. This proof would not be available for the semantics Φ because the SGref Simple(H) obtained from a k-normal NGref H containing distinct co-identical nodes is not normal. 2
4
Conclusion
I have presented two FOL semantics for Simple and Nested Graphs. Projection is sound and complete with respect to both semantics. This result shows that reasoning on Conceptual Graphs may be performed using graph operations instead of logical provers (e.g. reasoning with Graph Rules [8,7]). Non FOL formalisms, with “nested” formulas as in the logics of contexts have been proposed for Nested Graphs. It remains to compare these formalisms and the semantics that may be associated with them to the FOL semantics presented here.
References 1. M. Chein and M.L. Mugnier. Conceptual Graphs: fondamental notions. Revue d’Intelligence Artificielle, 6(4):365–406, 1992. Herm`es, Paris. 2. M. Chein, M.L. Mugnier, and G. Simonet. Nested graphs: A graph-based knowledge representation model with fol semantics. In Proceedings of the Sixth International Conference on Principles of Knowledge Representation and Reasoning (KR’98), Trento, Italy, June 1998. 3. R.V. Guha. Contexts: a formalization and some applications. Technical Report ACT-CYC-42391, MCC, December 1991. PhD Thesis, Stanford University. 4. J. McCarthy. Notes on Formalizing Context. In Proc. IJCAI’93, pages 555–560, 1993. 5. M.L. Mugnier and M. Chein. Repr´esenter des connaissances et raisonner avec des graphes. Revue d’Intelligence Artificielle, 10(1):7–56, 1996. Herm`es, Paris. 6. A Preller, M.L. Mugnier, and M. Chein. Logic for Nested Graphs. Computational Intelligence Journal, (CI 95-02-558), 1996. 7. E. Salvat. Raisonner avec des op´erations de graphes : graphes conceptuels et r`egles d’inf´erence. PhD thesis, Montpellier II University, France, December 1997. 8. E. Salvat and M.L. Mugnier. Sound and Complete Forward and Backward Chainings of Graph Rules. In ICCS’96, Lecture Notes in A.I. Springer Verlag, 1996. 9. G Simonet. Une autre s´emantique logique pour les graphes conceptuels simples ou emboˆıt´es. Research Report 96-048, L.I.R.M.M., 1996. 10. G Simonet. Une s´emantique logique pour les graphes conceptuels emboˆıt´es. Research Report 96-047, L.I.R.M.M., 1996. 11. J.F. Sowa. Conceptual Structures - Information Processing in Mind and Machine. Addison Wesley, 1984.
Peircean Graphs for the Modal Logic S5 Torben Bra¨ uner? Centre for Philosophy and Science-Theory Aalborg University Langagervej 6 9220 Aalborg East, Denmark
[email protected] Abstract. Peirce completed his work on graphical methods for reasoning within propositional and predicate logic, but left unfinished similar systems for various modal logics. In the present paper, we put forward a system of Peircean graphs for reasoning within the modal logic S5. It is proved that our graph-based formulation of S5 is indeed equivalent to the traditional Hilbert-Frege formulation. Our choice of proof-rules for the system is proof-theoretically well motivated as the rules are graphbased analogues of Gentzen style rules as appropriate for S5. Compared to the system of Peircean graphs for S5 suggested in [17], our system has fewer rules (two instead of five), and moreover, the new rules seem more in line with the Peircean graph-rules for propositional logic.
1
Introduction
It was a major event in the history of diagrammatic reasoning when Charles Sanders Peirce (1839 - 1914) developed graphical methods for reasoning within propositional and predicate logic, [18]. This line of work was taken up again in 1984 where conceptual graphs, which are generalisations of Peircean graphs, were introduced in [15]. Since then, conceptual graphs have gained widespread use within Artificial Intelligence. The recent book [1] witnesses a general interest in logical reasoning with diagrams within the areas of logic, philosophy and linguistics. Furthermore, the book witnesses a practically motivated interest in diagrammatic reasoning related to the increasing use of visual displays within such diverse areas as hardware design, computer aided learning and multimedia. Peirce completed his work on graphical methods for reasoning within propositional and predicate logic but left unfinished similar systems for various modal logics - see the account given in [12]. In the present paper, we put forward a system of Peircean graphs for reasoning within the modal logic S5. The importance of this logic is recognised within many areas, notably philosophy, mathematical logic, Artificial Intelligence and computer science. It is proved that our graphbased formulation of the modal logic S5 is indeed equivalent to the traditional Hilbert-Frege formulation. Our choice of proof-rules for the system is prooftheoretically well motivated as the rules are graph-based analogues of Gentzen ?
The author is supported by the Danish Natural Science Research Council.
M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 255–269, 1998. c Springer-Verlag Berlin Heidelberg 1998
256
T. Bra¨ uner
style rules as appropriate for S51 . Gentzen style is one way of formulating a logic which is characterised by particularly appealing proof-theoretic properties. It should be mentioned that a system of Peircean graphs for S5 was also suggested in [17]. However, our system has fewer rules (two instead of five), and moreover, the new rules seem more in line with the Peircean graph-rules for propositional logic. In the second section of this paper we give an account of propositional and modal logic in Hilbert-Frege style. Graph-based systems for reasoning within propositional logic and the modal logic S5 are given in respectively the third and the fourth section. In section five it is proved that our graph-based formulation of S5 is equivalent to the Hilbert-Frege formulation. In section six we discuss possible further work.
2
Propositional and Modal Logic
In this section we shall give an account of propositional logic and the modal logic S5 along the lines of [14]. See also [8, 9]. We formulate the logics in the traditional Hilbert-Frege style. Formulae for propositional logic are defined by the grammar s ::= p | s ∧ ... ∧ s | ¬(s) where p is a propositional letter. Parentheses are left out when appropriate. Given formulae φ and ψ, we abbreviate ¬(φ∧¬ψ) and ¬(¬φ∧¬ψ) as respectively φ ⇒ ψ and φ ∨ ψ. Definition 1. The axioms and proof-rules for propositional logic are as follows: A1 ` φ ⇒ (ψ ⇒ φ). A2 ` (φ ⇒ (ψ ⇒ θ)) ⇒ ((φ ⇒ ψ) ⇒ (φ ⇒ θ)). A3 ` (¬φ ⇒ ¬ψ) ⇒ (ψ ⇒ φ). Modus Ponens If ` φ and ` φ ⇒ ψ then ` ψ. Given the definition of derivability for propositional logic, it is folklore that one can prove soundness and completeness in the sense that a formula is derivable using the axioms and rules if and only if it is valid with respect to the standard truth-functional semantics. Formulae for modal logic are defined by extending the grammar for propositional logic with the additional clause s ::= ... | (s) The connective symbolises ”it is necessary that”. Given a formula φ, we abbreviate ¬¬φ as ♦φ. It follows that ♦ symbolises ”it is possible that”. It is a 1
This gives rise to a highly interesting discussion on whether graphs should be considered as a generalisation of Gentzen or/and Natural Deduction style, or perhaps Hilbert-Frege style. (Note that there is danger of confusion here as Gentzen discovered Natural Deduction style as well as what is usually called Gentzen style, [6].)
Peircean Graphs for the Modal Logic S5
257
notable property of the new connective that it is not truth-functional. Whereas the truth-value of for example the proposition ¬φ is determined by the truth-value of the proposition φ, it is not the case that the truth-value of φ is determined by the truth-value of φ. For example, the truth of φ does not necessarily imply the truth of φ. This can be illustrated by taking φ to symbolise either ”all bachelors are unmarried” or ”Bill Clinton is president of USA”. The proposition φ is true in both cases, but φ is true in the first case whereas it is false in the second. Definition 2. The axioms and proof-rules for S5 are constituted by all axioms and proof-rules for propositional logic together with the following: K ` (φ ⇒ ψ) ⇒ (φ ⇒ ψ). T ` φ ⇒ φ. S5 ` ♦φ ⇒ ♦φ. Necessitation If ` φ then ` φ. It may be worth pointing out the relation between S5 and other well known modal logics. We get the modal logic S4 if axiom S5 is replaced by the axiom ` φ ⇒ φ, called S4. It is straightforward to show that the logic S5 is stronger than the logic S4 in the sense that any formula provable in S4 is provable in S5 also. We get the modal logic T if axiom S5 is left out, and the modal logic K is obtained by leaving out S5 as well as T. The various modal logics described correspond to different notions of necessity. We shall here concentrate on S5 where can be considered as symbolising ”it is under all possible circumstances the case that”. Thus, the notion of necessity in question here is concerned with possible ways in which things might have been - it is not concerned with what is known or believed. This constitutes a philosophical motivation of the modal logic S5. From a mathematical point of view, S5 is interesting as it is a modal version of the monadic predicate logic (which is the usual predicate logic equipped with the restriction that every predicate has exactly one argument). Also, the modal logic S5 is used within the areas of Artificial Intelligence and computer science. In what follows, we shall give an account of the traditional possible-worlds semantics for S5. Formally, we need a non-empty set W together with a function V which assigns a truth-value V (w, p) to each pair consisting of an element w of W and a propositional letter p. The elements of W are to be thought of as possible worlds or possible circumstances. By induction, we extend the range of the valuation operator V to arbitrary formulas as follows: V (w, φ ∧ ψ) iff V (w, φ) and V (w, ψ) V (w, ¬φ) iff not V (w, φ) V (w, φ) iff for all w0 in W , V (w0 , φ) So V (w, φ) is to be thought of as φ being true in the world w. A formula φ is said to be S5-valid if and only if φ is S5-true in any model (W, V ), that is, we have V (w, φ) for any world w in W .
258
T. Bra¨ uner
Given the definitions of derivability and validity in S5, it is well known that one can prove soundness and completeness, that is, a formula is derivable using the Hilbert-Frege axioms and rules for S5 if and only if it is S5-valid.
3
Graphs for Propositional Logic
In this section we shall give an account of propositional logic using Peircean graphs in linear notation. Graphs for propositional logic are defined by the grammar s ::= p | s...s | ¬(s) where p is a propositional letter. The number of occurrences of s in a string s...s might be zero in which case we call the resulting graph empty. A graph can be rewritten into a formula of propositional logic by adding conjunction symbols as appropriate. We shall often blur the distinction between graphs and formulas when no confusion can occur. Given graphs φ and ψ together with a propositional letter p occurring exactly once in φ, the expression φ[ψ] denotes φ where ψ has been substituted for the occurrence of p. This notation will be used to point out one single occurrence of a graph in an enclosing graph. The ¬(...) part of a graph ¬(φ) is called a negation context. We say that a graph ψ is positively enclosed in φ[ψ] if and only if it occurs within an even number of negation contexts. Negative enclosure is defined in an analogous fashion. A graph can be written in non-linear style by writing φ instead of ¬(φ). A notion of derivation for propositional graphs is introduced in the definition below. Definition 3. A list of graphs ψ1 , ..., ψn constitutes a derivation of ψn from ψ1 if and only if each ψi+1 can be obtained from ψi by using one of the following rules: Insertion Any graph may be drawn anywhere which is negatively enclosed. Erasure Any positively enclosed graph may be erased. Iteration A copy of any graph φ may be drawn anywhere which is not within φ provided that the only contexts crossed are negation contexts which do not enclose φ. Deiteration Any graph which could be the result of iteration may be erased. Double Negation A double negation context may be drawn around any graph and a double negation context around any graph may be erased. The rules of the graph-based formulation of propositional logic have the notable property that they can be applied at the top-level as well as inside graphs. This is contrary to other formulations of propositional logic which allow only toplevel applications of rules. The point here is that the only global conditions on
Peircean Graphs for the Modal Logic S5
259
applying the inference rules for graphs concern the notion of sign of a graph all the other conditions are purely local. This gives rise to the theorem below. Theorem 1 (Cut-and-Paste). Let a list of graphs ψ1 , ..., ψn be given which constitutes a derivation of ψn from ψ1 . Also, assume that a graph φ[ψ1 ] is given in which ψ1 is positively enclosed. Then the list of graphs φ[ψ1 ], ..., φ[ψn ] constitutes a derivation of φ[ψn ] from φ[ψ1 ]. Proof. Induction on n.
t u
This theorem and it’s name is taken from [16]. The justification for the name is that a derivation from the empty graph can be ”cut out” and ”pasted into” anywhere which is positively enclosed. In what follows, we shall give a proof that the graph-based formulation of propositional logic is equivalent to the Hilbert-Frege formulation in the sense that the same graphs/formulae are derivable. Theorem 2. A graph is derivable from the empty graph if and only if the corresponding formula is derivable using the Hilbert-Frege style axioms and rules. Proof. To see that a Hilbert-Frege derivable formula corresponds to a derivable graph the following two observations suffice: Firstly, the axioms A1, A2 and A3 all correspond to derivable graphs. Secondly, given derivations of graphs φ and φ ⇒ ψ, a derivation of ψ can be constructed as follows: The Cut-and-Paste Theorem enables us to combine the derivations of the graphs φ and φ
ψ
into a derivation of the graph φ
φ
ψ
on which we apply Deiteration to get φ which by Erasure yields
ψ
260
T. Bra¨ uner
ψ which by Double-Negation finally yields the graph ψ This corresponds to the Modus Ponens rule. To see that the rules for graphs do not prove too much, the following argument suffices: The empty graph corresponds to the unit for conjunction which is obviously valid with respect to the standard truth-functional semantics, and furthermore, the rules for derivability of graphs correspond to validity-preserving operations on formulae. Hence, the formula corresponding to a graph derivable from the empty graph is valid and therefore Hilbert-Frege derivable according to the previously mentioned completeness result. t u
4
Graphs for the Modal Logic S5
In this section we shall give a graph-based version of the modal logic S5. Graphs for modal logic are defined by extending the grammar for defining graphs of propositional logic with the additional clause s ::= ... | (s) The (...) part of a graph (φ) is called a modal context 2 . Note that modal contexts do not matter for whether a graph is positively or negatively enclosed. We say that a graph is modally enclosed if it occurs within a modal context. In non-linear style we write φ instead of (φ). A notion of derivation for modal graphs is introduced in the definition below. 2
A historical remark should be made here: Peirce’s modal contexts correspond to ¬((...)) in our system. This definition of modal contexts is adopted by some authors, for example those of the papers [12] and [17]. Our choice of definition for modal contexts deviates from Peirce’s because we want to keep the notions of negation and necessity distinct. This is in accordance with Sowa’s definition of modal contexts for conceptual graphs, [15]. It is straightforward to restate our graph-rules in terms of Peirce’s modal contexts by adding negation contexts as appropriate and by defining positive and negative enclosure such that negation contexts as well as modal contexts are taken into account.
Peircean Graphs for the Modal Logic S5
261
Definition 4. A list of graphs ψ1 , ..., ψn constitutes a derivation of ψn from ψ1 if and only if each ψi+1 can be obtained from ψi by using either one of the rules for propositional logic or one of the following: Negative -Introduction A modal context may be drawn around any negatively enclosed graph. Positive -Introduction A modal context may be drawn around any positively enclosed graph ψ which is not modally enclosed provided that each propositional letter which is within a context enclosing ψ, but which is not within ψ, is modally enclosed. Note that the condition regarding propositional letters in the rule Positive Introduction is vacuous if the graph ψ is not enclosed by any contexts at all. Also, in the rule for iteration note that only negation contexts can be crossed when copying a graph (modal contexts cannot be crossed). Our choice of rules is proof-theoretically well motivated as the rules are graphbased analogues of the Gentzen rules for S5 originally given in [11] (and also considered in [2] and elsewhere). Gentzen rules for classical logic were introduced in [6]. In Gentzen style, proof-rules are used to derive sequents φ1 , ..., φn ` ψ1 , ..., ψm Derivability of such a sequent corresponds to derivability of ` (φ1 ∧ ... ∧ φn ) ⇒ (ψ1 ∨ ... ∨ ψm ) Note that the left hand side formulae φ1 , ..., φn are negatively enclosed whereas the right hand side formulae ψ1 , ..., ψm are positively enclosed. Generally, a rule in Gentzen style either introduces a formula on the left of the turnstyle or it introduces a formula on the right of the turnstile (that is, a formula is introduced either as negatively or positively occurring). In the S5 case, the Gentzen rules are as follows: -Left If Γ, φ ` ∆ then Γ, φ ` ∆. -Right If Γ ` φ, ∆ then Γ ` φ, ∆ provided that any propositional letter occurring in Γ or ∆ is modally enclosed. Clearly, the -Left rule corresponds to the rule Negative -Introduction for graphs and the -Right rule corresponds to the rule Positive -Introduction. Note that in the rule Positive -Introduction, the restriction that ψ must not be modally enclosed cannot be left out. The two graphs ♦q ⇒ ♦q and ♦q ⇒ ♦q constitute a counter-example as the first graph is derivable (as it is obviously valid) whereas the second graph is not derivable at all (as it is invalid). Here, the modal context enclosing the graph in question is negatively enclosed. Similarly, the two graphs (q ∨ p) ⇒ (q ∨ p) and (q ∨ p) ⇒ (q ∨ p) constitute a counter-example where the modal context enclosing the graph in question is positively enclosed. It should be mentioned that an equivalent system can be obtained by replacing the Negative -Introduction rule by the following:
262
T. Bra¨ uner
Positive -Elimination A modal context around any positively enclosed graph may be erased. The equivalence is straightforward to prove using the second modal cut-andpaste theorem, Theorem 4, given below3 . We clearly have to compare our system to the system for S5 given in the paper [17]. Our rule Negative -Introduction is also a rule of this paper, but our rule Positive -Introduction replaces three rules given there, namely graph-based analogues of the Hilbert-Frege axioms and proof-rules K, S5 and Necessitation. So, compared to the rules of [17] we have fewer rules (two instead of five). Furthermore, our rules seem more in line with the graph-rules for propositional logic4 and they are proof-theoretically well motivated, as is made clear in the discussion above. 3
The fact that the two rules give rise to equivalent systems leaves us with a highly interesting question: Which of the rules should we take, Negative -Introduction or Positive -Elimination? The first is suggested if we consider graphs to be a generalisation of Gentzen style. But the second rule is suggested if we consider graphs to be a generalisation of Natural Deduction style where in general a rule either introduces or eliminates a formula on the right of the turnstile (that is, a formula is either introduced or eliminated as positively occurring). Note, however, that Natural Deduction lends itself towards intuitionistic logic (as the asymmetry between assumptions and conclusion in Natural Deduction proofs is reflected in the asymmetry between input and output in the Brouwer-Heyting-Kolmogorov interpretation) rather than classical logic (as the asymmetry between assumptions and conclusion in Natural Deduction is not reflected in the standard truth-functional interpretation where the truth-values, true and false, are perfectly symmetric) whereas the converse seems to be the case with graphs. On the other hand, Natural Deduction proofs correspond in a certain sense to intuitive, informal reasoning. To quote from Prawitz’s classic on Natural Deduction: The inference rules of systems of natural deduction correspond closely to procedures common in intuitive reasoning, and when informal proofs - such as are encountered in mathematics for example - are formalised within these systems, the main structure of the informal proofs can often be preserved. ([13], p. 7) Also Peircean graphs can be said to correspond to intuitive and informal reasoning. But it is not clear in which sense this can be said about Gentzen’s calculus of sequents. Rather: The calculus of sequents can be understood as meta-calculi for the deducibility relation in the corresponding systems of natural deduction. ([13], p. 90)
4
The answer to the question concerning whether graphs generalise Gentzen or/and Natural Deduction style may be hidden within the problem of finding an appropriate notion of reduction for graph-derivations analogous to cut-elimination in Gentzen systems or/and normalisation in Natural Deduction systems - see [7]. We shall leave this issue to future work. This point is clear if we consider graphs as a generalisation of Gentzen or/and Natural Deduction style rather than Hilbert-Frege style (the latter seems unnatural compared to the former).
Peircean Graphs for the Modal Logic S5
263
We do not have the full Cut-and-Paste Theorem, Theorem 1, when the rules for the modal logic S5 are taken into account5 . However, it is possible to add restrictions in two different ways such that the theorem holds also in the modal case. In the first case we have added the restriction that ψ1 is not enclosed by any contexts at all. Theorem 3 (Cut-and-Paste). Let a list of graphs ψ1 , ..., ψn be given which constitutes a derivation of ψn from ψ1 . Also, assume that a graph φ[ψ1 ] is given in which ψ1 is not enclosed by any contexts. Then the list of graphs φ[ψ1 ], ..., φ[ψn ] constitutes a derivation of φ[ψn ] from φ[ψ1 ]. Proof. Induction on n.
t u
In the second case we have added the restriction that the rule for positive introduction is not applied in the derivation of ψn from ψ1 . Theorem 4 (Cut-and-Paste). Let a list of graphs ψ1 , ..., ψn be given which constitutes a derivation of ψn from ψ1 where the rule for positive -introduction is not applied. Also, assume that a graph φ[ψ1 ] is given in which ψ1 is positively enclosed. Then the list of graphs φ[ψ1 ], ..., φ[ψn ] constitutes a derivation of φ[ψn ] from φ[ψ1 ]. Proof. Induction on n.
5
t u
The Equivalence
In this section, we shall prove that the graph-based formulation of S5 is equivalent to the traditional Hilbert-Frege formulation. The theorem below says that a graph is S5-derivable from the empty graph if the corresponding formula is Hilbert-Frege derivable. Lemma 1. A graph is S5-derivable from the empty graph if the corresponding formula is derivable using the Hilbert-Frege style axioms and rules for S5. 5
In passing, we shall mention another theorem which holds for propositional logic but which does not hold for the modal logic S5 (and neither for other modal logics). This is the so-called Deduction Theorem which says that the derivability of ψ from φ implies the derivability of φ ⇒ ψ from the empty graph. Using the Cut-and Paste Theorem, it is straightforward to prove it for propositional logic. But here is a counter-example for S5: The graph p is derivable from the graph p, but p ⇒ p is not derivable from the empty graph (as it is not valid). In the paper [17] it is claimed that the Deduction Theorem holds for the graph-based system for S5 proposed there (and for graph-based systems proposed for other modal logics as well). But this is not true, in fact, the counter-example just given works also for the systems of that paper. However, in the mentioned paper the Deduction Theorem does hold in the cases where it is used.
264
T. Bra¨ uner
Proof. To see that a Hilbert-Frege derivable formula corresponds to a derivable graph the following four observations suffice: Firstly, the axioms A1, A2 and A3 all correspond to derivable graphs. This is analogous to the propositional case, Theorem 2. Secondly, given derivations of graphs φ and φ ⇒ ψ, a derivation of ψ can be constructed. This is straightforward to show in a way analogous to the propositional case, Theorem 2, by using the first modal cut-and-paste theorem, Theorem 3. This corresponds to the rule Modus Ponens. Thirdly, the axioms K, T and S5 all correspond to derivable graphs. With the aim of making clear the role of our modal rules, we shall give the derivations. By Double Negation, Insertion and Iteration we get
φ
ψ
φ
ψ
which by Negative -Introduction yields
φ
ψ
φ
ψ
which by Positive -Introduction yields
φ
ψ
φ
ψ
which by Double Negation finally yields the graph
φ
ψ
φ
ψ
that corresponds to (φ ⇒ ψ) ⇒ (φ ⇒ ψ), that is, the axiom K. By Double Negation, Insertion and Iteration we get φ
φ
Peircean Graphs for the Modal Logic S5
265
which by Negative -Introduction yields the graph φ
φ
that corresponds to φ ⇒ φ, that is, the axiom T. By Double Negation, Insertion and Iteration we get
φ
φ
which by Positive -Introduction yields the graph
φ
φ
that corresponds to ♦φ ⇒ ♦φ, that is, the axiom S5. Fourthly, given a derivation of a graph φ, a derivation of φ can be constructed by using the rule Positive -Introduction. This corresponds to the rule Necessitation. t u The following lemma says that negative -introduction preserves validity. Lemma 2. Let a graph φ[ψ] be given. For every S5-model (W, V ) it is the case that 1. if ψ is negatively enclosed then V (w, φ[ψ]) implies V (w, φ[(ψ)]), 2. if ψ is positively enclosed then V (w, φ[(ψ)]) implies V (w, φ[ψ]). for any w in W . Proof. Induction on the structure of φ[ψ]. We proceed on a case by case basis where symmetric cases are omitted. – The case where φ[ψ] = α ∧ β[ψ] for some α and β. If V (w, α ∧ β[ψ]) then V (w, α) and V (w, β[ψ]). But V (w, β[ψ]) implies V (w, β[ψ]) by induction. Hence, V (w, α ∧ β[ψ]). – The case where φ[ψ] = ¬β[ψ] for some β. If V (w, ¬β[ψ]) then V (w, β[ψ]) is false. But this implies the falsity of V (w, β[ψ]) by contraposition of the induction hypothesis. Hence, V (w, ¬β[ψ]).
266
T. Bra¨ uner
– The case where φ[ψ] = β[ψ] for some β. If V (w, β[ψ]) then V (w0 , β[ψ]) for any w0 in W . Thus V (w0 , β[ψ]) for any w0 in W by induction. Hence, V (w, β[ψ]). – The case where φ[ψ] = ψ. In this case, ψ cannot be negatively enclosed. Clearly, V (w, ψ) implies V (w, ψ). t u With the aim of proving that Positive -Introduction preserves validity, we shall prove a small proposition. Recall that truth in a model amounts to truth in any world of the model. Proposition 1. Let a graph φ be given in which any propositional letter is modally enclosed. In a given S5-model (W, V ) either φ is true or ¬(φ) is true. Proof. Induction on the structure of φ. We proceed on a case by case basis. – The case where φ = α ∧ β for some α and β. If ¬(α ∧ β) is not true in the world then V (w, α ∧ β) for some world w in W , and hence, also V (w, α) and V (w, β). By induction, this implies the truth of α and β in the model, and hence, the truth of α ∧ β in the model. – The case where φ = ¬β for some β. By induction, either β is true or ¬β is true in the model. – The case where φ = β for some β. Clearly ok. t u The following theorem essentially says that the rule Positive -Introduction preserves validity. Lemma 3. Let a graph φ[ψ] be given in which ψ is positively enclosed but not modally enclosed. Furthermore, assume that any propositional letter not within ψ is modally enclosed. If φ[ψ] is true in a given S5-model then φ[(ψ)] is also true in this model. Proof. Induction on the structure of φ[ψ]. We proceed on a case by case basis. – The case where φ[ψ] = α ∧ β[ψ] for some α and β. If α ∧ β[ψ] is true in the model then also α and β[ψ] are true. But the truth of β[ψ] implies the truth of β[ψ] by induction. Hence, α ∧ β[ψ] is true. – The case where φ[ψ] = ¬(α ∧ ¬β[ψ]) for some α and β. If ¬(α ∧ ¬β[ψ]) is true in the model then either ¬α is true or β[ψ] is true according to Proposition 1. But the truth of β[ψ] implies the truth of β[ψ] by induction. Hence, ¬(α ∧ ¬β[ψ]) is true. – The case where φ[ψ] = ψ. Clearly, the truth of ψ in the model implies the truth of ψ. t u The lemma above can be generalised in the following sense.
Peircean Graphs for the Modal Logic S5
267
Lemma 4. Let a graph φ[ψ] be given in which ψ is positively enclosed but not modally enclosed. Furthermore, assume that any propositional letter within a context enclosing φ, but not within ψ, is modally enclosed. If φ[ψ] is valid then φ[(ψ)] is also valid. Proof. There are two cases. If ψ is not enclosed by any negation contexts in φ[ψ] then φ[ψ] = α ∧ ψ for some α. If α ∧ ψ is valid then α and ψ are also valid. The validity of ψ implies the validity of ψ. Hence, α∧ψ is valid. If ψ is enclosed by a non-zero number of negation contexts in φ[ψ] then φ[ψ] = α ∧ ¬β[ψ] for some α and β. If α ∧ ¬β[ψ] is valid then also α and ¬β[ψ] are valid. By Lemma 4 the validity of ¬β[ψ] implies the validity of ¬β[ψ]. Hence, α ∧ ¬β[ψ] is valid. u t The following theorem says that a graph is S5-derivable from the empty graph only if the corresponding formula is Hilbert-Frege derivable. Lemma 5. A graph is S5-derivable from the empty graph only if the corresponding formula is derivable using the axioms and rules for S5. Proof. The formula corresponding to the empty graph is the unit for conjunction which is valid as it is true in any world of any model. The rules for derivability of graphs in propositional logic correspond to validity-preserving operations on formulae. It follows from Lemma 2 and Lemma 4 that the two rules for derivability of graphs in S5 correspond to validity-preserving operations on formulae. We conclude that the formula corresponding to any graph derivable from the empty graph is valid and therefore derivable using axioms and rules according to the previously mentioned completeness result. t u The theorem below says that the graph-based formulation of S5 is equivalent to the traditional Hilbert-Frege formulation. Theorem 5. A graph is S5-derivable from the empty graph if and only if the corresponding formula is derivable using the axioms and rules for S5. Proof. Lemma 1 and Lemma 5.
6
t u
Further Work
It is natural to ask whether what we have done in this paper is also possible with S5 replaced by S4, T or K. It is obviously possible to use graph-based analogues of the Hilbert-Frege axioms and proof-rules for each of the mentioned modal logics. This is what is done in [17]. But is it possible to find proof-theoretically well motivated graph-based formulations of S4, T and K? Clearly, this is related to the possibility of finding Gentzen and Natural Deduction systems for the logics in question. Gentzen proof-rules for S4 and S5 were introduced in [10, 11] and Natural Deduction rules for these logics were given in [13]. But such formulations
268
T. Bra¨ uner
have not been found for other modal logics6 . In Handbook of Philosophical Logic, the following remark is made on Prawitz’s Natural Deduction systems for S4 and S5: However, it has proved difficult to extend this sort of analysis to the great multitude of other systems of modal logic. It seems fair to say that a deductive treatment congenial to modal logic is yet to be found, for Hilbert systems are not suited for actual deduction, .... ([3], p. 27–28) The problem of finding proof-theoretically well motivated graph-based formulations of modal logics is analogous. This suggests that S4 is amenable to a graph-based formulation along the same lines as the one for S5. But it also suggests that this is not the case for other modal logics. This deficiency calls for explanation. The handbook continues: The situation has given rise to various suggestions. One is that the Gentzen format, which works so well for truth-functional operators, should not be expected to work for intensional operators, which are far from truth-functional. (But then Gentzen works well for intuitionistic logic which is not truth-functional either.) Another suggestion is that the great proliferation of modal logics is an epidemy from which modal logic ought to be cured: Gentzen methods work for the important systems, and the other should be abolished. ’No wonder natural deduction does not work for unnatural systems!’ ([3], p. 28) It is not clear to the author of this paper whether one of these suggestions provides a way out of the trouble. We shall leave it to further work. Acknowledgements: Thanks to Peter Øhrstrøm for comments at various stages of writing this paper.
References [1] G. Allwein and J. Barwise, editors. Logical Reasoning with Diagrams. Oxford University Press, 1996. [2] T. Bra¨ uner. A cut-free Gentzen formulation of the modal logic S5. 12 pages. Manuscript, 1998. [3] R. Bull and K. Segerberg. Basic modal logic. In D. Gabbay and F. Guenthner, editors, Handbook of Philosophical Logic, Vol. II, Extensions of Classical Logic, pages 1–88. D. Reidel Publishing Company, 1984. [4] M. Fitting. Tableau methods of proof for modal logics. Notre Dame Journal of Formal Logic, 13:237–247, 1972. 6
It should be mentioned that many modal logics can be given formulations which more or less diverge from ordinary Gentzen systems. Notable here are the formulations of T, S4 and S5 given in [9]. Rather than sequents in the usual sense, they are based on indexed sequents, that is, sequents where each formula is indexed by a string of natural numbers. Also the Prefixed Tableau Calculus of [4, 5] should be mentioned. See the discussion in [2].
Peircean Graphs for the Modal Logic S5
269
[5] M. Fitting. Basic modal logic. In D. Gabbay et al., editor, Handbook of Logic in Artificial Intelligence and Logic Programming, Vol. 1, Logical Foundations, pages 365–448. Oxford University Press, Oxford, 1993. [6] G. Gentzen. Untersuchungen u ¨ ber das logische Schliessen. Mathematische Zeitschrift, 39, 1934. [7] J.-Y. Girard, Y. Lafont, and P. Taylor. Proofs and Types. Cambridge University Press, 1989. [8] G. E. Hughes and M. J. Cresswell. An Introduction to Modal Logic. Methuen, 1968. [9] G. Mints. A Short Introduction to Modal Logic. CSLI, 1992. [10] M. Ohnishi and K. Matsumoto. Gentzen method in modal calculi. Osaka Mathematical Journal, 9:113–130, 1957. [11] M. Ohnishi and K. Matsumoto. Gentzen method in modal calculi, II. Osaka Mathematical Journal, 11:115–120, 1959. [12] P. Øhrstrøm. C. S. Peirce and the quest for gamma graphs. In Proceedings of Fifth International Conference on Conceptual Structures, volume 1257 of LNCS. Springer-Verlag, 1997. [13] D. Prawitz. Natural Deduction. A Proof-Theoretical Study. Almqvist and Wiksell, 1965. [14] D. Scott, editor. Notes on the Formalisation of Logic. Sub-faculty of Philosophy, University of Oxford, 1981. [15] J. F. Sowa. Conceptual Structures: Information Processing in Mind and Machine. Addison-Wesley, Reading, 1984. [16] J. F. Sowa. Knowledge Representation: Logical, Philosophical, and Computational Foundations. PWS Publishing Company, Boston, 1998. [17] H. van den Berg. Modal logics for conceptual graphs. In Proceedings of First International Conference on Conceptual Structures, volume 699 of LNCS. SpringerVerlag, 1993. [18] J. Zeman. Peirce’s graphs. In Proceedings of Fifth International Conference on Conceptual Structures, volume 1257 of LNCS. Springer-Verlag, 1997.
Fuzzy Order-Sorted Logic Programming in Conceptual Graphs with a Sound and Complete Proof Procedure Tru H. Cao and Peter N. Creasy Department of Computer Science and Electrical Engineering University of Queensland Australia 4072 {tru, peter}@csee.uq.edu.au
Abstract. This paper presents fuzzy conceptual graph programs (FCGPs) as a fuzzy order-sorted logic programming system based on the structure of conceptual graphs and the approximate reasoning methodology of fuzzy logic. On one hand, it refines and completes a currently developed FCGP system that extends CGPs to deal with the pervasive vagueness and imprecision reflected in natural languages of the real world. On the other hand, it overcomes the previous widesense fuzzy logic programming systems to deal with uncertainty about types of objects. FCGs are reformulated with the introduction of fuzzy concept and relation types. The syntax of FCGPs based on the new formulation of FCGs and their general declarative semantics based on the notion of ideal FCGs are defined. Then, an SLD-style proof procedure for FCGPs is developed and proved to be sound and complete with respect to their declarative semantics. The procedure selects reductants rather than clauses of an FCGP in resolution steps and involves lattice-based constraint solving, which supports more expressive queries than the previous FCGP proof procedure did. The results could also be applied to CGPs as special FCGPs and useful for extensions adding to CGs lattice-based annotations to enhance their knowledge representation and reasoning power.
1. Introduction It is a matter of fact that uncertainty is frequently encountered in the real world. Uncertain knowledge representation and reasoning therefore have gained growing importance in artificial intelligence research. So far, research on extensions of CGs ([29]) for dealing with uncertainty has mainly clustered into two groups. One is on application of CGs to information retrieval (e.g., [11, 27, 13]) and the other is on FCGs (e.g., [25, 33, 17]), with the latter being our current research interest. Among several theories and methodologies dealing with different kinds of uncertainty, fuzzy logic ([37]), originated by the theory of fuzzy sets ([35]), is an essential one for representing and reasoning with vague and imprecise information, which is pervasive in the real world as reflected in natural languages. It is significant that, whilst a smooth mapping between logic and natural language has been regarded as the main motivation of CG ([31]), a methodology for computing with words has been regarded as the main contribution of fuzzy logic ([38]). Interestingly, for example, whilst quantifying words in natural languages such as many, few or most can be represented in CGs ([30]), the vagueness and imprecision of M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 270-284, 1998 Springer-Verlag Berlin Heidelberg 1998
Fuzzy Order-Sorted Logic Programming in Conceptual Graphs
271
these words can be handled by fuzzy logic ([36]). It shows that these two logic systems, although so far developed quite separately, have a common target of natural language. Their merger then promises a powerful knowledge representation language where CG offers a structure for placing words in and fuzzy logic offers a methodology for approximate reasoning with them. Fuzzy logic programming systems can be roughly classified into two groups with respect to (w.r.t.) the narrow and the wide senses of fuzzy logic. Systems of the first group have formulas associated with real numbers in the interval [0,1] (e.g. [26, 21]). Those of the second involve computations with fuzzy sets as data in programs (e.g. [32, 2]). There are two common shortcomings of the previous systems of the second group. First, model-theoretic semantics and theorem-proving fundamentals were not established, whence the soundness and the completeness of the systems could not be proved. Second, they did not deal with type hierarchies as classical order-sorted logic programming systems did. In the fuzzy case, there is the problem of uncertainty about types of objects. Overcoming the first shortcoming, annotated fuzzy logic programs (AFLPs) have been developed as an essential formalism for fuzzy logic programming systems computing with fuzzy sets as soft data ([9, 7]). The purpose of our current work on FCGPs is two-fold. On one hand, it extends CGPs ([14, 28, 19]) to deal with vague and imprecise knowledge. On the other hand, it provides a wide-sense fuzzy logic programming system that can handle uncertainty about types of objects. The system is in the spirit of possibility theory and possibility logic ([12]), where fuzzy set membership functions are interpreted as possibility distributions, in contrast to probability distributions in a probability framework. For that purpose, this paper refines and completes the previous work [34, 4] on FCGPs in the following issues: 1. FCGs are reformulated with the introduction of fuzzy concept and relation types, providing unified structure and treatment for both FCGs and CGs. The syntax of FCGPs is refined on the new formulation of FCGs. 2. The general declarative semantics of FCGPs is defined on the notion of ideal FCGs, ensuring finite proofs of logical consequences of programs. The fixpoint semantics of FCGPs as the bridge between their declarative and procedural semantics are studied. 3. A sound and complete SLD-style proof procedure for FCGPs is developed. For the completeness, the procedure selects reductants rather than clauses of an FCGP in resolution steps. Also, it involves solving constraints on lattice-based fuzzy value terms, supporting more expressive queries which are possibly about not only fuzzy attribute-values but also fuzzy types. In fact, FCGPs and CGPs can be studied in the lattice-based reasoning framework of annotated logic programs ([20, 7]), as CGPs compute with concept types and FCGPs compute with fuzzy types and fuzzy attribute-values as lattice-based data. From this point of view, the results obtained in this paper could also be applied to CGPs as special FCGPs and useful for extensions adding to CGs lattice-based annotations to enhance their knowledge representation and reasoning power.
272
T.H. Cao and P.N. Creasy
The paper is organized as follows. Section 2 presents a framework of fuzzy types, the new formulation of FCGs and the notion of ideal FCGs. Section 3 defines the syntax and general declarative semantics of FCGPs and studies their fixpoint semantics. More details for these two sections can be found in [8]. In Section 4, the definitions of FCGP reductants and FCGP constraints are presented. Then, the new FCGP proof procedure is developed and proved to be sound and complete w.r.t. FCGP declarative semantics. Finally, Section 5 is for conclusions and suggestions for future research.
2. FCG Formulation with Fuzzy Types
Throughout this paper, the conventional notations ∩ and ∪ are respectively used for the ordinary/fuzzy set intersection and union operators, and lub stands for the least upper bound operator of a lattice. Especially, we use ≤ι as the common notation for all orderings used in the current work, under the same roof of information ordering, whereby A ≤ι B means B is more informative, or more specific, than A. In particular, we write A ≤ι B if B is a fuzzy sub-set of A, or B is a sub-type of A. It will be clear in a specific context which ordering this common notation denotes. 2.1. Fuzzy Types In the previous formulation of FCGs ([34, 4]) fuzzy truth-values, defined by fuzzy sets on [0,1], are used to represent the compatibility of a referent to a concept type, or referents to a relation type. The formulation of a fuzzy type as a pair of a basic type and a fuzzy truth-value was first proposed in [5], providing unified structure and treatment for both FCGs and CGs. The intended meaning of an assertion “x is of fuzzy type (t, v)” is “(x is of t) is v”. For example, an assertion “John is of fuzzy type (AMERICAN-MAN, fairly true)” says “It is fairly true that John is an AMERICAN-MAN”, where AMERICAN-MAN is a basic type and fairly true is the linguistic label of a fuzzy truth-value. The intuitive idea of the fuzzy sub-type ordering is similar to that of the ordinary one. That is, if τ2 is a fuzzy sub-type of τ1, then an assertion “x is of τ2” entails an assertion “x is of τ1”. For example, given BIRD ≤ι EAGLE and true ≤ι very true, one has (BIRD, true) ≤ι (EAGLE, very true), on the basis that “It is very true that x is an EAGLE” entails “It is true that x is a BIRD”. In [8], the notion of matchability with a mismatching degree of a fuzzy type to another is introduced, on the basis of the fuzzy sub-type partial ordering and the mismatching degree of a fuzzy set to another. Given two fuzzy types τ1 and τ2, the mismatching degree of τ1 to τ2, where τ1 is matchable to τ2, is a value in [0,1] and denoted by md(τ1/τ2). Then, τ1 ≤ι τ2 if and only if (iff) md(τ1/τ2) = 0. When md(τ1/τ2) ≠ 0, an assertion “x is of τ2” does not fully entail an assertion “x is of τ1”, but rather 1−md(τ1/ τ2) measures the relative necessity degree of “x is of τ1” given “x is of τ2”. Further, for the fact that an object may belong to more than one (fuzzy) types, we apply the conjunctive type construction technique of [1, 10] to define conjunctive fuzzy types. A conjunctive fuzzy type is defined to be a finite set of pairwise incomparable fuzzy types. For example, {(BIRD, very true), (EAGLE, fairly false)} is a conjunctive fuzzy type. An assertion “Object #1 is of type {(BIRD, very true), (EAGLE, fairly
Fuzzy Order-Sorted Logic Programming in Conceptual Graphs
273
false)}” says “It is very true that Object #1 is a BIRD and it is fairly false that it is an EAGLE”. Given two conjunctive fuzzy types T1 and T2, T1 is said to be matchable to T2 iff ∀τ1∈T1∃τ2∈T2: τ1 is matchable to τ2. The mismatching degree of T1 to T2 is then defined by md(T1/T2) = MaxT1 MinT2 {md(τ1/τ2) | τ1∈T1, τ2∈T2 and τ1 is matchable to τ2}. When md(T1/T2) = 0, T2 is said to be a conjunctive fuzzy sub-type of T1 and one writes T1 ≤ι T2. As proved in [8], the set of all conjunctive fuzzy types, defined over a basic type lattice and a fuzzy truth-value lattice, forms an upper semi-lattice under the conjunctive fuzzy sub-type partial ordering. Note that, a basic type t and a fuzzy type (t, absolutely true) are conceptually equivalent. So, for the sake of expressive simplicity, we write, for instance, APPLE instead of (APPLE, absolutely true). Also, one may view a fuzzy type as a conjunctive one that contains only one element, and vice versa. For the following formulation of FCGs, we assume basic concept and relation type lattices, on which fuzzy concept and fuzzy relation types are defined. However, for simplicity, we use the term fuzzy type to mean either fuzzy concept or fuzzy relation type, when a distinction is not necessary. 2.2. FCGs and Ideal FCGs The formulation of FCGs is refined accordingly with the introduction of fuzzy types. An FCG is defined as a conceptual graph (not necessarily connected) the nodes of which are fuzzy concepts and fuzzy relations, and the directed edges of which link the relation nodes to their neighbor concept nodes. Concept nodes are possibly joined by coreference links indicating that the concepts refer to the same individual. A fuzzy concept is either (1) a fuzzy entity concept, which consists of a conjunctive fuzzy concept type and a referent, or (2) a fuzzy attribute concept, which consists of a conjunctive fuzzy concept type, a referent and a fuzzy attribute-value defined by a fuzzy set. A fuzzy relation consists of a conjunctive fuzzy relation type. For an FCG g, we denote the set of all concept nodes and the set of all relation nodes in g respectively by VC g and VR g. For a fuzzy attribute concept c, we denote the fuzzy attribute-value in c by aval(c). FCG projection defined in [34] is also modified with the introduction of fuzzy types. As in [34], given a projection π from an FCG u to an FCG v, 1−επ measures the relative necessity degree of u given v, where επ∈[0,1] is the mismatching degree of π. When επ = 0, the necessity degree of u given v is 1, that is, v fully entails u. We now present the notion of ideal FCGs, first introduced in [8], that are based on the notion of ideals in lattice theory ([16]) and used to define the general FCGP declarative semantics. An ideal of an upper semi-lattice L is any sub-set S of L such that (1) S is downward closed, i.e., if a∈S, b∈L and b ≤ι a then b∈S, and (2) S is closed under finite least upper bounds, i.e., if a, b∈S then lub{a, b}∈S. The set of all ideals of an upper semi-lattice forms a complete lattice under the ordinary sub-set ordering, that is, given two ideals s and t, s ≤ι t iff s is a sub-set of t. For each element a∈L, the set {x∈L | x ≤ι a} is called a principal ideal. It is an important property that, given a principal ideal p and a set of ideals J, if p ≤ι lub(J) then p ≤ι lub(F) with F being a finite sub-set of J.
274
T.H. Cao and P.N. Creasy
An ideal FCG is defined like an FCG except that, fuzzy values in it are ideals of fuzzy attribute-value lattices or conjunctive fuzzy type upper semi-lattices. Given an ideal FCG g, a principal instance of g is an FCG derived from g by replacing each fuzzy value ideal in g by an element in the ideal. For a concept node c in a principal instance of g, we denote the corresponding concept node in g from which c is derived by origin(c). We write norm(g), which is called a normal ideal FCG, to denote g after being normalized, whereby no individual marker occurs in more than one concept node in norm(g) (cf. CG normal form in [14, 28]). Ideal FCG projection as the subsumption ordering over a set of ideal FCGs is defined similarly as CG projection. The difference is only that CG projection is based on basic type lattices, whilst ideal FCG projection is based on conjunctive fuzzy type ideal lattices. Given ideal FCGs u and v, we write u ≤ι v iff there exists an ideal FCG projection from u to v. Since each principal ideal can be represented by the greatest element in it, one may view an FCG as an ideal FCG whose fuzzy values are principal ideals, and vice versa. Also, a single ideal FCG can be viewed as a set of separate ideal FCGs, and vice versa.
3. Syntax and Declarative Semantics of FCGPs 3.1. FCGP Syntax Definition 3.1 An FCGP clause is defined to be of the form if u then v, where u and v are finite FCGs; v is called the head and u the body (possibly empty) of the clause, and there may be coreference links between u and v. Some concept and relation nodes may be defined to be the firm nodes in the clause. An FCGP is a finite set of FCGP clauses. The intended meaning of firm nodes in FCGP rules is that, the firm nodes in the body of a rule require full matching for the rule to be fired, whilst the firm nodes in the head of a rule are not subject to change by mismatching degrees when the rule is fired. For the examples in this paper, we have a convention that concept and relation nodes without linguistic labels of fuzzy sets are firm nodes. Example 3.1 The FCGP in Figure 3.1 consists of one fact saying “Apple #1 is fairly red”, and one rule saying “If an apple is red, then it is ripe”. Here, [APPLE: #1], [APPLE: *], (ATTR1) and (ATTR2) are firm nodes, and the others are not. Note that ATTR1 and ATTR2 are functional relation types, where the concept nodes linked to a functional relation node by double-lined arcs are the dependent concepts and the others are the determining concepts of the relation ([24, 6]). APPLE:
#1
if APPLE: *
ATTR1
ATTR1
COLOR:@fairly
COLOR:@red
red
then APPLE: *
Fig. 3.1. An FCGP
ATTR2
RIPENESS:@ripe
Fuzzy Order-Sorted Logic Programming in Conceptual Graphs
275
3.2. FCGP Interpretations and Models There are two notions of FCGP declarative semantics: restricted and general. For the restricted semantics, an FCGP interpretation is a normal FCG, whose fuzzy values are elements of fuzzy attribute-value lattices or conjunctive fuzzy type upper semi-lattices. For the general semantics, an FCGP interpretation is a normal ideal FCG, whose fuzzy values are ideals of these lattices or upper semi-lattices. As shown in [20, 9], for logic programs computing with data based on lattices that may be infinite, the general semantics has to be used to guarantee finite proofs of logical consequences of programs. Definition 3.2 An FCGP interpretation is a normal ideal FCG (possibly infinite). As in [34, 9], the satisfaction relation between an FCGP interpretation and an FCGP is based on the fuzzy modus ponens model of [23]. The model is consistent with classical modus ponens, that is, when the body of a rule fully matches a fact, then the head of the rule can be derived. When the body mismatches the fact by some degree, one has a degree of indetermination in reasoning, and the conclusion should be more ambiguous and less informative than it is when there is no mismatching. It is obtained by adding the mismatching degree to fuzzy values in the head. For a fuzzy value A, A+ε represents A being overall pervaded with an indetermination degree ε ∈[0,1]. If A is a fuzzy set on a domain U, the membership function of A+ε is defined by µA+ε(u) = Min{µA(u) + ε, 1}, for every u∈U. If A is a fuzzy type (t, v), then A+ε = (t, v+ε). If A is a conjunctive fuzzy type T, then A+ε = {τ+ε | τ∈T}. Definition 3.3 Let P be an FCGP and I be an FCGP interpretation. The satisfaction relation is defined as follows: 1. I |= P iff I |= C, for every clause C in P, 2. I |= if u then v iff the existence of an FCG projection π from u to a principal instance g of I implies the existence of an ideal FCG projection π* from v+επ to I such that (1) the mismatching degree of each mapping from a firm node in u to a node in g is 0, and (2) v+επ is derived from v by adding επ to fuzzy values in all concept and relation nodes that are not firm nodes in v, and (3) for every c∈VC u , c*∈VC v , if coref{c, c*} then π*c* = origin(πc). I is a model of P iff I |= P. A program Q is said to be a logical consequence of a program P iff, for every FCGP interpretation I, if I |= P then I |= Q. 3.3. FCGP Fixpoint Semantics As in [22, 20, 14, 9], each FCGP P is associated with an interpretation mapping TP , which provides the link between the declarative and procedural semantics of P. The upward iteration of TP is then defined with TP↑0 being the empty ideal FCG. Definition 3.4 Let P be an FCGP and I be an FCGP interpretation. Then TP(I) is defined to be norm(SP(I)∪I) where SP(I) = {v+επ | if u then v is a clause in P and π is an FCG projection from u to a principal instance g of I such that (1) the mismatching
276
T.H. Cao and P.N. Creasy
degree of each mapping from a firm node in u to a node in g is 0, and (2) v+επ is derived from v by adding επ to fuzzy values in all concept and relation nodes that are not firm nodes in v, and (3) for every c∈VC u , c*∈VC v , if coref{c, c*} then coref{c*, origin(πc)}}. Example 3.2 Let P be the program in Example 3.1 and, for calculation illustration, suppose the following relations between fuzzy sets denoted by linguistic labels: fairly red = red+ε ≤ι red, fairly ripe = ripe+ε ≤ι ripe which imply md(red / fairly red) = md(ripe / fairly ripe) = ε. Here, given two fuzzy sets A and A* on a domain U, md(A / A*) = SupU{Max{µA*(u) − µA(u), 0}} denotes the mismatching degree of A to A* ([34]). Then one has: TP↑ω = lub{TP↑n | n∈N} = TP↑2 = [APPLE: #1]→(ATTR1)⇒[COLOR:@fairly red] →(ATTR2)⇒[RIPENESS:@fairly ripe]. Theorem 3.1 ([8]) Let P be an FCGP. Then TP↑ω is the least model of P. The significance of Theorem 3.1 is that it ensures a finite sound and complete mechanical proof procedure for FCGPs. Indeed, if g is a finite FCG and g ≤ι TP↑n for some n∈N, then g ≤ι TP↑ω ≤ι I for every model I of P, which means g is a logical consequence of P. On the other hand, if g is a logical consequence of P, then g must be satisfied by TP↑ω as a model of P, i.e., g ≤ι TP↑ω = lub{TP↑n | n∈N}, whence g ≤ι TP↑n for some n∈N, due to g being finite and fuzzy values in g being principal ideals (cf. [22, 20, 14, 9]).
4. Procedural Semantics of FCGPs 4.1. FCGP Reductants Definition 4.1 An FCGP annotation term is recursively defined to be of either of the following forms: 1. A fuzzy value constant, which is a fuzzy attribute-value or a conjunctive fuzzy type, or 2. A+ξ, where A is a fuzzy value constant and ξ is a variable whose value is a real number in [0,1], or 3. A fuzzy value variable, whose value is a fuzzy attribute-value or a conjunctive fuzzy type, or 4. f(τ1, τ2, ..., τm), where each τi (1 ≤ i ≤ m) is an FCGP annotation term and f is a computable ([18]) and monotonic ( f(τ1, τ2, ..., τm) ≤ι f(τ’1, τ’2, ..., τ’m) if τi ≤ι τ’i for every i from 1 to m) function from L1 × L2 × ... × Lm to L, with L and Li’s being lattices of fuzzy attribute-values or upper semi-lattices of conjunctive fuzzy types. FCGP annotation terms of the first three forms are called simple FCGP annotation terms.
Fuzzy Order-Sorted Logic Programming in Conceptual Graphs
277
We denote fuzzy value variables by X, Y, ..., and real number variables by ξ, ψ, ..., which all are called annotation variables to be distinguished from individual variables denoted by x, y, ... . An expression without annotation variables is called annotation variable-free. Definition 4.2 Let P be an FCGP and C1, C2, ..., Cm be different clauses in P, where each Ck (1 ≤ k ≤ m) is of the form if uk then vk. Suppose that some concept nodes in v1, v2, ..., vm can be joined by a coreference partition operator ϖ. Then the clause: if ϖ[u1+ξ1 u2+ξ2 ... um+ξm] then norm(ϖ[v1+ξ1 v2+ξ2 ... vm+ξm]) is called a reductant of P, where each uk+ξk (or vk+ξk) is derived from uk (or vk) by adding ξk to fuzzy values in all concept and relation nodes that are not firm nodes in uk (or vk). Note that, in Definition 4.2, each real number variable ξk represents an unknown mismatching degree of the body of clause Ck to some fact, for Ck taking part in the reductant; ξk = 0 if the body of Ck is empty. This, on the other hand, corresponds to a tolerance degree for the head of Ck in backward chaining in [4]. Moreover, if ϖ(uk+ξk) then norm(ϖ(vk+ξk)) is also an FCGP reductant, which is constructed from only Ck; in this case, ϖ corresponds to a coreference partitioning in [28] on cut-point concept nodes of vk in a unification with a goal. Example 4.1 Figure 4.1 illustrates a reductant constructed from the two rules of the FCGP P. The first rule says “If the demand on a product is not high, then its price is not expensive”. The second rule says “If the demand on a product is not low, then its price is not cheap”. The fact says “The demand on product #2 is normal and its quality is quite good”. program P: if
PRODUCT:
*
ATTR1
DEMAND:@not
then
PRODUCT:
*
ATTR2
PRICE:@not
if
PRODUCT:
*
ATTR1
DEMAND:@not
then
PRODUCT:
*
ATTR2
PRICE:@not
PRODUCT:
#2
high
expensive low
cheap
ATTR1
DEMAND:@normal
ATTR3
QUALITY:@quite
good
a reductant of P: if
PRODUCT:
*
ATTR1
DEMAND:@lub{not
then
PRODUCT:
*
ATTR2
PRICE:@lub{not
high+ξ, not low+ψ}
expensive+ξ, not cheap+ψ}
Fig. 4.1. A reductant of an FCGP
278
T.H. Cao and P.N. Creasy
Property 4.1 Let P be an FCGP. Any annotation variable-free instance, obtained by a substitution for real number variables, of a reductant of P is a logical consequence of P. Proof. The proof is on the basis of FCGP declarative semantics and is similar to the proof for the alike property of AFLP reductants (Property 4.1 in [7]). 4.2. FCGP Constraints Definition 4.3 An FCGP constraint is defined to be of the form: σ1 ≤ι φ1 & σ2 ≤ι φ2 & ... & σm ≤ι φm where, for each i from 1 to m, σi and φi are two FCGP annotation terms evaluated to fuzzy values of the same domain. The constraint is said to be normal iff (1) for each i from 1 to m, σi is a simple FCGP annotation term and if σi contains a variable then this variable does not occur in φ1, φ2, ..., φi and (2) for every pair σi and σj (i ≠ j), σi and σj are not the same conjunctive fuzzy type variable. Definition 4.4 A solution for an FCGP constraint C is a substitution ϕ for annotation variables in C such that every annotation variable-free instance of ϕC holds. An FCGP constraint is said to be solvable iff there is an algorithm to decide whether the constraint has a solution or not, and to identify a solution if it exists. Note that, in Definition 4.4, ϕ does not necessarily contain a binding for every annotation variable. Also, a constraint having a solution is not necessarily solvable, because there may not exist an algorithm to identify what a solution is. On the other hand, a solvable constraint may not have a solution. Property 4.2 Any normal FCGP constraint is solvable. Proof. The proof is similar to the proof for the solvability of normal AFLP constraints (Property 4.2 in [7]). The algorithm for testing satisfiability of a normal FCGP/AFLP constraint is adapted from [20] for constraints on fuzzy value terms. Example 4.2 The constraint: (X ≤ι quite good) & (not high+ξ ≤ι normal) & (not low+ψ ≤ι normal) & (moderate ≤ι lub{not expensive+ξ, not cheap+ψ}) is a normal FCGP constraint. The function lub on fuzzy sets is their intersection, which is computable and monotonic. Suppose the relations between the fuzzy sets denoted by the linguistic labels in the constraint are as follows: normal = not high∩not low = lub{not high, not low}, moderate = not expensive∩not cheap = lub{not expensive, not cheap}. Applying the algorithm presented in [7], one obtains X = quite good, ξ = md(not high / normal) = 0 and ψ = md(not low / normal) = 0, which satisfy the constraint. So, a solution for the constraint is {X/quite good, ξ/0, ψ/0}.
Fuzzy Order-Sorted Logic Programming in Conceptual Graphs
279
4.3. FCGP Proof Procedure Definition 4.5 An FCGP goal G is defined to be of the form QG || CG, where QG is the query part which is a finite FCG whose fuzzy values are simple FCGP annotation terms or of the form lub{σ1, σ2, ..., σm} where σi’s are simple FCGP annotation terms, and CG is the constraint part which is an FCGP constraint. The goal is said to be normal iff (1) CG is a normal FCGP constraint and (2) for a conjunctive fuzzy type variable X, there is at most one occurrence of X in QG or in an inequality X ≤ι φ in CG (there is no such a restriction on occurrences of X in the right-hand sides of the inequalities in CG). Definition 4.6 Let P be an FCGP and G be an FCGP goal. An answer for G w.r.t. P is a triple where ρ and ϖ are respectively a referent specialization operator and a coreference partition operator on G, and ϕ is a substitution for annotation variables in G. The answer is said to be correct iff ϕ is a solution for CG and every annotation variable-free instance of ρϖϕQG is a logical consequence of P. The following definition of FCG unification is modified from the one in [4] with the introduction of FCGP annotation terms and constraints on them. As defined in [4], a VAR generic marker is one whose concept occurs in an FCGP query, or in the body of an FCGP rule, or in the head of an FCGP rule and coreferenced with a concept in the body of the rule; a NON-VAR generic marker is one whose concept occurs in an FCGP fact, or in the head of an FCGP rule but are not coreferenced with any concept in the body of the rule. Definition 4.7 Let u and v be finite normal FCGs. An FCG unification from u to v is a mapping θ: u → v such that: 1. ∀c∈VC u : type(c) is matchable to type(θc) and referent-unified(c, θc), and 2. ∀r∈VR u : type(r) is matchable to type(θr), and ∀i∈{1, 2, ..., arity(r)}: neighbor(θr, i) = θneighbor(r, i), and 3. No VAR generic maker is unified with different individual markers or noncoreferenced NON-VAR generic markers. The constraint produced by θ is denoted by Cθ and defined by the set {aval(c) ≤ι aval(θc) | c∈VC u and c is a fuzzy attribute concept}∪{type(c) ≤ι type(θc) | c∈VC u }∪{type(r) ≤ι type(θr) | r∈VR u }. The referent specialization operator, the coreference partition operator and the resolution operator defined by θ ([4]) are respectively denoted by ρθ, ϖθ and δθ. Note that, in Definition 4.7, if aval(c) is of the form lub{σ1, σ2, ..., σm}, where σi’s are simple FCGP annotation terms, then the constraint lub{σ1, σ2, ..., σm} ≤ι aval(θc) is equivalent to the constraint σ1 ≤ι φ & σ2 ≤ι φ & ... & σm ≤ι φ where φ = aval(θc). This also applies to type(c) and type(r).
280
T.H. Cao and P.N. Creasy
Definition 4.8 Let G be an FCGP goal QG || CG and C be an FCGP reductant if u then v (G and C have no variables in common). Suppose that there exists an FCG unification θ from a normalized sub-graph g of QG to v. Then, the corresponding resolvent of G and C is a new FCGP goal denoted by Rθ(G,C) and defined to be ρθϖθ[δθQG u] || Cθ & CG, where δθ deletes g from QG. Property 4.3 Let G and C be respectively an FCGP goal and an FCGP reductant. If G is a normal FCGP goal, then any resolvent of G and C is also a normal FCGP goal. Proof. The proof is similar to the proofs for the alike properties of annotated logic programs (Lemma 2 in [20]) and AFLPs (Property 4.3 in [7]). Note that the order of the inequalities in Cθ is not significant, but the order of Cθ and CG in Definition 4.8 is. Definition 4.9 Let P be an FCGP and G be an FCGP goal. A refutation of G and P is a finite sequence G G1 ... Gn−1 Gn such that: 1. For each i from 1 to n, Gi = Rθ i(Gi−1,Ci), where G0 = G and Ci is a reductant of P, and 2. QG n is empty, and 3. CG n is solvable and has a solution. Example 4.3 Let P be the program in Example 4.1 and G be the following FCGP goal querying “Which product has a moderate price and what is its quality like?”: [PRODUCT:*x]→(ATTR2)⇒[PRICE:@moderate] →(ATTR3)⇒[QUALITY:@X] Assuming the relations between the linguistic labels in P and G as in Example 4.2, a refutation of G and P can be constructed as follows: g0 = [PRODUCT:*x]→(ATTR2)⇒[PRICE:@moderate] (a sub-graph of QG) C1 = if [PRODUCT:*y]→(ATTR1)⇒[DEMAND:@lub{not high+ξ, not low+ψ}] (u) then [PRODUCT:*y]→(ATTR2)⇒[PRICE:@lub{not expensive+ξ, not cheap+ψ}] (v) θ1: g0 → v, ρθ1 = {}, ϖθ1= {{[PRODUCT:*x], [PRODUCT:*y]}} G1 = [PRODUCT:*y]→(ATTR1)⇒[DEMAND:@lub{not high+ξ, not low+ψ}] →(ATTR3)⇒[QUALITY:@X] || (moderate ≤ι lub{not expensive+ξ, not cheap+ψ}) g1 = [PRODUCT:*y]→(ATTR1)⇒[DEMAND:@lub{not high+ξ, not low+ψ}] (a sub-graph of QG1) C2 = [PRODUCT: #2]→(ATTR1)⇒[DEMAND:@normal] →(ATTR3)⇒[QUALITY:@quite good] θ2: g1 → C2, ρθ2= {([PRODUCT:*y], #2)}, ϖθ2= {} G2 = [#2]→(ATTR3)⇒[QUALITY:@X] || (lub{not high+ξ, not low+ψ} ≤ι normal) & (moderate ≤ι lub{not expensive+ξ, not cheap+ψ})
Fuzzy Order-Sorted Logic Programming in Conceptual Graphs
281
C3 = C2 θ3: QG2 → C3, ρθ3 = {}, ϖθ3 = {} G3 = || (X ≤ι quite good) & (lub{not high+ξ, not low+ψ} ≤ι normal) & (moderate ≤ι lub{not expensive+ξ, not cheap+ψ}) = || (X ≤ι quite good) & (not high+ξ ≤ι normal) & (not low+ψ ≤ι normal) & (moderate ≤ι lub{not expensive+ξ, not cheap+ψ}) As in Example 4.2, one has X = quite good and ξ = ψ = 0 as a solution for the constraint above, whence the corresponding answer for G w.r.t. P is . Note that, if a clause rather than a reductant of P were used to resolve g0, there would not be a refutation of G and P because, generally, neither moderate ≤ι not expensive nor moderate ≤ι not cheap holds. Also, early type resolution ([4]) is applied when g1 is deleted from QG1. The following theorems state the soundness and the completeness of this FCGP proof procedure. Due to space limitation we omit the proofs for these theorems, which were presented in the submitted version of this paper. Theorem 4.4 (Soundness) Let P be an FCGP and G be an FCGP goal. If G G1 ... Gn−1 Gn is a refutation of G and P, and ϕ is a solution for CG n, then is a correct answer for G w.r.t. P. Theorem 4.5 (Completeness) Let P be an FCGP and G be a normal FCGP goal. If there exists a correct answer for G w.r.t. P, then there exists a refutation of G and P. 4.4. Remarks A crisp attribute-value is a special fuzzy one defined by a fuzzy set whose membership function only has values 0 or 1. A basic type is a special fuzzy one whose fuzzy truth-value is absolutely true. So, a CG can be considered as a special FCG and a CGP as a special FCGP with all nodes in each of its rules being firm nodes. We now give remarks on proof procedures for both FCGPs and CGPs from the viewpoint of lattice-based reasoning. For discussion, let us consider the following CGP as a special FCGP: if [PERSON: *x]→(WORK_FOR)→[PERSON: *] then [EMPLOYEE: *x] if [PERSON: *y]→(ATTEND)→[UNIVERSITY: *] then [STUDENT: *y] [PERSON: John]→(WORK_FOR)→[PERSON: *] →(ATTEND)→[UNIVERSITY: Queensland] [PERSON: John]→(BROTHER_OF)→[GIRL: Mary] where the first rule says “A person is an employee if working for another”, the second rule says “A person is a student if attending a university”, the first fact says “John works for some person and attends university Queensland”, and the second fact says “John is a brother of Mary”. Note the rules in this CGP, where two coreferenced concepts occur in the body
282
T.H. Cao and P.N. Creasy
and the head of a rule with two different concept types. These rules realize a close coupling between the concept type hierarchy and the axiomatic part of a knowledge base ([3, 4]). A backward chaining proof procedure based on clause selection like the ones in [15, 28, 4] is not complete when such rules are present or facts are not in CG normal form. For example, it cannot satisfy the goal [EMPLOYEE-STUDENT: John] where EMPLOYEE-STUDENT = lub{EMPLOYEE, STUDENT}, because it does not combine rule-defined concept types of an individual, which are EMPLOYEE and STUDENT in this case. Meanwhile, CG normal form is required to combine fact-defined concept types of an individual. Such combinations are inherent in a forward chaining method (cf. [28, 19]). Actually, the significance of FCGP reductants is to combine concept types as well as other lattice-based data for backward chaining. In this example, a reductant of the CGP is: if [PERSON: *x]→(WORK_FOR)→[PERSON: *] then [EMPLOYEE-STUDENT: *x] →(ATTEND)→[UNIVERSITY: *] which together with the first fact resolve the goal [EMPLOYEE-STUDENT: John]. Moreover, in the light of annotated logic programming ([20, 7]), types have been viewed as annotations which can also be queried about. This then reveals an advantage of CG notation that makes this view possible, which has not been addressed in the previous works on CGPs/FCGPs ([15, 28, 4, 19]). With classical first-order logic notation, this view is hindered due to types being encoded in predicate symbols, whence queries about sorts of objects or relations among objects have not been thought about (cf. [1, 3]). For example, the following CG query asks “What is John and what is John’s relation with Mary?”: [X: John]→(Y)→[PERSON: Mary] where X is a concept type variable and Y is a relation type variable. Applying the presented FCGP proof procedure, one obtains X = EMPLOYEE-STUDENT and Y = BROTHER_OF, which say “John is an employee and a student, and John is a brother of Mary”, as the most informative answer w.r.t. the given CGP.
5. Conclusions The syntax of FCGPs with the introduction of fuzzy types has been presented, providing unified structure and treatment for both FCGPs and CGPs. The general declarative semantics of FCGPs based on the notion of ideal FCGs has been defined, ensuring finite proofs of logical consequences of programs. The fixpoint semantics of FCGPs has been studied as the bridge between their declarative and procedural semantics. Then, a new SLD-style proof procedure for FCGPs has been developed and proved to be sound and complete w.r.t. their declarative semantics. The two main new points in the presented FCGP proof procedure are that, it selects reductants rather than clauses of an FCGP in resolution steps and involves solving constraints on fuzzy value terms. As it has been analysed, a CGP/FCGP SLD-style proof procedure based on clause selection is not generally complete. The constraint solving supports more expressive queries which are possibly about not only fuzzy attribute-values but also fuzzy types. Since a CGP can be considered as a special FCGP, the results obtained here for
Fuzzy Order-Sorted Logic Programming in Conceptual Graphs
283
FCGPs could be applied to CGP systems. They could also be useful for any extension that adds to CGs lattice-based annotations to enhance their knowledge representation and reasoning power. The presented FCGP system, on one hand, extends CGPs to deal with vague and imprecise information pervading the real world as reflected in natural languages. On the other hand, to our knowledge, it is the first fuzzy order-sorted logic programming system for handling uncertainty about types of objects. When only fuzzy sets of special cases are involved, FCGPs could become possibilistic CGPs, where concept and relation nodes in a CG are weighted by only values in [0,1] interpreted as necessity degrees. They are less expressive than general FCGPs but have simpler computation and are still very useful for CG-based systems dealing with uncertainty. Besides, FCGPs could be extended further to represent and reason with other kinds of uncertain knowledge, such as imprecise temporal information or vague generalized quantifiers. These are among the topics that are currently being investigated. Acknowledgment. We would like to thank Marie-Laure Mugnier and the anonymous referees for the comments that help us to revise the paper for its readability. References 1. Aït-Kaci, H. & Nasr, R. (1986), Login: A Logic Programming Language with Built-In Inheritance. J. of Logic Programming, 3: 185-215. 2. Baldwin, J.F. & Martin, T.P. & Pilsworth, B.W. (1995), Fril - Fuzzy and Evidential Reasoning in Artificial Intelligence. John Wiley & Sons, New York. 3. Beierle, C. & Hedtstuck, U. & Pletat, U. & Schmitt, P.H. & Siekmann, J. (1992), An Order-Sorted Logic for Knowledge Representation Systems. J. of Artificial Intelligence, 55: 149-191. 4. Cao, T.H. & Creasy, P.N. & Wuwongse, V. (1997), Fuzzy Unification and Resolution Proof Procedure for Fuzzy Conceptual Graph Programs. In Lukose, D. et al. (Eds.): Conceptual Structures - Fulfilling Peirce’s Dream, LNAI No. 1257, Springer-Verlag, pp. 386-400. 5. Cao, T.H. & Creasy, P.N. & Wuwongse, V. (1997), Fuzzy Types and Their Lattices. In Proc. of the 6th IEEE International Conference on Fuzzy Systems, pp. 805-812. 6. Cao, T.H. & Creasy, P.N. (1997), Universal Marker and Functional Relation: Semantics and Operations. In Lukose, D. et al. (Eds.): Conceptual Structures - Fulfilling Peirce’s Dream, LNAI No. 1257, Springer-Verlag, pp. 416-430. 7. Cao, T.H. (1997), Annotated Fuzzy Logic Programs. Int. J. of Fuzzy Sets and Systems. To appear. 8. Cao, T.H & Creasy, P.N. (1997), Fuzzy Conceptual Graph Programs and Their Fixpoint Semantics. Tech. Report No. 424, Department of CS&EE, University of Queensland. 9. Cao, T.H. (1998), Annotated Fuzzy Logic Programs for Soft Computing. In Proc. of the 2nd International Conference on Computational Intelligence and Multimedia Applications, World Scientific, pp. 459-464. 10. Carpenter, B. (1992), The Logic of Typed Feature Structures with Applications to Unification Grammars, Logic Programs and Constraint Resolution. Cambridge University Press. 11. Chevallet, J-P. (1992), Un Modèle Logique de Recherche d’Informations Appliqué au Formalisme des Graphes Conceptuels. Le Prototype ELEN et Son Expérimentation sur un Corpus de Composants Logiciels. PhD Thesis, Université Joseph Fourier. 12. Dubois, D. & Lang, J. & Prade, H., (1994) Possibilistic Logic. In Gabbay, D.M. et al. (Eds.): Handbook of Logic in Artificial Intelligence and Logic Programming, Vol. 3, Oxford University Press, pp. 439-514. 13. Genest, D. & Chein, M. (1997), An Experiment in Document Retrieval Using Conceptual Graphs. In Lukose, D. et al. (Eds.): Conceptual Structures - Fulfilling Peirce’s Dream, LNAI No. 1257, Springer-Verlag, pp. 489-504. 14. Ghosh, B.C. & Wuwongse, V. (1995), Conceptual Graph Programs and Their Declarative
284
T.H. Cao and P.N. Creasy
Semantics. IEICE Trans. on Information and Systems, Vol. E78-D, No. 9, pp. 1208-1217. 15. Ghosh, B.C. (1996), Conceptual Graph Language - A Language of Logic and Information in Conceptual Structures. PhD Thesis, Asian Institute of Technology. 16. Grätzer, G. (1978), General Lattice Theory. Academic Press, New York. 17. Ho, K.H.L. (1994), Learning Fuzzy Concepts By Examples with Fuzzy Conceptual Graphs. In Proc. of the 1st. Australian Conceptual Structures Workshop. 18. Hopcroft, J.E. & Ullman, J.D. (1979), Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, Massachusetts. 19. Kerdiles, G. & Salvat, E. (1997), A Sound and Complete CG Proof Procedure Combining Projections with Analytic Tableaux. In Lukose, D. et al. (Eds.): Conceptual Structures - Fulfilling Peirce’s Dream, LNAI No. 1257, Springer-Verlag, pp. 371-385. 20. Kifer, M. & Subrahmanian, V.S. (1992), Theory of Generalized Annotated Logic Programming and Its Applications. J. of Logic Programming, 12: 335-367. 21. Klawonn, F. (1995), Prolog Extensions to Many-Valued Logics. In Höhle, U. & Klement, E.P. (Eds.): Non-Classical Logics and Their Applications to Fuzzy Subsets, Kluwer Academic Publishers, Dordrecht, pp. 271-289. 22. Lloyd, J.W. (1987), Foundations of Logic Programming. Springer-Verlag, Berlin. 23. Magrez, P. & Smets, P. (1989), Fuzzy Modus Ponens: A New Model Suitable for Applications in Knowledge-Based Systems. Int. J. of Intelligent Systems, 4: 181-200. 24. Mineau, G.W. (1994), Views, Mappings and Functions: Essential Definitions to the Conceptual Graph Theory. In Tepfenhart, W.M. & Dick, J.P. & Sowa, J.F. (Eds.): Conceptual Structures Current Practices, LNAI No. 835, Springer-Verlag, pp. 160-174. 25. Morton, S. (1987), Conceptual Graphs and Fuzziness in Artificial Intelligence. PhD Thesis, University of Bristol. 26. Mukaidono, M. & Shen, Z. & Ding, L. (1989), Fundamentals of Fuzzy Prolog. Int. J. of Approximate Reasoning, 3: 179-194. 27. Myaeng, S.H. & Khoo, C. (1993), On Uncertainty Handling in Plausible Reasoning with Conceptual Graphs. In Pfeiffer, H.D. & Nagle, T.E. (Eds.): Conceptual Structures - Theory and Implementation, LNAI No. 754, Springer-Verlag, pp. 137-147. 28. Salvat, E. & Mugnier, M.L. (1996), Sound and Complete Forward and Backward Chainings of Graph Rules. In Eklund, P.W. & Ellis, G. & Mann, G. (Eds.): Conceptual Structures - Knowledge Representation as Interlingua, LNAI No. 1115, Springer-Verlag, pp. 248-262. 29. Sowa, J.F. (1984), Conceptual Structures: Information Processing in Mind and Machine. AddisonWesley, Massachusetts. 30. Sowa, J.F. (1991), Towards the Expressive Power of Natural Languages. In Sowa, J.F. (Ed.): Principles of Semantic Networks - Explorations in the Representation of Knowledge, Morgan Kaufmann Publishers, San Mateo, CA, pp. 157-189. 31. Sowa, J.F. (1997), Matching Logical Structure to Linguistic Structure. In Houser, N. & Roberts, D.D. & Van Evra, J. (Eds.): Studies in the Logic of Charles Sanders Peirce, Indiana University Press, pp. 418-444. 32. Umano, M. (1987), Fuzzy Set Prolog. In Preprints of the 2nd International Fuzzy Systems Association Congress, pp. 750-753. 33. Wuwongse, V. & Manzano, M. (1993), Fuzzy Conceptual Graphs. In Mineau, G.W. & Moulin, B. & Sowa, J.F. (Eds.): Conceptual Graphs for Knowledge Representation, LNAI No. 699, SpringerVerlag, pp. 430-449. 34. Wuwongse, V. & Cao, T.H. (1996), Towards Fuzzy Conceptual Graph Programs. In Eklund, P.W. & Ellis, G. & Mann, G. (Eds.): Conceptual Structures - Knowledge Representation as Interlingua, LNAI No. 1115, Springer-Verlag, pp. 263-276. 35. Zadeh, L.A. (1965), Fuzzy Sets. J. of Information and Control, 8: 338-353. 36. Zadeh, L.A. (1978), PRUF - A Meaning Representation Language for Natural Languages. Int. J. of Man-Machine Studies, 10: 395-460. 37. Zadeh, L.A. (1990), The Birth and Evolution of Fuzzy Logic. Int. J. of General Systems, 17: 95105. 38. Zadeh, L.A. (1996), Fuzzy Logic = Computing with Words. IEEE Trans. on Fuzzy Systems, 4: 103111.
Knowledge Querying in the Conceptual Graph Model: The RAP Module Olivier Guinaldo1 and Ollivier Haemmerl´e2 1
LIMOS – U. d’Auvergne, IUT de Clermont-Ferrand, BP 86, F-63172 Aubi`ere Cedex, France
[email protected] 2 INA-PG, D´epartement OMIP, 16, rue Claude Bernard F-75231 Paris Cedex 05, France
[email protected] Abstract. The projection operation can be used to query a CG knowledge base by searching all the specializations of a particular CG – the question. But in some cases it may not be enough, particularly when the knowledge allowing an answer is distributed among several graphs belonging to the fact base. We define two operating mechanisms of knowledge querying, that work by means of graph operations. Both these mechanisms are equivalent and logically based. The first one modifies the knowledge base, while the second modifies the question. The Rap module is an implementation of the last algorithm on CoGITo.
1
Introduction
The specialization relation, which can be computed by means of the projection operation, is the basis for the reasoning in the CG model. It expresses that one graph contains a more specific knowledge than another. One of the strong points of the CG model is that there exist sound and complete logical semantics, which means that the graphical reasoning is equivalent to the logical deduction upon the logical formulae associated with the graphs [1, 2]. In this paper, we consider a knowledge based system in which the fact base and the query are represented in terms of CGs. An important step is to provide methods allowing one to query such a fact base. The projection operation can be used for such a search, but the problem is that the knowledge allowing one to answer a query can be distributed among several graphs of the knowledge base, in which case no projection can be made. We propose two algorithms designed to avoid that inconvenience. These algorithms are based on those of the Rock system [3, 4]. In the Rock system we used some heuristics in order to limit the combinatory. But after our work upon the management of large knowledge bases, we implemented these algorithms and showed that they were equivalent to logical deduction [5, 6]. This paper results from these works that were never published in an international event. M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 287-294, 1998 Springer-Verlag Berlin Heidelberg 1998
288
O. Guinaldo and O. Haemmerle
We propose a first reasoning algorithm which is sound and complete regarding the logical deduction. This algorithm works on the knowledge base by using a first step that merges the knowledge in order to allow the projection to run. Then we propose a second algorithm that uses a dual mechanism: the knowledge base is not modified, but the question is split and partial answers are searched for. We proved in [6] that this second mechanism is equivalent to the first one, and that it is sound and complete regarding the logical deduction. Then we show that this second algorithm presents several valuable points compared with the first one. The last section of this article is a presentation of the Rap module based on the second algorithm. Rap is implemented on the CoGITo platform [7].
2
Knowledge Querying
In the following, we call the fact base a set of CGs under normal form1 representing the facts of our system. Our goal is to provide CG answers to a CG question asked on a fact base. The projection operation is the ground operation used to make deductions. 2.1
The reasoning
In this paper, we exclusively focus on reasoning in terms of graph operations. However, in order to clarify the notions of fact base, question and answers, we propose an intuitive definition of these notions. A formal definition in first order logic is proposed in [6]. Definition 1 Let F B = {F1 , . . . , Fk } be a set of CGs under normal form defined on the support S and Φ(F1 ), . . . , Φ(Fk ) the set of associated formulae. Let Φ(S) be the set of formulae associated with S. Given Q a CG defined on S and Φ(Q) its associated formula. We say that there exists an answer A to Q on FB iff Φ(S), Φ(F1 ), . . . , Φ(Fk ) ` Φ(Q). The construction of such an answer is presented in the next section. 2.2
Composition: The first algorithm
Our goal is to propose a sound and complete algorithm computing CG answers according to the previous definition. In other words we want to show that it is possible to give a CG answer A to the CG question Q without using a logical theorem prover. We could define the notion of answer as a “specialization of the CG question belonging to the CG fact base”. Thus the algorithm could be “projecting the CG question upon each CG fact”. But such a definition cannot solve the following 1
According to [8], a CG is under normal form if it doesn’t have two conceptual vertices with the same individual marker; the normal form of a graph is computed by merging the conceptual vertices with the same individual marker.
Knowledge Querying in the Conceptual Graph Model: The RAP Module
289
problem: the knowledge relative to an individual marker and allowing one to answer a question may be distributed among several CG facts of the base. In that case it is impossible to answer, because of the impossibility of finding a projection. For example, consider the question Q and the fact base F B presented in fig. 1. Q cannot be projected on a graph of F B. But the logical formula associated with Q can be deduced from the part of F B in bold lines. Graph F in fig. 1 is obtained by the disjunctive sum of F 1 and F 2 (a disjunctive sum is a specialization operation in [8] consisting in the juxtaposition of two CGs), and then by the join made on the individual vertices Mike. This graph admits obviously a projection of Q on its grey subgraph. That sub-graph of F is an answer to Q.
2
carac
Blue
1
2
1
1
2
F1
1
2
poss
Car : CLT63
Man : Mike
agt
Drive
obj
Car : CLT63
Man : Mike 1
2
F2
Bike
1
poss
Adult
carac 2
FB 1
2
1
2
agt
Drive
obj
Car
1
Person
2
carac
Adult
1
poss
Q
2
Blue
carac
1
2
1
Car : CLT63
Bike
2
1
2
agt
Drive
obj
Man : Mike 1
2
F
poss
1
poss
1
carac
2 2
Bike Adult
Fig. 1. Example of a CG fact base and a CG question. No graph of F B admits a projection of Q. But graph F , the logical interpretation of which is equivalent to the conjonction of formulae associated with the graphs in F B, admits a projection of Q. There exists an answer to Q in F B.
So the first algorithm consists in considering the fact base F B = {F1 , . . . , Fk } as a unique graph resulting from the disjunctive sum of F1 , . . . , Fk , then normalizing that graph by merging all the vertices with the same individual marker, and finally projecting Q on F . The Composition algorithm is presented thoroughly in [6].
290
2.3
O. Guinaldo and O. Haemmerle
Blues: The second algorithm
Two drawbacks of the Composition algorithm can be noted. Firstly, the knowledge base is modified by the use of disjunctive sum and normalization. This is not a problem in terms of the complexity of these operations, but the original form of the knowledge is modified. This can be a problem for instance if such a knowledge represents sentences from a text and you want to know easily which sentences an answer comes from. Secondly, Composition involves a projection of the CG question into a graph of a size “equivalent to the size of the knowledge base”. In terms of implementation, this can be prejudicial, the whole knowledge base having to be loaded into the memory in order to compute the projection. That’s why we propose the Blues2 algorithm, which does not modify the fact base: Blues splits the question instead of merging the CG base. The main idea of this algorithm was proposed by Carbonneill and Haemmerl´e [4]. It is an equivalent alternative to the Composition algorithm. Afterwards, we consider that all the CG facts are individually in normal form: the combinatory increases significantly when we work with unspecified graphs, while putting a graph under normal form (which is an equivalent form in terms of logic) has a linear complexity in the size of the graph. Presentation and definitions In the Composition algorithm, we have seen that the answers to a question Q on a fact base F B are graphs resulting from the projection of Q into the graph produced by the disjunctive sum of CG facts then put under normal form. The Blues algorithm simulates these operations in two stages: 1. It splits the question instead of merging the facts, then tries to match each generated sub-question into the base in order to obtain partial answers. 2. It expresses conditions on recombination of these partial answers in order to generate exact answers. More precisely, the splitting of a question Q gives us a set Q = {Q1 , . . . , Qi , . . . , Qn } of CGs that we call sub-questions, and a set C = {C1 , . . . , Cj , . . . , Cm } of conceptual vertices sets. Each sub-question Qi is a copy of a connected sub-graph of Q which has at most one relation vertex (and its concept vertices neighbours). Each Cj is composed of all the concept vertices in Q which result from the splitting of the same concept vertex of Q that we call a cut vertex. The cut vertices of Q are the concept vertices that have at least two distinct neighbouring relation vertices. Moreover, we note ci the concept vertex of the sub-question Qi generated by the cut vertex c of Q (see fig. 2). Definition 2 We call partial answer to a CG question Q the graph that results from the projection Πi of the sub-question Qi into a graph of the base F B. We write it down Πi (Qi ). 2
Building alL the solUtions aftEr splitS.
Knowledge Querying in the Conceptual Graph Model: The RAP Module 2
Car
1
1
obj
2
agt
Drive
Person 0
c
c
2
poss
c1 Car
Q1
obj
1
c03
Person
1
c2
2
agt
Q2
Person
c02
Person
2
carac
1
Drive Drive
Adult
1
Q
2
2
carac
1
291
Bike
Adult
Q3
c04
1
Q4
2
poss
Bike
Fig. 2. Split of a CG question Q. c and c0 are two cut vertices of Q that generate four sub-questions. We have the following sets: Q = {Q1 , Q2 , Q3 , Q4 } and C = {{c1 , c2 }, {c02 , c03 , c04 }}
Our goal is to know whether a partial answer set P = {Π1 (Q1 ), · · · , Πn (Qn )} can be recombined into an exact answer to Q. We note Π the sum of projections Π1 , · · · , Πn such that ∀i, 1 ≤ i ≤ n, if v ∈ Qi then Π(v) = Πi (v). Definition 3 Given Cj = {ci1 , · · · , cil } the set of concept vertices resulting from the splitting of the cut vertex c of Q. Given ∀j, 1 ≤ j ≤ l, Πij (cij ) the image by Πij of the vertex cij in the sub-question Qij . The set ΠCj = {Π1 (ci1 ), · · · , Πl (cil )} is recombinable if each pair (Πu (ciu ), Πv (civ )) satisfies one of the following conditions (the indices u and v are between 1 and l): a) Πu (ciu ) and Πv (civ ) is the same vertex; the splitting was not necessary. b) Πu (ciu ) and Πv (civ ) are distinct but they have the same individual marker. Definition 4 Given Π1 (Q1 ), · · · , Πn (Qn ) some partial answers which have been obtained by projecting the sub-questions (of Q) Q1 , · · · , Qn by Π1 , · · · , Πn on the CG fact base BF . Given C = {C1 , . . . , Cj , . . . , Cm } the set of concept vertices sets due to the splitting in the CG question Q. Given Π the sum of projections Π1 , · · · , Πn such that ∀i 1 ≤ i ≤ n, if v ∈ Qi then Π(v) = Πi (v). Given Q1,···,n the graph resulting from the disjunctive sum on Q1 , · · · , Qn . If ∀j, 1 ≤ j ≤ m, ΠCj is recombinable, then we call answer to Q the normal form of the CG Π(Q1,···,n ). Figure 3 shows an example of recombination of a CG answer by the Blues algorithm. Note that Blues computes all the CG answers to a CG question Q on a CG fact base. The Blues algorithm is presented thoroughly in [6]. We also proved in that article that Blues algorithm is a sound and complete reasoning mechanism with regard to our logical definition of the reasoning (definition 1) and that it is equivalent to the Composition algorithm. Moreover, it is important to say that these two algorithms are both based on the NP-complete problem of the graph projection [9].
292
O. Guinaldo and O. Haemmerle Π1 (Q1 )
Π1 (c1 )
Π4 (Q4 ) 2
1
Drive
2
objet
Π4 (c04 ) 1
Π2 (c2 ) 1
1
obj
Man : Mike
carac
carac
1
Π3 (c03 )
2
2
Bike
poss
Car : CLT63
2
Adult
Π3 (Q3 ) 1
Blue
Man : Mike
Π2 (c02 )
poss
Car : CLT63
2
Π2 (Q2 )
F2
F1
Duplication +
Q1 , Q2
Normalization
a 1
Drive
2
obj poss
Bike
obj
A
Car : CLT63
2
1
2
1
Q2 , Q3
b
Man : Mike
Q2 , Q4
b
a
1
carac
2
Adult
Q3 , Q4
Fig. 3. Recombination of answers. A is a CG answer to the CG question Q of figure 2 on the CG fact base F B = {F1 , F2 }. Each Πi (Qi ) is the partial answer to subquestion Qi on F B. a and b symbolize the different cases of recombination on ΠC1 = {Π1 (c1 ), Π2 (c2 )} and ΠC2 = {Π2 (c02 ), Π3 (c03 ), Π4 (c04 )}.
3 3.1
The RAP Module Presentation
The Rap module (seaRch for All Projections) is essentially an implementation of the Blues algorithm, thus it proposes a sound and complete reasoning. The Rap module is implemented on the CoGITo platform [10, 7] and then can take advantage of its graph base management system [5, 11]. This management system is based on the hierarchy on the CGs induced by the specialization relation. Added to the usual techniques of classification and indexation [12, 13], the CoGITo platform implements hash-coding and filtering mechanisms that allow a reduction of the search domain without using any projection operation. In addition the projection algorithms that are used are efficient [9, 11]. As far as we know [14], the Rap module is the only reasoning module working in a general framework (the projection is not reduced to an injective projection. . . ) and being sound and complete regarding the logical deduction. Among the systems close to ours, we can cite the PEIRCE system [15]. PEIRCE ignores the possibility of “split” knowledge, and it uses injective projection. But it proposes “close match” CGs which are pertinent answers. Rock (Reasoning On
Knowledge Querying in the Conceptual Graph Model: The RAP Module
293
Conceptual Knowledge) [3] is another system close to ours. It searches exact and pertinent answers by modifying the question graph. It considers type definitions (Rap doesn’t). But its major drawback is that it is not complete. Rock can fail to find exact answers to a question. This drawback was at the source of the development of the Blues algorithm by Carbonneill and Haemmerl´e [4, 6]. 3.2
Optimization of the Blues algorithm
Two optimizations of the Blues algorithm have been done in the Rap module. The first one consists in using the set of specialization graphs of each subquestion graph as the search domain, instead of all the fact base. This is done by means of the specialization search functionnalities of CoGITo, that use hierarchical structuration and indexation. The second optimization consists in recomposing the answers only when it is certain that the sets of concept vertices of ΠC are recombinable, instead of trying an hypothetic recombination for all the combination of partial answers. More precisely this optimization is based on the following property: the choice of a partial answer to a sub-question Qi implies restrictions on the choice of the partial answers to the sub-questions that are localized in the immediate neighbourhood of Qi . If no partial answer is possible for one of these sub-questions, then the partial answer of Qi that was chosen cannot lead to an exact answer to Q. Two kinds of restriction are used in this optimization. In the example of figures 2 and 3, when the algorithm chooses a partial answer of Q1 in F1 , then it must choose a partial answer also belonging to F1 as a partial answer of Q2 , in order to make a recombination of a type. Then for Q3 and Q4 , the algorithm has to choose partial answers containing the individual vertex [Man:Mike] in order to make a recombination of b type. The indexed gestion of graphs in the CoGITo platform makes such a restriction of the search domain easy.
4
Conclusion and Perspectives
We have studied two reasoning algorithms only based on graph operations: the Composition algorithm that modifies the CG fact base in order to “compose” the knowledge eventually split through several distinct CGs, and the Blues algorithm that works by splitting the question graph, searching for partial answers, then recombining a complete answer. These algorithms are sound and complete regarding the logical deduction. This work was preliminary a “theoretic” work, but we have implemented the Blues algorithm in order to test it and to observe its behaviour on large CG bases. The first test concerns a base of 10000 graphs (generated by a random algorithm). The CPU times of Blues are close to those of Composition. It is a valuable result, because it shows that using an algorithm that respects the original form of a CG fact base is not prejudicial.
294
O. Guinaldo and O. Haemmerle
This first step in the development of a reasoning module of the CoGITo platform has to lead us to a more complete module taking the extension of the CG model into account (nested CGs for instance [8]). Another direction of study can be the exploration of techniques allowing one to provide pertinent answers as it was done by the Rock system. The Blues algorithm would be easily adaptable to a heuristic reasoning during its recombination phase. This would allow to take advantage of both the completeness of the Rap module and the ability of providing pertinent answers of the Rock system.
References 1. J.F. Sowa. Conceptual Structures - Information Processing in Mind and Machine. Addison-Wesley, Reading, Massachusetts, 1984. 2. M. Chein and M.L. Mugnier. Conceptual Graphs : fundamental notions. Revue d’Intelligence Artificielle, 6(4):365–406, 1992. 3. B. Carbonneill and O. Haemmerl´e. ROCK : Un syst`eme de question/r´eponse fond´e sur le formalisme des graphes conceptuels. In Actes du 9`eme Congr`es Reconnaissance des Formes et Intelligence Artificielle, Paris, pages 159–169, 1994. 4. B. Carbonneill. Vers un syst`eme de repr´ esentation de connaissances et de raisonnement fond´e sur les graphes conceptuels. Th`ese d’informatique, Universit´e Montpellier 2, Janvier 1996. ´ 5. O. Guinaldo. Etude d’un gestionnaire d’ensembles de graphes conceptuels. PhD thesis, Universit´e Montpellier 2, D´ecembre 1996. 6. O. Guinaldo and O. Haemmerl´e. Algorithmes de raisonnement dans le formalisme des graphes conceptuels. In Actes du XIe congr`es RFIA, volume 3, pages 335–344, Clermont-Ferrand, 1998. 7. O. Guinaldo and O. Haemmerl´e. CoGITo : une plate-forme logicielle pour raisonner avec des graphes conceptuels. In Actes du XVe congr`es INFORSID, pages 287–306, Toulouse, juin 1997. 8. M.L. Mugnier and M. Chein. Repr´esenter des connaissances et raisonner avec des graphes. Revue d’Intelligence Artificielle, 10(1):7–56, 1996. 9. M.L. Mugnier and M. Chein. Polynomial algorithms for projection and matching. In Heather D. Pfeiffer, editor, Proceedings of the 7th Annual Workshop on Conceptual Graphs, pages 49–58, New Mexico State University, 1992. 10. O. Haemmerl´e. CoGITo : Une plate-forme de d´eveloppement de logiciels sur les graphes conceptuels. Th`ese d’informatique, Universit´e Montpellier 2, Janvier 1995. 11. O. Guinaldo. Conceptual graphs isomorphism - algorithm and use. In Proceedings of the 4th Int. Conf. on Conceptual Structures, Lecture Notes in Artificial Intelligence, Springer-Verlag, pages 160–174, Sydney, Australia, August 1996. 12. R. Levinson. Pattern Associativity and the Retrieval of Semantic Network. Computers Math. Applic., 23(6-9):573–600, 1992. 13. G. Ellis. Compiled hierarchical retrieval. In E. Way, editor, Proceedings of the 6th Annual Workshop on Conceptual Graphs, pages 187–207, Binghamton, 1991. 14. D. Lukose, editor. Proceedings of the First CGTOOLS Workshop, University of New South Wales Sydney, N.S.W., AUSTRALIA, August 1996. 15. G. Ellis, R. Levinson, and P. Robinson. Managing complex objects in PEIRCE. International journal on human-Computer Studies, 41:109–148, 1994.
Stepwise construction of the Dedekind-MacNeille Completion Bernhard Ganter1 and Sergei O. Kuznetsov2 1
Technische Universit¨ at Dresden, Institut f¨ ur Algebra, D-01062 Dresden Department of Theoretical Foundations of Informatics, All-Russia Institute for Scientific and Technical Information (VINITI), ul. Usievicha 20 a, 125219 Moscow, Russia 2
Abstract. Lattices are mathematical structures which are frequently used for the representation of data. Several authors have considered the problem of incremental construction of lattices. We show that with a rather general approach, this problem becomes well-structured. We give simple algorithms with satisfactory complexity bounds.
For a subset A ⊆ P of an ordered set (P, ≤) let A↑ denote the set of all upper bounds. That is, A↑ := {p ∈ P | a ≤ p for all a ∈ A}. The set A↓ of lower bound is defined dually. A cut of (P, ≤) is a pair (A, B) with A, B ⊆ P , A↑ = B, and A = B ↓ . It is well known that these cuts, ordered by (A1 , B1 ) ≤ (A2 , B2 ) : ⇐⇒ A1 ⊆ A2 ( ⇐⇒ B2 ⊆ B1 ) form a complete lattice, the Dedekind-MacNeille completion (or short completion) of (P, ≤). It is the smallest complete lattice containing a subset orderisomorphic with (P, ≤). The size of the completion may be exponential in |P |. The completion can be computed in steps: first complete a small part of (P, ≤), then add another element, complete again, et cetera. Each such step increases the size of the completion only moderately and is moreover easy to perform. We shall demonstrate this by describing an elementary algorithm that, given a (finite) ordered set (P, ≤) and its completion (L, ≤), constructs the completion of any one-element extension of (P, ≤) in O(|L| · |P | · ω(P )) steps, where ω(P ) denotes the width of (P, ≤). The special case that (P, ≤) is itself a complete lattice and thus isomorphic to its completion, has been considered as the problem of minimal insertion of an element into a lattice, see e.g. Valtchev [4]. We obtain that the complexity of inserting an element into a lattice (L, ≤) and then forming its completion is bounded by O(|L|2 · ω(L)). M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 295–302, 1998. c Springer-Verlag Berlin Heidelberg 1998
296
B. Ganter and S.O. Kuznetsov
The elementary considerations on the incidence matrix of (P, ≤), which we use in the proof, do not utilize any of the order properties. Our result therefore generalizes to arbitrary incidence matrices. In the language of Formal Concept Analysis this may be interpreted as inserting a preconcept into a concept lattice.
1
Computing the Completion
Let us define a precut of an ordered set to be a pair (S, T ), where S is an order filter and T is an order ideal such that S ⊆ T ↓ , T ⊆ S ↑ . We consider the following construction problem: Instance: A finite ordered set (P, ≤), its completion, and a precut (S, T ) of (P, ≤). Output: The completion of (P ∪ {x}, ≤), where x 6∈P is some new element with p ≤ x ⇐⇒ p ∈ S and x ≤ p ⇐⇒ p ∈ T for all p ∈ P .1 (P, ≤) may be given by its incidence matrix (of size O(|P |2 )). The completion may be represented as a list of cuts, that is, of pairs of subsets of P . With a simple case analysis we show how the cuts of (P ∪ {x}, ≤) can be obtained from those of (P, ≤). Proposition 1. Each cut of (P ∪ {x}, ≤), except (S ∪ {x}, T ∪ {x}), is of the form (C, D), (C ∪ {x}, D ∩ T ), or (C ∩ S, D ∪ {x}) for some cut (C, D) of (P, ≤). If (C, D) is a cut of (P, ≤) then 1. (C ∪ {x}, D ∩ T ) is a cut of (P ∪ {x}, ≤) iff S ⊂ C = (D ∩ T )↓ , 2. (C ∩ S, D ∪ {x}) is a cut of (P ∪ {x}, ≤) iff T ⊂ D = (C ∩ S)↑ , 3. (C, D) is a cut of (P ∪ {x}, ≤) iff C 6⊆S and D 6⊆T . For a proof of this result and of the following see the next section. Proposition 2. The number of cuts of (P ∪ {x}, ≤) does not exceed twice the number of cuts of (P, ≤), plus two. A natural embedding of the completion of (P, ≤) into that of (P ∪ {x}, ≤) is given by the next proposition: Proposition 3. For each cut (C, D) of (P, ≤) exactly one of (C, D),
(C ∪ {x}, D),
(C, D ∪ {x}),
(C ∪ {x}, D ∪ {x})
is a cut of (P ∪ {x}, ≤). These cuts can be considered to be the “old” cuts, up to a modification. “New” cuts are obtained only from cuts (C, D) that satisfy 3) and simultaneously 1) or 2). An algorithm can now be given: 1
For elements of P different from x, the order remains as it was.
Stepwise construction of the Dedekind-MacNeille Completion
297
Algorithm to construct the completion of (P ∪ {x}, ≤). Let L denote the set of all cuts of (P, ≤). – Output (S ∪ {x}, T ∪ {x}). – For each (C, D) ∈ L do: 1. If C ⊆ S and D 6⊆T then output (C, D ∪ {x}). 2. If C 6⊆S and D ⊆ T then output (C ∪ {x}, D). 3. If C 6⊆S and D 6⊆T then a) output (C, D), b) if C = (D ∩ T )↓ then output (C ∪ {x}, D ∩ T ), c) if D = (C ∩ S)↑ then output (C ∩ S, D ∪ {x}). – End. It follows from the above propositions that this algorithm outputs every cut of (P ∪ {x}, ≤) exactly once. Each step of the algorithm involves operations for subsets of P . The most time consuming one is the computation of (D ∩ T )↓ and of (C ∩ S)↑ . Note that (D ∩ T )↓ = (min(D ∩ T ))↓ , where min(D ∩ T ) is the set of the minimal elements of D ∩ T and can be computed in O(|P | · ω(P )) steps. Since | min(D ∩ T )| ≤ ω(P ) and, moreover, \ (min(D ∩ T ))↓ = {p↓ | p ∈ min(D ∩ T )}, we conclude that (D ∩ T )↓ can be obtained with an effort of O(|P | · ω(P )). The dual argument for (C ∩ S)↑ leads to the same result. So if L is the set of cuts of (P, ≤), then the algorithm can be completed in O(|L| · |P | · ω(P )) steps. Let us mention that computing an incidence matrix of the completion can be done in O(|L|2 ) steps, once the completion has been computed, see Proposition 6.
2
Inserting a Preconcept
A triple (G, M, I) is called a formal context if G and M are sets and I ⊆ G×M is a relation between G and M . For each subset A ⊆ G let AI := {m ∈ M | (g, m) ∈ I for all g ∈ A}. Dually, we define for B ⊆ M B I := {g ∈ G | (g, m) ∈ I for all m ∈ B}. A formal concept of (G, M, I) is a pair (A, B) with A ⊆ G, B ⊆ M , AI = B, and A = B I . The formal concepts, ordered by (A1 , B1 ) ≤ (A2 , B2 ) : ⇐⇒ A1 ⊆ A2
( ⇐⇒ B2 ⊆ B1 ),
form a complete lattice, the concept lattice of (G, M, I). Most of the arguments given below become rather obvious if one visualizes a formal context as a G × M - cross table, where the crosses indicate the incidence
298
B. Ganter and S.O. Kuznetsov
relation I. The concepts (we sometimes omit the word “formal”) then correspond to maximal rectangles in such a table. Note that if A = B I for some set B ⊆ M , then (A, AI ) automatically is a concept of (G, M, I). A pair (A, B) with A ⊆ G, B ⊆ M , A ⊆ B I , and B ⊆ AI is called a preconcept of (G, M, I). In order to change a preconcept into a concept, one may extend each of the sets G and M by one element with the appropriate incidences. So as a straightforward generalization of the above, we consider the following construction problem: Instance: A finite context (G, M, I), its concept lattice, and a preconcept (S, T ) of (G, M, I). Output: The concept lattice of (G ∪ {x}, M ∪ {x}, I + ), where x 6∈G ∪ M is a new element and I + := I ∪ ((S ∪ {x}) × ({x} ∪ T )). The special case of section 1 is obtained by letting G = M := P
and
(g, m) ∈ I : ⇐⇒ g ≤ m.
Proposition 4. Each formal concept of (G ∪ {x}, M ∪ {x}, I + ), with the exception of (S ∪ {x}, T ∪ {x}), is of the form (C, D),
(C ∪ {x}, D ∩ T ),
or
(C ∩ S, D ∪ {x})
for some formal concept (C, D) of (G, M, I). With the obvious modifications, the conditions given in Proposition 1 hold. Proof. Each formal concept (A, B) of (G ∪ {x}, M ∪ {x}, I + ) belongs to one of the following cases: 1. x ∈ A, x ∈ B. Then A = S ∪ {x}, B = T ∪ {x}. 2. x ∈ A, x 6∈ B. Then B ⊆ T and B I = A \ {x}. Therefore (C, D) := (A \ {x}, (A \ {x})I ) is a formal concept of (G, M, I) satisfying S ⊂ C = (D ∩ T )I .
(1)
Conversely if (C, D) is a formal concept of (G, M, I) satisfying (1), then (A, B) := (C ∪ {x}, D ∩ T ) is a formal concept of (G ∪ {x}, M ∪ {x}, I + ). 3. x 6∈A, x ∈ B, dual to 2. Then (C, D) := ((B \ {x})I , B \ {x}) is a concept of (G ∪ {x}, M ∪ {x}, I + ) with T ⊂ D = (C ∩ S)I . Conversely each formal concept (C, D) with (2) yields a formal concept (A, B) := (C ∩ S, D ∪ {x}) of (G ∪ {x}, M ∪ {x}, I + ).
(2)
Stepwise construction of the Dedekind-MacNeille Completion
4. x 6∈A, x 6∈B. satisfying
299
Then (C, D) := (A, B) is a formal concept also of (G, M, I), C 6⊆S,
D 6⊆T.
(3)
Conversely is each pair with (3) also a concept of (G ∪ {x}, M ∪ {x}, I + ). If both (C ∪ {x}, D ∩ T ) and (C ∩ S, D ∪ {x}) happen to be concepts, then S ⊆ C and T ⊆ D, which implies C ∪ {x} = T I , D ∪ {x} = S I . Thus apart from perhaps one exceptional case these two possibilities exclude each other. From each concept of (G, M, I), we therefore obtain at most two concepts of (G ∪ {x}, M ∪ {x}, I + ), except in a single exceptional case, which may lead to three solutions. On the other hand, each concept of (G ∪ {x}, M ∪ {x}, I + ), except (S ∪ {x}, T ∪ {x}), is obtained in this manner. This proves Proposition 2. To see that Proposition 3 holds in the general case, note that each formal concept (C, D) of (G, M, I) belongs to one of the following cases: 1. C = S, D = T . Then (C ∪{x}, D∪{x}) is a concept of (G∪{x}, M ∪{x}, I + ). 2. C ⊆ S, T ⊂ D. Then D = C I and condition (2) (from the proof of Proposition 4) is fulfilled. Thus (C, D ∪ {x}) is a concept of (G ∪ {x}, M ∪ {x}, I + ). 3. S ⊂ C, D ⊆ T . Then C = DI and condition (1) is satisfied. Therefore (C ∪ {x}, D) is a concept of (G ∪ {x}, M ∪ {x}, I + ). 4. C 6⊆S, D 6⊆T . Then (C, D) is a concept of (G ∪ {x}, M ∪ {x}, I + ). It is clear that each of the possible outcomes determines (C, D), and that therefore the possibilities are mutually exclusive. It is a routine matter to check that these formal concepts are ordered in the same way than those of (G, M, I). The construction thus yields a canonical order embedding of the small concept lattice into that of the enlarged context. Since all details have carried over to the more general case, we may resume: Proposition 5. The algorithm given in section 1, when applied to the concept lattice L of (G, M, I), computes the concept lattice of (G ∪ {x}, M ∪ {x}, I + ). The abovementioned complexity considerations apply as well, but it is helpful to introduce a parameter for contexts that corresponds to the width. The incidence relation induces a quasiorder relation on G by g1 ≤ g2 : ⇐⇒ {g2 }I ⊆ {g2 }I . Let ω(G) be the width of this quasiorder, and let ω(M ) denote the width of the corresponding quasiorder on M . Let τ (G, M, I) := (ω(G) + ω(M )) · (|G| + |M |). Of course, τ (G, M, I) ≤ (|G| + |M |)2 . Provided the induced quasiorders on G and M are given as incidence matrices (these can be obtained in O(|G| · |M | · (|G| + |M |)) steps), we have a better bound on the complexity of the derivation operators: the set AI can be computed from A with complexity O(τ (G, M, I)).
300
B. Ganter and S.O. Kuznetsov
Computing AI was the most time consuming step in the algorithm on section 1. Thus computing the new concept lattice can be performed with O(|L| · τ (G, M, I)) bit operations. Each concept of (G∪{x}, M ∪{x}, I + ), except (S ∪{x}, T ∪{x}), is generated by exactly one of the steps 1, 2, 3a, 3b, 3c of the algorithm, and precisely 3b) and 3c) lead to “new” concepts (other than (S ∪ {x}, T ∪ {x}). When performing the algorithm, we may note down how the concepts were obtained. These data can be used later to construct an incidence matrix of the new lattice: Proposition 6. The order relation of the new lattice can be computed in additional O(|L|2 ) steps. Proof. (S ∪ {x}, T ∪ {x}) is the largest concept containing x in its extent and the smallest concepts containing x in its intent. In other words, (S ∪ {x}, T ∪ {x}) is greater than all concepts generated in steps 2) and 3b) and smaller than all concepts generated by steps 1) and 3c). It is incomparable to the other elements. So we may exclude this concept from further considerations. The order relation between the “old” concepts, i.e. between those generated in steps 1), 2), and 3a), is the same as before. For the remaining case, we consider w.l.o.g. a concept (C ∪{x}, D ∩T ), which was generated in step 3b) from a concept (C, D) of (G, M, I). Now (C ∪ {x}, D ∩ T ) ≤ (E, F ) if and only if (E, F ) has been generated in steps 2) or 3b) from some concept (E \{x}, (E \{x})I ) ≥ (C, D) of (G, M, I). If x ∈ E, then similarly (E, F ) ≤ (C ∪ {x}, D ∩ T ) is true if and only if (E, F ) has been generated in steps 2) or 3b) from some concept (E \ {x}, (E \ {x})I ) ≤ (C, D) of (G, M, I). Suppose x 6∈E. If (E, F ) was obtained in steps 1) or 3a) of the algorithm, than (E, E I ) is a concept of (G, M, I) and (E, F ) ≤ (C ∪{x}, D∩T ) is equivalent to (E, E I ) ≤ (C, D). If (E, F ) was obtained in step 3c), then S I ⊆ F , which implies D ∩ T ⊆ S I ⊆ F . So in this case (E, F ) ≤ (C ∪ {x}, D ∩ T ) always holds. Summarizing these facts, we obtain all comparabilities of a concept (C ∪ {x}, D ∩ T ) of (G ∪ {x}, M ∪ {x}, I + ) which was derived from a concept (C, D) of (G, M, I) in step 3b): Concepts greater than (C∪{x}, D∩T ) are those obtained in steps 2 or 3b) from concepts greater than (C, D), concepts smaller than (C ∪ {x}, D ∩ T ) are those obtained in steps 1), 2), 3a) or 3b) from those smaller than (C, D) and all those obtained in step 3c). Thus the comparabilities of (C ∪ {x}, D ∩ T ) can be obtained from those of (C, D) using only a bounded number of elementary operations in each case. Filling the corresponding row of the incidence matrix is of complexity O(|L|). The argument for concepts obtained by 3c) is analogous. The generalized algorithm may be applied to the context (P, P, 6>), obtained from an arbitrary ordered set (P, ≤). The concept lattice is the lattice of maximal antichains of (P, ≤) (see Wille [5]). Our result therefore relates to that of Jard, Jourdan and Rampon [2].
Stepwise construction of the Dedekind-MacNeille Completion
3
301
A Non-Incremental Procedure may be more Convenient
In practice, a strategy suggests itself that may be more time-consuming, but is nevertheless simpler than the algorithm presented in section 1. Rather than pursuing an incremental algorithm, it may be easier to compute the lattice “from scratch” (i.e. from the formal context, or, in the special case, from the ordered set (P, ≤)) each time. For this task there is an algorithm that is remarkably simple (it can be programmed in a few lines) and at the same time is not dramatically slower than the incremental approach: it computes the concept lattice L of a formal context (G, M, I) in O(|L| · |G|2 · |M |) steps. Using the parameter introduced above, we can improve this to O(|L| · |G| · τ (G, M, I)). This algorithm generates the formal concepts inductively and does not require a list of concepts to be stored. Let us exemplify the advantage of this by a simple calculation: A formal context (G, M, I) with |G| = |M | = 50 may have as may as 250 formal concepts in the extreme. But even if the lattice is “small” and has only, say, 1010 elements, it would require almost a hundred Gigabytes of storage space. Generating such a lattice with the inductive algorithm appears to be time-consuming, but not out of reach; the storage space required would be less than one Kilobyte. Moreover, this algorithm admits modifications that allow to search specific parts of the lattice. For details and proofs we refer to the literature (see [1]), but the algorithm itself is so simple that it can be recalled here. For simplicity assume G := {1, . . . , n}, and define for subsets A, B ⊆ G A [cl_performance](rel_isEnactmentOf)->[cl_GeneralisedProcess: x]\\.
Language annotations
From concepts to
CG:
[ [SurgicalDeed](isMainlyCharacterisedBy)->[performance](isEnactmentOf)->[[Inspecting](playsClinicalRole)->[SurgicalRole]\](actsSpecificallyOn)->[ArbitraryBodyConstruct](hasArbitraryComponent)->[RenalPelvis] (hasArbitraryComponent)->[CalixOfKidney]\ (hasPhysicalMeans)->[Endoscope] (hasSpecificSubprocess)->[SurgicalApproaching](hasPhysicalMeans)->[[Route](passesThrough)->[SkinAsOrgan]\]\\\\].
en: scopy; fr: scopie en:surgical; fr:chirurgical en: pyelo; fr: pyélo en: calico; fr: calico en: endoscopic fr: endoscopique en: by; fr: par en: percutaneous route fr: voie percutanée
Relational contraction:
Type contraction:
[cl_GeneralisedProcess: x](rel_hasSpecificSubprocess)->[cl_SurgicalApproaching](rel_hasPhysicalMeans)->[cl_Route: y]\\.
[cl_Route: x](rel_passesThrough)->[cl_SkinAsOrgan]\.
reld_byRouteOf
cl_PercutaneousRoute
Ouput of the generation tool for English and French: en: ’endoscopic surgical pyelocalicoscopy by percutaneous route’ fr: 'pyélocalicoscopie chirurgicale endoscopique par voie percutanée' Fig. 1. Operations applied during the generation task on the CG representing the French rubric „Pyélo-calicoscopie par voie percutanée“4
4
Internally, a concept is prefixed by cl_, a simple relationship by rel_ and a composite relationship by reld_. For the sake of simplicity, these prefixes are omitted in the above CG.
Tuning Up Conceptual Graph Representation
3
393
Modeling Medical Language
Modeling medical language requires taking into account the variations in the description of medical terms, while supporting a uniform representation expressing the medical concepts characterized by attributes and values. However, the way information is modeled does not always correspond to the way information is expressed in natural language. Different compromises must therefore be set up. 3.1
Annotation of Medical Information
In order to make available and operational the semantic content of the CORE model for NLP, the major task has consisted in annotating the model in the corresponding languages treated (mainly French, English, German, and Italian). These linguistic annotations are performed at two levels. First, conceptual entities are annotated with ’content words’ that correspond mostly to the syntactic categories of nouns and adjectives. Either single words, parts of words like prefixes and suffixes, or multiword expressions are permitted as annotation (see the language annotation part in Fig. 1). Second, annotations of relationships are more frequently achieved through ’function words’. The latter are conveyed either through grammatical structures, such as the adjectival structure or the noun complement (as in the examples pyelic calculus or calculus of the renal pelvis for the relationship rel_hasSpecificLocation), or directly through grammatical words such as prepositions (as in the example urethroplasty for perineal hypospadias where the preposition ’for’ denotes the relationship rel_hasSpecificGoal). The annotation process, which only occurs on named concepts, enables the creation of multilingual dictionaries that settle a direct bridge between concepts and language words. Every meaningful primitive concept belonging to the GALEN CORE model needs to be annotated. Besides, composite concepts, for which a definition is maintained at the conceptual level, may be annotated based on the availability and conciseness of words in the language treated. The verbosity of medical language and the complexity of the modeling style can then be respectively tuned by annotating composite concepts and composite relationships. For example, the concept cl_PercutaneousRoute is directly annotated with concise expressions, and the relationship reld_byRouteOf, especially created for NLP purposes, allows the nested concept cl_SurgicalApproaching to be masked during linguistic treatments (see Fig. 1). However, the combinatorial aspect of the compositional approach as well as the continually growing creation of new medical terms, make the annotation task unbounded and time-consuming. This has led to the implementation of procedural treatments at the linguistic level that map syntactic structures upon semantic representation but also include the management of standard usage of prefixes and suffixes. The latter is especially important for surgical procedures that are commonly expressed through compound word forms [12]. This means that the word pyelocalicoscopy is never described as an entry in the English dictionary (as the description of the concept denoting a pyelocalicoscopy is not explicitly named in the model), but is automatically generated according to its corresponding semantic description. Automated morphosemantic treatment also implies that the linguistic module be aware of abstract constructions used at the conceptual level to handle the
394
A.-M. Rassinoux et al.
enumeration of constituents. In Fig. 1, both the abstract concept cl_ArbitraryBodyConstruct and the relationship rel_hasArbitraryComponent are used to clarify the different body parts on which the inspection occurs. 3.2
The Relevance of the Focus for NLP
A recognized property of CG formalism, over other formalisms such as the frame system, is its ability to easily turn over the representation, i.e. to draw the same graph from a different head concept. For example, the following conceptual graph [cl_Pain]->(rel_hasLocation)->[cl_Abdomen] can be rewritten into [cl_Abdomen] ->(rel_isLocationOf)->[cl_Pain]. Even if these two graphs appear equivalent at first sight, from the conceptual viewpoint, some subtleties can be pointed out when shifting to NL. The former graph is naturally translated into abdominal pain or pain in the abdomen, whereas the second one tends to be translated into painful abdomen. In medical practice, the interpretation underlying these two clinical terms significantly differs. The key issue here is that, for NLP purposes, the head concept of a graph (such as cl_Pain in the first graph and cl_Abdomen in the second one) is precisely considered as the focus of the message to be communicated. The rest of the graph, therefore, is only there to characterize this main concept in more detail. Such an observation questions the focus-neutral property of CG formalism in so far as linguistic tools add special significance to the head concept or focus of a graph. Indeed, the latter is interpreted as the central wording upon which the rest of the sentence is built. 3.3
Contexts or Nested Conceptual Graphs
Contexts constitute a major topic of interest for both the linguist and conceptual graph communities. Recent attempts to formally define the notion of context have come to light [13, 14], and the use of context for increasing the expressiveness of NL representation is clearly asserted [15]. For GALEN modeling, contexts appear as a major way of avoiding ambiguity when representing medical language. In particular, associating a specific role with a concept (as for example, cl_SurgicalRole for identifying a surgical procedure, or cl_InfectiveRole or cl_AllergicRole for specifying a pathological role) allows for reasoning and then restricting the inference process to what is sensible to say about this concept. Such packaging of information is graphically represented through brackets delimiting the nested graph (see the CG displayed in Fig. 1 where two contexts are surrounded by bold brackets). Handling these contexts at the linguistic level results mainly in enclosing the scope of the nested graph in a single proposition, which can be expressed by a simple noun phrase or through a more complex sentence.
Tuning Up Conceptual Graph Representation
4
395
Formal Operations to Mediate between KR and NL
The previous section emphasized the gaps that exist between KR and medical language phrases. Indeed, as KR aims to describe, in an unambiguous way, the meaning carried out by NL, such a structure is naturally more complete and accurate than that which is simply expressed in NL. Mediating between these two means of communication implies setting up specific formal operations for readjusting KR and NL expressions. 4.1
Basic Operations on CGs
Balancing the degree of granularity and thus the complexity of a conceptual representation can be achieved in two different ways. On the one hand, a conceptual graph can be contracted in order to display information in a more concise manner. The contraction operation, which consists in replacing a connex portion of a graph by an explicit entity, is basically grounded on the projection operation (see [10], pp. 99). On the other hand, a conceptual graph can be expanded in order to add and thus make explicit precise information on the semantic content of the graph. The expansion operation, which consists in replacing a composite entity by its full definition, is based on the join operation (see [10], pp. 92). As the general guideline for the generation task in the GALEN project is to produce phrases ’as detailed as necessary but as concise as possible’, the projection operation appears as the central means to mediate with the complexity of KR. In order to adjust this operation for particular usage, it has been necessary to provide, in addition to the projected graph, the hanging graph, the list of cut points and finally the specializations performed during the projection. The hanging graph only embeds the remaining portions of the original graph that were connected to formal parameters (i.e. composed of one or two parts depending on the number of formal parameters, x and y, present in the definition). All other hanging subgraphs are clearly considered as cut points. Each specialization performed in the original graph, whether it concerns a relationship or a concept, is recorded in a list. Each of these components are then checked, as explained in the following section, for their particular behavior in the three following situations: the setting-up of the focus, the contraction of conceptual definitions, and the management of contexts. 4.2
Refining Basic Operations for NLP
In the simplest case, the focus of a graph is defined as the head concept of the graph. However, in KR, this solution is frequently left for the benefit of a representation that allows the focus as well as other concepts mentioned at the same level to be represented uniformly. This is the case for the example shown in Fig. 1, where the general concept cl_SurgicalDeed, representative of the type of concept to be modeled, is taken as the head concept of the graph. Then specific relationships, such as rel_isMainlyCharacterisedBy and rel_isCharacterisedBy, are respectively used to clarify the ’primary’ procedure and a number of, possibly optional, additional procedures. Establishing the focus of the graph in this case consists in restoring the
396
A.-M. Rassinoux et al.
primary procedure as the head concept of the graph, by projecting the corresponding abstraction shown in Fig. 1 on the initial graph. Then, the projected graph is straightly replaced by the specialization of the concept identified by the formal parameter x. In the example, the latter corresponds to the concept cl_Inspecting that is a descendant of cl_GeneralisedProcess in the conceptual hierarchy. Moreover, this operation prohibits the presence of cut points as well as any specialization done on concepts other than the formal parameter x. The two hanging graphs (if not empty) are then appended to the new head of the graph. In order to retain the level of detail of the conceptual representation, the type contraction does not allow specialization. Moreover, it normally prohibits the presence of cut points, which are signs of a resulting disconnected graph. However, for the generation task, such a rule can be bypassed. For example, let us consider the following graph: [cl_SurgicalExcising](rel_actsSpecificallyOn)->[cl_Adenoma](rel_hasLocativeAttribute)->[cl_ProstateGland]\\. Assuming that the composite concept cl_Adenomectomy exists in the model, the contraction of its corresponding definition would produce the cut point (rel_hasLocativeAttribute)->[cl_ProstateGland]. But, as the type contraction in the generation process is intended to ensure the conciseness of the produced NL expressions, by translating a portion of graph by precise words, the cut point can be joined to the hanging graph. This contributes to the generation of the valid NL expression prostatic adenomectomy from the above graph. For the relational contraction, the projection operation permits the specialization on concepts and relationships, as these relational definitions, specifically introduced for NLP purposes, are commonly expressed in the most general way possible. However, the cut points are not permitted. Finally, contexts are treated first of all by looking for the definition of a composite concept already described in the model that can be successfully projected on the nested graph. This is the case in Fig. 1 for the contextual information describing the percutaneous route that is replaced by the concise concept cl_PercutaneousRoute. In all the other cases, the boundaries of the contexts are simply removed and the nested graph is merged to the main graph as performed for the surgical role in Fig. 1.
5
Conclusion
Our experience with managing the conceptual graph formalism for NLP has reinforced our belief that a logical, expressive, and tractable representation of medical concepts is a requisite for dealing with the intricacies of medical language. In spite of the effort undertaken to independently manage conceptual knowledge (which in this case is mainly modeled within the GALEN project) and linguistic knowledge (which is handled by linguistic tools), it clearly appears that fine-tuning of both sources of knowledge is a requisite towards building concrete multilingual applications. Such an adjustment affects both the KR and the multilingual NLP tools, and is realized through declarative as well as procedural processes. On the one hand, it has been necessary to add declarative knowledge through the specification of both multilingual annotations and language-independent definitions. On the other hand, the procedural adjustment has been mainly achieved through the implementation of morphosemantic
Tuning Up Conceptual Graph Representation
397
treatment at the linguistic level, and the refinement of conceptual operations for holding the modeling style at the KR level. All these compromises have proved to be adequate for smoothly and methodically counterbalancing the granularity and complexity of KR with the implicitness and expressiveness of NL.
References 1. 2. 3. 4. 5. 6. 7.
8. 9. 10. 11.
12. 13. 14. 15.
McCray, A.T., Scherrer, J.-R., Safran, C., Chute, C.G. (eds.): Special Issue on Concepts, Knowledge, and Language in Health-Care Information Systems (IMIA). Meth Inform Med 34 (1995). Spyns, P.: Natural Language Processing in Medicine: An Overview. Meth Inform Med 35(4/5) (1996) 285-301. Cawsey, A.J., Webber, B.L., Jones, R.B.: Natural Language Generation in Health Care. JAMIA 4 (1997) 473-482. Rector, A.L., Nowlan, W.A., Glowinski, A.: Goals for Concept Representation in the GALEN project. In: Safran, C. (ed.): Proceedings of SCAMC’93. New York: McGrawHill, Inc. (1993) 414-418. Rogers, J.E., Rector, A.L.: Terminological Systems: Bridging the Generation Gap. In: Masys, D.R. (ed.): Proceedings of the 1997 AMIA Annual Fall Symposium. Philadelphia: Hanley & Belfus, Inc. (1997) 610-614. Wagner, J.C., Baud, R.H., Scherrer, J.-R.: Using the Conceptual Graphs Operations for Natural Language Generation in Medicine. In: Ellis, G. et al. (eds.): Proceedings of ICCS’95. Berlin: Springer-Verlag (1995) 115-128. Wagner, J.C., Solomon, W.D., Michel, P.-A. et al.: Multilingual Natural Language Generation as Part of a Medical Terminology Server. In: Greenes, R.A., Peterson, H.E., Protti, D.J. (eds.): Proceedings of MEDINFO’95. North-Holland: HC&CC, Inc. (1995) 100-104. Rassinoux, A.-M., Wagner, J.C., Lovis, C., et al.: Analysis of Medical Texts Based on a Sound Medical Model. In: Gardner, R.M. (ed).: Proceedings of SCAMC’95. Philadelphia: Hanley & Belfus, Inc. (1995) 27-31. Rassinoux, A.-M., Baud, R.H., Scherrer, J.-R.: A Multilingual Analyser of Medical Texts. In: Tepfenhart, W.M., Dick, J.P., Sowa, J.F. (eds.): Proceedings of ICCS’94. Berlin: Springer-Verlag (1994) 84-96. Sowa, J.F.: Conceptual Structures: Information Processing in Mind and Machine. Reading, MA: Addison-Wesley Publishing Compagny (1984). Rodrigues, J.-M., Trombert-Paviot, B., Baud, R. et al.: Galen-In-Use: An EU Project applied to the development of a new national coding system for surgical procedures: NCAM. In: Pappas, C., Maglaveras, N., Scherrer, J.-R. (eds.): Proceedings of MIE’97. Amsterdam: IOS Press (1997) 897-901. Norton, L.M., Pacak, M.G.: Morphosemantic Analysis of Compound Word Forms Denoting Surgical Procedures. Meth Inform Med 22(1) (1983) 29-36. Sowa, J.F.: Peircean Foundations for a Theory of Context. In: Lukose, D. et al. (eds.): Proceedings of ICCS’97. Berlin: Springer (1997) 41-64. Mineau, G.W., Gerbé, O.: Contexts: A Formal Definition of Worlds of Assertions. In: Lukose, D. et al. (eds.): Proceedings of ICCS'97. Berlin: Springer (1997) 80-94. Dick, J.P.: Using Contexts to Represent Text. In: Tepfenhart, W.M., Dick, J.P., Sowa, J.F. (eds.): Proceedings of ICCS'94. Berlin: Springer-Verlag (1994) 196-213.
Conceptual Graphs for Representing Business Processes in Corporate Memories Olivier Gerb´e1 , Rudolf K. Keller2 , and Guy W. Mineau3 1
DMR Consulting Group Inc. 1200 McGill College, Montr´eal, Qu´ebec, Canada H3B 4G7
[email protected] 2 Universit´e de Montr´eal C.P. 6128 Succursale Centre-Ville, Montr´eal, Qu´ebec, Canada H3C 3J7
[email protected] 3 Universit´e Laval Qu´ebec, Qu´ebec, Canada G1K 7P4
[email protected] Abstract. This paper presents the second part of a study conducted at DMR Consulting Group during the development of a corporate memory. It presents a comparison of four major formalisms for the representation of business processes: UML (Unified Modeling Language), PIF (Process Interchange Format), WfMC (Workflow Management Coalition) framework and conceptual graphs. This comparison shows that conceptual graphs are the best suited formalism for representing business processes in the given context. Our ongoing implementation of the DMR corporate memory – used by several hundred DMR consultants around the world – is based on conceptual graphs, and preliminary experience indicates that this formalism indeed offers the flexibility required for representing the intricacies of business processes.
1
Introduction
Charnel Havens, EDS (Electronic Data Systems) Chief Knowledge Officer, presents in [5] the issues of knowledge management. With a huge portion of a company’s worth residing in the knowledge of its employees, the time has come to get the most out of that valuable corporate resource – by applying management techniques. The challenge companies will have to meet is the memorization of knowledge as well as its storage and its dissemination to employees throughout the organization. Knowledge may be capitalized on and managed in corporate memories in order to ensure standardization, consistency and coherence. Knowledge management requires the acquisition, storage, evolution and dissemination of knowledge acquired by the organization [14], and computer systems are certainly the only way to realize corporate memories [15] which meet these objectives. M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 401–415, 1998. c Springer-Verlag Berlin Heidelberg 1998
402
O. Gerb´e, R.K. Keller, and G.W. Mineau
DMR Consulting Group Inc. has initiated the IT Macroscope project [7], a research project that aims to develop methodologies allowing organizations: i) to use IT (Information Technology) for increasing competitiveness and innovation in both the service and product sectors; ii) to organize and manage IT investments; iii) to implement information system solutions both practically and effectively; and iv) to ensure that IT investments are profitable. In parallel with methodology development, tools for designing and maintaining these methodologies, designing training courses, and for managing and promoting IT Macroscope products were designed. These tools implement the concept of corporate memory. This corporate memory, called the Method Repository, plays a fundamental role. It captures, stores [3], retrieves and disseminates [4] throughout the organization all the consulting and software engineering processes and the corresponding knowledge produced by the experts in the IT domain. During the early stage of the development of the Method Repository, the choice of a knowledge representation formalism was identified as a key issue. That lead us to define specific requirements for corporate memories, to identify suitable knowledge representation formalisms and to compare them in order to choose the most appropriate formalism. We identified two main aspects: knowledge structure and dynamics – business processes, together with activities, events, and participants. The first part of the study [2] lead us to adopt the conceptual graph formalism for structural knowledge. Uniformity of the formalism used in the Method Repository was one issue but not the all-decisive one in adopting conceptual graphs for the dynamic aspect, too. Rather, our decision is based on the comparison framework presented in this paper. In our comparison, we studied four major business modeling formalisms or exchange formats, UML (Unified Modeling Language), PIF (Process Interchange Format), WfMC (Workflow Management Coalition) framework, and conceptual graphs, against our specific requirements. Choosing these four formalisms for our study has been motivated by the requirement for building our solution on existing or de facto standards. Our study demonstrates that conceptual graphs are particularly well suited for representing business processes in corporate memories since they support: (i) shared activities, and (ii) management of instances. The paper is organized as follows. Section 2 introduces the basic notions of business processes as used in this paper. Section 3 defines specific requirements for the representation of business processes in corporate memories. Section 4 compares the four formalisms. Finally, Section 5 reports on the on-going implementation of the Method Repository and discusses future work.
2
Basic Notions
In this section, we present basic notions relevant to the representation of business processes. Main notions of representation of the dynamics in an enterprise are processes, activities, participants (input, output, and agent), events (preconditions and postconditions), and notions of sequence and parallelism of activity
Conceptual Graphs for Representing Business Processes
403
executions. These notions build upon some commonly used definitions in enterprise modeling, as summarized in the following paragraph. A process is seen as a set of activities. An activity is a transformation of input entities into output entities by agents. An event marks the end of an activity; the event corresponds to the fulfilment of both the activity’s postcondition and the precondition of its successor activity. An agent is a human or material resource that enables an activity. An input or output is a resource that is consumed of produced by an activity. The notions of sequence and parallelism define the possible order of activity executions. Sequence specifies an order of executions and parallelism specifies independence between executions. Figure 1 presents the notions of activity, agent, input, and output. Activities are represented by a circle and participants of activities are represented by rectangles and linked to their respective activities by arcs ; directions of arc define their participation: input, output or agent. There is no notational distinction between input and agent. Note that this simple process representation exclusively serves for introducing terminology and for illustrating our requirements.
Fabrication Order
Glazier
Cut Window Panes
Window Panes
Fig. 1. Activity with input, output and agent.
Figure 2 illustrates the notions of sequence and parallelism by a process composed of five activities. The activity Write Production Order is the first activity of the process, the activities Build Frame and Cut Panes are executed in parallel, Assemble Window follows the activities Build Frame and Cut Panes, and finally Deliver Window terminates the process. Note that we only have to consider the representation of parallel activities or sequential activities; all other cases can be represented by these two cases by splitting activities into sub-activities.
3
Requirements
This section introduces the two main requirements underlying our study: representation of processes sharing activities and management of instances that are involved in a process. It is obvious that there exists a lot of other requirements to represent a business process in a corporate memory. Since these other requirements are mostly met by all the formalisms studied, we decided to focus on the two main requirements mentioned above.
404
O. Gerb´e, R.K. Keller, and G.W. Mineau
Build Frame
Write Production Order
Assemble Window
Deliver Window
Cut Panes
Fig. 2. A process as a set of activities.
3.1
Sharing Activities
Let us consider the case of two processes that share a same activity. Figure 3 illustrates this settings.
Write Software Code
Design Software Validate Specifications Design Hardware
Build Hardware
Fig. 3. Sharing Activities.
The example depicted in Fig. 3 deals with the fabrication of a product which is made out of two components: a software component and a hardware component, as in a cellular phone, a microwave oven or a computer. The first process describes the development of the software component. It is composed of Design Software, Validate Specifications, and Write Software Code. The second process describes the development of the hardware component. It is composed of activities Design Hardware, Validate Specifications, and Build Hardware. The activity Validate Specifications is an activity of synchronization and is shared by the two processes. The problem in this example is the representation and identification of the two processes. Each process is composed of three activities, with one of them being in common. Therefore the formalism must offer reuse of parts of process definitions or support some kind of shared variable mechanism. To support the representation of business processes in corporate memory, a formalism must offer features to represent processes sharing the same activities.
Conceptual Graphs for Representing Business Processes
3.2
405
Instance Management
To illustrate the problem of instance management, let us assume the example of a window manufacturer who has a special department for building non standard size windows. Figure 4 presents the window fabrication process.
Build Frame
Client Order
Write Fabrication Order
Fabrication Order
Cut Window Panes
Window Frame
Window Assemble Window
Window Panes
Fig. 4. The Window Problem.
A fabrication order is established from a client order. A fabrication order defines the size and material of the frame and the size and thickness of the glasses to insert into the frame. The fabrication order is sent to the frame builder and glass cutter teams which execute the order. Then the frame and glasses are transmitted to the window assembly team which insert the glasses into the frame. The problem of this team is to insert the right glasses (size and thickness) into the right frames (size and material). Some frames take more time to build than others, so the frames may be finished in a different order than the glasses are. This problem can be solved by the assembly team by assembling the frame and glasses in conformity with the fabrication order. At the notational level, this requires the possibility of specifying instances of input and output participants. To support representation of business processes in corporate memory, the formalism must offer features to represent and manage the related instances needed by different processes.
4
Formalisms
This section presents the four business process modeling formalisms of our study. These formalisms offer representation features in order to describe, exchange, and execute business processes. Each of the studied formalisms supports the representation of the basic notions introduced in Section 2, so we concentrate on the specific requirements discussed above. Against these requirements we have evaluated the four formalisms, UML [1] (Unified Modeling Language), PIF
406
O. Gerb´e, R.K. Keller, and G.W. Mineau
[8] (Process Interchange Format), WfMC framework [6] (Workflow Management Coalition) and conceptual graphs. Other formalisms, Petri net [16] and CML [11, 12], have been considered but not included in this study because not well-suited to represent business processes or not enough formal. 4.1
Unified Modeling Language
In [2] we presented how to represent the static structure in UML [1] (Unified Modeling Language). Let us recall that UML, developed by Grady Booch, Jim Rumbaugh and Ivar Jacobson from the unification of Booch method, OMT and OOSE, is considered as a de facto standard. UML provides several kinds of diagrams that allow to show different aspects of the dynamics of processes. Use Case diagrams show interrelations between functions provided by a system and external agents that use these functions. Sequence diagrams and Collaboration diagrams present interactions between objects by specifying messages exchanged among objects. State diagrams describe the behavior of objects of a class or the behavior of a method in response to a request. A state diagram shows the sequence of states an object may have during its lifetime. It also shows responsible requests for state transitions, responses and actions of objects corresponding to requests. Activity diagrams have been recently introduced in UML. They are used to describe processes that involve several types of objects. An activity diagram is a special case of state diagram where states represent the completion of activities. In the context of corporate memory , activity diagrams are the most relevant and we will present their main concepts in what follows. In UML, there are two types of execution of activities: execution of activities that represent atomic actions, they are called ActionState, and execution of a non atomic sequence of actions, they are called ActivityState. Exchange of objects among actions are modeled by object flows that are called ObjectFlowState. ObjectFlowStates implements notions of inputs and ouptuts. Agents are represented by Swimlane in activity diagrams. However it is possible to define agent as a participant to an activity and to establish explicitly a relationship between agent and activity. Figure 5 shows how to model the cut window pane activity with participants. Activity diagrams shows possible scenarios; this means that activity diagrams
:fabrication order
:glazier
Cut Window Panes
:panes
Fig. 5. UML - The cut window pane Activity.
Conceptual Graphs for Representing Business Processes
407
show objects instead of classes. Dashed arrows link inputs and outputs to activities. Processes may be represented using activity diagrams in UML and Fig. 6 shows an example of the window building process. Solid arrows between processes represent the control flow.
Build Frame
Write Fabrication Order
Assemble Window
Deliver Window
Cut Window Panes
Fig. 6. UML - The whole Process.
Sharing Activities As detailed in [1], UML does not support adequate representation features for sharing activities. However activity diagrams are new in the definition of the language and all cases have not been yet presented. Instances Management In opposition with the representation of structure [2], the process representation is done at the instance level. Activity diagrams involve objects not classes and therefore it is possible to represent the window problem by using the object fabrication order which specifies frame and panes. Figure 7 shows a representation for the window problem. :frame Build Frame
Write Fabrication Order
:client order
:fabrication order
:fabrication order specifies :fabrication order
specifies
Cut Window Panes
Assemble Window
:panes
:fabrication order
Fig. 7. UML - The Window Problem.
:window
408
O. Gerb´e, R.K. Keller, and G.W. Mineau
4.2
Process Interchange Format (PIF)
The PIF (Process Interchange Format) workgroup, composed of representatives from companies and universities developed a format to exchange the specifications of processes [8]. A PIF process description is a set of frame definitions. Each frame specifies an instance of one class of the PIF metamodel. Figure 8 shows PIF metamodel. It is composed of a generic class ENTITY from which all other classes are derived and of four core classes: ACTIVITY, OBJECT, TIMEPOINT, and RELATION. Subclasses
Decision successor Agent
if
then
❄ ❄ ❄ creates ✲ modifies ✲ uses ✲
performs ✲
Activity
begin
Object
end status
❄ ❄ ❄ Time Point
✛
before
Fig. 8. PIF - Metamodel.
of ACTIVITY and OBJECT are respectively DECISION and AGENT. Class RELATION has seven subclasses, subclasses CREATES, MODIFIES, PERFORMS, and USES define relationships between ACTIVITY and OBJECT, the subclass BEFORE defines a predecessor relationship between two points in time, the subclass SUCCESSOR defines a successor relationship between two activities and, ACTIVITY-STATUS defines the status of an activity at a point in time. Figure 9 shows the representation of an activity using the PIF format. ACT1 (define-frame ACT1 :own-slots ((Instance-Of ACTIVITY) (Name ”Cut Window Panes") (End END-ACT1)))
(define-frame PRFRMS1 :own-slots ((Instance-Of PERFORMS) (Actor AGT1) (Activity ACT1)))
(define-frame OUTPUT1 :own-slots ((Instance-Of OBJECT) (Name ”panes")))
(define-frame END-ACT1 :own-slots ((Instance-Of TIMEPOINT)))
(define-frame INPUT1 :own-slots ((Instance-Of OBJECT) (Name ”Fabrication Order")))
(define-frame CRTS1 :own-slots ((Instance-Of CREATES) (Activity ACT1) (Object OUTPUT1)))
(define-frame AGT1 :own-slots ((Instance-Of AGENT) (Name ”Glazier")))
(define-frame USES1 :own-slots ((Instance-Of USES) (Activity ACT1) (Object INPUT1)))
Fig. 9. PIF - Activity with participants.
Conceptual Graphs for Representing Business Processes
409
defines the cut window panes activity as an instance of ACTIVITY with a name and a relation to END-ACT1. END-ACT1 represents the end of the activity and is defined as a point in time. Then come definitions of the three participants; each participant is defined in two parts: definition of the participant itself and definition of the relationship between the activity and the participant. With the PIF process interchange format and framework, there is no explicit definition of a process. A process is the set of defined activities. Example shown in Fig. 10 shows how two activities ACT1 and ACT2 are linked by a BEFORE relationship. (define-frame ACT1 :own-slots ((Instance-Of ACTIVITY) (Name ”Write Fabrication Order") (End END-ACT1)))
(define-frame ACT2 :own-slots ((Instance-Of ACTIVITY) (Name ”Build Frame") (End END-ACT2)))
(define-frame END-ACT1 :own-slots ((Instance-Of TIMEPOINT)))
(define-frame END-ACT2 :own-slots ((Instance-Of TIMEPOINT)))
(define-frame ACT1-ACT2 :own-slots ((Instance-Of BEFORE) (Preceding-Timepoint END-ACT1) (succeeding-Timepoint END-ACT2)))
Fig. 10. PIF - Process.
Sharing Activities The PIF format supports representation of several sequences of activities. It is possible to define in one file more than one sequence of activities by a set of frames instance of BEFORE. However it is not possible to explicitly identify several processes. Instance Management With the PIF format activities and participants involved in the activities are described at the type level. Therefore, it is not possible to identify instances in PIF activity definitions. 4.3
Workflow Reference Model
The Workflow Management Coalition (WfMC) defines in the Workflow Reference Model [6] a basic metamodel that supports process definition. The Workflow Reference Model defines six basic object types to represent relatively simple processes. These types are: Worflow Type Definition, Activity, Role, Transition Conditions, Workflow Relevant Data, and Invoked Application. Figure 11 shows the basic process definition metamodel. The Workflow Management Coalition has also published a Process Definition Interchange in version 1.0 beta [17] that describes a common interface to the exchange of process definitions between workflow engines. Figure 12 presents the definition of the activity Cut Window Panes using this exchange format. Participants (inputs or agents) to an activity are defined explicitly. Data that are created or modified by an activity are defined in the postconditions of the activity or defined as output parameters of invoked applications during activity execution. In WFMC Process Definition Interchange format, a process is defined as a list of activities and a list of transitions that
410
O. Gerb´e, R.K. Keller, and G.W. Mineau Workflow Type Definition consists of
may
Role
has
✛ refer to may have
❄
uses
Activity
❄ ❄
Transition Conditions
✲ uses ✻
❄ Data
✻
Invoked Application may refer to
Fig. 11. WfMC - Basic Process Definition MetaModel.
specify in which order activities are executed. In Fig. 13 of the following section, examples of definitions of activities in WFMC interchange format are shown. ACTIVITY Cut_Window_Panes PARTICIPANT Glazier, Fabrication_Order POST CONDITION Window_Panes exists END ACTIVITY PARTICIPANT Glazier TYPE HUMAN END PARTICIPANT
DATA Fabrication_Order TYPE COMPLEX DATA END DATA DATA Window_Panes TYPE REFERENCE END DATA
Fig. 12. WfMC - Activity with Participants.
Sharing Activities Processes are defined using keyword WORKFLOW and ENDWORKFLOW which respectively begins and ends a process definition. In a process definition, it is possible to use activities or participants that have been defined in another process definition. In the example shown in Fig. 13, two processes are defined with a common activity. The commom activity is defined in process 1 and reused in process 2. Instance Management Process definitions are defined at type level. However conditions that fire activity or that are realized at the end of an activity are expressed using Boolean expressions with variables. In theory, it is possible to represent the window problem but the version 1.0 beta of Process Definition Interchange [17] gives few indications to realize it.
Conceptual Graphs for Representing Business Processes WORKFLOW PROCESS1 ACTIVITY Design_Software ... END_ACTIVITY
WORKFLOW PROCESS2 ACTIVITY Design_Hardware ... END_ACTIVITY
ACTIVITY Validate_Specifications ... END_ACTIVITY
ACTIVITY Build_Hardware ... END_ACTIVITY
ACTIVITY Write_Software_Code ... END_ACTIVITY
TRANSITION FROM Design_Hardware TO Validate_Specifications END_TRANSITION
TRANSITION FROM Design_Software TO Validate_Specifications END_TRANSITION TRANSITION FROM Validate_Specifications TO Write_Software_Code END_TRANSITION END_WORKFLOW
411
TRANSITION FROM Validate_Specifications TO Build_Hardware END_TRANSITION END_WORKFLOW
Fig. 13. WfMC - Processes Sharing Activities.
4.4
Conceptual Graphs and Processes
In conceptual graph theory, there is no standard way to represent processes. Processes have not been extensively studied and only a few works are related to the representation of processes. John Sowa in [13] presents some directions to represent processes. Dickson Lukose [9] and Guy Mineau [10] have proposed executable conceptual structures. We present below a possible metamodel to represent processes that fulfills corporate memory requirements as expressed in Section 3. The metamodel (Fig. 14) is composed of three basic concepts: ACTIVITY, PROCESS, and EVENT. An activity
TYPE ACTIVITY(x) IS TYPE EVENT(x) IS [T:*x][T:*x](INPUT)[EVENT:*ev1] (OUTPUT) [TYPE] a2 = [ACTOR:?] = {[ACTOR],[LIST_OWNER]} and a3 = [ACTOR:?] = {[ACTOR],[LIST_OWNER]}
2a. Determine which actors ai are involved in the specification of permitted actions. This query should be directed toward the bottom of the context lattice, as this context contains all intention graphs (which in turn include the desired actor concepts): s4 = δI ∗ (C4∗ , q4 ) = {i3 } where q4 = [ACTOR:?] [SPECIFY] (rslt) -> [PERM_ACTION] a4 = [ACTOR:?] = {[LIST_OWNER]}
2b. Using a4 , determine, for each type of information tool gj (see 1a), its corre0 sponding si and actors ai (see 1b), which actors ai currently illegitimately control its specification process. g1 : [MAILING_LIST] 0
a2 = a2 − (a2 ∩ a4 ) = {[ACTOR]} g2 : [PRIVATE_MAILING_LIST] 0
a3 = a3 − (a3 ∩ a4 ) = {[ACTOR]}
0
2c. For each tool identified by the gj having illegitimate controlling actors ai , define 0 the illegitimate composition norms ck =< il , gj > by selecting from the si from 1b 0 those il which contain ai : 0
c1 =< i1 , g1 > 0
c2 =< i1 , g2 >
3. In the previous two steps we identified the illegitimate norms. Now we will prepare the stage for the specification discourse in which these norms are to be corrected. A composition norm does not just need to be seen as a context. It is itself a knowledge definition which needs to be covered by the extension graph of at least one other composition norm, which in that case acts as a meta-norm. In order to correct the illegitimate norms we need to (a) identify which actors are permitted to do this and (b) what items should be on their specification agenda. This step falls outside the scope of this paper but is presented here to provide the reader with the whole picture. A forthcoming paper will elaborate on meta-norms and contexts of contexts.
Handling Specification Knowledge Evolution Using Context Lattices
429
0
3a. For each illegitimate composition norm ck , select the actors ai from the permitted 0 (meta) composition norms cm which allow that ck to be modified:1 cm =< im , gm > where im = [ACTOR:?] [MODIFY] -> (rslt) -> [PERM COMP: #] 0
and gm = ck
3b. For each of these actors ai , build an agenda Ai . Such an agenda could consist of 0 (1) all illegitimate norms ck that each actor is permitted to respecify and (2) contextual information from the most specific context in which these norms are represented, or other contexts which are related to this context in some significant way. The exact contextual graphs to be included in these agendas are determined by the way in which the specification discourse is being supported, which is not covered in this paper and needs considerable future research. However, we would like to give some idea of the direction we are investigating. In our example, we identified the illegitimate (derived) composition norm ‘any actor is permitted to control (i.e. initiate, execute, and evaluate) the specification of a private mailing list’ (< i1 , g2 >). From its formal context C2∗ it also appears that a list owner, on the other hand, is permitted to at least execute the modification of this type (< i2 , g2 >). If another specification constraint would say that one permitted composition for each control process category per knowledge definition suffices, then only the initiation and evaluation of the modification now would remain to be defined (as the execution of the modification of the private mailing list type is already covered by the norm referring to the list owner). Thus, the specification agendas Ai for the actors ai identified in 3a could include : ‘you can be involved in the respecification of the initiation and the evaluation of the modification of the type private mailing list’, as well as ‘there is also actor-such-and-such (e.g. the list owner) who has the same (or more general/specific) specification rights, with whom you can negotiate or whom you can ask for advice.’. Of course, in a well-supported discourse these kinds of agendas would be translated into statements and queries much more readable to their human interpreters, but such issues are of a linguistic nature and are not dealt with here.
5
Conclusions
Rapid change in work practices and supporting information technology is becoming an ever more important aspect of life in many distributed professional communities. One of their critical success factors therefore is the continuous involvement of users in the (re)specification of their network information system. In this paper, the conceptual graph-based approach for the navigation of context lattices developed by Mineau and Gerb´e [1997] was used to structure the handling of user-driven specification knowledge evolution. In virtual professional communities, the various kinds of norms and the knowledge definitions to which they apply, as well as the specification constraints that apply to these norms, are prone to change. The formal context lattice approach can be used to guarantee that specification processes result in 1
For lack of space, we have not included such composition norms in our example, but since they are also represented in a context lattice, the same mechanisms apply. The only difference is that the extension graphs are themselves contexts (as defined in Sect.3).
430
A. de Moor and G. Mineau
legitimate knowledge definitions, which are both meaningful and acceptable to the user community. Extracting the context to which a query is applied, provides simpler graphs that can more easily be understood by the user when he interacts with the CG base. It also provides a hierarchical path that guides the matching process between CGs, that would otherwise not be there to guide the search. Even though the computation cost of matching graphs would be the same, overall performance would be improved by these guidelines as the search is more constrained. But the most interesting part about using a context lattice, is that it provides a structuring of different contexts that help conceptualize (and possibly visualize) how different contexts (‘micro-worlds’) relate to one another, adding to the conceptualization power of conceptual graphs. In future research, we plan to further formalize and standardize the still quite conceptual approach presented here, and also look into issues regarding its implementation.
References 1. A. De Moor. Applying conceptual graph theory to the user-driven specification of network information systems. In Proceedings of the Fifth International Conference on Conceptual Structures, University of Washington, Seattle, August 3–8, 1997, pages 536–550. SpringerVerlag, 1997. Lecture Notes in Artificial Intelligence No. 1257. 2. B.R. Gaines. Dimensions of electronic journals. In T.M. Harrison and T. Stephen, editors, Computer Networking and Scholarly Communication in the Twenty-First Century, pages 315–339. State University of New York Press, 1996. 3. L.J. Arthur. Rapid Evolutionary Development - Requirements, Prototyping & Software Creation. John Wiley & Sons, 1992. 4. T.W. Malone, K.-Y. Lai, and C. Fry. Experiments with Oval: A radically tailorable tool for cooperative work. ACM Transactions on Information Systems, 13(2):177–205, 1995. 5. P. Holt. User-centred design and writing tools: Designing with writers, not for writers. Intelligent Tutoring Media, 3(2/3):53–63, 1992. 6. G. Fitzpatrick and J. Welsh. Process support: Inflexible imposition or chaotic composition? Interacting with Computers, 7(2):167–180, 1995. 7. L.J. Arthur. Quantum improvements in software system quality. Communications of the ACM, 40(6):46–52, 1997. 8. I. Hawryszkiewycz. A framework for strategic planning for communications support. In Proceedings of the Inaugural Conference of Informatics in Multinational Enterprises, Washington, October 1997, 1997. 9. F. Dignum, J. Dietz, E. Verharen, and H. Weigand, editors. Proceedings of the First International Workshop on Communication Modeling ’Communication Modeling - The Language/Action Perspective’, Tilburg, The Netherlands, July 1-2, 1996. Springer eWiC series, 1996. http://www.springer.co.uk/eWiC/Workshops/CM96.html. 10. G. Mineau and O. Gerb´e. Contexts: A formal definition of worlds of assertions. In Proceedings of the Fifth International Conference on Conceptual Structures, University of Washington, Seattle, August 3–8, 1997, pages 80–94. Springer Verlag, 1997. Lecture Notes in Artificial Intelligence, No. 1257. 11. F. Dignum and H. Weigand. Communication and deontic logic. In R. Wieringa and R. Feenstra, editors, Working Papers of the IS-CORE Workshop on Information Systems - Correctness and Reusability, Amsterdam, 26-30 September, 1994, pages 401–415, September 1994. 12. J.F. Sowa. Conceptual Structures: Information Processing in Mind and Machine. AddisonWesley, 1984.
Using CG Formal Contexts to Support Business System Interoperation Hung Wing1 , Robert M. Colomb1 , and Guy Mineau2 1
CRC for Distributed Systems Technology Department of Computer Science The University of Queensland Brisbane, Qld 4072, Australia 2 Dept. of Computer Science Universit´e Laval, Canada
Abstract. This paper describes a standard interoperability model based on a knowledge representation language such as Conceptual Graphs (CGs). In particular, it describes how an Electronic Data Interchange (EDI) mapping facility can use CG contexts to integrate and compare different trade documents by combining and analysing different concept lattices derived from formal concept analysis theory. In doing this, we hope to provide a formal construct which will support the next generation of EDI trading concerned with corporate information.
1
Introduction
There have been several attempts to overcome semantic heterogeneity existing between two or more business systems. It could be a simple paper-based system in which purchase orders generated from a purchasing program can be faxed (or communicated via telephone) to a human coordinator, whose job is to extract and transcribe the information from an order to a format that is required by an order entry program. In general, the coordinator has specific knowledge that is necessary to handle the various inconsistencies and missing information associated with exchanged messages. For example, the coordinator should know what to do when information is provided that was not requested (unused item) or when information that was requested but it is not provided (null item). The above interoperation technique is considered simple and relatively inexpensive to implement since it does not require the support of EDI software. However, this approach is not flexible enough to really support a complex and dynamic trade environment where time critical trade transactions (e.g. a foreign exchange deal) may need to be interoperated on-the-fly without prior negotiation (about standardised trading terms) having to rely on the human ability to quickly and correctly transcribe complex trade information. In facilitation of system interoperation, other more sophisticated systems include general service discovery tools like the trader service of Open Distributed Processing [5], schema integration tools in multidatabase systems [2], contextbased interchange tools of heterogeneous systems [7,1], email message filtering M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 431–438, 1998. c Springer-Verlag Berlin Heidelberg 1998
432
H. Wing, R.M. Colomb, and G. Mineau
tools of Computer Systems for Collaborative Work (CSCW) [3], or EDI trade systems [6]. The above systems are similar in the sense that they all rely on commonly shared structures (ontologies) of some kind to compare and identify semantic heterogeneity associated with underlying components. However, what seems lacking in these systems is a formal construct which can be used to specify and compare the different contexts associated with trade messages. Detailed descriptions of theses systems and their pros and cons can be found in [9]. In this paper, we describe an enhanced approach which will support business system interoperation by using Conceptual Graph Formal Context[4] deriving from the Formal Concept Analysis theory [8]. The paper is organised as follows: Section 2 overviews some of the relevant formal methods. Section 3 describes how we can overcome the so-called 1st trade problem (refers to the initial high cost to establish a collection of commonly agreed trading terms).
formal Spec Customer Application Programs
formal Spec Data
Purchase Order Handler
EMF Server
Data
formal Spec Order Entry Handler
Revised Spec
formal Spec Data
Supplier Application Programs
Specification Analysis Tools EMF Human Coordinator
Fig. 1. EDI Mapping Facility (EMF)
2
Relevant Formal Methods
In designing the EDI Mapping Facility (EMF) (shown in Figure 1), we aim to facilitate the following: 1) Systematic interoperation: allow business system to dynamically and systematically collaborate with each other with minimal human intervention, 2) Unilateral changes: allow business system to change and extend trade messages with minimal consensus from other business systems, and 3) Minimising up front coordination: eliminate the so-called one-to-one bilateral trade agreements imposed by traditional EDI systems. To support the above aims, we need to be able to express the various message concepts and relationships among these concepts. In doing so we need a logical notation of some kind. In general, a formal notation such as first order logic, Object Z, or CGs is considered useful due to the following: 1) it is an unambiguous logical notation, 2) it is an expressive specification language, and 3) the specification aspects can be demonstrated by using mathematical proof techniques.
Using CG Formal Contexts to Support Business System Interoperation
433
However, we choose CGs to specify EDI messages due to the following added benefits: First, the graphical notation of CG is designed for human readability. Second, the Canonical Formation Rules of CGs allow a collection of conceptual graph expressions to be composed (by using join, copy) and decomposed (by using restrict, simplify ) to form new conceptual graph expressions. In this sense, the formation rules are a kind of graph grammar which can be used to specify EDI messages. In addition, they can also be used to enforce certain semantic constraints. Here, the canonical formation rules define the syntax of the trade expressions, but they do not necessarily guarantee that these expressions are true. To derive correct expressions from other correct expressions we need rules of inference. Third, aiming to support reasoning with graphs, Peirce defined a set of five inference rules (erasure, insertion, iteration, de-iteration, double negation) and an axiom (the empty set) based on primitive operations of copying and reasoning about graphs in various contexts. Thus, rules of inference allow a new EDI trade expression to be derived from an existing trade expression, allowing an Internet-based trade model to be reasoned about and analysed. Furthermore, to facilitate systematic interoperation we need to be able to formalise the various trade contexts (assumptions and assertions) associated with EDI messages. According to Mineau and Gerb´e, informally: ‘A context is defined in two parts: an intention, a set of conceptual graphs which describe the conditions which make the asserted graphs true, and an extension, composed of all the graphs true under these conditions’ [4]. Formally, a context Ci can be described as a tuple of two sets of CGs, Ti and Gi . Ti defines the conditions under which Ci exists, represented by a single intention graph; Gi is the set of CGs true in that context. So, for a context Ci , Ci = < Ti , Gi > = , where I(Ci ), a single CG, is the intention graph of Ci , and E(Ci ), the set of graphs conjunctively true in Ci , are the extension graphs. Based on Formal Concept Analysis theory [8], Mineau and Gerb´e further define the formal context, named Ci∗ , as a tuple < Ti , Gi > where Gi = E ∗ (Ti ) and Ti = I ∗ (Gi ) = I(Ci∗ ). With these definitions, the context lattice, L, can be computed automatically by applying the algorithm given in the formal concept analysis theory described below. This lattice provides an explanation and access structure to the knowledge base, and relates different worlds of assertions to one another. Thus, L is defined as: L =< {Ci∗ }, ≤> . In the next section, we describe how these formal methods can be applied to solve the so-called 1st trade problem.
3
An Example: Overcoming the 1st Trade Problem
This example is based on the following trade scenario: in a foreign exchange deal, a broker A uses a foreign exchange standard provided by a major New York bank to compose a purchase order. Similarly, a broker B uses another standard provided by a major Tokyo bank to compose an order entry. Note that these two standards are both specialisations of the same EDI standard. The key idea here
434
H. Wing, R.M. Colomb, and G. Mineau
is to deploy an approach in which we can specify the different assumptions and assertions relevant to the trade messages so we use these formalised specifications to systematically identify the different mismatches (null, unused, and missing items). As an example, Figure 2 shows how we may use CG contexts to model the various trade assumptions (intents i1 , ..., i7 ) and concept assertions (extents c1 , ..., c12 ).
CONTEXT: Foreign exchange
CONTEXT: Foreign exchange
i1
Use
c1 P/Order
Foreign-exchange
r1 Part
c1 P/Order
c1 P/Order
c1 P/Order
c3
Quantity
Factor: 1
ToAssert
Currency: USD
CostPerUnit
ToAssert
Part:
CONTEXT: Foreign exchange, Factor=1000
c4 c4
i6
Use
CostPerUnit
DiscountRqst
Foreign-exchange
Document: O/Entry
r1 Part
c5
ToAssert
a6 Use
Factor: 1000
CostPerUnit
i2
CONTEXT: Foreign exchange, Factor=1 Quantity
c3
CostPerUnit
c4
CONTEXT: Foreign exchange, Currency=USD
c8
Quantity
Part:
a2 Use
i5
CONTEXT: TKSE foreign exchange, Factor=1
c2 Product#
a1 Use
i4
c6, c7,...,c13
Document: #
ToAssert
Part: #
ToAssert
Part: #
C12
CONTEXT: Foreign exchange, Currency=JPY
i7
i3
ToAssert
Use
Foreign-exchange
Document: O/Entry
ToAssert
a6 Use
Currency: JYP
CostPerUnit c4 CostPerUnit
C12
Fig. 2. Sample CG contexts relevant to a purchase order (left) and an order entry (right)
There are several steps involve in the systematic interoperation of EDI messages. The following steps is based on the EMF shown in Figure 1. • Step 1. Prepare and forward specs: Broker A and B can interact with the Customer Application Program and Supplier Application Program, respectively, to compose a purchase order and an order entry based on the standardised vocabularies provided by a foreign exchange standard. Figure 2 shows two possible specifications: a purchase order and an order entry. Once formal specifications have been defined (by using CG formation rules and contexts), they can be forwarded to either an Purchase Order Handler or an Order Entry Handler for processing. Upon receiving an order request, the Purchase Order Handler checks its internal record stored in the Supplier Log to see whether or not this order spec has been processed before. If not, this ‘1st trade’ spec is forwarded to the EMF Server for processing. Otherwise, based on the previously established trade information stored in the Supplier Log, the relevant profile can then be retrieved and forwarded with the relevant order data to an appropriate Order Entry Handler for processing. In order to identify the discrepancy between a purchase order and an order entry, the Order Entry Handler needs to forward an order entry spec to an EMF Server for processing.
Using CG Formal Contexts to Support Business System Interoperation
435
• Step 2. Integrate and compare specs: To effectively compare two specs from different sources the EMF server needs to do the following: 1) formalise the specs and organise their formal contexts into two separate type hierarchies known as Context Lattices. Note that in order to compare two different specs, an agreement on an initial ontology must exist; 2) by navigating and comparing the structures of these context lattices it can identify and integrate contexts of one source with contexts of another source. In doing so it can form an integrated lattice, and 3) this integrated lattice can then be accessed and navigated to identify those equivalent and/or conflicting intentions (or assumptions). From these identified and matched intentions it can compare the extents in order to identify those matched, unused, null, and conflict assertions. The result of the comparison steps can then be used to generate the necessary mapping profiles. In the following, we describe how the above steps can be formally carried out. Mpo attributes i1
Gpo c1 objects c2 x c3 x c4 x c5
i2
i3
x x
x
Moe attributes
Goe objects
i4 c6 c7 x c8 x c9 c10 c11 c12 x
i5 i6
i7
x
x
x
c13
Fig. 3. FCA Formal Contexts represent two different sets of assumptions (about standard, currency and scale factor)
Generating the Context Lattices: Based on the FCA theory, the above formal CG contexts can be systematically re-arranged to form the corresponding FCA contexts of the purchase order and order entry (denoted as KP and KO , respectively). These contexts are illustrated as cross-tables shown in Figure 3. The cross-table on the left depicts the formal context (KP ) of the purchase order spec representing a query graph, while the cross-table shown on the right depicts the formal context (KO ) of the order entry spec representing a type graph. To simplify our example, all asserted conceptual relations shown in Figure 2 have been ignored in the cross-tables. If the application is required to query and compare the asserted relations, they can easily be included in the cross-tables prior to generating the context lattice. Recall from FCA theory that for a given context K we can systematically find its formal concepts (Xi , Bi ). By ordering these formal concepts based on the sub/superconcept relation (≤), we can systematically determine the concept lattice B(K) based on the join and meet operations of FCA theory. Thus, from our example, the context KP and KO shown in Figure 3 can be systematically processed to generate concept lattices B(KP ) and B(KO ) shown in Figure 4, respectively. The context KP has five formal concepts { C1,C2,C3,C4,C5} and the context KO also has five formal concepts { C6,C7,C8,C9,C10 }.
436
H. Wing, R.M. Colomb, and G. Mineau C1=
C6=
C2= C3=
C4=
C7=
C5= P/Order Context Lattice
C9=
C8=
C10= O/Entry Context Lattice
Fig. 4. Context lattices generated from cross-tables shown in Figure 3
Integrating the Context Lattices: At this point, the context lattices B(KP ) and B(KO ) represent the important hierarchical conceptual clustering of the asserted concepts (via the extents) and a representation of all implications between the assumptions (via its intents). With these context lattices we can then proceed to query and compare the asserted concepts based on the previously specified assumptions. However, before we can compare the individual concepts, we need to combine the context lattices to form an integrated context lattice. In doing this we ensure that only those concepts that are based on the same type of assumptions (or intention type) can be compared with each other. Otherwise, the comparison would make no sense.
C7= C2=
S2
S1 C9= C5=
C8= C3 =
Conflicting pair matching pair
C10= C5=
Fig. 5. Integrated Context Lattice
Based on the information given by the individual contexts shown in Figure 2 we can derive that i1 is equivalent to i4 (i.e. both context C2 and C7 are based on the same foreign exchange standard). Thus, we can integrate and compare context C2’s individual concepts c2 , c3 , c4 against node C7’s individual concepts (c7 , c8 , c12 ). By comparing these individual concepts we find that c2 = c7 , c3 = c8 , and c4 = c12 . Note that this comparison is true only when the above concepts are defined according to the convention specified by intents i1 and i4 . This integration step is illustrated in the top part of the integrated lattice shown in Figure 5.
Using CG Formal Contexts to Support Business System Interoperation
437
Similarly, we can integrate and compare contexts C3 and C8. In this case, we find that concept c8 = c3 (quantity) according to the assumption that c8 and c3 are based on i4 (foreign exchange standard) and i5 (factor = 1). This integration step is illustrated in the left part of the integrated lattice shown in Figure 5. While integrating and comparing C5 and C9 we find some discrepancies in the intentions i2 (factor =1) and i6 (factor = 1000), also in i3 (currency = USD) and i7 (currency = JPY). These discrepancies in the intent parts suggest that both c4 and c12 (CostPerUnit) are based on a conflicting assumption. This integration and comparison step is illustrated in the right part of the integrated lattice shown in Figure 5. The results of the integration process can be used to form the result profiles which identify those null, unused and mismatched items of a purchase order and an order entry. This profile is then forwarded to the relevant handlers to control the data flow between business sytems. In general, a mapping profile can systematically be generated by inferring the integrated context lattice. • Step 3, Forward relevant data: Upon receiving the mapping profiles from an EMF Server, the Purchase Order Handler and Order Entry Handler store these profiles in the Supplier Log and Customer Log, respectively. In doing so, subsequent order requests can use these profiles to coordinate and transport the purchase order data to the appropriate order entry programs without having to integrate and compare the Purchase Order’s and Order Entry’s specs. It is important to point out that by navigating the context lattice, analysis software would be able to identify the reasons behind the mismatching results. Some mismatches (e.g. unknown concepts those which cannot be identified by a particular standard) can be impossible to interpret by another system without the intervention of a human coordinator. However, some other mismatches (e.g. those exchanged concepts that were based on a shared ontology but were not provided or asked for) can be systematically appended to the original specs and forwarded back to the Purchase Order Handler or Order Entry Handler for re-processing. An open research agenda: Previously, we have described an approach to identify discrepancies among different CG concepts. It is important to note that discrepancies may come from relations and not just from concepts. They may come in the way concepts are connected by relations or they may come from within nested concepts (or nested relations). For example, the message ‘Broker A delivers a quote to Broker B’ may have different interpretations depending on whether A calls (or emails) B to deliver a quote on the spot (which may not be secured), or A requests a quote specialist (by using a server) to make a delivery (in which a quote can be securely delivered by using the encryption, certification, and non-repudiation techniques). If the application does not care how the quote is delivered, as long as B receives the quote, then it is not necessary to analyse or reason about the relevant nested concepts (or relations). However, if the security associated with the delivery is of concern we need to find a way to compare and identify the potential conflicts embedded in nested concepts.
438
H. Wing, R.M. Colomb, and G. Mineau
Discrepancies associated with relations can be solved by using the above described approach. For example, we can substitute relations (instead of concepts) as the extents of the cross table shown in Figure 3. In doing so, we can then generate the necessary context lattices and integrated lattices based on relations and not concepts. Thus, we can then compare and identify discrepancies among relations. If we view the different ways in which concepts are connected by relations as a collection of ‘super concepts’, then to identify the discrepancies among these super concepts, a partial common ontology (which describes how concepts may be connected by relations) must be used. The level of matching between different ontologies will have a direct impact on the comparison heuristic. The problem here is to discover some heuristics to guide a search through two lattices in order to integrate them. In doing so, we can find enough similarity to discover dissimilarities. To conclude, by using formal concept analysis and conceptual graph formalism we can systematically create context lattices to represent complex message specifications and their assumptions. In doing so, message specs can be effectively navigated and compared, thus it is feasible that a formal EDI mapping approach can be facilitated.
References 1. C. Goh, S. Bressan, S. Madnick, and M. Siegel. Context interchange: Representing and reasoning about data semantics in heterogeneous systems. ACM Sigmod Record, 1997. 2. Vipul Kashyap and Amit Sheth. Semantic and schematic similarities between database objects: a context-based approach. The VLDB Journal, 5, 1996. 3. J. Lee and T. Malone. Partially Shared Views: A Scheme for Communication among Groups that Use Different Type Hierarchies. ACM Transactions on Information Systems, 8(1), 1990. 4. G. Mineau and O. Gerb´e. Contexts: A formal definition of worlds of assertions. In Dickson Lukose et al, editor, International Conference on Conceptual Structures (ICCS97), 1997. 5. A. Puder and K. R¨ omer. Generic trading service in telecommunication platforms. In Dickson Lukose et al, editor, International Conference on Conceptual Structures (ICCS97), 1997. 6. L. Raymond and F. Bergeron. EDI success in small and medium-sized enterprises: A field study. Journal of Organizational Computing and Electronic Commerce, 6(2), 1996. 7. G. Wiederhold. Mediators in architecture of future information systems. IEEE Computer, 25(3), 1992. 8. R. Wille. Restructuring Lattice Theory: An Approach Based on Hierarchies of Concepts. In I. Rival, editor, Ordered Sets. Reidel, Dordrecht, Boston, 1982. 9. Hung Wing. Managing Complex and Open Web-deloyable Trade Objects. PhD Thesis, University of Queensland, QLD. 4072, Australia, 1998.
Author Index Angelova, G.
351
Baader, F. 15 Baud, R.H. 390 Biedermann, K. 209 Borgida, A. 15 Braiiner, T. 255 Burrow, A. 111 Cao, T.H. 270 Chibout, K. 367 Colomb, R.M. 431 Coulondre, S. 179 Coupey, P. 165 Creasy, P.N. 270 Cyre, W. R:. 51 Dibie, J. 80 Dieng, R. 139 Eklund, P.W. Faxon, C.
111
165
Ganter, B. 295 Genest, D. 154 Gerb~, 0. 401 Groh, B. 127 Guinaldo, O. 287
Kayser, D. 35 Keller, R.K. 401 Kuznetsov, S. O. 295 Liu, X. 375 Loiseau, S. 80 Lovis, C. 390 McGuinness, D.L. 15 Mann, G.A. 319 Mineau, G.W. 65, 401, 416, 431 Moor, A. de 416 Moulin, B. 359 Nock, R.
Pollitt, S. 111 Prediger, S. 225 Puder, A. 119 Rassinoux, A.-M. Ribi~re, M. 94
Jappy, P.
303
390
Salvat, E. 154, 179 Scherrer, J.-R. 390 Simonet, G. 240 Sowa, J.F. 3 Stra.hringer, S. 127 Tepfenhaxt, W.M. Vilnat, A.
Haemmerl~, O. 80, 287 Hoede, C. 375 Hug, S. 139
303
367
Wagner, J.C. 390 Wille, R. 127, 194 Wing, H. 431
334
Lecture Notes in Artificial Intelligence (LNAI)
Vol. 1314: S. Muggleton (Ed.), Inductive Logic Programming. Proceedings, 1996. VIII, 397 pages. 1997. Vol. 1316: M.Li, A. Maruoka (Eds.), Algorithmic Learning Theory. Proceedings, 1997. XI, 461 pages. 1997. Vol. 1317: M. Leman (Ed.), Music, Gestalt, and Computing. IX, 524 pages. 1997. Vol. 1319: E. Plaza, R. Benjamins (Eds.), Knowledge Acquisition, Modelling and Management. Proceedings, 1997. XI, 389 pages. 1997. Vol. 1321: M. Lenzerini (Ed.), AI*IA 97: Advances in Artificial Intelligence. Proceedings, 1997. XII, 459 pages. 1997. Vol. 1323: E. Costa, A. Cardoso (Eds.), Progress in Artificial Intelligence. Proceedings, 1997. XIV, 393 pages. 1997. Vol. 1325: Z.W. R a s ' , A. Skowron (Eds.), Foundations of Intelligent Systems. Proceedings, 1997. XI, 630 pages. 1997. Vol. 1328: C. Retort (Ed.), Logical Aspects of Computational Linguistics. Proceedings, 1996. VIII, 435 pages. 1997. Vol. 1342: A. Sattar (Ed.), Advanced Topics in Artificial Intelligence. Proceedings, 1997. XVIII, 516 pages. 1997. Vol. 1348: S. Steel, R. Alami (Eds.), Recent Advances in AI Planning. Proceedings, 1997, IX, 454 pages. 1997. Vol. 1359: G. Antoniou, A.K. Ghose, M. Truszezyn "ski (Eds.), Learning and Reasoning with Complex Representations. Proceedings, 1996. X, 283 pages. 1998. Vol. 1360: D. Wang (Ed.), Automated Deduction in Geometry. Proceedings, 1996. VII, 235 pages. 1998. Vol. 1365: M.P. Singh, A. Rao, M.J. Wooldridge (Eds.), Intelligent Agents IV. Proceedings, 1997. XII, 351 pages. 1998. Vol. 1371: I. Wachsmuth, M. FrShlich (Eds.), Gesture and Sign Language in Human-Computer Interaction. Proceedings, 1997. XI, 309 pages. 1998. Vol. 1374: H. Bunt, R.-J. Beun, T. Borghuis (Eds.), Multimodal Human-Computer Communication. VIII, 345 pages. 1998. Vol. 1387: C. Lee Giles, M. Gori (Eds.), Adaptive Processing of Sequences and Data Structures. Proceedings, 1997. XII, 434 pages. 1998. Vol. 1394: X. Wu, R. Kotagiri, K.B. Korb (Eds.), Research and Development in Knowledge Discovery and Data Mining. Proceedings, 1998. XVI, 424 pages. 1998. Vol. 1395: H. Kitano (Ed.), RoboCup-97: Robot Soccer World Cup I. XIV, 520 pages. 1998. Vol. 1397: H. de Swart (Ed.), Automated Reasoning with Analytic Tableaux and Related Methods. Proceedings, 1998. X, 325 pages. 1998.
Vol. 1398: C. Ntdellec, C. Rouveirol (Eds.), Machine Learning: ECML-98. Proceedings, 1998. XII, 420 pages. 1998. Vol. 1400: M. Lenz, B. Bartsch-Sptirl, H.-D. Burkhard, S. Wess (Eds.), Case-Based Reasoning Technology. XVIII, 405 pages. 1998. Vol, 1404: C. Freksa, C. Habel. K.F. Wender (Eds.), Spatial Cognition. VIII, 491 pages. 1998. VoL 1409: T. Sehaub, The Automation of Reasoning with Incomplete Information. XI, 159 pages. 1998. Vol. 1415: J. Mira, A.P. del Pobil, M. Ali (Eds.), Methodology and Tools in Knowledge-Based Systems. Vol. I. Proceedings, 1998. XXIV, 887 pages. 1998. Vol. 1416: A.P. del Pobil, J. Mira, M. Ali (Eds.), Tasks and Methods in Applied Artificial Intelligence. Vol. II. Proceedings, 1998. XXIII, 943 pages. 1998. Vol. 1418: R. Mercer, E. Neufeld (Eds.), Advances in Artificial Intelligence. Proceedings, 1998. XII, 467 pages. 1998. Vol. 1421: C. Kirchner, H. Kirchner (Eds.), Automated Deduction - CADE-15. Proceedings, 1998. XIV, 443 pages. 1998. Vol. 1424: L. Polkowski, A. Skowron (Eds.), Rough Sets and Current Trends in Computing. Proceedings, 1998. XIII, 626 pages. 1998. Vol. 1433: V. Honavar, G. Slutzki (Eds.), Grammatical Inference. Proceedings, 1998. X, 271 pages. 1998. Vol. 1434: J.-C. Heudin (Ed.), Virtual Worlds. Proceedings, 1998. XII, 412 pages. 1998. Vol. 1435: M. Klusch, G. Weig (Eds.), Cooperative Information Agents II. Proceedings, 1998. IX, 307 pages. 1998. Vol. 1437: S. Albayrak, F.J. Garijo (Eds.), Intelligent Agents for Telecommunication Applications. Proceedings, 1998. XII, 251 pages. 1998. Vol. 1441: W. Wobcke, M. Pagnucco, C. Zhang (Eds.), Agents and Multi-Agent Systems. Proceedings, 1997. XII, 241 pages. 1998. Vol. 1446: D. Page (Ed.), Inductive Logic Programming. Proceedings, 1998. VIII, 301 pages. 1998. Vol. 1453: M.-L. Mugnier, M. Chein (Eds.), Conceptual Structures: Theory, Tools and Applications. Proceedings, 1998. XIII, 439 pages. 1998. Vol. 1454: I. Smith (Ed.), Artificial Intelligence in Structural Engineering.XI, 497 pages. 1998. Vol. 1456: A. Drogoul, M. Tambe, T. Fukuda (Eds.), Collective Robotics. Proceedings, 1998. VII, 161 pages. 1998. Vol. 1458: V.O. Mittal, H.A. Yanco, J. Aronis, R. Simpson (Eds.), Assistive Technology in Artificial Intelligence. X, 273 pages. 1998.
Lecture Notes in Computer Science
Vol. 1415: J. Mira, A.P. del Pobil, M.AIi (Eds.), Methodology and Tools in Knowledge-Based Systems. Vol. I. Proceedings, 1998. XXIV, 887 pages. 1998. (Subseries LNAI). Vol. 1416: A.P. del Pobil, I. Mira, M.Ati (Eds.), Tasks and Methods in Applied Artificial Intelligence. Vol.II. Proceedings, 1998. XXIII, 943 pages. 1998. (Subseries LNAI). Vol. 1417: S. Yalamanchili, J. Duato (Eds.), Parallel Computer Routing and Communication. Proceedings, 1997. XII, 309 pages. 1998. Vol. 1418: R. Mercer, E. Neufeld (Eds.), Advances in Artificial Intelligence. Proceedings, 1998. XIL 467 pages. 1998. (Subseries LNA1). Vol. 1419: G. Vigna (Ed.), Mobile Agents and Security. XII, 257 pages. 1998. Vol. 1420: J. Desel, M. Silva (Eds.), Application and Theory of Petri Nets 1998. Proceedings, 1998. VIII, 385 pages. 1998. Vol. 1421: C. Kirchner, H. Kirchner (Eds.), Automated Deduction - CADE-15. Proceedings, 1998. XIV, 443 pages. 1998. (Subseries LNAI). Vol. 1422: J. Jeuring (Ed.), Mathematics of Program Construction. Proceedings, 1998. X, 383 pages. 1998. Vol. 1423: J.P. Buhler (Ed.), Algorithmic Number Theory. Proceedings, 1998. X, 640 pages. 1998. Vol. 1424: L. Polkowski, A. Skowron (Eds.), Rough Sets and Current Trends in Computing. Proceedings, 1998. XIII, 626 pages. 1998. (Subseries LNAI). Vol. 1425: D. Hutchison, R. Schlifer (Eds.), Multimedia Applications, Services and Techniques - ECMAST'98. Proceedings, 1998. XVI, 532 pages. 1998. Vol. 1427: A.J. Hu, M.Y. Vardi (Eds.), Computer Aided Verification. Proceedings, 1998. IX, 552 pages. 1998. Vol. 1430: S. Trigila, A. Mullery, M. Campolargo, H. Vanderstraeten, M. Mampaey (Eds.), Intelligence in Services and Networks: Technology for Ubiquitous Telecom Services. Proceedings, 1998. XII, 550 pages. 1998. Vol. 1431: H. Imai, Y. Zheng (Eds.), Public Key Cryptography. Proceedings, 1998. XI, 263 pages. 1998. Vol. 1432: S. Arnborg, L. Ivansson (Eds.), Algorithm Theory - SWAT '98. Proceedings, 1998. IX, 347 pages. 1998. Vol. 1433: V. Honavar, G. Slutzki (Eds.), Grammatical Inference. Proceedings, 1998. X, 271 pages. 1998. (Snbseries LNAI). Vol. 1434: J.-C. Heudin (Ed.), Virtual Worlds. Proceedings, 1998. XlI, 412 pages. 1998. (Subseries LNAI). Vol. 1435: M. Klusch, G. WeiB (Eds.), Cooperative Information Agents II. Proceedings, 1998. IX, 307 pages. 1998. (Subseries LNAI).
Vol. 1436: D. Wood, S. Yu (Eds.), Automata Implementation. Proceedings, 1997. VIII, 253 pages. 1998. Vol. 1437: S. Albayrak, F.J. Garijo (Eds.), Intelligent Agents for Telecommunication Applications. Proceedings, 1998. XII, 251 pages. 1998. (Subseries LNAI). Vol. 1438: C. Boyd, E. Dawson (Eds.), Information Security and Privacy. Proceedings, 1998. XI, 423 pages. 1998. Vol. 1439: B. Magnusson (Ed.), System Configuration Management. Proceedings, 1998. X, 207 pages. 1998.
Vol. 1441: W. Wobcke, M. Pagnucco, C. Zhang (Eds.), Agents and Multi-Agent Systems. Proceedings, 1997. XII, 241 pages. 1998. (Subseries LNAI). Vol. 1443: K.G. Larsen, S. Skyum, G. Winskel (Eds.), Automata, Languages and Programming. Proceedings, 1998. XVI, 932 pages. 1998. Vol. 1444: K. Jansen, J. Rolim (Eds.), Approximation Algorithms for Combinatorial Optimization. Proceedings, 1998. VIII, 201 pages. 1998. Vol. 1445: E. Jul (Ed.), E C O O P ' 9 8 - Object-Oriented Programming. Proceedings, 1998. XII, 635 pages. 1998.
Vol. 1446: D. Page (Ed.), Inductive Logic Programming. Proceedings, 1998. VIII, 301 pages. 1998. (Subseries LNAI). Vol. 1448: M. Farach-Colton (Ed.), Combinatorial Pattern Matching. Proceedings, 1998. VIII, 251 pages. 1998 .Vol. 1449: W.-L Hsu, M.-Y. Kao (Eds.), Computing and Combinatorics. Proceedings, 1998. XII, 372 pages. 1998. Vol. 1452: B.P. Goettl, H.M. Halff, C.L Redfield, V.J. Shute (Eds.), Intelligent Tutoring Systems. Proceedings, 1998. XIX, 629 pages. 1998. Vol. 1453: M.-L. Mugnier, M. Cbein (Eds.), Conceptual Structures: Theory, Tools and Applications. Proceedings, 1998. XIII, 439 pages. (Subseries LNAI). Vol. 1454: I. Smith (Ed.), Artificial Intelligence in Structural Engineering. XI, 497 pages. 1998. (Subseries LNAI). Vol. 1456: A. Drogout, M. Tambe, T. Fukuda (Eds.), Collective Robotics. Proceedings, 1998. VII, 161 pages. 1998. (Subseries LNAI). Vol. 1457: A. Ferreira, J. Rolim, H. Simon, S.-H. Teng (Eds.), Solving Irregularly Structured Problems in PraUel. Proceedings, 1998. X, 408 pages. 1998. Vol. 1458: V.O. Mittal, H.A. Yah'cO, J. Aronis, R. Simpson (Eds.), Assistive Technology in Artificial Intelligence. X, 273 pages. 1998. (Subseries LNAI). Vol. 1464: H.H.S. Ip, A.W.M. Smeuldcrs (Eds.), Multimedia Information Analysis and Retrieval. Proceedings, 1998. VIII, 264 pages. 1998.