INFORMATION MODELLING AND KNOWLEDGE BASES XIX
Frontiers in Artificial Intelligence and Applications FAIA covers all aspects of theoretical and applied artificial intelligence research in the form of monographs, doctoral dissertations, textbooks, handbooks and proceedings volumes. The FAIA series contains several sub-series, including “Information Modelling and Knowledge Bases” and “Knowledge-Based Intelligent Engineering Systems”. It also includes the biennial ECAI, the European Conference on Artificial Intelligence, proceedings volumes, and other ECCAI – the European Coordinating Committee on Artificial Intelligence – sponsored publications. An editorial panel of internationally well-known scholars is appointed to provide a high quality selection. Series Editors: J. Breuker, R. Dieng-Kuntz, N. Guarino, J.N. Kok, J. Liu, R. López de Mántaras, R. Mizoguchi, M. Musen, S.K. Pal and N. Zhong
Volume 166 Recently published in this series Vol. 165. A.R. Lodder and L. Mommers (Eds.), Legal Knowledge and Information Systems – JURIX 2007: The Twentieth Annual Conference Vol. 164. J.C. Augusto and D. Shapiro (Eds.), Advances in Ambient Intelligence Vol. 163. C. Angulo and L. Godo (Eds.), Artificial Intelligence Research and Development Vol. 162. T. Hirashima et al. (Eds.), Supporting Learning Flow Through Integrative Technologies Vol. 161. H. Fujita and D. Pisanelli (Eds.), New Trends in Software Methodologies, Tools and Techniques – Proceedings of the sixth SoMeT_07 Vol. 160. I. Maglogiannis et al. (Eds.), Emerging Artificial Intelligence Applications in Computer Engineering – Real World AI Systems with Applications in eHealth, HCI, Information Retrieval and Pervasive Technologies Vol. 159. E. Tyugu, Algorithms and Architectures of Artificial Intelligence Vol. 158. R. Luckin et al. (Eds.), Artificial Intelligence in Education – Building Technology Rich Learning Contexts That Work Vol. 157. B. Goertzel and P. Wang (Eds.), Advances in Artificial General Intelligence: Concepts, Architectures and Algorithms – Proceedings of the AGI Workshop 2006 Vol. 156. R.M. Colomb, Ontology and the Semantic Web Vol. 155. O. Vasilecas et al. (Eds.), Databases and Information Systems IV – Selected Papers from the Seventh International Baltic Conference DB&IS’2006 Vol. 154. M. Duží et al. (Eds.), Information Modelling and Knowledge Bases XVIII Vol. 153. Y. Vogiazou, Design for Emergence – Collaborative Social Play with Online and Location-Based Media
ISSN 0922-6389
Information Modelling and Knowledge Bases XIX
Edited by
Hannu Jaakkola Tampere University of Technology, Finland
Yasushi Kiyoki Keio University, Japan
and
Takahiro Tokuda Tokyo Institute of Technology, Japan
Amsterdam • Berlin • Oxford • Tokyo • Washington, DC
© 2008 The authors and IOS Press. All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without prior written permission from the publisher. ISBN 978-1-58603-812-0 Library of Congress Control Number: 2007940891 Publisher IOS Press Nieuwe Hemweg 6B 1013 BG Amsterdam Netherlands fax: +31 20 687 0019 e-mail:
[email protected] Distributor in the UK and Ireland Gazelle Books Services Ltd. White Cross Mills Hightown Lancaster LA1 4XS United Kingdom fax: +44 1524 63232 e-mail:
[email protected] Distributor in the USA and Canada IOS Press, Inc. 4502 Rachael Manor Drive Fairfax, VA 22032 USA fax: +1 703 323 3668 e-mail:
[email protected] LEGAL NOTICE The publisher is not responsible for the use which might be made of the following information. PRINTED IN THE NETHERLANDS
Information Modelling and Knowledge Bases XIX H. Jaakkola et al. (Eds.) IOS Press, 2008 © 2008 The authors and IOS Press. All rights reserved.
v
Preface In the last decades information modelling and knowledge bases have become hot topics not only in academic communities related to information systems and computer science but also in business areas where information technology is applied. The 17th European-Japanese Conference on Information Modelling and Knowledge Bases, EJC 2007, continues the series of events that originally started as a cooperation between Japan and Finland as far back as the late 1980’s. Later (1991) the geographical scope of these conferences expanded to cover all of Europe as well as countries outside Europe other than Japan. The EJC conferences constitute a world-wide research forum for the exchange of scientific results and experiences achieved in computer science and other related disciplines using innovative methods and progressive approaches. In this way a platform has been established drawing together researches as well as practitioners dealing with information modelling and knowledge bases. Thus the main topics of the EJC conferences target the variety of themes in the domain of information modelling, conceptual analysis, design and specification of information systems, ontologies, software engineering, knowledge and process management, data and knowledge bases. We also aim at applying new progressive theories. To this end much attention is being paid also to theoretical disciplines including cognitive science, artificial intelligence, logic, linguistics and analytical philosophy. In order to achieve the EJC targets, an international programme committee selected 19 full papers, 8 short papers, 4 position papers and 3 poster papers in the course of a rigorous reviewing process including 34 submissions. The selected papers cover many areas of information modelling, namely theory of concepts, database semantics, knowledge representation, software engineering, WWW information management, context-based information retrieval, ontological technology, image databases, temporal and spatial databases, document data management, process management, and many others. The conference would not have been a success without the effort of many people and organizations. In the Programme Committee, 37 reputable researchers devoted a good deal of effort to the review process in order to select the best papers and create the EJC 2007 programme. We are very grateful to them. Professors Yasushi Kiyoki and Takehiro Tokuda were acting as co-chairs of the programme committee. The Tampere University of Technology in Pori, Finland, promoted the conference in its capacity as organizer: Professor Hannu Jaakkola acted as conference leader and Ms. Ulla Nevanranta as conference secretary. They took care of both the various practical aspects necessary for the smooth running of the conference and for arranging the conference proceedings in the form of a book. The conference is sponsored by the City of Pori, Satakunnan Osuuskauppa, Satakunnan Puhelin, Secgo Software, Nokia, Ulla Tuominen Foundation and Japan Scandinavia Sasakawa Foundation. We gratefully appreciate the efforts of everyone who lent a helping hand.
vi
We are convinced that the conference will prove to be productive and fruitful toward advancing the research and application of information modelling and knowledge bases. The Editors Hannu Jaakkola Yasushi Kiyoki Takahiro Tokuda
vii
Programme Committee Co-Chairs Hannu Jaakkola, Tampere University of Technology, Pori, Finland Hannu Kangassalo, University of Tampere, Finland Yasushi Kiyoki, Keio University, Japan Takahiro Tokuda, Tokyo Institute of Technology, Japan Members Akaishi Mina, University of Tokyo, Japan Bielikova Maria, Slovak University of Technology, Slovakia Brumen Boštjan, University of Maribor, Slovenia Carlsson Christer, Åbo Akademi, Finland Charrel Pierre-Jean, Université Toulouse2, France Chen Xing, Kanagawa Institute of Technology, Japan Ďuráková Daniela, VSB – Technical University Ostrava, Czech Republic Duží Marie, VSB – Technical University of Ostrava, Czech Republic Funyu Yutaka, Iwate Prefectural University, Japan Haav Hele-Mai, Institute of Cybernetics, Estonia Heimbürger Anneli, University of Jyväskylä, Finland Henno Jaak,Tallinn Technical University, Estonia Hosokawa Yoshihide, Nagoya Institute of Technology, Japan Iivari Juhani, University of Oulu, Finland Jaakkola Hannu, Tampere University of Technology, Pori, Finland Kalja Ahto, Tallinn Technical University, Estonia Kawaguchi Eiji, Kyushu Institute of Technology, Japan Leppänen Mauri, University of Jyväskylä, Finland Link Sebastian, Massey University, New Zealand Mikkonen Tommi, Tampere University of Technology, Finland Mirbel Isabelle, Université de Nice Sophia Antipolis, France Multisilta Jari, Tampere University of Technology, Pori, Finland Nilsson Jørgen Fischer, Denmark Technical University, Denmark Oinas-Kukkonen Harri, University of Oulu, Finland Palomäki Jari, Tampere University of Technology, Pori, Finland Pokorny Jaroslav, Charles University Prague, Czech Republic Richardsson Ita, University of Limerick, Ireland Roland Hausser, Erlangen University, Germany Sasaki Hideyasu, Ritsumeikan University, Japan Suzuki Tetsuya, Shibaura Institute of Technology, Japan Thalheim Bernhard, Kiel University, Germany Tyrväinen Pasi, University of Jyväskylä, Finland Vojtas Peter, Charles University Prague, Czech Republic Wangler Benkt, Skoevde University, Sweden
viii
Watanabe Yoshimichi, Yamanashi University, Japan Yoshida Naofumi, Komazawa University, Japan Yu Jeffery Xu, Chinese University of Hong Kong, Hong Kong Organizing Committee Professor Hannu Jaakkola, Tampere University of Technology, Pori, Finland Dept. secretary Ulla Nevanranta, Tampere University of Technology, Pori, Finland Professor Eiji Kawaguchi, Kyushu Institute of Technology, Japan Steering Committee Professor Eiji Kawaguchi, Kyushu Institute of Technology, Japan Professor Hannu Kangassalo, University of Tampere, Finland Professor Hannu Jaakkola, Tampere University of Technology, Pori, Finland Professor Setsuo Ohsuga, Japan Professor Marie Duží, VSB – Technical University of Ostrava, Czech Republic
ix
Contents Preface Hannu Jaakkola, Yasushi Kiyoki and Takahiro Tokuda Programme Committee Comparing the Use of Feature Structures in Nativism and in Database Semantics Roland Hausser Multi-Criterion Search from the Semantic Point of View (Comparing TIL and Description Logic) Marie Duží and Peter Vojtáš
v vii 1
21
A Semantic Space Creation Method with an Adaptive Axis Adjustment Mechanism for Media Data Retrieval 40 Xing Chen, Yasushi Kiyoki, Kosuke Takano and Keisuke Masuda Storyboarding Concepts for Edutainment WIS Klaus-Dieter Schewe and Bernhard Thalheim A Model of Database Components and Their Interconnection Based upon Communicating Views Stephen J. Hegner
59
79
Creating Multi-Level Reflective Reasoning Models Based on Observation of Social Problem-Solving in Infants Heikki Ruuska, Naofumi Otani, Shinya Kiriyama and Yoichi Takebayashi
100
CMO – An Ontological Framework for Academic Programs and Examination Regulations Richard Hackelbusch
114
Reusing and Composing Habitual Behavior in Video Browsing Akio Takashima and Yuzuru Tanaka
134
Concept Modeling in Multidisciplinary Research Environment Jukka Aaltonen, Ilkka Tuikkala and Mika Saloheimo
142
Extensional and Intensional Aspects of Conceptual Design Elvira Locuratolo and Jari Palomaki
160
Emergence of Language: Hidden States and Local Environments Jaak Henno
170
Frameworks for Intellectual Property Protection on Multimedia Database Systems Hideyasu Sasaki and Yasushi Kiyoki
181
x
Wavelet and Eigen-Space Feature Extraction for Classification of Metallography Images Pavel Praks, Marcin Grzegorzek, Rudolf Moravec, Ladislav Válek and Ebroul Izquierdo Semantic Knowledge Modeling in Medical Laboratory Environment for Drug Usage: CASE Study Anne Tanttari, Kimmo Salmenjoki and Lorna Uden
190
200
Towards Automatic Construction of News Directory Systems Bin Liu, Pham Van Hai, Tomoya Noro and Takehiro Tokuda
208
A System Architecture for the 7C Knowledge Environment Teppo Räisänen and Harri Oinas-Kukkonen
217
Inquiry Based Learning Environment for Children Marjatta Kangassalo and Eva Tuominen
237
A Perspective Ontology and IS Perspectives Mauri Leppänen
257
The Improvement of Data Quality – A Conceptual Model Tatjana Welzer, Izidor Golob, Boštjan Brumen, Marjan Družovec, Ivan Rozman and Hannu Jaakkola
276
Knowledge Cluster Systems for Knowledge Sharing, Analysis and Delivery Among Remote Sites Koji Zettsu, Takafumi Nakanishi, Michiaki Iwazume, Yutaka Kidawara and Yasushi Kiyoki
282
A Formal Ontology for Business Process Model TAP: Tasks-Agents-Products Souhei Ito, Shigeki Hagihara and Naoki Yonezaki
290
A Proposal for Student Modelling Based on Ontologies Angélica de Antonio, Jaime Ramírez and Julia Clemente
298
Ontology-Based Support of Knowledge Evaluation in Higher Education Andrea Kő, András Gábor, Réka Vas and Ildikó Szabó
306
When Cultures Meet: Modelling Cross-Cultural Knowledge Spaces Anneli Heimbürger
314
Process Dimension of Concepts Vaclav Repa
322
E-Government: On the Way Towards Frameworks for Application Engineering Marie-Noëlle Terrasse, Marinette Savonnet, Eric Leclercq, George Becker, Thierry Grison, Laurence Favier and Carlo Daffara
330
A Personal Web Information/Knowledge Retrieval System Hao Han and Takehiro Tokuda
338
A Personal Information Protection Model for Web Applications by Utilizing Mobile Phones Michiru Tanaka, Jun Sasaki, Yutaka Funyu and Yoshimi Teshigawara
346
xi
Manufacturing Roadmaps as Information Modelling Tools in the Knowledge Economy Augusta Maria Paci
354
Metadata Extraction and Retrieval Methods for Taste-Impressions with Bio-Sensing Technology Hanako Kariya and Yasushi Kiyoki
359
An Ontological Framework for Modeling Complex Cooperation Contexts in Organizations Bendoukha Lahouaria
379
Information Modelling and Knowledge Bases for Interoperability Solution in Security Area Ladislav Buřita and Vojtĕch Ondryhal
384
On the Construction of Ontologies Based on Natural Language Semantic Terje Aaberge
389
Author Index
395
This page intentionally left blank
Information Modelling and Knowledge Bases XIX H. Jaakkola et al. (Eds.) IOS Press, 2008 © 2008 The authors and IOS Press. All rights reserved.
1
Comparing the Use of Feature Structures in Nativism and in Database Semantics Roland Hausser Universität Erlangen-Nürnberg Abteilung Computerlinguistik (CLUE)
[email protected] Abstract Linguistics has always been a field with a great diversity of schools and sub-schools. This has naturally led to the question of whether different grammatical analyses of the same sentence are in fact equivalent or not. With the formalization of grammars as generative rule systems, beginning with the “Chomsky revolution” in the late nineteen fifties, it became possible to answer such questions in those fortunate instances in which the competing analyses were sufficiently formalized. An early example is the comparison of Context-Free Phrase Structure Grammar (CFPSG) and Bidirectional Categorial Grammar (BCG), which were shown to be weakly equivalent by Gaifman 1961. More recently, the question arose with respect to the language classes and the complexity hierarchies of Phrase Structure Grammar (PS-grammar) and of Left-Associative Grammar (LA-grammar), which were shown to be orthogonal to each other (TCS’92). Here we apply the question to the use of feature structures in contemporary schools of Nativism on the one hand, and in Database Semantics (DBS) on the other. The practical purpose is to determine whether or not the grammatical analyses of Nativism based on constituent structure can be used in Database Semantics.
1 Introduction: Constituent Structure in Nativism In contemporary linguistics, most schools are based on constituent structure analysis. Examples are GB (Chomsky 1981), LFG (Bresnan ed. 1982), GPSG (Gazdar et al. 1985), and HPSG (Pollard and Sag 1987, 1994). Related schools are DCG (Pereira and Warren 1980), FUG (Kay 1992), TAG (Vijay-Shanker and Joshi 1988), and CG (Kay 2002). For historical reasons and because of their similar goals and methods, these schools may be jointly referred to as variants of Nativism.1 Constituent structure is defined in terms of phrase structure trees which fulfill the following conditions:
1.1
D EFINITION OF C ONSTITUENT S TRUCTURE
1. Words or constituents which belong together semantically must be dominated directly and exhaustively by a node. 2. The lines of a constituent structure may not cross (non-tangling condition). 1
Nativism is so-called because it aims at characterizing the speaker-hearer’s innate knowledge of language (competence) – excluding the use of language in communication (performance).
2
R. Hausser / Comparing the Use of Feature Structures in Nativism and in Database Semantics
According to this definition, the first of the following two phrase structure trees is a linguistically correct analysis, while the second is not:
1.2
C ORRECT AND INCORRECT CONSTITUENT STRUCTURE ANALYSIS correct
incorrect
S
VP
S
SP
NP
V
NP
NP
V
NP
Julia
knows
John
Julia
knows
John
There is common agreement among Nativists that the words knows and John belong more closely together semantically than the words Julia and knows.2 Therefore, only the tree on the left is accepted as a correct grammatical analysis. Formally, however, both phrase structure trees are equally well-formed. Moreover, the number of possible trees grows exponentially with the length of the sentence.3 The problem is that such a multitude of phrase structure trees for the same sentence would be meaningless linguistically, if they were all equally correct. It is for this reason that constituent structure as defined in 1.1 is crucial for phrase structure grammar (PS-grammar): constituent structure is the only known principle4 for excluding most of the possible trees. Yet it has been known at least since 1960 (cf. Bar-Hillel 1964, p. 102) that there are certain constructions of natural language, called “discontinuous elements,” which do not fulfill the definition of constituent structure. Consider the following examples:
1.3
C ONSTITUENT S TRUCTURE PARADOX : V IOLATING CONDITION 1 S
VP
V
NP
NP
Suzy
looked
DET
N
DE
the
word
up
Here the lines do not cross, satisfying the second condition of Definition 1.1. The analysis violates the first condition, however, because the semantically related expressions looked – up, or rather the nodes V (verb) and DE (discontinuous element) dominating them, are not exhaustively dominated by a node. Instead, the node directly dominating V and DE also dominates the NP the word. 2 To someone not steeped in Nativist linguistics, these intuitions may be difficult to follow. They are related to the substitution tests of Z. Harris, who was Chomsky’s teacher. 3 If loops like A → ... A are permitted in the rewrite rules, the number of different trees over a finite sentence is infinite! 4 Historically, the definition of constituent structure is fairly recent, based on the movement and substitution tests of American Structuralism in the nineteen thirties and forties.
R. Hausser / Comparing the Use of Feature Structures in Nativism and in Database Semantics
1.4
3
C ONSTITUENT S TRUCTURE PARADOX : V IOLATING CONDITION 2 S
VP
VP
V
NP
NP
Suzy
looked
DET
N
DE
the
word
up
Here the semantically related subexpressions looked and up are dominated directly and exhaustively by a node, thus satisfying the first condition of Definition 1.1. The analysis violates the second condition, however, because the lines in the tree cross. Rather than giving up constituent structure as the intuitive basis of their analysis, the different schools of Nativism extended the formalism of context-free phrase structure with additional structures and mechanisms like transformations (Chomsky 1965), f-structures (Bresnan ed. 1982), meta-rules (Gazdar et al. 1985), constraints (Pollard and Sag 1987, 1994), the adjoining of trees (Vijay-Shanker and Joshi 1988), etc. In recent years, these efforts to extend the descriptive power of context-free phrase structure grammar have converged in the widespread use of recursive feature structures with unification. Consider the following example, which emphasizes what is common conceptually to the different variants of Nativism.
1.5
R ECURSIVE FEATURE STRUCTURES AND UNIFICATION S
phrase structure derivation
NP
VP
V Julia
knows
NP John
lexical lookup noun: Julia num: sg gen: fem
unification
result
verb: know noun: John tense: pres num: sg subj: gen: masc obj: verb: know tense: pres subj: obj: noun: John num: sg gen: masc verb: know tense: pres subj: noun: Julia num: sg gen: fem obj: noun: John num: sg gen: masc
4
R. Hausser / Comparing the Use of Feature Structures in Nativism and in Database Semantics
As in 1.2 (correct tree), the analysis begins with the start symbol S, from which the phrase structure tree is derived by substituting NP and VP for S, etc., until the terminal nodes Julia, knows, and John are reached (phrase structure derivation). Next the terminal nodes are replaced by feature structures via lexical lookup. Finally, the lexical feature structures are unified (indicated by the dotted arrows), resulting in one big recursive feature structure (result). The order of unification mirrors the derivation of the phrase structure tree. On the one hand, the use of feature structures provides for many techniques which go beyond the context free phrase structure tree, such as a differentiated lexical analysis, structure sharing (a.k.a. token identity), a truth-conditional semantic interpretation based on lambda calculus, etc. On the other hand, this method greatly increases the mathematical complexity from polynomial to exponential or undecidable. Also, the constituent structure paradox, as a violation of Definition 1.1, remains.
2 Elimination of Constituent Structure in LA-grammar Instead of maintaining constituent structure analysis when it is possible (e.g. 1.2, correct tree) and taking exception to it when it is not (e.g. 1.3), Left-Associative Grammar completely abandoned constituent structure as defined in 1.1 by adopting another, more basic principle. This principle is the time-linear structure of natural language – in accordance with de Saussure’s 1913/1972 second law (principe seconde). Time-linear means linear like time and in the direction of time. Consider the following reanalysis of Example 1.2 within Left-Associative Grammar (LAgrammar) as presented in NEWCAT’86:
2.1
T IME - LINEAR ANALYSIS OF Julia knows John IN LA- GRAMMAR Julia knows John (v)
Julia knows (a’ v)
Julia (nm)
John (nm)
knows (s3’ a’ v)
Given an input sentence or a sequence of input sentences (text), LA-grammar always combines a “sentence start,” e.g. Julia, and a “next word,” e.g. knows, into a new sentence start, e.g. Julia knows. This time-linear procedure starts with the first word and continues until there is no more next word available in the input. In LA-grammar, the intuitions about what “belongs semantically together” (which underlie the definition of constituent structure 1.1) are reinterpreted in terms of functor-argument structure and coded in categories which are defined as lists of one or more category segments. For example, in 2.1 the category segment nm (for name) of Julia cancels the first valency position s3’ (for nominative singular third person) of the category (s3’ a’ v) of knows, whereby Julia serves as the argument and knows as the functor. Then the resulting sentence start Julia knows of the category (a’ v) serves as the functor and John as the argument. The result is a complete sentence, represented as a verb without unfilled valency positions, i.e., as the category (v). Next consider the time-linear reanalysis of the example with a discontinuous element (cf. 1.3 and 1.4):
R. Hausser / Comparing the Use of Feature Structures in Nativism and in Database Semantics
2.2
5
T IME - LINEAR ANALYSIS OF Suzy looked the word up Suzy looked the word up (v)
Suzy looked the word (up’ v)
Suzy looked the (nn’ up’ v)
Suzy looked (a’ up’ v)
Suzy (nm)
up (up)
word (nn)
the (nn’ np)
looked (n’ a’ up’ v)
Here the discontinuous element up is treated like a valency filler for the valency position up’ in the lexical category (n’ a’ up’ v) of looked. Note the strictly time-linear addition of the “constituent” the word: the article the has the category (nn’ np) such that the category segment np cancels the valency position a’ in the category (a’ up’ v) of Suzy looked, while the category segment nn’ is added in the result category (nn’ up’ v) of Suzy looked the. In this way, the obligatory addition of a noun after the addition of a determiner is ensured. The time-linear analysis of LA-grammar is based on formal rules which compute possible continuations. Consider the following example (explanations in italics):
2.3
E XAMPLE OF AN LA- GRAMMAR RULE APPLICATION (i) rule name Nom+Fverb:
(ii) ss (iii) nw (NP) (NP’ X V) ⇒ | | | | (nm) (s3’ a’ v) Julia knows
(iv) ss’ (v) RP (X V) {Fverb+Main, ...} | | matching and binding (a’ v) Julia knows
An LA-grammar rule consists of (i) a rule name, here Nom+Fverb, (ii) a pattern for the sentence start ss, here (NP), (iii) a pattern for the next word nw, here (NP’ X V), (iv) a pattern for the resulting sentence start ss’, here (X V), and (v) a rule package RP, here {Fverb+Main, ...}. The patterns for (ii) ss, (iii) nw, and (iv) ss’ are coded by means of restricted variables, which are matched and vertically bound with corresponding category segments of the language input. For example, in 2.3 the variable NP at the rule level is bound to the category segment nm at the language level, the variable NP’ is bound to the category segment s3’, etc. If the matching of variables fails with respect to an input (because a variable restriction is violated), the rule application fails. If the matching of variables is successful, the categorial operation (represented by (ii) ss, (iii) nw, and (iv) ss’) is performed and a new sentence start is derived. That the categorial operation defined at the rule level can be executed at the language level is due to the vertical binding of the rule level variables to language level constants. After the successful application of an LA-grammar rule, the rules in its (v) rule package RP are applied to the resulting sentence start (iv) ss’ and a new next word. A crucial property of LA-grammar rules is that they have an external interface, defined in terms of the rule level variables and their vertical matching with language level category segments. This is in contradistinction to the rewrite rules of phrase structure grammar: they do not have any external interface because all phrase structure trees are derived from the same initial S node, based on the principle of possible substitutions.
6
R. Hausser / Comparing the Use of Feature Structures in Nativism and in Database Semantics
3 From LA-grammar to Database Semantics The external interfaces of LA-grammar rules, originally introduced for computing the possible continuations of a time-linear derivation, open the transition from a sign-oriented approach to an agent-oriented approach of natural language analysis.5 While a sign-oriented approach analyses sentences in isolation, an agent-oriented approach analyses sentences as a means to transfer information from the mind of the speaker to the mind of the hearer. In Database Semantics, LA-grammar is used for an agent-oriented approach to linguistics which aims at building an artificial cognitive agent (talking robot). This requires the design of (i) interfaces for recognition and action, (ii) a data structure suitable for storing and retrieving content, and (iii) an algorithm for (a) reading content in during recognition, (b) processing content during thought, and (c) reading content out during action. Moreover, the data structure must represent non-verbal cognition at the context level as well as verbal cognition at the language level. Finally, the two levels must interact in such a way as to model the speaker mode (mapping from the context level to the language level) and the hearer mode (mapping from the language level to the context level). Consider the representation of these requirements in the following schema:
3.1
S TRUCTURING CENTRAL COGNITION IN AGENTS WITH LANGUAGE Cognitive Agent central cognition sign recognition sign synthesis
contex recognition context action
External Reality
language component
theory of grammar
pragmatics
theory of language
context component
peripheral cognition
The interfaces of recognition and action are based on pattern matching. At the context level, the patterns are defined as concepts, which are also used for coding and storing content. At the language level, the concepts of the context level are reused as the literal meanings of content words. In this way, the lexical semantics is based on procedurally defined concepts rather than the metalanguage definitions of a truth-conditional semantics (cf. NLC’06, Chapter 2 and Section 6.2). The data structure for coding and storing content at the context level is based on flat (nonrecursive) feature structures called proplets (in analogy to “droplets”). Proplets are so-called because they serve as the basic elements of concatenated propositions. Consider the following example showing the simplified proplets representing the content resulting from an agent perceiving a barking dog (recognition) and running away (action):
3.2
C ONTEXT PROPLETS REPRESENTING dog barks. (I) run. ⎡
⎤
sur: ⎢noun: dog⎥ ⎥ ⎢ ⎥ ⎢ ⎣fnc: bark ⎦ prn: 22 5
⎡
⎤ ⎡
sur: ⎢verb: bark⎥ ⎢ ⎥ ⎢ ⎥ ⎢arg: dog ⎥ ⎢ ⎥ ⎣nc: 23 run ⎦ prn: 22
⎤
sur: ⎢verb: run ⎥ ⎢ ⎥ ⎢ ⎥ ⎢arg: moi ⎥ ⎢ ⎥ ⎣pc: 22 bark⎦ prn: 23
Clark 1996 distinguishes between the language-as-product and the language-as-action traditions.
R. Hausser / Comparing the Use of Feature Structures in Nativism and in Database Semantics
7
The semantic relation between the first two proplets is intrapropositional functor-argument structure, and is coded as follows: The first proplet with the core feature [noun: dog] specifies the associated functor with the intrapropositional continuation feature [fnc: bark], while the second proplet with the core feature [verb: bark] specifies its associated argument with [arg: dog] (bidirectional pointering). That the first and the second proplet belong to the same proposition is indicated by their having the same prn (proposition number) value, namely 22. The semantic relation between the second and the third proplet is extrapropositional coordination. That these two proplets belong to different propositions is indicated by their having different prn values, namely 22 and 23, respectively. Their coordination relation is coded in the second proplet by the extrapropositional continuation feature [nc: 23 run] and in the third proplet by [pc: 22 bark], whereby the attributes nc and pc stand for “next conjunct” and “previous conjunct,” respectively. The values of the nc and pc attributes are the proposition number and the core value of the verb of the coordinated proposition. By coding the semantic relations between proplets solely in terms of attributes and their values, proplets can be stored and retrieved according to the needs of one’s database, without any of the graphical restrictions induced by phrase structure trees. Furthermore, by using similar proplet at the levels of language and context, the matching between the two levels during language interpretation (hearer mode) and language production (speaker mode) is structurally straightforward. Consider the following example in which the context level content of 3.2 is matched with corresponding language proplets containing German surfaces:
3.3
M ATCHING BETWEEN THE LANGUAGE AND THE CONTEXT LEVEL
sur: Hund language level: noun: dog (horizontal relations) fnc: bark prn: 122
sur: bellt verb: bark arg: dog nc: 123 run prn: 122
internal sur: context level: noun: dog (horizontal relations) fnc: bark prn: 22
sur: fliehe verb: run arg: moi pc: 122 bark prn: 123
matching
sur: verb: bark arg: dog nc: 23 run prn: 22
(vertical relations)
sur: verb: run arg: moi pc: 22 bark prn: 23
The proplets at the language and the context level are alike except that the sur (surface) attributes of context proplets have an empty value, while those of the language proplets have a language-dependent surface, e.g. Hund, as value. On both levels, the intra- and extrapropositional relations are coded by means of attribute values (horizontal relations, indicated by dotted lines). The reference relation between corresponding proplets at the two levels, in contrast, is based on matching (vertical relations, indicated by double arrows). Simply speaking, the matching between a language and a context proplet is successful if they have the same attributes and their values are compatible. Even though the vertical matching takes place between individual proplets, the horizontal semantic relations holding between the proplets at each of the two levels are taken into account as well. Assume, for example, that the noun proplet dog at the language level has the fnc value bark, while the corresponding proplet at the context level had the fnc value sleep. In this case, the two proplets would be vertically incompatible – due to their horizontal relations to different verbs, coded as different values of their respective fnc attributes. Having described the data structure of Database Semantics, let us turn next to its algorithm. For natural language communication, the time-linear algorithm of LA-grammar is used in three different variants: (i) in the hearer mode, an LA-hear grammar interprets sentences of natural language as sets of proplets ready to be stored in the database of the cognitive agent,
8
R. Hausser / Comparing the Use of Feature Structures in Nativism and in Database Semantics
(ii) in the think mode, an LA-think grammar navigates along the semantic relations between proplets, and (iii) in the speaker mode an LA-speak grammar verbalizes the proplets traversed in the think mode as surfaces of a natural language. Consider the following LA-hear derivation of Julia knows John in Database Semantics.
3.4
T IME - LINEAR HEARER - MODE ANALYSIS OF Julia knows John Julia
knows
John
lexical lookup noun: Julia fnc: prn:
verb: know arg: prn:
noun: John fnc: prn:
syntactic−semantic parsing: 1
2
noun: Julia fnc: prn: 1
verb: know arg: prn:
noun: Julia fnc: know prn: 1
verb: know arg: Julia prn: 1
noun: John fnc: prn:
result of syntactic−semantic parsing: noun: Julia fnc: know prn: 1
verb: know arg: Julia John prn: 1
noun: John fnc: know prn: 1
This derivation is similar to 2.1 in that it is strictly time-linear. The differences are mostly in the format. While 2.1 must be read bottom up, 3.4 starts with the lookup of lexical proplets and must be read top down. Furthermore, while the ss and nw in 2.1 each consist of a surface and a category defined as a list, the ss and nw in 3.4 consist of proplets. Finally, while the output of 2.1 is the whole derivation (like a tree in a sign-oriented approach), the output of 3.4 is a set of proplets (cf. result) ready to be stored in the database. The rules of an LA-hear grammar have patterns for matching proplets rather than categories (as in 2.3). This is illustrated by the following example (explanations in italics):
3.5
E XAMPLE OF AN LA-hear RULE APPLICATION (i) rule name (ii) ss-pattern
rule level
NOM+FV:
noun: α fnc:
⎡
verb: β arg:
matching and binding
noun: Julia
proplet level
(iii) nw-pattern (iv) operations (v) rule package
⎢ ⎣fnc:
prn: 1
copy α nw.arg {FV+OBJ, ...} copy β ss.fnc
⎤
⎡
⎤
⎥ ⎦
⎢ ⎣arg:
⎥ ⎦
verb: know prn:
This rule resembles the one illustrated in 2.3 in that it consists of (i) a rule name, (ii) a pattern for the ss, (iii) a pattern of the nw, and (v) a rule package. It differs from 2.3, however, in that the resulting sentence start (iv) ss’ is replaced by a set of operations. During matching, the variables, here α and β, of the rule level are vertically bound to corresponding values at the proplet level. This is the basis for executing the rule level operations at the proplet level. In 3.5, the operations code the functor-argument relation between the subject and the verb by copying the core value of the noun into the arg slot of the verb and the core value of the verb into the fnc slot of the noun. In the schematic derivation 3.4, the copying is indicated by the arrows. The result of the rule application 3.5 is as follows:
R. Hausser / Comparing the Use of Feature Structures in Nativism and in Database Semantics
3.6
9
R ESULT OF THE LA-hear RULE APPLICATION SHOWN IN 3.5 ⎡
⎤ ⎡
⎤
noun: Julia verb: know ⎢ ⎥ ⎢ ⎥ ⎣fnc: know ⎦ ⎣arg: Julia ⎦ prn: 1 prn: 1
In the next time-linear combination, the current result serves as the sentence start, while lexical lookup provides the proplet John as the next word (cf. 3.4, line 2). The example with a discontinuous element (cf. 2.2 and 2.3) is reanalyzed in the hearer mode of Database Semantics as follows:
3.7
H EARER MODE ANALYSIS OF Suzy looked the word up looked
Suzy
the
word
up
lexical lookup noun: Suzy fnc: prn:
verb: look a_1 arg: prn:
noun: n_1 fnc: prn:
noun: word fnc: prn:
adj: up mdd: prn:
syntactic−semantic parsing: noun: Suzy fnc: prn: 2
verb: look a_1 arg: prn:
2
noun: Suzy fnc: look a_1 prn: 2
verb: look a_1 arg: Suzy prn: 2
noun: n_1 fnc: prn:
3
noun: Suzy fnc: look a_1 prn: 2
verb: look a_1 arg: Suzy n_1 prn: 2
noun: n_1 fnc: look a_1 prn: 2
4
noun: Suzy fnc: look a_1 prn: 2
verb: look a_1 arg: Suzy word prn: 2
noun: word fnc: look a_1 prn: 2
5
noun: Suzy fnc: look a_1 prn: 2
verb: look a_1 arg: Suzy word prn: 2
noun: word fnc: look a_1 prn: 2
1
noun: word fnc: prn:
adj: up mdd: prn:
result of syntactic−semantic parsing: noun: Suzy fnc: look up prn: 2
verb: look up arg: Suzy word prn: 2
noun: word fnc: look up prn: 2
One difference to the earlier LA-grammar analysis 2.2 is the handling of the determiner the. In its lexical analysis, the core value is the substitution value n_1. In line 2, this value is copied into the arg slot of look and the core value of look is copied into the fnc slot of the. In line 3, the core value of word is used to substitute all occurrences of the substitution value n_1, after which the nw proplet is discarded. This method is called function word absorption. An inverse kind of function word absorption is the treatment of the discontinuous element up. It is lexically analyzed as a standard preposition with the core attribute adj (cf. NLC’06, Chapter 15). In line 5, this preposition is absorbed into the verb, based on a suitable substitution value. Thus, a sentence consisting of five words is represented by only three proplets.
4 The Cycle of Natural Language Communication In Database Semantics, the proplets resulting from an LA-hear derivation are stored in alphabetically ordered token lines, called a word bank. Each token line begins with a concept,
10
R. Hausser / Comparing the Use of Feature Structures in Nativism and in Database Semantics
corresponding to the owner record of a classic network database, followed by all proplets containing this concept as their core value in the order of their occurrence, serving as the member records of a network database (cf. Elmasri and Navathe 1989). Consider the following example.
4.1
T RANSFER OF CONTENT FROM THE SPEAKER TO THE HEARER
sign Julia
John
noun: John fnc: know prn: 1
Julia
noun: Julia fnc: know prn: 1
knows
John
noun: John fnc: know prn: 1
noun: Julia fnc: know prn: 1 verb: know arg: Julia John prn: 1
verb: know arg: Julia John prn: 1
know
hearer: key−word−based storage
speaker: retrieval−based navigation
The word bank of the agent in the hearer mode (left) shows the token lines resulting from the LA-hear derivation 3.4. Due to the alphabetical ordering of the token lines, the sequencing of the proplets resulting from the LA-hear derivation is lost. Nevertheless, the semantic relations between them are maintained, due to their common prn value and the coding of the functor-argument structure in terms of attributes and values. The word bank of the agent in the speaker mode (right) contains the same proplets as the word bank on the left. Here a linear order is reintroduced by means of a navigation along the semantic relations defined between the proplets. This navigation from one proplet to the next serves as a model of thought and as the conceptualization of the speaker, i.e., as the specification of what to say and how to say it. The navigation from one proplet to the next is powered by an LA-think grammar. Consider the following rule application:
4.2
E XAMPLE OF AN LA-think RULE APPLICATION (i) rule name (ii) ss pattern ⎡
rule level
V_N_V:
verb: β
⎤
⎢ ⎥ ⎣arg: X α Y⎦
(iii) nw pattern (iv) operations
⎡
noun: α
⎢ ⎥ ⎣fnc: β ⎦
prn: k prn: k matching and binding ⎡
proplet level
verb: know
⎤
⎢ ⎥ ⎣arg: Julia John⎦
prn: 1
⎤
output position ss mark α ss
(v) rule package {V_N_V, ...}
R. Hausser / Comparing the Use of Feature Structures in Nativism and in Database Semantics
11
By binding the variables β, α, and k to know, Julia, and 1, respectively, the next word pattern is specified at the rule level such that the retrieval mechanism of the database can retrieve (navigate to, traverse, activate, touch) the correct continuation at the proplet level:
4.3
R ESULT OF THE LA-think RULE APPLICATION ⎡
⎤ ⎡
⎤
verb: know noun: Julia ⎢ ⎥ ⎢ ⎥ ⎣arg: !Julia John⎦ ⎣fnc: know ⎦ prn: 1 prn: 1
In order to prevent repeated traversal of the same proplet,6 the arg value currently retrieved is marked with “!” (cf. NLC’06, p. 44). The autonomous navigation through the content of a word bank, powered by the rules of an LA-think grammar, is used not only for conceptualization in the speaker mode, but also for inferencing and reasoning in general. Providing a data structure suitable to (i) support navigation was one of the four main motivations for changing from the NEWCAT’86 notation of LA-grammar illustrated in 2.1, 2.2, and 2.3 to the NLC’06 notation illustrated in 3.4, 3.5, and 3.7. The other three motivations are (ii) the matching between the levels of language and context (cf. 3.3), (iii) a more detailed specification of lexical items, and (iv) a descriptively more powerful and more transparent presentation of the semantic relations, i.e., functor-argument structure, coordination, and coreference. A conceptualization defined as a time-linear navigation through content makes language production relatively straightforward: If the speaker decides to communicate a navigation to the hearer, the core values of the proplets traversed by the navigation are translated into their language-dependent counterparts and realized as external signs. In addition to this languagedependent lexicalization of the universal navigation, the language production system must provide language-dependent 1. word order, 2. function word precipitation (as the inverse of function word absorption), 3. word form selection for proper agreement. These tasks are handled by language-dependent LA-speak grammars in combination with language-dependent word form production. As an example of handling word order consider the production of the sentence Julia knows John from the set of proplets derived in 3.4:
4.4
P ROPLETS UNDERLYING LANGUAGE PRODUCTION ⎡
⎤ ⎡
⎤ ⎡
⎤
noun: Julia noun: John verb: know ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎣arg: Julia John⎦ ⎣fnc: know ⎦ ⎣fnc: know ⎦ prn: 1 prn: 1 prn: 1
Assuming that the navigation traverses the set by going from the verb to the subject noun to the object noun, the resulting sequence may be represented abstractly as VNN. Starting the navigation with the verb rather than the subject is because the connection between propositions is coded by the nc and pc features of the verb (cf. 3.2 and NLC’06, Appendix A2). Assuming that n stands for a name, fv for a finite verb, and p for punctuation, the time-linear derivation of an abstract n fv n p surface from a VNN proplet sequence is based on the following incremental realization: 6
Relapse, see tracking principles, FoCL’99, p. 454.
12
R. Hausser / Comparing the Use of Feature Structures in Nativism and in Database Semantics
4.5
S CHEMATIC PRODUCTION OF Julia knows John. activated sequence
realization
i V i.1 V i.2 fv V i.3 fv V i.4 fv p V
n N n N n n N N n n N N
n n fv n fv n n fv n p
In line i.1, the derivation begins with a navigation from V to N, based on LA-think. Also, the N proplet is realized as the n Julia by LA-speak. In line i.2, the V proplet is realized as the fv knows by LA-speak. In line i.3, LA-think continues the navigation to the second N proplet, which is realized as the n John by LA-speak. In line i.4, finally, LA-speak realizes the p . from the V proplet. This method can be used to realize not only a subject–verb–object surface (SVO) as in the above example, but also an SOV and (trivially) a VSO surface. It is based on the following principles:
4.6
P RINCIPLES FOR REALIZING SURFACES FROM A PROPLET SEQUENCE
• Earlier surfaces may be produced from later proplets. Example: The initial n surface is achieved by realizing the second proplet in the activated VN sequence first (cf. line i.1 in 4.5 above). • Later surfaces may be produced from earlier proplets. Example: The final punctuation p (full stop) is realized from the first proplet in the VNN sequence (cf. line i.4 in 4.5 above). Next consider the derivation of Suzy looked the word up., represented as an abstract n fv d nn de p surface, whereby n stands for a name, fv for a finite verb, d for a determiner, nn for a noun, de for a discontinuous element, and p for punctuation.
4.7
S CHEMATIC PRODUCTION OF Suzy looked the word up. activated sequence
realization
i V i.1 i.2 i.3 i.4 i.5 i.6
V fv V fv V fv V fv de V fv de p V
n N n N n d N N n d nn N N n d nn N N n d nn N N
n n fv n fv d n fv d nn n fv d nn de n fv d nn de p
R. Hausser / Comparing the Use of Feature Structures in Nativism and in Database Semantics
13
This derivation of an abstract n fv d nn de p surface from an underlying VNN navigation shows two7 instances of function word precipitation: (i) of the determiner the from the second N proplet, and (ii) of the discontinuous element up from the initial V proplet.
5 “Constituent Structure” in Database Semantics? The correlation of the activated VNN sequence and the associated surfaces shown in line i.6 (left) of 4.7 may be spelled out more specifically as follows:
5.1
S URFACES REALIZED FROM PROPLETS IN A TRAVERSED SEQUENCE fv
de
p
n
d
nn
look
up
.
Suzy
the
word
verb: look up arg: Suzy word prn: 1
noun: Suzy fnc: look up prn: 1
noun: word fnc: look up prn: 1
This structure is like a constituent structure insofar as what belongs together semantically (cf. 1.1, condition 1) is realized from a single proplet. Like a deep structure in Chomsky 1965, however, the sequence fv de p n d nn of 5.1 does not constitute a well-formed surface. What is needed here is a transition to the well-formed surface sequence n fv d nn de p:
5.2
S URFACE ORDER RESULTING FROM AN INCREMENTAL REALIZATION n
fv
Suzy
look
verb: look up arg: Suzy word prn: 1
noun: Suzy fnc: look up prn: 1
nn
de
p
word
up
.
d the
noun: word fnc: look up prn: 1
Instead of using a direct mapping like a transformation, Database Semantics establishes the correlation between the “deep” fv de p n d nn sequence 5.1 and the “surface” n fv d nn de p sequence 5.2 by means of a time-linear LA-think navigation with an associated incremental LA-speak surface realization, as shown schematically in 4.7 (for the explicit definition of the complete DBS1 and DBS2 systems of Database Semantics see NLC’06, Chapters 11–14). Note, however, that this “rediscovery” of constituent structure in the speaker mode of Database Semantics applies to the intuitions supported by the substitution and movement tests by Bloomfield 1933 and Harris 1951 (cf. FoCL’99, p. 155 f.), but not to the formal Definition 1.1 based on phrase structure trees. Nevertheless, given the extensive linguistic literature within phrase-structure-based Nativism, let us consider the possibility of translating formal constituent structures into proplets of Database Semantics.
6 On Mapping Phrase Structure Trees into Proplets Any context-free phrase structure tree may be translated into a recursive feature structure. A straightforward procedure is to define each node in the tree as a feature structure with the attributes node, up, and down. The value of the attribute node is the name of a node in 7
Actually, there is a third instance, namely the precipitation of the punctuation p from the V proplet.
14
R. Hausser / Comparing the Use of Feature Structures in Nativism and in Database Semantics
the tree, for example node: S. The value of the attribute up specifies the next higher node, while the value of the attribute down specifies the next lower nodes. The linear precedence in the tree is coded over the order of the down values. Furthermore, the root node S is formally characterized by having an empty up value, while the terminal nodes are formally characterized by having empty down values. Consider the following example of systematically recoding the phrase structure tree 1.2 (correct) as a recursive feature structure:
6.1
R ECODING A TREE AS A RECURSIVE FEATURE STRUCTURE ⎤
⎡
node: S
⎢up: ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢down: ⎢ ⎢ ⎢ ⎢ ⎣
⎥ ⎤⎥ ⎥ node: VP ⎥ ⎥⎥ ⎢up: S ⎥ ⎥ ⎢ ⎤⎡ ⎤⎥⎥ ⎡ ⎢ ⎥⎥ ⎢ node: V node: NP ⎥ ⎢ ⎥ ⎢up: VP ⎥⎥ ⎢up: VP ⎥⎥ ⎢ ⎢ ⎢ ⎥⎥ ⎢ ⎡ ⎤⎥ ⎡ ⎤⎥ ⎥ ⎢ ⎥ ⎥ ⎢ ⎢down: ⎢ node: knows ⎥ ⎢ node: John ⎥⎥ ⎥ ⎢ ⎥⎢ ⎥⎥ ⎢ ⎥⎥ ⎢ ⎢ ⎥ ⎢ ⎥ ⎣down: ⎣up: V ⎣ ⎦⎦ ⎣down: ⎣up: NP ⎦ ⎦⎦⎦ ⎡
⎡
⎤
node: NP
⎢up: S ⎥ ⎢ ⎡ ⎤⎥ ⎢ ⎥ node: Julia ⎥ ⎢ ⎢ ⎥⎥ ⎣down: ⎢ ⎣up: NP ⎦⎦
down:
down:
down:
The translation of a phrase structure tree into a recursive feature structure leaves ample room for additional attributes, e.g., phon or synsem, as used by the various schools of Nativism. Furthermore, the recursive feature structure may be recoded as a set of non-recursive feature structures, i.e., proplets. The procedure consists in recursively replacing each value consisting of a feature structure by its elementary node value, as shown below:
6.2
R ECODING 6.1 AS A SET OF PROPLETS
non-terminal nodes ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎡ ⎤ node: S node: NP node: VP node: V node: NP ⎥ ⎥ ⎢ ⎥ ⎢ ⎥⎢ ⎢ ⎥ ⎢ ⎦ ⎣up: S ⎦ ⎦ ⎣up: VP ⎦ ⎣up: VP ⎣up: ⎦ ⎣up: S down: knows down: John down: NP VP down: Julia down: V NP terminal nodes ⎡ ⎤ ⎡
⎤ ⎡
⎤
node: knows node: John node: Julia ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎦ ⎣up: NP ⎦ ⎣up: V ⎦ ⎣up: NP down: down: down:
Formally, these proplets may be stored and retrieved in a word bank like the ones shown in Example 4.1. The mapping from phrase structure trees to recursive feature structures (e.g., 6.1) to sets of proplets (e.g., 6.2) is not symmetric, however, because there are structures which can be easily coded as a set of proplets, but have no natural representation as a phrase structure tree. This applies, for instance, to a straight line, as in the following example:
6.3
G RAPHICAL REPRESENTATION OF A LINE H
I
J
K
Such a line has no natural representation as a phrase structure tree, but it does as a set of of proplets, as in the following definition:
R. Hausser / Comparing the Use of Feature Structures in Nativism and in Database Semantics
6.4
15
R ECODING THE LINE 6.3 AS A SET OF PROPLETS ⎡
⎤
⎡
⎤ ⎡
line: H ⎢ ⎥ ⎣prev: ⎦ next: I
start
line: I
intermediate
line: J
⎤
⎥ ⎢ ⎥ ⎢ ⎣prev: H⎦ ⎣prev: I ⎦
next: J ⎡
line: K
next: K ⎤
⎥ ⎢ ⎣prev: J ⎦
finish
next:
The beginning of the line is characterized by the unique proplet with an empty prev attribute, while the end is characterized by the unique proplet with an empty next attribute.8 Proplets of this kind are used in Database Semantics for the linguistic analysis of coordination. The asymmetry between the expressive power of phrase structure trees and proplets must be seen in light of the fact that the language and complexity hierarchy of substitution-based phrase structure grammar (also called the Chomsky hierarchy) is orthogonal to the language and complexity hierarchy of time-linear LA-grammar (cf. TCS’92 and FoCL’99, Part II). For example, while the formal languages ak bk and ak bk ck are in different complexity classes in phrase structure grammar, namely polynomial versus exponential, they are in the same class in LA-grammar, namely linear. Conversely, while the formal languages ak bk and HCFL are in the same complexity class in phrase structure grammar, namely polynomial, they are in different classes in LA-grammar, namely linear versus exponential.
7 Possibilities of Constructing Equivalences Regarding the use of feature structures, the most obvious difference between Nativism and Database Semantics are recursive feature structures in Nativism (cf. 1.5) and flat feature structures in Database Semantics (cf. results in 3.4 and 3.7). The recursive feature structures of Nativism are motivated by the constituent structure of the associated phrase structure trees, while the flat feature structures (proplets) of Database Semantics are motivated by the task of providing (i) a well-defined matching procedure between the language and the context level (cf. 3.3) and (ii) a time-linear storage of content in the hearer mode, a time-linear navigation in the think mode, and a time-linear production in the speaker mode (cf. 4.1). 8
Another structure unsuitable for representation as a phrase structure is a circle: K
H
J
I
There is no natural beginning and no natural end, as shown by the following definition as a set of proplets: ⎡ ⎤ ⎡ ⎤ ⎤ ⎡ ⎤ ⎡
arc: H
arc: I
arc: J
arc: K
next: I
next: J
next: K
next: H
⎥ ⎥ ⎢ ⎢ ⎥ ⎢ ⎥ ⎢ ⎣prev: K⎦ ⎣prev: H⎦ ⎣prev: I ⎦ ⎣prev: J ⎦ In this set, none of the proplets has an empty prev or next attribute, thus aptly characterizing the essential nature of a circle as compared to a line (cf. Example 6.4).
16
R. Hausser / Comparing the Use of Feature Structures in Nativism and in Database Semantics
These differences do not in themselves preclude the possibility of equivalences between the two systems, however. Given our purpose to discover common ground, we found that phrase structure trees and the associated recursive feature structures (cf. 6.1) can be systematically translated into equivalent sets of proplets (cf. 6.2), thus providing Nativism with a data structure originally developed for matching, indexing, storage, and retrieval in Database Semantics. Furthermore, we have seen that something like constituent structure is present in Database Semantics, namely the correlation of semantically related surfaces to the proplet from which they are realized in the speaker mode (cf. 5.1). How then should we approach the possible construction of equivalences between the two systems? From a structural point of view, there are two basic possibilities: to either look for an equivalence between corresponding components of the two systems (small solution), or to make the two candidates more equal by adding or subtracting components (large solution). Regarding a possible equivalence of corresponding components (small solution), a comparison is difficult. Relative to which parameters should the equivalence be defined: Complexity? Functionality? Grammatical insight? Data coverage? Language acquisition? Typology? Neurology? Ethology? Robotics? Some of these might be rather difficult to decide, requiring lengthy arguments which would exceed the limits of this paper. So let us see if there are some parts in one system which are missing in the other. This would provide us with the opportunity to add the component in question to the other system, thus moving inadvertently to a large solution for constructing an equivalence. Beginning with Nativism, we find the components of a universal base generated by the rules of a context-free phrase structure grammar, constrained by constituent structure, and mapped by transformations or similar mechanisms into the grammatical surfaces of the natural language in question. These components have taken different forms and are propagated by different linguistic schools. Their absence in Database Semantics raises the question of how to take care of what the components of Nativism have been designed to do. Thereby, two aspects must be distinguished: (i) the characterization of wellformedness and (ii) the characterization of innateness. For Chomsky, these are inseparable because without a characterization of innateness there are too many ways to characterize wellformedness.9 For Database Semantics, in contrast, the job of characterizing syntactical and semantical wellformedness is treated as a side-effect which results naturally from a well-functioning mechanism of interpreting and producing natural language during communication.
8 Can Nativism be Turned into an Agent-oriented Approach? Next let us turn to components which are absent in Nativism.10 Their presence in DBS follows from the purpose of building a talking robot. The components, distinctions, and procedures in question are the external interfaces for recognition and action (cf. 3.1), a data structure with an associated algorithm modeling the hearer mode and the speaker mode (cf. 4.1), a systematic distinction between the language and the context level as well as their correlation in terms of matching (cf. 3.3), inferences at the context level (cf. NLC’06, Chapter 5), turntaking, etc., all of which are necessary in addition to the grammatical component proper. Extending Nativism by adding these components raises two challenges: (i) the technical problem of connecting the historically grown phrase structure system with the new compo9 This problem is reminiscent of selecting the “right” phrase structure tree from a large number of possible trees (cf. 1.2), using the principle of constituent structure. 10 They are also absent in truth-conditional semantics relative to a set-theoretical model defined in a metalanguage, which has been adopted as Nativism’s favorite method of semantic interpretation.
R. Hausser / Comparing the Use of Feature Structures in Nativism and in Database Semantics
17
nents and (ii) finding a meaningful functional interaction between the original system and the new components. Regarding (i), there is the familiar problem of the missing external interfaces: how should a phrase structure system with transformations or the like be integrated into a computational model of the hearer mode and the speaker mode? Regarding (ii), it must be noted that Chomsky and others have emphasized again and again that Nativism is not intended to model the use of language in communication. Nevertheless, an extension of Nativism to an agent-oriented system would have great theoretical and practical advantages. For the theory, it would substantially broaden the empirical base,11 and for the practical applications, it would provide a wide range of much needed new functionalities such as procedures modeling the speaker mode and the hearer mode. Let us therefore return to the possibility of translating phrase structure trees systematically into proplets (cf. 6.1 and 6.2). Is this formal possibility enough to turn Nativism into an agent-oriented system? The answer is simple: while the translation in question is a necessary condition for providing Nativism with an effective method for matching, indexing, storage, and retrieval, it is not a sufficient condition. What is needed in addition is that the connections between the proplets (i) characterize the basic semantic relations of functor-argument structure and coordination as simply and directly as possible and (ii) support the navigation along these semantic relations in a manner which is as language-independent as possible. For these requirements, constituent structure presents two insuperable obstacles, namely (a) the proplets representing non-terminal nodes and (b) the proplets representing function words. Regarding (a), consider the set of proplets shown in 6.2 and the attempt to navigate from the terminal node Julia to the terminal node knows. Because there is no direct relation between these two proplets in 6.2, such a navigation would have to go from the terminal proplet Julia to the non-terminal proplet NP to the non-terminal proplet S to the non-terminal proplet VP to the non-terminal proplet V and finally to the terminal proplet knows. Yet eliminating these non-terminal nodes12 would destroy the essence of constituent structure as defined in 1.1 and thus the intuitive basis of Nativism. The other crucial ingredient of constituent structure, besides the non-terminal nodes, are the function words. They are important insofar as the words belonging together semantically are in large part the determiners with their nouns, the auxiliaries with their non-finite verbs, the prepositions with their noun phrases, and the conjunctions with their associated clauses. Regarding problem (b) raised by proplets representing function words, let us return to the example Suzy looked the word up, analyzed above in 1.3, 1.4, 2.2, 3.7, and 4.7. 11
As empirical proof for the existence of a universal grammar, Nativism offers language structures claimed to be learned error-free. They are explained as belonging to that part of the universal grammar which is independent from language-dependent parameter setting. Structures claimed to involve error-free learning include 1. 2. 3. 4. 5. 6. 7. 8. 9.
structural dependency C-command subjacency negative polarity items that-trace deletion nominal compound formation control auxiliary phrase ordering empty category principle
After careful examination of each, MacWhinney 2004 has shown that there is either not enough evidence to support the claim of error-freeness, or that the evidence shows that the claim is false, or that there are other, better explanations. 12 In order to provide for a more direct navigation, as in Example 2.1 (result).
18
R. Hausser / Comparing the Use of Feature Structures in Nativism and in Database Semantics
Given that this sentence does not have a well-formed constituent structure in accordance with Definition 1.1, let us look for a way to represent it without non-terminal nodes, but with proplets for the function words the and up. Consider the following tentative proposal, which represents each terminal symbol (word) as a proplet and concatenates the proplets using the attributes previous and next, in analogy to 6.4:
8.1
T ENTATIVE REPRESENTATION WITH FUNCTION WORD PROPLETS ⎡
⎤⎡
⎤⎡
⎤ ⎡
⎤⎡
det: the noun: word noun: Suzy verb: look ⎢prev: ⎥ ⎢prev: Suzy⎥ ⎢prev: look ⎥ ⎢prev: the ⎥ ⎥ ⎥⎢ ⎥⎢ ⎢ ⎥⎢ ⎥ ⎥⎢ ⎥⎢ ⎢ ⎥⎢ ⎣next: look ⎦ ⎣next: the ⎦ ⎣next: word⎦ ⎣next: up ⎦ prn: 2 prn: 2 prn: 2 prn: 2
⎤
prep: up ⎢prev: word⎥ ⎢ ⎥ ⎢ ⎥ ⎣next: ⎦ prn: 2
For the purposes of indexing, this analysis allows the storage of the proplets in – and the retrieval from – locations in a database which are not subject to any of the graphical constraints induced by phrase structure trees, and provides for a time-linear navigation, forward and backward, from one proplet to the next.13 For a linguistic analysis within Nativism or Database Semantics, however, the analysis 8.1 is equally unsatisfactory. What is missing for Nativism is a specification of what belongs together semantically. What is missing for Database Semantics is a specification of the functor-argument structure. For constructing an equivalence between Nativism and Database Semantics we would need to modify the attributes and their values in 8.1 as to 1. retain the proplets for the function words, 2. characterize what belongs semantically together in the surface, and 3. specify the functor-argument structure. Of these three desiderata, the third one is the most important: without functor-argument structure the semantic characterization of content in Database Semantics would cease to function and the extension of Nativism to an agent-oriented approach would fail. For specifying functor-argument structure, the proplets for function words are an insuperable obstacle insofar as they introduce the artificial problem of choosing whether the connection between a functor and an argument should be based on the function words (modifiers) or on the content words (heads). For example, should the connection between looked and the word be defined between looked and the, or between looked and word? Then there follows the question of how the connection between word and the should be defined, and how the navigation should proceed. These questions are obviated in Database Semantics by defining the grammatical relations directly between the content words. Consider the following semantic representation of Suzy looked the word up, repeating the result line of 3.7, though with the additional attribute sem to indicate the contribution of the determiner the after function word absorption:
8.2
S EMANTIC REPRESENTATION WITH FUNCTION WORD ABSORPTION ⎡
⎤⎡
⎤⎡
⎤
noun: word verb: look up noun: Suzy ⎥ ⎢sem: def sg ⎥ ⎥ ⎢sem: pres ⎢sem: nm ⎥⎢ ⎥ ⎥⎢ ⎢ ⎥⎢ ⎥ ⎥⎢ ⎢ ⎣fnc: look up⎦ ⎣arg: Suzy word⎦ ⎣fnc: look up⎦ prn:2 prn: 2 prn: 2 13 The navigation would be powered by rules like that illustrated in 4.2, modified to apply to the attributes of 8.1. For a complete DBS-system handling Example 8.1, consisting of an LA-hear grammar and an LAthink/speak-grammar, see NLC’06, Section 3.6.
R. Hausser / Comparing the Use of Feature Structures in Nativism and in Database Semantics
19
Compared to the five proplets of Example 8.1, this analysis consists of only three. The attributes prev and next have been replaced by the attributes sem (for semantics), fnc (for the functor of a noun), and arg (for the argument(s) of a verb). The functor-argument structure of the sentence is coded by the value look up of the fnc slot of the nouns Suzy and word, and the values Suzy word of the arg slot of the verb look up (bidirectional pointering). During the time-linear LA-hear analysis, shown in 3.7, the function words are treated as full-fledged lexical items (proplets). The resulting semantic representation 8.2 provides grammatical relations which support forward as well as backward navigation. These navigations, in turn, are the basis of the production of different language surfaces. For example, while forward navigation would be realized in English as Suzy looked the word up, the corresponding backward navigation would be realized as The word was looked up by Suzy.14 In 8.2, the contribution of the absorbed function word the is the value def of the cat attribute of the proplet word, while the contribution of the absorbed function word up is the corresponding value of the verb attribute of the proplet look up.15 Defining the grammatical relations solely between content words is motivated not only by the need to establish semantic relations suitable for different kinds of navigation, but also by the fact that function words are highly language-dependent, like morphological variation and word order.
9 Conclusion While Nativism and Database Semantics developed originally without feature structures, they were added later for a more detailed grammatical analysis. This paper describes the different functions of feature structures in Nativism and Database Semantics, and investigates the possible establishment of equivalences between the two systems. Establishing equivalences means overcoming apparent differences. The most basic difference between Nativism and Database Semantics is that Nativism is sign-oriented while Database Semantics is agent-oriented. Ultimately, this difference may be traced to the respective algorithms of the two systems: the rewrite rules of PS-grammar (Nativism) do not have an external interface, while the time-linear rules of LA-grammar (Database Semantics) do. It is for this reason that Nativism cannot be extended into an agent-oriented approach, thus blocking the most promising possibility for constructing an equivalence with Database Semantics. This result complements the formal non-equivalence between the complexity hierarchies of PS-grammar and LA-grammar proven in TCS’92. The argument in this paper has been based on only two language examples, namely Julia knows John and Suzy looked the word up. For wider empirical coverage see NLC’06. There, functor-argument structure (including subordinate clauses), coordination (including gapping constructions), and coreference (including ‘donkey’ and ‘Bach-Peters’ sentences) are analyzed in the hearer and the speaker mode, based on more than 100 examples.
Acknowledgments This paper benefited from comments by Airi Salminen, University of Toronto; Kiyong Lee, Korea University; Haitao Liu, Communication University of China; and Emmanuel Giguet, Université de Caen. All remaining mistakes are those of the author. 14 15
For a more detailed analysis see NLC’06, Section 6.5. In analogy to 2.2, the value up could also be stored as a third valency filler in the arg slot of the verb.
20
R. Hausser / Comparing the Use of Feature Structures in Nativism and in Database Semantics
References Bar-Hillel, Y. (1964) Language and Information. Selected Essays on Their Theory and Application. Reading, MA: Addison-Wesley Bloomfield, L. (1933) Language, New York: Holt, Rinehart, and Winston Bresnan, J. (ed.) (1982) The Mental Representation of Grammatical Relation. Cambridge, MA: MIT Press Chomsky, N. (1965) Aspects of a Theory of Syntax, The Hague: Mouton Chomsky, N. (1981) Lectures on Government and Binding, Dordrecht: Foris Clark, H. H. (1996) Using Language. Cambridge: Cambridge Univ. Press Elmasri, R. & S.B. Navathe (1989) Fundamentals of Database Systems, Redwood City, CA: Benjamin-Cummings Gaifman, C. (1961) Dependency Systems and Phrase Structure Systems, P-2315, Santa Monica, CA: Rand Corporation Gazdar, G., E. Klein, G. Pullum, and I. Sag (1985) Generalized Phrase Structure Grammar. Cambridge, MA: Harvard Univ. Press Harris, Z. (1951) Methods in Structural Linguistics, Chicago: Univ. of Chicago Press Hausser, R. (1986) NEWCAT: Parsing Natural Language Using Left-Associative Grammar, LNCS 231, Berlin Heidelberg New York: Springer (NEWCAT’86) Hausser, R. (1992) “Complexity in Left-Associative Grammar,” Theoretical Computer Science, Vol. 106.2:283-308, Amsterdam: Elsevier (TCS’92) Hausser, R. (1999) Foundations of Computational Linguistics, 2nd ed. 2001, Berlin Heidelberg New York: Springer (FoCL’99) Hausser, R. (2001) “Database Semantics for natural language,” Artificial Intelligence, Vol. 130.1:27–74, Amsterdam: Elsevier (AIJ’01) Hausser, R. (2006) A Computational Model of Natural Language Communication, Berlin Heidelberg New York: Springer (NLC’06) Kay, M. (1992) “Unification,” in M. Rosner and R. Johnson (eds) Computational Linguistics and Formal Semantics, p. 1-30, Cambridge: Cambridge Univ. Press Kay, P. (2002) “An informal sketch of a formal architecture for construction grammar,” Grammars, Vol. 5:1–19, Dordrecht: Kluwer MacWhinney, B. (2004) “A multiple process solution to the logical problem of language acquisition,” Journal of Child Language, Vol. 31:883–914, Cambridge: CUP Pereira, F., and D. Warren (1980) “Definite clause grammars for language analysis – a survey of the formalism and a comparison with augmented transition networks,” Artificial Intelligence, Vol. 13:231–278, Amsterdam: Elsevier Pollard, C., and I. Sag (1987) Information-based Syntax and Semantics, Vol. I: Fundamentals, Stanford: CSLI Pollard, C., and I. Sag (1994) Head-Driven Phrase Structure Grammar, Stanford: CSLI Saussure, F. de (1913/1972) Cours de linguistique générale, Édition critique préparée par Tullio de Mauro, Paris: Éditions Payot Shankar, V., and A. Joshi (1988) “Feature-structure based tree adjoining grammar,” in Proceedings of 12th Internation Conference on Computational Linguistics (Coling’88)
21
Information Modelling and Knowledge Bases XIX H. Jaakkola et al. (Eds.) IOS Press, 2008 © 2008 The authors and IOS Press. All rights reserved.
Multi-Criterion Search from the Semantic Point of View (Comparing TIL and Description Logic)
Marie DUŽÍ, VSB-Technical University Ostrava 17.listopadu 15 708 33 Ostrava Czech Republic
[email protected] Peter VOJTÁŠ Charles University Prague Malostranské námČstí 25 118 00 Praha 1 Czech Republic
[email protected] Abstract In this paper we discuss two formal models apt for a search and communication in a ‘multi-agent world’, namely TIL and EL@. Specifying their intersection, we are able to translate and switch between them. Using their union, we extend their functionalities. The main asset of using TIL is a fine-grained rigorous analysis and specification close to natural language. The additional contribution of EL@ consists in modelling multi-criterion aspects of user preferences. Using a simple example throughout the paper, we illustrate the aspects of a multi-criterion search and communication by their analysis and specification in both the systems. The paper is an introductory study aiming at a universal logical approach to the ‘multi-agent world’, which at the same time opens new research problems and trends.
1. Introduction and motivation. In this paper we discuss two formal models that are relevant in the area of search and communication in the multi-agent world, namely Transparent Intensional Logic (TIL) and a fuzzy variant EL@ of the existential description logic EL (see [2]). Since TIL has been introduced and discussed in the EJC proceedings and EL is a well-known logical system, we are not going to introduce in details the technicalities of them. Instead, we provide just a minimal necessary introduction to keep the paper self-contained and concentrate on the analytic and specification role of these systems in the area of a semantic web search that takes into account specific user fuzzy criteria. By comparing the two formalisms we aim at providing a clue to their integration. Last but not least we’d like to illustrate the assets of a rigorous logical approach to the problem. The main asset of using TIL is a fine-grained rigorous analysis and specification close to natural language. The additional contribution of EL@ consists in modelling multi-criterion aspects of user preferences. The paper is an introductory study aiming at a universal logical approach to the ‘multi-agent world’, which at the same time opens new research problems and trends. The EL@ logic is a many-valued version of the existential description logic EL (see [2]) where fuzzification concerns only concepts and the logic is enriched with aggregation (see [21]). Specifying the intersection of TIL and EL@, viz. the TIE@L, we are able to translate
22
M. Duží and P. Vojtáš / Multi-Criterion Search from the Semantic Point of View
and switch between the two systems. Using their union, TI+E@L, we extend their functionalities. Throughout the paper we use a simple example in order to illustrate basic principles, common features, as well as differences of the two systems. Example Consider a simple communication between three agents, A, B and C. The agents can be computational, like web services, database engines, query engines, pieces of software, or even human ones. The agent A sends a message to B asking to find a hotel suitable for A (the structure of the message and the meaning of ‘suitable’ will be discussed later). After obtaining an answer the agent A chooses a hotel and sends another message to the agent C asking to seek a suitable parking place close to the chosen hotel. The criteria of A are: hotel price (e.g., as low as possible), hotel distance to a beach (should be as close as possible), hotel year of building (not too old), parking place price and parking place distance (to the hotel). We are going to describe this scenario simultaneously in two formal models: TIL (Transparent Intensional Logic) and DL (Description Logic). Of course, the model can be made more realistic by considering a larger number of agents searching for specific attribute values (this approach is motivated by Fagin in [10]). When needed, we will switch between the levels of granularity in order to go into more details. Using the DL and/or database notation we are thus going to consider agents of the type User, and the attributes Hotel_Price, Hotel_Beach_Distance, Hotel_Year_of_Construction, Parking_Price, Parking_Distance. Let the values of the attributes (results of the search) be:
Particular attribute preferences of a user U can be evaluated by assigning the preference degree, a real number in the interval [0,1], to the attribute values. For instance, cheap_U(150) = 0.75, close_U (300) = 0.6, new_U (1980) = 0.2, and similarly for the other values. In this way we obtain fuzzy subsets cheap_U, close_U, new_U of the attribute domains, which can be recorded in a fuzzy database operation table (see [15]):
Our reasoning and decision making is driven not only by the preferences we assign to the values of attributes, but also by the weight we assign to the very criteria of interest. For instance, when being short of money, the low price of the hotel is much more important than its closeness to the beach. When being rich we may prefer a modern high-tech equipped hotel situated on the beach. On the other hand a hotel close to the beach may become totally unattractive in a tsunami-affected area. The multi-criterion decision is thus seldom based on a simple conjunctive or disjunctive combination of the respective factors, and we need an algorithm to compute global user preferences as a composition of particular weighted fuzzy values of the selection criteria. The algorithm can be rather sophisticated. However, for the sake of simplicity, let it be just a weighted average:
M. Duží and P. Vojtáš / Multi-Criterion Search from the Semantic Point of View
@U (cheapU , close_U , new_U )
23
2 cheap_U 3 close_U new_U 6
Computing the global degree of preferences of the hotel h1 for the user U, we obtain:
@U (0.75, 0.6, 0.2)
2 0.75 3 0.6 0.2 6
3.5 6
0.58...
Since this value is higher than the value of the hotel h2, the user U is going to choose h1. Of course, another user can have different preferences, and also the preferences of one and the same user may dynamically change in time. Besides the fact that in a multi-agent world we work with vague, fuzzy or uncertain information, we have to take into account also the demand on robustness and distribution of the system. The system has to be fully distributive, and we have to deal with value gaps because particular agents may fail to supply the requested data. On the other hand, in critical and emergency situations, which tend to a chaotic behaviour, the need for an adequate data becomes a crucial point. Therefore the classical systems which are based on the Closed World Assumption are not plausible here. We have to work under the Open World Assumption (OWA), and a lack of knowledge must not yield a collapse of the system. For instance, it may happen that we are not able to retrieve the distance of the hotel h1 to the beach, and the available data are as follows:
There are several possibilities of dealing with lacking data. We may use default values (e.g., average, the best or the worst ones), or treat the missing values as value gaps of partial functions. From the formal point of view, TIL is a hyper-intensional partial O-calculus. By ‘hyper-intensional’ we mean the fact that the terms of the ‘language of TIL constructions’ are not interpreted as the denoted functions, but as algorithmically structured procedures, known as TIL constructions, producing the denoted functions as outputs. Thus we can rigorously and naturally handle the terms that are in classical logics ‘non-denoting’, or undefined;1 in TIL each term is denoting a full-right entity, namely a construction. Hence (well-typed) terms never lack semantics. It may just happen (in well defined cases) that the denoted procedure fails to produce an output function. And if it does not fail it may happen that the produced function fails to have a value at an argument. These features of TIL are naturally combined with and completed by the EL@ fuzzy tools, in particular the aggregation algorithms. The paper is organized as follows: Chapter 2 contains brief introductory remarks on TIL. Chapter 3 introduces the EL@ description logic, and Chapter 4 is devoted to the formal description of our motivating examples, which gives us a flavour of the common features of both the models. As a result, in concluding Chapter 5 we outline a possible hybrid system and specify the trends of future research.
1
For the logic of definedness see [11].
24
2
M. Duží and P. Vojtáš / Multi-Criterion Search from the Semantic Point of View
TIL in brief.
In this Chapter we provide just a brief introductory explanation of the main notions of Transparent Intensional Logic (TIL). For exact definitions and details see, e.g., [5], [7], [8], [19], [20]. TIL approach to knowledge representation can be characterised as the ‘top-down approach’. TIL ‘generalises to the hardest case’ and obtains the ‘less hard cases’ by lifting various restrictions that apply only higher up. This way of proceeding is opposite to how semantic theories tend to be built up. The standard approach consists in beginning with atomic sentences, then proceeding to molecular sentences formed by means of truthfunctional connectives or by quantifiers, and from there to sentences containing modal operators and, finally, attitudinal operators. Thus, to use a simple case for illustration, once a vocabulary and rules of formation have been laid down, a semantics gets off the ground by analysing an atomic sentence as follows: (1)
“Charles selected the hotel h”:
S(a,h)
And further upwards: (2)
“Charles selected the hotel h, and Thelma is happy”:
S(a,h) H(b)
(3)
“Somebody selected the hotel h”:
x S(x,h)
(4)
“Possibly, Charles selected the hotel h”:
S(a,h)
(5)
“Thelma believes that Charles selected the hotel h”:
B(b,S(a,h)).
In non-hyperintensional (i.e., non-procedural) theories of formal semantics, attitudinal operators are swallowed by the modal ones. But when they are not, we have three levels of granularity: the coarse level of truth-values, the fine-grained level of truth-conditions (propositions, truth-values-in-intension), and the very fine-grained level of hyperpropositions, i.e., constructions of propositions. TIL operates with these three levels of granularity. We start out by analysing sentences from the uppermost end, furnishing them with a hyperintensional2 semantics, and working our way downwards, furnishing even the lowest-end sentences (and other empirical expressions) with a hyperintensional semantics. That is, the sense of a sentence such as “Charles selected the hotel h” is a hyper-proposition, namely the construction of the denoted proposition (i.e., the instruction how to evaluate the truth-conditions of the sentence in any state of affairs). When assigning a construction to an expression as its meaning, we specify a procedural know-how, which must not be confused with the respective performatory know-how. Distinguishing performatory know-how from procedural know-how, the latter could be characterised “that a knower x knows how A is done in the sense that x can spell out instructions for doing A.”3 For instance, to know what Goldbach Conjecture means is to understand the instruction to find whether ‘all positive even integers 4 can be expressed as the sum of two primes’. It does not include either actually finding out (whether it is true or not by following a procedure or by luck) or possessing the skill to do so.4 Furthermore, the sentence “Charles selected the hotel h” is an ‘intensional context’, in the sense that its logical analysis must involve reference to empirical parameters, in this case both possible worlds and instants of time. Charles only contingently selected the hotel; i.e., he did so only at some worlds and only sometimes. The other reason is because the analysans must 2 3 4
The term ‘hyperintensional’ has been introduced by Max Cresswell, see [4]. See [16, p.6] For details on TIL handling knowledge see [8].
M. Duží and P. Vojtáš / Multi-Criterion Search from the Semantic Point of View
25
be capable of figuring as an argument for functions whose domain are propositions rather than truth-values. Construing ‘S(a,h)’ as a name of a truth-value works only in the case of (1) and (2). It won’t work in (5), since truth-values are not the sort of thing that can be believed. Nor will it work in (4), since truth-values are not the sort of thing that can be possible. Constructions are procedures, or instructions, specifying how to arrive at less-structured entities. Being procedures, constructions are structured from the algorithmic point of view, unlike set-theoretical objects. The TIL ‘language of constructions’ is a modified hyperintensional version of the typed O-calculus, where Montague-like O-terms denote, not the functions constructed, but the constructions themselves. Constructions qua procedures operate on input objects (of any type, even on constructions of any order) and yield as output (or, in well defined cases fail to yield) objects of any type; in this way constructions construct partial functions, and functions, rather than relations, are basic objects of our ontology. The choice of types and of constructions is not given once for ever: it depends on the area to be analyzed. By claiming that constructions are algorithmically structured, we mean the following: a construction Cbeing an instructionconsists of particular steps, i.e., sub-instructions (or, constituents) that have to be executed in order to execute C. The concrete/abstract objects an instruction operates on are not its constituents, they are just mentioned. Hence objects have to be supplied by another (albeit trivial) construction. The constructions themselves may also be only mentioned: therefore one should not conflate using constructions as constituents of composed constructions and mentioning constructions that enter as input into composed constructions, so we have to strictly distinguish between using and mentioning constructions. Just briefly: Mentioning is, in principle, achieved by using atomic constructions. A construction is atomic if it is a procedure that does not contain any other construction as a used subconstruction (a constituent). There are two atomic constructions that supply objects (of any type) on which complex constructions operate: variables and trivializations. Variables are constructions that construct an object dependently on valuation: they vconstruct, where v is the parameter of valuations. When X is an object (including constructions) of any type, the Trivialization of X, denoted 0X, constructs X without the mediation of any other construction. 0X is the atomic concept of X: it is the primitive, nonperspectival mode of presentation of X. There are two compound constructions, which consist of other constructions: Composition and Closure. Composition is the procedure of applying a function f to an argument A, i.e., the instruction to apply f to A to obtain the value (if any) of f at A. Closure is the procedure of constructing a function by abstracting over variables, i.e., the instruction to do so. Finally, higher-order constructions can be used twice over as constituents of composed constructions. This is achieved by a fifth construction called Double Execution. TIL constructions, as well as the entities they construct, all receive a type. The formal ontology of TIL is bi-dimensional. One dimension is made up of constructions, the other dimension encompasses non-constructions. On the ground level of the type-hierarchy, there are entities unstructured from the algorithmic point of view belonging to a type of order 1. Given a so-called epistemic (or ‘objectual’) base of atomic types (R-truth values, Lindividuals, W-time moments / real numbers, Z-possible worlds), mereological complexity is increased by the induction rule for forming partial functions: where D, E1,…,En are types of order 1, the set of partial mappings from E1 u…u En to D, denoted (D E1…En), is a type of order 1 as well.5 5 TIL is an open-ended system. The above epistemic base {R, L, W, Z} was chosen, because it is apt for naturallanguage analysis, but the choice of base depends on the area to be analysed.
26
M. Duží and P. Vojtáš / Multi-Criterion Search from the Semantic Point of View
Constructions that construct entities of order 1 are constructions of order 1. They belong to a type of order 2, denoted by *1. This type *1 together with atomic types of order 1 serves as a base for the induction rule: any collection of partial functions, type (D E1…En), involving *1 in their domain or range is a type of order 2. Constructions belonging to a type *2 that identify entities of order 1 or 2, and partial functions involving such constructions, belong to a type of order 3. And so on ad infinitum. Definition (Constructions) i) Variables x, y, z, …are constructions that construct objects of the respective types dependently on valuations v; they v-construct. ii) Trivialization: Where X is an object whatsoever (an extension, an intension or a construction), 0X is a construction called trivialization. It constructs X without any change. iii) Composition: If X v-constructs a function F of a type (D E1…Em), and Y1,…,Ym v-construct entities B1,…,Bm of types E1,…,Em, respectively, then the composition [X Y1 … Ym] is a construction that v-constructs the value (an entity, if any, of type D) of the (partial) function F on the argument ¢B1, …, Bn². Otherwise the composition [X Y1 … Ym] does not v-construct anything: it is v-improper. iv) Closure: If x1, x2, …,xm are pairwise distinct variables that v-construct entities of types E1, E2, …, Em, respectively, and Y is a construction that v-constructs an entity of type D, then [Ox1…xm Y] is a construction called closure, which v-constructs the following function F of the type (D E1…Em), mapping E1 u…u Em to D: Let B1,…,Bm be entities of types ȕ1,…,ȕm, respectively, and let v(B1/x1,…,Bm/xm) be a valuation differing from v at most in associating the variables x1,…xm with B1,…,Bm, respectively. Then F associates with the m-tuple ¢B1,…,Bm² the value v(B1/x1,…,Bm/xm)-constructed by Y. If Y is v(B1/x1,…,Bm/xm)improper (see iii), then F is undefined on ¢B1,…,Bm². v) Double execution: If X is a construction that v-constructs a construction X’, then 2X is a construction called double execution. It v-constructs the entity (if any) v-constructed by X’. Otherwise the double execution 2X is v-improper. vi) Nothing is a construction, unless it so follows from i) through vi). The notion of construction is a notion that is the most misunderstood notion of those ones used in TIL. Some logicians ask: Are constructions formulae of type-logic? Our answer: No! Another question: Are they denotations of closed formulae? Our answer: No! So a pre-formal, ‘pre-theoretical’ characteristics is needed: constructions are abstract procedures. Question: Procedures are time-consuming, how can they be abstract? Answer: The realization of an algorithm is time-consuming, the algorithm itself is timeless and spaceless. Question: So what about your symbolic language? Why do you not simply say that its expressions are constructions? Answer: These expressions cannot construct anything they serve only to represent (or encode) constructions. Question: But you could do it like Montague6 did: To translate expressions of natural language into the language of intensional logic, and then interpret the result in the standard manner. What you achieve using ‘constructions’ you would get using metalanguage. Answer(s): First, Montague and other intensional logics interpret terms of their language as the respective functions, i.e., set-theoretical mappings. However, these mappings are the outputs of executing the respective procedures. Montague does not make it possible to mention the 6
For details on Montague system see, e.g., [12, pp. 117-220].
M. Duží and P. Vojtáš / Multi-Criterion Search from the Semantic Point of View
27
procedures as objects sui generis, and to make thus a semantic shift to hyperintensions. Yet we do need a hyperintensional semantics. Notoriously well-known are attitudinal sentences which no intensional semantics can properly handle, because its finest individuation is equivalence.7 Second, our logic is universal: we do not need to work as part-time linguisticians. Using the ‘language of constructions’ we directly encode constructions. Definition ((D-)intension, (D-)extension) (D-)intensions are members of a type (DZ), i.e., functions from possible worlds to the arbitrary type D. (D-)extensions are members of the type D, where D is not equal to (EZ) for any E, i.e., extensions are not functions from possible worlds. Remark on notational conventions: An object A of a type D is called an D-object, denoted A/D. That a construction C v-constructs an D-object is denoted C ov D. We will often write ‘x A’, ‘x A’ instead of ‘[0D Ox A]’, ‘[0D Ox A]’, respectively, when no confusion can arise. We also often use an infix notation without trivialisation when using constructions of truthvalue functions (conjunction), (disjunction), (implication), { (equivalence) and negation (), and when using a construction of an identity. Intensions are frequently functions of a type ((DW)Z), i.e., functions from possible worlds to chronologies of the type D (in symbols: DWZ), where a chronology is a function of type (DW). We will use variables w, w1, w2,… as v-constructing elements of type Z (possible worlds), and t, t1, t2, … as v-constructing elements of type W (times). If C o DWZ v-constructs an D-intension, the frequently used composition of a form [[C w] t], v-constructing the intensional descent of the D-intension, will be abbreviated as Cwt. Some important kinds of intensions are: Propositions, type RWZ. They are denoted by empirical (declarative) sentences. Properties of members of a type D, or simply Į-properties, type (RD)WZ.8 General terms (some substantives, intransitive verbs) denote properties, mostly of individuals. Relations-in-intension, type (RE1…Em)WZ. For example transitive empirical verbs, also attitudinal verbs denote these relations. Omitting WZ we get the type (RE1…Em) of relations-inextension (to be met mainly in mathematics). D-roles, offices, type DWZ, where D (RE). Frequently LWZ. Often denoted by concatenation of a superlative and a noun (“the highest mountain”). Individual roles correspond to what Church in [3] called “individual concept”. The role of the above defined constructions in a communication between agents will be illustrated in Chapter 4, in particular in Paragraph 4.5. Just a note to elucidate the role of Trivialisation and empirical parameters w o Z, t o W: The TIL language is not based on a fixed alphabet: the role of formal constants is here played by Trivialisations of nonconstructional entities, i.e., the atomic concepts of them. Each agent has to be equipped with a basic ontology, namely the set of atomic concepts he knows. Thus the upper index ‘0’ serves as a marker of the atomic concept (like a ‘key-word’) that the agent should know. If they do not, they have to learn it. The lower index ‘wt’ can be understood as an instruction to execute an empirical inquiry (search) in order to obtain the actual current value of an intension, for instance by searching agent’s database or by asking the other agents, or even by means of agent’s sense perception.
7
See [12, p.73] Collections, sets, classes of ‘D-objects’ are members of type (RD); TIL handles classes (subsets of a type) as characteristic functions. Similarly relations (-in-extension) are of type(s) (Rȕ1…ȕm).
8
28
3
M. Duží and P. Vojtáš / Multi-Criterion Search from the Semantic Point of View
The EL@ description logic
In a multi-agent world like the semantic web we need to retrieve, process, share or reuse information which is often vague or uncertain. The applications have to work with procedures that deal with the degree of relatedness, similarity or ranking. These motivations lead to the development of the fuzzy description logic (see, [18]). In this chapter we briefly describe a variant of the fuzzy description logic, namely EL@ (see [21]). One of the principal sources of fuzziness is user evaluation (preference) of crisp values of attributes. For instance, the hotel price is crisp but user evaluation may lead to a fuzzy predicate like a cheap, moderate, or expensive hotel. User preferences are modelled by linearly ordered set of degrees T = [0,1] extending classical truth-values. Thus we have: 0 = False = A = the worst T and 1 = True = T = the best T Now when searching a suitable object we have to order the set of available objects according to the user degrees assigned to object-attribute values. Practical experiences have shown that the ordering is seldom based on a conjunctive or disjunctive combination of particular scores. Rather, we need to work with a fuzzy aggregation function that combines generally incomparable sets of values. The EL@ logic is in some aspects a weakening of Straccia fuzzy description logic and in some other aspects a strengthening.9 The restrictions concern using just crisp roles and not using negation. Moreover, quantification is restricted to existential quantifiers. The extension concerns the application of aggregation functions. Thus we loose the ability to describe fuzziness in roles but gain the ability to compute a global user score. The EL@ alphabet consists of (mutually disjoint) sets NC of concept names containing T, NR role names, NI instance names and constructors containing and a finite set C of combination functions with an arity function ar : C Æ {n N : n 2}. Concept descriptions in EL@ are formed according to the following syntax rules (where @C) The interpretation structures of our description logic EL@ are parameterized by an ordered set of truth-values T (the degrees of membership to a domain of a fuzzy concept) and a set of nary aggregation functions over T. An interpretation structure T is thus an algebra T = {T, , {@•T: @ C }}, where (T , ,T ) is an upper complete semilattice with the top element T, and @•T: Tar(@) Æ T is a lattice of totally continuous (order-preserving) aggregation functions. A T –interpretation is then a pair I = ¢ǻI, •I², with a nonempty domain ǻI and the interpretation of language elements aI ǻI, for a NI AI: ǻI Æ T, for A NC (concepts can be fuzzy, like a suitable hotel) rI ǻI × ǻI, for r NR 9
For details on fuzzy description logic see [18].
M. Duží and P. Vojtáš / Multi-Criterion Search from the Semantic Point of View
29
(Roles remain crisp; however, users may interpret these data in a fuzzy way. We assume that fuzziness should not be attached to the data from the very beginning). The extension of the T –interpretation I to the composed EL@ concepts is given by (@(C1, …, Cn))I(x) = @•(CI1 (x), …, CIn (x)) and ( r.C)I(x) = sup{CI(y): (x, y) rI} The EL@ is a surprisingly expressive language with good mathematical properties. It opens a possibility to define declarative as well as procedural semantics of an answer to a user query formulated by means of a fuzzy concept definition. The discussion on the complexity of particular problems, like satisfiability, the instance problem, the problem of deciding subsumption and the proof of soundness and completeness are, however, out of scope of the present paper. For details, see, e.g., [21].
4
TIL and EL@ combined.
Using the example from the outset we are now going to outline the way of integrating the two systems. We illustrate the work with a typed and / or non-typed language, and the role of basic pre-concepts like a type, domain, concept and role. As stated above, TIL is a typed system. The basic types serve as the pre-concepts.
4.1
Pre-concepts
a) Basic types (TIL): The epistemic base is a collection of: R – the set of truth-values {T, F}, L – the universe of discourse (the set of individuals), W – the set of times (temporal factor) and / or real numbers, Z – the set of possible worlds (modal factor) (EL@): Basic pre-concepts are T and ǻI, as specified in Chapter 3. The description logic does not work explicitly with the temporal and modal factor. However, there is a possibility to distinguish between necessary ex definitione (T-boxes) and contingency of attribute values (A-boxes). Moreover, EL@ contributes the means for handling user preference structures – the preference factor. (TIL): The universe of discourse is the (universal) set of individuals. EL@ works with varying domains of interpretation ǻI. b) Functions and relations TIL is a functional system: Composed (functional) types are collections of partial functions; D-sets and (DE)-relations are modelled by their characteristic functions, objects of types (RD), (RDE), respectively. (EL@): Being a variant of description logic, EL@ is based on the first-order predicate logic where n-ary predicates are interpreted as n-ary relations over the universe. However, in EL@ this is true only for n = 2: binary predicates are crisp roles. In the other aspects EL@ is actually functional; it deals with (crisp) n-ary aggregation functions, and unary predicates (concepts) are interpreted as fuzzy sets by their fuzzy characteristic functions ǻI Æ T.
30
4.2
M. Duží and P. Vojtáš / Multi-Criterion Search from the Semantic Point of View
Assortment of the individuals in the universe
(TIL) properties In order to classify individuals into particular sorts, we use properties of individuals. They are intensions, namely functions that depending on the states of affairs (the modal parameter Z) and time (the parameter W) yield a population of individuals (RL) that actually and currently have the property in question. Example: h1, h2, h3 / L are individuals with the property Hotel / (RL)WZ of being a hotel. In the database setting these individuals belong to the domain of the attribute “hotel”, or, these individuals may instantiate the entity set HOTEL. That h1, h2, h3 are hotels is in TIL represented by the constructions of the respective propositions
OwOt [0Hotelwt 0h1], OwOt [0Hotelwt 0h2], OwOt [0Hotelwt 0h3], where the property Hotel / (RL)WZ is first intensionally descended (0Hotelwt) and then ascribed to an individual: [0Hotelwt 0hi]. Finally, to complete the meaning of ‘hi is a hotel’, we have to abstract over the modal and temporal parameter in order to construct a proposition of type RWZ that hi is a hotel: OwOt [0Hotelwt 0hi]. Gloss the construction as an instruction for evaluating the truth-conditions: In any state of affairs of evaluation (OwOt) check whether the individual (0hi) currently belongs to the actual population of hotels ([0Hotelwt 0hi]). (EL@) equivalents. Names of properties correspond to the elements of NC and NR. The above propositions are represented by membership assertions: Hotel(h1), Hotel(h2), Hotel(h3). The example continued. Let A, B, C / L are individuals with the property of being an agent. In the database setting these individuals belong to the domain of the attribute “user”, or, these individuals may instantiate the entity set AGENT. However, in order to be able to represent nary properties of individuals by means of binary ones, we need to identify particular users. Of course, in case of a big and varying set of users it is not in general possible to identify each user, and we often have to consider (a smaller number of) user profiles. (TIL): That A, B, C are agents is represented by the constructions of the respective propositions:
OwOt [0Agentwt 0A], OwOt [0Agentwt 0B], OwOt [0Agentwt 0C], where the property Agent / (RL)WZ is intensionally descended and then ascribed to an individual: [0Agentwt 0Ai]. Finally, in order to construct a proposition, we have to abstract over the parameters w, t: OwOt [0Agentwt 0Ai]. Gloss: In any state of affairs of evaluation check whether the individual A currently belongs to the actual population of agents. (EL@): The above propositions are represented by membership assertions: Agent(A), Agent(B), Agent(C). (TIL): Parking / (RL)WZ; the property of an individual of being a parking place. For instance, the proposition that p1, p2, …, pn / L are individuals with the property of being a parking place, is constructed by
M. Duží and P. Vojtáš / Multi-Criterion Search from the Semantic Point of View
31
OwOt [0Parkingwt 0pi]. (EL@): These individuals belong to the extension of the concept: Parking(pi).
4.3
Attributes – criteria
In general, attributes are empirical functions, i.e. intensions of a type (Įȕ)WZ. For instance, ‘the President of (something)’ denotes a (singular) attribute. Dependently on the modal factor Z and time W the function in question associates the respective country with the unique individual playing the role of its President. But, for instance, George W. Bush might not have been the President of the USA (the modal dependence), and he has not always been and will soon not be10 the President (the temporal dependence). (TIL) Price / (WL)WZ; an empirical function associating an individual (of type L) with a Wnumber (its price); to obtain a price of a hotel hi, we have to execute an empirical procedure:
OwOt [0Pricewt 0hi]. (EL ) the value of the attribute Price can be obtained, e.g., by an SQL query @
SELECT Price FROM Hotel WHERE Hotel.Name=hi or by using a crisp atomic role hotel_price. (TIL) Distance / (WLL)WZ; an empirical function assigning a W-number (the distance) to a pair of individuals, for instance: OwOt [0Distancewt 0hi 0pi]. (TIL) DistE / (WL)WZ; the empirical function assigning to an individual a W-number (its distance to another chosen entity E – a beach, a hotel, …). (EL@) Database point of view: Assuming we have a schema Distance(Source, Target, Value), this is the value of the attribute Distance.Value. It can be obtained, e.g., by the SQL query SELECT Distance.Value FROM Distance, Hotel WHERE Hotel.Name=hi AND Hotel.Address=x AND Distance.Source=x AND Distance.Target=E DL point of view: In DL we meet a problem here, because the relation Distance is of arity 3 and DL is a binary conceptual model. For each individual E we can consider an atomic role hotel_distance_from_E. (Of course, in practical applications we can combine these approaches). (TIL) Year / (WL)WZ; an empirical function assigning to an individual a W-number (its year of building). (EL@) Database and DL points of view similar as above (TIL) Appertain-to / (R LL)WZ; the binary relation between individuals. For example, a parking place pi belonging to a hotel hi:
OwOt [[0Parkingwt 0pi] [0Hotelwt 0hi] [0Appertain-towt 0hi 0pi]]. (EL@) the relation between a particular hotel and a parking; a crisp role
10
Written in January 2007
32
4.4
M. Duží and P. Vojtáš / Multi-Criterion Search from the Semantic Point of View
Evaluation of criteria by combining user preferences
The procedural semantics of TIL makes it possible to easily model the way particular agents can learn by experience. An agent may begin with a small ontology of atomic primitive concepts (trivialisations of entities) and gradually obtain pieces of information on more detailed definitions of the entities. In TIL terminology each composed construction yielding an entity E is an ontological definition of E. For instance, the agent A may specify the property of being a Suitable (for-A) hotel by restricting the property Hotel. To this end the property Suitable hotel is defined by the construction composing the price, distance and year attribute values and yielding the degree greater than 0.5. (TIL): Suitable-for / ((RL)WZ L (RL)WZ)WZ – an empirical (parameters W, Z) function that applied to an individual (of type L) and a property (of type (RL)WZ) returns a property (of type (RL)WZ). For instance, the property of being a suitable hotel for the agent A can be defined by:
OwOt [0Suitable-forwt 0A 0Hotel] = OwOt Ox [[0Hotelwt x] [[0Evaluatewt 0A [0Pricewt x] [0DistEwt x] [0Yearwt x]] t 0.5]]. By way of further refining, we can again define the atomic concept 0Evaluate. To this end we enrich the ontology by 0Aggregate and 0Apt-for, which can again be refined. And so on, theoretically ad infinitum. Evaluate / (W L WWW)WZ - an empirical function that applied to an individual a and a triple of Wparameters (e.g., price, distance, year) returns a W-number [0,1], which is the preference degree of a particular hotel for the agent a. [0Evaluatewt a par1 par2 par3] = [0Aggregate [0Apt-forwt a par1] [0Apt-forwt a par2] [0Apt-forwt a par3]]. Aggregate / (W WWW) – the aggregation function that applied to the triple of W-numbers returns a W-number = the degree of appropriateness. Apt-for / (W LW) – an empirical function that applied to an individual a / L and a Wparameter pari (e.g., price, distance, and so like) returns a preference scale of the respective parameter pari for the user a. The scale is a W-number [0,1]. For instance, [0Evaluatewt 0A [0Pricewt x] [0DistEwt x] [0Yearwt x]] = [0Aggregate [0Apt-forwt A [0Pricewt x]] [0Apt-forwt A [0DistEwt x]] [0Apt-forwt A [0Yearwt x]]]. The empirical function Evaluate is the key function here. Applied to an individual agent (user) and particular criteria it returns the agent’s preference-degree of a particular object. Each agent may dynamically (parameter W) choose (parameter Z) its own function Evaluate. The algorithm computing the preference-degree of an object consists of two independent subprocedures: i) user preference scale Apt-forwt of the TIL-type (W LW), or using the EL@ notation: ¢user, pari² o [0,1], where pari is the value of a particular criterion (for instance price, distance, etc.). Here the additional role of EL@ comes into play. The EL@ logic makes it possible to choose an appropriate scale algorithm. It can be a specific function for a particular user U1, e.g.:
M. Duží and P. Vojtáš / Multi-Criterion Search from the Semantic Point of View
33
ii) the aggregation function Aggregate of the TIL -type (W WWW), or in the EL@ notation (understood as a many valued connective) @: [0,1]3 o [0,1], computing the global preference degree. Here we consider the Aggregate function as not being user-dependent, but rather system-dependent (therefore, in TIL –notation there is no WZ-parameter). In other words, it is a system algorithm of computing the general user preference. Of course, we might let each user specify his/her/its own algorithm but in practice it suffices to consider different user profiles associated with each aggregation function. Thus the system may test several algorithms of aggregation, e.g., those that were used for users with a similar profile, in order to choose the suitable aggregation. It does not seem to be necessary to further refine the specification in TIL. Instead we either call at this point a software module, or make use of the EL@ logic. In the example above we used the weighted average:
4.5
Communication of agents; messages
The communication aspects are not elaborated in EL@ from the semantic point of view. Hence it represents the added value of TIL when integrating with EL@. However, in SQL we have ORDER BY command and when dealing with preferences we work with the notion of the best, top-k, respectively, answers. The EL@ many valued logic setting understood as a comparative tool (numerical values do not matter) is an appropriate tool for evaluating fuzzy predicates. It provides a good semantics for ordering preferences of answers (see [13]). The TIL-philosophy is driven by the fact that natural language is a perfect logical language. Hence the TIL-specification is close to an ordinary human reasoning and natural-language communication. On the other hand, however, the high expressive power of the TIL language of constructions may sometimes be an obstacle to an effective implementation. This problem is dealt with by the step-by-step refinement as discussed above. At the first step we specify just a coarse-grained logical form of a message; the execution is left to particular Java modules. Then a more fine-grained specification makes it possible to increase agent’s “intelligence” by letting him dynamically decide which finer software modules should be called. To this end we combine Java modules, Prolog, fuzzy Prolog Ciao, etc. (TIL): The general scheme of a message is: Message / (R L L RWZ)WZ
OwOt [0Messagewt 0Who 0Whom OwOt [0Typewt 0What]], where
34
M. Duží and P. Vojtáš / Multi-Criterion Search from the Semantic Point of View
Who /L, Whom /L, What (content)/ĮWZ, Type / (RĮWZ)WZ, What – the subject of the message is a specification of an intension (usually a proposition of type RWZ). (EL@): The description logic does not incorporate a specific semantic logical description of messages. It is usually handled by an implementation component (generally in the Software Engineering part) by dealing with exceptions, deadlocks, etc. (TIL): There are three basic types of messages that concern propositions; i.e., these types are properties of propositions, namely Order, Inform, Query/(RRWZ)WZ. In an ordinary communication act we implicitly use the type Inform affirming that the respective proposition is true. But using an interrogative sentence we ask whether the proposition is true (Query), or using an imperative sentence we wish that the proposition were true (Order). The content of a message is then the construction of a proposition, the scheme of which is given by:
OwOt [0Typewt 0What] o RWZ. In what follows we specify in more details possible typical types of messages. Type = {Seek, Query(Yes-No), Answer, Order, Inform, Unrecognised, Refine,…}; where Typei / (RDWZ)WZ or Typei / (R n)WZ. Examples of a content of a message: [0Seekwt 0What]; What / DWZ o send me an answer = the actual D-value of What in a given state of affairs w,t of evaluation. [0Querywt 0What]; What / RWZ o send me an answer = the actual R-truth-value of What in a given state of affairs w,t. [0Orderwt 0What]; What / RWZ o manage What to be actually True (in a state of affairs w,t.) [0Informwt 0What]; What / RWZ o informing that What is actually True [[0Answerwt 0What] = a / D]; where a = [0Whatwt]; the answer to a preceding query or seek. [0Unrecognisedwt 00What]; the atomic concept 0What has not been recognised; a request for refinement. Note that Unrecognised is of type (R n)WZ, the property of a construction (usually an atomic concept). Therefore the content of the message is not the intension What constructed by 0What, but the construction 0What itself. The latter is mentioned here by trivialisation, therefore 00What. [[0Refinewt 00What] = 0C o DWZ]; an answer to the message on unrecognised atomic concept. The construction C is the respective composed specification (definition) of What, i.e., C and 0What are equivalent, they construct the same entity: C = 0What. For instance, the set of prime numbers can be defined as the set of numbers with two factors: [[0Refinewt 00Prime] = 0[Ox [0Card Oy [0Div x y] = 02]]], where x, y o Nat (the type of natural numbers), Div / (R Nat Nat) – the relation of being divisible by, Card / (Nat (R Nat))– the number of elements of a set.
M. Duží and P. Vojtáš / Multi-Criterion Search from the Semantic Point of View
4.6
35
Example of communication
Now we continue the simple example from the outset. We will analyse a part of the dialog of the three agents A, B, C. Sentences will be first written in ordinary English then analysed using TIL, transformed into the standardised message, and if needed provided by a gloss. For the sake of simplicity we will omit the specification of TIL-types of particular objects contained in a message. However, since the TIL-type is an inseparable part of the respective TIL-construction, we do not omit it in a real communication of agents. For instance, when building an agent’s ontology, each concept is inserted with its typing. Message 1 (A to B): ‘I wish B to seek a suitable hotel for me.’ A(TIL):
OwOt [0Wishwt 0A OwOt [0Seekwt 0B [0Suitable-forwt 0A 0Hotel]]], Wish/(RLRWZ)WZ, Seek/(RL(RL)WZ)WZ,
A(TIL m1):
OwOt [0Messagewt 0A 0B OwOt [0Seekwt [0Suitable-forwt 0A 0Hotel]]]
Gloss:
The agent A is sending a message to B asking to seek a suitable hotel for A.
Message 2 (B to A): However, the agent B does not understand the sub-instruction [0Suitable-forwt 0A 0Hotel], because he does not have the atomic concept 0Suitable-for in his ontology. Therefore, he replies a message to A, asking to explain: ‘I did not recognise 0Suitable-for.’ B(TIL m2):
OwOt [0Messagewt 0B 0A OwOt [0Unrecognisedwt 00Suitable-for]
Remark Thus the lower index wt can be understood as an instruction to execute an empirical inquiry (search) in order to obtain the actual current value of an intension, here the property of being a suitable hotel (for instance by searching agent’s database or by asking the other agents, or even by means of agent’s sense perception). The upper index 0 serves as a marker of the primitive (atomic) concept belonging the agent’s ontology. If it does not, i.e., if the agent does not know the concept, he has to ask the others in order to learn by experience. Message 3 (A to B): The agent A replies by specifying the restriction of the property Hotel to those hotels which are evaluated with respect to price, distance and the year of building with the degree higher than 0.5: A(TIL)
0
Suitable-for / ((RL)WZ L (RL)WZ)WZ, a o L, p o (RL)WZ;
0
Suitable-for = OwOt Oap OwOt Ox [[pwt x] [[0Evaluatehwt a [0Pricewt x] [0DistEwt x] [0Yearwt x]] t 00.5]]. Gloss: The A’s answer message should refine the atomic concept 0Suitable-for. Now there is a problem, however. The agent B would have to remember the respective message asking for the refinement in order to apply the property to proper arguments (namely A and Hotel). This would not be plausible in practice, because A is the aid prayer, not B. Therefore the answer message contains the smallest constituent containing the refined concept: A(TIL m3): 0
OwOt [0Messagewt 0A 0B OwOt [0Refinewt 0[0Suitable-forwt 0A 0Hotel] =
[OwOt Ox [[0Hotelwt x] [[0Evaluatehwt 0A [0Pricewt x] [0DistEwt x] [0Yearwt x]] t 0.5]]]]]
36
M. Duží and P. Vojtáš / Multi-Criterion Search from the Semantic Point of View
In this way the agent B obtains a piece of knowledge what should he look for. Another possibility would be A’s sending the original message 1 refined, i.e., the constituent [0Suitable-forwt 0A 0Hotel] replaced by the new specification: A(TIL m3’):
OwOt [0Messagewt 0A 0B OwOt [0Seekwt
OwOt Ox [[0Hotelwt x] [[0Evaluatehwt 0A [0Pricewt x] [0DistEwt x] [0Yearwt x]] t 0.5]]]] However, we prefer the former, because in this way B learned what a suitable hotel for A means. Or rather, he would learn if he understood 0Evaluateh, which may not be the case if he received the request for the first time. Thus if B does not have the concept in his /her ontology, he again sends a message asking for explaining: Message 4 (B to A): B:
I did not recognise 0Evaluateh.
B(TIL m4):
OwOt [0Messagewt 0B 0A OwOt [0Unrecognisedwt 00Evaluateh]
Message 5 (A to B): A(TIL m5):
OwOt [0Messagewt 0A 0B OwOt [0Refinewt 0 0
[ Evaluatewt 0A [0Pricewt x] [0DistEwt x] [0Yearwt x]] =
= 0[0Aggregate [0Apt-forwt A [0Pricewt x]] [0Apt-forwt A [0DistEwt x]] [0Apt-forwt A [0Yearwt x]]] And so on, the refinement may continue and the agents may learn new concepts (from the theoretical point of view ad infinitum). Anyway, finally B fully understands the message and attempts at fulfilling the task; recall that he is to seek a suitable hotel for A. Note that the whole process is dynamic, even agents’ learning by the process of refining particular atomic concepts. B knows now that actually and currently a hotel suitable for A is such a hotel the price, distance from the beach and the year of building of which evaluate with respect to A’s scaling [0Apt-forwt 0A] with the degree higher than 0.5. But he also knows that it might have been otherwise (the modal parameter w / Z) and it will not have to be always so (the temporal parameter t / W). In other words, A and B now share common knowledge of the composed concept defining the property of being a suitable hotel for A. When eventually B accomplishes his search he sends an answer to A: Message 6 (B to A): A(TIL m6):
OwOt [0Messagewt 0B 0A OwOt [0Answerwt [0Suitable-forwt 0A 0Hotel] = {¢h1,0.7²,{¢h5,0.53²}] 11
Gloss: B found out that there are two instances of the property v-constructed by the construction [0Suitable-forwt 0A 0Hotel], namely the hotel h1 that has been evaluated with the degree 0.7 and h5 with the degree 0.53. Since h1 has been evaluated as better than h5, A chooses the former. At this point the communication can continue as a dialogue between A and C in a similar way as above. The aim is now finding a suitable parking close to the chosen hotel h1 and then asking to navigate to the chosen parking place:
OwOt [0Messagewt 0A 0C OwOt [0Seekwt [0Suitablepwt 0A 0Parking]]] 11
Here we use the classical set-theoretic notation without trivialisation, for the sake of simplicity.
M. Duží and P. Vojtáš / Multi-Criterion Search from the Semantic Point of View
37
OwOt [0Messagewt 0C 0A OwOt [0Unrecognisedwt 00Suitablep] OwOt [0Messagewt 0A 0C OwOt [0Refinewt 0[0Suitablepwt 0A 0Parking] = 0
[OwOt Ox [[0Parkingwt x] [[0Evaluatepwt 0A [0Pricewt x] [0DistEwt x]] t 00.5]]]]]
OwOt [0Messagewt 0C 0A OwOt [0Answerwt [0Suitablepwt 0A 0Parking]] = {{p2,0.93},{p1,0.53}] The message closing the dialogue might be sent from A to C:
OwOt [0Messagewt 0A 0C OwOt [0Orderwt OwOt [0Navigate-towt 0p2]]]. At this point the agent C must have 0Navigate-to in his/her ontology (if he/she does not then the learning process described above begins); C thus knows that he/she has to call another agent D which is a GIS-agent that provides navigation facilities (see [6]). Concluding this paragraph we again compare the TIL approach with EL@. An analogy to the above described means of communication can be found in the DL community. There are heuristics for the top-k search (see [13]). However, these facilities lack any formal / logic / semantic specification. The development of description logic and its variants can be considered as a step forward to the development of languages which extend W3C standards. In [9] a step in this direction is described. In particular the EL@ variant of the description logic can be embedded into classical two-valued description logic with concrete domains (see [1]), and thus also into OWL (or a slight extension of it). Using the results described in this paper, especially the added value of TIL, we can expect the extension of W3C based specification of web service languages using the OWL representation.
5.
Conclusion: A hybrid system
In the previous chapters, especially by using the parallel description of our motivating example in Chapter 4, we tried to show that TIL and EL@ have many features in common. Both the systems can share some basic types, functions, concepts and roles; both the systems distinguish extensional and intentional context (the former being modelled by the intensional descent in TIL and A-Boxes in DL, the latter illustrated here by the (user-) definition or specification of a multi-criterion search). These features can form the intersection TIE@L. On the other hand, both the systems can be enhanced by accommodating features of the other system, thus forming a union TI+E@L. The main contribution of EL@ is the method of modelling multi-criterion aspects of user preferences (some heuristics have been tested in separate works), and computing global user preferences by means of the aggregation functions and scaling. TIL contributes to this union the method of a very fine-grained and rigorous knowledge specification closed to natural language, including procedural hyperintensional semantics. We are convinced that these aspects are crucial for a smooth communication and reasoning of agents in the multi-agent world. Artificial Intelligence is sometimes characterised as a ‘struggle for consistency’. To put it slightly metaphorically, reality is consistent. Only our ‘making it explicit’ in language may lead to paradoxes and inconsistencies due to misinterpretations that are caused by a too coarse-grained analysis of assumptions. The specification of the formal model of the hybrid system is however still a subject of further research. Currently we plan to perform experiments and tests on real data using the hints described in Chapter 4.
38
M. Duží and P. Vojtáš / Multi-Criterion Search from the Semantic Point of View
In the team led by M. Duží, working on the project “Logic and Artificial Intelligence for multi-agent systems” (see http://labis.vsb.cz/), we pursue research on multi-agent systems based on TIL. Currently we implemented software modules simulating the behaviour of mobile agents in a traffic system. The agents can choose particular realisations of their predefined processes; moreover, they are able to dynamically adjust their behaviour dependently on changing states of affairs in the environment. They communicate by message-exchange system. To this end the TIL-Script language (see [14]) has been designed and it is currently being implemented. We also plan to test some modules with EL@ features. The project in which P. Vojtas is involved (see [17]) deals with theoretical models compatible with W3C standards and experimental testing of multi-criterion search dependent on user preferences. We believe that the TIL features will enhance the system with a rigorous semantic description and specification of the software / implementation parts. When pursuing the research we soon came to the conclusion that the area of the semantic web and multi-agent world in general is so broad that it is almost impossible to create a universal development method. Instead we decided to develop a methodology comprising and integrating particular existing and/or newly developed methods as well as our fine-grained rigorous logic. The paper is an introductory study aiming at a more universal logical approach to the ‘multi-agent world’, which at the same time opens new research problems and trends. The main challenges are formal measures (soundness and completeness) and implementation measures of the integrated hybrid system. –––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– ACKNOWLEDGEMENTS This work has been supported by the project No. 1ET101940420 “Logic and Artificial Intelligence for multiagent systems” within the program “Information Society” of the Czech Academy of Sciences, and by the “Semantic Web” project No. 1ET100300419 of the Czech IT agency.
REFERENCES Baader, F., Calvanese, D., McGuinness, D.L., Nardi, D., Patel-Schneider, P.F. eds. (2002): Description Logic Handbook, Cambridge University Press. [2] Brandt, S. (2004): Polynomial Time Reasoning in a Description Logic with Existential Restrictions, GCI Axioms, and What Else? In R. López de Mantáras et al. eds. In Proc. ECAI-2004, pp. 298-302. IOS Press. [3] Church, A. (1956): Introduction to Mathematical Logic I. Princeton. [4] Cresswell, M.J. (1985): Structured meanings. MIT Press, Cambridge, Mass. [5] Duží, M.(2004): Concepts, Language and Ontologies (from the logical point of view). In Information Modelling and Knowledge Bases XV. Ed. Y. Kiyoki, H. Kangassalo, E Kawaguchi, IOS Press Amsterdam, Vol. XV, 193-209. [6] Duží, M., Ćuráková, D., DČrgel, P., Gajdoš, P., Müller, J. (2007): Logic & Artificial Inteligence for MultiAgent Systems. In Information Modelling and Knowledge Bases XVIII. M. Duží, H. Jaakkola, Y. Kyioki, H.Kangassalo (Eds.), IOS Press Amsterdam, 236-244. [7] Duží, M., Heimburger A. (2006): Web Ontology Languages: Theory and practice, will they ever meet?. In Information Modelling and Knowledge Bases XVII. Ed. Y. Kiyoki, J. Henno, H. Jaakkola, H. Kangassalo, IOS Press Amsterdam, Vol. XVII, 20-37. [8] Duží, M., Jespersen B, Müller, J. (2005): Epistemic Closure and Inferable Knowledge. In the Logica Yearbook 2004. Ed. Libor BČhounek, Marta Bílková, Filosofia Praha, Vol. 2004, 1-15. [9] Eckhardt, A., Pokorný, J., Vojtáš, P. (2006): Integrating user and group preferences for top-k search from distributed web resources, technical report 2006 [10] Fagin, R. (1999): Combining fuzzy information from multiple systems, Journal of Comput. System Sci. 58, 1999, 83-99 [11] Feferman, S. (1995): ‘Definedness’. Erkenntnis 43, pp. 295-320. [12] Gamut, L.T.F. (1991): Logic, Language and Meaning. Volume II. Intensional Logic and Logical Grammar. [1]
M. Duží and P. Vojtáš / Multi-Criterion Search from the Semantic Point of View
39
The University of Chicago Press, Chicago, London. [13] Gurský, P., Lencses, R., Vojtáš, P. (2005): Algorithms for user dependent integration of ranked distributed information. In TCGOV 2005 Poster Proceedings, M. Boehlen et al eds. IFIP – Universitaetsverlag Rudolf Trauner, Laxenburg, ISBN 3-85487-787-0, pp. 123-130 [14] The TIL-Script language description is available at the VSB-TIL homepage: http://www.cs.vsb.cz/TIL [15] Pokorný, J., Vojtáš, P. (2001): A data model for flexible querying. In Proc. ADBIS'01, A. Caplinskas and J. Eder eds. Lecture Notes in Computer Science 2151, Springer Verlag, Berlin, 280-293 [16] Rescher, N. (2002): ‘Epistemic logic’, in: A Companion to Philosophical Logic, D. Jacquette (ed.), Oxford: Blackwell, pp. 478-91. [17] ‘Semantic Web’ project of the Czech IT agency 1ET100300419 [18] Stracia, U. (2001): reasoning with Fuzzy description Logics. Journal of Artificial Intelligence and research 14 (2001), pp. 137-166. [19] Tichý, P. (1988): The Foundations of Frege’s Logic. De Gruyter. [20] Tichý, P. (2004): Pavel Tichý’s Collected Papers in Logic and Philosophy. Svoboda, V., Jespersen, B., Cheyne, C. (editors), Filosofia Prague and University of Otago Press. [21] Vojtáš, P. (2006): EL Description Logics with Aggregation of User Preference Concepts. In Information Modelling and Knowledge Bases XVIII. M. Duží, H. Jaakkola, Y. Kyioki, H. Kangassalo (Eds.), IOS Press Amsterdam, 154-165.
40
Information Modelling and Knowledge Bases XIX H. Jaakkola et al. (Eds.) IOS Press, 2008 © 2008 The authors and IOS Press. All rights reserved.
A Semantic Space Creation Method with an Adaptive Axis Adjustment Mechanism for Media Data Retrieval Xing Chen1, Yasushi Kiyoki2, Kosuke Takano3 and Keisuke Masuda4 1,3
Department of Information & Computer Sciences Kanagawa Institute of Technology 1030 Simo-Ogino, Atsugi-shi, Kanagawa 243-0292, Japan
[email protected],
[email protected] 2 Department of Environmental Information Keio University Fujisawa, Kanagawa 252-8520, Japan
[email protected] 4 Graduate Courses of Information & Computer Sciences Kanagawa Institute of Technology 1030 Simo-Ogino, Atsugi-shi, Kanagawa 243-0292, Japan
[email protected] Abstract. This paper presents a new semantic space creation method with an adaptive axis adjustment mechanism for media data retrieval. The semantic space is essentially required to search semantically related and appropriate information resources from media databases. In the method, data in the media databases are mapped as vectorized metadata on the semantic space. The distribution of the metadata on the semantic space is the main factor affecting the accuracy of the retrieval results. In the method, an adaptive axis adjustment mechanism is used to rotate and combine the semantic correlated axes on the semantic space, and remove axes from the semantic space. We demonstrated by experiments that when the semantic space is created and adjusted based on the semantic correlated factors, the metadata are appropriately and sharply distributed on the semantic space.
1. Introduction Large numbers of heterogeneous databases are spreading in wide area computer network environments to meet the increasing needs of homo-sapiens. People have opportunities to obtain significant information from those heterogeneous databases through the wide-area computer network [1], [2], [7]. However, it is still difficult for users to extract appropriate information without knowledge on the contents and structures of those databases. The development of sophisticated retrieval methods for
X. Chen et al. / A Semantic Space Creation Method with an Adaptive Axis Adjustment Mechanism
41
realizing an intelligent multimedia database environment is an important issue in database research field. Semantic information retrieval methods [3], [5], [6] are proposed for realizing intelligent information retrieval. In the semantic information retrieval methods, the semantic relationship computing is the essential function for extracting the semantically related and appropriate information resources from databases [8], [9], [11], [12]. We have proposed two fundamental frameworks for computing semantic relationships between retrieval candidates and queries in multimedia database environments [3], [4], [6], [8]. In these methods, semantic expression vectors [8] are used as metadata of media data (retrieval-candidate media data) to express attributes, contents and impressions of media data. We have also proposed several fundamental method and systems for extracting semantically correlated factors from data resources [3], [4]. One of the important issues of the semantic associative retrieval is to select appropriate media data according to the requirements of queries. The key point to select the appropriate media is that the metadata are appropriately distributed on the semantic space. The learning mechanism is one of the methods to adjust the metadata distributions on the semantic space [3]. In this paper, we present a method to create a well-structured semantic space on which vectorized metadata are appropriately and sharply distributed. In the method, a vector space is created based on the factors of the metadata. The basic idea of the method is to rotate and combine the axes of the space, which correlate to the same semantic factors and to remove the axes from the space, which reduce the precision of the retrieval results. In the method, the adaptive axis adjustment mechanism is used to implement the ‘rotating’, ‘combining’ and the ‘removing’ operations. Sets of objects in the databases are given as the training data for the extracting the semantic correlated factors from the metadata. When the axes of the vector space are adjusted by the adjustment mechanism based on the semantic correlated factors, a well-structured semantic space is created. On the space, metadata are appropriately and sharply distributed. The outline of the related works and issues for creating semantic space are reviewed in Section 2. In Section 3, we present the new semantic space creation method for realizing appropriate and precise semantic associative retrieval. Several experimental results are presented in Section 4 to clarify the feasibility and effectiveness of the proposed
42
X. Chen et al. / A Semantic Space Creation Method with an Adaptive Axis Adjustment Mechanism
method. 2. Related works and issues about the semantic space In this section, we first review a simple model, the vector space model (VSM) [14], [15]. After that, a method, the Latent Semantic Indexing (LSI) [5], which is used to create the semantic space, is reviewed. We will present the issues when the spaces are created based these methods for information retrieval. 2.1 The vector space model In the vector space model (VSM), retrieval candidates and queries are modeled as vectors of a vector space. If an object (a retrieval candidate or a query) is represented by n features, for example, n index terms, it is assumed that each feature is a vector of unit length and each feature (unit vector) is weighted by its importance. All objects on the vector space are represented as vectors which are linear combinations of weighted features. For example, if a document is represent by n index terms, each term is assumed as a unit length vector and the document is represented as a vector of weighted terms. A common weighting scheme for terms within a document is the frequency of occurrence of the terms in the document. Retrieval processing in vector space models is performed by determining similarity between query and retrieval candidates. Associative coefficients based on the inner product of the retrieval candidate vectors and query vector are used to determine the similarity. The most popular similarity measure is the cosine coefficient, which measures the angle between the retrieval candidate vector and the query vector, and the retrieval candidates are ranked in the decreasing order of this measure. 2.2 The issue of the VSM on information retrieval The standard vector space model assumes that features of retrieval candidates are not correlated, that is, they are pair-wise orthogonal. If index terms are used to represent documents to vectors, the index terms are assumed pair-wise orthogonal. The issue of the vector space created by index terms for information retrieval is explained by Example 1.
X. Chen et al. / A Semantic Space Creation Method with an Adaptive Axis Adjustment Mechanism
43
Example 1. Suppose documents are represented by two index terms, t1, t2. A vector space with two dimensions is created and all documents are represented as two dimensional vectors. Each document vector di can be expressed as: di
d i1t 1 d i 2 t 2 , (i
1, ,6) .
In the figure, let t1 and t2 represent the term basis vectors. That is, t1 and t2 are orthogonal. Let t1 and t(r)2 represents term vectors in the case that t1 and t(r)2 are not orthogonal. In the first case that t1 and t2 are orthogonal, a document is represented as a vector d and in the second case, that is, t1 and t(r)2 are not orthogonal, the document is represented as the vector d(r).
t (2r )
t2
G2
d12
d d (r ) d11
t1
G1
Fig. 1. Two-dimensional vector space in the cases that (1) t1and t2 are orthogonal and (2) they are not orthogonal.
As shown in Figure 1, if the index term vectors t1 and t2 are not orthogonal, components of the document vector are different from those when t1 and t2 are orthogonal. The differences can be written as follows: d ( r ) (d11 G 1 )t1 (d12 G 2 )t 2 , where G 1 is the increasing difference of the document vector on t1 vector and G 2 is the decreasing difference of the document vector on t2 vector. Consider simplified situations that when the index terms t1 and t2 represent a same meaning, we define t1=t2 as shown in Figure 2(a). When the index terms t1 and t2 are different in the meanings, we define t1 and t2 to be orthogonal as shown in Figure 2(b). Let document d1 is represented by index term t1 and document d2 is represented by index term t2, we have d1
d11t1 , d 2
d 22 t 2 .
44
X. Chen et al. / A Semantic Space Creation Method with an Adaptive Axis Adjustment Mechanism
In the vector space model, vectors t1 and t2 are supposed to be orthogonal and each of them is normalized. Therefore, a 2-dimentional space is represented by t1 and t2 as the orthonormal basis. When t1 and t2 represent different meanings, as shown in Figure 2(b), this assumption is acceptable. When t1 and t2 represent a same meaning and they must be represented as vectors overlapped to each other, but they are represented as orthogonal vectors. If the index term t1 is used as the keyword in a query, the retrieval candidates with highest relation to a query are not selected. The query can be represented as: q q1t 1 . The cosine measure, which used as the ranking function to measure similarity between the query and retrieval candidate vectors, can be defined as dci u q cosd i , q . dci u d i qc u q When the scalar product of a query vector and a retrieval candidate vector are orthogonal, the cosine value of them is also zero. As shown in Figure 2(b), when t2 is represented orthogonally to t1, the scalar product of q and d2 is zero, therefore, d2 is not selected. It can be further explained in the following formulas: cosq, d1 1, cosq, d 2 0. It is required when t1 and t2 represent a same meaning, they should be represented overlapped as shown in Figure 2(a), because basis vectors are either pair-wise orthogonal or overlapped. t2 d2 d2
t1 , t 2
t1 d1
d1 (a)
(b)
Fig. 2. (a) The situation that index terms t1 and t2 represent a same meaning. (b) The situation that index terms t1 and t2 represent different meanings. Two documents are represented as two vectors d1 and d2. In the situation (a) they overlap with each other and in the situation (b), they are orthogonal.
X. Chen et al. / A Semantic Space Creation Method with an Adaptive Axis Adjustment Mechanism
45
2.3 Creating the semantic space As explained in Figure 1, when a basis vector semantically correlates to the other basis vectors on the vector space, these basis vectors must be rotated in order to get the correct distribution of media data on the space. We refer the vector space on which media data are mapped according to semantic factors to as the semantic vector space, or simply, the semantic space. Methods of creating semantic space are proposed [3], [5], [6]. The basic idea of these methods is illustrate in the next example, Example 2. Example 2. Consider a two-dimensional vector space with five document vectors. Each document is represented by two indexing terms t1 and t2: d i d i1t1 d i 2 t 2 . Let D be a term-document matrix defined as: ª d11 «d « 21 « « ¬d 51
ª d1 º «d » D « 2» «» « » ¬d 5 ¼
d12 º d 22 »» . » » d 52 ¼
Let R be the matrix with characteristic of R u R ' I , where R ' is the transposed matrix of R and I is the identity matrix. When R is a two-row and two-column matrix, a
projection matrix P can be calculated by the following equation: P D u R. Each column of R is a vector:
>r1
R
r2 @.
Each column of P is a vector representing a document on the vector space with the basis of r1 and r2:
P
ªd1( r ) º « (r ) » «d 2 » , « » « (r ) » «¬d 5 »¼
where,
d i( r )
>d i1
d i 2 @u R .
46
X. Chen et al. / A Semantic Space Creation Method with an Adaptive Axis Adjustment Mechanism
It is expected that vectors of documents on the space with the basis of R are in correct locations. Singular Value Description (SVD) is one of the methods to rotate basis vectors. In the following, we use F, R to represent the vector space-F and vector space-R, respectively. We use D(f) to represent metadata on the space-F and D(r) to represent the metadata on the space-R. The axis on the space-R is represented as fi and the axis on the space R is represented as ri. When SVD is applied on the matrix D(f), an orthogonal space R is obtained: D ( f ) LSR c,
D( f ) R
LS D (r ) , where S is a diagonal matrix that contains singular values, L and R are left and right matrices of the matrix S. R ' is the transposed matrix of R. The diagonal matrix S has the characteristics as the follows: S' S , SS 1
I, where I is the identity matrix. Both the matrices L and R have orthonormal columns, that is RR ' I , LL' I . Therefore, D f u R LS R c u R LS
D r , that is, metadata on the space F are projected on to the space R.
2.4 The issue of the space created by SVD for information retrieval As mentioned above, the basis vectors of R are the rotated vectors of the original space F. Because RR ' I , we will demonstrate in the following, if two vectors q and d are orthogonal on the space F, they are also orthogonal on the space R. That is, the required vector d can not be selected only by rotating basis vectors. The distribution of metadata on the vector spaces F and R created by SVD is illustrated in Example 3.
X. Chen et al. / A Semantic Space Creation Method with an Adaptive Axis Adjustment Mechanism
47
Example 3. Give a matrix D(f) with five vectors represents five retrieval candidates and each retrieval candidate is represented with two features, f1 and f2:
D( f )
ª3 «0 « «1 « «3 «¬2
0º 5»» 2» . » 2» 3»¼
By performing SVD on D(f), matrix R is obtained: 0.8836 º ª 0.46824 « 0.8836 0.46824» . ¬ ¼
R
㪍
f2
㪌
㪻㪉
㪋
d 21( r )
㪊
㪻㪌
㪉 (r ) d 22
d11( r )
㪻㪊
㪻㪋
㪈
f1
㪇 㪄㪉
㪇 㪄㪈㫉㪈
r1
㪄㪉
㪻㪈 㫉㪉
㫆
㪉
㪋
d12( r )
r2
㪍
Fig. 3. Vectors of retrieval candidates on two vector spaces. One of the spaces is created with the vector f1 and f2 as the basis and the other is created with the vectors r1 and r2 as the basis.
The matrix D(r) which represents the vectors of the retrieval candidates on the space created with R is
D( r )
D( f ) u R 2.6508 º ª 1.4047 « 4.418 2.3412 » « » « 2.2354 0.05288» . « » « 3.1719 1.7143 » «¬ 3.5873 0.3628 »¼
48
X. Chen et al. / A Semantic Space Creation Method with an Adaptive Axis Adjustment Mechanism
The distribution of the vectors of retrieval candidates on the two vector spaces is shown in Figure 3. One space is created with f1 and f2 as the basis and the other is created with r1 and r2 as the basis. If a query vector q and a retrieval candidate vector di are orthogonal, the scalar product of them is zero:
qc u d i
0.
When the vectors q and di are projected onto a new orthogonal space R ( RR ' I ), their scalar product on the new space is also zero:
q(r )
qc u R c ,
d i( r )
dci u R c ,
c q ( r ) u d i( r )
qc u R s u dci u R c qc u R u R c u dcic qc u I u d i qc u d i 0.
That is, the required retrieval candidate can not be selected on the new space R. This can be illustrated by using Figure 3. As shown in Figure 3, the vector d1 and d2 are orthogonal on the space with f1 and f2 as basis. This orthogonal characteristic is not changed on the new space with r1 and r2 as basis. Latent Semantic Index (LSI) [5] is the method by using SVD to create an orthogonal space. After the orthogonal space is created, a compression operation is needed to compress the space from n dimensions to k dimensions, where, k < n. In LSI, a space R is created by applying SVD on the matrix D(f). Based on the above illustration on SVD for the basis vector rotation, different from that mentioned in [5], in our point of view, the space R is the space which is created by rotating the axes of the space F. The space compression is the operation to remove basis vectors from the space R. In LSI, the vectors correlating to the small singular values are removed. That is, if R is composed with n basis vectors, R >r1 r2 rn @ ,
X. Chen et al. / A Semantic Space Creation Method with an Adaptive Axis Adjustment Mechanism
49
the vector rn is the first candidate to be removed. Next will be rn-1, rn-2, … However, the issue is that it is not clear that the removed basis vectors really represent the semantic factors or not which are represented by the other un-removed vectors. In Example 3, based on LSI, the axis r2 is removed because it relates to the smallest singular value. However, in the case of Example 3, it is better to remove the axis r1. 3. The semantic space creation method with an adaptive axis adjustment mechanism 3.1 The basic idea of the method As we mentioned in the previous section that when two feature vectors fi and fj semantically correlate to each other, they are not pair-wise orthogonal and must be rotated into the correct position. When semantically correlated vectors q and d are mapped onto the vectors fi and fj on the space-F, vectors q and d are orthogonal on the space-F. If q represents a query vector and d represents a retrieval candidate vector, although they semantically correlate to each other, the retrieval candidate can not be found, because the scalar product of q and d is zero. Our idea is to rotate the axes, which represent the same semantic factors, and to combine them into new vectors. When two feature vectors fi and fj represent a same semantic factor, in our method, they are rotated and combined into a new vector ri: 1
ri
f1 f 2
f
i
f j .
fj
ri
d
q
fi
Fig. 4. The query vector q and the retrieval candidate vector d are overlapped on the vector ri which is created by combining the vectors fi and fj.
When the new created vector ri is used as the basis of the vector space instead of the vectors fi and fj, the vector d can be selected by the query because both the query vector
50
X. Chen et al. / A Semantic Space Creation Method with an Adaptive Axis Adjustment Mechanism
q and the retrieval candidate vector d are overlapped on the vector ri as shown in Figure 4.
In general, if a subspace with the basis of vectors fi, fj, …, fk represents a same feature on the space-F, a new semantic vector space, space-R, is created by rotating and combining those basis vectors on the subspace into a new basis vector ri. Let B represents the set of the vectors fi, fj, …, fk, the basis vectors of the new created semantic vector space can be represented as f l , if f l B,
rl ri
§ · ¨ ¦ f i ¸, f i B . ¨ ¦ f i © fiB ¸¹
1
fi B
Fig. 5. When the vectors fi and fj represent a same feature, they are rotated and combined into a new vector on the space-R. The vector rl equals to the vector fl on the space-R. The created space-R is a 2-dimensional space while space-F is a 3-dimensional space.
On the new created space, space-R, vectors rl and ri are pair-wise orthogonal because ri is on the subspace [fi, fj, …, fk] and the vector rl is orthogonal to the subspace as illustrated in Figure 5. In the following, we will present our semantic retrieval space creation method, which is referred to as Optimal Semantic Space Creation Method (OPTSS). The main purpose of the OPTSS method is to create optimal semantic spaces. This method requires that
X. Chen et al. / A Semantic Space Creation Method with an Adaptive Axis Adjustment Mechanism
51
learning data sets exist. An axis adjustment mechanism is used to rotate, combine and remove axes based on the semantic factors obtained from the learning data sets. The rotating, combining and removing operations are referred to as the OPTSS operation. In the following, we present the outline of our method. At first, an n u n unit matrix R is created as the retrieval space, where n is the number of the feature vectors. After that, basis vectors representing the same factors are searched from the learning data sets. When two basis vectors fi and fj are found representing a same semantic factor, the vector fi and fj are rotated and combined into a single basis vector. After the rotating and the combining operations, the distribution of the learning data is check to see if there are ‘noise’ data exist among them or not. If there are ‘noise’ data among the learning data, the basis vectors which represent the ‘noise’ factors are searched and removed from the space. The rotating, combing and removing operations on the basis vectors are the basic OPTSS operations to create the optimal semantic spaces. These operations are illustrated in Figure 6.
fk Removing fk
fj Rotating and Combining fi and fj
fi
Fig. 6. The feature vectors fi and fj which represent a same feature are combined into one vector. The vector fk which represents a ‘noise’ feature is removed.
3.2 Technical details of the semantic space creation method with the adaptive axis adjustment mechanism
In our method, the space creation operations are divided into two processing steps. The first step is to rotate and combine basis vectors. This step is efficient for improving the precision of queries. The second step is to remove the basis vectors representing the ‘noise’ factors. This step is efficient for improving the precision of the queries. The semantic factors are searched from the learning data sets and the basis vectors representing the ‘noise’ factors are searched by checking the distribution of the learning
52
X. Chen et al. / A Semantic Space Creation Method with an Adaptive Axis Adjustment Mechanism
data to see if there are ‘noise’ data among the them or not. In the case that each of the media data is represented by the n-feature vectors as a metadata, di
d1,i f1 d 2,i f 2 d n ,i f n ,
an n u n unit matrix R is created at first. After that an index vector Cr is generated from a learning data set L, which contains the index of all the features in the learning data set L. The elements in the learning data set L are the vectorized metadata of the media data. The indexing set Cr is created through the following steps: Step-1: Set Cr as an n-dimensional vector. Each element of the vector Cr is set to ‘0’. Step-2: For each element dj in the learning data set L, if the feature value dk,j is greater than a threshold e, the k-th element of the vector Cr is set to ‘1’. The non-zero elements of the index vector Cr indicate the feature vectors that should be rotated and combined. After the index vector Cr is created, the rotating and the combining operations are implemented through the following steps: Step-3: If the k-th element is the first non-zero element of the index vector Cr, the k-th column of the matrix R is replaced by the vector Cr. This column is identified as RCr. Step-4: Remove all the columns of the matrix R which are indicated by the non-zero elements of the index vector Cr except the column RCr which is replaced by the vector Cr. After the above four processing steps, the rotating and the combination operations are finished. The removing operation is implemented through the following steps: Step-1: Set q as an n-dimensional zero vector. Step-2: If the k-th element is the first non-zero element of the index vector Cr, set the k-th element of the vector q to ‘1’. Step-3: For each metadata di, calculate the inner product of di with the vector q on the semantic space R: pi d i u R u q'uR ' . This step will be repeated until the inner products of all the metadata are
X. Chen et al. / A Semantic Space Creation Method with an Adaptive Axis Adjustment Mechanism
53
calculated. Step-4: Reversely sort the metadata based on the inner products calculated by Step-3. The ranked position of di is stored into a variable Ranki, where, i is the index used to indicate the metadata di, its inner product and its ranking position value Ranki. If the ranking value of di is one, Ranki = 1, di is ranked at the top position. Step-5: For each combined feature vectors in the semantic space R, remove one of the combined feature vector fk from the space. Store all the ranking values of the metadata in the learning data set L. If di is the element of the set L, Ranki is stored into the variable Prev_Ranki. Excuse Step-3 and Step-4 again. For each element di in the set L, if the new ranking position of di lower than its previous ranking position after the feature vector fk is removed from the space R, that is, Ranki is greater than Prev_Ranki, it means that the feature vector fk can not be removed from the space. Otherwise, the feature vector fk is removed from the space R. This step will be continued until all the combined feature vectors are tested whether they can be removed from the space R or not. After the above removing processing, the optimal semantic retrieval space R is created. 4. Experiments
Experiments are performed and the experimental results are presented which shows that optimal semantic retrieval space can be created by using OPTSS method proposed in this paper. When the semantic space is optimally created, retrieval candidate documents will be appropriately and sharply distributed on the space. Therefore, the recall and precision of queries will be greatly improved compared to that before the OPTSS operations are performed. We will also show the recall and precision based on OPTSS method and those based on VSM and LSI (Latent Semantic Indexing), respectively. 4.1 Evaluation method
In our experiments, 8,557 English documents are randomly extracted as the retrieval candidate documents from the “Test collection 1 (NTCIR-1) 1,” which contains about 160,000 English documents. These documents are summaries of conference papers presented at academic or professional conferences hosted by Japanese academic 1
NTCIR is a project organized by National Center for Science Information Systems in Japan. http://research.nii.ac.jp/ntcir/
54
X. Chen et al. / A Semantic Space Creation Method with an Adaptive Axis Adjustment Mechanism
societies. The 8,557 documents used in the experiments are extracted from ten query categories of “Robot,” (identified as Q-1), “Document image understanding,” (identified as Q-4) and “Feature dimension reduction,” (identified as Q-5),…, etc.. In NTCIR-1, correct document data sets are prepared for the query categories. We randomly extracted 5 correct documents from each of the ten query categories, respectively. That is, for a query, there exist only five correct documents which are distributed in the 8,557 documents. In the experiments, the “stop-words”, like the article, conjunction, etc., are removed as the previously processing and stemming processing is also performed previously. 811 English words are extracted as a term set from the 50 correct documents. A 8557 u 811 document-term matrix M is created, in which, each element is the appearing frequency of a term in a document. Each row of the matrix M represents a document vector di which is an 811 dimensional vector. In the matrix, d1, d2,…, d5 are the vectors representing the correct documents of the query category Q-1; d1006, d1007,…,d1010 are the vectors representing the correct documents of the query category Q-4; and d1978, d1979,…,d1982 are the vectors representing the correct documents of the query category Q-5. Ten queries, q1, q2,…, q10 are used for search the correct documents of Q-1, Q-2,…, Q-10, respectively. Each query contains one keyword. For example, the keywords of q1, q2 and q3 are “robot”, “image” and “dimension”, respectively. Figure 7 shows the precision rate and the recall rate of the retrieval result of q1 before the OPTSS operations are performed. The precision goes down from 0.0625 to 0.00058432 as the increasing of the recall.
㪩㪼㪺㪸㫃㫃㪄㫇㫉㪼㪺㫀㫊㫀㫆㫅
㪈㪅㪉 㪈 㪇㪅㪏 㪩㪼㪺㪸㫃㫃 㪧㫉㪼㪺㫀㫊㫀㫆㫅
㪇㪅㪍 㪇㪅㪋 㪇㪅㪉 㪇 㪈
㪉
㪊
㪋
㪌
㪥㫆㪅㩷㫆㪽㩷㫋㪿㪼㩷㪺㫆㫉㫉㪼㪺㫋㩷㪻㫆㪺㫌㫄㪼㫅㫋㫊
Fig. 7. The precision and the recall before the performing of the OPTSS operations
X. Chen et al. / A Semantic Space Creation Method with an Adaptive Axis Adjustment Mechanism
55
Figure 8 shows the recall and the precision of OPTSS which is created by adding the five correct documents of Q-1 into the learning data set L. Our experimental result shows that 104 feature vectors are rotated and combined, and 42 feature vectors are removed for creating the optimal semantic space. From Figure 8, it is can be found that the precision is improved great much. The precision reaches as high as 1.0 and goes not bellow than 0.8 when the recall increases from 0.2 to 1.0.
㪩㪼㪺㪸㫃㫃㪄㫇㫉㪼㪺㫀㫊㫀㫆㫅
㪈㪅㪉 㪈 㪇㪅㪏 㪩㪼㪺㪸㫃㫃 㪧㫉㪼㪺㫀㫊㫀㫆㫅
㪇㪅㪍 㪇㪅㪋 㪇㪅㪉 㪇 㪈
㪉
㪊
㪋
㪌
㪥㫆㪅㩷㫆㪽㩷㫋㪿㪼㩷㪺㫆㫉㫉㪼㪺㫋㩷㪻㫆㪺㫌㫄㪼㫅㫋㫊
Fig. 8. The precision rate and the recall rate of the retrieval result of q1 on the semantic space created by the proposed method.
Such high precisions support the conclusion that the created semantic retrieval space is optimal one. This conclusion is supported by the experiment results shown in Table 1. Table 1 shows the recall and the precision rates of the three queries, q1, q2 and q3, which are obtained by using OPTSS. It can be found from the table that very high precision is obtained by creating the semantic space based on our OPTSS method. The precision of q2 reaches as high as 1.0 when the recall reaches to 1.0. The lowest precision is 0.625 which is the result of q3. The table also shows that the total correct documents are ranked in the top 6, 5 and 8, respectively. Table 1. The recall and the precision obtained by using OPTSS 㫈㪈
㫈㪉
㫈㪊
㪩㪸㫅㫂 㪛㫆㪺㫌㫄㪼㫅㫋㩷㪥㫆㪅 㪩㪼㪺㪸㫃㫃㩷㫉㪸㫋㪼 㪧㫉㪼㪺㫀㫊㫀㫆㫅㩷㫉㪸㫋㪼 㪩㪸㫅㫂 㪛㫆㪺㫌㫄㪼㫅㫋㩷㪥㫆㪅 㪩㪼㪺㪸㫃㫃㩷㫉㪸㫋㪼 㪧㫉㪼㪺㫀㫊㫀㫆㫅㩷㫉㪸㫋㪼 㪩㪸㫅㫂 㪛㫆㪺㫌㫄㪼㫅㫋㩷㪥㫆㪅 㪩㪼㪺㪸㫃㫃㩷㫉㪸㫋㪼 㪧㫉㪼㪺㫀㫊㫀㫆㫅㩷㫉㪸㫋㪼 㪈 㪋 㪇㪅㪉 㪈 㪈 㪈㪇㪇㪐 㪇㪅㪉 㪈 㪈 㪈㪐㪎㪏 㪇㪅㪉 㪈 㪉 㪊 㪇㪅㪋 㪈 㪉 㪈㪇㪇㪍 㪇㪅㪋 㪈 㪉 㪈㪐㪏㪉 㪇㪅㪋 㪈 㪊 㪈 㪇㪅㪍 㪈 㪊 㪈㪇㪈㪇 㪇㪅㪍 㪈 㪊 㪈㪐㪏㪇 㪇㪅㪍 㪈 㪌 㪌 㪇㪅㪏 㪇㪅㪏 㪋 㪈㪇㪇㪏 㪇㪅㪏 㪈 㪋 㪈㪐㪏㪈 㪇㪅㪏 㪈 㪍 㪉 㪈 㪇㪅㪏㪊㪊㪊㪊 㪌 㪈㪇㪇㪎 㪈 㪈 㪏 㪈㪐㪎㪐 㪈 㪇㪅㪍㪉㪌
56
X. Chen et al. / A Semantic Space Creation Method with an Adaptive Axis Adjustment Mechanism
Table 2. The recall and the precision obtained by using VSM 㫈㪈
㫈㪉
㫈㪊
㪩㪸㫅㫂 㪛㫆㪺㫌㫄㪼㫅㫋㩷㪥㫆㪅 㪩㪼㪺㪸㫃㫃㩷㫉㪸㫋㪼 㪧㫉㪼㪺㫀㫊㫀㫆㫅㩷㫉㪸㫋㪼 㪩㪸㫅㫂 㪛㫆㪺㫌㫄㪼㫅㫋㩷㪥㫆㪅 㪩㪼㪺㪸㫃㫃㩷㫉㪸㫋㪼 㪧㫉㪼㪺㫀㫊㫀㫆㫅㩷㫉㪸㫋㪼
㪩㪸㫅㫂 㪛㫆㪺㫌㫄㪼㫅㫋㩷㪥㫆㪅 㪩㪼㪺㪸㫃㫃㩷㫉㪸㫋㪼 㪧㫉㪼㪺㫀㫊㫀㫆㫅㩷㫉㪸㫋㪼
Table 2 shows the recall and the precision before the performing of the OPTSS Comparing Table 1 with Table 2, it is clarified that the space created by using OPTSS is the optimal one. 㪈㪅㪉 㪈
㪧㫉㪼㪺㫀㫊㫀㫆㫅
㪇㪅㪏 㪇㪅㪍 㪇㪅㪋 㪇㪅㪉 㪇 㪇
㪇㪅㪉
㪧㫉㪼㪺㫀㫊㫀㫆㫅㩷㫆㫅㩷㪦㪧㪫㪪㪪
㪇㪅㪋
㪇㪅㪍 㪩㪼㪺㪸㫃㫃 㪧㫉㪼㪺㫀㫊㫀㫆㫅㩷㫆㫅㩷㪣㪪㪠
㪇㪅㪏
㪈
㪈㪅㪉
㪧㫉㪼㪺㫀㫊㫀㫆㫅㩷㫆㫅㩷㪭㪪㪤
Fig. 9. The recall precision rates obtained by OPTSS, LSI and VSM
Experiments on VSM and LSI are also performed. In the experiments on LSI, forty-two basis vectors are removed from the space. That is, the space is compressed from 811 dimensions to 769. The experimental result is shown in Figure 9. Figure 9 also shows the recall and the precision rates obtained by OPTSS, LSI and VSM, respectively. It is clear that when the semantic space is optimally created, very high precision can be obtained.
X. Chen et al. / A Semantic Space Creation Method with an Adaptive Axis Adjustment Mechanism
57
5. Conclusion
In this paper, we have presented a semantic media data search space creation method with an adaptive axis adjustment mechanism for adjusting the semantic expression vectors, which representing semantic correlated factors, and the vectors which representing the ‘noise’ factors that make the precision of queries down. By using this method, we can create an optimal semantic space for extracting semantically related and appropriate information adapting to the individual query requirements. The basic idea of our method is to rotate and combine the semantic expression vectors representing the semantic correlated factors and to remove the vectors representing the ‘noise’ factors. The mechanism on how to find the semantic correlated vectors from the learning data set is introduced in technical details. The mechanism for finding the ‘noise’ correlated vectors is also introduced. Experimental results are presented for demonstrating the efficiency of the proposed method. The experimental results also clarify that the optimal semantic retrieval spaces are created by using our method. Another important feature of our method is that the keywords (related to the recall) and the ‘noise’ terms (related to the precision) can be found by our method. In the future work, we will further improving the processing quality of the learning mechanism. Reference
[1] Batini, C.,Lenzelini, M. and Nabathe, S.B., “A comparative analysis of methodologies for database schema integration,” ACM Comp. Surveys, Vol. 18, pp.323-364, 1986. [2] Bright, M.W., Hurson, A.R., and Pakzad, S.H., “A Taxonomy and Current Issues in Multidatabase System,” IEEE Computer, Vol.25, No.3, pp.50-59, 1992. [3] Chen, X. and Kiyoki, Y., “A query-meaning Recognition Method with a Learning Mechanism for Document Information Retrieval,” Information Modelling and Knowledge Bases XV (IOS Press), Vol. 105, pp. 37-54, 2004. [4] Chen, X. and Kiyoki, Y., “A Dynamic Retrieval Space Creation Method for Semantic Information Retrieval,” Information Modelling and Knowledge Bases XVI(IOS Press) Vol. 121, 46-63, 2005. [5] Deerwester, S., Dumais, S. T., Landauer, T. K., Furnas, G. W. and Harshman, R. A., “Indexing by latent semantic analysis,” Journal of the Society for Information
58
X. Chen et al. / A Semantic Space Creation Method with an Adaptive Axis Adjustment Mechanism
Science, vol.41, no.6, 391-407, 1990. [6] Kitagawa, T. and Kiyoki, Y., “A mathematical model of meaning and its application to multidatabase systems,” Proceedings of 3rd IEEE International Workshop on Research Issues on Data Engineering: Interoperability in Multidatabase Systems, pp.130-135, April 1993. [7] Kiyoki, Y., Kitagawa, T. and Hitomi, Y., “A fundamental framework for realizing semantic interoperability in a multidatabase environment,” Journal of Integrated Computer-Aided Engineering, Vol.2, No.1(Special Issue on Multidatabase and Interoperable Systems), pp.3-20, John Wiley & Sons, Jan. 1995. [8] Kiyoki, Y., Kitagawa, T. and Hayama, T., “A metadatabase system for semantic image search by a mathematical model of meaning,” ACM SIGMOD Record, Vol.23, No. 4, pp.34-41, Dec. 1994. [9] Kiyoki, Y. and Kitagawa, T., “A semantic associative search method for knowledge acquisition,” Information Modelling and Knowledge Bases (IOS Press), Vol. VI, pp.121-130, 1995. [10]Kolodner, J.L., “Retrieval and organizational strategies in conceptual memory: a computer model,” Lawrence Erlbaum Associates, 1984. [11]Krikelis, A., Weems C.C., “Associative processing and processors,” IEEE Computer, Vol.27, No. 11, pp.12-17, Nov. 1994. [12]Ogden, C.K., “The General Basic English Dictionary,” Evans Brothers Limited, 1940. [13]Potter J.L., “Associative Computing,” Frontiers of Computer Science Series, Plenumn, 1992. [14]Raghavan, V. V. and Wong, S. K. M. , “A critical analysis of vector space model for information retrieval,” Journal of the American Society for Information Science, Vol.37 (5), p. 279-87, 1986. [15]Salton, G., “Introduction to Modern Information Retrieval,” McGraw-Hill, 1983. [16]Sheth, A. and Larson, J.A., “Federated database systems for managing distributed, heterogeneous, and autonomous databases,” ACM Computing Surveys, Vol.22, No.3, pp.183-236, 1990. [17]Williams, Lippincott and Wilkins, “Stedman's Electronic Medical Dictionary VERSION 5.0,” A Wolters Kluwer Company, 2000 [18]“Fifteenth Edition Harrison's Principles of Internal Medicine CD-ROM VERSION 1.0,” McGraw-Hill, 2001 [19]“Longman Dictionary of Contemporary English,” Longman, 1987.
Information Modelling and Knowledge Bases XIX H. Jaakkola et al. (Eds.) IOS Press, 2008 © 2008 The authors and IOS Press. All rights reserved.
59
Storyboarding Concepts for Edutainment WIS Klaus-Dieter Schewe1 and Bernhard Thalheim2 1 Massey University, Department of Information Systems & Information Science Research Centre, Private Bag 11 222, Palmerston North, New Zealand 2 Christian Albrechts University Kiel, Department of Computer Science, D-24098 Kiel, Germany
[email protected] [email protected] Abstract. Edutainment web information systems must be adaptable to the user, to the content currently available, to the technical environment of the user, and to the skills, abilities and needs of the learner. This paper provides conceptions that support this kind of adaptivity and sophisticated support.
1 Introduction 1.1 Edutainment Web Information Systems E-learning is now nearly as popular as e-commerce and e-business. The providers range from universities over non-profit organisations to professional training institutions. Furthermore, sites of museums, exhibitions, etc. can be counted as learning web information systems. In general, every data-intensive information system that is realised in a way that users can access it via web browsers will be called a web information system (WIS). The major intention of a learning WIS should be to support learning. In the case of socalled edutainment systems the provided knowledge is usually easy to grasp. The message is that learning can be fun. The usage of a learning system depends on whether the control of the learning process is left to the user or the system. In both cases, however, it is assumed that the users are willing to learn and match the required prerequisites. The content of a learning site depends on the area that is to be taught. Learning sessions are used as structural means, and navigation through these sessions may be organised in linear form or as a directed acyclic graph [ST05]. Systems may be completely passive allowing only material to be read or downloaded. Other systems may involve upload mechanisms for assessment of the learning progress. According to the progress made, a system may even provide feedback to its user. We concentrate our work on active learning. The functionality of learning sites mainly supports the navigation through the site, i.e., the navigation through the learning material. In contrast to other systems, this navigation is a long-term progress with usually many interruptions. More sophisticated systems provide system-driven repetition and feedback. Apparently, the functionality of such systems is still a matter of research. Also, personal information needs can be supported by providing an interface to e-mail.
60
K.-D. Schewe and B. Thalheim / Storyboarding Concepts for Edutainment WIS
1.2 Achievements and Challenges of Edutainment WIS A large number of websites provide learning and edutainment services has already been developed. Examples of technology-supported learning include computer-based training systems, interactive learning environments [DN03], intelligent computer-aided instruction systems [RK04, BGW92], distance learning systems, and collaborative learning environments. Developers have learned a number of lessons. Edutainment should neither be considered to be yet another presentation from for classical instructionistic learning nor a means for presentation of content that has been used for lecturing. Edutainment should not be understood as the enrichment of learning by multimedia facilities but must concentrate on content that is presented in a pleasing form, that is easy to use and to understand, that is enriched by functionality which corresponds to the content, and that allows to control the progress in learning. Edutainment might support everybody to participate in learning activities on any place at any time within any team. Learning cultures will however limit this globalisation claim. Learner have their profile and their portfolio. Main kind of learning stories are complementary learning, self-organised learning and continuing education on demand. Learner must thus be supported in developing their own learning space and in understanding and developing their learning abilities. Controlling and assessment cannot be completely automatised. Each user is different from the other. The context differs. The learning history is different. At the same time, learners often need an immediate feedback. Edutainment can be supported by a number of devices ranging from computers and PDA’s to mobile phones. We thus distinguish between electronic learning (e-learning) systems and mobile learning (m-learning) systems. Mobile phones and mobility in general are changing people’s way of working, communication and learning. The differences between formal learning, informal learning and way of working will diminish. The real added value of m-learning is in the area of informal learning whereas e-learning may cover formal and informal learning. Scenarios and content are multifaceted, depend on the presentation device, and must be adaptable to the learner, to the learning community. The type of learning will be different from classical ones. The style of learning changes to pro-active formal learning with self study content, virtual classroom and trainer based facilitation. Learning is social, arguing, reflecting, articulating, and debating with others. Majority of learning in the web is informal learning. It is based on ad-hoc information sharing and on communications and collaboration. The functionality of edutainment WIS is also based on functions such as participation, contribution, and annotation. 1.3 Scope of this Paper This paper introduces some new conceptions we have already used in our edutainment WIS projects: DaMiT (a data mining tutoring workbench), KoPra (cooperative learning of database programming), and Learning Lusitia (continuing life-long learning for engineers and alumnis). We extend the classical concept of learning objects [LOM00, LTS00] to open learning objects, show how scenario development [For94] leads to sophisticated support for didactics, and provide an insight into development of appropriate functionality. Additionally we show how brands of edutainment WIS, actor specification, learning scenarios, content and functionality can be developed for sophisticated edutainment WIS.
K.-D. Schewe and B. Thalheim / Storyboarding Concepts for Edutainment WIS
61
2 The Five General Characteristics of Edutainment WIS 2.1 The Brand of Edutainment WIS The pattern or brand of the edutainment WIS P W 2U A (Provider P, knowledge W, user U, activities A) generalises the classical who-to-whom pattern (e.g. B2B). It is specialized for edutainment WIS: Provider: Providers are currently mainly educational institutions or educational communities. In future, we will observe commercialisation of education. So, main providers are going to be companies. If a provider is a singleton person then this provider plays the role of the teacher. Product dimension: Since control and assessment of learning progress is a still unsolved issue and appropriate presentation of complex information is often not feasible, edutainment WIS concentrate on easy-to-understand information or easy-to-grasp knowledge. The associated auxiliary scenarios are based on functions such as validate , control and advice . Therefore, the brand can be characterised by P K 2Clearn,validate,control,advice . User dimension: Users of edutainment WIS are mainly private people. They may be pupils or students, people seeking for continuing education, workers in companies with specific portfolio, or just people interested in auxiliary information. Users can be also groups. The main behavior of users is characterised by the role of the learner or of student. Activity dimension: Activities are currently centered around learning, searching for content, collecting content, and solving exercises. Activities also include to ask questions, to act in teams for problem solving, and to discuss issues associated with the learning material. Edutainment (learning) sites (BK 2Clearn , CK 2Clearn ) are then specialized to the brands: Teachercontent chunks 2Studentreceive,respond,solve in teams,raise questions,possibly apply Teachercontent chunks 2Studentrecognise,listen,work on it,solve exercises,ask urgent questions TeacherKnowledge 2Studentdiscuss,get feedback,work on it TeacherKnowledge 2Student Groupdiscuss,get feedback,work on it and TeacherW isdom 2Studentdiscuss,get feedback,work on it . 2.2 Actors in Edutainment Edutainment WIS are currently mainly or exclusively supporting the pupil or student actor. The behaviour of actors might however be more complex: Pupils obtain knowledge through teachers, their schedules, and their abilities. They need guidance, motivation, and control. Collaborating or cooperating students act in a collaboration depending on a cooperation profile and rights and roles. Communication partners exchange content, discuss and resolve questions, seek for hints or for better understanding. Supporting and motivating partners are users with control, motivation and supporting functions.
62
K.-D. Schewe and B. Thalheim / Storyboarding Concepts for Edutainment WIS
Teachers act in various roles, obligations, rights, and in a variety of involvement. The nature of the activities that constitute teaching depends more on the age of the persons being taught than on any other one thing. Users are different in their history. We distinguish between learner that initially seek to increase their knowledge and skills and users that are seeking continuing education. The first group of learners needs some guidance. Therefore, the learning style is based on pedagogic teacher-based learning. The latter are self-organised. We call then andragogic learners. The differences between the two learning styles are given in the following table: andragogic self-organized learning pedagogic teacher-based learning independent learner dependent learner self-regulated learning not necessarily self-regulated learning self-motivated learning need to be motivated reflective needs help for learning arguably needs help for learning analytical needs help for learning situated in real world context structured engaged with knowledge 2.3 Learning Scenarios and Stories Modelling of e-learning scenarios is more difficult due to the variety of didactic approach, to the variety of learners, to the variety of tasks and the context of the site. For this reason, modelling of dialogue scenes and dialogue steps is more generic than modelling of simple interaction steps that usually lead to deterministic runs through the story space. We, thus, distinguish a number of associations among steps and scenes depending on the content of the learning element. Our solution to this challenge is based on generic parameters that are instantiated depending on the learner, the history, the context etc. Each learning unit is specified by a contextfree expression with a set of parameters. These parameters are instantiated depending on the learner profile, the learner task portfolio, the media objects, the learner computational environment, the data to be used in exercises or algorithms, the presentation environment, and the available and accessible learning elements. Learning scenarios (learn ) are classically based on general learning styles: Sequenced learning is based on a curriculum sequencing in active or passive scenarios based on classical pedagogical approaches. Typical established pedagogical approaches are conditioning, operated learning, model learning, cognitive approaches. Sequencing is extensively studied based on classical didactics. Active scenarios model behaviour of active actors. A typical metaphor to be applied is the shopping bag. Passive scenarios did not get the success that has been expected during the 80ies and 90ies. Tutoring systems are applicable for advisory systems, for help desks, or for training physical skills. Interactive learning is either based on self-organized or content-sequenced or habit-regulated or publish-subscribe scenarios. Group learning is based on a cooperative setting (integrated sub-tasks, black boarding) or in a collaborative setting (cooperation, discussion, development of solution by all members).
K.-D. Schewe and B. Thalheim / Storyboarding Concepts for Edutainment WIS
63
Sequenced learning is currently mainly based on didactic approaches, i.e. on instructions, classical content and classical hermeneutics. Due to the sequential way dependencies among learning units are rather simple. Learning scenes use reception techniques, explication of the learning content, context association, deriving understanding and interpretation, and finally integrating the essence of the content into the knowledge of the learner. 2.4 Content in Edutainment Content used in edutainment WIS is mainly easy-to-grasp and easy-to-understand information or knowledge. Most important associated scenarios are validate , control and advice . Content is given in a large variety. We need therefore to consider the kind of content, the activities that can be associated with the content, the characterisation and annotation of content, and finally the quality characteristics of content. Edutainment content delivery, storage, retrieval, and extraction is still an open research issue. Edutainment WIS provide content to learners depending on their learning task, their personality, their working environment, their learning history and the policy of the content provider. This challenge is more complex than the challenge to generate “content”. Content can be represented by media objects. Content for learning must have high adaptivity. We distinguish a variety of content or generally knowledge: • knowledge and abilities for orientation, e.g., explanation, presentation, history, facts, surveys, overview; • knowledge and abilities for application, skills, abilities rules, procedures, principles, strategy, and laws; • knowledge and abilities for explanation, e.g., ‘why’-knowledge (proof, causal) and ‘what’knowledge (definition, description, argument, assumption, reflection); • knowledge and abilities on sources, e.g. archives, documents, citation, reference, and links; • knowledge and abilities for solving problems, e.g., sample solutions, analogs, training solutions, discovery solution, and examination. All media objects can be provided independently of the learner. We need however to consider also differences. Background knowledge leads to different speed and reception by the learner. Work abilities and habits influence current work. The learning style must be considered in many facets. The social environment is based on cultural and psychological differences. The history of the learning process should be considered if we want to avoid annoying repetitions. The learning portfolio influences occasion, intention and motivation. The learning object presentation is or is not acceptable depending on the profile of the learner. The learning environment is modelled by many technical facets. Content change management allows to provide content with or without refresh. The payment profile may result in content reduction. Learning objects are elements of a new type of computer-based instruction influenced by the object-oriented paradigm of computer science. Learning objects are defined here as any media object which can be used, re-used or referenced during technology supported learning. Examples of learning objects include multimedia content, instructional content, learning objectives, instructional software and software tools, and persons, organizations, or events referenced during technology supported learning. The IEEEs Learning Technology Standards
64
K.-D. Schewe and B. Thalheim / Storyboarding Concepts for Edutainment WIS
Committee has developed an internationally recognized definition of “learning object”: “any entity, digital or non-digital, that can be used, re-used, or referenced during technology supported learning”. This definition is extraordinarily broad. A large variety of learning elements is used in edutainment WIS. Course elements are lecture notes and comments. Exercise material may be given in a textual or in an interactive form Illustration material is often based on animations, complex multimedia elements or on links. Algorithms can be provided in an executable form, as virtual machine or via an interface to the server. Input data for algorithms can be provided with the learning elements or through other elements in the network. 2.5 General Functionality Based on Word Fields During requirements capture WIS design and development we use natural language descriptions for analysing the activities of the users: What are these activities? Does the description indicate any sequencing or continuation? What data is needed for or used by the activities? What are the relationships between these data? As user activities can be described by verbs, we suggest analysing the corresponding word fields. A word field [Kun92, SSS90, BZKK+ 05] is a linguistic system in which similar words that describe the “same” basic semen and are used in the same context are combined to a common structure and data set. In contrast to common synonym dictionary, word fields define the possible and necessary actors, the actions and the context. Word fields can be used for verbs, nouns and adjectives. Functionality of edutainment WIS is also determined by the main word fields we observe for learning. Main word fields applied in learning are: Learn: Learning is a very complex activity. It includes to gain knowledge or understanding of or skill in by study, instruction, or experience. Additionally, learning is associated with memorizing, to come to be able to perform some task, and to know this ability. Learning is based on obtaining content and discover the concepts behind. It is also based on facilities for annotation, ordering, and integrating A user obtains the role of a learner or student. Learners are usually supported by other actors who teach and instruct. Learners determine content with certainty, usually by making an inquiry or other effort. They check the content, find out whether it is useful or they need additional content. Know: Learning is based on skills, abilities, and knowledge. It target on their improvement. The improvement should be measurable. The learning success is then examined. To know means to be cognizant or aware of a fact or a specific piece of information and to possess knowledge or information about. It may include also to know how to do or perform something. The learner obtains firsthand knowledge of states, situations, emotions, or sensations. The change in knowledge is acknowledged and recognized by other actors who can accept to be what is claimed. The word field know used in learning is different from one used in identity or information WIS. Master: Learning is usually intending to master problems and to become completely proficient or skilled in an area. This mastership is closely related to practise and experiment with the new knowledge. The learner has a firm understanding or knowledge of and is on top of of a problem.
K.-D. Schewe and B. Thalheim / Storyboarding Concepts for Edutainment WIS
65
Skill: Skills are abilities that have been acquired by training, e.g. abilities to produce solutions in some problem. A user acting as a learner is possibly trained until he/she obtained these skills. Study: The learner is engage in study and undertakes formal study of a subject Studying requires an endeavor and a try. In learning WIS studying is based on read in detail especially with the intention of learning. Therefore the presentation of the material and the storyboard is an essential part. Studying does not only mean to view content but to check over, to check up, to con, to examine, to inspect, and to survey it. This activity is performed attentively and in detail. Learners need to mind, to perpend, to think (out or over), and to weigh. Therefore, users need time and workplaces. Studying can be performed by oneself without any teacher, supporter, or observer. Studying is often based on the existence of a specific workplace and of a specific workspace. Learners may thus have a study place and workplaces which are given to them, which might be rented and which can be exported to the user. We observe that these word fields have a more complex structure. They lead to functions which require support functions. The difference to the word fields discussed for business WIS is their iterative application. This kind of behaviour has already been observed for community services. Additionally, learning can be performed within temporal communities. Other related word fields are discover, ascertain, catch on, determine, teach, educate, judge, evaluate, advise, innovate and discuss. Learning word fields can be combined and thus form a didactic story. Learning word fields are going to be combined with reasoning word fields such as analogise, analyse, cause, classify, conjecture, counter-example, formalise, generalise, sharpen, specialise, unknown, and weaken. General activity word fields that appear in any story are folded into our learning word fields. Typical general activities are characterised by answer, comment, compose, decompose, define, draw, effect, example, extend, fact, inquire, instantiate, and reduce. 3 Novel Concepts of the Edutainment WIS DaMiT, KoPra, and Learning Lausitia The DaMiT system [JMR+ 03] supports sequenced and interactive learning. Interactive learning is still an open research issue. We have developed and extensively used an implementation in the KO P RA project [SBZ04]. Learners act in a collaborative setting depending on their needs and the goals of the learning program in the project Learning Lausitia. 3.1 Supporting Didactics by SiteLang The website description language SiteLang has been introduced in [TD01]. A scenario is an application-oriented view of parts of the storyboard. The storyboard consists of a set of scenarios, one of which is the main scenario, whereas the others define scenes, a plot specified by a SiteLang process, a set of roles, a set of user types, a set of tasks each associated with a goal, and a set of constraints comprising deontic constraints for the rights and obligations of roles, preference rules for user types, and other dependencies on the plot. We may distinguish four main kinds of general scenarios for the three learning styles: Content scenarios are centered around the content chunks or content suites to be delivered to the learner, to be received by the learner, and to be integrated into his/her knowledge.
66
K.-D. Schewe and B. Thalheim / Storyboarding Concepts for Edutainment WIS
Control scenario are based on assessment or control of the success of learning. They involve at least two different actors: learners and controllers. A third actor involved may be the advisor. Workspace scenario are auxiliary scenario that help the learner to organise the learning material, that enhance the learner content space with memos, with excerpts, own solutions, and collaboration notes. Collaboration scenario support the learner to learn in groups, to communicate with their partners and with other actors such as teachers and advisors, to coordinate activities and to cooperate during solution of exercises, content reception and development of solutions. These four general scenarios can be seen as relatively independent scenarios that can be combined with each other depending in the learning style. We develop the story space by the didactic scenario quadruple in Figure 1 on the basis of relatively independent scenarios and in the second step through their combination. workspace scenario collaboration scenario
content scenario
control scenario Figure 1: The didactic scenario quadruple
Edutainment didactics is currently developed for most learning WIS from scratch and by doing. We decided to develop such didactics systematically. Edutainment didactics can be based on general learning storyboards. Since learning is one of the most complex activities we support a number of additional general stories within the story space: Teaching: Teaching is a complex activity that goes beyond instruction, that includes a process of formal training, that is based a body of specialized knowledge, and satisfies a set of standards of performanceintellectual, practical, and ethical. Teaching intends to learner to know something, to know how to accustom to some action or attitude, and to know the consequences of some action. Educating: Education is similar to teaching but more intention based. It includes training by formal instruction and supervised practice especially in a skill, trade, or profession. It targets to persuade or condition to feel, believe, or act in a desired way. Learners are mentally, morally, or aesthetically educated especially by instruction. Judging and evaluating: Edutainment also is often oriented to form an opinion about through some weighing of evidence and testing of premises and to decide a matter as a judge. Learner are trained in determining or fixing the value of a matter, in determining the significance, worth, or condition of usually by some kind of appraisal and study. Advising and discussing: Advising and discussion scenarios are similar to those we have already considered for community WIS. Advising means to be able to use the background
K.-D. Schewe and B. Thalheim / Storyboarding Concepts for Edutainment WIS
67
knowledge of the learner or the knowledge given by the content chunks for generation of advices or information or for consulting. Discussion techniques are applied in edutainment whenever we need to investigate by reasoning or to argument or to discourse about in order to reach conclusions or to convince. Discussion is an exploration technique that implies a sifting of possibilities especially by presenting considerations pro and con. Innovating: Innovation introduces a new idea, new content or a new activity in order to effect a change. These different learning scenarios can be compiled or integrated within the story space. We use for integration, ordering and hierarchical presentation of scenarios the theory of KAT’s. This combination is necessary whenever we need to consider different didactic approaches: Critical-constructive didactics see learning through interaction via goals. Learn-theoretical didactics consider learning through dialogues between actors. Goal-oriented learning is based on a separation of the intention space into subspaces with sub-processes. Cybernetical didactics uses regulated processes for story development. Critical-constructive didactics are based on interactions, repetitions, and obstruction. Curriculum planing extends classical sequenced or blended learning by schedules, goals, and steps. 3.2 Storyboard Pattern Scenario typically proceed in one or (usually) more scenes. Scenario are describing the ways how the work is performed and are based on didactics, goals, user purposes, and content. A storyboard contributes to achieving a purpose or goals. Scenes are composed of other scenes. We assume that scenes can be hierarchically formed based on basic scenes. Typical basic scenes are the information seeking scene, the collaboration scene, the assessment scene, the result integration scene and the problem solution scene. Scenes can be composed with another scene based on sequential, parallel, alternative etc. composition operators. It seems to be obvious that these scenes have a common structure and a common behaviour. We thus can extract general stories or scenarios. These general stories use general scenes or scene pattern. Patterns represent recurring solutions to software development problems within a particular context. These scene pattern are refined to the concrete scenes. Let us consider a typical scene pattern that involves a number of learners. Problem solution scenes are one of the most complex scenes in edutainment WIS. They typically consist of an orientation or review subscene, of a problem solution scene performed in teams, and of a finalisation subscene. Problem solution scenes are composed of a number of subscenes that can be classified into: • Review of the state-of-affairs: The state of the problem solution is reviewed, evaluated, and analysed. Obligations are derived. Open problem tasks can be closed, rephrased or prioritised.
68
K.-D. Schewe and B. Thalheim / Storyboarding Concepts for Edutainment WIS
• Study of documents and resources: Available documents and resources are checked whether they are available, adequate and relevant for the current scene, and form a basis for the successful completion of the scene. • Discussions and elicitation with other partners: Discussions may be informal, interviewbased, or systematic. The result of such discussions is measured by some quality criteria such as trust or confidence. They are spread among the partners with some intention such as asking for revision, confirmation, or extension of the discussion. • Recording and documentation of solutions: The result of the scene is usually recorded in one solution proposal. • Classification of solutions, requirements, results: Each result developed is briefly examined individually and in dependence of other results from which it depends and to which it has an impact. • Review of the problem solution process: Once the result to be achieved is going to be recorded the solution is examined whether is has the necessary and sufficient quality, whether is must be revised, updated or rejected, whether there are conflicts, inconsistencies or incompleteness or whether more may be needed. If the evaluation results in requiring additional scenes or subscenes then the scene or the subscene is going to be extended by them. This classification is based on the general problem solution framework discussed in [PP45]. Problem solution scenes are usually iterative and thus cyclic as displayed in Figure 2. The Problem solution scene j Application area understanding Y K
Evaluation of solutions
U Deployment of solutions
j Phenomenon understanding :
j y
U Solution preparation K zU Solution development
Figure 2: The subscenes in the edutainment problem solution scene
general frame to problem solution is then based on the problem solution or search problem: Problem characterisation with abstracting from non-essential parts; Context injection for simplification of the problem and of the solution;
K.-D. Schewe and B. Thalheim / Storyboarding Concepts for Edutainment WIS
69
Tools and instruments for solution based on constructors, associations, collections, and classification; Specification/prescription/description as the results of the problem solution problem. Analysis and solution formation and problem solution are two specific activities that depend on each other. Development aims in forming solutions as well as discovering inconsistencies and conflicts and resolving incompleteness. Solution formation is based on the mapping of properties, requirements or phenomenons into WIS solutions. Solutions may have alternative solutions. These alternative solutions can be used at a later scenes instead of the given one. Typical techniques for problem solution and formation are exploration and experimentation, skeptical evaluation, conjecturing and refuting. Additional techniques are investigative ones that use resources. In the evaluation phase we show whether the problem solution result is correct, and if so what kind of tests etc. ought to be devised. We do not require completeness since this is a relative issue. Solution validation is the process of inspection of each solution with reference to solutions it is based on. It is based on the application area description and checks whether the right solutions are developed. Validation produces a report that constitutes the correctness and that produces the extensions and updates necessary for the base solutions. Verification analyses solutions in order to ascertain whether what has been developed satisfies a number of obliged properties. It checks whether the solutions developed so far are correct according to some correctness criteria and according to proof or verification techniques that are currently applicable. Techniques may be informal, i.e., based on verbal arguments or tests, qualitative, i.e. based on abstraction and qualitative reasoning, or formal, i.e. based on model checking and proof techniques. Each of the subscenes can be refined depending on the application, the user, the content, collaboration and the control: User refinement: Each user has a personal profile and a task portfolio. The website should only use those dialogue scenes the learner is assigned to. Learning objects refinement: Learning objects are under constant change. Whether this change is shown to the user depends on the user profile. However, in general refinement to available content is generated for the user. Learning objects refinement: Learning objects have prerequisites, support learning goals, and are associated to other learning objects. These associations must be made available by the system. Scenario refinement: The edutainment WIS also supports an adaption of the entire story currently requested by the user depending on the user and the completion of tasks. Usage refinement: Users are annoyed whenever they did not complete a task and they must begin from scratch after resuming. For this reason, the edutainment WIS must support also refinement to the current usage. It is obvious that such refinement cannot be generated by random application of refinement rules. We observe however that this can be layered. The approach to layering used in the
70
K.-D. Schewe and B. Thalheim / Storyboarding Concepts for Edutainment WIS Current usage refinement User refinement Explicit story refinement Learning object space
Learning object correlates
Edutainment WIS Edutainment WIS content context Edutainment WIS story enrichment Edutainment WIS user profile and portfolio enrichment Edutainment WIS on demand enrichment Figure 3: Layering of generation and filtering against learning units
system is displayed in Figure 3. We generate first the learning scene together with its learning objects for each of the dialogue scenes. Next we extend the set of learning objects by all associated objects. This set of learning objects cannot be delivered to the learner in any order, e.g. prerequisites must be shown first. Some of the learning objects may be shown in any order, i.e., in parallel. Next we filter this set against the storyboard and generate now learning object sequences depending on the scenes. Now we are able to take into consideration the user refinement such as the technical equipment, the channel information such as capacity, the web browser currently in use. Based on this information, we can adapt the website to the current refinement. Finally, we may now enhance the scenes to the specific demands. 3.3 Open Learning Units for Content Classically learning objects are composed elements. Comparing the existing standards such as LOM, SCORM we extract the following basic units: Learning elements are basic components providing the content for singleton learning steps. Typical learning elements are definitions, remarks, proofs, lemmata, illustrations, motivational remarks etc. Learning elements may be associated with learning intentions and require basic skills and knowledge from the learner. These requirements form the context of learning elements. Semantics context specifies the prerequisites and the pieces of knowledge that can be learned. Execution context restricts the utilisation of learning elements, e.g., the environment. Learning modules are the main supporting media objects for lectures. They may be enhanced by indexing, annotation or search functionality. Actors may only call an entire module. Modules may consist of learning elements. Typical composition expressions are regular expressions. The utilisation of learning modules may be based on a number of execution styles such as blackboard execution. Learning modules are typically not materialised. This distinction is too rough for practical usage. Learner may stop learning in any module, may resume learning at a later stage and may define their own way of visiting modules. We thus need a sophisticated, flexible and adaptable mechanism for structuring and compositions of learning modules. We distinguish between learning elements that are simply specific media objects with an additional characterisation, and learning units that can be understood
K.-D. Schewe and B. Thalheim / Storyboarding Concepts for Edutainment WIS
71
as combined learning elements, which are dependent on the learning element and thus only updateable through the learning element. Learning elements can be understood as simple media objects such as text elements, video clips, images etc. Learning units are composed of learning elements, cf. Figure 4. NameOfUnit u Identification HeaderContent Associations to Units {u} Meta-data on Unit u ... Contained Elements NameOfElement e1 Identification Content of Element e1 Associations to Elements {e} Meta-data on Element e1 ...
NameOfElement ek Identification Content of Element ek Associations to Elements {e} Meta-data on Element ek ...
Figure 4: Hierarchical composition of learning elements to learning units
For composition of learning units we extend the specification by metadata, additional functionality, specific scenarios, specific representation styles, filters for playout, a characterisation of learning units spaces that can be associated, and context. The onion playout facility based on XSL rules [TD01] supports the generation of the right content, at the right time, in the right representation, by the right costs, and within the right learning history for each learner. Learning elements and learning units are commonly characterised by • a name of the unit, • a general annotation called header content, • metadata that provide additional information on the unit, and • associations among the media objects. Additionally, learning units are characterised by • expression hierarchically combining learning elements or learning units into the given one, • reusability conditions describing the degree of ease with which constituent media objects may be individually accessed and reused, • common function describing the manner in which the unit is generally used, • extra-object dependence describing whether the unit needs information (such as location on the network) about learning units other than itself, • functions of algorithms and procedures within the media object,
72
K.-D. Schewe and B. Thalheim / Storyboarding Concepts for Edutainment WIS
• potential for inter-contextual reuse describing the number of different instructional contexts in which the learning unit may be used, that is, the unit’s potential for reuse in different content areas or domains, and • potential for intra-contextual reuse describing the number of times the unit may be used within the same content area or domain. The above discussed theory of learning elements, learning units and learning modules is now becoming the state-of-the-art in most edutainment WIS. We are additionally interested in support of learning scenarios which are adaptable in most of the forms discussed above. In order to cope with such requirements we introduce the concept of open learning units. These objects are learning units extended by parameters instantiable by information used in general learning modules such as prerequisites, supporting knowledge for better understanding of the unit, and associated units, parameters for links to other content elements, comments, parameters for replacements strategies depending on availability of content objects, parameters for associating units to ontology objects for flexible classification of learning objects and bottom-up collection of learning scenarios, parameters for learner profile integration through which the open learning unit can be replaced by objects that are more appropriate for the current learner type, parameters keeping track on the learning history and which can be used for adaptation of the unit to the learner’s history, and parameters used to keep track on the payment profile of the learner. Open learning units can be specialized to learning units by instantiating all parameters. In this process, user-specific learning units are generated by step-wise instantiation, extension and specialization of the given open learning unit. This procedure is based on rules which can be specified as attribute grammar rules and which can be transformed to XSL rules. These XSL rules are used to transform the given learning unit that is given as XML document to more specific XML documents. We need, however, to clarify whether an arbitrary specialization order is applicable. Furthermore, the representation style can be added. This adaptation approach has been extensively discussed in [TD01]. Since we change step-by-step the learning unit to be transferred to the learner we call this generation approach the onion generation style. The delivery of open learning units is based on container functions that support derived learning units by filters which support • enrichment of each unit by other learning elements or learning units, • contraction of units to essential material, e.g., for repetition of units already visited, and • cut of learning units to new units according to the restrictions that are applicable.
K.-D. Schewe and B. Thalheim / Storyboarding Concepts for Edutainment WIS
73
This approach has already been used in the DaMiT system. Since DaMiT was centered around data mining material and has been mathematically oriented, a number of specific filters have been developed: • Repetition filters contracted the units to essential definitions and to the informal statements of theorems in the unit. • Definitions filters provided the definitions of the together with all associated definitions used in the statements and theorems of the unit. • Examples filter compiled examples in the unit to consecutive examples with references to statements and definitions. • Quick reminder filter are extracting those elements of the units that ar marked as absolutely essential. • Proposition-and-theorems filters summarised all theorems and statement of a unit together with sketches or ideas of their proofs. 3.4 Enhancements of Edutainment Stories Edutainment WIS authoring is considered to be one of the most difficult tasks. This difficulty is increased by the current approach to develop content and stories from scratch. Classroom teaching does not follow this approach. It is mainly based on sequenced or curriculum-ruled learning approaches. These approaches are easier to use due to the high level of reuse, due to the high-level of similarity within the teaching scenario, and due to the homogeneity of teams after the class has been formed. The last advantage cannot be used in edutainment WIS since the auditory is very heterogeneous, is changing over time, does not follow any timed schedules, has a brought variety of preliminary knowledge and is self-organised. The first two advantages may however be incorporated into edutainment as well. Meanwhile it is well acknowledged that general purpose, general content and general storyboard systems are not feasible. Since any area of knowledge has its specific approaches to learn that knowledge we need specific storyboards for edutainment systems. General purpose learning systems are replaced by specific purpose systems, e.g. learning application of certain knowledge. Any content needs its specific representation. Therefore, we must specialize from one hand side and must be very general for any kind of story required. We shall show that our approach supports these requirements. Authors of content for edutainment may base their storyboard on general “pattern” of scenario that might be useful for the given topic. This general scenario can be then refined by the author. We are interested in such scenario that can be composed from given scenarios. So, the author selects first a general scenario or a storyboard. Next he/she uses pattern for refinement of their scenes. Educational material is assigned to scenes based on the storyboard. This material is modelled by media types [ST04]. Finally, collaboration, control and workspace scenario are folded into the developed scenario. Let us demonstrate the integration based on the experience we gained in one of our edutainment WIS project. Data mining is one of the challenging topics. First, users must learn in a rather interactive form. Second, the outcome of the learning process must be evaluated and interpreted. Third, the data used for data mining are often private or secured data. Therefore,
74
K.-D. Schewe and B. Thalheim / Storyboarding Concepts for Edutainment WIS
the user should not transfer the data. Instead the user must understand how to prepare for the data mining process. Additionally the demand for data mining appears as a part of the daily business of users. In a large project integrating research groups in ten German universities and application groups in half-dozen German software companies we developed the edutainment system DaMiT (data mining tutoring) that educates users to such extent that they can evaluate whether the results of decision support systems are generating correct results and whether results will lead to correct conclusions. The system supports decision learning • by learning basic and advanced topics on data mining on demand, • by applying data mining algorithms to test data for training of users, and • by finally applying these data mining algorithms to data of the application engineer. Surprises, data warehousing, and complex applications often require sophisticated data analysis. The most common approach to data analysis is to use data mining software or reasoning systems based on artificial intelligence. These applications allow to analyse data based on the data on hand. At the same time data are often observational or sequenced data, noisy data, null-valued data, incomplete data, of wrong granularity, of wrong precision, of inappropriate type or coding, etc. Therefore, brute-force application of analysis algorithms leads to wrong results, to losses of semantics, to misunderstandings etc. The storyboard in Figure 2 gives only a general description of the main part of the data mining storyboard. It must be enhanced to become a complete story for learning data mining. We thus need general frameworks for data analysis beyond the framework used for data mining. The enhancement procedure may be based on general story descriptions. In our application approaches known from mathematics for the general mathematical problem solving have been used for derivation of a data analysis framework: Elaboration of needs: Modelling of the tasks and problems and their data requirements. Elaboration of opportunities: Selection of possible appropriate analysis algorithms, categorisation of their outcome and pitfalls within the task and problem scope, development of an application frame for application of the chosen algorithms. Extraction, transformation, and loading of data: Categorisation, extraction of macro- and meta-data, adaption of the data to the analysis needs and modelling of data semantics and pragmatics. Problem solution: Extraction, transformation and loading of macro-data for the chosen analysis algorithms, including cleansing and adaption of the data. Refinement of problem solution: Application, stepwise refinement and correction of the analysis algorithms. Interpretation of analysis: Modelling of the obtained analysis results with their semantics and pragmatics.
75
K.-D. Schewe and B. Thalheim / Storyboarding Concepts for Edutainment WIS
The general data analysis framework has been developed for data mining tasks. It considers modelling of data analysis algorithms, description of their requirements to data, meta-data of their functionality for analysis and transformations of the data. This general framework can be now refined in a number of ways. A typical refinement is displayed in Figure 5. Users learn the opportunities of data mining based on case studies. They explore good and bad studies, get an explanation based on the theory background, and become familiar with modelling techniques. Elaboration of opportunities by analogy Disselect case
j
Empty list of cases
Deficits j in the application domain
K U Survey on case studies Y K
j Selected case study Y y K U Hall of fame for solutions
z
j
Related cases
z
:
Theory behind
9 Model survey
Figure 5: The scenes for elaboration of opportunities for active learning in the DaMiT system
The framework is classically enhanced by scenario that support training of users to data mining based on exercises, examples, case studies etc. 3.5 Context Space Enhancements When determining context we already know the edutainment scenarios we would like to support, the intentions associated with the WIS, the user and learner characterisation on the basis of profiles and portfolios, and the technical environment we are going to use. These restrictions enable a more refined understanding of context within a WIS. [MST05] characterises a WIS by six intertwined dimensions: the intentions, the usage, the content, the functionality, the context, and the presentation. We must thus relate context to the other dimensions. As presentation resides on a lower level of abstraction, it does not have an impact on context. Content and functionality will be used for context refinement. The user model, the specified edutainment scenarios, and the intention can be used for a disambiguation of the meaning and an injection of context. In doing so we distinguish the following facets of context: Learner context: The WIS is used by learners for a number of tasks in a variety of involvements and well understood collaboration. These learners impose their quality requirements on the WIS usage as described by their security and privacy profiles. They need additional auxiliary data and auxiliary functions. The variability of use is restricted by the learner’s context, which covers the learner’s specific tasks and specific data and
76
K.-D. Schewe and B. Thalheim / Storyboarding Concepts for Edutainment WIS
function demand, and by chosen involvement, while the profile of learners imposes exceptions. The involvement and collaboration of learners is based on assumptions of social behaviour and restrictions due to organisational decisions. These assumptions and restrictions are components of the learner’s context. Storyboard context: The meaning of content and functionality to users depends on the stories, which are based on scenarios that reflect learning scenarios and the portfolios of users or learners. According to the profile of these users a number of quality requirements such as privacy, security and availability must be satisfied. The learner’s scenario context describes what the learner needs to understand in order to efficiently and effectively solve his/her tasks in the actual portfolio. The learner’s determine the policy for following particular stories. System context: The edutainment WIS is developed to support a number of intentions. The purposes and intents lead to a number of decisions on the WIS architecture, the technical environment, and the implementation. The WIS architecture has an impact on its utilisation, which often is only implicit and thus leads to not understandable systems behaviour. The technical environment restricts the user due to restrictions imposed by server, channel and client properties. Adaptation to the current environment is defined as context adaptation to the current channel, to the client infrastructure and to the server load. At the same time a number of legal decisions based on regulations, laws and business rules have been incorporated into the WIS. Temporal context: The utilisation of a scene by an learner depends on his/her history of utilisation. Learners may interrupt and resume their activities at any moment of time. As they may not be interested in repeating all previous actions they have already successfully completed, the temporal context must be taken into account. Due to availability of content and functionality the current utilisation may lead to a different story within the same scenario. This entire information forms the context space, which brings together the storyboard specification and the contextual information. Typical questions that are answered on the basis of the context space are: • What content is required by the context space? • What functionality is required by the context space? • What has to be changed for the life cases, the storyboard, etc., if context is considered? As outlined above the context space is determined by the learners, the scenarios, the WIS itself, and the time. It leads to a specialisation of the content, structuring and functionality of the scenes. Context is associated with desirable properties of the WIS such as quality criteria and security and privacy requirements. Quality criteria such as suitability for the users or learnability provide obligations for the WIS development process. Though these criteria are rather fuzzy, they lead directly to a number of implementation obligations that must be fulfilled at later stages, i.e. within the development on the implementation layer. For instance, learnability means comprehensibility, i.e. the WIS must be easy to use, remember, capture and forecast. This requires clarity of the visual representation, predictability,
K.-D. Schewe and B. Thalheim / Storyboarding Concepts for Edutainment WIS
77
directness and intuitiveness. These properties allow the user to concentrate on the tasks. The workflows and the discourse structure correspond to the expectations of the users and do not lead to surprising situations. They can be based on metaphors and motives taken from the application domain. In the same way other quality criteria can also be mapped to development obligations. Other properties that may be associated with context refer to the potential utilisation for other tasks outside the scope of the storyboard. In this case we do not integrate the additional tasks into the storyboard, but instead support these tasks, if this in accordance with our intentions. For instance, we might expect further visits targeting at core concerns of the edutainment WIS. 4 Open Issues and Challenges for Edutainment WIS Learning content must be properly handled. It turns out that this task is one of the most difficult tasks. For this reason, any edutainment WIS must be combined with an authoring WIS that supports authors during appropriate development of content. Quality control of learning objects is necessary since low quality data harms the learning success. We may distinguish a number of reasons for low quality and development strategies for improvement of quality: Incomplete content, mixed content, wrong content, complex associations among content, and mutated content. Functionality development for edutainment WIS also includes the development of very sophisticated supporting facilities for the actors, the content, the context, and the presentation. Most edutainment systems are currently based on scenarios of sequenced learning. 3rd generation systems are aiming in providing best-suited content just in time to the right user, place and device with the best pricing. They challenge current technology. Research is sought on didactics, content integration and delivery, storyboarding, adaptation and context integration, and success control. Open learning objects provide a sophisticated facility for content management. The theory of open learning units should be integrated with didactics based on storyboarding, content adaptation and delivery, and content development. In future, they will be extended with context, e.g., story space, actor, user, payment, portfolio, association, history, etc. Control functionality should be provided for open learning units in the same fashion as we know it already for exercises, tests, and exams for self-control or certification. References [BGW92]
R. M. Briggs, L. J. Gagne, and W. W. Wager. Principles of Instructional Design. Thomson Learning, 1992.
[BZKK+ 05] A. Binemann-Zdanowicz, R. Kaschek, T. Kuss, K.-D. Schewe, B. Thalheim, and B. Tschiedel. A conceptual view of electronic learning systems. Education and Information Technologies, 2005. [DN03]
S. Dohi and S. Nakamura. The development of the dynamic syllabus for school of information environment. In ITHET03, pages 505–510, 2003.
[For94]
H. I. Forsha. The Complete Guide to Storyboarding and Problem Solving. ASQ Quality Press, 1994.
[JMR+ 03]
Klaus P. Jantke, M. Memmel, O. Rostanin, B. Thalheim, and B. Tschiedel. Decision support by learning-on-demand. In Proc. of the 15th Conf. on Advanced Information Systems Engineering (CAiSE ’03), Workshops Proc., Information Systems for a Connected Society. CEUR Workshop Proc. 75, pages 317–328. Technical University Aachen (RWTH), 2003.
78
K.-D. Schewe and B. Thalheim / Storyboarding Concepts for Edutainment WIS
[Kun92]
J. Kunze. Generating verb fields. In Proc. KONVENS, Informatik Aktuell, pages 268–277. Springer, 1992. in German.
[LOM00]
LOM. orking draft v4.1. http://ltsc.ieee.org/doc/wg12/LOMv4.1.htm, 2000.
[LTS00]
LTSC. Learning technology standards committee website. http://ltsc.ieee.org/, 2000.
[MST05]
T. Moritz, K.-D. Schewe, and B. Thalheim. Strategic modelling of web information systems. International Journal on Web Information Systems, 1(4):77–94, 2005.
[PP45]
G. Polya and G. Polya. How to solve it: A new aspect of mathematical method. Princeton University Press, Princeton, 1945.
[RK04]
W. J. Rothwell and H. C. Kazanas. Mastering the Instructional Design Process: A Systematic Approach (Third Edition). Pfeiffer, 2004.
[SBZ04]
J. Sonnberger and A. Binemann-Zdanowicz. Kopra - ein adaptives Lehr-Lernsystem f¨ur kooperatives Lernen. In GMW’2004, Graz, Austria, Sept. 2004, pages 274–285, 2004.
[SSS90]
H. Schreiber, K.-E. Sommerfeld, and G. Starke. Deutsche Wortfelder f¨ur den Sprachunterricht: Verbgruppen. VEB Verlag Enzyklop¨adie, Leipzig, 1990.
[ST04]
K.-D. Schewe and B. Thalheim. Web Information Systems, chapter Structural media types in the development of data-intensive web information systems, pages 34–70. IDEA Group, 2004.
[ST05]
K.-D. Schewe and B. Thalheim. Conceptual modelling of web information systems. Data and Knowledge Engineering, 54:147–188, 2005.
[TD01]
B. Thalheim and A. D¨usterh¨oft. Sitelang: Conceptual modeling of internet sites. In Proc. ER’01, volume 2224 of LNCS, pages 179–192. Springer, 2001.
Information Modelling and Knowledge Bases XIX H. Jaakkola et al. (Eds.) IOS Press, 2008 © 2008 The authors and IOS Press. All rights reserved.
79
A Model of Database Components and their Interconnection Based upon Communicating Views Stephen J. HEGNER† Umeå University Department of Computing Science SE-901 87 Umeå, Sweden
Abstract A formalism for constructing database schemata from simple components is presented in which the components are coupled to one another via communicating views. The emphasis is upon identifying the conditions under which such components can be interconnected in a conflict-free fashion, and a characterization of such, based upon the acyclicity of an underlying hypergraph, is obtained. The work is furthermore oriented towards an understanding of how updates can be supported within the component-based framework, and initial ideas of so-called canonical liftings are presented.
1.
Introduction
Large systems typically possess a complex structure; database systems are no exception. Schemata with thousands of relations are not uncommon; the largest have tens of thousands. If not designed properly, such systems are certain to be unmanageable intellectually, leading to redundancy, design errors, and difficulties as the schema evolves over time. In many engineering settings, including in particular computer hardware, this problems associated with complexity are addressed, at least in part, by designing a large system as the interconnection of simpler (often prefabricated) components. In the field of logic design, for example, this approach has become completely standard [16]. For a variety of reasons, it has not seen as much success in software engineering [17], and in database systems in particular. Although the idea that formal objects which describe computations, such as automata, can be the basic modules of an interconnection calculus dates back to relatively early days of theoretical computer science (see, for example, [2, Sec. 3.2]), corresponding ideas have not found great success in application to more concrete problems. To address some of the shortcomings of more classical approaches, Broy [5] [6] has proposed a formal interconnection calculus for components, including software components in particular. His approach has two flavors. The first is based upon input and output streams with otherwise stateless components; the state being effectively recaptured in the stream. The second is based upon a more conventional state-transition approach. † Much of the research leading to this paper was completed while the author was a visitor at the Information Systems Engineering Group, Department of Computer Science, Christian-Albrechts University of Kiel, Germany.
80
S.J. Hegner / A Model of Database Components and Their Interconnection
Thalheim [24] [22] [25] has recently forwarded the idea of basing database design upon components. While his formal calculus is based upon the state-transition model of Broy, the emphasis is solidly upon a less formal approach to the design and synthesis of schemata within the context of the higher-order entity-relationship model (HERM) [23], which is in turn based upon the classical entity-relationship (ER) model [8]. The present work was motivated by the desire to understand the various aspects of updates, in particular their propagation, in the the context of the database components of Thalheim. As the investigation evolved, however, it gradually became apparent that a somewhat different model of database component was necessary. Although database systems are a form of software system, and therefore it is not unreasonable to use similar formalisms for both, database systems also have special characteristics not shared by all software systems. In particular, in many database modelling scenarios, the central notions are sequences or streams of data, but rather database schemata and their views. This is particularly true in the context of updates. Furthermore, while the ER model and its extensions are widely used in database modelling and design, few if any actual systems are based upon it. Rather, ER-based designs are invariably converted to operational models, such as the relational model or an objectrelational model, for final realization. That an understanding of database updates is closely tied to database views has been recognized since the seminal work of Bancilhon and Spyratos [3], and has been elaborated greatly in [12] [13]. Thus, in an attempt to understand updates in the context of database components, it seems most natural to seek a model of components in which views play a central rôle. In this work, an alternative notion of database component, based upon schemata and views, is forwarded. Roughly speaking, the components are schemata which are interconnected via common views, called ports. Communication is then not via sequences or streams, but rather by applying updates to these ports. An approach to components which is based upon the notions of database schema and view must begin with a choice for the representation of these notions. It turns out that there is relatively little to be gained by restricting the investigation to a particular data model, such as the relational model and its relatives. Rather, much more abstract models, in which a database schema is represented by a (possibly structured) set which represents its databases, and a database morphism is a structure preserving morphism, suffice completely. The formalizations surrounding the constant-complement strategy [3] [12] [13] adapt naturally to the component framework. In particular, the notions developed in these works for view updates provide, perhaps somewhat surprisingly, precisely the notions which are necessary for the representation of component interconnection. The organization of this paper is as follows. In Section 2, the fundamental notions of view-based database components and their interconnections are presented. In Section 3, the notion of the hypergraph of a component is presented, and characterizations of “good” components, in terms of properties of their underlying hypergraphs, are presented. In Section 4, the topic of updates to to interconnections of components is explored briefly. Finally, in Section 5, some conclusions and further directions are presented. 2.
The Basic Ideas of View-Based Database Components
The component-based approach is compositional, not decompositional. In other words, instead of beginning with a large main schema and breaking it into its constituent parts, the starting point is a collection of smaller schemata, together with information on how they may be combined. Since the decomposition of relational schemata is a topic which should be familiar to most readers, it is helpful to begin with a simple example of such decomposition,
S.J. Hegner / A Model of Database Components and Their Interconnection
81
and then identify the conditions under which it may be reversed to yield a component-based composition theory.
2.1 Example — Reconstructing simple components from a decomposition Let E0 be the relational schema comprised of the single relation name R 0 [B1 B2 B3 ], constrained by the set F0 = {B1 → B2 , B2 → B3 } of functional dependencies (FDs), and let LDB(E0 ) denote the (finite) legal databases (i.e., relations) of this schema (relative to approE0 0 priately chosen domains). The view ΠE B1 B2 = (E1 , πB1 B2 ) of E0 is the projection of E0 onto the attributes B1 B2 . More precisely, E1 denotes the schema whose single relation is R1 [B1 B2 ] and whose databases, denoted LDB(E1 ), are precisely the projections of members of LDB(E0 ) onto the attributes B1 B2 , with the view mapping πBE10B2 : r → {(b1 , b2 ) | (∃b3 )((b1 , b2 , b3 ) ∈ r}. E0 E0 0 By construction, πBE10B2 is surjective. Define the views ΠE B2 B3 = (E2 , πB2 B3 ) and ΠB1 B3 =
(E2 , πBE10B3 ) similarly, with R2 [B2 B3 ] and R3 [B1 B3 ], their relation schemes, respectively. It follows from one of the earliest and most widely known results in relational database E0 0 theory [10, p. 31] that the decomposition of E0 into the views {ΠE B1 B2 , ΠB2 B3 } is lossless, in the sense that the decomposition mapping Δ(B1 B2,B2 B3 ) : LDB(E0 ) → LDB(E1 ) × LDB(E2 ) defined by r → (πBE10B2 (r), πBE20B3 (r)) is injective. Similarly, the decomposition of E0 into E0 E0 E0 0 {ΠE B1 B2 , ΠB1 B3 } is lossless. However, the decomposition {Π B1 B2 , ΠB2 B3 } enjoys a second
E0 0 property which {ΠE B1 B2 , ΠB1 B3 } lacks. Specifically, define
(∗)
LDB(E1 ) ⊗ LDB(E2 ) = {(r1 , r2) ∈ LDB(E1 ) × LDB(E2 ) | πBE21 (r1 ) = πBE22 (r2 )}
with πBE21 and πBE22 the obvious projections onto attribute B2 . Then Δ(B1 B2 ,B2B3 ) (LDB(E0 )) = LDB(E1 ) ⊗ LDB(E2 ). Viewed another way, there is a bijective correspondence between the legal databases of E0 and those pairs of databases of E1 and E2 which agree on the common column B. Rissanen [21, Sec. 3] calls this property independence, and shows that it holds precisely in the case that the decomposition is lossless and dependency preserving; the latter meaning that a cover of the FDs of the original schema embeds into the views. In this case, the embedding is particularly simple, with B1 → B2 lying in E1 and B2 → B3 lying in E2 . Indeed, LDB(E1 ) (resp. LDB(E2 )) is exactly the set of relations which satisfy B1 → B2 (resp. E0 0 B2 → B3 ). On the other hand, there is no embedded cover of F 0 into {ΠE B1 B2 , ΠB1 B3 }. and so this independence property does not hold for that context. In [12, 2.17], this property is generalized to much more general classes of constraints. What is remarkable about independence is that the quesK1 K2 tion of whether a pair (r1 , r2 ) ∈ LDB(E1 ) × LDB(E2 ) arises R1 [ B1 B2 ] R2 [ B2 B3 ] from a database of the main schema may be answered with reference only to a combination of local constraints on the component schemata and a view-based compatibility condiG1 tion between them; no other knowledge of the main schema is necessary. This leads naturally to the component-based Figure 1: The interconnection of K1 and K2 1 philosophy. Define the components K1 = (E1 , {ΠE B2 }) and E2 E1 2 K2 = (E2 , {ΠB2 }). The set {ΠB2 } identifies the ports of K1 , and {ΠE B2 } likewise for K2 , with Ei ΠB2 , for i ∈ {1, 2}; the obvious views being defined by projection. The underlying schemata of these two ports are identical. Denote it by G1 ; it has T1 [B2 ] as its sole relational symbol. The interconnection of K1 and K2 is depicted graphically in Figure 1 above.
82
S.J. Hegner / A Model of Database Components and Their Interconnection
This property — that the view schemata of two ports are identical — is called star compatibility, and is central to the interconnection of these components. The star interconnection of K1 and K2 is in effect a join of K1 and K2 ; its schema is the “union” of the schemata of K1 and K2 , subject to the constraint that the two relations agree on the views defined by the components. In more detail, define the schema E12 to have the two relation symbols R2 [B1 B2 ] and R3 [B2 B3 ], constrained by the FDs B1 → B2 and B2 → B3 respectively, as well as the port constraint which stipulates that for any (r1 , r2 ) ∈ LDB(E1 ) × LDB(E2 ), πBE21 (r1 ) = πBE22 (r2 ). Formally, this join of K1 and K2 , the compound component defined by the star interconnection E1 E2 E12 1 E2 {ΠE B2 ΠB2 }, is denoted Cpt{K1 , K2 }, {ΠB2 , ΠB2 } , and is given explicitly by (E12 , {ΠB2 }), 12 with ΠE B2 the view whose schema is E12 and whose view mapping projects either of the two relations of E12 onto attribute B2 . Note that E12 is the schema which is obtained by decomposing E0 into E1 and E2 , and since E12 and E0 are isomorphic in a natural way, essentially the same information is represented. However, the component-based approach is compositional – the components K1 and K2 make no reference whatever to any main schema. E0 0 This same construction does not work for {ΠE B1 B2 , ΠB1 B3 }. Upon defining Δ(B1 B2 ,B1 B3 ) : LDB(E0 ) → LDB(E1 ) × LDB(E2 ) by r → (πBE10B2 (r), πBE10B3 (r)), with (∗ )
E
LDB(E1 ) ⊗ LDB(E2 ) = {(r1 , r2) ∈ LDB(E1 ) × LDB(E2 ) | πBE11 (r1 ) = πB12 (r2 )}
it is not the case that Δ(B1 B2 ,B1 B3 ) (LDB(E0 )) = LDB(E1 ) ⊗ LDB(E2 ). Rather, to determine whether a pair (r1 , r2 ) ∈ LDB(E1 ) × LDB(E2 ) arises as the projection of some r ∈ LDB(E0 ), it is necessary first to compute the join of that pair, since the constraint A 2 → A3 cannot be checked within either of the projections alone. In other words, upon defining the component 0 K2 = (E2 , {ΠE B1 B3 }), it is not the case that the interconnection of K1 and K2 will have a schema E0 0 which is isomorphic to E0 . Mathematically, the congruences of the pair {ΠE B1 B2 , Π B2 B3 }
E0 0 commute, while those of {ΠE B1 B2 , ΠB1 B3 } do not. For details of these ideas, as well as their connection to the constant-complement update strategy for views, consult [12].
2.2 Database contexts As noted in the introduction, the the ideas developed here are not limited to a specific data model, such as the relational model or the ER model. Rather, they apply to virtually any database model. Formally, a database context is a pair S = (D, ) in which D is a class of database schemata and their morphisms, and is a function which associates to each schema D of D a set LDB(D) of legal states, and to each database morphism f : D 1 → D2 of D a function LDB( f ) : LDB(D1 ) → LDB(D2 ). The idea is that a database schema D is modelled by its legal states alone, and that a database morphism is modelled by its underlying function. This is precisely the framework which is used in the original work on the constantcomplement update strategy [3]. Suitable examples for D include the context of all relational schemata with morphisms defined by the relational algebra, nested relational models [20, Ch. 7], and the HERM model [23] together with suitably defined morphisms. This modelling assumption must be faithful to the more structured framework which it represents. In particular, joins, as exemplified in equation (∗) of 2.1, must translate back and forth between the full data model and the -based abstraction. These conditions are so natural that it is difficult to envision any reasonable formalization of database schemata and morphisms which would not satisfy them. Therefore, the straightforward but lengthy list of conditions which must be satisfied are not given here. However, for those readers familiar with the basic language of category theory [1] [15], these conditions can be characterized
S.J. Hegner / A Model of Database Components and Their Interconnection
83
succinctly by requiring that the database schemata and morphisms form a concrete category D with finite limits over the category , with the further condition that the grounding functor :D→ both preserve and reflect limits. notation on morphisms can become quite cumAs a notation convenience, since the bersome, for a database morphism f : D1 → D2 in D, f˚ or ( f )˚will often be used as shorthand for LDB( f ).
2.3 Notational convention Throughout the rest of this paper, unless stated specifically to the contrary, take S to be a database context. Unless stated specifically to the contrary (e.g., in examples), all database schemata and morphisms will be assumed to be based in S. Since views are central to the approach to components given here, a precise definition within the context S is essential. The following definition is based in large part upon those found in [12] and [13], to which the reader is referred for more detail.
2.4 Views Let D be a database schema. A view of D is a pair Γ = (V, γ ) with V a database schema and γ : D → V a database morphism with the property that γ˚ : LDB(D) → LDB(V) is surjective. The zero view on D is ZViewD = (ZSchema, ZMorD ), with ZSchema is a database schema with the property that LDB(ZSchema) consists of exactly one element, and (ZMorD )˚: LDB(D) → LDB(ZSchema) the function which sends every M ∈ LDB(D) to the unique element of LDB(ZSchema). In other words, ZSchema is a constant schema with exactly one state, and ZView D is the view of D which maps every M ∈ LDB(D) to that state. Under the conditions identified in 2.2, a zero view exists for every schema D, and is furthermore unique up to isomorphism. Views occur very frequently in this work. To avoid the need to spell out the complete definition every time, the following convention will be followed. If a view is named by the Greek letter Γ, then the view morphism will be denote using γ , and SchemaΓ will be used as an alias for the underlying view schema. This convention furthermore extends to all subscripted and superscripted variants. For example, for the view Γ defined above, SchemaΓ is an alias for V. Similarly, for a view named Γi , the full definition is Γi = (SchemaΓi , γi ). Additionally, when SchemaX appears as the argument of another notation, X will frequently be used in its stead when no confusion can result. In particular, LDB(Γ i ) will be used as an abbreviation for LDB(SchemaΓi ).
2.5 Components The formal definition of a component follows the pattern introduced in the example of 2.1, but is based not upon the relational model but rather upon the more general model of database schemata and views in the context S. Specifically, a component is an ordered pair C = (Schema(C), Ports(C)) which satisfies the following two conditions. (cpt-i) Schema(C) is a database schema.
84
S.J. Hegner / A Model of Database Components and Their Interconnection
(cpt-ii) Ports(C) is a finite set of views of Schema(C), called the ports of C, with the property that none of the ports are zero views. LDB(C) will be used as notational shorthand for LDB(Schema(C)).
2.6 Name-normalized components In describing interconnections of components, it is essential to be able to recover the identity of the embodying component from the name of a port. While an elaborate tagging formalism could be developed, the solution proposed here is much simpler; namely, that for a given set of components, all port names are globally unique. Since this is only a naming convention, there is no loss of generality in such an assumption. Specifically, let X be a finite set of components. (a) X is called name normalized if for distinct C,C ∈ X , Ports(C) ∩ Ports(C ) = ∅.
(b) For Y ⊆ X , define PortsY = {Ports(C) | C ∈ Y }. Thus, PortsY is just the set of all ports which are associated with some component in Y . (c) If X is name normalized and Γ ∈ PortsX , then SrcCpt(Γ) denotes the source component of Γ, which is the unique C ∈ X for which Γ ∈ Ports(C).
2.7 Star interconnections and interconnection families Let X be a finite set of components. (a) For I ⊆ PortsX , Components(I) denotes {SrcCpt(Γ) | Γ ∈ I}. Thus, Components(I) is just the set of all components which are associated with a port in I. (b) A star-compatible set for X is an I ⊆ PortsX with the property that for all Γ, Γ ∈ I, SchemaΓ = SchemaΓ . (c) A star interconnection over X is a star-compatible set I for X with the property that for distinct Γ, Γ ∈ I, SrcCpt(Γ) and SrcCpt(Γ ) are distinct as well. In this case, I is called a star interconnection of Components(I). (d) A interconnection family for X is a finite set of star interconnections over X . The above definitions are critical to the interconnection model. For a set {C1 ,C2 , . . .,Ck } of components to be coupled in a single star configuration using ports Γ 1 , Γ2 , . . ., Γk , respectively, the schemata of the ports must be identical (and not just isomorphic). This is necessary because in the interconnection, the states of these views must always be the same. The resulting star interconnection, as defined formally below, is illustrated for k = 4 in Figure 2. This is somewhat distinct from the notion of a star component, as defined in [24]. Here it is the interconnection, and not the component itself, which has the star property. It is possible to construct a maximal interconnection family, from which all others can be obtained via appropriate subset operations. (e) For Γ ∈ PortsX , define MaxStarX (Γ) = {Γ ∈ PortsX | SchemaΓ = SchemaΓ }, and define MaxStar(X ) = {MaxStarX (Γ) | Γ ∈ PortsX }. Each member of MaxStar(X ) is a maximal star-compatible set for X . Thus, J is an interconnection family for X iff J is the union of disjoint subsets of members of MaxStar(X ).
S.J. Hegner / A Model of Database Components and Their Interconnection
C1
Γ1 V1
Γ2
C2
V2
Γ3 V3
C3
C2 V
→
def
Vi = SchemaΓi
C1
85
V4 Γ4 C4
C3
C4 def
V1 = V2 = V3 = V4 = V = SchemaΓ
Figure 2: The star-compatibility condition and the interconnection of four components 2.8 Notational convention Unless specifically stated to the contrary, for the rest of this paper, take X to be a namenormalized set of components with J an interconnection family for X .
2.9 Annotated examples In 2.12 and 2.13, examples are presented which illustrate many of the ideas which have been developed thus far, as well as those which will be developed in the rest of this section. Rather than distributing fragments of these examples throughout the text, it seems more appropriate to present them in unified fashion. The reader is therefore encouraged to look ahead to these examples for clarification of the concepts which are developed.
2.10 The compound component defined by an interconnection family A central idea surrounding the component philosophy is that simple components may be combined to form more complex ones. While this idea is simple in principle, there are nevertheless some complicating details which mandate that not every possible interconnection family is suitable for the formation of a complex component. These issues are now investigated in more detail. (a) The schema SchemaX , J defined by J on X is given as follows. LDB(SchemaX , J ) ={MB ∈ LDB(B) | B∈X
(∀I ∈ J)(∀{Γ1 , Γ2 } ⊆ I)(γ˚1(MSrcCpt(Γ1 ) ) = γ˚2 (MSrcCpt(Γ2 ) ))} (b) For C ∈ X , define the natural projection morphism NatProj J;X C : SchemaX , J → Schema(C) on elements by MB B∈X → MC . It is clear that SchemaX , J is the natural schema for the interconnection of the elements of X into a single component, using J as the interconnection family. There is, however, a complication. For this complex component, it is necessary to identify the ports. The obvious solution is to define one common port for each star interconnection. In light of the definition of (a) above, the view of this common port will not depend upon which of the components from X is used to define it. Unfortunately, the combined ports may no longer be views, because the underlying mapping is no longer surjective. In general, the constraints on the schemata of the constituents which arise by combining components may limit the values
86
S.J. Hegner / A Model of Database Components and Their Interconnection
which may appear on the ports. It is therefore essential to identify conditions under which these problems do not occur. The following definitions lay the framework for studying this question in more detail. (c) The component C ∈ X is free in X if for any N ∈ LDB(C), there is an MB B∈X ∈ LDB(SchemaX , J ) with the property that MC = N. (d) Let C ∈ Y and let Γ ∈ Ports(C). The port Γ is said to be free in X with respect to J if for any N ∈ LDB(Γ), there is an MB B∈X ∈ LDB(SchemaX , J ) with γ˚(MC ) = N. (e) X is free for ports with respect to J if for every C ∈ X and Γ ∈ Ports(C), Γ is free in X with respect to J. The condition of the component C being free in X is the stronger of the two; it essentially states that the interconnection does not further constrain C. It is very easy to find examples of situations in which constituent components are not free; see 2.12. The condition of a view Γ being free in X with respect to J is strictly weaker, since the entire component C need not be unconstrained, but rather only that each of its ports must be unconstrained individually. In the example of 2.12, all ports are free with respect to MaxStar(L). Nevertheless, it is possible to construct relatively simple relational examples in which this condition is not satisfied, as illustrated in 2.13. The weaker condition of X being free for ports with respect to J is sufficient to admit a consistent definition for the ports of a compound component. By its very nature, it guarantees that the view mapping for the combined port will be surjective — it is both necessary and sufficient. The formal details are as follows. For parts (f)-(i), assume further that X is free for ports with respect to J. (f) Let C ∈ X and let Γ ∈ Ports(C). Define the lifting of Γ to SchemaX , J to be the pair [X ,J] Γ = (SchemaX , J , [X ,J]γ ) with [Y,J] γ : SchemaX , J → SchemaΓ given as the composition illustrated to the right.
SchemaY, J
NatProjJ;X C [X ,J]
γ
Schema(C)
γ Schema(Γ)
(g) Let I ⊆ J, and let Γ, Γ ∈ PortsX . Define the equivalence relation ≡X,J on liftings of such views by [X ,J]Γ ≡X,J [X ,J]Γ iff Γ = Γ or else there is a I ∈ J with {Γ, Γ } ⊆ X . Let [[X ,J]Γ] denote the equivalence class of [X ,J]Γ under this equivalence relation. (h) Define PortsX , J = {[[X ,J]Γ] | Γ ∈ PortsX }. (i) The compound component defined by X , J is given as follows. CptX , J = (SchemaX , J , PortsX , J ) It is worth repeating that the above notion of compound component is well defined only in the case that X is free for ports with respect to J. It should perhaps be noted that it is not always necessary (or desirable) to include all ports from the constituents in the compound component. However, the choice of which ones to include and which ones to exclude cannot be made on a formal level; rather, it must be a modelling decision. For example, consider forming a compound component from {L 1 , L2 } of 2.12. If one decides to exclude from the compound component those ports which have already been matched, then it would be impossible to connect L3 to the compound of L1 and L2 , since the necessary port has been removed. This must be a design decision, not a mathematical one.
S.J. Hegner / A Model of Database Components and Their Interconnection
87
2.11 Subcomponents of compound components Just as simple components may be combined to construct more complex ones, so too may simpler components be extracted from complex ones. For this extraction process to yield well-defined subcomponents, certain conditions must be met, which are now explored in more detail. In the following, let Y ⊆ X . (a) For
J an interconnection family for X , the relativization of J to Y is ReltJ,Y = {I ∩ ( {Ports(C) | C ∈ Y }) | I ∈ J}. Thus, ReltJ,Y is obtained from J by removing all ports which are not associated with components in Y . (b) The relative schema SchemaY, J defined by J on Y is given as follows. LDB(SchemaY, J ) ={MB ∈
LDB(B) |
B∈Y
(∀I ∈ ReltJ,Y )(∀{Γ1, Γ2 } ⊆ I)(γ˚1(MSrcCpt(Γ1 ) ) = γ˚2 (MSrcCpt(Γ2 ) ))} Thus, SchemaY, J = SchemaY, ReltJ,Y . In other words, the relative schema SchemaY, J is precisely the schema of the compound component CptY, ReltJ,Y . On the other hand, one can also consider the schema obtained by projecting from SchemaX , J the constituent schemata which arise from components in Y . (c) The projected schema ProjSchX Y, J is defined as follows. LDB(ProjSchX Y, J ) ={MB ∈
LDB(B) |
B∈Y
(∃NB B∈X ∈ LDB(SchemaX , J ))(∀B ∈ Y )(NB = MB )} (d) Call Y closed in X with respect to J if ProjSchX Y, J = SchemaY, J . The above closure conditions is very important, because it states that CptY, ReltJ,Y is embedded in CptX , J without the latter imposing any additional constraints. Clearly LDB(ProjSchX Y, J ) ⊆ LDB(SchemaY, J ); the reverse inclusion LDB(SchemaY, J ) ⊆ LDB(ProjSchX Y, J ) holds precisely when Y Is closed in X with respect to J. The definition of a subcomponent now proceeds similarly to that of a compound component, as given in 2.10(i). (f) Under the condition that Y is closed in X with respect to J and that Y is free for ports with respect to ReltJ,Y , define the subcomponent of X , J generated by Y as follows. SubCptX,J Y = (ProjSchX Y, J , PortsY, ReltY, J ) An illustration of why the closure condition is essential for this definition is given at the end of 2.12. As noted at the end of 2.10, the choice of which ports to include and which to exclude in a compound component is a design condition. This is equally true for subcomponents. However, in the latter case, there is a classification which is useful — to partition the ports of SubCptX,J Y into those which connect it to other parts of X and those which do not. The formalization is as follows. Again, for this definition to make sense, it must be assumed that Y is free for ports with respect to ReltY, J .
88
S.J. Hegner / A Model of Database Components and Their Interconnection
(g) Define the set of all ports, external ports, the internal ports, of Y with respect to J, respectively, as follows. AllPortsY, J ={[[Y,ReltY,J] Γ] | Γ ∈ PortsY } ExtPortsY, J ={[[Y,ReltY,J] Γ] | (∃I ∈ J)(∃Γ )(({Γ, Γ } ⊆ I) ∧ (Γ ∈ PortsY ) ∧ (Γ ∈ PortsY ))} IntPortsY, J =AllPortsY, J \ ExtPortsY, J Finally, there is a natural projection morphism, whose underlying function is guaranteed to be surjective, defined as follows. (h) For Z ⊆ Y , define the natural projection morphism NatProjX;J;Y Z : ProjSchX Y, J → ProjSchX Z, J on elements by MB B∈Y → MB B∈Z . Define the natural projection view of ProjSchX Y, J to Z to be ProjViewJ;Y Z = (ProjSchX Y, J , NatProjX;J;Y Z ). 2.12 Example — An illustrative set of components The purpose of this example is to provide a setting in which many of the concepts which are introduced in this paper may be illustrated. It is not intended to model a “real” database situation, but rather to illustrate a wide variety of possibilities. All components are based upon the relational model, and all ports are defined via projections, although neither of these limitations is inherent to the model. By using the familiar relational model, the key ideas can be illustrated in a relatively compact fashion, and certain modelling pitfalls can be highlighted. Table 1 summarizes the key information for each atomic component. For the port names, the superscript identifies the component, while the subscript identifies the attributes which are projected. Since each attribute name is used at most once in each component, this convention is unambiguous. For simplicity, it will be assumed that with each attribute A i is associated a countably infinite domain dom(Ai ), while the legal relations themselves must be finite. Table 2 summarizes information about the ports, grouped by those which have identical underlying schemata. This is a natural grouping, since ports with identical schemata are precisely those which may be coupled to one another. Figure 3 shows all possible (star) interconnections of these components. That is, it connects all ports with identical underlying schemata. The components are shown as rectangles, while the port schemata are displayed as circles. Letting L = {L1 , L2 , L3 , L4 , L5 , L6 , L7 }, the associated interconnection family is MaxStar(L) = {{ΠFA11 , ΠFA21 , ΠFA31 }, {ΠFA34A5 , ΠFA44A5 }, {ΠFA47 , ΠFA57 },
{ΠFA48 , ΠFA78 }, {ΠFA50 , ΠFA60 }, {ΠFA69 , ΠFA79 }, {ΠFA34 , ΠFA44 }}
Of course, there is no requirement that when a set of components is interconnected, all possible connections must be included. Any subset of a member of MaxStar(L) is a valid star interconnection. Thus, any set consisting of disjoint subsets of members of MaxStar(L) is a valid star interconnection over L. Observe that the components L4 , L5 , L6 , and L7 are not free in L with respect to MaxStar(L). Indeed, the interconnection forces the additional constraints A 8 → A7 , A7 → A0 , A0 → A9 , and A9 → A8 on L4 , L5 , L6 , and L7 , respectively. On the other hand, L is free for ports with respect to MaxStar(L). Finally, an illustration of the need for the closure condition in the definition of subcomponent is given. Let X567 = {L5 , L6 , L7 }, J567 = {{ΠFA50 , ΠFA60 }, {ΠFA69 , ΠFA79 }}, and Y57 =
89
S.J. Hegner / A Model of Database Components and Their Interconnection
Comp. Name L1 L2 L3 L4 L5 L6 L7
Schema Name F1 F2 F3 F4 F5 F6 F7
Schema Constraints
Relations R1 [A1 A2 ] R2 [A1 A3 ] R3 [A1A4 A5 ] R4a [A4 A5 A6 ] R4b [A7 A8 ] R5 [A7 A0 ] R6 [A9 A0 ] R7 [A8 A9 ]
A4 → A 5 A4 → A 5 A7 → A 8 A0 → A 7 A9 → A 0 A8 → A 9
Ports ΠFA11 ΠFA21 ΠFA31 ΠFA34 A5 F4 ΠA4A5 ΠFA47 ΠFA48 ΠFA57 ΠFA50 ΠFA69 ΠFA60 ΠFA78 ΠFA79
Table 1: The atomic components of the running example Schema Name H1
Schema Relations
Schema Constraints
T2 [A4 A5 ]
H3
T3 [A7 ]
H4
T4 [A8 ]
H5
T5 [A0 ]
H6
T6 [A9 ]
H7
T7 [A4 ]
ΠFA11
πAF11 : F1 → H1
ΠFA31
πAF13 : F3 → H1
ΠFA21
T1 [A1 ]
H2
Associated Ports and View Mappings
A4 → A 5
πAF12 : F2 → H1
ΠFA34 A5
πAF43A5 : F3 → H2
ΠFA47
πAF74 : F4 → H3
ΠFA44 A5 ΠFA57
πAF44A5 : F4 → H2 πAF75 : F5 → H3
ΠFA48
πAF84 : F4 → H4
ΠFA50
πAF05 : F5 → H5
ΠFA78 ΠFA60 ΠFA69 ΠFA79 ΠFA34 ΠFA44
πAF87 : F7 → H4 πAF06 : F6 → H5 πAF96 : F6 → H6 πAF97 : F7 → H6 πAF43 : F4 → H7 πAF44 : F4 → H7
Table 2: The port schemata of the running example
{L5 , L7 }. Then ReltY57 , J567 = {{ΠFA50 }, {ΠFA79 }}. Operationally, ReltY57 , J567 is equivalent to ∅; that is, it imposes no constraints at all. This implies that the subcomponent SubCptX567 ,J57 ProjSchX567 Y57 , J567 is not well defined. To illustrate this directly, let ML5 = {(a7 , a0 )} ∈ LDB(F5 ) and ML7 = {(a8 , a9 ), (a8, a9 )} ∈ LDB(F7 ) with a8 = a8 . In view of the FD A9 → A0 on F6 , there can be no ML6 ∈ LDB(F6 ) such that (ML5 , ML6 , ML7 ) ∈ LDB(ProjSchX567 Y57 , J567 ), since the fact that ML5 contains only one tuple implies that ML7 can consist of only one tuple as well.
90
S.J. Hegner / A Model of Database Components and Their Interconnection
H7 L1
L2
L3
L4
R1 [ A1 A2 ]
R2 [ A1 A3 ]
R3 [ A1 A4 A5 ]
H1
R4a [ A4 A5 A6 ] R4b [ A7 A8 ] H2
H3
H4
L5
L6
L7
R5 [ A7 A0 ]
R6 [ A9 A0 ]
R7 [ A8 A9 ]
H5
H6
Figure 3: Graphical depiction of all possible interconnections 2.13 Example — A pair of components whose interconnection is not free for ports It is useful to show how a simple interconnection of relational components can violate the condition 2.10(e) of being free for ports. To this end, let A 1 and A2 be attributes with the same countably infinite domain; dom(A1 ) = dom(A2 ). There are three components over these domains, as identified in Table 3. Each component has two ports, one for the proF F jection of its relation on A1 , and a second for A2 . For the ports ΠFAα1 , ΠAβ1 , and ΠAδ1 , let F
the port schema have the single relation symbol TA1 [A1 ], and for the ports ΠFAα2 , ΠAβ2 , and F
ΠAδ2 , let the port schema have the single relation symbol TA2 [A2 ]. There are no constraints associated with these port schemata, other than the domain constraints. However, if L α F F and Lβ are interconnected via Jαβ = {{ΠFAα1 , ΠAβ1 }, {ΠFAα2 , ΠAβ2 }}, then it is easy to see that F
F
F
LDB(Schema{Lα , Lβ }, Jαβ ) = ∅. Similarly, letting Jαδ = {{ΠFAα1 , ΠAβ1 }, {ΠAγ2 , ΠAγ2 }}, it follows that LDB(Schema{Lα , Lδ }, Jαδ ) = {∅}. Thus, {Lα , Lβ , Lδ } is not free for ports with respect to either Jαβ or Jαδ . Comp. Name Lα
Schema Name Fα
Rα [A1 A2 ]
Schema Constraints Rα [A1 ] ⊆ Rα [A2 ]
Lβ
Fβ
Rβ [A1 A2 ]
Rβ [A1 ] ⊆ Rβ [A2 ]
Lδ
Fδ
Rδ [A1A2 ]
Rδ [A1 ] ∩ Rδ [A2 ] = ∅
Relations
Ports ΠFAα1 F ΠAβ1 F ΠAδ1
ΠFAα2 F
ΠAβ2 F
ΠAδ2
Table 3: The atomic components for 2.13
3.
Acyclic Interconnections of Components
As observed in 2.10 and 2.11, the property of a set X of components being free for ports with respect to an interconnection family J is critical for the definition of both compound components and subcomponents. Therefore, it is crucial to identify conditions under which this condition is met. Fortunately, by requiring the acyclicity of a hypergraph derived from the interconnection, it is possible to guarantee the even stronger condition that each constituent component of is free in X .
91
S.J. Hegner / A Model of Database Components and Their Interconnection
The reader is perhaps familiar with the use of hypergraphs in characterizing the structure of relational decompositions [9]. However, there are very substantial differences between the hypergraph of a compound component, as defined here, and the hypergraph of a relational decomposition. The use of hypergraphs in this paper is closer to that found in more general characterizations of desirable schemata, as studied in [11]. In any case, it seems appropriate to give a self-contained presentation of the ideas.
3.1 Hypergraphs and acyclicity To begin, a very brief summary of the key notions from the theory of hypergraphs is given. The standard reference on this subject is the monograph of Berge [4] , to which the reader is referred for details. A hypergraph is a pair G = (V, H) in which V is a finite set of vertices and H ⊆ P(V ) (the set of all subsets of V ), with each h ∈ H containing at least two distinct elements. 1 The members of H are called hyperedges. A path from v1 to vn in G is a sequence v1 , h1 , v2 , h2 , .., vn−1, hn−1 , vn in which the following conditions hold: (i) vi ∈ V for 1 ≤ i ≤ n with {vi | 1 ≤ i ≤ n − 1} all distinct, and {vi | 2 ≤ i ≤ n} all distinct. It may be the case that v1 = vn , but this is not necessary. (ii) hi ∈ H for 1 ≤ i ≤ n − 1 with {hi | 1 ≤ i ≤ n − 1} all distinct. (iii) {vi , vi+1 } ∈ hi for 1 ≤ i ≤ n − 1. The number n − 1 is called the length of the path. A (Berge)2 cycle in G is a path of length at least two from a vertex v to itself. G is called (Berge) acyclic if it does not contain any (Berge) cycles. For V ⊆ V , the full subhypergraph of G generated by V is SubHGraphG,V = (V , {h ∩ V | h ∈ H}). V ⊆ V is closed in G if whenever v1 , vn ∈ V and v1 , h1 , v2 , h2 , .., vn−1, hn−1 , vn is a path from v1 to vn in G, then v1 , h1 ∩V , v2 , h2 ∩V , .., vn−1, hn−1 ∩V , vn is a path from v1 to vn in SubHGraphG,V .
3.2 The hypergraph of an interconnection family Interconnection hypergraphs are defined as follows. (a) The interconnection hypergraph of X defined by J, denoted IntGraphX (J), has X as its set of vertices and {Components(I) | I ∈ J} as its hyperedges. The interconnection hypergraph for MaxStar(L) of 2.12 is shown in Figure 4 to the right. Each hyperedge is represented as an ellipse, with its members the component names which it encircles.
L1
L2
L3
L4
L5
L6
L7
Figure 4:
The interconnection hypergraph of MaxStar(L) of 2.12
(b) For Y ⊆ X , the interconnection hypergraph of Y defined by J, denoted IntGraphY (J), is the full subhypergraph of IntGraphX (J) generated by Y . The key result is the following.
1 In [4, Ch. 17, §1], hyperedges (arêtes) are allowed to have only one member, implying that a hyperedge may connect a vertex to itself. Such edges are not allowed in the formalism presented here. 2 In [4], such entities are called simply cycles, but in other contexts, such as that of [9], many different types of cycles for hypergraphs are investigated, so the qualifier “Berge” is appended.
92
S.J. Hegner / A Model of Database Components and Their Interconnection
3.3 Proposition Assume that IntGraphX (J) is acyclic. (a) Every C ∈ X , and hence every Γ ∈ PortsX , is free in X with respect to J. (b) If Y ⊆ X is a closed set of vertices of IntGraphX (J), then Y is closed in X with respect to J. P ROOF OUTLINE : The idea is very simple. Assume that IntGraphX (J) is acyclic, and choose any C ∈ X and MC ∈ LDB(C). For each I ∈ J, Γ ∈ I ∩Ports(C), and Γ ∈ I with Γ = Γ , choose any MSrcCpt(Γ ) ∈ LDB(SrcCpt(Γ )) with γ˚(MC ) = γ˚ (MSrcCpt(Γ ) ). Since IntGraphX (J) is acyclic, it is guaranteed that this construction will not result in any conflicts of other port matchings. Now choose another C ∈ X from those which were included in the previous step, and repeat the process. The formal proof proceeds by induction. This establishes (a). Part (b) is almost identical, except that the starting point is a member of LDB(SchemaY, J ) instead of a member of LDB(C). 2 Thus, whenever IntGraphX (J) is acyclic, the notion of compound component 2.10(i) is well defined. Furthermore, for all Y ⊆ X with the property that Y is a closed set of vertices in IntGraphX (J), the subcomponent SubCptX,J Y , as given in 2.11(f), is well defined as well. 3.4 Attribute hypergraphs Since Berge acyclicity is not viewed as the appropriate one for the characterization of good schema construction in the relational context [9], it is worthwhile to present a short example to show how the notion of hypergraph for a relational schema differs from that of component interconnection. Consider just the components L3 and L4 of 2.12. A1 A4 A5 A6 A7 A8 The attribute hypergraph for this pair, as well as the component hypergraph, are shown in Figure 5 to the L3 L4 right. It is easy to see that the attribute hypergraph is cyclic. Indeed, (A4 , {A1, A4 , A5 }, A5, Figure 5: The attribute hypergraph {A4 , A5 , A6 , A7 , A8 }, A4) is a path from A4 to itself. (above) and interconnection hyperHowever, no such corresponding path occurs in the graph (below) for L3 and L4 of 2.12 component hypergraph. The key distinction is that in the relational representation, as studied in [9], each attribute is a vertex of the hypergraph, while in the model of this paper, each component is a single vertex, regardless of how many attributes the relations of its schemata may have. In the relational representation, any port view with more than one attribute will result in a Berge-cyclic hypergraph. Thus, the fundamental properties of the underlying hypergraphs in the two representations can be completely different. 4.
Updates to components
The initial motivation for developing the ideas reported here was to study how updates to components propagate throughout the interconnection, and in particular to look for canonical ways to extend an update on a subcomponent to the entire network. While a complete presentation of these ideas must be deferred to a separate article, it is nevertheless instructive to illustrate the key ideas.
S.J. Hegner / A Model of Database Components and Their Interconnection
93
4.1 Liftings and feasible environments Let Z ⊆ Y ⊆ X , and let (M1, M2 ) be an update on ProjSchY, J . (a) (M1 , M2 ) is internal to SubCptX,J Z if γ˚(M1 ) = γ˚(M2 ) for all Γ ∈ ExtPortsZ, J . Thus, if (M1 , M2) is internal, the update can be made without involving any components not in Z. If the desired update is not internal, then it must be lifted to a larger subcomponent. (b) An update (M1 , M2 ) on SubCptX,J Y is called an internal lifting of (M1 , M2 ) to SubCptX,J Y if (M1 , M2 ) is internal to SubCptX,J Y and (NatProjX;J;Y Z )˚(Mi ) = Mi for i ∈ {1, 2}. (c) Y is called a uniformly feasible environment for (M1, M2 ) relative to J if for every M1 ∈ LDB(SubCptX,J Y ) with (NatProjX;J;Y Z )˚(M1 ) = M1 , there is an M2 ∈ LDB(SubCptX,J Y ) with the property that (M1 , M2 ) is an internal lifting of (M1 , M2 ) to SubCptX,J Y . Uniform feasibility is crucial because it says that the update (M1 , M2) on SubCptX,J Z has an internal lifting to SubCptX,J Y regardless of the actual state of that subcomponent. 4.2 Order and canonical updates There is one additional issue regarding the lifting of updates which is not recaptured in the formalism of 4.1. In general, there are many possible liftings of an update (M1 , M2) from SubCptX,J Z to SubCptX,J Y , even in the case that Y is a least uniformly feasible environment for it. The further goal is to find the canonical such lifting — characterized by that which adds the least amount of additional information to the database. The problem of recapturing such minimality has received significant attention in the context of view updates, particularly in the context of logic programming [19]; however here the databases do not generally consist of sets of clauses, and so a different approach is necessary. For the component context developed here, the approach which is currently being developed is to regard the best lifting to be those defined by free updates. The details will be reported in a separate article, but the following example provides a glimpse of the main ideas.
4.3 Example — Canonical liftings of updates to components In this example, there is a total of eight components, including the two which were introduced in 2.1. The overall format is similar to that employed in 2.12; therefore, only significant differences will be elaborated. Information about the atomic components and the ports is given in Tables 4 and 5, respectively. The interconnection family I 1 which will be used is shown below; Figure 6 illustrates this family in graphical form. E6 E6 E5 E5 E7 E8 E3 E2 E4 E4 1 I1 = {{ΠE B2 , ΠB2 }, {ΠA2 , ΠA2 }, {ΠA4A5 , ΠA4 A5 }, {ΠA6 , ΠA6 , ΠA6 }, {ΠA7 , ΠA7 }}
For simplicity, this example will be limited to the characterization of canonical least liftings for insertions. For the realization of such liftings to be nontrivial, it must be possible to insert partial information into relations, and to pad out the remainder with nulls. However, this must be done in a systematic way, paying careful attention to constraints which specify where nulls may appear, and how functional dependencies (FDs) behave in their presence.
94
S.J. Hegner / A Model of Database Components and Their Interconnection
Comp. Name
Schema Name
Relations
K1
E1
R1 [B1 B2 ]
K2
E2
R2 [B2 B3 ]
K3
E3
R3 [A1 A2 ]
K4
E4
R4 [A2 A3 A4 A5 ] S4 [B1 B2 ]
K5
E5
R5 [A4 A5 A6 ]
K6
E6
R6 [A6 A7 ]
K7
E7
R7 [A6 A8 ]
K8
E8
R8 [A7 A9 ]
Ports
Constraints B1 → B 2 ForbidNulls(B1 B2 ) B2 → B 3 ForbidNulls(B2 B3 ) A1 → A 2 ReqNullTup(A1 A2 ) NoPartNulls(A1 A2 ) A2 → A 3 A4 A5 R4 [A3 ] ⊆ S4 [B1 ] NoPartNulls(A2 A3 A4 A5 ) ReqNullTup(A2 A3 A4 A5 ) ForbidNulls(B2 ) n A4 A5 → A6 NoPartNulls(A4 A5 ) ReqNullTup(A4 A5 A6 ) n A6 → A7 ReqNullTup(A6 A7 ) n A6 → A8 ReqNullTup(A6 A8 ) n A7 → A9 ReqNullTup(A7 A9 )
1 ΠE B2 2 ΠE B2 3 ΠE A2
4 ΠE A2
4 ΠE A4 A5
5 ΠE A4 A5 6 ΠE A6
4 ΠE B2
5 ΠE A6 6 ΠE A7
7 ΠE A6 8 ΠE A7
8 ΠE A9
Table 4: The atomic components of the example of 4.3 K1
K2
R1 [ B1 B2 ]
R2 [ B2 B3 ] G1
K4 S4 [ B1 B2 ]
K3
K5
K6
R3 [ A1 A2 ] R4 [ A2 A3 A4 A5 ] R5 [ A4 A5 A6 ] R6 [ A6 A7 ] G2
G3
G4
K7
K8
R7 [ A6 A8 ]
R8 [ A7 A9 ]
G5
Figure 6: Graphical representation of the interconnection family I 1 Nulls: There is a distinguished null marker, denoted n, with n ∈ dom(A) for every attribute A. This null marker is similar to the placeholder described in [18, Sec. 12.5.2]. There are three associated constraint types, each of which takes a list of attribute names as its argument. Since attribute names can occur in only one relation name of a schema, the semantics described below are unambiguous. ForbidNulls(−): This constraint specifies that no tuple may have the value n in any of the attribute positions listed. NoPartNulls(−): (No partial nulls) This constraint specifies if a tuple has the value n in
S.J. Hegner / A Model of Database Components and Their Interconnection
Schema Name G1
Schema Relations T1 [B2 ]
G2
T2 [A2 ]
G3
T3 [A4 A5 ]
Schema Constraints ForbidNulls(B2 )
Associated Ports and View Mappings 1 ΠE B2
πBE21 : E1 → G1
4 ΠE B2
πBE24 : E4 → G1
2 ΠE B2
T4 [A6 ]
πAE23 : E3 → G2
NoPartNulls(A4 A5 )
4 ΠE A4 A5
πAE44A5 : E4 → G3
ReqNullTup(A4 A5 )
ΠA45A5
πAE45A5 : E5 → G3
6 ΠE A6
πAE66 : E6 → G4
4 ΠE A2
E
7 ΠE A6
G5
T5 [A7 ]
G6
T6 [A9 ]
πBE22 : E2 → G1
3 ΠE A2
5 ΠE A6
G4
95
6 ΠE A7 8 ΠE A7
8 ΠE A9
πAE24 : E4 → G2
πAE65 : E5 → G4 πAE67 : E7 → G4 πAE66 : E6 → G5 πAE68 : E8 → G5 πAE98 : E8 → G6
Table 5: The port schemata of the example of 4.3 one of the attribute positions listed, then it must have the value n in every position listed. ReqNullTup(−): (Require null tuple) This constraint specifies that a relation must have at least one tuple which has the value n in every position listed in the argument. Functional dependencies and nulls: In addition to the usual semantics for an FD A → B, there are several variations which further specify the special way in which the null marker n is handled. In all descriptions below, assume (without loss of generality) that R[U] is a relation scheme on attribute set U with A, B ⊆ U, and that r is a tuple over R[U]. n n-FDs: The relation r satisfies the n-FD A → B iff the following two conditions are satisfied. Null extension: For every t ∈ r there is a t ∈ r with t [A] = t[A] and t [B] = n for every B ∈ B. Quasi-functionality: If t,t ∈ r with the property that t[A] = t [A], then at least one of the following three conditions must hold: (i) t = t ; (ii) t[B] = n for every B ∈ B; or (iii) t [B] = n for every B ∈ B.
In words, for an n-FD A → B to be satisfied, there may be at most two tuples over associated with each distinct value for A, one with all nulls in B (required) and possibly one other which is not all null on B. There are three extensions of the notion of an n-FD, which are identified below. n B is null preserving if whenever t ∈ r with t[A] = n Null preservation: The n-FD A → n for some A ∈ A, then t[B] = n for every B ∈ B. The notation A → B indicates that
96
S.J. Hegner / A Model of Database Components and Their Interconnection n the n-FD A → B is null preserving. n Null reflection: The n-FD A → B is null reflecting if whenever If t ∈ r with t[B] = n n for every B ∈ B, then t[A] = n for every A ∈ A. The notation A → B indicates that n the n-FD A → B is null reflecting. n Simultaneous preservation and reflection: The notation A → B indicates that the n-FD n A → B is both null preserving and null reflecting.
⎡ ⎢ ⎢ ⎢ ⎢ ⎣
E3 ⎤ a11 a21 a12 a22 ⎥ ⎥ a13 a23 ⎥ ⎥ a14 a23 ⎦ n
n
E4 ⎤ a21 a31 a41 a51 ⎢ a22 a32 a42 a52 ⎥ ⎢ ⎥ ⎣ a23 a33 a43 a53 ⎦ ⎡
R3
n
⎡ ⎢ ⎢ ⎢ ⎢ ⎣
n
n
n
⎤
a31 b21 a32 b22 ⎥ ⎥ a33 b23 ⎥ ⎥ b11 b21 ⎦ n b22 S
4
⎡ ⎢ ⎢ ⎢ ⎢ ⎢ R4⎢ ⎣
E5 ⎤ ⎡ a41 a51 a61 ⎢ a41 a51 n ⎥ ⎥ ⎢ ⎢ a42 a52 a62 ⎥ ⎥ ⎢ ⎣ a42 a52 n ⎥ ⎥ a43 a53 n ⎦ n n n R
E6 ⎤ a61 a71 a61 n ⎥ ⎥ a62 a72 ⎥ ⎥ a62 n ⎦ n n R
⎡ ⎢ ⎢ ⎢ ⎢ ⎣ 6
E7 ⎤ a61 a81 a61 n ⎥ ⎥ a62 a82 ⎥ ⎥ a62 n ⎦ n n R
E8 ⎤ a71 a91 ⎢ a71 n ⎥ ⎥ ⎢ ⎣ a72 n ⎦ n n R ⎡
8
7
5
E1 c11 c12 c11 c12 R
1
E2 c22 c23 R
2
Figure 7: An example database for the interconnection I1 Figure 7 shows an example of a consistent database for the interconnection I 1 . To illustrate the ideas of canonical updates, a few selected examples will now be considered. Suppose that it is desired to insert the tuple (a15 , a25 ) into the relation of R3 [A1 A2 ] of the schema E3 of K3 . The goal is to identify the canonical lifting of this update from K3 to to a compound component under which this update may be supported naturally, without making any arbitrary choices. First of all, it is important to clarify what is meant by an arbitrary choice. Consider a soluE4 3 tion which lifts the update to the compound component Cpt{K3 , K4 }, {ΠE A2 , ΠA2 } by inserting the tuple (a25 , a31 , a41 , a51 ) into R4 [A2 A3 A4 A5 ], leaving the state of all other components unchanged. This lifting makes an arbitrary choice for the values for attributes A 3 A4 A5 ; note that (a25 , a32 , a42 , a52 ) or (a25 , a33 , a43 , a53 ) would work just as well, and there is no reason to prefer one over the other. All involve parts of tuples which already occur in the database, and so make arbitrary semantic choices for the information associated with (a 15 , a25 ). The canonical solution involves using completely new values in these positions, and so is independent of the other values in these relations. More precisely, let a 35 ∈ dom(A3 ) \ {n}, a45 ∈ dom(A4 ) \ {n}, a55 ∈ dom(A5 ) \ {n}, and b25 ∈ dom(B2 ) \ {n} be distinct domain values which have not already used in any relation. First insert (a25 , a35 , a45 , a55 ) into the relation of R4 [A2 A3 A4 A5 ] and (a35 , b25 ) into the relation of S4 [B1 B2 ]. Note that the second tuple is mandated by the inclusion dependency R4 [A3 ] ⊆ S4 [B1 ]. This further mandates an insertion into E5 E4 E4 3 E5 , so the update must be extended to Cpt{K3 , K4 , K5 }, {ΠE A2 , ΠA2 }{ΠA4 A5 , ΠA4 A5 } . Insert (a45 , a55 , n) into the relation of K5 . The use of the null is a recognition that the constraints do not force one to select a specific value for A6 . Note that this insertion does not violate the n n-FD is A4 A5 → A6 , so that the resulting state is legal. Furthermore, it does not make use of any arbitrary values which already occur in other relations and it is least, in the sense that no subset of the specified insertions will do, and there is no smaller set of components which will support such an insertion using new values. One issue still remains; namely, that arbitrary choices for a35 , a45 , a55 , and b25 were made. Formally, this is reconciled by noting that all other such solutions are isomorphic, up
S.J. Hegner / A Model of Database Components and Their Interconnection
97
to a renaming of the values. A solution to the update problem is thus not a single update, but rather an equivalence class of isomorphic updates. Replacing a35 , a45 , a55 , and b25 by different values which do not occur elsewhere, say a 35 , a45 , a55 , and b25 , results in a solution which is structurally indistinguishable. The resulting solution is canonical, and identifies the natural scope for lifting an insertion to K3 as {K3 , K4 , K5 }. A formal justification of the canonicity of this solution is rooted in the construction of free objects [15, §31] [1, 8.22] over a suitable category of updates, and is beyond the scope of this paper; the details will appear in forthcoming report. However, it is possible to give an informal justification. The idea is that every other possible lifting of the desired update to K3 can be obtained from the canonical one via a combination of adding additional tuples and forcing the “free” values in the canonical update to take on specific values. For example, the lifting which inserts the tuple (a25 , a31 , a41 , a51 ) can be obtained from the canonical one identified above which inserts the tuples (a25 , a35 , a45 , a55 ), (a45 , a55 , n), and (a35 , b25 ) by mapping the “free” values a35 , a45 , a55 , and b25 to the existing values a31 , a41 , a51 , and b21 , respectively. The case of deletions is handled similarly, although it turns out to be somewhat simpler, since there is no need to group isomorphic solutions (as no new values need be inserted). 5.
Closing Remarks
5.1 Conclusions The foundations of a component-based model of database schemata, with the interconnection of components realized via communicating views, has been presented. Particular attention has been paid to the question of when such interconnections are well behaved, and a characterization in terms of the acyclicity of an underlying hypergraph has been presented. Furthermore, the way in which updates propagate through such components has been illustrated, although not fully formalized.
5.2 Further directions This paper is only a beginning, and many topics remain to be studied. Among them are the following. Updates in the component-based framework: This work began as a study of updates in the context of components. Consequently, an important future direction is to complete the formalization of canonical liftings, as discussed in 4.3. A related direction is update via cooperation, in which the realization of a proposed update requires that other components be updated as well, not as a canonical update but rather as one chosen by a user who has update rights for that component. First results on this topic, including a formal model for the update process, are reported in [14]. Upon following the communication between ports that update by cooperation entails, it is possible to infer much about the necessary workflow patterns behind such updates. Initial investigations on this latter topic are now being pursued. Component-based HERM to relational design theory: It is a standard design technique to begin by modelling the enterprise using a flavor of ER, such as the HERM model, and then to translate that design to a relational schema [23, Ch. 10]. A future direction of this research is to extend this design theory to components; that is, to develop systematic
98
S.J. Hegner / A Model of Database Components and Their Interconnection
tools for the translation of a HERM design based upon components, such as elaborated in [25], to a relational design which preserves the component structure, using the component model developed here for this final schema. Rapprochement with the behavioral theory: As already noted in the introduction, the work presented here is motivated by the database-component model of Thalheim [25], which is in turn based upon the more general component model of Broy [6]. It is important to pursue an understanding of the degree to which these two models can be unified, and to understand their fundamental differences as well.
5.3 Remarks on the literature Nearly thirty years ago, Weber [26] suggested that modular design techniques could be applied fruitfully to database systems as well, although no detailed formalization was presented. In [7], a software tool for modular database design is presented. In that approach, the emphasis is not so much upon building systems by interconnecting components as it is in refining the design, specifically by combining and even redefining the so-called conceptual modules via subsumption modules. As such, it does not emphasize basic communication between components as does the framework presented here. Rather, it has much more of a software-engineering flavor. As noted in the introduction, the flavor of component-based modelling of database systems upon which this paper has its roots in the approach of Thalheim [24] [22] [25]. 6.
Acknowledgments
Much of this research was completed while the author was a visitor at the Information Systems Engineering Group at the University of Kiel during parts of 2005 and 2006. He is indebted to Bernhard Thalheim for suggesting the idea that his ideas of database components and the author’s work on views and view updates could have a fruitful intersection, as well as for inviting him to work with his group on this problem. He is furthermore indebted to Bernhard, as well as to the other members of the Information Systems Engineering Group, particularly Hans-Joachim Klein and Peggy Schmidt, for many helpful discussions during the course of this work. Peggy Schmidt also read a preliminary draft of this paper and made numerous suggestions to improve the presentation. References [1] A DÁMEK , J., H ERRLICH , H., AND S TRECKER , G. Abstract and Concrete Categories. WileyInterscience, 1990. [2] A RBIB , M. A. Theories of Abstract Automata. Prentice-Hall, 1969. [3] BANCILHON , F., AND S PYRATOS , N. Update semantics of relational views. ACM Trans. Database Systems 6 (1981), 557–575. [4] B ERGE , C. Graphes et Hypergraphes. Dunod, 1970. [5] B ROY, M. A logical basis for modular software and systems engineering. In SOFSEM (1998), B. Rovan, Ed., vol. 1521 of Lecture Notes in Computer Science, Springer, pp. 19–35. [6] B ROY, M. Model-driven architecture-centric engineering of (embedded) software intensive systems: modeling theories and architectural milestones. Innovations Syst. Softw. Eng. 6 (2007), in press.
S.J. Hegner / A Model of Database Components and Their Interconnection
99
[7] C ASANOVA , M. A., F URTADO , A. L., AND T UCHERMAN , L. A software tool for modular database design. ACM Trans. Database Systems 16, 2 (1991), 209–234. [8] C HEN , P. P. The entity-relationship model - toward a unified view of data. ACM Trans. Database Systems 1, 1 (1976), 9–36. [9] FAGIN , R. Degrees of acyclicity for hypergraphs and relational database schemes. J. Assoc. Comp. Mach. 30, 3 (1983), 514–550. [10] H EATH , I. J. Unacceptable file opearations in a relational data base. In Proceedings of the ACM SIGFIDET Workshop on Data Description, Access, and Control (1971), pp. 19–33. [11] H EGNER , S. J. Characterization of desirable properties of general database decompositions. Ann. Math. Art. Intell. 7 (1993), 129–195. [12] H EGNER , S. J. An order-based theory of updates for database views. Ann. Math. Art. Intell. 40 (2004), 63–125. [13] H EGNER , S. J. The complexity of embedded axiomatization for a class of closed database views. Ann. Math. Art. Intell. 46 (2006), 38–97. [14] H EGNER , S. J., AND S CHMIDT, P. Update support for database views via cooperation. In Advances in Databases and Information Systems, 11th East European Conference, ADBIS 2007, Varna, Bulgaria, September 29 - October 3, 2007, Proceedings (2007), Y. Ioannis and B. Novikov, Eds., Lecture Notes in Computer Science, Springer-Verlag. In press. [15] H ERRLICH , H., AND S TRECKER , G. E. Category Theory. Allyn and Bacon, 1973. [16] K ATZ , R. H., AND B ORRIELLO , G. Contemporary Logic Design, second ed. Pearson Education, 2005. [17] K RUEGER , C. W. Software reuse. ACM Comput. Surveys 24, 2 (1992), 131–183. [18] M AIER , D. The Theory of Relational Databases. Computer Science Press, 1983. [19] M AYOL , E., AND T ENIENTE , E. A survey of current methods for integrity constraint maintenance and view updating. In Proc. ER ’99 Workshops, Paris, Nov. 15-18, 1999 (1999), vol. 1727 of Springer LNCS, Springer-Verlag. [20] PAREDAENS , J., D E B RA , P., G YSSENS , M., AND VAN G UCHT , D. The Structure of the Relational Database Model. Springer-Verlag, 1989. [21] R ISSANEN , J. Independent components of relations. ACM Trans. Database Systems 2, 4 (1977), 317–325. [22] S CHMIDT, P., AND T HALHEIM , B. Component-based modeling of huge databases. In Advances in Databases and Information Systems: 8th East European Conference, ADBIS 2004, Budapest, Hungary, September 22-25, 2004, Proceedings (2004), A. Benczúr, J. Demetrovics, and G. Gottlob, Eds., no. 3255 in Lecture Notes in Computer Science, Springer-Verlag, pp. 113–128. [23] T HALHEIM , B. Entity-Relationship Modeling. Springer-Verlag, 2000. [24] T HALHEIM , B. Database component ware. In ADC (2003), K.-D. Schewe and X. Zhou, Eds., vol. 17 of CRPIT, Australian Computer Society, pp. 13–26. [25] T HALHEIM , B. Component development and construction for database design. Data Knowl. Eng. 54, 1 (2005), 77–95. [26] W EBER , H. Modularity in data base system design: A software engineering view of data base systems. In Issues in Data Base Management, Proceedings of the 4th VLDB, September 1315, 1978, West Berlin, Germany (1978), H. Weber and A. I. Wasserman, Eds., North-Holland, pp. 65–91.
100
Information Modelling and Knowledge Bases XIX H. Jaakkola et al. (Eds.) IOS Press, 2008 © 2008 The authors and IOS Press. All rights reserved.
Creating Multi-Level Reflective Reasoning Models Based on Observation of Social Problem-Solving in Infants Heikki RUUSKA, Naofumi OTANI, Shinya KIRIYAMA, Yoichi TAKEBAYASHI Shizuoka University, Johoku 3-5-1, Hamamatsu, Japan
Abstract. We have created an infant learning environment that has capacity for effective behavioral analysis while providing new views on infant learning, and serves as a basis for creating a corpus of infant behavior. We are gathering behavioral data including but not limited to data in social, spatial and temporal domains, while providing inspiring learning experiences to infants. We have used the corpus for constructing and combining novel models on social interaction and problem solving.
1. Introduction Current intelligent systems are usually limited to producing an effective solution for a single problem type and fail when presented a problem outside of its scope. We believe that in order to create more flexible systems – ones that can handle the many kinds of everyday tasks labeled as commonsensical -- we need advanced methods that can select solution methods according to problem types. In order to develop such methods, we are investigating how humans identify and define problems and solve them. Most commonsensical problems are solved without us being aware of them, making tracing the process difficult. It has been proposed that this is because an adult mind is very complex and already equipped with error handlers that switch to a different solution type quickly, without conscious notice [1]. Therefore, we are focusing our research on infants, who often fail at producing solutions, thus giving us valuable data on where processes are going wrong and what is missing when compared to an adult mind. In this paper, we describe creation of a multimodal infant behavior corpus and how we have used it for creating models of social problem-solving that might describe what is actually happening in infants’ minds. Furthermore, we present some ideas on how the corpus data and the problem-solving models can be used for creating larger scale reflective reasoning systems, and developing inspiring learning environments.
H. Ruuska et al. / Creating Multi-Level Reflective Reasoning Models
101
2. Building Multimodal Infant Behavior Corpus 2.1 Learning Environment We have an experimental parent-child learning environment for infants [2 to 5]. It has two purposes: first, to provide the participating parents a good environment for the wholesome growth of their children and inspiring experiences to the infants, and second, to provide us with a setting where we can regularly monitor the infants’ behavior and development.
Figure 1 Two infants, their mothers, and a teacher engaged in a learning task
Three sixty-minute classes are held weekly, each class consisting of three infant-parent pairs, where the infants are of the same age. There is one teacher per class. The first half of a class takes place in a classroom setting where the teacher utilizes various materials, such as clay, crayons, paper, etc. and has the infants complete various tasks, such as building, drawing or identifying things, usually with the parents’ aid. For the second half of a class, the teacher and parents discuss child care and child learning. During this time, the infants are given various toys and are let to play freely. The program also includes reports on what is happening at homes, including parents’ observations on child’s development. The whole sixty-minute sessions are recorded by four cameras placed at different angles and multiple microphones, including rucksack microphones worn by each infant. The positioning of the cameras and an overview of the classroom and studio is shown in Figure 2. At the time of the writing, we have footage of 51 learning sessions over a year and a half’s time.
102
H. Ruuska et al. / Creating Multi-Level Reflective Reasoning Models
Figure 2 Infant learning environment layout
2.2 Annotating Infant Behavior The audiovisual data recorded during the classes described in the previous subchapter is annotated with meta-knowledge after the class by multiple members of the research team. The annotations include descriptions of infants’ activities, as well as descriptions of actual actions, such as speech, movement, and grasping things. Information on obvious and inferred goals is also recorded. These annotations are currently recorded in a natural language (Japanese) and they form the core of the behavioral corpus. Figure 3 shows an annotation tool used for the purpose.
Figure 3 Annotation tool.
H. Ruuska et al. / Creating Multi-Level Reflective Reasoning Models
103
3. Preliminary Modeling Experiments 3.1 Modeling Based on Observations in Corpus Shortly after starting to build the behavioral corpus described in Chapter 2, we noticed many developmental changes in infants. Some infants that used to be by themselves were becoming more cooperative and intrigued by their surroundings. There was perceivable increase in their social awareness. For example, we observed many cases of social collaboration and solving of interest conflicts between infants in the playroom setting where they were left to themselves. These observations led us to wonder how we might construct a simulated setting where these changes, and processes themselves, could be replicated, i.e. how to construct models of social problem-solving in infants. A central requirement was that the models be such where we would have models for separate individuals, instead of having models for separate cases. This is necessary for achieving better independence from particular situations. The models should also have internal models of themselves and of other people. The models thus constructed could then be checked and compared with multiple recorded scenarios in the corpus, and perhaps even applied to older children or adults. Our approach is similar to that of Piaget [6] in that we do long-term in-depth observation of relatively few cases, as opposed to statistically analyzing a large number of cases. But where Piaget was more concerned with describing developmental stages occurring in children, we are aiming at building and testing a variety of computable commonsense models. Coined in computer terms, our learning environment is a development and debugging platform for models constructed by others and ourselves. Some related models we are testing and applying include Territory of Information theory by Kamio [7], logical models such as proposed by McCarthy [8] and Mueller [9], models for creating large knowledge bases systems such as Cyc [10], probabilistic reasoning, rule-based reasoning, and methods for combining all of these, as proposed by Minsky [11].
3.2 Emotion Machine Commonsense Model The behavioral corpus gives us a framework for examining factors behind infant behavior and learning, but is by itself not enough for creating working models. For this, we need to convert the natural language annotations to a computable form, and build models and testing environments for deciding which theories and models are relevant. We took parts of the commonsensical reasoning model presented by Minsky [11] as a starting point for building computational models of infant behavior. In particular, we found ideas proposed by him and Sloman [12,13] on multi-level approach to problem solving useful for describing social problem-solving.
3.3 Script Language for Computable Models In order to create the basis for computational models, we needed to implement a scripting language for describing infant behavior in a computable form. Whilst a natural language is easiest to use for annotation, natural language processing is not yet advanced enough for
104
H. Ruuska et al. / Creating Multi-Level Reflective Reasoning Models
creating usable computational models based on natural language alone. To begin with, we utilized Narrative-L, a simple description language using Common Lisp syntax that was developed by Push Singh for his EM-ONE architecture [14]. This had an advantage of having an already existing testing environment [15]. We then started developing variants more suited to describing infant behavior. Chapter 4 includes some examples of a scripting language we currently use in its listings. In the following subchapter (3.4), we present an example of the early scripting language.
3.4 Early Example: “Which Is the Driver?” This example is based on a corpus entry of a playroom setting. Figure 4 contains still shots of the scene. In the scenario, a 4-year old boy and a girl both want to play the driver. The girl communicates this to the boy. The boy understands this, but the girl doesn’t understand that the boy also wants to be the driver. The boy recognizes that there is conflict of interest and he pushes the girl off. The girl is hurt and starts crying. The boy tries to cheer her up. In the end, they end trying to play the driver as both know there can only be one in a car. ( def narrative play-driver ( sequential ( observes boy ( is-sitting girl block ) [1]) ( expects boy ( play-role boy driver ) [2]) ( causes [1] [2] )) ) ( def narrative move-and-sit ( desires boy ( play-role boy driver ) [1] ) ( sequential ( moves-close-to boy block [2] ) ( sit boy block [3] )) ( causes [1] [2] ) ( follows [2] [3] )) ) … Parts of “Which is the Driver”, an early narrative script.
Figure 4 Still pictures of the “Which is the Driver” scene. From the left: the girl shows she wants to play the driver by pantomime. The boy wants to be the driver and pushes her off. The girl starts crying, and the boy tries to cheer her up.
H. Ruuska et al. / Creating Multi-Level Reflective Reasoning Models
105
4. A Reflective Problem-Solver: Simulating the Behavioral Corpus Data 4.1 Creating Models for Multi-level Reflective Reasoning In this chapter, we present a new model that can be used to analyze behavior such as in the following description recorded by an external observer: A boy and a girl are in a room with toys. Boy is holding a block. Boy is looking at the girl, and it looks like he wants to give the block to the girl. Boy is just about to throw the block to the girl, but then he stops for some reason. Boy walks over to the girl. Boy hands the block to the girl. Girl is now holding the block. Our model tries to answer the following questions: If the boy has a goal of giving the block to the girl, how does he know how to do this? Furthermore, if two ways of accomplishing the goal: throwing the block, or walking to the girl and handing it over, are available, how does he choose which to pick? In Chapter 3.4, we described an early example of how infant social behavior might be described and modeled in narratives. However, to create more robust models, we need to divide the actions and reasoning processes to more atomic units, so that they can be used in more than one scenario. From this viewpoint, we pursue flexible models that can be combined in novel ways, to allow for interpretation and reproduction of many kinds of infant behavior. Our model uses discrete sub-processes, divided into various levels of reasoning, for creating solutions to simple problems such as the ones mentioned above.
Figure 5 A three level model that creates an action plan based on scripted memory data
The generic framework of the model is illustrated in Figure 5. We have divided the reasoning process for the problem to three levels [16]: one where plans for solving the
106
H. Ruuska et al. / Creating Multi-Level Reflective Reasoning Models
problem are created, one where whether the plans are possible to implement is inspected, and one where whether they are socially or morally viable is decided. An error with a plan causes a new process with the goal of fixing it to be created. Each of the levels carries a selection of sub-processes, or programs, which we call “critics” after Minsky 2006 [11]. Each critic is essentially a way to think of its own: a thought process that is triggered under certain conditions and usually has inputs and outputs. Critics can also activate other critics or set new sub-goals. As the “playing the driver” case is too complicated for the scope of this paper, we will instead focus on the above example of passing an object to another person. Here, we utilize only a limited number of critics: one that identifies the problem as a “current-state -> desired-state” type of a problem, and activates another critic that searches the memory for existing solutions for this problem type; one that launches subprograms for calculating the required time for each action, and adds them up to determine how long carrying out each solution takes; one that checks whether carrying out all the actions in a solution is possible in the current situation; and one that checks whether carrying out the actions in a solution is socially acceptable, by searching the memory for socially negative elements followed by the actions.
4.2 First Level of Reasoning: Action-Planners First, we make some basic assumptions. We assume that the boy has knowledge of the following actions: hold, walk, hand over and throw. These assumptions are justified by past data. We also assume that the boy also has experience data of similar cases, as listed in memory scripts below. The natural language version is to the left and a computational script language version to the right. We shall use the same syntax throughout the text. Memory Script 1. Boy has block. Boy walks to girl. Boy hands over block. Boy observes girl has block.
(setnarrativeMemory script1 (have boy1 block1) (walk-to boy1 girl1) (hand-over boy1 girl1 block1) (observes boy1 (have girl1 block1)))
Memory Script 2. Boy has block. Boy throws block to girl. Boy observes girl has block.
(setnarrativeMemory script2 (have boy1 block1) (throw-at boy1 girl1 block1) (observes boy1 (have girl1 block1)) (observes boy1 (start-cry girl1)) (reprimand mother boy1))
Boy observes girl starts to cry. Boy is scolded by mother.
When presented with a problem of the form “make current state into desired state”, the boy’s first action is to search for an existing solution. He does this through searching for analogous cases.
H. Ruuska et al. / Creating Multi-Level Reflective Reasoning Models
Current state: Boy has block. Desired state: Boy observes girl has block. Solve problem (current state to desired state). Check for analogous situations in memory.
107
(setstate current (have boy1 block1)) (setstate desired (observes boy (have boy1 block1)) (createplan current desired) =>(findanalogies current desired)
After searching for scripts where “Girl has block” is preceded by “Boy has block”, the action planner returns scripts 1 and 2, giving Plan 1: “walk to girl and hand over the block” and Plan 2: “throw the block to girl” [17]. Either of these plans will bridge the gap between the current situation and the desired situation. Plan 1. Boy walks to girl. Boy hands over block.
=>(define-plan plan1 (walk-to boy1 girl1) (hand-over boy1 girl1 block1))
Plan 2. Boy throws block to girl.
=>(define-plan plan2 (throw-at boy1 girl1 block1))
However, before taking any action, the boy would need to know if any of the plans are practical. Determining this is handled by the second level of critics.
4.3 Second Level of Reasoning: Physical Viability Analysis For the second, reflective level, we assume two critics: one that scans the action plans for flaws, and one that calculates the time they take. A plan is flawed if it includes an action impossible to implement in the current situation. For example, there might be a wall between the boy and the girl, so the boy couldn’t simply walk over to the girl. Alternatively, the block might be too heavy or the girl too far away for the boy to throw it to the girl. Here we assume that there are no such problems.
Compare total time for Plan 1 and Plan 2. Plan 2 takes less time. Prefer Plan 2. Check for validity of Plan 1. “Boy walks to girl” is possible. “Boy hands over the block” is possible.
(compare-times plan1 plan2) => (plan2 plan1) => (planpref-speed (plan2 plan1)) (actions-possibility plan1) (possibility (walk-to boy1 girl1)) =>true (possibility (hand-over boy1 girl1 block1) after (walk-to boy1 girl1) after (have boy1
108
H. Ruuska et al. / Creating Multi-Level Reflective Reasoning Models
block1)) =>true Check for validity of Plan 2. “Boy throws block to girl” is possible.
(actions-possibility plan2) (possibility (throw-at boy1 girl1 block1) after (have boy1 block1)) =>true
In the end, both plans are deemed all right by second level critics and are passed on to the third level, with the distinction that Plan 2 is preferable because it takes less time.
4.4 Third Level of Reasoning: Social Viability Analysis On the third, self-reflective level, the plans are evaluated only by one critic in this example case: one that examines whether actions in them are socially acceptable. This heavily simplified [18] critic does this by searching for any scripts in memory where socially unfavorable elements are preceded by actions included in the plans. Check for social compliance of Plan 1. “Boy walks to girl” no problems. “Boy hands over the block” no problems. Check for social compliance of Plan 2. “Boy throws block to girl” not good.
(actions-sociality plan1) (sociallyOK (walk-to boy1 girl1)) =>true (sociallyOK (hand-over boy1 girl1 block1)) =>true (actions-sociality plan2) (sociallyOK (throw-at boy1 girl1 block1)) =>false
Plan 2 failed because in Memory Script 2, the action “boy throws block to girl” is later followed by “boy is scolded by mother”, which the critic regards as a sign of social incompliance. Thus, even though Plan 2 was preferred after the second level of critics, it failed at the third level. Therefore, as the result of the reasoning process, the boy carries out the first step of Plan 1. If the situation hasn’t changed once the first step is completed, he carries out the second step and then checks if the desired result was achieved.
4.5 Applying the Constructed Model to a Separate Case In this subchapter, we show how some elements of the model constructed in previous subchapters can be applied to a different social problem. For simplicity, we will skip the detailed analysis of the previous chapters and only discuss the main points for this case. First, as before, we give a brief description of the events. It should be noted here that the blocks used are soft, furry toy blocks. A girl and a boy are in a room with toys. The girl wants for the boy to play with her. The boy is looking other way. The girl grabs a block. The girl throws the block at the boy. The
H. Ruuska et al. / Creating Multi-Level Reflective Reasoning Models
109
boy turns around to face the girl. The girl grabs another block. The girl waves the block around. The boy picks up a block. The boy comes to the girl. The boy and the girl start waving the blocks around. The questions raised here are, for example: Why does the girl throw a block instead of making noise or saying something? How does the girl know that she can make the boy play with her this way? Why is the boy not angry at being thrown at with a block? We will not try to answer all the possible questions here, but try to explain what might have caused the girl’s behavior. For describing the girl’s behavior, we need to add some additional critics to the ones introduced in (4.1). First, as the girl wants for the boy to play with her, she needs a first level critic that recognizes that she needs to have the boy’s agreement to play with her. For gaining agreement, she could use script knowledge to know that she needs to make a request. Strictly speaking, using script knowledge for such a basic communication model as making a request is probably not necessary, as babies seem to know this innately, when asking for nutrition or change of diapers, for example. As the ultimate goal for the girl is to get the boy to agree to play with her, she uses her communication model for requests to create the following sub-goals: Basic communication model
Derived Sub-goal for Situation
Script Expression
1.Hold other’s attention
Boy pays attention to girl.
2.Deliver message / make request 3.Infer whether agreement was gained
Boy sees girl wants to play.
(observe girl1 (look-at boy1 girl1)) (observe boy1 (desire girl1 (play-together girl1 boy1))) (observe girl1 (agree boy1 (play-together girl1 boy1)))
Boy agrees to play.
Girl first runs the second and third level critics on this plan; finding no problems (there are no physical or social constraints that would directly prevent any of the sub-goals from taking place), the girl sets out to determine how to fulfill each sub-goal. The girl first needs to gain the boy’s attention. For this case, we assume she can have a pick from 1) shouting to the boy, 2) throwing something at the boy, or 3) moving to the front of the boy, based on her experience data. Second-level speed analysis would give them in order of preference of 1-2-3. As the girl doesn’t have social inhibitors on throwing things, at least light-weight and soft things, at people in this case (maybe her mother isn’t present, or she doesn’t have an experience where she was scolded for throwing things, though this is unlikely), all the plans are valid for the first sub-goal. Now that the girl has a plan on how to gain the boy’s attention, she considers the next step: how to convey the message that she wants to play. We assume she has a memory where she, after waving a toy to someone, has played with that someone using that toy. Her verbal skills are insufficient to convey the request by speech so she doesn’t have that option. Thus, she comes up with a plan of picking up a block, holding it and waving it around for fulfilling the second sub-goal. Regarding the third step, the girl recognizes that she can’t make any conclusions or plans before seeing how the boy reacts. We can assume that she has a few ways to identify whether the boy agreed or not: for example, if the boy walked to the girl and started doing something with a toy facing the girl, or if the girl chose the plan to throw the block, the boy
110
H. Ruuska et al. / Creating Multi-Level Reflective Reasoning Models
throwing the block back could be taken for agreement; the boy walking away, or turning away and continuing whatever he was doing could be taken for disagreement. The point is, the ways for the boy to respond are too many, so she will have to make her conclusions afterwards. As the result of this reasoning process, the girl has made the following plan. Step 1: Shout to the boy, or throw something at the boy, or walk to the boy. Step 2: Wave a toy around. Step 3: See how the boy reacts. However, we are missing a piece: why does she decide to throw a block in the end, as the order of preference based on our previous model would cause her to shout? We conclude that she probably throws the block because she wants to play with the block. Also, the act of throwing is also an act of play, which the girl asserts as priming the boy for the suggestion. Therefore, when making her plans she reflects on her intentions, concluding that as she wants to play with the block, she should use the strongest way possible for communicating this.
4.6 Recording Mental States of Individuals From our corpus based approach, we can create different models for each individual based on our observations. First, we record their behavioral data and tendencies. We can then use these records to decide which particular critics and memory scripts apply to whom in which cases. For example, the boy not throwing the block in the first example case and the girl throwing the block in the second could be handled by the same social inhibition critic, with different memories (whether there is a memory of having been scolded linked to throwing) or different situations (whether the person that did the scolding is present) causing differences in behavior. In our approach, we can describe the mental status and available resources of each individual separately, allowing us to create models that bend flexibly to a variety of patterns of social interaction.
4.7 Using Scripted Corpus Data in Simulated Environments The previous subchapters illustrate some models which can be used as a base for analyzing infant behavior in different social situations where he or she is handing things over, or throwing them to, someone else, and communicating intentions. However, whereas the behavioral corpus has an abundance of examples where singular actions such as handing things over or throwing them take place, for complex scenarios the number of real-world examples is usually limited. Therefore, we are building simulated environments for analyzing causality in more complex cases, like the case described in Chapter 4.5, where we can combine individual actions in ways that have not come up in corpus data, for further testing of the models. Our current simulated environment is a three dimensional space with robots and other physical objects, where their behavior is governed by Newtonian physics. The robots act according to pre-defined scripts. This simulated environment is realistic enough to allow us to construct a wide variety of believable scenes.
H. Ruuska et al. / Creating Multi-Level Reflective Reasoning Models
111
5. Future Prospects 5.1 Validity Considerations Models such as the simplified ones described in Chapter 4 naturally give rise to questions about their validity. How can we be sure that anything like these processes exists in the human mind for problem-solving? From observations alone, how can we be sure that a boy who threw a block at the girl actually had any goal of passing the block to girl, instead of the boy just deciding to throw the block that way and the girl just happening to be there? This problem is emphasized since the infants’ ability to communicate their goals verbally is limited. Our approach is to use multiple observers who by course create slightly different interpretations on both the goals and the ways of reasoning behind particular actions. The models thus created are then applied to many different scenarios in the corpus to see if and where they contradict. By having long-term data on individual subjects, we can also check what kind of experiences they have had in the past, and how those would seem to affect their reasoning. Ultimately, we plan to integrate the reasoning models into a simulated environment, as the one described in chapter 4.7, and link the sensory inputs of the virtual actors to this virtual world. In effect we would be replacing the predefined scripts described before with a virtual intelligence system. We can do this because all parts of the model described in Chapter 4, the script memory and the layered processes, are computable. By this, we can better debug the reasoning models: decide which parts of the virtual actors’ behavior are humanlike and which parts need improving.
5.2 Extending the Domains for the Models and Other Further Developments We have mainly discussed how a multimodal infant behavior corpus can be used in creating models for social problem-solving. However, it is possible to broaden the field to include models for formation and acquisition of spatial, temporal and structural concepts, for example. We have started with the social domain in interests of simplicity and testability, as the changes are relatively easily observed. It should be noted that the problem solving models created the way presented in this paper are not intended to be an only representation for problem solving. Our multi-layer reflective model using simple scripts as a memory format should be thought of as a “kernel” that can launch up a wider variety of approaches. For example, a second level critic described in section 4.3 could use an advanced route-finding algorithm for determining whether walking to next to the girl is possible; similarly, a physics simulation could be run to calculate whether throwing the block to the girl is possible. For analyzing causality, we could first use simple probabilities and stochastic operators, and eventually semantic nets combined with fuzzy logic could conceivably be used in deciding whether the fact that the boy was scolded by the mother really had anything to do with the boy throwing the block. These all are areas for future development. As a step onwards, upon gathering a representative collection of models, we are considering testing their compatibility with larger knowledge-bases. As the function of the layered processes described in this paper greatly depends on what kinds of scripts they are
112
H. Ruuska et al. / Creating Multi-Level Reflective Reasoning Models
fed, it should be fruitful to investigate how they work when supplied with broader chains of commonsense knowledge, such as those stored by Cyc [19] and Openmind [20] projects. Finally, we hope to use the corpus as a tool for analyzing the processes of learning, helping us to invent more effective education methods for benefit of the later generations.
6. Conclusion In this paper, we have described an experimental infant learning environment, a multimodal infant behavior corpus formed from observations of that environment, and some models for reflective reasoning that are constructed based on the corpus. These models by their atomic and flexible nature can be applied to various problem solving scenarios. The models developed by our method can be used for launching different types of subordinate problem solvers for even greater flexibility, bringing us closer to realization of comprehensive human-like problem solving systems.
Acknowledgements First and foremost we would like to thank Saki Kawaguchi, Yutaka Sakane and Goh Yamamoto for their participation in the project, and their advice and critical comments. Further, we especially thank Marvin Minsky and the late Push Singh for providing a theoretical basis for many of the ideas which we have built our system on.
Footnotes and References [1] This has been discussed in many works. P.166 of [6] contains an illustrating example. [2] Yoichi Takebayashi: Multimodal Knowledge Contents Design From the Viewpoint of Commonsense Reasoning. Proceedings of GSIS International Symposium on Information Sciences of New Era: Brain, Mind and Society. Sendai, Japan 2005. [3] Atsuyo Shirai, Saki Kawaguchi, Shinichi Sakane, Yutaka Sakane, Yoichi Takebayashi: Design of Infant-Parent Learning Environments with Privacy Protection. Proceedings of The 20th Annual Conference of the Japanese Society for Artificial Intelligence, 2G2-6, 2006. [Japanese] [4] Kota Otake, Naofumi Otani, Saki Kawaguchi, Yutaka Sakane, Shinya Kiriyama, Yoichi Takebayashi: A Consideration on Infant Behavior Corpus. E-007, FIT2005, Japan. [Japanese] [5] Saki Kawaguchi, Kota Otake, Goh Yamamoto, Shogo Ishikawa, Shinya Kiriyama, Shinichi Sakane, Yutaka Sakane, Yoichi Takebayashi: Multimodal Knowledge Authoring System for Empowering Parent-Child Co-Education Environments. Proceedings of The 19th Annual Conference of the Japanese Society for Artificial Intelligence, 3D2-03, 2005. [Japanese] [6] Jean Piaget: The Language and Thought of the Child. New York: Routledge, 1923/2001. [7] Akio Kamio: Territory of Information. Philadelphia: John Benjamin, 1998. [8] John McCarthy: Mathematical logic in artificial intelligence. Daedalus, 117(1), 1988. [9] Erik Mueller: Commonsense Reasoning. Morgan Kaufmann, 2006. [10] Douglas Lenat, R.Guha: Building Large Knowledge-based Systems, Addison-Wesley, 1990. [11] Marvin Minsky: The Emotion Machine. Simon & Schuster, 2006. [12] Aaron Sloman: Beyond Shallow Models of Emotion. Cognitive Processing, 1(1), 2001.
H. Ruuska et al. / Creating Multi-Level Reflective Reasoning Models
113
[13] Marvin Minsky, Push Singh, Aaron Sloman: The St. Thomas common sense symposium: designing architectures for human-level intelligence. AI Magazine, summer, 25(2), 2004. [14] Push Singh: EM-ONE: An Architecture for Reflective Commonsense Thinking, PhD thesis, MIT, 2005. [15] The Roboverse virtual world simulator is described in [9]. It is currently developed by Bo Morgan at MIT. [16] In [6] and [8], 6 levels for reasoning are described. Levels described in this paper correspond to them roughly as follows: action-planners 1&2, physical evaluators 3, social evaluators 4-6, respectively. The levels described in this paper are by no means meant to be comprehensive: we have simply left out the parts which we currently have no data for. [17] If there were no ready solutions in memory, there could be another critic that suggests the boy moves closer to the girl, for example, in order to bring the block nearer to the girl, and see what happens. [18] Only existence of actions is checked here: for a better model, we need to include subprograms that can test for whether there are any actual causal relationships. Also, using classes of objects instead of instances, we wouldn’t be limited to specific objects. However, such classification has its own problems and is out of the scope of this paper. [19] Douglas Lenat: CYC: A large-scale investment in knowledge infrastructure. Communications of the ACM, 38, No.11 (1995) [20] Push Singh, Erik T. Mueller, Grace Lim, Travell Perkins, Wan Li Zhu: Open Mind Common Sense: Knowledge Acquisition from the General Public. Proceedings of the First International Conference on Ontologies, Databases, and Applications of Semantics for Large Scale Information Systems. Irvine, California 2002
114
Information Modelling and Knowledge Bases XIX H. Jaakkola et al. (Eds.) IOS Press, 2008 © 2008 The authors and IOS Press. All rights reserved.
CMO – An Ontological Framework for Academic Programs and Examination Regulations Richard HACKELBUSCH Universität Oldenburg, Escherweg 2, D-26121 Oldenburg, Germany Abstract. Academic institutions release teaching and examination regulations in order to form the statutory framework of academic programs. Because of the fact that these regulations are worded using legal terminology and are often very complicated, students often do not know how to satisfy these laid down program requirements. This can lead to needlessly long study times. In addition, academic boards have to supply an amount of courses that fits the students’ actual demand that is a difficult task because there is often only little information available. Frequent changes of those regulations and the existing of parallel valid different regulations of programs leading to the same degrees may aggravate these problems. In order to be able to offer software support to handle these problems, a computer-understandable representation of academic programs and their examination regulations is needed. In this paper, we present and explain our approach, based upon ontologies. It defines a meta-model that allows such semantic representations. Instantiations of the ontology can be used within a framework, e.g., to implement decision support systems that can help students to decide how they can satisfy the corresponding program requirements, or that can help academic boards to forecast the students’ demand on certain courses, or examinations.
Introduction In order to achieve certain degrees, students have to pass educational programs offered by academic institutions. The statutory framework of those academic programs is formed by legally binding teaching and examination regulations. Like law, these examination regulations are worded using legal terminology and are often complicated. That’s one reason why they are often hard to comprehend by students (and often by lecturers, too). A result is that a lot of students even do not try to read them. This can lead to a large demand of course guidance (see [4]). Yet another reason is the prevailing heterogeneity of examination regulations of different programs. Often, examination regulations of single academic institutions already differ a lot. This can aggravate the problem, e.g., if courses of different programs must be integrated in a single curriculum for example in the case of a “minor subject”. This leads to questions that “external” courses of other academic programs can or must be taken in order to satisfy the corresponding program requirements, and if or how grades have to be annualized. In addition, the fact that valid and different examination regulations can parallel exist — forming the statutory framework of different programs leading to the same degrees — is another problem. This can happen, e.g., after an introduction of a new version of examination regulations. That’s why it happens that students of the same academic institution who are aiming at the same degrees, too, have to satisfy different program requirements. To address those problems, often subsidiary documents like study guides are introduced that are intended to describe possible variations of correct curricula. Those documents are intended to be used as a basis for students planning their curricula. But a major disadvantage of those documents is that they often only describe situations in general. Thus, they often cannot comply with the individual situations of all students. In those cases, they do not bring a lot
R. Hackelbusch / CMO – An Ontological Framework
115
of benefits. Another attempt to solve these problems is the offer of individual study guidance (that is mandatory in some countries, e.g., in Germany [13]) but that can be very expensive. Reformations of academic programs that are to be implemented in the course of the so-called Bologna-Process reduce the problems just conditionally. Often, only the required minimum of the Bologna-guidelines (like modularization, see [9]) are implemented. In contrast, a result of the reformations is that these above described problems now become obvious. From the point of academic boards, there are problems concerning examination regulations, too. Information about the current state of the students’ progress is missing. That’s why supply and demand of courses might be balanced adversely. Academic boards normally just reach information about the number of students who have started their studies in a certain term heading certain degrees under certain examination regulations, and — in some cases — how many students already have finished/aborted their studies. Thus, it can only clearly determined how many students are studying within a certain term. But it is not clear, how far those students have reached in order to satisfy their program requirements to get their degrees. From the point of lecturers, it is often not clear how many students who want to take a certain course have to satisfy that program requirements version, too. Mostly, the expected number of students who want to take that course can only be forecasted by stating the number of students studying in certain terms on knowledge of past terms (assuming that certain courses are taken by students studying in certain terms). This can lead to a adverse balance of supply and demand of courses. But these problems do not only concern the task of deciding whether a course should be offered, or not. They also concern the task of deciding how many resources should be provided in conjunction with the offer of certain courses. Until there is no more information available concerning the progress of the students’ studies, the corresponding academic boards are not able to create an adequate supply to fit the students’ demand. In addition, collecting and analyzing of that information have to be compliant with privacy guidelines (see [19]). This paper is the reengineering of our previous published work (see [7]). 1. Approach In order to be able to offer computer-assisted help to solve the above-described problems, one precondition are computer-readable represented examination regulations. For this purpose, the relevant part of their semantics is the description of requirements that have to be fulfilled in order to get a certain degree. These requirements describe processes that represent possible curricula (see [4]). Such processes could be modeled, e.g., using Event Driven Process Chains (EPKs) introduced by ARIS (see [15]), Unified Markup Language (UML, see [12]), or PetriNets/Workflow-Nets (see [17]). Approaches that allow a direct modeling of processes have the advantage that they offer a more human-engineered way to model examination regulations in comparison to approaches that work exclusively on rule basis, like [8]. The way to convert examination regulations into a computer readable model can be done easier and more intuitively. Beyond the semantic representation of processes described by examination regulations, a further semantic representation of them is preferable, too. The definition of concepts like modules, examinations, lessions, etc., with different attributes like workload, fields, etc., that can be very heterogeneous comparing different academic programs, should be modelable. This also would be one precondition to define, e.g., that courses can be used as substitutes for others. Such semantic representations are mostly difficult to model using classic modeling approaches in particular if there are conditions to be modeled that are related to such
116
R. Hackelbusch / CMO – An Ontological Framework
concepts or attributes (like “sum of workload > 8”, or “of special field of technical computing scince”). That’s why our approach uses ontological concepts in order to define concepts for the semantic representation of examination regulations and their possible curricula in a process view. We call this ontology Curricula Mapping Ontology (CMO). G RUBER defines ontology as “an explicit specification of a conceptualization” [3]. A conceptualization is an abstract model in a defined domain including the relevant identifying vocabulary, reaching consent within a certain community. “In such an ontology, definitions associate the names of entities in the universe of discourse (e.g., classes, relations, functions, or other objects) with humanreadable text describing what the names are meant to denote, and formal axioms that constrain the interpretation and well-formed use of these terms”. A framework for the model interpretation is implemented that can be used as a basis to implement, e.g., decision support systems. It is able to interpret models of the CMO allowing nearly freely defined concepts and attributes, without the need of adapting it (see below). 2. The Ontology The main target of our ontological approach is to offer a meta-model in order to be able to represent possible curricula of academic programs that are each originally regulated by examination regulations. The concept is to represent a process view of those academic programs and examination regulations. The meta-model is the conceptual level of the ontology. Examination regulations and academic programs have to be modeled on their instance level. The concrete interpretation of academic programs and examination regulations that are modeled using the ontology’s meta-model can to be done using the framework, and assigning individually the students’ results. 2.1. Main concepts Two main entity types in such processes can be identified: At first there are (process) steps that have to be successfully taken by students in order to get their degrees. A process step can stand for an examination, a course, or a class, or even a thesis. On the other hand, there are conditions that regulate if a certain student is allowed to try to take such a step. has
Process
fulfilled
* 1
Predecessor
*
Process_ Element
0..1
Successor
*
Predecessor
subClassOf subClassOf
Process_Step
0..1
Condition
hasPostcondition
Figure 1. Basic concepts
In order to represent a process that stands for an academic program, the meta-model allows to arrange process steps and conditions using links to their predecessors and successors in the process that each can be process steps or conditions as well. Therefore, the meta-model defines the entity types Process_Step and Condition (abstract) that each are a subclass of an entity representing an abstract general Process_Element (see figure 1 — abstract concepts are intended to be subclassed and not to be instantiated directly; they are visualized using shadows). A Process contains of a set of such specializations of Process_Element. The framework interpreting such a modeled process has to determine
117
R. Hackelbusch / CMO – An Ontological Framework
a boolean value for each element of the process (fulfilled). TRUE represents a successfully taken process step or a fulfilled condition. On the other hand, FALSE stands for a failed or not yet taken process step or a not fulfilled condition. Condition Nand sub ClassOf
subClassOf
sub ClassOf
Xnor
Logical_ Condition
Xor
subClassOf subClassOf
subClass sub Of ClassOf
Nor
And
Or
Figure 2. Logical conditions
Entities of the type Process_Step can have one or no predecessor and an arbitrary number of successors. A Condition can have an arbitrary number of predecessors and successors. The interpretation of processes modeled using this meta-model is that it is allowed to take a process step if its predecessor has a TRUE boolean value or if it has no predecessor. The question if the boolean value of an instance of Process_Step itself has to be interpreted as TRUE of FALSE depends on the result of an individual attempt of the corresponding student in taking the assigned examination, or course, etc., of the Process_Step instance. Of course, the framework interpreting the modeled examination regulations has to examine the boolean value of instances of Condition of the process, too. The value of a condition depends on the boolean values of the predecessors of the condition and the type of the condition itself (see below). Thus, no cycles of conditions are allowed to be modeled. The type of a condition depends on the type of specialization of Condition that is instantiated and used inside of the process. Specializations of Condition have to be interpreted either as simple logical terms (Logical_Condition) or more or less complex conditions based on the comparison of numerical values (Value_Condition). Logical conditions are the specializations And, Or, Xor, Nand, etc. (see figure 2). The interpretation of instances of those conditions in a process is similar logic gates (see [14]): An Andcondition, for example, has to be interpreted as TRUE (and as FALSE elsewhere) if all predecessors (Process_Step-instances as well as Condition-instances) have to be interpreted as TRUE as well. *
*
Value
Successor
Process_ Element
*
Achievement_ Value Predecessor
Value
subClassOf
Equal
subClassOf
Value
1
subClassOf
Value_Condition
Subtraction
Operator subClassOf
Term
Multiplication
Division Value
Greater
1 subClassOf
Smaller
2
subClassOf
subClassOf subClassOf subClassOf
Addition subClassOf
subClassOf
1
2
Condition
1
subClassOf
Greater_Equals
subClassOf
subClassOf
Smaller_Equals
Unequal
Figure 3. Value conditions
Limited to logical conditions, only very simple academic programs and examination regulations could be easily modeled. Therefore, the meta-model defines the abstract concept
118
R. Hackelbusch / CMO – An Ontological Framework
of a Value_Condition. An instance of Value_Condition represents a comparison of values that each can depend on values of process elements of the set of predecessors of the condition itself. The type of comparison (equals, smaller, greater, ...) depends on the type of specialization of Value_Condition that is instantiated (see figure 3). Each instance of a specialization of Value_Condition references two instances of Value. An instance of Value can be interpreted as a numerical value or as a Term of two Value instances. This Term instance combine two Value instances with a mathematical operator like +, -, *, or / representing a calculation of a numerical value by combining the two values with the mathematical operator. A numerical value either can be directly modeled as a constant on instance level of the ontology instantiating Value directly. Using the concept Achievement_Value, it also can be modeled as a value depending on values of the elements of the predecessor set of the condition (like the number of successfully passed process steps) that has to be calculated by the framework interpreting the model (see below). subClassOf
Achievment_ Value
Condition
1
subClassOf
Passed
subClassOf
2 subClassOf
1
Failed
Achievment_ Type
Aggregator Value
subClassOf
Value_Condition
Count
Figure 4. Concepts for aggregated numerical values
The basic idea of the possibility of modeling value conditions is to offer concepts to model conditions concerning values derived from the predecessor elements of the condition. A simple example for that would be a condition that only can be interpreted as TRUE if at least a certain number of its predecessor elements also have to be interpreted as TRUE. Such an examination regulation could be that a student is only allowed to try to take a certain course if he has already successfully passed, e.g., three of four specific courses. Figure 4 shows an extract of concepts to model aggregated values derived from a certain attribute concerning the predecessor elements of a condition. For this purpose, an instance of Achievement_Value has to reference a specialized instance of Achievement_Type and a specialized instance of Aggregator (like Count, Average, Smallest, etc.) that determines the type of aggregation. The type of specialization of the instance of Achievement_Type determines the attribute that will be aggregated. This can be countable attributes like the number of considered Process_Step instances that are to be interpreted as Passed or as Failed or attributes like grade (Grade) that can be aggregated to an Average value. Process_Step:A
Greater_Equals:Min3
Process_Step:E
Process_Step:B Achievement_ Value:PC
Value:3
Process_Step:C Count:C Process_Step:D
Passed:P 3
Figure 5. An example of a Value_Condition instance
119
R. Hackelbusch / CMO – An Ontological Framework
The above mentioned example of three of four courses to pass in order to be able to take other steps is exemplarily modeled in figure 5: The predecessor elements of the condition Min3 are A, B, C, and D. Min3 is an instance of Greater_Equals that is a specialization of Value_Condition (see figure 3). It has to be interpreted as TRUE if the first value PC is greater or equals than the second value 3. The second value (3) is a simple instance of Value that assigns the constant numerical value 3. The instance that represents the first value (PC) is the interesting one: PC is an instance of Achievement_Value referencing P (an instance of Passed, and thus, a specialization of Achievement_Type, see figure 4) and referencing the counting aggregation representation C. It has to be interpreted as the number of elements of A, B, C, and D that have to be interpreted as Passed. That’s why the condition Min3 has to interpreted that way that it is TRUE if the number of elements of A, B, C, and D that have to be interpreted as Passed is greater or equals than three (and else FALSE). Thus, E only can be taken if at least three of A, B, C, and D have successfully been passed. The question if a condition must be interpreted as TRUE or FALSE can easily be answered by interpreting the modeled representation. But to be able to do this, it might be necessary to determine whether certain Process_Step instances must be interpreted as TRUE or FALSE — that in addition is a key question in order to be able to determine the current progress of a student reaching his degree. An instance of Process_Step has to be interpreted as TRUE if the assigned course or examination has been successfully passed. This decision depends on the result of the corresponding course or examination and the assigned grade scale. A result might be ‘A’, ‘B’, ‘C’, ‘D’, or ‘E’ with ‘A’ to ‘D’ to be interpreted as passed and ‘E’ as failed to pass. It is necessary to represent those grade scales because there might be rules that specify how to calculate the final grade of the degree. Or there might be rules that depend on the average grade or allows a certain number of failed examinations if there are a certain number of “good” grades, too. A closer look on grades and results is made in section 2.2. These concepts of Value_Condition in conjunction with Achievement_Value already allow to model very complex conditions. But there are also concepts needed that allow the representation of choosing elements of the set of elements being the predecessors of a Condition instance. An example would be a condition that regulates that the average value of the result of the three best courses of a set of courses must be smaller than 2.5. definesNegativeWildcard
Condition
Type
Boundary_ Type
Achivement_Value
definesWildcard
0..1
0..1
Availabilty
0..1
*
Complement
Intersection
subClassOf
Set_Operator subClassOf
1
subClassOf
Difference
2
subClass Of
B_Unequal subClassOf
subClass Of
1
B_Equal
subClass Of
B_Smaller
Choosing_Term
subClassOf
Boundary
Chooser
0..1 subClassOf
1
1
0..1
B_Greater
0..1
Value limit
0..1
Ordered
Set_Union subClassOf subClassOf
Descending
Ascending
Figure 6. Concepts for dynamically choosing a subset of predecessor elements of a condition
In order to allow the modeling of a dynamic choosing of predecessor elements of a condition, the ontology defines the concept Chooser that is shown in figure 6. An instance of
120
R. Hackelbusch / CMO – An Ontological Framework
Chooser can be referenced by an instance of Condition or Achievement_Value. It stands for a selection of a subset of the set of predecessor elements of the associated instance of Condition (in the case of Achievement_Value), or of the instance of Condition itself. There are a couple of concepts to select the subset: An instance of Chooser can reference a set of instances of Boundary that represent a boundary of a specific (self-defined) dimension (e.g., grade, or workload). By instantiating a specialization of Boundary, the type of selection (equals, smaller, greater, ...) can be set. In addition, a reference to an instance of Value specifies the value of the boundary. The type is specified by referencing an instance of the specified Boundary_Type (see below). The second concept is the possibility to reference up to one instance of Type. Type can — like Process_Step — define a couple of wildcards to delimitate the set of elements to elements that are applicable for that wildcards (for the “wildcard” concept see section 2.2). The third concept allows selecting an ordered extract of the set of predecessor elements of the Condition. This can be modeled by instancing up to one reference to an instance of a specialization of Ordered. This instance (of Descending, or Ascending) references one instance of Boundary_Type to represent the dimension to that the elements should be sorted. And, finally, there can be up to one reference to an instance of Value to delimitate the maximum number of elements of the subset that have to be chosen using the represented sorting. subClassOf
Achievement_ Value
subClassOf
subClassOf
subClassOf
Boundary_ Type
Type
Workload
Passed subClassOf
Rated Dated
sub ClassOf subClassOf
Failed subClassOf
Taken
Figure 7. Specializations of Achievement_Value and Boundary_Type
These three concepts to delimitate the subset of a Condition instance can be combined. The order of those selection operators has to be interpreted that the Boundary concept has the highest priority fallowed by Type and Ordered. In order to change this priority, Chooser instances can be combined with set operators. To do so, instances of Condition, or Achievement_Value can reference up to one instance of Choosing_Term that is a specialization of Chooser, instead of referencing directly an instance of Chooser. An instance of Choosing_Term represents the combination of two subsets that are represented by a Chooser (or even a Choosing_Term) instance using a set operator (union, intersection, complement ...) chosen by instantiating a specialization of Set_Operator that is referenced by an instance of Choosing_Term, too. In general, specializations of Achievement_Type can be distinct into two sets (see figure 7): One set are concepts that represent types that only can be counted (like Type, Taken, Passed, Failed). The other set contains concepts that represent numerical values that can be aggregated (like Workload, Rated, Dated). Elements of this set are specializations of Boundary_Type (see above). Achievement_Type and Boundary_Type are intended to be subclassed in order to define special types that can be used for modelling conditions. These types have to be connected with attribute definitions of, e.g., specializa-
121
R. Hackelbusch / CMO – An Ontological Framework
tions of Availability (see below) in order to define for that type that definition stands for, and to allow the generic model interpreting framework to understand that concept. Condition:Condition01
B_Greater:B01
Value:4
4
Workload:W Chooser:Chooser01 Rated:R
3
Value:3
Ascending:O01
Figure 8. Selection by defining an ordered number of elements and a boundary
Due to a lack of space, only a simple example of a selection of a subset is described in this paper. Figure 8 shows the representation of a selection of all predecessor elements of a condition that have a workload greater than four, and — if there are some — a selection of the three ones that have the smallest grades (see section 2.2 for the representation of grades and results). In order to model that selection, the Condition instance Condition01 references the Chooser instance Chooser01. Chooser01 itself references the Boundary instance B01 that represents a boundary that includes only those predecessors of Condition01 to that a course is assigned that has a workload of more than four credits. Secondly, B01 references the Ordered instance O01 representing the selection of the first three of the predecessors of Condition01 (of the first selection) to that the smallest rated course results are assigned with. Instantiating Condition01 as a specialization of Value_Condition, e.g., a condition that compares the average grade of the selection with a constant value can be represented. 2.2. Results Before the representation of grades and rules (to retry) to pass a course are discussed (see section 2.3), the association of courses or examinations to Process_Step instances is described. Because of the fact that the type and naming of courses or examinations is very heterogeneous at different academic programs they are not intended to be modeled as concrete subclasses of instances of Process_Step. Instead of this, each instance of Process_Step can assign a wildcard that represent possible courses or examinations that can be assigned to the corresponding instance of Process_Step. These wildcards can be descriptions of available courses or examinations including blank values. For example, it is defined that the Process_Step instance CS has to be assigned with a course named “computing science” with a workload of six credits. Then, e.g., the teacher or the term are non relevant values. That’s why those values can be left blank. This representation has to be interpreted that way that all courses named “computing science” with a workload of six credits can be used by a student to “take” the Process_Step instance CS of the process representing the academic program. Figure 9 shows the concepts to model possible assignments of Process_Step instances using a wildcard. Each instance of Process_Step can reference up to one Availability instance. Subclasses of this abstract concept are concepts of real types of courses or examinations of a concrete academic institution or academic program like “module” or “master thesis”. Exemplarily there is a concept for representing modules as a subclass of Availability shown. A Module is a specialization of Availability and has in this example a couple of datatype properties like Field, Name, Workload, and Type.
122
R. Hackelbusch / CMO – An Ontological Framework 0..1 definesNegativeWildcard
Availability
Field
0..1
subClassOf
0..1
Module
0..1 Name
1 1
Process_Step
0..1
Workload
Grade_Scale
0..1 definesWildcard
0..1 Type
uses
Result
0..1 0..1
...
Date
Grade
Figure 9. Wildcard and result concepts
In order to determine the current individual progress of a student running through his academic program by the framework, each course or examination, the student has tried to take/pass, should be associated with a Process_Step instance. Now it can be determined if the condition that represents the requirements to get the degree has to be interpreted as TRUE because each Process_Step instance can be determined as to be interpreted as TRUE or FALSE. If no try to take or pass an examination or course can be associated to a Process_Step instance, this instance must interpreted as FALSE. In order to be able to use these self-defined attributes as, e.g., values of conditions, they have to be connected with specializations of Achievement_Type. By instantiating Availability_Concept_Connector it can be modeled that self-defined concepts can be used referencing that specialization of Achievement_Type or Boundary_Type (see figure 10). The already introduced concept Workload, e.g., is connected by the connector instance C1 with the corresponding attribute of Module (Workload was already used in the example of figure 8). The way to refer these connectors is shown in section 2.5.
Field
Availability_Concecpt_Connector
Availability
0..1
Achievement_Type
nextConcept
instanceOf
subClassOf instanceOf
0..1
subClassOf
0..1
0..1
Availability_Concept_Connector:C1
Name
Module
Workload
Name
Type
subClassOf
Availability_Concept_Connector:C2
0..1
Workload
URI2
...
subClassOf
Boundary_ Type
URI2 URI1
URI1
Figure 10. Concepts for connecting self-defined concepts with Achievement_Type specializations
The concepts to represent grades will be discussed next. Figure 9 shows concepts to represent results of attempts. To represent results, each instance of Process_Step references one Result instance each referencing a Grade_Scale instance (see below). If a student made an attempt to take an examination or a course, the corresponding information can be assigned to a Result instance of an applicable Process_Step instance. An applicable Process_Step instance is an instance to that the corresponding course or examination can be assigned (see above). This information about a result contains the course information that must be applicable for the corresponding desired Process_Step instance. That means that all values that are explicitly defined by a wildcard of that Process_Step instance must be identical. Beside others, there is also information about the time stamp (Date) of the attempt and the result itself (Grade) needed. All of this information has to be acquired by the framework that uses the ontology. Each Process_Step instance can be associated with a different type of grade scale. For example, for some courses, students just get a certification that they have successfully participated in that course. Thus, grade scales for those Process_Step instances just need
R. Hackelbusch / CMO – An Ontological Framework
123
a differentiation between “passed” and “failed”. For other courses, students might get a more differentiated grade. That’s why the ontology offers concepts to represent diverse types of grade scales. fulfilled
inferiorTo
1 subClassOf
Grade_Scale
Ordinal_Scale
*
subClassOf notFulfilled
Nominal_Scale worst subClassOf
Cardinal_Scale
best
1
fulfilled
1
fulfilled
0..1
*
superiorTo
1
allPossible
subClassOf
*
Cardinal_Grade
average_calculable
1
Ordinal_Grade
Nominal_Grade
subClassOf
Cardinal_Scale _Selection
0..1
*
subClassOf
Grade
0..1
1 Value
subClassOf
Labeling label
1
Figure 11. Concepts for representing grade scales
These concepts are shown in figure 11: A Grade_Scale can either be a Nominal_Scale, an Ordinal_Scale, or a Cardinal_Scale. Each of these scales references corresponding specializations of Grade that each can have a Label (like ‘A’). A Nominal_Scale instance references two sets of Nominal_Grade instances. One set represents grades that stands for a successfully pass; the other set stands for a set of grades that represents a failed attempt. An Ordinal_Scale instance references a set of Ordinal_Grade instances each referencing up to one predecessor and one successor. An Ordinal_Scale instance also references one single Ordinal_Grade instance that stands for the worst grade that represents a successfully pass. Finally, there are concepts for two types of cardinal scales: A “normal” Cardinal_Scale instance references three instances of Cardinal_Grade that are mostly different. One stands for the best grade, one for the worst grade, and one for the worst grade that stands for a successfully pass. Each instance references one instance of Value that represents a numerical value (for Value see figure 3). These up to three grades carve out the borders that delimitate possible cardinal grades. The second concept of a cardinal scale is a specialization of Cardinal_Grade named Cardinal_Grade_Selection. An instance of that specialization references in addition to Cardinal_Grade instances a set of Cardinal_Grade instances that stand for all possible grades. Some examination regulations might define that lecturers are only able to assign marks of a set of cardinal grades and that they are not allowed to assign any mark between those grades. An instance of Cardinal_Grade_Selection also references a constant boolean value that represents the possibility of calculating the average grade of a set of cardinal grades. This calculation can lead to a grade that is not part of the set of allowed grades. If the boolean value is set as TRUE calculated average grades do not have to be part of the set of allowed grades. This boolean value has to be modeled on the instance level of the ontology as a constant. A process step also can refer up to one post condition (see figure 1). Then, a result only can be assigned to such a Process_Step instance if this Condtion instance would have to be interpreted as fulfilled after such an assignment. The concepts of grade scales and grades allow the framework that uses the ontology to determine if the boolean value of a Process_Step instance has to be interpreted as TRUE or as FALSE or as Passed or as Failed if a result is assigned to this instance.
124
R. Hackelbusch / CMO – An Ontological Framework
2.3. Internal Processes Examination regulations define different types of rules that regulate the possibilities of a retry of an attempt. There are rules imaginable that define that a retry of an attempt is possible under certain circumstances — independent of the result itself. There are also rules imaginable that define the number of reattempts of failed attempts. Those kinds of rules are mostly applicable for a number of Process_Step instances and not just for one. Thus, it would be very absurd to model those rules inside the main process of the modeled academic program individually for each Process_Step instance. The whole modeled process would needlessly swell and become very inscrutable. In addition, that would be a very redundant model. That’s why the ontology offers a concept to model processes inside a Process_Step. 1
URI
Achievement_ Value
input
URI
1
0..1
URI
input
toType
0..1
Connectable_ Element
*
fromType
*
1
* input
1 isFulfilled
output subClassOf
Connector subClassOf
usabel For uses
subClassOf
toValue
Selector
subClassOf subClassOf
Type_Extractor
Select_Best
Process_ Step
Switcher
*
Extractor Value_Extractor
Process_ Element
inputFalse
1
Internal_Process
ofType
... subClassOf
1
* PatternOf
1
Achievement_ Type
subClassOf
Process
Figure 12. Concepts for representing internal processes
The main concept to model a Process that must be used “instead” of a Process_Step is Internal_Process and shown in figure 12. An Internal_Process instance represents the process that must be used instead of a Process_Step instance. It references one Process_Step instance that represents a pattern, and it references a couple of Connector instances for connecting the Process with the pattern (see below). It has to be interpreted that way that the Process instance can be basically used instead of all Process_Step instances that are equal ore defined more concrete than the pattern itself. Finally, an Internal_Process instance must explicit reference all Process_Step instances that must be replaced by the Internal_Process instance (and that must be equal or defined more concrete). The concreteness of a Process_Step instance is the way its wildcard is represented. For example, if the pattern Process_Step instance references one wildcard that defines only that there must be six credits to achieve, the pattern is basically usable for all Process_Step instances that references wildcards with at least the definition of exactly six credits to achieve. In that case, all other values of the wildcards are not relevant to identify applicable Process_Step instances. The concept Connector is used to represent two aspects: On the one hand, it has to be represented how the results of process steps inside the process that replaces a certain process step have to be mapped to the result of the replaced process step. Concrete: If the Process_Step instance PS1 is the predecessor of the Process_Step instance PS2, and PS1 must be replaced by the Process instance P, it has to be modeled how the result of PS1 would look like if PS1 is replaced by P, in order to determine if an attempt to take PS2 is possible. On the other hand, it might be needed to connect the pattern Process_Step instance with Process_Step instances inside the Process that should replace certain Process_Step instances to be able to define that values of the corresponding replaced Process_Step instance should be used inside the Process in order to be able to model
R. Hackelbusch / CMO – An Ontological Framework
125
generic internal processes that can be used to replace different instances of Process_Step. If, for example, a Process_Step instance M1 that can only be associated with a course named “Mathematics 1” should be replaced by a Process instance P, all Process_Step instances of P should be exclusively associable with courses named “Mathematics 1”, too. The two directions to use connectors are explained next. Process_Step:Attempt01
GREATER:Condition01
Achievement_Value: AV
Failed:F
Value:0
Process_Step:Attempt02
0
Count:C
Figure 13. An (internal) process regulating the possibilities of a retrial of a failed course or examination
A simple rule regulating the possibilities of a retrial of a failed course or examination would be that a failed course or examination could be repeated once. Modeled as a process on instance level, it can be represented as shown in figure 13: Condition01 has to be interpreted that way that Attempt02 only can be taken if the attempt in taking Attempt01 has failed (the number of failed elements must be greater than zero). Next, a pattern has to be defined and connected with elements of the process, if needed. As already mentioned, therefore, the concept Connector is used. Figure 12 shows the Connector concept: A general instance of Connector can connect two entities of Connectable_Element that each references a to be connected instance by its URI. If these two entities are no Connector instances or instances of its specializations, they have to be of the same type (like Result or Process_Step) or compatible type in the case of using an Extractor (see below). It is also possible to concatenate several Connector instances. In this case, all elements that are not a Connector instance or an instance of its specializations, must be of the same type (or compatible type), too. A general Connector simply connects two instances in a directed way. For example, it is possible to make a connection from the pattern Process_Step instance to each Process_Step instance of the replacing Process (the substitute). That means that — besides conditions defined inside the Process — all those instances inside the Process can only be taken if the Process_Step that is replaced by the Process can be taken, too. It also means that for each of those Process_Step instances inside the Process, there are the same wildcards defined as for the pattern (and therefore, the same wildcards as for the replaced Process_Step). If two entities are connected by a Connector instance, all referenced entities of the “source” then have to be treated as (virtually) referenced by the “target”, too, instead of using these references of the “target”. There is one exception: If the attribute definition of connected entities is optional (“0..1”), then the reference of the “target” if it is set has still to be used instead of using the reference of the “source”. Thus, it is possible to model differentiated wildcards within the internal process. For example, all values of the wildcard of the pattern have to be used, and in addition applicable Availability instances have to reference the type “laboratory”. This concept is rather needed for representing substitutions, e.g., for representing minor subjects (see below). To represent the final grade of a process, it is not adequate, to connect just one entity with another. For example, it should be regulated that the best grade or the last grade of certain trials is used as result for the replaced Process_Step instances. A general Connector
126
R. Hackelbusch / CMO – An Ontological Framework
entity only allows an explicit hard coded connection between two entities. To allow the representation of connections that aggregate values or depend on conditions, specializations of the concept Connector are defined. These are the concepts Switcher, Selector, and Extractor (see figure 12). Switcher has — in addition to Connector — a second input (inputFalse) and references an instance of Process_Element. An instance of Switcher has to be interpreted that way that it connects exactly one of the two referenced input entities with the entities of the output set. that one is connected depends on the interpretation of the boolean value of the referenced Process_Element instance. The abstract concept Selector references a set of entities as input and connects a filtered value of that set with the output entities. Specialized instances of Selector can be used to connect the Result instances of the Process_Step instances inside the Process with the Result instance of the pattern Process_Step instance. With that concepts it can be defined how the Result of a replaced Process_Step instance has to be interpreted. Examples for specializations of Selector are concepts that stand for filtering the best (Select_Best), worst (Select_Worst) — each depending on the associated Grade_Scale — or last (Select_Last) Result instance to connect it with the output entities of the connector. Using Value_Extractor in conjunction with Achievement_Value, a representation of a calculation, e.g., the average value of a set of grades can be modeled. Value_Extractor is a specialization of Extractor that references up to one URI of an attribute definition of the source (fromType) and the target (toType) of the connection. That means that instead of the source its attribute should be used as source if fromType is set. The same is defined for the target and toType. If no aggregation or calculation should be used the extraction representation can be used by instantiating the specialization Type_Extractor of Extractor. An example is shown in figure 16 within the next section. Connector:C1 Process_Step:Attempt01
GREATER:Condition01
Achievement_Value: AV
Failed:F
Result:gradeA01
Count:C
Process_Step:Attempt02
Process_Step:Pattern
Result:gradeA02
Result:gradeP
Value:0
0
Select_Best:C2 ofType
Rated:R
Figure 14. Connecting the internal process regulating the possibilities of a retrial of a failed course or examination with the pattern
Figure 14 shows an exemplary connection of the internal process (that has already been introduced in figure 13) with a pattern. In this simple example, the Process_Step instances Attempt01 and Attempt02 have the same structure as the pattern Process_Step instance Pattern. Furthermore, an attempt to take a course or an examination that is associated with Attempt01 or Attempt02 is only possible if the attempt to take Pattern itself is possible. In addition, the attempt to take Attempt02 is only possible if the attempt of taking Attempt01 has failed (Condition01). The result of Pattern is that Result of Attempt01 and Attempt02 that has the best grade. If — for example —
R. Hackelbusch / CMO – An Ontological Framework
127
Pattern is usable for a Process_Step instance Mathematics and all courses named “Mathematics I” that also have six credits can be associated with Mathematics, the internal process with Attempt01 and Attempt02 can be used instead of Mathematics, and Attribute01 and Attribute02 can only be taken with courses named “Mathematics I” that have six credits, too. The result of Mathematics, then, would have to be interpreted that way that it is the result of Attempt01, unless an attempt to take Attempt02 with a better result has happened (in that second case, it would be interpreted as the result of Attempt02). An assignment of and internal process to a process step means that there are connectors between the step and the pattern, and in the other direction between their results implicitly set. As already mentioned, academic programs regulated by examination regulations are represented on the instance level of the ontology. For each academic program/set of examination regulations a different set of instances of the ontology have to be modeled that each represent the main process of an academic program. In addition to the main process, rules like those shown in figure 14 have to be modeled, too. Each of those rules can be associated to a couple of Process_Step instances of the main process or even used recursively. The requirement is that the pattern Process_Step instance of the rule can be used instead of the Process_Step that should be replaced and the possibility of this substitution is explicitly modeled (usableFor). For each student, the framework that uses a model of an academic program has to assign his achievements to the Process_Step instance to calculate his progress. The fact that an internal process can be used for a couple of Process_Step instances implies that the instances of the Process_Step instances inside those processes each might have to be independently assigned to multiple achievements. 2.4. Substitutions For the same reason that the rules for retrials should be modeled outside the main process of the academic program, different courses through the program like different minor subjects should be modeled outside the main process of the academic program, too. Otherwise, the main process of the academic program would become very intransparent. There are two types of substitutions to differ between: One type is the definition of substitutions for single Process_Step instances. These substitutions can be used, e.g., for modeling the possibility to replace a Process_Step instance applicable for courses with six credits by a Process with two Process_Step instances each applicable for courses with three credits (like seminars) or the definition of multiple possibilities in assigning courses with a step (like one choice of three possibilities). The second type is the definition of substitutions for processes having more than one Process_Step instance. This type of substitution can be used, e.g., for modeling processes for minor subjects. The concepts for modeling substitutions are shown in figure 15. These are very similar to the Internal_Process concepts: A Process_Substitution instance references a Process instance that stands for the pattern (substitutes) and a Process instance that stands for the substitution (bySubstitute). The Process instances that stand for possible processes that can be substituted are referenced by usableFor. Finally a set of Connector instances is referenced, too. Element_Substitution is another specialization of Substitution. The difference between these two concepts is that Element_Substitution defines a pattern that is a single Process_Step instance and not a Process instance, and, of course, that it is usable for single Process_Step instances instead of Process instances, too.
128
R. Hackelbusch / CMO – An Ontological Framework
*
Process_ Element
Process
1
*
1
*
1
usableFor
Connector
*
has
bySubstitute substitutes
substitutes
Element_ Substitution
subClassOf
subClassOf uses
usable For
Process_ Substitution
Substitution
Figure 15. Concepts for representing substitutions
The way to model substitutes is the same that is above described for modeling internal processes. But there are two substantial differences between those two concepts Element_Substitution and Internal_Process: The first difference is that an Internal_Process instance can be used for a couple of Process_Step instances, but for each Process_Step instance, there is only up to one Internal_Process instance applicable. An instance of Element_Substitution can be used for a couple of Process_Step instances, too. But — in difference to the concept Internal_Process — for each Process_Step instance, there are more than one Element_Substitution instances applicable. The second difference is the main difference: To use one substitution for a Process_Step instance that is defined by a Element_Substitution instance is only an option. If there is an instance of Internal_Process for a Process_Step instance defined, the internal process must be used. The only exception of this rule is the use of a substitution. Then, for each Process_Step instance of the Process instance that substitutes the original Process_Step instance it has to be checked if there are some Internal_Process instances defined, and so on. Connector:C1 Type_Extractor:C2 fromType toType
6
3
Field
Workload
Workload
Connector:C3
Module:M1
Process_Step:Pattern
Result:RPattern
Module:S
Process_Step:A
Process:Substitute
Process_Step:B
Result:RB
Element_Substitution:S
Figure 16. A simple example of an element substitution definition
An example of a simple element substitution definition is shown in figure 16. The instance of Element_Substitution S references the pattern Process_Step instance Pattern and the Process instance Substitute. The elements of Substitute are the two Process_Step instances A and B. They are not connected with each other. Thus, the only precondition to take one of the two would be the fulfilled precondition to take the Process_Step element that has been substituted by Substitute (Connector C3). Using the usableFor link, for each Process_Step instance with six credits the possibil-
129
R. Hackelbusch / CMO – An Ontological Framework
ity to replace that Process_Step instances with Substitute can be modeled. The pattern process step Pattern can be basically used instead of all Process_Step instances for that is defined a similar or more concrete wildcard. The Process_Step instances A and B each can be associated to courses with three credits. The field is basically irrelevant (C2, an instance of the Connector specialization Type_Extractor connecting the attribute Field of the source M1 with the attribute Field of the target S), unless there is a “Field” defined by a Process_Step instance that is substituted by Substitute (the Process_Step instance is defined “more concrete” than the pattern), then, courses or examinations associated with A and B have to be of the same Field as the wildcard definition of the substituted process. Finally, this Element_Substitution is defined that way that the result RB of B will have to be used as result of the pattern in the opposite direction (Connector C1). Process_Step:A
Condition:C1
Process_Step:B
Process_Step:C
Process_Step:X
Process_Step:D
Process_Step:Y
Process_Step:E
Process_Step:E
Figure 17. Problems while substituting a process
The substitution of two processes is quite more complicated. If a process should be substituted with another, there might be problems because some elements of those that have to be substituted process are predecessors of elements outside this process that must be mapped by the substitution. These problems are exemplarily shown in figure 17: The process on the left side should be replaced by the process on the right side. The problem is the element E outside both processes that originally is a successor of the Process_Step instance B of the left process. To overcome this problem, each element of the pattern Process must be mapped via Connector instances by the substitute Process. The second problem is that all elements of the pattern process must be mapped to elements of the processes that should be replaced. The mapping between the process that should be substituted and the pattern process does not have to be distinct because there can exist more than one Process_Step instance inside the process that is intended to be substituted that is equal or defined more concrete than a specific Process_Step instance of the pattern process. That is why it is defined that each Process_Step instance of the pattern Process instance must reference via a Connector instance one Process_Step instance of each of the Process instances that are intended to be substituted (usableFor). A typical use case of instances of Substitution is the definition of minor subjects. A minor subject is a part of the academic program that is — regarding the content — mostly independent from the rest of the program, and that typically can be chosen from a set of minor subjects. A minor subject of a computing science academic program, e.g., could be economics. Another different use case is the definition of rules that regulate that a number Process_Step instances must be associated with courses or examinations of a set of a greater number of courses or examinations (e.g., two courses of “Mathematics I”, “Physics I”, or “Economics I”). In order to model the possibilities of choosing a minor subject in the course of an academic program, some free choice Process_Step instances can be modeled in the main process. These free choice Process_Step instances each do not reference a wild-
130
R. Hackelbusch / CMO – An Ontological Framework
card Availability instance. A result of this model is that no course or examination is associable to those free choice Process_Step instances. In addition, all free choice Process_Step instances that build a minor subject part of the academic program together are referenced by another Process instance that concerns the process that is intended to be substituted. For each minor subject a Substitution instance has to be defined that references a substitution inside that the concrete Process_Step instances are modeled each referencing a wildcard Availability instance to be associable to real courses or examinations. In addition to the main process of the academic program, the substitution process can define additional conditions that define whether a certain Process_Step instance can be taken. Process_Step:PS1
Module:M1
Module:M2
Process_Step:PS2
And:Condition01 Process_Step: FreeChoice1
Process_Step: FreeChoice2
usableFor
Process: Minor_Subject
Connector:C4
Connector:C3
Process_Step: PFreeChoice1
Process_Step: PFreeChoice2
Result:gradeFC1
Result:gradeFC2
Select_Best:C1
Connector:C2
Result:grade1
Result:grade2
Process_Step: Mathematics1
Process_Step: Mathematics2
Process: Pattern substitutes
Connector:C5 Process_Substitution: SMathematics
ofType
Rated:R Connector:C6 Module:M3 substituteFor
Process: MSMathematics
Module:M4 Process_Step: Mathematics3
Result:grade3
Module:M5
Figure 18. A simple example of a minor subject substitution
A very simple example of a process substitution representing a minor subject is shown in figure 18: An extract of the main process describing the academic program is on top of the figure. It contains the Process_Step instances PS1, PS2, FreeChoice1, and FreeChoice2, and the Condition instance Condition01. Additionally, also the Process instance Minor_Subject contains the Process_Step instances FreeChoice1 and FreeChoice2. Each of these two instances of Minor_Subject have no reference to any wildcard Availability instance (like M1, or M2 in the case of PS1, and PS2). That’s why no course or examination can be associated with these elements. The Process_Substitution instance SMathematics references a pattern Process instance Pattern, and a substitute Process instance MSMathematics that contains the model of the minor subject. In this example, the minor subject process contains more Process_Step instances than the pattern process. Each of those Process_Step instances Mathematics1, Mathematics2, and Mathematics3 of MSMathematics are referencing a wildcard Availability instance defining the associable courses or examinations (M3, M4, and M5). The Process_Step instance Mathematics1 can be taken if PFreeChoice1 can be taken (Connector instance C5). In addition, C5 also would connect the wildcard of PFreeChoice1 with Mathematics1 if there was some. Mathematics2 can be taken after Mathematics1 has been successfully passed. The results of Mathematics1, and Mathematics2 are represented as to use the best value
R. Hackelbusch / CMO – An Ontological Framework
131
and connected to the result PFreeChoice1 by C1. Finally, the Process_Step instance Mathematics3 can be taken if PFreeChoice2 can be taken. In this example, the pattern process elements PFreeChoice1, and PFreeChoice2 are connected with FreeChoice1, and FreeChoice2 by C3, and C4. Thus, if the process substitution SMathematics is used to replace Minor_Subject, Mathematics1 could be taken if FreeChoice1 can be taken, and Mathematics3 could be taken if FreeChoice2 can be taken (that’s the case if FreeChoice1, and PS1 have both successfully been taken, see Condition01). The result of FreeChoice1 would be the average value of the results of Mathematics1 and Mathematics2, the result of FreeChoice2 would be the result of Mathematics3. 2.5. Frame In this final subsection, the way to bring it all together is explained. Unfortunately, there are some additional important concepts and issues that could not be addressed in this paper. These are concepts, e.g., concerning grouping conditions, regulations depending on the time (for example the time span between two attempts, or the standard period of study), or the status of a student (for example “full-time” or “part-time”) by using freely definable annotations (again, without the need of adapting the framework). Other not explained concepts are concepts that allow the representation of rules to allow an unlimited number of attempts, or that represent general rules for processes (like all associated courses of Process_Step instances of a process must differ in/be equal to certain values). Finally, concepts representing learned knowledge (allowed special-rules, exceptions, etc.) could not explained here, too. Academic_ Program
1
Process_Step
1 Rule_Set Availability_Concept_Connector
* *
*
0..1
Internal_Process
Substitution
Figure 19. Concepts for representing the frame of an acedemic program
In order to bring it all together, figure 19 shows the elements on the conceptual level of the ontology to represent an academic program. Therefore, the concept Academic_Program has to be instantiated. An instance of Academic_Program references one instance of Process_Step that represents the step to get the degree. As other instances of Process_Step, too, that instance can have successors. Thus, an academic program can be modeled, e.g., as a two step program like a diploma program with an intermediate examination (“pre-degree”). Each of these directly referenced Process_Step instances should reference one instance of Internal_Process. This Internal_Process instance, then, stands for possible curricula of the academic program. Secondly, an instance of Academic_Program references one instance of Rule_Set. Rule_Set is a concept whose instance represents all examination regulations of an academic program. These regulations again are represented by Process instances that are referenced by Substitution instances, and Internal_Process instances (that each have to be referenced by that Rule_Set instance). Finally, a Rule_Set instance can reference a couple of Availability_Concept_Connector instances. These instances
132
R. Hackelbusch / CMO – An Ontological Framework
allow value conditions to “access” values, e.g., of instances of attributes of specializations of Availability (see section 2.2). 3. Conclusions and future work Our approach is validated using the ontology language OWL-DL1 and representing examination regulations for a couple of academic programs. Using JENA2 , we developed a framework containing a model interpreter that can interpret CMO-models in connection with a set of result-assignments without the need of adapting it after defining new Availability specializations, or attributes. Our approach is applicable but the modeled ontologies themselves representing academic programs and their examination regulations become very large and difficult to handle without explicit software support. Thus, we plan to develop a software tool that allows an abstract modeling of the process view of academic programs without the need of explicitly handling the CMO itself (e.g., via Protégé3 ). Imaginable concepts would be the use of templates and a graphical process editor based upon the ontological concepts. Currently, the decision support system EUSTEL (introduced in [5]) that uses the concepts of the ontology and their instantiated program models is under development. It integrates the individual data of the students and the supply of courses of the corresponding academic institution. The system will be connected to the learning management system Stud.IP4 in order to allow students to plan their curricula at the same place where they already can check their individual results and the university calendar. In particular, using EUSTEL, students will be able to run through different settings of their individual curricula (e.g., choice/changing of certain courses, choice/changing of primary/minor subject). One key element of the support that will be offered by EUSTEL is the possibility of visualizing the individual curricula, and the possibilities in continuing the studies calculating with certain settings of the corresponding curricula. In addition, lecturers will be supported by EUSTEL in retrieving predictions of the demand for their lessons in certain terms — broken down to different examination regulations applied for the corresponding demanding students. EUSTEL itself is intended to be part of the system described in [6]. The aim of the system is to allow a comparison of academic courses and curricula of different academic institutions. 4. Related work Other approaches to offer computer-assisted decision support in questions of examination regulations are for example H ANUS [8] and G UMHOLD /W EBER [4]. H ANUS exclusively uses a rule-based representation of examination regulations and offers no process view on the conceptual level. In contrast, G UMHOLD /W EBER defines a process-based representation. But it has very restricted possibilities to represent special examination regulations. In both approaches, semantic representation of the contents is not provided. A support of academic boards is not supported in both approaches, too. Other formats are, e.g., CDM, XCRI, IMSLD (see [16]), but they are mostly in frame for very specific types of academic programs. Most of the approaches that are intended to support the target group of academic boards aim financial 1
http://www.w3.org/TR/owl-features/ http://jena.sourceforge.net/ 3 http://protege.stanford.edu/ 4 http://www.studip.de/ 2
R. Hackelbusch / CMO – An Ontological Framework
133
aspects. For example, G OEKEN /B URMEISTER [2] provide a business intelligence solution for the controlling of schools. On the other hand, there are different approaches that are intended to represent legal sources — like law — using ontological concepts. These approaches are mostly more generic and detached from a specific legal domain. One of the first ambitious attempts is the Language For Legal Discourse by M C C ARTY [11]. Other related work of that kind is analized in V ISSER /B ENCH -C ARPON [18]. Another ambitious attempt is the attempt of B OER / VAN E NGERS /W INKELS [1] that is intended to offer ontological concepts for comparing and harmonizing legislation. Technically related is also the work of, e.g., M ALONE /C ROWSTON /H ERMAN [10] that contains concepts for an ontological representation of workflows. References [1] A. Boer, T. van Engers, and R. Winkels. Using Ontologies for Comparing and Harmonizing Legislation. Proceedings of the 9th international conference on Artificial intelligence and law, pages 60 – 69, 2003. [2] M. Goeken and L. Burmester. Entwurf und Umsetzung einer Business-Intelligence-Lösung für ein Fakultätscontrolling. Multikonferenz Wirtschaftsinformatik (MKWI), pages 137 – 152, 2004. [3] T. Gruber. A Translation Approach to Portable Ontology Specifications. Knowledge Acquisition, 5(2):199 – 220, 1993. [4] M. Gumhold and M. Weber. Internetbasierte Studienassistenz am Beispiel von SASy. In doIT SoftwareForschungstag, Stuttgart, November 2003. Fraunhofer IRB Verlag. [5] R. Hackelbusch. EUSTEL – Entscheidungsunterstützung im Technology Enhanced Learning. Christian Hochberger, Rüdiger Liskowsky (Hrsg.): INFORMATIK 2006 – Informatik für Menschen - Band 1, Gesellschaft für Informatik, Bonn, pages 65 – 69, 2006. [6] R. Hackelbusch. Handling Heterogeneous Academic Curricula. A Min Tjoa, Roland R. Wagner (Hrsg.): Proceedings of the Seventeenth International Conference on Databases and Expert Systems Applications (DEXA 2006), 4-8 September 2006, Krakow, Poland, IEEE, IEEE Computer Society Press, Los Alamitos, Washington, Tokyo, pages 344 – 348, 2006. [7] R. Hackelbusch. Ontological Representation of Examination Regulations and Academic Programs. In: Hannu Jaakkola, Yasushi Kiyoki, Takahiro Tokuda (Hrsg.): Proceedings of the 17th European-Japanese Conference on Information Modelling and Knowledge Bases EJC 2007, Tampere University of Technology, Pori, Juvenes, Tampere, pages 115 – 134, 2007. [8] M. Hanus. An Open System to Support Web-based Learning. Proceedings of the 12th International Workshop on Functional and (Constraint) Logic Programming (WFLP 2003), 2003. [9] Kultusministerkonferenz. Rahmenvorgaben für die Einführung von Leistungspunktsystemen und die Modularisierung von Studiengängen. 2004. [10] T. W. Malone, K. Crowston, and G. A. Herman. Organizing Business Knowledge: The MIT Process Handbook. MIT Press, Cambridge, MA, 2003. [11] L. T. McCarty. A Language for Legal Discourse – I. Basic Features. Proceedings of the second international conference on Artificial intelligence and law, ACM Press, pages 180 – 189, 1989. [12] Object Management Group. Unified Modeling Language Specification, version 1.5. OMG document formal/03-03-01, 2003. [13] A. Reich. Hochschulrahmengesetz. Bock Verlag, 2005. [14] Z. Salcic and A. Smailagic. Digital Systems Design and Prototyping: Using Field Programmable Logic and Hardware Description Languages. Springer-Verlag, 2000. [15] A.-W. Scheer. ARIS - Modellierungsmethoden, Metamodelle, Anwendungen. Springer-Verlag, 1998. [16] J. Tattersall, J. Janssen, B. van den Berg, and R. Koper. Using IMS Learning Design to Model Curricula. Proceedings of the International Workshop in Learning Networks for Lifelong Competence Development, 2006. [17] W. M. P. van der Aalst and T. Basten. Inheritance of Workflows: An Approach to Taking Problems Related to Change. Theoretical Computer Science, 270:125 – 203, 2002. [18] P. R. Visser and T. J. Bench-Capon. A Comparision of Four Ontologies for the Design of Legal Knowledge Systems. Artificial Intelligence and Law 6, pages 27 – 57, 1998. [19] B. C. Witt. Datenschutz an Hochschulen. LegArtis Verlag, 2004.
134
Information Modelling and Knowledge Bases XIX H. Jaakkola et al. (Eds.) IOS Press, 2008 © 2008 The authors and IOS Press. All rights reserved.
Reusing and Composing Habitual Behavior in Video Browsing Akio TAKASHIMA, Yuzuru TANAKA Meme Media Laboratory, Hokkaido University, N-13, W-8, Sapporo, Hokkaido, 060-8628, Japan Abstract. We have increasingly more opportunities to use video for our knowledge work, such as monitoring events, reflecting on physical performances, learning subject matter, or analyzing scientific experimental phenomena. In such ill-defined situation, users often create their own browsing styles to explore the videos because the domain knowledge of contents is not useful, and then the users interact with videos according to their browsing style. However, such kind of tacit knowledge, which is acquired through user’s experiences, has not been well managed. The goal of our research is to share and reuse tacit knowledge, and then create new knowledge by composing them in video browsing. This paper describes the notion of reusing habitual behavior of video browsing, and presents examples of composing these behaviors to create new video browsing styles.
1.
Introduction
How people interact with text and images in everyday life involves not simple naive information-receiving processes but complex knowledge-construction processes. Videos as knowledge materials are no exception. As technology advances, we have increasingly more opportunities to use video for our knowledge work, such as monitoring events, reflecting on physical performances, learning subject matter, or analyzing scientific experimental phenomena. In such ill-defined situation, users often create their own browsing styles to explore the videos because the domain knowledge of contents is not useful, and then the users interact with videos according to their browsing style [1]. However, such kind of tacit knowledge, which is acquired through user’s experiences [2], has not been well managed. The goal of our research is to share and reuse tacit knowledge, and then create new knowledge by composing them in video browsing. This paper describes the notion of reusing habitual behavior in video browsing, and gives examples of composing these behaviors to create new video browsing styles. In what follows, we first discuss video browsing process for knowledge work. Section 3 describes our approach to associate user’s habitual browsing behavior with video data, and then describes the system to generate the associations to identify video browsing styles. Section 4 presents how to reuse browsing behavior and compose them. 2. 2.1.
Habitual Behavior in Video Browsing Knowledge in Video Browsing Process
We consider that at least two types of knowledge exist in video browsing process; knowledge of content, and knowledge of browsing. Numerous studies which focus on content based analysis for retrieving or summarizing video had been reported [3][4]. These studies are based on knowledge of content which is semantic information of video data, for example, people tend
A. Takashima and Y. Tanaka / Reusing and Composing Habitual Behavior in Video Browsing
135
to pay attention to the goal scene of soccer game, or captions on news include the summary or location of the news topic. Thus, this approaches only work on the specific purposes (e.g. extracting goal scenes of soccer games as important scenes) which are assumed beforehand. In contrast, knowledge of browsing has possibility to be used for identifying various kinds of scenes. In knowledge work, people watch video more actively, then may have his/her browsing styles. 2.2.
Active Watching
By applying Adler’s notion of active reading [5] to the video viewing experience, we have developed the concept of active watching and support tools based on the concept [6]. One of the key points about active watching is that users need to manipulate video to experience video in various ways in knowledge work. Therefore that leads us an assumption which says that people in knowledge work often poses their habitual browsing behavior; in other words, develops their own video browsing styles. 2.3.
Approach
In this research, we assume the situation in which users solve problems thorough active watching process in knowledge work. Our approach to support users is; Identifying associations between user’s habitual browsing behavior and video data Allowing users to manage (reuse and compose) habitual browsing behavior Several studies have been reported that address using users’ behavior to estimate preferences of the users in web browsing process [7]. On the other hand, little is reported in video browsing process. Mahmood et. al. modeled users’ browsing behavior using HMM and developed a system that generate video previews without any knowledge of video content [8], however, the system can not generate new browsing styles. According to our approach, users can browse videos thorough his/her or another users’ browsing styles and create new browsing styles. Details follow next two sections. 3.
Associating Browsing Behavior with Video
3.1.
Association Elements
We assume following characteristics in video browsing for knowledge work: People often browse video in consistent and specific manners User interaction with video can be associated with low-level features of the video To avoid including domain knowledge, we do not deal with semantic video features in identifying browsing styles. While user's manipulation to a video depends on the meanings of the content and on how the user's thought is, it is hard to observe these aspects. In this work, we tried to estimate associations between video features and user manipulations (See [9] for the detail.) We deal with the low-level features (e.g., color distribution, optical flow, and sound level) as what are associated with user manipulation. The user manipulation indicates changing speeds (e.g., Fast-forwarding, Rewinding, and Slow Playing). Identifying associations from these aspects, which can be easily observed, means that the user can grab tacit knowledge without any domain knowledge of the content of the video.
136
A. Takashima and Y. Tanaka / Reusing and Composing Habitual Behavior in Video Browsing
Fig. 1. The User Experience Reproducer and the overview of the process 3.2.
The User Experience Reproducer
To extract associations and reproduce browsing style for other videos, we have developed a system called the User Experience Reproducer. The User Experience Reproducer consists of the Association Extractor and the Behavior Applier (Fig. 1). The Association Extractor for generating a classifier The Association Extractor identifies relationships between low-level features of videos and user manipulation to the videos. The Association Extractor needs several training videos and the browsing logs by a particular user on these videos as input. To record the browsing logs, the user browses training videos using the simple video browser,
Fig. 2. Examples of the low-level video features
A. Takashima and Y. Tanaka / Reusing and Composing Habitual Behavior in Video Browsing
137
which enables user to control playing speed. We categorized the patterns of changing playing speeds into three types based on the patterns frequently used in informal user observation [6]. The three types areskip, re-examine, and others. Skip means video browsing at the speed higher than the normal playing speed (1.0x.) Re-examine indicates the browsing manipulation which is made from forwarding a video at less than normal speed after the rewinding. Any other browsing manipulations are included others. The browsing logs possess the pairs of a video frame number and the three categorized manipulations which the user actually played the frame. As low-level features, the system analyzes more than sixty properties of each frame such as color dispersion, mean of color value, number of moving objects, optical flow, sound frequency, and so on. Fig. 2 shows examples of the low-level features. Fig. 2(a) indicates statistical data of the representative color in one frame. Fig. 2(b) shows optical flow data of moving objects in a video(See [9] for the detail.) The system never recognizes semantics of contents (e.g. out-of-play scenes or shoot scene of a soccer game.) Then using the browsing logs and the low-level features, The Association Extractor generates a classifier that determines the speed at which each frame of the videos should be played. We employ the C4.5 algorithm for generating decision tree to make a classifier [10]and we used WEKA engine that is data mining software [11]. The Behavior Applier for scheduling and playing a target video The Behavior Applier plays the frames of a target video automatically at each speed in accordance with the play schedule. The play schedule is determined by applying the low-level features of a target video to the classifier. The play schedule is represented in the form of a list of associations between each video frame number (i.e. #1, #2 ...) in a target video and behavior (i.e. Skip, Re-examine, and Others.) The Behavior Applier smooth outliers from the sequence of frames, which should be played at a same speed of before/after frames, and also can visualize whole applied behavior to each frame of the video. In the current implementation, the three types of browsing behaviors (skip, re-examine, others) will be played at the pre-defined speed (3.0x, 0.5x, 1.0x) respectively. A preliminary user study for reusing habitual behavior has conducted, and is described in 4.1 4.
Managing Habitual Browsing Behavior
As described before, the managing of users’ browsing styles possesses a lot of possibilities for knowledge work. In this section we describe the evaluation of reusing habitual behavior and the way to compose these behaviors. 4.1.
Browsing Style Reuse
Using the User Experience Reproducer, we conducted a preliminary user study to extract and reuse one’s browsing style. In this study, we used ten 5min. soccer game videos for training a classifier, and two 5min. soccer game videos for applying the browsing style and for playing automatically. We observed two subjects, so that the process described above was conducted twice. In training phase, SubjectA seemed to have been trying to re-examine (rewind then play at less than normal speed) particular scenes, which show players gathering in front of goal post or show a player kicking the ball to the goal. In addition, he skipped in-play scenes which do not display goal post, and out-of-play scenes. SubjectB tended to skip out-of-play scenes of the games.
138
A. Takashima and Y. Tanaka / Reusing and Composing Habitual Behavior in Video Browsing
Qualitative Analysis in Reusing Through the Behavior Applier, each subject saw the two target videos playing automatically in accordance with their own browsing style. The trial of SubjectA played nearly 80% of important scenes (for SubjectA) at a slower speed. The trial of SubjectB skipped nearly 70% of out-of-play scenes. These percentages were calculated by measuring duration of these scenes manually. The results of informal interview tell that both subjects got satisfaction from the target videos, which are automatically played. It is not easy to describe whether the applied browsing behavior by the system constitutes a perfect fit for the user's particular needs. However, through the user study, it seems possible to reuse tacit knowledge in video browsing without any domain knowledge of the contents. 4.2.
Browsing Style Composition
We associated user’s manipulation with video features; in other words, decomposed one’s browsing style into rules. We then tried to compose each rules so that create another browsing styles. The Timing when a composition is executed is after the play schedules were generated (Fig. 3). As described before, a play schedule is a list of associations between each video frame in a video and user’s behavior. One play schedule is generated through one classifier by the Behavior Applier, thus in the case of Fig. 3, two classifiers are made in order to generate two play schedules respectively. To compose browsing styles, we compose these play schedules with several operations. We defined a few simple operations, such as intersection A B : ^x | x A and x B` , complement A \ B : ^x | x A and x B` , and union A B : ^x | x A or x B` where A and B are sets of video frames that associated with
Fig. 3. Composing play schedules
A. Takashima and Y. Tanaka / Reusing and Composing Habitual Behavior in Video Browsing
139
specific behavior such as fast forwarding or re-examining, x is a specific video frame. Some examples which are made by using these operations are as follows: SUSER1 SUSER 2 S YOU
(ex.1)
( S USER1 S USER 2 ) \ S YOU
(ex.2)
( S USER1 S USER 2 ) S YOU
(ex.3)
Fig. 4 shows these examples visually. The upper three belts in Fig. 4 indicate the estimated behavior of a video thorough the User Experience Reproducer based on three persons’ browsing styles. For instance, the first belt shows that User1 may browse the video at a normal speed first, then skip (Fast-Forward) the second part, re-examine the next scenes, skip the fourth part, and then browse the last part normally. The second and third belts are described in the same manner. The lower three belts corresponds to the three examples of composing browsing behavior. The details as follows: ex.1 describes the intersection of the three browsing styles. In this case, the system estimates that these three persons will skip at earlier scenes. This operation detects meaningful manipulations for all users. In other words, the operation works like a social filtering system if the number of users is bigger enough. ex.2 shows the complement of SYOU in the intersection of SUSER1 and SUSER2. This operation finds the habitual behavior of other users which does not tend to be selected by you. This operation can be regarded as an active help system [12]. ex.3 describes the union of SYOU and the intersection of SUSER1 and SUSER2. You can experience your habitual behavior while taking other users' habitual behavior into consideration (Note, this union operation needs to identify a priority to avoid conflicts between behaviors.) These examples shows that simple compositions of associations between a video frame and browsing behavior can create other meaningful browsing style.
Fig. 4. Examples of estimated browsing behaviors and its compositions
140
A. Takashima and Y. Tanaka / Reusing and Composing Habitual Behavior in Video Browsing
Qualitative Analysis in Composing We conducted another user study for composition by using the same classifiers which described in 4.1. SubjectA and B saw two target videos. Each video was automatically played in accordance with the three types of composition described as follows: S SubA S SubB
(cmp.1)
S SubB \ S SubA or S SubA \ S SubB
(cmp.2)
S SubA S SubB
(cmp.3)
As described in 4.1, SubjectA tend to re-examine exciting scenes near goalpostsand skip outof-play scenes. SubjectB tended to skip out-of-play scenes of the games. As a result of applying the cmp.1, the system fast-forwarded the scenes which were expected to be skipped by both subject and played other scenes at the normal speed. This automatic play became almost same as the browsing style of SubjectB. In the interview given after the browsing, SubjectA said that “Although some scenes (in-play scenes which he wanted to skip) are played at the normal speed, the browsing style is acceptable”. SubjectB regarded this composition as almost the same one of his browsing style. In applying the cmp.2, the result of S SubB \ S SubA was shown to SubjectA, and vice versa. As a result, each subject experienced the browsing behaviors which emerged only in the others. Subject B did not like skipping in-play scenes which do not display goalpost. On the other hand, he got interested in the exciting scenes (for SubjectA) which are played at a slower speed. SubjectA did not like skipping the replay scenes that he usually browses at the normal speed. Both subjects were irritated by the out-of-play scenes which had not been skipped by the system. In applying the cmp.3, SubjectA seemed that the browsing style was acceptable because the automatic play by applying the cmp.3 was similar to the browsing style of SubjectA. SubjectB said “It looks like a digest video of a soccer game.” 5.
Discussion / Future Work
This paper describes the notion of reusing habitual behavior of video browsing, and presents examples of composition using these behaviors to create new video browsing styles. Findings from the user studies Although our user studies had only two subjects, it seems possible to reuse tacit knowledge in video browsing without any domain knowledge of the contents. In composition trial, several positive aspects such as accepting the other user’s browsing styles were found. However, there were negative aspects caused by the forced unexpected browsing behaviors. It seems good to reproduce a certain person’s browsing style to the person; on the other hand, reproducing a certain person’s browsing style to others requires some mechanism which reduces cost of applying unknown browsing styles. Mechanisms should allow users to grab the overview of browsing styles when they reuse or compose their styles. In our current implementation, users must determine the browsing style before users encounter a video to play it automatically. This is another reason that the new mechanisms are required. Some specific interaction patterns based on user groups might be found if we conduct user study with more subjects. Conducting more user studies is our future work.
A. Takashima and Y. Tanaka / Reusing and Composing Habitual Behavior in Video Browsing
141
Habitual browsing behavior In contrast with the research works which employ content-based domain knowledge, little has been reported that addresses composing tacit knowledge such as video browsing style in knowledge work. The fact that video data essentially has temporal aspect might make users browse video passively than other media such as text or image. On the other hand, the fact that we have increasingly more opportunities to use video for our knowledge work might make us browse video more actively. We believe that novice will be able to operate video freely and develop their own browsing styles. To support these users, we need to clarify not only semantic understanding of video content, but also habitual behavior of each user. Timing of composing In this paper, although we described to use play schedules for composing, there are other options about when composition will be executed. We plan to compose browsing logs or classifiers of each user as other types of composition. The timing of composition has possibility to give us much better result. Composing tacit knowledge It is said that the social navigation technique for supporting the user's activity by using information on a past are useful [13]. The contributions of our study is not only give the notion of reusing information on a past but also give the example to create new browsing style. We present three types of composing manipulation in this paper, and composing manipulation still has more possibility to generate new and meaningful browsing styles. Refining the composing manipulation is one of feature work. References [1] Y. Yamamoto, K. Nakakoji, A. Takashima, The Landscape of Time-based Visual Presentation Primitives for Richer Video Experience, Human-Computer Interaction: INTERACT 2005, M.F. Costabile, F. Paterno (Eds.), Rome, Italy, Springer, pp.795-808, September, 2005. [2] Michael Polanyi, Tacit Dimension, Peter Smith Pub Inc., 1983. [3] Y.Nakamura, T. Kanade, Semantic analysis for video contents extraction—spotting by association in news video, Proceedings of the fifth ACM international conference on Multimedia, Seattle, pp.393-401, 1997. [4] A. Ekin, A.M. Tekalp, and R. Mehrotra, Automatic soccer video analysis and summarization, IEEE Trans. on Image Processing, vol. 12, no. 7, pp. 796-807, July 2003. [5] Adler, M. J. and Doren, C. V.: How to Read a Book, Simon and Schuster, New York, 1972. [6] A. Takashima, Y. Yamamoto, K. Nakakoji, A Model and a Tool for Active Watching: Knowledge Construction through Interacting with Video, Proceedings of INTERACTION: Systems, Practice and Theory, Sydney, Australia, pp.331-358, 2004. [7] Y. Seo, B. Zhang, Learning user's preferences by analyzing web-browsing behaviors, Proceedings of International Conference on Autonomous Agents, pp.381-387, 2000. [8] Tanveer Syeda-Mahmood, Dulce Ponceleon, Learning video browsing behavior and its application in the generation of video previews, Proceedings of the ninth ACM international conference on Multimedia, pp.119-128, 2001. [9] A. Takashima, Sharing Video Browsing Style by Associating Browsing Behavior with Low-level Features of Videos, Proceedings of the HCI International Conference (HCII), Beijing, July 2007 (in print). [10] Quinlan,J.R. C4.5:Programs for machine learning. Morgan Kaufmann Publishers, CA, 1993. [11] WEKA: http://www.cs.waikato.ac.nz/ml/weka/ [12] Fischer, G., Lemke, A. C., & Schwab, T., Knowledge-Based Help Systems, In L. Borman & B. Curtis (Eds.), Proceedings of CHI'85 Conference on Human Factors in Computing Systems, ACM, New York, pp. 161-167.1985. [13] A. Dieberger, P. Dourish, K. H¨o¨ok, P. Resnick, and A. Wexelblat. Social navigation: Techniques for building more usable systems. interactions, Vol. 7, No. 6, pp. 36–45, 2000.
142
Information Modelling and Knowledge Bases XIX H. Jaakkola et al. (Eds.) IOS Press, 2008 © 2008 The authors and IOS Press. All rights reserved.
" #" $% $# ' >>*|$\\]NN X /|^$''| ¨'¤|¡||''||'«' |||$||¡||"" \" "`|
163 &RPPDQG ,GHQWLILHU 1DPH
7RNHQ(QWLW\
)LJXUH([SUHVVLRQH[DPSOHRIDRQHWLPHWRNHQ
1LMLJHQFRGHVPHDQPDWUL[FRGHVLQ-DSDQHVH
351
M. Tanaka et al. / A Personal Information Protection Model for Web Applications
¶7KH3,063QRWLILHVWKHUHVXOWWRWKH8$ 67(3 8$ LG
163 67(3
8$
8$
163
67(3
67(3 163LG 8$ LG
67(3 5HVXOW
7 67(3 67(3 /LQNLQJ
3,063
67(3 ,VVXHG 7RNHQ
7
7
8$ 8$
67(3 3,063
7
8$
)LJXUH3URFHVVIORZRI67(3
7 67(3¶ 5HVXOW
7 0DWFKLQJ 67(3 9HULILFDWLRQ
8$
7 67(3
)LJXUH3URFHVVIORZRI67(3

352
M. Tanaka et al. / A Personal Information Protection Model for Web Applications

M. Tanaka et al. / A Personal Information Protection Model for Web Applications
353
¶3&,QDGGLWLRQSHUVRQDOLQIRUPDWLRQPDQDJHPHQW VHUYLFH SURYLGHUV FDQ EH UHGXFHG EHFDXVH WKH SHUVRQDO LQIRUPDWLRQ LV HQFU\SWHG DW XVHUV¶
354
Information Modelling and Knowledge Bases XIX H. Jaakkola et al. (Eds.) IOS Press, 2008 © 2008 The authors and IOS Press. All rights reserved.
Manufacturing Roadmaps as Information Modelling Tools in the Knowledge Economy Augusta Maria PACI EPPLab ITIA - CNR Via dei Taurini 19, 00185 Rome, Italy
Abstract. Roadmaps are the authorative medium-high tech viewpoints for the competitiveness and sustainability of industrial and public organizations. The paper provides an overview of possibilities to use roadmaps as virtual tools which contribute to support, coupled with knowledge management, industrial innovation and new industry processes. In the age of digitization, virtual roadmaps as a full participatory process supports new conceptual modelling of the manufacturing domain and at the same time it enables knowledge workers to share and elaborate innovative concepts. Hence roadmaps enable the design of new collaborative knowledge management environments. In medium time horizon, considering the global dimension of manufacturing, roadmaps can be reference tools for cooperation agreements in bilateral and multilateral projects. The paper provides a case study of roadmaps for advanced manufacturing and an example of the open innovation model.
1. Roadmaps in the Age of Digitization In the last ten years, public research organizations and manufacturing industries have developed roadmaps to foster industrial innovation and new industry. Roadmapping is often connected with foresight studies [1,2,3,5]. In the industrial economy, firms exploit technology roadmaps without previously contributing to their development. Conversely, approaching the European knowledge economy, technology roadmapping means a full participatory process of both research organizations and firms to identify R&D solutions to industrial targets for innovation. This process is a continuous cycle that assesses through feedback short, medium and long-term development plans. This paper explores the use of roadmaps tools for investigation in industrial innovation through research based innovation. In the age of digitization [6], virtual roadmaps go beyond the description of time scaled plans and priorities, which are the main result of traditional paper roadmaps1 [3,4,14]. 2. Virtual Roadmaps for Knowledge Management Roadmaps are a new type of predictive tools, that facilitate new communication and organizational learning processes “on doing things right first time”. Roadmapping, as a full participatory process, supports the transformation of an organization culture, enabling inflows and outflows of knowledge from individuals and groups to the organization level and the creation of new knowledge. Virtual roadmaps are tools that allow people to share knowledge and communicate through a common language in any type of organization. As such, these tools are essential in the global market for new solutions 1
According to IMTI Roadmapping methodology: “Roadmaps define the desired future vision, identify the goals required to achieve the vision, and define requirements and tasks to achieve the goals. This approach serves to develop programmes that involve the organization to respond to challenges.”
A.M. Paci / Manufacturing Roadmaps as Information Modelling Tools in the Knowledge Economy 355
responding to the needs of virtual and networked enterprises and research institutions. In the industrial technology domain, new needs are: definition of the knowledge domains and multiple tiers, consensus building among relevant communities, influence on decision makers, policy convergence among different stakeholders, prioritization of interventions, time horizon of targets, investments planning, dissemination of best practices and flagship projects. The digital roadmaps are becoming new communication means in any type of organization and community: industry, research, education, public institutions, national and local government, market. Particularly, virtual and networked enterprises intensify collaboration with partners, suppliers, advisors and other towards the future. Roadmapping - as a full participatory process - facilitate information modelling according to the SECI model: [7] x socialization: sharing tacit knowledge that is built upon existing experiences x externalization: articulating knowledge and developing the organization “intellectual capital” through dialogue x combination: expliciting the borders of expectations in terms of “competitive advantage” x internalization: participating in a learning process. In this way, new roadmaps support the development of collaborative knowledge management and dynamically manage knowledge sharing. Therefore, roadmaps become a “social process” and a “collective framing” to encapsulate the intangibles elements that transmit tacit and explicit elements of the organizational knowledge; they bridge the dualism currently existing in ICT-based knowledge management between the “organization of knowledge” and the “business strategies”. They also provide expectations in market impact and support the increasing role of performance measurement and scientific management. Virtual roadmaps expand the field of knowledge management and serve as meta-description of future industrial and technology areas. Looking at roadmaps as new tools implies that they: x are authorative medium-high tech viewpoints - detailing future vision, goals, requirements and targets filtered through complex participatory process - for the competitiveness and sustainability of industrial and public organizations; x represent a hierarchy of complex contents in macro-area, sub-areas, detailed topics, detailed technologies, which are relevant for looking-forward approaches to foster and sustain the organization’s new products and services; x communicate through different styles and channels, from complex schemas for technical analysis to simple presentation for effective and immediate impact; x are agents for the diffusion of priorities for innovation among people -bridging high level business decision and practical high tech work- fostering the application of research results to practice in the knowledge economy innovation chain; x describe and disseminate concepts and goals in an uniform language; x contribute to the collection of data, value creation that can be measurable within organizations in terms of efficiency and market impact. According to the specific industry’s high tech development plan and strategy, any roadmap represents the targets of the organization’s medium and long term strategy: quantitative values, time horizons, high level requirements, market diffusion and impact, resource allocation, supporting activities, infrastructures, facilities and best-practices inside and outside the organization.
356 A.M. Paci / Manufacturing Roadmaps as Information Modelling Tools in the Knowledge Economy
These new roadmaps predict how dynamically create the conditions for intelligent business in industrial domains. These new tools can leverage the Seven Knowledge levers2, facilitating the knowledge creation process, handling the daily situations within turbulent environments, managing the human dimension and sense-making interpretations. 3. Case study on advanced manufacturing Referring to the above-mentioned main principles, the manufacturing high-tech domains have been studied as a Case study of Manufacturing Roadmaps. The most recent and comprehensive new roadmapping concept in manufacturing technologies is the authoritative high-level representation of the five pillars of the ManuFuture industrial transformation reference model [8,9]. These macro-domains concern transectoral RTD areas that require solutions based on key and emerging technologies for new production systems and business models. The ManuFuture roadmaps aim to achieve European industrial innovation for high added value products and services providing time-scales and prioritized topics (Fig. 1) [8].
TRANSFORMATION
Agenda objectives
Drivers
TRANSFORMATION OF INDUSTRY OF
Goals
MAKE/DELIVERY HVA PRODUCTSSERVICES
INNOVATING PRODUCTION
R&D
INNOVATING RESEARCH
Competition Rapid Technology Renewal Eco-sustainability Socio economic Environment
New Added Value Products and Services
New Business Models
Advanced Industrial Engineering
ShortMedium-Term
Medium Term
Emerging Manufacturing Sciences and Technologies
Infrastructures and Education
Regulation Values -public acceptability TIME SCALE
Continuous
Long Term
Long Term
Fig. 1: ManuFuture industrial transformation reference model (source: ManuFuture Strategic Research Agenda, September 2006)
The stakeholders who contributed to the ManuFuture Platform have set out plans to use these transectoral technology macro-domains and corresponding roadmaps. Many other strategic sources, like platforms’ Strategic Research Agendas, roadmaps and studies have been analysed to set the targets of knowledge-based industrial development for European manufacturing. Within the European manufacturing community, wide consultations were carried towards industrial and research bodies to gain relevant contributions. Later on, after an intensive work, further roadmapping for Manufuture [10,12,13] developed specific transectoral technology roadmaps that were presented in the Manufuture Conference in Tampere for further validation and comments (http://manufuture2006.fi/) [11]. 4. Towards a Collaborative Knowledge Management The roadmapping process in virtual environments supports the design of new collaborative knowledge management, consolidating, exploiting and maintaining the knowledge produced and consolidated in the process. This new collaborative knowledge management may exploit the SECI modalities fostering: 2
The Seven Knowledge levers are Customer knowledge, stakeholder relationship, business environment insights, organizational memory, knowledge in processes, knowledge in products and services, knowledge in people
A.M. Paci / Manufacturing Roadmaps as Information Modelling Tools in the Knowledge Economy 357 x
x
x
the combination modality enabling the knowledge conversion, the two-ways interaction between: high-level management of public and private organizations aiming at developing technology policy to win the market competition; people who learned in the process which knowledge and targeting goals are envisaged by the organization. This modality consolidates the transfer of the roadmaps concepts among knowledge workers (individuals and groups), through social interactions based on ICT technologies. the internalization modality enabling to internalize and practice the roadmapping concepts. This avoids a passive acceptance by knowledge workers, and triggers a participative and continuous validation process with verification procedures and control measures. the socialization modality enabling to widespread the understanding and use of the roadmaps as agents for diffusion of culture and innovation.
5. Open Model for Collaborative Knowledge Management The new collaborative knowledge management, that integrates the roadmapping process, provides an example of the Open innovation model [14]. In this example (figure 2), input from roadmaps provide specific elements for innovation while information modelling provide specific elements knowledge management. The combination of roadmaps and information modelling operating a convergence between prediction and responsiveness permits the creation of a new collaborative environment. Collaborative knowledge management (source An open Innovation Paradigm. CHESBROUGH, 2006 Elaboration EPPLab, 2006)
BUSINESS/TECHNOLOGIES POLICY STRATEGIES
ROADMAPS VALUE CREATION
PERFORMANCE MEASUREMENT
INFORMATION MODELING
KNOWLEDGE CREATION KNOWLEDGE WORKERS
PREDICTION
RESPONSIVENESS
Fig. 2: Collaborative knowledge management based on Open model
Therefore expectations and future goals are integrated with inflows and outflows through a continuous participatory process. This process responds to a fast changing environment and to the need of alignment of people capacity toward innovation. 6. Global Dimension In medium time horizon, considering the global dimension of manufacturing and innovation strategies, the new roadmaps can be applied as virtual reference tools within cooperation agreements and bilateral and multilateral projects. In the knowledge economy, these roadmaps will represent the high-tech manufacturing language. Like super highways, new roadmaps are the communication infrastructure for industrial innovation and new industry. They will allow the info-mobility of knowledge workers along complex high-tech concepts and innovation projects.
358 A.M. Paci / Manufacturing Roadmaps as Information Modelling Tools in the Knowledge Economy
In this spirit, Japan public research organizations say that: “By combining the knowledge of industry, government, and academic fields, METI established our country's first "Strategic Technology Roadmap" in 20 different fields. Strategic Technology Roadmap indicates the technical goals and demands of pro-ducts/services necessary for the production of new industry. Hereafter, Strategic Technology Roadmap will be offered to industry, government, and academic fields to promote cooperation of one another, and also to be used for managing METI research & development.” [15] 7. Conclusion In the knowledge economy, new roadmaps as reference tools support new ICT- based knowledge management, playing a role to achieve successful results in industrial innovation and new industry. They contribute to the concept design of collaborative environments for global knowledge creation and sharing. In the age of digitization, roadmaps enable to optimize the learning and the knowledge transfer allowing knowledge workers to cooperate remotely around common and strategic innovation goals. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15]
DREHER C., ManVis Main Report, Fraunhofer-ISI, 2005. FUTMAN PROJECT, The future of manufacturing in Europe 2015-2020: Main report, 2004. IMTR/IMTI, Roadmapping Methodology, http://www.imti21.org/resources/docs/roadmapping.htm. INSTITUTE OF MANUFACTURING, Informan EUREKA Project, 2003. MANUFUTURE HIGH LEVEL GROUP, ManuFuture A Vision for 2020. Assuring the future of manufacturing in Europe, Report of the High Level Group, EU DG Industrial research, 2004. MACKENZIE OWEN J., The scientific article in the age of digitization, Springer, 2007. NONAKA I., The Knowledge-Creating Company , in: Harvard Business Review on Knowledge Management. Harvard Business School Press,. pp. 21-45, 1998 EUROPEAN COMMISSION MANUFUTURE PLATFORM, ManuFuture Strategic Research Agenda: September 2006, ISBN 92-79-01026-3. (www.manufuture.org). TOKAMANIS C., Improve the competitiveness of European Industry. ManuFuture Conference, Tampere, Oct. 2006, http://manufuture2006.fi/presentations/. PACI A.M., A collaborative industry-research frame for roadmapping in Production Engineering Conference, Wroclaw 7-8 December, pp 5-10, 2006. WESTKAEMPER E., Manufuture RTD Roadmaps: from vision to implementation, ManuFuture Conference, Tampere, Oct. 2006, http://manufuture2006.fi/presentations/. JOVANE F., PACI A.M., et al., Area Tecnologie di gestione e produzione sostenibile. In: II Rapporto sulle priorità nazionali della ricerca scientifica e tecnologica, Fondazione Rosselli (ed.), Milano, Guerini, pp 310-349, 2005. WILLIAMS D., Road mapping - A personal perspective, in Seminar: “Supporto alla ricerca in collaborazione con l’industria nell’area Sistemi di Produzione: strumenti e metodologie, CNR, Rome, 28 nov. 2006. CHESBROUGH H., Open innovation researching: a new paradigm. Oxford University Press, 2006, http://www.openinnovation.eu/. NEDO (New Energy and Industrial Technology Development Organization) Roadmap http://www.nedo.go.jp/roadmap/index.html.
Technical support provided by dr. Cecilia Lalle (EPPLab ITIA - CNR)
Information Modelling and Knowledge Bases XIX H. Jaakkola et al. (Eds.) IOS Press, 2008 © 2008 The authors and IOS Press. All rights reserved.
359
Metadata Extraction and Retrieval Methods for Taste-impressions with Bio-sensing Technology Hanako Kariya† Yasushi Kiyoki†† †Graduate School of Media and Governance, Keio University ††Faculty of Environmental Information, Keio University 5322 Endoh, Fujisawa, Kanagawa 252-8520, Japan {karihana,kiyoki}@sfc.keio.ac.jp Abstract. In this paper, we present a new generation food information retrieval by Taste-impression equipped with bio-sensing technology. The aim of our method is to realize the computing environment for one of the un-discussed basic human perception: sense of taste. Our method extracts Taste-impression metadata automatically by using sensor outputs retrieved from a taste sensor, according to 1) user’s desirable abstraction levels of terms expressing Taste-impression and 2) characteristic features of foods, such as a type, a nationality, and a theme. We call those characteristics of foods and drinks, “Taste Scope”. By extracting Taste Scope dependent metadata applying a bio-sensing technology, our method transforms sensor outputs expressing primitive taste elements into meaningful Taste-impression metadata and computes correlations between target foods (or drinks) and a query described in Taste-impression. Users can intuitively search any kinds of information regarding foods and drinks on the basis of abstract Taste-impression preferences with user’s desired granularity. We clarify the feasibility and effectiveness of our method by showing several experimental results.
1 Background Issues In recent years, a lot of recipe and drink databases are accessible through global area computer networks. These information resources are rapidly added and deleted according to the dynamic transition in food industry. There are currently two significant issues in foods and drinks search behavior for food consumers and creators (Target users of our retrieval method). First issue lies in the side of consumers. Consumers unfortunately do not have any attractive approaches to find his or her favorite food products, on the basis of their preferences of taste sense. Exisiting food data retrieval systems support users’ finding their favorite products or recipes, by merely providing product names and brands searches. Therefore, users’ relying on their own experiences is the only solution to reach favorite foods of his/her favorite tastes among numerous data replaced rapidly. The second issue is in the creator side. For example, food developers strrugle in designing new products on a daily basis. In order to design reputable and sustainable product, food developers need to understand desirable food or drink images of consumers. This would only be realized not by existing method such as advertisement but taste design itself. Furthermore, the product development needs to be performed for various foods and drinks concurrently in a limited period of time. Therefore, search environments for integrating the anonymous food data altogether according to his or her objective taste design vision are essential for competitive food products development. In order to solve such difficulties, an information retrieval system for “Impression” on the basis of user’s taste preferences should be of clear benefit to overall food business and consumers. In this paper, we propose metadata extraction and retrieval methods for Taste-impressions with bio-sensing technology, focusing on a metadata extraction method. Our impressionbased retrieval is realized by query expressed as verbal expression of impression keywords
360
H. Kariya and Y. Kiyoki / Metadata Extraction and Retrieval Methods for Taste-Impressions
Figure 1: Target User Categories with Query Expressions of Our Method
such as “rich”, “fresh” or taste pattern as shown in Figure 6. Our metadata extraction method automatically generates Taste-impression metadata for target data, according to features of each food or drink, such as a type, a nationality, and a theme by applying sensor outputs retrieved from the taste sensor. 2 Basic Concept Our approach to Taste-impression-based retrieval is based on two concepts (Figure 2). 1. “Taste Scope” adoption to a metadata extraction mechanism for optimizing sensor outputs makes it possible to manipulate the complexity of Taste-impression. 2. Application of bio-sensing technology to metadata extraction method, i.e. transforming cognitive data into metadata expressing verbal queries allows the impression-based retrieval with user’s optimal granularity in taste expression.
Figure 2: Basic Approach and Concept of our Method
H. Kariya and Y. Kiyoki / Metadata Extraction and Retrieval Methods for Taste-Impressions Automatic Metadata extraction for target data from sensor outputs
Beer Search engine
Target Data with Sensor Outputs
Soup Scope Beer Scope
361
Metadata for Soup Scope
㫌㫄㪸㫄㫀 㫊㪸㫃㫋㫐 䊶䊶䊶 㪸㪺㫀㪻 Metadata 䌦or Beer Scope
Beer &
Rich
㪹㫀㫋㫋㪼㫉 㫊㫎㪼㪼㫋 䊶䊶䊶 㪹㫆㪻㫐 Individual Feature Sets (Metadata) to represent different Taste Scopes
Query
Figure 3: Overview of Taste Scope adoption to Metadata Extraction
Feature1: Definition of Taste Scope in order to utilize anonymous contexts for terms expressing Taste-impression. One of the most important premises of Taste Scope is that, foods and drinks showing exact same taste patterns of sensor outputs, do not necessarily mean that Taste-impression is always the same and vice versa. An impression word “rich” is one of the common expressions for both soups and Japanese Sake, for instance. However, the main feature (taste elements) of each impression is in “bitterness” and in “umami (palatability)”. These taste elements are completely different, but have significant impact for underlying meaning definition in this same Taste-impression word. In order to deal with such Taste-impression-specific complexity, it is indispensable to define metadata for Taste-impression in verbal expression by transforming sensor outputs according to these viewpoints, that is, “Taste Scope”. For reflecting Taste Scope, our metadata extraction method for foods and drinks data introduces two modules named “Tasteimpression Metadata Generation Module” and “Standardization Module”, which perform new optimization operations by reflecting the Taste Scope intelligence in our metadata processing (Figure 3). Feature2: Application of bio-sensing technology to metadata etxraction, in order to transform sensory information to verbal query expressing Taste-impression with user’s desired granularity A multi-channel taste sensor, namely known as an electronic tongue [8] [6], computes and outputs taste senses for various foods and drinks quantitatively and provides the objective scale for human sensory expression in food developing and quality control. Unlike existing sensors such as temperature and pressure sensors, which respond to single physical quantities, a multi-channel taste sensor can measure many kinds of chemical substances in each food synthetically and transform these substances into meaningful quantities of basic tastes such as saltines, sweetness and its continued stimuli (Hereinafter called after taste). This sensor has been developed on the basis of mechanisms found in the biological system, such as parallel processing of multidimensional information or by the use of biomaterial and hence called bio-sensing technology (Figure 4). By applying bio-sensing technology to Taste Scope intelligence in our metadata extraction, our impression-based retrieval makes it possible to compute the correlation of basic components of sensory information and verbal query expressing Taste-impression with user’s desired granularity of taste expression.
362
H. Kariya and Y. Kiyoki / Metadata Extraction and Retrieval Methods for Taste-Impressions
Figure 4: Taste Sensor developed by bio-sensing technology [8, 5]
3 Related Work Whereas main objective of our study is to extract metadata focusing on taste sense interpretation and its expressions as Taste-impression keywords, several related work could be found in terms of Sensor and Kansei Database fields perspectives. In this section, we classify related works into two categories: 1) Kansei and Impression-based retrieval systems, 2) Sensor Database systems, and present the main difference of these studies from our method. 3.1 Kansei and Impression-based retrieval system Kansei databases are studied in various fields to realize intuitive search environment for images, music [9], video streams [7] and so on. Just to name a few,“ A Metadata System for Semantic Search by a Mathematical Model of Meaning ”[15] realizes impression-based retrieval for images by automatically computing the color scheme and its correlated impression word. The aim of these studies is to deal with the global impression of impression words of digital images in database. The paper [14] presents an extraction method of boundaries with impression changes by using color information and N-gram for video streams. These approaches are applicable and effective method for impression-based retrieval of images and video streams, whose impression are unique identifiable. In contrast to these solution for extracting global impressions for media data, our method extracts the metadata of Taste-Scope-dependent impression to solve the complexity and diversity of taste sense, as shown in table 1. 3.2 Sensor Database systems Concept for applying sensory information to a database system has been popular in numerous fields, and new applications are being explored constantly [10, 11, 3].
H. Kariya and Y. Kiyoki / Metadata Extraction and Retrieval Methods for Taste-Impressions
363
Table 1: Conventional Kansei and Impression-based Retrieval and our method
Pattern Matching
Retrieval by Impression Query
Retrieval by Scopes
Non-compliant
Non-compliant
Existing Impression-based retrieval
Compliant
Non-compliant
Our Method
Compliant
Compliant
For instance, application techniques of database systems to finger verification has been widely used. The aim of these studies is to realize exact pattern matching of finger prints with sensor information stored in heterogeneous databases, such as optical sensor and thermal sweeping sensors [12]. Location based sensor database studies have been very popular in ubiquitous computing fields as well [1]. There are successful applications of location-aware mobile computing, most notably navigation systems based on GPS sensors (e.g. [2]). Other examples are the NaviCam [16] and active badge systems [4]. Generally speaking, objectives of existing applications are detection of the presence of an object or a condition, object recognition, object identification/classification, tracking, monitoring, and change detection. Additionally, conventional approaches have been relied on simple physical sensor outputs such as distance and temperature sensors to achieve objectives above. On the other hand, our method applies sensory data of bio-sensing technology to database application as the raw information resource for metadata extraction in order to realize Tasteimpression expression. Sensory information with the combination of Taste Scope enables the automatic and meaningful information provision in our metadata processing. 4 An Example for Query Processing In this section, we first demonstrate the actual usage of our system with user scenarios in order to present the significance of our method. Next, we present an example of the metadata extraction and query processing method in order to show the data flow of our method. 4.1 User Scenarios There are two types of query options available in our Taste-impression search in order to satisfy different kind of target user needs. Beer information retrieval is shown as an example here. Assume that several beer makers offer local databases to introduce their products and general consumers (user scenario with Search Option 1) and drink developers (user scenario with Search Option2) are using our system. Our query processing and system architecture are shown in Figure 5 and user interface is shown in Figure 6. Query example for general consumers (Search Option1) A consumer unfamiliar to alcoholic beverages is seeking beers for refreshing. Search Option1 has prepared to satisfy needs of general consumers with elementary familiarity for taste flavors. Since the user does not have any detailed knowledge regarding taste preferences, the user holds only elementary level of expression ability for desired taste pattern. Such user submits a query with the Taste Scope “beer” and Taste-impression “fresh” to express his/her abstract favorite taste images.
364
H. Kariya and Y. Kiyoki / Metadata Extraction and Retrieval Methods for Taste-Impressions
Figure 5: Query Processing and System Architecture
Query example for foods and drinks developers (Search Option2) A drink developer needs to design marketable beer product for next season. Since the user does not have enough time to seek desirable taste by try and error for only one of their portfolios products, need the system which strongly support taste design of products in intuitive manner. In this case, the user should hold not merely ambiguous Taste-impression images but more concrete taste design images, if to a lesser degree of expert sake sommelier or a buyer in specialized food importers. Search Option2 has prepared to meet with such needs of taste design professionals with intermediate familiarity in taste. The user submits a query with the Taste Scope “beer” and directly addresses their objective taste pattern. The user is able to find similar beer item which could be the future rival products, in advance to physical product implementation. By understanding such information to differentiate with others in our system, the user is able to adjust the direction of product development with a cost effective solution. 4.2 Data flow of our method Taste Scope, such as “beer scope”, is described and committed as the query, and it is used to manage overall scenario of query processing (Figure 7). According to this Taste Scope, our method selects a candidate set of Taste-impression words as well as features, subsets of sensor outputs, functions to calculate the sensor outputs and aggregation functions for the intermediate values of sensor outputs. These functions are evaluated through following steps. Step-1 Mapping of a set of retrieval candidates for target data: Metadata extraction method maps a set of retrieval candidates for target data in the beer scope to the database, which consists of IDs of beer items and sensor outputs, with URLs as local information regarding beers.
H. Kariya and Y. Kiyoki / Metadata Extraction and Retrieval Methods for Taste-Impressions
365
Figure 6: User Interface
Target Data
Automatic Metadata extraction for target data from sensor outputs
㪠㫄㫇㫉㪼㫊㫊㫀㫆㫅㩷“㪩㪼㪽㫉㪼㫊㪿㫀㫅㪾”
㪠㫄㫇㫉㪼㫊㫊㫀㫆㫅㩷“㪟㪼㪸㫍㫐”
㪤㪛㩷㪽㫆㫉㩷㪙㪼㪼㫉㩷㪪㪺㫆㫇㪼 㪤㪛㩷㪽㫆㫉㩷㪪㫆㫌㫇㩷㪪㪺㫆㫇㪼
㪤㪛㩷㪽㫆㫉㩷㪙㪼㪼㫉㩷㪪㪺㫆㫇㪼 㪤㪛㩷㪽㫆㫉㩷㪪㫆㫌㫇㩷㪪㪺㫆㫇㪼
㪤㪛㩷㪽㫆㫉㩷㪮㫀㫅㪼㩷㪪㪺㫆㫇㪼
䊶䊶䊶
Query
㪤㪛㩷㪽㫆㫉㩷㪮㫀㫅㪼㩷㪪㪺㫆㫇㪼
䊶䊶䊶
㪤㪼㫋㪸㪻㪸㫋㪸㩷㪪㫇㪸㪺㪼㩷㪽㫆㫉㩷㪼㪸㪺㪿㩷㪫㪸㫊㫋㪼㩷㪠㫄㫇㫉㪼㫊㫊㫀㫆㫅 㫀㫅㫋㪼㪾㫉㪸㫋㪼㩷㪸㫅㫆㫅㫐㫄㫆㫌㫊㩷㪺㫆㫅㫋㪼㫏㫋㫊㩷㫆㪽㩷㪻㫀㪽㪽㪼㫉㪼㫅㫋㩷㪫㪸㫊㫋㪼㩷㪪㪺㫆㫇㪼㫊
Figure 7: Overview of Taste Scope
Step-2 Standardizing for sensor outputs: Optimizations for sensor outputs are automatically processed by the operation P2 in the standardization module Pz for the beer scope. Step-3 Extracting metadata from sensor data: Sensor outputs processed in Step-3 (intermediate values) are converted to metadata for target data, which consist of important feature sets for the Taste-impression definition in the beer scope by the operation G1 and G3 in the Taste-impression metadata generation module Ge . Step-4 Calculating correlation: The query processing method measures correlation values among metadata for beer items and keyword “fresh” (Search Option1) or addressed taste pattern (Search Option2) selected in Step-3, and outputs URLs as ranking results. 5 Metadata Extraction and Retrieval Methods for Taste-impressions with Bio-sensing Technology In this section, we present a framework of metadata extraction and retrieval method for Tasteimpressions with bio-sensing technology. The main functions of our method consist of a metadata extraction function for Taste-impression, and its query processing function. The execution model and basic algorithm outline of our method are shown in Figure 8 and 9. The execution model of our method is described in the following order. 1. The overall execution model 2. The metadata extraction method
366
H. Kariya and Y. Kiyoki / Metadata Extraction and Retrieval Methods for Taste-Impressions
Query Processing :
Λ(T , Wn, C x ) → R
Wn T aste Domain W1
A set of retrieval candidate R belongs to W1
T1
R2
Step-1 A cidic-bitterness
S1
Pz : R → S
Data E xtraction
T
A set of retrieval candidate S (Standardized R)
P1
R1
T2
Cx T aste Impression C1
Query1
R B ase-bitterness
schema example
P2
Step-2
S2
A set of R esults R1
G1 Ge : S → M G2
S
… A cidic-astringency
β ( M , C x ) → R1 Metadata M1 belongs to W1
Step-3 astringency
bitterness
… sourness
€ E xtr action for T ar get data M etadata
α (T , Wn ) → M
Figure 8: Execution Model
3. The query processing method The feature of our method is in the metadata extraction, where sensor outputs are transformed into meaningful Taste-impression metadata with user’s requested granularity. This feature brings two contributions in our method. First contribution is the adoption of Taste Scope to our metadata extraction method. Our method makes an interpretation of information retrieved from the Taste Scope, and develops metadata for Taste-impression. Our metadata extraction method makes it possible to recognize and express the abstract and subtle impression representation of taste sense by handling sensor outputs realizing modules, that are, Taste Impression Metadata Generation Module Ge and Standardization Module Pz . These modules are implemented by reflecting the specialized knowledge for defining subtle flavor of target Taste Scope. These modules make it possible to integrate diversified Taste-impression definition from anonymous Taste Scope. Second contribution is query expression dealing with the heterogeneous abstraction levels of verbal expression regarding sense of taste. We have set the abstraction level of verbal expression as granularity. We have implemented our method to meet with the desired granularity for different target users with low-intermediate knowledge regarding foods and taste sense. Namely, our method makes it possible to realize information provision for users balancing their familiarity level concerning taste and abstraction level for expressing target data verbally. Less familiar to taste knowledge, higher abstraction level for query is set (Figure 1). 5.1 The overall Query Execution Model In this section, we present overall query processing procedures and basic fuctions for metadata extraction and retrieval methods. In our method, the meaning of Taste-impression is determined by the indication of Taste Scope. Specification of the Taste Scope is executed with Wn of query, which is reflected to the metadata generation for Cx (Cx ∈ C(Wn )) and selection for target data. Wn (Wn ∈ W ) consists of appropriate feature sets to define the impression in each Taste Scope. As for query
H. Kariya and Y. Kiyoki / Metadata Extraction and Retrieval Methods for Taste-Impressions
367
selection by scope(T, Wn ) → T selection by (T ) → R; for each z · z + n{ selection by format(R, z) → Rz normalization(Pz , Rz ) → S for each Rzl in Rz { Pz (Rzl ) → Sl Append (S, Sl ); } MetadataExtraction(S) → m for each Sl in S{ Ge (Sl ) → Ml ; Append (m, Ml ); } }
Union(M, m); Figure 9: Algorithm Outline
options, we present Search Option1 as Q1 and Search Option2 as Q2. The structures of a query is defined as: Q1 = (Wn , Cx )
(1)
Q2 = (Wn , {d1 , d2 , · · · dn })
(2)
Wn = {SID, {f1 , f2 , · · · fn }}
(3)
Cx = {SID, {d1 , d2 , · · · dn }}
(4)
Execution model of our method F is only performed by inputs of query described with this data structure. Overall query processing F targets retrieval candidate T in and outputs retrieved results T out , by computing the correlation among Wn and Cx (Q1) and and sorting T in based on calculated correlation values. Otherwise, the user who understands exact taste pattern to be expected would directly specify the feature values as shown in Q2. Since data selection of the operation F is indicated by Wn , retrieval results T out is subset data of T in . Overall query processing operation F is defines as: F (T in , Wn , Cx ) → T out |T out ⊂ T in
(5)
5.2 The Metadata Extraction Method for Taste-impression In this section, we present overall outline of our metadata-extraction method for Taste-impression. Our metadata extraction method consists of three functions and executed by following order.
368
H. Kariya and Y. Kiyoki / Metadata Extraction and Retrieval Methods for Taste-Impressions
Step-1 Mapping a set of retrieval candidate to Rl Step-2 Feature Value Optimization by Standardization Module Pz Step-3 Schema Optimization by Taste-impression metadata generation module Ge First, we present and formalize functions and data structures of our method. Second, we show our metadata processing procedure demonstrating typical operation examples for Beer Scope. We clarify our method by introducing its 1) metadata schema in progress and 2) reflected Scope knowledge, which serve as the basis for each operation. 1. Mapping a set of retrieval candidate In this step, a set of retrieval candidates for target data Tl is extracted from all candidates T , on the basis of selected scope identifier SID. Tl consists of SID, its own identifier OID, and entity data (information resources in network) data. Each extracted Tl is joined with sensor data Rl , which is also extracted from all candidates R by SID. Sensor data are also described as the set of SID, OID, and sensor outputs data. These data are mapped and treated as baseline data for metadata generation. Data structure of target data Tl and sensor outputs Rl are defined as: Tl = {SID, OID, data}
(6)
Rl = {SID, OID, data}
(7)
Since each tuple in Tl has the SID, our mapping process consists of: Step-1: Selection of target data Tl with the Scope ID1 which is equivalent to Beer Scope, Step-2: Join of Tl with R1 , sensor data with SID Step-3: Mapping of selected R1 as raw data for creating metadata. 2. Standardization Module Pz Mapped sensor outputs Rl are automatically pre-processed by the standardization module Pz , in order to optimize feature values for target Taste Scope. Pz 1) selects adequate functions for target Taste Scope, 2) receives Rl and 3) outputs standardized values Sl . Therefore we could regard retrieved Sl as intermediate, pre-processed values for metadata. Data structure of Pz and Sl , function of the standardization module Pz are defined as: Pz : Rl → Sl
(8)
Sl = {OID, data}
(9)
normalization{Pz , Rl ||z ∈ {1, 2, 3, · · · , n}, l ∈ {1, 2, 3, · · · , n}}
(10)
Figure 10 is an example of sensor outputs optimization procedure activated by Taste Scope “Beer”. Our feature value optimization process consists of P1 operator which reflects Beer Scope specific intelligence. One of the actual operators is: • The threshold adjustment operation: In Taste Scope for Beers, it is widely known that slight difference of feature values significantly contribute to the flavor composition. For instance, only small multiplication of bitterness drastically changes the impression from “fresh” to “mild”. To deal with this issue, the threshold values adjusted by a specialist are subtracted for each feature value to describe the typical base line values of Japanese beers (taste pattern of reference solution).
H. Kariya and Y. Kiyoki / Metadata Extraction and Retrieval Methods for Taste-Impressions
369
Figure 10: Sensor Outputs Optimization process with threshold adjustment example
3. Taste-impression metadata generation module Ge Standardized sensor outputs Sl are automatically processed by Taste-impression metadata generation module Ge , in order to extract adequate feature sets for Taste-impression definition in target Taste Scope. Ge selects adequate functions for target Taste Scope, and selected functions receive Sl and outputs standardized values Ml . Sl is converted to Ml by using 1) the extraction and composition of features and 2) weighting of feature values on the basis of the denominator for target Taste Scope. Function of Tthe the taste-impression metadata generation module Ge and data structure of Ml can be defined as follows, where each Ml is composed of same features of Wn . Ge : Sl → Ml
(11)
Ml = {OID, v(Wn )}
(12)
v = {(f1 , d1 ), (f2 , d2 ), · · · (fn , dn )}
(13)
Figure 11 presents the Taste-impression-metadata generation phase with Taste Scope knowledge regarding “Beer”, whose intelligence are reflected as operator G1 to G3 . • The schema integration operation G1 : In Taste Scope for Beers, one of features of sensor outputs, salinity, does neither harm nor good on impression definition and indifferent to impression composition. Therefore, this feature will be omitted from the correlation matching target by multiplying feature values by 0. Acerbity (c5) and after taste of acerbity (c6) have merged with the union operator, in order to transform the abstraction level of feature words to suitable verbal expression for users. • The weighting operation for Acidic-bitterness (Sensor outputs from Channel ID3) G2 : In Taste Scope for Beers, Acidic-bitterness plays crucial role for impression
370
H. Kariya and Y. Kiyoki / Metadata Extraction and Retrieval Methods for Taste-Impressions
Figure 11: The taste-impression metadata generation example for Beer Data
composition. Since this feature could be described as one of the most essential taste elements for metadata definition, should be emphasized with strong weight naturally. In this experiment, we have tentatively set the weighting coefficient to 10. • The weighting operation for Base-bitterness (Sensor outputs from Channel ID4) G3 : In Taste Scope for Beers, Base-bitterness has negative effect on impression composition. Human interpret and feel this taste element in beers as if bitterness in medicines, whereas the function as “umami” if added adequate amount to tomato juice, for instance. For reflecting this fact, the weighing operation here turns feature values of Base-bitterness into negative. Aim of this operation is to realize pointdeduction scoring for feature bitterness when merged (union) with other sorts of outputs related to the bitterness. The standardization module Pz and the taste-impression metadata generation module Ge deserve recognition and expression mechanisms for defining the diversified impression representation on taste sense. Taste Scopes are eventually expresssed as metadata for each Taste-impression and has ability to express anonymous meanings of taste-impression in different Taste Scopes. Note that while these operation examples are realized as beer-specific operations, operations themselves are applicable and re-usable for several Taste Scopes, if same constraint applies. That is, each function reflects Taste-Scope intelligence for defining characteristic features of keyword for a Taste-impression applicable to several Taste Scopes. Such module application in our method realizes a search environment for various heterogeneous foods and drinks data in a comprehensive manner.
H. Kariya and Y. Kiyoki / Metadata Extraction and Retrieval Methods for Taste-Impressions
371
Figure 12: Correlation Calculation
5.3 The correlation calculation operator The correlation calculation operator β 1) computes correlations between Taste-impression metadata Me (Me ∈ M ) of target data and each Taste-impression words Cx (Cx ∈ C) and 2) outputs semantically close taste data Rl as retrieval results according to the user’s query (impression word or taste pattern and target scope). By employing our operation β, our method sorts the target data Tl in descending order on the basis of calculated correlation values, and enables ranking for target data according to impression words complied with Taste Scope. The data structure and function of Correlation calculation operator β are defined as: β(Ml , Cx ) → Rl
(14)
Our method provides two types of taste impression search options which eventually incorporated into operator β. Whereas Query1 correlate Wn with Taste-impression keyword with given feature values by the professionals for impression definition (The most abstract impression expression in our method), Query2 provides less intuitive search by directly addressing one’s desirable taste pattern as shown in Figure 12. 6 An application to the Beer and Japanese Food Scopes By realizing taste-impression-based retrieval by Taste-impression with our metadata extraction method, users can intuitively search any kinds of information regarding foods and drinks on the basis of abstract Taste-impression preferences. For extracting target data of the beer and Japanese foods by Taste-impression, we have applied our metadata extraction method to local Japanese recipe and drink databases. 6.1 A Metadata Extraction Method for Taste-impression In this section, we represent the implementation of our metadata extraction method. We have applied experimental data and defined functions for each module.
372
H. Kariya and Y. Kiyoki / Metadata Extraction and Retrieval Methods for Taste-Impressions
Figure 13: Principle of Taste Sensor (Offered by Insent, Inc.)
1. Sensor Outputs We show our information resources applied as row data for our experimental system. We present generation principle for sensor outputs 1 and an extraction method for an experimetal sensor data. In order to realize metadata extraction for the beer and Japanese Taste Scopes, we have applied real sensor outputs of beer data (28 tuples) and virtual outputs for Japanese food data (25 tuples). • Principle of Taste Sensor We have implemented our program applying taste sensing system proposed in [6] [8]. In a narrow definition of taste, human tongue receives taste as electric signal from foods and drinks composed of numerous chemical substances, whose 1) interaction has not been clear and 2) explanations are under developed. In order to deal with such difficulty for analyzing and evaluating taste, taste sensing technology applying human tong mechanism has developed as a multi-channel taste sensor and is widely used in food industry. Transducers of the sensor are composed of lipids immobilized with polyvinyl chloride. The multi-channel electrode is connected to a channel scanner through highinput impedance amplifiers. The electric signals are converted to a digital code by a digital voltmeter and then transferred to computer as shown in Figure 13. • Sensor Outputs of Taste Sensor The sensor output is not the amount of specific taste substances but the taste quality and intensity. The sensor has a concept of “global selectivity” which is the ability to classify enormous kinds of chemical substances into primitive taste elements such as saltiness, sourness, bitterness, umami and sweetness and its after taste (flavor stability on tongue). Electric signals obtained from the sensor are converted to taste quality based on the Weber-Fechner law which gives an approximately accurate generalization of the intensity of sensation. The base of logarithm is defined as 1.2. For example, 12.5 units means 10 times higher concentration than that of the original sample, and 125 units is 100 times higher concentration. Sensor outputs attributes consist of 16 features. Excerpt of sensor outputs for beers are shown in Table 2. 1 Description of Principle of Taste Sensor and Sensor Outputs of Taste Sensor have excepted and summarized according to [6] and [8].
H. Kariya and Y. Kiyoki / Metadata Extraction and Retrieval Methods for Taste-Impressions
beer brands
tartness
salinity
other bitterness
…
acidicbitterness
acerbity
Acerbity (after taste)
astringency
Kirin Rager
15.15
-4.87
-1.2
…
11.6
21.77
2.24
12.22
Kirin Ichiban-shibori
13.37
-5.34
-1.09
…
10.13
19.41
1.93
11.48
Sapporo Black Label
14.33
-6.49
-1.08
…
10.33
20.42
1.8
10.85
Suntory Malts
17.16
-4.83
-0.83
…
8.84
19.55
1.62
11.08
Asahi Super Dry
16.02
-8.73
-0.33
…
10.36
18.58
1.64
11.25
Kirin Tanrei
9.76
-8.55
0.16
…
9.79
16.26
1.63
12.47
373
Table 2: Subset data of Sensor Outputs for Beers
• Experimental Sensor data generated for Japanese Foods Similar virtual data are created by questionnaire for Japanese food data and applied to our experimental system tentatively. To generate virtual outputs, we have prepared 50 test subjects, 48 typical Japanese food items as experimental objects, and have conducted questionnaire by Semantic Differential method [13]. In this questionnaire, we have added free space for each target data so those tests subjective are able to write the impression words which they have came up with in his or her mind. We have applied these results for performance evaluation as well. Support rate has calculated as the ratio for the number of impression words written in each food among number of test subjective. We have set the threshold of support rate as 39% and eliminated 23 food data for convenience because main aim of this experiment is to search target data with impression, i.e. foods and drinks with low level of impression association for users are not as meaningful as target data of our system. 2. The standardization module Pz For sensor outputs conversion, we have implemented several functions for Pz as follows. P1 is defined as the comparative assessment for target Taste Scopes. It converts original feature values into suitable values for defining target Taste Scope respectively. We subtract the specific numbers for each feature value, which is adjusted by the specialist for target Taste Scope from original feature values. By this function, we are able to clarify the slight difference of feature values, and reflect subtle taste balance of flavor representation. P2 is defined as pre-processing function for feature values. This operation converts some of original feature values (sensor outputs) into absolute values. Since sensor outputs are expressed as electric potential, several features such as salinity and other bitterness have expressed in negative in original values. In order to comply with the semantic for vector expression for metadata, we have applied our P2 operation for features with such issues. P3 is defined as normalization function for feature values. Feature values are normalized to compute the norm of each vector between 0 and 1. By this function, we are able to resolve the big gap of gross average values between each feature marinating the balance of original important feature values. 3. Taste-impression metadata extraction module Ge For extracting adequate feature sets for Taste-impression definition in target Taste Scope, we have realized several functions for module Ge as follows. G1 is defined as integration function for feature values. Feature values are extracted and composed to define the target Taste Scope. By this function, we can produce suitable features for target Taste Scope.
374
H. Kariya and Y. Kiyoki / Metadata Extraction and Retrieval Methods for Taste-Impressions
G2 is defined as the emphasis assessment for feature values. We have tentatively multiplied feature values 10 times for emphasizing key feature value for impression composition of each Taste Scope. G3 is defined as another pre-processing function for feature values. This operation converts original feature values (sensor outputs) into negative. As described in the previous chapter, some of the taste elements such as base-bitterness in the beer case, has negative effect impression composition for beers. Aim of this operation is to realize point deduction scoring for feature when merged (union) with G1 . We can combine new functions for Pz and Ge in order to convert sensor outputs to suitable features and feature values for target Taste Scope, aside of the implemented functions above. 6.2 The Query Processing Method In order to realize an experimental information retrieval system, we have implemented the query processing method. The query processing method measures correlation values among metadata for target data (Ty ) and keyword described in Taste-impression (Cx in case of Search Option1), or directly insert the taste pattern (x1 to xm in case of Search Option2), and outputs the ranking results with URLs as local information for foods and drinks. We measure correlations between vectors of query and target data, using the operation of inner product. We have measured correlations using various ways, such as inner product, cosine correlation and comparison of vectors. In this paper, we have implemented the operation of inner product for measuring correlations. The Inner Product is a technique for calculating the amount of correlations between the query keyword and target data. Both of the query keyword and the target data are expressed respectively as vectors that have the same elements. The correlation function (Cx , Ty ) is defined as following formula:
(Cx , Ty ) =
mf
Wxi − Wyi
(15)
i−1
Cx = (Wx1 , Wx2 , · · · , Wxm )
(16)
Ty = (Wy1 , Wy2 , · · · , Wym )
(17)
7 Experimental Studies For evaluating feasibility of our system and its application, we have performed four experiments with following objectives. Experiment1: Feasibility evaluation for different Taste Scopes Experiment2: Performance evaluation for Japanese food scope Experiment3: Performance evaluation for Beer scope Experiment4: Functions Adjustment Evaluation for Beer Scope The overall experimental results have shown that our method has observed applicable to anonymous Taste Scopes as shown in Experiment1. Performance evaluation to the data in beer and Japanese food scopes assured retrieval results are reasonable in Experiment2 and 3. Furthermore, function adjustments in Experiment3 have allowed improvements in ranking
H. Kariya and Y. Kiyoki / Metadata Extraction and Retrieval Methods for Taste-Impressions
375
rank
target data ID
correlation
rank
target data ID
correlation
correlation
target data
support rate
correlation
target data
support rate
[1]
kyuuritowakameno-sunomono:
14.32
[1]
beer data1
26.54
1
20.97
chinjaoro-su:
73%
14
14.08
saba-no-misoni:
39%
[2]
ma-bo-toufu:
12.93
[2]
beer data10
25.04
2
20.93
ma-bo-toufu:
57%
15
14.05
niku-jaga:
[3]
chinjaoro-su:
12.75
[3]
beer data11
24.44
3
20.01
yaki-gyouza:
55%
16
13.35
karei-no-nituke:
18.38
butaniku-noshouga-yaki:
59%
17
[4]
beer data15
[5]
chirashi-sushi:
12.24
[5]
beer data16
23.14
[6]
butaniku-noshouga-yaki:
11.63
[6]
beer data17
22.52
[4]
yaki-gyouza:
12.29
24.09
[7]
ajino-shioyaki:
11.52
[7]
beer data2
21.77
[8]
mi-toso-supasuta:
11.13
[8]
beer data3
20.77
13.30
ro-ru-kyabetsu:
5
17.77
ebi-furai:
61%
18
12.44
kyuuritowakameno-sunomono:
6
4
16.74
ika-no-bata-sote-:
52%
19
11.87
kabochano-nimono:
7
16.61
kaki-furai:
64%
20
11.74
8
16.47
karubona-rapasuta:
61%
21
11.70
chirashi-sushi:
9
16.46
toriniku-no-teriyaki:
39%
22
10.37
ingen-no-gomaae:
potetosarada:
[9]
bi-fu-shichu-:
11.04
[9]
beer data4
20.42
10
15.50
buri-no-teriyaki:
50%
23
10.21
[10]
buri-no-teriyaki:
10.99
[10]
beer data 8
19.55
11
15.31
mi-toso-supasuta:
55%
24
9.12
chawan-mushi:
[11]
toriniku-no-teriyaki:
10.88
[11]
beerdata12
19.41
12
15.16
ajino-shioyaki:
25
8.08
houren-sou-no-ohitashi:
13
15.11
bi-fu-shichu-:
Results forJAPANESE FOOD Scope
Results for the BEER Scope
furofuki-daikon:
57%
Figure 14: Retrieval Results for different Taste Scopes Figure 15: Retrieval results (“Rich” for Japanese foods)
performance with the application of implemented modules for Beer Scope, hence verified the plagability of modules in our experimental system. 7.1 Experiment 1: Feasibility evaluation for anonymous Taste Scopes • Evaluation Method: Experiment1 is for applicability evaluation of our method to several Taste Scopes. For experimental studies, we submit a query with both Taste Scopes “Japanese food” and “beer” and selected Taste-impression “fresh”. • Experimental Results and Analysis: In this experiment, we have observed our experimental system have 1) selected the appropriate target data for each Taste Scope concurrently, and 2) ranked the target data with reasonable accuracy. For instance, “Kyuri-to-wakame-no-sunomono (Vinegared cucumber and brown seaweed)” and “Chirashizushi” (Vinegared rice arranged with various kinds of sliced raw fish) ranks in 1 and 5 in Japanese food scope. This result has suggested that the taste Scope-based metadata extraction method for impression retrieval is promising. Detailed performance evaluation for each scope is shown in following 2 experiments.
7.2 Experiment 2: Performance Evaluation for Japanese Food Scope • Evaluation Method: Experiment2 is for performance evaluation in Japanese food scope. As experimental objects, we have created the 25 virtual sensor outputs by questionnaire of 50 test subjects. In this experiment, we have committed the query with impression word “rich” Taste Scope “Japanese food”. Results are shown in Table 15. As impression words for query, we have implemented three impression expressions: “maroyaka (mellow or mild in Japanese)”, “sappari (fresh)” and “kotteri (heavy or rich)”. We have selected these impression words because frequent uses of these impressions have observed in our questionnaire. As target data, we have selected the 25 food items based on the support rate of keywords. • Experimental Results and Analysis: Target data with more than 39% support rate for impression word “rich” are indicated by boldface. Overall comparison of support rate and actual ranking results presents the reasonable correlation of our impression-based retrieval and keyword. Impression word “rich” define the attribute “fat” as the most significant feature for flavor definition (2.73) and then “salinity” (2.17). These attribute values are the 2nd and 3rd
376
H. Kariya and Y. Kiyoki / Metadata Extraction and Retrieval Methods for Taste-Impressions
Figure 16: Factor Analysis results (correct answers)
Figure 17: Function Combinations
greatest value among overall attribute values for feature values of impression words. Other values are relatively large as well, compared with other metadata of impression words. These facts demonstrate that the overall impression for the query “rich” for Japanese food is thick especially fat and salinity perspective. Since retrieving the target data with large attribute values globally is easier in inner product query processing, support rate for this query example is very promising. 7.3 Experiment 3: Performance Evaluation for Beer Scope • Evaluation Method: In experiment3, we have conducted performance evaluation for Taste Scope beer on the basis of extensive survey, evaluating the retrieval results of our method with prepared collect answers. As experimental objects, we have applied the 28 sensor outputs for the beer scope. The criterion for its performance evaluation is whether our experimental system highly ranks the correct answers as the ranking results. For preparing correct answers, beer data for each Taste-impression have defined by the marketing analysis survey for beer data. These answers have been generated by factor analysis for 178 test subjects, 30 beer data and 32 Taste-impression words 2 . We have sorted beer data according to this factor rating values in descending order and selected top 4 as the correct answers for each impression as shown in Table 16. • Experimental Results and Analysis: Results are shown in figure 7. Among 28 real beer data, we have observed correct answers are ranked in 1, 2, 6 and 11, demonstrating 50% with recall ratio in top5 and 75% for top10 target data. These experimental results have present feasibility in our metadata generation method. We will discuss the effect of module adoption to our method in the next experiments. 7.4 Experiment 4: Functions Adjustment • Evaluation Method Experiment3 is for evaluating plagability of modules in our experimental system. We have implemented and applied several function of the standardization module Pz and Taste Impression Metadata Generation Module Ge to metadata generation for sensor data in the beer scope.
2 The marketing data is offered by Masayuki Goto, Faculty of Environmental Information, Musashi Institute of Technology.
H. Kariya and Y. Kiyoki / Metadata Extraction and Retrieval Methods for Taste-Impressions
Query : Beer, Rich 㫉㪸㫅㫂 㪲㪈㪴 㪲㪉㪴 㪲㪊㪴 㪲㪋㪴 㪲㪌㪴 㪲㪍㪴 㪲㪎㪴 㪲㪏㪴 㪲㪐㪴 㪲㪈㪇㪴 㪲㪈㪈㪴 㪲㪈㪉㪴 㪲㪈㪊㪴 㪲㪈㪋㪴 㪲㪈㪌㪴 㪲㪈㪍㪴 㪲㪈㪎㪴 㪲㪈㪏㪴 㪲㪈㪐㪴 㪲㪉㪇㪴 㪲㪉㪈㪴 㪲㪉㪉㪴 㪲㪉㪊㪴 㪲㪉㪋㪴 㪲㪉㪌㪴 㪲㪉㪍㪴
㫋㪸㫉㪾㪼㫋㩷㪻㪸㫋㪸㩷㪠㪛 㪺㫆㫉㫉㪼㪺㫋㩷㪸㫅㫊㫎㪼㫉㩷㪋 㪺㫆㫉㫉㪼㪺㫋㩷㪸㫅㫊㫎㪼㫉㩷㪉
㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪉 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈㪈 㪺㫆㫉㫉㪼㪺㫋㩷㪸㫅㫊㫎㪼㫉㩷㪈 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪋 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪌 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈㪌 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪍 㪺㫆㫉㫉㪼㪺㫋㩷㪸㫅㫊㫎㪼㫉㩷㪊 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪐 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈㪇 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈㪍 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈㪎 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪉㪇 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈㪋 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈㪐 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈㪏 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪉㪈 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪏 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪉㪉 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈㪉 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈㪊 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪎 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪊
㪺㫆㫉㫉㪼㫃㪸㫋㫀㫆㫅 㪉㪍㪅㪌㪋 㪉㪌㪅㪇㪋 㪉㪋㪅㪋㪋 㪉㪋㪅㪇㪐 㪉㪊㪅㪈㪋 㪉㪉㪅㪌㪉 㪉㪈㪅㪎㪎 㪉㪇㪅㪎㪎 㪉㪇㪅㪋㪉 㪈㪐㪅㪌㪌 㪈㪐㪅㪋㪈 㪈㪏㪅㪌㪏 㪈㪎㪅㪊㪋 㪈㪎㪅㪉 㪈㪍㪅㪉㪎 㪈㪍㪅㪉㪍 㪈㪋㪅㪌㪋 㪈㪋㪅㪊㪈 㪈㪊㪅㪎㪋 㪈㪊㪅㪊㪊 㪈㪊㪅㪇㪉 㪈㪉㪅㪍㪉 㪈㪉㪅㪌㪌 㪈㪉㪅㪉㪐 㪈㪉㪅㪈㪐 㪈㪈㪅㪎㪋
㫉㪸㫅㫂 㪲㪈㪴 㪲㪉㪴 㪲㪊㪴 㪲㪋㪴 㪲㪌㪴 㪲㪍㪴 㪲㪎㪴 㪲㪏㪴 㪲㪐㪴 㪲㪈㪇㪴 㪲㪈㪈㪴 㪲㪈㪉㪴 㪲㪈㪊㪴 㪲㪈㪋㪴 㪲㪈㪌㪴 㪲㪈㪍㪴 㪲㪈㪎㪴 㪲㪈㪏㪴 㪲㪈㪐㪴 㪲㪉㪇㪴 㪲㪉㪈㪴 㪲㪉㪉㪴 㪲㪉㪊㪴 㪲㪉㪋㪴 㪲㪉㪌㪴 㪲㪉㪍㪴
㫋㪸㫉㪾㪼㫋㩷㪻㪸㫋㪸㩷㪠㪛 㪺㫆㫉㫉㪼㪺㫋㩷㪸㫅㫊㫎㪼㫉㩷㪋 㪺㫆㫉㫉㪼㪺㫋㩷㪸㫅㫊㫎㪼㫉㩷㪈 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪍 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈 㪺㫆㫉㫉㪼㪺㫋㩷㪸㫅㫊㫎㪼㫉㩷㪉 㪺㫆㫉㫉㪼㪺㫋㩷㪸㫅㫊㫎㪼㫉㩷㪊 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈㪈 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪋 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈㪌 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈㪍 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪌 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪐 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪉㪇 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈㪇 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪉 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈㪐 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪎 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈㪊 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈㪎 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪉㪈 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈㪉 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈㪋 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪉㪉 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈㪏 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪊 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪏
Exp.A (original)
㪺㫆㫉㫉㪼㫃㪸㫋㫀㫆㫅 㪉㪅㪈㪋 㪉㪅㪇㪇 㪈㪅㪊㪇 㪈㪅㪉㪐 㪈㪅㪉㪏 㪈㪅㪇㪐 㪈㪅㪇㪈 㪈㪅㪇㪇 㪇㪅㪐㪐 㪇㪅㪎㪎 㪇㪅㪍㪈 㪄㪇㪅㪊㪊 㪄㪇㪅㪊㪐 㪄㪇㪅㪌㪍 㪄㪇㪅㪎㪊 㪄㪇㪅㪎㪍 㪄㪈㪅㪌㪍 㪄㪈㪅㪎㪉 㪄㪈㪅㪐㪐 㪄㪉㪅㪊㪋 㪄㪉㪅㪌㪍 㪄㪉㪅㪍㪇 㪄㪊㪅㪇㪏 㪄㪊㪅㪈㪋 㪄㪊㪅㪏㪊 㪄㪋㪅㪈㪏
㫉㪸㫅㫂 㪲㪈㪴 㪲㪉㪴 㪲㪊㪴 㪲㪋㪴 㪲㪌㪴 㪲㪍㪴 㪲㪎㪴 㪲㪏㪴 㪲㪐㪴 㪲㪈㪇㪴 㪲㪈㪈㪴 㪲㪈㪉㪴 㪲㪈㪊㪴 㪲㪈㪋㪴 㪲㪈㪌㪴 㪲㪈㪍㪴 㪲㪈㪎㪴 㪲㪈㪏㪴 㪲㪈㪐㪴 㪲㪉㪇㪴 㪲㪉㪈㪴 㪲㪉㪉㪴 㪲㪉㪊㪴 㪲㪉㪋㪴 㪲㪉㪌㪴 㪲㪉㪍㪴
㫋㪸㫉㪾㪼㫋㩷㪻㪸㫋㪸㩷㪠㪛 㪺㫆㫉㫉㪼㪺㫋㩷㪸㫅㫊㫎㪼㫉㩷㪋 㪺㫆㫉㫉㪼㪺㫋㩷㪸㫅㫊㫎㪼㫉㩷㪈 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪍 㪺㫆㫉㫉㪼㪺㫋㩷㪸㫅㫊㫎㪼㫉㩷㪊 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈 㪺㫆㫉㫉㪼㪺㫋㩷㪸㫅㫊㫎㪼㫉㩷㪉 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈㪌 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪋 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈㪈 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈㪍 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪌 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪐 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪉㪇 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈㪇 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈㪐 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪉 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪎 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈㪊 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈㪎 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪉㪈 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈㪉 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈㪋 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪉㪉 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪈㪏 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪊 㪹㪼㪼㫉㩷㪻㪸㫋㪸㪏
Exp.B
377
㪺㫆㫉㫉㪼㫃㪸㫋㫀㫆㫅 㪎㪉㪅㪌 㪎㪇㪅㪐 㪍㪍㪅㪋 㪍㪋㪅㪐 㪍㪊㪅㪐 㪍㪊㪅㪏 㪍㪊㪅㪏 㪍㪊㪅㪍 㪍㪈㪅㪏 㪌㪏㪅㪏 㪌㪎㪅㪉 㪌㪇㪅㪋 㪌㪇 㪋㪏㪅㪈 㪋㪋㪅㪈 㪋㪊㪅㪐 㪊㪏㪅㪊 㪊㪊㪅㪏 㪊㪈㪅㪏 㪊㪇㪅㪋 㪉㪍㪅㪉 㪉㪌㪅㪍 㪉㪊㪅㪉 㪉㪇㪅㪉 㪈㪊㪅㪊 㪐㪅㪐
Exp.C
Figure 18: Function Adjustments Results for Beers
Figure 19: Recall Ratio Improvements (Exp.4)
We compared these retrieval ranking results of Experiment A (only with fundamental operation, very close to original data of sensor outputs), Experiment B (threshold adjustment with G3 ) as and Experiment3 (Experiment A with weighting) as shown in Table 17. Here, we have committed the query with impression word “rich” applying these 3 optimization patterns. • Experimental Results Results are promising as shown in figure 18. Adjusting the feature values with these functions have accepted better ranking results, compared with those without standardization function. To be more specific, adoption of the beer-specific Taste Scope intelligence in these modules (Experiment B and C) achieves 40% recall ratio improvement for in top5, 25% for top10 target data compared with Experiment A, thus demonstrating the promise of the approach (Figure 19 ). These results suggest that function adjustment in our metadata generation method is effective for optimizing feature values for target Taste Scopes. 8 Conclusion and Future Work In this paper, we have presented Metadata Extraction and Retrieval Methods for Taste-impressions with bio-sensing Technology. Features of our metadata extraction method are 1) the metadata extraction method which transforms sensor outputs into meaningful Taste-impression metadata automatically and 2) the definition of the Taste Scope which utilizes anonymous meanings of each Taste-impression. The application of our methods to media data of the beer and Japanese food scope has been shown, and the feasibility of our system has been examined by several experimental studies. We are currently developing new Taste Scopes which deal with the view points of user groups in order to manipulate diversified sensitivity of people’s tongue, such as of the elderly and the young. These functions would be added to our proposed method and allow the further flexibility for extracting Taste-impression. Eventually, we are hoping to realize a sensor based metadata extraction method by several bio-sensing technologies such as odor sensor in order to improve the quality of metadata from other five senses aspects. Acknowledgements We would thank to Shuichi Kurabayashi and Dr. Naofumi Yoshida of Graduate School of Media and Governance, Keio University for valuable discussions and helpful comments on this study. I also would like to express my gratitude to researchers of Taste Sensor, Dr. Hidekazu
378
H. Kariya and Y. Kiyoki / Metadata Extraction and Retrieval Methods for Taste-Impressions
Ikezaki of Intelligentsensor technologies,Inc. and Prof. Kiyoshi Toko of Graduate School of Information Science and Electrical Engineering, Kyushu University for valuable comments for implementing the experimental system. References [1] Albrecht Schmidt, Michael Beigl, Hans-W. Gellersen,“There is more to Context than Location – Environment Sensing Technologies for Adaptive Mobile User Interfaces”, Proceedings of International Workshop on Interactive Applications of Mobile Computing (IMC98) [2] BMW. The BMW navigation system. BMW compass. http://www.bmw.com/compass/htdocs/BMWe/backissue/FORSCH2E.shtml, 1998. [3] B. Dasarathy, “Sensor Fusion Potential Exploitation - Innvative Architectures and Illustrative Applications”, Proc. of the IEEE, Vol. 85, pp. 24-38, Jan. 1997. [4] Beadle, P., Harper, B., Maguire, G.Q. and Judge, J.Location Aware Mobile Computing. Proceedings of IEEE International Conference on Telecommunications, Melbourne, Australia, April 1997. [5] Charles Zuker, “A Matter of Taste :Candidates for Taste Receptors Identified” Howard Hughes Medical Institute Bulletin, 2003 [6] H.Ikezaki, Y.Kobayashi, R.Toukubo, Y.Naito, A.Taniguchi, and K.Toko : “Techniques to Control Sensitivity and Selectivity of Multichannel Taste Sensor Using Lipid Membranes” Digest Tech. Papers Transducers ’99, pp.1634-1637, June, 1999 [7] Ijichi, A. and Kiyoki, Y.:“ A Kansei Metadata Generation Method for Music Data Dealing with Dramatic Interpretation ”, Information Modeling and Knowledge Bases, Vol.XVI, IOS Press, [8] K.Toko,“Biomimetic sensor technology”, Cambridge University Press, 2000 [9] Nobuko Miura, Shuichi Kurabayashi, and Yasushi Kiyoki: An Automatic Extraction Method of TimeSeries Impression-Metadata for Color Information of Video Streams. International Special Workshop on Databases For Next Generation Researchers (SWOD2005) in conjunction with ICDE2005, 2005, 54-57. [10] Pramod K. Varshney, “Multisensor Data Fusion”, Lecture Notes in Computer Science (Springer-Verlag Heidelberg), Vol.1821, 2000 [11] R. Antony, “Database Support to Data Fusion Automation”, Proc. of the IEEE, Vol. 85, pp. 39-53, Jan. 1997. [12] R. Cappelli, “SFinGe: an Approach to Synthetic Fingerprint Generation”, in proceedings International Workshop on Biometric Technologies (BT2004), Calgary, Canada, pp.147-154, June 2004. [13] Snider,J.G. and Osgood, C.E. : ”Semantic Differential Technique-A Sourcebook”, Aldine Pub. Company, 1969 [14] Tanizawa, K. and Uehara, K. : “Automatic Detection of the Semantic Structure from Video by Using Ngram” [15] Y. Kiyoki, T. Kitagawa, T. Hayama, “A Metadatabase System for Semantic Image Search by a Mathematical Model of Meaning”, Multimedia Data Management using metadata to integrate and apply digital media, McGrawHill, Amit Sheth and Wolfgang Klas(editors), Chapter 7, 1998. [16] Nagao, K., Rikimoto, J. Agent Augmented Reality: A software Agent meets the real world. Proceeding of the 2nd Conference on Multiagent systems (ICMAS-96), Dec 1996.
Information Modelling and Knowledge Bases XIX H. Jaakkola et al. (Eds.) IOS Press, 2008 © 2008 The authors and IOS Press. All rights reserved.
379
!! " #$%%&%'( Position paper In this paper we propose a cooperation modeling approach which aims to the alignment of system development with the organizational change where the system will operate. It consists in a semantic enrichment of cooperation documentation so that the intertwining interactions between organization, human and system views could be represented explicitly into the system development process. The proposed ontological framework plays crucial roles as a communication, learning and design artifact for different stakeholders.
1. Cooperation capturing and modeling Cooperation modeling is very decisive for the system development process when the application domain is characterized with complex cooperation. It is not only necessary to identify and to understand the actual work practices but also to capture and predict the changes the future system will initiate so that the system is kept adaptable to the permanent changing environment. These changes can be explicitly known such as those of technological nature, or not as easily identifiable such as those from social nature. The literature witnesses the emergence of manifold models technology supporting the cooperation such as CSCW (Groupware and Workflows), Business process re-engineering, etc. On one side, the difference of the origins of the approaches on which the models are based (theories of situated action, Communities of practice, Distributed cognition, studies on coordination mechanisms and “articulation work”, etc.) leads to the fact that there is no consensus regarding the set of concepts and abstraction levels underlying the cooperation modeling. On the other side, changing approaches are mainly from organizational point of view (organization work-oriented, system-oriented, collaborators-oriented, process-oriented approaches) and dealing thus with the nature of the work practices, for instance, if they are structurally opened or closed to be changed or not [10], after the embedding of the cooperative system.
!"# !$# !%# !$$# they do not consider explicitly the two levels of requirements dealing with the nature of the cooperative work as well as what will be changed and the way to change it. &
'
(
)* &
( (
(
( *
+ ( *
380
B. Lahouaria / An Ontological Framework for Modeling Complex Cooperation Contexts
The alignment of organization, human and system views for cooperation modeling Participatory and evolutionary system development approaches such as STEPS [12] are very convenient for cooperation support because of the fact that considering organization, human and system at the same level is an old tradition in such approaches. Modeling complex cooperation characterizing work practices in organizations requires unfortunately, not only the alignment of these analytical dimensions but furthermore that they should be explicitly integrated into the system development and embedding processes. This seems to be a very difficult task for the methods grounded on participation and evolution principles where learning processes are based on project-oriented software artifacts (such as Scenarios, Glossary, Prototypes, etc.). The project itself is unique and limited in the time so that the focus is finally only on the software as a product. We claim that a practical solution for the alignment of system development with the organizational change should assure that: • the participation of the users to the development process means that continuously new unknown participants with different interests could be introduced in order to deal with ( !,#
• the evolution of the system goes beyond the project’s context so that t
-
) * ( ! (
)*
An ontology-based cooperative work representation . ' (
/ ***0 ( *
1 23
5+ ! * * * * * * *
! *! - !6#*
; * ! / 7 0 !8# (
* .
9 ( *
& :
• ;
(
(
• Concretizing the evolution by creating an organizational memory information system, in order to have a common learning artifact.
Ontological framework for cooperative processes The whole environment (organizational, human and system) where the system is embedded does not exist physically but is represented by means of the cooperation ontologies represented at the cognitive context (see Fig. 1). . -
) / 0 ( -
) -
*
B. Lahouaria / An Ontological Framework for Modeling Complex Cooperation Contexts
381
.9 .9
; ;
)
'
; ;
; ;
; )
+
+
+
/(
=0
+
(
+
9
99>
9
Fig. 2 Top-level cooperation ontology
(
* •
; * ! !
382
B. Lahouaria / An Ontological Framework for Modeling Complex Cooperation Contexts
*
. (
* . (
( / 0 / 0 /0
!@#*
/(0 *
) *
( *
* .
*
+
( *
!
!
!
"
.
(A
# 79 + 9 ;9
79 + 9
79
;9 + 9(A
;9
Fig. 3 Organizational, human and system cooperation foundational ontologies
Table 1. Cooperation modeling levels in the system development process E 9 /9 0
Analyse
E 9 / 0 / 0
Operationalization Extensional level (instances)
The horizontal representation of cooperation views through the foundational ontologies (see Fig. 3) is useful for guiding the developers team to take into account different stakeholders with different interest, understandings and terminology about the cooperation, whereas the vertical representation of cooperation levels (see table 1) is useful for guiding them in their task of analyzing and generating contextual cooperative processes metamodels which should be adequate to their application domain in hand. 5. Application of OFCP to a hospital research project ?;= ( =BB9%DDD / ( 0
' 9* . '
( ( 9
B. Lahouaria / An Ontological Framework for Modeling Complex Cooperation Contexts
383
* =BB9%DDD FE
( ( )
FE
* & ?;= A (
!G# )
A '
*
*
?;= ( )
(
*
? =BB9
9
) * ( 23
( *
?
! ?;= ( )( for cooperation analysis process. Indeed, a cooperative process could be characterized in terms of network of dependencies among entities annoted through the set of concepts in OFCP. The process of analysis could begin or alternate from any type of entity (task-oriented, object-oriented, actor-oriented, resource-oriented analysis, etc.). References [1] Y. Engeström. Developmental work research: Reconstructing expertise through expansive learning. In: Nurminen, M. I., Järvinen, P. & Weir, G. (eds.), Conference on Human jobs and computer interfaces, Tampere, Finnland, June 26-28, 1991,s.124-143. [2] C. Floyd, Y. Dittrich, R. Klischewski, (Eds.). Social Thinking - Software Practice. Relating Software Development, Work and Organizational Change, Dagstuhl-Report Nr. 99361. Wadern: IBFI, 1999 [3] I. Wetzel. Information Systems Development with Anticipation of Change Focusing on Professional Bureaucraties. In proc. Of Hawai’, International Conference on System Sciences, HICCS-34, Maui, January 2000. [4] J. Ziegler. Modeling cooperative work processes- A multiple perspectives framework. Int. Journal of human-computer interaction, 14(2), 139-157, 2002 [5] C. Floyd. Software development as reality construction. In: Floyd, C. et al. (eds) : Software Development and Reality Construction. Springer Verlag, Berlin 1992 [6] D. Hensel: Relating Ontology Languages and Web Standards. In: J.Ebert, U. Frank (Hrsg.): Modelle und Modellierungssprachen in Informatik und Wirtschaftsinformatik. Proc. „Modellierung 2000“, FöllbachVerlag, Koblenz 2000, pp. 111-128. [7] N. Guarino. Foundational ontologies for humanities: the Role of Language and Cognition, in first int. Workshop “Ontology Based modeling in Humanities”, University of Hamburg, 7-8 April 2006. [8] E. Falkenberg, W. Hesse, P. Lindgreen, B.E. Nilsson, J.L.H. Oei, C. Rolland, R.K. Stamper, F.J.M. Van Assche, A.A. Verrijn-Stuart, K. Voss: FRISCO - A Framework of Information System Concepts - The FRISCO Report. IFIP WG 8.1 Task Group FRISCO. Web version: ftp://ftp.leidenuniv.nl/pub/rul/frifull.zip (1998) [9] WJ. Orlikowski & D. Robey. Information Technology and the Structuring of Organizations. Information Systems Research. Vol 2(2): 143-169.1991. [10] E* H : ; : ; +