ECAI 2008: Proceedings, 18th European Conference on Artificial Intelligence, July 21-25, 2008, Patras, Greece : Including Prestigious Applications of Intelligent ... in Artifical Intelligence and Applications)

ECAI 2008 Frontiers in Artificial Intelligence and Applications FAIA covers all aspects of theoretical and applied art...

Author: Malik Ghallab | Constantine D. Spyropoulos | Nikos Fakotakis | Nikos Avouris

7 downloads 754 Views 21MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

ECAI 2008

Frontiers in Artificial Intelligence and Applications FAIA covers all aspects of theoretical and applied artificial intelligence research in the form of monographs, doctoral dissertations, textbooks, handbooks and proceedings volumes. The FAIA series contains several sub-series, including “Information Modelling and Knowledge Bases” and “Knowledge-Based Intelligent Engineering Systems”. It also includes the biennial ECAI, the European Conference on Artificial Intelligence, proceedings volumes, and other ECCAI – the European Coordinating Committee on Artificial Intelligence – sponsored publications. An editorial panel of internationally well-known scholars is appointed to provide a high quality selection. Series Editors: J. Breuker, R. Dieng-Kuntz, N. Guarino, J.N. Kok, J. Liu, R. López de Mántaras, R. Mizoguchi, M. Musen, S.K. Pal and N. Zhong

Volume 178 Recently published in this series Vol. 177. Vol. 176. Vol. 175. Vol. 174. Vol. 173. Vol. 172. Vol. 171. Vol. 170. Vol. 169. Vol. 168. Vol. 167. Vol. 166. Vol. 165. Vol. 164. Vol. 163. Vol. 162. Vol. 161. Vol. 160. Vol. 159. Vol. 158. Vol. 157. Vol. 156. Vol. 155. Vol. 154. Vol. 153. Vol. 152. Vol. 151. Vol. 150. Vol. 149. Vol. 148. Vol. 147. Vol. 146. Vol. 145. Vol. 144.

C. Soares et al. (Eds.), Applications of Data Mining in E-Business and Finance P. Zaraté et al. (Eds.), Collaborative Decision Making: Perspectives and Challenges A. Briggle, K. Waelbers and P.A.E. Brey (Eds.), Current Issues in Computing and Philosophy S. Borgo and L. Lesmo (Eds.), Formal Ontologies Meet Industry A. Holst et al. (Eds.), Tenth Scandinavian Conference on Artificial Intelligence – SCAI 2008 Ph. Besnard et al. (Eds.), Computational Models of Argument – Proceedings of COMMA 2008 P. Wang et al. (Eds.), Artificial General Intelligence 2008 – Proceedings of the First AGI Conference J.D. Velásquez and V. Palade, Adaptive Web Sites – A Knowledge Extraction from Web Data Approach C. Branki et al. (Eds.), Techniques and Applications for Mobile Commerce – Proceedings of TAMoCo 2008 C. Riggelsen, Approximation Methods for Efficient Learning of Bayesian Networks P. Buitelaar and P. Cimiano (Eds.), Ontology Learning and Population: Bridging the Gap between Text and Knowledge H. Jaakkola, Y. Kiyoki and T. Tokuda (Eds.), Information Modelling and Knowledge Bases XIX A.R. Lodder and L. Mommers (Eds.), Legal Knowledge and Information Systems – JURIX 2007: The Twentieth Annual Conference J.C. Augusto and D. Shapiro (Eds.), Advances in Ambient Intelligence C. Angulo and L. Godo (Eds.), Artificial Intelligence Research and Development T. Hirashima et al. (Eds.), Supporting Learning Flow Through Integrative Technologies H. Fujita and D. Pisanelli (Eds.), New Trends in Software Methodologies, Tools and Techniques – Proceedings of the sixth SoMeT_07 I. Maglogiannis et al. (Eds.), Emerging Artificial Intelligence Applications in Computer Engineering – Real World AI Systems with Applications in eHealth, HCI, Information Retrieval and Pervasive Technologies E. Tyugu, Algorithms and Architectures of Artificial Intelligence R. Luckin et al. (Eds.), Artificial Intelligence in Education – Building Technology Rich Learning Contexts That Work B. Goertzel and P. Wang (Eds.), Advances in Artificial General Intelligence: Concepts, Architectures and Algorithms – Proceedings of the AGI Workshop 2006 R.M. Colomb, Ontology and the Semantic Web O. Vasilecas et al. (Eds.), Databases and Information Systems IV – Selected Papers from the Seventh International Baltic Conference DB&IS’2006 M. Duží et al. (Eds.), Information Modelling and Knowledge Bases XVIII Y. Vogiazou, Design for Emergence – Collaborative Social Play with Online and Location-Based Media T.M. van Engers (Ed.), Legal Knowledge and Information Systems – JURIX 2006: The Nineteenth Annual Conference R. Mizoguchi et al. (Eds.), Learning by Effective Utilization of Technologies: Facilitating Intercultural Understanding B. Bennett and C. Fellbaum (Eds.), Formal Ontology in Information Systems – Proceedings of the Fourth International Conference (FOIS 2006) X.F. Zha and R.J. Howlett (Eds.), Integrated Intelligent Systems for Engineering Design K. Kersting, An Inductive Logic Programming Approach to Statistical Relational Learning H. Fujita and M. Mejri (Eds.), New Trends in Software Methodologies, Tools and Techniques – Proceedings of the fifth SoMeT_06 M. Polit et al. (Eds.), Artificial Intelligence Research and Development A.J. Knobbe, Multi-Relational Data Mining P.E. Dunne and T.J.M. Bench-Capon (Eds.), Computational Models of Argument – Proceedings of COMMA 2006

ISSN 0922-6389

ECAI 2008 18th European Conference on Artificial Intelligence July 21–25, 2008, Patras, Greece Including

Prestigious Applications of Intelligent Systems (PAIS 2008)

Proceedings Edited by

Malik Ghallab INRIA, France

Constantine D. Spyropoulos NCSR Demokritos, Greece

Nikos Fakotakis University of Patras, Greece

and

Nikos Avouris University of Patras, Greece

Organized by the European Coordinating Committee for Artificial Intelligence (ECCAI) and the Hellenic Artificial Intelligence Society (EETN) Hosted by the University of Patras, Greece

Amsterdam • Berlin • Oxford • Tokyo • Washington, DC

© 2008 The authors and IOS Press. All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without prior written permission from the publisher. ISBN 978-1-58603-891-5 Library of Congress Control Number: 2008905319 Publisher IOS Press Nieuwe Hemweg 6B 1013 BG Amsterdam Netherlands fax: +31 20 687 0019 e-mail: [email protected]

Distributor in the UK and Ireland Gazelle Books Services Ltd. White Cross Mills Hightown Lancaster LA1 4XS United Kingdom fax: +44 1524 63232 e-mail: [email protected]

Distributor in the USA and Canada IOS Press, Inc. 4502 Rachael Manor Drive Fairfax, VA 22032 USA fax: +1 703 323 3668 e-mail: [email protected]

LEGAL NOTICE The publisher is not responsible for the use which might be made of the following information. PRINTED IN THE NETHERLANDS

v

ECCAI Member Societies ACIA (Spain) Catalan Association for Artificial Intelligence (Associació Catalana d’Intelligència Artificial) ADUIS (Ukrain) Association of Developers and Users of Intelligent Systems AEPIA (Spain) Spanish Association for Artificial Intelligence (Asociación Española para la Inteligencia Artificial) AFIA (France) French Association for Artificial Intelligence (Association Française pour l’Intelligence Artificielle) AIAI (Ireland) Artificial Intelligence Association of Ireland AIIA (Italy) Italian Association for Artificial Intelligence (Associazione Italiana per l’Intelligenza Artificiale) AISB (United Kingdom) Society for the Study of Artificial Intelligence and the Simulation of Behaviour APPIA (Portugal) Portuguese Association for Artificial Intelligence (Associação Portuguesa para a Inteligência Artificial) BAIA (Bulgaria) Bulgarian Artificial Intelligence Association BCS-SGAI (United Kingdom) British Computer Society Specialist Group on Artificial Intelligence BNVKI (Belgium/Netherlands) Belgian-Dutch Association for Artificial Intelligence (Belgisch-Nederlandse Vereniging voor Kunstmatige Intelligentie) CSKI (Czech Republic) Czech Society for Cybernetics and Informatics (Ceská spolecnost pro kybernetiku a informatiku) DAIS (Denmark) Danish Artificial Intelligence Society EETN (Greece) Hellenic Artificial Intelligence Society FAIS (Finland) Finnish Artificial Intelligence Society (Suomen Tekoälyseura ry) GI/KI (Germany) German Informatics Association (Gesellschaft für Informatik; Sektion KI e.V.) IAAI (Israel) Israeli Association for Artificial Intelligence LANO (Latvia) Latvian National Organisation of Automatics (Latvijas Automatikas Nacionala Organizacija) LIKS-AIS (Lithuania) Lithuanian Computer Society–Artificial Intelligence Section (Lietuvos Kompiuterininku Sajunga) NJSZT (Hungary) John von Neumann Society for Computing Sciences (Neumann János Számítógéptudományi Társaság) ÖGAI (Austria) Austrian Society for Artificial Intelligence (Österreichische Gesellschaft für Artificial Intelligence) RAAI (Russia) Russian Association for Artificial Intelligence SAIS (Sweden) Swedish Artificial Intelligence Society SGAICO (Switzerland) Swiss Group for Artificial Intelligence and Cognitive Science (Schweizer Informatiker Gesellschaft) SLAIS (Slovenia) Slovenian Artificial Intelligence Society (Slovensko drustvo za umetno inteligenco) SSKI SAV (Slovak Republic) Slovak Society for Cybernetics and Informatics at Slovak Academy of Sciences (Slovenská spolocnost pre kybernetiku a informatiku pri Slovenskej akadémii vied)

This page intentionally left blank

vii

ECAI 2008 Conference Chair Constantine D. Spyropoulos, Greece

Programme Committee Chair Malik Ghallab, France

Organizing Committee Chairs Nikos Fakotakis, Greece Nikos Avouris, Greece

Workshops Chairs Boi Faltings, Switzerland Ioannis Vlahavas, Greece

Demonstration Systems Chair Nikos Karacapilidis, Greece

Area Chairs Antoniou, Grigoris, Greece Benhamou, Frédéric, France Bessiere, Christian, France Console, Luca, Italy Cordier, Marie-Odile, France Dague, Philippe, France De Raedt, Luc, Belgium Flach, Peter, UK Geffner, Hector, Spain Horrocks, Ian, UK Ingrand, Felix, France Lakemeyer, Gerhard, Germany Lang, Jérôme, France Milano, Michela, Italy

Myllymaki, Petri, Finland Oliveira, Eugenio, Portugal Pazienza, Maria Teressa, Italy Saffiotti, Alessandro, Sweden Struss, Peter, Germany Thiébaux, Sylvie, Austria Torasso, Pietro, Italy Traverso, Paolo, Italy Trousse, Brigitte, France Uszkoreit, Hans, Germany Van Harmelen, Frank, The Netherlands Van Someren, Maarten, The Netherlands Verfaillie, Gérard, France

viii

PAIS 2008 Chairs Nick Jennings, United Kingdom Alex Rogers, United Kingdom

PAIS Programme Committee Stuart Aitken, UK Joachim Baumeister, Germany Jeremy Baxter, UK Riccardo Bellazzi, Italy Michael Berger, Germany Stefan Bussmann, Germany Andrew Byde, UK Monique Calisti, Switzerland Simon Case, UK Pádraig Cunningham, Ireland Ian Dickinson, UK Partha Dutta, UK

Floriana Esposito, Italy Robert Ghanea-Hercock, UK Josep Lluis Arcos, Spain Simon Maskell, UK David Nicholson, UK Michal Pechoucek, Czech Republic Nicola Policella, Germany Sarvapali Ramchurn, UK Oliviero Stock, Italy Jerome Thomas, France Simon Thompson, UK Franz Wotawa, Austria

ix

ECAI Programme Committee Agirre, Eneko, ES Ågotnes, Thomas, NO Ait-Mokhtar, Salah, FR Alechina, Natasha, UK Alonso, Carlos, ES Alonso, Eduardo, UK Amgoud, Leila, FR Ananiadou, Sophia, UK Antunes, Luis, PT Ardissono, Liliana, IT Areces, Carlos, FR Assayag, Gerard, FR Avesani, Paolo, IT Baldwin, Timothy, AU Baroglio, Cristina, IT Bartak, Roman, CZ Basili, Roberto, IT Battiti, Roberto, IT Beaufils, Bruno, FR Beck, Christopher, CA Beetz, Michael, DE Beldiceanu, Nicolas, FR Ben Naim, Jonathan, FR Bertoli, Piergiorgio, IT Besnard, Philippe, FR Biau, Gérard, FR Biswas, Gautam, US Blockeel, Hendrik, BE Boella, Guido, IT Boissier, Olivier, FR Bonet, Blai, VE Bonnefon, J.-F., FR Booth, Richard, TH Bordeaux, Lucas, UK Borrajo, Daniel, ES Bouchon-Meunier, B., FR Bouillon, Pierrette, CH Bouquet, Paolo, IT Bourreau, Eric, FR Bozzano, Marco, IT Brafman, Ronen, IL Brazdil, Pavel, PT Brown, Ken, IE Brugali, Davide, IT Buffet, Olivier, FR Buntine, Wray, AU Busquets, Didac, ES Cali, Andrea, UK

Camps, Valerie, FR Cancedda, Nicola, FR Cardoso, Amilcar, PT Carlsson, Mats, SE Carroll, John, US Ceberio, Martine, US Chades, Iadine, FR Charpillet, Francois, FR Chevaleyre, Yann, FR Cholvy, Laurence, FR Christie, Marc, FR Coelho, Helder, PT Coghill, George, UK Cohen, David, UK Collet, Jacques, FR Comet, Jean-Paul, FR Conitzer, Vincent, US Cornet, Ronald, NL Cortes, Juan, FR Cortés, Ulises, ES Coste-Manière, Eve, FR Coste-Marquis, Sylvie, FR Crowley, James, FR Cuenca Grau, Bernardo, UK Cussens, James, UK David, Bertrand, FR De Giacomo, Giuseppe, IT De Jong, Hidde, FR De Kleer, Johan, US De Ruyter, Boris, NL de Vries, Gerben Klaas Dirk, NL Dechter, Rina, US Delgrande, James, CA Demazeau, Yves, FR Devy, Michel, FR Dignum, Frank, NL Dignum, Virginia, NL Dimitrakakis, Christos, NL Dombre, Etienne, FR Domingue, John, UK Domshlak, Carmel, IL Dousson, Christophe, FR Dressler, Oskar, DE Duckett, Tom, UK Dutech, Alain, FR Edelkamp, Stefan, DE Eisele, Andreas, DE Eiter, Thomas, AT

El Fallah, S. Amal, FR Elkind, Edith, UK Endriss, Ulle, NL Erdem, Esra, TR Esteva, Marc, ES Euzenat, Jérôme, FR Eveillard, Damien, FR Ferber, Jacques, FR Faltings, Boi, CH Fargier, Hélène, FR Feelders, Ad, NL Fern, Alan, US Fernandez-Madrigal, J.-A, ES Ferrane, Isabelle, FR Ferré, Sébastien, FR Finzi, Alberto, IT Fischer, Klaus, DE Fisher, Michael, UK Forbus, Ken, US Fornara, Nicoletta, CH Fox, Maria, UK Frank, Eibe, NZ Frasconi, Paolo, IT Friedrich, Gerhard, AT Fuernkranz, Johannes, DE Gama, Joao, PT Gebhard, Patrick, DE Gent, Ian, UK Ghidini, Chiara, IT Giordana, Attilio, IT Giordano, Laura, IT Giovannucci, Andrea, ES Giunchiglia, Enrico, IT Gleizes, Marie-Pierre, FR Glimm, Birte, UK Godo, Lluis, ES Goethals, Bart, BE Gordillo, Jose-Luis, MX Governatori, Guido, AU Grastien, Alban, AU Gribonval, Rémi, FR Grobelnik, Marko, SI Gros, Patrick, FR Grosclaude, Irene, FR Grossi, Davide, LU Grunwald, Peter, NL Guéré, Emmanuel, FR Haarslev, Volker, CA

x

Haase, Peter, DE Habet, Djamal, FR Hajicova, Eva, CZ Hansen, Eric, US Harrenstein, Paul, DE Haslum, Patrik, AU Haton, Jean-Paul, FR Hayes, Pat, US Helmert, Malte, DE Hernandez, Daniel, DE Hernandez-Orallo, Jose, ES Hertzberg, Joachim, DE Herzig, Andreas, FR Hitzler, Pascal, DE Hofbaur, Michael, AT Hoffmann, Joerg, AT Hollink, Vera, NL Hoos, Holger, CA Hosobe, Hiroshi, JP Hu, Wei, CN Huang, Jinbo, AU Huang, Zhisheng, NL Huget, Marc-Philippe, FR Hunter, Aaron, CA Hunter, Anthony, UK Hustadt, Ullrich, UK Infantes, Guillaume, US Ironi, Liliana, IT Isaac, Antoine, NL Jaeger, Manfred, DK Jaffar, Joxan, SG Jannin, Pierre, FR Jonsson, Anders, ES Julio, Alferes Jose, PT Junker, Ulrich, FR Jéron, Thierry, FR Kayser, Daniel, FR Kalech, Meir, US Kalfoglou, Yannis, UK Kalyanpur, Aditya, US Kaplunova, Alissa, DE Karlsson, Lars, SE Kaski, Samuel, FI Kazakov, Yevgeny, UK Kern-Isberner, Gabriele, DE Kersting, Kristian, DE Klein, Michel, NL Koehn, Philipp, UK Koivisto, Mikko, FI Kok, Joost, NL

Konieczny, Sébastien, FR Koubarakis, Manolis, GR Krose, Ben, NL Krüger, Antonio, DE Kudenko, Daniel, UK Kuesters, Ralf, DE Lachiche, Nicolas, FR Lacroix, Simon, FR Lafortune, Stephane, US Lallouet, Arnaud, FR Lamperti, Gianfranco, IT Lanfranchi, Vitaveska, UK Larranaga, Pedro, ES Lavrac, Nada, Slovenia Lechevallier, Yves, FR Lecoutre, Christophe, FR Lembo, Domenico, IT Lesperance, Yves, CA Levene, Mark, UK Lima, Pedro, PT Liz, Sonenberg, AU Long, Derek, UK Longin, Dominique, FR Lorini, Emiliano, FR Lucas, Peter, NL Luis, Correia, PT Lukasiewicz, Thomas, UK Lutz, Carsten, DE López de Mántaras, R., ES Mackay, Wendy, FR Magro, Diego, IT Malerba, Donato, IT Manya, Felip, ES Marchand, Hervé, FR Marquis, Pierre , FR Martelli, Alberto, IT Massa, Paolo, IT Massimo, Zanzotto F., IT Maudet, Nicolas, FR McNeill, Fiona, UK Meisels, Amnon, IL Mendes, Rui, PT Mengin, Jerome, FR Meo, Rosa, IT Meseguer, Pedro, ES Meyer, Tommie, ZA Michel, Laurent, US Milicic, Maja, DE Mille, Alain, FR Mobasher, Bamshad, US

Moeller, Ralf, DE Monfroy, Eric, CL Mosterman, Pieter, US Motik, Boris, UK Mouaddib, Abdel-Illah, FR Muggleton, Stephen, UK Màrquez, Lluís, ES Napoli, Amedeo, FR Narasimhan, Sriram, US Nardi, Daniele, IT Nayak, Abhaya, AU Neumann, Guenter, DE Niemela, Ilkka, FI Nijholt, Anton, NL Nijssen, Siegfried, BE Nivre, Joakim, SE Noirhomme, Monique, BE Nunes, Luís, PT Nyberg, Mattias, SE O’Sullivan, Barry, IE Oddi, Angelo, IT Oepen, Stephan, NO Omicini, Andrea, IT Oriolo, Giuseppe, IT Ossowski, Sascha, ES Ozturk, Escoffier M., FR Pagnucco, Maurice, AU Palacios, Hector, ES Paliouras, Georgios, GR Pan, Jeff, UK Paolucci, Mario, IT Paquet, Thierry, FR Parsia, Bijan, UK Paternò, Fabio, IT Patino Vilchis, Jose Luis, FR Paula, Rocha Ana, PT Payne, Terry, UK Peek, Niels, NL Peischl, Bernhard, AT Pena, Jose, SE Pencolé, Yannick, FR Peppas, Pavlos, GR Perini, Anna, IT Perron, Laurent, FR Petrelli, Daniela, UK Pfahringer, Bernhard, NZ Pianesi, Fabio, IT Picardi, Claudia, IT Pirri, Fiora, IT Poesio, Massimo, IT

xi

Poibeau, Thierry, FR Portinale, Luigi, IT Pralet, Cédric, FR Price, Chris, UK Provan, Gregory, IE Pulido, Junquera B., ES Pulman, Stephen, UK Putnik, Goran, PT Pélachaud, Catherine, FR Quiniou, René, FR Quinou, Rene, FR Regin, Jean-Charles, FR Reis, Luis Paulo, PT Remondino, Marco, IT Renz, Jochen, AU Retore, Christian, FR Ricci, Francesco, IT Rintanen, Jussi, AU Robertson, Dave, UK Rochart, Guillaume, FR Roli, Andrea, IT Roos, Teemu, FI Rosati, Riccardo, IT Rosec, Olivier, FR Rossi, Francesca, IT Rousset, Marie-Christine, FR Rudova, Hana, CZ Ruml, Wheeler, US Sabbadin, Régis, FR Sabou, Marta, UK Sabouret, Nicolas, FR Sachenbacher, Martin, DE Salido, Miguel, ES Sanchez, Daniel, ES Sanner, Scott, AU Sattler, Uli, UK Saubion, Frederic, FR Sauro, Luigi, IT

Saïs, Lakhdar, FR Schaub, Torsten, DE Schiex, Thomas, FR Schlobach, Stefan, NL Schmid, Helmut, DE Schulte, Christian, SE Schulte, im Walde S., DE Schumann, Anika, AU Schwind, Camilla, FR Sellmann, Meinolf, US Semeraro, Giovanni, IT Serafini, Luciano, IT Serrurier, Mathieu, FR Shapiro, Steven, CA Shvaiko, Pavel, IT Sidobre, Daniel, FR Siegel, Anne, FR Simeon, Nicola, FR Simon, Laurent, FR Simonis, Helmut, IE Simov, Kiril, Bulgaria Smith, Barbara, UK Sprinkhuizen-Kuyper I., NL Stamou, Giorgos, GR Stede, Manfred, DE Stergiou, Kostas, GR Stuckenschmidt, Heiner, DE Stumme, Gerd, DE Stumptner, Markus, AU Stylianou, Yannis, GR Teichteil-Königsbuch, F., FR Ten Teije, Annette, NL Terenziani, Paolo, IT Terna, Pietro, IT Terrioux, Cyril, FR Tessaris, Sergio, IT Theseider Dupré, Daniele, IT Thielscher, Michael, DE

Thonnat, Monique, FR Torta, Gianluca, IT Trave-Massuyes, L., FR Trombettoni, Gilles, FR Truszczynski, Miroslaw, US Tsoukias, Alexis, FR Van Atteveldt, Wouter, NL Van Beek, Peter, CA Van Ditmarsch, Hans, NZ Van Hage, Willem, NL Van Hentenryck, Pascal, US Van Hoeve, Willem-Jan, US Van den Bosch, Antal, NL Van der Torre, Leon, LU Verhagen, Harko, SE Viappiani, Paolo, CA Vidal, Thierry, FR Vidal, Vincent, FR Vincent, Nicole, FR Volz, Raphael, DE Wallace, Mark, AU Wang, Kewen, AU Wang, Shenghui, NL Webb, Nick, US Weibelzahl, Stephan, IE Weydert, Emil, LU Widmer, Gerhard, AT Wilks, Yorick, UK Williams, Mary-Anne, AU Wilson, Nic, IE Wotawa, Franz, AT Wrobel, Stefan, DE Yangarber, Roman, FI Yap, Roland, SG Yokoo, Makoto, JP Yu, Huizhen, FI Zancanaro, Massimo, IT Zanella, Marina, IT


ECAI 2008 M. Ghallab et al. (Eds.) IOS Press, 2008 © 2008 The authors and IOS Press. All rights reserved.

xiii

Preface Artificial Intelligence is a highly creative field. Numerous research areas in Computer Science that originated over the past fifty years within AI laboratories and were discussed in AI conferences are now completely independent and mature research domains whose young practitioners may not even be acquainted with the AI affiliation. It is fortunate to see that while disseminating and spreading out, the AI field per se remains very active. This is particularly the case in Europe. The ECAI series of conferences keeps growing. This 18th edition received more submissions than the previous ones. About 680 papers and posters were registered at ECAI 2008 conference system, out of which 518 papers and 43 posters were actually reviewed. The program committee decided to accept • •

121 full papers, an acceptance rate of 23%, and 97 posters.

Several submitted full papers have been accepted as posters. All posters, presented in these Proceedings as short papers, will have formal presentation slots in the technical sessions of the main program of the conference, as well as poster presentations within a specific session. The 561 reviewed submissions were originated from 51 different countries, out of which 35 countries are represented in the final program. The following table shows the number of submitted and accepted papers or posters per country, based on the contact author affiliation. Country Australia Austria Belgium Brazil Bulgaria Canada Chile China Cyprus Czech Republic Denmark Egypt Finland France Germany Greece Hungary

Sub. Acc. 26 12 12 6 4 3 13 1 1 1 13 6 1 6 3 1 1 6 1 1 1 1 4 3 116 42 49 20 34 14 1

Country India Iran Ireland Israel Italy Japan Korea Luxembourg Malaysia Malta Mexico Morocco Netherlands New Zealand Norway Pakistan Poland

Sub. Acc. 2 5 1 13 6 6 2 43 19 9 4 2 4 2 2 1 1 1 1 1 1 23 11 1 2 1 1 4

Country Sub. Acc. Portugal 17 6 Romania 4 1 Russia 4 Saudi Arabia 1 Singapore 1 Slovenia 4 3 South Africa 2 Spain 35 12 Sweden 9 5 Switzerland 2 Taiwan 2 1 Thailand 1 Tunisia 5 1 Turkey 3 1 United Kingdom 46 19 United States 15 6 Venezuela 1

The distribution of the 561 submitted and the 218 accepted paper or posters over reviewing areas (based on the first keyword chosen by the authors) is given below. With respect to previous ECAI conferences, one may notice a relative growth of the Machine Learning and Cognitive Modeling & Interaction areas. The rest of the distribution remains about stable, with marginal fluctuations given that areas are overlapping and their frontiers are not sharp.

xiv

ECAI 2008 Conference Areas KR&R Machine Learning Distributed & Multi-agents Systems Cognitive Modeling & Interaction Constraints and search Model-based Reasoning and Diagnosis NLP Planning and scheduling Perception, Sensing and Cognitive Robotics Uncertainty in AI

Papers Submitted 102 102 92 57 51 51 47 33 14 12 561

Papers Accepted 42 32 37 17 20 26 18 13 6 7 218

The Prestigious Applications of Intelligent Systems (PAIS), ECAI associated subconference, has also been very successful this year by the number and quality of submitted papers. Its program committee received 35 submissions in total and accepted 11 full papers, and 4 additional papers with short presentations. In conclusion, we are very happy to introduce you to the Proceedings of this 18th edition of ECAI, a conference that is growing and maintaining a high standard of quality. The success of this edition is due to the contribution and support of many colleagues. We would like to gratefully thank all those who helped organizing ECAI 2008 into a tremendous success. Area chairs, PAIS, workshop chairs and workshop organizers as well as the Systems Demonstration Chair were the key actors of this success. They managed timely and efficiently a heavy workload. Much thanks in particular to Felix Ingrand, who acted not only area chair but also as a program co-chair through the overall process. PC members provided high quality reviews and contributed to detailed discussions of several papers before reaching a decision. Finally, to all the persons involved in the local organization of the conference, many thanks for a tremendous amount of excellent work and much appreciated help. June 2008

Malik Ghallab Constantine Spyropoulos Nikos Fakotakis Nikos Avouris

xv

Contents ECCAI Member Societies

v

Conference Organization

vii

ECAI Programme Committee

ix

Preface Malik Ghallab, Constantine D. Spyropoulos, Nikos Fakotakis and Nikos Avouris

xiii

I. Invited Talks Semantic Activity Recognition Monique Thonnat

3

Bayesian Methods for Artificial Intelligence and Machine Learning Zoubin Ghahramani

8

The Impact of Constraint Programming Pascal Van Hentenryck

9

Web Science George Metakides

10

II. Papers 1. Knowledge Representation and Reasoning Advanced Preprocessing for Answer Set Solving Martin Gebser, Benjamin Kaufmann, André Neumann and Torsten Schaub

15

A Generic Framework for Comparing Semantic Similarities on a Subsumption Hierarchy Emmanuel Blanchard, Mounira Harzallah and Pascale Kuntz

20

Complexity of Subsumption in the EL Family of Description Logics: Acyclic and Cyclic TBoxes Christoph Haase and Carsten Lutz

25

Reasoning About Dynamic Depth Profiles Mikhail Soutchanski and Paulo Santos

30

Comparing Abductive Theories Katsumi Inoue and Chiaki Sakama

35

Privacy-Preserving Query Answering in Logic-Based Information Systems Bernardo Cuenca Grau and Ian Horrocks

40

Optimizing Causal Link Based Web Service Composition Freddy Lécué, Alexandre Delteil and Alain Léger

45

Extending the Knowledge Compilation Map: Closure Principles Hélène Fargier and Pierre Marquis

50

Semantic Modularity and Module Extraction in Description Logics Boris Konev, Carsten Lutz, Dirk Walther and Frank Wolter

55

New Results for Horn Cores and Envelopes of Horn Disjunctions Thomas Eiter and Kazuhisa Makino

60

xvi

Belief Revision with Reinforcement Learning for Interactive Object Recognition Thomas Leopold, Gabriele Kern-Isberner and Gabriele Peters

65

A Formal Approach for RDF/S Ontology Evolution George Konstantinidis, Giorgos Flouris, Grigoris Antoniou and Vassilis Christophides

70

Modular Equivalence in General Tomi Janhunen

75

Description Logic Rules Markus Krötzsch, Sebastian Rudolph and Pascal Hitzler

80

Conflicts Between Relevance-Sensitive and Iterated Belief Revision Pavlos Peppas, Anastasios Michael Fotinopoulos and Stella Seremetaki

85

Conservativity in Structured Ontologies Oliver Kutz and Till Mossakowski

89

Removed Sets Fusion: Performing off the Shelf Julien Hué, Eric Würbel and Odile Papini

94

A Coherent Well-Founded Model for Hybrid MKNF Knowledge Bases Matthias Knorr, José Júlio Alferes and Pascal Hitzler

99

2. Machine Learning Prototype-Based Domain Description Fabrizio Angiulli

107

Online Rule Learning via Weighted Model Counting Frédéric Koriche

112

Focused Ensemble Selection: A Diversity-Based Method for Greedy Ensemble Selection Ioannis Partalas, Grigorios Tsoumakas and Ioannis Vlahavas

117

MTForest: Ensemble Decision Trees Based on Multi-Task Learning Qing Wang, Liang Zhang, Mingmin Chi and Jiankui Guo

122

Many-Valued Concept Lattices for Conceptual Clustering and Information Retrieval Nizar Messai, Marie-Dominique Devignes, Amedeo Napoli and Malika Smail-Tabbone

127

Online Optimization for Variable Selection in Data Streams Christoforos Anagnostopoulos, Dimitris K. Tasoulis, David J. Hand and Niall M. Adams

132

Sub Node Extraction with Tree Based Wrappers Stefan Raeymaekers and Maurice Bruynooghe

137

Automatic Recurrent ANN Development for Signal Classification: Detection of Seizures in EEGs Daniel Rivero, Julian Dorado, Juan Rabuñal and Alejandro Pazos

142

A Method for Classifying Vertices of Labeled Graphs Applied to Knowledge Discovery from Molecules Frédéric Pennerath, Géraldine Polaillon and Amedeo Napoli

147

Nonnegative Decompositions with Resampling for Improving Gene Expression Data Biclustering Stability Liviu Badea and Doina Ţilivea

152

Exploiting Locality of Interactions Using a Policy-Gradient Approach in Multiagent Learning Francisco S. Melo

157

xvii

A Fast Method for Property Prediction in Graph-Structured Data from Positive and Unlabelled Examples Susanne Hoche, Peter Flach and David Hardcastle

162

VCD Bounds for Some GP Genotypes José Luis Montaña

167

Robust Division in Clustering of Streaming Time Series Pedro Pereira Rodrigues and João Gama

172

3. Model-Based Diagnosis and Reasoning Generating Diagnoses from Conflict Sets with Continuous Attributes Emmanuel Benazera and Louise Travé-Massuyés

179

A Compositional Mathematical Model of Machines Transporting Rigid Objects Peter Struss, Axel Kather, Dominik Schneider and Tobias Voigt

184

Model-Based Diagnosis of Discrete Event Systems with an Incomplete System Model Xiangfu Zhao and Dantong Ouyang

189

Chronicles for On-Line Diagnosis of Distributed Systems Xavier Le Guillou, Marie-Odile Cordier, Sophie Robin and Laurence Rozé

194

Test Generation for Model-Based Diagnosis Gregory Provan

199

Observation-Subsumption Checking in Similarity-Based Diagnosis of Discrete-Event Systems Gianfranco Lamperti and Marina Zanella

204

Local Consistency and Junction Tree for Diagnosis of Discrete-Event Systems Priscilla Kan John and Alban Grastien

209

Hierarchical Explanation of Inference in Bayesian Networks that Represent a Population of Independent Agents Peter Šutovský and Gregory F. Cooper

214

Coupling Continuous and Discrete Event System Techniques for Hybrid System Diagnosability Analysis Mehdi Bayoudh, Louise Travé-Massuyès and Xavier Olive

219

A Probabilistic Analysis of Diagnosability in Discrete Event Systems Farid Nouioua and Philippe Dague

224

Temporal Logic Patterns for Querying Qualitative Models of Genetic Regulatory Networks Pedro T. Monteiro, Delphine Ropers, Radu Mateescu, Ana T. Freitas and Hidde de Jong

229

Fighting Knowledge Acquisition Bottleneck with Argument Based Machine Learning Martin Možina, Matej Guid, Jana Krivec, Aleksander Sadikov and Ivan Bratko

234

4. Cognitive Modeling and Interaction Automatic Page Turning for Musicians via Real-Time Machine Listening Andreas Arzt, Gerhard Widmer and Simon Dixon

241

CDL: An Integrated Framework for Context Specification and Recognition

246

Fulvio Mastrogiovanni, Antonello Scalmato, Antonio Sgorbissa and Renato Zaccaria Web Page Prediction Based on Conditional Random Fields Yong Zhen Guo, Kotagiri Ramamohanarao and Laurence A.F. Park

251

xviii

A Formal Model of Emotions: Integrating Qualitative and Quantitative Aspects Bas R. Steunebrink, Mehdi Dastani and John-Jules Ch. Meyer

256

Modeling Collaborative Similarity with the Signed Resistance Distance Kernel Jérôme Kunegis, Stephan Schmidt, Şahin Albayrak, Christian Bauckhage and Martin Mehlitz

261

Modeling the Dynamics of Mood and Depression Fiemke Both, Mark Hoogendoorn, Michel Klein and Jan Treur

266

Groovy Neural Networks Axel Tidemann and Yiannis Demiris

271

An Efficient Student Model Based on Student Performance and Metadata Arndt Faulhaber and Erica Melis

276

5. Natural Language Processing Reducing Bias Effects in DOP Parameter Estimation Evita Linardaki

283

Multilingual Evidence Improves Clustering-Based Taxonomy Extraction Hans Hjelm and Paul Buitelaar

288

Unsupervised Grammar Induction Using a Parent Based Constituent Context Model Seyed Abolghasem Mirroshandel and Gholamreza Ghassem-Sani

293

Word Sense Induction Using Graphs of Collocations Ioannis P. Klapaftis and Suresh Manandhar

298

Learning Context-Free Grammars to Extract Relations from Text Georgios Petasis, Vangelis Karkaletsis, Georgios Paliouras and Constantine D. Spyropoulos

303

Talking Points in Metaphor: A Concise Usage-Based Representation for Figurative Processing Tony Veale and Yanfen Hao

308

Semantic Decomposition for Question Answering Sven Hartrumpf

313

Finding Key Bloggers, One Post at a Time Wouter Weerkamp, Krisztian Balog and Maarten de Rijke

318

Why Is This Wrong? – Diagnosing Erroneous Speech Recognizer Output with a Two Phase Parser Bernd Ludwig and Martin Hacker

323

Task Driven Coreference Resolution for Relation Extraction Feiyu Xu, Hans Uszkoreit and Hong Li

328

WWW Sits the SAT: Measuring Relational Similarity on the Web Danushka Bollegala, Yutaka Matsuo and Mitsuru Ishizuka

333

Improved Statistical Machine Translation Using Monolingual Paraphrases Preslav Nakov

338

Orthographic Similarity Search for Dictionary Lookup of Japanese Words Lars Yencken and Timothy Baldwin

343

6. Uncertainty and AI From Belief Change to Preference Change Jérôme Lang and Leendert van der Torre

351

xix

A General Model for Epistemic State Revision Using Plausibility Measures Jianbing Ma and Weiru Liu

356

Structure Learning of Markov Logic Networks Through Iterated Local Search Marenglen Biba, Stefano Ferilli and Floriana Esposito

361

Single-Peaked Consistency and Its Complexity Bruno Escoffier, Jérôme Lang and Meltem Öztürk

366

Belief Revision Through Forgetting Conditionals in Conditional Probabilistic Logic Programs Anbu Yue and Weiru Liu

371

Mastering the Processing of Preferences by Using Symbolic Priorities in Possibilistic Logic Souhila Kaci and Henri Prade

376

7. Distributed and Multi-Agents Systems Interaction-Oriented Agent Simulations: From Theory to Implementation Yoann Kubera, Philippe Mathieu and Sébastien Picault

383

Optimal Coalition Structure Generation in Partition Function Games Tomasz Michalak, Andrew Dowell, Peter McBurney and Michael Wooldridge

388

Coalition Structures in Weighted Voting Games Edith Elkind, Georgios Chalkiadakis and Nicholas R. Jennings

393

Agents Preferences in Decentralized Task Allocation Mark Hoogendoorn and Maria L. Gini

398

Game Theoretical Insights in Strategic Patrolling: Model and Algorithm in Normal-Form Nicola Gatti

403

Monitoring the Execution of a Multi-Agent Plan: Dealing with Partial Observability Roberto Micalizio and Pietro Torasso

408

A Hybrid Approach to Multi-Agent Decision-Making Paulo Trigo and Helder Coelho

413

Coalition Formation Strategies for Self-Interested Agents Thomas Génin and Samir Aknine

418

Of Mechanism Design and Multiagent Planning Roman van der Krogt, Mathijs de Weerdt and Yingqian Zhang

423

IAMwildCAT: The Winning Strategy for the TAC Market Design Competition Perukrishnen Vytelingum, Ioannis A. Vetsikas, Bing Shi and Nicholas R. Jennings

428

Multi-Agent Reinforcement Learning Algorithm with Variable Optimistic-Pessimistic Criterion Natalia Akchurina

433

As Safe as It Gets: Near-Optimal Learning in Multi-Stage Games with Imperfect Monitoring Danny Kuminov and Moshe Tennenholtz

438

A Heuristic Based Seller Agent for Simultaneous English Auctions Patricia Anthony and Edwin Law

443

A Truthful Two-Stage Mechanism for Eliciting Probabilistic Estimates with Unknown Costs Athanasios Papakonstantinou, Alex Rogers, Enrico H. Gerding and Nicholas R. Jennings

448

xx

Goal Generation and Adoption from Partially Trusted Beliefs Célia da Costa Pereira and Andrea G.B. Tettamanzi

453

Adaptive Play in Texas Hold’em Poker Raphaël Maîtrepierre, Jérémie Mary and Rémi Munos

458

Theoretical and Computational Properties of Preference-Based Argumentation Yannis Dimopoulos, Pavlos Moraitis and Leila Amgoud

463

Norm Defeasibility in an Institutional Normative Framework Henrique Lopes Cardoso and Eugénio Oliveira

468

8. Constraints and Search SLIDE: A Useful Special Case of the CARDPATH Constraint Christian Bessiere, Emmanuel Hebrard, Brahim Hnich, Zeynep Kiziltan and Toby Walsh

475

Frontier Search for Bicriterion Shortest Path Problems L. Mandow and J.L. Pérez de la Cruz

480

Heuristics for Dynamically Adapting Propagation Kostas Stergiou

485

Near Admissible Algorithms for Multiobjective Search Patrice Perny and Olivier Spanjaard

490

Compressing Pattern Databases with Learning Mehdi Samadi, Maryam Siabani, Ariel Felner and Robert Holte

495

A Decomposition Technique for Max-CSP Hachémi Bennaceur, Christophe Lecoutre and Olivier Roussel

500

Fast Set Bounds Propagation Using BDDs Graeme Gange, Vitaly Lagoon and Peter J. Stuckey

505

A New Approach for Solving Satisfiability Problems with Qualitative Preferences Emanuele Di Rosa, Enrico Giunchiglia and Marco Maratea

510

Combining Binary Constraint Networks in Qualitative Reasoning Jason Jingshi Li, Tomasz Kowalski, Jochen Renz and Sanjiang Li

515

Solving Necklace Constraint Problems Pierre Flener and Justin Pearson

520

Vivifying Propositional Clausal Formulae Cédric Piette, Youssef Hamadi and Lakhdar Saïs

525

Hybrid Tractable CSPs Which Generalize Tree Structure Martin C. Cooper, Peter G. Jeavons and András Z. Salamon

530

Justification-Based Non-Clausal Local Search for SAT Matti Järvisalo, Tommi Junttila and Ilkka Niemelä

535

Multi-Valued Pattern Databases Carlos Linares López

540

Using Abstraction in Two-Player Games Mehdi Samadi, Jonathan Schaeffer, Fatemeh Torabi Asr, Majid Samar and Zohreh Azimifar

545

xxi

9. Planning and Scheduling A Practical Temporal Constraint Management System for Real-Time Applications Luke Hunsberger

553

Towards Efficient Belief Update for Planning-Based Web Service Composition Jörg Hoffmann

558

Genetic Optimization of the Multi-Location Transshipment Problem with Limited Storage Capacity Nabil Belgasmi, Lamjed Ben Saïd and Khaled Ghédira

563

Regression for Classical and Nondeterministic Planning Jussi Rintanen

568

Combining Domain-Independent Planning and HTN Planning: The Duet Planner Alfonso Gerevini, Ugur Kuter, Dana Nau, Alessandro Saetti and Nathaniel Waisbrot

573

Learning in Planning with Temporally Extended Goals and Uncontrollable Events André A. Ciré and Adi Botea

578

A Simulation-Based Approach for Solving Generalized Semi-Markov Decision Processes Emmanuel Rachelson, Gauthier Quesnel, Frédérick Garcia and Patrick Fabiani

583

Heuristics for Planning with Action Costs Revisited Emil Keyder and Héctor Geffner

588

Diagnosis of Simple Temporal Networks Nico Roos and Cees Witteveen

593

10. Perception, Sensing and Cognitive Robotics An Attentive Machine Interface Using Geo-Contextual Awareness for Mobile Vision Tasks Katrin Amlacher and Lucas Paletta

601

Learning Functional Object-Categories from a Relational Spatio-Temporal Representation Muralikrishna Sridhar, Anthony G. Cohn and David C. Hogg

606

Sequential Spatial Reasoning in Images Based on Pre-Attention Mechanisms and Fuzzy Attribute Graphs Geoffroy Fouquier, Jamal Atif and Isabelle Bloch

611

Automatic Configuration of Multi-Robot Systems: Planning for Multiple Steps Robert Lundh, Lars Karlsson and Alessandro Saffiotti

616

Structure Segmentation and Recognition in Images Guided by Structural Constraint Propagation Olivier Nempont, Jamal Atif, Elsa Angelini and Isabelle Bloch

621

Theoretical Study of Ant-Based Algorithms for Multi-Agent Patrolling Arnaud Glad, Olivier Simonin, Olivier Buffet and François Charpillet

626

Incremental Component-Based Construction and Verification of a Robotic System Ananda Basu, Matthieu Gallien, Charles Lesire, Thanh-Hung Nguyen, Saddek Bensalem, Félix Ingrand and Joseph Sifakis

631

Salience-Driven Contextual Priming of Speech Recognition for Human-Robot Interaction Pierre Lison and Geert-Jan Kruijff

636

xxii

III. Prestigious Applications of Intelligent Systems (PAIS) A New CBR Approach to the Oil Spill Problem Juan Manuel Corchado, Aitor Mata, Juan Francisco De Paz and David Del Pozo

643

QuestSemantics – Intelligent Search and Retrieval of Business Knowledge Ian Blacoe, Ignazio Palmisano, Valentina Tamma and Luigi Iannone

648

Intelligent Adaptive Monitoring for Cardiac Surveillance Lucie Callens, Guy Carrault, Marie-Odile Cordier, Elisa Fromont, François Portet and René Quiniou

653

A Decision Support System for Breast Cancer Detection in Screening Programs Marina Velikova, Peter J.F. Lucas, Nivea Ferreira, Maurice Samulski and Nico Karssemeijer

658

The Design, Deployment and Evaluation of the AnimalWatch Intelligent Tutoring System Paul R. Cohen, Carole R. Beal and Niall M. Adams

663

AI on the Move: Exploiting AI Techniques for Context Inference on Mobile Devices Adolfo Bulfoni, Paolo Coppola, Vincenzo Della Mea, Luca Di Gaspero, Danny Mischis, Stefano Mizzaro, Ivan Scagnetto and Luca Vassena

668

Two Stage Knowledge Discovery for Spatio-Temporal Radio-Emission Data Matthias Haringer, Lothar Hotz and Vera Kamp

673

Using Natural Language Generation Technology to Improve Information Flows in Intensive Care Units James Hunter, Albert Gatt, François Portet, Ehud Reiter and Somayajulu Sripada

678

Application and Evaluation of a Medical Knowledge System in Sonography (SONOCONSULT) Frank Puppe, Martin Atzmueller, Georg Buscher, Matthias Huettig, Hardi Luehrs and Hans-Peter Buscher

683

Automating Accreditation of Medical Web Content Vangelis Karkaletsis, Pythagoras Karampiperis, Konstantinos Stamatakis, Martin Labský, Marek Růžička, Vojtěch Svátek, Enrique Amigó Cabrera, Matti Pöllä, Miquel Angel Mayer, Angela Leis and Dagmar Villarroel Gonzales

688

Pattern Classification Techniques for Early Lung Cancer Diagnosis Using an Electronic Nose Rossella Blatt, Andrea Bonarini, Elisa Calabró, Matteo Matteucci, Matteo Della Torre and Ugo Pastorino

693

A BDD Approach to the Feature Subscription Problem T. Hadzic, D. Lesaint, D. Mehta, B. O’Sullivan, L. Quesada and N. Wilson

698

Continuous Plan Management Support for Space Missions: The RAXEM Case Amedeo Cesta, Gabriella Cortellessa, Michel Denis, Alessandro Donati, Simone Fratini, Angelo Oddi, Nicola Policella, Erhard Rabenau and Jonathan Schulster

703

The i-Walker: An Intelligent Pedestrian Mobility Aid R. Annicchiarico, C. Barrué, T. Benedico, F. Campana, U. Cortés and A. Martínez-Velasco

708

Mixture of Gaussians Model for Robust Pedestrian Images Detection Dymitr Ruta

713

IV. Short Papers 1. Knowledge Representation and Reasoning Deriving Explanations from Causal Information Ph. Besnard, M.-O. Cordier and Y. Moinard

723

xxiii

A Hybrid Tableau Algorithm for ALCQ Jocelyne Faddoul, Nasim Farsinia, Volker Haarslev and Ralf Möller

725

Semantic Relatedness in Semantic Networks Laurent Mazuel and Nicolas Sabouret

727

HOOPO: A Hybrid Object-Oriented Integration of Production Rules and OWL Ontologies Georgios Meditskos and Nick Bassiliades

729

Rule-Based OWL Ontology Reasoning Using Dynamic ABOX Entailments Georgios Meditskos and Nick Bassiliades

731

Computability and Complexity Issues of Extended RDF Anastasia Analyti, Grigoris Antoniou, Carlos Viegas Damásio and Gerd Wagner

733

Automated Web Services Composition Using Extended Representation of Planning Domain Mohamad El Falou, Maroua Bouzid, Abdel-Illah Mouaddib and Thierry Vidal

735

Propositional Merging Operators Based on Set-Theoretic Closeness Patricia Everaere, Sébastien Konieczny and Pierre Marquis

737

Partial and Informative Common Subsumers in Description Logics Simona Colucci, Eugenio Di Sciascio, Francesco Maria Donini and Eufemia Tinelli

739

Prime Implicate-Based Belief Revision Operators Meghyn Bienvenu, Andreas Herzig and Guilin Qi

741

Approximate Structure Preserving Semantic Matching Fausto Giunchiglia, Mikalai Yatskevich, Fiona McNeill, Pavel Shvaiko, Juan Pane and Paolo Besana

743

Discovering Temporal Knowledge from a Crisscross of Timed Observations Nabil Benayadi and Marc Le Goc

745

Fred Meets Tweety Antonis Kakas, Loizos Michael and Rob Miller

747

Definability in Logic and Rough Set Theory Tuan-Fang Fan, Churn-Jung Liau and Duen-Ren Liu

749

WikiTaxonomy: A Large Scale Knowledge Resource Simone Paolo Ponzetto and Michael Strube

751

Computing ∈-Optimal Strategies in Bridge and Other Games of Sequential Outcome Pavel Cejnar

753

2. Machine Learning Classifier Combination Using a Class-Indifferent Method Yaxin Bi, Shenli Wu, Pang Xiong and Xuhui Shen

757

Reinforcement Learning with Classifier Selection for Focused Crawling Ioannis Partalas, Georgios Paliouras and Ioannis Vlahavas

759

Intuitive Action Set Formation in Learning Classifier Systems with Memory Registers L. Simões, M.C. Schut and E. Haasdijk

761

An Ensemble of Classifiers for Coping with Recurring Contexts in Data Streams Ioannis Katakis, Grigorios Tsoumakas and Ioannis Vlahavas

763

xxiv

Content-Based Social Network Analysis Paola Velardi, Roberto Navigli, Alessandro Cucchiarelli and Mirco Curzi

765

Efficient Data Clustering by Local Density Approximation Marc-Ismaël Akodjènou and Patrick Gallinari

767

Gas Turbine Fault Diagnosis Using Random Forests Manolis Maragoudakis, Euripides Loukis, Panayotis-Prodromos Pantelides

769

How Many Objects?: Determining the Number of Clusters with a Skewed Distribution Satoshi Oyama and Katsumi Tanaka

771

Active Concept Learning for Ontology Evolution Murat Şensoy and Pınar Yolum

773

Determining Automatically the Size of Learned Ontologies Elias Zavitsanos, Sergios Petridis, Georgios Paliouras and George A. Vouros

775

Dynamic Multi-Armed Bandit with Covariates Nicos G. Pavlidis, Dimitris K. Tasoulis, Niall M. Adams and David J. Hand

777

Reinforcement Learning with the Use of Costly Features Robby Goetschalckx, Scott Sanner and Kurt Driessens

779

Data-Driven Induction of Functional Programs Emanuel Kitzelmann

781

CTRNN Parameter Learning Using Differential Evolution Ivanoe De Falco, Antonio Della Cioppa, Francesco Donnarumma, Domenico Maisto, Roberto Prevete and Ernesto Tarantino

783

3. Model-Based Diagnosis and Reasoning Incremental Diagnosis of DES by Satisfiability Alban Grastien and Anbulagan

787

Characterizing and Checking Self-Healability Marie-Odile Cordier, Yannick Pencolé, Louise Travé-Massuyès and Thierry Vidal

789

Improving Robustness in Consistency-Based Diagnosis Using Possible Conflicts Belarmino Pulido, Anibal Bregon and Carlos Alonso-González

791

Dependable Monitoring of Discrete-Event Systems with Uncertain Temporal Observations Gianfranco Lamperti and Marina Zanella

793

Distributed Repair of Nondiagnosability Anika Schumann, Wolfgang Mayer and Markus Stumptner

795

From Constraint Representations of Sequential Code and Program Annotations to Their Use in Debugging Mihai Nica and Franz Wotawa

797

Compressing Binary Decision Diagrams Esben Rune Hansen, S. Srinivasa Rao and Peter Tiedemann

799

Dependent Failures in Consistency-Based Diagnosis Jörg Weber and Franz Wotawa

801

Cost-Sensitive Iterative Abductive Reasoning with Abstractions Gianluca Torta, Daniele Theseider Dupré and Luca Anselma

803

xxv

Computation of Minimal Sensor Sets for Conditional Testability Requirements Gianluca Torta and Pietro Torasso

805

Combining Abduction with Conflict-Based Diagnosis Ildikó Flesch and Peter J.F. Lucas

807

4. Cognitive Modeling and Interaction An Activity Recognition Model for Alzheimer’s Patients: Extension of the COACH Task Guidance System B. Bouchard, P. Roy, A. Bouzouane, S. Giroux and A. Mihailidis

811

Not So New: Overblown Claims for ‘New’ Approaches to Emotion Dylan Evans

813

Emergence of Rules in Cell Assemblies of fLIF Neurons Roman V. Belavkin and Christian R. Huyck

815

ERS: Evaluating Reputations of Scientific Journals Émilie Samuel and Colin de la Higuera

817

Personal Experience Acquisition Support from Blogs Using Event-Depicting Images Keita Sato, Yoko Nishihara and Wataru Sunayama

819

Object Configuration Reconstruction from Descriptions Using Relative and Intrinsic Reference Frames H. Joe Steinhauer

821

Probabilistic Reinforcement Rules for Item-Based Recommender Systems Sylvain Castagnos, Armelle Brun and Anne Boyer

823

An Efficient Behavior Classifier Based on Distributions of Relevant Events Jose Antonio Iglesias, Agapito Ledezma, Araceli Sanchis and Gal Kaminka

825

ContextAggregator: A Heuristic-Based Approach for Automated Feature Construction and Selection Robert Lokaiczyk and Manuel Goertz

827

A Pervasive Assistant for Nursing and Doctoral Staff Alexiei Dingli and Charlie Abela

829

5. Natural Language Processing Author Identification Using a Tensor Space Representation Spyridon Plakias and Efstathios Stamatatos

833

Categorizing Opinion in Discourse Nicholas Asher, Farah Benamara and Yvette Yannick Mathieu

835

A Dynamic Approach for Automatic Error Detection in Generation Grammars Tim vor der Brück and Holger Stenzhorn

837

Answering Definition Question: Ranking for Top-k Chao Shen, Xipeng Qiu, Xuanjing Huang and Lide Wu

839

Ontology-Driven Human Language Technology for Semantic-Based Business Intelligence Thierry Declerck, Hans-Ulrich Krieger, Horacio Saggion and Marcus Spies

841

Evaluation Evaluation David M.W. Powers

843

xxvi

6. Uncertainty and AI Using Decision Trees as the Answer Networks in Temporal Difference-Networks Laura-Andreea Antanas, Kurt Driessens, Jan Ramon and Tom Croonenborghs

847

An Efficient Deduction Mechanism for Expressive Comparative Preferences Languages Nic Wilson

849

An Analysis of Bayesian Network Model-Approximation Techniques Adamo Santana and Gregory Provan

851

7. Distributed and Multi-Agents Systems Verifying the Conformance of Agents with Multiparty Protocols Laura Giordano and Alberto Martelli

855

Simulated Annealing for Coalition Formation Helena Keinänen and Misa Keinänen

857

A Default Logic Based Framework for Argumentation Emanuel Santos and João Pavão Martins

859

An Empirical Investigation of the Adversarial Activity Model Inon Zuckerman, Sarit Kraus, Jeffrey S. Rosenschein

861

Addressing Temporal Aspects of Privacy-Related Norms Guillaume Piolle and Yves Demazeau

863

Evaluation of Global System State Thanks to Local Phenomenona Jean-Michel Contet, Franck Gechter, Pablo Gruer and Abder Koukam

865

Experience and Trust — A Systems-Theoretic Approach Norman Foo and Jochen Renz

867

Trust-Aided Acquisition of Unverifiable Information Eugen Staab, Volker Fusenig and Thomas Engel

869

BIDFLOW: A New Graph-Based Bidding Language for Combinatorial Auctions Madalina Croitoru, Cornelius Croitoru and Paul Lewis

871

Multi-Agent Reinforcement Learning for Intrusion Detection: A Case Study and Evaluation Arturo Servin and Daniel Kudenko

873

GR-MAS: Multi-Agent System for Geriatric Residences Javier Bajo, Juan M. Corchado and Sara Rodriguez

875

Agent-Based and Population-Based Simulation of Displacement of Crime (extended abstract) Tibor Bosse, Charlotte Gerritsen, Mark Hoogendoorn, S. Waqar Jaffry and Jan Treur

877

Organizing Coherent Coalitions Jan Broersen, Rosja Mastop, John-Jules Ch. Meyer and Paolo Turrini

879

A Probabilistic Trust Model for Semantic Peer-to-Peer Systems Gia-Hien Nguyen, Philippe Chatalic and Marie-Christine Rousset

881

Conditional Norms and Dyadic Obligations in Time Jan Broersen and Leendert van der Torre

883

Trust Aware Negotiation Dissolution Nicolás Hormazábal, Josep Lluis de la Rosa i Esteva and Silvana Aciar

885

xxvii

On the Role of Structured Information Exchange in Supervised Learning Ricardo M. Araujo and Luis C. Lamb

887

Magic Agents: Using Information Relevance to Control Autonomy B. van der Vecht, F. Dignum and J.-J.Ch. Meyer

889

Infection-Based Norm Emergence in Multi-Agent Complex Networks Norman Salazar, Juan A. Rodriguez-Aguilar and Josep Ll. Arcos

891

Opponent Modelling in Texas Hold’em Poker as the Key for Success Dinis Félix and Luís Paulo Reis

893

8. Constraints and Search LRTA* Works Much Better with Pessimistic Heuristics Aleksander Sadikov and Ivan Bratko

897

Thinking Too Much: Pathology in Pathfinding Mitja Luštrek and Vadim Bulitko

899

Dynamic Backtracking for Distributed Constraint Optimization Redouane Ezzahir, Christian Bessiere, Imade Benelallam, El Houssine Bouyakhf and Mustapha Belaissaoui

901

Integrating Abduction and Constraint Optimization in Constraint Handling Rules Marco Gavanelli, Marco Alberti and Evelina Lamma

903

Symbolic Classification of General Multi-Player Games Peter Kissmann and Stefan Edelkamp

905

Redundancy in CSPs Assef Chmeiss, Vincent Krawczyk and Lakhdar Sais

907

Reinforcement Learning and Reactive Search: An Adaptive MAX-SAT Solver Roberto Battiti and Paolo Campigotto

909

A MAX-SAT Algorithm Porfolio Paulo Matos, Jordi Planes, Florian Letombe, João Marques-Silva

911

On the Practical Significance of Hypertree vs. Tree Width Rina Dechter, Lars Otten and Radu Marinescu

913

9. Planning and Scheduling A New Approach to Planning in Networks Jussi Rintanen

917

Detection of Unsolvable Temporal Planning Problems Through the Use of Landmarks E. Marzal, L. Sebastia and E. Onaindia

919

A Planning Graph Heuristic for Forward-Chaining Adversarial Planning Pascal Bercher and Robert Mattmüller

921

10. Perception, Sensing and Cognitive Robotics Vector Valued Markov Decision Process for Robot Platooning Matthieu Boussard, Maroua Bouzid and Abdel-Illah Mouaddib

925

xxviii

Learning to Select Object Recognition Methods for Autonomous Mobile Robots Reinaldo A.C. Bianchi, Arnau Ramisa and Ramón López de Mántaras

927

Robust Reservation-Based Multi-Agent Routing Adriaan ter Mors, Xiaoyu Mao, Jonne Zutt, Cees Witteveen and Nico Roos

929

Automatic Animation Generation of a Teleoperated Robot Arm Khaled Belghith, Benjamin Auder, Froduald Kabanza, Philippe Bellefeuille and Leo Hartman

931

Planning, Executing, and Monitoring Communication in a Logic-Based Multi-Agent System Martin Magnusson, David Landén and Patrick Doherty

933

Author Index

935

I. Invited Talks


ECAI 2008 M. Ghallab et al. (Eds.) IOS Press, 2008 © 2008 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-58603-891-5-3

3

Semantic Activity Recognition Monique Thonnat 1 Abstract. Extracting automatically the semantics from visual data is a real challenge. We describe in this paper how recent work in cognitive vision leads to significative results in activity recognition for visualsurveillance and video monitoring. In particular we present work performed in the domain of video understanding in our PULSAR team at INRIA in Sophia Antipolis. Our main objective is to analyse in real-time video streams captured by static video cameras and to recognize their semantic content. We present a cognitive vision approach mixing 4D computer vision techniques and activity recognition based on a priori knowledge. Applications in visualsurveillance and healthcare monitoring are shown. We conclude by current issues in cognitive vision for activity recognition.

with the unautorized person accessing together with an employee to a fordidden area. In the second case (shown in figure 2) without information on the location of the scene one can recognize a woman standing alone; a medical expert knowing the patient will interpret the same scene as an active elderly preparing a meal in her kitchen. In fact, the interpretation of a video sequence is not unique but it depends on the a priori knowledge of the observer and on his/her goal.

1 INTRODUCTION This paper is focused on activity recognition. Activity recognition is a hot topic in the academic field not only due to scientific motivations but also due to strong demands coming from the industry and the society; in particular for videosurveillance and healthcare. In fact, there is an increasing need to automate the recognition of activities observed by visual sensors (usually CCD cameras, omni directional cameras, infrared cameras). More precisely we are interested in the real-time semantic interpretation of dynamic scenes observed by video cameras. We thus study spatio-temporal activities performed by mobile objects (e.g. human beings, animals or vehicles) interacting with the physical world. What does it mean to understand a video ? Is it just to perform statistics on the appearance of images and to recognize an image from a set of already seen images? If we really want to understand the activities performed by the physical objects 2D analysis is not sufficient. We need to locate the physical objects in the 3D real world. The dynamics of the physical objects is a major cue for activity recognition. The computer vision community is very active in the domain of motion detection, mobile object tracking and more recently trajectory analysis. Very often these analyses are performed in the image plane and are thus dependant of the sensor parameters as its field of view, position and orientation. However for reliable activity recognition the dynamics of the physical objects must be computed in the 4D space. Is there a unique objective interpretation of a dynamic scene? For instance the scenes shown in figures 1 and 2 can be interpreted more or less precisely in function of the a priori knowledge of the observer. In the first case (shown in figure 1) without information on the location of the scene one can recognize an indoor scene where two men are walking together towards a door; a videosurveillance expert knowing the location (a bank agency), its spatial configuration as well as security rules will interpret the same scene as a bank attack 1

INRIA, France, email: [email protected]

Figure 1. A scene with different valid interpretations: two people walking together towards a door or a bank attack with an access to a forbidden area by an unauthorized person and an employee.

Figure 2.

A scene with different valid interpretations: a person standing in a room or an active elderly preparing a meal in a kitchen.

2 4D APPROACH We present a cognitive vision approach mixing 4D computer vision techniques and activity recognition based on a priori knowledge. The major issue in semantic interpretation of dynamic scenes is the gap between the subjective interpretation of data and the objective measures provided by the sensors.

4

M. Thonnat / Semantic Activity Recognition

images from camera 1 images from camera 2

images from camera N

Figure 3.

mobile objects from camera 1

fused tracked mobile objects for the whole scene

motion detection

frame to frame tracking

motion detection


...

...

motion detection


tracked mobile objects from camera 1 cameras with overlapped FOV fusion

tracked mobile objects from camera N

long term group tracking long term crowd tracking physical objects

AND/OR tree−based scenario recognition automaton−based scenario recognition

alerts

temporal−constraints based scenario recognition Bayesian−network based scenario recognition

From sensor data to high level interpretation; global structure of an activity monitoring system built with VSIP[1].

Our approach to address this problem is to keep a clear boundary between the application dependent subjective interpretations and the objective analysis of the videos. We thus define a set of objective measures which can be extracted in real-time from the videos, we propose formal models to enable users to express their activities of interest and we build matching techniques to bridge the gap between the objective measures and the activity models. Figure 3 shows the global structure of a videosurveillance system built with this approach. First, a motion detection step followed by a frame to frame tracking is made for each video camera. Then the tracked mobile objects coming from different video cameras with overlapping fields of view are fused into a unique 4D representation for the whole scene. Depending on the chosen application, a combination of one or more of the available trackers (individuals, groups and crowd tracker) is used. Then scenario recognition is performed by a combination of one or more of the available recognition algorithms (automaton based, Bayesian-network based, AND/OR tree based and temporal constraints based). Finally the system generates the alerts corresponding to the predefined recognized scenarios. For robust semantic interpretation of mobile object behaviour it is mandatory to rely on correct physical object type classification. It can be based on simple 3D models like parallelepipeds [12] or complex 3D human body configurations with posture models as in [2]. Figure 4 shows examples of such postures.

Figure 4.

long term individual tracking

scenes or the walls and doors for indoor scenes) as well as the main static 3D objects (for instance the furniture in indoor scenes) and the 2D zones of interest. This geometry is defined in terms of 3D position, shape and volume. • Semantic information: for each part of the map semantic information is added as its type (e.g. 3D object, 2D zone), its characterics (e.g. yellow, fragile) or its function (e.g. entrance zone, seat). We can see on figure 5 a 2D map of an indoor flat and on figure 10 two partial views of the 3D map built for monitoring elderly at home. In this map in addition to the main structure of the rooms (walls, doors, etc.), the equipment and the furniture are defined as well as the information related to the sensors.

Figure 5. Top view of the flat

Different 3D models of human body postures

Figure 6.

3D map: the kitchen area and the top view of a flat for monitoring elderly at home

3 3D MAP We use 3D maps as a means to model the a priori knowledge of the physical environment captured by the sensors. More precisely the 3D maps contain the a priori knowledge of the empty scenes:

4 ACTIVITY MODELLING

• Video Cameras: 3D position of the sensors, calibration matrix, fields of view,... • 3D Geometry: the geometry of the static structure of the empty scene (for instance the buildings and road structure for outdoor

In order to express the semantics of the activities a modelling effort is needed. The models correspond to the modeling of all the knowledge needed by the system to recognize video events occurring in the scene. To allow security operators to easily define and modify their models, the description of the knowledge is declarative and intuitive

5


(in natural terms). We propose a video event ontology to share common concepts in video understanding and to decrease the effort of knowledge modelling.

4.1 The Video Event Ontology The event ontology is a set of concepts for describing physical objects, events and relations between concepts: The physical objects are all the concepts to describe objects of the real world in the scene observed by the sensors. The attributes of a physical object are pertinent for the recognition. These attributes characterize the physical object. There are two types of physical objects: contextual objects (which are usually static and whenever in motion, its movement can be predicted using contextual information) and mobile objects (which can be perceived as moving in the scene and as initiating their motions, without the possibility to predict their movement). The events are all the concepts to describe mobile object evolutions and interactions in a scene. Different terms are used to describe these concepts and categorized into two categories: state (including primitive/composite state) and event (including primitive/composite event, single/multi-agent event). A primitive state is a spatio-temporal property valid at a given instant or stable on a time interval which is directly inferred from audiovisual attributes of physical objects computed by low level signal processing algorithms. A composite state is a combination of states. A primitive event is a change of states. A composite event is a combination of states and events. A single-agent event is an event involving a single mobile object. A multi-agent event is a composite event involving several (at least two) mobile objects with different motions. Currently this ontology contains 151 concepts used for different applications in video understanding. This ontology is implemented in Protege to be independant of a particular activity recognition formalism.

algorithm recognizes which events are occurring using the primitive video events. To recognize an event composed of sub-events, given the event model, the recognition algorithm selects a set of physical objects matching the remaining physical object variables of the event model. The algorithm then looks back in the past for any previously recognized state/event that matches the first component of the event model. If these two recognized components verify the event model constraints (e.g. temporal constraints), the event is said to be recognized. In order to facilitate complex event recognition, after each event recognition, event templates are generated for all composite events, the last component of which corresponds to this recognized event. For more details see [9].

6 APPLICATIONS This approach has been applied to a large set of applications in visualsurveillance.

6.1 Visualsurveillance A typical example of complex activities in which we are interested is aircraft monitoring (see figure 7 in apron areas . In this example the duration of the servicing activities8 around the aircraft is about one hour and the activities involve interactions between several ground vehicles and human operators. The goal is to recognize these activities through formal activity models as shown in figure 9 and data captured by a network of video cameras (such as the ones shown in figure 7). For more details, refer to [3] and the related European project website http://www.avitrack.net/.

4.2 Activity Models A formalism for expressing an activity is directly based on the concepts of the video event ontology. A composite event model is composed of five parts: ”physical objects” involved in the event (e.g. person, equipment, zones of interest), ”components” corresponding to the sub-events composing the event, ”forbidden components” corresponding to the events which should not occur during the main event, ”constraints” are conditions between the physical objects and/or the components (including symbolic, logical, spatial and temporal constraints including Allen interval algebra operators, and ”alarms” describing the actions to be taken when the event is recognized. Primitive states, composite states and primitive events can be described using the same formalism. Please see [10] and [9] for more details of the formalism.

Figure 7.

a

b

c

d

Different views of an apron area captured by video cameras for aircraft monitoring

5 ACTIVITY RECOGNITION The algorithm proposed in [9] and in [10] enables to process efficiently (i.e. in realtime) a data flow and to recognize pre-defined activities. Alternative approaches based on probabilistic methods [6] or [7] can also be used. In the following we concentrate on the first approach because it is directly based on the formalism and the ontology presented in the previous section. The video event recognition

6.2 Healtcare monitoring In this application the objective is to monitor elderly at home (see figure 10). In collaboration with gerontologists, we have modeled several primitive states, primitive events and composite events. First we

6


Figure 8. Activity recognition problem in airport: the main servicing operations around an aircraft (refuelling, baggage loading, power supply, etc...) and the location of the 8 video cameras (in blue)

are interesting in modelling events characteristic of critical situations such as falling down. Second, these events aim at detecting abnormal changes of behavior patterns such as depression. Given these objectives we have selected the activities that can be detected using video cameras [11]. We have modeled thirty four video events. In particular, we have defined fourteen primitives states, four of them are related to the location of the person in the scene (e.g. inside kitchen, inside livingroom) and the ten remaining are related to the proposed 3D key human postures. We have defined also four primitive events related to the combination of these primitive states: ”standing up” which represents a change state from sitting or slumping to standing, ”sitting down” which represents a change state from standing, or bending to sitting on a chair, ”sitting up” represents a change state from lying to sitting on the floor, and ”lying down” which represents a change state from standing or sitting on the floor to lying. We have defined also six primitive events such as: stay in kitchen, stay in livingroom. These primitive states and events are used to define more composite events. For this study, we have modeled ten composite events. In this paper, we present just two of them: ”feeling faint” and ”falling down”. The model of the ”feeling faint” event is shown in figure 4. The ”feeling faint” model involves one physical object (one person), and it contains three 3D human posture components and constraints between these components. CompositeEvent (PersonFeelingFaint, PhysicalObjects( (p: Person) ) Components ( (pStand: PrimitiveState Standing(p)) (pBend: PrimitiveState Bending(p)) (pSit: PrimitiveState Sitting Outstretched Legs(p))) Constraints ((Sequence pStand; pBend; pSit) (pSit’s Duration >= 10)) Alarm( AText(”Person is Feeling Faint”) AType(”URGENT”)) ) ”Feeling faint” model.

Figure 9. Activity recognition problem in airport: example of an activity model enabling to describe an unloading operation with a high-level language

We have also modelled the ”falling down” event. There are different ways for describing a person falling down. Thus, we have modelled the event ”falling down” with three models: Falling down 1: A change state from standing, sitting on the floor (with flexed or outstretched legs) and lying (with flexed or outstretched legs). Falling down 2: A change state from standing, and lying (with flexed or outstretched legs). Falling down 3: A change state from standing, bending and lying (with flexed or outstretched legs). An example of the definition of the model ”falling down 1” is shown below.

Figure 10.

healthcare

CompositeEvent(PersonFallingDown1, PhysicalObjects( (p: Person) ) Components ( (pStand: PrimitiveState Standing(p)) (pSit: PrimitiveState Sitting Flexed Legs(p)) (pLay: PrimitiveState Lying Outstretched Legs(p))) Constraints ( (pSit before meet p lay) (pLay’s Duration >= 50)) Alarm (AText(”Person is Falling Down”) AType(”VERYURGENT”)) ) ”Falling down 1” model.

Figure 11 and figure 12 show respectively the camera view and the 3D visualization of the recognition of the ”feeling faint” event.


7

7 CONCLUSION

Figure 11.

Recognition of the ”feeling faint” event

We have shown a 4D semantic approach for activity recognition of dynamic scene. There are still a lot of open issues among which a full theory of visual data interpretation, reliable techniques for 4D analysis able to deal with changing observation conditions and scene content. From an activity recognition point of view the three main points are the development of shared operational ontologies, of formalisms for activity modelling with good properties such as scalability and learning techniques for model refinement. In particular a large set of learning issues are rised by this 4D semantic approach for instance: learning contextual variations for physical object detection and image segmentation [5], learning the structure of the activity models [8] or learning the visual concept detectors [4].

REFERENCES

Figure 12.

3D visualization of the recognition of the ”feeling faint” event

Figure 13 and figure 14 show respectively the camera view and the 3D visualization of the recognition of the ”falling down” event.

Figure 13.

Figure 14.

Recognition of the ”falling down” event

3D visualization of the recognition of the ”falling down” event

[1] A. Avanzi, F. Bremond, C. Tornieri, and M. Thonnat, ‘Design and assessment of an intelligent activity monitoring platform’, EURASIP Journal on Applied Signal Processing, special issue in ”Advances in Intelligent Vision Systems: Methods and Applications”, 2005(14), 2359– 2374, (August 2005). [2] B. Boulay, F. Bremond, and M. Thonnat, ‘Applying 3d human model in a posture recognition system’, Pattern Recognition Letter, Special Issue on vision for Crime Detection and Prevention, 27(15), 1788–1796, (2006). [3] Florent Fusier, Valery Valentin, François Bremond, Monique Thonnat, Mark Bor g, David Thirde, and James Ferryman, ‘Video understanding for complex activity recognition’, Machine Vision and Applications Journal, 18, 167–188, (2007). [4] N. Maillot and M. Thonnat, ‘Ontology based complex object recognition’, Image and Vision Computing Journal, Special Issue on Cognitive Computer Vision, 26(1), 102–113, (2008). [5] V. Martin and M. Thonnat, ‘Learning contextual variations for video segmentation’, in The 6th International Conference on Vision Systems (ICVW08), Santorini, Greece, (2008). [6] G. Medioni, I. Cohen, F. Brémond, S. Hongeng, and G. Nevatia, ‘Activity Analysis in Video’, Pattern Analysis and Machine Intelligence PAMI, 23(8), 873–889, (2001). [7] N. Moenne-Loccoz, F. Brémond, and M. Thonnat, ‘Recurrent bayesian network for the recognation of human behaviors from video’, in Third International Conference On Computer Vision Systems (ICVS 2003), volume LNCS 2626, pp. 44–53, Graz, Austria, (2003). Springer. [8] A. Toshev, F. Brémond, and M. Thonnat, ‘An a priori-based method for frequent composite event discovery in videos’, in Proceedings of 2006 IEEE International Conference on Computer Vision Systems, New York USA, (January 2006). [9] V-T. Vu, F. Brémond, and M. Thonnat, ‘Automatic video interpretation: A novel algorithm for temporal scenario recognition’, in The Eighteenth International Joint Conference on Artificial Intelligence (IJCAI’03), Acapulco, Mexico, (2003). [10] V-T. Vu, F. Brémond, and M. Thonnat, ‘Automatic video interpretation: A recognition algorithm for temporal scenarios based on pre-compiled scenario models’, in The 3rd International Conference on Vision System (ICVS’03), Graz, Austria, (2003). [11] N. Zouba, B. Boulay, F. Brémond, and M. Thonnat, ‘Monitoring activities of daily living (adls) of elderly based on 3d key human postures’, in The 4th International Cognitive Vision Workshop (ICVW08), Santorini, Greece, (2008). [12] M. Zúniga, F. Brémond, and M. Thonnat, ‘Fast and reliable object classification in video based on a 3d generic model’, in The 3rd International Conference on Visual Information Engineering (VIE2006), pp. 433–441, Bangalore, India, (September 26-28 2006).

8


Bayesian Methods for Artificial Intelligence and Machine Learning Zoubin Ghahramani Department of Engineering, University of Cambridge, UK Machine Learning Department, Carnegie Mellon University, USA http://learning.eng.cam.ac.uk/zoubin

Abstract. Bayesian methods provide a framework for representing and manipulating uncertainty, for learning from noisy data, and for making decisions that maximize expected utility----components which are important to both AI and Machine Learning. However, although Bayesian methods have become more popular in recent years, there remains a good degree of skepticism with respect to taking a fully Bayesian approach. This talk will introduce fundamental topics in Bayesian statistics as they apply to machine learning and AI, and address some misconceptions about Bayesian approaches. I will then discuss some current work on non-parametric Bayesian machine learning, particularly in the area of unsupervised learning.


9

The Impact of Constraint Programming Pascal Van Hentenryck Brown University

Abstract. Constraint programming is a success story for artificial intelligence. It quickly moved from research laboratories to industrial applications and is in daily use to solve complex optimization throughout the world. At the same time, constraint programming continued to evolve, addressing new needs and opportunities. This talk reviews some recent progress in constraint programming, including its hybridization with other optimization approaches, the quest for more autonomous search, and its applications in a variety of nontraditional areas.

10


Web Science George Metakides

Abstract not available at time of printing.

II. Papers


1. Knowledge Representation and Reasoning



15

Advanced Preprocessing for Answer Set Solving Martin Gebser and Benjamin Kaufmann and André Neumann and Torsten Schaub1 2 Abstract. We introduce the first substantial approach to preprocessing in the context of answer set solving. The idea is to simplify a logic program while identifying equivalences among its relevant constituents. These equivalences are then used for building a compact representation of the program (in terms of Boolean constraints). We implemented our approach as well as a SAT-based technique to reduce Boolean constraints. This allows us to empirically analyze both preprocessing types and to demonstrate their computational impact.

1

INTRODUCTION

Answer Set Programming (ASP; [3]) has become an attractive paradigm for declarative problem solving. This is partly due to the availability of efficient off-the-shelf ASP solvers [9, 19]. In fact, modern ASP solvers rely on Boolean constraint solving technology [1, 8, 7], leading to a similar performance as advanced SAT solvers [17]. On the other hand, the attractiveness of ASP stems from its rich modeling language, allowing for an easy and elaborationtolerant handling of knowledge-intensive applications. In practice, an input program is usually run through multiple preprocessing steps. At first, a so-called grounder instantiates all variables, thus producing a ground logic program. Classical ASP solvers, such as smodels [19], more or less take the resulting program as is without doing further optimizations. In contrast, modern ASP solvers translate a ground program into a set of Boolean constraints (e.g., clauses) in order to exploit advanced SAT solving technology. Such translations necessitate the introduction of extra propositions (see below) in order to avoid an exponential blow-up. Also, this addition may result in exponentially smaller search spaces [16] and permits more succinct representations of loop constraints [14]. Nonetheless, the question arises in how far the introduced redundancy can be trimmed. While ASP solvers still lack full-fledged preprocessing techniques, they already constitute an integral part of many SAT solvers [2, 20, 10]. There are two principal ways to address preprocessing in ASP solving: the external one, aiming at the reduction of a ground program, and the internal one, (recurrently) optimizing its inner representation. Within modern ASP solvers, the latter can be done by adapting corresponding techniques from SAT. Hence, we concentrate in the sequel on the former approach, being specific to ASP. Thereby, we build upon work on program transformations and equivalence [4, 5, 11]. To be precise, we develop preprocessing techniques for ground logic programs under answer set semantics. The idea is to transform a program into a simpler one, along with an assignment and a relation expressing equivalences among the assignable constituents of the program. These equivalences are subsequently exploited when transforming the resulting program into Boolean constraints, represented as clauses. We implemented both our external and a SATbased internal reduction strategy within the ASP solver clasp [7]. This makes clasp the first ASP solver incorporating advanced pre-

processing techniques. Furthermore, our implementation allows us to empirically assess both the external and the internal approach to preprocessing, thus demonstrating their computational impact.

2

A (normal) logic program over an alphabet A is a finite multiset3 of rules of the form a ← b1 , . . . , bm , ∼cm+1 , . . . , ∼cn , where a, bi , cj ∈ A are atoms for 0 < i ≤ m, m < j ≤ n. A literal is an atom a or its (default) negation ∼a. Furthermore, let ∼A = {∼a | a ∈ A} and A = {a | a ∈ A}, where a is used for (classical) negation in propositional formulas. For a rule r, let head (r) = a be the head of r and the multiset body(r) = {b1 , . . . , bm , ∼cm+1 , . . . , ∼cn } be the body of r. Given a (multi)set B of literals, let B + = {a ∈ A | a ∈ B} and B − = {a ∈ A | ∼a ∈ B}. The set of atoms occurring in a logic program Π is denoted by atom(Π) and body(Π) = {body(r) | r ∈ Π}. Also, we define body(a) = {body(r) | r ∈ Π, head (r) = a}. Following [18], we characterize the answer sets of a logic program Π by the models of the completion [6] and loop formulas of Π. As mentioned above, in practice, this involves introducing extra propositions pB for bodies B. Given a program Π over A, its completion formula is then defined as follows: ˘ `W ´ ¯ CF (Π, A) = a ↔ ∪ B∈body(a) pB | a ∈ A `V ´ ¯ ˘ V b ∧ c | B ∈ body(Π) . (1) pB ↔ b∈B + c∈B − A loop is a (nonempty) set of atoms that circularly depend upon each other in a program’s positive atom dependency graph [18]. The set of all loops of Π is denoted by loop(Π). If loop(Π) = ∅, then Π is said to be tight [12]. The loop formula of some L ∈ loop(Π) is `W ´ `W ´ LF (Π, L) = a∈L a → a∈L,B∈body(a),B + ∩L=∅ pB , and LF (Π) = {LF (Π, L) | L ∈ loop(Π)}. The bodies contributing to the consequent of a loop formula provide external support for the antecedent’s atoms. An atom is said to be unfounded if it belongs to the antecedent of a loop formula whose consequent is ⊥, expressing the absence of external support. We represent (classical) models by their set of entailed propositions, and let M(F ) stand for the set of all models of F . For some alphabet A, we define M(F )|A = {M ∩ A | M ∈ M(F )}. Then, a set X ⊆ A is an answer set of a logic program Π over A if X ∈ M(CF (Π, A) ∪ LF (Π))|A . We let AS (Π) denote the set of all answer sets of Π. Note that, whenever Π is tight, we have X ∈ AS (Π) iff X ∈ M(CF (Π, A))|A . Consider the following program Π over A = {a, . . . , f }: {a ←; b ← a, ∼c; c ← ∼b, ∼d; e ← ∼c; e ← f ; f ← a, e} . We get the following completion formula, CF (Π, A): {a ↔ p0 ; b ↔ p1 ; c ↔ p2 ; d ↔ ⊥; e ↔ p3 ∨ p4 ; f ↔ p5 } ¯∪ ˘ p0 ↔ ; p1 ↔ a ∧ c; p2 ↔ b ∧ d; p3 ↔ c; p4 ↔ f ; p5 ↔ a ∧ e . 3

1 2

Affiliated with SFU, Canada, and Griffith University, Australia. Universität Potsdam, August-Bebel-Str. 89, D-14482 Potsdam, Germany

BACKGROUND

The usage of multisets is motivated by the syntactic nature of our approach and the fact that grounders produce duplicates. For simplicity, we keep standard set notation for multiset operations.

16

M. Gebser et al. / Advanced Preprocessing for Answer Set Solving

CF (Π, A) has three models: {a, b, e, f, p0 , p1 , p3 , p4 , p5 }, {a, c, p0 , p2 }, and {a, c, e, f, p0 , p2 , p4 , p5 }. Furthermore, program Π has one loop, {e, f }, yielding LF (Π) = {e ∨ f → p3 }. This loop formula is falsified by {a, c, e, f, p0 , p2 , p4 , p5 }, thus {a, c, e, f } is no answer set of Π. The other two models of CF (Π, A) satisfy LF (Π) and correspond to the answer sets {a, b, e, f } and {a, c} of Π. Finally, a (partial) Boolean assignment A over A ∪ 2A∪∼A is a set of possibly negated elements of its domain. We define A = {a ∈ A | a ∈ A} ∪ {B ⊆ A ∪ ∼A | B ∈ A}. For instance, A = {a, d, {a, ∼c}} assigns true to a and false to d as well as body {a, ∼c}, and A = {d, {a, ∼c}} contains all false elements of A.

3

PREPROCESSING

Our initial goal is to turn a given program Π over an alphabet A into a simplified program Π , a partial assignment A, and an equivalence relation E on the atoms and bodies in Π . More formally, we transform a triple (Π, ∅, ∅) into (Π , A, E). Thereby, Π is obtained from Π by program transformations, mainly involving rule eliminations and body modifications. The semantics of the original program Π is captured by Π along with assignment A and E, where the latter is also exploited to generate a compact representation of Π in terms of Boolean constraints. Our transformation rules, shown in Table 1, are grouped into four building blocks: s = {(s0 ), . . . , (s15 )}, e = {(e16 ), . . . , (e27 )}, a = {(a28 ), . . . , (a35 )}, and u = {(u36 )}. (Note that many of them are subject to conditions, given in the rightmost column.) Roughly, the rules in s permit elementary simplifications, while e partitions atoms and bodies into equivalence classes. As a byproduct of this, all unclassified atoms are unfounded and set to false via (u36 ). Finally, the rules in a substitute the atoms in an equivalence class by a unique representative for that class. Note that s, e, a, and u are intended to be applied till saturation before proceeding to another block of transformations. In what follows, we gradually explain the different transformations and also provide examples. To begin with, rules (s0 ) to (s10 ) build upon well-known program transformations [4, 5, 11]. Let T →∗ T represent the computation of a fixpoint T by repeated applications of → to T . Then, s →∗ amounts to computing the fixpoint of Fitting’s operator [13]. s In addition, →∗ makes assignments to bodies and simplifies the program at hand. Finally, rules (s11 ) to (s15 ) preserve the correspondence between the program Π and its associated assignment A. s For Π0 = {a ←; b ← a, ∼c; c ← ∼b, ∼d}, we get (Π0 , ∅, ∅) →∗ (Π1 , A1 , ∅), where Π1 = {b ← ∼c; c ← ∼b} and A1 = {a, d}. s In general, a fixpoint of → has the following syntactic properties. s Proposition 1 Let (Π, ∅, ∅) →∗ (Π , A, ∅), for logic program Π over alphabet A. Then, we have: 1. body(r) = ∅, for all r ∈ Π ; 2. body(a) = ∅, for all a ∈ atom(Π ); 3. (atom(Π ) ∪ body(Π )) ∩ (A ∪ A) = ∅; 4. A ∩ A = ∅; 5. {B S ⊆ A ∪ ∼A | B ∈ A ∪ A} ⊆ A; 6. B∈A\A (B + ∪ B − ) ⊆ atom(Π ). W W Using BF (Y ) = {( b∈B + b ∨ c∈B − c) | B ∈ Y }, we can capture the relationship between the original program Π and the reduced program Π along with assignment A as follows. s Proposition 2 Let (Π, ∅, ∅) →∗ (Π , A, ∅), for logic program Π over alphabet A. Then, we have AS (Π) = M(CF (Π , A\A)∪LF (Π )∪(A∩A)∪BF (A\A))|A . Rules (e16 ) to (e27 ) comprise the heart of our approach and build an equivalence relation on atoms and bodies. We represent equivalence classes as triples, viz., E = [a, B, C], where a is an atom

representative for E, B is a body (externally) supporting E, and C contains all atoms and bodies belonging to E. We denote the components of E by aE = a, BE = B, and CE = C. Thereby, ∅ denotes a null value, where aE = ∅ means that CE ∩ A = ∅ and BE = ∅ expresses that E is not (externally) supported. For a set E of equivalence classes, define:4 S S s EC = EC = [a,B,C]∈E,B =∅ C [a,B,C]∈E C S s + = B . EB [a,B,C]∈E,B =∅ Some classes in E are defined as dual to each other (and are finally represented by complementary propositional literals). In Table 1, the rules (e16 ) and (e17 ) each introduce a new equivalence class E along e and we assume both classes to be correlated via with its dual class E, e1 ; E2 , E e2 ; . . . ). Finally, we use E e to some unique name (e.g., E1 , E e = E. denote the dual class of E, and let E e Let us illustrate →∗ starting from (Π1 , A1 , ∅): e

→

E

Rule

(e16 ) b ← ∼c (e17 ) b ← ∼c (e18 ) b ← ∼c (e16 ) c ← ∼b (e20 ) (e17 ) c ← ∼b (e18 ) c ← ∼b

E1 E2 E3 E4 E5 E6 E7

= {E1 = E1 ∪ {E2 = {E1 = E3 ∪ {E3 e1 = {E = E5 ∪ {E4 e1 = {E E1 e2 E

e1 = [∅, ∅, ∅]} = [∅, {∼c}, {{∼c}}], E e2 = [∅, ∅, ∅] = [b, {∼c}, {b}], E } e1 , E e2 = [b, {∼c}, {b, {∼c}}], E } e3 = [∅, ∅, ∅]} = [∅, {∼b}, {{∼b}}], E e2 , E e3 = [∅, {∼b}, {{∼b}}], E1 , E } e4 = [∅, ∅, ∅] = [c, {∼b}, {c}], E } = [c, {∼b}, {c, {∼b}}], = [b, {∼c}, {b, {∼c}}], e3 = E e4 = [∅, ∅, ∅] =E }

e1 . We get two non-trivial, dual equivalence classes: E1 and E e1 is repreClass E1 is represented by b and supported by {∼c}; E sented by c and supported by {∼b}. Observe that (e16 ) and (e17 ) introduce equivalence classes and their duals, while (e18 ) and (e20 ) merge different classes. (For simplicity, trivial dual classes are kept.) e The overall proceeding of →∗ is support-driven, that is, rules are only taken into account if their positive body atoms have been classified. Moreover, each (vital) class [a, B, C] must be supported by some body B = ∅. To illustrate this, consider Π0 ∪ Π1 , where Π1 = {e ← ∼c; e ← f ; f ← e; g ← e, ∼f ; g ← h, ∼f ; h ← f, g} . s

We get (Π0 ∪ Π1 , ∅, ∅) →∗ (Π1 ∪ Π1 , A1 , ∅) and continue by ape plying →∗ to (Π1 ∪ Π1 , A1 , E7 ): e

→ (e17 ) (e16 ) (e21 ) (e17 ) (e16 ) (e21 ) (e19 )

E

Rule e ← ∼c f ←e f ←e e←f f ←e

E1 E2 E3 E4 E5 E6 E7

(e22 ) g ← e,∼f E7

= E7 ∪ = E1 ∪ = E7 ∪ = E3 ∪ = E4 ∪ = E3 ∪ = E7 ∪

{E1 {E2 {E1 {E3 {E4 {E3 {E1 e E 1

e = [∅, ∅, ∅] } = [e, {∼c}, {e}], E 1 e = [∅, ∅, ∅] } = [∅, {e}, {{e}}], E 2 e , E e = [e, {∼c}, {e, {e}}], E } 1 2 e = [f, {e}, {f }], E3 = [∅, ∅, ∅] } e = [∅, ∅, ∅]} = [∅, {f }, {{f }}], E 4 e , E e = [f, {e}, {f, {f }}], E } 3 4 = [e, {∼c}, {e, {e}, f, {f }}], e = E e = E e = [∅, ∅, ∅] =E } 2 3 4

We thus get (Π2 , A1 , E7 ), where Π2 = Π1 ∪ (Π1 \ {g ← e, ∼f }). Set E7 augments E7 with E1 , revealing that e and f can be treated as equals. Note that the supporting body {∼c} does not belong to CE1 , given that bodies {e} and {f } in CE1 are involved in loop {e, f }. Notably, the application of (e22 ) to g ← e,∼f allows us to stop without classifying g and h, which are unfounded relative to Π2 . However, by delaying the removal of g ← e,∼f , an equivalence relation E7 such that g and h belong to classes E satisfying BE = ∅ 4

The superscript s indicates supporting bodies B = ∅.

17


s

(s0 ) (s1 ) (s2 ) (s3 ) (s4 ) (s5 ) (s6 ) (s7 ) (s8 ) (s9 ) (s10 ) (s11 ) (s12 ) (s13 ) (s14 ) (s15 )

(Π ∪ {r, r}, A, E) (Π ∪ {a ← , , B}, A, E) (Π ∪ {a ← b, ∼b, B}, A, E) (Π ∪ {a ← a, B}, A, E) (Π ∪ {a ←}, A, E) (Π, A, E) (Π ∪ {a ← ∼a, B}, A, E) (Π ∪ {a ← B}, A ∪ {a}, E) (Π ∪ {a ← B}, A ∪ {B}, E) (Π ∪ {a ← , B}, A ∪ {}, E) (Π ∪ {a ← ∼, B}, A ∪ {}, E) (Π, A ∪ {{, } ∪ B}, E) (Π, A ∪ {{b, ∼b} ∪ B}, E) (Π, A ∪ {, {} ∪ B}, E) (Π, A ∪ {, {} ∪ B}, E) (Π, A ∪ {B}, E)

→ s → s → s → s → s → s → s → s → s → s → s → s → s → s → s →

(e16 ) (e17 ) (e18 ) (e19 ) (e20 ) (e21 ) (e22 ) (e23 ) (e24 ) (e25 ) (e26 )

(Π ∪ {a ← B}, A, E) → e (Π ∪ {a ← B}, A, E) → e (Π ∪ {a ← B}, A, E ∪ {E, [a, B, C]}) → e (Π ∪ {a ← B}, A, E ∪ {E, [a, B, C]}) → e e (Π, A, E ∪ {E, E, [a, B, C]}) → e e [a, B, C]}) (Π, A, E ∪ {E, E, → e e (Π ∪ {a ← B}, A, E ∪ {E, E}) → e (Π, A, E ∪ {[a, B, C]}) → e (Π, A, E ∪ {[a, B, C]}) → e (Π ∪ {a ← B}, A, E ∪ {[a, ∅, C]}) → e (Π ∪ {a ← B}, A, E ∪ {[a , ∅, C ]}) →

e

e

(e27 ) (Π ∪ {a ← B}, A, E ∪ {[∅, ∅, C]})

→

e (a28 ) (Π ∪ {a ← B}, A, E ∪ {E, E})

→

e (a29 ) (Π ∪ {a ← b, B}, A, E ∪ {E, E})

→

(a30 ) (a31 ) (a32 ) (a33 ) (a34 ) (a35 )

e (Π ∪ {a ← b, B}, A, E ∪ {E, E}) e (Π ∪ {a ← ∼c, B}, A, E ∪ {E, E}) e (Π ∪ {a ← ∼c, B}, A, E ∪ {E, E}) e (Π, A ∪ {B}, E ∪ {E, E}) (Π, A ∪ {{b} ∪ B}}, E ∪ {E}) (Π, A ∪ {{∼c} ∪ B}}, E ∪ {E})

a

a

a

→ a → a → a → a → a → u

(u36 ) (Π, A, E)

→

(Π ∪ {r}, A, E) (Π ∪ {a ← , B}, A, E) (Π, A, E) (Π, A, E) (Π, A ∪ {a}, E) (Π, A ∪ {a}, E) (Π, A ∪ {{∼a} ∪ B}, E) (Π, A ∪ {a}, E) (Π, A ∪ {B}, E) (Π ∪ {a ← B}, A ∪ {}, E) (Π, A ∪ {}, E) (Π, A ∪ {{} ∪ B}, E) (Π, A, E) (Π, A ∪ {, B}, E) (Π, A ∪ {}, E) (Π, A ∪ {a, B}, E)

` ´ a ∈ (B + ∪ B − ) \ (atom(Π) ∪ A ∪ A) ´ ` + s ,B ∈ e = [∅, ∅, ∅]}) (Π ∪ {a ← B}, A, E ∪ {E = [∅, B, {B}], E B ∪ Es ⊆ EC / EC ´ ` + Bs s ,a ∈ e = [∅, ∅, ∅]}) (Π ∪ {a ← B}, A, E ∪ {E = [a, B, {a}], E B ∪ EB ⊆ EC / EC ` ´ (Π ∪ {a ← B}, A, E ∪ {E = [a, B, C ∪ CE ]}) body(a) ⊆ CE , CE ∩ atom(Π) = ∅ ` ´ (Π ∪ {a ← B}, A, E ∪ {E = [aE , BE , CE ∪ C]}) body(a) ⊆ CE , CE ∩ atom(Π) = ∅ ` ´ e (Π, A, E ∪ {E = [a, B, C ∪ CE ], E}) B ∈ C, B + = ∅, B − ⊆ CEe , CE ∩ atom(Π) = ∅ ` ´ e (Π, A, E ∪ {E = [aE , BE , CE ∪ C], E}) B ∈ C, B + ⊆ CE , B − ⊆ CEe , CE ∩ atom(Π) = ∅ ` + ´ e (Π, A, E ∪ {E, E}) (B ∩ CE ) ∪ (B − ∩ CEe ) = ∅, (B + ∩ CEe ) ∪ (B − ∩ CE ) = ∅ ` ´ (Π, A, E ∪ {[a, ∅, C]}) B = ∅, B ∈ / body(Π) ´ ` s (Π, A, E ∪ {[a, ∅, C]}) B = ∅, B + ⊆ EC ´ ` + s s (Π ∪ {a ← B}, A, E ∪ {[a, B, C]}) B ∪ EB ⊆ EC ` s ⊆ Es , (Π ∪ {a ← B}, A, E ∪ {[a, B, C]}) {a, a } ⊆ C , a = a , B + ∪ EB C´ C = ({a, B} ∩ C ) ∪ (C \ (atom(Π) ∪ body(Π))) ´ ` s ⊆ Es (Π ∪ {a ← B}, A, E ∪ {[∅, B, C]}) B ∈ C, B + ∪ EB C ` e (Π, A, E ∪ {E, E}) a ∈ CE \ {aE }, {(a ← B ) ∈ Π S ∪ {a ← B} | a ∈ CE \ {aE },´ B + = ∅, a ∈ r∈Π∪{a←B} body(r)+ } = ∅ ` ← B ) ∈ Π | a ∈ C \ {a }, e (Π ∪ {a ← aE , B}, A, E ∪ {E, E}) b ∈ CE \ {aE }, {(aS E E ´ B + = ∅, a ∈ r∈Π∪{a←b,B} body(r)+ } = ∅ ` ´ e (Π ∪ {a ← ∼aEe , B}, A, E ∪ {E, E}) b ∈ CE \ {aE }, (b ← B ) ∈ Π, B + = ∅ ` ´ e (Π ∪ {a ← B}, A, E ∪ {E, E}) c ∈ CE , B + ∩ CEe = ∅ ` ´ e (Π ∪ {a ← ∼aE , B}, A, E ∪ {E, E}) c ∈ CE \ {aE }, B + ∩ CEe = ∅ ` + ´ e (Π, A, E ∪ {E, E}) (B ∩ CE ) ∪ (B − ∩ CEe ) = ∅, (B + ∩ CEe ) ∪ (B − ∩ CE ) = ∅ ` ´ (Π, A ∪ {{aE } ∪ B}}, E ∪ {E}) b ∈ CE \ {aE } ` ´ (Π, A ∪ {{∼aE } ∪ B}}, E ∪ {E}) c ∈ CE \ {aE } ` ´ s ∪ A) (Π, A ∪ {a}, E) a ∈ atom(Π) \ (EC

Transformation rules for preprocessing (where ∈ A ∪ A, ∼a = a, ∼a = a, and a = a).

Table 1.

could have been obtained as well. The latter again signals that g and h are unfounded, as in the case that they remain unclassified. The next results shed some light on the syntactic properties of the s e s e consecutive application of →∗ and →∗ , abbreviated by →∗ →∗ . s

e

Proposition 3 Let (Π, ∅, ∅) →∗ →∗ (Π , A, E), for logic program Π over alphabet A. Then, we have: 1. 2. 3. 4. 5.

` ´ a ∈ atom(Π) \ (A ∪ A), body(a) = ∅

s s EB ⊆ EC ⊆ atom(Π ) ∪ body(Π ); EC ∩ (A ∪ A) = ∅; CE ∩ CE = ∅, for all E, E ∈ E such that E = E ; (aE ← BE ) ∈ Π , for all E ∈ E such that aE = ∅, BE = ∅; s s body(r)+ ⊆ EC , for all r ∈ Π such that head (r) ∈ / EC .

We next show that our transformations preserve answer sets and that duality among equivalence classes carries forward to answer sets. s

e

Proposition 4 Let (Π, ∅, ∅) →∗ →∗ (Π , A, E), for logic program Π over alphabet A, and let X ∈ AS (Π). Then, we have: s 1. A ∩ A ⊆ X ⊆ (A ∩ A) ∪ EC ;

2. CE ∩A ⊆ X and CEe ∩X = ∅ or CEe ∩A ⊆ X and CE ∩X = ∅, e ⊆ E. for all {E, E} Equivalences and implicit or explicit unfoundedness of atoms (cf. E7 and E7 above) are exploited by the remaining transformations: (a28 ) to (a35 ) substitute equivalent atoms by the representative aE (or ∼aEe via rule (a30 )) for their class E, while (u36 ) assigns false to unfounded atoms. a u Although → and → leave program Π1 unchanged, they allow for further reducing Π2 in view of the obtained equivalence classes. We a u obtain (Π2 , A1 , E7 ) →∗ (Π3 , A1 , E7 ) →∗ (Π3 , A2 , E7 ), where Π3 = Π1 ∪ {e ← ∼c; e ← e; g ← h, ∼e; h ← e, g} and A2 = A1 ∪ {g, Sh} = {a, d, g, h}. Using E[X] = [a,B,C]∈E,C∩X =∅ (C ∩ A) for accumulating all atoms equivalent to members of X, we obtain the following result. s e a u Proposition 5 Let (Π, ∅, ∅) →∗ →∗ →∗ →∗ (Π , A, E), for logic program Π over alphabet A. Then, we have AS (Π) = {X ∪E[X]∪(A∩A) | X ∈ AS (Π )∩M(BF (A\A))} .

18


Finally, we consider the saturated result of preprocessing, where s e a u ∗ Π → (Π , A, E) stands for (Π, ∅, ∅) ( →∗ →∗ →∗ →∗ )∗ (Π , A, E). Let σ = {y1 /y1 , . . . , yn /yn } denote a substitution, and let Yσ be Y with every occurrence of yi replaced by yi for 1 ≤ i ≤ n. This allows us to formulate the following termination and confluence result. Theorem 6 Let Π be a logic program over A. Then, we have: ∗ 1. Every derivation → from Π terminates with some (Π , A, E) such that no transformation rule in Table 1 is applicable to (Π , A, E); ∗ ∗ 2. If Π → (Π1 , A1 , E1 ) and Π → (Π2 , A2 , E2 ), then (A1 ∩ A) ∪ E[A1 ] = (A2 ∩A)∪E[A2 ], Π1 σ = Π2 , and (A1 \A)σ = A2 \A, where σ = {a/aE | E ∈ E2 , a ∈ CE ∩ A}; ∗ ∗ e1 } ⊆ E1 3. If Π → (Π1 , A1 , E1 ), Π → (Π2 , A2 , E2 ), and {E1 , E e such that BE1 = ∅, then {E2 , E2 } ⊆ E2 such that BE2 = ∅, CE1 σ = CE2 σ, and CEe1 σ = CEe2 σ, where σ = {a/aE | E ∈ E2 , a ∈ CE ∩ A}. ∗

Reconsidering Π0 ∪ Π1 , we get (Π0 ∪ Π1 ) → (Π1 , A2 , E ), where E contains two vital classes, viz., E = [b, {∼c}, {b, {∼c}, e = [c, {∼b}, {c, {∼b}}], while all other e, {e}, f, {f }}] and E classes E ∈ E are such that BE = ∅. This outcome is independent from the order in which transformations are applied. Also note that all six rules of Π1 are removed by preprocessing, thus transforming non-tight program Π0 ∪ Π1 into tight program Π1 . Notably, the result of our transformations goes beyond the wellfounded model [21] of a logic program. ∗ Proposition 7 Let Π → (Π , A, E), for logic program Π over A, and let I ⊆ A ∪ A be the well-founded model of Π. Then, we have I ∩A ⊆ (A∩A)∪E[A] and I ∩A ⊆ (A \ (A ∪ E[A ∪ atom(Π )])). Similar to the known algorithms for computing a program’s well∗ founded model, → can be computed in quadratic time. In fact, if no program rule is removed (via rules other than (a28 )) after the initial s s e a u application of →∗ , a linear pass of →∗ →∗ →∗ →∗ suffices to coms∗ e∗ a∗ u∗ ∗ ∗ pute →, while iteration, viz., ( → → → → ) , is needed otherwise. We now take advantage of the result of our initial preprocessing phase, (Π , A, E), for obtaining a compact completion formula. To this end, we use E to induce a variable mapping ν : atom(Π ) ∪ {pB | B ∈ body(Π )} → V ∪ V, where V is an alphae ⊆ E such that BE = ∅, we bet of variable names. For each {E, E} e as follows: select a unique v ∈ V and map the elements of E and E 1. ν(y) = v iff y ∈ (CE ∩ atom(Π )) ∪ {pB | B ∈ CE ∩ body(Π )}; 2. ν(y) = v iff y ∈ (CEe ∩ atom(Π )) ∪ {pB | B ∈ CEe ∩ body(Π )}. Practically, ν amounts to an abstraction of the original program, as used for the internal representation within ASP solvers. We then use ν for inducing a substitution σν = {y/ν(y) | y ∈ atom(Π ) ∪ {pB | B ∈ body(Π )}}. For (Π1 , A2 , E ), we get mapping ν1 = {b → v; c → v; p{∼c} → v; p{∼b} → v}, using only one variable v. Having mapping ν induced by (Π , A, E), we express the completion and loop formulas of Π using the variables in V: ` VFν (Π , A, E) = LF (Π ) ∪ BF (A \ A) ∪ ´ CF (Π , atom(Π ) ∪ (A \ (A ∪ E[A ∪ atom(Π )]))) σν . Note that applying σν leaves the introduction of body propositions (cf. (1)) implicit. In our example, we get VFν1 (Π1 , A2 , E ) = CF (Π1 , {b, c, d, g, h})σν1 = {v ↔ v; v ↔ v; d ↔ ⊥; g ↔ ⊥; h ↔ ⊥} . Note that LF (Π1 ) is empty (since Π1 is tight), and so is BF (A2 \A). Clearly, CF (Π1 , {b, c, d, g, h})σν1 possesses the models ∅ and {v}. Such models are linked to the atoms in an original program Π by

appeal to EFν (E) = {a ↔ ν(aE ) | E ∈ E, BE = ∅, a ∈ CE ∩ A}; e.g., EFν1 (E ) = {b ↔ v; e ↔ v; f ↔ v; c ↔ v}. Formally, we have the following result. ∗ Theorem 8 Let Π → (Π , A, E), for logic program Π over A, and let ν be a variable mapping induced by (Π , A, E). Then, we have AS (Π) = M((A ∩ A) ∪ E[A] ∪ VFν (Π , A, E) ∪ EFν (E))|A . For instance, for (Π1 , A2 , E ), ν1 , and A = {a, . . . , h}, we obtain M({a} ∪ ∅ ∪ VFν1 (Π1 , A2 , E ) ∪ EFν1 (E ))|A = {{a, b, e, f }, {a, c}}, which are the two answer sets of Π0 ∪ Π1 . Finally, note that our implementation within clasp takes advantage of the preprocessing result only for the initial construction of a compact completion formula, while loop formulas are not computed a priori, but only if they are used for propagation or conflict analysis.

4

EXPERIMENTS

We conducted systematic experiments on the benchmark sets used in the categories SCore and SLparse of the ASP competition [15]. Our comparison considers the ASP solver clasp in four modes: (1) no elaborated preprocessing, only elementary simplifications as in (s0 ) to (s15 ); (2) external program reduction (as described in Section 3); (3) internal reduction, extending SatELite-like techniques [10];5 and (4) both types of preprocessing. Table 2 summarizes results in seconds (t), indicating the number of timeouts via a superscript. Each line averages over n runs on n/3 instances, each shuffled three times. Furthermore, |r|, |a|, and |b| give the average number of rules, atoms, and bodies, respectively, in the original programs of each class; |v| and |c| give the average number of variables and Boolean constraints in the internal representation. The number of variables |v| is the same for variant (1) and (3) as well as for (2) and (4), respectively, and thus not duplicated in Table 2. At the bottom of Table 2, all individual runs are summed up, not taking averages. Full details are provided at [7]. In total, we see that variant (4) performs best, even though SatELite-like techniques are currently not applied to so-called extended rules (allowed within SLparse instances, shown in the second part of Table 2), while we have generalized external program reduction to work on such rules too. Furthermore, SatELite-like techniques work best on tight examples, being released from unfounded set checking. (Note that 2/3 of the benchmark classes are tight.) Unlike this, the approach in Section 3 is advantageous on non-tight programs due to its support-driven strategy. Another factor is the size of s e a u input programs. While our external technique ( →∗ →∗ →∗ →∗ ) is implemented in a linear fashion, SatELite-like techniques involve subsumption tests yielding a quadratic worst case behavior. Regarding the number of variables, one has to compare |a|+|b| with |v|. In the worst case, both would be equal. However, we sometimes see significant reductions of more than one order of magnitude. Given that the elementary simplifications already cut down the number of variables, the speed-ups of version (2) over (1) are mainly due to the reduced completion formula (reflected by |c|). Also, the number |c| of constraints is often much smaller than the original number |r| of rules.

5

DISCUSSION

We provided the first ASP-specific approach to preprocessing logic programs, aiming at reducing an input program as well as the number of variables in its internal representation. The latter goal is also pursued by smodels [19], where choices rely on atoms occurring negatively in bodies, and by cmodels [8], where heuristics are used to 5

Note that a straightforward application of SatELite-like techniques is insufficient since it interferes with unfounded set detection.

19


Problem Name (n) 15-Puzzle (30) BlockedN-Queens (42) EqTest (15) Factoring (15) HamiltonianPath (42) RLP-150 (42) RLP-200 (42) RandomNonTight (42) SchurNumbers (15)

|r| 17203 308796 6901 6974 4228 728 1184 839 12014

|a| |b| 5161 13029 5503 155646 434 2996 4965 6782 1533 2542 151 715 201 1165 55 806 736 4391

|v| 3100 53716 1143 3637 1358 288 455 287 1005

clasp (1) clasp (2) |c| t |v| |c| t 24348 0.3 2930 23942 0.3 69281 18 285.8 50613 2988 16 254.5 12338 16.0 999 11514 14.4 13407 5.6 2244 9524 3.9 5533 0.1 748 2987 0.1 3002 0.3 286 2992 0.3 4850 0.9 453 4838 0.9 5286 32.3 283 5267 32.8 4862 2.3 829 3971 1.4

clasp (3) |c| t 13497 0.3 2720 18 265.1 9866 16.4 3791 1.8 2974 0.1 2994 0.3 4835 1.0 5286 31.3 2451 2.6

clasp (4) |c| t 13296 0.3 2720 18 265.7 9419 14.7 3765 1.9 1277 0.1 2986 0.3 4826 0.9 5252 33.4 1602 1.0

15-Puzzle (15) 38250 11385 37498 15694 116321 1 213.2 15298 115173 96.3 79624 104.1 79624 112.8 BlockedN-Queens (15) 5024 4699 2726 2472 331 17.1 894 331 9.1 331 9.5 331 13.5 BoundedSpanningTree (15) 206557 2359 203226 68524 201427 3.7 67796 198432 3.7 190486 16.5 190486 16.8 CarSequencing (15) 1582 2303 1263 1189 630 15 600.0 695 630 15 600.0 630 15 600.0 630 13 566.3 Factoring (12) 7685 5470 7472 4006 14803 8.6 2473 10525 4.1 4196 2.2 4170 2.1 HamiltonianCycle (15) 10502 7003 4955 3986 12236 0.3 1925 7916 0.2 4676 1.4 4641 1.3 HamiltonianPath (15) 4924 1623 2920 1514 6102 0.1 864 3387 0.1 3364 0.1 1560 0.1 Hashiwokakero (12) 738726 149926 717900 227596 2163406 3 125.2 217954 1912400 3 125.2 1915809 3 125.4 1912400 3 125.3 KnightsTour (15) 58062 10968 37996 14866 16518 0.5 11383 10559 0.5 5317 0.7 3402 0.7 RLP-150 (15) 735 151 721 290 3030 0.4 288 3019 0.3 3023 0.4 3014 0.3 RLP-200 (15) 793 199 781 326 3309 1.1 319 3269 1.0 3276 1.0 3244 1.0 RandomNonTight (15) 848 55 816 290 5380 9.0 287 5361 5.8 5380 9.0 5347 5.5 SchurNumbers (15) 85319 1713 43097 7570 11438 2 129.3 7307 11438 1 164.0 10705 1 129.0 10705 1 97.8 SearchTest-plain (15) 690808 4339 522045 34753 160494 3 122.9 31869 148922 2 114.1 114633 3 124.4 105102 1 81.5 SearchTest-verbose (15) 802803 4959 606804 40320 165791 12.3 36964 152633 13.8 97379 37.5 88708 34.9 SocialGolfer (15) 31506 11269 31108 12500 119754 3 120.6 11857 119754 3 121.3 108148 3 124.4 108148 3 124.2 SolitaireBackward (15) 20508 8381 9305 5473 39345 1.9 2545 18017 1.1 13980 1.7 11740 0.7 SolitaireBackward2 (15) 27435 4397 25517 8713 14323 4 260.4 8366 14323 6 312.8 10008 4 179.1 10009 3 177.7 SolitaireForward (15) 19606 8020 8858 5153 29835 3 120.3 3602 23819 3 120.3 18448 2 90.3 15253 3 120.2 Su-Doku (9) 1003593 17053 502502 173185 12772 7.1 165897 12772 7.9 12772 11.0 12772 11.3 TowersOfHanoi (15) 18340 7215 15028 7294 15903 24.1 5500 13527 24.4 8665 24.7 8664 16.0 TravelingSalesperson (15) 3825 3065 1588 1448 3588 0.4 583 2356 0.2 2356 0.3 2339 1.5 VerifyTest-variableSearchSpace (15) 12914 2296 9134 1061 4285 0.1 608 3088 0.1 1273 0.1 806 0.1 WeightBoundedDominatingSet (15) 3163 2879 798 1187 2048 6 245.9 264 910 4 165.1 453 3 128.2 453 2 105.4 WeightedLatinSquare (15) 997 770 446 405 222 0.0 146 222 0.0 222 0.0 222 0.0 WeightedSpanningTree (15) 112034 2185 108934 36998 81210 2.3 36294 78426 2.2 78052 4.5 78052 4.4 Total time/timeouts 44116.9/58 40774.2/53 38641.0/52 37139.0/47 variables/constraints 10954406/46339719 10172081/39117132 -/35997972 -/35438242 Table 2. Experiments with clasp (1.0.5) on a 2.2GHz PC under Linux; each run restricted to 600s time and 1GB RAM.

eliminate body variables. However, up to now clasp is the only ASP solver integrating advanced preprocessing techniques. Neither ASPspecific (external) nor SatELite-like (internal) preprocessing have yet been implemented elsewhere in the context of ASP. Our experiments show that investments in preprocessing are well spent. In fact, the best results are obtained when combining ASP-specific with SatELitelike preprocessing. Instead of integrating preprocessing into clasp, it could be performed by a dedicated front-end, beneficial also to other solvers. The development of such a tool is left as a future issue.

REFERENCES [1] http://assat.cs.ust.hk. [2] F. Bacchus, ‘Enhancing Davis Putnam with extended binary clause reasoning’, in Proceedings AAAI’02, pp. 613–619. AAAI Press, (2002). [3] C. Baral, Knowledge Representation, Reasoning and Declarative Problem Solving. Cambridge University Press, (2003). [4] S. Brass and J. Dix, ‘Semantics of (disjunctive) logic programs based on partial evaluation’, Journal of Logic Programming, 40(1), 1–46, (1999). [5] S. Brass, J. Dix, B. Freitag, and U. Zukowski, ‘Transformation-based bottom-up computation of the well-founded model’, Theory and Practice of Logic Programming, 1(5), 497–538, (2001). [6] K. Clark, ‘Negation as failure’, in Logic and Data Bases, eds., H. Gallaire and J. Minker, pp. 293–322. Plenum Press, (1978). [7] http://www.cs.uni-potsdam.de/clasp. [8] http://www.cs.utexas.edu/users/tag/cmodels. [9] http://www.dlvsystem.com. [10] N. Eén and A. Biere, ‘Effective preprocessing in SAT through variable

[11] [12] [13] [14] [15]

[16] [17] [18] [19] [20] [21]

and clause elimination’, in Proceedings SAT’05, eds., F. Bacchus and T. Walsh, pp. 61–75. Springer, (2005). T. Eiter, M. Fink, H. Tompits, and S. Woltran, ‘Simplifying logic programs under uniform and strong equivalence’, in Proceedings LPNMR’04, eds., V. Lifschitz and I. Niemelä, pp. 87–99. Springer, (2004). F. Fages, ‘Consistency of Clark’s completion and the existence of stable models’, J. of Methods of Logic in Computer Science, 1, 51–60, (1994). M. Fitting, ‘Tableaux for logic programming’, Journal of Automated Reasoning, 13(2), 175–188, (1994). M. Gebser, B. Kaufmann, A. Neumann, and T. Schaub, ‘Conflict-driven answer set solving’, in Proceedings IJCAI’07, ed., M. Veloso, pp. 386– 392. AAAI Press/MIT Press, (2007). M. Gebser, L. Liu, G. Namasivayam, A. Neumann, T. Schaub, and M. Truszczyński, ‘The first answer set programming system competition’, in Proceedings LPNMR’07, eds., C. Baral, G. Brewka, and J. Schlipf, pp. 3–17. Springer, (2007). M. Gebser and T. Schaub, ‘Tableau calculi for answer set programming’, in Proceedings ICLP’06, eds., S. Etalle and M. Truszczyński, pp. 11–25. Springer, (2006). C. Gomes, H. Kautz, A. Sabharwal, and B. Selman, ‘Satisfiability solvers’, in Handbook of Knowledge Representation, eds., V. Lifschitz, F. van Hermelen, and B. Porter. Elsevier, (2008). F. Lin and Y. Zhao, ‘ASSAT: computing answer sets of a logic program by SAT solvers’, Artificial Intelligence, 157(1-2), 115–137, (2004). http://www.tcs.hut.fi/Software/smodels. S. Subbarayan and D. Pradhan, ‘NiVER: Non increasing variable elimination resolution for preprocessing SAT instances’, in Proceedings SAT’04, eds., H. Hoos and D. Mitchell, pp. 276–291. Springer, (2005). A. Van Gelder, K. Ross, and J. Schlipf, ‘The well-founded semantics for general logic programs’, Journal of the ACM, 38(3), 620–650, (1991).

20


A generic framework for comparing semantic similarities on a subsumption hierarchy Emmanuel Blanchard1 and Mounira Harzallah1 and Pascale Kuntz1 Abstract. Defining a suitable semantic similarity between concept pairs of a subsumption hierarchy is becoming a generic problem for many applications in knowledge engineering exploiting ontologies. In this paper, we define a generic framework which can guide the proposition of new measures by making explicit the information on the ontology which has not been integrated into existing definitions yet. Moreover, this framework allows us to rewrite numerous measures, originally proposed in various contexts, which are in fact closely related to each other. From this observation, we show some metrical and ordinal properties. Experimental comparisons on WordNet and on collections of human judgments complete the theoretical results and confirm the relevance of our propositions.

1

Introduction

Semantic similarity is a generic issue in a variety of applications in the areas of computational linguistics, artificial intelligence and biology, both in the academic community and the industry. Examples include word sense disambiguation [20], detection and correction of word spelling errors (malaproprisms) [4], image retrieval [23], information retrieval [13] and biological issues [25]. Similarities have been widely studied for set representations. The similarity σ(A, B) between two subsets of elements A and B is often defined as a function of the elements common to A and B and as a function of the distinct ones. The Jaccard’s coefficient [12] and the Dice’s coefficient [7], which have originally been defined for ecological studies, are probably the most commonly used similarities among a large family of coefficients [11][24]. Their theoretical properties have been carefully studied [10][6]. Another important issue is the evaluation of semantic similarity in a network structure. With a long history in psychology [27][21], the problem of evaluating semantic similarity in a network structure has known a noticeable renewed interest linked to the development of the semantic web. In the 1970’s many studies on categorization were influenced by a theory which stated that, from an external point of view, the categories in a set of objects were organized in a taxonomy according to an abstraction process. It is a common principle of the current knowledge representation systems to describe proximity relationships between domain concepts by a hierarchy, or more generally by a graph, i.e. by the ontologies associated with the new languages of the semantic Web –in particular OWL [1]. The tree-based similarities defined on a subsumption hierarchy contain two categories of similarities: those which, like the Wu and Palmer’s similarity [28], only depend on the hierarchical structure (e.g., path lengths between concept pairs), and those which, like the Lin’s similarity [14], additionally incorporate statistics on a corpus 1

University of Nantes, France, email: [email protected]

(e.g., concept occurrence frequencies). Some recent work has tried to extend the tree-based definitions to graphs by simultaneously taking into account different semantic relationships [15]. But, despite its pertinence, this attempt is faced with many open problems, and in practice the set-based and the tree-based similarities still remain the most widely used. Our main purpose here is to show that these measures, which have originally been proposed in various contexts, are closely related to each other. Most set-based similarities σ (A, B) can be re-written as functions f (|A| , |B| , |A ∩ B|) of the cardinalities of sets A and B and of their intersection set A ∩ B. In data analysis, a classification attempt, not widely used in knowledge engineering, has permitted to gather numerous similarity definitions into two parametrized functions that we denote by fα and fβ [6]. In this paper, we extend the definitions of these functions to the tree-based similarities: we define two generic functions feα and feβ with the same schema as fα and fβ . Each function depends on a real parameter α or β, and on the “information content” ψ(ci ) = − log P (ci ) initially introduced by Resnik [19], where P (ci ) is the probability of encountering an instance of the concept ci . The operational computation of the theoretical probability P(ci ) may vary according to the available information (e.g., a corpus). We show that numerous published tree-based similarities are associated with a α or β value and an approximation of P. The interests of this work are threefold. First, some partial pairwise comparisons have already been presented in the literature, but our unified framework allows to precisely identify the theoretical differences and commonalities of a large set of measures. Second, an analysis of the combinatorics of the subsumption hierarchy has led us to define new approximations of the probability P which exploit information on the subsumption hierarchy which has not been integrated into existing measures yet. Third, we show that ordinal and metrical properties can be straightforwardly deduced from this unified framework. We complete this theoretical study by numerical experiments on WordNet samples (version 2.0) and on benchmarks on which human judgments have been collected.

2

A typology of set-based similarities

In this section, we denote by S a finite set of elements and A, B, C some subsets of S. We briefly recall that a similarity σ on P(S) is a function σ : P(S) × P(S) → IR+ which satisfies two properties: symmetry (σ(A, B) = σ(B, A)) and maximality (σ(A, A) ≥ σ(B, C)). Most of the set-based similarities can be grouped into two parametrized families. The first one σα has been proposed by Caillez and Kuntz [6]. It is defined by a ratio between the cardinality of the intersection |A ∩ B|

21

E. Blanchard et al. / A Generic Framework for Comparing Semantic Similarities on a Subsumption Hierarchy

and the Cauchy’s mean [5] of the cardinalities of the respective sets |A| and |B|: σα (A, B) = fα (|A| , |B| , |A ∩ B|) =

|A∩B| μα (|A|,|B|)

(1)

”1 “ α α α where μα (|A| , |B|) = |A| +|B| for α ∈ IR. 2 Note that the case α = 1 concides with the classical arithmetic mean. The second family σβ has been studied by Gower and Legendre [10]: σβ (ci , cj ) = fβ (|A| , |B| , |A ∩ B|) =

β·|A∩B| |A|+|B|+(β−2)·|A∩B|

Table 1. Correspondence between different parameter values and well-known set-based similarities α Mean μα Similarity σα −∞ minimum Simpson β Similarity σβ −1 harmonic Kulczinsky 1/2 Sokal&Sneath 0 geometric Ochia¨ı 1 Jaccard 1 arithmetic Dice 2 Dice +∞ maximum Braun&Blanquet

It is easy to check that the values of the similarities σα and σβ are in the interval [0; 1].

A new formulation of tree-based similarities

In the following, we denote by C = {c1 , c2 , . . . , cn } a finite set of concepts. Formally, an ontology can be modeled by a directed graph where the nodes represent concepts and the arcs represent labeled relationships. Here, like often in the literature, we restrict ourselves to the subsumption relationship “is-a” on C × C. This relationship is common to every ontology, and different papers have confirmed that it is the most structuring one (e.g., [18]). In this case, if we assume that each concept ci has no more than one parent (direct subsumer), the ontology can be modeled by a rooted tree T (C) where the root c0 is either an informative concept or a “dummy” concept just added for the connectivity. We denote by cij the most specific common subsumer of the concepts ci and cj in T (C). In this section, we adapt the definitions 1 and 2 above to define new tree-based similarity families using the information content notion [19]. We also propose different ways to compute the information content of a concept which aims at better exploiting the hierarchy. Moreover, we show how our framework support the rediscovering of existing tree-based similarities. Our proposition allows to better understand both the relationships between the set-based similarities and the tree-based similarities and between the tree-based similarities themselves.

3.1

σ eα (ci , cj ) = feα (ψ(ci ), ψ(cj ), ψ(cij )) =

ψ(cij )

μα (ψ(ci ),ψ(cj ))

(3)

where μα is the Cauchy’s mean and α ∈ IR, and σ eβ (ci , cj )

(2)

where β ∈ IR∗+ . Table 1 shows the correspondence for different values of α and β with well-known measures (see [24] for the original references of the definitions).

3

the information content ψ(cij ) = − log P(cij ) of their most specific common subsumer cij . Consequently, from the definitions 1 and 2, we deduce two new parametrized functions which define tree-based similarities:

= feβ (ψ(ci ), ψ(cj ), ψ(cij )) β·ψ(cij ) = ψ(ci )+ψ(cj )+(β−2)·ψ(c ij )

(4)

where β ∈ IR∗+ Let us remark that σ eα (ci , cj ) = σ eβ (ci , cj ) when α = 1 and β = 2. The parameter α allows to choose different definitions of the mean (e.g., arithmetic, geometric). Formulation 4 explicitely shows that the parameter β allows to weight the importance of the common information associated with the most specific common subsumer. The logarithm base has no influence over this similarity measure due to the use of a ratio.

3.2

Information content computation

Let us remark that in practice the instance set I is never completely described in extension. Consequently, the operational computation of the probability P (ci ) depends both on the information at our disposal and on the hypothesis carried through the construction of the ontology. We denote by Pb (ci ) the approximation of P (ci ) in practice. br proposed by Resnik is computed by the forThe approximation P br (ci ) = n(ci ) where n(ci ) is the number of occurrences of mula: P n(c0 ) ci plus the number of occurrences of the concepts which are subsumed by ci in T (C). This approximation considers the root as virbr (c0 ) = 1). tual (P The probability P(ci ) can be approximated without considering any additional information. We propose some approximations deduced from various hypothesis on the extension of the concepts. We distinguish three approaches associated with different hypothesis: • descending approach – Hypothesis 1: exponential decreasing of the instance number bd ) with concept depth in T (C) (P – Hypothesis 2: uniform distribution of the father’s instances on bs ) its sons (P • ascending approach – Hypothesis 3: exponential increasing of the instance number bh ) with concept height in T (C) (P – Hypothesis 4: uniform distribution of the root’s instances on bg ) leaves (P • combined approach bdh : aggregation of P bd and P bh – P bsg : aggregation of P bs and P bg – P

Two new generic functions

Like Lin in his seminal paper [14], let us suppose that a concept ci references a subset Ii of an instance set I. By analogy with the Shannon’s information theory, the information content of the concept ci is measured by ψ(ci ) = − log P(ci ) where P(ci ) ∈ [0, 1] is the probability for a generic instance of ci to belong to Ii . Similarly, the common information associated with a concept pair {ci , cj } is

3.2.1

d (Hypothesis 1) Approximation P

The probability for an instance to be associated with a concept ci decreases exponentially with the depth di of ci in T (C). Then, b b bd (ci ) = Pd (parent (ci )) = P(c0 ) P k k di

(5)

22


where k is a fixed integer and parent (ci ) is the parent (direct subsumer) of ci . Let us remark that when the logarithm base is set to k, the information content of a concept ci is equivalent to its depth plus the information content of the root: bd (ci ) = di + ψ(c0 ) ψd (ci ) = − logk P

3.2.2

(6)

s (Hypothesis 2) Approximation P

bs (parent(ci )) P |Children(parent(ci ))|

h (Hypothesis 3) Approximation P

bh (ci ) = P

(8)

In the particular case of a logarithm base equal to k, the information content of a concept ci is defined by: bh (ci ) = h0 − hi + ψ(c0 ) ψh (ci ) = − logk P

3.2.4

(10)

where Leaves (ci ) corresponds to the leaf set subsumed by ci (when ci is a leaf, Leaves (ci ) = {ci }). bs case. Here, the information This case is dual to the previous P content (ψg ) deduced from this approximation corresponds to the generality degree in comparison with the leaves ; the height takes into account a part of the information exploited by this generality debh by considering the number of gree. This approximation refines P sons of the concept and its subsumed concepts.

3.2.5

sg and P dh Approximations P

We consider an alternative which simultaneously take into account the specificity and the generality degrees: bsg (ci ) = P

bg (ci ) bs (ci )+P P 2

(11)

bs and bsg is based on the arithmetic mean of P The definition of P b Pg . This choice is forced by the preservation of the recursivity: bsg (cx ). bsg (ci ) = P P P cx ∈Children(ci )

bd and P bh : A dual case is the aggregation of P bdh (ci ) = P

bh (ci ) bd (ci )+P P 2

lin(ci , cj ) =

2·ψr (cij ) ψr (ci )+ψr (cj )

(13)

Due to the Resnik’s approximation, the root concept is considered b 0 ) = 1). as virtual (P(c

3.3.2

Wu & Palmer’s similarity

wup(ci , cj ) =

3.3.3

(12)

2·ψd (cij ) ψd (ci )+ψd (cj )

(14)

Stojanovic’s similarity

bd allows to rewrite the Stojanovic’s similarity The approximation P [26] which is analogous to the Jaccard’s coefficient: sto(ci , cj ) =

3.3.4

We consider a uniform distribution of the instances of the root concept on the leaf concepts: |Leaves(ci )| |Leaves(c0 )|

The Lin’s similarity [14] is analogous to the Dice’s coefficient with the Resnik’s approximation:

(9)

g (Hypothesis 4) Approximation P

bg (ci ) = P(c b 0) · P

Lin’s similarity

The Wu & Palmer’s similarity [28] is analogous to the Dice’s coeffibd : cient with the approximation P

Each leaf has the same instance number and the probability of an instance to be associated with a concept ci increases exponentially with the height of ci . A leaf concept has a minimal probability which depends on the height of the hierarchy and on the instance number of the root. We can approximate P(ci ) by: b 0) P(c kh0 −hi

In this subsection, we show that the generic functions σ eα and σ eβ describe a set of semantic similarities (e.g., Lin, Wu & Palmer). We show that, in some cases, the approximations of P (ci ) coincide with known measures of the literature.

(7)

where Children (ci ) corresponds to the set of sons of ci . The information content (ψs ) deduced from this approximation corresponds to the specificity degree in comparison with the root ; the depth takes into account a part of the information exploited by bd by considering this specificity degree. This approximation refines P the number of sons of each subsumer.

3.2.3

Similarity definitions deduced from the approximations

3.3.1

We consider a uniform distribution of the instances of a father concept on its son concepts : bs (ci ) = P

3.3

ψd (cij ) ψd (ci )+ψd (cj )−ψd (cij )

(15)

Proportion of Shared Specificity

The Proportion of Shared Specificity (pss) proposed by Blanchard et bs approximaal. [2] coincides with the Dice’s coefficient with the P tion: 2·ψs (cij ) pss(ci , cj ) = ψs (ci )+ψ (16) s (cj )

4

Metrical and ordinal properties

Most of the work on the mathematical properties of the similarities are focused on their metrical aspect [18]. They usually resort to preliminary transformations of the similarity into a dissimilarity of the form δ = M axσ − σ, where M axσ is the maximal value reached by σ, or δ = σ1 when M axσ is not finite, in order to check the triangular inequality δ (ci , cj ) ≤ δ (ci , ck ) + δ (ck , cj ). Here, M axσα = M axσβ = 1 and we can consider the transformations δα = 1 − σα and δβ = 1 − σβ . By studying the set-based similarities, Caillez et al. [6] and Gower et al. [10] have proved that the triangular inequality holds for α → +∞ and β ∈ [0, 1]. From a formal point of view, these questions are interesting; however, for practical applications in knowledge engineering, the developed approaches do not generally require this constraining property. When comparing results with different similarities, we can remark that specialists are more often concerned with the ordering associated with the obtained values than with the intrinsic values. Indeed, they order the concept pairs according to the proximities quantified by these measures.


Proposition 1. The similarities of the family {e σβ }β∈IR∗ fol+

low the same ordering: for any ci , cj , ck , cl in C, σ eβ (ci , cj ) ≤ σ eβ (ck , cl ) ⇔ σ eβ (ci , cj ) ≤ σ eβ (ck , cl ) for any β and β ∈ IR∗+ . We show that σ eβ (ci , cj ) ≤ σ eβ (ck , cl ) ⇐⇒ σ e1 (ci , cj ) ≤σ e1 (ck , cl ) for any β ∈ IR∗+ . When ψ(ci )+ψ(cj )−2·ψ(cij ) = 0 then, σ e1 (ci , cj ) = σ eβ (ci , cj ) for any β > 0. Otherwise, it is easy to check that, for ψ(ci ) + ψ(cj ) − 2 · ψ(cij ) = 0, σ eβ (ci , cj ) =

β·e σ1 (ci ,cj ) 1+(β−1)·e σ1 (ci ,cj )

Consequently, σ e1 (ci , cj ) ≥ σ e1 (ck , cl ) σ eβ (ck , cl ).

⇐⇒

23

the discussion between experts concerning the ontological nature of WordNet. We have computed the information content for four different concept sets: the whole set of WordNet (146690 concepts) and three subsets of WordNet composed of the concept sets used respectively in the Miller & Charles [16], Rubenstein & Goodenough [22] and Finkelstein & Gabrilovich [9] benchmarks. We have compared the bd , P bg and P br . The correlations ρ (ψd , ψr ) and approximations P ρ (ψg , ψr ) are reported in the figure 1 (the rank correlations not reported here give similar results).

σ eβ (ci , cj ) ≥

Proposition 2. The similarities of the family {e σα }α∈IR do not follow the same ordering. Let us consider the following counter-example on a set C = {c1 , c2 , c3 , c4 }. We suppose that c1 is a subsumer of c2 , and that ψ(c1 ) = 1, ψ(c2 ) = 3, ψ(c3 ) = ψ(c4 ) = 2 and ψ(c34 ) = 2. In this 1 case, the Cauchy’s means are μα (ψ(c1 ), ψ(c2 )) = ((1 + 3α )/2) α and μα (ψ(c3 ), ψ(c4 )) = 2. Due to the convexity of the power function when α > 1, then μα (ψ(c1 ), ψ(c2 )) > μα (ψ(c3 ), ψ(c4 )) and consequently σ eα (c1 , c2 ) < σ eα (c3 , c4 ). When α < 1, the inequality is inverted. Proposition 3. The similarities of the family {e σα }α∈IR are decreasing functions of α. This is due to the fact that the α-means are increasing functions of α (e.g., [5]).

5

Experimental results

In this section, we present two complementary comparisons based on the subsumption hierarchy of WordNet 2.0 [8]. First, we compare the information content restricted to the structural information with the well-known Resnik’s information content which additionally requires a corpus. This allows us to quantify the information deduced from the corpus. Second, we use three well-known benchmarks (Rubenstein & Goodenough [22], Miller & Charles [16], Finkelstein et al. [9]) which gather human judgments on some concept pairs. This allowed us to evaluate the relevance of the different approximations.

5.1

Figure 1. Correlation of ψd and ψg information content with the one of Resnik ψr on WordNet concepts and four subsets

br which is a yardstick has been computed The approximation P with the British National Corpus with the Resnik counting method and a smoothing by 1 [17]. We can remark that each benchmark uses a sample of concepts which is not so representative of the whole set of concepts. Indeed, the corpus effect on the information content is more important on the whole set than on the three samples. From this point of view, the one of Finkelstein & Gabrilovich is the worse benchmark. Unsurprisingly, the information content based on the approximabd is the less correlated with P br . However, the positive corretion P lations show the relationship between the ascending and descending approximations: the depth tends to be conversely proportional to the height. The correlations between ψg and ψr show that the information quantity deduced from the corpus is restricted comparatively to the information deduced from the hierarchical structure. Nevertheless, these results depend on the corpus and the structure of WordNet. That’s why further work is required to generalize this conclusion to a large set of ontologies.

Comparison on WordNet

This subsection presents a comparison between the information content based on different approximations. We restrict ourselves to nouns and to the subsumption hierarchy (hyperonymy/hyponymy) of WordNet. This hierarchy which contains 146690 nodes constitutes the backbone of the noun subnetwork accounting for close to 80% of the links [3]. The computations have been performed with the Perl modules of Pedersen et al. [17] which allowed us to adapt treebased measures to the WordNet structure. Hence, although a synset could have more than one hyperonym, we have represented it as a tree model TW ordN et (C). We have also added some Perl modules to take into account all the new approximations presented in this paper. The main interest of TW ordN et (C) is to be large enough to allow computations of robust statistics and we do not enter here into

5.2

Comparisons with human judgments

As showed in section 3.1, two components are essential when comparing two concepts ci and cj : the shared information content (ψ ∩ (ci , cj ) = ψ(cij )) and the distinguishing information content (ψ (ci , cj ) = ψ(ci ) + ψ(cj ) − 2 · ψ(cij )). To measure the specific influence of these two components we have computed the correlation of each of them with the human judgment. The considered human judgment evaluations are taken from the Miller & Charles [16], Rubenstein & Goodenough [22], Finkelstein & Gabrilovich [9] experiments and the approximation of P is the Resnik’s approximation. The results (figure 2) closely depend on the test sets. The contribution of ψr is more important than the one of ψr∩ for the benchmarks of Miller & Charles and Rubenstein & Goodenough

24


REFERENCES

Figure 2. Contribution of ψ ∩ and br to simulate human ψ with P judgment

Figure 3. Contribution of ψ ∩ and bg to simulate human ψ with P judgment

contrary to the Finkelstein & Gabrilovich benchmark. This tend to express the variability of human sensibility which can be due to the evaluation process of the three benchmarks. bg seems to Moreover the previous experiments have shown that P be the more efficient (better correlated with human judgments) approximation comparing to the Resnik’s approximation which uses a corpus. Hence, we have computed the correlations of the two compobg (figure 3). The nents ψ ∩ and ψ with the human judgment with P results are very similar to those obtained with the Resnik’s approximation. This tend to suppose that the information deduced from the corpus contain as much information as noise.

6

Conclusion

The concept of similarity is fundamental in numerous fields (e.g., classification, AI, psychology, ...). At the origin, the definitions are often built to fulfill precise objectives in specific domains. However, several measures (e.g., [12, 7]) have shown their relevance to very different applications. Nowdays similarities know a significant renewed interest associated with the expansion of the ontologies in knowledge engineering. In this framework, the most often used measures to quantify proximities between concept pairs are tree-based similarities whose definitions may integrate or not additional information from a textual corpus. In practice, the choice of a similarity is a critical step since the results of the algorithms often closely depend on this choice. In this paper, we have built a new theoretical framework which allows to rewrite homogeneously numerous similarity functions used in knowledge engineering. We believe that such an approach, in the spirit of the pioneer work of Lin, is important for two major reasons. First, this rewriting highlights relationships both semantically and structurally between a large set of measures which have been originally defined for very different purposes. And, it has allowed to deduce mathematical properties. Second, it can guide the proposition of new measures by making explicit the information on the ontology which has not been integrated into the definitions yet. In this way, we have here proposed new approximations which allow to better exploit the information associated with the hierarchical structure of the ontology. We have also restricted ourselves to similarities for subsumption hierarchies without multiple inheritance. We have started to extend our approach to subsumption hierarchy with multiple inheritance.

ACKNOWLEDGEMENTS We would like to thank the referees for their comments which helped improve this paper.

[1] S. Bechhofer, F. van Harmelen, J. Hendler, I. Horrocks, D. L. McGuinness, P. F. Patel-Schneider, and L. A. Stein. Owl web ontology language reference, 2004. http://www.w3.org/TR/owl-ref/. [2] E. Blanchard, P. Kuntz, M. Harzallah, and H. Briand, ‘A tree-based similarity for evaluating concept proximities in an ontology’, in Proc. 10th Conf. Int. Federation Classification Soc., pp. 3–11. Springer, (2006). [3] A. Budanitsky, ‘Lexical semantic relatedness and its application in natural language processing’, Technical report, Univ. of Toronto, (1999). [4] A. Budanitsky and G. Hirst, ‘Evaluating wordnet-based measures of semantic distance’, Computational Linguistics, 32(1), 13–47, (2006). [5] P.S. Bullen, D. S. Mitrinovic, and P. M. Vasics, Means and their inequalities, Reidel, 1988. [6] F. Caillez and P. Kuntz, ‘A contribution to the study of the metric and euclidiean structures of dissimilarities’, Psychometrika, 61(2), 241– 253, (1996). [7] L. R. Dice, ‘Measures of the amount of ecologic association between species’, Ecology, 26(3), 297–302, (1945). [8] WordNet: An electronic lexical database, ed., C. Fellbaum, MIT Press, 1998. [9] L. Finkelstein, E. Gabrilovich, Y. Matias, G. Wolfman E. Rivlin, Z. Solan, and E. Ruppin, ‘Placing search in context: The concept revisited’, ACM Trans. Information Systems, 20(1), 116–131, (2002). [10] J.C. Gower and P. Legendre, ‘Metric and euclidean properties of dissimilarity coefficients’, J. of Classification, 3, 5–48, (1986). [11] Z. Hubalek, ‘Coefficient of association and similarity based on binary (presence, absence) data: an evaluation’, Biological Reviews, 57(4), 669–689, (1982). [12] P. Jaccard, ‘Distribution de la flore alpine dans le bassin des dranses et dans quelques régions voisines’, Bulletin de la Société Vaudoise de Sciences Naturelles, (37), 241–272, (1901). (in french). [13] J. H. Lee, M. H. Kim, and Y. J. Lee, ‘Information retrieval based on conceptual distance in is-a hierarchies’, J. Documentation, 49(2), 188– 207, (1993). [14] D. Lin, ‘An information-theoretic definition of similarity’, in Proc. 15th Int. Conf. Machine Learning, pp. 296–304. Morgan Kaufmann, (1998). [15] A. G. Maguitman, F. Menczer, H. Roinestad, and A. Vespignani, ‘Algorithmic detection of semantic similarity’, in Proc. 14th Int. Conf. World Wide Web, pp. 107–116. ACM Press, (2005). [16] G.A. Miller and W.G. Charles, ‘Contextual correlates of semantic similarity’, Language and Cognitive Processes, 6(1), 1–28, (1991). [17] T. Pedersen, S. Patwardhan, and J. Michelizzi, ‘Wordnet similarity measuring the relatedness of concepts’, in Proc. 5th Ann. Meet. North American Chapter Assoc. Comp. Linguistics, pp. 38–41, (2004). [18] R. Rada, H. Mili, E. Bicknell, and M. Blettner, ‘Development and application of a metric on semantic nets’, IEEE Trans. Syst., Man, Cybern., 19(1), 17–30, (1989). [19] P. Resnik, Selection and Information : A Class based Approach to Lexical Relationships, Ph.D. dissertation, University of Pennsylvania, 1993. [20] P. Resnik, ‘Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language’, J. Artificial Intell. Research, 11, 95–130, (1999). [21] E. Rosch, ‘Cognitive representations of semantic categories’, Experimental Psychology: Human Perception and Performance, 1, 303–322, (1975). [22] H. Rubenstein and J.B. Goodenough, ‘Contextual correlates of synonymy’, Comm. ACM, 8(10), 627–633, (1965). [23] A.W. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain, ‘Content-based image retrieval at the end of the early years’, IEEE Trans. Pattern Anal. Machine Intell., 22(12), 1349–1380, (2000). [24] R. R. Sokal and P. H. Sneath, Principles of numerical taxonomy, W. H. Freeman, 1963. [25] O. Steichen, C. Daniel-Le Bozec, M. Thieu, E. Zapletal, and M.-C. Jaulent, ‘Computation of semantic similarity within an ontology of breast pathology to assist inter-observer consensus’, Computers in Biology and Medicine, 36(7-8), 768–788, (2006). [26] N. Stojanovic, A. Maedche, S. Staab, R. Studer, and Y. Sure, ‘Seal: a framework for developing semantic portals’, in Proc. Int. Conf. Knowledge Capture, pp. 155–162, (2001). [27] A. Tversky, ‘Features of similarity’, Psychological Review, 84(4), 327– 352, (1977). [28] Z. Wu and M. Palmer, ‘Verb semantics and lexical selection’, in Proc. 32nd Annual Meeting Assoc. Computational Linguistics, pp. 133–138, (1994).


25

Complexity of Subsumption in the EL Family of Description Logics: Acyclic and Cyclic TBoxes Christoph Haase1 and Carsten Lutz2 Abstract. We perform an exhaustive study of the complexity of subsumption in the EL family of lightweight description logics w.r.t. acyclic and cyclic TBoxes. It turns out that there are interesting members of this family for which subsumption w.r.t. cyclic TBoxes is tractable, whereas it is E XP T IME-complete w.r.t. general TBoxes. For other extensions that are intractable w.r.t. general TBoxes, we establish intractability already for acyclic and cyclic TBoxes.

1

MOTIVATION

Description logics (DLs) are a popular family of KR languages that can be used for the formulation of and reasoning about ontologies [5]. Traditionally, the DL research community has strived for identifying more and more expressive DLs for which reasoning is still decidable. In recent years, however, there have been two lines of development that have led to significant popularity also of DLs with limited expressive power. First, a number of novel and useful lightweight DLs with tractable reasoning problems has been identified, see e.g. [3, 8]. And second, many large-scale ontologies that are formulated in such lightweight DLs have emerged from practical applications. Prominent examples include the Systematized Nomenclature of Medicine, Clinical Terms (SNOMED CT), which underlies the systematized medical terminology used in the health systems of the US, the UK, and other countries [19]; and the gene ontology (GO), which aims at consistent descriptions of gene products in different databases [20]. In this paper, we are concerned with the EL family of lightweight DLs, which consists of the basic DL EL and its extensions. Members of this family underly many large-scale ontologies including SNOMED CT and GO. The DL counterpart of an ontology is called a TBox, and the most important reasoning task in DLs is subsumption. In particular, computing subsumption allows to classify the concepts defined in the TBox/ontology according to their generality [5]. In the DL literature, different kinds of TBoxes have been considered. In decreasing order of expressive power, the most common ones are general TBoxes, (potentially) cyclic TBoxes, and acyclic TBoxes. For the EL family, the complexity of subsumption w.r.t. general TBoxes has exhaustively been analyzed in [3] and its recent successor [4]. In all of the considered cases, subsumption is either tractable or E XP T IME-complete. However, the study of general TBoxes does not reflect common practice of ontology design, as most ontologies from practical applications correspond to cyclic or acyclic TBoxes. For example, SNOMED CT and GO both correspond to so-called acyclic TBoxes. Since cyclic and acyclic TBoxes are often preferable in terms of computational complexity [7, 14], the question arises 1 2

University of Oxford, UK, [email protected] TU Dresden, Germany, [email protected]

whether there are useful extensions of EL for which reasoning w.r.t. such TBoxes is computationally cheaper than reasoning w.r.t. general TBoxes. The goal of the current paper is to analyse the computational complexity of subsumption in the EL family of description logics w.r.t. acyclic TBoxes and cyclic TBoxes, with a special emphasis on the border of tractability. In our analysis, we omit extensions of EL for which tractability w.r.t. general TBoxes has already been established. Our results exhibit a more varied complexity landscape than in the case of general TBoxes: we identify cases in which reasoning is tractable, co-NP-complete, PS PACE-complete, and E XP T IMEcomplete. Notably, we identify two maximal extensions of EL for which subsumption w.r.t. cyclic TBoxes is tractable, whereas it is E XP T IME-complete w.r.t. general TBoxes. In particular, these extensions include primitive negation and at-least restrictions. They also include concrete domains, but fortunately do not require the strong convexity condition that was needed in the case of general TBoxes to guarantee tractability [3]. For other extensions of EL such as inverse roles and functional roles, we show intractability results already w.r.t. acyclic TBoxes. Compared to the case of general TBoxes, it is often necessary to develop new approaches to lower bound proofs. We also show that the union of the two identified tractable fragments is not tractable. Detailed proofs are provided in [10].

2

DESCRIPTION LOGICS

The two types of expressions in a DL are concepts and roles, which are built inductively starting from infinite sets NC and NR of concept names and role names, and applying concept constructors and role constructors. The basic description logic EL provides the concept constructors top (), conjunction (C D) and existential restriction (∃r.C), and no role constructors. Here and in what follows, we denote the elements of NC with A and B, the elements of NR with r and s, and concepts with C and D. The semantics of concepts and roles is given in terms of an interpretation I = (ΔI , ·I ), with ΔI a non-empty set called the domain and ·I the interpretation function, which maps every A ∈ NC to a subset AI of ΔI and every role name r to binary relation rI of over ΔI . Extensions of EL are characterized by the additional concept and role constructors that they offer. Figure 1 lists all relevant constructors, concept constructors in the upper part and role constructors in the lower part. The left column gives the syntax, and the right column shows how to inductively extend interpretations to composite concepts and roles. In the presence of role constructors, composite roles can be used inside existential restrictions. In atleast restrictions (≥ n r) and atmost restrictions (≤ n r) , we use n to denote a nonnegative integer. The concrete domain constructor p(f1 , . . . , fk ) de-

C. Haase and C. Lutz / Complexity of Subsumption in the EL Family of Description Logics: Acyclic and Cyclic TBoxes

26

Syntax

Semantics

(C1)

LT (B) ⊆ LT (A)

ΔI

(C2)

For each ∃rB .B ∈ ET (B) there is ∃rA .A ∈ ET (A) such that rA ⊆ rB and (A , B ) ∈ S

(C3)

ConD (A) implies ConD (B)

I

I

¬C

Δ \C

C D

C I ∩ DI

C D

C I ∪ DI

(≤ n r)

{x | #{y | (x, y) ∈ rI } ≤ n}

(≥ n r)

{x | #{y | (x, y) ∈ rI } ≥ n}

∃r.C

{x | ∃y : (x, y) ∈ rI ∧ y ∈ C I }

∀r.C

{x | ∀y : (x, y) ∈ rI → y ∈ C I }

p(f1 , . . . , fk )

{x | ∃d1 , . . . , dk : f1I (x) = d1 ∧ . . . ∧ fkI (x) = dk ∧ (d1 , . . . , dk ) ∈ pD }

r∩s

rI ∩ sI

r∪s

rI ∪ sI

r− r

+

{(x, y) | (y, x) ∈ rI } S I i i>0 (r )

Figure 1. Syntax and semantics of concept and role constructors.

serves further explanation, to be given below. To denote extensions of EL, we use the symbol of the added constructors in superscript. For example, EL ,∪,− denotes the extension of EL with concept disjunction (C D), role disjunction (r ∪ s), and inverse roles (r− ). The concrete domain constructor permits reference to concrete data objects such as strings and integers. It provides the interface to a concrete domain D = (ΔD , ΦD ), which consists of a domain ΔD and a set of predicates ΦD [13]. Each p ∈ ΦD is associated with a fixed arity n and a fixed extension pD ⊆ Δn D . In the presence of a concrete domain D, we assume that there is an infinite set NF of feature names disjoint from NR and NC . In Figure 1 and in general, f1 , . . . , fk are from NF and p ∈ ΦD . An interpretation I maps every f ∈ NF to a partial function f I from ΔI to ΔD . We use EL(D) to denote the extension of EL with the concrete domain D. In this paper, a TBox T is a finite set of concept definitions A ≡ C, where A ∈ NC and C is a concept. We require that the left-hand side of all concept definitions in a TBox are unique. A concept name A ∈ NC is defined if it occurs on the left-hand side of a concept definition in T , and primitive otherwise. A TBox T is acyclic if there are no concept definitions A1 ≡ C1 , . . . , Ak ≡ Ck ∈ T such that Ai+1 occurs in Ci for 1 ≤ i ≤ k, where Ak+1 := A1 . An interpretation I is a model of T iff AI = C I for all A ≡ C ∈ T . The main reasoning task considered in this paper is subsumption. A concept C is subsumed by a concept D w.r.t. a TBox T , written T |= C D, if C I ⊆ DI for all models I of T . If T is empty or missing, we simply write C D. Sometimes, we also consider satisfiability of concepts. A concept C is satisfiable w.r.t. a TBox T if there is a model of T such that C I = ∅. For many extensions of EL, satisfiability is trivial because there are no unsatisfiable concepts.

3

TRACTABLE EXTENSIONS

We identify two extensions of EL for which subsumption w.r.t. TBoxes is tractable: EL∪,(¬) (D) and EL≥,∪ . This should be contrasted with the results in [3] which imply that subsumption w.r.t. general TBoxes is E XP T IME-complete in both extensions. In Section 4.1, we show that taking the union of the two extensions results in intractability already w.r.t. acyclic TBoxes.

Figure 2.

3.1

EL∪,(¬) (D): Conditions for adding (A, B) to S.

Role Disjunction, Primitive Negation, and Concrete Domains

We show that subsumption in EL∪,(¬) (D) w.r.t. (acyclic and cyclic) TBoxes is tractable. The superscript ·(¬) indicates primitive negation, i.e., negation can only be applied to concept names. The following is an example of an EL∪,(¬) (D)-TBox, where has age is a feature, and ≥13 and ≤19 are unary predicates of the concrete domain D: Parent

≡

Human ∃(has child ∪ has adopted).

Mother

≡

Parent Female ¬Male

Teenager

≡

Human ≥13 (has age) ≤19 (has age)

To guarantee tractability, we require the concrete domain D to satisfy a standard condition. Namely, we require D to be p-admissibile, i.e., satisfiability of and implication between concrete domain expressions of the form p1 (v11 , . . . , vn1 1 ) ∧ · · · ∧ pm (v1m , . . . , vnmm ) are decidable in polynomial time, where the vji are variables that range over ΔD . In [3], it is shown that a much stronger condition is required to achieve tractability in EL(D) with general TBoxes. This condition is convexity, which requires that if a concrete domain atom p(v1 , . . . , vn ) implies a disjunction of such atoms, then it implies one of the disjuncts. For our result, there is no need to impose convexity. When deciding subsumption, we only consider concept names instead of composite concepts. This is sufficient since T |= C D iff T |= A B, where T := T ∪ {A ≡ C, B ≡ D} and A and B do not occur in T . The subsumption algorithm requires the input TBox T to be in the following normal form. In each A ≡ C ∈ T , C is of the form

1≤i≤k

Li

1≤i≤

∃ri .Bi

1≤i≤m

pi (f1i , . . . , fni i )

where the Li are primitive literals, i.e., possibly negated primitive concept names; the ri are of the form r1 ∪ . . . ∪ rn ; and the Bi are defined concept names. In the following, we refer to the set of literals occurring in C with LT (A), to the set of existential restrictions as ET (A), and define the following concrete domain expression, which for simplicity uses features as variables: ConD (A) := p1 (f11 , . . . , fn11 ) ∧ · · · ∧ pm (f1m , . . . , fnmm ). To ease notation, we confuse a role ri = r1 ∪ . . . ∪ rn with the set {r1 , . . . , rn }. It is easy to see how to adapt the algorithm given in [2] to convert an EL∪,(¬) (D)-TBox into normal form in quadratic time. During the normalization, we check for unsatisfiable concepts. This is easy since a defined concept name A with A ≡ C ∈ T is unsatisfiable w.r.t. T iff one of the following three conditions holds: (i) there is a primitive concept P with {P, ¬P } ∈ LT (A); (ii) ConD (A) is unsatisfiable; or (iii) there is an ∃r.B ∈ ET (A) with B unsatisfiable. Suppose we want to decide whether A is subsumed by B w.r.t. a TBox T in normal form. If A is unsatisfiable, the algorithm answers


27

(C2) For each ∃rB .B ∈ ET (B) there is ∃rA .A ∈ ET (A) such that rA ⊆ rB and (A , B ) ∈ S

In the extension of EL with only at-least restrictions (≥ n r), subsumption w.r.t. general TBoxes is E XP T IME-complete [3]. As we will show in Section 4.3, EL extended with at-most restrictions (≤ n r) is intractable already w.r.t. acyclic TBoxes.

(C3) For each (≥ m r) ∈ NT (B), there is (≥ n r) ∈ NT (A) such that n ≥ m.

4

(C1) PT (B) ⊆ PT (A)

Figure 3.

EL≥,∪ : Conditions for adding (A, B) to S.

“yes”. Otherwise and if B is unsatisfiable, it answers “no”. If A and B are both satisfiable, it computes a binary relation S on the defined concept names of T . The relation S is initialized with the identity relation and then completed by exhaustively adding pairs (A, B) for which the conditions in Figure 2 are satisfied. It is easily seen that the algorithm runs in time polynomial w.r.t. the size of the input TBox. Let S0 , . . . , Sn be the sequence of relations that it produces. To show soundness, it suffices to prove that if (A, B) ∈ Si , i ≤ n, then T |= A B. This is straightforward by induction on i. To prove completeness, we have to exhibit a model I of T with AI \ B I = ∅. Such a model is constructed in a twostep process. First, we start with an instance of A, and then “apply” the concept definitions in the TBox as implications from left to right, constructing a potentially infinite, tree-shaped interpretation. In the second step, we apply the concept definitions from right to left, filling up the interpretation of defined concepts. Both steps involve some careful bookkeeping which ensures that the constructed instance of A is not an instance of B. Theorem 1 Subsumption in EL∪,(¬) (D) w.r.t. TBoxes is in PT IME. This result still holds if we additionally allow role conjunction (r ∩s) and require that composite roles are in disjunctive normal form (without DNF, subsumption becomes co-NP-hard).It is worth mentioning that, in the presence of general TBoxes, extending EL with each single one of (i) primitive negation, (ii) role disjunction, and (iii) any non-convex concrete domain results in E XP T IME-hardness [3]. Note that convexity of a concrete domain is a rather strong restriction, and it is pleasant that we do not need it to achieve tractability. We point out that it should be possible to enhance the expressive power of EL∪,(¬) (D) by enriching it with additional constructors of the DL EL++ [3]. Examples include nominals and transitive roles.

INTRACTABLE EXTENSIONS

We identify extensions of EL for which subsumption is intractable w.r.t. acyclic and cyclic TBoxes.

4.1

Primitive Negation and At-Least Restrictions

We show that taking the union of the DLs EL∪,(¬) (D) and EL≥,∪ from Sections 3.1 and 3.2 results in intractability. To this end, we consider EL≥,(¬) and show that subsumption w.r.t. the empty TBox is CO -NP-complete. It is easy to establish the lower bound also for EL≥ (D) as long as there are two concepts p(f1 , . . . , fn ) and p (f1 , . . . , fm ) that are mutually exclusive. This is the case for most practically useful concrete domains D. For the lower bound, we reduce 3-colorability of graphs to nonsubsumption. Given an undirected graph G = (V, E), reserve one concept name Pv for each node v ∈ V , and a single role name r. Then, G is 3-colorable iff CG (≥ 4 r), where „ « CG := ∃r. Pv ¬Pw v∈V

{v,w}∈E

I \ (≥ 4 r)I , then d has at most three rIntuitively, if d ∈ CG successors, each describing one of the three colors. The use of primitive negation in CG ensures that no two adjacent nodes have the same color. A matching upper bound can be derived from the CO -NP-upper bound for subsumption in ALUN , which has the concept constructors top, bottom (⊥), value restriction (∀r.C), conjunction, disjunction, primitive negation, number restrictions, and unqualified existential restriction [11]. Given two EL≥,(¬) -concepts C, D, we have C D iff ¬D ¬C. It remains to observe that bringing ¬C and ¬D into negation normal form yields two ALUN -concepts.

Theorem 3 Subsumption in EL≥,(¬) is CO -NP-complete.

4.2

Inverse Roles

where the Pi are primitive concept names, the ri are of the form r1 ∪ . . . ∪ rn , the Bi are defined concept names, and the si are role names. We use PT (A) to refer to the set of primitive concept names occurring in C, ET (A) is as in the previous section, and NT (A) is the set of number restrictions in C. The conditions for adding a pair (A, B) to the relation S are given in Figure 3.

In [1], it is shown that subsumption w.r.t. the empty TBox is tractable in (an extension of) EL− . We prove that, w.r.t. acyclic TBoxes, subsumption in EL− is PS PACE-complete. Since the upper bound follows from PS PACE-completeness of subsumption in ALCI [5], we concentrate on the lower bound. We reduce validity of quantified Boolean formulas (QBFs). Let ϕ = Q1 v1 · · · Qk vk .ψ be a QBF, where Qi ∈ {∀, ∃} for 1 ≤ i ≤ k. W.l.o.g., we may assume that ψ = c1 ∧ · · · ∧ cn is in conjunctive normal form. We construct an acyclic TBox Tϕ and select two concept names L0 and E0 such that ϕ is valid iff Tϕ |= L0 E0 . Intuitively, a model of L0 and Tϕ is a binary tree of depth k that is used to evaluate ϕ. In the tree, a transition from a node at level i to its left successor corresponds to setting vi+1 to false, and a transition to the right successor corresponds to setting vi+1 to true. Thus, each node on level i corresponds to a truth assignment to the variables v1 , . . . , vi . In Tϕ , we use a single role name r and the following concept names:

Theorem 2 Subsumption in EL≥,∪ w.r.t. TBoxes is in PT IME.

• L0 , . . . , Lk represent the level of nodes in the tree model;

3.2

Role Disjunction and At-Least Restrictions

In EL≥,∪ , we allow role disjunction only in existential restrictions, but not in number restrictions. To show that subsumption w.r.t. TBoxes is tractable, we use a variation of the algorithm in the previous section. In the following, we only list the differences. A TBox is in normal form if, in each A ≡ C ∈ T , C is of the form

1≤i≤k

Pi

1≤i≤

∃ri .Bi

(≥ ni si )

1≤i≤m


28

• Ci,j , 1 ≤ i ≤ n and 1 ≤ j ≤ k, represents truth of the clause ci on level j of the tree model; • E0 , . . . , Ek are used for evaluating ψ, and the index again refers to the level. For 1 ≤ i ≤ k, we use Pj to denote the conjunction of all concept names Ci,j , 1 ≤ i ≤ n, such that vj occurs positively in ci ; similarly, Nj denotes the conjunction of all concept names Ci,j , 1 ≤ i ≤ n, such that vj occurs negatively in ci . Now, the TBox Tϕ is as follows: L0 Lk−1 Ci,j Ek Ei Ei

≡ ··· ≡ ≡ ≡ ≡ ≡

∃r.(L1 P1 ) ∃r.(L1 N1 ) ∃r.(Lk Pk ) ∃r.(Lk Nk ) ∃r− .Ci,j−1 for 1 ≤ i ≤ n and 1 < j ≤ k C1,k · · · Cn,k ∃r.Ei+1 for 0 ≤ i < k where Qi+1 = ∃ ∃r.(Pi+1 Ei+1 ) ∃r.(Ni+1 Ei+1 ) for 0 ≤ i < k where Qi+1 = ∀

The definitions for L0 , . . . , Lk−1 build up the tree. The use of P1 and N1 in these definitions together with the definition of Ci,j sets the truth value of the clause ci according to a partial truth assignment of length j. Finally, the definitions of E0 , . . . , Ek evaluate ϕ according to its matrix formula ψ and quantifier prefix. It can be checked that ϕ is valid iff Tϕ |= L0 E0 . Theorem 4 Subsumption in EL− w.r.t. acyclic TBoxes is PS PACEcomplete. We leave the case of cyclic TBoxes as an open problem. In this case, the lower bound from Theorem 4 is complemented only by the E XP T IME upper bound for subsumption in EL− w.r.t. general TBoxes from [3].

4.3

concept names. The TBox Tϕ is as follows: 8 j if pi ∈ { j1 , j2 , j3 } < ∃r0 .Ai+1 j j Ai ≡ ∃r1 .Ai+1 if ¬pi ∈ { j1 , j2 , j3 } : j j ∃r0 .Ai+1 ∃r1 .Ai+1 otherwise Ajn+1

≡

Aϕ

≡

Bi

≡

Let EL be EL extended with functional roles, i.e., there is a countably infinite subset NF ⊆ NR such that all elements of NF are interpreted as partial functions. It is shown in [3] that subsumption in ELf w.r.t. general TBoxes is E XP T IME-complete. We show that it is co-NP-complete w.r.t. acyclic TBoxes and PS PACE-complete w.r.t. cyclic ones. We use ELF to denote the variation of ELf in which all role names are interpreted as partial functions. It has been observed in [3] that there is a close connection between ELF and FL0 , which provides the concept constructors conjunction and value restriction. It is easy to exploit this connection to transfer the known co-NP-hardness (PS PACE-hardness) from subsumption in FL0 w.r.t. acyclic (cyclic) TBoxes as proved in [16, 12] to ELF . We omit details for brevity. Since the described approach is not very illuminating regarding the source of intractability, however, we give a dedicated proof of coNP-hardness of subsumption in ELF w.r.t. acyclic TBoxes using a reduction from 3-SAT to non-subsumption. Let ϕ = c1 ∧ . . . ∧ ck be a 3-formula in the propositional variables p1 , . . . , pn and with cj = j1 ∨ j2 ∨ j3 for 1 ≤ j ≤ k. We construct a TBox Tϕ and select concept names Aϕ and B1 such that ϕ is satisfiable iff Tϕ |= Aϕ B1 . In the reduction, we use two role names r0 and r1 to represent falsity and truth of variables. More precisely, a path rv1 · · · rvn with rvi ∈ {r0 , r1 } corresponds to the valuation pi → vi , 1 ≤ i ≤ n. Additionally, we use a number of auxiliary

1≤j≤k

Aj1

∃r0 .Bi+1 ∃r1 .Bi+1

Bn+1 ≡

If I is a model of Tϕ and d ∈ (Aj1 )I , 1 ≤ j ≤ k, then d is the root of a tree in I whose edges are labelled with r0 and r1 and whose paths are the valuations that make the clause cj false. Due to functionality of r0 and r1 , each d ∈ AIϕ is thus the root of a (single) tree whose paths are precisely the valuations that make any clause in ϕ false. Finally, d ∈ B1I means that d is the root of a full binary tree of depth n whose paths describe all valuations. It follows that ϕ is satisfiable iff Tϕ |= Aϕ B1 . To prove matching upper bounds for ELf , we exploit the fact that, due to the FL0 -connection, subsumption in ELF is easily shown to be in CO -NP w.r.t. acyclic TBoxes and in PS PACE w.r.t. cyclic ones. We give an algorithm for subsumption in ELf that uses subsumption in ELF as a subprocedure. Like the algorithms in Section 3, it computes a binary relation S on the set of defined concept names by repeatedly adding pairs (A, B) such that the input TBox entails A B. The algorithm works for both acyclic and cyclic TBoxes, giving us the desired upper bound in both cases. We assume the input TBox T to be in the same normal form as described in Section 3.2, but without concepts of the form (≥ n r). Let S be a binary relation on the defined concept names in T . For every concept ∃r.A occurring in T with r ∈ / NF , introduce a fresh concept name Xr,A such that Xr,A = Xr ,A iff r = r , (A, A ) ∈ S, and (A , A) ∈ S. Now let the ELF -TBox TS be obtained from T by (i) replacing every concept ∃r.A where r ∈ / NF with Xr,A , and (ii) for each ∃r.A in T with r ∈ / NF , adding the concept definition

Functional Roles f

Xr.A ≡ Xr,B1 · · · Xr,Bn Zr,A where B1 , . . . , Bn are all concept names with (A, Bi ) ∈ S and (Bi , A) ∈ / S; and Zr,A is a fresh concept name. The algorithm starts with S as the identity relation and then exhaustively performs the following step: add (A, B) to S if TS |= A B. It returns “yes” if the input concepts form a pair in S, and “no” otherwise. Additionally, we can show that subsumption in ELf without TBoxes is in PT IME by a reduction to subsumption in EL. Theorem 5 Subsumption in ELf is in PT IME, CO -NP-complete w.r.t. acyclic TBoxes and PS PACE-complete w.r.t. cyclic TBoxes. It is not hard to see that the lower bounds carry over to EL≤ .

4.4

Booleans

We consider extensions of EL with Boolean constructors, starting with negation. Since EL¬ is a notational variant of ALC, we obtain the following from the results in [17, 18]. Theorem 6 Satisfiability and subsumption in EL¬ is PS PACEcomplete without TBoxes and w.r.t. acyclic TBoxes, and E XP T IMEcomplete w.r.t. cyclic TBoxes. Now for disjunction. It has been shown in [6] that subsumption in EL is CO -NP-complete without TBoxes. In order to establish lower


bounds for subsumption w.r.t. TBoxes, we reduce satisfiability in EL¬ to non-subsumption in EL . An EL¬ -TBox T is in normal form if for each A ≡ C ∈ T , C is of the form , P , ¬B, ∃r.B, or B1 B2 with P primitive and B, B1 , B2 defined. It is straightforward to show that any EL¬ -TBox T can be transformed into normal form in linear time such that all (non-)subsumptions are preserved. Thus, let T = {A1 ≡ C1 , . . . , An ≡ Cn } be an EL¬ -TBox in normal form. Since the proofs underlying Theorem 6 use only a single role name, we may assume w.l.o.g. that T contains only a single role name r. We convert T into an EL -TBox T by introducing fresh concept names A1 , . . . , An representing the negations of A1 , . . . , An and replacing every A ≡ ¬Aj ∈ T with A ≡ Aj and every Ai ≡ ∃r.Aj ∈ T with Ai ≡ ∃r.(Aj

(Ak Ak )).

. . ∃r.}(Aj Aj ) M ≡ 0≤i 0 ∧ z > 0 ∧ d > 0∧ s. Below, only the SSA for depth is shown, size and dist are f acing(θ, loc(xr , yr ), s) ∧ f ieldV iew(β)∧ analogous. visible(loc(xr , yr ), b, β, θ, s)∧ The predicate depth(pk(b, u, z, d), u, loc(xr , yr ), do(a, s)) /* there are no invisible peaks in p */ holds after the execution of an action a at a situation s if and only (¬∃bI , uI , zI , dI ) (pk(bI , uI , zI , dI ) ∈ p∧ if a was a sensing action that picked out the peak of b with depth ¬visible(loc(xr , yr ), bI , β, θ, s) ), u or the robot R (or an object b) moved to a location such that the or in English, sensing a profile p is a possible action, if p includes a Euclidean distance from the object to the observer (the depth of the peak (with positive attributes) from a visible object and has no peaks object b) becomes u in the resulting situation. This SSA is formally from objects that are currently not visible (given robot’s orientation expressed in the following formula, that also includes a frame axiom and aperture). The predicate visible(v, b, β, θ, s) means that a body stating that the value of the fluent depth remains the same in the b is visible from the current viewpoint v if the field of view is β absence of any action that explicitly changes its value. and the robot is facing a direction θ in the situation s. This predicate

M. Soutchanski and P. Santos / Reasoning About Dynamic Depth Profiles

depth(pk(b, u, z, d), u, loc(xr , yr ), do(a, s)) ≡ (∃t, p)a = sense(p, loc(xr , yr ), t) ∧ pk(b, u, z, d) ∈ p ∨ (∃t, x, y, x1 , y1 , r, e)(a = endM ove(R, loc(x1 , y1 ), loc(xr , yr ), t)∧ location(b, loc(x, y), s) ∧ location(R, loc(x1 , y1 ), s)∧ radius(b, r) ∧ euD(loc(x, y), loc(xr , yr ), e) ∧ (u = e − r)) ∨ (∃t, x1 , y1 , x2 , y2 , r, e)(a = endM ove(b, loc(x1 , y1 ), loc(x2 , y2 ), t)∧ location(R, loc(xr , yr ), s) ∧ location(b, loc(x1 , y1 ), s)∧ radius(b, r) ∧ euD(loc(xr , yr ), loc(x2 , y2 ), e) ∧ (u = e − r)) ∨ depth(pk(b, u, z, d), u, loc(xr , yr ), s)∧ location(R, loc(xr , yr ), s) ∧ (∃x, y).location(b, loc(x, y), s)∧ (¬∃t, l, p , u , z , d , x1 , y1 ) (a = endM ove(R, loc(xr , yr ), l, t) ∨ a = endM ove(b, loc(x, y), loc(x1 , y1 ), t) ∨ a = sense(p, loc(xr , yr ), t) ∧ pk(b, u , z , d ) ∈ p ∧ u = u ). In addition to the predicates on peak attributes we can define a set of relations representing transitions between attributes of single peaks. These transitions account for the perception of moving bodies and can be divided into two kinds: predicates referring to transitions in single peaks and transitions between pairs of peaks. Transitions on single peaks are: extending(pk(b, u, z, d),loc(xr , yr ), s), which states that a peak pk(b, u, z, d), representing an object b, is perceived from loc(xr , yr ) as extending (or expanding in size) in situation s; shrinking(pk(b, u, z, d), loc(xr , yr ), s), states that pk(b, u, z, d), representing a visible object b, is shrinking (contracting in size) in s; appearing(pk(b, u, z, d), loc(xr , yr ), s) means that pk(b, u, z, d), unseen in a previous situation, is perceived in a situation s; and, vanishing(pk(b, u, z, d), loc(xr , yr ), s) that represents the opposite of appearing. Finally, peak static represents that the peak attributes do not change in the resulting situation do(a, s) wrt s. For instance, SSA for extending (below) states that a peak is perceived as extending in a situation do(a, s) iff there was a sensing action that perceived that its angular size is greater in do(a, s) than in s, or the robot (or the object) moved to a position such that the computed angular size of the object in do(a, s) is greater than its size in situation s. In either case, the depth in both situations, depth u in do(a, s) and depth u in s, has to be smaller than an L (the furthermost point that can be noted by the robot sensors), representing in this case a threshold on depth that allow the distinction between extending and appearing. Thus, if the peak depth u in situation s was such that u ≥ L, i.e., the peak was too far, but the depth u < L in do(a, s), i.e., the peak is closer to the viewpoint in the resulting situation, then the peak is perceived as appearing, rather than extending (shrinking and vanishing are analogous). Examples of situations in which these fluents hold are given in Figure 1: if the observer moves from viewpoint ν2 to ν1 (Figure 1(c) and (a)), the peak from b2 is perceived as extending (the peak q from b2 is greater in Figure 1(b) than in (d)). If the change is from ν1 to ν2 , instead, q would be shrinking, whereas if only one of the distances was smaller than L, then q would be appearing or vanishing, according to the differences noted in s and in do(a, s). For simplicity, we present a high-level description of the SSA only. extending(peak, viewpoint, do(a, s)) iff a is a sensing action which measured that the angular size of peak is currently larger than it was at s or a is an endM ove action terminating the process of robot’s motion resulting in the viewpoint such that a computed size of peak from the viewpoint is larger than it was at s or a is an endM ove action terminating the motion of an object to a new position such that from robot’s viewpoint a computed size of peak became larger than it was at s or extending(peak, viewpoint, s) and % frame axiom % a is none of those actions which have effect of decreasing the perceived angular size of peak

33

One of the predicates referring to the transition between pairs of peaks is approaching(pk(b1 , u1 , z1 , d1 ), pk(b2 , u2 , z2 , d2 ),loc(xr , yr ), s), which represents that peaks pk(b1 , u1 , z1 , d1 ) and pk(b2 , u2 , z2 , d2 ) (related, respectively, to objects b1 and b2 ) are approaching each other in situation s as perceived from the viewpoint loc(xr , yr ). (The following relations have analogous arguments to those of approaching, they were omitted here for brevity.) Similarly, receding, states that two peaks are receding from each other. The predicate coalescing, states that two peaks are coalescing. Analogously to coalescing, the relation hiding represents the case of a peak coalescing completely with another peak (corresponding to total occlusion of one body by another). The predicate splitting, states the case of one peak splitting into two distinct peaks; finally, two peak static, states that the two peaks are static. Axioms constraining the transitions between pairs of peaks are straightforward, but long and tedious (due to involved geometric calculations). Therefore, for simplicity, we discuss only a high-level description of the SSA for approaching (the axioms for receding, coalescing, shrinking and hiding are analogous). The axiom for approaching expresses that two depth peaks are approaching iff an apparent angle between them obtained by a sensing action is smaller at the situation do(a, s) than at s or, the observer (or an object) moved to a position such that a calculated apparent angle is smaller at do(a, s) than at s. In the latter case, the apparent angle between peaks from b1 , b2 is calculated by the predicate angle(loc(xb1 , yb1 ), loc(xb2 , yb2 ), loc(xν , yν ), rb1 , rb2 , γ) that has as arguments, respectively, the location of the centroids of objects b1 and b2 , the location of viewpoint ν, the radii of b1 and b2 and γ is an angle that we want to compute. The computations accomplished by angle include the straightforward solution (in time O(1)) of a system of equations (including quadratic equations for the circles representing the perimeter of the objects and linear equations for the tangent rays going from the viewpoint to the circles). Similarly to the threshold L used in the SSA for extending above, the SSA for approaching uses a pre-defined (hardware dependent) threshold Δ (roughly, the number of pixels between peaks) that differentiates approaching (receding) from coalescing (splitting). Another threshold is used in an analogous way to differentiate coalescing from hiding. Figure 1 also exemplifies a case where approaching can be entailed. Consider for instance a robot going from viewpoint ν1 to ν2 , in this case, the angular distance (k − j) between peaks p and q in Fig. 1(d) is less than (e − n) in Fig. 1(b). Moving from viewpoint ν2 to ν1 would result in the entailment of receding. If it was the case that the apparent distance between the objects was less than Δ, coalescing or splitting could be entailed. approaching(peak1, peak2, viewpoint, do(a, s)) iff a is a sensing action that measured the angle between peak1 and peak2 and this angle is smaller than it was at s or a is an endM ove action terminating the process of robot’s motion resulting in the viewpoint such that a computed angle between peak1 and peak2 is currently smaller than it was at s or a is an endM ove action terminating the motion of an object to a new position such that from robot’s viewpoint a computed angle between peaks decreased in comparison to what it was at s or approaching(peak1, peak2, viewpoint, s) and % frame axiom% a is none of those actions which have an effect of increasing the perceived angle between peak1 and peak2.

We name Theory of Depth and Motion (T DM) a theory consisting of the precondition axioms Dap for actions introduced in this section, SSA Dss for all fluents in this section, an initial theory DS0 (with at least two objects and the robot), together with Duna and Σ.

34

M. Soutchanski and P. Santos / Reasoning About Dynamic Depth Profiles

5 Perception and Motion in T DM The previous section introduced SSA for depth profiles constraining the fluents on depth peaks to hold when either a particular transition in the attributes of a depth peak was sensed, or the robot (or an object) moved to a position such that a particular transition happens. It is easy to see that the axioms presented above define the conceptual neighbourhood diagram (CND) for depth profiles (Fig. 2). It is worth noting also that the vertices in the conceptual neighbourhood diagram (and the edges connecting them) in Figure 2 represent all the percepts that can be sensed given the depth profile calculus in a domain where objects and the observer can move. Therefore, we can say that perception in T DM is sound and complete wrt motion, in the sense that the vertices and edges of the CND in Fig. 2 result from object’s motion (i.e. perception is sound) and that every motion in the world is accounted by a fluent or by an edge between fluents in this CND (i.e. it is complete). Our first result is a schema applying to each fluent in T DM that represents perception of relations between peaks. Theorem 1 (Perception is sound wrt motion). For any fluent F in the CND the following holds: T DM |= a = sense(p, loc(xr , yr ), t ) ⊃ (¬F ( x, s)∧F ( x, do(a, s)) ⊃ (∃b, l1 , l2 , t)a = endM ove(b, l1 , l2 , t) ) T DM |= a = sense(p, loc(xr , yr ), t ) ⊃ (F ( x, s)∧¬F ( x, do(a, s)) ⊃ (∃b, l1 , l2 , t)a = endM ove(b, l1 , l2 , t) ). For any fluents F and F in T DM if there is an edge between F and F in the CND then the following holds: T DM |= a = sense(p, loc(xr , yr ), t ) ⊃ ( F ( x, s) ∧ ¬F ( x, s) ∧ ¬F ( x, do(a, s))∧F ( x, do(a, s)) ⊃ (∃b, l1 , l2 , t) a = endM ove(b, l1 , l2 , t) ). Proof sketch: The proof of this theorem rephrases the explanation closure axiom that follows from the corresponding SSA (see [11] for details). For every vertex in the CND (i.e., for every perceptionrelated fluent F of T DM), if the last action that the robot did is not a sense action, then the change in the value of this fluent can happen only due to an action endM ove. In addition, we show that for every edge linking two distinct fluents F and F of the CND in Fig. 2, the transition is due to a move action such that in the resulting situation, the fluent F ceases to hold, but F becomes true. 2 The next theorem states that every motion in the domain is accounted by a vertex or by an edge of the CND in Fig. 2. We denote by Fi , Fj all perception-related fluents (Fi and Fj can be different vertices or can be the same). Theorem 2 (Perception is complete wrt motion). For any moving action a in T DM there is a fluent Fi or an edge between two fluents Fi and Fj in the CND: T DM |= ˆW x, do(a, s)) ∨ ´˜ (∃b, i Fi ( W l1 ,`l2 , t)a = endM ove(b, l1 , l2 , t) ⊃ x, s)∧¬Fj ( x, s) ∧ ¬Fi ( x, do(a, s))∧Fj ( x, do(a, s)) i,j Fi ( Proof sketch: The proof follows from the geometric fact that the twelve numbered regions defined by the bi-tangents between two objects (Figure 3) define all possible qualitatively distinct viewpoints to observe these objects. It is easy to see that for every motion of the observer within each region or across adjacent regions in Figure 3 there is an action A mentioned in the SSAs that corresponds to this motion. Therefore, it follows from SSAs that, either a vertex of the CND (a fluent F ) describes the perception resulting from the motion, or there are two fluents F and F such that F ceases to hold after doing a, but F becomes true. For instance, take a robot in Region 5 (Fig. 3) facing the two objects a and b, but moving backward from them. The SSAs would allow the conclusion that the peaks referring to a and b would be approaching and shrinking. On the other hand, a robot (still facing a and b) crossing from Region 5 to 6 would be able to en-

tail the transition from approaching to coalescing by using SSAs. 9

10 11 12

Figure 3.

a

8

1 2

3

b

7

4 5

6

Bi-tangents between two visible objects.

6 Discussion and conclusion We propose a logical theory built within the situation calculus for reasoning about depth perception and motion of a mobile robot amidst moving objects. The resulting formalism, called Theory of Depth and Motion (T DM), is a rich language that allows both sensor data assimilation and reasoning about motion in the world, where their effects are calculated with Euclidean geometry. We show that reasoning about perception of depth in T DM is sound and complete with respect to actual motion in the world. This result proves the conjecture made in [12] which hypothesises that the transitions in the conceptual neighbourhood diagrams of the depth profile calculus are logical consequences of a theory about actions and change. Note that T DM relies on standard models of dense orders, computational geometry and other quantitative abstractions, but this pays off at the end: we can obtain logical consequences about purely qualitative phenomena (e.g., objects approaching each other) from T DM. This theory is an important contribution of our paper. Future research includes the implementation of the proposed formalism in a simulator of a dynamic traffic scenario. We expect that the theory presented in this paper will allow the reasoning system to recognize and summarize (in simple sentences) plans of other vehicles based on knowledge about its own motion, and its perceptions. Acknowledgements: Thanks to Joshua Gross, Frédo Durand, Sherif Ghali for comments about computing visibility efficiently in dynamic 2D scenes. This research has been partially supported by the Canadian Natural Sciences and Engineering Research Council (NSERC) and FAPESP, São Paulo, Brazil.

REFERENCES [1] A. G. Cohn and J. Renz, ‘Qualitative spatial representation and reasoning’, in Handbook of Knowledge Representation, 551–596, (2008). [2] M. de Berg et al, Computational Geometry, Algorithms and Applications (Chapter 15), 2nd Edition, Springer, 2000. [3] A. Goultiaeva and Y. Lespérance, ‘Incremental plan recognition in an agent programming framework’, in Cognitive Robotics, Papers from the 2006 AAAI Workshop, pp. 83–90, Boston, MA, USA, (2006). [4] Gerd Herzog, VITRA: Connecting Vision and Natural Language Systems, http://www.dfki.de/vitra/, Saarbrücken, Germany, 1986-1996. [5] H. Levesque and G. Lakemeyer, ‘Cognitive robotics’, in Handbook of Knowledge Representation, 869–886, Elsevier, (2008). [6] R. Mann, A. Jepson, and J. M. Siskind, ‘The computational perception of scene dynamics’, CVIU, 65(2), 113–128, (1997). [7] A. Miene, A. Lattner, U. Visser, and O. Herzog, ‘Dynamic-preserving qualitative motion description for intelligent vehicles’, in IEEE Intelligent Vehicles Symposium (IV-04), pp. 642–646, Parma, Italy, (2004). [8] Hans-Hellmut Nagel, ‘Steps toward a cognitive vision system’, AI Magazine, 25(2), 31–50, (2004). [9] R. P. A. Petrick, A Knowledge-level approach for effective acting, sensing, and planning, Ph.D. dissertation, University of Toronto, 2006. [10] D. Randell, M. Witkowski, and M. Shanahan, ‘From images to bodies: Modeling and exploiting spatial occlusion and motion parallax’, in Proc. of IJCAI, pp. 57–63, Seattle, U.S., (2001). [11] Raymond Reiter, Knowledge in Action. Logical Foundations for Specifying and Implementing Dynamical Systems, MIT, 2001. [12] Paulo Santos, ‘Reasoning about depth and motion from an observer’s viewpoint’, Spatial Cognition and Computation, 7(2), 133–178, (2007). [13] M Soutchanski, ‘A correspondence between two different solutions to the projection task with sensing’, in Proc. of the 5th Symposium on Logical Formalizations of Commonsense Reasoning, pp. 235–242, New York, USA, May 20-22, (2001).


35

Comparing Abductive Theories Katsumi Inoue 1 and Chiaki Sakama 2 Abstract. This paper introduces two methods for comparing explanation power of different abductive theories. One is comparing explainability for observations, and the other is comparing explanation contents for observations. Those two measures are represented by generality relations over abductive theories. The generality relations are naturally related to the notion of abductive equivalence introduced by Inoue and Sakama. We also analyze the computational complexity of these relations.

1

Introduction

Abduction has been used in many applications of AI including diagnosis, design, updates, and discovery. Abduction is incorporated in problem-solving and programming technologies as abductive logic programming [11]. In the process of building knowledge bases, we need to update an abductive theory in accordance with situation change and discovery of surprising facts. For example, to refine an incomplete description, one may need to add more details to a part of the current theory. Such a refinement is expected to ensure that the revised theory is more powerful in abductive reasoning than the previous one. Then, it is important to evaluate abductive theories by comparing abductive power of each theory in such processes. In predicate logic, comparison of information contents between theories is done by comparing their logical consequences. For example, given two first-order theories T1 and T2 , T1 is considered more informative than T2 if T2 |= ψ implies T1 |= ψ for any formula ψ, i.e., T1 |= T2 . In this case, it is also said T1 is more general than T2 [13, 14]. On the other hand, T1 and T2 are equally informative if T1 |= T2 and T2 |= T1 , that is, if T1 and T2 are logically equivalent (T1 ≡ T2 ). Recently, Inoue and Sakama considered the generality conditions for answer set programming (ASP) [9] and for Reiter’s default logic [10]. These generality/equivalence relations compare monotonic/nonmonotonic theories in terms of deduction. The topic of our interest in this paper is how to compare abductive theories. That is, we seek conditions under which an abductive theory has more explanation power than another abductive theory. As far as the authors know, no answer to this question is given in the literature of abduction. To understand the problem, suppose that an abductive theory A1 is defined to be stronger than another abductive theory A2 . This might imply that there is a formula which can be explained in the former but cannot be in the latter. Then, we would expect that A1 has more background knowledge than A2 or A1 has more hypotheses than A2 . However, the situation is not so simple because addition of background knowledge may violate the consistency of some combination of hypotheses. Hence, relationships between 1 2

National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430, Japan. email: [email protected] Wakayama University, Sakaedani, Wakayama 640-8510, Japan. email: [email protected]

amounts of background theories and hypotheses need to be analyzed in depth to compare abductive theories precisely. In this paper, we consider two logical frameworks for abduction, first-order abduction and abductive logic programming (ALP). Then, we introduce two methods for comparing explanation power of different abductive theories, which were originally introduced by Inoue and Sakama [8] to identify equivalence of two abductive theories. The first one is aimed at comparing explainability for observations in different theories, while the second one is aimed at comparing explanation contents for observations. Those two comparison measures are represented by generality relations over abductive theories. Moreover, the generality relations can naturally be related to the notion of abductive equivalence in [8]. Note that the proposed techniques for first-order abduction can also be applied to comparing frameworks for explanatory induction in inductive logic programming. The rest of this paper is organized as follows. Section 2 introduces two generality relations for comparing abductive first-order theories. Section 3 applies the similar techniques to ALP. Section 4 relates the abductive generality relations to abductive equivalence. Section 5 discusses the complexity issues. Section 6 gives concluding remarks.

2

Generality Relations in First-order Abduction

In this section, we consider abductive theories represented in firstorder logic, which have often been used in abduction in AI, e.g., [17]. In this setting, abductive theories are compared by two measures. Definition 1 Suppose that B and H are sets of first-order formulas, where B represents background knowledge and H is a set of (candidate) hypotheses. We call a pair (B, H) a (first-order) abductive theory. Given a formula O as an observation, a set E of formulas belonging to H 3 is an explanation of O in (B, H) if B ∪ E |= O and B ∪ E is consistent. We say that O is explainable in (B, H) if it has an explanation in (B, H).

2.1

Comparing Explainability

We first consider a measure for comparing explainability between abductive theories. Definition 2 An abductive theory A1 = (B1 , H1 ) is more (or equally) explainable than an abductive theory A2 = (B2 , H2 ), written as A1 ≥ A2 , if every observation explainable in A2 is also explainable in A1 . 3

In this paper we do not specify how H is constructed. For example, when hypotheses contain variables, we could just assume that the set H is closed under instantiation. In another case, we could specify the language of H with a bias and then define that any formula which is constructed from H and satisfies the bias belongs to H. This latter treatment enables us to deal with comparing theories for inductive logic programming (ILP) [14] within the same logical framework as abduction. In any case, we simply denote as E ⊆ H when E is a set of formulas belonging to H.

36

K. Inoue and C. Sakama / Comparing Abductive Theories

Example 1 Consider three abductive theories A1 = (B1 , H1 ), A2 = (B2 , H2 ) and A3 = (B3 , H3 ), where B1

=

{ sprinkler was on ⊃ grass is wet },

H1

=

{ sprinkler was on, rained last night },

B2

=

B1 ∪ { rained last night ⊃ grass is wet },

H2

=

H1 ∪ { ¬(sprinkler was on ⊃ grass is wet ) },

B3

=

B2 ∪ { grass is wet ⊃ shoes are wet },

H3

=

H1 ∪ { ¬(sprinkler was on ⊃ shoes are wet ) }.

Then, A3 ≥ A2 ≥ A1 holds. In fact, every observation explainable in Ai is explainable in Ai+1 for i = 1, 2. Notice that A1 ≥ A2 also holds because rained last night can be explained by itself in both A1 and A2 . By contrast, shoes are wet is explainable in A3 , but is not in either A1 or A2 , i.e., A2 ≥ A3 . Note that each additional hypothesis in Hj \ H1 for j = 2, 3 has no effect in explaining any formula as it cannot be added to Bj without violating the consistency. We provide a necessary and sufficient condition for the explainable generality relation. In the following, T h(Σ) denotes the set of logical consequences of a set Σ of first-order formulas.

Proof: For any abductive theory (B, H), we can associate a prerequisite-free normal default theory Δ = (DH , B), where DH = | h ∈ H}. Then there is a 1-1 correspondence between the ex{ :h h tensions of Δ (in the sense of Reiter [18]) and Ext((B, H)) [17, Theorem 4.1]. By the semi-monotonicity of normal default theories [18, Theorem 3.2], H1 ⊇ H2 implies that, for any extension F of Δ2 = (DH2 , B), there is an extension E of Δ1 = (DH1 , B) such that F ⊆ E. By Theorem 2, the result holds. 2 For abductive theories A1 = (B1 , H) and A2 = (B2 , H) with the same hypotheses, B1 |= B2 implies neither A1 ≥ A2 nor A2 ≥ A1 . This explains the name of semi-monotonicity in Proposition 4. Example 2 Suppose the abductive theories A = (B, H) and A = (B , H) where B = {a ∧ b ⊃ p}, B = B ∪ {¬b}, and H = {a, b}. Then, A ≥ A because p has the explanation {a, b} in A but is not explainable in A . On the other hand, A ≥ A because ¬b has the explanation ∅ in A but is not explainable in A.

2.2

Comparing Explanations

We next provide a second measure for comparing abductive theories. This time we compare explanation contents.

Definition 3 An extension of an abductive theory A = (B, H) is T h(B ∪ S) where S is a maximal set of formulas belonging to H such that B∪S is consistent. The set of all extensions of A is denoted as Ext(A).

Definition 4 An abductive theory A1 = (B1 , H1 ) is more (or equally) explanatory than an abductive theory A2 = (B2 , H2 ), written as A1 A2 , if, for any observation O, every explanation of O in A2 is also an explanation of O in A1 .

Lemma 1 ([17]) Let O be a (possibly infinite) set of formulas. There is an explanation that explains every formula in O in (B, H) iff there is an extension X of (B, H) such that O ⊆ X.

Example 3 For three abductive theories in Example 1, A3 A2 A1 holds. Although A1 ≥ A2 holds, we see that A1 A2 because {rained last night } is an explanation of grass is wet in A2 but is not in A1 .

Theorem 2 Let A1 = (B1 , H1 ) and A2 = (B2 , H2 ) be abductive theories. Then, A1 ≥ A2 holds iff for any extension X2 of A2 , there is an extension X1 of A1 such that X2 ⊆ X1 .

It is easy to see that the relation is stronger than the relation ≥, that is, A1 A2 implies A1 ≥ A2 . Now we show the necessary and sufficient condition for explanatory generality.

Proof: (⇐) By Lemma 1, if an observation O is explainable in A2 , there is X2 ∈ Ext(A2 ) such that O ∈ X2 . For any such X2 , there is X1 ∈ Ext(A1 ) such that X2 ⊆ X1 . Then, O ∈ X1 and O is explainable in (B1 , H1 ) by Lemma 1. Hence, A1 ≥ A2 . (⇒) Assume that there is X2 ∈ Ext(A2 ) such that X2 ⊆ X1 for any X1 ∈ Ext(A1 ). Pick a formula ψ i for each X1 i ∈ Ext(A1 ) such that ψi ∈ (X2 \ X1 i ) (= ∅), and let O be the set of ψi ’s from every X1 i . Then, V O ⊆ X2 but O ⊆ X1 for any X1 ∈ Ext(A1 ). By Lemma 1, F ∈O F is explainable in A2 but is not explainable in 2 A1 . Hence, A1 ≥ A2 . There are several classes of abductive theories in which we can see explainable generality holds under some simple conditions. Proposition 3 (Assumption-freeness) Suppose two abductive theories (B1 , L) and (B2 , L), where L is the set of all literals in the underlying language. Then, (B1 , L) ≥ (B2 , L) iff B2 |= B1 . Proof: Any extension of an abductive theory (Bi , L) is logically equivalent to a (complete) model of Bi . By Theorem 2, (B1 , L) ≥ (B2 , L) iff, for any model M of B2 , there is a model N of B1 such that M ⊆ N . Because both M and N are complete, M ⊆ N implies M = N . Hence, any model of B2 is a model of B1 . 2 Proposition 4 (Semi-monotonicity) Suppose that (B, H1 ) and (B, H2 ) are two abductive theories with the same background knowledge. If H1 ⊇ H2 , then (B, H1 ) ≥ (B, H2 ).

Theorem 5 Let A1 = (B1 , H1 ) and A2 = (B2 , H2 ) be abductive theories. Then, A1 A2 holds iff B1 |= B2 and H1 ⊇ H2 hold, where Hi = { E ⊆ Hi | Bi ∪ E is consistent } for i = 1, 2. Proof: Note that any explanation E of an observation O in (Bi , Hi ) satisfies that (1) Bi ∪ E |= O and (2) E ∈ Hi . (⇐) Suppose A1 A2 . Then there exist a formula O and a set E of formulas such that B2 ∪ E |= O and E ∈ H2 while B1 ∪ E |= O or E ∈ H1 . If B1 ∪ E |= O holds, we have B1 |= E ⊃ O and B2 |= E ⊃ O, which implies B1 |= B2 . If E ∈ H1 holds, by E ∈ H2 we have H2 ⊆ H1 . Hence, the result holds. (⇒) Suppose A1 A2 . Then for any formula O and any set E of formulas, B2 ∪ E |= O and E ∈ H2 imply B1 ∪ E |= O and E ∈ H1 . By the fact that B2 ∪ E |= O implies B1 ∪ E |= O for any O, we have B2 ∪ E |= B1 ∪ E for any E ∈ H2 ∩ H1 . Then, B2 |= B1 holds when E = ∅. By the fact that E ∈ H2 implies 2 E ∈ H1 , we also have H2 ⊆ H1 . Hence, the result holds. Corollary 6 Let A1 = (B1 , H1 ) and A2 = (B2 , H2 ) be abductive theories. Then, A1 A2 holds iff B1 |= B2 and A1 ≥ A2 hold. Proof: The set Hi in Theorem 5 contains every subset E of Hi such that Bi ∪ E is consistent. Hi can be characterized by Ext(Ai ) as each consistent theory is a subset of some extension. Then, it can be proved that H1 ⊇ H2 iff for any X2 ∈ Ext(A2 ), there is X1 ∈ Ext(A1 ) such that X2 ⊆ X1 . Hence, the result follows from Theorem 2. 2 Corollary 7 If H1 ⊇ H2 , then (B, H1 ) (B, H2 ) holds.


3

Generality Relations in Abductive Logic Programming

In this section, we turn our attention to generality relations in abductive logic programming (ALP) [11]. The most significant difference between abduction in first-order logic and ALP is that ALP allows the nonmonotonic negation-as-failure operator not in a background program. When the background program P is nonmonotonic, the fact that P ∪E is consistent for some set E of hypotheses does not necessarily imply that P ∪ E is consistent for E ⊂ E. Hence comparing abductive power in ALP should be checked in a more naive manner upon each subset of hypotheses.

Definition 8 Let A1 = P1 , Γ1 and A2 = P2 , Γ2 be abductive programs, and G an observation. A1 is more (or equally) explainable than A2 , written as A1 ≥ A2 , if every observation explainable in A2 is also explainable in A1 . On the other hand, A1 is more (or equally) explanatory than A2 , written as A1 A2 , if, for any observation G, every explanation of G in A2 is also an explanation of G in A1 . Example 4 Let A1 = P1 , Γ and A2 = P2 , Γ be abductive programs, where P1 = { p ← a, a ← b }, P2 = { p ← a, p ← b }, and Γ = {a, b}. Then, A1 ≥ A2 and A2 ≥ A1 , while A1 A2 but A2 A1 . In fact, {b} is an explanation of a in A1 , but is not in A2 . The following results hold for two generality relations.

Definition 5 An abductive (logic) program is a pair P, Γ where • P is a (logic) program, which is a set of rules of the form: L1 ; · · · ; Lk ; not Lk+1 ; · · · ; not Ll ← Ll+1 , . . . , Lm , not Lm+1 , . . . , not Ln

(1)

where each Li is a literal (n ≥ m ≥ l ≥ k ≥ 0), and not represents negation as failure (NAF). The symbol ; represents disjunction. The left-hand side of the rule is the head, and the right-hand side is the body. A program containing variables is a shorthand of its ground instantiation. • Γ is a set of literals, called abducibles. Any instance of an abducible is also an abducible. Logic programs mentioned above belong to the class of general extended disjunctive programs (GEDPs) [6]. If any rule of the form (1) in a program P does not contain not in its head, i.e., k = l, P is called an extended disjunctive program (EDP) [4]. Moreover, if the head of any rule in an EDP P contains no disjunction, i.e., k = l ≤ 1, P is called an extended logic program (ELP). A semantics of a logic program is given by the answer set semantics [4, 6]. We denote the set of all ground literals in the language of a program as Lit. For a program P , the set of answer sets of P is denoted as AS(P ). When P is an EDP, AS(P ) is an antichain in 2Lit , that is, for any two answer sets S1 , S2 ∈ AS(P ), S1 ⊆ S2 implies S1 = S2 [4], but this is not the case for a GEDP. A semantics for ALP is given by extending answer sets of the background program with addition of abducibles. Such an extended answer set is called a belief set, which has also been called a generalized stable model [11]. Definition 6 Let A = P, Γ be an abductive program, and E ⊆ Γ. A belief set of A (with respect to E) is a consistent answer set of the logic program P ∪ E. The set of all belief sets of A is denoted as BS(A). A set S ∈ BS(A) is often denoted as SE when S is a belief set with respect to E. Definition 7 Let A = P, Γ be an abductive program, and G a conjunction of ground literals called an observation. We will often identify a conjunction G with the set of literals in G. A set E ⊆ Γ is an explanation of G in A if every ground literal in G is true in a belief set of A with respect to E.4 When G has an explanation in A, G is explainable in A. Note that restrictions in ALP can be removed so that not only literals but rules can be allowed as abducibles and that observations can contain NAF formulas as well as literals. As in the case of first-order abduction, two generality relations are defined for ALP as follows. 4

This definition provides credulous explanations. Alternatively, skeptical explanations are defined as E ⊆ Γ such that G is true in every belief set of A with respect to E.

37

Theorem 8 Let A1 = P1 , Γ1 and A2 = P2 , Γ2 be abductive programs. Then, A1 ≥ A2 holds iff for any belief set S2 of A2 , there is a belief set S1 of A1 such that S2 ⊆ S1 . Proof: (⇐) If G is explainable in A2 , there is S2 ∈ BS(A2 ) such that G ⊆ S2 . For any such S2 , there is S1 ∈ BS(A1 ) such that S2 ⊆ S1 . Then, G ⊆ S1 and G is explainable in A1 . Hence, A1 ≥ A2 . (⇒) Assume that there is S2 ∈ BS(A2 ) such that S2 ⊆ S1 for any S1 ∈ BS(A1 ). For each S1 i ∈ BS(A1 ), pick a literal Li such that Li ∈ (S2 \ S1 i ) (= ∅), and let G be the set of Li ’s from every S1 i . Then, G ⊆ S2 but G ⊆ S1 for any S1 ∈ BS(A1 ). That is, G 2 is explainable in A2 but is not in A1 , i.e., A1 ≥ A2 . Theorem 9 Let A1 = P1 , Γ1 and A2 = P2 , Γ2 be abductive programs. Then, A1 A2 holds iff for any E ⊆ Γ2 and any SE ∈ BS(A2 ), there is TE ∈ BS(A1 ) such that E ⊆ Γ1 and SE ⊆ TE . Proof: (⇒) Suppose A1 A2 . Then, for any observation G and any E ⊆ Γ2 , the fact that G ⊆ SE for some SE ∈ BS(A2 ) implies that G ⊆ TE for some TE ∈ BS(A1 ). Thus, SE ⊆ TE . (⇐) Suppose SE ∈ BS(A2 ) for any E ⊆ Γ2 implies the existence of TE ∈ BS(A1 ) with E ⊆ Γ1 such that SE ⊆ TE . Then, for any observation G, G ⊆ SE implies G ⊆ TE . That is, if G has an 2 explanation E in A2 , G has the same explanation E in A1 . Theorem 8 and Theorem 9 might look similar, but the condition of the latter is finer-grained than that of the former. In fact, as in the case of first-order abduction, A1 A2 implies A1 ≥ A2 .

4

Connection to Abductive Equivalence

In this section, we consider the relationship between the generality relations in abduction proposed in this paper and the equivalence relations in abduction proposed in the literature. Inoue and Sakama [8] study different types of equivalence relations in abduction: explainable/explanatory equivalence of abductive theories under both first-order abduction and ALP. Pearce et al. [16] characterize a part of these problems in the context of equilibrium logic. In the following, an abductive framework A means either a first-order abductive theory A = (B, H) or an abductive logic program A = P, Γ . Definition 9 ([8]) Let A1 and A2 be abductive frameworks. 1. A1 and A2 are explainably equivalent if, for any observation O,5 O is explainable in A1 iff O is explainable in A2 . 2. A1 and A2 are explanatorily equivalent if, for any observation O, E is an explanation of O in A1 iff E is an explanation of O in A2 . 5

This definition of explainable equivalence for ALP is not exactly the same as that in [8, Definition 4.3]. In [8] an observation is a single ground literal, while we allow a conjunction of ground literals as an observation.

38


Explainable equivalence requires that two abductive frameworks have the same explainability for any observation. Explainable equivalence may reflect a situation that two programs have different knowledge to derive the same goals. On the other hand, explanatory equivalence assures that two abductive frameworks have the same explanation contents for any observation. Explanatory equivalence is stronger than explainable equivalence: if two abductive frameworks are explanatorily equivalent then they are explainably equivalent. By Definitions 2, 4, 8, and 9, it is obvious that all generality relations defined in this paper are “anti-symmetric”6 in the sense that two abductive frameworks are explainably/explanatorily equivalent iff one is both more (or equally) and less (or equally) explainable/explanatory than another at the same time.

there is T ∈ max(AS(P1 ∪ E)) such that T ⊆ T . By A2 A1 , there is S ∈ AS(P2 ∪ E) such that T ⊆ S , and then there is S ∈ max(AS(P2 ∪ E)) such that S ⊆ S . Then S ⊆ S holds and both belong to max(AS(P2 ∪ E)), which imply S = T = S , and thus S ∈ max(AS(P1 ∪ E)). Hence, (1) if E ⊆ Γ2 and P2 ∪ E is consistent then E ⊆ Γ1 and P1 ∪ E is consistent, and (2) max(AS(P2 ∪ E)) ⊆ max(AS(P1 ∪ E)) for any E ⊆ Γ2 . Similarly, (3) if E ⊆ Γ1 and P1 ∪ E is consistent then E ⊆ Γ2 and P2 ∪ E is consitent, and (4) max(AS(P1 ∪ E)) ⊆ max(AS(P2 ∪ E)) for any E ⊆ Γ1 . By (1) and (3), C1 = C2 holds. By (2) and (4), max(AS(P1 ∪ E)) = max(AS(P2 ∪ E)) holds for any E ⊆ Γ1 and for any E ⊆ Γ2 . Hence, the result follows. (⇐) can be proved in a similar way. 2

Proposition 10 Let A1 and A2 be abductive frameworks.

Two logic programs P1 and P2 are strongly equivalent with respect to a rule set R if AS(P1 ∪ R) = AS(P2 ∪ R) for any logic program R ⊆ R [7]. This equivalence notion is a restricted version of strong equivalence [12], and is called relative strong equivalence [7].7 The next result was originally shown in [8]8 and then was discussed in [16] for EDPs. Now it can be simply proved by the antichain property of AS(P ) for any EDP P .

1. A1 and A2 are explainably equivalent iff A1 ≥ A2 and A2 ≥ A1 . 2. A1 and A2 are explanatorily equivalent iff A1 A2 and A2 A1 . With this correspondence and results in previous sections, we can derive either new characterizations of abductive equivalence or new (and simple) proofs of previously presented results. For first-order abduction, the following results can be verified with new proofs. Proposition 11 Two first-order abductive theories A1 and A2 are explainably equivalent iff Ext(A1 ) = Ext(A2 ) holds. Proposition 12 For first-order abductive theories A1 = (B1 , H1 ) and A2 = (B2 , H2 ), the following four statements are equivalent. 1. 2. 3. 4.

A1 A1 B1 B1 Hi

and A2 are explanatorily equivalent. and A2 are explainably equivalent and B1 ≡ B2 . ≡ B2 and H1 = H2 . ≡ B2 and H1 = H2 , where = { h ∈ Hi | Bi ∪ {h} is consistent } for i = 1, 2.

For ALP, the next results can be newly obtained. In the following, for any set X, let max(X) = { x ∈ X | ¬∃y ∈ X. x ⊂ y }. Theorem 13 Let A1 = P1 , Γ1 and A2 = P2 , Γ2 be abductive programs. Then, A1 and A2 are explainably equivalent iff max(BS(A1 )) = max(BS(A2 )). Proof: (⇒) By Theorem 8, A1 ≥ A2 implies that, for any S2 ∈ max(BS(A2 )) there exists S1 ∈ BS(A1 ) such that S2 ⊆ S1 , and then there exists S1 ∈ max(BS(A1)) such that S1 ⊆ S1 . By A2 ≥ A1 , there exists S2 ∈ BS(A2 ) such that S1 ⊆ S2 , and then there exists S2 ∈ max(BS(A2 )) such that S2 ⊆ S2 . Then S2 ⊆ S2 holds, but because both belong to max(BS(A2 )), S2 = S2 holds. Hence, S2 (= S1 ) also belongs to max(BS(A1 )), and thus the result holds. (⇐) can be proved by tracing the above proof backward. 2 Theorem 14 Let A1 = P1 , Γ1 and A2 = P2 , Γ2 be abductive programs. A1 and A2 are explanatorily equivalent iff C1 = C2 holds and max(AS(P1 ∪ E)) = max(AS(P2 ∪ E)) for any E ∈ Ci , where Ci = { E ⊆ Γi | Pi ∪ E is consistent } for i = 1, 2. Proof: (⇒) Suppose that A1 and A2 are explanatorily equivalent. By Theorem 9, A1 A2 implies that, for any E ⊆ Γ2 and any SE ∈ BS(A2 ), there is TE ∈ BS(A1 ) such that E ⊆ Γ1 and SE ⊆ TE . Then, for any E ⊆ Γ2 and any S ∈ max(AS(P2 ∪ E)), E ⊆ Γ1 and there is T ∈ AS(P1 ∪ E) such that S ⊆ T , and then 6

The relations ≥ and are also preorders, i.e., reflexive and transitive, for both first-order abduction and ALP.

Corollary 15 Let A1 = P1 , Γ and A2 = P2 , Γ be abductive programs with the same hypotheses such that both P1 and P2 are EDPs. Also, let Pi = Pi ∪{ ← L, ¬L | L ∈ Lit} for i = 1, 2. Then, A1 and A2 are explanatorily equivalent iff P1 and P2 are strongly equivalent with respect to Γ.

5

Complexity Results

We show that the computational complexity of deciding generality between abductive theories becomes more complex in general than that of abductive equivalence presented in [8]. Theorem 16 Let A1 and A2 be two propositional abductive theories. Deciding if A1 ≥ A2 is ΠP 3 -complete. Proof: Let A1 = (B1 , H1 ) and A2 = (B2 , H2 ). We here identify Ext(Ai ) with the extensions of the prerequisite-free normal default theory (DHi , Bi ) for i = 1, 2 as in the proof of Proposition 4. For any subset S ⊆ H2 , checking if E = T h(B2 ∪ S) is an extension of A2 is coNP-complete [19]. If E ∈ Ext(A2 ) then deciding if there does not exist F ∈ Ext(AV 1 ) such that E ⊆ F can be determined V by checking if the formula B2 ∧ S belongs to some extension P of A1 , which is Σ2 -complete [5]. Thus, we can choose S ⊆ H2 in nondeterministic polynomial time with a ΣP 2 -oracle to decide if A1 ≥ A2 holds. Hence, the original problem is the complement of P this, and belongs to ΠP 3 . We omit the proof of Π3 -hardness because of the space limitation. 2 Theorem 17 Let A1 and A2 be two propositional abductive theories. Deciding if A1 A2 is ΠP 3 -complete. Proof: Follows from Corollary 6 and Theorem 16. 7

2

This definition is due to [7], and is slightly different from the notion of relativized equivalence in [20, 16]. In [20], P1 and P2 are defined as strongly equivalent relative to a literal set U iff AS(P1 ∪ R) = AS(P2 ∪ R) for any set R of rules that are constructed using literals in U . 8 The condition of EDPs was missing in [8, Theorem 4.4]. In fact, only Theorem 14 holds for GEDPs. Moreover, to characterize inconsistent programs in ALP, an EDP having the answer set Lit should be translated to an EDP without an answer set in Corollary 15.


Theorem 18 Let A1 = P1 , Γ1 and A2 = P2 , Γ2 be abductive programs. Deciding if A1 ≥ A2 is (i) ΠP 2 -complete when P1 and P2 are ELPs, and is (ii) ΠP -complete when P1 and P2 are GEDPs. 3 Proof: A computation problem in GEDPs reduces in polynomial time to the corresponding problem in EDPs [6], so we here consider the cases that each Pi is either an ELP or an EDP. (Membership) For any guess S ⊆ Lit, deciding if S ∈ BS(A2 ) is NP-complete for an ELP P2 (resp. ΣP 2 -complere for an EDP P2 ) [2]. For such an S, deciding if there does not exist T ∈ BS(A1 ) such that S ⊆ T can be determined by credulous reasoning that contains S, which is NP-complete for an ELP P1 (resp. ΣP 2 -complere for an EDP P1 ) [2]. Hence, by Theorem 8, A1 ≥ A2 can be nondeterministically solvable with two calls to an NP-oracle (resp. a ΣP 2 -oracle). P (resp. Π ). Therefore, the complement is in ΠP 2 3 (Hardness) We prove for Wnthe ELP case. Let Φ = ∀X∃Y.φ be a closed QBF, where φ = j=1 Cj is a DNF formula, that is, Cj is a conjunction of literals. Let A1 = P1 , Γ1 and A2 = P2 , Γ2 be abductive programs such that P1 = {g ← Cj | 1 ≤ j ≤ n}, Γ1 = X ∪ ¬X ∪ Y ∪ ¬Y , P2 = {g ← }, and Γ2 = X ∪ ¬X, where ¬X = {¬x | x ∈ X} and ¬Y = {¬y | y ∈ Y }. Note that both P1 and P2 are ELPs. We prove that: A1 ≥ A2 ⇔ Φ is valid. (⇒) Suppose A1 ≥ A2 . By Theorem 8, for any S ∈ BS(A2 ), there is T ∈ BS(A1 ) such that S ⊆ T . In particular, for any IX ⊆ X, there is a belief set S ∈ BS(A2 ) with respect to IX ∪¬(X \IX ), and hence IX ∪¬(X\IX ) ⊆ T for some T ∈ BS(A1 ). Since g ∈ S, g must be in T too. Then, some Cj (1 ≤ j ≤ n) must be true under IX ∪ ¬(X \ IX ) and IY ∪ ¬(Y \ IY ) for some IY ⊆ Y . Hence, φ is true under such an interpretation. Since IX was arbitrary, Φ is valid. (⇐) Suppose Φ is valid. Then for any IX ⊆ X, φ is true under IX ∪ ¬(X \ IX ) and IY ∪ ¬(Y \ IY ) for some IY ⊆ Y . Then some Cj is true under this interpretation, and hence g holds. It is easy to see for any S ∈ BS(A2 ) that there is T ∈ BS(A1 ) such that S ⊆ T . By Theorem 8, A1 ≥ A2 holds. For the EDP case, we can apply a transformation of a QBF ∀X∃Y ∀Z.φ into a disjunctive program, which is analogous to the one presented in [1, Theorem 3.1] and [2, Lemma 2]. 2 Theorem 19 Let A1 = P1 , Γ1 and A2 = P2 , Γ2 be abductive programs. Deciding if A1 A2 is (i) ΠP 2 -complete when P1 and P2 are ELPs, and is (ii) ΠP 3 -complete when P1 and P2 are GEDPs. Proof: Like Theorem 18, we can assume that each Pi is either an ELP or an EDP. For any guess S ⊆ Lit, deciding if SE ∈ BS(A2 ) for some E ⊆ Γ2 is NP-complete for an ELP P2 (resp. ΣP 2 -complere for an EDP P2 ) [2]. For any such E, deciding if AS(P1 ∪ E) = ∅ is NP-complete for an ELP P2 (resp. ΣP 2 -complere for an EDP P2 ) [1]. For SE , deciding if there does not exist T ∈ AS(P1 ∪ E) such that SE ⊆ T can be determined by credulous reasoning that contains SE , which is NP-complete for an ELP P1 (resp. ΣP 2 complere for an EDP P1 ) [2]. Hence, by Theorem 9, A1 A2 can be nondeterministically solvable with three calls to an NP-oracle (resp. P P a ΣP 2 -oracle). Therefore, the complement is in Π2 (resp. Π3 ). The hardness can be shown in the same way as in Theorem 18. 2

6

Discussion

The relation ≥ introduced in this paper can be represented by generality relations defined by Inoue and Sakama [9, 10]. We briefly sketch the relationships here. For first-order abductive theories A1 = (B1 , H1 ) and A2 = (B2 , H2 ), by identifying Ext(Ai) with the extensions of the prerequisite-free normal default theory (DHi , Bi ) for

39

i = 1, 2, we can prove that A1 ≥ A2 iff A1 |=dt A2 , where |=dt is a Hoare order defined on the class of default theories [10]. On the other hand, for abductive logic programs A1 = P1 , Γ1 and A2 = P2 , Γ2 , let Pi (i = 1, 2) be the GEDP defined by Pi = Pi ∪ { l; not l ← | l ∈ Γi }. Then, BS(Ai ) = AS(Pi ) holds [6]. With this result, we can see that A1 ≥ A2 iff P1 |=lp P2 , where |=lp is a Hoare order defined on the class of GEDPs (originally defined on the class of EDPs in [9]). Besides work on generality relations in ASP [9], a general correspondence framework has been proposed in [3, 15] to compare logic programs. This framework is defined to compare equivalence and inclusion between the semantics of logic programs instead of generality, but the notions of projection and contexts are also introduced to enable a variety of equivalence comparison. Incorporating these notions into our generality framework is a topic of future work.

REFERENCES [1] T. Eiter and G. Gottlob. On the computational cost of disjunctive logic programs: propositional case. Annals of Mathematics and Artificial Intelligence, 15:289–323, 1995. [2] T. Eiter, G. Gottlob and N. Leone. Abduction from logic programs: semantics and complexity. Theoretical Computer Science, 189:129– 177, 1997. [3] T. Eiter, H. Tompits and S. Woltran. On solution correspondences in answer-set programming. In: Proc. IJCAI-05, pp. 97–102, 2005. [4] M. Gelfond and V. Lifschitz. Classical negation in logic programs and disjunctive databases. New Generation Computing, 9:365–385, 1991. [5] G. Gottlob. Complexity results for nonmonotonic logics. J. Logic and Computation, 2:397–425, 1992. [6] K. Inoue and C. Sakama. Negation as failure in the head. J. Logic Programming 35, pp. 39–78, 1998. [7] K. Inoue and C. Sakama. Equivalence of logic programs under updates. In: Proc. 9th European Conference on Logics in Artificial Intelligence, LNAI 3229, pp. 174–186, Springer, 2004. [8] K. Inoue and C. Sakama. Equivalence in abductive logic. In: Proc. IJCAI-05, 2005, pp. 472–477. [9] K. Inoue and C. Sakama. Generality relations in answer set programming. In: Proc. 22nd International Conference on Logic Programming, LNCS 4079, pp. 211–225, Springer, 2006. [10] K. Inoue and C. Sakama. Generality and equivalence relations in default logic. In: Proc. 22nd Conference on Artificial Intelligence (AAAI07), pp. 434–439, 2007. [11] A. Kakas, R. Kowalski and F. Toni. The role of abduction in logic programming. In: D. Gabbay, C. Hogger and J. Robinson, editors, Handbook of Logic in Artificial Intelligence and Logic Programming, Vol. 5, pp. 235–324, Oxford University Press, 1998. [12] V. Lifschitz, D. Pearce and A. Valverde. Strongly equivalent logic programs. ACM Transactions on Computational Logic, 2:526–541, 2001. [13] T. Niblett. A study of generalization in logic programs. In: Proc. 3rd European Working Sessions on Learning, pp. 131–138, Pitman, 1988. [14] S.-H. Nienhuys-Cheng and R. De Wolf. Foundations of Inductive Logic Programming. LNAI 1228, Springer, 1997. [15] J. Oetsch, H. Tompits and S. Woltran. Facts do not cease to exist because they are ignored: relativised uniform equivalence with answer-set projection. In: Proc. 22nd Conference on Artificial Intelligence (AAAI07), pp. 458–464, 2007. [16] D. Pearce, H. Tompits and S. Woltran. Relativised equivalence in equilibrium logic and its applications to prediction and explanation: preliminary report. In: Proc. LPNMR’07 Workshop on Correspondence and Equivalence for Nonmonotonic Theories, pp. 37–48, 2007. [17] D. Poole. A logical framework for default reasoning. Artificial Intelligence, 36:27–47, 1988. [18] R. Reiter. A logic for default Reasoning. Artificial Intelligence, 13:81– 132, 1980. [19] R. Rosati. Model checking for nonmonotonic logics: algorithm and complexity. In: Proc. IJCAI-99, pp. 76–81, 1999. [20] S. Woltran. Characterizations for relativized notions of equivalence in answer set programming. In: Proc. 9th European Conference on Logics in Artificial Intelligence, LNAI 3229, pages 161–173, Springer, 2004.

40


Privacy-Preserving Query Answering in Logic-based Information Systems Bernardo Cuenca Grau and Ian Horrocks 1 Abstract. We study privacy guarantees for the owner of an information system who wants to share some of the information in the system with clients while keeping some other information secret. The privacy guarantees ensure that publishing the new information will not compromise the secret one. We present a framework for describing privacy guarantees that generalises existing probabilistic frameworks in relational databases. We also formulate different flavors of privacy-preserving query answering as novel, purely logic-based reasoning problems and establish general connections between these reasoning problems and the probabilistic privacy guarantees.

1

Motivation

Privacy protection is an important issue in modern information systems. The digitalization of data on the Web has dramatically increased the risks of private information being either accidentally or maliciously disclosed. These risks have been witnessed by numerous cases of personal data theft from systems that were believed to be secure. The design of information systems that provide provable privacy guarantees is, however, still an open problem—in fact, the notion of privacy is itself still open to many interpretations [2]. This paper addresses the problem of privacy-preserving query answering. In this setting it is assumed that the information itself is kept secret, but that the owner of the information wants to allow some query access to it while at the same time preventing private information from being revealed. For example, a hospital may want to allow researchers studying prescribing practices to query the patients’ records database for information about medicines dispensed in the hospital, but they want to ensure that no information is revealed about the medical conditions of individual patients. To make this more precise, the hospital wants to check whether answering specified legal queries could augment knowledge (from whatever source) that an attacker may have about the answer to a query for patient names and their medical conditions (the so-called sensitive query). Taking into account that an attacker may have previous knowledge about the system is of crucial importance, as such knowledge may connect the answers to legal and sensitive queries, and lead to the (partial) revelation of the latter. For example, allowing a query for drugs and the dates on which they were prescribed may seem harmless, but if the attacker knows the dates on which patients have been in hospital and drugs that are used to treat AIDS, then he may deduce that there must be an AIDS patient amongst the group known to be in hospital on a date when AIDS drugs were dispensed. This problem has been recently investigated in the context of relational databases (DBs) [9, 10, 6]. In these privacy frameworks, the knowledge and/or beliefs about the system of a potential attacker are 1

Oxford University Computing Laboratory, UK

modeled as a probability distribution over possible states of the information system. Privacy checking then amounts to verifying whether publishing new information, such as the answer to a legal query, could change the probability (from an attacker’s perspective) of any particular answer to the sensitive query. In the first part of this paper, we extend the probabilistic notions of privacy explored in the DB literature to cover a very general class of logic-based languages which includes, for example, ontology languages [12]. Furthermore, since these notions are too strict in practice, we propose ways to weaken them. In the second part, we formulate privacy-preserving query answering in terms of novel, purely logic-based reasoning problems. We show that our logic-based notions have natural probabilistic counterparts. Finally, we argue that these reasoning problems are related to existing ones; to illustrate this fact, we point out a connection with the notion of a conservative extension, an important concept in modular ontology design [8, 7]. Given the generality of our notion of an information system, we do not make claims concerning computational properties. Our results, however, provide an excellent formal base for studying such properties for particular languages.

2

Logic-based Information Systems

We adopt a general framework for describing logic-based information systems that captures any language whose formal semantics is based on First Order (FO) models; the framework is open toward different mechanisms for selecting admissible models and thus comprises a wide range of languages. We distinguish between intensional knowledge (background knowledge about the application domain) and extensional knowledge (data involving specific objects of the domain). This allows us to make the usual distinction in KR between schema knowledge and data. The framework here has been adapted from existing general frameworks in the literature [5, 1]. An Information System Formalism (ISF) is a tuple F = (Σ, LS , LD , Sem) where Σ is a countably infinite FO-signature, LS , LD are FO-languages over Σ, called the schema and dataset language respectively, and Sem is a specification of the semantics (of which more below). A schema S (respectively a dataset D ) is a set of LS -sentences (respectively a set of LD -sentences) over Σ. For example, in relational DBs, Σ is a set of relations and constants; LD only allows for ground atomic formulas, and LS is the language of FO Predicate Logic with equality. Datasets and schemas are called relational instances and relational schemas respectively. In the case of description logic (DL) ontologies, Σ contains unary relations, binary relations and constants; LS is a DL, such as SH I Q [12], and LD again only allows for ground atomic formulas over the predicates in Σ; Datasets are called ABoxes and schemas TBoxes.

41

B. Cuenca Grau and I. Horrocks / Privacy-Preserving Query Answering in Logic-Based Information Systems

The semantics is given by a pair Sem = (δ, ◦); δ is a function that assigns to each FO-interpretation I over Σ and each possible set S of LS -sentences (respectively LD -sentences D ) a truth value δ(I , S ) ∈ {true, false} (respectively δ(I , D ) ∈ {true, false}); ◦ is a binary operation on sets of interpretations, such that for each pair of sets M1 , M2 , ◦ returns a set of interpretations M3 = M1 ◦ M2 . An information system (IS) F is a pair ℑ = (S , D ), with S an LS -schema, and D an LD -dataset. The set of models of ℑ is Mod(ℑ) = Mod(S ) ◦ Mod(D ), with Mod(S ) = {I | δ(I , S ) = true} / and Mod(D ) = {I | δ(I , D ) = true}. ℑ is satisfiable if Mod(ℑ) = 0. For example, in both ontologies and relational DBs, schemas are interpreted in the usual way in FOL: δ(I , S ) = true iff I |=FOL S . In SH I Q ontologies, datasets are also interpreted in the usual way: δ(I , D ) = true iff I |=FOL D , and ◦ is the intersection between the schema and the dataset models. In relational DBs, however, the data usually has a single model—that is, δ(I , D ) = true iff I = ID , where ID is the minimal Herbrand model of D ; The operation ◦ is also defined differently: I1 ◦ I2 ∈ Mod(ℑ) iff I2 = ID and ID |=FOL S . We are also very permissive w.r.t. query languages. A query language for F is an FO-language LQ over Σ. A boolean query Q is an LQ -sentence. The semantics is given by a function δLQ that assigns to each interpretation I and boolean query Q a truth value δLQ (I , Q) ∈ {true, false}. A system ℑ entails Q, written ℑ |=F Q if, for each I ∈ Mod(ℑ), δLQ (I , Q) = true. A general query Q is a LQ -formula, where x is the vector of free variables in Q. Let σ[x/o] be a function that, when applied to a general query Q, yields a new boolean query σ[x/o] (Q) by replacing in Q the variables in x by the constants in o. The answer set for Q in ℑ is the following set of tuples of constants: ans(Q, ℑ) = {o | ℑ |=F σ[x/o] (Q)}. An example of a query language could be the language of conjunctive queries in both DBs and ontologies. Given a query language LQ , a view over ℑ is a pair V = (V, v), with V —the definition of the view— an LQ -query, and v—the extension of the view— a finite set of tuples of constants, such that v = ans(V, ℑ). Condition [S ↑] [S ∗ ] [V] [Q = q]

Set Syst([S ↑]) = {ℑ = (S , D ) | ℑ ∈ IS and S ⊆ S } Syst([S ∗ ]) = {ℑ = (S , D ) | ℑ ∈ IS} Syst([V]) = {ℑ ∈ IS | each V ∈ V is a view over ℑ} Syst([Q = q]) = {ℑ ∈ IS | ans(Q, ℑ) = q} Table 1. Conditions on Information Systems

Given F = (Σ, LS , LD , Sem), we denote by IS, D the set of all satisfiable systems and datasets respectively in F , and by Tup the set of all tuples of constants over Σ. We also consider systems in IS that satisfy certain conditions; the conditions we consider are given in Table 1. Given a schema S , the first and second rows in the table represent respectively the set of ISs whose schemas extend S and are equal to S ; given a set of views V, the third row represents the set of ISs over which every V ∈ V is a view; finally, given a query Q and an answer set q, the last row represents the ISs for which q is the answer to Q. We denote with [C1 , . . . ,Cn ] the conjunction of conditions [C1 ], . . . , [Cn ], and with Syst([C1 , . . . ,Cn ]) the subsets of IS that satisfy all of [C1 ], . . . , [Cn ].

3

The Privacy Problems

Given F = (Σ, LS , LD , Sem) and a query language LQ , our goal is to study privacy guarantees for Bob —the owner of a system ℑ = (S , D ) in IS— against the actions of Alice— a potential attacker. Existing privacy frameworks for DBs[9, 10, 6] assume that the actual data D is kept hidden. The data to be protected is defined by

a query Q, called the sensitive query, whose definition is known by Alice. As an external user, Alice can only access the system through a query interface which allows her to ask certain “legal” queries; these legal queries, together with their answers, are represented as a set V of views over ℑ. Bob wants to extend the set of legal queries, i.e., to publish new views. The problem of interest is the following: The publishing problem: Given ℑ = (S , D ), an initial set of views V and a final set of views W over ℑ with V ⊆ W, verify that no additional information about the answers to Q is disclosed.2 R(x,y) (dis1,drug1) (dis2,drug1) (dis3, drug2) (dis4, drug2)

S(z,y) (pat1,drug1) (pat2, drug1) (pat3, drug2) (pat4, drug2)

T(z,w,x) (pat1,male,dis1) (pat2,male,dis2) (pat3, f em, dis3) (pat4, male, dis4)

F(z,t) (pat1, (pat2, (pat3, (pat4,

f lo1) f lo2) f lo3) f lo2)

Table 2. Example Hidden Dataset

Example 1 The IS of a hospital, modeled in FO-logic, contains data about the following predicates: R(x, y), which relates diseases to drugs, S(z, y), which relates patients to their prescribed drugs, T(z, w, x), which relates patients, their gender, and their diagnosed disease, and F(z,t) which specifies the floor of the hospital where each patient is located. Their extension in the hidden dataset D is given in Table 2. The schema S is public and contains FO-sentences such as ∀x, y : [R(x, y) ⇒ Disease(x)∧Drug(y)], which ensures that R only relates diseases to drugs, and sentences like ∀x : [Disease(x) ⇒ ¬Drug(x))], which ensures disjointness between drugs, diseases, patients, genders and floors. S also models other common-sense knowledge, e.g. that the gender of a patient is unique. Bob does not want to reveal any information about which patients suffer from dis1, i.e., the answer to the query Q(z) = ∃w : [T(z, w, dis1)] should be secret; however, Bob also wants to publish views V1 = (V1 , v1 ), and V2 = (V2 , v2 ) with V1 (x, y) ← F(z,t) and V2 (z, w) ← ∃x : [T(z, w, x)], and where v1 , v2 are their respective extensions w.r.t. D . Publishing these views could lead to a privacy breach w.r.t. Q. For example, if S contains a sentence α stating that all the patients in f lo1 suffer from dis1 then, by publishing V1 , Alice could deduce that pat1 suffers from dis1 and thus belongs to the answer to Q1 , which clearly causes a privacy breach. Even if the identity of patients suffering from dis1 is not revealed, the views could still provide useful information to Alice. Suppose that S contains β stating that dis1 is a kind of disease that only affects men; then by publishing V2 Alice could infer that pat3, a woman, cannot be in the answer to Q1 , which would permit Alice to discard possible answers. Such privacy breaches are datasetdependent: if all patients in D were male and none of them is on the first floor, then publishing V1 and V2 would be harmless. 3 Existing DB frameworks assume that the schema is static and fully known by Alice, which are not always reasonable assumptions. For inferential systems like ontologies [12], where the schema participates in query answering by allowing the deduction of new data, Bob may prefer to hide a part of the schema. In fact, some widely used ontologies, such as SNOMED-CT—a component of the Care Record Service in the British Health System—are not fully available. Furthermore, the schema may undergo continuous modifications; indeed many ontologies are updated on a daily basis. To overcome these limitations, we propose to formalise and study the following problems: The generalised publishing problem: New views or schema axioms are published, but the IS ℑ = (S , D ) remains static. Given an initial public schema S1 and a final public schema S2 with S1 ⊆ S2 ⊆ S , 2

/ Note that this generalises the “standard” case where V = 0.

42


initial views V and final views W with V ⊆ W, Bob wants to verify that no additional information about the answers to Q is disclosed. The system evolution problem: The IS ℑ = (S , D ) evolves to ℑ = (S , D ). Bob wants to ensure that, if it was possible to safely publish certain information before the change, then the same information can be safely published after the change. DB frameworks are probabilistic and apply to the publishing problem [10, 6, 11]. In the next section, we generalise them. Our presentation differs from [10, 6, 11] in two aspects: we consider arbitrary ISFs instead of relational DBs; and we consider the generalised publishing problem: instead of assuming that the schema is fixed and known, we allow for partially secret schemas. We show that known results for DBs can be naturally lifted to our more general setting.

4

Probabilistic Frameworks

The framework by Miklau & Suciu [10] is based on Shannon’s information-theoretic notion of perfect secrecy. As mentioned before, we present the framework in a more general form. Alice’s (additional) knowledge about the IS being attacked is given as a distribution P : IS → [0, 1] over all possible ISs. Given P, the probability that an IS satisfies a condition [C] in Table 1 is as follows: P([C]) = ∑ℑ∈Syst([C]) P(ℑ). Given [C1 ], [C2 ], P([C1 ] | [C2 ]) represents the probability, according to Alice’s knowledge, that an IS satisfies [C1 ] given that it satisfies [C2 ]; this can be computed using the Bayes P([C1 ,C2 ]) formula: P([C1 ] | [C2 ]) = P([C 2 ]) Let ℑ = (S , D ) be the system to be protected. Alice initially knows part of the schema S1 ⊆ S and views V over ℑ. After publication, she observes the new schema S2 with S1 ⊆ S2 and views W = V ∪ U; she is also aware that the real schema S extends both S1 and S2 . The apriori and a-posteriori probabilities, according to Alice’s knowledge, that q is the answer to Q are respectively given as follows:3 P([Q = q] | [S1 ↑, V])

(a-priori)

(1)

P([Q = q] | [S2 ↑, W])

(a-posteriori)

(2)

The privacy condition under consideration is called perfect privacy: intuitively, Alice should not learn anything about the possible outcomes of Q, whatever her additional knowledge or beliefs (i.e., for any P). Note that the condition is trivially satisfied if S1 and V already reveal the answer to Q, i.e., if each ℑ ∈ Syst([S1 ↑, V]) yields the same outcome to Q; in this case we say that Q is trivial. Example 2 Suppose that in Example 1, the schema S with β ∈ S is known, and V2 —the relation between patients and their genders— is published. Suppose that Alice has only vague knowledge about the IS and considers all datasets consistent with S equally likely. Consider an answer set q containing pat3. Before publishing the view, the probability (1) is non-zero for q, whereas, after publishing V2 , (2) is zero. Intuitively, Alice’s knowledge about Q has increased. 3 Definition 1 (Perfect Privacy). Perfect privacy holds if, for each P : IS → [0, 1] and q ∈ Tup with (1) well-defined, (2) equals (1).

In Example 1, Alice may believe that the answer to Q is q1 = {pat1} with P(q1 ) = 2/3, q2 = {pat1, pat2} with P(q2 ) = 1/6 and q3 = {pat1, pat3} with P(q3 ) = 1/6. Note the difference with [10], where Alice had prior knowledge about the possible ISs themselves. The distribution P induces possible compatible distributions P : IS → [0, 1] over ISs as follows: P is compatible with P, written P ∈ Comp(P) if, for each q, the sum of the probabilities of the ISs for which ans(Q, ℑ) = q is precisely P(q) (i.e., ∑{ℑ∈Syst([Q=q])} P (ℑ) = P(q)). Alice’s a-priori and a-posteriori knowledge is given respectively by (1) and (2) over P , and the privacy condition is the following: Definition 2 (Safety). Safety holds if, for each P : Tup → [0, 1], P ∈ Comp(P), and q ∈ Tup with (1) well-defined, (2) equals (1). Triviality of Perfect Privacy and Safety: In the relational DB literature, it has been observed that, on the one hand, safety and perfect privacy are closely related [6] and that, on the other hand, they are too strict in practice: revealing any new information, even if apparently irrelevant to Q, causes perfect privacy and safety not to hold— intuitively, this is because the attacker’s beliefs can establish a (possibly spurious) connection between any revealed information and the answer to the sensitive query. We show that these results can be naturally lifted to the generalised publishing problem for arbitrary ISFs as follows: Theorem 1 For given ℑ, Q, S1 , S2 , and V, W: (i) Safety ⇔ Perfect Privacy, and (ii) Perfect Privacy ⇔ Syst([S1 ↑, V]) ⊆ Syst([S2 ↑, W]). Relaxing Perfect Privacy and Safety: A number of recent papers have tried to weaken these notions. Miklau and Suciu [10] proposed to place constraints on P and consider only product distributions; this amounts to assuming that the tuples in the DB are independent. This assumption, however, is not reasonable if the schema is nontrivial: schema constraints can impose arbitrary correlations between tuples. Other proposals, e.g. [3], involve making (1) only approximately equal to (2). In this paper, we propose two novel notions— quasi-safety and quasi-privacy— that significantly relax Definitions 1 and 2 respectively; we show later on that both notions are equivalent and have a nice logical counterpart in terms of purely logic-based reasoning problems. Consider the notion of safety. Given P : Tup → [0, 1], Definition 2 requires (1) and (2) to coincide for all its compatible distributions. Definition 2 can be relaxed by requiring, for each P, only the existence of a compatible distribution P for which (1) and (2) coincide. Moreover, such distribution must be “reasonable” given the public information S1 , V—that is, if P assigns non-zero probability to q1 , then P cannot assign zero probability to all ISs that satisfy [S1 , V] and yield q1 . Formally, we say that P ∈ Comp(P) is ad/ there is an IS missible for S1 , V if, for each q such that P(q) = 0, / ℑ ∈ Syst([S1 , V, Q = q]) such that P (ℑ) = 0. Definition 3 (Quasi-Safety). Quasi-safety holds if, for each P : Tup → [0, 1] there is an admissible P ∈ Comp(P) s.t., for each q ∈ Tup, for which (1) is well-defined, (2) equals (1).

The framework by Deutsch and Papakonstantinou [6, 11] models Alice’s knowledge or beliefs as a distribution P : Tup → [0, 1] over the possible outcomes of the sensitive query. Here, we present the framework in a more general form.

That is, whatever Alice’s knowledge or beliefs about the answers to Q, there is always a compatible opinion about the hidden IS that is “reasonable” given the public information and that would not cause her to revise her beliefs after the new information is published. A similar principle can be used for weakening perfect privacy:

These probabilities are well-defined if P([S1 ↑, V]) and P([S2 ↑, W]) are nonzero; that is, if there is a IS with non-zero probability that is compatible with the available information.

Definition 4 (Quasi-Privacy). Quasi-privacy holds if, for each P : IS → [0, 1], there is a P : IS → [0, 1] s.t., for each q ∈ Tup for which (1) is well-defined over P, (2) over P equals (1) over P.

3


That is, whatever Alice’s initial beliefs about the hidden IS, she can always revise them such that her opinion about the answers to Q does not change when the new information is published.

5

A Logic-based Framework

In this section, we formalise privacy from a purely logic-based perspective as a guarantee that the published information will not “change the meaning” of the sensitive query. We propose a collection of privacy conditions that model this notion of meaning change, and consider both the publishing and the evolution problems.

5.1

The Generalised Publishing Problem

The most basic information about Q is obviously its answer. The most dangerous privacy breach occurs when publishing new information reveals part of such answer. In Example 1, before publishing any views, Alice cannot deduce the name of any patient suffering from dis1; after publication of V1 , Alice learns that pat1 does have dis1 and therefore belongs to the answer of Q. We will then say that the set of certain answers to Q has changed. Furthermore, as seen in Example 1, a privacy breach could also occur if Alice can discard possible answers and therefore formulate a “better guess”, even if part of the actual answer has not been disclosed. Initially, all possible sets of patients (e.g. q3 = {pat2, pat3}) are possible. Upon publication of V2 , all answers including pat3 (e.g. q3 = {pat2, pat3}) become impossible. We will then say that the set of possible outcomes of Q has changed. Possible outcomes and certain answers: Given Q and a condition [C] (see Table 1), the possible outcomes of Q given [C] are as follows: out([C]) = {q ∈ Tup | ∃ℑ ∈ Syst([Q = q,C])}

(3)

The set of certain answers of Q given [C] is defined as the common subset of all the possible outcomes: cert([C]) = out([C]). As argued before, a privacy condition should at least guarantee that the set of certain answers given the initial schema and views stays the same after publishing the new information:4 cert([S1 ↑, V]) = cert([S2 ↑, W])

(4)

A stronger privacy condition can be obtained if we require the set of possible outcomes not to change as follows: out([S1 ↑, V]) = out([S2 ↑, W])

(5)

It is ultimately up to the data owner to decide which condition is most appropriate for his application needs. Monotonicity for answer sets: Sometimes in this section we will focus only on ISFs and query languages that have a monotonic behavior with respect to answer sets—that is, if new schema axioms and/or views are published, the set of possible answers to a query Q can only decrease. In the limit, if the whole system is published, then only one answer remains possible, namely the “real” answer for Q against the IS . This property can be formalized as follows:

S1 ⊆ S2 and V ⊆ W ⇒ out([S2∗ , W]) ⊆ out([S1∗ , V])

(6)

Many languages currently used in practice, such as relational DBs and DL ontologies satisfy this property. Checking Condition (5) in ISFs that satisfy Property (6) just requires to consider the initial and final schemas, instead of all their super-sets. 4

It can be easily seen that Condition (5) implies 4

43

Proposition 1 If F satisfies Property (6), then Condition (5) holds iff out([S1∗ , V]) ⊆ out([S2∗ , W]), In what follows, if a result depends on Property (6), it will be explicitly stated; otherwise, we assume general ISFs and queries. Bridges between probability and logic: At this stage, we can establish a first general bridge between our logic-based conditions and the probabilistic ones. In particular, it turns out that Condition (5) is equivalent to both quasi-privacy and quasi-safety: Theorem 2 Quasi-safety ⇔ Quasi-privacy ⇔ Condition (5). Note that Theorem 2, on the one hand, implies that quasi-safety and quasi-privacy are indeed equivalent notions; on the other hand, it provides a natural logical interpretation to our probabilistic weakening of safety and perfect privacy. Breaches in logic privacy: Condition (5) may still lead to potential security breaches if new schema axioms are published, as shown by the following example: Example 3 Suppose LS is FO predicate logic, LD only allows for ground atomic formulas, and LQ is the language of conjunctive queries. Let A, B be unary predicates and R a binary predicate; consider a Σ with two constants: a, b. The sensitive query is A(x). Suppose that Bob publishes V1 with definition B(x) and extension {a, b}. Initially, S1 = 0/ and hence all outcomes Tup = {{}, {a}, {b}, {a, b}} are possible. Suppose that Bob publishes S2 = {∀x : [A(x) ↔ ∃y : [R(x, y) ∧ B(y)]]}. Upon publication of S2 , no possible outcome is ruled out, but S2 has introduced a correlation between V1 and Q. These correlations could potentially lead to a security breach. 3 Indeed, even if Alice cannot discard any possible outcome of Q, Bob may want to prevent the new information from establishing potentially dangerous correlations; to this end, we introduce a stronger notion of logic-based privacy. Strengthening logic privacy: We propose an additional condition in case new schema axioms are published. Our condition is only defined for ISs satisfying Property (6) and it ensures that for each possible dataset D , Alice obtains the same answer for Q independently of whether she considers the initial schema S1 or the final one S2 . That is, for each ℑ = (S2 , D ) ∈ Syst([S2∗ , W]), the following should hold: ans(Q, ℑ) = ans(Q, ℑ )

(7)

where ℑ = (S1 , D ). If we enforce this condition in the example above, we would have that publishing S2 yields a privacy breach. Indeed, consider D = {R(a, b), B(a), B(b)}; we have ans(Q, S1 = {}) = {}, whereas ans(Q, S2 ) = {a}. These intuitions motivate the following notion of privacy for ISFs satisfying Property (6): Definition 5 (Strong Logic-based Privacy). Given Q, S1 , S2 , V, W, strong logic-based privacy holds if Conditions (5) and (7) hold. The above establishes a middle ground between too strict privacy notions (Definitions 1, 2) and rather permissive ones (Definitions 3, 4). Definition 5 implies that a privacy breach may only occur if the new information correlates the public one to the answers of Q; that is, publishing information that is completely unrelated to Q will not break privacy. Note, however, that if S1 = S2 , then Definition 5 reduces to Condition (5) since Condition (7) trivially holds. A connection with conservative extensions: Definition 5 is close to conservative extensions, a well-established notion in mathematical logic, and an important concept in ontology design and reuse [8, 4, 7].

44


Conservative extensions have been recently proposed as the basic notion for defining modules in ontologies—independent parts of a given theory— and safe refinements—extensions of a theory that do not affect certain aspects of the meaning of the original theory. In the context of privacy-preserving query answering, the notion of a query conservative extension [7] for monotonic ISFs is of special relevance: Definition 6 (Query Conservative Extension). 5 Given S1 ⊆ S2 , sets Q, D of queries and datasets respectively, S2 is a query conservative extension of S1 w.r.t. Q, D if, for each Q ∈ Q and D ∈ D, we have that ans(Q, ℑ = (S2 , D )) = ans(Q, ℑ = (S1 , D )). In order to establish a connection between Definitions 5 and 6, let us introduce the following notation. Given [C], we denote the set of datasets that an IS that satisfies [C] can have as follows: Data([C]) = {D ∈ D | ∃ℑ ∈ Syst([C]), ℑ has dataset D }. If D = Data([S2∗ , W]), then Definition 6 corresponds precisely to Condition (7). If V = W, and D = Data([S1∗ , V]), then Definition 6 is a sufficient condition for strong logic-based privacy.

5.2

The System Evolution Problem

Suppose that the privacy of ℑ = (S , D ) w.r.t. a query Q and a set V of published views has been tested and the system evolves to ℑ = (S , D ). We want to ensure that ℑ behaves in the same way as ℑ w.r.t. the secrecy of Q given V. Such notion of robustness under changes can be characterized as follows. Let ℑ = (S , D ), ℑ = (S , D ) be ISs, and let Q be a sensitive query. Consider a notion of security characterized by a predicate Privacy(ℑ, Q, V), e.g. (strong) logic-based privacy, which is evaluated to true if, given the IS ℑ = (S , D ), with S being public, Q is secure for the publication of V. Definition 7 (Secure Evolution). The evolution of ℑ = (S , D ) to ℑ = (S , D ) is secure w.r.t. Q and V if Privacy(ℑ, Q, V) implies Privacy(ℑ , Q, V ) with V being the views over ℑ with the same view definitions as V. We distinguish two situations: (i) the data changes during the evolution of the system, but the schema remains constant, and (ii) the data remains constant, but the schema changes. Varying the data: We first formulate the notion of data independence, which ensures robust evolution w.r.t. changes in the data. Definition 8 (Data Independence). A notion of privacy is dataindependent w.r.t. S , Q and V if, for each ℑ, ℑ ∈ Syst([S ∗ ]) the evolution of ℑ to ℑ is secure w.r.t. Q, V. It is not hard to see that, given any non-trivial Q and any S , Perfect privacy and safety are data-independent w.r.t. S , Q. In contrast, the notion of privacy derived from Condition 5 is not data-independent for all S . Consider Example 1 and suppose that the schema S contains the sentence β and that the dataset D only contains male patients. In this case, Condition (5) holds since no possible outcome of Q can be ruled out when publishing V2 ; however, if D evolves to D containing a female patient, then the condition is violated. As a consequence, strong logic-based privacy is not data-indepedent and, given Theorem 2, nor are quasi-privacy and quasi-safety. Data independence for any schema is, indeed, a strict requirement. For ISFs satisfying Property (6), certain schemas and certain views, it is possible to obtain data-independence results: 5

In [7], D and Q are the sets of all datasets and all queries respectively over a given signature.

Proposition 2 Let S be a query conservative extension of S = {} w.r.t. Q = {Q} and D = D; let V, V be s.t. out([V]) = out([V ]). Then (strong) logic-based privacy is data-independent w.r.t. S , Q. Proposition 2 guarantees that data independence is obtained for schemas and views that are uncorrelated with the sensitive query. Varying the schema: we now assume that the data remains constant and the schema changes. Suppose that, in Example 1, the initial schema S does not contain β; let S = S ∪ {β} and let the dataset D contain a female patient. Publishing the names and gender of the patients (view V2 ) does not cause a privacy breach since S does not introduce any correlation between diseases and the gender of patients; however, when ℑ = (S , D ) evolves to ℑ = (S , D ) then such correlation does exist and the publication of V2 is no longer safe. Note that, given Q, D , we have that S is not a query conservative extension of S . This observation suggests the following sufficient condition for secure evolution of ISFs satisfying Property (6): Proposition 3 Let S is a query conservative extension of S w.r.t Q = {Q} and D = Data([S ∗ ]); let out([S∗ , V]) = out([S∗ , V ]). Then, the evolution of ℑ = (S , D ) to ℑ = (S , D ) is secure w.r.t. Q, V for both privacy as in Condition (5) and strong logic-based privacy. Propositions 2 and 3 establish a bridge between the notions of conservative extension and secure evolution and show that the former can be used to provide sufficient conditions for the latter.

6

Conclusion

In this paper, we have generalised existing results for privacy in databases, and proposed novel privacy conditions. We have proposed a novel logic-based approach and established bridges with existing information-theoretic approaches. Our results provide a deeper fundamental understanding of privacy-preserving query answering and can be used as a starting point for studying the decidability and complexity of the different privacy guarantees for particular languages.

REFERENCES [1] F. Baader, C. Lutz, H. Sturm, and F. Wolter, ‘Fusions of Description Logics and Abstract Description Systems’, JAIR, 16, 1–58, (2002). [2] E. Bertino, S. Jajodia, and P. Samarati, ‘Database security: Research and practice’, Inf. Syst., 20(7), 537–556, (1995). [3] A. Blum, C. Dwork, F. McSherry, and K. Nissim, ‘Practical privacy: the sulq framework’, in PODS, pp. 128–138. ACM, (2005). [4] B. Cuenca Grau, I. Horrocks, Y. Kazakov, and U. Sattler, ‘A logical framework for modularity of ontologies’, in IJCAI-07, pp. 298–304. AAAI, (2007). [5] G. De Giacomo E. Franconi I. Horrocks A. Kaplunova D. Lembo M. Lenzerini C. Lutz D. Martinenghi R. Moeller R. Rosati S. Tessaris A.Y. Turhan D. Calvanese, B. Cuenca Grau. Common framework for representing ontologies. TONES Project Deliverable, 2007. [6] A. Deutsch and Y. Papakonstantinou, ‘Privacy in database publishing’, in ICDT-2005, volume 3363 of LNCS, pp. 230–245. Springer, (2005). [7] R. Kontchakov, F. Wolter, and M. Zakharyaschev, ‘Modularity in dl lite’, in DL-2007. [8] C. Lutz, D. Walther, and F. Wolter, ‘Conservative extensions in expressive description logics’, in IJCAI-07, pp. 453–459. AAAI, (2007). [9] A. Machanavajjhala and J. Gehrke, ‘On the efficiency of checking perfect privacy’, in PODS-2006, pp. 163–172. ACM, (2006). [10] G. Miklau and D. Suciu, ‘A formal analysis of information disclosure in data exchange’, J. Comput. Syst. Sci., 73(3), 507–534, (2007). [11] A. Nash and A. Deutsch, ‘Privacy in GLAV information integration’, in ICDT, pp. 89–103, (2007). [12] P.F. Patel-Schneider, P. Hayes, and I. Horrocks. Web ontology language OWL Abstract Syntax and Semantics. W3C Recommendation, 2004. [13] L. Sweeney, ‘K-anoniminity: a model for protecting privacy’, Int. J. on Uncertainty, Fuzziness and Knowledge-based Systems., 10(5), (2002).


45

Optimizing Causal Link Based Web Service Composition Freddy Lécué1,2 and Alexandre Delteil2 and Alain Léger2 Abstract. Automation of Web service composition is one of the most interesting challenges facing the Semantic Web today. Since Web services have been enhanced with formal semantic descriptions, it becomes conceivable to exploit causal links i.e., semantic matching between their functional parameters (i.e., outputs and inputs). The semantic quality of causal links involved in a composition can be then used as a innovative and distinguishing criterion to estimate its overall semantic quality. Therefore non functional criteria such as quality of service (QoS) are no longer considered as the only criteria to rank compositions satisfying the same goal. In this paper we focus on semantic quality of causal link based semantic Web service composition. First of all, we present a general and extensible model to evaluate quality of both elementary and composition of causal links. From this, we introduce a global causal link selection based approach to retrieve the optimal composition. This problem is formulated as an optimization problem which is solved using efficient integer linear programming methods. The preliminary evaluation results showed that our global selection based approach is not only more suitable than the local approach but also outperforms the naive approach.

1

Introduction

The semantic web [6] is considered to be the future of the current web. Web services in the semantic web are enhanced using rich description languages such as the Web Ontology Language (OWL) [19]. Formally the latter semantic descriptions are expressed by means of Description Logics concepts [4] in ontologies. An ontology is defined as a formal conceptualization of a domain we require to describe the semantics of services e.g., their functional input, output parameters. Intelligent software agents can, then, use these descriptions to reason about web services and automate their use to accomplish intelligent tasks e.g., selection, discovery, composition. In this work we focus on web service composition and more specifically on its functional level (aka causal link composition). Starting from an initial set of web services, such a level of composition aims at selecting and inter-connecting web services by means of their (semantic) causal links according to a goal to achieve. The functional criterion of causal link, first introduced in [14], is defined as a semantic connection between an output of a service and an input parameter of another service. Since the quality of the latter links are valued by a semantic matching between their parameters, causal link compositions could be estimated and ranked as well. From their estimation results, some compositions can be considered as unsuitable in case of under specified causal links. Indeed a composite service that does not provide acceptable quality of causal links might be as useless as a service not providing the desired functionality. Unlike most of approaches [5, 22, 23] which focus on the quality of composition by means of non functional parameters i.e., quality of 1 2

Ecole de Mines de Saint-Etienne, France, email: [email protected] Orange Labs, France, email: {firstname.lastname}@orange-ftgroup.com

service (QoS), the quality of causal links can be considered as a distinguishing functional criterion for semantic web service compositions. Here we address the problem of optimization in service composition with respect to this functional criterion. Retrieving such a composition is defined as the global selection of causal links maximizing the quality of the composition, taking into account preferences and constraints defined by the end-user. To this end, an objective function maximizing the overall quality subject to causal links constraints is introduced. This leads to an NP-hard optimization problem [8] which is solved using integer linear programming methods. The remainder of this paper is organised as follows. In the next section we briefly review i) causal links, ii) a distinguishing criterion i.e., their robustness and iii) the causal link composition model. Section 3 defines the causal link quality criteria we require during the global selection phase. Section 4 formulates the problem of global causal link selection and describes an integer linear programming method to efficiently solve it. Section 5 presents its computational complexity and some experimentations. Section 6 briefly comments on related work. Finally section 7 draws some conclusions and talk about possible future directions.

2

Background

First of all, we present causal links. Then we remind the definition of their robustness, and finally describe causal link composition.

2.1

Web Service Composition & its Causal Links

In the semantic web, parameters (i.e., input and output) of services referred to concepts in a common ontology3 or Terminology T , where the OWL-S profile [1] or SA-WSDL [18] can be used to describe them (through semantic annotations). At functional level web service composition consists in retrieving some semantic links between output parameters Out si ∈ T of services si and input parameters In sj ∈ T of other services sj . Such a link i.e., causal link [14] cli,j (Figure 1) between two functional parameters of si and sj is formalized as si , SimT (Out si , In sj ), sj . Thereby si and sj are partially linked according to a matching function SimT . This function expresses which matching type is employed to chain services. The range of SimT is reduced to the four well known matching type introduced by [16] and the extra type Intersection [15]: • Exact If the output parameter Out si of si and the input parameter In sj of sj are equivalent; formally, T |= Out si ≡ In sj . • PlugIn If Out si is sub-concept of In sj ; formally, T |= Out si In sj . • Subsume If Out si is super-concept of In sj ; formally, T |= In sj Out si . • Intersection If the intersection of Out si and In sj is satisfiable; formally, T |= Out si In sj ⊥. 3

Distributed ontologies are not considered here but are largely independent of the problem addressed in this work.

46

F. Lécué et al. / Optimizing Causal Link Based Web Service Composition

• Disjoint Otherwise Out si and In sj are incompatible i.e., T |= Out si In sj ⊥. Out0 si

In0 si Ink si

si

Service

Inn si

Figure 1.

Service

In0 sj

In sj

Out si Outn si

Causal Link cl

2.2

Causal Link cli,j (SimT (Out si , In sj ))

sj

Service

Out sj

Inn sj Input Parameter

Output Parameter

Illustration of a Semantic Causal Link cli,j .

Robust Causal Link

The latter matching function SimT enables, at design time, finding some levels of semantic compatibilities (i.e., Exact, PlugIn, Subsume, Intersection) and incompatibilities (i.e., Disjoint) among independently defined web service descriptions. However, as emphasized by [13], the matching types Intersection and Subsume need some refinements to be fully efficient for causal links composition. Example 1. (Causal Link & Subsume Matching Type) Suppose s1 and s2 be two services such that the output parameter NetworkConnection of s1 is (causal) linked to the in1 put parameter SlowNetworkConnection of s2 (cl1,2 in Figure 3). This causal link is valued by a Subsume matching type since N etworkConnection SlowN etworkConnection (Figure 2). It is obvious that such a causal link should not be directly applied in a service composition since the NetworkConnection is not specific enough to be used by the input SlowNetworkConnection. Indeed the output parameter NetworkConnection requires some Extra Descriptions to ensure a composition of s1 and s2 .

Example 2. (Robustness, Extra & Common Description) Suppose the causal link presented in Example 1. Such a link is not robust enough (Definition 1) to be applied in a composition. The description missing in NetworkConnection to be used by the input parameter SlowNetworkConnection is defined by the Extra Description SlowN etworkConnection\N etworkConnection i.e., ∀netSpeed.Adsl1M . However the Common Description is not empty since this is defined by SlowN etworkConnection N etworkConnection i.e., ∀netP ro.P rovider. Robust causal links can be obtained by retrieving Extra Description that changes an Intersection in a PlugIn matching type, and a Subsume by an Exact matching type.

2.3

Causal Link Composition Model

In this work, the process model of web service composition and its causal links is specified by a statechart [10]. Its states refer to services whereas its transitions are labelled with causal links. In addition some basic composition constructs such as sequence, conditional branching (i.e., OR-Branching), structured loops, concurrent threads (i.e., AND-Branching), and inter-thread synchronization can be found. To simplify the presentation, we assume that all considered statecharts are acyclic and consists of only sequences, OR-Branching and AND-Branching. In case of cycle, a technique for unfolding statechart into its acyclic form needs to be applied beforehand. Details about this unfolding process are omitted for space reasons. Example 3. (Process Model of a Causal Link Composition) Suppose si,3≤i≤8 be six services extending Example 1 in a more complex composition. The process model of this composite service is illustrated in Figure 3. The composition consists in an OR-Branching and AND-Branching wherein nine causal links are involved.

A causal link valued by the Intersection matching type requires a comparable refinement. From this, [13] defined a robust causal link. N etworkConnection ≡ ∀netP ro.P rovider ∀netSpeed.Speed SlowN etworkConnection ≡ N etworkConnection ∀netSpeed.Adsl1M

T2 1 cl1,2

T1

s1

Network Connection

Adsl1M ≡ Speed ∀ mBytes.1M

1 cl1,4

Figure 2. Sample of an ALE domain ontology T . Causal Link cl

Definition 1. (Robust Causal link) A causal link si , SimT (Out si , In sj ), sj is robust iff the matching type between Out si and In sj is either Exact or PlugIn. Property 1. (Robust Web Service Composition) A composition is robust iff all its causal links are robust. A possible way to replace a link si , SimT (Out si , In sj ), sj valued by Intersection or Subsume in its robust form consists in computing the information contained in the input In sj and not in the output Out si . To do this, the difference or subtraction operation [7] for comparing ALE DL descriptions is adapted in [13]. Even if [20] previously presented an approach to capture the real semantic difference, the [7]’s difference is preferred since its result is unique. From this, in case a causal link si , SimT (Out si , In sj ), sj is neither valued by a Disjoint matchmaking nor robust, Out si and In sj are compared to obtain two kinds of information, a) the Extra Description In sj \Out si that refers to the information required but not provided by Out si to semantically link it with the input In sj of sj , and b) the Common Description Out si In sj that refers to the information required by In sj and effectively provided by Out si .

Figure 3.

s

Slow 2 Network Connection

T3 1 cl2,3

T6

s3 1 cl3,5

s4 Input Parameter

T5

s5

OR-Branching

T4

1 cl5,6

1 cl4,5

1 cl5,7

s6 AND Branching

T7

s7

1 cl6,8 T8

s8

1 cl7,8

Output Parameter T: Task s: Service

Illustration of an (Executable) Causal Link Composition.

The example 3 illustrates an executable composition wherein tasks Ti have been concretized by one of their candidate services e.g., here si . Indeed some services with common functionality, preconditions and effects although different input and output parameters are given and can be used to perform a target task in the composition. In this way we address the issue of composing a large and changing collection of semantic web services. In our approach the choice of services is done at composition time, only based on their causal links with A other services. Thus each abstract causal link cli,j between two tasks Ti , Tj of an abstract composition needs to be concretized. Ideally, a k,1≤k≤n relevant link is selected among its n candidate causal links cli,j between two of their services to obtain an executable composition. Example 4. (Tasks, Candidate Services & Causal Links) Let s2 be a candidate service for T2 with NetworkConnection 2 as input parameter. The causal link cl1,2 between s1 and s2 is then 1 2 more robust than cl1,2 . Indeed cl1,2 is valued by an Exact matching 1 type whereas cl1,2 is valued by a Subsume matching type.


3

Causal Link Quality Model

As previously presented, several candidate services are grouped together in every task of an abstract composition. A way to differenti1 2 ate their causal links (e.g., cl1,2 and cl1,2 in example 4) consists in considering their different functional quality criteria. To this end, we adopt a causal link quality model, effective to any causal link. In this section, we first present the quality criteria used for elementary causal links, before turning our attention to composite causal links. For each criterion, we provide a definition and indicates rules to compute its value for a given causal link.

3.1

Quality Criteria for Elementary Causal Links

We consider three generic quality criteria for elementary causal links cli,j defined by si , SimT (Out si , In sj ), sj : its i) Robustness, ii) Common Description rate, and iii) Matching Quality. • Robustness. The Robustness qr of a causal link cli,j is defined by 1 in case the link cli,j is robust (see Definition 1), and 0 otherwise. • Common Description rate. This rate4 qcd ∈ (0, 1] is defined by: qcd (cli,j ) =

|Out si In sj | |In sj \Out si | + |Out si In sj |

(1)

This criterion estimates the rate of descriptions which is well specified for upgrading a non robust causal link into its robust form. In (1), Out si In sj is supposed to be satisfiable since only relevant links between two services are considered in our model. • Matching Quality. The Matching Quality qm of a link cli,j is a value in (0, 1] defined by SimT (Out si , In sj ) i.e., either 1 (Exact), 34 (PlugIn), 12 (Subsume) and 14 (Intersection). The Disjoint match type is not considered since Out si In sj is satisfiable. In case we consider Out si In sj to be not satisfiable, it is straightforward to extend and adapt our quality model by computing contraction [9] between Out si and In sj . Given the above quality criteria, the quality vector of a causal link cli,j is defined as follows: ` ´ q(cli,j ) = qr (cli,j ), qcd (cli,j ), qm (cli,j ) (2) In case of services si and sj related by more than one causal link, the value of each criterion is retrieved by computing their average.

3.2

Quality Criteria for Causal Link Composition

The above quality criteria are also applied to evaluate the quality of any causal link composition c. To this end, Table 1 provides aggregation functions for such an evaluation. A brief explanation of each criterion’s aggregation function follows (here cl stands for cli,j ): • Robustness. On the one hand the robustness Qr of both a sequential and an AND-Branching composition c is defined as the average of its causal link cl’s robustness qr (cl). On the other hand the robustness of an OR-Branching causal link composition is a sum of qr (cl) weighted by pr i.e., the probability that causal link cl be chosen at run time. • Common Description rate. This Description rate Qcd of c is defined as its robustness, by simply changing qr (cl) by qcd (cl). • Matching Quality. The matching quality Qm of a sequential and AND-Branching causal link composition c is defined as a product of qm (cl). The matching quality of an OR-Branching causal link composition c is defined as Qr (c), by changing qr (cl) by qm (cl). 4

|.| refers to the size of ALE concept descriptions ([12] p.17) i.e., ||, |⊥|, . |A|, |¬A| and |∃r| is 1; |C D| = |C| + |D|; |∀r.C| and |∃r.C| is 1 + |C|. For instance |Adsl1M | is 3 in Figure 2.

47

Using the above aggregation functions, the quality vector of an executable causal link composition is defined by (3). For each criterion l ∈ {r, cd, m} the higher the value Ql for c the higher its lth quality. (3) Q(c) = (Qr (c), Qcd (c), Qm (c)) Even if criteria qr , qm used to value a single causal link are correlated, their aggregated values of compositions Qr , Qm for Sequential, AND-Branching are independent since they are computed from different functions i.e., linear for Qr , not for Qm . Thus a composition c with a high robustness may have either a high or low overall matching quality. We have the same conclusion on the other criteria. Composition Quality Criterion Construct Robustness Qr Com. Desc. rate Qcd Match. Qual. Qm Q Sequential/ 1 P 1 P q (cl) |cl| cl qcd (cl) cl qm (cl) AND- Branching |cl| cl r P P P OR-Branching cl qr (cl).pcl cl qcd (cl).pcl cl qm (cl).pcl Table 1. Quality Aggregation Rules for Causal Link Composition.

4

Global Causal Link Selection

In the following we study the optimal composition5 as the selection of causal links that optimize the overall quality of the composition. On the one hand the selection can be locally optimized at each abA stract causal link cli,j of the composition, but two main issues arise. k First, the local selection of a candidate link cli,j enforces a specific service for both tasks Ti and Tj . Thus, these constraints can no longer A ensure to select neither the best links for its closest abstract links clα,i A and clj,β nor the optimal composition (e.g., the best local selection in A 1 cl1,2 i.e., cl1,2 does not lead to the optimal composition in Figure 4). Secondly, quality constraints may be not satisfied, leading to a suboptimal composition e.g., a constraint with a robustness more than 70% cannot be enforced. On the other hand, the naive global approach considers an exhaustive search of the optimal composition among all A the executable compositions. Let |cli,j | be the number abstract links in an composition and n be the number of candidate services by task, A the total number of executable causal link compositions is n2.|cli,j | , making this approach impractical for large scale composition. Here, we address these issues by presenting an integer linear programming (IP) [21] based global causal link selection, which i) further constrains causal links, and ii) meets a given objective.

4.1

IP Based Global Selection & Objective Function

There are 3 inputs in an IP problem: an objective function, a set of integer decision variables (restricted to value 0 or 1), and a set of constraints (equalities or inequalities), where both the objective function and the constraints must be linear. IP attempts to maximize or minimize the value of the objective function by adjusting the values of the variables while enforcing the constraints. The problem of retrieving an optimal executable composition is mapped into an IP problem. Here we suggest to formalize its objective function. To this end, the robustness, common description rate and matching values of the p potential executable compositions i.e., Qλ,1≤λ≤p l,l∈{r,cd,m} have been first determined by means of aggregation functions in Table 1. Then, the latter quality values Qλr , Qλcd , Qλm has been scaled according to (4). ( min Qλ l −Ql ∼λ if Qmax − Qmin = 0 l l −Qmin Ql = Qmax (4) l ∈ {r, cd, m} l l max 1 if Ql − Qmin =0 l In (4), Qmax is the maximal value of the lth quality criteria whereas l min Ql is the minimal value of the lth quality criteria. This scaling phase complexity is linear in the number of abstract links in the composition. Finally, the objective function (5) of the IP problem follows. 5

The relation and combination with quality of services is not addressed here.

48


X

max

1≤λ≤p

!

“∼λ ” Ql × ωl

(5)

T1

Candidates

l∈{r,cd,m}

where Pωl ∈ [0, 1] is the weight assigned to the l quality criterion and l∈{r,cd,m} ωl = 1. In this way preferences on quality of the desired executable compositions can be done by simply adjusting ωl e.g., the Common Description rate could be weighted higher.

4.2

2 q(cl1,2 ) = (0, 35 , 12 )

Causal Link cl

Allocation Constraint. Only one candidate link should be selected A for each abstract link cli,j between tasks Ti and Tj . This constraint k,1≤k≤n in (6). is formalized by exploiting the integer variables yi,j n X k A yi,j = 1, ∀cli,j (6) k=1

Example 5. (Allocation Constraint) Suppose the sequential composition of tasks T1 , T2 , T3 in Figure 4. Two candidate causal links can be applied between tasks T1 and T2 1 2 i.e., cl1,2 , cl1,2 . Since only one candidate between two tasks will be 1 2 1 2 A selected, we have y1,2 + y1,2 = 1. We have y2,3 + y2,3 = 1 for cl2,3 . Incompatibility Constraint. Since the selection of a candidate k A for cli,j enforces a specific service for both tasks Ti causal link cli,j (e.g., si ) and Tj (e.g., sj ), the number of candidate links concretizA A ing its closest abstract links clα,i and clj,β is highly reduced. Indeed A A the candidate links for clj,β (clα,i ) have to use only input (output) parameters of sj (si ). Thus, a constraint (7) for each pair of incomk l patible candidate links (cli,j , clj,β ) is required in our IP problem. k l A A yi,j + yj,β ≤ 1, ∀cli,j ∀clj,β

(7)

Example 6. (Incompatibility Constraint) Suppose the composition in Figure 4. According to (7), the incompat1 2 2 1 ibility constraints are i) y1,2 + y2,3 ≤ 1, ii) y1,2 + y2,3 ≤ 1. Indeed 1 2 2 1 (cl1,2 , cl2,3 ), (cl1,2 , cl2,3 ) are pairs of incompatible candidate links since task T2 cannot be performed by two distinct services sa and sb . Besides (6), (7), IP constraints on the quality criteria of the whole abstract composition are required. Here, we focus on the sequential, AND-Branching compositions, but a similar formalization for ORBranching compositions and a fortiori their combinations is required. k Robustness Constraint. Let ri,j be a function of (i, j, k) representk ing the robustness quality of a causal link cli,j . Constraint (8) is required to capture the robustness quality of a causal link composition. n 1 XX k k Qr = A ri,j .yi,j (8) |cli,j | A k=1 cli,j

An additional constraint (9) can be used to constrain the robustness quality of the executable composition to not be lower than L. n 1 XX k k ri,j .yi,j ≥ L, L ∈ [0, 1] (9) A |cli,j | A k=1 cli,j

Common Description Rate Constraint. Let cdki,j be a function of k (i, j, k) representing the Common Description rate of a link cli,j . Its k constraint is defined as (8), (9) by replacing Qr by Qcd , ri,j by cdki,j .

A Candidates cl2,3

sb

T3

Candidates

1 q(cl2,3 ) = (0, 15 , 14 )

sa

Input Parameter

Figure 4.

Integer Variables & Constraints of IP Problem

k,1≤k≤n A For every candidate link cli,j of an abstract link cli,j , we ink clude an integer variable yi,j in the IP problem indicating the seleck k tion or exclusion of link cli,j . By convention yi,j is 1 if the kth cank A didate link cli,j is selected to concretize cli,j between tasks Ti and Tj , 0 otherwise. The selected links will form an optimal executable composition satisfying (5) and meeting the following constraints:

T2

Candidates

1 q(cl1,2 ) = (1, 1, 1)

s1

th

A Candidates cl1,2

sα

2 q(cl2,3 ) = (1, 1, 1)

Output Parameter T: Task s: Service

Tasks, Candidate Services & Causal Links.

Matching Quality Constraint. Among the criteria used to select causal links, the Matching quality is associated with a nonlinear aggregation function (see Table 1). A transformation in a linear function is then required to capture it in the IP problem. Assume mki,j be a function of (i, j, k) representing the Matching quality of causal link k cli,j . The overall Matching quality of the executable composition is: n ” Y “Y k Qm = (mki,j )yi,j (10) clA i,j

k=1

The Matching quality constraints can be linearised by applying the logarithm function ln. Equation (10) then becomes: ! n X X k k ln(Qm ) = ln(mi,j ).yi,j (11) clA i,j

Pn

k=1

k = 1 and = 1 or 0 for each causal link cli,j . since ln(Qm ) is formalized to capture the Matching quality in our work. Changing a nonlinear constraint in its linear form requires also to linearise the objective function. Thus, (12) is replaced by (13) in (4). k k=1 yi,j

k yi,j

Qλm − Qmin m Qmax − Qmin m m

(12)

ln(Qλm ) − ln(Qmin m ) (13) min ln(Qmax m ) − ln(Qm )

Local Constraint. The IP problem can also include local selection and encompass local constraints. Such constraints can then predicate on properties of a single link and can be formally included in the A model. In case a target causal link cli,j requires its local robustness to be higher than a given value v, this constraint is defined by (14). n X k k ri,j .yi,j > v, v ∈ [0, 1] (14) k=1

Local constraints are enforced during the causal links selection. Those which violate the local constraints are filtered from the list of candidate links, reducing the number of variables of the model. The proposed method for translating the problem of selecting an optimal execution composition into an IP problem is generic and, although it has been illustrated with criteria introduced in Section 3, other semantic criteria to value causal links can be accommodated.

5

Computational Complexity & Experimentation

The optimization problem formulated in section 4 , which is equivalent to an IP problem, is NP-hard [17]. In case the number of abstract and candidate causal links is expected to be very high, finding the exact optimal solution to such a problem takes exponential run-time complexity in the worst case, so no practical. However our approach scales well by running a heuristic based IP solver wherein hundreds of abstract and candidate causal links are involved. This is a suitable upper bound for practicable industrial applications. We conducted experiments on an Intel(R) Core(TM)2 CPU, 1.86GHz with 512 RAM. Compositions with up to 500 abstract causal links and 100 candidates for each abstract link have been considered. In our experiments we assumed that robustness, common


Computation Cost (ms)

description rate and matching quality of each causal link have been inferred in a pre-processing step of semantic reasoning. From these, the IP model formulation is computed, and the optimization problem is solved by running CPLEX, a state of the art integer linear programming solver based on the branch and cut technique 6 [21]. The experimentation (Figure 5) aimed at comparing the global selection based approach by IP with the local optimization and naive global selection (i.e., exhaustive search). We measured the computation cost (in ms) of selecting causal links to create an optimal executable composition under the three different selection approaches. 10000 Global Selection Using Exhaustive Search Global Selection Using IP Local Optimization Based-Selection

8000 6000 4000 2000 0 0

100 200 300 400 Number of Abstract Causal Links in Composition

500

Figure 5. Number of Abstract Causal Links vs. Computation Cost for Optimal Executable Composition. (100 candidates for each causal links).

The computation cost of global selection by exhaustive search is very high even in very small scale in aspect of the number of abstract causal links and their candidates. Although the computation cost of global selection by IP is higher than that of local optimization, it is still acceptable. Finding the optimal solution to the optimization problem takes 10 seconds for a composition of 450 abstract causal links with 100 candidate links (i.e., 10 candidate services by task). In case of higher number of links, the problem can be, for instance, divided in several global selection problems. Alternatively, suboptimal solutions satisfying revisited quality thresholds can be sufficient.

6

Related Work

Despite considerable work in the area of service composition, few efforts have specifically addressed optimization in ’causal link’-based service composition. Even if [13] introduce validity and robustness in causal link composition, no quality model is explicitly supported. In addition, the most valid and robust compositions are only addressed in their future work. In contrast, we present a model with various types of quality criteria used for optimizing the composition. Unlike our work that considers quality of causal links, [23, 2] focused on QoS-aware service composition. To this end, they suggest a QoS-driven approach to select candidate services valued by non functional criteria such as price, execution time, and reliability. In the same way as our approach, they consider their problem as an optimization problem. Towards this issue different strategies as optimization techniques can be adopted, e.g., Integer Programming [23], Genetic Algorithms (GAs) [8], or Constraint Programming [11]. As discussed in [8], GAs better handle non-linearity of aggregation functions, and better scale up when the number of candidate services for each abstract service is high. In IP based approaches all quality criteria are used for specifying both constraints and objective function. In contrast to our problem the incompatibility constraints are not required since they assume independence between the services of any task. The global selection problem is also modelled as a knapsack problem [22], wherein [3] performed dynamic programming to solve the problem. Unfortunately all the previous QoS-aware service composition approaches consider only causal links valued by an Exact match. The causal link quality is then disregarded by these approach. 6

LINDO API version 5.0, Lindo Systems Inc. http://www.lindo.com/

7

49

Conclusion and Future Work

In this work we study causal links based semantic web service composition. Our approach has been directed to meet the main challenge facing this problem i.e., how effectively retrieve optimal compositions of causal links. To this end we have first presented a general and extensible model to evaluate quality of both elementary and composition of causal links. Since the global causal link selection is formalized as an optimization problem, IP techniques are used to compute optimal executable composition of services. Our global selection based approach is not only more suitable than the local approach but also outperforms the naive approach. Moreover the experimental results show an acceptable computation cost of the IP-based global selection for a high number of abstract and candidates causal links. Since several executable compositions maximizing the overall quality of causal links may be retrieved, the main direction for future work is to consider optimality for quality of service (driven by empirical analysis of compositions usage) to further optimize them.

REFERENCES [1] Anupriya Ankolenkar, Massimo Paolucci, Naveen Srinivasan, and Katia Sycara, ‘The owl-s coalition, owl-s 1.1’, Technical report, (2004). [2] Danilo Ardagna and Barbara Pernici, ‘Adaptive service composition in flexible processes’, IEEE Trans. Software Eng., 33(6), 369–384, (2007). [3] Ismailcem Budak Arpinar, Ruoyan Zhang, Boanerges Aleman-Meza, and Angela Maduko, ‘Ontology-driven web services composition platform’, Inf. Syst. E-Business Management, 3(2), 175–199, (2005). [4] F. Baader and W. Nutt, in The Description Logic Handbook: Theory, Implementation, and Applications, (2003). [5] Rainer Berbner, Michael Spahn, Nicolas Repp, Oliver Heckmann, and Ralf Steinmetz, ‘Heuristics for qos-aware web service composition’, in ICWS, pp. 72–82, (2006). [6] Tim Berners-Lee, James Hendler, and Ora Lassila, ‘The semantic web’, Scientific American, 284(5), 34–43, (2001). [7] S. Brandt, R. Kusters, and A. Turhan, ‘Approximation and difference in description logics’, in KR, pp. 203–214, (2002). [8] Gerardo Canfora, Massimiliano Di Penta, Raffaele Esposito, and Maria Luisa Villani, ‘An approach for qos-aware service composition based on genetic algorithms’, in GECCO, pp. 1069–1075, (2005). [9] Simona Colucci, Tommaso Di Noia, Eugenio Di Sciascio, Francesco M. Donini, and Marina Mongiello, ‘Concept abduction and contraction in description logics’, in DL, (2003). [10] David Harel and Amnon Naamad, ‘The statemate semantics of statecharts’, ACM Trans. Softw. Eng. Methodol., 5(4), 293–333, (1996). [11] Ahlem Ben Hassine, Shigeo Matsubara, and Toru Ishida, ‘A constraintbased approach to horizontal web service composition’, in ISWC, pp. 130–143, (2006). [12] Ralf Küsters, Non-Standard Inferences in Description Logics, volume 2100 of Lecture Notes in Computer Science, Springer, 2001. [13] Freddy Lécué and Alexandre Delteil, ‘Making the difference in semantic web service composition.’, in AAAI, pp. 1383–1388, (2007). [14] Freddy Lécué and Alain Léger, ‘A formal model for semantic web service composition’, in ISWC, pp. 385–398, (2006). [15] L. Li and I. Horrocks, ‘A software framework for matchmaking based on semantic web technology’, in WWW, pp. 331–339, (2003). [16] M. Paolucci, T. Kawamura, T.R. Payne, and K. Sycara, ‘Semantic matching of web services capabilities’, in ISWC, pp. 333–347, (2002). [17] Christos H. Papadimitriou, ‘On the complexity of integer programming’, J. ACM, 28(4), 765–768, (1981). [18] K. Sivashanmugam, K. Verma, A. Sheth, and J. Miller, ‘Adding semantics to web services standards’, in ICWS, pp. 395–401, (2003). [19] Michael K. Smith, Chris Welty, and Deborah L. McGuinness, ‘Owl web ontology language guide’, W3c recommendation, W3C, (2004). [20] Gunnar Teege, ‘Making the difference: A subtraction operation for description logics’, in KR, pp. 540–550, (1994). [21] L. Wolsey, Integer Programming, John Wiley and Sons, 1998. [22] Tao Yu, ‘Service selection algorithms for composing complex services with multiple qos constraints’, in ICSOC, pp. 130–143, (2005). [23] Liangzhao Zeng, Boualem Benatallah, Marlon Dumas, Jayant Kalagnanam, and Quan Z. Sheng, ‘Quality driven web services composition’, in WWW, pp. 411–421, (2003).

50


Extending the Knowledge Compilation Map: Closure Principles Hélène Fargier1 and Pierre Marquis2 Abstract. We extend the knowledge compilation map introduced by Darwiche and Marquis with new propositional fragments obtained by applying closure principles to several fragments studied so far. We investigate two closure principles: disjunction and implicit forgetting (i.e., existential quantification). Each introduced fragment is evaluated w.r.t. several criteria, including the complexity of basic queries and transformations, and its spatial efficiency is also analyzed.

1

INTRODUCTION

This paper is concerned with knowledge compilation (KC). The key idea underlying KC is to pre-process parts of the available data (i.e., turning them into a compiled form) for improving the efficiency of some computational tasks (see among others [2, 1, 10, 4]). A research line in KC [7, 3] addresses the following important issue: How to choose a target language for knowledge compilation? In [3], the authors argue that the choice of a target language for a compilation purpose in the propositional case must be based both on the set of queries and transformations which can be achieved in polynomial time when the data to be exploited are represented in the language, as well as the spatial efficiency of the language (i.e., its ability to represent data using little space). Thus, the KC map reported in [3] is an evaluation of dozen of significant propositional languages (called propositional fragments) w.r.t. several dimensions: the spatial efficiency (i.e., succinctness) of the fragment and the class of queries and transformations it supports in polynomial time. The basic queries considered in [3] include tests for consistency, validity, implicates (clausal entailment), implicants, equivalence, sentential entailment, counting and enumerating theory models (CO, VA, CE, EQ, SE, IM, CT, ME). The basic transformations are conditioning (CD), (possibly bounded) closures under the connectives ∧, ∨, and ¬ ( ∧ C, ∧BC, ∨C, ∨BC, ¬C) and (possibly bounded) forgetting which can be viewed as a closure operation under existential quantification (FO, SFO). The KC map reported in [3] has already been extended to new propositional languages, queries and transformations in [12, 5, 11]. In this paper, we extend the KC map with new propositional fragments obtained by applying closure principles to several fragments studied so far. Intuitively, a closure principle is a way to define a new propositional fragment from a previous one. In this paper, we investigate in detail two disjunctive closure principles, disjunction (∨) 1 2

IRIT-CNRS, Université Paul Sabatier, France, email: [email protected] Université Lille-Nord de France, Artois, CRIL UMR CNRS 8188, France, email: [email protected]

and implicit forgetting (∃), and their combinations. Roughly speaking, the disjunction principle when applied to a fragment C leads to a fragment C[∨] which allows disjunctions of formulas from C, while implicit forgetting applied to a fragment C leads to a fragment C[∃] which allows existentially quantified formulas from C. Obviously enough, whatever C, C[∨] satisfies polytime closure under ∨ (∨C) and C[∃] satisfies polytime forgetting (FO). Applying any/both of those two principles may lead to new fragments, which can prove strictly more succinct than the underlying fragment C; interestingly, this gain in efficiency does not lead to a complexity shift w.r.t. the main queries and transformations; indeed, among other things, our results show that whenever C satisfies CO (resp. CD), then C[∨] and C[∃] satisfy CO (resp. CD). The remainder of this paper is organized as follows. In Section 2, we define the language of quantified propositional DAGs. In Section 3, we extend the usual notions of queries, transformations and succinctness to this language. In Section 4, we introduce the general principle of closure by a connective or a quantification before focusing on the disjunctive closures of the fragments considered in [3] and studying their attractivity for KC, thus extending the KC map. In Section 5, we discuss the results. Finally, Section 6 concludes the paper.

2

A GLIMPSE AT QUANTIFIED PDAGS

All the propositional fragments we consider in this paper are subsets of the following language of quantified propositional DAGs QPDAG: Definition 1 (quantified PDAGs) Let P S be a denumerable set of propositional variables (also called atoms). • QPDAG is the set of all finite, single-rooted DAGs α (called formulas) where each leaf node is labeled by a literal over P S or one of the two Boolean constants or ⊥, and each internal node is labeled by ∧ or ∨ and has arbitrarily many children or is labeled by ¬, ∃x or ∀x (where x ∈ P S) and has just one child. • Qp PDAG is the subset of all proper formulas of QPDAG, where a formula α is proper iff for every literal l = x or l = ¬x labelling a leaf of α, at most one path from the root of α to this leaf contains quantifications of the form ∃x or ∀x, and if such a path exists, it is the unique path from the root of α to the leaf. Restricting the language QPDAG to proper formulas α ensures that every occurrence of a variable x corresponding to a literal at a leaf of α depends on at most one quantification on x, and is either free or bound. As a consequence (among others), conditioning a proper formula can be achieved as usual (without requiring any duplication of nodes).

H. Fargier and P. Marquis / Extending the Knowledge Compilation Map: Closure Principles

PDAG [12] is the subset of Qp PDAG obtained by removing the possibility to have internal nodes labeled by ∃ or ∀; PDAG-NNF [3] (resp. ∃PDAG-NNF, resp. ∀PDAG-NNF) is the subset of Qp PDAG obtained by removing the possibility to have internal nodes labeled by ¬, ∃ or ∀ (resp. ¬, ∀, resp. ¬, ∃). Distinguished formulas from QPDAG are the literals over P S; if V is any subset of P S, LV denotes the set of all literals built over V , i.e., {x, ¬x | x ∈ V }. If a literal l of LP S is an atom x from P S, it is said to be a positive literal; otherwise it has the form ¬x with x ∈ P S and it is said to be a negative literal. If l is a literal built up from the atom x, we have var(l) = x. A clause (resp. a term) is a (finite) disjunction (resp. conjunction) of literals or the constant ⊥ (resp. ). The size |α| of any QPDAG formula α is the number of nodes plus the number of arcs in α. The set V ar(α) of free variables of a Qp PDAG formula α is defined in the standard way. Let I be an interpretation over P S (i.e., a total function from P S to BOOL = {0, 1}). The semantics of a QPDAG formula α in I is the truth value from BOOL defined inductively in the standard way; the notions of model, logical consequence (|=) and logical equivalence (≡) are also as usual. Finally, if α ∈ QPDAG and X = {x1 , . . . , xn } ⊆ P S, then ∃X.α (resp. ∀X.α) is a short for ∃x1 .(∃x2 .(...∃xn .α)...) (resp. ∀x1 .(∀x2 .(...∀xn .α)...)) (this notation is well-founded since whatever the chosen ordering on X, the resulting formulas are logically equivalent).

3

QUERIES, TRANSFORMATIONS, AND SUCCINCTNESS

The following queries CO, VA, CE, EQ, SE, IM, CT, ME for PDAG-NNF formulas have been considered in [3]; their importance is discussed in depth in [3], so we refrain from recalling it here; we extend them to Qp PDAG formulas and add to them the MC query (model checking), which is trivial for PDAG formulas (every formula from PDAG satisfies MC), but not for Qp PDAG formulas. Definition 2 (queries)

Let C denote any subset of Qp PDAG.

• C satisfies CO (resp. VA) iff there exists a polytime algorithm that maps every formula α from C to 1 if α is consistent (resp. valid), and to 0 otherwise. • C satisfies MC iff there exists a polytime algorithm that maps every formula α from C and every interpretation I over V ar(α) to 1 if I is a model of α, and to 0 otherwise. • C satisfies CE iff there exists a polytime algorithm that maps every formula α from C and every clause γ to 1 if α |= γ holds, and to 0 otherwise. • C satisfies EQ (resp. SE) iff there exists a polytime algorithm that maps every pair of formulas α, β from C to 1 if α ≡ β (resp. α |= β) holds, and to 0 otherwise. • C satisfies IM iff there exists a polytime algorithm that maps every formula α from C and every term γ to 1 if γ |= α holds, and to 0 otherwise. • C satisfies CT iff there exists a polytime algorithm that maps every formula α from C to a nonnegative integer that represents the number of models of α over V ar(α) (in binary notation). • C satisfies ME iff there exists a polynomial p(., .) and an algorithm that outputs all models of an arbitrary formula α from C in time p(n, m), where n is the size of α and m is the number of its models (over V ar(α)). The following transformations for PDAG-NNF formulas have been considered in [3]; again, we extend them to Qp PDAG formulas:

Definition 3 (transformations) Qp PDAG.

51

Let C denote any subset of

• C satisfies CD iff there exists a polytime algorithm that maps every formula α from C and every consistent term γ to a formula from C that is logically equivalent to the conditioning α | γ of α on γ, i.e., the formula obtained by replacing each free occurrence of variable x of α by (resp. ⊥) if x (resp. ¬x) is a positive (resp. negative) literal of γ. • C satisfies FO iff there exists a polytime algorithm that maps every formula α from C and every subset X of variables from PS to a formula from C equivalent to ∃X.α. If the property holds for each singleton X, we say that C satisfies SFO. • C satisfies ∧C (resp. ∨C) iff there exists a polytime algorithm that maps every finite set of formulas α1 , . . . , αn from C to a formula of C that is logically equivalent to α1 ∧ . . . ∧ αn (resp. α1 ∨ . . . ∨ αn ). • C satisfies ∧BC (resp. ∨BC) iff there exists a polytime algorithm that maps every pair of formulas α and β from C to a formula of C that is logically equivalent to α ∧ β (resp. α ∨ β). • C satisfies ¬C iff there exists a polytime algorithm that maps every formula α from C to a formula of C logically equivalent to ¬α. Finally, the following notion of succinctness (modeled as a preorder over propositional fragments) has been considered in [3]; we also extend it to QPDAG formulas: Definition 4 (succinctness) Let C1 and C2 be two subsets of QPDAG. C1 is at least as succinct as C2 , denoted C1 ≤s C2 , iff there exists a polynomial p such that for every formula α ∈ C2 , there exists an equivalent formula β ∈ C1 where |β| ≤ p(|α|). ∼s is the symmetric part of ≤s defined by C1 ∼s C2 iff C1 ≤s C2 and C2 ≤s C1 . <s is the asymmetric part of ≤s defined by C1 <s C2 iff C1 ≤s C2 and C2 ≤s C1 .

4 4.1

EXTENDING THE KC MAP BY DISJUNCTIVE CLOSURES Closure Principles

Intuitively, a closure principle is a way to define a new propositional fragment starting from a previous one, through the application of “operators” (i.e., connectives or quantifications):3 Definition 5 (closures) Let C be a subset of QPDAG and be any finite subset of {∨, ∧, ¬, ∃, ∀}. C[] is the subset of QPDAG inductively defined as follows:4 • if α ∈ C, then α ∈ C[], • if δ ∈ ∩ {∨, ∧}, and αi ∈ C[] with i ∈ 1 . . . n and n > 0, then δ(α1 , . . . , αn ) ∈ C[], • if ¬ ∈ and α ∈ C[], then ¬α ∈ C[], • if δ ∈ ∩ {∀, ∃}, α ∈ C[], and x ∈ P S then δx.α ∈ C[]. Observe that if C ⊆ Qp PDAG then C[] ⊆ Qp PDAG: closure does not question properness. We also have the following easy proposition, which makes precise the interplay between elements of in the general case: 3

Other closure principles could have been defined in a similar way, would the underlying propositional language contain other connectives. 4 In order to alleviate the notations, when = {δ , . . . , δ }, we shall write n 1 C[δ1 , . . . , δn ] instead of C[{δ1 , . . . , δn }].

52

H. Fargier and P. Marquis / Extending the Knowledge Compilation Map: Closure Principles

Proposition 1 For every subset C of QPDAG and every finite subsets 1 , 2 of {∨, ∧, ¬, ∃, ∀}, we have: • • • •

C[∅] = C. If 1 ⊆ 2 then C[1 ] ⊆ C[2 ]. (C[1 ])[2 ] ⊆ C[1 ∪ 2 ]. If 1 ⊆ 2 or 2 ⊆ 1 then (C[1 ])[2 ] = C[1 ∪ 2 ].

Before focusing on some specific “operators”, we add to succinctness the following notions of polynomial translation and polynomial equivalence, which prove helpful in the following evaluations: Definition 6 (polynomial translation) Let C1 and C2 be two subsets of QPDAG. C1 is said to be polynomially translatable into C2 , noted C1 ≥P C2 , iff there exists a polytime algorithm f such that for every α ∈ C1 , we have f (α) ∈ C2 and f (α) ≡ α. Like ≥s , ≥P is a preorder (i.e., a reflexive and transitive relation) over the power set of QPDAG. It refines the spatial efficiency preorder ≥s over QPDAG in the sense that for any two subsets C1 and C2 of QPDAG, if C1 ≥P C2 , then C1 ≥s C2 (but the converse does not hold in general). Thus, if C1 is polynomially translatable into C2 , we have that C2 is at least as succinct as C1 . Furthermore, whenever C1 is polynomially translatable into C2 , every query which is supported in polynomial time in C2 also is supported in polynomial time in C1 ; and conversely, every query which is not supported in polynomial time in C1 unless the polynomial hierarchy collapses cannot be supported in polynomial time in C2 , unless the polynomial hierarchy collapses. The corresponding indifference relation ∼P given by C1 ∼P C2 iff C1 ≥P C2 and C2 ≥P C1 , is an equivalence relation; when C1 ∼P C2 , C1 and C2 are said to be polynomially equivalent. Obviously enough, polynomially equivalent fragments are equally efficient (and succinct) and possess the same set of tractable queries and transformations. Before presenting some useful polynomial equivalences, we first need to introduce the notion of stability under uniform renaming. It characterizes the subsets C of Qp PDAG for which, intuitively, the choice of variables names does not really matter; technically it allows to rename (bound) variables in a formula α of C without leaving the fragment. Definition 7 (stability under uniform renaming) Let C be any subset of Qp PDAG. C is stable under uniform renaming iff for every α ∈ C, there exists arbitrarily many distinct bijections r from V ar(α) to subsets V of fresh variables from P S (i.e., not occurring in α) such that the formula r(α) obtained by replacing in α (in a uniform way) every free occurrence of x ∈ V ar(α) by r(x) belongs to C as well. We are now ready to present more specific results: Proposition 2 Let C be any subset of Qp PDAG, s.t. C is stable under uniform renaming. We have: • (C[∃])[∨] ∼P (C[∨])[∃] ∼P C[∨, ∃]. • (C[∀])[∧] ∼P (C[∧])[∀] ∼P C[∧, ∀]. It is important to note that such polynomial equivalences, showing in some sense that the “sequential” closure of a propositional fragment stable under uniform renaming by a set of “operators” among {∨, ∃} (resp. among {∧, ∀}) is equivalent to its “parallel” closure, cannot be systematically guaranteed for any choices of fragments

and “operators”. For instance, if C is the set LP S ∪ {, ⊥}, then (C[∨])[∧] is the set of all CNF formulas, (C[∧])[∨] is the set of all DNF formulas, and C[∨, ∧] is the set of all PDAG-NNF formulas. From the succinctness results reported in [3], it is easy to conclude that those three fragments are not pairwise polynomially equivalent. Similarly, if C is the set of all clauses over P S, then (C[∧])[∃] and C[∧, ∃] are polynomially equivalent to CNF[∃], but (C[∃])[∧] is polynomially equivalent to CNF, which is not polynomially equivalent to CNF[∃] (this follows from the forthcoming Proposition 8).

4.2

Disjunctive Closures

In the rest of this paper, we will focus on the two disjunctive closure principles [∨] (closure by disjunction), [∃] (closure by forgetting), and their combinations. At the start, this choice was motivated by the fact that any closure C[∃] obviously satisfies forgetting, which is an important transformation for a number of applications, including planning, diagnosis, reasoning about action and change, reasoning under inconsistency (see e.g. [2, 8, 9] for details), while any closure C[∨] clearly preserves the crucial query CO and transformation CD. Our purpose is now to locate on the KC map all languages obtained by applying the disjunctive closure principles to the eight languages PDAG-NNF, DNNF, CNF, OBDD< DNF, PI, IP, MODS considered (among others) in [3]; all those languages are subsets of PDAG: • PDAG-NNF is the subset of PDAG consisting of negation normal form formulas. • DNNF is the subset of PDAG-NNF consisting of decomposable negation normal form formulas. • CNF is the subset of PDAG-NNF consisting of conjunctive normal form formulas. • OBDD< is the subset of DNNF consisting of ordered binary decision diagrams. < is a strict and complete ordering over P S and we assume the ordered set (P S, 1, |DIψ (c)| = 1},

1 ϕ= N H (ψ)

=

{c ∈ ϕ | |P (c)| > 1, |DIψ (c)| = 1}.

The following example illustrates the definitions. Example 4 Suppose that ϕ = c1 ∧ c2 ∧ c3 , where c1

=

(¬x1 ∨ ¬x3 ∨ x4 ∨ x6 )

c2

=

(¬x1 ∨ ¬x2 ∨ x3 ∨ ¬x5 ∨ x6 )

c3

=

(¬x1 ∨ ¬x2 ∨ ¬x3 ∨ ¬x5 ∨ x6 )

and ψ = (¬x1 ∨ ¬x3 ∨ x4 )(¬x1 ∨ ¬x2 ∨ ¬x5 ∨ x6 ). Then DIψ (c1 )

=

{ ¬x1 ∨ ¬x3 ∨ x4 }

DIψ (c2 )

=

{ ¬x1 ∨ ¬x2 ∨ ¬x5 ∨ x6 },

DIψ (c3 )

=

{c3 },

1 and ϕH = c3 , ϕ1N H (ψ) = { c1 , c2 }, ϕ= N H (ψ) = ∅. As predicted, 1 ϕ = ϕH ∧ ϕ1N H (ψ) ∧ ϕ= N H (ψ) = c3 ∧ c1 ∧ c2 .

Using the above concepts, we now characterize Horn cores. Definition 6 (ϕ∗ψ ) Given a CNF ϕ and formula ψ, define the CNF 1 ϕ∗ψ = ϕH ∧ μϕ (ψ) ∧ ϕ= N H (ψ),

where μϕ (ψ) = {c ∈ DIψ (c) | c ∈ ϕ1N H (ψ)}. That is, replace non-Horn clauses c in the CNF for ϕ by strengthenings to definite clauses c , if c is the only such clause implied by ψ. We have now the following result. Theorem 1 (Horn Core Characterization) A given Horn CNF ψ is a Horn core of a CNF ϕ if and only if ψ ≡ ϕ∗ψ . The formal proof is omitted here. Intuitively, by construction of ϕ∗ψ any Horn core ψ of ϕ must fulfill ψ ≤ ϕ∗ψ . On the other hand, ϕ∗ψ ≤ ϕ; thus if ϕ∗ψ is equivalent to ψ, it must be a Horn core. Example 5 (cont’d) In Example 4, we had ϕH = {c3 }, ϕ1N H (ψ) = 1 { c1 , c2 }, and ϕ= N H (ψ) = ∅. We thus obtain μϕ (ψ) = DIψ (c1 ) ∪ DIψ (c2 ) = { ¬x1 ∨ ¬x3 ∨ x4 , ¬x1 ∨ ¬x2 ∨ ¬x5 ∨ x6 }, and thus ϕ∗ψ = ϕH ∧ μϕ (ψ) = (¬x1 ∨ ¬x2 ∨ ¬x3 ∨ ¬x5 ∨ x6 ) ∧ (¬x1 ∨¬x3 ∨ x4 )(¬x1 ∨¬x2 ∨¬x5 ∨ x6 ). Now ψ = (¬x1 ∨ ¬x3 ∨ x4 ) ∧ (¬x1 ∨ ¬x2 ∨ ¬x5 ∨ x6 ) ≡ ϕ∗ψ ; hence, ψ is a Horn core of ϕ.

5

Computation

We now turn to computing a Horn core of a Horn disjunction, for which we exploit the characterization in the previous section. Our strategy is to increase an initial Horn CNF repeatedly, until we arrive at a Horn CNF that satisfies the condition in Theorem 1. To this end, we first consider recognizing a Horn core, and show that the problem is polynomial if a CNF for ϕ is constructible in polynomial time.

5.1 Recognizing Horn Cores We observe the following facts. Lemma 2 Let ϕ1 ∨ · · · ∨ ϕl be a disjunction of l ≥ 2 Horn CNFs ϕi , and let ϕ be a CNF for it. Let ψ be a Horn CNF. Then, 1. ϕ∗ψ is constructible from ϕ and ψ in polynomial time; 2. checking whether ψ ≤ ϕ∗ψ is feasible in polynomial time; 3. checking whether ϕ∗ψ ≤ ψ is feasible in polynomial time.

T. Eiter and K. Makino / New Results for Horn Cores and Envelopes of Horn Disjunctions

Proof Items 1 and 2 are clearly feasible in polynomial time (note that ϕ∗ψ is a CNF). For item 3, we rewrite ϕ∗ψ as a Horn disjunction: ϕ∗ψ

=

1 ϕH ∧ μϕ (ψ) ∧ ϕ= N H (ψ)

≡

1 1 ϕH ∧ μϕ (ψ) ∧ ϕ= N H (ψ) ∧ ϕN H (ψ)

≡

ϕ ∧ μϕ (ψ) ≡ (ϕ1 ∧ μϕ (ψ)) ∨ · · · ∨ (ϕl ∧ μϕ (ψ)).

As α ∨ β ≤ γ iff α ≤ γ and β ≤ γ, we can check for i = 1, . . . , l 2 that ϕi ∧ μϕ (ψ) ≤ ψ; this is feasible in polynomial time. In particular, if l is bounded by a constant, a CNF ϕ for ϕ1 ∨ · · · ∨ ϕl is computable in polynomial time by simple means (e.g., ϕ := S(ϕ1 , . . . , ϕl )). We thus obtain the following result. Theorem 2 Deciding whether a given Horn CNF ψ is a Horn core of a given Horn disjunction ϕ = ϕ1 ∨ · · · ∨ ϕl , l ≥ 2, is feasible in polynomial time, if a CNF for ϕ is computable in polynomial time. In particular, if l is bounded by a constant, this is decidable in time O(max{n, l}n|ψ|Πli=1 |ϕi |) (here |γ| is the number of clauses in γ). Here and later, we assume in the time analysis that clauses c are represented by bitmaps (of size n) for P (c) and N (c).

5.2

Algorithm N EWCORE Input: Horn CNFs ψ, ϕ1 , . . . ϕl , l ≥ 2. Output: A Horn core ψ of ϕ = ϕ1 ∨ · · · ∨ ϕl such that ψ ≤ ψ ≤ ϕ, or “no” if none exists. Step 1. convert ϕ to a CNF α (e.g., α := S(ϕ1 , . . . , ϕl )); if ψ ≤ α then return “no’; S =1 := {c ∈ α | |P (c)| > 1, |DIψ (c)| = 1}; β := {N (c) ∪ {xj } | c ∈ S =1 , xj ∈ P (c)}; μ := {c ∈ DIψ (c) | c ∈ α, |P (c)| > 1, |DIψ (c)| = 1}; ψ := ψ; Step 2. while ϕi ∧ μ ≤ ψ for some i ∈ {1, . . . , l} do // (ϕψ ≤ ψ ) begin select v ∈ {0, 1}n witnessing ϕi ∧ μ ≤ ψ β := β − { c ∈ β | c(v) = 0 }; for each c ∈ S =1 do if a single clause c ∈ β fulfills N (c ) = N (c), P (c ) ⊆ P (c) then begin S =1 := S =1 − { c }; μ := μ ∪ { c }; end ψ := {c ∈ α | |P (c)| ≤ 1} ∪ μ ∪ β; end{while}; Step 3. Output ψ . Figure 3. New algorithm for Horn core computation

Constructing a Horn Core

We now present our algorithm to construct a Horn core of a Horn disjunction ϕ = ϕ1 ∨ · · · ∨ ϕl that contains a given Horn CNF ψ. If ψ ≤ ϕ, then obviously there is no Horn core ψ of ϕ such that ψ ≤ ψ ≤ ϕ. Otherwise, we can construct some such ψ by iteratively increasing ψ, exploiting the characterization in Theorem 1. The following lemma is crucial. Lemma 3 Suppose ψ ≤ ϕ and ψ ≡ ϕ∗ψ . Then, there exists some v ∈ {0, 1}n such that (i) ψ(v) = 0 and ϕ∗ψ (v) = 1 (i.e., ϕ∗ψ ≤ ψ), and (ii) for every such v and Horn CNF ψ = ϕH ∧ μϕ (ψ) ∧ β, 1 where β contains for each clause c ∈ ϕ= N H (ψ) at least one clause c ∈ DIψ (c) such that c (v) = 1, it holds that ψ < ψ ≤ ϕ. The algorithm N EWCORE, shown in Figure 3, proceeds as follows. After converting ϕ to a CNF α and testing ϕ ≤ α, it initializes auxiliary variables and a candidate Horn core ψ . In Step 2, ψ is tested using Lemma 2; if not a Horn core yet, ψ is repeatedly updated according to Lemma 3. Example 6 (cont’d) Reconsider ϕ = ϕ1 ∨ ϕ2 in Example 3, where ϕ1 = (¬x1 ∨¬x3 ∨x4 )∧(¬x2 ∨¬x5 ∨x6 ), and ϕ2 = (¬x1 ∨¬x2 ∨ x3 )∧(¬x1 ∨¬x3 ∨x6 ), and let ψ = (¬x1 ∨¬x3 ∨x4 )∧(¬x1 ∨¬x2 ). (As seen from Example 4, ψ ≤ ϕ but ψ is not a Horn core of ϕ.) In Step 1 of N EWCORE, α = c1 ∧ c2 ∧ c3 and DIψ (c1 ) = { ¬x1 ∨ ¬x3 ∨ x4 } DIψ (c2 ) = { ¬x1 ∨ ¬x2 ∨ ¬x5 ∨ x6 , ¬x1 ∨ ¬x2 ∨ ¬x5 ∨ x6 }, DIψ (c3 ) = { ¬x1 ∨ ¬x2 ∨ ¬x3 ∨ ¬x5 ∨ x6 }. Thus, S =1 β μ ψ

= {c2 }, = {¬x1 ∨ ¬x2 ∨ ¬x5 ∨ x3 , ¬x1 ∨ ¬x2 ∨ ¬x5 ∨ x6 }, = {¬x1 ∨ ¬x3 ∨ x4 }, and = (¬x1 ∨ ¬x3 ∨ x4 ) ∧ (¬x1 ∨ ¬x2 ).

In Step 2, the test of the while loop succeeds as ϕ1 ∧ μ ≡ ϕ1 ≤ ψ holds; e.g., for v = (110011), we have ϕ1 (v) = 1 and ψ (v) = 0. The set β is then updated to β = {¬x1 ∨ ¬x2 ∨ ¬x5 ∨ x6 }, and for c2 the updates S =1 := ∅ and μ = {¬x1 ∨ ¬x3 ∨ x4 , ¬x1 ∨ ¬x2 ∨ ¬x5 ∨ x6 } are performed; finally, ψ is updated to

63

ψ

=

(¬x1 ∨ ¬x2 ∨ ¬x3 ∨ ¬x5 ∨ x6 )∧ (¬x1 ∨ ¬x3 ∨ x4 ) ∧ (¬x1 ∨ ¬x2 ∨ ¬x5 ∨ x6 ).

The test for the next while-iteration fails, since ψ ≡ μ; hence, ψ is output. Note that ψ is indeed a Horn core of ϕ such that ψ ≤ ψ . The following result states that the new algorithm is correct (the formal proof is omitted here). Theorem 3 N EWCORE correctly computes a Horn core ψ of ϕ = ϕ1 ∨· · ·∨ϕl such that ψ ≤ ψ ≤ ϕ. Moreover, it can be implemented to run in time O(nlm(l ˆ m ˆ + |ψ|)), where m ˆ = Πli=1 |ϕi |. In particular, if l is bounded by a constant, N EWCORE runs in polynomial time: In Step 1, building α = S(ϕ1 , . . . , ϕl ) is feasible in time O(nlm), ˆ and the test ψ ≤ α in time O(n|ψ|m). ˆ Each DIψ (c) is computable in time O(nl|ψ|), and thus the initial S =1 , β, and μ in time O(mnl|ψ|). ˆ In total, Step 1 is feasible in time O(nlm|ψ|). ˆ In Step 2, the while loop is executed at most (l−1)|S =1 |+1 ≤ (l−1)m+1 ˆ often. Using appropriate data structures, the loop body is executable in time O(lnm) ˆ and the loop tests need throughout the computation in total time O(nlm(l ˆ m+|ψ|)): ˆ since μ only increases, all tests ϕ for each potential clause c in ψ i ∧ μ ≤ c are feasible in P ˆ + |ϕi |)) = O(lnm). ˆ There are at most lm ˆ total time O( li=1 n(m such clauses c from the initial β and at most |ψ| many from ψ \ β. In total, Step 2 is feasible in time O(nlm(l ˆ m ˆ + |ψ|)). In summary, we obtain a bound of O(nlm(l ˆ m ˆ + |ψ|)). However, in practice better behavior is plausible as |α|, |S =1 | etc. are likely to be smaller than m ˆ and far less than (l−1)m+1 ˆ loop executions are expected; furthermore, simple optimizations can be incorporated. Important features of algorithm N EWCORE are, different from algorithms C ORE and C ORE∗ , that it can compute targeted Horn cores with ψ and that it is nondeterministically complete; upon proper choices of v, each Horn core ψ such that ψ ≤ ψ ≤ ϕ is obtainable. Example 7 (cont’d) In Example 6, in the first while-iteration for e.g. v = (110100), which also witnesses ϕ1 ∧ μ ≤ ψ , β is updated differently (it stays unchanged). In the next iteration, β is necessarily updated to β = {¬x1 ∨ ¬x2 ∨ ¬x5 ∨ x6 }, and the same Horn core ψ as in Example 6 is re-obtained. In fact, all successful choices lead to this ψ . Hence, it is the unique Horn core such that ψ ≤ ψ ≤ ϕ.

64

6

T. Eiter and K. Makino / New Results for Horn Cores and Envelopes of Horn Disjunctions

Horn Envelope of a Horn Disjunction

We now turn to the question of whether a Horn envelope of a Horn disjunction can be computed efficiently. As we show, it has a negative answer, which is a consequence of the intractability of recognizing the Horn envelope. More precisely, the following holds. Theorem 4 Given Horn CNFs ψ, ϕ1 , and ϕ2 , deciding whether ψ is a Horn envelope of ϕ = ϕ1 ∨ ϕ2 is co-NP-complete. Proof (Sketch) As for the membership in co-NP, ψ is not a Horn envelope of ϕ if and only if either (a) ϕ ≤ ψ, or (b) there exists a Horn clause c such that (b.1) ϕ ≤ ψ ∧ c and (b.2) ψ ≤ c (thus ϕ ≤ (ψ ∧ c) < ψ). Such a clause c can be guessed, and the tests (a), (b.1), and (b.2) are feasible in polynomial time. The co-NP-hardness is shown by a reduction from the complement of SAT. Let α = c1 ∧ · · · ∧ cm be a CNF of nonempty clauses ci on variables x1 . . . , xn . Let y, z, x1 , . . . , xn be fresh variables. Define β1

=

β2

=

c∗1 ∧ · · · ∧ c∗m ,

V (¬y ∨ ¬x1 ∨ · · · ∨ ¬xn ) ∧ n i=1 (¬y ∨ xi ∨ ¬xi ), W W where c∗i = y ∨ x ∈ N (c ) ¬xj ∨ x ∈ P (c ) ¬xj , and let ϕ 1 = z ∧ β1 ∧ β 2 ,

ϕ2 = y ∧ β2 , and

ϕ = ϕ 1 ∨ ϕ2 .

Note that ϕ1 and ϕ2 are Horn CNFs and that ψ ∗ ≤ β1 ∧ β2 must hold, where ψ ∗ is any Horn envelope of ϕ, as ϕ ≡ (y ∨ z) ∧ β1 ∧ β2 . Intuitively, ϕ2 generates the models v for the variables . , xn , which are x1 , . . W W encoded by prime implicates of β2 of form pv = v = 0 ¬xi ∨ v = 1 ¬xi , while ϕ1 generates, via interaction (resolution) of clauses in β1 and β2 , all models v resp. clauses pv such that α(v) = 0. Now if α is unsatisfiable, then each pv is an implicate of both ϕ1 and ϕ2 , thus of ψ ∗ , and ψ ∗ ≡ β1 ∧ β2 follows. Otherwise, some pv is not a joint implicate, and ψ ∗ < β1 ∧ β2 holds. Thus ψ = β1 ∧ β2 is a Horn envelope of ϕ iff α is unsatisfiable. 2 Armed with this result, we now derive that most likely we cannot efficiently construct a compact Horn envelope of a Horn disjunction. Theorem 5 There is no algorithm that constructs, given Horn CNFs ϕ1 and ϕ2 , a prime irredundant Horn envelope ψ for ϕ1 ∨ ϕ2 in time polynomial in the size of ψ, ϕ1 , and ϕ2 , unless P = NP. Proof We show that if such an algorithm would exist, then the co-NP-complete problem of recognizing the Horn envelope in Theorem 4 could be solved in polynomial time. The proof makes use of the following lemma, which states an important property of Horn CNFs. Denote by α the representation size of a CNF α Lemma 4 All prime irredundant (Horn) CNFs for a Horn CNF ϕ differ at most polynomially in size, i.e., there exists a polynomial p(·) such that for every two irredundant prime CNFs ϕ1 and ϕ2 equivalent to ϕ, ϕ1 ≤ p(ϕ2 ) and ϕ2 ≤ p(ϕ1 ). This lemma follows, e.g., by combining results in [8] on prime irredundant Horn CNFs (especially, the number of clauses c with P (c) = ∅ in them) and in [7] on the FD-covers, which correspond to sets of definite clauses. Now suppose that an algorithm A would exist that computes a prime irredundant Horn envelope for ϕ = ϕ1 ∨ ϕ2 in polynomial total-time, i.e., in time bounded by a polynomial q(os, is) in the output size os = A(ϕ1 , ϕ2 ), and the input size is = ϕ1 + ϕ2 . We use then A to decide, given Horn CNFs ψ, ϕ1 and ϕ2 , whether ψ is a Horn envelope of ϕ1 ∨ ϕ2 in polynomial time (which implies P = NP) as follows. We run A for at most q(os∗ , is) steps,

where os∗ = p(ψ) is a (polynomial) upper bound on the size of A(ϕ1 , ϕ2 ) from Lemma 4 (note that ψ need not be prime). If A halts, then we check whether the output of A is equivalent to ψ; this is feasible in polynomial time. Otherwise, A will compute a Horn CNF ψ ∗ such that ψ ∗ ≡ ψ, and hence ψ is not the Horn envelope of ϕ. This algorithm works in polynomial time in the size of ψ, ϕ1 , and ϕ2 . 2 We remark that in the hardness proof of Theorem 4 neither ϕ1 nor ϕ2 may be replaced by a small Horn CNF. In fact, we can show that the problem is tractable S if in some ϕi the number of variables that occur positively, i.e., | {P (c) | c ∈ ϕi }|, is bounded by a constant. Moreover, if both ϕi have this property, a Horn envelope of ϕ1 ∨ ϕ2 is computable in input-polynomial time. This holds since a CNF ϕ∗i with all prime implicates of ϕi is computable in polynomial time (in the size of ϕi ) and S(ϕ∗1 , ϕ∗2 ) contains all prime implicates of ϕ1 ∨ ϕ2 . More generally, we have the following result. S Proposition 2 Given arbitrary CNFs ϕi such that | {P (c) | c ∈ ϕi }| ≤ k for a constant k, where 1 ≤ i ≤ l and l is bounded by a constant, a prime irredundant Horn envelope for ϕ1 ∨ · · · ∨ ϕl is computable in time polynomial in the size of ϕ1 ∨ · · · ∨ ϕl .

7

Conclusion

Horn cores and Horn envelopes are important concepts for propositional formulas that have appealing properties. We have obtained both positive results, like a novel characterization of Horn cores for CNFs and a new algorithm to compute Horn cores for a Horn disjunction, and a negative result in terms of the intractability of computing the Horn envelope of a Horn disjunction wrt. polynomial total-time. These results provide a computational basis for crafting implementations in the context of knowledge bases. Several issues remain for future work. One is to explore consequences and applicability of the present results to other combinations of Horn theories than disjunctions. Another is to further delineate the (in)tractability frontier for Horn envelopes that was briefly discussed here. Finally, efficient enumeration of multiple or all Horn cores would be interesting (a suitable variant of algorithm N EWCORE is non-obvious).

REFERENCES [1] Y. Boufkhad, ‘Algorithms for Propositional KB Approximation’, in Proc. National Conference on AI (AAAI ’98), pp. 280-285. AAAI Press. [2] M. Cadoli and F. Scarcello, ‘Semantical and Computational Aspects of Horn Approximations’, Artificial Intelligence, 119(1-2), 1–17, (2000). [3] A. del Val, ‘An Analysis of Approximate Knowledge Compilation’, in Proc. IJCAI ’95, pp. 830–836, (1995). [4] W. Dowling and J. H. Gallier, ‘Linear-time Algorithms for Testing the Satisfiability of Propositional Horn Theories’, Journal of Logic Programming, 3, 267–284, (1984). [5] T. Eiter, K. Makino, and T. Ibaraki, ‘Disjunctions of Horn Theories and their Cores’, SIAM Journal on Computing, 31(1), 269–288, (2001). [6] G. Gogic, Ch. Papadimitriou, and M. Sideri, ‘Incremental Recompilation of Knowledge’, J. Artif. Intell. Res., 8, 23–37, (1998). [7] G. Gottlob, ‘On the size of nonredundant FD-covers’, Information Processing Letters, 24(6), 355–360, (1987). [8] P. Hammer and A. Kogan, ‘Horn functions and their DNFs’, Information Processing Letters, 44, 23–29, (1992). [9] P. Hammer and A. Kogan, ‘Optimal Compression of Propositional Horn Knowledge Bases: Complexity and Approximation’, Artificial Intelligence, 64, 131–145, (1993). [10] D. Kavvadias, Ch. Papadimitriou, and M. Sideri, ‘On Horn Envelopes and Hypergraph Transversals’, in Proc. 4th Int’l Symp. Algorithms and Computation (ISAAC-93), LNCS 762, pp. 399–405. Springer. [11] B. Selman and H. Kautz, ‘Knowledge Compilation and Theory Approximation’, Journal of the ACM, 43(2), 193–224, (1996).


65

Belief revision with reinforcement learning for interactive object recognition Thomas Leopold1 and Gabriele Kern-Isberner2 and Gabriele Peters3 Abstract. From a conceptual point of view, belief revision and learning are quite similar. Both methods change the belief state of an intelligent agent by processing incoming information. However, for learning, the focus in on the exploitation of data to extract and assimilate useful knowledge, whereas belief revision is more concerned with the adaption of prior beliefs to new information for the purpose of reasoning. In this paper, we propose a hybrid learning method called S PHINX that combines low-level, non-cognitive reinforcement learning with high-level epistemic belief revision, similar to human learning. The former represents knowledge in a sub-symbolic, numerical way, while the latter is based on symbolic, non-monotonic logics and allows reasoning. Beyond the theoretical appeal of linking methods of very different disciplines of artificial intelligence, we will illustrate the usefulness of our approach by employing S PHINX in the area of computer vision for object recognition tasks. The S PHINX agent interacts with its environment by rotating objects depending on past experiences and newly acquired generic knowledge to choose those views which are most advantageous for recognition.

1

INTRODUCTION

One of the most challenging tasks of computer vision systems is the recognition of known and unknown objects. An elegant way to achieve this is to show the system some samples of each object class and thereby train the system, so that it can recognize objects that it has not seen before, but which look similar to some objects of the training phase (due to some defined features). Several methods to do so have been successfully used and anaylized. One of them is to set up a rule-based system and have it reason, another one is to use numerical learning methods such as reinforcement learning. Both of them have advantages, but also disadvantages. Reinforcement learning yields good results in different kinds of environments, but its training is time consuming, since it is a trial-anderror method and the agent has to learn from scratch. The possibilities to introduce background knowledge (e. g., by the choice of the initial values of the QTable) are more limited as for example with knowledge representation techniques. Another disadvantage consists in a limited possibility to generalize experiences and so to be able to act appropriately in unfamiliar situations. Though some generalization can be obtained by the application of function approximization techniques, the possibilities to generalize from learned rules to unfa1 2 3

University of Technology Dortmund, Germany, email: [email protected] University of Technology Dortmund, Germany, email: [email protected] University of Applied Sciences and Arts Dortmund, Germany, email: [email protected]

miliar situations are more diverse again with for example knowledge representation techniques. Knowledge representation and belief revision techniques have the advantage that the belief of the agent is represented quite clearly and allows reasoning about actions. The belief can be extended by new information, but needs to be revised when the new information contradicts the current belief. One drawback is that it is difficult to decide which parts of the belief should be given up, so that the new belief state is consistent, i.e., without inherent contradictions. In this paper, we present our hybrid learning system S PHINX, named after the Egyptian statue of a hybrid between a human and a lion. It combines the advantages of both Q-Learning and belief revision and diminishes the disadvantages, thus synergy effects can emerge. S PHINX agents, on the one hand, are intelligent agents equipped with epistemic belief states which allows them to build a model of the world and to apply reasoning techniques to focus on most plausible actions. On the other hand, they use QTables to determine which action should be carried out next, and are able to process reward signals from the environment. Moreover, S PHINX agents can learn situational as well as generic knowledge which is incorporated into their epistemic states via belief revision. In this way, they are able to adjust faster and more thoroughly to the environment, and to improve their learning capabilites considerably. This will be illustrated in detail by experiments in the field of computer vision. This paper is organized as follows: Chapter 2 summarizes related work. In chapter 3 we recall basic facts on Q-Learning, ordinal conditional functions and revision. Chapter 4 contains the main contribution of this paper, the presentation of the S PHINX system. Chapter 5 summarizes results from experiments in computer vision carried out in different environments. Finally, we conclude in chapter 6.

2

RELATED WORK

Psychological findings propose a two-level learning model for human learning [1], [6], [3], [10]. On the so called bottom level, humans learn implicitly and acquire procedural knowledge. They are not aware of the relations they have learned and can hardly put it into words. On the other level, the top level, humans learn explicitly and acquire declarative knowledge. They are aware of the relations they have learned and can express it, e. g., in form of if-then rules. A special form of declarative knowledge is episodic knowledge. This kind of knowledge is not of general nature, but refers to specific events, situations or objects. Episodic knowledge makes it possible to remember specific situations where general rules do not hold. These two levels do not work separately. Depending on what is learned, humans learn top-down or bottom-up [11]. It has been found [8] that in completely unfamiliar situations mainly implicit learning

66

T. Leopold et al. / Belief Revision with Reinforcement Learning for Interactive Object Recognition

takes place and procedural knowledge is acquired. The declarative knowledge is formed afterwards. This indicates that the bottom-up direction plays an important role. It is also advantageous to continually verbalize to a certain extent what one has just learned and so speed up the acquisition of declarative knowledge and thereby the whole learning process. Sun, Merrill and Peterson developed the learning model CLARION [9]. It is a two-level, bottom-up learning model which uses QLearning for the bottom level and a set of rules for the top level. The rules have the form ’Premise ⇒ Action’, where the premise can be met by the current state signal of the environment. For the maintainance of the set of rules (i. e., adding, changing and deleting rules) the authors have conceived a certain technique. They have proven their model, which works similar to human learning, to be successful in a mine field navigation task and similar to human learning. Cang Ye, N. H. C. Yung and Danwei Wang propose a neural fuzzy system [2]. Like CLARION, this is a two-level learning model, combining reinforcement learning and fuzzy logic. The system has successfully been applied to a mobile robot navigation task.

3

BASICS AND BACKGROUND

In this section, we will recall basic facts on the two methodologies that are used and combined in this paper. First, we briefly describe Q-Learning, a popular approach used for solving Markov Decision Processes (MDPs) (see e.g. [12]). The scenario is the usual one for agents, where one or more agents interact with an environment. Normally, the environment starts in a state and ends, when one terminal state is reached. This timespan is called an episode. For each action, the agent is rewarded. The more reward it collects during an episode, the better. Episodes consist of steps in which the agent first perceives the current state s of the environment via a (numerical) state signal, e. g., an ID. It looks up in its memory, called QTable, which action a seems to be the best in this situation and performs it. The environment reacts on this action by changing its state to s . After this change, the agent gets a reward r for its choice and updates its QTable. Q(λ)-learning is an enhanced Q-Learning method that not only takes the expected rewards into account but also considers the stateaction-pairs that have led to a state s. Let Q(s, a) represent the sum of rewards the agent expects to receive until the end of the episode, if it performs action a in situation s, and let A(s) be the set of actions the agent can perform in state s. The update formula for a state-action-pair (˜ s, a ˜) for Q(λ)-learning is Q(˜ s, a ˜) := Q(˜ s, a ˜) + α · e(˜ s, a ˜) · δ, where e(˜ s, a ˜) is an eligibility factor, expressing how much influence on (s, a) is conceded to (˜ s, a ˜) (the longer ago, the smaller the value), and δ := r+ max Q(s , a )−Q(s, a). a ∈A(s )

Before updating the (˜ s, a ˜)-values, the eligibility factor of the current state-action-pair (s, a) is increased by 1. After the update, the parameter λ is used to decrease the e(˜ s, a ˜)-values to e(˜ s, a ˜) := λ · e(˜ s, a ˜). For λ = 0, we get the basic Q-Learning approach. The decision which action to take in a situation s is usually done by choosing the one with the greatest Q(s, a)-value. To make the discovery of new solutions possible, the agent chooses a random action with a small probability . Now, the concept of ordinal conditional functions (OCFs) and appropriate revision techniques will be explained. OCFs will serve as representations of epistemic states of agents in this paper. Ordinal conditional functions [7] are also called ranking functions, as they assign a degree of plausibility in the form of a degree of disbelief, or surprise, respectively, to each possible world. We will

work within a propositional framework, making use of multi-valued propositional variables di with domains {vi,1 , . . . , vi,mi }. Possible worlds are simply interpretations here, assigning exactly one value to each di , and thus correspond to complete elementary conjunctions of multivalued literals (di = vi,j ), mentioning each di . Let Ω be the set of all possible worlds. Formally, an ordinal conditional function (OCF) is a mapping κ : Ω → N ∪ {∞} with κ−1 (0) = ∅. The lower κ(ω), the more plausible is ω, hence the most plausible worlds have κ-value 0. A degree of plausibility can be assigned to formulas A by setting κ(A) := min{κ(ω) | ω |= A}, so that κ(A ∨ B) = min{κ(A), κ(B)}. This means that a formula is considered as plausible as its most plausible models. Therefore, due to κ−1 (0) = ∅, at least one of κ(A), κ(A) must be 0. A proposition A is believed if κ(A) > 0 (which implies particularly κ(A) = 0). Moreover, degrees of plausibility can also be assigned to conditionals by setting κ(B|A) = κ(AB) − κ(A). A conditional (B|A) is accepted in the epistemic state represented by κ, or κ satisfies (B|A), written as κ |= (B|A), iff κ(AB) < κ(AB), i.e. iff AB is more plausible than AB. OCFs represent the epistemic attitudes of agents in quite a comprehensible way and offer simple arithmetics to propagate information. Therefore, they can be revised by new information in a straightforward manner, making use of the idea of so-called c-revisions [4] that are capable of revising ranking functions even by sets of new conditional beliefs. Here, we will only consider revisions by one conditional belief, so we will present the technique for this particular case. Given a prior epistemic state in the form of an OCF κ and a new conditional belief (B|A), the revision κ∗ = κ ∗ (B|A) is defined by j κ0 + κ(ω) + λ, if ω |= AB, ∗ κ (ω) = (1) κ0 + κ(ω) , otherwise, where κ0 is a normalizing additive constant and λ is the least natural number to ensure that κ∗ (AB) < κ∗ (AB). Although c-revisions are defined in [4] for logical languages defined from binary atoms, the approach can be easily generalized to considering multi-valued propositional variables. Note that also c-revision by facts is covered, as facts are identified with degenerate conditionals with tautological premises, i.e. A ≡ (A|). OCFs and c-revisions provide a framework to carry out high quality belief revision meeting all standards which are known to date, even going beyond that [4].

4

THE SPHINX LEARNING METHOD

Similar to the cognitive model, our learning method consists of two levels. For the bottom level we use Q(λ)-Learning, and for the top level, ordinal conditional functions (OCFs) are employed to represent the epistemic state of an agent and perform belief revision. This brings together two powerful methodologies from rather opposite ends of the scale of cognitive complexity, meeting the challenge of combining learning and belief revision in a particularly extreme case. To combine belief revision and reinforcement learning, each (subsymbolic) state s is described by a logical formula from a language defined over propositional variables di with domains {vi,1 , . . . , vi,mi }. The symbolic representation of a specific state is a conjunction of literals mentioning all di and reflects the logical perception of s by the agent. Furthermore, we define a variable action having as domain the set Actions of possible actions. Hence, the possible worlds on which ranking functions are defined here correspond to elementary conjunctions of the form (d1 = v1,k1 ) ∧ . . . ∧ (dn = vn,kn ) ∧ (action = a).


Figure 1.

The S PHINX system

The S PHINX system interlinks Q-learning, the epistemic state and belief revision in two ways: First, it uses current beliefs to restrict the search space of actions for Q-Learning. Second, direct feedback to an action in the form of a reward is processed to acquire specific or generic symbolic knowledge from the most recent experience by which the current epistemic state is revised. It is displayed in figure 1 and works as follows: Algorithm ’Sphinx-Learning’: While the current state s is not a terminal state 1. The Sphinx agent perceives the signal of the state s coming from the environment and its logical description d(s). 2. The agent queries its current epistemic state κ which actions Aκ (s) = {a1 , . . . , ak } are most plausible in s. 3. The agent looks up the Q-values of these actions and determines the set Abest (s) ⊆ Aκ (s) of those actions in Aκ (s) that have the greatest Q-value. 4. The agent chooses a random action a ∈ Abest (s) and performs it. 5. The environment changes to the successor state. 6. The agent receives the reward r from the environment. 7. The agent updates the QTable as described in section 3. 8. The new Q-values for actions in s are being read and the new best actions for s are determined. 9. The agent tries to find new rules that relate d(s) to best actions (according to the updated QTable) and revises κ with this information in form of conditionals. End While We will now explain the algorithm step by step. When a state s is perceived (step 1), then κ is browsed for the most plausible worlds satisfying d(s). Aκ (s) (step 2) is the set of actions occurring in the most plausible d(s)-worlds: Aκ (s) = {a ∈ Actions | κ(d(s) ∧ action = a) = κ(d(s))} Then, the actions in Aκ (s) are filtered according to their Q-values (step 3), and one of these actions is carried out (step 4). It is particularly in these two steps that the enhancement of reinforcement learning with epistemic background pays out, since an ordinary QAgent determines the set of best actions from the set of all possible actions. Steps 5 to 7 are pure Q-Learning. In step 8, the best actions for s due to the new Q-values are determined. This is done to exploit the experience by the received reward for future situations and make it usable on the epistemic level in step 9. The operations performed in step 9 are quite complex and described in the following. The aim of the mentioned revision of κ is to make those actions most plausible in d(s) that have the greatest Q-value in s. As inputs for this revision, the agent tries to find patterns in the state descriptions for which certain actions are generally

67

better than others. This is done by a frequency based heuristics. For each pattern (i.e., a conjunction of literals of some of the variables) p and each action a, the agent remembers how often a was a best resp. a poor action by using counters. If the agent finds in step 8, that an action a is a best action in s and has not been among the best actions before, then the counters for a of all patterns covered by d(s) are increased by 1. If a was a best action in s before but is no longer, the counters are decreased by 1. Negative experiences where a was a poor action are handled in an analogous manner. With these counters, probabilities can be calculated, expressing, if a is usually a best resp. a poor action, when a situation s for which d(s) satisfies p is perceived. If such a relation between a pattern and a set of actions is found, a revision of κ with a conditional encoding such newly acquired strategic knowledge is performed; basically, the following four different types of revision occur: 1. Revision with information about a poor action in a specific state (episodic knowledge). 2. Revision with information about a poor action in several, similar states (generalization). 3. Revision with information about best actions in a specific state (episodic knowledge). 4. Revision with information about best actions in several, similar states (generalization). A ’poor’ action in a specific state resp. in several, similar states was defined as an action that yields a reward less than -1. The conditionals used to revise κ have the following forms: 1. (action = a|d(s)), where d(s) is the symbolic representation of a certain state s in which a is poor. 2. (action = a|p), where p is a pattern satisfied by d(s), representing a set of states, which are similar because they share a common pattern. W 3. ( action = ai |d(s)), where all ai are best actions (due to their i

Q-values) in s. W 4. ( action = ai |p), where each ai is a best action in at least one i

of the states covered by the pattern p. ai needs not to be a best action in all states covered by p. The last form of revision should exclude not best actions from being plausible when p is perceived, so the agent has to find the best action for a specific state covered by p only among the actions ai . Since revisions and especially revisions with generalized rules have a strong influence on the choice of actions, they have to be handled carefully, i. e., the agent should be quite sure about the correctness of a rule before adding it to its belief. Therefore, the agent uses several counters counting, how often an action has been poor, not poor, a best or not a best one under certain circumstances. With these counters some probabilities can be calculated which can be used to evaluate the certainty about the correctness of a specific rule. However, since all rules are merely plausible but not correct in a logical sense, further revisions may alleviate or even cancel the effects of erroneously acquired rules. Our learning model also supports background knowledge. If the user knows some rules that might be helpful for the agent and its task, he can formulate them as conditionals and let the agent revise κ with them before starting to learn.

5

INTERACTIVE OBJECT RECOGNITION

We tested our learning method in a navigation environment and in two different simulations of object recognition environments. In this

68


paper, we present the results of the latter in two different scenarios.

5.1

Recognition of Geometric Objects

In this test environment, the agent has to learn to recognize the following objects: sphere, ellipsoid, cylinder, cone, tetrahedron, pyramid, prism, cube, cuboid. By interacting with the environment the agent can look at the object from the front, from the side or from the top or it can choose to try to name the current object. The possible front, side, and top views are represented by five elementary shapes, namely: circle, ellipse, triangle, square, and rectangle. For example, the cone has the front view ’triangle’, the side view ’triangle’, and the top view ’circle’. The prism is given by the front view ’triangle’, the side view ’rectangle’, and the top view ’rectangle’. This leads to the following domains for this environment: • FrontView = {Unknown, Circle, Ellipse, Triangle, Square, Rectangle} • SideView = {Unknown, Circle, Ellipse, Triangle, Square, Rectangle} • TopView = {Unknown, Circle, Ellipse, Triangle, Square, Rectangle} • Action = {LookAtFront, LookAtSide, LookAtTop, RecognizeUnknown, RecognizeSphere, RecognizeEllipsoid, RecognizeCylinder, RecognizeCone, RecognizeTetrahedron, RecognizePyramid, RecognizePrism, RecognizeCube, RecognizeCuboid} At the beginning of each episode, the environment chooses one of the nine geometric objects and generates the state signal ’FrontView = Unknown ∧ SideView = Unknown ∧ TopView = Unknown’. If the agent’s action is LookAtFront, LookAtSide, resp. LookAtTop, the FrontView, SideView, resp. TopView is revealed in the new state signal following the agent’s action. If the agent’s action is an action of type ’Recognize’ action, the episode ends. The reward function returns -1, if one of the ’Look’ actions has been performed. Otherwise, the agent is rewarded with 10, if it has recognized the objects correctly, and with -10, if not. After ten steps the running episode is forced to end. Figure 2 shows the recognition rates after each training phase. In each training phase, each object is shown ten times to the current agent. The values result from 1000 independend agents. If the agents are provided with the background knowledge If no view has been perceived yet, then look at the front, the side, or the top of the object via the conditional (action = LookAtFront ∨ action = LookAtSide ∨ action = LookAtTop|FrontView = Unknown ∧ SideView = Unknown ∧ TopView = Unknown), the recognition rates improve, as can also be seen from figure 2. In the following, we list some of the rules that the agents learned by exploring the effects of updating the QTables on the cognitive (i.e. logical) level: • If FrontView = Circle, then action = RecognizeSphere • If FrontView = Unknown ∧ SideView = Triangle, then action = LookAtFront • If FrontView = Triangle ∧ SideView = Unknown, then action = RecognizePrism

5.2

Recognition of Simulated Real Objects

To analyse Sphinx under more realistic conditions, we set up another environment. We defined shape attributes that are suitable for representing objects within a simple object recognition task and then

Figure 2.

Recognition Rates for Geometric Objects

chose random objects and describe them with these previously defined attributes. These attributes are the input to Sphinx. Again, there are three possible perspectives: the front view, the side view, and a view from a position between these two views. The decision for these persepectives, especially for the intermediate view, was made based on the results found by [5] who revealed that the intermediate view plays a special role in human object recognition. The front and the side view are described by three attributes each: approximate (idealized) shape, size (i.e. proportion) of the shape, and deviance from the idealized shape. The approximate shape can take on the values unknown, circle, square, triangle up, and triangle down. The size can be unknown, flat, regular, or tall. The deviance can be little, medium, or big. Besides these attributes the object is described by the complexity of its texture. This attribute can take on the values simple, medium, and complex. We set the attributes for each object manually. In a real application they can be determined easily by a simple image processing module which merely has to quantize the shape and texture of an object. If the agent looks at the object from the front or the side, it perceives the matching idealized shape, its size, its deviance, and the complexity of the texture. From the intermediate view the agent can only perceive the idealized shapes of the front and the side view and the complexity of the texture, but not the size and deviances. Formally the domains are: • FrontViewShape = {Unknown, Circle, Square, TriangleUp, TriangleDown} • FrontViewSize = {Unknown, Flat, Regular, Tall} • FrontViewDeviation = {Unknown, Little, Medium, Much} • SideViewShape = {Unknown, Circle, Square, TriangleUp, TriangleDown} • SideViewSize = {Unknown, Flat, Regular, Tall} • SideViewDeviation = {Unknown, Little, Medium, Much} • Texture = {Simple, Medium, Complex} • Action = {RotateLeft, RotateRight, RecognizeUnkown} ∪ R where R is the set of ’Recognize’ actions. At the beginning of each episode, the agent looks at the current object from a random perspective and the variables are set according to this perspective. Now, the agent can rotate the object clockwise or counter-clockwise or name it. If the agent’s action is a ’Recognize’ action, the episode ends. After ten steps the running episode is forced to end. The reward function is the same as in the previous test environment. We have chosen


15 different objects from nine different object classes such as bottle, tree, and house for which we provide the three attributes mentioned (shape, size, and deviation) (see figure 3).

69

• If FrontViewShape = Circle ∧ SideViewShape = Unknown ∧ Texture = Simple, then action = RotateLeft • If Texture = Complex, then action = RecognizeBottle What remains to be done at this point to apply our system to real images of objects, is the extraction of shape attributes from the images. This can be done by existing segmentation methods.

6 Figure 3.

Approximated geometrical forms of objects

Similar to the previous scenario, the experimental results obtained by testing 100 independend agents are depicted in Figure 4. Again, it can be seen clearly that S PHINX-Learning does better than Q(λ)learning with respect to the speed of learning.

CONCLUSION

Both low-level, non-cognitive learning and high-level learning with using epistemic background and acquiring generic knowledge are present in human learning processes. In this paper, we presented the hybrid S PHINX approach that enables intelligent agents to adjust to its environment in a similar way by combining epistemicbased belief revision with experience-based reinforcement learning. We linked both methodologies for two purposes: First, the current epistemic state allows the agent to focus on most plausible actions that are evaluated by QTables to find the most promising actions in some current state. Second, the direct feedback by the environment is used not only to update QTables, but also to generate specific or generic knowledge with which the epistemic state is revised. In order to illustrate the usefulness of our approach, we described application scenarios from computer vision and performed experiments in which S PHINX agents are employed for object recognition tasks. The evaluation of these experiments shows clearly that the proposed interplay of belief revision and reinforcement learning benefits from the advantages of both methodologies. Therefore, the S PHINX approach allows complex yet flexible interactions between learning and reasoning that help agents perform considerably better.

REFERENCES

Figure 4.

Recognition Rates for Simulated Real Objects

In a second step we added background knowledge that enabled the agent to recognize all objects correctly, if it has perceived all of the three views. Furthermore, we added rules to the background knowledge that told the agent to look at the object from all perspectives first. With these rules the agent has a complete, but not optimal, solution for the task. We wanted to find out how fast the agent learns that it does not need all views to classify the current object. To protect the background knowledge from being overwritten by the agent’s own rules too early, some parameters were changed, so that the agent had to be more sure about the correctness of a rule before adding it to its belief. This setup resulted in a constantly high recognition rate of over 99 %. The number of perceived views decreased over time from 3.28 to 1.99. The value of 3.28 perceived view vs. 3 possible views results from the fact, that the intermediate view has to be perceived twice if the environment starts in this view. Then, the agent perceives this view at the beginning, then rotates the object to the front and then back to the intermediate view so it can rotate the object to the side view in the next step (or vice versa). Here are some of the rules the agent learned and assimilated during its training: • If FrontViewShape = TriangleUp ∧ FrontViewSize = Tall, then action = RecognizeBottle

[1] Anderson, J. R., The architecture of cognition, Hardvard University Press, Cambridge, MA, 1983. [2] C. Ye and Yung, N. H. C. and D. Wang, ‘A fuzzy controller with supervised learning assisted reinforcement learning algorithm for obstacle avoidance’, IEEE Transactions on Systems, Man, and Cybernetics, Part B, 33(1), 17–27, (2003). [3] Gombert, J.-E., ‘Implicit and explicit learning to read: Implication as for subtypes of dyslexia’, Current Psychology Letters, 10(1), (2003). [4] G. Kern-Isberner, Conditionals in nonmonotonic reasoning and belief revision, Springer, LNAI 2087, 2001. [5] Pereira, A. and James, K. H. and Jones, S. S., and Smith, L. B. Preferred views in children’s active exploration of objects, 2006. [6] Reber, A. S., ‘Implicit learning and tacit knowledge’, Journal of Experimental Psychology: General, 118(3), 219–235, (1989). [7] W. Spohn, ‘Ordinal conditional functions: a dynamic theory of epistemic states’, in Causation in Decision, Belief Change, and Statistics, II, eds., W.L. Harper and B. Skyrms, 105–134, Kluwer Academic Publishers, (1988). [8] Stanley, W. B. and Mathews, R. C. and Buss, R. R. and Kotler-Cope, S., ‘Insight without awareness: On the interaction of verbalization, instruction and practice in a simulated process control task’, The Quarterly J. of Exp. Psychology Section A, 41(3), 553–577, (1989). [9] Sun, R. and Merrill, E. and Peterson, T., ‘From implicit skills to explicit knowledge: a bottom-up model of skill learning’, Cognitive Science, 25(2), 203–244, (2001). [10] Sun, R. and Slusarz, P. and Terry, C., ‘The interaction of the explicit and the implicit in skill learning: A dual-process approach’, Psychological Review, 112(1), 159–192, (2005). [11] Sun, R. and Zhang, X. and Slusarz, P. and Mathews, R., ‘The interaction of implicit learning, explicit hypothesis testing learning and implicit-toexplicit knowledge extraction’, Neural Networks, 20(1), 34–47, (2007). [12] Sutton, R. S. and Barto, A. G., Reinforcement Learning: An Introduction, Bradford Book, The MIT Press, 1998.

70


A Formal Approach for RDF/S Ontology Evolution George Konstantinidis and Giorgos Flouris and Grigoris Antoniou and Vassilis Christophides1 Abstract. In this paper, we consider the problem of ontology evolution in the face of a change operation. We devise a general-purpose algorithm for determining the effects and side-effects of a requested elementary or complex change operation. Our work is inspired by belief revision principles (i.e., validity, success and minimal change) and allows us to handle any change operation in a provably rational and consistent manner. To the best of our knowledge, this is the first approach overcoming the limitations of existing solutions, which deal with each change operation on a per-case basis. Additionally, we rely on our general change handling algorithm to implement specialized versions of it, one per desired change operation, in order to compute the equivalent set of effects and side-effects.2

1

INTRODUCTION

Stored knowledge, in any knowledge-based application, may need to change due to various reasons, including changes in the modeled world, new information on the domain, newly-gained access to information previously unknown, and other eventualities [11]. Here, we consider the case of ontologies expressed in RDF/S (as most of the Semantic Web Schemas (85,45%) are expressed in RDF/S [14]) and introduce a formal framework to handle the evolution of an ontology given a change operation. We pay particular attention to the semantics of change operations which can, in principle, be either elementary (involving a change in a single ontology construct) or composite (involving changes in multiple constructs) [5]. Even though RDF/S does not support negation, the problem is far from trivial as inconsistencies may rise due to the validity rules associated with RDF/S ontologies. In fact, naive settheoretical addition or removal of ontological constructs (i.e., direct application of a change) has been acknowledged as insufficient for ontology evolution [4, 6, 12]. Most of the implemented ontology management systems (e.g., [1, 2, 8, 13]), are designed using an ad-hoc approach, that solves the problems related to each change operation on a per-case basis. More specifically, they explicitly define a finite, and thus incomplete, set of change operations that they support, and have determined, a priori, the semantics of each such operation. Hence, given the lack of a formal methodology, the designers of these systems have to determine, in advance, all the possible invalidities that could occur in reaction to a change, the various alternatives for handling any such possible invalidity, and to pre-select the preferable solutions for implementation per case [6]; this selection is hard-coded into the systems’ implementations. This approach requires a highly tedious, case-based reasoning which is error-prone and gives no formal guarantee that

the cases and options considered are exhaustive. To overcome these limitations, we propose an ontology evolution framework and elaborate on its formal foundations. Our methodology is driven by ideas and principles of the belief revision literature [3]. In particular, we adopt the Principle of Success (every change operation is actually implemented) and the Principle of Validity (resulting ontology is valid, i.e., it satisfies all the validity constraints of the underlying language). Satisfying both these requirements is not trivial, as the straightforward application of a change operation upon an ontology can often lead to invalidity, in which case certain additional actions (side-effects) should be executed to restore validity. Sometimes, there may be more than one ways to do so, in which case a selection mechanism should be in place to determine the “best” option. In this paper, we employ a technique inspired by the Principle of Minimal Change [3] (stating that the appropriate result of changing an ontology should be as “close” as possible to the original). The general idea of our approach is to first determine all the invalidities that any given change (elementary or composite) could cause upon the updated ontology, using a formal, well-specified validity model, and then to determine the best way to overcome potential invalidity problems in an automatic way, by exploring the various alternatives and comparing them using a selection mechanism based on an ordering relation on potential side-effects. In particular, our formal approach is parameterizable to this relation, thus providing a customizable way to guarantee the determination of the “best” result. Although our framework is general, in this paper we focus on a fragment of the RDF/S model which exhibits interesting properties for deciding query containment and minimization [10]. To the best of our knowledge, our implementation is the first one that allows processing any type of change operation, and in a fully automatic way.

2

PROBLEM FORMULATION

2.1

In order to abstract from the syntactic peculiarities of the underlying language and develop a uniform framework, we will map RDF to First-Order Logic (FOL). Table 1 (restricted for presentation purposes) shows the FOL representation of certain RDF statements. The language’s semantics is not carried over during the mapping, so we need to combine the FOL representation with a set of validity rules that capture such semantics. For technical reasons, we assume that all constraints can be encoded in the form of (or can be broken down into a conjunction of) DEDs (disjunctive embedded dependencies), which have the following general form: ∀ uP ( u) → ∨i=1,...,n ∃ vi Qi ( u, vi )

1

Institute of Computer Science, FO.R.T.H., Heraklion, Greece, email:gconstan,fgeo,antoniou,[email protected] 2 This work was partially supported by the EU projects CASPAR (FP6-2005IST-033572) and KP-LAB (FP6-2004-IST-27490).

Modeling RDF/S, ontologies and updates

where: • u, vi are tuples of variables

(DED)

71

G. Konstantinidis et al. / A Formal Approach for RDF/S Ontology Evolution

Table 1. Representation of RDF facts using FOL predicates

Table 2. Validity Rules

RDF triple

Intuitive meaning

Predicate

Rule ID/Name

Integrity Constraint

Intuitive Meaning

C rdf:type rdfs:Class P rdf:type rdf:Property x rdf:type rdfs:Resource P rdfs:domain C P rdfs:range C C1 rdfs:subClassOf C2 P1 rdfs:subPropertyOf P2 x rdf:type C xPy

C is a class P is a property x is a class instance domain of property range of property IsA between classes IsA between properties class instantiation property instantiation

CS(C) P S(P ) CI(x) Domain(P, C) Range(P, C) C IsA(C1 , C2 ) P IsA(P1 , P2 ) C Inst(x, C) P I(x, y, P )

R2 Domain Applicability

∀x, y ∈ Σ: Domain(x, y) → P S(x) ∧ CS(y) ∀x, y ∈ Σ: C IsA(x, y) → CS(x) ∧ CS(y) ∀x, y ∈ Σ: C Inst(x, y) → CI(x) ∧ CS(y) ∀x, y, z ∈ Σ: Domain(x, y) → ¬Domain(x, z) ∨ (y = z) ∀x ∈ Σ, ∃y, z ∈ Σ: P S(x) → Domain(x, z) ∧ Range(x, y) ∀x, y, z ∈ Σ: C IsA(x, y) ∧ C IsA(y, z) → C IsA(x, z) ∀x, y, ∈ Σ: C IsA(x, y) → ¬C IsA(y, x) ∀x, y, z ∈ Σ : C Inst(x, y) ∧ C IsA(y, z) → C Inst(x, z) ∀x, y, z, w ∈ Σ : P I(x, y, z) ∧ Domain(z, w) → C Inst(x, w) ∀x, y ∈ Σ: P IsA(x, y) → ¬P IsA(y, x)

Domain applies to properties; the domain of a property is a class Class IsA applies between classes

R4 C IsA Applicability R6 C Inst Applicability R8 Domain is unique

• P , Qi are conjunctions of relational atoms of the form R(w1 , ..., wn ) and equality atoms of the form (w = w ), where w1 , ..., wn , w, w are variables or constants • P may be the empty conjunction We employ DEDs, as they are expressive enough for capturing the semantics of different RDF fragments and other simple data models which are appropriate for our purposes in this paper. Moreover, DEDs will prove suitable for constructing a convenient mechanism for detecting and repairing invalidities. Table 2 shows some rules that are used to capture the semantics of the various RDF constructs (e.g., R11 captures IsA transitivity), as well as the restrictions imposed by our RDF model (e.g., R8 captures that the domain of a property should be unique). It should be stressed that the semantics of the language captured by tables 1 and 2 essentially corresponds to a fragment of the standard RDF/S data model3 in which there is a clear role distinction between ontology primitives, no cycles in the subsumption relationships, while property subsumption respects corresponding domain/range subsumption relationships. Such a fragment has been first studied in [10] in an effort to provide a group of sound and complete algorithms for query containment and minimization while it is compatible with W3C guidelines4 for devising restricted fragments of the RDF/S data model. Similarly, the general-purpose change handling algorithm presented in this paper can be also applied to other fragments of RDF/S (see also [7, 9]) or the standard RDF/S semantics. In Table 2, Σ denotes the set of constants in our language. We equip our FOL with closed semantics, i.e., CWA (closed world assumption). This means that, for two formulas p, q, if p q then p % ¬q. Abusing notation, for two sets of ground facts U , V , we will say that U implies V (U % V ) to denote that U % p for all p ∈ V . Any expression of the form P (x1 , ..., xk ) is called a positive ground fact where P is a predicate of arity k and x1 , ..., xk are constant symbols. Any expression of the form ¬P (x1 , ..., xk ) is called a negative ground fact iff P (x1 , ..., xk ) is a positive ground fact. L denotes the set of all well-formed formulae that can be formed in our FOL. We denote by L+ the set of positive ground facts, L− the set of negative ground facts and set L0 = L+ ∪ L− , called the set of ground facts of the language. We define: • An ontology is a set K ⊆ L+ • An update is a set U ⊆ L0 In simple words, an ontology is any set of positive ground facts whereas an update is any set of positive or negative ground facts. Applying an update to an ontology should result in the incorporation of the update in the ontology. By definition, ontologies have two properties: (a) they are always consistent (in the purely logical sense) and (b) they imply only the 3 4

http://www.w3.org/TR/rdf-concepts/ http://www.w3.org/TR/2004/REC-rdf-mt-20040210/#technote

R10 Domain and Range exists R11 C IsA Transitivity R12 C IsA Irreflexivity R15 Determining C Inst R17 Property Instance of and Domain R23 P IsA Irreflexivity

Class Instanceof applies between a class instance and a class The domain of a property is unique Each property has a domain and a range Class IsA is Transitive

Class IsA is Irreflexive Class instance propagation Instanceof between properties reflect in their sources/domains Property IsA is Irreflexive

positive ground facts that are already in the ontology. The above two properties together with the CWA semantics, imply that: • P (x) ∈ K ⇔ K % P (x) ⇔ K ¬P (x) • P (x) ∈ / K ⇔ K % ¬P (x) ⇔ K P (x) An application of these properties is that updating K with ¬P (x) corresponds to contracting P (x) from K, because “incorporating” ¬P (x) in an ontology could be achieved only by removing P (x) from K. Therefore, updating an ontology with negative ground facts corresponds to contraction/erasure in the standard terminology, whereas updating an ontology with positive ground facts corresponds to revision/update in the standard terminology.

2.2

Updating under constraints

We say that an ontology K satisfies a validity rule c, iff K % c. Obviously for a set C of validity rules, K satisfies C (K % C) iff K % c for all c ∈ C. It is easy to see that for a simple constraint of the form c = ∀uP (u) → Q(u), where P, Q are simple positive predicates and u is a variable, it holds that: K % c iff for all constants x : K % {¬P (x)} or K % {Q(x)}. This can be easily extended to the general case. Suppose that c = ∀ uP ( u) → ∨i=1,...,n ∃ vi Qi ( u, vi ), where P ( u) = P1 ( u) ∧ . . . ∧ Pk ( u) for some k 0 and Qi ( u, vi ) = Qi1 ( u, vi ) ∧ . . . ∧ Qim ( u, vi ) for some m > 0 depending on i. Then K % c iff for all tuples of constants x at least one of the following is true (note that in case of obvious reference to tuples of constants or variables we will be omitting the symbol): • There is some j : 0 < j k such that K % {¬Pj (x)}.

72


• There is some i : 1 i n and some tuple of constants z such that for all j = 1, 2, ..., m K % {Qij (x, z)}. We can conclude that K % c iff for all tuples of constants x at least one of the following sets is implied by K: • {¬Pj (x)}, 0 < j k • {Qi1 (x, z) ∧ Qi2 (x, z) ∧ ... ∧ Qim (x, z)}, 1 i n, z:constant

negative ground facts, so they are updates in our terminology. This is a very useful remark, as we will subsequently take advantage of the elements of Comp(c, x), applying them as updates. In our example, the validity of rule R2.2, for x = P, y = D can be restored iff either {¬Domain(P, D)} or {CS(D)} are added as additional updates (side-effects) to the ontology. Note that sideeffects could trigger side-effects of their own if violating any rules.

Based on the above observation, we define the component set of c with respect to some tuple of constants x as follows:

Rule ID/Name

Components of the rule

Comp(c, x) = {{¬Pj (x)}|0 < j k} ∪ {{Qi1 (x, z) ∧ Qi2 (x, z)

R2 Domain Applicability

R2.1 : ∀x, y ∈ Σ : Comp(R2.1, (x, y))= {{¬Domain(x, y)}, {P S(x)}} R2.2 : ∀x, y ∈ Σ : Comp(R2.2, (x, y))= {{¬Domain(x, y)}, {CS(y))}} ∀x, y, z ∈ Σ : Comp(R8, (x, y, z))= {{¬Domain(x, y)}, {¬Domain(x, z)}, {(y = z)}} R10.1 : ∀x ∈ Σ, ∃z ∈ Σ : Comp(R10.1, (x, z))= {{¬P S(x)}, {Domain(x, z)}} R10.2 : ∀x ∈ Σ, ∃y ∈ Σ : Comp(R10.1, (x, y))= {{¬P S(x)}, {, Range(x, y)}} ∀x, y, z, w ∈ Σ : Comp(R17, (x, y, z, w)) = {{¬P I(x, y, z)} ,{¬Domain(z, w)}, {C Inst(x, w)}}

Table 3. Some validity rules in component set form

∧... ∧ Qim (x, z)} |1 i n, z : constant} Prop. 1 will subsequently help us define a valid ontology.

R8 Domain is unique

Prop. 1 K % c iff for all constants x there is some V ∈ Comp(c, x) such that K % V .

R10 Domain and Range exists

Def. 1 Consider a FOL language L and a set of validity rules C. An ontology K will be called valid with respect to L and C iff K is consistent and it satisfies the validity rules C.

R17 Property Instance of and Domain

Note that a valid ontology, by our rules of Table 2, contains all its implicit knowledge as well (i.e., it is closed with respect to inference). Due to the special characteristics of our framework (e.g., CWA, the form of rules, etc), one does not need to employ full FOL reasoning to determine whether an ontology K is valid (i.e., using Def. 1 and Prop. 1); instead, we can use the specialized procedure described below (Prop. 2).

A

P

Make D domain of P

a

P

(a)

Prop. 2 A ground fact P (x), added to an ontology K, would violate rule c, iff there is some set V and tuple of constants u for which ¬P (x) ∈ V and V ∈ Comp(c, u) and for all V ∈ Comp(c, u), V = V it holds that K V . As an example, consider the ontology of Fig. 1(a). The original ontology in our case, per Table 1, is: K = { CS(A), CS(B), CI(a), CI(b), P S(P ), Domain(P, A), Range(P, B), P I(a, b, P ), C Inst(a, A), C Inst(b, B)} and the update is: U = {Domain(P, D)}. To detect rule violations in an automated way, according to Prop. 2, we must find all the rules that contain ¬Domain(x, y), set x = P , y = D, and determine whether some other component for the particular instantiation is implied by the ontology. If the answer is no, then the addition of Domain(P, D) would violate the particular instantiation of this rule. In our case, this is true for rule R2.2 (domain applicability), for x = P, y = D and rule R8 (unique domain) for x = P, y = D, z = A as well as for x = P, y = A, z = D (see also Table 3 for some rules in their component set format). Moreover, it violates rule R17 for x = a, y = b, z = P, w = D. One nice property of our detection mechanism is that it provides an immediate way to restore invalidities as well, i.e., generate potential side-effects that would restore the violation. In particular, the violation that Prop. 2 detects can be restored by making any of the elements of Comp(c, u) true in the ontology. At this point note that when a Qij (x, z) in some set V ∈ Comp(c, x) is an equality of the form w = w , then the truth value of this equality is revealed as soon as we instantiate this rule’s variables to constants. Therefore, by evaluating an equality as a tautology () or contradiction (⊥) and replace it accordingly in the rule’s instances, we are able to eliminate all the equality atoms from the components sets. Without equalities, the elements of Comp(c, x) contain only positive and

D P

B

b

A

B

a

P

b

(b)

Figure 1. Adding a new domain to a property.

2.3

Selection of side-effects: ordering

If there were no validity rules or we were not interested in the result being a valid ontology the most rational way to perform an update would be to simply apply the changes in U upon K. Def. 2 The raw application of an update U to an ontology K is denoted by K + U and is the following ontology: K + U = {P (x) ∈ L+ |P (x) ∈ K ∪ U and¬P (x) ∈ / U} When a set of changes (i.e., an update U ) is raw applied to a valid ontology K, some of the changes that appear in U may be void, i.e., they don’t need to be performed because they are already implemented (implied) by the original ontology. We define an operator which, given a resulting ontology K that an update would produce on a valid ontology K, calculates the actual effects of the update: Def. 3 For K a valid ontology and K an ontology: Delta(K, K ) = {P (x) ∈ L0 |K % {P (x)} and K {P (x)}} Delta function is some kind of “edit distance”5 between K and K ; if K = K + U , then Delta represents the actual changes that U enforces upon K. Thus, K + U = K + Delta(K, K + U ) = K , so Delta(K, K + U ) produces the same result as U when applied upon an ontology; however they may be different as U could contain void changes. 5

Note that the term “edit-distance” is usually used for sequences and not sets (i.e., edit scripts)


As already mentioned, the raw application of an update would not work for our case, because it may not respect the validity constraints of the language. Thus, applying an update involves the application of some side-effects. In some cases, it may not be possible to find adequate side-effects for the update at hand; such updates are called infeasible and cannot be executed. For example, any inconsistent update (such as, U = {CS(A), ¬CS(A)}) is infeasible. In most cases though, an update has several possible alternative sets of side-effects, which implies that a selection should be made. Consider an update U with alternative side-effects U1 and U2 . Then, the set of changes that should be raw applied on the initial ontology, in order to reach a valid result, is either U ∪ U1 or U ∪ U2 . According to the Principle of Minimal Change we should choose the one which causes the “mildest” effects upon the ontology; to determine the “relative mildness” (or “relative cost”) of such effects, we define an ordering between updates. Note that this ordering should depend on K itself: for example, the “cost” of removing an IsA relation between A and B should depend on the importance of the concepts A, B in the RDF graph itself. The following conditions have proven necessary for an ordering to produce “rational” results. Def. 4 An generating

ordering iff the

Delta Antisymmetry: Transitivity: Totality: Conflict Sensitivity: Monotonicity:

K is following U :

called conditions U

U

updatehold:

For any U , U K and K U implies Delta(K, K + U )=Delta(K, K + U ). For any U , U , U : U K U and U K U implies U K U . For any U , U : U K U or U K U . For any U , U : U K U iff Delta(K, K + U ) K Delta(K, K + U ). For any U , U : U ⊆ U implies U K U .

Similarly, an ordering scheme {K |K : a valid ontology} is called update-generating iff K is update-generating for all valid ontologies K. For our RDF case we introduced a particular update-generating ordering, which is based on the ordering shown in Table 4 (among the positive and negative predicates presented in Table 1 for simplicity). The details of the expansion of this ordering to refer to ground facts and sets of ground facts (i.e., updates) is omitted due to space limitations. In short, the general idea is that an update U1 is “cheaper” (or preferable) than U2 (denoted by U1 K U2 ) iff the “most expensive” predicate used in update U1 , is “cheaper” than the “most expensive” predicate used in update U2 where the predicates’ relative preference is determined by the order shown in Table 4. Ties are resolved using cardinality considerations and/or the relative importance of the predicate’s arguments in the original ontology (details omitted). Our ordering was based on the results of experimentation on various alternative orderings and results to an efficient and intuitive implementation. Nonetheless, our algorithm works with any update-generating ordering; each different ordering would model and impose a different global evolution policy on our algorithm. Fig. 1(b) depicts the outcome of the requested update with respect to our ordering. Table 4.

Ordering of predicates

P I

ECAI 2008: Proceedings, 18th European Conference on Artificial Intelligence, July 21-25, 2008, Patras, Greece : Including Prestigious Applications of Intelligent ... in Artifical Intelligence and Applications)

Innovations in Applied Artificial Intelligence: 18th International Conference on Industrial and Engineering Applications of Artificial Intelligence

Knowledge Management for Health Care Procedures: ECAI 2008 Workshop K4HelP 2008, Patras, Greece, July 21, 2008, Revised Selected Papers (Lecture Notes ... Lecture Notes in Artificial Intelligence)

Artificial Intelligence in Education (Frontiers in Artificial Intelligence and Applications)

Artificial Intelligence.. Theories, Models and Applications, 5 conf., SETN 2008

Artificial Intelligence.. Methodology, Systems, and Applications, 13 conf., AIMSA 2008

New Frontiers in Artificial Intelligence, JSAI 2008 Conference and Workshops

Artificial Intelligence Research and Development (Frontiers in Artificial Intelligence and Applications, Vol. 146) (Frontiers in Artificial Intelligence and Applications)

KI-94: Advances in Artificial Intelligence: 18th German Annual Conference on Artificial Intelligence, Saarbrücken, September 18-23, 1994. Proceedings: ... 18th

International Symposium on Distributed Computing and Artificial Intelligence 2008

Encyclopedia of Artifical Intelligence

Artificial Intelligence - Strategies Applications and Models

Artificial intelligence in medicine: 10th Conference on Artificial Intelligence in Medicine, AIME 2005, Aberdeen, UK, July 23-27, 2005; proceedings

Tools and Applications with Artificial Intelligence

Artificial Intelligence Research and Development: Proceedings of the 11th International Conference of the Catalan Association for Artificial Intelligence ... in Artificial Intelligence and Applications)

Algorithms and Architectures of Artificial Intelligence (Frontiers in Artificial Intelligence and Applications)

Algorithms and Architectures of Artificial Intelligence (Frontiers in Artificial Intelligence and Applications)

New Trends in Applied Artificial Intelligence: 20th International Conference on Industrial and Engineering Applications of Artificial Intelligence and ... Lecture Notes in Artificial Intelligence)

Applications and Innovations in Intelligent Systems XVI: Proceedings of AI-2008, The Twenty-eighth SGAI International Conference on Innovative Techniques ... of Artificial Intelligence (v. 16)

Applications and Innovations in Intelligent Systems XVI: Proceedings of AI-2008, The Twenty-eighth SGAI International Conference on Innovative Techniques ... of Artificial Intelligence (v. 16)

Formal Models, Languages And Applications (Machine Perception and Artifical Intelligence)

Advances in Logic, Artificial Intelligence and Robotics: Laptec 2002 (Frontiers in Artificial Intelligence and Applications, 85)

Advances in Artificial General Intelligence: Concepts, Architectures and Algorithms (Frontiers in Artificial Intelligence and Applications)

Emerging Intelligent Computing Technology and Applications. With Aspects of Artificial Intelligence: 5th International Conference on Intelligent ... Lecture Notes in Artificial Intelligence)

Computational Collective Intelligence. Technologies and Applications. ICCCI 2011 Proceedings Part II (Lecture Notes in Artificial Intelligence)

Proceedings of the Second Conference on Artificial General Intelligence (Advances in Intelligent Systems Research)

Methods and Applications of Artificial Intelligence: Third Helenic Conference on AI, SETN 2004, Samos, Greece, May 5-8, 2004, Proceedings

Ubiquitous Intelligence and Computing: 5th International Conference, UIC 2008, Oslo, Norway, June 23-25, 2008 Proceedings

Advancing Artificial Intelligence through Biological Process Applications

Computational Collective Intelligence. Technologies and Applications. ICCCI 2011 Proceedings Part I (Lecture Notes in Artificial Intelligence)

Advancing Artificial Intelligence through Biological Process Applications

ECAI 2008: Proceedings, 18th European Conference on Artificial Intelligence, July 21-25, 2008, Patras, Greece : Including Prestigious Applications of Intelligent ... in Artifical Intelligence and Applications)

Innovations in Applied Artificial Intelligence: 18th International Conference on Industrial and Engineering Applications of Artificial Intelligence

Knowledge Management for Health Care Procedures: ECAI 2008 Workshop K4HelP 2008, Patras, Greece, July 21, 2008, Revised Selected Papers (Lecture Notes ... Lecture Notes in Artificial Intelligence)

Artificial Intelligence in Education (Frontiers in Artificial Intelligence and Applications)

Artificial Intelligence.. Theories, Models and Applications, 5 conf., SETN 2008

Artificial Intelligence.. Methodology, Systems, and Applications, 13 conf., AIMSA 2008

New Frontiers in Artificial Intelligence, JSAI 2008 Conference and Workshops

Artificial Intelligence Research and Development (Frontiers in Artificial Intelligence and Applications, Vol. 146) (Frontiers in Artificial Intelligence and Applications)

KI-94: Advances in Artificial Intelligence: 18th German Annual Conference on Artificial Intelligence, Saarbrücken, September 18-23, 1994. Proceedings: ... 18th

International Symposium on Distributed Computing and Artificial Intelligence 2008

Encyclopedia of Artifical Intelligence

Artificial Intelligence - Strategies Applications and Models

Artificial intelligence in medicine: 10th Conference on Artificial Intelligence in Medicine, AIME 2005, Aberdeen, UK, July 23-27, 2005; proceedings

Tools and Applications with Artificial Intelligence

Artificial Intelligence Research and Development: Proceedings of the 11th International Conference of the Catalan Association for Artificial Intelligence ... in Artificial Intelligence and Applications)

Algorithms and Architectures of Artificial Intelligence (Frontiers in Artificial Intelligence and Applications)

Algorithms and Architectures of Artificial Intelligence (Frontiers in Artificial Intelligence and Applications)

New Trends in Applied Artificial Intelligence: 20th International Conference on Industrial and Engineering Applications of Artificial Intelligence and ... Lecture Notes in Artificial Intelligence)

Applications and Innovations in Intelligent Systems XVI: Proceedings of AI-2008, The Twenty-eighth SGAI International Conference on Innovative Techniques ... of Artificial Intelligence (v. 16)

Applications and Innovations in Intelligent Systems XVI: Proceedings of AI-2008, The Twenty-eighth SGAI International Conference on Innovative Techniques ... of Artificial Intelligence (v. 16)

Formal Models, Languages And Applications (Machine Perception and Artifical Intelligence)

Advances in Logic, Artificial Intelligence and Robotics: Laptec 2002 (Frontiers in Artificial Intelligence and Applications, 85)

Advances in Artificial General Intelligence: Concepts, Architectures and Algorithms (Frontiers in Artificial Intelligence and Applications)

Emerging Intelligent Computing Technology and Applications. With Aspects of Artificial Intelligence: 5th International Conference on Intelligent ... Lecture Notes in Artificial Intelligence)

Computational Collective Intelligence. Technologies and Applications. ICCCI 2011 Proceedings Part II (Lecture Notes in Artificial Intelligence)

Proceedings of the Second Conference on Artificial General Intelligence (Advances in Intelligent Systems Research)

Methods and Applications of Artificial Intelligence: Third Helenic Conference on AI, SETN 2004, Samos, Greece, May 5-8, 2004, Proceedings

Ubiquitous Intelligence and Computing: 5th International Conference, UIC 2008, Oslo, Norway, June 23-25, 2008 Proceedings

Advancing Artificial Intelligence through Biological Process Applications

Computational Collective Intelligence. Technologies and Applications. ICCCI 2011 Proceedings Part I (Lecture Notes in Artificial Intelligence)

Advancing Artificial Intelligence through Biological Process Applications

Recommend Documents