1 Proceedings of the 2nd Asia-Pacific Conference on IAT
DzvdDpmzni Editors
Ning Zhong Jiming Liu Setsuo Ohsuga Jeffrey Bradshaw World Scientific
Proceedings; trf the 2nd Asia-Pacific Conference on W
Intelligent Agent rri 1 1 A eCnnOIOgy
Research and Development
Proceedings erf the 2nd Asia-Pacific Conference on IAT
Intelligent Agent Technology Research and Development Editors
Ning Zhong Maebashi Institute of Technology, Japan
Jiming Liu Hong Kong Baptist University
Setsuo Ohsuga Waseda University, Japan
Jeffrey Bradshaw University of West Florida, USA
^ S j World Scientific wB
New Jersey * London • Singapore • Hong Kong
Published by World Scientific Publishing Co. Pte. Ltd. P O Box 128, Farrer Road, Singapore 912805 USA office: Suite IB, 1060 Main Street, River Edge, NJ 07661 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.
INTELLIGENT AGENT TECHNOLOGY Research and Development Copyright © 2001 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.
ISBN 981-02-4706-0
Printed in Singapore by World Scientific Printers (S) Pte Ltd
PREFACE Intelligent Agent Technology is concerned with the development of autonomous computational or physical entities capable of perceiving, reasoning, adapting, learning, cooperating, and delegating in a dynamic environment. It is one of the most promising areas of research and development in information technology, computer science, and engineering today. This book is an attempt to capture the essence of the current state of the art in intelligent agent technology and to identify the new challenges and opportunities that it is or will be facing. It contains the papers accepted for presentation at The Second Asia-Pacific Conference on Intelligent Agent Technology (IAT '01), held in Maebashi, Japan, October 23-26, 2001. The second meeting in the IAT conference series follows the success of IAT '99 held in Hong Kong in 1999. IAT '01 brought together researchers and practitioners to share their original research results and practical development experiences in intelligent agent technology. The most important feature of this conference was that it emphasized a multi-facet, holistic view of this emerging technology, from its computational foundations, in terms of models, methodologies, and tools for developing a variety of embodiments of agentbased systems, to its practical impact on tackling real-world problems. Much work has gone into the preparation of the IAT '01 technical program: Original, high-quality papers were solicited for various aspects of theories, applications, and case studies related to agent technologies. 134 full papers were submitted from 32 countries and regions of all continents. Each submitted paper was reviewed by at least three experts on the basis of technical soundness, relevance, originality, significance, and clarity. Based on the review reports, 25 regular papers (19%) and 40 short papers were accepted for presentation and publication. This book is structured into six chapters according to the main conference sessions: Chapter 1. Formal Agent Theories Chapter 2. Computational Architecture and Infrastructure Chapter 3. Learning and Adaptation Chapter 4. Knowledge Discovery and Data Mining Agents Chapter 5. Distributed Intelligence Chapter 6. Agent-Based Applications In addition to the above chapters, this book also includes the abstract or papers for the IAT '01 keynote/invited talks by Benjamin W. Wah, Toyoaki Nishida, Zbigniew W. Ras, Andrzej Skowron, and Katia Sycara, which provide different perspectives to Intelligent Agent Technology. v
vi
We wish to express our gratitude to all members of the Conference Committee and the International Advisory Board for their instrumental and unfailing support. IAT '01 has a very exciting program with a number of features, ranging from technical sessions, invited talks, agent demos, and social programs. All of this work would not have been possible without the generous dedication of the Program Committee members and the external reviewers in reviewing the papers submitted to IAT '01, of our invited speakers, Benjamin W. Wah, Toyoaki Nishida, Zbigniew W. Ras, Andrzej Skowron, and Katia Sycara, in preparing and presenting their very stimulating talks, and of Jianchang Mao (Demos & Exhibits Chair) in soliciting demo proposals and setting up the program. We thank them for their strong support. The conference Web support team at the Knowledge Information Systems Laboratory, Maebashi Institute of Technology did a terrific job of putting together and maintaining the home page for the conference as well as building a software, namely, cyber-chair, which is an intelligent agent and interface among organizers, program committee members, and authors/attendees. We would like to thank Juzhen Dong, Muneaki Ohsima, Norichika Hayazaki of the conference Web support team for their dedication and hard work. IAT '01 could not have taken place without the great team effort of the Local Organizing Committee and the support of Maebashi Institute of Technology and Maebashi Convention Bureau. Our special thanks go to Nobuo Otani (Local Organizing Chair), Sean M. Reedy, Masaaki Sakurai, Kanehisa Sekine, and Yoshitsugu Kakemoto (the Local Organizing Committee members) for their enormous efforts in planning and arranging the logistics of the conference from registration/payment handling, venue preparation, accommodation booking, to banquet/social program organization. We are very grateful to the IAT '01 sponsors: ACM SIGART, Maebashi Institute of Technology, Maebashi Convention Bureau, Maebashi City Government, Gunma Prefecture Government, The Japan Research Institute, Limited, United States Air Force Office of Scientific Research, Asian Office of Aerospace Research and Development, and United States Army Research Office in Far East, and Web Intelligence Laboratory, Inc. for their generous support. We thank ACM SIGWEB, SIGCHI, Japanese Society for Artificial Intelligence, JSAI SIGFAI, SIGKBS, and IEICE SIGKBSE for being in cooperation with IAT '01. Last but not the least, we thank Ms. Lakshmi Narayanan of World Scientific for her help in coordinating the publication of this book.
October 2001 Ning Zhong and Jiming Liu Program Committee Chairs Setsuo Ohsuga and Jeffrey Bradshaw General Conference Chairs
CONFERENCE ORGANIZATION
General Chairs: Program Chairs: Demos and Exhibits Chair: Local Organizing Chair:
Setsuo Ohsuga (Waseda U., Japan) Jeffrey Bradshaw (Inst. H&M Cognition, USA) Ning Zhong (Maebashi Inst. Technology, Japan) Jiming Liu (Hong Kong Baptist U.) Jianchang Mao (Verity Inc., USA) Nobuo Otani (Mabashi Inst. Technology, Japan)
International Advisory Board Jeffrey M. Bradshaw (Inst. H&M Cognition, USA) Michele L. D. Gaudreault (US AOARD) Daniel T. Ling (Microsoft Corp., USA) Jiming Liu (Hong Kong Baptist U.) Jianchang Mao (Verity Inc., USA) Hiroshi Motoda (Osaka U., Japan)
Setsuo Ohsuga (Waseda U., Japan) Patrick S. P. Wang (Northeastern U., USA) Yiyu Yao (U. Regina, Cadada) Jie Yang (U. Science & Technology of China) Ning Zhong (Maebashi Inst. Technology, Japan) Jan Zytkow (U. North Carolina, USA)
Local Organizing Committee Masahiko Satori (Maebashi Inst. Tech., Japan) Tadaomi Miyazaki (Maebashi Inst. Tech., Japan) Nobuo Otani (Mabashi Inst. Technology, Japan) Sean M. Reedy (Mabashi Inst. Technology, Japan) Ning Zhong (Maebashi Inst. Technology, Japan)
Toshio Kawamura (Maebashi Convention B.) Masaaki Sakurai (Maebashi Convention Bureau) Kanehisa Sekine (Maebashi Convention Bureau) Midori Asaka (IPA, Japan) Yoshitsugu Kakemoto (JRI, Limited, Japan)
Program Committee K. Suzanne Barber (U. Texas-Austin, USA) Guy Boy (EURISCO, France) Cristiano Castelfranchi (CNR, Italy) Kerstin Dautenhahn (U. Hertfordshire, UK) Edmund H. Durfee (U. Michigan, USA) E. A. Edmonds (Loughborough U., UK) Tim Finin (UMBC, USA) Adam Maria Gadomski (ENEA, Italy) Scott Goodwin (U. Regina, Canada) Vladimir Gorodetsky (Russian Academy of Sci.) Mark Greaves (The Boeing Company, USA) Barbara Hayes-Roth (Stanford U., USA) Michael Huhns (U. South Carolina, USA) Keniti Ida (Maebashi Inst. Technology, Japan) Tom Ishida (Kyoka oto U., Japan) Lakhmi Jain (U. South Australia) Stefan J. Johansson (U. Karlskrona, Sweden) Qun Jin (U. Aizu, Jaoan) Juntae Kim (Dongguk U., Korea) David Kinny (U. Melbourne, Australia) Matthias Klusch (German Research Center for AI) Sarit Kraus (U. Maryland, USA) Danny B. Lange (General Magic, INC., USA) Jimmy Ho Man Lee (Chinese U. Hong Kong) Jiming Liu (Hong Kong Baptist U.) Mike Luck (U. Southampton, UK) Helen Meng (Chinese U. Hong Kong) Joerg Mueller (Siemens, Germany)
Hideyuki Nakashima (ETL, Japan) Wee-Keong Ng (Nanyang Tech. U., Singapore) Katsumi Nitta (Tokyo Inst. Technology, Japan) Yoshikuni Onozato (Gunma U., Japan) Tuncer Oren (Marmara Research Center, Turkey) Ichiro Osawa (ETL, Japan) Sun Park (Rutgers U., USA) Van Parunak (ERIM, USA) Zbigniew W. Ras (U. North Carolina, USA) Eugene Santos (U. Connecticut, USA) Zhongzhi Shi (Chinese Academy of Sciences) Carles Sierra (Scientific Research Council, Spain) Kwang M. Sim (Chinese U. Hong Kong) Andrzej Skowron (Warsaw U., Poland) Ron Sun (U. Misouri-Columbia, USA) Niranjan Suri (U. West Florida, USA) Takao Terano (U. Tsukuba, Japan) Demetri Terzopoulos (U. Toronto, Canada) Huaglory Tianfield (Glasgow Caledonian U., UK) David Wolpert (NASA Ames Research Center) Jinglong Wu (Kagawa U., Japan) Takahira Yamaguchi (Shizuoka U., Japan) Kazumasa Yokota (Okayama Prefectural U., Japan) Eric Yu (U. Toronto, Canada) P. C. Yuen (Hong Kong Baptist U.) Chengqi Zhang (Deakin U., Australia) Ning Zhong (Maebashi Inst. Technology, Japan)
TABLE OF CONTENTS Preface Conference Organization
v vii
Invited Talks Intelligent Agents for Market-Trend Prediction Benjamin W. Wah Social Intelligence Design for Knowledge Creating Communities Toyoaki Nishida Query Answering Based on Distributed Knowledge Mining Zbigniew W. Ras Approximate Reasoning by Agents in Distributed Environments Andrzej Skowron Multi-Agent Infrastructure for Agent Interoperation in Open Computational Environments Katia Sycara Chapter 1. Formal Agent Theories SPY: A Multi-Agent Model Yielding Semantic Properties F. Buccafurri, D. Rosaci, G. M. L. Same, L. Palopoli ABT with Asynchronous Reordering Marius-Calin Silaghi, Djamila Sam-Haroud, Boi Faltlngs Social Rationality and Cooperation Guido Boella Belief Revision in Type Theory Tijn Borghuis, Fairouz Kamareddine, Rob Nederpelt Heterogeneous BDI Agents II: Circumspect Agents Maria Fash A Preference-Driven Approach to Designing Agent Systems Stefan J. Johansson, Johan Kummeneje Agent Consumer Reports: of the Agents, by the Agents, and for the Agents Xiaocheng Luan, Yun Peng, Timothy Finin Logical Formalizations Built on Game-Theoretic Argument about Commitments Lamber Royakkers, Vincent Buskens Asynchronous Consistency Maintenance Marius-Calin Silaghi, Djamila Sam-Haroud, Boi Faltings
IX
2 3 17 28
40
44 54 64 69 74 80
86
92 98
Chapter 2. Computational Architecture and Infrastructure Reasoning about Mutual-Belief among Multiple Cooperative Agents Wenpin Jiao Portable Resource Control for Mobile Multi-Agent Systems in JAVA Walter Binder, Jarle G. Hulaas, Alex Villazon, Rory G. Vidal An Agent-Based Mobile E-Commerce Service Platform for Forestry and Agriculture Matthias Klusch, Andreas Gerber An Itinerary Scripting Language for Mobile Agents in Enterprise Applications Seng Wai Loke, Arkady Zaslavsky, Brian Yap, Joseph Fonseka Intelligent Agents for Mobile Commerce Services Mihhail Matskin A New Concept of Agent Architecture in Agentspace T. Nowak, S. Ambroszkiewicz 21 st Century Systems, INC.'s Agent Enabled Decision Guide Environment (AEDGE™) Plamen V. Petrov, Alexander D. Stoyen, Jeffrey D. Hicks, Gregory J. Myers Proactiveness and Effective Observer Mechanisms in Intelligent Agents Jon Plumley, Kuo-Ming Chao, Rachid Anane, Nick Godwin Chapter 3. Learning and Adaptation Parrondo Strategies for Artificial Traders Magnus Boman, Stefan J. Johansson, David Lyback BDI Multi-Agent Learning Based on First-Order Induction of Logical Decision Trees Alejandro Guerra Hernandez, Amal El-Fallah Seghrouchni, Henry Soldano Evolutionary Behaviors of Competitive Agents in Dilemma Situation... Tin Tin Naing, Lifeng He, Atsuko Mutoh, Tsuyoshi Nakamura, Hidenori Itoh A Strategy for Creating Initial Data on Active Learning of Multi-Layer Perceptron Kazunori Iwata, Naohiro Ishii Equilibrium Selection in a Sequential Multi-Issue Bargaining Model with Evolutionary Agents Norberto Eiji Nawa, Katsunori Shimohara, Osamu Katai Affect and Agent Control: Experiments with Simple Affective States Matthias Scheutz, Aaron Sloman Meta-Learning Processes in Multi-Agent Systems Ron Sun
104 114 119 124 129 134 139
144
150
160
170
180
190 200 210
Scalability and the Evolution of Normative Behavior Jorg Wellner, Sigmar Papendick, Werner Dilger Thinking-Learning by Argument Aladdin Ayesh Evolution of a Foraging Model with Many Individuals by Kin-selection Kazue Kinoshita, Atsuko Mutoh, Tsuyoshi Nakamura, Hidenori Itoh The Use of Emergent Behaviour in a Multi-Agent System to Drive Self-Adaptation at the Interface Peter Marshall, Sue Greenwood A Biologically Inspired Four Legged Robot That Exhibits Some Natural Walking Behaviours 5. Peng, G. R. Cole, C. P. Lam Chapter 4. Knowledge Discovery and Data Mining Agents CM-RELVIEW: A Tool for Causal Reasoning in Multi-Agent Environments Brahim Chaib-Draa User's Ontology-Based Autonomous Interface Agents Tarek Helmy, Satoshi Amamiya, Makoto Amamiya Integration and Reuse of Heterogeneous XML DTDs for Information Agents Euna Jeong, Chun-Nan Hsu Virtual Museum's Assistant Osvaldo Cairo, Ana Aldeco, M.E. Algorri Index Based Document Classification with CC4 Neural Networks Enhong Chen, Zhengya Zhang, Xufa Wang, Jie Yang Price Watcher Agent for E-Commerce Simon Fong, Aixin Sun, Kin Keong Wong Automated Information Extraction from Web Pages Using an Interactive Learning Agent Jugal K. Kalita, Paritosh Rohilla An Intelligent Agent with Structured Pattern Matching for a Virtual Representative Seung-ik Lee, Sung-Bae Cho A Calendar Management Agent with Fuzzy Logic Wayne Wobcke XML Based Multi-Agent Collaboration for Active Digital Libraries Yanyan Yang, Omer F. Rana, David W. Walker, Roy Williams, Giovanni Aloisio
220 230 235
240
245
252 264 274 284 289 294
300
305 310 315
XII
Chapter 5. Distributed Intelligence An Intelligent Channel Allocation Scheme for Mobile Networks: An Application of Agent Technology Eliane L. Bodanese, Laurie G. Cuthbert An Atomic Approach to Agent-Based Imagery and Geospatial Problem Solving James J. Nolan, Robert Simon, Arun K, Sood Model-Based Creation of Agents and Distribution of Problem Solving Katsuaki Tanaka, Setsuo Ohsuga A Distributed Algorithm for Coalition Formation Among E-Commerce Agents Guillaume Vauvert, Amal El Fallah-Seghrouchni Optimal Reward Functions in Distributed Reinforcement Learning David H. Wolpert, Kagan Turner Polygonal Approximation of Planar Digital Curves Using Ant System Peng-Yeng Yin A Biological View on Information Ecosystems Bengt Carlsson, Paul Davidsson The CoDAC Collaboration Framework K W. Ng, T. O. Lee A Multi-Agent Approach to Modelling Interaction in Human Mathematical Reasoning Alison Pease, Simon Colton, Alan Smaill, John Lee Secure Asynchronous Search Marius-Calin Silaghi, Djamila Sam-Haroud, Boi Faltings Foundations of Market-Driven Agents: An Adaptation of Zeuthen's Bargaining Model Kwang Mong Sim, Chung Yu Choi Chapter 6. Agent Based Applications Kavanah: An Active User Interface Information Retrieval Application Eugene Santos JR., Hien Nguyen, Scott M. Brown iJADE WeatherMAN - A Multi-Agent Fuzzy-Neuro Network Based Weather Prediction System Raymond Lee, James Liu, Jane You Acquaintance Models in Coalition Planning for Humanitarian Relief Operation Michal Pechoucek, Vladimir Marik, Jaroslav Barta Agent Negotiation in a Virtual Marketplace Walid S. Saba, Pratap R. Sathi
322 334 344 355 365 375 385 390 395 400
405
412
424
434 444
XIII
Modeling User Preferences to Facilitate More Automated and Accurate Transaction Brokering within Heterogeneous Multi-Agent Electronic Markets G. Tewari, P. Maes, A. Berkovich, V. Gabovich Attitude Based Agents in E-Commerce Applications S. Au, N. Parameswaran Organizing Internet Agents According to a Hierarchy of Information Domains Sylvie Cazalens, Philippe Lamarre Introducing User Preference Modeling for Meeting Scheduling Hon Wai Chun, Rebecca Y. M. Wong Executive Attentional Control in Autonomous Robotic Agents Jason Garforth, Anthony Meehan, Sue Mchale Implementation and Analysis of Mobile Agents in a Simulation Environment for Fieldbus Systems R. Hunstock, U. Ruckert, T. Hanna Evaluating Believability in an Interactive Narrative Jarmo Laaksolahti, Per Persson, Carolina Palo iJADE Stock Predictor - An Intelligent Multi-Agent Based Time Series Stock Prediction System Raymond S. T. Lee, James N. K. Liu Approximate Sensor Fusion in a Navigation Agent J. F. Peters, S. Ramanna, M. Borkowski, A. Skowron Simulating Day-Ahead Trading in Electricity Markets with Agents Max Scheldt, Hans-Jurgen Sebastian Using Mobile Agents to Update and Maintain Course Materials on Students' Computers in Internet-Based Distance Education Hongxue Wang, Pete Holt Author Index
454 464
469 474 479
484 490
495 500 505
510
515
INVITED TALKS
INTELLIGENT AGENTS FOR MARKET-TREND PREDICTION BENJAMIN W. WAH Department of Electrical and Computer Engineering and the Coordinated Science Laboratory University of Illinois at Urbana- Champaign Urbana, IL 61801, USA http://manip.crhc.uiuc.edu (2001 IEEE Computer Society President) In this presentation we discuss the role of intelligent agents in markettrend predictions. Market-trend data, such as stock-market data, are characterized by non-stationary time series t h a t may depend on non-numeric and non-quantifiable measures. T h e prediction of market trends, therefore, should consist of prediction of non-stationary time series and the abstraction and integration of non-numeric information in prediction. In this talk, we survey various prediction techniques for and mining of market-trend data. We propose to use intelligent agents in the abstraction of non-numeric information, the decomposition of non-stationary time series into multiple stationary time series, and the prediction of trends using artificial neural networks. Finally, we illustrate our techniques in predicting stock-market data.
2
SOCIAL INTELLIGENCE DESIGN FOR KNOWLEDGE CREATING COMMUNITIES TOYOAKI NISHIDA Department of Information and Communication Engineering Graduate School of Information Science and Technology The University of Tokyo 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan nishida@kc. t. u-tokyo. ac.jp Communities play an important role in knowledge creation by providing people with opportunities to continually learn from others, find partners to collaborate with, and demonstrate the significance of their disciplines. In education or business, it is relatively easy to find typical examples of knowledge creating communities for sharing and exchanging specialized knowledge among knowledge workers. In other domains such as NPO or local communities, people are naturally practicing mutual learning and invaluable knowledge is built as a result, even if knowledge creation is not deemed a primary goal of the community. In this paper, 1 present an interdisciplinary approach to augmenting the community knowledge creating process by integrating insights from social psychology, cognitive psychology, and advanced information technology. I emphasize the role of conversations and stories as a means of establishing a common background in a community. I describe several systems that primarily use the conversational modality to mediate community communication. Among others, EgoChat allows the user to make conversation with virtualized egos responding on behalf of other users. It allows the user to take an initiative by interrupting the conversation and changing its flow. VoiceCafe allows artifacts to make conversation with people or other artifacts. It stimulates creative thinking by bringing about utterances from the physical object's point of view, which might be strikingly different from humans' view. These engineering approaches should be tightly coupled with sociological and cognitive approaches, to predict and assess the effects of community communication mediation systems on the human society. 1 discuss issues on designing a constructive framework of interaction for achieving practical goals without being caught by known pathological pitfalls of group interactions.
1
Introduction
The agent technology plays a diverse role in the networked society. On the one hand, agents may be intelligent surrogates that work on behalf of the user. This type of agents includes intelligent brokers that seek best match between service providers and consumers, intelligent traders that buy and sell goods on behalf of the user, intelligent decision makers that negotiate contracts for the user, and so on. Alternatively, agents may be embodied conversational interfaces that entertain the 3
4
user. This type of agents is becoming popular as agent portals on the Internet, or as artificial pets in the entertainment and amusement domain. In this paper, I discuss issues in applying the agent technology to the development of a social information service for mediating communication among people. From this perspective, the central issue is designing and understanding a world where people and agents cohabit, rather than inventing a system of artifacts. We will not be able to innovate a totally new kingdom of artificial agents apart from the human society, but we have to carefully embed the agent system in the existing human society. This means that we need to understand more about humans and the human society to better design an embedded system. We need to pay much attention on the effects the technology brings about the human society. We need to make every effort to have the proposal accepted by the human community. In contrast, we need not insist on the novelty of the technology or a pedagogical issue of whether the artifact can be called an agent. Let us call this field social intelligence design in general. Research on social intelligence design involves such issues as how new technologies induce the emergence of a new language and lifestyle. For example, interactive multimedia websites are a new medium and maybe even a new language, with interesting new conventions, and increasing adaptation to the support of communities. Japanese teenagers have developed a new language for use originally with beepers and now with mobile phones. These are both new mainstream real world developments that should be studied further, and could probably give some valuable insights. The theme of "social intelligence" is really an angle on the support of groups in pursuit of their goals, whether that is medical knowledge, stock trading, or teenage gossip. I focus on community support systems to shed light on key aspects of social intelligence design. The goal of a community support system is to facilitate formation and maintenance of human and knowledge networks to support activities in a community. Examples of community support systems include socially intelligent agents that mediate people in getting to know and communicate with each other, a collaborative virtual environment for large-scale discussions, personalized agents for helping cross-cultural communication, interactive community media for augmenting community awareness and memory, to name just a few. I emphasize the role of stories and conversations as a means of establishing a common background in a community. Stories allow us to put pieces of information into an intelligible structure. Conversations give us an opportunity to examine information from various angles and search for a good story structure. In some community support systems, story-telling agents play a central role. It should be noted that their significance depends more on the contents of stories rather than conversation mechanism. I also emphasize the empirical aspects of social intelligence design. Engineering approaches should be tightly coupled with sociological and cognitive approaches, to predict and assess the effects of community communication
5
mediation systems on the human society. I show how psychological approaches are applied to design and evaluation of community support systems. 2
Communities and Social Intelligence
Social intelligence design is distinguished from most of other conventional engineering disciplines in that we have to be strongly aware of the human society as a target. For this reason, I first take a look at the nature of my target, i.e., communities, in this section. A community is a group of people loosely coupled by a shared interest or environment. More formal discussion can be found in literature in sociology. For example, Smith defines a community as follows: Generically, a community can be understood as a set of on-going social relations bound together by a common interest or shared circumstance. As a result, communities may be intentional or unintentional, a community's participants may purposely join together or be thrust into membership by circumstance. Intentional communities are of particular interest because they raise more questions about the reasons and causes for their emergence than do unintentional ones [21]. Traditional communities were local communities that are characterized by locality and shared living environment. The advent of a global information network has not only considerably relaxed spatial constraints for communities to be built, but also provided a new opportunities for existing communities. Typical networked communities include: • communities of interest, in which people are tied with a shared interest; • communities of practice, in which a group of people work together and share a common work practice; and • enhanced local communities or smart communities, which result from enhancing communication and information sharing facilities in existing local communities. Schlichter contrasts communities with groups and teams [23]. He characterizes communities as sets of people who share something but who do not necessary know each other or interact on personal basis. In contrast, groups are sets of people who know each other but who do not necessarily cooperate, while teams are sets of people who are cooperating to achieve a common goal. In educational environments, the class of lecture may be regarded as a community, a discussion group a group, and a learning group a team. Recently, communities have become increasingly paid more attention in the context of knowledge management and distance learning. A community provides its members with opportunities to continually learn from others, find partners to collaborate with, and demonstrate the significance of their disciplines. In education
6
or business, it is relatively easy to find examples of communities oriented towards knowledge creation by sharing and exchanging specialized knowledge among knowledge workers. In other domains such as NPO or local communities, people are naturally practicing mutual learning and invaluable knowledge is built as a result, even if knowledge creation is not deemed a primary goal of the community. We consider that community knowledge creation is essentially a co-evolution of human and knowledge networks [16, 17]. By human network, I mean a collection of people connected by various social relations, such as acquaintance or partnership. A human network is considered to embody tacit knowledge that may be shared in a community but may not be explicitly spoken. In contrast, knowledge network is a collection of documents or files connected explicitly by hyperlinks or implicitly by references. Knowledge network explicitly describes shared knowledge and interest in a community. A knowledge network enables people with a common interest to know each other, resulting in extension of human network. A human network, in turn, helps new ideas grow through intimate discussions. It facilitates the extension of knowledge network through publication of new knowledge. Thus, a synergetic cycle of human and knowledge network will lead to a successful community. A more elaborate characterization of human and knowledge networks is proposed by Contractor [3]. He pointed out that observed knowledge networks are different from cognitive networks that each individual possesses as a cognitive perception of the network. He proposes to distinguish between: (i) knowledge networks that represent the extent to which the same or disparate knowledge is distributed among various members of the group, and (ii) cognitive knowledge networks that represent individuals' cognitive perceptions of "who knows what" within the group. In order to understand the dynamics of community knowledge, Contractor proposes to observe five types of network data: (i) a communication network of actors based on existing tasks and project links between them, (ii) a knowledge network based on actors providing an inventory of their skills and expertise, (iii) a knowledge network of actors based on links between their web sites, (iv) a knowledge network of actors based on common links from their web sites, and (v) a knowledge network based on similarity in content between different actors' web sites. It should be noted that all kinds of interaction in a community may not bring about fruitful results. In social psychology, various pathological pitfalls are known about group interactions. A notorious example is flaming, an endless slander battle on the net, which is rare in face-to-face communication. Flaming blocks discussions among community members, possibly resulting in a destructive damage to a community. False consensus is another undesirable phenomenon. It results from "a spiral of silence", or "bandwagon effect", for instance, in which false cognition is socially amplified.
7
3
Community Support Systems
The role of community support systems is to support community activities by providing a communication channel for community members. Community support systems are built on top of the communication and expected to help community members (i) exchange awareness with other members, (ii) explore human and knowledge networks, (iii) build community knowledge, (iv) organize public events, (vi) form a group/team for collaborative work, (v) negotiate with others, and (vii) discuss public issues and make decisions about the community. Community support systems provide rather long-range, bottom-up communicative functions in the background of daily life. This feature is contrastive with groupware that emphasizes more task-driven, short-range collaboration, although awareness is equally emphasized. In the rest of this section, I will discuss the first three functions. 3.1
Helping to Awareness with Other Members
Most of networked communities are based on intentional participation, based on a common interest for instance. Compared with mission-oriented groups where participants are incorporated in a certain work structure, the degree of necessity to exchange awareness is relatively low in networked communities. Participants tend to become silent unless a mechanism is provided for lowering the cost for exchanging awareness with other members. In order to support awareness, Schlichter uses spatial metaphors such as rooms or hallways in "The Lecture 2000", a computational environment for supporting a learning community. FaintPop supports a light-weight, acknowledge-only mode of communications [19]. The major design goal of FaintPop is to communicate the sense of connectedness, not to perform informative functions. FaintPop is a communication device similar to a photo frame. Small photos or icons of the user's colleagues are displayed in the frame, through which the user can communicate with other users using a simple touch actions. Three types of touching are permitted: a tap to communicate a neutral feeling, a pet a positive feeling, and a hit a negative feeling. The user can communicate her/his feeling towards her/his colleagues by using these three types of touching and other community members can observe it. Sumi proposes to use interest-based information distribution system, which pushes information to interested users, rather than passively waits for requests from users [24]. Voice Cafe [8] allows artifacts to make conversation with people or other artifacts (Figure 1). It stimulates creative thinking by bringing about utterances from the physical object's point of view, which might be strikingly different from humans' view. Each Voice Cafe artifact consists of a physical object and a conversational agent. It can communicate with community members by exchanging gossips, or small talks about members' conditions, schedules, thoughts and opinions, and so on.
8
(a) the conceptual framework of Voice Cafe
Figure 1. Virtualized egos as an interactive community medium.
By listening to the gossips, members can gain awareness of other people at the small talk level. 3.2
Helping to Explore Human and Knowledge Networks
This facility helps the userfindhuman and human resources in a community. Social matchmaking is frequently used to locate people on the Internet who share some similar interests and enable the automatic formation of interest group. Social matchmaking calculates the distance between users by referring to their user profiles. A major motivation . behind social matchmaking is to address situations such that finding an expert is difficult and time consuming; people are often working on similar projects without realizing it; or people feel socially isolated
9 Real World Inhabitant
Figure 2. Virtualized egos as an interactive community medium.
because nobody around s/he seems to share the same Interest. Yenta [4] is a multiagent matchmaking system that can automatically determine user interests and operate in a completely decentralized, peer-to-peer fashion. Yenta is a persistent agent that uses referrals to find each other, build clusters of like-minded agents, and introduce users to each other. Special care is paid to protect user privacy. Silhouettell [20] combines awareness support and social matchmaking to bridge between informal and formal meetings. It projects the location of participants on the screen as shadows, and facilitates conversation by presenting Web pages that are inferred to common to the participants. Referral Web [11] integrates recommendations and search through the concept of a social network. It helps the user discover her/his relationship to the best human experts for a given topic. It gathers all information from public sources, which removes the cost of information posting and registration. It can also explain the user why each link in the referral-chain appeared. In order to provide an integrated method of exploring and building human and knowledge networks, we use a talking-virtualized-egos metaphor in CoMeMoCommunity [14] and EgoChat [12] to enable an elaborate asynchronous communication among community members. A virtualized ego mainly plays two functions (Figure 2). First, it stores and maintains the user's personal memory. Second, it presents the content of the personal memory on behalf of the user at appropriate situations. By personal memory, we mean an aggregation of relevant information represented in the context specific to a particular person. Personal memory plays a crucial role not only in personal information management but also in mutual understanding in a community. A virtualized ego serves as a portal to the memory and knowledge of a person. It accumulates information about a person and allows her/his colleague to access the information by following an ordinary spoken-language conversation mode, not by
10
going up and down a complex directory in search for possibly existent information, or by deliberately issuing commands for information retrieval. In addition, virtualized ego embodies tacit and non-verbal knowledge about the person so that more subtle messages such as attitude can be communicated. As is also the case with Voice Cafe, we take a conversation-centered approach in designing intelligent systems and capturing intelligence itself. Conversation plays varieties of roles in human societies. It not only allows people to exchange information, but it also helps them create new ideas or manage human relations. In our approach, more emphasis is placed on creating, exchanging, reorganizing, and utilizing conversational contents in knowledge creation, rather than implementing intelligent agents or yet-another human interface. 3.3
Helping to Build Community Knowledge
The third function of a community support system is for helping community members build a shared knowledge. Nonaka and Takeuchi pointed out that the community knowledge is built by a spiral of interactions between explicit and tacit knowledge [18]. They suggest that the process of knowledge creation is more important than the body of knowledge, for people often find more value in communities that evolve as a result of learning. This implies that more emphasis should be placed on supporting interactions or the emergent aspect of community knowledge [13] in community support systems. The Public Opinion Channel (POC) [15, 16, 7] is a community-wide interactive broadcasting system (Figure 3). A POC continuously collects messages from people in a community and feeds edited messages back to them. POC is not intended to be a system that broadcasts public opinions per se. Instead, it is intended to broadcast miscellaneous information that can serve as a basis of public opinion formation. A POC repeats a cycle consisting of call-for-opinion followed by one or more repetition of responding by the community members and summarization by the POC system. In the initial call-for-opinion message, the POC system specifies a focus of discussion. Alternatively, people may also initiate discussion by submitting a topic. Then, interested community members may respond with messages. In principle, messages are not limited to pure opinions. Instead, they may include questions, stories, findings, jokes, proposals, and all other message types. The POC system may combine these messages, generate a story, and broadcast it to the community. The POC system may issue a progress report based on responses from community members. The process proceeds with altering subjects. A POC brings about ecology of ever evolving stories. People can access to the story pool at anytime by an on-demand-type access means. Another thing I would like to emphasize here is that the POC broadcasting can be embedded in the ambient environment, just like a radio broadcasting, so that people need not pay much attention at all times.
11
Community Broadcasting Servers (POC Servers)
POC Clients
We have to discuss of the ethics of cyborg .^j
' Community B Advertising/or Opinions Broadcasting Opinions . cyborg nil be
Community C Figure 3. The Conceptual framework of Public Opinion Channel (POC). The POC is an interactive broadcasting system that continuously collects messages from community members and feeds edited message streams back to the community.
Compared with existing mass media, a POC has various advantages. Computational support and network connectivity enable a large amount of responses to be analyzed on the fly, allowing real-time interactive stories to be generated. In particular, a combination of statistical computation and semantic processing permits minority opinions to be reflected in the structure of public opinion. We believe that POC also contributes to community knowledge building and public discussion. 4
Social Intelligence Design
Social Intelligence Design is a new discipline aimed at understanding and supporting social intelligence, i.e., intelligence collectively exhibited by (natural/artificial) agents to dynamically organize members' activities into a coherent one by exploiting or innovating the social structure. Social intelligence models intelligence as a phenomenon emerging from the way agents, either natural or artificial, are interacting with each other. Research into community support systems is concerned with engineering aspects of Social Intelligence Design. Meanwhile, investigation into the sociological and cognitive aspects are equally or sometimes more important. Engineering approach should be tightly coupled with sociology and psychology and other disciplines closely related to the study of humans and human society. Thus, Social Intelligence Design involves not only designing artifacts but
12
also designing and understanding the social structure in which the artifacts are embedded.1 Social intelligence design gives some new life to Agent Technology and Artificial Intelligence research in general in that humans are integral part of a big picture by shifting the focus, from building artifacts with problem solving or learning capabilities, to designing a framework of interaction that leads to creation of new knowledge and relationship among participants. An interdisciplinary study integrating insights from Artificial Intelligence, Human-Computer Interaction, Social and Cognitive Sciences, Media Studies, and other related disciplines is necessary to predict and assess the effects of social intelligence augmentation systems on the human society from sociological and cognitive viewpoints. Promising application domains includes collaborative environment, e-learning, knowledge management, community support systems, symbiosis of humans and artifacts, crisis management, and digital democracy. The engineering side of Social Intelligence Design involves not only community support systems but also systems that range from group/team oriented collaboration support systems [5] to large-scale online-discussion support systems such as Bubble used in the IBM's WorldJam trial [26]. The humanity side of Social Intelligence Design involves design and assessment of social intelligence. In the rest of this section, I will overview a couple of research in this side. 4.1
Social Intelligence Design from Social Psychological View
Azechi points out that two factors hinder dynamic knowledge interaction in a networked community. One is the essential disposition of a group that prevents effective cooperation, particularly known as groupthink and the hostility to outgroups. Groupthink [9] means a phenomenon that collective creativity does not exceed individual creativity. The hostility to out-groups means that a group member has hostility to out-groups easily [23, 25]. This phenomenon is closely related with stereotyping, which means some stigmata produce the wrong inference about an outsider's behavior pattern and personality. Another factor is a new concept called escape from information, which means the tendency of the people living in a mass-consumption society to make themselves the same as others and avoid choosing and expressing information themselves. Azechi classifies the content of a message into dry and wet information [1]. Dry information primarily contains logical linguistic information and constitutes the core of a message. It may be an assertion, a question, a view, an opinion, or any other statements that are logically constituted by fact. In contrast, wet information is
These arguments are inspired by discussion at JSAI-Synsophy International Workshop on Social Intelligence Design, Matsue, Japan, May 21-22, 2001.
13
meta-information incidental to the contents of the message. It is mainly nonlinguistic information and is not affected by the dry information. He argues that community-wide discussion for achieving some practical goal should be made only with dry information, otherwise rational discussion will be hindered due to the pathology of a group and the fruitful outcome cannot not be expected at all. He suggests that the following five functions should be incorporated into POC: (i) protecting personal information, (ii) consideration of minority opinions, (iii) equal opportunities to presenters, (iv) positive reciprocity, (v) and tutoring. Such discussion is taken as a valuable insight that can be used to design a public discussion support system in general. In addition, he argues based on the investigation from a POC prototype that the user's motivation is often discouraged by trash information, setting up a clear purpose for using the tool will encourage the user, and informing the user of the size of the user group will increase the motivation [2]. Matsumura addresses the consensus formation in networked communities and points out the importance of minority opinions in group decision making [10]. Based on social psychological experiments on the minority opinions, he has found out that (i) minority members tend to overestimate the number of other members who share the same attitude, (ii) minority members tend to underestimate the attitude of other members, (iii) minority members who underestimate the proportion of the minority's opinion tend to lose an intention to act. Such inaccuracy in cognition of opinion distribution is called false consensus effect. These observations should be taken into account in designing discussion support systems so that useful discussions can be expected by reflecting minority opinions. He discusses the pros and cons of using anonymous messages, which will obscure the real distribution of opinions. Good news is that it will not discourage minority members by the fact that they are in the minority. Bad news is that it may cause an incorrect cognition about the distribution of opinions. 4.2
Evaluations of Social Intelligence
Social Intelligence Design is certainly an empirical study. We have to repeat the design-implement-evaluation cycle until we reach better systems. Network Analysis is a powerful means of evaluating or comparing empirical data. It provides us with a means for calculating various aspects of a given network in terms of centrality, density or cohesion. By comparing those features from one network against those from another, we can describe the similarity and difference in quantitative term. Fujihara has applied Network Analysis to a log collected from experiments with a POC prototype for several months to see if POC actually facilitates community knowledge creation [6]. Fujihara points out the importance of setting up appropriate control condition for formalizing the result of experiments. It will allow the effects of tools to be
14
measured and compared against a baseline condition. He also suggests that multiple methods should be combined to gain a reliable result. Methods of evaluation may fall into the following three types: • • •
5
Analysis of users' subjective estimations and introspection that can be collected through questionnaire and interview, Experimental methods where experimental conditions are operated systematically, and Analysis of data collected by observing users' behavior in ordinal and natural settings. Concluding Remarks
In this paper, I have discussed issues related to Social Intelligence Design. By social intelligence I mean intelligence collectively exhibited by (natural/artificial) agents to dynamically organize members' activities into a coherent one by exploiting or innovating the social structure. The central issue here is designing and understanding a world where people and agents cohabit, rather than inventing a system of artifacts. I have overviewed community support systems as example of the engineering aspects of Social Intelligence Design. I have also shown some psychological approaches related to the design and evaluation stages of Social Intelligence Design. The agent technology has a large potential of augmenting social intelligence, provided that special care is taken in order to embed artifacts into the human society. References 1. Azechi, S., Social psychological approach to knowledge-creating community, in: Nishida, T. (ed.), Dynamic Knowledge Interaction, pp. 15-57, CRC Press LLC, 2000. 2. Azechi, S., Motivation for showing opinion on public opinion channel: a case study, to be presented at KES-2001, Osaka, Sept. 6, 7 & 8 September 2001, Japan, 2001. 3. Contractor, N., Bishop, A., and Zink, D., PrairieKNOW: a tool to assist the study, creation, and growth of community networks, in: Bullinger, H.-J. and Ziegler, J. (eds.), Human-Computer Interaction, Volume 2, Erlbaum, Hillsdale, pp. 447-451, 1999. 4. Foner, L. N., Political Artifacts and Personal Privacy: The Yenta Multi-Agent Distributed Matchmaking System. Ph.D. Dissertation, MIT, 1999.
15
5.
6. 7.
8.
9. 10. 11.
12. 13. 14.
15.
16. 17. 18.
19.
20.
Fruchter, R., Bricks, bits & interaction, presented at JSAI-Synsophy International Workshop on Social Intelligence Design, Matsue, Japan, May 2122,2001. Fujihara, N., How to evaluate application of conversational intelligence, to be presented at KES-2001, Osaka, Sept. 6, 7 & 8 September 2001, Japan, 2001. Fukuhara, T., Nishida, T., and Uemura, S., Public Opinion Channel: a system for augmenting social intelligence of a community, presented at JSAI-Synsophy International Workshop on Social Intelligence Design, Matsue, Japan, May 21 22,2001. Fukuhata, T., Nishida, T., and Uemura, S., Voice Cafe: conversational support system in a group, KES 2001, to be presented at KES-2001, Osaka, Sept. 6, 7 & 8 September 2001, Japan, 2001. Janis, I. L., Groupthink, 2nd Edition, Houghton Mifflin Company, Boston, 1982. Matsumura, K., Consensus formation process in network community, to be presented at KES-2001, Osaka, Sept. 6, 7 & 8 September 2001, Japan, 2001. Kautz, H., Selman B., and Shah, M., Referral Web: combining social networks and collaborative filtering, Communications of the ACM, 40 (3) pp. 63-65, 1997. Kubota, H., Nishida, T., and Koda, T., Exchanging tacit community knowledge by talking-virtualized-egos, in: Proceedings of Agent 2000, pp.285-292, 2000. Nakata, K., Knowledge as Social Medium, New Generation Computing, Vol. 17, No. 4, pp. 395-405, 1999. Nishida, T., Facilitating community knowledge evolution by talking vitrualized egos, in: Hans-Joerg Bullinger and Juegen Ziegler (eds.), Human-Computer Interaction VOLUME 2, Lawrence Erlbaum Associates, Pub., pp. 437-441, 1999. Nishida, T., Fujihara, N., Azechi, S., Sumi, K., and Hirata, T., Public Opinion Channel for communities in the information age, New Generation Computing, Vol. 17, No. 4, pp. 417-427, 1999. Nishida, T. (ed.), Dynamic Knowledge Interaction, CRC Press LLC, 2000. Nishida, T., Towards dynamic knowledge interaction, Keynote Paper, in: Proc. KES-2000, pp. 1-12,2000. Nonaka, I. and Takeuchi, H., The knowledge-creating company: How Japanese companies create the dynamics of innovation, Oxford University Press, New York, 1995. Ohguro, T., FaintPop: In touch with the social relationships, presented at JSAISynsophy International Workshop on Social Intelligence Design, Matsue, Japan, May 21-22, 2001. Okamoto, M., Isbister, K., Nakanishi, H. and Ishida, T., Supporting crosscultural communication in real-world encounters, The 8th International
21. 22.
23.
24. 25.
26.
Conference on Human-Computer Interaction (HCI-99), Volume 2, pp. 442446, 1999. Smith, M. A., Voices from the WELL: the logic of the virtual commons, http://www.usyd.edu.au/su/social/papers/virtcomm.htm. Sherif, M., Harvey, O. J., White, B. J., Hood, W. R., and Sherif, C. W., Intergroup conflict and cooperation: the robbers' cave experiment, Institute of group relations, University of Oklahoma, Norman, 1966. Schlichter, J., Koch, M., and Xu, C , Awareness — the common link between groupware and community support systems, in: Ishida, T. (ed.), Community Computing and Support Systems — Social Interaction in Networked Communities, LNCS 1519, Springer, Berlin, pp. 77-93, 1998. Sumi, K. and Nishida, T., Context-aware and personalized communication support system, IEEE Intelligent Systems, in press, 2001. Tajifel, H. and Turner, J.C., The social identity theory of intergroup behavior, in: Worchel, S. and Austin, W. G. (eds.), Psychology of Intergroup Relations, 2nd Edition, Nelson-Hall, pp. 7-24, 1986. Thomas, J. C , Collaborative innovation tools, presented at JSAI-Synsophy International Workshop on Social Intelligence Design, Matsue, Japan, May 2122,2001.
QUERY A N S W E R I N G B A S E D O N D I S T R I B U T E D KNOWLEDGE MINING
ZBIGNIEW W. RAS University of North Carolina, Department of Computer Science Charlotte, N.C. 28223, USA E-mail:
[email protected] Traditional query processing provides exact answers to queries. It usually requires that users fully understand the database structure and content to issue a query. Due to the complexity of the database applications, so called global queries can be posed which traditional query answering systems can not handle. In this paper a query answering system based on distributed data mining is presented to rectify these problems.
1
Introduction
In many fields, such as medical, banking and educational, similar databases are kept at many sites. Each database stores information about local events and uses attributes suitable for a local task, but since the local situations are similar, the majority of attributes are compatible among databases. An attribute may be missing in one database, while it occurs in many others. Missing attributes lead to problems. A user may issue a query to a local database S\ in search for objects that match a desired description, only to realize that one component a\ of that description is missing in S\ so that the query cannot be answered. The same query may work in other databases but the user is interested in identifying suitable objects only in S\. Clearly, the task of integrating established database systems is complicated not only by the differences between the sets of attributes but also by differences in structure and semantics of data. We call such systems heterogeneous. The notion of an intermediate model, proposed by [Maluf and Wiederholdf, is very useful in dealing with the heterogeneity problem, because it describes the database content at a relatively high abstract level, sufficient to guarantee homogeneous representation of all databases. Discovery layers and action layers introduced in this paper, can be used for a similar purpose. Discovery layer contains rules extracted from a database. Actions layer contains, so called, action rules (see [Ras and Wieczorkowskaf) showing what minimal changes in a database are needed to re-classify some of its objects. 17
18
2
Distributed Knowledge Systems
In this section, we recall the notion of an information system and a distributed information system (DIS). Next, we introduce local queries and give their standard semantics. Finally, we show the structure of discovery layers and action layers. By an information system we mean S = (X, A, V), where X is a finite set of objects, A is a finite set of attributes, and V = [J{Va : a G A} is a set of their values. We assume that: • Va, V(, are disjoint for any a, b G A such that a ^ b, • a : X —> Va is a function for every a £ A. Instead of a, we may write a^s] to denote that a in an attribute in S. By a distributed information system we mean a pair DS = where:
({Si}i^i,L)
• / is a set of sites. • Si = (Xi, Ai,Vi) is an information system for any i G I, • L is a symmetric, binary relation on the set / . A distributed information system DS = ({Si}i^i,L) following condition holds:
is consistent if the
(V;)(Vj)(Vz e Xi n X,)(Va e Az n A,) {a[Si]{x) =
(a[Sj])(x)).
In the remainder of this paper we assume that DS = ({5j}i e /, L) is consistent. Also, we assume that Sj = (Xj,Aj,Vj) where Vj — \J{Vja : a G Aj}, for any j & I. We use A to denote the set of all attributes in DS, A = [){Aj : j G / } . Also, by V we mean (J{Vj : j G / } . Before, we introduce the notion of a discovery layer, we begin with a definition of s(i)-terms and their standard interpretation Mj in DS = ({Sj}jei,L), where Sj = (Xj,Aj,Vj) and Vj = \J{Vja : a G Aj}, for any j G /. By a set of s(i)-terms (also called a set of local queries for site i) we mean a least set Tj such that: • 0,1 G Tu • w G Ti for any w G Vj,
19 • if t i , t 2 eTi,
then (*i + t2), (h * t2), ~ h
£Tt.
By a set of s(2)-formulas we mean a least set F; such t h a t : • if ti,t2
G Ti, then (h = t2) G Fh
Definition of D 5 - t e r m s (also called a set of global queries) and .DS-formulas is quite similar (we only replace Tj by (J{Tj : i £ 1} and F, by F in two definitions above). We say t h a t : • s(i)-term t is primitive
if it is of the form l\{w
: w G Ui} for any Ui C V»,
• s(i)-term t = Y\{w : w G t/j} where U, C Vi is simple singleton set for any a £ Ai, • s(«)-term is in disjunctive normal where each tj is primitive.
form
if £/$ n Via is a
(DNF) if t = ^T,{tj : j G J}
Similar definitions we have for D S - t e r m s . Clearly, it is easy t o give an example of a local query. T h e expression: select * from Flights where airline = "Delta" and departureJime — "morning" and departure-airport = "Charlotte" is an example of a non-local query (Z>S-term) in a d a t a b a s e Flights(airline, departure-airport,
departure dime, arrival-airport).
arrival-time,
Semantics of s(i)-terms is defined by standard interpretation M* in a distributed information system DS — ({Sj}jei,L) as follows: • Mi(0) = 0, Mi(l)
= Xt
• Mi(w)
= {x G Xi : if w G Vja then w = h,(x, a)} for any w G V,,
• if ti,t2
are s(i)-terms, then
M J ( i i + t 2 ) = M1((1)UMi(i2), Mi(t1*t2) = Mi{t1)nMi(t2), Mi(~t1) = Xi-Mi(ti).
Afifa = i2) = (if Mi(ti) = M i ( i 2 ) then T else F ) where T stands for True and F for
False
20
The sound and complete axiomatization of the above semantics is quite standard and for instance is given in paper by [Ras]6. Now, we are ready to introduce the notion of (fc,i)-rules, for any i £ I. We use them to form a discovery layer at site i £ I. By (fc,i)-rule in DS = ({SJ}J^I,L), k,i £ I, we mean a triple (c,t,s) such that: • c e Vk - Vu • t,s are s(fc)-terms in DNF and they both belong to Tt D T,-, • Mk(t) CMk(c)CMk{t
+ s).
For any (fc,i)-rule (c, t,s) in DS =
({SJ}J^I,L),
we say that:
• (t —> c) is a k-certain rule in DS, • (t + s —>• c) is a k-possible rule in DS. Let us assume that r\ = (ci, t\, s\), r-i — (02, *2> ^2) a r e (k, i)-rules. We say that: r i , r 2 are strongly consistent, if either ci,c 2 are values of two different attributes in Sk or a DNF form equivalent to t\ * £2 does not contain simple conjuncts. Now, we are ready to define a discovery layer Dk{. Its elements can be seen as approximate descriptions of values of attributes from Vk — Vi in terms of values of attributes from Vk n V*. To be more precise, we say that Dki is a set of (k, i)-rules such that: if (c, t, s) £ Dki and t\ = ~ (t + s), then (~ c, tl, s) e Dk{. By a discovery layer for site i, denoted by Di, we mean any subset of
\J{Dki : (k,i) e L}. 3
Actions Layer
In this section we introduce the notion of actions layer which is a basic part of a distributed knowledge system (DKS). Information systems can be seen as decision tables . In any decision table together with the set of attributes a partition of that set into conditions and decisions is given. Additionally, we assume that the set of conditions is partitioned into stable conditions and flexible conditions. Attribute a 6 A is called stable for the set X if its values assigned to objects from X can not be
21
changed in time. Otherwise, it is called flexible. Date of birth is an example of a stable attribute. Interest rate on any customer account is an example of a flexible attribute. For simplicity reason, we consider decision tables with only one decision. We adopt the following definition of a decision table: A decision table is any information system of the form S = {X, A\ U A2 U {d},V), where d §t Ai U A2 is a distinguished attribute called decision. The elements of A\ are called stable conditions, whereas the elements of A2 U {d} are called flexible conditions. The goal is to change values of attributes in Ai for some objects in X so the values of the attribute d for these objects may change as well. Rules in a discovery layer defining d in terms of A\ U A2 are extracted from S and used to discover new rules called action rules 7 . These new rules provide suggestions for re-classification of objects from S in terms of the attribute d. It can be done because d is flexible. Now, let us assume that (a,v —> w) denotes the fact that the value of attribute o has been changed from v to w. Similarly, the term {a, v —> w){x) means that a{x) = v has been changed to a{x) = w. Saying another words, the property {a, v) of object x has been changed to property {a, w). Assume now that S = {X, AiUA2U{d}, V) is a decision table, where A\ is a set of stable attributes and A2 is a set of flexible attributes. Assume that rules ri,r2 have been extracted from S and ri/Ai = r2/A2,d{n) = ki,d{r2) = k2 and hi < k2. Also, assume that {bx, b2,..., bp) is a list of all attributes in Dom{ri) D Dom{r2) fl A2 on which r\,r2 differ and ri{b\) = Wi,7-1(62) = v2,...,n{bp) =vp. By (ri,r2)-action rule on x € X we mean a statement: [(61,vi —>wi) A {b2,v2 —>• w2) A A{bp,vp —> wp)]{x) => [{d,fa) —> {d,k2)]{x). If the value of the above rule is true on x then the rule is valid for x. Otherwise is false. Action layer for a site i, denoted by Acti, contains ( r i , ^ - a c t i o n rules constructed from rules r i , r2 in a discovery layer Di. 4
Distributed Knowledge System
In this section, we introduce the notion of a distributed knowledge system. By Distributed Knowledge System {DKS) we mean DS = {{{Si, Dt, Acti)}iei, where {{Si}iei,L) is a distributed information system, Di = \J{Dki : {k,i) e L} is a discovery layer and Acti is a n action layer for i £ I. Figure 1 shows the basic architecture of DKS (a query answering system QAS that handles global queries is also added to each site of DKS). Opera-
L)
22
Actions Layer
Actions Layer
Actions Layer
Actions Layer
Knowledge Exchange i
Knowledge Exchange
Knowledge Exchange
Knowledge Exchange
Knowledge Exchange
Knowledge Exchange
i
Discovery Layer
mining j i
Discovery Layer
Discovery Layer j
j
mining j
j
Discovery Layer
j
~~i
Database
Operational Semantics
Operational Semantics
QAS
QAS
Operational Semantics
Operational Semantics
QAS
QAS
Figure 1: Distributed Knowledge System (DKS)
global query
Transformation engine based on logical axioms and operational semantics Ni
local query at site i (lower approximation)
QAS for site i
Figure 2: Query Answering System (QAS)
tional semantics reflects the dynamic nature of definitions of attribute values in a query (see [Ras and Zytkowf). Figure 2 shows a part of QAS which is responsible for query transformation. This part of QAS can be replaced by a rough transformation engine shown in Figure 3. If for each non-local attribute we collect rules from many sites of DKS and then resolve all inconsistencies among them (see [Rasp), then the local confidence in resulting operational definitions is high since they represent consensus of many sites. Assume now that N is a standard interpretation of global queries as introduced for instance in [Rasf. It corresponds to a pessimistic approach to
23
global query
Rough transformation engine based on logical axioms and operational semantics Ni and Ji
local query at site i (upper approximation)
local query at site i (lower approximation) QRAS for site i
Figure 3: Query Rough Answering System (QRAS)
evaluation of global queries because of the way the non-local attribute values are interpreted (their lower approximation is taken). We can replace Ni by a new interpretation Ji representing optimistic approach to evaluation of global queries. Namely, we define: • Ji(w) =X• J i ( ~ w)=X-
Ni(~ w), Ni(w),
• Jj(i) = Ni(t) for any other t. In optimistic approach to evaluation of queries, upper approximation of non-local terms w, ~ w is taken. Following this line of thought, we can propose rough operational semantics Ri defined as Ri(t) = [Ni(t), Ji(t)] for any global query t. Rough operational semantics has a natural advantage of either Nt or Jj. Clearly, if interpretations Ni and Ji of a term t give us the same sets of objects, then both approximations (lower and upper) are semantically equal. 5
Query Answering Based on Reducts
In this section we recall the notion of a reduct (see [Pawlakf) and show how it can be used to improve query answering process in DKS. Let us assume that 5 — (X, A, V), is an information system and V — \J{Va '• a, € A}. Let B C A. We say that x,y £ X are indiscernible by B, denoted [x « B y], if (Vo € B)[a{x) — a(y)]. Now, assume that both Bi,-02 are subsets of A. We say that B\ depends on B2 if « B 2 C « B I - Also, we say that B\ is a covering of B2 if B2 depends on
24 B\ and B\ is minimal. By a reduct of A in 5 (for simplicity reason we say A-reduct of 5) we mean any covering of A. Example. Assume the following scenario:
. S, = (Xi,{c,d,e,g},Vi),
52 =
(X2,{a,b,c,d,f},V2),
S3 = (X3, {b, e, g, h}, V3) are information systems, • User submits a query q = q(c, e, / ) to the query answering system QAS associated with system Si, • Systems Si, S2, S3 are parts of DKS. Attribute / is non-local for a system Si so the query answering system associated with Si has to contact other sites of DKS requesting a definition of / in terms of {d, c, e,g}. Such a request is denoted by < / : d,c,e,g >. Assume that the system S2 is contacted. The definition of / , extracted from 52, involves only attributes {d, c, e, g} D {a, b, c, d, / } = {c, d}. There are three /-reducts (coverings off) in 52- They are: {a, b}, {a, c}, {b, c}. The optimal /-reduct is the one which has minimal number of elements outside {c, d}. Let us assume that {b, c} is chosen as an optimal /-reduct in 52. Then, the definition of / in terms of attributes {b, c} will be extracted from 52 and the query answering system of 52 will contact other sites of DKS requesting a definition of b (which is non-local for Si) in terms of attributes {d,c,e,g}. If definition of b is found, then it is sent to QAS of the site 1. Figure 4 shows the process of resolving query q in the example above. We will use the graph in Figure 5 to represent visually the fact: R[i] is an a-reduct at site i containing attribute b. Let us adopt the following definition. By < ai,A >-linear set of reducts we mean a set {< Oj, R[i] >: 1 < i < k} such that: • a, £ A, for any 1 < i < k • ai+i G R[i], for any 1 < i < k — 1 • R[i] is an a^-reduct at site i and card(^4 — i?[i]) = l, for any 1 < i < k — 1 • R[k] C A.
25 overlap X2
a
b
c
d
f
x1
a1
b1
d
d1
f1
x2
a1
b2
c2
d1
f2
x3
a1
b1
d
d1
f1
x4
a1
b2
c2
d1
f2
x5
a2
b2
C1
d1
f3
x6
a2
b2
d
d1
f3
x7
a3
b1
c3
d2
f4
x8
a3
b1
c3
d2
f4
step 2 ^ / / • ^ b T d ->f1 b2*c2 ->f2 b2*d ->f3 c3-> f4 rules extracted at site 2
Coverings of f: {a,b}, {a,c}, {b,c} Covering {b, c} is chosen as optimal one.
Coverings of b: {e}, {g,h} Covering {e} is chosen as optimal one.
b
e
g
h
yi
b1
e1
gi
hi
y2
b1
e1
g2
h2
y3
b1
e1
gi
hi
y4
b1
e1
g2
h2
y5
b2
e2
g2
hi
y6
b2
e2
g2
hi
y7 y7
b2
e3
g3
h3
y8
b2
e3
g3
h3
Figure 4: Process of resolving a query by QAS in
DKS
b R[i]
i
Figure 5: R[i]: a-reduct at site i containing attribute b
26
a3 R[1]
I
1
R[2] I
;
lR[k-1]
*
R[1], R[2],..., R[k-1] should have minimal number of attributes outside A R[k] is a subset of A R[k]
Figure 6: < a\,A
>-linear set of reducts
Figure 6 visually represents < a\, A >-linear set of reducts. Clearly, the existence of < a, A >-linear set of reducts is sufficient for attribute a to be definable in DKS. The existence of < a, A >-directed set of reducts (defined below) is necessary for attribute a to be definable in DKS. By < ai, A >-directed set of reducts we mean a smallest, non-empty set {< a,, R[i], Si >: 1 < i < k} such that: • ai $ A, for any 1 < i < k • Si is a site of DKS, for any 1 < i < k • R[i] is an a^-reduct at site Sj, for any 1 < i < k • (Va G U{#[*] :idirected set of reducts, to guarantee the smallest number of steps needed to learn the definition of attribute a\ while keeping the confidence of what we learn still the highest. 6
Conclusion
Query answering system for DKS can handle two types of queries: Queries asking for all objects at a site i which satisfy a given description (any attributes are allowed to be used here). In such a case, query answering system will search for operational definitions of all attributes not-existing at the site i, before it can process the query locally. Queries asking for actions which have to be undertaken in order to change the classification of some objects at site i. Such queries can be processed
27
entirely at site i or moved for remote processing to other sites of DKS. In the last case, operational definitions of all attributes from the site i in terms of attributes from another site are needed. But, this problem will be a topic of a separate paper. References 1. Maluf, D., Wiederhold, G., "Abstraction of representation for interoperation", in Proceedings of Tenth International Symposium on Methodologies for Intelligent Systems, LNCS/LNAI, Springer-Verlag, No. 1325, 1997, 441-455 2. Navathe, S., Donahoo, M., "Towards intelligent integration of heterogeneous information sources", in Proceedings of the Sixth International Workshop on Database Re-engineering and Interoperability, 1995 3. Pawlak, Z., "Rough classification", in International Journal of ManMachine Studies, Vol. 20, 1984, 469-483 4. Prodromidis, A.L. & Stolfo, S., "Mining databases with different schemas: Integrating incompatible classifiers", in Proceedings of The Fourth Intern. Conf. onn Knowledge Discovery and Data Mining, AAAI Press, 1998, 314-318 5. Ras, Z., "Dictionaries in a distributed knowledge-based system", in Concurrent Engineering: Research and Applications, Conference Proceedings, Pittsburgh, Penn., Concurrent Technologies Corporation, 1994, 383-390 6. Ras, Z., "Resolving queries through cooperation in multi-agent systems", in Rough Sets and Data Mining (Eds. T.Y. Lin, N. Cercone), Kluwer Academic Publishers, 1997, 239-258 7. Ras, Z., Wieczorkowska, A., "Action Rules: how to increase profit of a company", in Principles of Data Mining and Knowledge Discovery, D.A. Zighed, J. Komorowski, J. Zytkow (Eds), Proceedings of PKDD'00, Lyon, France, LNCS/LNAI, No. 1910, Springer-Verlag, 2000, 587-592 8. Ras, Z., Zytkow, J., "Mining for attribute definitions in a distributed twolayered DB system", Journal of Intelligent Information Systems, Kluwer, Vol. 14, No. 2/3, 2000, 115-130 9. Ras, Z., Zytkow, J.,"Discovery of equations to augment the shared operational semantics in distributed autonomous BD System", in PAKDD'99 Proceedings, LNCS/LNAI, No. 1574, Springer-Verlag, 1999, 453-463 10. Zytkow, J.M., Zhu, J., and Zembowicz R. Operational Definition Refinement: a Discovery Process, Proceedings of the Tenth National Conference on Artificial Intelligence, The AAAI Press, 1992, p.76-81.
A P P R O X I M A T E R E A S O N I N G B Y A G E N T S IN DISTRIBUTED ENVIRONMENTS ANDRZEJ SKOWRON Institute of Mathematics, Warsaw University Banacha 2, 02-097 Warsaw, Poland E-mail:
[email protected] Solving complex problems by multi-agent systems in distributed environments requires new approximate reasoning methods based on new computing paradigms. One such recently emerging computing paradigm is Granular Computing. Granular computations are performed on information granules representing vague and complex concepts delivered by agents engaged in tasks such as knowledge representation, communication with other agents, and reasoning. In this paper, we present an outline of foundations for information granule calculi and methods for inducing relevant information granule constructions from data and background knowledge. These constructions can be interpreted as approximate reasoning schemes. The proposed methodology of approximate reasoning has been developed for solving complex problems in areas such as identification of objects by autonomous systems, web mining or sensor fusion.
1
Introduction
Information processing in intelligent systems, in particular in multi-agent systems, needs new soft-computing paradigms. The solutions derived by such systems should satisfy a given specification not necessarily exactly but up to a satisfactory degree. One can expect that due to such more relaxed specification constraints the robust solutions for complex problems can be derived efficiently by methods based on these new paradigms. One of such recently emerging paradigm is Granular Computing based on rough set and rough mereological approaches (see, e.g., Zadeh and Kacprzyk 33 , Zhong et al 35 , Lin 9 , Polkowski and Skowron 22 ' 25 > 26 ; Skowron and Stepaniuk 29 , Nguyen et al11, Skowron 27 ) as a way to achieve Computing with Words (see, e.g., Zadeh 3 2 - 3 4 ; Zadeh and Kacprzyk 3 3 ) . Granular computations are performed on information granules representing vague and complex concepts delivered by agents engaged in, for example, knowledge representation, communication with other agents, and reasoning. Our approach is related to logical aspects of perception (see, e.g., Zadeh 3 4 ) . Specifications of complex tasks are often formulated in words, phrases or more complex texts of a natural language. Hence, the following main problem arises: if and how can an information granule, in a sense, sufficiently close to the target information granule Gt representing the task specification, be
28
29
constructed from input information granules (e.g., representing sensor measurements) . One of the important problems is related to the construction of an interface allowing knowledge acquisition agents (KA-agents) to acquire knowledge from customer-agents (CA-agents), who specify a task. The aim is to induce a satisfactory approximation Gk of the target information granule Gt in the language of KA-agents, i.e., an information granule Gk sufficiently close to (or included in) the target information granule Gt- Hence, some tools for expressing inclusion and proximity (closeness) of information granules measured by the degree of proximity are needed. For this purpose we use rough sets (see, e.g., Pawlak 14 , Komorowski et al7) and rough mereology (see, e.g., Polkowski and Skowron 1 8 ' 2 0 ' 2 2 ) . The interface construction should be supported by background knowledge (in particular, by ontology of concepts) and experimental data. An information granule G sufficiently close to the information granule Gk delivered by KA-agents should be constructed from input information granules (representing, e.g., sensor measurements). In the search for granule G, relevant operations and inclusion (closeness) measures on information granules should be discovered and used. The granule G is constructed from basic components defined by information granule calculi. Any such calculus consists of components such as (i) elementary input information granules, (ii) operations on information granules, (iii) relations of inclusion and proximity measured by the proximity degree between information granules, and (iv) schemes of information granule construction which can be treated as approximate reasoning schemes (Ai?-schemes, for short) on information granules. Elementary information granules together with inclusion and proximity relations between such granules are primitive constructs in granule construction. Higher level constructs, like information granules and related inclusion (closeness) relations, can be defined from previously constructed lower level constructs using relevant operations. Fusion operations are important operations on information granules. They are based on negotiation schemes for resolving conflicts between agents, delivering arguments of operations. More complex operations are defined by robust ^4-R-schemes. Such schemes are obtained by approximate reasoning rules and methods for their composition, dependent on available data and background knowledge. The robustness of ^4i?-schemes means that the closeness (inclusion) of constructed granules is preserved in a satisfactory degree under small deviations of input granules (or operation parameters used for the granule construction). The robustness of the target construction can be deduced from the robustness of their sub-constructions, if some constraints
30
for composition are satisfied. The robust j4i?-schemes should be extracted from experimental (e.g., sensory) data or/and background knowledge rather than from classical deduction mechanisms. The ^4ii-schemes are parameterized. Relevant information granules are constructed by tuning AR-scheme parameters. There are several kinds of parameters tuned in the process of searching for relevant information granules. Some of them come from approximation spaces of agents that make it possible to obtain a proper generalization degree of the granule constructed in the inductive reasoning. Other parameters are related to agent teams and are used to tune measures of inclusion (closeness) between information granules and to tune propagation mechanisms of the inclusion (closeness) degrees along the Ai?-schemes. The ^4-R-schemes in multi-agent systems can be treated as higher order neural networks, called rough neural networks, performing operations on information granules instead of numbers. One of the main problems of a new Rough Neurocomputing paradigm is to develop methods for inducing rough neural networks. In this paper, we outline an approach to the above mentioned problems. Our approach is based on the foundations of a calculus on information granules developed by means of rough set and rough mereological approaches. Its aim is to create a methodology and tools for solving a wide class of complex problems ranging from the identification of road traffic situations by an unmanned aerial vehicle (see, e.g., www page of WITAS project 31 ) to problems of text data mining in the Internet (see, e.g., Skowron 27 , Kargupta and Chan 6 ) . 2
Information Granule Systems
In this section, we present a basic notion for our approach, i.e., information granule system. Any such system S consists of a set of elementary granules E together with an operation {•} making collections of granules from finite sets of granules. A finite subset of the set generated from elementary granules using this operation is fixed. This subset is extended by means of other operations on information granules producing new information granules. Moreover, a family of relations with the intended meaning to be a part to a degree between information granules is distinguished. Degrees of inclusion are also treated as information granules. The degree structure is described by a relation to be an exact part. More formally, an information granule system is any tuple S = (E,{E},H,0,v,{vp}peH) where 1. E is a finite set of elementary granules;
(1)
31
2. {£} is a finite subset of PU(E) = E U P(E) U P ( E U P(E)) U ... where P(X) denotes the powerset of X; 3. H is a finite set of granule inclusion degrees with a binary relation v C H x H to be an (exact) part; v defines on H a structure used to compare the degrees by assuming, e.g., p < q if and only if v(q,p); 4. O is a set of (partial) operations used to construct new granules from {E}; by means of operations from O, the set {E} is extended to the set G(S) C PUJ(E) of granules generated from {E} using operations from O; 5. j / p C G(S) x G(5) is a binary relation io fee a part to a degree at least p between information granules from G(S). One can consider the following examples of the set E of elementary granules: (i) a set of descriptors of the form (a, v) where a € A and v £ Va for some finite attribute set A and value sets Va, and (ii) a set of descriptor conjunctions. The set {E} consists of granules constructed by means of an operation {•} making collections from already constructed granules. Examples of such granules are tolerance granules created by means of similarity (tolerance) relation between elementary granules, decision rules, sets of decision rules, sets of decision rules with guards, information systems or decision tables (see, e.g., Polkowski and Skowron 2 2 , Skowron and Stepaniuk 29 , Skowron 2 7 ). The most interesting class of information granules create information granules specified in natural language and their approximations by means of experimental data tables and background knowledge. One can consider as an example of the set H of granule inclusion degrees the set of binary sequences of a fixed length with the relation v to be a part defined by the lexicographical order. This degree structure can be used to measure the inclusion degree between granule sequences or to measure the matching degree between granules representing classified objects and granules describing the left hand sides of decision rules in simple classifiers (see, e.g., Polkowski and Skowron 2 5 ) . However, one can consider more complex degree granules by taking as degree of inclusion of granule g\ in granule gi the granule being a collection of common parts of these two granules g\ and < .'ELEMENT content (itinerary+, querydeff, results)> < [ELEMENT itinerary (server)+> < ELEMENT server2 (Cardiff | Leece | Caltech)> < .'ELEMENT results (#PCDATA)> Figurel A DTD for Agent Message Communication
Message type represents intentions such as request, response, failure, and refuse explicitly and allows the system to monitor and control the progress of the interaction. For example, we can define a message for a request to search for tracks, and another message for information passing to return tracks. Context is used to identify the sender, the intended recipient of the message or originator for forwarded messages, using some form of local, regional, or global naming scheme. Returnby sets a deadline for user's waiting time.
319
Content defines itinerary of agent and user's request wrapping in XML, as well as forms of returning results. We define a set of DTDs for agent communication in the SARA system that specifies all of the legal message types, constraints on the attributes, and message sequences. In XML-based messages, agents encode information with meaningful structure and commonly agreed semantics. On the receiving side, different parts of the information can be identified and used by different services. Agents may use XML to explain their beliefs, desires, and intentions (BDI). Moreover, a mobile agent can carry an XML front-end to a remote data archive for data exchange, where both queries and answers are XML-encoded. We have currently identified various types of messages for agent interaction, such as upa-ura messages, ura-lia messages, and lia-upa message. Messaging is performed synchronously so that the URA is launched as soon as receiving a message from the UPA. A lia-upa message is sent from a LIA to a related UPA when the tasks are finished. In our system, we use the JAXP interface to XML developed by SUN that supports SAX and Document Object Model (DOM). 3.2
XML-based data specification
In our system we use XML to encode system structure as metadata. The metadata base consists of four tables. The Track table houses information about the track image such as its name, date of acquisition, unique id, width, height, and number of channels. The Coords table contains the latitude and longitude coordinates of the four vertices of the image. In the File table the filenames of the files constituting the image are recorded, and finally the Stored table contains the information about where the image is actually stored, that is, one of the data servers that compose the distributed digital library. 4
Implementation and application of the system
We have developed a simple prototype of an agent-based SARA system [4]. A subset of the LIAs and UIAs has been implemented to provide the basic functionality for on-demand processing of SARA data. We also implemented the following functions: locate a mobile agent, send a URA to the specified data server or compute server, retrieve URA-related information from the management server, stop the execution of an agent and release the resource held by the agent, and report agent status. The current system can be employed in a wide range of application domains, such as the analysis of multi-temporal images corresponding to changes in the ecology of a particular region, and studies of environmental pollution. SARA images can also be compared based on phase and amplitude differences of the
320
backscatter radiation, to study seismic or volcanic processes, motions of ice-sheets or glaciers, or other similar geological events. Support for real time processing can facilitate frequent overpassing of satellites over a given region in case of natural disasters such as forest fires or flash floods. The agent-based approach provides a useful system for enabling such applications to be more effectively deployed due to the reasons mentioned above, and involves an integration of data resources, various types of physical storage media, and various compute servers and analysis algorithms. A detailed description of agent-based data analysis is given in [3]. 5
Conclusions and future work
An agent based framework for managing access to data, supporting parallel queries to data repositories, and providing an XML based data model for integration data from different repositories is outlined. Our approach utilises stationary agents which undertake specific roles, and mobile agents which can carry analysis algorithms to data repositories. We use XML to model metadata and support agent communication. We are currently working on the further enhancement of the user interface for formulating queries and visualizing data. We are integrating more sophisticated agents into the system. Another important task for future study is to investigate system scalability and response times when thousands of agents are hosted on a single data source.
References 1. Aloisio G., Milillo G., Williams R.D., An XML architecture for highperformance web-based analysis of remote-sensing archives, Future Generation Computer Systems 16 (1999) 91-100 2. Coddington P.D., Hawick K.A., James H.A., Web-based access to distributed high-performance geographic information systems for decision support, In Proc. HICSS-32, Maui, January 1999. 3. Rana Omer F., Yang Yanyan, Georgousopoulos Christos, Walker David W., Williams Roy, Agent Based Data Analysis for the SARA Digital Library, Workshop on Advanced Data Storage / Management Techniques for HPC, Warrington, UK, 23rd - 25th February 2000. 4. Yang Yanyan, Rana Omer F., Georgousopoulos Christos, Walker David W., Williams Roy, Mobile Agents and the SARA Digital Library, In Proceedings of the IEEE Advances in Digital Libraries 2000, Washington DC, Published by IEEE Computer Society, May 2000.
CHAPTER 5 DISTRIBUTED INTELLIGENCE
AN INTELLIGENT CHANNEL ALLOCATION SCHEME FOR MOBILE NETWORKS: AN APPLICATION OF AGENT TECHNOLOGY ELIANE L. BODANESE Centro Federal de Educacao Tecnologica do Parana , Av. Sete de Setembro, Curitiba, Parana, Brazil, 80230-901, E-mail:
[email protected] 3165
LAURIE G. CUTHBERT Dept. of Electronic Engineering, Queen Mary and Westfield College - University of London Mile End Road, London, El 4NS, England E-mail:
[email protected] As the demand for mobile services grows, techniques for increasing the efficiency of channel usage in mobile networks become more important. Radio resource flexibility is needed to cope with the limited frequency spectrum available for network operators. The frequency channel allocation in mobile networks is a complex control problem with specific goals, i.e., to minimize the call blocking probability of the network and minimize the delay in channel assignment. This paper proposes a multi-agent system implementation to control frequency channel allocation in mobile networks. The internal agent architecture allows base stations to be more flexible and intelligent, negotiating and co-operating with others to improve the efficiency of the channel allocation scheme. The simulation results prove that the use of agent technology in controlling the allocation of channels is feasible and the agent negotiation is an important feature of the system in order to improve perceived quality of service and to improve the load balancing of the traffic.
1
Introduction
Mobile networks were first implemented following the specifications of the called cellular concept [1]. This architecture of cellular network is composed of hexagonal cells representing geographic areas. The users called mobile stations (MS) or mobile users, are able to start/receive communication while moving inside the cellular network. Each cell has a base station (BS) which supplies frequency channels to the mobile stations inside its boundaries. The base stations are linked to a mobile switching centre (MSC) responsible for controlling the calls and acting as a gateway to other networks. When a mobile station using a frequency channel reaches the boundary of the cell, it needs to change its current frequency channel for another belonging to the neighboring cell, this procedure is known as handojfov handover. The assignment of frequency channels in the cellular concept is fixed, i.e., the total number of frequency channels allocated to a network operator is divided in sets and each set is assigned statically to a cell. The set of frequency channels used in one cell can be used in other cells distant enough to allow the reuse of the frequency channels without causing interference problems. These cells are called co-channel 322
323
cells and the distance between them is the co-channel reuse distance. The group of cells using different sets of channels form a cluster of cells that is called compact pattern. The frequency reuse layout of a cellular system is assembled following a scheme [1], that finds the nearest co-channel cells of any cell of the network. Fig. 2 illustrates the frequency reuse layout of a network with a 7-cell compact pattern.
Figure 1. Frequency reuse layout of a 7-cell compact pattern cellular network.
The frequency channel allocation scheme as introduced in the cellular concept is called Fixed Channel Assignment (FCA), that is simple, but strongly constrains channel utilization. If the traffic demand in a cell is greater than the number of nominal channels, (i.e. the frequency channels assigned to that cell) all the excess demand is blocked, regardless of the traffic conditions in other cells. This constraint is too limiting for mobile networks and several strategies have been proposed to maximize frequency channel allocation and minimize the traffic blocking rate. The strategies have been divided mainly in two groups: variants of FCA and Dynamic Channel Assignment (DCA). The strategies using FCA differ from the original concept by allowing a cell to use channels from its neighboring cells, when there are no more available channels to serve incoming calls or handoff requests. One of the examples is the channel borrowing scheme with channel locking [2] that borrows a channel from an adjacent cell, but prevents the use of the borrowed channel in the co-channel cells of the lender that are near to the borrower, taking into account the co-channel interference constraints. DCA strategies [3] do not have nominal channels; all channels are kept in a central pool. When there is a channel request in one cell, the MSC chooses the appropriate channel that gives maximum channel efficiency taking into account the co-channel interference constraints. The channels are assigned for the duration of a call and then they are returned to the central pool or reallocated. A detailed survey describing DCA schemes and FCA-based schemes, including those with channel borrowing is given in [3].
324
Centralized schemes generally perform better, but the communications overhead is a real problem, causing several authors to propose modifications to make the schemes more distributed. One example is the Distributed Load Balancing with Selective Borrowing (D-LBSB) scheme [2] that performs better than its centralized version and also outperforms some other existing schemes. The schemes referenced here are purely reactive algorithms: although they give better performance than FCA still there is wasted signaling because of the high degree of failure under increasing traffic loads. Fully reactive algorithms are not able to detect the drop in their efficiency; as it will be shown in Section 4. In order to avoid wasting signaling resources, base stations must detect the situations where the efficiency of the algorithm is low and decide on different ways to improve the efficiency of the algorithm or to stop its execution. In count of that, we propose a multi-agent system that allows base stations to be more flexible, negotiating and cooperating with others to improve the efficiency of the channel assignment scheme. There is one agent in each base station, which is responsible for the control of the assignment of channels in the cell where it is located. 2
A Channel Allocation Scheme Based on Agents
Several agent frameworks have been proposed for control management in telecommunication networks, more specifically, the adoption of a layered multiagent control system seems to fit the requirements of distribution, intelligence, robustness and concurrency of telecommunication network control as we can see in Somers [4]. A layered arrangement allows levels of co-ordination, each layer is defined to conduct the control of the network to a certain level of competence. In our approach, we selected a type of multi-agent system whose layered control is located inside the agents. The agents were based on a hybrid architecture called INTERRAP (INTEgration of Reactive behavior and RAtional Planning) proposed by Miiller [5]. This architecture is composed of three hierarchical controls layers, a knowledge base that supports the representation of different abstraction levels of knowledge and a well defined control architecture that ensures coherent interaction among the control layers. Each layer has different functionality which working concurrently completes the global behavior of the agent. INTERRAP was designed to react to unexpected events, to long-term actions based on goals and to cope with other agent interactions. Its layered control approach and its mechanisms allowing negotiation among agents made this architecture suitable for optimizing channel allocation in mobile networks and therefore employed and adapted for this work. For the cellular network scenario, the layered control of the agent is structured to include a reactive layer that is responsible for fast accommodation of traffic demand, a local planning layer that uses other strategies to optimize the local load distribution of channels and a co-operative layer, responsible for load balancing
325
across a larger area. The knowledge base is composed by a world model that contains the environment information and everything necessary for the operation of a channel allocation algorithm; a mental model that contains the complete information about the agent, about the use of frequency channels and possibly history of traffic load in the cell and a social model that has relevant information about other agents' data. The agent is prepared to receive the input from the environment that includes requests for channel allocation from new calls, handoff requests, borrowing channel requests and orders for locking channels. The actions performed by the agents over the environment includes all execution tasks that actually allocate, release, re-allocate, lock, lend channels, manage handoffs and terminate appropriately unsuccessful requests. 3 3.1
Functional Specification of the Agent Reactive Layer
The Reactive Layer is primarily responsible for the channel assignment and is basically composed of a FCA algorithm with channel borrowing and channel locking. The algorithm called "Distributed Borrowing Algorithm " (D-B A) is based on the general behavior of the D-LBSB scheme [2]. Using an algorithm similar to one of the distributed schemes described in the literature allows a better evaluation of the performance of the agent and also allows the benefits of the Local Planning and Co-operative Planning layers to be evaluated against the same reactive scheme. In the D-BA scheme, when the channel availability in a cell decreases to a certain threshold, it becomes "hot"; cells above the threshold are "cold" cells. When a cell becomes hot, it triggers the execution of the borrowing channel algorithm. The neighboring cells with more channel availability are chosen to lend channels to the hot cell, they are the lenders. The hot cell expects to get a number X of channels (determined by the algorithm). The algorithm borrows channels until the number X is reached or all selected lenders are not able to lend channels anymore, what comes first. Full details of the D-BA scheme is found in [6]. 3.2
Local Planning Layer
This layer is responsible for the channel re-assignment scheme. Every time a channel is released, the Reactive Layer requests a re-assignment decision in order to improve locally the channel usage. The Local Planning Layer also determines a departing region r close to the borders of the cell. Mobile users inside this region are considered as departing from the cell and they have important role inside the agent negotiation. Finally, this layer monitors the efficiency of the algorithm in the Reactive Layer, and it is responsible for the decision of triggering the Co-operative Planning Layer.
326
3.3
Co-operative Planning Layer
The Co-operative Planning Layer is responsible for the negotiation of resources. When the use of the local channel allocation algorithm on its own is not sufficient to keep low rates of call blocking, one way to alleviate the load of a hot spot is to move calls to less loaded regions. This can be done through management handoffs (or traffic handoffs as known in the standards), but this is not an easy task. Only mobile users close to certain borders of the cells can attempt the management handoffs. The handoff attempts must be in a co-ordinated manner to avoid a mobile user being shifted back and forward between two cells. The handoff of mobile users to different cells is a load-balancing problem, so that the co-ordinated control needed to solve this problem depends on the collaboration of a group of cells. The co-ordination of a joint plan using management handoffs is the responsibility of the Co-operative Planning Layer of the agents. The agent negotiation has two phases: first to find the best region to attempt the movement of calls; second, the actual execution of the handoff requests in a co-ordinated manner, i. e. the execution of the joint plan. In order to engage and collaborate in a joint plan, the agents communicate through the Contract-Net Protocol (CNP) [7]. In this work, the agent implementation uses a subset of the communicative acts specified in FIPA Agent Communication Language (ACL) [8]. The execution of the protocol during the two-phase agent negotiation is best explained through an example. In Fig. 2, cell A is hot and the local channel algorithm is not responding efficiently. The Local Planning Layer triggers the Cooperative Planning Layer to start the negotiation with other cells. The following actions are taken in the negotiation: 1.
In the hot cell, agent A is a manager candidate agent and it sends call for proposals (cfp(O)) to all its co-channel cell agents A',-, where i can vary from 1 to the current number of co-channel cells of A (maximum 6). 2. The A\ agents advertise the need for channel availability assessment to their neighboring cell agents B',y, 1 < j < 6 sending cfp(l) (cell load assessment and plan engagement availability). 3. Each B'ij agent sends to the respective A',- agent apropose(l) act if the cell is able to engage in a joint plan in the near future or a refuse(l) act otherwise. Inside apropose(l) act the B'y agent sends its degree of coldness (dc) and the percentage of the mobile users in the cell {Depart) which are inside the departing region r of selected cell borders, called regions of movement. The regions of movement in each cell are pre-defined according to the position of the cell of agent A. 4. Each A't agent receives the answers of its neighboring B'y agents and it computes the value of the utility function (Equation 3). Where Reg is the minimum set containing A',- and its neighbors B'y that have sent propose(l) acts
327
(1 < |Re g\ < 1). The factor a is introduced in order to decrease the influence of Depart over F' (0 < a < 0.25) : F'=-j ,* Y(dc k +a- Departk)(S) |Reg| teRes 5. The A'i agents that were able to perform the calculation of F\ send the result to agent A 'mpropose(O) acts. The A',- agents that did not have enough propose(l) acts from their neighbors in order to calculate F' send refuse(O) acts. 6. The receivedpropose(O) act with biggest F' value is chosen to be the region for moving the calls (if F' is greater than a minimum value). Agent A advertises the result of the auction to the winning co-channel cell agent with an acceptproposal(O) act. If there is no winning region, then agent A sends rejectproposal(O) to all A',- agents that have sent propose(O) acts and aborts the joint plan attempt for a specific duration of time. 7. If there is a winning region, then the co-channel cell agent of this region sends cfp(2) (engage joint plan) to its neighboring B'y agents. 8. Each B'y agent receiving the cfp(2), assess its availability to engage the joint plan, considering the number of plans it is already participating in and the regions of movement being already considered in such plans. It sends a propose(2) act if: the number of current engaged plans is less than two and the regions of movement (if engaged in another plan) match the requesting one. Otherwise, it sends a refuse(2) act. 9. If the wining co-channel cell agent receives back a minimum number of propose(2) acts from its neighboring B'y agents, it sends back an inform(jp) (inform joint plan) act to agent A and sends accept-proposal(2) acts to all of its B'y agents that have sentpropose(2) acts. Otherwise it sends afailurefjp) (joint plan failure) act to agent A and reject-proposal(2) acts to its B'y agents that have sent propose(2) acts. 10. The winning co-channel cell agent that has just sent an informfjp) and its B'y agents will perform all preparatory tasks to engage the joint plan and they will wait for an inform(activejp) (inform joint plan activation) from agent A. 11. If agent A receives an informfjp) act, it sends a reject-proposal(O) to all other co-channel cell agents that have sent propose(O) acts before, and a requestQp) (request joint plan engagement) act to its two neighboring cell agents in connection with the winning region. This request is mandatory. Finally, agent A will send an inform(activejp) act to all agents engaged in the joint plan (first joint plan execution act). If agent A receives afailureQp) act, it selects the next best F' (if exists) and the actions from 6 to 11 are repeated. 12. An agent receiving a requestQp) act will perform all preparatory tasks to engage the joint plan and wait for an inform(activejp) act from agent A. 13. End of the first phase of negotiation. The second phase of the negotiation starts with the engagement of all agents belonging to the winning region, the manager agent A and its two neighboring cell
328
agents into the joint plan (shaded region in Fig. 2). Agent A is the manager of the joint plan and the other partner agents are the contractors of the plan [7]. The manager has the responsibility to monitor the actions of the contractors and to terminate the joint plan. Each iteration of the joint plan needs to be feasible. Therefore, a proposed heuristic follows a resource-oriented approach of marketbased control. The aim is to load-balance the whole region so that the difference in degree of coldness of partner cells should be smaller then certain threshold.
Figure 2. Negotiation Strategy.
The following heuristic tries to balance the region by distributing users among cells: 1. The manager agent A sends its first act to all partner agents to inform them that the joint plan is in operation (inform(activejp) act). 2. All partner agents receiving the inform(acttvejp) act will send an inform(ptrnjp) (partner cell in the joint plan) act to their manager agent identifying themselves and their neighboring cells in the regions of movement. 3. Iteration: a) The manager agent sends a query-ref(0) act to all partner agents. b) Each partner agent sends its total number of channels and the number of channels in use to the manager agent through an inform-ref(0) act. c) The manager agent computes the rate of change (Ac,) for each partner agent and itself by calculating the difference between the channel occupancy of the cell (c/C,) and the average channel occupancy of all members (N) of the joint plan (Lmg):
Ac,=|-L ovs (4)
L^L±±{S)vieN
Ci is the total number of channels in use in the cell of agent i. C, is the total number of channels (nominal + borrowed) in the cell of agent i. Lavg is the average channel occupancy of all cells of the joint plan. Ac, is the rate of change in channel occupancy of cell i inside the joint plan.
329 d) If the cell of agent i has Ac, >0, the manager agent sends to agent i: Ac,, the Ac of the neighboring cells having borders with the regions of movement of the cell of agent i and the total number of channels of these cells (C). It also sends Lavg. This information is sent through a requestQpaction) (joint plan action) act. e) Each agent i that receives the requestQpaction) act from the manager agent will try to transfer mobile users in the regions of movement (departing areas) following the algorithm: I. Sort the received Ac of the neighboring cells. II.If Ac, is smaller than min Ac , then no transfers can be made; go to step f). Otherwise, go to step III. III.Calculate how many mobile users need to be transferred: users = Ac/ * Ci. IV. If min Ac is greater than Lavg, then transfer one mobile user to the neighboring cell with min Ac ; Go to step VIII. Otherwise, step V. V. Sort only Ac that is smaller or equal to Lavg. The aim is to transfer mobile users proportionally to the number of channels available in each target neighboring cell with Ac smaller or equal to Lavg. VI.For all sorted Ac find the number of mobile users that the cell can receive. For Ac of cell j : us, = -Ac, * C,. j
J
J
J
VII. To find the proportion of mobile users that will be attempted to transfer m
to each cell, sum all USJ: US = ^jusj
. The proportion of mobile
'us, ^ users for each cell is: min — - * users US VIII. Do the handoff attempts. f) End of the iteration. 4. Repeat this iteration at intervals of s seconds until the manager decides to terminate the joint plan. When the plan is to be terminated, the manager agent sends a cancel(jp) (cancel joint plan) act to inform the termination of the plan. The termination of the joint plan can be determined by the completion of certain number of iterations or by an exception. 4
Implementation and Analysis
The results of the multi-agent system are compared against the conventional mobile network using the FCA and a mobile network using only the D-BA scheme under common traffic load scenarios. The network performance measurements used for the comparison are the traffic blocking rate and handoff rejection rate. For simplicity, the cellular networks being compared are identified by their channel allocation
330
schemes: FCA, D-BA and MA for the multi-agent system. The expected behavior of the MA network is to improve the performance of the D-BA scheme when the latter starts to decrease the efficiency of its borrowing algorithm. This improvement also starts to decrease as the traffic load increases, because fewer resources will be available for load balancing. A common cellular network was modeled in OPNET™ and validated against a mathematical model. The cellular network is composed of 49 cells and each cell has 10 nominal channels. The compact pattern is a 7-cell cluster with the reuse distance being 3 cell units. Mobile users have their own trajectories inside the mobile network. Call establishments and handoff requests are simulated as they are requested in AMPS systems [1]. Poisson distribution is used to generate calls, which have an exponentially distributed duration with a mean of 3 min. An idle mobile inside the cell performs the call attempt. The performance of the three systems were analyzed under four layouts with different traffic distributions. The results of one layout (Fig. 3) are presented here. In Fig. 3, the number in the bottom of each cell is the cell identification; the number in the middle gives the Poisson arrival rates in calls/hour (ranging from 20 to 200 calls/hour). Mobile users inside the shaded area drive at 40km/h or walk at 2km/h (in both directions). The remaining mobile users have random trajectories moving at 5km/h. In the results, the abscissa of each graph is the percentage of load increase in all cells compared to the traffic load shown in Fig. 3, called the base load (0 %).
145
100
100 .14
id
«
40
SO
IT
3
4
S8B
ISO
i
6
<m
1 dO
140
40
2
11 100
40
10
8 10B
21
13
m 3
20
i
ISO
80 22
I d0 23
fl0
24
Figure 3. Cellular system layout.
The MA network outperforms the D-BA and the FCA networks (Fig. 4) and the expected general behavior of the MA network is demonstrated. The number of borrowing attempts is kept at the same level as the D-BA network, and at almost the same efficiency ((successful + partial successful outcomes) / total number of
331
borrowing algorithm executions) (Fig. 5). This shows that the reductions in blocking rate are due to the agent negotiation performing the load balancing.
^ y
^
y£^ -^
s
^
- 0 and ?,o, where "rj represents all nodes other than 77. We call this quantification learnability 15 . A linear approximation to the learnability in the vicinity of ( is the differential learnability \,lri (C):
w o ^ "?^(C)".
a)
' l|VC,,o7,(C)ll It can be proven that in many circumstances, especially in large problems, WLU has much higher differential learnability than does the team game choice of private utilities 15 . (Intuitively, this is due to the subtraction occurring in the WLU's removing a lot of the noise.) The result is that convergence to optimal G with WLU is much quicker (up to orders of magnitude so) than with a team game. However the equivalence class of utilities that are factored for a particular G is not restricted to the associated team game utility and clamp-to-0 WLU. Indeed, one can consider solving for the utility in that equivalence class that maximizes differential learnability. An approximation to this calculation is to solve for the factored utility that minimizes the expected value of [\n,WLRn]~2, where the expectation is over the values £)0. A number of approximations have to be made to carry out this calcula-
369 tion 15 . The final result is that 77 should clamp to its empirical expected average action, where that average is over the elements in its training set 2 3 . Here, for simplicity, we do not actually make sure to clamp each 77 separately to its own average action, a process that involves 77 modifying what it clamps to in an online manner. Rather we clamp all agents to the same average action. We then made the guess that the typical probability distribution over actions is uniform. (Intuitively, we would expect such a choice to be more accurate at early times than at later times in which agents have "specialized".) 3
The Bar Problem
We focus on the following six more general variants of the bar problem investigated in our earlier work 2 0 : There are N agents, each picking one out of seven actions every week. Each action corresponds to attending the bar on some particular set of / out of the seven nights of the current week, where / £ {1,2,3,4,5,6}." At the end of the week the agents get their rewards and the process is repeated. For simplicity we chose the attendance profiles of each potential action so that when the actions are selected uniformly the resultant attendance profile across all seven nights is also uniform. World utility is G(Q = E ^ G ( O ) , where flG(G) = E L i 0(*fc(C,*)), Xk((,t) is the total attendance on night k at week t, (j>(y) = yexp(—y/c), and c is a real-valued parameter. (To keep the "congestion" level constant, for / going from 1 to 6, c = {3,6,8,10,12,15}, respectively.) Our choice of (.) means that when either too few or too many agents attend some night in some week world reward RQ is low. Since we are concentrating on the utilities rather than on the RL algorithms that use them, we use (very) simple RL algorithms. Each agent 77 has a 7-dimensional vector giving its estimates of the reward it would receive for taking each possible action. At the beginning of each week, each 77 picks the night to attend randomly, using a Boltzmann distribution over the seven components of 77's estimated rewards vector. For simplicity, the temperature parameter of the Boltzmann distribution does not decay in time. However to reflect the fact that each agent operates in a non-stationary environment, reward estimates are formed using exponentially aged data: in any week t, the estimate 77 makes for the reward for attending night % is a weighted average of all the rewards it has previously received when it attended that night, with the weights given by a
I n order to keep the learning difficulties faced by the agents similar for various choices of /, the agents always have seven action from which to choose. Each such action gets mapped to an "attendance" profile, e.g.., for ( = 2, so that each agent must choose two nights, action one maps to attending on days one and two, action two maps to attending on days two and three etc.
370
an exponential function of how long ago each such reward was. To form the agents' initial training set, we had an initial period in which all actions by all agents were chosen uniformly randomly, before the learning algorithms were used to choose the actions. 4
Experimental Results
We investigate three choices of/5: 0, 1 = (1,1,1,1,1,1,1), and the "average" action, a = |-, where I e {1,2,3,4,5,6} depending on the problem. The associated WLU's are distinguished with a superscript. In the experiments reported here all agents have the same reward function, so from now on we drop the agent subscript from the private utilities. Writing them out, the three WLU reward functions are: RWL<s(£t) = Raiit)
~
Ra{CL%t))
= d,(xd„((,t)) RWL;((,t)
= RG(C,t) -
^,(ard,(C,*)-l)
RaiCLfct))
7
= Y, Mxd(C,t)) - Mxd(£,t) + i) d^dr,
RWL*(0)
= RaiCt) ~
Ro{CL%t))
7
= ] T 4>d(xd((,t)) -
4>d{xd((,t)+ad)
d^tdv
+ 4>dv(xdv(C,t)) - 4>d„(xdv(C,t) - 1 + ad) where dn is the night picked by 77 and ad = 1/7 The team game reward function is simply RQ. Note that to evaluate RWLs each agent only needs to know the total attendance on the night it attended. In contrast, RQ and RWLS require centralized communication concerning all 7 nights, and RWLi requires communication concerning 6 nights. Finally, note that when viewed in attendance space rather than action space, CLa is clamping to the attendance vector Vi = X)d=i ^sX' where ud e
(4)
Ar' =
),
otherwise
Therefore, more quantities of pheromone will be laid on the edges on which most passing ants can construct shorter feasible tours. On the other hand, in the worst case, the edges will receive no positive rewards either no ants walked through them or most passing ants constructed infeasible tours. 2.6. The Proposed Algorithm and the Best Parameter Setting Now we present the details of the proposed algorithm (which we denote as AS/Poly) for solving the polygonal approximation problem as follows. 1. Initialize. Construct the directed graph as depicted in Subsection 2.2. Set i\. = 1/ for every edge (i,j). Set T. = 1 for every entry of the selection table of the starting node.
379 Set A^C = 1. Set tour,,., . *"
2.
3.
global
= x,x1 • • • x„x,.
best
1
2
»
'
For every ant do Select the starting node according to the selection probability. Repeat Select the next node according to the node transition rule using Eq. (2). until a closed tour is completed. // the selection of the next node can not pass over the starting node // Find out the shortest feasible tour, say tOUr , among the current m tours. current best
4.
If \tour
< \tour
luul
l<J,
\
current
_besl\
global_best
^
~
|
then
"'gM>fl/_6asf |
'""^current_besl
5.
For every edge do Update the pheromone intensity according to the pheromone updating rule (see Eqs. (3) and (4)). 6. Update every entry of the selection table using Eq. (1). 7. If (NC = MAX_CYCLE) then output tourglM bal and stop; otherwise, NC =NC + l and goto step 2. To obtain the best performance of the AS/Poly, various parameter values have been experimented. The test benchmark curves which are used in the literature [2, 4, 5, 7, 8. 9] are shown in Figs. l(a)-l(b). The best performance of the AS/Poly was found when a = 1, p -5, r = 0 . 4 , p = 0 . 1 , and m = 2 0 for both test curves.
(a) (b) Fig. 1 The test benchmark curves. 3. Experimental Results and Discussions In this section, we will discuss more important properties of the AS/Poly through empirical studies. The performance of various strategies of the AS/Poly is compared to those of two other heuristics: genetic algorithms (GA) and tabu search (TS). 3.1. Properties of the Proposed Algorithm 3.1.1. AS/Poly v.s. Random Walk We have implemented a random walk search method (referred to as random walk for simplicity) where the distributed ants have no communications and no previous
380 experiences, just choose the next node randomly. Fig. 2 (a) shows the global shortest tour length obtained at each cycle for both of the AS/Poly and the random walk. It is seen that in the beginning stage of running cycles, the AS/Poly which does not yet cumulate enough feedback experiences has similar performance as that of random walk. After the 4th cycle, the global shortest tour length found by the AS/Poly keeps decreasing, while the one found by the random walk is almost unchanged. Consequently, the mechanisms facilitating the inter-ant communication and the persistence rate of previous experiences play significant roles in the search paradigm.
---ACSfl\4y ] -*- Ran.1oin Walk
4(1
(a)
43
4(5
49
C-'i'.-l
(b)
Fig. 2 The important properties of the AS/Poly. 3.1.2. Node Branch Entropy To examine the convergence behavior of the AS/Poly, we can measure the average node branch entropy. For the rth node on the graph, the node branch entropy is computed by the transition probabilities on all of the directed edges which originate from the rth node. That is
£,=-X^i°g/V
(5)
The node transition rule becomes more deterministic when the node branch entropy approaches to 0. The average node branch entropy is then computed over all node branch entropy values, i.e., £ = Y" E.ln- Fig- 2(b) shows the value of E at each cycle. Initially, the value of E decreases gradually since there is few experiences cumulated and the AS/Poly tries to explore new branches to cumulate more experiences. In the middle cycles (between the 40th and the 100th cycles), the value of E drops drastically because some edges become more preferable and have higher transition probabilities. In the later cycles, the value of E decreases
381 gradually again since the dominant edges stand out and the transition probabilities become stable. Hence, the value of the maximal number of running cycles which decides the stopping criterion of the AS/Poly could be set as the one falling in the stable part. 3.2. Searching Strategy In this subsection, we present two advanced searching strategies of the AS/Poly. 3.2.1. Elitist Strategy The elitist strategy which comes from GA [12] is also suggested to be used in the AS by Dorigo [11]. The basic idea is that the pheromone intensity on the best tour obtained so far (denoted as tour, L, , ) is further reinforced such that v
tour
global
_besl
has higher probability of changing edges with other tours. We have
implemented the elitist strategy with the AS/Poly. 3.2.2. Hybrid Strategy We suggest to use a hybrid strategy which embeds a local search method in between the iterations of the AS/Poly. The local search method used here is an iterative one, on and for saving the computational time, it is applied to tour ly- First, global _ best
randomly delete one node from tour on tour
• Then for each of the remaining nodes
> it is moved along every position between its two neighboring nodes global
best
and finally stops at the position which gives the minimal approximation error. An iteration is completed when all of the nodes on tour have been processed. The next iteration is activated if any node has been moved to a new position, i.e., there is an error reduction in the previous iteration. The iteration is repeated for at most five times to save computations. 3.2.3. Comparative Performances Experimentally we found that both of the elitist strategy and the hybrid strategy overcome the AS/Poly, and the hybrid strategy has the best performance. As will be seen in the next subsection, the two advanced strategies cost negligible extra CPU time than the AS/Poly, and they have more significant approximation improvement when smaller £ -bound is specified. 3.3. Comparison with GA and TS The experiments are conducted on a PC with a Pentium 120MHz CPU. The numerical results are obtained by averaging over 10 independent runs for each of the test methods. Various values of £ have been specified for each of the test curves. Table 1 lists the average number of line segments (d) of the final obtained approximating polygon and the average computational time (t) of every comparative
382 approach. It can be seen that, for both of the two assessing factors, the proposed AS/Poly and its variations have the best performance, the TS-based approach ranks in the middle, and the GA-based approach is the worst. It is also observed that the elitist strategy and the hybrid strategy have more prominent improvement in .educing the number of approximating line segments than the AS/Poly when the value of £ is decreasing. The average CPU cost time of the elitist strategy is similar to that of the AS/Poly because only a few computations are needed to update the pheromone trail of tour . . . b • The extra CPU cost time of the hybrid strategy is also negligible if the user prefers to see a better approximation result with higher compression ratio. Table 1 The comparative performances of the GA-based approach, the TS-based approach, the AS/Poly approach, the elitist strategy and the hybrid strategy of the AS/Poly. GA-based TS-based AS/Poly Elitist Hybrid Curves £ d t d t d t d t d t 15.6 16.3 17.3 20.5 23.1
5.71 4.45 5.28 4.62 5.65
10.6 13.7 14.6 20.1 21.9
0.93 0.92 0.89 0.90 0.90
11.2 13.0 13.2 17.2 19.8
0.68 0.70 0.70 0.71 0.72
11.6 13.0 13.0 17.0 19.0
0.69 0.68 0.71 0.72 0.72
11.0 12.6 12.8 16.6 18.8
0.87 0.84
60 13.2 30 13.9 Semicircle 25 16.8 (n=102) 20 19.2 15 23.0
4.56 4.80 4.29 4.67 4.44
11.0 13.6 14.9 16.2 18.3
0.87 0.79 0.78 0.78 0.76
10.0 12.6 13.4 16.4 18.0
0.59 0.59 0.59 0.60 0.61
10.0 12.4 13.0 16.0 17.4
0.59 0.57 0.61 0.62 0.63
10.0 12.0 13.0 15.8 16.8
0.78 0.75 0.74 0.73 0.73
150 100 Leaf 90 (n=120) 30 20
0.89 0.90 0.90
Figs. 3(a)-3(e) show the final approximating polygon and the corresponding number of the approximating line segments for the leaf-shaped curve with the £ bound set to 20 for each of the test methods; Figs. 3(f)-3(j) are those for the curve with four semicircles given the £ -bound equal to 15. It can be seen that the proposed algorithms produce the least number of approximating line segments for all test curves.
383
(a)GA-based (rf = 23)
(b) TS-based ( i > j > 0 then for any valuation v, feasible*-(u)-^>feasible"(u) and pref erencej(u) ^preference" (u). There exists a valuation, v, such that either feasiblej(u) ^= f' easiblevAu), or otherwise feasible^(u) = f easiblevAu) = T and preference"(u) + £ < preferencevAu). A Dynamic DisCSP (DyDisCSP) is defined by a set of agents Ao,...,An. Ak,k=[o,h),n>h>o a r e ^ agents called initiators. Each agent Aj owns a NVCSP, NVCSPj. Given a valuation, v, for a set of external variables, S(v) is the set of agents owning a variable not instantiated in v to F. By convention, the initiators always belong to S(v). An agent is active if it belongs to the minimal subset, A(v), of S(v) such that S{Uvars(A(v))v) n (S{v) \ A(v)) = 0. Definition 1 (Acceptable valuation) A valuation v is acceptable if each agent in S(v) proposes for v a feasible associated tuple (f easibleWi) = T). Definition 2 (Solution) A solution of a DyDisCSP is an acceptable valuation v of all the external variables such that if each agent Ai in S(v) is active and proposes for v an associated tuple (T,preference^ (i)), where hi < rij, then v G {6| b = argmin( a
YJ Aies(v),i>h
preference%.(i))}
402
and no agent Ai, i>0, wants to reveal a constraint Cj, j>ki- The feasibility condition is Y^AieS(v)Preference,v)C(m,n)k . E() is a function on an agent's eagerness that determines the amount of concession (section 2). T(t,t',,") is a function on trading time that determines an agent's rate of concession (section 3). 0(n,<w{>,v) is a function that determines the amount of concession according to an agent's own expectation, number of trading partners and their (differences in) offers/bids (section 4). C(m,n) is a function on the number of competitors m and trading partners n that determines the probability that an agent is ranked as the most preferred trader by at least one of its trading partners (section 5). 2 Eagerness and Convergence The eagerness e of an agent is the percentage of convergence of the spread k. e represents an agent's desire to make concession, that is, to narrow k in each trading cycle, independent of the current trading time, number of competitors and trading partners. If the market situation remains unchanged, an agent's spread A' in its next bid/offer is given by: k'= (1 -e)k where 1 > e > 0. e is a user-supplied value that is assumed to be constant throughout the entire trading process. The function E(e) = \-e where 1 > e > 0 determines the ratio between k* and k, hence, the amount of concession. A high value of e corresponds to a strong desire of an agent to narrow k' and moving closer to reach a consensus. For instance, with other market factors being constant, if e = 0, then k'=(]-0)k = k; an agent does not make any concession. If e= 1, then k'= (1 -\)k = 0; a consensus is reached. 3 Trading Time and Strategies At different stages of trading, agents may make different rates of concession. Their strategies with respect to remaining trading time can be classified as follows: 1. An agent that is keen to complete a deal quickly may make large concessions in the first few trading cycles. However, if a consensus is not reached rapidly, there will be little room for negotiation in later stages. 2. An agent may choose to make minimal concession in early trading cycles and large concession when the trading time is expiring.
407
3. 4.
An agent may make constant rate of concession throughout the trading process. Remaining trading time does not affect an agent's next bid/offer. It sticks to its original bid/offer throughout the entire trading process. The formulation of next spread k' with respect to trading time is given as: *'=
((< T) where k0 is the initial spread; t is the current trading
"(T
time, T is an agent's closing time, and A is a non-negative temporal sensitivity factor that models the four classes of strategies (see figure 1). Since —— = - A — — k 0 dt
j—= —A(A — 1 ) — — k„
T
and dt t a n d A, t, and T are nonnegative, the slope (^LL) is always negative. Hence, k' and also the utility of the agent either decreases with time (since concession is being made) or remain constant throughout the trading process. The rate of change of the slope , v )
n (v
o
w,)
)"
where v and c are the utility of the last bid/offer of a trading agent and its conflict utility respectively. 5 Trading Competition The amount of competition of a market-driven agent is determined by the probability that it is not being considered as the most preferred trading partner. Suppose an agent B, has m-1 competitors, B = {B2,...,B„,} and n trading partners, S = {S,,...,S„}. The probability that B, is not the most preferred trading partner of any S, G S is m ~ ' . The probability of B, not being the most preferred trading partner m
of all S, e S is (m ~x\
• Hence, the probability of B, being considered the most
preferred trading partner by at least one of S, e S is C(m, n) = , _ (m - '
If
m xl
m = n = 1 then there is one buyer and one seller. Hence, c(l,l) = l - [ -—- | = l - If « = /, there are m buyers B = {B,,..., m
C(m,\)
= l
~' i
B„,} and one seller S,. Then,
Each 5, G B is exposed to a probability of it not the most
preferred trading partner of S,. If there are m buyers B = {B,,..., B„,} and n sellers S
410
= {S,,..., S„], then each 5, £ B has a probability of " - ' that it not the most m
preferred trading agent of any S, e S. The probability that B, e B i s not the most preferred trading partner of all S, e S is (m ~ ' Y . Therefore, the probability that a Bi e B is the most preferred trading partner of at least one St e S is cim n) = i {'" ~ ' T • The market-driven strategy considered in this research is designed for both buyer and seller agents and the above arguments hold for both buyer and seller agents. Furthermore, it is reminded that the cardinalities of B and S vary with changing market situations (as buyers and sellers can enter and leave the market at any time). 6 Conclusion This paper has presented the foundations of engineering market-driven agents. It extends the work in [5] by considering the additional notions of spread, and probability of completing a deal. The mathematical functions in sections 2 through 5 were used to guide the design of market agents that can automatically control the amount of concession in a negotiation for a given market situation. Experimental results in [7] demonstrated that the reactions of market-driven agents coincide with intuitions in real life trading (space limitation precludes them from being included here). It is hoped that the foundations discussed in this paper can shed new light in engineering negotiation agents. References 1. J. Rodriguez-Aguilar, et.al. Towards a Testbed for Trading Agents in Electronic Auction Markets. AI Communication, Vol. 11, No. 1, pp 5-19, 1998. 2. P. Wurman, et.al. The Michigan Internet AuctionBot: A Configurable Auction Server for Human and Software Agents. Proceedings of the 2nd Int. Conf. on Autonomous Agents, May 1998. 3. D. Chavez, et.al. A Real-Life Experiment in Creating an Agent Marketplace. Proceedings of the 2nd Int. Conf. on the Practical Application of Intelligent Agents and Multi-Agent Technology, London, April 1997. 4. R. Guttman and P. Maes. Cooperative vs. Competitive Multi-Agent Negotiations in Retail Electronic Commerce. Proceedings of the 2nd Int. Workshop on Cooperative Information Agents (CIA'98). Paris, July 1998. 5. K.M. Sim and E. Wong. Engineering Market-driven Agents for E-auction. In proceedings of the Pacific Asia Conf. in Info. System 2000, pp 1128 - 1138. 6. John C. Harsanyi, Bargaining, In (Eatwell, Milgate, & Newman eds.), The New Palgrave: Game Theory, Macmillan Press Limited, 1989, pp. 60-62. 7. C. Choi. Towards Strategic Agents for Auction & Negotiation. MSc. Thesis, Dept. of Comp., HK Poly. Univ., HK, 2001. Supervised by Prof. K. M. Sim.
CHAPTER 6 AGENT BASED APPLICATIONS
KAVANAH: A N ACTIVE USER INTERFACE INFORMATION RETRIEVAL A P P L I C A T I O N EUGENE SANTOS JR. AND HIEN NGUYEN Intelligent Distributed Information System Lab Department of Computer Science and Engineering University of Connecticut {eugene,hien} @cse. uconn.edu SCOTT M. BROWN Air Force Research Laboratory Crew System Interface Division Wright-Patterson AFB, OH sbrown777@acm. org This paper reports our implementation and evaluation of an active user interface in an information retrieval application called Kavanah. The goal of the active user interface is to improve the quality of information retrieval and to reduce the user's cognitive workload while searching for information. Our underlying concept is to dynamically construct the search queries based on a dynamic representation that captures user interests, preferences and searching context (as represented in a user ontology). Our approach to disaggregating the essential aspects of a user's intent for searching allows for focused multi-agent based construction and correction of the overall user model that captures the user's intent, thus promoting increased effectiveness and efficiency. We evaluate the effectiveness of the active user interface with commonly used metrics from the information retrieval community by measuring retrieval performance with and without the presence of an active user interface. Furthermore, we measure the ability to discover new knowledge by evaluating our dynamic online ontology construction. The evaluations use the Unified Medical Language System knowledge base as a test bed.
1
Introduction
During the last few years, as the result of the overwhelming number of choices of online and offline information resources, we have witnessed an increasing trend towards the construction of personal assistant agents in information filtering, recommender systems and agent communities 2,9 ' 11 . The main focus of these approaches is to capture user interests by analyzing the user interactions with the system and to use these interactions to guide the system reactions accordingly to improve the quality of the users' work. In this paper, we hypothesize that constructing a unified model of the user's interests, preferences, and context in an information seeking task provides a fine-grained model that more effectively captures the user's informa-
412
413
Hon seeking intent than a model addressing a subset of these salient characteristics. While other previous efforts have focused exclusively on learning any one aspect of information seeking, none of them has attempted to integrate all three aspects together for determining a user's intent in seeking information. We refer to our personal assistant agent as an active user interface (AUI) in this paper. Active user interfaces not only capture user interests, preferences, and contexts but also focus on the interactions among them in a dynamic fashion. In particular, our focus is on deriving and learning the context or user ontology. Most existing methods assume that all users share a single common ontology 13 . This implicitly assumes that all users have the same level of understanding and beliefs expressed in the common ontology. We believe that users understand information and how it interacts in their own individual way. This arises from many factors ranging from user experience and expertise to basic differences in user style and operation. We show that by using our model, we can do more than just elicit the user interests and preferences. We provide a learning capability for the system to discover new knowledge based on analyzing the documents relevant to the user and the context, i.e. why the user is focusing on the given information. This work is derived from our earlier research with a predecessor system, Clavin 4,15 ' 16 . We evaluate our hypothesis by constructing an AUI in an information retrieval application called Kavanah. The implementation of our AUI is a multi-agent based system in which the main agent contains the user model consisting of user preference, interest, and context and the suppporting agents are used to dynamically construct and maintain the user model based on changes in the user's intent as well as incorrectness and incompleteness in the user model. Our evaluation goal is to show the effectiveness of this model by comparing the system performance in cases with and without an AUI using commonly used metrics in information retrieval. The rest of the paper is organized as follows: the next section discusses the architecture of the system followed by a detailed description of our implementation. Next, we discuss our preliminary empirical evaluation. Finally, related work and future research issues are considered. 2
System architecture
The main goal of Kavanah is to use its AUI to assist the users in getting the right information at the right time using the right tools 4 . The goal of the AUI is to accurately represent a user's intent. Intent inference involves deducing an individual's goals based on observations of that individual's actions 12 . In automated intent inference, this process is typically implemented through
414
one or more behavioral models that have been constructed and optimized for the individual's behavior patterns. In an automated intent inference system, data representing observations of an individual, the individual's actions, or the individual's environment (collectively called observables) are collected and delivered to the model(s), which match the observables against patterns of behavior and derive inferred intent from those patterns. These inferences can then be passed to an application for generation of advice, definition of future information requirements, or proactive aiding. We partition intent inference into three formative components. The first, interests, captures at a high level the focus and direction of the individual's attention. The second, preferences, describes the actions and activities that can be used to carry out the goals that currently hold the individual's attention, with a focus on how the individual tends to carry them out. The third, context, provides insight into the user's knowledge and deeper motivations behind the goals upon which the individual is focused and illuminates connections between goals. In other words, the first component captures what the individual is doing, the second captures how the individual might do it, and the third infers why the individual is doing it. With regards to the research presented in this paper, the AUI needs to provide the right assistance to the information retrieval application on what the user is currently interested in; how a query needs to be constructed and returned results needs to be portrayed; and why the user dwells on a search topic. We assume that the interests are influenced by the ultimate goal that the user is trying to reach and the methods which she uses to accomplish that goal. For example, suppose that the user's goal is to study lung cancer and her approach is to scan materials from general definitions to specific methods used to treat this disease. Her interests will thus vary from general treatments to specific chematography processes. In particular, her interests may change from a certain drug to a more general approach for treatment. The user interests, in turn, influence user preferences and context. If user interests appear to be far off the goal that the user is trying to reach, she may change her search strategies and understanding of the subject accordingly. In our AUI, we capture the interest, preference, and context aspects of user intent with an interest relevancy set, a user ontology network, and a preference network correspondingly. The interests relevancy set determines what is currently relevant to the user. It is generated by reasoning over the user ontology network. Based on the utility values of each concept node in the user ontology network, we end up with a rank ordering of the concepts to build an interest relevancy set. Since user interests change over time, we incorporate a fading function to make the irrelevant interests fade away. We
415
will describe this process in more detail in Section 3. The user ontology network captures the user's knowledge of concepts and the relations among concepts in a specific domain. Before further discussing the user ontology network, we introduce briefly the concept of a domain ontology. The domain ontology captures the domain knowledge containing the concepts and the relations among them in a specific domain. The user ontology exploits the domain ontology by extracting the missing information that it needs to have in order to help identify the concepts which the user is interested in. Therefore, the user ontology will be similar to a subgraph of the domain ontology. However, in the user ontology network, additional relations which are not found in the domain ontology may exist as a result of user misconceptions or user-specific expert knowledge that is not fully captured in the domain ontology. The user ontology network in Kavanah is represented by a Bayesian network in which each node either represents a concept or a relation among concepts. The preference network represents how the user wants to form the query, how this query should be answered, and how results should be portrayed. The user's preference in Kavanah is reflected by how the user prefers to use a class of tools. A tool is defined as an operator to perform specific actions to transform the information that the user needs based on preference. Each node in the preference network represents a tool, an action associated with that specific tool, or a pre-condition which represents the requirements of the tool connected with it. An example of a tool is a filter that removes those documents that do not match a certain criteria. Another example of a tool is an expander that searches for documents that expand the searching topic. Figure 3(a) shows an example of a preference network. The AUI uses correction adaptation agents to maintain the preference network. Each correction adaptation agent offers a bid to the AUI to change the preference network. They maintain a user model that is identical to that of the AUI until the AUI requests bids, at which time the AUI adapts its own user model based on its bidding behavior component. In the bidding process, the correction adaptation agent that most likely improves the AUI's effectiveness will win the bid and this winning agent is permitted to correct the user model. We evaluate the user model by a set of metrics that measure its adaptivity, autonomy, collaboration and robustness requirements. We capture the user's utility for having the AUI perform an action on his behalf to achieve a goal by the utility function over that set of requirements. An example of such a utility function is as follows: Ureq = 0.14*Ureactive+0.14 * Upredictive + U.14 * Uperceptive
+ 0. 14 * Uautonomous
+ 0 . 1 4 * t/coMadoratiue"("0.14 * Ucapability
'
0.14 * V'misconception- For more information about the correction adaptation
416
agents, the bidding process and the metrics, please see our previous paper 5 . 3
System implementation
We start this section by describing the overall process in Kavanah and then describe in detail how the AUI helps the system build the adapted query. Kavanah consists of five modules as shown in Figure 1(a). The input module accepts the user's natural language queries and transfers them to the query module where they are parsed and converted into a query graph (QG) which is similar in construction to the user ontology network except that it may contain a node(s) representing a variable (usually denoted as X) that is necessary to represent unknown concepts in the user query. A query graph is a directed acyclic graph, where each node represents a concept or a relation among the concepts. A relation node should have concept nodes as parents and children. A concept node represents a noun phrase while a relation node represents a verb phrase in a user query or a natural language sentence. An example of a QG of the query "What causes liver damage?" is shown in the left side of Figure 3(b). The AUI uses the query graph and generates a new adapted query for the search module based on the current user model. An example of an adapted query is shown in the right side of Figure 3(b). The search module matches the QG of the adapted query against each document graph representing a record in the database of documents, chooses those records that have the number of matches greater than a user-defined threshold, and displays the output to the user. A document graph (DG) is a directed graph that contains concept and relation nodes and is also similar to the user ontology network (e.g Figure 2(a)). Note that all of the common concepts in all of documents are found in a global dictionary and domain ontology. A match between a QG and a DG is defined as the number of concept and relation nodes of the QG being found in the DG over the number of nodes of the QG. After the search module returns the search results, the feedback module allows the user to indicate whether the search result is relevant or not. The AUI constructs the adapted query in Kavanah by maintaining the updated user interests, preferences and context. The logical architecture of the AUI is shown in Figure 1(b). The AUI determines the current interests by reasoning over the user ontology network with the concepts found in the user query set as evidence. Each element of the interest relevancy set consists of an interest concept and an interest level. The interest concept represents the topic that the user is currently interested in. It is determined from the user query and the set of documents that the user has indicated as relevant in the recent search. The interest level is a real number indicating how much the user is
417 Input ^
Query module
,
DO
5_ M AUI
. I- Interest -i >,
User Ontology network J?
\
Search/ Output
S
(a)
Preference network
"
^
(b)"~"~
Figure 1. (a) The overall Kavanah architecture, (b) AUI Architecture
interested in the corresponding interest concept. Denote each interest concept as a and its associated interest level as L(a). We compute L(a) after every query by: L(a) = 0.5(L(o) + ^ ) with n as the number of relevant documents containing this concept a and m as the number of relevant documents. If L(a) falls below the user-defined threshold value, the corresponding interest concept a is removed from the interest relevancy set. To compute the new set of interests, we set as evidence in the user ontology network those concepts found in the query and the interest relevancy set, and perform belief updating on the user ontology network. We construct the user ontology network dynamically by finding a common set of subgraphs of all relevant documents. Each document is represented as a DG(e.g Figure 2(a) and 2(c)). For each relevant document, we build a set of its subgraphs. A subgraph X of a DG Y is a DG such that each node a belongs to X also belongs to Y. The sets of subgraphs of the concepts "urate oxidase" and "cosmids" are generated in Figure 2(b) and 2(d). After generating all of the subgraphs, we compute the number of occurences for each subgraph in the entire set. We select those subgraphs that have the number of occurences greater than a user-defined threshold and consider them as the common subgraphs of the relevant documents. The common set of subgraphs of the two above concepts is shown in Figure 2(f). This set is used by an agent to update the user ontology network. The agent will check if a subgraph is not currently in the user ontology network, and adds it accordingly. This agent will ensure that the update will not result in a loop in the existing ontology network. If it does, we skip this addition. A new link between two existing concepts in the user ontology network will also be created if two concepts are linked in the set of common subgraphs and the frequency of these links exceeds a certain user-defined threshold. The preference network is built when the user issues a new query and gives relevance feedback. Each user query is considered as a pre-condition
418
Subgraphs
Document graphs Cosmids }*{
Cosmids)
llinzymc
( Cosmids VK
lsa
Common subgraphs
lsa \—WLnzymeJ J~*\ Enzyme J
Figure 2. (a) The graph represents "cosmids". (b) Subgraphs of concept "cosmids". (c) The graph represents "urate oxidase", (d) Subgraphs of concept "urate oxidase", (f) The set of common subgraphs of the concepts "cosmids" and "urate oxidase".
Query graph of adapted query 02:
Query graph ofquery_02:
Liver ssjamagg/
(b) Figure 3. (a) An example of a preference network, (b) Examples of query graphs associated with the user query and the adapted query generated by AUI
node in the preference network. If this query or its part is already asked, the existing node in the preference network which has a QG matched with the QG of the new query or of its part will be set as evidence. Each interest concept from the interest relevancy set is added to the preference network as precondition node and set as evidence. If the user query is totally new, the tool being used by the user is set to the default value (a filter) and a goal node
419
representing the filter tool is added to the preference network. Otherwise, it is set to the tool being represented by the goal node with highest utility value. Each action node represents a way to construct a adapted query based on the current tool, interests and user query. Figure 3(a) shows an example of a preference network in which the user is using an expander, is currently interested in the concept "urate" and wants to find out the causes of the liver damage. Note that each user query and adapted query generated by AUI associates with a QG. The current way of forming a new query is determined as a result of reasoning over the preference network by taking top n goals nodes as defined by the user. We computed the expected utility for a goal g with associated action Oj as follows: EU(g) = P{g = T)U(a — g = T) + P(g = F)U(a - g = F) in which U(a - g = T/F) = Y,P(a-g = T/F)U(ai) and U(a,i) — J2 U(attributes) The set of attributes in the formulas above includes the workload, user expertise, temporal memory, and spatial memory. For more detail information on this, please refer to our ealier work 6 . 4
Empirical Evaluation
We empirically evaluate the system using the definitions of 100 concepts extracted from the Unified Medical Language System (UMLS). In the first evaluation, we focus on the quality of the retrieval process. We constructed a set of queries and processed this set through the system with and without the AUI. In this query set, we are mainly using the "wh" questions to find out the definitions of concepts or identify concepts that match certain requirements. For example, "what is urate oxidase?" or "which enzyme inhibits monoami oxidase and causes liver damage?". We made an assumption that the user does not just explore the concept randomly, but focuses on what he is studying. We used the precision and recall metrics commonly used in information retrieval 14 as our evaluation criteria. Figure 4 shows the precision and recall for all the questions in the cases with and without AUI. As we see, the precision and recall in cases that have an AUI are better than those without any help. If Kavanah is working without an AUI, it simply matches the QG of the user query with each DG representing each record in the database. Depending on how well the user manipulates the keywords in a query, the search may return more, less or even none of documents. This process requires the user either know the contents of the database or be very familiar with the search topics to achieve a decent result. The user's feedback is not used to adapt the search query. With AUI, depending on the user's feedback, Kavanah helps the user construct an appropriate search query that satisfies the user's searching intent. For example, if the user does not indicate any documents from
420
Precision comparisons with ant) without active user interface (AIM)
-WithoutAUl "With AUI
—i Q1
1 Q2
r Q3
Q4
G5
Q6
Q?
Q8
Q9
Q10
Questions
Recall comparisons with and without active user interface (AUI) 1.5 -. =on
1-
^Su«iaB)fflnaiti>i»B|Hkn
o i)
* 0.5 0
Q1
Q2
Q3
Q4
^ZZ Q5
Q6
Q7
Q8
Q9Q10
-WithoutAUl With AUI
Questions
Figure 4. Precision and recall for Kavanah using with and without active user interface.
the returned list relevant, Kavanah then knows that perhaps, a wrong tool has been used, or the interests are not up-to-date or the ontology is far off the mark. It will automatically correct those misses in order to improve the quality of the search. We also evaluated the process of constructing and updating the user's ontology network by building simulated user ontologies from the domain ontology. We randomly choose some concept nodes from the domain ontology (referred in this experiment as testing concepts) and randomly remove some links associated with them to see if our system can reconstruct those missing links in the user ontology network. For each testing concept, we construct a set of queries such that they reflect the relations between the testing concept and the removed links. We compute the link error as follows: LinkError = ^ in which n is the number of links in the user ontology network matched against the target user subgraph's and m is the total number of links of the user ontology network constructed by AUI. First, we performed this experiment using the testing database mentioned above and found out that there is a large mismatch between the domain ontology and the set of concepts being
421 Matching percentage of ontology construction — • » - — Matching
1.50 1 1.000.500.00-
>c^ 1
2
3
Test case
4
5
percentage with separated database and domain ontology ••i.^im, Matching percentage with domain ontology as database
Figure 5. Link matching percentage of ontology construction
used to construct our testing database. As it is shown in Figure 5, the matching percentage of the links created in the user ontology against the real user subgraph in domain ontology is low for this case. Next, we tried to use a portion of the domain ontology as the testing database to perform the same experiment in order to see if the matching percentage of those links is changed without the mismatch between the database and domain ontology. Obviously, we see that the results are significantly better. 5
Related work
In the information retrieval and information filtering community, dynamically reacting to the changes in the user interests, preferences and context (i.e., ontology) to reduce the user cognitive workload is a challenging problem. Most of the research from information retrieval has been focused on capturing the user interests only 7 - 18,3 . Work by the InfoSleuth Group explores the ontology to improve the retrieval process 10 is a closely related to our work. In this work, ontology is dynamically constructed from the text and constantly been updated as the retrieval process goes on. However, this technique doesn't focus on the preferences and interests. There is not enough detail from the paper about this approach to empirically compare it with ours. Another work related to ours is IRIA 1 which unobtrusively builds a map of accessible relevant information and uses it to enable users to find information quickly. The difference between this work and ours was the IRIA approach is based on a context-sensitive search which spreads the user interests based on the relevant information to users while ours is based on a decision theoretic approach to maintain the model of the user preferences, interests and context.
422 6
Future work
This paper has described our on-going work to construct an active user interface that provides intelligent assistance to the user in an information retrieval system. There are a number of issues that arise from our design and empirical evaluation. We want to extend our evaluation to a more complex scenario with different kind of questions and search strategies. Unfortunately, the current database has the problem of low term frequency which is usually referred to as data sparseness problem in information retrieval 17 . We are also looking for another supplement database or semantic network in UMLS that will help us to overcome the problem of disjointness between the domain ontology and the database used as testbed. We wanted to measure not only links errors, but also concepts errors which refers to the number of concepts in the user ontology network matched against the original real user subgraph. At present, we use a fading mechanism to fade away interests, preferences or context that are no longer used. This may result in more frequent updates than necessary if the user intent is not very dynamic. We wanted to employ a mechanism to differentiate between the short-term and long-term interests, preferences and context in an intuitive way using findings from experimental psychology8. Acknowledgements This work was supported in part by AFOSR Grant No.F49620-00-l-0244 and the AFRL Human Effectiveness Directorate Through Sytronics Incorporated. Thanks to Greg Johnson and Thuong Doan for helping with this paper. References 1. Anthony, Jr. G. F.; Devaney, M.; and Ram, A. 2000. Iria: The information retrieval intelligent assistant. In Proceedings of the International Conference on Artificial Intelligence. 275-280. 2. Balabanovic, M.; and Shoham, Y. 1997. Content-based, collaborative recommendation. In Communications of the ACM, 66-72. Vol 40, No.3 3. Billsus, D.; and Pazzani, M. J. 2000. User modeling for adaptive news access. In Journal of User Modeling and User-Adapted Interaction.Vol( 10),issue 2/3.147-180. 4. Brown, S. M.; Santos, Jr. E.; and Banks, S. B. 1999. Active user interfaces for building decision-theoretic systems. In Proceedings of the 1st Asia-Pacific Conference on Intelligent Agent Technology, Hong Kong. 244-253. 5. Brown, S.M.; Santos, Jr. E.; Banks, S. B.; and Oxley, M. 1998. Using
423
6. 7.
8. 9.
10.
11. 12.
13. 14. 15.
16.
17. 18.
explicit requirements and metrics for interface agent user model construction. In Proceedings of the Second International Conference on Autonomous Agents, 1-7, Minneapolis, MN. Brown, S. M. 1998. Decision theoretic approach for interface agent development. Ph. d disertation. Chen, L.; and Sycara, K. 1998. Webmate: A personal agent for browsing and searching. In Proceedings of the 2nd International Conference on Autonomous Agents and Multi Agent Systems. Minneapolis, MN. Ericsson, K. A.; and Kintsch, W. 1995. Long-term working memory. In Psychology Review, 211-245. 102. Horvitz, E.; Breeze, J.; Heckerman, D.; Hovel, D.; and Rommelse, K. 1998. The lumiere project: Bayesian user modeling for inferring goals and needs of software users. In Proceedings of the Fourteenth Annual Conference on Uncertainty in Artificial Intelligence.256-265. Hwang, C. H. 1999. Incompletely and imprecisely speaking: Using dynamic ontologies for representing and retrieving information. InfoSleuth Group, Microeletronics and Computer Technology Corp, 3500 West Balconnes Center Drive, Austin, TX 78759. Maes, P. 1994. Agents that reduce work and information overload. In Communications of the ACM, 31-40. 37(7). Geddes, N. 1986. The Use of Individual Differences in Inferring Human Operator Intentions. In Proceedings of the Second Annual Aerospace Applications of Artificial Intelligence Conference Gruber, T. R. 1993. Toward principles for the design of ontologies used for knowledge sharing. In the International Workshop on Ontology. Salton, G.; and McGill, M. 1983. Introduction to Modern Information Retrieval. McGraw-Hill Book Company. Santos, Jr. E.; Brown, S. M.; Lejter, M.; and Banks, S. B. 1999. Dynamic user model construction with bayesian networks for intelligent in formation queries. In Proceedings of of the 12th Flairs conference.3-7. Santos, Jr. E.; Nguyen, H.; and Brown, S. M. 2000. Medical document information retrieval through active user interfaces. In Proceedings of the 2000 International Conference in Artificial Intelligence.323-329. van Rijsbergen, C. 1975. Information Retrieval.The Whitefriars Press Ltd, London and Tonbridge. Widyantoro, D. H.; Ioerger, T. R.; and Yen, J. 1999. Adaptive agent for learning changes in user interests. In Proceedings of the International Conference on Information and Knowledge Management CIKM'99. Kansas City.
iJADE WeatherMAN - A MULTIAGENT FUZZY-NEURO NETWORK BASED WEATHER PREDICTION SYSTEM RAYMOND LEE, JAMES LIU, JANE YOU Department of Computing, Hong Kong Polytechnic University, Hong Kong Email: {csstlee, csnkliu,
[email protected] Weather forecasting has been one of the most challenging problems around the world for more than half a century. Not only because of its practical value in meteorology, but it is also a typically "unbiased" time-series forecasting problem in scientific researches. In this paper, we propose an innovative intelligent multi-agent based environment, namely (/JADE) intelligent Java Agent Development Environment - to provide an integrated and intelligent agent-based platform in the e-commerce environment. In addition to contemporary agent development platforms, which focus on the autonomy and mobility of the multi-agents, /JADE provides an intelligent layer (known as the 'conscious layer') to implement various AI functionalities in order to produce 'smart' agents. From the implementation point of view, we introduce the I'JADE WeatherMAN - an intelligent multi-agent based system for automatic weather information gathering, filtering and time series weather prediction (which is done by a fuzzy-neuro network model), based on the weather information provided by various weather stations. Compared with previous studies on single point source using similar network and others like radial basis function network, learning vector quantization and Naive Bayesian network, the results are very promising. This neural-based rainfall forecasting system is useful and parallel to traditional forecast from the Hong Kong Observatory.
1
Introduction
Traditionally, weather forecasting is based mainly on numerical models [15]. This classical approach attempts to model the fluid and thermal dynamic systems for grid-point time series prediction based on boundary meteorological data. The simulation often requires intensive computations involving complex differential equations and computational algorithms. Besides, the accuracy is bound by certain "inherited" constraints such as the adoption of incomplete boundary conditions, model assumptions and numerical instabilities [12]. This kind of approach is more appropriate for long-term (over 24 hours) forecasting over a large area of several thousand kilometres [2]. For mesoscale and short-term weather forecasting in a relative small region such as Hong Kong there is a need for some alternative. In addition, it is generally recognized by meteorologists that rainfall forecasting is quite difficult to be accurate, due to the geographical and topographic features of the region here. Similar studies [11][14] confirmed the difficulty and thus presenting a challenge for further investigation of the problem. Moreover, with the exponential growth in Internet technology and e-commerce applications, real-time weather information gathering (from different weather stations) and weather prediction (such as temperature and rainfall forecasts) are increasingly important and demanding. Agent technology, with its automatic delegation of tasks, autonomous 424
425
and highly mobile characteristics in the Web environment, is starting to play an important role of e-commerce in the new millennium. However, contemporary agent systems such as IBM Aglets [1] and ObjectSpace Voyager [16] focus on the mobility and multi-agent communications. The 'core' functions of intelligent agents (IA) - the AI (Artificial Intelligent) counterpart with intelligent capabilities including machine learning, intelligent pattern recognition and classification functions - are difficult to implement. In a typical e-shopping scenario, most of the time we are handling 'inexact' product selection criteria. For instance, in rainfall prediction, we might not interest in the exact rainfall amount, instead we only care about the degree of "Rain Depth" ranging from "nil" or "trace" amount to "heavy" rainfall. In these cases, we are dealing with different degrees of 'fuzziness', which can be efficiently handled by AI technologies such as fuzzy systems. With the integration of machine learning technique such as neural networks, an 'intelligent' weather forecasting system can be constructed. In this paper, we propose a comprehensive and truly intelligent agent-based framework, known as iJADE (pronounced as 'IJ') - intelligent Java Agent Development Environment. To compensate for the deficiency of contemporary agent software platforms such as IBM Aglets and ObjectSpace Voyager Agents, which mainly focus on multi-agent mobility and communication, iJADE provides an ingenious layer called the 'Conscious (Intelligent) Layer' which implements various AI functionalities into multi-agent applications. From the implementation point of view, we demonstrate how /JADE model can be applied in real-time weather information gathering and weather prediction process - /JADE WeatherMan. In short, /JADE WeatherMan integrates the intelligent mobile agent technology on information gathering, with the fuzzy neural network for weather prediction. 2 2.1
iJADE Architecture UA DE Framework: A CTS and DNA Layered Model
In this paper, we propose an innovative and fully integrated intelligent agent model called /JADE for intelligent Web-mining and other intelligent agent-based ecommerce applications. The system framework is shown in Figure 1. Unlike contemporary agent systems such as IBM Aglets [1], which focus on the multiagent communication and autonomous operations, the aim of /JADE is to provide comprehensive 'intelligent' agent-based framework and applications for future ecommerce and Web-mining applications. Figure 1 depicts the two level abstraction in the /JADE system: a) /JADE system level - ACTS model, and b) /JADE data level - DNA model. The ACTS model consists of 1) the Application Layer, 2) the Conscious (Intelligent) Layer, 3) the Technology Layer, and 4) the Supporting
426
Layer. The DNA model is composed of 1) the Data Layer, 2) the Neural Network Layer, and 3) the Application Layer.
Figure 1. System architecture of I'JADE (v 1.6) model
2.2
Application layer including UADE eMiner
This is the uppermost layer that consists of different intelligent agent-based applications. These /JADE applications are developed by the integration of intelligent agent components from the 'Conscious Layer' and the data 'knowledge fields' from the DNA model. Concurrent applications (/JADE vl.6) implemented in this layer include: • /JADE eMiner [10] - an intelligent Web-mining agent system for e-shopping. It consists of the implementation of 1) FAgent, an automatic authentication system based on human face recognition [6], and 2) FShopper, a fuzzy agentbased Internet shopping agent [7]. • /JADE WeatherMAN - an intelligent weather forecasting agent which is the extension of previous research on multi-station weather forecasting using fuzzy neural networks [13] proposed in this paper. • /JADE WShopper [9] - an integrated intelligent fuzzy shopping agent with WAP technology for intelligent mobile shopping on the Internet. • /JADE Stock Advisor - an intelligent agent-based stock prediction system using time series neuro-oscillatory prediction technique [8]. 2.3
Conscious (Intelligent) layer
This layer provides the intelligent basis of the /JADE system, using the agent components provided by the 'Technology Layer'. The 'Conscious Layer' consists of the following three main intelligent functional areas:
427
•
•
•
2.4
'Sensory Area' - for the recognition and interpretation of incoming stimulates. It includes a) visual sensory agents using EGDLM (Elastic Graph Dynamic Link Model) for invariant visual object recognition [4], and b) auditory sensory agents based on wavelet-based feature extraction and interpretation scheme [3]. 'Logic Reasoning Area" - conscious area providing different AI tools for logical 'thinking' including rule-based reasoning, such as fuzzy and GA (Genetic Algorithms) rule-based systems [5]. 'Analytical Area' - consists of various AI tools for analytical calculation, such as recurrent neural network-based analysis for real-time prediction and data mining [8]. Technology and Supporting layers
The technology layer provides all the necessary mobile agent implementation APIs for the development of intelligent agent components in the 'Conscious Layer'. In the current version of the /JADE model, IBM Aglets [1] are used as the agent 'backbone'. The supporting layer provides all the necessary system supports to the 'Technology Layer'. It includes 1) Programming language support based on Java, 2) Network protocols support such as HTTP, HTTPS, ATP, etc., and 3) Markup languages support such as HTML, XML, WML, etc. 3 3.1
Implementation - iJADE WeatherMAN UADE WeatherMAN - System Overview
In this paper, we based on the /JADE model as the system framework to introduce an innovative intelligent agent-based weather information extraction, analytical and prediction system, based on weather information provided by multiple weather stations distributed in Hong Kong. To summarize, /JADE WeatherMAN consists of five major components: 1. User Requirement Definition Scheme (URDS) 2. Data Collection Scheme (DCS) 3. Variable Selection and Transformation Scheme (VSTS) 4. Fuzzy-neuro Training and Prediction Scheme (FNTPS) 5. Weather Reporting Scheme (WRS) A schematic diagram of the /JADE WeatherMAN is depicted in Figure 2. 3.2
User Requirement Definition Scheme (URDS) & Weather Reporting Scheme (WRS)
The URDS involves the collection of the user requirement, including the selection :>f forecasting element(s) (such as temperature, rainfall, humidity, etc), forecasting •ange (e.g. next day or next-N day forecast) and other parameter such as regional brecast / global weather report. These information are collected via the /JADE
428
Weather Reporter - a stationary agent situated in the client machine for the collection of user requirement, negotiation and dispatch mobile agents (/JADE Weather Messengers) the final weather reporting in the WRS.
Figure 2 - Schematic diagram of the iJADE WeatherMAN
3.3
Data Collection Scheme (DCS)
In this scheme, each /JADE Weather Messenger (mobile agent) will "visit" different weather station for the collection of weather information, which will in turn be centralized within the main weather center (HKO site) for further processing. In our implementation, meteorological data during the period between 1993 and 1997 were collected from 11 regional weather stations in Hong Kong. These stations are located at Ta Kwu Ling (TKL), Ping Chau (EPC), Lau Fau Shan (LFS), Tai Po Kau (TPK), Sha Tin (SHA), Sha Lo Wan (SLW), H.K. Observatory (HKO), Junk Bay (JKB), Wong Chuk Hang (WCH), Cheung Chau (CCH) and Waglan Island (WGL). The data are 6-hour records of weather elements at 0600H, 1200H, 1800H, 2400H and they include: dry-bulb temperature (TT); dew-point temperature (DP); relative humidity (RH); hourly rainfall (RF); mean wind speed (FF); 60-min prevailing wind direction (DD); mean sea-level pressure (MSLP).
429 3.4
Variable Selection and Transformation Scheme (VSTS)
Once all the /JADE Weather Messengers are centralized in the central station, they will exchange and integrate the weather information they have and reorganize (which involve weather element selection, information grouping and transformation) these information such that they can be more efficiently be used for system training and testing. For instance, in our experimental testing using the weather information from the 11 station, there are severe loss of data from certain stations. As a result, the 11 stations were grouped to cover five regions (Rl, R2, R3, R4 & R5) according to the distribution of weather records that can be collected in each region. Certain incomplete data were approximated using some linear function interpolating with the nearby values of the same element within the region. 100 -,
• TT"
a .£ 8 S S> | g
*> „....~~~. 80 70 60 50 40 30
°-
10 0
_
1
.41 u
I CCH
FPC
HKO
HK!
1.
.1 '
1 L
iLti tj \i
• _
• DD DDP E3FF
SHA
SLW
TKL
li
• MSLP ERF • RH
,,
TPO
WGL
Weather Station
Figure 3 - Data distribution among 11 weather stations
3.5
Fuzzy-neuro Training and Prediction Scheme (FNTPS) Neural Network Module
y~\
Fuzzy Module
N
1A 1 /\
1 A 1 1 1 1 Input Variables
/\ /\ A Membership functions
Figure 4 - Schematic diagram for Fuzzy-neuro network on rainfall (RF) forecast using meteorological data: Relative humidity (RH), dry-bulb temperature (TT), dew-point temperature (DT), wind direction (WD), wind speed (WS), mean sea level pressure (PR) and rainfall (RF)
430
Having all of the relevant weather information collected and pre-processed, the /JADE Weather Forecaster (a stationary computational agent situated in the central station) will start the appropriate network training and forecasting - based on a back-propagation based fuzzy-neuro network (Figure 4). Table 1 shows the category defined for the fuzzication of the rainfall element into five different categories. Table 1. Rainfall categories Category (mm)
Nil
Trace
Light
Moderate
Heavy
Range in depth
0