Lecture Notes in Artificial Intelligence Edited by J. G. Carbonell and J. Siekmann
Subseries of Lecture Notes in Computer Science
3904
Matteo Baldoni Ulle Endriss Andrea Omicini Paolo Torroni (Eds.)
Declarative Agent Languages and Technologies III Third International Workshop, DALT 2005 Utrecht, The Netherlands, July 25, 2005 Selected and Revised Papers
13
Series Editors Jaime G. Carbonell, Carnegie Mellon University, Pittsburgh, PA, USA Jörg Siekmann, University of Saarland, Saarbrücken, Germany Volume Editors Matteo Baldoni Università di Torino, Dipartimento di Informatica via Pessinetto 12, 10149 Torino, Italy E-mail:
[email protected] Ulle Endriss University of Amsterdam, Institute for Logic, Language and Computation Plantage Muidergracht 24, 1018 TV Amsterdam, The Netherlands E-mail:
[email protected] Andrea Omicini Università di Bologna, DEIS, Dipartimento di Elettronica, Informatica e Sistemistica Alma Mater Studiorum, Via Venezia 52, 47023 Cesena, Italy E-mail:
[email protected] Paolo Torroni Università di Bologna, DEIS Alma Mater Studiorum, Viale Risorgimento 2, 40136 Bologna, Italy E-mail:
[email protected] Library of Congress Control Number: 2006922191
CR Subject Classification (1998): I.2.11, C.2.4, D.2.4, D.2, D.3, F.3.1 LNCS Sublibrary: SL 7 – Artificial Intelligence ISSN ISBN-10 ISBN-13
0302-9743 3-540-33106-9 Springer Berlin Heidelberg New York 978-3-540-33106-3 Springer Berlin Heidelberg New York
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. Springer is a part of Springer Science+Business Media springer.com © Springer-Verlag Berlin Heidelberg 2006 Printed in Germany Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper SPIN: 11691792 06/3142 543210
Preface The workshop on Declarative Agent Languages and Technologies is a wellestablished venue for researchers interested in sharing their experiences in the areas of declarative and formal aspects of agents and multi-agent systems, and in engineering and technology. Today it is still a challenge to develop technologies that can satisfy the requirements of complex agent systems. The design and development of multi-agent systems still calls for models and technologies that ensure predictability, enable feature discovery, allow for the verification of properties, and guarantee flexibility. Declarative approaches are potentially a valuable means for satisfying the needs of multi-agent system developers and for specifying multi-agent systems. DALT 2005, the third edition of the workshop, was held in Utrecht, The Netherlands, in July 2005, in conjunction with AAMAS 2005, the Fourth International Joint Conference on Agents and Multiagent Systems. Over 30 persons attended the workshop confirming the success of the previous editions in Melbourne 2003 (LNAI 2990) and New York 2004 (LNAI 3476). The workshop series is a forum of discussion aimed both at supporting the transfer of declarative paradigms and techniques into the broader community of agent researchers and practitioners, and at bringing the issues of designing real-world and complex agent systems to the attention of researchers working on declarative programming and technologies. A twofold process led to this volume. On the one hand, the best papers presented at the workshop were selected after a further, meticulous reviewing process. On the other hand, an open call was issued for contributions that were not submitted to the original workshop call for papers, that resulted in a few other papers added, chosen through a very strict reviewing process. As a result, this volume contains 14 papers and it is organized in four parts corresponding to the main topics of DALT: agent programming and beliefs, architectures and logic programming, knowledge representation and reasoning, and coordination and model checking. Each paper was reviewed by at least three members of the Programme Committee in order to supply the authors with rich feedback that could stimulate the research. Part I - Agent Programming and Beliefs The first part of this volume contains three papers. The first work, “Beliefs in Agent Implementation”, by Winkelhagen, Dastani, and Broersen, extends the language 3APL with beliefs represented as explicit modal operators. A proof procedure is also presented which is shown to be sound. The second work, “Modelling Uncertainty in Agent Programming”, by Kwisthout and Dastani, tackles the uncertainty of agent beliefs modelling it by means of Dempster-Shafer theory, reporting complexity results. The last work, “Complete Axiomatizations of Finite Syntactic Epistemic States”, by ˚ Agotnes and Walicki, discusses a formal model of knowledge as explicitly computed sets of formulae, extending the
VI
Preface
epistemic language with an operator which expresses what an agent knows at most.
Part II - Architectures and Logic Programming The second part contains four papers. The first, “An Architecture for Rational Agents”, by Lloyd and Sears, proposes an agent architecture in which agents have belief bases that are theories in a multi-modal, higher-order logic. Machine learning techniques are used to update the belief base. The second paper, “LAIMA: A Multi-Agent Platform Using Ordered Choice Logic Programming”, by De Vos, Crick, Padget, Brain, Cliffe, and Needham, introduces a deductive reasoning multi-agent platform based on an extension of answer set programming. Agents are represented as ordered choice logic programs. The third work, “A Distributed Architecture for Norm-Aware Agent Societies”, by Garc´ıa-Camino, Rodr´ıguezAguilar, Sierra, and Vasconcelos, describes a distributed architecture that accounts for a “social layer” in which rules are used for representing normative positions. Last but not least, “About Declarative Semantics of Logic-Based Agent Languages”, by Costantini and Tocchio, provides a declarative semantics to logicbased agent-oriented languages, focussing on DALI as a case of study and paying particular attention to communication among agents.
Part III - Knowledge Representation and Reasoning This part consists of five papers. The first paper, “Goal Decomposition Tree: An Agent Model to Generate a Validated Agent Behaviour”, by Simon, Mermet, and Fournier, presents the Goal Decomposition Tree agent model, which allows both the specification and validation of agent behavior. The second paper, “ResourceBounded Belief Revision and Contraction”, by Alechina, Jago, and Logan, is set in the context of the AGM postulates and presents a linear time belief contraction operation that satisfies all but one of these postulates for contraction. The third paper, “Agent-Oriented Programming with Underlying Ontological Reasoning”, by Moreira, Vieira, Bordini, and H¨ ubner, defines a version of the BDI agent-oriented programming language AgentSpeak which is based on description logic. The authors use as a running example the well-known smart meetingroom scenario. The fourth paper, “Dynagent: An Incremental Forward-Chaining HTN Planning Agent in Dynamic Domains”, by Hayashi, Tokura, Hasegawa, and Ozaki, presents an agent algorithm that integrates forward-chaining HTN planning, execution, belief updates, and plan modifications. By this approach agents are enabled to deal with dynamic worlds. The fifth paper, “A Combination of Explicit and Deductive Knowledge with Branching Time: Completeness and Decidability Results”, by Lomuscio and Wo´zna, introduces a combination of Computational Tree Logic and an epistemic logic, which encompasses an epistemic operator to represent explicit knowledge. The properties of the obtained logic, such as decidability, are presented.
Preface
VII
Part IV - Coordination and Model Checking The last part of the volume contains two papers. “An Intensional Programming Approach to Multi-agent Coordination in a Distributed Network of Agents”, by Wan and Alagar, presents an extension of Lucx and discusses the Intensional Programming Paradigm, with the aim of providing a programming model for coordinated problem solving in a multi-agent system. The last work in this collection, “A Tableau Method for Verifying Dialogue Game Protocols for Agent Communication”, by Bentahar, Moulin, and Meyer, proposes a tableau-based model checking technique for verifying dialogue game protocols, defined using a social commitment-based framework for agent communication called Commitment and Argument Network.
DALT is now looking forward to its fourth meeting, which will take place in May 2006 in Hakodate, Japan, again as an AAMAS workshop, and will be chaired by Matteo Baldoni and Ulle Endriss. Besides the traditional DALT topics, the next edition will pay particular attention to the impact of the development of declarative approaches to application areas such as the semantic web, web services, security, and electronic contracting.
January 2006
Matteo Baldoni Ulle Endriss Andrea Omicini Paolo Torroni
Organization Workshop Organizers Matteo Baldoni Ulle Endriss Andrea Omicini Paolo Torroni
University University University University
of of of of
Turin, Italy Amsterdam, The Netherlands Bologna a Cesena, Italy Bologna, Italy
Programme Committee Natasha Alechina Rafael Bordini Brahim Chaib-draa Alessandro Cimatti Keith Clark Marco Colombetti Stefania Costantini Mehdi Dastani J¨ uergen Dix Boi Faltings Michael Fisher Wiebe van der Hoek Michael N. Huhns Catholijn Jonker Peep K¨ ungas Yves Lesp´erance Brian Logan Alessio Lomuscio Viviana Mascardi John-Jules Ch. Meyer Eric Monfroy Sascha Ossowski Julian Padget Lin Padgham Wojciech Penczek Lu´ıs Moniz Pereira Enrico Pontelli Juan Rodriguez-Aguilar Luciano Serafini Marek Sergot
University of Nottingham, UK University of Durham, UK Laval University, Canada ITC-IRST, Trento, Italy Imperial College London, UK Politecnico di Milano, Italy University of L’Aquila, Italy Utrecht University, The Netherlands University of Clausthal, Germany EPFL, Switzerland University of Liverpool, UK University of Liverpool, UK University of South Carolina, USA Radboud University Nijmegen, The Netherlands NUST, Trondheim, Norway York University, Toronto, Canada University of Nottingham, UK University College London, UK University of Genova, Italy Utrecht University, The Netherlands University of Nantes, France Universidad Rey Juan Carlos, Madrid, Spain University of Bath, UK RMIT University, Melbourne, Australia Polish Academy of Sciences, Warsaw, Poland Universidade Nova de Lisboa, Portugal New Mexico State University, USA Spanish Research Council, Barcelona, Spain ITC-IRST, Trento, Italy Imperial College London, UK
X
Organization
Francesca Toni Wamberto Vasconcelos Michael Winikoff Franco Zambonelli
Imperial College London, UK University of Aberdeen, UK RMIT University, Melbourne, Australia University of Modena and Reggio Emilia, Italy
Additional Reviewers Alessandro Artale Cristina Baroglio Valentina Gliozzi
Alberto Martelli Eric Pacuit Viviana Patti
Gian Luca Pozzato Sebastian Sardina
Table of Contents
Agent Programming and Beliefs Beliefs in Agent Implementation Laurens Winkelhagen, Mehdi Dastani, Jan Broersen . . . . . . . . . . . . . . . .
1
Modelling Uncertainty in Agent Programming Johan Kwisthout, Mehdi Dastani . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17
Complete Axiomatizations of Finite Syntactic Epistemic States Thomas ˚ Agotnes, Michal Walicki . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
33
Architectures and Logic Programming An Architecture for Rational Agents John W. Lloyd, Tom D. Sears . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
51
LAIMA: A Multi-agent Platform Using Ordered Choice Logic Programming Marina De Vos, Tom Crick, Julian Padget, Martin Brain, Owen Cliffe, Jonathan Needham . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
72
A Distributed Architecture for Norm-Aware Agent Societies Andr´ es Garc´ıa-Camino, Juan A. Rodr´ıguez-Aguilar, Carles Sierra, Wamberto Vasconcelos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
89
About Declarative Semantics of Logic-Based Agent Languages Stefania Costantini, Arianna Tocchio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
Knowledge Representation and Reasoning Goal Decomposition Tree: An Agent Model to Generate a Validated Agent Behaviour Ga¨ele Simon, Bruno Mermet, Dominique Fournier . . . . . . . . . . . . . . . . . 124 Resource-Bounded Belief Revision and Contraction Natasha Alechina, Mark Jago, Brian Logan . . . . . . . . . . . . . . . . . . . . . . . 141 Agent-Oriented Programming with Underlying Ontological Reasoning ´ Alvaro F. Moreira, Renata Vieira, Rafael H. Bordini, Jomi F. H¨ ubner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
XII
Table of Contents
Dynagent: An Incremental Forward-Chaining HTN Planning Agent in Dynamic Domains Hisashi Hayashi, Seiji Tokura, Tetsuo Hasegawa, Fumio Ozaki . . . . . . . 171 A Combination of Explicit and Deductive Knowledge with Branching Time: Completeness and Decidability Results Alessio Lomuscio, Bo˙zena Wo´zna . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
Coordination and Model Checking An Intensional Programming Approach to Multi-agent Coordination in a Distributed Network of Agents Kaiyu Wan, Vasu S. Alagar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 A Tableau Method for Verifying Dialogue Game Protocols for Agent Communication Jamal Bentahar, Bernard Moulin, John-Jules Ch. Meyer . . . . . . . . . . . 223 Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
Beliefs in Agent Implementation Laurens Winkelhagen, Mehdi Dastani, and Jan Broersen Institute of Information and Computing Sciences, Utrecht University, P.O. Box 80.089, 3508 TB Utrecht, The Netherlands {lwinkelh, mehdi, broersen}@cs.uu.nl
Abstract. This paper extends a programming language for implementing cognitive agents with the capability to explicitly represent beliefs and reason about them. In this programming language, the beliefs of agents are implemented by modal logic programs, where beliefs are represented by explicit modal operators. A distinction is made between a belief base language that can be used to represent an agent’s beliefs, and a belief query language that can be used to express queries to the agent’s belief base. We adopt and modify a proof procedure that decides if a belief query formula is derivable from the belief base of an agent. We show that the presented proof procedure is sound.
1
Introduction
This paper presents an extension of the agent programming language 3APL [1]. This programming language provides data structures such as beliefs, goals, plans and reasoning rules, as well as programming constructs to manipulate these data structures. Examples of such constructs are updating the beliefs, planning a goal and executing a plan. In multi-agent settings, agents are assumed to have the ability to communicate. Several specifications have been proposed to facilitate agent communication, amongst these, the FIPA1 standards are important to 3APL. According to FIPA, agents can communicate by sending each other messages that contain, amongst other things, a communicative act and the message content. The communicative acts specified by FIPA [2] require that a message with a certain performative can only be sent if certain belief formulae, the preconditions of the act, hold. This necessitates capabilities in the agent programming language to implement agents that can reason with beliefs. These beliefs can be about the beliefs of the agent that wants to send the message, or about the beliefs of the receiver of the message. For example, the INFORM act specifies that the sender believes that receiver has no beliefs about the message content. The programming language 3APL implements agent belief in terms of a set of Horn clause formulae and uses a Prolog-engine to verify if the agent has a certain belief. However, 3APL lacks (1) the possibility to represent beliefs of agents about their own beliefs and the beliefs of other agents. It also lacks (2) the possibility to reason with these beliefs. 1
The Foundation for Intelligent Physical Agents: http://www.fipa.org/
M. Baldoni et al. (Eds.): DALT 2005, LNAI 3904, pp. 1–16, 2006. c Springer-Verlag Berlin Heidelberg 2006
2
L. Winkelhagen, M. Dastani, and J. Broersen
In this paper we show how we can extend the language of 3APL and the underlying Prolog mechanism in order to implement these two features. We take existing work [3, 4] on adding modal reasoning capabilities to logic programming, and investigate how to use this in combination with 3APL. We extend the programming language with an explicit modal operator of belief and provide a proof method that allows 3APL agents to function correctly with this modal operator. We show that this proof method is sound. In sections 2 and 3 we give a quick introduction to 3APL and modal logic programming respectively. The bulk of the paper is section 4 in which we combine one approach to modal logic programming with the 3APL programming language, first discussing the syntactical changes, then the semantical interpretations and finally giving a soundness proof for these semantics. Section 5 is the conclusion, in which we will also point out some areas for further research.
2
3APL
Like other agent programming languages, 3APL provides data structures and programming constructs to manipulate them. Since 3APL is designed to implement cognitive agents, its data structures represent cognitive concepts such as beliefs, goals, plans, and reasoning rules. These data structures can be modified by programming constructs, also called deliberation operations, such as selecting a goal, applying a planning rule to it, or executing a plan. These operations constitute the deliberation process of individual agents which can be viewed as the agent interpreter. The formal syntax and semantics of 3APL are given in [1]. In this section, we will explain the ingredients of this programming language and give the formal definition of only those ingredients that are relevant for the research problem of this paper, i.e. the belief of 3APL agents. The beliefs and goals are logical formulae representing the current and desirable state of the world, respectively. The goals of the agents are represented as logical formulae which are conjunction of atomic ground2 formulae. The beliefs of 3APL agents can be specified by formulae in the following belief base language: Definition 1. (base language and belief base language) Let Var, F unc and Pred be the sets of domain variables, functions and predicates, respectively. Let T erm be the set of terms constructed from variables and functions in usual way. The base language L is defined as the set of atomic formulae built on terms T erm and predicates Pred in the usual way. Let ψ ∈ L be ground (atomic) formulae of the base language and let φ, φ1 , . . . , φn ∈ L. The belief base language LBB , which represents the beliefs of agents, is defined as follows. ψ, ∀x1 ,...,xn (φ1 ∧ . . . ∧ φn → φ) ∈ LBB where ∀x1 ,...,xn (ϕ) denotes the universal closure of the formula ϕ for every variable x1 , . . . , xn occurring in ϕ. 2
Throughout this paper we will use such standard terminology. In case of doubt, we will use the same terminology as used in [1].
Beliefs in Agent Implementation
3
In order to reach its goals, a 3APL agent adopts plans. A plan is built from basic elements that can be composed by sequence operators, if-then-else constructs, and while-do loops. The basic elements can be basic actions, test actions, or abstract plans. A test action checks whether a certain formula is derivable from the belief base. An abstract plan is an abstract representation of a plan which can be instantiated with a plan during execution. Thus, an abstract plan cannot be executed directly and should be rewritten into another plan, possibly (and even probably) containing executable basic actions, through application of reasoning rules (see below). There are three types of basic actions. The first type of basic action is the mental action. This action modifies the beliefs of the agents and is specified in terms of pre- and post-conditions in the form of belief formulae. A mental action can be performed if the pre-condition is derivable from the agent beliefs after which the post-condition must be derivable. The external actions can be performed in the environment of the agents. The effect of these actions is determined by the environment and can be perceived by the agent through sensing. The communication actions pass messages to another agent. A message contains the name of the receiver of the message, the speech act or performative (e.g. inform, request, etc.) of the message, and the content. The content of the message is a belief formula. In order to reason with goals and plans, 3APL has two types of reasoning rules: goal planning rules and plan revision rules. A goal planning rule, which is a tuple consisting of a goal formula, a belief formula and a plan, indicates that the state represented by the goal formula can be reached by the plan if the belief formula holds. Such a rule is applicable when the agent has a goal unifiable with the goal formula of the rule and when the belief formula of the rule is derivable from the belief base. Application of such a rule will add the plan to the set of plans of the agent. A plan revision rule, which is a tuple consisting of a belief formula and two plans, indicates that the first plan can be replaced by the second plan if the belief formula holds. Such a rule is applicable when the agent has a plan unifiable with the first plan of the rule and when the belief formula of the rule is derivable from the belief base. The application of the rule will replace the unifiable plan of the agent with the second plan. The plan revision rules are powerful way to handle failed or blocked plans as well as adapting and removing plans of the agent. A 3APL agent starts its deliberation with a number of goals to achieve. Planning rules are then applied to generate plans for the goals after which the plans are executed to achieve the goals. Plans may start with mental actions for which the pre-conditions are not true. In this case, the plan revision rules are applied to generate alternative plans. During the execution of 3APL agents, there are four cases where it is checked if a certain formula is derivable from the beliefs of the agent. These cases are related to the execution of test actions and mental actions, and to the application of the goal planning rules and plan revision rules. In fact, the content of test actions, the pre-condition of mental actions, and the guard of the rules are logical formulae which should be derivable before these
4
L. Winkelhagen, M. Dastani, and J. Broersen
actions and rules can be executed or applied. We define the language of these formulae, the belief query language (LB ). Definition 2. (belief query language) Let L be the base language. Then, the belief query language LB with typical formula β is defined as follows: – – – – –
if φ ∈ L, then B(φ), ¬B(φ) ∈ Disjunction, ∈ Disjunction, if δ, δ ∈ Disjunction, then δ ∨ δ ∈ Disjunction, if δ ∈ Disjunction, then δ ∈ LB , if β, β ∈ LB , then β ∧ β ∈ LB .
The bold ‘B’ is not a modal operator, but is used to represent a query expression. For example Bϕ represents the query whether ϕ is derivable from the belief base. We use ¬Bϕ to represent the query whether ϕ is not derivable from the belief base. This interpretation of ¬ corresponds with the interpretation of negation as failure in logic programming. The B operator prevents the distribution of this negation over the belief query. The 3APL semantics is based on an operational semantics which is defined in terms of a transition system. A transition system is a set of derivation rules for deriving transitions. A transition is a transformation of one configuration (or state) into another and it corresponds to a single computation step. In the case of the operational semantics for 3APL agents, the configurations are the mental states of the 3APL agents defined in terms of the belief base, goal base, plan base, and a substitution which assigns terms to variables. Definition 3. (configuration) Let LGB be the goal language and LP be the plan language3 . A configuration of an individual 3APL agent is a tuple ι, σ, γ, Π, θ, where ι is an agent identifier, σ ⊆ LBB is the belief base of the agent, γ ⊆ LGB is the goal base of the agent, Π ⊆ LP is the plan base of the agent and θ is a ground substitution that binds domain variables to domain terms. In order to check whether an agent in a certain state has a certain belief or not, one must check if the corresponding belief query formula is derivable from the agent configuration that represents the state of the agent. In the 3APL transition semantics, this is formally expressed through an entailment relation |=τ 4 . In particular, to check if agent ι believes β ∈ LB (a belief query formula) in state ι, σ, γ, Π, θ is specified as ι, σ, γ, Π, θ |=τ β. The entailment relation |=τ is defined recursively for the compound formulae. At the level of atomic formulae the satisfaction relation is defined in terms of propositional satisfaction relation. In particular, if β is of the form Bφ, the satisfaction relation is defined as follows: ι, σ, γ, Π, θ |=τ β ⇔ σ |= φτ 3 4
Since the focus of this paper is the beliefs of the 3APL agents, the goal and plan language are not presented. For a detailed specification of these languages, see [1]. The subscript τ is a substitution under which a formula is derivable from the state.
Beliefs in Agent Implementation
5
The details of this semantics are explained in [1]. In the current implementation of 3APL, we use the Prolog engine for the propositional satisfaction relation |= in order to check the derivability of a belief query formula from the agent belief. In this paper, we will extend the beliefs of agents with modal (belief) formula by adding the modal belief operator to the belief language and belief query language, and use a reasoning engine to check the derivability of belief query formulae from the agent beliefs.
3
PROLOG-Based Approaches to Modal Reasoning
If one wants to use modal logic in logic programming there are generally two approaches. One can translate the modal operators to a non-modal logic and apply the theory of classical logic programming (the translational approach, exemplified by Nonnengart [5]). Alternatively one can maintain the modal syntax and adapt the logic programming theory to the modal extension (the direct approach [4]). It makes sense to take a closer look at the latter approach, as it allows for an explicit introduction of modal operators in 3APL. In this section, we will take a closer look at MProlog (by Linh Anh Nguyen [6, 4]) and NemoLOG (by Matteo Baldoni [3, 7, 8]), two existing systems for modal-logic programming that use the direct approach. These systems are generic in the sense that they can be used to model a variety of multi-modal logics. They allow for different modal operators as well as for different axiom systems. Both MProlog and NemoLOG ultimately operate in a similar manner, using goal-resolution, which is very much like Prolog and which can be illustrated by considering a simple example in KD45 modal logic. Example 1. A simple modal logic programming example. Let us examine how MProlog and NemoLOG solve a modal logic program consisting of the following goal, rule and fact. 1. (goal) ϕ 2. (rule) (ψ → ϕ) 3. (fact) ψ Both systems will prove the goal (1) from the fact (3) using the rule (2), after they have determined that the modalities in the goal, the rule and the fact are in correspondence with each other. In the terminology of MProlog and NemoLOG, this correspondence is called ‘direct consequence’ and ‘derivation’ respectively. The methods used by MProlog and NemoLOG differ slightly in various ways. For example while MProlog allows both universal and existential modal operators, NemoLOG disallows existential modal operators [8]. They also differ in the way they define their semantics. In the case of MProlog, an SLD-resolution method is defined that allows the use of specific rules to modify the modalities in goalclauses, in order to be able to apply a rule to the goal-clause. On the other hand, NemoLOG introduces a goal-directed proof procedure which utilizes an external derivation mechanism for modalities. We will discuss this mechanism shortly.
6
L. Winkelhagen, M. Dastani, and J. Broersen
Finally, there is a difference in the implementation. MProlog is developed as an interpreter while, for reasons of efficiency, NemoLOG is specifically designed not to be an interpreter. As a result, programs in MProlog need to have every rule and fact designated as such beforehand, while programs in NemoLOG look just like normal Prolog programs, except for the fact that each rule makes a call to a derivation-rule to check if the modal operators allow execution of the rule. NemoLOG’s separation of the logic governing modalities and the logic concerning program-clauses seems to make modal logic programming more clear, more intuitive and more easily adaptable to our purposes. In this paper we aim to give 3APL the instruments for accommodating a logic of belief. We will do so by blending in some of the ideas of NemoLOG. Specifically, we look at the goal-directed proof procedure. A key part of this proof procedure is the external derivation mechanism for modalities. Modalities are stripped from the goal-clause and remembered in the so-called Modal Context. A modal context is an environment in which goals can be proven. To prove a goal in a given modal context we may use formulae that are valid in specific other modal contexts. These modal contexts are determined by a so called ‘matching relation’. In this manner, the matching relation defines which modal logical framework we use. Syntactically modal contexts form a data structure used by the recursive goal-resolution procedure to prove modal goals. Semantically, they resemble the possible worlds well-known from Kripke semantics and the matching relation defines the properties of the modal accessibility relation [9]. When applying a rule to a goal-clause, the goal-directed proof procedure checks whether the modalities in the rule can be derived from the modal context. Consecutively the rule is applied to the goal and the modalities in the modal context are adjusted to represent the new environment. Our approach to modal logic programming, based NemoLOG, will be explained in detail in section 4.3.
4
Integrating Modal Belief Properties in 3APL
In the previous sections we have taken a look at 3APL and have investigated modal logic programming, now we are ready to integrate both. It should be clear that this integration focusses on the belief base and belief query language in 3APL as well as on the (Prolog) inference mechanism 3APL uses. 4.1
The Addition of Modal Operators in 3APL
In this paper we are primarily interested in installing a notion of belief into 3APL agents, so therefore we will only add one type of modal operator to the base language: modal operators of belief. These modal operators follow the axioms of a widely accepted system for belief, namely KD45. Later, in the section 4.2, we give an interpretation for this belief operator. Even as in the original 3APL, we refrain form introducing direct negation to the base language. The belief base of an agent in 3APL is a Prolog program, and
Beliefs in Agent Implementation
7
therefore it is based on the closed world assumption. Negation in a belief base is modelled as negation-as-failure, meaning that formulae are negated by merely not being in the belief base. The addition of a modal operator for belief does not change this very much, though it does present us with some new possibilities regarding negation. For one, there is an intuitive difference between not believing that something is the case and believing that something is not the case. If we want to be able to distinguish these, we should add the possibility of using classical negation within the scope of a belief modality to 3APL. We leave this possibility for further research. Because 3APL is a programming language for implementing Multi Agent Systems, it makes sense to introduce multiple modal operators of belief. Therefore we will introduce for each agent i ∈ Agents a modal operator Bi . Definition 4. (extended base language) Let L be the basic langauge as defined in 1. Let Bi be a modal operator for an agent i ∈ Agents. The base language L is extended with the modal belief operator by adding the following clause: if φ ∈ L, then Bi φ ∈ L. The revised belief base language is defined in terms of the expressions of the extended base language. The belief base of an agent contains only a finite number of expressions from the respective language, possibly none. Note that, in the base language, we can have Bi Bj Bi φ where i, j ∈ Agents. Definition 5. (extended belief base language) Let L be the extended base langauge. Let ψ, ψ1 , ..., ψm ∈ L be ground (atomic) formulae, let φ, φ1 , . . . , φm ∈ L and let i ∈ Agents. The belief base language, LBB is a set of formulae defined on the extended base language as follows: – Bi ψ ∈ LBB – ∀x1 , ...,xn Bi (φ1 ∧ . . . ∧ φm → φ) ∈ LBB where ∀x1 , ...,xn (ϕ) denotes the universal closure of the formula ϕ for every variable x1 , ..., xn occurring in ϕ. According to the above definition, formulae in the belief base language can have two forms: Bi ψ and ∀x1 , ...,xn Bi (φ1 ∧ ... ∧ φm → φ). We will call the first form the fact-form and the second the rule-form. Note that every formula in the belief base language LBB is preceded by a modality Bi . i.e. every expression in the belief base language is explicitly believed. As one can see, the above definition for LBB is very similar to the one used to define the corresponding 3APL language. However, there are some notable differences. We have imposed the restriction that everything in the belief base of an agent (i) should be preceded by a belief modality (Bi ) to make explicit that the agent believes whatever is in its belief base5 . This as opposed to 3APL where no modal operator is allowed (i.e. in 3APL everything in the belief base 5
Note that the argument of Bi is a belief formula that can contain modal operators Bi or Bj for other agents j.
8
L. Winkelhagen, M. Dastani, and J. Broersen
is implicitly believed). The explicit instead of implicit use of the belief modality will turn out to be especially useful when we come to the semantics of the belief base language. Now we will redefine the belief query language. In the following it is important to note the difference between B and Bi . The latter is used as a modal operator to indicate a belief of an agent i and can be applied to modal belief formulae in which Bi can occur again. The former is used to indicate that a formula is a belief formula and cannot be applied to a formula that contains the B operator. Note that another use of the bold ‘B’ is to prevent the distribution of negation over its argument, as explained in our section on 3APL. Definition 6. (extended belief query language) Let L be the extended base language and let ψ be of the form Bi ϕ where ϕ ∈ L and i ∈ Agents. We shall call ψ a formula in the inner belief query language (LI ). The extended belief query language, LB , with typical formula β is defined as follows: – – – – –
B(ψ), ¬B(ψ) ∈ Disjunction, ∈ Disjunction, if δ, δ ∈ Disjunction then δ∨δ ∈ Disjunction, if δ ∈ Disjunction, then δ ∈ LB , if β, β ∈ LB , then β∧β ∈ LB .
Note this definition is almost exactly the same as definition 2. The only difference lies in the arguments of the basic queries, which are of the form B(ϕ). 4.2
The Semantics of Modal Operators in 3APL
In section 2, we have explained that during the execution of 3APL agents, there are four cases where it should be tested if a belief query formula is derivable from the agent belief. These test occur at the execution of test actions and mental actions and at the application of goal planning rules and plan revision rules. In the original 3APL, the agent typically uses a belief query for this purpose, and this has not changed with the addition of modal operators and the extra restrictions on both the belief base and belief query language. One thing that has changed is that the reasoning mechanism must now take modal operators into account. This expands our belief base language (LBB ) to a Horn clause fragment of first order modal logic and enriches our belief query language (LB ) with modal operators. In order to define the derivability of queries from a belief base we will introduce a superset language Ls . Relying on the belief base language or the belief query language alone does not suffice, because the languages differ from each other. We embed both the belief base language and the inner belief query language (LI ) as subsets of the superset language to study the derivation relation in a standard modal logic setting. This embedding is illustrated in figure 1. This situation allows us to extend both the belief base and belief query language to form a larger fragment of the superset language if we so desire. An example of such an extension could be the inclusion of negation in various forms.
Beliefs in Agent Implementation
9
Fig. 1. The relation between the languages
The alphabet of the superset language consists of the same sets Vars, F unc and Pred as the base language (definition 1) combined with the classical logical connectives and quantifiers ¬, ∨, ∧, →, ∀ and ∃. The syntax of the superset language is defined as usual in first-order modal logic. As usual in first-order modal logic, we will use Kripke interpretations to define the semantics of this superset language. These interpretations make use of possible worlds. Each possible world has a first-order domain of quantification. For the sake of simplicity we will assume constant domains (i.e. the domains do not vary between possible worlds) and rigid designators (i.e. constants refer to the same domain element, regardless of the possible world it is in). The semantics of the superset language are defined in terms of Kripke interpretations as follows: Definition 7. (Kripke interpretations) Let M be a Kripke interpretation M = W, R, D, π where: – W is a nonempty set of worlds – R = {Ra1 , ..., Ran } for a1 , ..., an ∈ Agents where each Ri ∈ R is a binary relation on W which is serial, transitive and Euclidean (the accessibility relation associated with Bi ) – D is a (nonempty) set of domain elements – π is an interpretation of constant symbols function symbols and predicate symbols such that • for each n-ary function symbol f of Ls (including constants of Ls ), π(f ) is a function from Dn to D • for each n-ary predicate symbol p and each world w ∈ W , π(p, w) is an n-ary relation on D A variable assignment V w.r.t. a Kripke interpretation M is a function that maps each variable to an element of the domain D of M . Interpretation for terms in the domain is defined as usual from the interpretation of constants and function symbols. Definition 8. (Kripke semantics) Let be a relation between w ∈ W and closed formulae of Ls satisfying, for all w ∈ W , the following conditions:
10
L. Winkelhagen, M. Dastani, and J. Broersen
- M, V, w - M, V, w - M, V, w - M, V, w - M, V, w - M, V, w - M, V, w - M, V, w
p(t1 , ..., tn ) iff (V (t1 ), ..., V (tn )) ∈ π(p, w) ¬φ iff M, V, w φ φ ∧ ψ iff M, V, w φ and M, V, w ψ φ → ψ iff M, V, w φ or M, V, w ψ ∀xφ iff for each c ∈ D, M, V, w, φ{x/c} ∃xφ iff for some c ∈ D, M, V, w, φ{x/c} Bi φ iff for all w ∈ W such that (w, w ) ∈ Ri , M, V, w φ
A closed formula (i.e. a formula where all variables are bound by a quantifier) φ of the language Ls is satisfiable if there is a Kripke model M = W, R, D, π and a world w ∈ W for some variable assignment V , such that M, V, w φ. φ is a valid formula ( φ) if ¬φ is not satisfiable. We will lift the above definition for satisfiability to groups of formulae: M, V, w Φ iff M, w φ for all φ ∈ Φ. We will define modal logical entailment between groups of formulae Φ Ls Ψ as M, V, w Φ implies M, V, w Ψ . We can use this definition of entailment to establish whether a set of formulae in Ls follows from a second set of formulae in Ls . This of course also holds for formulae in clausal fragments of Ls such as the belief base language. We can now give a definition of the semantics of belief queries based on [1]. We will do so by defining what it means for belief queries to be satisfied by an agent configuration. It is important to stress that this entailment (denoted by τ ) relation is an entirely different relation than the one defining modal logical entailment (denoted by ) from definition 8. In particular, we will use the entailment relation defined for modal logic () to define the entailment relation for belief queries (τ ). Definition 9. (semantics of belief queries) Let ι, σ, γ, Π, θ be the agent configuration of agent ι, δ, δ ∈ Disjunction and Bφ, β, β ∈ LB . Let τ, τ1 , τ2 be ground substitutions. The semantics of belief queries is defined as follows: ι, σ, γ, Π, θ ∅ ι, σ, γ, Π, θ τ ι, σ, γ, Π, θ ∅ ι, σ, γ, Π, θ τ
Bφ ⇔ σ φτ where Varf (φ) ⊆ dom(τ ) ¬Bφ ⇔ ¬∃τ : ι, σ, γ, Π, θ τ Bφ δ∨δ ⇔ ι, σ, γ, Π, θ τ δ or (∀τ : ι, σ, γ, Π, θ τ δ and ι, σ, γ, Π, θ τ δ ) ι, σ, γ, Π, θ τ β∧β ⇔ ι, σ, γ, Π, θ τ β and ι, σ, γ, Π, θ τ β τ
Here the entailment relation on the righthand side of the second clause is defined in definition 8. Again, except for the use of an entailment function for first order modal logic, this is quite like the semantics specified in the original 3APL [1]. 4.3
A Goal-Directed Proof Procedure
In this section we will give a goal-directed proof procedure. This procedure will allow us to compute whether a belief query is entailed by a belief base. The
Beliefs in Agent Implementation
11
procedure is based on NemoLOG [3] proposed by Baldoni, which is adapted to the needs and specifications of 3APL. We will first introduce the concept of a ∗ modal context, which will form the basis of our matching6 relation (⇒) and thus for the whole of the proof procedure. Definition 10. (Modal Context) Let M1 , ..., Mn ∈ {B1 , ..., Bm } where 1, ..., m ∈ Agents. We will then define a modal context MC as a finite sequence of modal operators denoted by M1 ...Mn . We will denote an empty modal context with ε. A modal context records the modalities in front of goals during the execution of our goal-directed proof procedure. We will denote the set of all possible modal contexts by MC ∗ . Intuitively, a modal context can be seen as a name for a set of possible worlds. We say that we evaluate a formula in a certain modal context MC when we evaluate it in each possible world named by that modal context instead of in the actual world. In our goal-directed proof procedure, we will use the modal context to be able to recognize syntactically if a Program Clause is applicable to the Goal or not. From this viewpoint, the modal context denotes in which worlds we are evaluating the Goal. The proof procedure itself will, when appropriate, update the modal context to reflect the correct worlds. To avoid problems with variable renaming and substitutions we will denote by [P ] the set containing the set of all ground instances of clauses from a belief base P . Also, each formula F ∈ L which is in [P ] will be transformed into a formula of the form → F . We do this to make all semantical information in [P ] available in a uniform way, i.c. in the form of implications, the rule-form. This will be useful when we get to the inductive definition of the goal-directed proof procedure described in definition 13. Definition 11. (Program Clauses) Let P be a set of clauses ∈ LBB and Γ ∈ MC ∗ . Define [P ] (the Program Clauses) to be the set of formulae obtained by transforming all clauses in P using the following rules: a) if C ∈ P and C ∈ L then → C ∈ [P ]. b) if ∀x1 , ..., xn Γ C ∈ P and ∀x1 , ..., xn Γ C ∈ / L then Γ C {t/x} ∈ [P ] for each x ∈ x1 , ..., xn for all ground terms t that can occur in P . Let G, which we shall call a Goal, be either or of the form φ1 ∧ ... ∧ φn , where φ1 , ..., φn ∈ L. We will call [P ], where P is the contents of the belief base, the Program. By definition 8 it can easily be seen that in [P ] every bit of semantical information in the belief base P is present in clauses of the general form Γb (G → Γh A), where Γb is ε or a single belief modality, Γh is an arbitrary, possibly empty, sequence of belief modalities and A ∈ L. This is important because the inference rules of our goal-directed proof procedure can only make use of Program Clauses written in this general form. In the following, we will give a proof procedure that applies formulae in the program [P ] to prove a goal G in a modal context MC. The applicability of a 6
As mentioned in the previous chapter, Baldoni often uses the word derivation.
12
L. Winkelhagen, M. Dastani, and J. Broersen
formula F ∈ [P ] depends on the modalities in F and on the modal context. The sequences of modalities in F must match the sequence of modal operators that is MC. We will now give the definition of this matching relation. Definition 12. (matching relation) For 1, ..., n ∈ Agents, let M be the set of modal operators {B1 , ..., Bn }. We define two auxiliary relations. Let the relation ⇒ be defined as ⇒ = {(Mi , Mi Mi ) | Mi ∈ M}. Let the relation ⇒ be defined as ⇒ = {(Γ1 Γ Γ2 , Γ1 Γ Γ2 ) | Γ, Γ , Γ1 , Γ2 ∈ MC ∗ & Γ ⇒ Γ }7 . Now we can define ∗ our matching relation (⇒) as the equivalency closure of ⇒ over the space MC ∗ . Now we will give a set of inference rules. These rules constitute our goal-directed proof procedure. Definition 13. (goal-directed proof procedure) Let M be a single modal operator. Let A ∈ L be without modal operators. We define the procedure to find a proof for a closed Goal G from a modal context MC = Γ and a program [P ] by induction on the structure of G as follows: 1. [P ], Γ ; 2. [P ], Γ A if there is a clause Γb (G → Γh A) ∈ P and a modal context Γb such that ∗ ∗ (1) [P ], Γb G, (2) Γb ⇒ Γb , and (3) Γb Γh ⇒ Γ . 3. [P ], Γ G1 ∧ G2 if [P ], Γ G1 and [P ], Γ G2 ; 4. [P ], Γ ∃xG if for some ground term t [P ], Γ {x/t}G; 5. [P ], Γ M G if [P ], Γ M G. Given a program [P ] and a closed goal G, we say that G is provable from [P ] if [P ], ε G can be derived by applying the above rules 1. - 5. Using Goal-Resolution and Backwards chaining, this proof procedure is able to prove modal goals in the same way Prolog is able to prove non-modal goals. Implementation of the proof procedure builds on the current 3APL implementation and is a matter of writing a simple Prolog module to handle the modalities. The above definitions (11-13) can together be seen as a proof method for the KD45n quantified modal logic that we want our agents to use. The working of the proof procedure is very simple: The rules 1, 3, and 4 deal with query formulae that are always true, conjunctions or existentially qualified, respectively. If a goal is prefixed by a modality, then rule 5 takes the modality away from the goal while updating the modal context. Using definition 11 we ensure that goals cannot be of a form other than the ones dealt with above, so this suffices. Finally, rule 2 allows for the application of a Program Clause (rule) and specifies the resulting Goal and modal context. Rule 2 makes use of the special matching relation ∗ ⇒ as defined in definition 12. The matching relation is intended as a means to 7
We denote by Γ1 Γ2 the concatenation of the modal contexts Γ1 and Γ2 .
Beliefs in Agent Implementation
13
establish if a certain sequence of modal operators can be used to access a formula involving (or if you will; guarded by) another sequence of modal operators. Our matching relation is a specialized version of the derivation relation defined ∗ by Baldoni [3], which is based on inclusion axioms 8 . That derivation relation ⇒ is defined as the transitive and reflexive closure of Γ Γ1 Γ ⇒ Γ Γ2 Γ for each inclusion axiom Γ1 → Γ2 . Here Γ, Γ1 , Γ2 and Γ are all sequences of modalities. An example of an inclusion axiom is i → i i , describing transitivity. A ∗ derivation relation based on this axiom makes 2 1 2 ⇒ 2 1 1 2 valid. Other inclusion axiom will introduce other properties. Our own derivation relation (definition 12) gives our programs transitivity, seriality and euclidicity. However this can easily be adapted to other logic systems following [3]. 4.4
Soundness of the Proof-Procedure
The proof-procedure given in the previous section allows us to derive a belief query from an agent configuration. We first transform the belief base of the agent configuration into Program Clauses and the belief query into a query usable by the goal-directed proof procedure defined in definition 13. Given this goal-directed proof procedure and given the Kripke semantics for the superset language Ls (given in definition 8) we show that the proof procedure is sound. In order to prove soundness, we must prove that every belief query proven by our procedure (definition 13)is valid using the semantics the superset language (Ls ). Formally: [P ] β ⇒ P Ls β. With our soundness proof we deviate from the approach of Baldoni, who gives a correctness proof based on fixpoint semantics relating to both the operational semantics and the Kripke semantics [3]. We instead attempt to give a soundness proof by directly relating the goal directed proof procedure to the Kripke semantics of our superset language. ∗ The matching relation ⇒ plays an important role in the proof procedure, therefore it makes sense to establish some important properties for this relation. First, because the matching relation determines the properties of the modal logic of the system, we will prove that these properties are the desired ones. As a consequence we prove that the validity of modal logical formulae is preserved within matching modal contexts. This allows us to prove the soundness of our proof procedure. The matching relation imposes certain restrictions on the relations between modal contexts, and thus on Bi . We will prove that these restrictions correspond to those described in definition 8 (seriality, transitivity and euclidicity) by constructing a Kripke frame over MC ∗ (the set of all modal contexts) and the relation Ri . We will then apply the restrictions imposed by the matching relation and prove that the frame is a KD45n-frame, which means that seriality, transitivity and euclidicity apply. A KD45n -frame is a non-empty set of worlds with accessibility relations between those worlds that are serial, transitive and euclidean, with respect to 8
Inclusion axioms can be used to describe modal logic properties. For example i → i i describes transitivity.
14
L. Winkelhagen, M. Dastani, and J. Broersen
multiple agents [10]. Our Kripke interpretation of definition 7 is based on a KD45n -Kripke frame (the worlds W and relations R of the interpretation). What we prove is that, if one imposes restrictions on the relations in accordance with the matching relation, then the desired transitivity, seriality and euclidicity properties of the relations Ri ∈ R hold. ∗
Lemma 1. (Semantics of the matching relation) The matching relation ⇒ over MC ∗ defines a KD45n -frame. Proof. Let F be a frame W, R where W is an infinite set of worlds such that W = {wΓ |Γ ∈ MC ∗ } and R is a set of relations {R1 , ..., Rn } for 1, ..., n ∈ Agents where the relation Ri is defined as (wΓ , wΓ ) where Γ = Γ Mi . Here Γ Mi is the modal context obtained by the concatenation of Γ and the single modality Mi . ∗ ∗ Let F be F modulo ⇒, i.e. if Γ ⇒ Γ then wΓ ∈ W and wΓ ∈ W are equivalent. We will prove that the accessibility relations Ri ∈ R are all serial, transitive and euclidean and that therefore F is a KD45n-frame. Seriality. Every Γ ∈ MC ∗ corresponds to a world wΓ ∈ W . Furthermore if Γ ∈ MC ∗ then, by the infinite nature of MC ∗ , Γ Mi ∈ MC ∗ . Therefore there always is a world wΓ ∈ W that corresponds with Γ Mi . Since Ri is defined as (wΓ , wΓ ) we have that for every wΓ ∈ W (wΓ , wΓ ) ∈ Ri . Transitivity. Let wΓ , wΓ and wΓ correspond to the modal contexts Γ , Γ Mi ∗ ∗ and Γ Mi Mi respectively. Since Γ Mi ⇒ Γ Mi Mi and because F is F modulo ⇒, we have wΓ = wΓ . Moreover, since (wΓ , wΓ ) ∈ Ri we also have (wΓ , wΓ ) ∈ Ri . Euclidicity. If (wΓ , wΓ ) ∈ Ri and (wΓ , wΓ ) ∈ Ri then if wΓ corresponds with the modal context Γ then wΓ and wΓ corresponds with the modal context Γ Mi . Because the relation Ri is also serial there must be a relation (wΓ , wΓ ) ∈ Ri such that wΓ corresponds with the modal context Γ Mi Mi . ∗ Because Γ Mi Mi ⇒ Γ Mi, we have wΓ = wΓ and thus (wΓ , wΓ ) ∈ Ri . Lemma 2. If Γ ϕ is valid (in the logic of the superset language (definition 8)), ∗ and Γ ⇒ Γ , then Γ ϕ is also valid. Proof. This is a direct consequence of the previous lemma (1) saying that the ∗ matching relation ⇒ defines a KD45n -frame over modal contexts. Theorem 1. (soundness: [P ] β ⇒ P Ls β) Every belief query made true by the goal-directed proof procedure ( ) is also valid in the Kripke semantics (). Proof. We have to prove that [P ] β ⇒ P Ls β. If the goal directed proof procedure stops, the result is a proof tree, where any node is of the form [P ], Γ β, the root is of the form [P ], G, and all leaves have the form [P ], Γ . With every node [P ], Γ β of the tree, we define an associated Ls formula of the form P → Γ β. Then we prove that validity of the associated formulas P → Γ β is preserved when going up (or ‘down’, depending on whether you really want to picture the tree as a tree) from the leaves of the tree to the
Beliefs in Agent Implementation
15
root. The associated formulae for the leaves are all of the form P → Γ , which are trivially valid in the superset logic (Definition 8). Then by induction over the proof tree structure, we get that P → β is valid (ε is the empty modal context), which (for all standard notions of entailment) results in P Ls β. To complete the proof, we now prove the preservation property for each individual rule in the goal-directed proof procedure. Rule 1: Trivial. Rule 2: We have to prove that under the conditions (1) Γb (G → Γh A) ∈ P , (2) ∗ ∗ Γb Γh ⇒ Γ , and (3) Γb ⇒ Γb , validity of P → Γb G implies validity of P → Γ A. If Γb (G → Γh A) is not valid, the implication holds trivially, so we only have to consider the case where Γb (G → Γh A) is valid. Thus, we have to prove that under conditions 2 and 3, the validity of P → Γb G and Γb (G → Γh A) implies the validity of P → Γ A. For this we make use of the previous lemma (2). With this lemma together with condition 2, we conclude that from the validity of Γb (G → Γh A) we may conclude the validity of Γb (G → Γh A). From the fact that we deal with normal modal reasoning, we may conclude to the validity of Γb G → Γb Γh A (applying the K-property of normal modal logic). From this, together with the validity of P → Γb G, we may conclude to the validity of P → Γb Γh A. Applying the previous lemma (2) one more time gives us the desired P → Γ A. Rule 3: We have to prove that if P → Γ G1 and P → Γ G1 are valid, also P → Γ (G1 ∧ G2 ) is valid. This follows directly from the fact that we deal with normal modal logic (i.e. Kripke structures), for which ϕ ∧ ψ → (ϕ ∧ ψ). Rule 4: This rule replaces a ground term by an existentially quantified variable. From the semantics of existential quantification, it immediately follows that this preserves validity. Rule 5: Trivial.
5
Conclusion and Further Research
We have extended the agent programming language 3APL, giving agents a means to explicitly reason about the beliefs of themselves and the beliefs of other agents. We have done so by adding modal operators of belief to the syntax of the belief base and belief query languages of 3APL. The corresponding semantics have been adjusted to correspond with a KD45n-type modal logic, often used to represent beliefs, governing the behavior of these modal operators. In the final section we have given a method for checking the derivability of a belief query from a belief base, providing the functionality we sought. This method is proven to be sound. The next step will be to implement a working version of this method, and to test it with communication. This implementation can be build upon the existing 3APL implementation. This would only require programming a Prolog interpreter that can work with the modalities involved. Another interesting possibility is the addition of negation to the belief base and belief query languages. This may dramatically increase the expressional power of 3APL. With the right restrictions, we belief that negation can be intro-
16
L. Winkelhagen, M. Dastani, and J. Broersen
duced problem-free, however this is left for further research. Finally, we plan to show that our proof method is complete with respect to possible belief queries and belief base content. This can be done by induction on the form of the belief queries.
References 1. M. Dastani, B. van Riemsdijk, F. Dignum, J.J. Meyer: A programming language for cognitive agents: Goal directed 3apl. In Mehdi Dastani, Juergen Dix, A.E.F.S., ed.: Programming Multi-Agent Systems: First International Workshop, PROMAS 2003, Melbourne, Australia, July 15, 2003, Selected Revised and Invited papers. Volume 3067 of Lecture Notes in Computer Science., Springer (2004) 111–130 2. The Foundation for Intelligent Physical Agents: (Fipa communicative act library specification) 3. Baldoni, M.: Normal Multimodal Logics: Automatic Deduction and Logic Programming Extension. PhD thesis, University of Turin, Italy (1998) 4. Nguyen, L.A.: The modal logic programming system MProlog. In Alferes, J.J., Leite, J.A., eds.: Proceedings of JELIA 2004. Volume 3229 of Lecture Notes in Computer Science., Springer (2004) 266–278 5. Nonnengart, A.: How to use modalities and sorts in prolog. In MacNish, C., Pearce, D., Pereira, L.M., eds.: Logics in Artificial Intelligence. Springer (1994) 365–378 6. Nguyen, L.A.: Multimodal logic programming and its applications to modal deductive databases. Technical report (2003) 7. M. Baldoni, L. Giordano, A. Martelli: A modal extention of logic programming: Modularity, beliefs and hypothetical reasoning. In: Journal of Logic and Computation. Volume 8. Oxford University Press (1998) 597–635 8. M. Baldoni, L. Giordano, A. Martelli: A framework for modal logic programming. In Maher, M., ed.: Proc. of the Joint International Conference and Symposium on Logic Programming, JICSLP’96. MIT Press (1996) 52–66 9. M. Baldoni, L. Giordano, A. Martelli: Translating a modal language with embedded implication into horn clause logic. In: Extensions of Logic Programming. (1996) 19–33 10. Blackburn, P., de Rijke, M., Venema, Y.: Modal Logic. Volume 53 of Cambridge Tracts in Theoretical Computer Science. Cambridge University Press (2001)
Modelling Uncertainty in Agent Programming Johan Kwisthout and Mehdi Dastani ICS, Utrecht University {johank, mehdi}@cs.uu.nl
Abstract. Existing cognitive agent programming languages that are based on the BDI model employ logical representation and reasoning for implementing the beliefs of agents. In these programming languages, the beliefs are assumed to be certain, i.e. an implemented agent can believe a proposition or not. These programming languages fail to capture the underlying uncertainty of the agent’s beliefs which is essential for many real world agent applications. We introduce Dempster-Shafer theory as a convenient method to model uncertainty in agent’s beliefs. We show that the computational complexity of Dempster’s Rule of Combination can be controlled. In particular, the certainty value of a proposition can be deduced in linear time from the beliefs of agents, without having to calculate the combination of Dempster-Shafer mass functions.
1
Introduction
In multi-agent systems, individual agents are assumed to be situated in some environment and are capable of autonomous actions in the environment in order to achieve their objectives [20]. An autonomous agent interacts with its environment, based on its information and objectives, both of which are updated with the information acquired through interaction. In order to develop multi-agent systems, many programming languages have been proposed to implement individual agents, their environments, and interactions [5, 10, 4, 3]. These languages provide programming constructs to enable the implementation of agents that can reason about their information and objectives and update them according to their interactions. Unfortunately, although Rao and Georgeff [12] already uses beliefs - rather than knowledge operators - due to the agent’s lack of knowledge about the state of the world, many of the proposed programming languages assume that the information and objectives of agents are certain. This is obviously an unrealistic assumption for many real world applications. In such applications, either the environment of the agents involves uncertainty or the uncertainty is introduced to agents through imperfect sensory information. Past research dealing with the application of existing programming languages such as 3APL for robot control [17] showed, that sensory input is not always accurate, and that external actions have unpredictable outcomes: the environment in which the agent operates is
The work of this author was partially supported by the Netherlands Organisation for Scientific Research NWO.
M. Baldoni et al. (Eds.): DALT 2005, LNAI 3904, pp. 17–32, 2006. c Springer-Verlag Berlin Heidelberg 2006
18
J. Kwisthout and M. Dastani
both inaccessible and indeterministic. This seriously devalues the practical use of such agent programming languages for real applications like mobile robot control. Therefore, we believe that individual agents need to be able to reason and update their states with uncertain information, and that agent-oriented programming languages should facilitate these functionalities. In this paper, we focus on cognitive agents which can be described and implemented in terms of cognitive concepts such as beliefs and goals. We consider programming languages that provide programming constructs to implement agent’s beliefs, to reason about beliefs, and update beliefs. In order to allow the implementation of cognitive agents that can work with uncertain information, we investigate the possible use of Dempster-Shafer theory to incorporate uncertainty in BDI-type agent programming languages. We discuss how uncertain beliefs can be represented and reasoned with, and how they can be updated with uncertain information. The structure of this paper is as follows. First we will introduce the most important relevant concepts in Dempster-Shafer theory. In section 3 we propose a mapping between this theory and agent beliefs. In section 4 we deal with implementational issues, and show that computational complexity can be controlled given certain restrictions on the belief representation. In section 5 we show how the agent’s belief base can be updated and queried, while the computational complexity of these operations are discussed in section 6. Finally, in section 7 we conclude the paper.
2
Dempster-Shafer Theory
The concept uncertainty is closely related to probability theory. We differentiate between the notion of chance and probability: a chance represents an objective, statistical likeliness of an event (such as throwing a six with a dice), while probability represents the likeliness of an event given certain subjective knowledge (for example, the probability of six, given that we know the number is even). Probabilistic reasoning deals with the question how evidence influences our belief in a certain hypothesis H. We define the probability of H, denoted as P (H), as a real number between 0 and 1, with P (H) = 0 meaning H is definitely false, and P (H) = 1 meaning H is definitely true. A value between 0 and 1 is a measure for the probability of H. The theory of Dempster and Shafer [14] can be seen as a generalisation of probability theory. In this theory, a frame of discernment Ω is defined as the Ω set of all hypotheses in a certain domain. On the power set 2 , a mass function m(X) is defined for every X ⊆ Ω, with m(X) ≥ 0 and X⊆Ω m(X) = 1. If there is no information available with respect to Ω, m(Ω) = 1, and m(X) = 0 for every subset of Ω. For example, in a murder case Ω is a list of suspects, {Peter, John, Paul, Mary, Cindy}. If the investigator has no further information, the mass function associated with Ω will assign 1 to Ω and 0 to all real subsets of Ω. If there is evidence found regarding certain subsets of Ω, for example a slightly unreliable witness claims the killer was probably a male, we assign an
Modelling Uncertainty in Agent Programming
19
appropriate mass value (say 0.6) to this particular subset of Ω and - since we have no further information and X⊆Ω m(X) = 1 by definition - we assign a mass value of 0.4 to Ω. The mass function in this case would be: ⎧ ⎨ 0.6 if X ={Peter, John, Paul} m1 (X) = 0.4 if X = Ω ⎩ 0 otherwise Note that no value whatsoever is assigned to subsets of {Peter, John, Paul}. If we receive further evidence, for example that the killer was most likely (say with a probability of 0.9) left-handed, and both John and Mary are left-handed, then we might have another mass function like: ⎧ ⎨ 0.9 if X ={John, Mary} m2 (X) = 0.1 if X = Ω ⎩ 0 otherwise Dempster’s Rule of Combination is a method to combine both pieces of evidence into one combined mass function. This function for the combination of m1 ⊕ m2 is defined as: Definition 1. (Dempster’s Rule of Combination [14]). Let X, Y, Z ⊆ Ω. Then the following holds: m1 ⊕ m2 (X) =
Y ∩Z=X Y ∩Z =∅
m1 (Y )·m2 (Z) m1 (Y )·m2 (Z)
and
m1 ⊕ m2 (∅) = 0 Dempster’s Rule of Combination is commutative and associative, as shown in [13]. In our example, combining both pieces of evidence would lead to the following mass function: ⎧ 0.06 if X ={Peter, John, Paul} ⎪ ⎪ ⎪ ⎪ ⎨ 0.36 if X ={John, Mary} m1 ⊕ m2 (X) = 0.54 if X ={John} ⎪ ⎪ 0.04 if X = Ω ⎪ ⎪ ⎩ 0 otherwise Given a certain mass function, the subsets of Ω that have a mass value greater than zero are called focal elements, and we will denote the set of focal elements of a given mass function ϕ as the core of that mass function. A simple support function is a special case of a mass function, where the evidence only supports a certain subset A of Ω, and zero mass is assigned to all subsets of Ω other than A, i.e., the core of a simple support function is {A, Ω}: Definition 2. (simple support function [14]). Let X ⊆ Ω and A be an evidence with probability s. Then, the simple support function related to A is specified as follows:
20
J. Kwisthout and M. Dastani
⎧ ⎨ s if X = A m(X) = 1-s if X = Ω ⎩ 0 otherwise On a mass function, two other functions are defined, namely a belief function Bel(X) and a plausibility function P l(X). Definition 3. (belief and plausibility function [14]). Let X, Y ⊆ Ω, then the belief and plausibility functions can be defined in terms of a certain mass function m as follows: Bel(X) = m(Y ) and P l(X) = m(Y ) Y ⊆X
X∩Y =∅
Informally, the belief and plausibility functions can be seen as a lower respectively upper limit on the probability of the set of hypotheses X. Note, that P l(X) = 1 − Bel(Ω\X). The difference between Bel(X) and P l(X) can be regarded as the ignorance with respect to X.
3
Mapping Agent Beliefs to Dempster-Shafer Sets
Can the theory of Dempster and Shafer be applied to the beliefs of an agent, if they are represented by proposition-logical formulae in an agent programming language? To investigate this question, suppose we have an agent-based program that operates in the context of a 2-by-2 grid-world where bombs can appear in certain positions in the grid and an agent can partially perceive the environment and move around. The agent tries to sense the bombs surrounding him, thus locating all bombs and safe squares in his environment. We assume that the agent’s belief base is a set of logical formulae that the agent believes to hold. This belief base can therefore be understood as the conjunction of all formulae from the set, i.e. it can be represented as a conjunctive formula. Assume that, at a given moment during the execution of the program, the agent has the formula safe(1) in its belief base (say BB1 ). This indicates that the agent believes that square 1 is a safe location. How can we relate this belief in terms of the Dempster-Shafer theory? The frame of discernment Ω can be understood as the set of all models of the grid-world, as shown in table 1. In a 2-by-2 grid-world there are 16 models, ranging from ‘all squares are safe’ to ‘all squares contain bombs’. We can relate the agent’s current beliefs to a subset of hypotheses from Ω, where each hypothesis is considered as a model of that belief. For example, if we define the hypotheses as in table 1 then the belief formula saf e(1) is a representation of the set {H1 , H2 , H3 , H4 , H5 , H6 , H7 , H8 } of hypotheses, which is exactly the set of all models of the belief base BB1 . If we define a mass-function msaf e(1) according to this belief base, we would assign 1 to this set, and 0 to Ω (and to all other subsets of Ω). In fact, each belief base
Modelling Uncertainty in Agent Programming
21
Table 1. Bomb location and associated hypothesis Hyp. 1 2 3 4 5 6 7 8
1 Safe Safe Safe Safe Safe Safe Safe Safe
2 Safe Safe Safe Safe Bomb Bomb Bomb Bomb
3 Safe Safe Bomb Bomb Safe Safe Bomb Bomb
4 Safe Bomb Safe Bomb Safe Bomb Safe Bomb
Hyp. 9 10 11 12 13 14 15 16
1 Bomb Bomb Bomb Bomb Bomb Bomb Bomb Bomb
2 Safe Safe Safe Safe Bomb Bomb Bomb Bomb
3 Safe Safe Bomb Bomb Safe Safe Bomb Bomb
4 Safe Bomb Safe Bomb Safe Bomb Safe Bomb
can be represented by a mass function. Such a mass function would assign 1 to the particular subset of Ω that contains all hypotheses that are true with respect to the belief base, or in other words: the maximal subset of hypotheses in Ω that are models of safe(1). Notice that the set {H1 , H2 , H3 , H4 , H5 , H6 } consists of models of safe(1) as well, but it is not the maximal subset with this property. If a belief base is a certain belief formula ϕ, then it could be represented by a simple support function mϕ (X) that supports only the maximal set of hypotheses in Ω that are models of ϕ. This can be formalised as follows: 1 if X ⊆ Ω & models(X, ϕ) & ∀Y ⊆ Ω (models(Y, ϕ) ⇒ Y ⊆ X) mϕ (X) = 0 otherwise In this definition, the relation models(X, ϕ) is defined as ∀M ∈ X M |= ϕ, where M is a model and |= is the propositional satisfaction relation. The condition of the if-clause indicates that X is the maximum set of hypotheses in Ω that are models of ϕ. In the sequel we use the function maxΩ (ϕ) to denote this set, and define it as follows: maxΩ (ϕ) = X ⇐⇒ X ⊆ Ω & models(X, ϕ) & ∀Y ⊆ Ω (models(Y, ϕ) ⇒ Y ⊆ X). Using this auxillary function, the mass function that represents the belief base BB1 can then be rewritten as: 1 if X = maxΩ (safe(1)) msaf e(1) (X) = 0 otherwise 3.1
Adding Beliefs
If we add another belief formula to the belief base, the resulting belief base can be represented by the combination of the mass function of both belief formulae. Suppose we add safe(2) to the belief base, with the following mass function: 1 if X = maxΩ (safe(2)) msaf e(2) (X) = 0 otherwise We can combine both pieces of evidence using Dempster’s Rule of Combination. Since the only non-empty intersection of sets defined by either msaf e(1)
22
J. Kwisthout and M. Dastani
or msaf e(2) is the set maxΩ (safe(1) ∧ safe(2)), the resulting mass function m1 = msaf e(1) ⊕ msaf e(2) is defined as follows1 : 1 if X = maxΩ (safe(1) ∧ safe(2)) m1 (X) = 0 otherwise Note that maxΩ (safe(1) ∧ safe(2)) corresponds to the subset {H1 , H2 , H3 , H4 } of hypotheses. Apart from these beliefs, which can be either true or false, we could imagine a situation where a belief is uncertain. We might conclude, on the basis of specific evidence, that a location probably contains a bomb; such a belief formula (say bomb(3) with a probability value of 0.7), should be added to the belief base. In order to incorporate such cases, we introduce the concept of a basic belief formula to represent uncertain belief formulae. Definition 4. (Basic belief formula). Let ϕ be a belief formula and p ∈ [0..1]. Then the pair ϕ : p, which indicates that ϕ holds with probability p, will be called a basic belief formula2 . With these basic belief formulae, the above mentioned belief base can be represented as { safe(1): 1.0, safe(2): 1.0, bomb(3): 0.7 }. Of course, we could represent bomb(3): 0.7 as a mass function, as we did with beliefs safe(1) and safe(2). This mass function would assign a probability value of 0.7 to the set of hypotheses, all having a bomb on location 3, and (because we have no further information) a probability value of 0.3 tot Ω: ⎧ ⎨ 0.7 if X = maxΩ (bomb(3)) mbomb(3) (X) = 0.3 if X = Ω ⎩ 0 otherwise If we would combine m1 and mbomb(3) using Dempster’s Rule of Combination, we would get the following mass function: ⎧ ⎨ 0.7 if X = maxΩ (safe(1) ∧ safe(2) ∧ bomb(3)) m2 = m1 ⊕ mbomb(3) (X) = 0.3 if X = maxΩ (safe(1) ∧ safe(2)) ⎩ 0 otherwise This combined mass function m2 represents our updated belief base. Note, that the set maxΩ (safe(1) ∧ safe(2)) is exactly {H1 , H2 , H3 , H4 }, and the set maxΩ (safe(1) ∧ safe(2) ∧ bomb(3)) is equal to {H3 , H4 }. 3.2
Deleting Beliefs
We can also delete belief formulae from our belief base. For example, we could conclude ¬bomb(3) during the execution of our program. We will model deletion of a formula as the addition of its negation. This corresponds to the maximal set of hypotheses according to which there is no bomb on location 3: 1 2
We will use simple indices for the combined mass functions to improve readability. The term basic belief formula should not be confused with an atomic belief formula. Note, that the probability assigned to a basic belief formula cannot be (further) distributed to the atomic formulae that may constitute the basic belief formula.
Modelling Uncertainty in Agent Programming
m¬bomb(3) (X) =
23
1 if X = maxΩ (¬bomb(3)) 0 otherwise
Combining msaf e(1) and m¬bomb(3) leads to the following mass function: 1 if X = maxΩ (save(1) ∧¬bomb(3)) m3 = msave(1) ⊕ m¬bomb(3) (X) = 0 otherwise Of course, we could also conclude that a certain belief becomes less probable instead of impossible. In that case, the negation of the formula under consideration will be added with a certainty value, for example ¬bomb(3): 0.3. We would represent this formula as: ⎧ ⎨ 0.3 if X = maxΩ (¬bomb(3)) m¬bomb(3) (X) = 0.7 if X = Ω ⎩ 0 otherwise Combining this alternative mass function m¬bomb(3) and msaf e(1) leads to the following mass function: ⎧ ⎨ 0.59 if X = maxΩ (save(1) ∧¬bomb(3)) m4 = msaf e(1) ⊕ m¬bomb(3) (X) = 0.41 if X = maxΩ (save(1) ∧ bomb(3)) ⎩ 0 otherwise 3.3
Composite Beliefs
Until now we have only used atomic formula in our examples. However, we can also model disjunctions, conjunctions and negation of beliefs as sets of hypotheses by mapping disjunction, conjunction and negation of beliefs, to respectively unions, intersections, and complements of sets of hypotheses. We can illustrate such composite beliefs with an example. Consider the following two formulae in the already mentioned grid-world: ϕ: safe(2) ∧ (safe(3) ∨ safe(4)), and ψ: safe(1) ∨ (¬safe(2) ∧ safe(3)). These formulae correspond to the the sets {H1 , H2 , H3 , H9 , H10 , H11 } and {H1 , H2 , H3 , H4 , H5 , H6 , H7 , H8 , H13 , H14 }, respectively. If ϕ has a probability of p, and ψ has a probability of q, then these formula could be represented by basic belief formulae as follows: p if X = maxΩ (safe(2) ∧ (safe(3) ∨ safe(4))) mϕ (X) = 1 − p otherwise q if X = maxΩ (safe(1) ∨ (¬safe(2) ∧ safe(3))) mψ (X) = 1 − q otherwise Obviously, the conjunction ϕ∧ψ equals safe(1) ∧ safe(2) ∧ (safe(3) ∨ safe(4)), and from table 1 follows, that this result corresponds to the set {H1 , H2 , H3 }, which is the intersection of {H1 , H2 , H3 , H9 , H10 , H11 } and {H1 , H2 , H3 , H4 , H5 , H6 , H7 , H8 , H13 , H14 }.
24
J. Kwisthout and M. Dastani
3.4
Inconsistency Problem
The issue of inconsistency in Dempster’s Rule of Combination deserves further attention. In the original rule, as defined in definition 1, combinations that lead to an empty set have a mass probability of zero, and the other combinations are scaled to make sure all mass probabilities add to one. This leads to unexpected results when two mass functions with a high degree of conflict are combined. This can be demonstrated with an often-used example (e.g. in [9]). In a murder case there are three suspects: Peter, Paul and Mary. There are two witnesses, who both give highly inconsistent testimonies, which can be represented with the following mass functions: ⎧ ⎧ ⎨ 0.99 if X = ’killer is Peter’ ⎨ 0.99 if X = ’killer is Mary’ m1 (X) = 0.01 if X = ’killer is Paul’ m2 (X) = 0.01 if X = ’killer is Paul’ ⎩ ⎩ 0 otherwise 0 otherwise Combining these two mass functions leads to a certain belief that Paul is the killer, although there is hardly any support in either of the witnesses’ testimonies. 1 if X = ’killer is Paul’ m1 ⊕ m2 (X) 0 otherwise Sentz [13] describes a number of alternatives for the Rule of Combination. The most prominent (according to Sentz) is Yager’s modified Dempster’s Rule[21]. Ultimately, this rule attributes the probability of combinations, which lead to the empty set, to Ω 3 . A similar approach is demonstrated by Smets [15], which states that in the case of inconsistent mass functions, the closed world assumption (the assumption that one of the three suspects is the murderer) is not valid. The probability of empty sets should be attributed to ∅, as a sort of ‘unknown third’. This would lead to a mass function of: ⎧ ⎨ 0.0001 if X = ’killer is Paul’ m1 ⊕ m2 (X) 0.9999 if X = Ω (Yager), resp., X = ∅ (Smets) ⎩ 0 otherwise Jøsang [9] poses an alternative, namely the consensus operator, which attributes the means of the probabilities of two inconsistent beliefs to the combination, rather than their multiplication: ⎧ 0.495 X = ’killer is Peter’ ⎪ ⎪ ⎨ 0.495 X = ’killer is Mary’ m1 ⊕ m2 (X) 0.01 X = ’killer is Paul’ ⎪ ⎪ ⎩ 0 X =∅ 3
To be more exact, Yager differentiates between ground probabilities q(X) and basic probabilities m(X). The empty set can have a q(∅) ≥ 0. When combining, these ground probabilities are used and the mass is attributed after the combination, where m(X) = q(X) for X = ∅, and m(Ω) = q(Ω) + q(∅).
Modelling Uncertainty in Agent Programming
25
Table 2. Attribution of mass to inconsistent combinations Suspect Peter Paul Mary ∅ or Ω
W1 0.99 0.01 0 0
W2 0 0.01 0.99 0
Dempster 0 1 0 0
Yager/Smets 0 0.0001 0 0.9999
Jøsang 0.495 0.01 0.495 0
We can summarise these approaches (Dempster, Yager/Smets and Jøsang) using the ‘murder case’ example, as shown in table 2. Note, that the issue of inconsistency directly relates to the choice of the frame of discernment Ω. In this example, we restrict our frame of discernment to be the set of three mutually exclusive hypotheses, namely {Paul, Peter, Mary}. If, on the other hand, our frame would be Ω = {Paul, Peter, Mary, Peter or Mary}, and we would map the phrase ’killer is Peter’ to the subset {Peter, Peter or Mary} and the phrase ’killer is Mary’ to the subset {Mary, Peter or Mary}, then there would be no inconsistency at all. We deal with the choice of our frame of discernment in the next section. 3.5
The Frame of Discernment
Until now, we have mapped agent beliefs to a given set of 16 hypotheses in a 2-by-2 grid-world. Unfortunately, the frame of discernment that corresponds to a given agent program is unknown, and, just as important, there is no unique frame of discernment in such a program. We might just as well add a totally irrelevant hypothesis H17 , stating ’All squares contain apples’. We do not know if a certain hypothesis, say H16 , can become true at all during the execution of the program. This implies that the relation between the frame of discernment and the agent’s beliefs is a many-to-many mapping. This problem can, however, be solved. According to some agent programming languages such as 3APL [5], the number of beliefs an agent can hold during execution of the program is finite. For example, in the programming language 3APL only basic actions can update the belief base. The update corresponding to a basic action is specified as the post-condition of the basic action which is determined by the programmer before running the program. Therefore, in a given 3APL program all possible beliefs are given either by the initial belief base or by the specification of the basic actions. For this type of agent programs we can construct a theoretical frame of discernment that includes a set of hypotheses such that each belief that the agent can hold during its execution can be mapped to a subset of the frame of discernment. Shafer states [14, p.281], that in general the frame of discernment cannot be determined beforehand (i.e. without knowing which evidence might be relevant), and that we tend to enlarge it as more evidence becomes available. But, on the other side, if Ω is too large, holding too much irrelevant hypotheses, the probability of any hypothesis is unreasonably small. By stating that the frame of discernment should be large enough to hold
26
J. Kwisthout and M. Dastani
all relevant hypotheses with respect to the program under consideration, Ω will be neither too small nor too large. In this paper, we demand that Ω should be such that for each belief that an agent can hold during its execution (i.e. each combination of the basic belief formulae) there must be at least one non-empty subset of hypotheses in Ω. This corresponds to the demand, that the belief base of an agent should be consistent and remain so during its execution. This is the case with, e.g., 3APL agents. In other words, each conjunction of basic belief formulae has a non-empty subset of hypotheses from Ω that are models of the conjunction of basic belief formulae.
4
Mass Calculation
As we have seen, any given belief base can be represented with a mass function. Generally, a belief formula bi with probability pi divides the set of all hypotheses Ω into: ⎧ if X = maxΩ (bi ) ⎨ pi mbi (X) = 1 − pi if X = Ω ⎩ 0 otherwise The combination of belief formulae b1 , . . . , bn can thus be represented with a mass function mk = m1 ⊕ . . . ⊕ mn , related to beliefs b1 , . . . , bn , where the number of subsets of Ω that are used to define mk and have a mass value greater than zero, is equal to 2n . When a belief formula mn is combined with an already existing combination m1 ⊕. . .⊕mn−1 , the resulting mass function m1 ⊕. . .⊕mn is defined by the non-empty intersections of all subsets of Ω in mn , with all subsets of Ω in m1 ⊕ . . . ⊕ mn−1 . Since we use simple support functions to represent our beliefs - with focal elements Ω and maxΩ (ϕ) - the number of resulting subsets is doubled with each added belief formula. Because n belief formulae lead to a mass function of 2n combinations, keeping a mass function in memory and updating it when the belief base changes will lead to a combinatorial explosion in both processing time and memory requirements, as the following scenario will show. Suppose we start with an empty belief base, which has the following trivial mass function: m(X) =
1 if X = Ω 0 otherwise
If we add basic belief formula b1 : p1 , we compute the following mass function: ⎧ if X = maxΩ (b1 ) ⎨ p1 mb1 (X) = 1 − p1 if X = Ω ⎩ 0 otherwise
Modelling Uncertainty in Agent Programming
27
Adding b2 : p2 leads to4 :
mb1
⎧ p1 · (1 − p2 ) if X = maxΩ (b1 ) ⎪ ⎪ ⎪ ⎪ if X = maxΩ (b2 ) ⎨ p2 · (1 − p1 ) if X = maxΩ (b1 ∧ b2 ) ⊕ mb2 (X) = m(X) = p1 · p2 ⎪ ⎪ ⎪ (1 − p1 ) · (1 − p2 ) if X = Ω ⎪ ⎩ 0 otherwise
Note that these consecutive mass combinations can be generalised to a situation with n basic belief formulae, and that the combined mass function will grow exponentially5 . Fortunately, there is no need to calculate the entire mass function. If we need the mass value for a certain set X ⊆ Ω, we can calculate it using the probabilities in the belief base without the need to calculate the entire mass function. Before formulating and proving this proposition, we will first show that we can simplify Dempster’s Rule of Combination if 1) we use simple support functions to represent basic belief formulae, and 2) we define Ω to be such that each conjunction of basic belief formulae that the agent can hold during its execution maps to a non-empty subset of Ω, as we discussed in section 3.5. The two demands are intuitive: the first demand states, that the evidence that is represented by each basic belief formula only supports one subset of Ω, and the second guarantees that the conjunction of basic belief formulae has a model, i.e. updating the belief base always results in a consistent belief base. To facilitate further considerations, we define pϕ as the probability assigned to a certain belief formula ϕ, and define mϕ to be a simple support function associated with ϕ. The core of mϕ is denoted as C(mϕ ). We introduce the concept Mcompleteness to denote the mentioned condition on Ω, and define it as follows: Definition 5. (M-complete). Let Ω be a set of hypotheses and let M be a set of mass functions. Then Ω will be called M-complete if and only if ∀mφ , mψ ∈ M (maxΩ (φ) ∈ C(mφ ) & maxΩ (ψ) ∈ C(mψ )) ⇒maxΩ (φ ∧ ψ) ⊆ Ω With this notion of M-completeness, we can formulate and prove how Dempster’s Rule of Combination can be simplified. Proposition 1. Let SΩ be the set of all basic belief formulae, and M be the set of all mass functions associated with basic belief formulae from SΩ . Let Ω be M-complete, and let φ and ψ be two non-equivalent basic belief formulae (i.e. 4
5
Note that this general scheme is based on the assumption that b1 ∧b2 is not equivalent to b1 or to b2 , i.e. it is assumed that the belief formulae are such that their pairwise intersections are not equivalent with one of the constituents. When this assumption would not hold, some of the clauses would be identical, and there would be less clauses in the remaining scheme. In fact, Orponen [11] showed that Dempster’s Rule of Combination is #P-complete. Wilson [19] has provided a number of approximation algorithms to overcome this problem, and Barnett [1, 2] has shown, that the calculation of the combination is linear if only singleton subsets are used or if the subsets are atomic with respect to the evidence. The latter restriction is problematic since we are not aware of the actual hypotheses that constitute the frame of discernment.
28
J. Kwisthout and M. Dastani
¬(φ ≡ ψ)). Then Y ∩Z =∅ mφ (Y ) · mψ (Z) = 1, and for each X ⊆ Ω there are at most one Y ⊆ Ω and one Z ⊆ Ω relevant for the Dempster’s combination rule such that this rule can be simplified to mφ ⊕ mψ (X) = mφ (Y ) · mψ (Z) where Y ∩ Z = X. Proof. Since φ and ψ are two basic belief formulae, the only Y, Z ⊆ Ω for which mφ (Y ) = 0 and mψ (Z) = 0 are focal elements of mφ and mψ , i.e. C(mφ ) = {MY , Ω} where MY = maxΩ (φ) and C(mψ ) = {MZ , Ω} where MZ = maxΩ ψ. In other words, Y ranges over C(mφ ) and Z ranges over C(mψ ). For all other subsets Y and Z from Ω, we have mφ (Y ) = 0 and mψ (Z) = 0 such that mφ (Y ) · mψ (Z) = 0 which does not influence the summation Y ∩Z=X mφ (Y ) · mψ (Z) in the numerator of the Combination Rule. Given that Y and Z range over C(mφ ) and C(mψ ) respectively, it is clear that for each X ⊆ Ω we can have at most one Y ∈ C(mφ ) for which mφ (Y ) = 0 and at most one Z ∈ C(mψ ) for which mψ (Z) = 0 such that Y ∩ Z = X. More specifically, for any X ⊆ Ω, the subsets Y and Z can be determined as follows: If If If If
X X X X
= Ω then Y = Ω and Z = Ω. = maxΩ (φ), then Y = X and Z = Ω. = maxΩ (ψ), then Y = Ω and Z = X. = maxΩ (φ ∧ ψ), then Y = maxΩ (φ) and Z = maxΩ (ψ).
Since mφ and mψ are simple support functions, mφ (maxΩ (φ)) = pφ , mφ (Ω) = 1 − pφ , mψ (maxΩ (ψ)) = pψ , and mψ (Ω) = 1 − pψ . Then, we have Y ∩Z =∅ mφ (Y )·mψ (Z) = pφ ·pψ +pφ ·(1−pψ )+pψ ·(1−pφ )+(1−pφ )·(1−pψ ) = 1. This proves that the denominator does not influence the result of the Combination Rule. Therefore, Dempster’s Rule of Combination can be simplified to mφ ⊕ mψ (X) = mφ (Y ) · mψ (Z) where Y ∩ Z = X. This result can be generalised for consecutive combinations. From the associativity of Dempster’s Rule of Combination[13] follows, that Y1 , . . . , Yn in the formula Y1 ∩ . . . ∩ Yn = X can be determined in a similar way. For example, in a belief base consisting of three basic belief formulae φ, ψ, and χ, the set X = maxΩ (ψ∧ψ) can be determined as intersection of Yφ = maxΩ (φ), Yψ = maxΩ (ψ) and Yχ = Ω. In the second proposition we formulate a straightforward method to calculate the mass of any combination of basic belief formula, and prove that this calculation leads to the same result as the simplified Rule of Combination. Proposition 2. Let SΩ be the set of all basic belief formulae, and M be the set of all mass functions associated with basic belief formulae from SΩ . Let Ω be M-complete. For each subset X ⊆ Ω, there exists an unique bi-partition of the + − set of basic belief formulae SΩ , say SX and SX , such that the general case of Dempster’s combination rule can be simplified as follows: mi (X) = pϕ · (1 − pϕ ) i=1...n
+ ϕ∈SX
− ϕ∈SX
Modelling Uncertainty in Agent Programming
29
Proof. Let Y and Z be subsets of Ω. Based on proposition 1, Dempster’s Rule of Combination can be reduced to m1 ⊕m2 (X) = m1 (Y ) · m2 (Z), where Y ∩Z = X. The mass function that is formed by n consecutive combinations, is then equal to ∀X, Y1 , . . . , Yn ⊆ Ω : mi (X) = mi (Yi ), where Yi = X i=1...n
i=1...n
i=1...n
Given the mass functions mφ1 , . . . , mφn ∈ M and for any X ⊆ Ω and 1 ≤ i ≤ n, there exists at most one Yi ⊆ Ω ranging over the core C(mφi ) such that Y1 ∩ . . . ∩ Yn = X and mφi (Yi ) = 0 (for the same reason as in proposition 1). According to the definition of the simple support function, mφi (Yi ) can be either + pφi or 1 − pφi for 1 ≤ i ≤ n. Let then SX = {φ | Yi = maxΩ (φ) & φ ∈ SΩ } which is the set of all basic belief formula for which the corresponding mass function − + assigns pφi to the subset Yi (rather than 1 − pφi ). Using SX = SΩ \SX proves the proposition.
5
Updating and Querying the Belief Base
In order to incorporate new information (e.g. by observation or communication) in their beliefs, agents need to update their belief base. Since both an existing belief base (consisting of basic belief formulae) and a new basic belief formulae can both be represented by mass functions, we can add this new basic belief formula to the belief base which in its turn can be represented by a mass function. This would yield the same result as combining each single support function associated with the basic belief formulae, as we proved in proposition 2. However, if the belief base already contains this belief formula, we can update it using the associative nature of Dempster’s Rule of Combination. For example, suppose a belief base, which consists of two basic belief formulae b1 : p1 and b2 : p2 , is updated with the basic belief formula b1 : p3 . The new probability of b1 can be calculated since mb1 ⊕ mb2 ⊕ mb1 = mb1 ⊕ mb1 ⊕ mb2 = (mb1 ⊕ mb1 ) ⊕ mb2 . Combining mb1 and mb1 leads to a single support function with a mass value of p1 +p3 −p1 ·p3 for the set X = maxΩ (b1 ), therefore we can update the probability of b1 in the belief base to p1 + p3 − p1 · p3 . Furthermore, we can test (query) if a proposition ϕ can be derived from a belief base Γ . In section 2, we discussed the belief and plausibility functions (defined in terms of a certain mass function) that return the total mass assigned to models of ϕ and the total mass that is not assigned to models of the negation of ϕ. Using this functions, we can test if ϕ can be derived from Γ within a certain probability interval [L, U ] (denoted as Γ |=[L,U ] ϕ). This can be done by checking if Bel(maxΩ (ϕ)) ≥ L and P l(maxΩ (ϕ)) ≤ U , since Bel and P l indicate the lower and upper bound probability, respectively. Note that Bel and P l are defined in terms of a mass function (definition 3). When we consider the belief base Γ , the mass function used in Bel and P l will represent the belief base. The mass function is then denoted by mΓ . As discussed in section 3, the mass function mΓ assigns a mass value to each subset of the frame of discernment related to the belief
30
J. Kwisthout and M. Dastani
base Γ . Therefore, the test if ϕ can be deducted from Γ within [L, U ] can be formulated as follows: Γ |=[L,U ] ϕ ⇐⇒ Bel(maxΩ (ϕ)) & P l(maxΩ (ϕ)) ⇐⇒ m (Y ) ≥ L & Γ Y ⊆maxΩ (ϕ) Y ∩maxΩ (ϕ) =∅ mΓ (Y ) ≤ U
6
Complexity of Belief Queries
We can calculate Bel(maxΩ (ϕ)) by adding all values for m(X), where X ⊆ Ω and models(X, ϕ). Moreover, P l(maxΩ (ϕ)) can be calculated in a similar way by adding all values for m(X), where X ⊆ Ω and ¬models(X, ¬ϕ)). Without restrictions on the basic belief formulae in the belief base, we need to iterate over all focal elements of m, which suggests an exponential computational complexity for the determination whether Γ |=[L,U ] ϕ. However, if deduction is based on the Closed World Assumption, then Bel(X) = P l(X), and, furthermore, if we restrict the logical formulae that constitute the basic belief formulae in the belief base to be prolog facts (i.e. atoms) then the computational complexity of Bel(maxΩ (ϕ)) is linear in the length of ϕ. Proposition 3. Let Bel(X) and P l(X) be Dempster-Shafer belief and plausibility functions, respectively, defined on a certain mass function m. If the deduction X |= ϕ is based on the Closed World Assumption, then we have Bel(X) = P l(X). Proof. In the Closed World Assumption, we can test the belief in a certain formula ϕ by calculating Bel(maxΩ (ϕ)), and the belief in the negation of this formula by calculating Bel(Ω\max (ϕ)). Since Bel(X) is defined as m(Y ), Ω Y ⊆X then Bel(Ω\X) = m(Y ) in the CWA. Since m(Y ) + Y ⊆X Y ⊆X Y ⊆X m(Y ) = X⊆Ω m(X) = 1, it follows that Bel(Ω\X) = 1 − Bel(X). By definition6 , P l(X) = 1 − Bel(Ω\X): the plausibility of a set of hypotheses is 1 minus the belief in the complement (with respect to Ω) of this set. But Bel(Ω\X) = 1 − Bel(X), and therefore P l(X) = Bel(X) under the Closed World Assumption. The belief in a certain formula ϕ can be calculated straightforwardly by using standard probability calculus in linear time with respect to the number of atoms and connectives in ϕ. Proposition 4. Let the belief base be a set of atomic formulae, to which a probability value is assigned. If φ : pφ and ψ : pψ are basic belief formulae in the belief base, then Bel(maxΩ (φ)), Bel(maxΩ (¬φ)), Bel(maxΩ (φ ∧ ψ)), and Bel(maxΩ (φ ∨ ψ)) can be calculated as follows. – – – – 6
Bel(maxΩ (φ)) = pφ Bel(maxΩ (¬φ)) = Bel(Ω\maxΩ (φ)) = 1 − pφ Bel(maxΩ (φ ∧ ψ)) = pφ · pψ Bel(maxΩ (φ ∨ ψ)) = pφ + pψ − pφ · pψ
See section 2.
Modelling Uncertainty in Agent Programming
31
Proof. The proof of the first clause is as follows: Bel(maxΩ (ϕ)) is defined as the sum of all mass values that are defined for a certain subset of hypotheses in Ω that are models of ϕ. If the belief base consists of exactly the basic belief formula φ : pφ , then the first clause is proved by the definition of mass function, i.e., ⎧ if X = maxΩ (φ) ⎨ pφ mφ (X) = 1 − pφ if X = Ω ⎩ 0 otherwise If the belief base is, however, a combination of two mass functions, say mφ and mχ , then maxΩ (φ) and maxΩ (φ ∧ χ) are the only focal elements of mφ ⊕ mχ that consist of models of φ. Their mass values are then respectively pφ · pχ and pφ · 1 − pχ that sum up to pφ . The second clause follows by definition from the first, using Bel(maxΩ (φ)) + Bel(maxΩ (¬φ)) = 1. The proof of the third clause follows the lines of the proof of the first clause. When combining mass functions related to mφ and mψ , the models of φ ∧ ψ are constructed by intersecting the maximal set of hypotheses in Ω that are models of φ and ψ, and from Dempster’s Rule of Combination follows that the mass of this intersection equals p1 · p2 . The fourth clause follows from the second and third clause.
7
Conclusion and Further Work
A lot of research has been conducted on the topic of reasoning with uncertainty. Many approaches are based on extending epistemic logic with probability. For example, [7] proposed the system AXMEAS , [8] introduced the PF D system, and [16] further refined this system to PF KD45. Some of these logics are suggested to be a good candidate to be used as an underlying system for agent programming (see for example [6]). However, next to the epistemic logic approach, alternative notions of uncertainty are suggested, like the Certainty Factor model used in MYCIN, Bayesian (or causal) networks, and the Dempster-Shafer theory of evidence. Particularly appealing in the latter is the ability to model ignorance as well as uncertainty, the presence of a combination rule to combine evidence, and the concept of hypotheses which can be easily related to models of logical formulae. Nevertheless, the compuational complexity, the issue of inconsistency, and the logical validity of the combination rule (see for example [18] for a discussion) are serious disadvantages of this theory for the practical application of this theory to agent programming. We have investigated a possible mapping of Dempster-Shafer sets to belief formulae, which are represented by logical formulae, in agent programming languages. We have shown that, with restrictions on the mass functions and on the frame of discernment, Dempster-Shafer theory is a convenient way to model uncertainty in agent beliefs, and these disadvantages can be overcome. Because we do not need to keep a combined mass function of n beliefs in memory and update it with each belief update (but compute the mass value of a particular subset of Ω based on the beliefs in the belief base) there is no combinatorial explosion. Currently, we are working on an implementation of uncertain beliefs in the agent programming language 3APL. Further research will be conducted on the consequences of uncertain beliefs to agent deliberation.
32
J. Kwisthout and M. Dastani
References 1. J.A. Barnett. Computational methods for a mathematical theory of evidence. In Proceedings of the Seventh International Joint Conference on Artificial Intelligence, pages 868–875, 1981. 2. J.A. Barnett. Calculating Dempster-Shafer plausibility. IEEE transactions on pattern analysis and machine intelligence, 13:599–603, 1991. 3. L. Braubach, A. Pokahr, and W. Lamersdorf. Jadex: A short overview. In Main Conference Net.ObjectDays 2004, pages 195–207, september 2004. 4. P. Busetta, R. Ronnquist, A. Hodgson, and A. Lucas. Jack intelligent agents components for intelligent agents in java. Technical report, 1999. 5. M. Dastani, B. van Riemsdijk, F. Dignum, and J.-J. Meyer. A programming language for cognitive agents : Goal directed 3APL. In Proceedings of the First Workshop on Programming Multiagent Systems (ProMAS03), 2003. 6. N. de C. Ferreira, M. Fisher, and W. van der Hoek. A simple logic for reasoning about uncertainty. In Proceedings of the ESSLLI’04 Student Session, pages 61–71, 2004. 7. R. Fagin and J.Y. Halpern. Reasoning about knowledge and probablility. Journal of the ACM, 41:340–367, 1994. 8. M. Fattorosi-Barnaba and G. Amati. Studio logica. 46:383–393, 1987. 9. A. Jøsang. The consensus operator for combining beliefs. Artificial Intelligence Journal, 142(1-2):157–170, 2002. 10. A.F. Moreira and R.H. Bordini. An operational semantics for a BDI agent-oriented programming language. In J.-J. C. Meyer and M. J. Wooldridge, editors, Proceedings of the Workshop on Logics for Agent-Based Systems, pages 45–59, 2002. 11. P. Orponen. Dempster’s rule of combination is # P-complete. Artificial Intelligence, 44:245–253, 1990. 12. A.J. Rao and M.P. Georgeff. Modelling rational agents within a BDI-architecture. In J. Allen, R. Fikes, and E. Sandewall, editors, Proceedings of the Second International Conference on Principles of Knowledge Representation and Reasoning. Morgan Kaufmann Publishers, San Mateo, CA, 1991. 13. K. Sentz. Combination of Evidence in Dempster-Shafer Theory. PhD thesis, Binghamton University, 2002. 14. G. Shafer. A mathematical theory of evidence. Princeton Univ. Press, Princeton, NJ, 1976. 15. P. Smets. The combination of evidence in the transferable belief model. IEEE Pattern Analysis and Machine Intelligence, 12:447–458, 1990. 16. W. van der Hoek. Some considerations on the logic PFD. Journal of Applied Non Classical Logics, 7:287–307, 1997. 17. M. Verbeek. 3APL as programming language for cognitive robotics. Master’s thesis, Utrecht University, 2003. 18. F. Voorbraak. As Far as I know - Epistemic Logic and Uncertainty. PhD thesis, 1993. 19. N. Wilson. Algorithms for dempster-shafer theory. In D.M. Gabbay and P. Smets, editors, Handbook of Defeasible Reasoning and Uncertainty Management Systems, volume 5: Algorithms, pages 421–475. Kluwer Academic Publishers, 2000. 20. M. Wooldridge. Intelligent agents. In G. Weiss, editor, Multiagent Systems. The MIT Press, 1999. 21. R. Yager. On the Dempster-Shafer framework and new combination rules. Information Sciences, 41:93–137, 1987.
Complete Axiomatizations of Finite Syntactic Epistemic States ˚ Thomas Agotnes and Michal Walicki Department of Informatics, University of Bergen, PB. 7800, N-5020 Bergen, Norway {agotnes, walicki}@ii.uib.no
Abstract. An agent who bases his actions upon explicit logical formulae has at any given point in time a finite set of formulae he has computed. Closure or consistency conditions on this set cannot in general be assumed – reasoning takes time and real agents frequently have contradictory beliefs. This paper discusses a formal model of knowledge as explicitly computed sets of formulae. It is assumed that agents represent their knowledge syntactically, and that they can only know finitely many formulae at a given time. In order to express interesting properties of such finite syntactic epistemic states, we extend the standard epistemic language with an operator expressing that an agent knows at most a particular finite set of formulae, and investigate axiomatization of the resulting logic. This syntactic operator has also been studied elsewhere without the assumption about finite epistemic states [5]. A strongly complete logic is impossible, and the main results are non-trivial characterizations of the theories for which we can get completeness. The paper presents a part of a general abstract theory of resource bounded agents. Interesting results, e.g., complex algebraic conditions for completeness, are obtained from very simple assumptions, i.e., epistemic states as arbitrary finite sets and operators for knowing at least and at most.
1 Introduction Traditional epistemic logics [11, 16], based on modal logic, are logics about knowledge closed under logical consequence – they describe agents who know all the infinitely many consequences of their knowledge. Such logics are very useful for many purposes, including modelling the information implicitly held by the agents or modelling the special case of extremely powerful reasoners. They fail, however, to model the explicit knowledge of real reasoners. Models of explicit knowledge are needed, e.g., if we want to model agents who represent their knowledge syntactically and base their actions upon the logical formulae they know. An example is when an agent is required to answer questions about whether he knows a certain formula or not. The agent must then decide whether this exact formula is true from his perspective — when he, e.g., is asked whether he knows q ∧ p and he has already computed that p ∧ q is true but not (yet) that q ∧ p is true, then he cannot answer positively before he has performed a (trivial) act of reasoning. Real agents do not have unrestricted memory or unbounded time available for reasoning. In reality, an agent who bases his actions on explicit logical formulae has at any given time a finite set of formulae he has computed. In the general case, we M. Baldoni et al. (Eds.): DALT 2005, LNAI 3904, pp. 33–50, 2006. c Springer-Verlag Berlin Heidelberg 2006
34
˚ T. Agotnes and M. Walicki
cannot assume any closure conditions on this set: we cannot assume that the agent has had time to deduce something yet, nor can we assume consistency or other connections to reality — real agents often hold contradictory or otherwise false beliefs. The topic of this paper is formal models of knowledge as explicitly computed sets of formulae. We represent an agent’s state as a finite set of formulae, called a finite epistemic state. Modal epistemic logic can be seen not only as a description of knowledge but also as a very particular model of reasoning which is not valid for resource bounded agents. With a syntactic approach, we can get a theory of knowledge without any unrealistic assumptions about the reasoning abilities of the agents. The logic we present here is a logic about knowledge in a system of resource bounded agents at a point in time. We are not concerned with how the agents obtain their knowledge, but in reasoning about their static states of knowledge. Properties of reasoning can be modelled in an abstract way by considering only the set of epistemic states which a reasoning mechanism could actually produce. For example, we can choose to consider only epistemic states which do not contain both a formula and its negation. The question is, of course, whether anything interesting can be said about static properties of such general states. That depends on the available language. Syntactic characterizations of states of knowledge are of course nothing new [7, 11, 12, 17]. The general idea is that the truth value of a formula such as Ki φ, representing the fact that agent i knows the formula φ, need not depend on the truth value of any other formula of the form Ki ψ. Of course, syntactic characterization is an extremely general approach which can be used for several different models of knowledge – including also closure under logical consequence. It is, however, with the classical epistemic meta language, too general to have any interesting logical properties. The formula Ki φ denotes that fact that i knows at least φ – he knows φ but he may know more. We can generalize this to finite sets X of formulae: i X ≡ {Ki φ : φ ∈ X} representing the fact that i knows at least X. In this paper we also use a dual operator, introduced in [3, 5], to denote the fact that i knows at most X: i X denotes the fact that every formula an agent knows is included in X, but he may not know all the formulae in X. We call the language the agents represent their knowledge in the object language (OL). In the case that OL is finite, the operator i can, like i , be defined in terms of Ki : i X ≡ {¬Ki φ : φ ∈ OL \ X} But in the general case when OL is infinite, e.g. if OL is closed under propositional connectives, i is not definable by Ki . We also use a third, derived, epistemic operator: ♦i X ≡ i X ∧ i X meaning that the agent knows exactly X. The second difference from the traditional syntactic treatments of knowledge, in addition to the new operator i , is that we restrict the set of formulae an agent can know at a given time to be finite. The problem we consider in this paper is axiomatizing the
Complete Axiomatizations of Finite Syntactic Epistemic States
35
resulting logic. We present a sound axiomatization, and show that it is impossible to obtain strong completeness. The main results are proof-theoretical and semantical characterizations of the sets of premises for which the system is complete; these sets include the empty set so the system is weakly complete. In [5] we studied the axiomatization of a similar logic with the “knowing at most” operator, albeit without the finiteness assumption. Proving completeness (for the mentioned class of premises) turns out to be quite difficult when we assume that only finitely many formulae can be known, but this can be seen as a price paid for the treatment of the inherently difficult issue of finiteness. In the next section, the language and semantics for the logic are presented. In Section 3 it is shown that strong completeness is impossible, and a sound axiomatization presented. The rest of the paper is concerned with finding the sets of premises for which the system is complete. Section 5 gives a proof-theoretic account of these premise sets, while a semantic one consisting of complex algebraic conditions on possible epistemic states is given in Section 6. The results in Sections 5 and 6 build on results from the similar logic from [5] mentioned above, presented in Section 4. In Section 7 some actual completeness results, including weak completeness, are shown, and Section 8 concludes.
2 Language and Semantics The logic is parameterized by an object language OL. The object language is the language in which the agents reason, e.g. propositional logic or first order logic. No assumptions about the structure of OL is made, and the results in this paper are valid for arbitrary object languages, but the interesting case is the usual one when OL is infinite. An example of a possible property of the object language, which is often assumed in this paper, is that it is closed under the usual propositional connectives. Another possible property of an object language is that it is a subset of the meta language, allowing e.g. the expression of the knowledge axiom in the meta language: i {α} → α. ℘fin (OL) denotes the set of all finite epistemic states, and a state T ∈ ℘fin (OL) is used as a term in an expression such as i T . In addition, we allow set-building operators , on terms in order to be able to express things like (i T ∧ i U ) → i (T U ) in the meta language. TL is the language of all terms: Definition 1 (TL(OL)) . TL(OL), or just TL, is the least set such that – ℘fin (OL) ⊆ TL – If T, U ∈ TL then (T U ), (T U ) ∈ TL The interpretation [T ] ∈ ℘fin (OL) of a term T ∈ TL is defined as expected: [X] = X when X ∈ ℘fin (OL), [T U ] = [T ] ∪ [U ], [T U ] = [T ] ∩ [U ]. 2 An expression like i T relates the current epistemic state of an agent to the state described by the term T . In addition, we allow reasoning about the relationship between the two states denoted by terms T and U in the meta language by introducing formulae . of the form T = U , meaning that [T ] = [U ].
36
˚ T. Agotnes and M. Walicki
The meta language EL, and the semantic structures, are parameterized by the number of agents n and a set of primitive propositions Θ, in addition to the object language. The primitive propositions Θ play a very minor role in the rest of this paper; they are only used to model an arbitrary propositional language which is then extended with epistemic (and term) formulae. Particularly, no relation between OL and Θ is assumed. Definition 2 (EL(n, Θ, OL)) . Given a number of agents n, a set of primitive formulae Θ, and an object language OL, the epistemic language EL(n, Θ, OL), or just EL, is the least set such that: – – – –
Θ ⊆ EL If T ∈ TL(OL) and i ∈ [1, n] then i T, i T ∈ EL . If T, U ∈ TL(OL) then (T = U ) ∈ EL If φ, ψ ∈ EL then ¬φ, (φ ∧ ψ) ∈ EL
2
The usual derived propositional connectives are used, in addition to T U for T . U = U and ♦i φ for (i φ ∧ i φ). The operators i , i and ♦i are called epistemic . operators. A boolean combination of formulae of the form T = U is called a term formula. Members of OL will be denoted α, β, . . ., members of EL denoted φ, ψ, . . ., and members of TL denoted T, U, . . .. The semantics of EL is defined as follows. Again, Θ and its interpretation does not play an important role here. Definition 3 (Knowledge Set Structure) . A Knowledge Set Structure (KSS) for n agents, primitive propositions Θ and object language OL is an n + 1-tuple M = (s1 , . . . , sn , π) where si ∈ ℘fin (OL) and π : Θ → {true, false} is a truth assignment. si is the epistemic state of agent i, and the set of all epistemic states is S f = ℘fin (OL). The set of all KSSs is denoted Mfin . The set of all truth assignments is denoted Π. 2 Truth of an EL formula φ in a KSS M , written M |=f φ, is defined as follows (the subscript f means “finite” and the reason for it will become clear later). Definition 4 (Satisfaction) . Satisfaction of a EL-formula φ in a KSS M = (s1 , . . . , sn , π) ∈ Mfin , written M |=f φ (M is a model of φ), is defined as follows: M |=f p M |=f ¬φ
⇔ ⇔
π(p) = true M |=f φ
M |=f (φ ∧ ψ) M |=f i T
⇔ ⇔
M |=f φ and M |=f ψ [T ] ⊆ si
M |=f i T . M |=f T = U
⇔ ⇔
si ⊆ [T ] [T ] = [U ]
2
As usual, if Γ is a set of formulae then we write M |=f Γ iff M is a model of all formulae in Γ and Γ |=f φ (φ is a logical consequence of Γ ) iff every model of Γ is
Complete Axiomatizations of Finite Syntactic Epistemic States
37
also a model of φ. If ∅ |=f φ, written |=f φ, then φ is valid. The class of all models of Γ is denoted mod f (Γ ). The logic consisting of the language EL, the set of structures Mfin and the relation |=f can be used to describe the current epistemic states of agents and how epistemic states are related to each other — without any restrictions on the possible epistemic states. For example, the epistemic states are neither required to be consistent – an agent can know both a formula and its negation – nor closed under any form of logical consequence – an agent can know α∧β without knowing β ∧α. Both consequence conditions and closure conditions can be modelled by a set of structures M ⊂ Mfin where only epistemic states not violating the conditions are allowed. For example, we can construct a set of structures allowing only epistemic states not including both a formula α and ¬α at the same time, or including β ∧ α whenever α ∧ β is included. If we restrict the class of models considered under logical consequence to M , we get a new variant of the logic. We say that “Γ |=f φ with respect to M ” if every model of Γ in M is a model of φ. The question of how to completely axiomatize these logics, the general logic described by Mfin and the more special logics described by removing “illegal” epistemic states, is the main problem considered in this paper and is introduced in the next section.
3 Axiomatizations The usual terminology and notation for Hilbert-style proof systems are used. A proof system is sound with respect to M ⊆ Mfin iff Γ φ implies that Γ |=f φ wrt. M , weakly complete wrt. M iff |=f φ wrt. M implies that φ, and strongly complete wrt. M iff Γ |=f φ wrt. M implies that Γ φ. When it comes to completeness, it is easy to see that it is impossible to achieve full completeness with respect to Mfin with a sound axiomatization without rules with infinitely many antecedents, because the logic is not compact. Let Γ1 be the following theory, assuming that the object language is closed under a 1 operator: Γ1 = {1 {p}, 1{1 {p}}, 1{1 {1 {p}}}, . . .} Clearly, Γ1 is not satisfiable, intuitively since it describes an agent with an infinite epistemic state, but any finite subset of Γ1 is satisfiable. However, a proof of its inconsistency would necessarily include infinitely many formulae from the theory and be of infinite length (if the proof used only a finite subset Γ ⊂ Γ , the logical system would not be sound since Γ is satisfiable). Another illustrating example is the following theory, assuming that the object language has conjunction: Γ2 = {1 {α, β} → 1 {α ∧ β} : α, β ∈ OL} Unlike Γ1 , Γ2 is satisfiable, but only in a structure in which agent 1’s epistemic state is the empty set. Thus, Γ2 |=f 1 ∅. But again, a proof of 1 ∅ from Γ2 would be infinitely long (because if would necessarily use infinitely many instances of the schema Γ2 ), and an axiomatization without an infinite deduction rule would thus be (strongly) incomplete since then Γ2 1 ∅.
38
˚ T. Agotnes and M. Walicki
3.1 The Basic System Since we cannot get strong completeness, the natural question is whether we can construct a weakly complete system for the logic described by Mfin . The answer is positive. The following system EC is sound and weakly complete with respect to Mfin . Although it is not too hard to prove weak completeness directly, we will prove a more general completeness result from which weak completeness follows as a special case – as discussed in Section 3.2 below. Definition 5 (EC ) . The epistemic calculus EC is the logical system for the epistemic language EL consisting of the following axiom schemata: All substitution instances of tautologies of propositional calculus A sound and complete axiomatization of term formulae
Prop TC
i ∅ (i T ∧ i U ) → i (T U )
E1 E2
(i T ∧ i U ) → T U (i (U {α}) ∧ ¬ i {α}) → i U
E3 E4
i T ∧ U T → i U i T ∧ T U → i U
KS KG
and the following transformation rule φ, φ → ψ ψ
MP 2
A sound and complete term calculus is given in the appendix. Γ φ means that there exists a sequence φ1 · · · φl with φl = φ such that each φi is either an axiom, an element in Γ or the result of applying the rule MP to some φj and φk with j < i and k < i. The main axioms of EC are self-explaining. KS and KG stand for “knowledge specialization” and “knowledge generalization”, respectively. It is easy to see that the deduction theorem (DT) holds for EC . Theorem 6 (Soundness) . If Γ φ then Γ |=f φ
2
3.2 Extensions In Section 2 we mentioned that a logic with closure conditions or consistency conditions on the epistemic states can be modelled by a class M ⊂ Mfin by restricting the set of possible epistemic states. Such subclasses can often be described by axioms. For example, the axiom i {α} → ¬ i {¬α} D describes agents who never will believe both a formula and its negation. The next question is whether if we add an axiom to EC the resulting system will be complete with respect to the class of models of the axiom; e.g., if EC extended with
Complete Axiomatizations of Finite Syntactic Epistemic States
39
D will be complete with respect to the class of all models with epistemic states not containing both a formula and its negation. Weak completeness of EC does, of course, entail (weak) completeness of EC extended with a finite set of axioms (DT). An axiom schema such as D, however, represents an infinite set of axioms, so completeness of EC extended with such an axiom schema (with respect to the models of the schema) does not necessarily follow. The completeness proof, which is constituted by most of the remainder of the paper, is actually more than a proof of weak completeness of EC : it is a characterization of those sets of premises for which EC is complete, called finitary theories, and gives a method for deciding whether a given theory is finitary. Thus, if we extend EC with a finitary theory, the resulting logic is weakly complete with respect to the corresponding models. Examples. If we assume that OL is closed under the usual propositional connectives and the i operators, some common axioms can be written in EL as follows: i {(α → β)} → (i {α} → i {β}) i {α} → ¬ i {¬α} i {α} → i {i {α}} ¬ i {α} → i {¬ i {α}}
Distribution
K
Consistency Positive Introspection
D 4
Negative Introspection
5
The system EC extended with axiom Φ will be denoted EC Φ; e.g., the axioms above give the systems EC K, EC D, EC 4, EC 5.
4 More General Epistemic States A semantical structure for the language EL, a KSS, has a finite epistemic state for each agent. In this section we introduce a generalised semantic structure and some corresponding results. The generalised semantics and corresponding results are taken almost directly from a similar logic presented in [5]. Their interest in this paper is as an intermediate step towards results for KSSs; the results in the two following sections build on the results for the generalised semantics given below. Recall that the object language OL is a parameter of the logic. For the rest of the paper, let ∗ be an arbitrary but fixed formula such that ∗ ∈ OL In other words, let OL be some arbitrary language properly extending OL and let ∗ be some arbitrary element in OL \ OL. It does not matter how ∗ is selected, as long as it is not a formula of the object language, but it is important that it is selected and fixed from now on. The generalised semantic structures are defined as follows. Definition 7 (General Epistemic States and General Knowledge Set Structures) . The set of general epistemic states is S = ℘(OL) ∪ ℘fin (OL ∪ {∗})
40
˚ T. Agotnes and M. Walicki
A General Knowledge Set Structure (GKSS) for n agents, primitive propositions Θ and object language OL is an n + 1-tuple M = (s1 , . . . , sn , π) where si ∈ S and π : Θ → {true, false} is a truth assignment. si is the general epistemic state of agent i. The set of all GKSSs is denoted M. 2 In addition to the finite epistemic states S f , general epistemic states include states s where: 1. s is an infinite subset of OL: the agent knows infinitely many formulae, or 2. s = s ∪ {∗}, where s ∈ ℘fin (OL): the agent knows finitely many formulae but one of them is the special formula ∗ Observe that Mfin ⊂ M. Interpretation of the language EL in the more general structures M is defined in exactly the same way as for the structures Mfin (Definition 4). To discern between the two logics we use the symbol |= to denote the satisfiability relation between GKSSs and EL formulae, and the corresponding validity and logical consequence relations, and use |=f for KSSs as before. The class of all general models (GKSSs) of Γ ⊆ EL is denoted mod (Γ ). While ∗ ∈ OL, since EL is defined over OL, e.g., i {∗} is not a well formed formula. So, informally speaking, the generalised semantics allows an agent to know infinitely many formulae at the same time, or to know something (the special formula ∗) which we cannot reference directly in the meta language. It turns out that our logical system EC (Def. 5) is strongly complete with respect to this semantics. A variant of the following theorem was proved in [5] (for a slightly different logic; the proof is essentially the same). Theorem 8 (Soundness and Completeness wrt. GKSSs [5]) . For every Γ ⊆ EL, φ ∈ EL: Γ |= φ ⇔ Γ φ 2
5 Finitary Theories and Completeness Since EC is not strongly complete with respect to Mfin , it is of interest to characterize exactly the theories for which EC is complete, i.e., those Γ where Γ |=f φ ⇒ Γ φ for every φ ∈ EL. In this section we provide such a characterization. We define the concept of a finitary theory, and show that the set of finitary theories is exactly the set of theories for which EC is complete. The proof builds upon the completeness result for the more general logic described in the previous section. Definition 9 (Finitary Theory) . A theory Γ is finitary iff it is consistent and for all φ, Γ (1 X1 ∧ · · · ∧ n Xn ) → φ for all sets X1 , . . . , Xn ∈ ℘fin (OL) ⇓ Γ φ where n is the number of agents.
2
Complete Axiomatizations of Finite Syntactic Epistemic States
41
Informally speaking, a theory is finitary if provability of a formula under arbitrary upper bounds on epistemic states implies provability of the formula itself. We use the intermediate definition of a finitarily open theory, and its relation to that of a finitary theory, in order to prove completeness. Definition 10 (Finitarily Open Theory) . A theory Γ is finitarily open iff there exist terms T1 , . . . , Tn such that Γ ¬(1 T1 ∧ · · · ∧ n Tn )
2
Informally speaking, a theory is finitarily open if it can be consistently extended with some upper bound on the epistemic state of each agent. Lemma 11 . 1. A finitary theory is finitarily open. 2. If Γ is a finitary theory and Γ φ, then Γ ∪ {¬φ} is finitarily open.
2
P ROOF. 1. Let Γ be a finitary theory. If Γ is not finitarily open, Γ ¬(1 T1 ∧· · ·∧n Tn ) for all terms T1 , . . . , Tn . Then, for an arbitrary φ, Γ (1 T1 ∧ · · · ∧ n Tn ) → φ for all T1 , . . . , Tn and thus Γ φ since Γ is finitary. By the same argument Γ ¬φ, contradicting the fact that Γ is consistent. 2. Let Γ be a finitary theory, and let Γ φ. Then there must exist terms T1φ , . . . , Tnφ φ φ such that Γ (1 T1 ∧ · · · ∧ n Tn ) → φ. By Prop we must have that Γ ¬φ → φ φ φ φ ¬(1 T1 ∧ · · · ∧ n Tn ) and thus that Γ ∪ {¬φ} ¬(1 T1 ∧ · · · ∧ n Tn ), which shows that Γ ∪ {¬φ} is finitarily open. It is difficult in practice to show whether a given theory satisfies a proof theoretic condition such as those for finitary or finitarily open theories, but we have a tool to convert the problem to a semantic one: the completeness result for GKSSs in the previous section (Theorem 8). For example, to show that Γ φ, it suffices to show that Γ |= φ (with respect to GKSSs). This result can be used to see that the claims of non-finitaryness in the following example hold. Example 12 . The following are examples of non-finitary theories (let n = 2, p ∈ Θ, p ∈ OL, and let OL be closed under the i operators): 1. Γ1 = {1 {p}, 1{1 {p}}, 1{1 {1 {p}}}, . . .}. Γ1 is not finitarily open, and describes an agent with an infinite epistemic state. 2. Γ2 = {¬ 1 T : T ∈ TL}. Γ2 is not finitarily open, and describes an agent which cannot be at any finite point. 3. Γ3 = {1 T → ¬ 2 T : T, T ∈ TL}. Γ3 is not finitarily open, and describes a situation where agents 1 and 2 cannot simultaneously be at finite points. 4. Γ4 = {1 T → p : T ∈ TL}. Γ4 is finitarily open, but not finitary. To see the former, observe that if Γ4 ¬(1 T1 ∧ 2 T2 ) for arbitrary T1 , T2 then Γ4 |=f ¬(1 T1 ∧ 2 T2 ) by soundness (Theorem 6) – but it is easy to see that Γ4 has
42
˚ T. Agotnes and M. Walicki
models which are not models of ¬(1 T1 ∧ 2 T2 ) (take e.g. s1 = [T1 ], s2 = [T2 ] and π(p) = true). To see the latter, observe that Γ4 p (if Γ4 p, ∆ p for some finite ∆ ⊂ Γ4 , which again contradicts soundness) but Γ4 (1 T1 ∧ 2 T2 ) → p for all T1 , T2 . 2 Theorem 13 . A theory Γ is finitarily open if and only if it is satisfiable in Mfin .
2
P ROOF. Γ is finitarily open iff there exist Ti (1 ≤ i ≤ n) such that Γ ¬(1 T1 ∧ · · · ∧ n Tn ); iff, by Theorem 8, there exist Ti such that Γ |= ¬(1 T1 ∧ · · · ∧ n Tn ); iff there exist Ti and a GKSS M ∈ M such that M |= Γ and M |= 1 T1 ∧ · · · ∧ n Tn ; iff there exist Ti and M = (s1 , . . . , sn , π) ∈ M such that si ⊆ [Ti ] (1 ≤ i ≤ n) and M |= Γ ; iff there exist si ∈ ℘fin (OL) (1 ≤ i ≤ n) such that (s1 , . . . , sn , π) |= Γ ; iff Γ is satisfiable in Mfin . Theorem 14 . Let Γ ⊆ EL. Γ |=f φ ⇒ Γ φ for all φ iff Γ is finitary.
2
P ROOF. Let Γ be a finitary theory and let Γ |=f φ. By Lemma 11.1 Γ is finitarily open and thus satisfiable by Theorem 13. Γ ∪ {¬φ} is unsatisfiable in Mfin , and thus not finitarily open, and it follows from Lemma 11.2 that Γ φ. For the other direction, let Γ |=f φ ⇒ Γ φ for all φ, and assume that Γ φ. Then, Γ |=f φ, that is, there is a M = (s1 , . . . , sn , π) ∈ mod f (Γ ) such that M |=f φ. Let Ti (1 ≤ i ≤ n) be terms such that [Ti ] = si . M |=f 1 T1 ∧ · · · ∧ n Tn , and thus M |=f (1 T1 ∧· · ·∧n Tn ) → φ. By soundness (Theorem 6) Γ (1 T1 ∧· · ·∧n Tn ) → φ, showing that Γ is finitary. Lemma 15 . Let Γ ⊆ EL. The following statements are equivalent: 1. 2. 3. 4. 5.
Γ is finitary. Γ |=f φ ⇒ Γ φ, for any φ Γ |=f φ ⇒ Γ |= φ, for any φ (∃M∈mod(Γ ) M |= φ) ⇒ (∃M∈mod f (Γ ) M |=f φ), for any φ Γ φ ⇒ Γ ∪ {¬φ} is finitarily open, for any φ.
2
Lemma 15.4 is a finite model property, with respect to the models of Γ . We have now given a proof-theoretic definition of all theories for which EC is complete: the finitary theories. We have also shown some examples of non-finitary theories. We have not, however, given any examples of finitary theories. Although the problem of proving that EC is complete for a theory Γ has been reduced to proving that the theory is finitary according to Definition 9, the next problem is how to show that a given theory in fact is finitary. For example, is the empty theory finitary? If it is, then EC is weakly complete. We have not been able to find a trivial or easy way to prove finitaryness in general. In the next section, we present results which can be used to prove finitaryness. The results are semantic conditions for finitaryness, but can only be used for theories of a certain class and we are only able to show that they are sufficient and not that they also are necessary.
Complete Axiomatizations of Finite Syntactic Epistemic States
43
6 Semantic Finitaryness Conditions Epistemic axioms are axioms which describe legal epistemic states, like “an agent cannot know both a formula and its negation”. In Section 4 we presented the notion of a general epistemic state, and epistemic axioms can be seen as describing sets of legal general epistemic states as well as sets of legal finite epistemic states. Although we are ultimately interested in the latter, in this section we will be mainly interested in the former – we will present conditions on the algebraic structure of sets of general epistemic states in mod (Φ) which are sufficient for the axioms Φ to be finitary. First, epistemic axioms and their correspondence with sets of legal general epistemic states are defined. Then, conditions on these sets are defined, and it is shown that the GKSSs of a given set of epistemic axioms – being (essentially) the Cartesian product of the corresponding sets of legal general states – exhibit the finite model property if the sets of legal general states fulfil the conditions. The set of axioms is then finitary by Lemma 15.4. 6.1 Epistemic Axioms Not all formulae in EL should be considered as candidates for describing epistemic properties. One example is p → i {p}. This formula does not solely describe the agent – it describes a relationship between the agent and the world. Another example is ♦i {p} → ♦j {q}, which describes a constraint on one agent’s belief set contingent on another agent’s belief set. Neither of these two formulae describe purely epistemic properties of an agent. In the following definition, EF is the set of epistemic formulae and Ax is the set of candidate epistemic axioms. Definition 16 (EF , EF i , Ax ) . – EF ⊆ EL is the least set such that for 1 ≤ i ≤ n: T ∈ TL ⇒ i T, i T ∈ EF
φ, ψ ∈ EF ⇒ ¬φ, (φ ∧ ψ) ∈ EF
– EF i ={φ ∈ EF : Every epistemic operator in φ is i or i } (1 ≤ i ≤ n) – Ax = 1≤i≤n EF i
2
An example of an epistemic axiom schema is, if we assume that OL has conjunction, i {α ∧ β} → i {α} ∧ i {β}
(1)
Recall the set S of all general epistemic states, defined in Section 4. Definition 17 (Mφ , Siφ , MΦ , SiΦ ) . For each epistemic formula φ ∈ EF i , Mφ = S1φ × · · · × Snφ × Π where Sjφ = S for j = i and Siφ is constructed by structural induction over φ as follows: Sii T = {X ∈ S : [T ] ⊆ X}
Sii T = {X ∈ S : X ⊆ [T ]}
Si¬ψ = S\Siψ Siψ1 ∧ψ2 = Siψ1 ∩ Siψ2 When Φ ⊆ Ax then: SiΦ = ( φ∈Φ∩EF i Siφ ) ∩ S and MΦ = S1Φ × · · · × SnΦ × Π
2
44
˚ T. Agotnes and M. Walicki
In the construction of Mφ we remove the impossible (general) epistemic states by restricting the set of epistemic states to Siφ . The epistemic states which are not removed are the possible states — an agent can be placed in any of these states and will satisfy the epistemic axiom φ. Given Φ ⊆ Ax , the corresponding KSS models are: Φ Φ f Φ f MΦ fin = M ∩ Mfin = (S1 ∩ S ) × · · · × (Sn ∩ S ) × Π
That MΦ and MΦ fin indeed are the class of GKSS models and the class of KSS models of Φ, respectively, can easily be shown: f Lemma 18 . If Φ ⊆ Ax , MΦ = mod (Φ) and MΦ fin = mod (Φ)
2
Thus, the model class for epistemic axioms is constructed by removing certain states from the set of legal epistemic states. For example, (1) corresponds to removing epistemic states where the agent knows a conjunction without knowing the conjuncts. Note that ∅ is trivially a set of epistemic axioms, and that Si∅ = S and M∅ = M. 6.2 Finitaryness of Epistemic Axioms Lemmas 15.1 and 15.4 say that Γ is finitary iff mod (Γ ) has the finite model property. We make the following intermediate definition, and the following Lemma is an immediate consequence. Definition 19 (Finitary set of GKSSs) . A class of GKSSs M ⊆ M is finitary iff, for all φ: ∃M∈M M |= φ ⇒ ∃M f ∈Mf M f |= φ f
where M = M ∩ Mfin .
2
Lemma 20 . Let Γ ⊆ EL. Γ is finitary iff mod (Γ ) is finitary.
2
In the definition of the conditions on sets of general epistemic states, the following two general algebraic conditions will be used. Directed Set. A set A with a reflexive and transitive relation ≤ is directed iff for every finite subset B of A, there is an element a ∈ A such that b ≤ a for every b ∈ B. In the following, directedness of a set of sets is implicitly taken to be with respect to subset inclusion. Cover. A family of subsets of a set A whose union includes A is a cover of A. The main result is that the following conditions on sets of general epistemic states are sufficient for the corresponding GKSSs to be finitary (Def. 19), and furthermore, if the sets are induced by epistemic axioms, that the axioms are finitary. The conditions are quite complicated, but simpler ones are given below. Definition 21 (Finitary Set of Epistemic States) . If S ⊆ S is a set of general epistemic states and s ∈ ℘(OL), then the set of finite subsets of s included in S is denoted S|fs = S ∩ ℘f in (s)
Complete Axiomatizations of Finite Syntactic Epistemic States
45
S is finitary iff both: 1. For every infinite s ∈ S: (a) S|fs is directed (b) S|fs is a cover of s 2. ∀s∪{∗}∈S ∀s ∈℘fin (OL) ∃α∈ s : (a) ∃sf ∈S∩℘(s∪{α}) s ∩ s ⊆ sf (b) ∃sf ∈S∩℘(s∪{α}) sf ⊆s (c) S ∩ ℘(s ∪ {α}) is directed
2
The definition specifies conditions for each infinite set in S (condition 1) and each finite set in S containing ∗ (condition 2). Condition 2 is similar to condition 1, but is complicated by the fact that, informally speaking, the existence of a proper formula α to “replace” ∗ is needed. In practice, the simplified (and stronger) conditions presented in Corollary 24 below can often be used. The following Lemma is the main technical result in this section. The proof is somewhat involved, and must be left out due to space restrictions. It can be found in [1]. Lemma 22 . If S1 , . . . , Sn are finitary sets of epistemic states (Def. 21), then S1 × · · · × S n × Π is a finitary set of GKSSs (Def. 19).
2
Recall that a set Φ of epistemic axioms induces sets of legal epistemic states SiΦ (Def. 17). Theorem 23 . If Φ is a set of epistemic axioms such that S1Φ , . . . , SnΦ are finitary sets of epistemic states, then Φ is finitary. 2 P ROOF. Since Φ are epistemic axioms, MΦ = S1Φ × · · · × SnΦ × Π. Since all SiΦ are finitary, by Lemma 22 MΦ is a finitary set of GKSSs. Since MΦ = mod (Φ) (Lemma 18), Φ is finitary by Lemma 20. Theorem 23 shows that the conditions in Def. 21 on the set of legal epistemic states induced by epistemic axioms are sufficient to conclude that the axioms are finitary. In the following Corollary, we present several alternative sufficient conditions which are stronger. It can easily be shown that these conditions imply Def. 21. Corollary 24 . A set of epistemic states S ⊆ S is finitary if either one of the following three conditions hold: 1. For every s ⊆ OL: (a) S|fs is directed (b) S|fs is a cover of s 2. (a) S|fs is directed for every s ⊆ OL (b) {α} ∈ S for every α ∈ OL 3. (a) S|fs is directed for every infinite s ∈ S (b) {α} ∈ S for every α ∈ OL (c) ∀s∪{∗}∈S ∀s ∈℘fin (OL) ∃α∈ s s ∪ {α} ∈ S
2
46
˚ T. Agotnes and M. Walicki
7 Some Completeness Results For a given axiom schema Φ, the results from Sections 4, 5 and 6 can be used to test whether the system EC Φ is weakly complete, henceforth in this section called only “complete”, with respect to mod f (Φ) ⊆ Mfin . First, check that Φ is an epistemic axiom schema (Def. 16). Second, construct the GKSS (see Sec. 4) models of Φ, MΦ = S1Φ × · · · SnΦ × Π = mod (Φ) (Def. 17, Lemma 18). Third, check that each SiΦ is finitary (Def. 21) – it suffices that they each satisfy one of the simpler conditions in Corollary f 24. If these tests are positive, EC Φ is complete with respect to MΦ fin = mod (Φ), the Φ KSSs included in M , by Theorems 23 and 14. The converse does not hold; MΦ fin is not necessarily incomplete with respect to the corresponding models if the tests are negative. Many of the properties discussed in Section 5 can, however, be used to show incompleteness. These techniques are used in Theorem 25 below to prove the assertion from Section 3 about weak completeness of EC , in addition to results about completeness of the systems EC K, EC D, EC 4 and EC 5 from Section 3.2. For the latter results it is assumed that OL is closed under the usual propositional connectives and the i operators. Theorem 25 (Completeness Results) . 1. 2. 3. 4. 5.
EC is sound and complete with respect to Mfin EC K is sound and complete with respect to MK fin EC D is sound and complete with respect to MD fin EC 4 is not complete with respect to M4fin EC 5 is not complete with respect to M5fin
2
P ROOF . Soundness, in the first three parts of the theorem, follows immediately from D Theorem 6 and the fact that K and D are valid in MK fin and Mfin , respectively. The strategy for the completeness proofs, for the first three parts of the theorem, is as outlined above. (Weak) completeness of EC can be considered by “extending” EC by the empty set, and attempting to show that the empty set is a finitary theory. The empty set is trivially a set of epistemic axioms, and the axiom schemas K and D also both represent sets of epistemic axioms, with GKSS models constructed from the following sets of general epistemic states respectively: Si∅ = S SiK = S \ {X ∈ S : ∃α,β∈OL α → β, α ∈ X; β ∈ X)} SiD = S \ {X ∈ S : ∃α∈OL α, ¬α ∈ X} We show that these sets all are finitary sets of epistemic states by using Corollary 24. It follows by Theorem 23 that the theories ∅, K and D are finitary theories, and thus that EC , EC K and EC D are (weakly) complete by Theorem 14. For the two last parts of the theorem, we show that 4 and 5 are not finitary theories; it follows by Theorem 14 that EC 4 and EC 5 are incomplete. 1. Corollary 24.1 holds for Si∅ = S: Let s ⊆ OL. S|fs = S ∩ ℘f in (s) = ℘f in (s). ℘f in (s) is directed, because for every finitesubset B ⊂ ℘f in (s), ∪s ∈B s ∈ ℘f in (s). ℘f in (s) is a cover of s, because s ⊆ ℘f in (s).
Complete Axiomatizations of Finite Syntactic Epistemic States
47
2. Corollary 24.3 holds for SiK : Corollary 24.3.(a): It must be shown that SiK |fs is directed for infinite s ∈ SiK . Let s , s ∈ SiK ∩ ℘f in (s), and let for 0 < j: s0 = s ∪ s sj = sj−1 ∪ {β : α → β, α ∈ sj−1 } sf = k sk It is easy to show that sf ∈ SiK , each sj is a finite subset of s, and sf is finite. Corollary 24.3.(b): Clearly, {α} ∈ SiK for every α ∈ OL. Corollary 24.3.(c): Let s ∪ {∗} ∈ SiK and s ∈ ℘fin (OL). Let α ∈ OL be s. t.: – α→β ∈ s for any β ∈ OL – α ∈ s – The main connective in α is not implication It is easy to see that there exist infinitely many α satisfying these three conditions; there are infinitely many α ∈ OL without implication as main connective, and both s and s are finite. It can easily be shown that s ∪ {α} ∈ SiK . 3. Corollary 24.3 holds for SiD : Corollary 24.3.(a): It must be shown that SiD |fs is directed for infinite s ∈ SiD . Let s , s ∈ SiD ∩ ℘f in (s), and let sf = s ∪ s . It can easily be shown that sf ∈ SiD , and sf ∈ ℘f in (s) trivially. Corollary 24.3.(b): Clearly, {α} ∈ SiD for every α ∈ OL. Corollary 24.3.(c): Let s ∪ {∗} ∈ SiD and s ∈ ℘fin (OL). Let α ∈ OL be s. t.: – ¬α ∈s – α ∈ s – α does not start with negation It is easy to see that there exist infinitely many α satisfying these three conditions; there are infinitely many α ∈ OL without negation as main connective, and both s and s are finite. It can easily be shown that s ∪ {α} ∈ SiD . 4. Let 1 ≤ i ≤ n, and let M = (s1 , . . . , sn , π) ∈ Mfin such that M |=f 4. si must be the empty set – otherwise it would not be finite. Thus, 4 |=f i ∅. 4 does, however, have infinite models, so 4 |= i ∅. Lemma 15 gives that 4 is not finitary. 5. It is easy to see that 5 is not satisfiable in Mfin (i.e. that a model for 5 must be infinite). By Theorem 13 and Lemma 11, 5 is not finitary. Although the results in Theorem 25 are hardly surprising, they seem surprisingly hard to prove.
8 Discussion and Conclusions This paper presents a general and very abstract theory of resource bounded agents. We assumed that agents’ epistemic states are arbitrary finite sets of formulae. The addition of the “knowing at most” operator i gives a more expressive language for a theory of knowledge without any unrealistic assumptions about the reasoning abilities of the agents. Properties of reasoning can be modelled in an abstract way by considering only the set of epistemic states which a reasoning mechanism could actually produce. If
48
˚ T. Agotnes and M. Walicki
a more detailed model of reasoning is needed, the framework can be extended with a model describing transitions between finite epistemic states. This is exactly what is done in [4]. The key property of the models considered in this paper is the assumption about finite epistemic states; the results build on previous results for a similar logic without this assumption. The main results are an axiomatization of the logic, and two characterizations of the theories for which the logic is complete. The first, the notion of finitary theories, is a proof-theoretic account of all such theories. The second, algebraic conditions on certain sets of epistemic states, is a semantic one, but is only a sufficient condition for finitaryness. The latter was used to show finitaryness of the empty theory and thus weak completeness of the system. It follows from these results that the logic EC is decidable. The characterizations were also used to show (in)completeness of several extensions of EC . The results give a general completeness proof, of which weak completeness is a special case, and the complexity of the proof is due to this generality. Interesting results have been obtained from very weak assumptions: finite memory and a “knowing at most” operator in the meta language give complex algebraic conditions for axiomatizability. Related works include the many approaches to the logical omniscience problem (LOP) [13]; see e.g. [11, 18, 19] for surveys. Particularly, the work in this paper is a development of the syntactic treatment of knowledge as mentioned in Section 1. [11] presents this approach in the form of standard syntactic assignments. It is easy to see that KSSs are equivalent to standard syntactic assignments restricted to assigning finite knowledge to each agent. The i operator, and the derived ♦i operator, are new in the context of syntactic models. ♦i is, however, similar to Levesque’s only knowing operator O [15]. Oα means that the agent does not know more than α, but knowledge in this context means knowledge closed under logical consequence and “only knowing α” is thus quite different from “knowing exactly” a finite set of formulae syntactically. Another well-known approach to the LOP is the logic of general awareness [10], combining a syntactic and a semantic model of knowledge. This logic can be seen as syntactic assignments restricted to assigning truth only to formulae which actually follow, a special case of standard syntactic assignments. In this view, the logic we have discussed in this paper is not a “competing” framework to be compared to the logic of general awareness. Rather, it is an abstraction of the syntactic fragment of the latter logic, and gives a theory of two new concepts orthogonal to those modeled by the awareness logic: finite epistemic states and the “knowing at most” operator, respectively (that the i operator is indeed not definable by the usual syntactic operator Ki , as mentioned in the introduction, is shown formally in [2]). Adding the finiteness assumption, i.e. restricting the set of formulae an agent can be aware of (and thus his explicit knowledge), and/or the “knowing at most” operator to the awareness logic should be straightforward, but explicit definitions must be left out here due to lack of space. The application of the results in this paper to the logic of general awareness is nevertheless interesting for future work. Models of reasoning as transition between syntactic states, as mentioned above, include Konolige’s deduction model [14], active logics [8] and timed reasoning logics (TRL) [6]. Possibilities for future work include further development of the identification of finitary theories. For the case of epistemic axioms, the presented algebraic conditions
Complete Axiomatizations of Finite Syntactic Epistemic States
49
are sufficient but not necessary and tighter conditions would be interesting. Deciding finitaryness of general, not necessarily epistemic, axioms should also be investigated1. Acknowledgements. The work in this paper has been partly supported by grants 166525/V30 and 146967/431 from the Norwegian Research Council.
References ˚ 1. Thomas Agotnes. A Logic of Finite Syntactic Epistemic States. PhD thesis, Department of Informatics, University of Bergen, 2004. ˚ 2. Thomas Agotnes and Natasha Alechina. The dynamics of syntactic knowledge. Technical Report 304, Dept. of Informatics, Univ. of Bergen, Norway, 2005. ˚ 3. Thomas Agotnes and Michal Walicki. A logic for reasoning about agents with finite explicit knowledge. In Bjørnar Tessem, Pekka Ala-Siuru, Patrick Doherty, and Brian Mayoh, editors, Proc. of the 8th Scandinavian Conference on Artificial Intelligence, Frontiers in Artificial Intelligence and Applications, pages 163–174. IOS Press, Nov 2003. ˚ 4. Thomas Agotnes and Michal Walicki. Syntactic knowledge: A logic of reasoning, communication and cooperation. In Chiara Ghidini, Paolo Giorgini, and Wiebe van der Hoek, editors, Proceedings of the Second European Workshop on Multi-Agent Systems (EUMAS), Barcelona, Spain, December 2004. ˚ 5. Thomas Agotnes and Michal Walicki. Strongly complete axiomatizations of ”knowing at most” in standard syntactic assignments. In Francesca Toni and Paolo Torroni, editors, Preproceedings of the 6th International Workshop on Computational Logic in Multi-agent Systems (CLIMA VI), London, UK, June 2005. 6. Natasha Alechina, Brian Logan, and Mark Whitsey. A complete and decidable logic for resource-bounded agents. In Proc. of the Third Intern. Joint Conf. on Autonomous Agents and Multi-Agent Syst. (AAMAS 2004), pages 606–613. ACM Press, Jul 2004. 7. R. A. Eberle. A logic of believing, knowing and inferring. Synthese, 26:356–382, 1974. 8. J. Elgot-Drapkin, S. Kraus, M. Miller, M. Nirkhe, and D. Perlis. Active logics: A unified formal approach to episodic reasoning. Techn. Rep. CS-TR-4072, 1999. 9. Ronald Fagin and Joseph Y. Halpern. Belief, awareness and limited reasoning. In Proceedings of the Ninth International Joint Conference on Artificial Intelligence, pages 491–501, Los Angeles, CA, 1985. 10. Ronald Fagin and Joseph Y. Halpern. Belief, awareness and limited reasoning. Artificial Intelligence, 34:39–76, 1988. A preliminary version appeared in [9]. 11. Ronald Fagin, Joseph Y. Halpern, Yoram Moses, and Moshe Y. Vardi. Reasoning About Knowledge. The MIT Press, Cambridge, Massachusetts, 1995. 12. Joseph Y. Halpern and Yoram Moses. Knowledge and common knowledge in a distributed environment. Journal of the ACM, 37(3):549–587, July 1990. 13. J. Hintikka. Impossible possible worlds vindicated. Journ. of Phil. Logic, 4:475–484, 1975. 14. Kurt Konolige. A Deduction Model of Belief and its Logics. PhD thesis, Stanford U., 1984. 15. H. J. Levesque. All I know: a study in autoepistemic logic. Art. Intell., 42:263–309, 1990. 16. J.-J. Ch. Meyer and W. van der Hoek. Epistemic Logic for AI and Computer Science. Cambridge University Press, Cambridge, England, 1995. 17. R. C. Moore and G. Hendrix. Computational models of beliefs and the semantics of belief sentences. Technical Note 187, SRI International, Menlo Park, CA, 1979. 1
Natasha Alechina (private communication) has pointed out that also the knowledge axiom (mentioned in Section 2) is finitary.
50
˚ T. Agotnes and M. Walicki
18. Antonio Moreno. Avoiding logical omniscience and perfect reasoning: a survey. AI Communications, 11:101–122, 1998. 19. Kwang Mong Sim. Epistemic logic and logical omniscience: A survey. International Journal of Intelligent Systems, 12:57–81, 1997.
A Term Calculus The following axioms give a sound and complete calculus of term formulae (see Definition 5). . T =T . . T =U →U =T . . . T =U ∧U =V →T =V . . . T = U ∧S =V → S T =V U . . . T = U ∧S =V → S T =V U . T U =U T . T U =U T . (T U ) V = T (U V ) . (T U ) V = T (U V ) . T (T U ) = T . T (T U ) = T . T (U V ) = (T U ) (T V ) . {α1 , . . . , αn } = {α1 } · · · {αn } . . {α} = {β} → {α} {β} = {α} . . ¬({α} = {β}) → {α} {β} = ∅ . ¬X = Y
equivalence (reflexivity) equivalence (symmetry)
T1 T2
equivalence (transitivity) join-congruence
T3 T4
meet-congruence join-commutativity
T5 T6
meet-commutativity join-associativity
T7 T8
meet-associativity meet-absorption
T9 T10
join-absorption
T11
distributivity atomicity
T12 T13 T14 T15
X, Y ∈ ℘fin (OL), X =Y
T16
An Architecture for Rational Agents J.W. Lloyd and T.D. Sears Computer Sciences Laboratory, Research School of Information Sciences and Engineering, The Australian National University {jwl, timsears}@csl.anu.edu.au
Abstract. This paper is concerned with designing architectures for rational agents. In the proposed architecture, agents have belief bases that are theories in a multi-modal, higher-order logic. Belief bases can be modified by a belief acquisition algorithm that includes both symbolic, on-line learning and conventional knowledge base update as special cases. A method of partitioning the state space of the agent in two different ways leads to a Bayesian network and associated influence diagram for selecting actions. The resulting agent architecture exhibits a tight integration between logic, probability, and learning. Two illustrations of the agent architecture are provided, including a user agent that is able to personalise its behaviour according to the user’s interests and preferences.
1
Introduction
Our starting point is the conventional view that an agent is a system that takes percept sequences as input and outputs actions [9, p.33]. Naturally enough, on its own, this barely constrains the agent architectures that would be feasible. Thus we add a further constraint that agents should act rationally, where a rational agent is defined as follows [9, p.36]: “For each possible percept sequence, a rational agent should select an action that is expected to maximise its performance measure, given the evidence provided by the percept sequence and whatever built-in knowledge the agent has”. Various specific rationality principles could be used; we adopt the well-known principle of maximum expected utility [9, p.585] (namely, a rational agent should choose an action that maximises the agent’s expected utility). This implies that, for the intended applications, it is possible for the agent designer to specify utilities for all states (or, at least, the agent has some means of acquiring these utilities). The proposed architecture includes a decision-theoretic component involving utilities and Bayesian networks that implements the principle of maximum expected utility. Another major component of the architecture is a model of the environment (or, at least, a model of enough of the environment in order to be able to effectively select actions). This model has two parts: state and beliefs. The state M. Baldoni et al. (Eds.): DALT 2005, LNAI 3904, pp. 51–71, 2006. c Springer-Verlag Berlin Heidelberg 2006
52
J.W. Lloyd and T.D. Sears
records some information that helps the agent mitigate the well-known problems that arise from partial observability. The beliefs are random variables defined on the state; evidence variables assist in the selection of actions and result variables assist in the evaluation of the utility of states. Beliefs are expressed in a multi-modal, higher-order logic. Implicit in the concept of rationality is the notion that agents should make every effort to acquire whatever information is needed for action selection. This implies that agents should have a learning component that allows them to improve their performance (which includes adapting to changing environments). The proposed architecture incorporates a learning component for exactly this purpose. The resulting agent architecture exhibits a tight integration between logic, probability, and learning. In the terminology of [9, p.51], agents employing this architecture are model-based, utility-based, learning agents. An outline of this paper is as follows. The next section provides a brief introduction to beliefs and their acquisition. Section 3 describes the approach to agent architecture. Section 4 gives two illustrations of the agent architecture. The paper concludes with some remarks about the wider context of this research.
2
Beliefs
This section contains a brief introduction to belief bases and the logic in which these are expressed. We employ the logic from [5] which is a multi-modal version of the higher-order logic in [4] (but leaving out polymorphism); much more detail about the logic is contained in these two works. Definition 1. An alphabet consists of three sets: 1. A set T of type constructors. 2. A set C of constants. 3. A set V of variables. Each type constructor in T has an arity. The set T always includes the type constructors 1 and Ω both of arity 0. 1 is the type of some distinguished singleton set and Ω is the type of the booleans. Each constant in C has a signature. The set V is denumerable. Variables are typically denoted by x, y, z, . . . . For any particular application, the alphabet is assumed fixed and all definitions are relative to the alphabet. Types are built up from the set of type constructors, using the symbols → and ×. Definition 2. A type is defined inductively as follows. 1. If T is a type constructor of arity k and α1 , . . . , αk are types, then T α1 . . . αk is a type. (For k = 0, this reduces to a type constructor of arity 0 being a type.) 2. If α and β are types, then α → β is a type.
An Architecture for Rational Agents
53
3. If α1 , . . . , αn are types, then α1 × · · · × αn is a type. (For n = 0, this reduces to 1 being a type.) The set C always includes the following constants. 1. 2. 3. 4. 5. 6.
(), having signature 1 . =α , having signature α → α → Ω, for each type α. and ⊥, having signature Ω. ¬, having signature Ω → Ω. ∧, ∨, −→, ←−, and ←→, having signature Ω → Ω → Ω. Σα and Πα , having signature (α → Ω) → Ω, for each type α.
The intended meaning of =α is identity (that is, =α x y is iff x and y are identical), the intended meaning of is true, the intended meaning of ⊥ is false, and the intended meanings of the connectives ¬, ∧, ∨, −→, ←−, and ←→ are as usual. The intended meanings of Σα and Πα are that Σα maps a predicate to iff the predicate maps at least one element to and Πα maps a predicate to iff the predicate maps all elements to . Definition 3. A term, together with its type, is defined inductively as follows. 1. A variable in V of type α is a term of type α. 2. A constant in C having signature α is a term of type α. 3. If t is a term of type β and x a variable of type α, then λx.t is a term of type α → β. 4. If s is a term of type α → β and t a term of type α, then (s t) is a term of type β. 5. If t1 , . . . , tn are terms of type α1 , . . . , αn , respectively, then (t1 , . . . , tn ) is a term of type α1 × · · · × αn (for n ≥ 0). 6. If t is a term of type Ω and i ∈ {1, . . . , m}, then i t is a term of type Ω. Terms of the form (Σα λx.t) are written as ∃α x.t and terms of the form (Πα λx.t) are written as ∀α x.t (in accord with the intended meaning of Σα and Πα ). A formula of the form i ϕ is interpreted as ‘agent i believes ϕ’. An important feature of higher-order logic is that it admits functions that can take other functions as arguments and thus has greater expressive power for knowledge representation than first-order logic. This fact is exploited throughout the architecture, in the use of predicates to represent sets and in the predicate rewrite systems used for learning, for example. In applications there are typically many individuals that need to be represented. For example, with agent applications, the state is one such individual. With logic as the representation language, (closed) terms represent individuals. In [4], a class of terms, called basic terms, is identified for this purpose. The inductive definition of basic terms comes in three parts. The first part uses data constructors to represent numbers, lists, and so on. The second part uses abstractions to represent sets, multisets, and so on. The third part uses tuples to represent vectors. The class of basic terms provides a rich class of terms for
54
J.W. Lloyd and T.D. Sears
representing a variety of structured individuals. We use basic terms to represent individuals in the following. As shown in [4], and also [5], a functional logic programming computational model can be used to evaluate functions that have equational definitions in the logic. Furthermore, [5] provides a tableau theorem-proving system for the multimodal, higher-order logic. A method of belief acquisition for belief bases that are theories in the logic is under development. The belief acquisition algorithm includes both symbolic, online learning and conventional knowledge base update as special cases. The key idea of the algorithm is to introduce two languages, the training language and the hypothesis language. By carefully controlling these lanaguges it is possible to have belief acquisition that at one extreme directly incorporates new beliefs into the belief bases (as for a conventional knowledge base update algorithm) and at the other extreme is a conventional learning algorithm that learns definitions of functions that generalise to new individuals. The algorithm itself is a (somewhat generalised) decision-list learning algorithm, as introduced in [8]. Thus definitions of functions in belief bases are (logical forms of) decision lists. The basic idea for the use of the logic is that each agent in a multi-agent system has its own belief base that consists of formulas of the form i ϕ (for agent i). Also agents can have access to the belief bases of other agents by means of interaction axioms which are schemas of the form i ϕ −→ j ϕ, whose intuitive meaning is that ‘if agent i believes ϕ, then agent j believes ϕ’. In addition, while not illustrated in this paper, standard modal axioms such as T , D, B, 4, and 5 can be used for any modality i , depending on the precise nature of the beliefs in the belief base of agent i. The tableau theorem-proving system in [5] can prove theorems that involve these standard modal axioms and interaction axioms. Finally, note that while one could use the logic for specification of agents (as in [10], for example), this is not what is proposed here. Instead, we use the logic for expressing actual belief bases of agents, and theorem proving and computation involving these belief bases is an essential component of the implementation of agents.
3
Agent Architecture
In this section, we describe the agent architecture. We consider an agent situated in some environment that can receive percepts from the environment and can apply actions to the environment. Thus the agent can be considered to be a function from percepts to actions; an agent architecture provides the definition of this function. Let S denote the set of states of the agent. Included in a state may be information about the environment or something that is internal to the agent.
An Architecture for Rational Agents
55
It is also likely to include the agent’s current intention or goal, that is, what it is currently trying to achieve. The state may be updated as a result of receiving a percept. For example, a user agent may change its intention due to a request from the user. As well as some state, the agent’s model includes its belief base, which can also be updated. In the proposed architecture, belief bases have a particular form that we introduce below motivated by the desire to make action selection as effective as possible. Let A denote the set of actions of the agent. Each action changes the current state to a new state (possibly non-deterministically). The agent selects an action that maximises the expected utility. Executing the action involves applying the action to the environment and/or moving to a new state. Summarising the description so far, here is a high-level view of the main loop of the agent algorithm. loop forever get percept update model select action put action. We now concentrate on the action selection part of this loop. In practical applications, the set of states may be very large; in this case, it may be impractical to try to select the action by explicitly dealing with all these states. An obvious idea is to partition the state space so that the behaviour of an action is somehow consistent over all the states in an equivalence class [1]. We exploit this idea in what follows. For that we will need some random variables. It will be helpful to be quite precise about their definition. Random variables are functions; this fact and the probability space that they are defined on will be important in the description of the agent architecture. For simplicity and because it is the case of most interest in applications, we confine the discussion to random variables that are discrete. (Note that (f ◦ g)(x) means g(f (x)).) Definition 4. A random variable is a measurable function X : Ω → Ξ, where (Ω, A, p) is a probability space and (Ξ, B) is a measurable space. The probability measure X −1 ◦ p on B is called the distribution (or law) of X. The probability space of interest here is the product space S × A × S which is assumed to have some suitable σ-algebra and probability measure p(·, ·, ·) defined on it. Intuitively, p(s, a, s ) is the probability that applying action a will result in a transition from state s to state s . By conditioning on states and actions, for each state s ∈ S and action a ∈ A, a transition probability distribution p(· | s, a) is obtained.
56
J.W. Lloyd and T.D. Sears
There are three projections defined on S × A × S. The first projection initial : S × A × S → S is defined by initial (s, a, s ) = s, the second projection action : S × A × S → A is defined by action(s, a, s ) = a, and the third projection final : S × A × S → S is defined by final (s, a, s ) = s , for each (s, a, s ) ∈ S × A × S. Each projection induces a probability measure on their respective codomains that makes the codomains into probability spaces. We will be interested in two different sets of random variables on S. These are the evidence variables Ei : S → Vi (i = 1, . . . , n) and the result variables Rj : S → Wj (j = 1, . . . , m). Two sets of random variables can now be defined on the probability space S × A × S, as follows. For each evidence variable Ei , consider the random variable initial ◦ Ei : S × A × S → Vi . Similarly, for each result variable Rj , consider the random variable final ◦ Rj : S × A × S → Wj . Let X denote (initial ◦ E1 , . . . , initial ◦ En , action, final ◦ R1 , . . . , final ◦ Rm ). All this can now be put together to obtain the random variable X : S × A × S → V1 × · · · × Vn × A × W1 × · · · × Wm , which is illustrated in Figure 1. The distribution of X is given by X −1 ◦ p, where p is the probability measure on S × A × S. In the following, we will refer to an element of V1 × · · · × Vn as an evidence tuple and an element of W1 × · · · × Wm as a result tuple. The distribution of X is illustrated in the influence diagram in Figure 2. Here the Bayesian network given by the evidence and result variables is extended into
S×A× : S
:: :: ::final :: :: ::
initial
S(
E1
(( (( (( En (( (
V 1 · · · Vn
action
A
S)
R1
)) )
)) R )) n ))
W1 · · · Wm
Fig. 1. Evidence and result variables
An Architecture for Rational Agents
57
onml hijk hijk onml hijk onml ... initial ◦ E1 initial ◦ E2 initial ◦ En LLL r r LLL r ,, LLL rrr ,, LLL rrr r r ,, LLL r r r ,, r LL rrr ,, LLLL LLL rrr g action r ,, r LLrLr gggggggg t ,, r g g t L r g LLL t r ggg ,, tt LL gggggggg rrr tt , rrr tt ggggLgLLL r g t g r g t g LL& ytt xrrrggggggggg sgg hijk onml hijk onml ... final ◦ R1 final ◦ Rm KK KK ~ KK ~~ KK ~~ KK ? ~ KK KK ???~~~~ % ?? ? ? ?? U ?? ??
Fig. 2. Influence diagram
an influence diagram by adding the random variable action for the action selected and a utility node to indicate that it is the result variables that contribute to the utility. Each node in the Bayesian network has an associated (conditional) probability table (that is not shown in Figure 2). Note that, in general, it is possible for there to be dependencies amongst the initial ◦ Ei and amongst the final ◦ Rj . The evidence and result variables play an important role in action selection. One can think of these as features of the states. However, each class of features serves a different purpose. The evidence variables are chosen so as to assist the selection of a good action, whereas the result variables are chosen so as to provide a good evaluation of the resulting state. It will be convenient to insist that included amongst the evidence variables are boolean random variables that describe the pre-condition (if any) of each action. These random variables will be functions with definitions in the belief base. We can now be more precise about the definition of belief bases. Definition 5. A belief base is a theory consisting of the definitions of the evidence variables and (some of ) the result variables. The implication of this definition is that the belief base should contain the definitions of the evidence variables, (some of) the result variables, and any subsidiary functions used to define these, and that is all it need contain – there is no other use for functions in the belief base. This fact provides strong guidance during the design of the agent. Observations of the result variables will be needed but it will not be essential for the agent to know their (full) definitions. An application for which the definition of a result variable is not known but for which the variable can be adequately observed is given below.
58
J.W. Lloyd and T.D. Sears
At this point in the development we have entered the realm of graphical probabilistic models. Over the last 20 years or so there has been a huge effort to find efficient algorithms for doing all kinds of tasks such as marginalising, conditioning, and learning in graphical probabilistic models with large numbers of random variables. (See, for example, [3] and [6].) All of this work is directly relevant to the agent context and we hope to exploit it in future. For the moment, we are primarily interested in decision making which does not require the full graphical model and the applications of current interest have small graphical models anyway, so we simply make a few remarks below about this context. On the other hand, a new problem is raised by the agent applications in that the definitions of the random variables are generally changing and this leads to some complications that deserve a deeper investigation. By conditioning on the evidence variables and action variable in Figure 2, the table in Figure 3 is obtained. There is a row in this table for every combination of an evidence tuple together with an action, except that rows for which the precondition (if any) of the action has value ⊥ are deleted. There is a column under final ◦ (R1 , . . . , Rm ) for every result tuple. For each combination of evidence tuple and action, there is an associated transition probability distribution that gives, for each possible result tuple, the probability of reaching that result tuple. The last row records the utilities for each result tuple. To compute the expected utility EU (e, a) of action a applied to evidence tuple e, we proceed as follows. Suppose that e and a appear together in the ith row of the table. Then the expected utility is EU (e, a) =
l1 ···lm
pi,j × uj .
j=1
final ◦ (R1 , . . . , Rm ) initial ◦ (E1 , . . . , En ) action (w1,1 , . . . , wm,1 ) (w1,2 , . . . , wm,1 ) . . . (w1,l1 , . . . , wm,lm ) (v1,1 , . . . v1,n ) a1 p1,1 p1,2 ... p1,l1 ···lm (v2,1 , . . . v2,n ) a2 p2,1 p2,2 ... p2,l1 ···lm ... ... ... ... ... ... (vk,1 , . . . vk,n ) ak pk,1 pk,2 ... pk,l1 ···lm u1 u2 ... ul1 ···lm
Fig. 3. Influence diagram conditioned on the evidence and action variables
A policy for an agent is a mapping policy : S → A. The policy is extracted from the conditioned influence diagram in the usual way. Given a state s, the action a selected by policy is the one for which EU ((E1 (s), . . . , En (s)), a) is a maximum. The case when the state s is not known exactly but is given by a distribution (the ‘belief state’ case [9]) is handled by the obvious generalisation of this. This policy thus implements the principle of maximum expected utility. The agent architecture exhibits a tight integration between logic and probability, since the random variables in the network have definitions given by the
An Architecture for Rational Agents
59
logic. Furthermore, for agents in dynamic situations, the definitions of evidence variables will need to be modified from time to time. In the first illustration of the next section, we discuss a personalisation application for which there are three evidence variables, one of which is learned and the other two are subject to updating. In effect, the evidence variables are tailored to the interests and preferences of the user and this will be seen to be essential for selecting good actions. The approach taken here to agent architecture assumes that it is possible to specify the utilities. Generally, assigning utilities to a very large number of states in advance is a difficult task. However, with the right choice of result variables, the task can become much easier; this is illustrated with the applications in the next section. The modelling challenge for the agent designer is to find result variables so that all states that map to the same result tuple really do have the same (or very similar) utility. Thus, in essence, the task is to find a good piecewise constant approximation of the utility function. While not every application will succumb to this approach, it seems that there is a sufficiently large subset of agent applications which does and therefore the approach is useful. Having settled on the actions, and evidence and result variables, and having specified the utilities, the only other information needed to obtain a policy is the collection of transition probability distributions p(· | e, a) in Figure 3. How are these obtained? Essentially all that needs to be done is to observe triples of the form (e, a, r), where e is an evidence tuple, a is an action, and r is the corresponding result tuple. The observation of each such triple increments a count kept at the corresponding entry in the table of transition probabilities. More generally, r is not uniquely determined by e and a but is given by a probability distribution. In this case, the increment is proportioned over the result tuples according to this distribution. These counts can then be normalised over each row to obtain a probability distribution.
4
Illustrations
We now consider two illustrations of the agent architecture. 4.1
TV Recommender
The first illustration is concerned with applying machine learning techniques to building user agents that facilitate interaction between a user and the Internet. It concentrates on the topic of personalisation in which the agent adapts its behaviour according to the interests and preferences of the user. There are many practical applications of personalisation that could exploit this technology. This illustration is set in the context of an infotainment agent, which is a multi-agent system that contains a number of agents with functionalities for recommending movies, TV programs, music and the like, as well as information agents with functionalities for searching for information on the Internet. Here we concentrate on the TV recommender as a typical such agent. More detail (but not the decision-theoretic discussion that follows) is given in [2].
60
J.W. Lloyd and T.D. Sears
A detailed description of the most pertinent aspects of the design of the TV recommender is now given. The knowledge representation aspects of the TV recommender are presented first; these mainly concern the states, actions, evidence variables, and result variables. First we introduce the types that will be needed and the data constructors corresponding to these types. We will need several standard types: Ω (the type of the booleans), Nat (the type of natural numbers), Int (the type of integers), and String (the type of strings). Also List denotes the (unary) list type constructor. Thus, if α is a type, then List α is the type of lists whose elements have type α. We introduce the following type synonyms. State = Occurrence × Status Occurrence = Date × Time × Channel Date = Day × Month × Year Time = Hour × Minute Program = Title × Subtitle × Duration × (List Genre) × Classification × Synopsis Text = List String.
In addition, Title, Subtitle, and Synopsis are all defined to be String, and Year , Month, Day, Hour , Minute and Duration are all defined to be Nat . The data constructors for the type Status are as follows. Unknown, Yes, No : Status. The meaning of Unknown is that a recommendation (about a program having a particular occurrence) hasn’t yet been made, Yes means that it has a positive recommendation, and No means that it has a negative recommendation. Channel is the type of TV channels, Genre the type of genres, and Classification the type of classifications. There are 49 channels, 115 genres and 7 classifications. There is a type Action with data constructors given by RecommendYes, RecommendNo : Action. The action RecommendYes is a positive recommendation, while RecommendNo is a negative recommendation. The action RecommendYes takes a state (o, Unknown), where o is some occurrence, and produces the new state (o, Yes). Similarly, the action RecommendNo takes a state (o, Unknown) and produces the new state (o, No). In the following, the definitions of various functions will appear. These are (mainly) in the belief base of the TV agent. To indicate that they are beliefs of the TV agent, the necessity modality t is used. Thus, if ϕ is a formula, then the meaning of t ϕ is that ‘ϕ is a belief of the TV agent’. Other agents in the multi-agent system have their own necessity modality; for example, the modality for the diary agent is d . There is also a base of common beliefs accessible to all the agents for which the necessity modality is . Interaction axioms allow
An Architecture for Rational Agents
61
one agent to access the beliefs of another agent [5]. So, for example, there are interaction axioms that allow the TV agent to access the beliefs of the diary agent and also the common beliefs. There are three projections on State × Action × State. The first projection is initial : State × Action × State → State t ∀State s. ∀Action a. ∀State s . ((initial (s, a, s )) =State s). The second projection is action : State × Action × State → Action t ∀State s. ∀Action a. ∀State s . ((action (s, a, s )) =Action a). The third projection is final : State × Action × State → State t ∀State s. ∀Action a. ∀State s . ((final (s, a, s )) =State s ). Now we turn to the evidence variables. To define these, a number of subsidiary functions are needed and so we start there. The agent has access via the Internet to a TV guide (for the next week or so) for all channels. This database is represented by a function tv guide having signature tv guide : Occurrence → Program. Here the date, time and channel information uniquely identifies the program and the value of the function is (information about) the program itself. The TV guide consists of (thousands of) facts like the following one. t ((tv guide ((20, 7, 2004), (20, 30), ABC)) =Program (“The Bill ”, “”, 50, [Drama], M , “SunHill continues to work at breaking the people smuggling operation”)). This fact states that the program on 20 July 2004 at 8.30pm on channel ABC has title “The Bill”, no subtitle, a duration of 50 minutes, genre drama, a classification for mature audiences, and synopsis “Sun Hill continues to work at breaking the people smuggling operation”. There are a number of simple subsidiary functions that are defined as follows. Note that the definition of the function add is in the base of common beliefs and hence uses the modality . proj Occurrence : State → Occurrence t ∀Occurrence o. ∀Status s. ((proj Occurrence (o, s)) =Occurrence o).
62
J.W. Lloyd and T.D. Sears
period : Occurrence → Date × Time × Time t ∀Date d. ∀Time t. ∀Channel c. ((period (d, t, c)) =Date×Time×Time (d, t, (add (t, (proj Duration (tv guide (d, t, c))))))). add : Time × Duration → Time ∀Hour h. ∀Minute m. ∀Duration d. ((add ((h, m), d)) =Time ((60 × h + m + d) div 60, (60 × h + m + d) mod 60)). The belief base of the TV recommender includes the function user tv time acceptable which has a definition that is obtained by belief acquisition and would look something like the following. user tv time acceptable : Date × Time × Time → Ω t ∀Date d. ∀Time t. ∀Time t . ((user tv time acceptable (d, t, t )) =Ω if (weekday d) ∧ ((proj Hour t) ≥ 20) ∧ ((proj Hour t ) ≤ 23) then else if ¬(weekday d)∧((proj Hour t) ≥ 12) ∧ ((proj Hour t ) ≤ 25) then else ⊥). This definition states that the user is willing to watch programs that start between 8pm and 11pm on weekdays and between midday and 1am on weekends. This information is obtained rather directly from the user by the belief acquisition algorithm. The belief base of the TV recommender also includes the function user likes tv program which has a definition that is obtained by belief acquisition (in this case, machine learning since the function learned generalises) and would look something like the following. user likes tv program : Program → Ω t ∀Program x. ((user likes tv program x) =Ω if (proj Title ◦ (=Title “NFL Football ”) x) then else if (proj (List Genre) ◦ (listExists 1 genre ◦ (< 0)) x) then ⊥ else if (proj Title ◦ StringToText ◦ (listExists 1 (=String “sport”)) x) then else if (proj (List Genre) ◦ (listExists 1 (=Genre Current Affairs)) x) then ⊥ else if (proj Synopsis ◦ StringToText ◦ (listExists 1 (=String “american”)) x) then .. . else ⊥).
Much more detail on how this function is learned is contained in [2].
An Architecture for Rational Agents
63
Another subsidiary function needed is user diary free that has signature user diary free : Date × Time × Time → Ω. The definition of this function is in the belief base of the diary agent. The TV agent has access to this definition because of an interaction axiom that makes the belief base of the diary agent accessible to the TV agent. We can now define the evidence variables. The first evidence variable is ultp : State → Ω t ∀State x. ((ultp x) =Ω (proj Occurrence ◦ tv guide ◦ user likes tv program x)). This evidence variable is intended to indicate whether or not the user likes a program (irrespective of whether or not the user would be able to watch it). The second evidence variable is udiary : State → Ω t ∀State x. ((udiary x) =Ω (proj Occurrence ◦ period ◦ user diary free x)). This evidence variable is intended to indicate whether or not the user’s diary is free during the time that the program is on. The third evidence variable is uaccept : State → Ω t ∀State x. ((uaccept x) =Ω (proj Occurrence ◦ period ◦ user tv time acceptable x)). This evidence variable is intended to indicate whether or not the user is willing to watch television at the time the program is on. The intuition is that the three evidence variables are features that together can reasonably be used by the TV agent to make recommendations for the user. Next we turn to the result variables. The first result variable is the function user : State → Ω which models the user as an oracle about occurrences of programs. Given a state, user returns if the user intends to watch the program whose occurrence is in the first component of the state; otherwise, user returns ⊥. Of course, the
64
J.W. Lloyd and T.D. Sears State × ActionI × State
nnn nnn n n initial nn n nnn n n n nn n w nn
Ω
StateC CC {{ CC { { CC uaccept { ultp {{ CC { udiary { CC { { CC { CC { }{ ! { Ω
II II II IIfinal II II II II $
State 4
action
Ω
Action
444 44 user 44recomm 44 44
Ω
Status
Fig. 4. Evidence and result variables for the TV recommender
initial ◦ onml hijk
initial ◦ onml hijk
ultp
initial ◦ onml hijk
udiary
,, ,, ,, ,, ,, ,, ,, ,, ,
final ◦ onml hijk user
uaccept
y yy yy y yy yy yy y action yy yy t yy t y tt yy tt yy tt yyy tt t y| y t t final ◦ onml hijk recomm
99 99 99 9 ??? ?? ?? ? ?? U ?? ??
Fig. 5. Influence diagram for the TV recommender
definition of user is not available to the agent. But it can observe values for this function by asking the user. The second result variable is recomm : State → Status t ∀Occurrence o. ∀Status s. ((recomm (o, s)) =Status s). The function recomm simply projects onto the status component of the state. Figure 4 illustrates the evidence and result variables for the TV recommender. The corresponding influence diagram is given in Figure 5.
An Architecture for Rational Agents
initial ◦ (ultp, udiary, uaccept ) (, , ) (, , ) (, , ⊥) (, , ⊥) (, ⊥, ) (, ⊥, ) (, ⊥, ⊥) (, ⊥, ⊥) (⊥, , ) (⊥, , ) (⊥, , ⊥) (⊥, , ⊥) (⊥, ⊥, ) (⊥, ⊥, ) (⊥, ⊥, ⊥) (⊥, ⊥, ⊥)
final ◦ (user , recomm) action (, Yes) (, No) (⊥, Yes) (⊥, No) RecommendYes 0.8 0.0 0.2 0.0 RecommendNo 0.0 0.8 0.0 0.2 RecommendYes 0.4 0.0 0.6 0.0 RecommendNo 0.0 0.4 0.0 0.6 RecommendYes 0.0 0.0 1.0 0.0 RecommendNo 0.0 0.0 0.0 1.0 RecommendYes 0.0 0.0 1.0 0.0 RecommendNo 0.0 0.0 0.0 1.0 RecommendYes 0.1 0.0 0.9 0.0 RecommendNo 0.0 0.1 0.0 0.9 RecommendYes 0.0 0.0 1.0 0.0 RecommendNo 0.0 0.0 0.0 1.0 RecommendYes 0.0 0.0 1.0 0.0 RecommendNo 0.0 0.0 0.0 1.0 RecommendYes 0.0 0.0 1.0 0.0 RecommendNo 0.0 0.0 0.0 1.0 1 0 0.4 1
65
0.88 0.20 0.64 0.60 0.40 1.00 0.40 1.00 0.46 0.90 0.40 1.00 0.40 1.00 0.40 1.00
Fig. 6. Influence diagram conditioned on the evidence and action variables for the TV recommender
initial ◦ (ultp, udiary, uaccept ) (, , ) (, , ⊥) (, ⊥, ) (, ⊥, ⊥) (⊥, , ) (⊥, , ⊥) (⊥, ⊥, ) (⊥, ⊥, ⊥)
action RecommendYes RecommendYes RecommendNo RecommendNo RecommendNo RecommendNo RecommendNo RecommendNo
Fig. 7. Policy for the TV recommender corresponding to Figure 6
Given in Figure 6 is a set of transition probability distributions and utilities that could reasonably arise in practice. (The utilities are given by the user.) The last column contains the expected utility of the action given the evidence for each row. In Figure 6 it is the case that the definition acquired by the agent for uaccept is actually not all that accurate and the user is willing to watch programs outside the periods given in the definition of this function. Secondly, it is assumed that the user is reasonably happy with states in which the agent recommends programs that they do not actually want to watch and therefore gives this state the utility value of 0.4. The policy corresponding to the situation is given in Figure 7 and its definition in the logic is as follows.
66
J.W. Lloyd and T.D. Sears
policy : State → Action t ∀State s. ((policy s) =Action if (((ultp s) =Ω ) ∧ ((udiary s) =Ω )) then RecommendYes else RecommendNo). 4.2
Blocks World
The second illustration is the traditional blocks world domain, which will highlight a couple of aspects missing from the TV recommender. Here are some declarations suitable for this domain. B0 , B1 , B2 , B3 , B4 , B5 , B6 , B7 , B8 , B9 , Floor : Object Stack = List Object World = {Stack } OnState = Object × Object Intention = {OnState} State = World × Intention. We consider worlds in which there are up to 10 blocks. A blocks world is modelled as a set of stacks of blocks. The agent’s intentions are modelled as a set of on-states, where an on-state is a pair of objects, the first of which is intended to be immediately on top of the second. The set of on-states may not fully specify the position of all blocks in the world; an intention is just a set of constraints that any goal state should satisfy. A state is a pair consisting of a blocks world and an intention. A block b in the world of a state is misplaced if (i) the pair consisting of b and its supporting object directly contradicts one of the on-states in the intention, or (ii) b is supported by a misplaced block. A move of a block is constructive if, after the move, the block is not misplaced and the move achieves an on-state in the intention. Once a block has been moved constructively, it need not move again in the course of achieving the intention. This discussion leads to the definition of two actions. The first, called CMove, makes a constructive move of a block. If there are several blocks that can be moved constructively, then the choice of block to be moved is non-deterministic. The second action, called FMove, involves moving a misplaced block (that is not on the floor) to the floor. Once again the choice of such a block to be moved is non-deterministic. This gives the following declaration of the constants for the type Action. CMove, FMove : Action. It is assumed that after executing an action, the agent receives a percept from the environment that enables it to know the exact state of the world (that is, which particular block was actually moved).
An Architecture for Rational Agents
67
Next we give the evidence variables which have the following signatures. okC : State → Ω okF : State → Ω. The intended meaning of predicate okC is that it is true iff a constructive move is possible. The intended meaning of predicate okF is that it is true iff there is a misplaced block that is not on the floor. Note that, if a world does not satisfy the intention and okC is false, then okF is true. The precondition for the action CMove is that okC is true. Also the precondition for the action FMove is that okF is true. Thus, for blocks world, the evidence variables simply provide the preconditions for the actions. There is a single result variable misplaced that returns the number of blocks in the world that are misplaced (with respect to the intention). The function misplaced has a definition as follows. misplaced : State → Nat b ∀State s. ((misplaced s) =Nat card {b | (isMisplaced b s)}). Here b is the necessity modality for the blocks world agent, card computes the cardinality of a set, and isMisplaced is a predicate that checks whether a block is misplaced for a state. Intuitively, states for which the value of misplaced is smaller should have a higher utility, as they are ‘closer’ to a goal state. The evidence and result variables are shown in Figure 8.
State × Action @ × State
x xx xx xx x xx xx xx x x |
@@ @@ @@ final @@ @@ @@ @
initial
State *
** *** okC **okF ** *
Ω
Ω
State
action
misplaced
Action
Nat
Fig. 8. Evidence and result variables for the blocks world agent
Figure 9 gives the influence diagram for the blocks world agent. Note that, in this illustration, the utility of a state is determined by a single result variable. Figure 10 shows the conditioned influence diagram (where actual counts are shown instead of probabilities; these counts need to be normalised). The observations are from 100 episodes in a world of 10 blocks for which each episode
68
J.W. Lloyd and T.D. Sears
initial ◦ onml hijk
initial ◦ onml hijk
okC
okF
,, ,, ,, ,, ,, ,, ,, ,, ,
action rr rr r r rr rr rr r ry final ◦ hijk onml misplaced
? ??? ?? ? ? ?? U ?? ??
Fig. 9. Influence diagram for the blocks world agent
involved achieving an intention with 10 on-states (that is, the each goal state was fully specified). There are only four rows in this table instead of eight because in four cases the precondition of the corresponding action is not satisfied; in each of the four rows that remain, the precondition of the corresponding action is satisfied. The counts in this table were obtained by deploying the agent and observing the effects of actions on various states. The utilities are based on a value of 1 if there are no misplaced blocks (that is, a goal state has been reached) with a discount factor of 0.9 for every further misplaced block.
final ◦ misplaced initial ◦ (okC , okF ) (, ) (, ) (, ⊥) (⊥, )
action
0
1
2
3
4
5
6
7
8
9
10
CMove FMove CMove FMove
0 0 100 0 1.0
1 0 99 0 0.9
1 0 99 0 0.81
4 1 96 2 0.73
14 4 85 3 0.66
31 18 66 4 0.59
42 23 52 9 0.53
53 69 24 35 0.48
63 58 8 104 0.43
38 58 0 115 0.39
0 11 0 117 0.35
0.49 0.46 0.76 0.40
Fig. 10. Influence diagram conditioned on the evidence and action variables for the blocks world agent
In this very simple case, the only choice about actions comes in the case that the preconditions of both actions are satisfied. This is shown in the first two rows of Figure 10. Applying the principle of maximum expected utility to these two rows leads to the action CMove being selected. The policy is given in Figure 11 and its definition in the logic is as follows.
An Architecture for Rational Agents
69
policy : State → Action t ∀State s. ((policy s) =Action if ((okC s) =Ω ) then CMove else FMove). initial ◦ (okC , okF ) (, ) (, ⊥) (⊥, )
action CMove CMove FMove
Fig. 11. Policy for the blocks world agent
4.3
Comparision of the Illustrations
Between them, the two illustrations highlight most of the important features of the rational agent architecture. First, note that in each illustration we have brought expert knowledge to bear to make good choices of the state, actions, evidence and result variables, and utilities. For example, for blocks world, we used knowledge of the domain to define the action CMove. While it would be possible to pretend not to have expert knowledge about the domain and use more primitive actions instead, our view is that, faced with a real-world application, it is natural to leverage every piece of expert knowledge about the domain that is available. We have followed this approach in both illustrations. The TV recommender provides a good illustration of evidence variables that are acquired, in this case, from the user. They were chosen because their values provide an evidence tuple that is a good guide to the recommendation that the agent should make. Their definitions are quite complicated and user-specific. (See [2] for details about ultp.) On the other hand, in blocks world, the evidence variables are as simple as they can possibly be – just the preconditions for the actions. This situation would be untypical for more complex applications, but seems reasonable for an agent whose main task is to move in a search space. For the TV recommender we have the advantage of an oracle – the user – who can directly criticise its recommendations. Thus the form of the result variables in that illustration seems natural. For blocks world, we exploit the fact that it is possible to make a fairly accurate estimate of the length of the path from the current state to the goal. This naturally suggests that misplaced should be the result variable. In blocks world, the agent may have to make many moves to reach the goal. In other words, a plan to reach the goal emerges from the policy that is just concerned with ‘locally’ choosing the best action. For real-world applications, it is typical for several actions to be needed to accomplish a task and it is also typical for actions to have non-deterministic outcomes, as in blocks world.
70
J.W. Lloyd and T.D. Sears
Notwithstanding the positive aspects of the illustrations, neither provides fully convincing evidence of the usefulness of the rational agent architecture. Studying more complex applications is a perhaps the most important next step.
5
Discussion
We conclude with some remarks about the wider context of this research. The scientific question of interest here is that of designing effective architectures for agent systems that perform complex tasks. We take the view that a satisfactory solution to this problem will involve (at least) a successful integration of logic, probability and machine learning: logic is need for knowledge representation and reasoning, probability is needed for handling uncertainty, and machine learning is needed for adaptivity. The integration of logic and probability is an old scientific problem going back to the 19th Century that computer scientists and others are still struggling with today. Recently, the goal of integrating logic, probability, and learning was explicitly identified as a major research problem. (An excellent survey of this problem is given in [7]. See also the Dagstuhl Seminar on this topic at http://www.dagstuhl.de/05051/.) Of course, there are many possible approaches to solving this problem, quite a few of which were presented at the Dagstuhl Seminar and most of which are primarily set in a machine learning context. Here we take the view that the best general setting for this problem is that of agents. The main elements of our approach to the architecture of agents are as follows. First, there are the agent beliefs. For these, we employ a modal higher-order logic. While neither of the illustrations in this paper make truly crucial use of the modalities, we have applications in mind where these would be essential. In general, we expect to make use of both epistemic modalities, as in this paper, and temporal modalities, as these are the two kinds of modalities that seem most useful for agents. The beliefs are maintained as function definitions of evidence and result variables. There is a general method for acquiring beliefs that includes conventional database update and machine learning as special cases. The other main element of the architecture is a Bayesian network on the values of evidence and result variables. The parameters for this Bayesian network are learned through training. These parameters together with the utilities on result tuples determine the action with the highest expected utility for each evidence tuple and, therefore, the policy for the agent. Thus our approach has integrated logic and probability in the following way: logic is used to provide the definitions of the evidence and result variables, while probability is used to provide distributions on the values of these variables. Machine learning is needed in two places in the architecture: the definitions of some variables are learned by belief acquisition and the Bayesian network parameters are learned by maximum-likelihood estimation. Our approach to agent construction assumes that expert knowledge is available for each application. Thus, given an application, it must be possible to know what should be the state, actions, definitions (or, at least, hypothesis languages)
An Architecture for Rational Agents
71
for the evidence and result variables, and utilities of the agent. With these in place, the agent policy can be learned through training. Our experience with a small number of application domains suggests a methodology for constructing agents by our approach can be successfully developed, although much work on this remains to be done. Currently, work is proceeding on the theoretical foundations, architectural issues, and applications. Particularly important are the applications as we need to deploy agents in application areas rather more complex than those studied in this paper to make the whole approach fully convincing. One particularly promising application area that we are investigating is a poker-playing program. This application is a challenging one and provides a good testbed for agent and AI technologies. In particular, it exercises every aspect of the rational agent architecture.
Acknowledgements The authors are grateful to Joshua Cole and Kee Siong Ng for many helpful discussions that have greatly influenced the ideas in this paper and for their implementation of the agent applications reported here.
References 1. C. Boutilier, T. Dean, and S. Hanks. Decision theoretic planning: Structural assumptions and computational leverage. Journal of Artificial Intelligence Research, 11:1–94, 1999. 2. J.J. Cole, M.J. Gray, J.W. Lloyd, and K.S. Ng. Personalisation for user agents. In Fourth International Conference on Autonomous Agents and Multiagent Systems (AAMAS 05), 2005. 3. R.G. Cowell, A.P. Dawid, S.L. Lauritzen, and D.J. Spiegelhalter. Probabilistic Networks and Expert Systems. Statistics for Engineering and Information Sciences. Springer, 1999. 4. J.W. Lloyd. Logic for Learning. Cognitive Technologies. Springer, 2003. 5. J.W. Lloyd. Modal higher-order logic for agents. http://users.rsise.anu.edu.au/~jwl/beliefs.pdf. 6. J. Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, 1988. 7. Luc De Raedt and Kristian Kersting. Probabilistic logic learning. SIGKDD Explorations, 5(1):31–48, 2003. 8. R. Rivest. Learning decision lists. Machine Learning, 2(3):229–246, 1987. 9. S. Russell and P. Norvig. Artificial Intelligence: A Modern Approach. Prentice-Hall, second edition, 2002. 10. M. Wooldridge. Reasoning about Rational Agents. MIT Press, 2000.
LAIMA: A Multi-agent Platform Using Ordered Choice Logic Programming Marina De Vos , Tom Crick, Julian Padget, Martin Brain, Owen Cliffe, and Jonathan Needham Department of Computer Science, University of Bath, Bath BA2 7AY, UK {mdv, tc, jap, mjb, occ, jmn20}@cs.bath.ac.uk
Abstract. Multi-agent systems (MAS) can take many forms depending on the characteristics of the agents populating them. Amongst the more demanding properties with respect to the design and implementation of multi-agent system is how these agents may individually reason and communicate about their knowledge and beliefs, with a view to cooperation and collaboration. In this paper, we present a deductive reasoning multi-agent platform using an extension of answer set programming (ASP). We show that it is capable of dealing with the specification and implementation of the system’s architecture, communication and the individual agent’s reasoning capacities. Agents are represented as Ordered Choice Logic Programs (OCLP) as a way of modelling their knowledge and reasoning capacities, with communication between the agents regulated by uni-directional channels transporting information based on their answer sets. In the implementation of our system we combine the extensibility of the JADE framework with the flexibility of the OCT front-end to the Smodels answer set solver. The power of this approach is demonstrated by a multi-agent system reasoning about equilibria of extensive games with perfect information.
1 Introduction The emergence of deductive reasoning agents over traditional symbolic AI has led to the development of logical theories within an agent framework. The principal idea is to use logic formalisms to encode a theory stating the best action to perform in any given situation. The action performed by the agent will be derived from this theory. Unfortunately, the problems of translating and representing the real world in an accurate and adequate symbolic description are still largely unsolved (although there has been some encouraging work in the area, for example, the OpenCyc project [1]). Answer set programming (hereafter referred to as ASP) [2] is a formal, logical language designed for declarative problem solving, knowledge representation and real world reasoning. It represents a modern approach to logic programming, based on analysis of which theoretical constructs are needed in reasoning applications rather than implementing subsets of classical logic or other abstract logical systems.
This work was partially supported by the European Fifth Framework Programme under the grant IST-2001-37004 (WASP) and the Nuffield Foundation under the grant NAL/00689/A.
M. Baldoni et al. (Eds.): DALT 2005, LNAI 3904, pp. 72–88, 2006. c Springer-Verlag Berlin Heidelberg 2006
LAIMA: A Multi-agent Platform Using OCLP
73
In this paper, we present a formalism for multi-agent systems in which the agent use answer set programming to represent their reasoning capabilities. Such systems are useful for modelling decision-problems; not just the solutions of the problem at hand but also the evolution of beliefs and the interactions between the agents can be characterised. In this system as a set of agents are connected by uni-directional communication channels. To model a single agent’s reasoning, we use Ordered Choice Logic Programs [3], an extension of the answer set semantics that provides for explicit representation of preferences between rules and dynamic, circumstance-dependent choice between alternatives in a decision. Agents use information received from other agents as a guidance for their reasoning. As long as the information received does not conflict with the decision(s) the agent has to make towards its goal(s), this new knowledge will be accepted. If conflict arises, the agent will always prefer its own alternatives over the ones presented by other agents. In this sense, one could say that our agents act strategically and rational in the game theoretic sense ([4]). The knowledge an agent possesses is captured by its answer set semantics, with part of this knowledge being shared with agents listening on the outgoing channels. This is regulated by filters that for each connecting agent decide what sort of information this agent is allowed to receive. The semantics of the whole system corresponds to a stable situation where no agent, with stable input, needs to change its output. Our implementation uses the JADE platform[5] to provide basic communications and agent services, an ontology developed in Prot´eg´e[6] and OCT [3, 7] to model, maintain and reason about the agent’s beliefs and knowledge. The rest of the paper is organised as follows: Section 2 explains the advantages of using logic programming for modelling agent behaviour and details why we believe ASP is better suited for this than Prolog, the traditional choice for logic programming. A brief overview of OCLP is provided in Section 3. Sections 4 and 5 present the theoretical multi-agent framework, LAIMA, and its implementation, JOF. Game theory, one of the possible application areas of LAIMA/JOF is detailed in Section 6. This paper ends with a discussion of future developments in ASP that are beneficial to the agent community and the future work on the presented system.
2 Why Answer Set Programming? One of the key components in making agents behave intelligent is the ability to reason: perform logical inference, works with combinatorial constraints and being able to handle complex, logical queries over a large search domain. These actions are simple to express and implement in declarative logic programming formalisms. By using these tools, the developer of an agent system can focus on the reasoning and ‘thought’ processes of the agent rather than being bogged down in the detail of how to implement them. Using existing logical formalisms rather than an ad-hoc systems also brings a greater degree of robustness and certainty to the agent’s reasoning, i.e. because it is possible or easier to prove and verify the behaviour of participating agents. Finally, the availability of a number of powerful and mature implementations contributes to reduced development time.
74
M. De Vos et al.
The question then becomes one of “Which logical formalism?”. Due to its power, expressiveness, and the availability of fast inference mechanisms, we use ASP. A detailed discussion of the benefits of ASP over other formalisms can be found in section 1.1 of [2]. An important aspect of ASP is its treatment of negation. The semantics of ASP naturally gives rise to two different methods of calculating negation, negation as failure and constraint based negation. Negation as failure, (i.e. we cannot prove p to be true) is characterised as epistemic negation1 (i.e. we do not know p to be true). Constraint-based negation introduces constraints that prevent certain combinations of atoms from being simultaneously true in any answer set. This is characterised as classical negation as it is possible to prevent a and ¬a both being simultaneously true, a requisite condition for capturing classical negation. This is a significant advantage in some reasoning tasks as it allows reasoning about incomplete information. More importantly, the “closed world assumption” (inherent in much work modelled in Prolog) is not present in ASP. In the extension of ASP that we propose to use for modelling agents, we only allow implicit negation coming from the modelling decisions (you have decide upon exactly one alternative when forced to make a choice). However, as described in [8], it is perfectly possible to embed any form of negation (constraint negation or negation as failure) using the order mechanism. In other words, the ordering of rules replaces use of negation, making OCLP a suitable tool to model the exact type of negation one wishes to use. In terms of complexity, having both concepts (negation and order) is no different to having just one, as one can be represented using the other. One key difference between Prolog and ASP is that the semantics of ASP clearly give rise to multiple possible world views2 in which the program is consistent. The number and composition of these varies with the program. Attempting to model the same ideas in Prolog can lead to confusion, as the multiple possible views may manifest themselves differently, depending on the query asked. In ASP terms, Prolog would answer a query on a as true if there is at least one answer set in which a is true. However, there is no notion of ‘in which answer set is this true’. Thus, a subsequent query on b might also return true, but without another query it would not be possible to infer if a and b could be simultaneously true. Although this seems overly rigid when requiring pure reactive agents, one can encode reactive behaviour very easily in ASP, enforcing a single answer set.
3 Ordered Choice Logic Programming OCLP ([3]) was developed as an extension of ASP to reason about decisions and preferences. This formalism allows programmers to explicitly express decisions (in the form of exclusive choices between multiple alternatives) and situation dependent preferences. We explain the basics3 of OCLP by means of a simplified grid-computing situation: 1 2 3
For a full discussion of such issues, see [2]. The exact, formal, declarative meaning of answer sets is still under debate [9]. Full details can be found in [3].
LAIMA: A Multi-agent Platform Using OCLP
75
Example 1. Suppose a grid-computing agent capable of performing two tasks which are mutually exclusive because of available resources. Your agent has permission to use three clusters (cluster 1 ,cluster 2 , cluster 3 ). Because of prices, computing power, reachability and reliability, the agent prefers cluster 3 over cluster 2 , cluster 2 over cluster 1 and cluster 3 over cluster 1 . However, in order to perform task two, she needs access to both a database and a mainframe. The former can only be provided by cluster 1 , while the latter is only available from cluster 2 . The above information leads to two correct behaviours of our agent: – performing task one, the agent uses cluster 3 ; and – executing task two, the agent requires access to clusters 1 and 2 in order to use the required database and mainframe. To model such behaviour, we represent the agent as an answer set program capable of representing preference-based choices between various alternatives. The preference is established between components, or groups of rules. Components are linked to each other by a strict order denoting the preference relation between them. Information flows from less specific components to the more preferred ones until a conflict among alternatives arises, in which case the most specific one will be favoured. Alternatives are modelled by choice rules (rules that imply exclusive disjunction). In other words, an OCLP P is a pair C, ≺ where C is a collection components, containing a finite number of (choice) rules, and a strict order relation ≺ on C. We will use to denote the reflexive closure of ≺. For the examples, we represent an OCLP as a directed acyclic graph (DAG), in which the nodes are the components and the arcs represent the relation “≺”. For two components C1 and C2 , C1 ≺C2 is represented by an arrow from going from C1 to C2 , indicating that information from C1 takes precedence over C2 . Example 2. The agent mentioned in Example 1 can be easily modelled using an OCLP, as shown in Figure 1. The rules in components P1 , P2 and P3 express the preferences between the clusters when only computing power is required. The rules in P4 indicate the grid computing problem and the required choice between the possible alternatives. In component P5 , the first rule states the goals of the agent in terms of the tasks. The third and the fourth rules specify the resources, apart from computing power, needed for a certain tasks. The last two rules express the availability of the extra resources in a cluster. The semantics of an OCLP is determined using interpretations, (sets of atoms that are assumed to be true and atoms not in the interpretation being unknown). Given an interpretation, we call a rule A ← B applicable when the precondition B, the body, is true (all the elements in the body are part of the interpretation). A rule is applied when it is applicable and when the consequence A, called the head, contains exactly one true atom. The latter condition is reasonable as rules with more than one element in their head represent decisions where only one alternative may be chosen. Using interpretations we can reason about the different alternatives. Two atoms are considered to be alternatives with respect to an interpretation if and only if a choice between them is forced by the existence of a more specific and applicable choice rule with a head containing (at least) the two atoms. So given an atom a in a component C,
76
M. De Vos et al. P1
cluster 1 ←
P2
cluster 2 ←
P3
cluster 3 ←
P4
grid ← cluster 1 ⊕ cluster 2 ⊕ cluster 3 ← grid
P5
task 1 ⊕ task 2 ← database ← task 2 mainframe ← task 2 cluster 1 ← database cluster 2 ← mainframe
Fig. 1. The Grid Computing Agent from Example 2
we can define the alternatives of a in that component C with respect to an interpretation I (a), as those atoms that appear together with a in the head of a more I, written as ΩC specific applicable choice rule. Example 3. Reconsider Example 2. Let I and J be the following interpretations: – I = {cluster 2 ,task 1 } and J = {grid ,cluster 1 ,task 1 }. The alternatives for cluster 2 in P2 w.r.t. J are ΩPJ2 (cluster 2 ) = {cluster 1 , cluster 2 }. W.r.t. I, we obtain ΩPI 2 (cluster 2 ) = ∅, since the choice rule in P4 is not applicable. When we take P5 instead of P2 , we obtain: ΩPJ5 (cluster 2 ) = ∅ w.r.t. J, since the applicable choice rule is in a less preferred component which makes it irrelevant to the decision process in the current component. Unfortunately, interpretations are too general to convey the intended meaning of a program, as they do not take the information in the rules into account. Therefore, models are introduced. The model semantics for choice logic programs, the language used in the components [10], and for ASP is fairly simple: an interpretation is a model if and only if every rule is either not applicable (the body is false) or applied (i.e. the body is true and the head contains exactly one head atom). For OCLP, taking individual rules into account is not sufficient: the semantics must also consider the preference relation. In cases where two or more alternatives of a decision are triggered, the semantics requires a mechanism to deal with it. When the considered alternatives are all more specific than the decision itself, the decision should be ignored. When this is not the case, the most specific one should be chosen. If this is impossible, because they are unrelated or
LAIMA: A Multi-agent Platform Using OCLP
77
equally specific, an arbitrary choice is justified. This selection mechanism is referred to as defeating. Example 4. Reconsider Example 2. The rule cluster 1 ← is defeated w.r.t. interpretation I = {cluster 2 ,grid } because of the applicable rule cluster 2 ← . W.r.t. interpretation J = {cluster 1 ,cluster 2 , grid ,database,mainframe} we have that the rule cluster 1 ⊕ cluster 2 ⊕ cluster 3 ← grid is defeated by the combination of the rules cluster 1 ← database and cluster 2 ← mainframe. So we can define a model for an OCLP as an interpretation that leaves every rule either not applicable, applied or defeated. Unfortunately, in order to obtain the true meaning of a program, the model semantics tends to be too coarse. For traditional ASP, the Gelfond-Lifschitz transformation or reduct [11] was introduced to remove models containing unsupported assumptions. Interpretations that are the minimal model of their Gelfond-Lifschitz transformation are called answer sets, as they represent the true meaning of the program or alternatively the answer to the problem encoded by the program. Therefore, algorithms and their implementations for obtaining the answer sets of a program are often referred to as answer set solvers. For OCLP, a reduct transformation to obtain the answer sets of our programs is also required. The transformed logic program can be obtained by adding together all components of the OCLP. Then, all defeated rules are taken away, together with all false head atoms of choice rules. The remaining rules with multiple head atoms are transformed into constraints, assuring that only one of them can become true whenever the body is satisfied4 . Example 5. Reconsider the OCLP from Example 2. This program has two answer sets: – M1 = {grid , cluster 1 , task 2 database, mainframe, cluster 2 } and – M2 = {cluster 3 , grid ,task 1 }, which matches our agent’s intended behaviour. In [3] it was shown that a bi-directional polynomial mapping exists between ordered choice logic programs and extended logic programs with respect to their answer set semantics. The mapping from OCLPs to normal logic programs is possible by introducing two new atoms for each rule in order to indicate that a rule is applied and defeated. For each rule in the OCLP, we generate a set of rules (the number of rules is equal to the number of head atoms) in the logic program that become applicable when the original rule is made applied. A rule is also created for each possible situation in which the original rule could become defeated. Finally add one rule that should be made applicable when the original rule is applied and not defeated. Constraints to make sure that the head elements cannot be true at the same time is are not necessary. The reverse mapping is established by creating an OCLP with three components placed in a linear formation. The least preferred one establishes negation as failure. The middle component contains all the rules from the original program. The most preferred 4
The relationship between those two definitions of reduct clearly convey that order and negation are interchangeable and explains why negation can be easily embedded in OCLP [8].
78
M. De Vos et al.
makes sure that for each pair a, not a only one can be true at any time, without given a a chance to be true without reason in the middle component These polynomial mappings demonstrate that the complexity of both systems is identical (more information on the complexity aspects can be found in [2]). Having a polynomial mapping to a traditional logic program makes it possible implement a front end to one of the existing answer set solvers like Smodels ( [12]) or DLV ( [13]). OCT [3] is such a front-end.
4 The LAIMA Framework In this section we consider systems of communicating agents, where each agent is represented by an OCLP that contains knowledge and beliefs about itself, its goals, the environment and other agents and the mechanism to reason about them. We assume that agents are fully aware of the agents that they can communicate with, as the communication structure is fixed, and that they can communicate by passing sets of atoms over uni-directional channels. A LAIMA system is a triple A, C, L containing a set of A of agents, a set L of atoms representing the communication language of the system and a relation C as a subset of A × A × L representing the communication channels between the agents and the filter they use when passing information. The filter tells the listening agent which sort of information they can expect, if any. Furthermore, with each agent a ∈ A we associate an OCLP Pa . In examples we use the more intuitive representation of a graph. The set L is formed by all the atoms appearing in the OCLP associated with the agents. The filter is shown next to the arrow when leaving the transmitting agent. In order not to clutter the image, we assume that if no filter is present the agent could potentially transmit every atom of S. Example 6. The system in Figure 2 displays a multi-agent system where seven agents “cooperate” to solve a murder case. Two witnesses, agents Witness 1 and Witness 2 provide information to the Inspector agent is called to a scene to establish a murder took place. Both witnesses will only pass on information relevant to the crime scene. Information from the witnesses is passed to the Officer agent for further processing. Depending on information provided she can decide to question the three suspect agents. If questioned and when no alibi can be provided, the Officer can accuse the suspect. Of course, the Officer will only pass on selective information about the case to the Suspects. The suspects from the side have no intention to tell more than if they have an alibi or not. For OCLP, we defined the notion of interpretation to give a meaning to all the atoms in our program. For our LAIMA systems we have a number of agents that can hold sets of beliefs which do not necessarily have to be the same. It is perfectly acceptable for two agents (e.g. humans) to agree to disagree. To allow this in our system we need interpretations to be functions taking an agent as input and an interpretation of the agent’s updated version as output.
LAIMA: A Multi-agent Platform Using OCLP Witness 2
Witness 1 blond ← tall ← hate police ← {tall , blond }
male ← tall ← {tall , male} murder ←
Inspector
Officer
{quest(a), accuse(a)}
79
quest(a) ← tall, male quest(b) ← tall , male, blond quest(c) ← tall, female, blond accuse(X ) ← noalibi (X ) {quest(b), accuse(b)}
{quest(c), accuse(c)}
{alibi(a), noalibi(a)} alibi (a) ← quest(a)
{alibi(a), noalibi(c)} noalibi (c) ← quest(c)
SuspectA
SuspectC {alibi(b), noalibi(b)} noalibi (b) ← quest(b) SuspectB Fig. 2. The Murder LAIMA of Example 6
Example 7. Consider the Murder LAIMA of Example 6. Then, the function I with: I(Witness 1 ) = {hate police, blond , tall }, I(Witness 2 ) = {male, tall }, I(Inspector ) = {murder, male, blond , tall }, I(Officer ) = {murder,blond ,tall , male,quest (a),quest(b),accuse(b), alibi (a),noalibi (b)} – I(SuspectA) = {quest(a), alibi (a)} – I(SuspectB) = {quest(b), noalibi (b)} – I(SuspectC ) = {} – – – –
is an interpretation for this system. In the current setting, an agent’s output is not only dependant on the agent’s current beliefs defined by an interpretation, but also on the recipient of this information. The information sent is determined by the intersection of the agent’s belief and the filter used for communication to the agent he is sending information to (Out bI (a) = I(a) ∩ F with (a, b, F ) ∈ C). On the other hand, an agent receives as input the output of all agents connected to its incoming channels (In I (b) = ∪(a,b)∈C Out bI (a)). An agent reasons on the basis of positive information received from other agents (its input) and its own program that may be used to draw further conclusions, possibly
80
M. De Vos et al. student workHard ← pass pass ⊕ badLuck ← attend, workHard teacher workHard ← badLuck attend ← badLuck attend ← pass
Fig. 3. The Exam LAIMA of Example 9
contradicting incoming information. In this framework, we made the decision that an agent will attach a higher preference to their own rules rather than to suggestions coming from outside. This can be conveniently modelled by extending an agent’s ordered program with an extra “top” component containing the information gathered from its colleagues. This new OCLP is referred to as the updated version of the agent. This way, the OCLP semantics will automatically allow for defeat of incoming information that does not fit an agent’s own program. When modelling rational agents, we assume that agents would only forward information of which they can be absolutely sure. Therefore, it is only sensible to request from agents being modelled as OCLP, to communicate in terms of answer sets. When an interpretation produces an answer set for each agent’s updated version, we call it a model for the LAIMA system. Given that one agent can create multiple answer sets of its updated version, LAIMA generates an extra source of non-determinism, providing connecting agents the different possible views it holds. Example 8. Consider the Murder LAIMA of Example 6 and the interpretation I from Example 7. Then I is a model for this LAIMA and corresponds with the intuition of the problem. The model semantics provides an intuitive semantics for systems without cycles. Having cycles allow assumptions made by one agent to be enforced by another agent. Example 9. Consider the LAIMA in Figure 3 where a student and a teacher discuss issues about the relationship between attending the lecture, working hard, passing the unit or having bad luck. This systems has three models: – M (teacher ) = M (student) = ∅ – N (teacher ) = N (student) = {attend, workHard , pass} – S(teacher ) = S(student) = {attend, workHard , badLuck } Nothing in the OCLP program suggest that an exam has taken place or that the agents are not just hypothesising. Therefore, the only sensible situation is represented by interpretation M . To avoid such self-sustaining propagation of assumptions, we require that the semantics is the result of a fixpoint procedure which mimics the evolution of the belief
LAIMA: A Multi-agent Platform Using OCLP
81
set of the agents over time. Initially, all agents start off with empty input from the other agents, stating they do not currently beliefs about the other agents. At every stage agents receive the input generated by the previous cycle to update their current belief set. This evolution stops when a fixpoint (no agent changes the beliefs they had from the previous cycle) is reached. This final interpretation is then called an evolutionary answer set. More formally, a sequence I0 , . . . , In of interpretations is an evolution of a LAIMA F if for all agents a and i > 0 we have that Ii+1 (a) is an answer set of the updated program of a with the input of the current interpretation, In Ii (a). An interpretation I is an evolutionary fixpoint of F w.r.t. an interpretation I0 iff there exists an evolution I0 , . . . and an integer i ∈ N such that Ij = Ii = I for all j > i. An evolutionary answer set of F is an evolutionary fixpoint of F w.r.t. the empty interpretation I∅ (which associates the empty set with each agent). Thus, in an evolution, the agents update their believes and intentions as more information becomes available: at each phase of the evolution, an agent updates its program to reflect input from the last phase and computes a new set of beliefs. An evolution thus corresponds to the way decision-makers try to get a feeling about the other participants. The process of reaching a fixpoint boils down to trying to get an answer to the question “if I do this, how would the other agents react?”, while trying to establish a stable compromise. Note that the notion of evolution is non-deterministic since an agent may have several local models. For a fixpoint, it suffices that all agents can maintain the same set of beliefs as in the previous stage. Example 10. The LAIMA of Example 9 has exactly one evolutionary answer set, namely M , just as we hoped. At iteration 1, both agents will receive no input resulting in producing both empty answer sets as output. This makes that the input in iteration 2 is also empty for both, obviously resulting the same output. Since both iterations are exactly the same, we have reached a fixpoint. The LAIMA of Example 6 has also one evolutionary answer set: the model mentioned in Example 7. The evolution producing this answer set is slightly more interesting. We leave it to reader to construct the evolution in detail. In the first iteration, the witnesses produce their facts which are received by the inspector in the second iteration. In second iteration both witnesses and inspector complete their belief set which is reported to the Officer for use in the third iteration. In this iteration, the Officer will decide which suspects to question. This is done in the fourth iteration. In the fifth iteration, the Officer will solve the crime. SuspectB will be notified of being accused during the sixth iteration.
5 JOF: The LAIMA Implementation The theoretical multi-agent architecture LAIMA described in the previous section has been implemented as the JADE OCLP Framework (JOF). We have opted for JADE as the MAS architecture, Prot´eg´e [6] for the ontology and OCT [3] as the answer set solver. The choice of working within the JADE framework[5] and the use of the relevant FIPA[14] specifications allows for an architecture based on behaviours. The agents can be implemented at a high-level by defining their behaviours and reactions to events.
82
M. De Vos et al.
An ontology is required for the grounding of the concepts and relationships within JOF and to smooth out the communication between the various agents in the system and to allow easy expansion to different types agents. JOF is based on a simple client-server model, using fat clients , where the clients are responsible for most of the computation. The implementation has two main agents or roles: the Coordinator and JOF agents. Using the GAIA methodology[15] of defining roles by four attributes (responsibilities, permissions, activities and protocols), we have defined the collection of behaviours for our two main agents. The Coordinator contains the graphical user interface for the application and is designed to be the interface between the human user and the MAS. When initialised, the Coordinator agent will simply start up the user interface, awaiting user input. The Coordinator is capable of ordering agents to perform a cycle, update and retrieve their information. It also a maintains list of agents that can be edited by the user, along with other administrative actions. The JOF agents are designed to also run autonomously. This distinction is important, as it allows the option of running JOF without the need for a human supervisor. Agents are then able to subscribe to the Coordinator or to each other, and receive periodic information in the form of answer sets. It is possible to run JOF in two different modes: retained and runtime. Retained mode allows the user to add, remove or modify any of the agents in the JOF environment. All agents running in retained mode are controlled by the Coordinator agent. Runtime mode provides the autonomous functionality and events to fire upon relevant stages in the OCLP agent life-cycle but also gives means for a developer to alter all of the relevant information of the local agents. The JOF ontology gives a basis for the representation of different objects and functionality in the JOF environment. For example, the ontology provides for the fundamental of notions like ‘rule’ or ‘OCLP program’. There are three main sections that we can identify in the ontology: OCLP programs, message passing and the knowledge base. The ontology itself is declared hierarchically using the BeanGenerator within Prot´eg´e and also encapsulates a number of message classes, such as the answer set broadcast and the Coordinator’s information request and retrieval messages. In order to interact correctly with OCT to provide the answer sets, JOF is required to produce valid OCT input, parse OCT result output and deal with the OCT error messages when they occur. Therefore, valid input would be an OCLP program as a string, satisfying OCT syntax. Output would be an answer set broadcast. It is possible to either synchronously run OCT and wait for its output, or as a thread which fires an event when finished. JOF agents need a way of knowing when it is time to process their inputs, which is the motivation for the Call For Answer Sets (CFAS) protocol. This controls the lifecycle of the JOF agents and is sent with a given deadline, after which the agents that not receive input will carry on using the empty set as input. This deadline is required in the event that the communication between agents fails due to technical problems or if agents become inactive. Once all responses have been received, the solution to the updated OCLP is computed immediately. JOF is implemented with LAIMA systems in mind, however it is not limited to it. JOF can be seen as an open society where agents come and go and communication is
LAIMA: A Multi-agent Platform Using OCLP
83
established as necessary. The architecture also allows for different types of agents to join provided they communicate using the ontology developed.
6 Playing Games In this section, we demonstrate that LAIMA and its implementation JOF are an elegant and intuitive mechanism for reasoning about games and their equilibria. Due to page limitations, we limit the scope to extensive games with perfect information [4] and their Nash equilibria. An extensive game is a detailed description of a sequential structure representing the decision problems encountered by agents (players) in strategic decision making (agents are capable of reasoning about their actions in a rational manner). The agents in the game are perfectly informed of all events that have previously occurred. Thus, they can decide upon their action(s) using information about the actions which have already taken place. This is done by passing histories of previous actions to the deciding agents. Terminal histories are obtained when all the agents/players have made their decision(s). Players have a preference for certain outcomes over others. Often, preferences are indirectly modelled using the concept of payoff where players are assumed to prefer outcomes where they receive a higher payoff. Therefore, an extensive game with perfect information is 4-tuple, denoted N, H, P, (i )i∈N , containing the players N of the game, the histories H, a player function P telling whose turn it is after a certain history and a preference relation ≤i for each player i over the set of terminal histories. For convenience, in the following example we use a tree representation where the small circle at the top represents the initial history. Each path starting at the top represents a history. Terminal histories are the paths ending in leaf node. The numbers next to nodes represent the players while the labels of the arcs represent actions. The number below the terminal histories are payoffs representing the players’ preferences (The first number is the payoff of the first player, the second number is the payoff of the second player, etc). Example 11. An agent responsible for the large chunks of data processing contemplates using an existing cluster (and paying rent for it) or buying a new cluster. The cluster supplier has the option to offer his client the full power of the cluster or not. Given full power, the agent can decide to use this service just once or many times. The game depicted in Figure 4 models an individual’s predicament.
Fig. 4. The Cluster use Game of Example 11
84
M. De Vos et al.
A strategy of a player in an extensive game is a plan that specifies the actions chosen by the player for every history after which it is her turn to move. A strategy profile contains a strategy for each player. The first solution concept for an extensive game with perfect information ignores the sequential structure of the game; it treats the strategies as choices that are made once and for all before the actual game starts. A strategy profile is a Nash equilibrium if no player can unilaterally improve upon his choice. Put in another way, given the other players’ strategies, the strategy stated for the player is the best this player can do. Example 12. The game of Example 11 has two equilibria: – {{rent, many}, {full}}, and – {{rent, once}, {reduced}} Given an extensive game with perfect information, a polynomial mapping to a LAIMA exists such that the Nash equilibria of the former can be retrieved as the answer sets of this system. The evolution leading to those evolutionary answer sets models exactly how players in a game reason to find the answer sets. So LAIMA are not only useful for calculation the equilibria but they also provide a mechanism to monitor the change of the players’ beliefs. An agent is created for each player in the game. The OCLP of such an agents contains as many components as the represented player has payoffs (step 1). The order among the components follows the expected payoff, higher payoffs correspond to more specific components (step 2). The various actions a player can choose from a certain stage of
agent C01
buy ←
C11
rent ← full once ← full
C21
rent ← reduced
C31
rent ← full multiple ← full rent ⊕ buy ← multiple ⊕ once ←
Sn C12
full ← rent, once reduced ← rent
C32
full ← rent, multiple full ⊕ reduced ←
cluster Fig. 5. The LAIMA corresponding to the Cluster Game (Example 13)
LAIMA: A Multi-agent Platform Using OCLP
85
the game are turned into a choice rule which is placed in the most specific component of the agent modelling the player making the decision (step 3). Since Nash equilibria do not take into account the sequential structure of the game, players have to decide upon their strategy before starting the game, leaving them to reason about past and future at each decision point. This is reflected in the rules (step 4): each non-choice rule is made out of a terminal history (path from top to bottom in the tree) where the head represents the action taken by the player/agent, when considering the past and future created by the other players according to this history. The component of the rule corresponds to the payoff the deciding player would receive in case that the history was actually followed. Example 13. If we apply this technique to the game in Example 11 we obtain the multiagent system depicted in Figure 5. The two evolutionary answer sets of the system match the two Nash equilibria (Example 12) perfectly.
7 Conclusions and Directions for Future Research The deductive multi-agent architecture described in this paper uses OCLP to represent the knowledge and reasoning capacities of the agents involved. However, the architecture and its behaviour are designed to be able to deal with any extension of traditional answer set programs. [2] gives an overview of some the language extensions currently available. There is an emerging view [8, 16, 17, 18, 19] that ASP, with or without preferences or other language constructs, is a very promising technology for building MAS. It satisfies the perspective that agent programming should be about describing what the goals are, not how they should be achieved— i.e. “post-declarative” programming [20]. Furthermore, having the same language for specification and implementation, makes verification trivial. Thanks to its formal grounding of all its language constructs, it becomes easy to verify the soundness and completeness of your language and implementation. A number of multi-agent platforms currently exists that use logic programming languages as their basic constructs. The Dali project [21] is a complete multi-agent platform entirely written in Prolog. The EU-funded research project SOCS ([22, 23]) has constructed a multi-agent architecture based solely on logic programming components, each responsible for a part of the reasoning of the agent. Using ASP, it would be possible to use the same language for all components, avoiding duplication of domain description and knowledge. In the Minerva architecture [19], the authors build their agents out of subagents that work on a common knowledge base written as a MDLP (Multi-Dimentional Logic Program) which is an extension of Dynamic Logic Programming. It can be shown that MDLP can easily be translated into OCLP such that their stable models match our answer sets. The advantage of using OCLP is that you have a more flexible defeating structure. One is not restricted to decision comprising only two alternatives. Furthermore, decision are dynamic in our system. On the other hand, the Minerva system does not restrict itself to modelling the beliefs of agents, but allows for full BDI-agents that can plan towards a certain goal. It would be interesting to see what results we obtain when OCLP approach was incorporated into Minerva and what results we would obtain by incorporating Minerva into LAIMA.
86
M. De Vos et al.
The flexibility on the specification side of ASP is counter-balanced by the technology of current (grounding) answer set solvers, which mean that a major part of the solution space is instantiated and analysed before the any computation of a solution begins. The advantage of this is that the computation is staged [24] into a relatively expensive part that can be done off-line and a relatively cheap part that is executed on-line in response to queries. For example, [17] reports that the Space Shuttle engine control program is 21 pages, of which 18 are declarations and 3 are rules describing the actual reasoning. Grounding takes about 80% of the time, while the 20% is used for the actual computation of the answer set and answering queries. There is a parallel between the way answer set solvers compute the whole solution space and model-checking. In our framework, an agent is given an initial set of beliefs and a set of reasoning capacities. Each time the agent receives information from the outside world, it will update its knowledge/beliefs base and computes the answer sets that go with it. This means that for even the smallest change the whole program has to be recomputed (including the expensive grounding process). To provide some additional flexibility, a recent development [25] describes a technique for incremental answer set solving. With an incremental solver that permits the assertion of new rules at run-time we are in a position to move from purely reactive agents to deliberative agents that can support belief-desireintention style models of mentality. We aim to replace the fixed solver (OCT) we have built into JADE and described here with the incremental one (IDEAS) shortly. The current implementation of grounding in ASP limits its use to finite domains. Clearly, a fixed solution space is quite limiting in some domains, but reassuring in others. OCLP significantly simplifies the development of such systems by constructing a finite domain. To go beyond the grounding solver is currently one of the key challenges, together with the more practical aspects of ASP like updates, methodology, and debugging. The advantage of an unground solver is that the initialisation overheads are significantly reduced, compared to a grounding solver, since the only task is the grounding and removal of redundant rules. However, the disadvantage is that the finite space for checking (discussed above) is lost: the price of flexibility is a loss in verifiability. We are currently working on a lazy grounding solver, as a result of which we believe we will be in a position to offer a very high-level programming language for the construction of practical reasoning agents. Thus, not only will it be possible to extend the knowledge base of the agent dynamically, but it will also enable the solution of problems over infinite domains, like time (discreet or continuous), numbers. It seems likely that the next few years will see significant developments in answer set solver technology, deploying both incremental and unground solvers. We also foresee future improvements on the framework itself. Currently, all our agents take input only from other agents and they communicate positive information, but it would very beneficial if they could communicate negative information too. Unfortunately, this may lead to contradictions, but various strategies to deal with it come to mind, such as removing conflicting information, introducing trust between agents or allowing another sort of non-determinism by attempting to initiate a cycle for both possibilities. The LAIMA system and its answer set semantics was developed using OCLP as a way to model the agents. However, the system could also be used for other agent char-
LAIMA: A Multi-agent Platform Using OCLP
87
acterisations. In the paper, we assumed that agents will always prefer their own beliefs over the information passed on by other. We are currently working on an extension in which trust levels are used we currently to allow agents to distinguish between various sources and deal with it appropriately. In this system we assumed that the environment would be represented as an agent. Normally, information passed from the environment should be considered more accurate than an agent’s current beliefs. Using these trust levels and OCLP, it would be easy to model this behaviour: each input is assigned its own component with an ordering depending on the trust level of the supplier. Another problem for the moment is feeding back failure into the system. When an agent fails to produce an answer set for its updated version, communication will stop at this agent, without warning the others. From both a theoretical and implementation point of view this raises interesting possibilities. In the near future, we aim to use LAIMA and its extensions for the development of larger systems. One of our goals, is to try to incorporate the ALIAS[26] system, an agent architecture for legal reason based on abductive logic, into ours.
References 1. Opencyc: http://www.cyc.com.opencyc 2. Baral, C.: Knowledge Representation, Reasoning and Declarative Problem Solving. Cambridge Press (2003) 3. Brain, M., De Vos, M.: Implementing OCLP as a front-end for Answer Set Solvers: From Theory to Practice. In: ASP03: Answer Set Programming: Advances in Theory and Implementation, Ceur-WS (2003) online CEUR-WS.org/Vol-78/asp03-final-brain.ps. 4. Osborne, M.J., Rubinstein, A.: A Course in Game Theory. Third edn. The MIT Press, Cambridge, Massachusets, London, Engeland (1996) 5. Jade: http://jade.tilab.com/ 6. Prot´eg´e: http://protege.stanford.edu/ 7. De Vos, M.: Implementing Ordered Choice Logic Programming using Answer Set Solvers. In: Third International Symposium on Foundations of Information and Knowledge Systems (FoIKS’04). Volume 2942., Vienna, Austria, Springer Verlag (2004) 59–77 8. De Vos, M., Vermeir, D.: Extending Answer Sets for Logic Programming Agents. Annals of Mathematics and Artifical Intelligence 42 (2004) 103–139 Special Issue on Computational Logic in Multi-Agent Systems. 9. Denecker, M.: What’s in a Model? Epistemological Analysis of Logic Programming, CeurWS (2003) online CEUR-WS.org/Vol-78/. 10. De Vos, M., Vermeir, D.: On the Role of Negation in Choice Logic Programs. In Gelfond, M., Leone, N., Pfeifer, G., eds.: Logic Programming and Non-Monotonic Reasoning Conference (LPNMR’99). Volume 1730 of Lecture Notes in Artificial Intelligence., El Paso, Texas, USA, Springer Verslag (1999) 236–246 11. Gelfond, M., Lifschitz, V.: The stable model semantics for logic programming. In: Proc. of fifth logic programming symposium, MIT PRESS (1988) 1070–1080 12. Niemel¨a, I., Simons, P.: Smodels: An implementation of the stable model and well-founded semantics for normal LP. In Dix, J., Furbach, U., Nerode, A., eds.: Proceedings of the 4th International Conference on Logic Programing and Nonmonotonic Reasoning. Volume 1265 of LNAI., Berlin, Springer (1997) 420–429
88
M. De Vos et al.
13. Eiter, T., Leone, N., Mateis, C., Pfeifer, G., Scarcello, F.: The KR system dlv: Progress report, comparisons and benchmarks. In Cohn, A.G., Schubert, L., Shapiro, S.C., eds.: KR’98: Principles of Knowledge Representation and Reasoning. Morgan Kaufmann, San Francisco, California (1998) 406–417 14. FIPA: http://www.fipa.org/ 15. Wooldridge, M., Jennings, N.R., Kinny, D.: The gaia methodology for agent-oriented analysis and design. Autonomous Agents and Multi-Agent Systems 3 (2000) 285–312. 16. Dix, J., Eiter, T., Fink, M., Polleres, A., Zhang, Y.: Monitoring Agents using Declarative Planning. Fundamenta Informaticae 57 (2003) 345–370 Short version appeared in Gnther/Kruse/Neumann (Eds.), Proceedings of KI 03, LNAI 2821, 2003. 17. Nogueira, M., Balduccini, M., Gelfond, M., Watson, R., Barry, M.: A A-Prolog Decision Support System for the Space Shuttle. In: Answer Set Programming: Towards Efficient and Scalable Knowledge Represenation and Reasoning. American Association for Artificial Intelligence Press, Stanford (Palo Alto), California, US (2001) 18. Gelfond, M.: Answer set programming and the design of deliberative agents. In: Procs. of 20th International Conference on Logic Programming. Number 3132 in Lecture Notes in Artificial Intelligence (LNCS) (2004) 19–26 19. Leite, J.A., Alferes, J.J., Pereira, L.M.: Minerva - a dynamic logic programming agent architecture. In Meyer, J.J., Tambe, M., eds.: Intelligent Agents VIII. Number 2002 in LNAI, Springer-Verlag (2002) 141–157 20. Wooldridge, M.J., Jennings, N.R.: Agent theories, architectures and languages: a survey. In Wooldridge, M.J., Jennings, N.R., eds.: Intelligent Agents; ECAI-94 Workshop on Agent Theories, Architectures, and Languages (Amsterdam 1994). Volume 890 of LNCS (LNAI)., Berlin: Springer (1994) 1–39 21. Costantini, S., Tocchio, A.: A Logic Programming Language for Multi-agent Systems. In: Logics in Artificial Intelligence, Proceedings of the 8th European Conference, Jelia 2002. Volume 2424 of Lecture Notes in Artificial Intelligence., Cosenza, Italy, Springer-Verlag, Germany (2002) 22. Kakas, A., Mancarella, P., Sadri, F., Stathis, K., Toni, F.: Declarative agent control. In Leite, J., Torroni, P., eds.: 5th Workshop on Computational Logic in Multi-Agent Systems (CLIMA V). (2004) 23. SOCS: http://lia.deis.unibo.it/research/socs/ 24. Jørring, U., Scherlis, W.: Compilers and staging transformations. In: Proceedings of 13th ACM Symposium on Principles of Programming Languages, New York, ACM (1986) 86–96 25. Brain, M.J.: Undergraduate dissertation: Incremental answer set programming. Technical Report 2004–05, University of Bath, U.K., Bath (2004) 26. Ciampolini, A., Torroni, P.: Using abductive logic agents for modeling the judicial evaluation of crimimal evidence. Applied Artificial Intelligence 18 (2004) 251–275
A Distributed Architecture for Norm-Aware Agent Societies A. Garc´ıa-Camino1, J.A. Rodr´ıguez-Aguilar1, C. Sierra1 , and W. Vasconcelos2 1
2
Institut d’Investigaci´ o en Intel·lig`encia Artificial, CSIC, Campus UAB 08193 Bellaterra, Catalunya, Spain {andres, jar, sierra}@iiia.csic.es Dept. of Computing Science, Univ. of Aberdeen, Aberdeen AB24 3UE, UK
[email protected] Abstract. We propose a distributed architecture to endow multi-agent systems with a social layer in which normative positions are explicitly represented and managed via rules. Our rules operate on a representation of the states of affairs of a multi-agent system. We define the syntax and semantics of our rules and an interpreter; we achieve greater precision and expressiveness by allowing constraints to be part of our rules. We show how the rules and states come together in a distributed architecture in which a team of administrative agents employ a tuple space to guide the execution of a multi-agent system.
1
Introduction
Norms (i.e., obligations, permissions and prohibitions) capture an important aspect of heterogeneous multi-agent systems (MASs) – they constrain and influence the behaviour of individual agents [1, 2, 3] as they interact in pursuit of their goals. In this paper we propose a distributed architecture built around an explicit model of the norms associated with a society of agents, consisting of: – an information model storing the normative positions of MASs’ individuals. – a rule-based representation of how normative positions are updated during the execution of a MAS. – a distributed architecture with a team of administrative agents to ensure normative positions are complied with and updated. A normative position [4] is the “social burden” associated with an agent, that is, their obligations, permissions and prohibitions. We show in Fig. 1 our proposal and how its components fit together. Our architecture provides a social layer for multi-agent systems specified via electronic institutions (EI, for short) [5]. EIs specify the kinds and order of interactions among software agents with a view to achieving global and individual goals – although our study here concentrates on EIs we believe our ideas can be adapted to alternative frameworks. In our diagram we show a tuple space in which information models ∆0 , ∆1 , . . . are stored – these models are called institutional states (explained in Section 3) and contain all norms and other information that hold in specific points of time during the EI enactment. M. Baldoni et al. (Eds.): DALT 2005, LNAI 3904, pp. 89–105, 2006. c Springer-Verlag Berlin Heidelberg 2006
90
A. Garc´ıa-Camino et al.
The normative positions of agents are Electronic Institution updated via institutional rules (described IAg Institutional Agent in Section 4). These are constructs of the form LHS RHS where LHS describes . . . Tuple Space 0 1 a condition of the current information model and RHS depicts how it should be ... GAg GAg Governor Agents updated, giving rise to the next information model. Our architecture is built around a shared tuple space [6] – a kind ... EAg EAg External Agents of blackboard system that can be accessed asynchronously by different administraFig. 1. Proposed Architecture tive agents. In our diagram our administrative agents are shown in grey: the institutional agent updates the institutional state using the institutional rules; the governor agents work as “escorts” or “chaperons” to the external, heterogeneous software agents, writing onto the tuple space the messages to be exchanged. In the next Section we introduce electronic institutions. In Section 3 we introduce our information model: the institutional states. In Section 4 we present the syntax and semantics of our institutional rules and how these can be implemented as a logic program; in that section we also give practical examples of rules. We provide more details of our architecture in Section 5. In Section 6 we contrast our proposal with other work and in Section 7 we draw some conclusions, discuss our proposal, and comment on future work. 1.1
Preliminary Concepts
We need to initially define some basic concepts. Our building blocks are firstorder terms (denoted as T) and implicitly universally quantified atomic formulae (denoted as A) without free variables. We make use of numbers and arithmetic functions to build terms; arithmetic functions may appear infixed, following their usual conventions. We adopt Prolog’s convention [7] and use strings starting with capital letters to represent variables and strings starting with lowercase letters to represent constants. We also employ arithmetic relations (e.g., =, =, and so on) as predicate symbols, and these will appear in their usual infix notation with their usual meaning. Atomic formulae with arithmetic relations represent constraints on their variables: Definition 1. A constraint C is of the form T T , where ∈ {=, =, >, ≥, Z+5,Z #=< 100,Z #> 30),Γ). Γ = [[X]-(X in inf..68),[Y]-(Y in 37..119),[Z]-(Z in 31..100)]
The representation LimInf..LimSup is a syntactic variation of our expanded constraints. We can thus translate Γ above as {−∞ < X < 68, 37 < Y < 119, 31 < Z < 100}. Our proposal hinges on the existence of the satisfy relationship that can be implemented differently: it is only important that it should return a set of partially solved expanded constraints. The importance of the expanded constraints is that they allow us to precisely define when the constraints on the LHS of the rule hold in the current institutional state, as captured by : Definition 7. Γ1 Γ2 holds iff satisfy (Γ1 , Γ1 ) and satisfy(Γ2 , Γ2 ) hold and for every constraint (⊥1 X 1 ) in Γ1 , there is a constraint (⊥2 X 2 ) in Γ2 , such that max (⊥1 , ⊥2 ) ≥ ⊥1 and min(⊥1 , ⊥2 ) ≤ ⊥1 , where ⊥i , i , i = 1, 2 are arbitrary values. That is, all variables in Γ1 must be in Γ2 (with possibly other variables not in Γ1 ) and i) the maximum value for these variables in Γ1 , Γ2 must be greater than or equal to the maximum value of that variable in Γ1 ; ii) the minimum value for these variables in Γ1 , Γ2 must be less than or equal to the minimum value of that variable in Γ1 . We make use of this relationship to define that constraints of a rule hold in an institutional state if they further limit the values of the existing constrained variables. We now proceed to define the semantics of an institutional rule. In the definitions below we rely on the concept of substitution, that is, the set of values for variables in a computation [7, 12]: Definition 8. A substitution σ is a finite, possibly empty set of pairs Xi /Ti , 0 ≤ i ≤ n. The application of σ to a formula A follows the usual definition [12]: 1. c · σ = c for a constant c. 2. X · σ = T · σ if X/T ∈ σ; if X/T ∈ σ then X · σ = X. 3. pn (T0 , . . . , Tn ) · σ = pn (T0 · σ, . . . , Tn · σ). We now define when the LHS matches an institutional state: Definition 9. sl (∆, LHS, σ) holds depending on the format of LHS: 1. 2. 3. 4.
sl (∆, A ∧ LHS, σ1 ∪ σ2 ) holds iff sl (∆, A, σ1 ), sl (∆, LHS, σ2 ) hold. sl (∆, ¬LHS, σ) holds iff sl (∆, LHS, σ) does not hold. sl (∆, B, σ) holds iff B · σ ∈ ∆, constrs(∆, Γ ) and satisfy(Γ · σ, Γ ). sl (∆, C, σ) holds iff constrs(∆, Γ ) and {C · σ} Γ .
A Distributed Architecture for Norm-Aware Agent Societies
95
Case 1 depicts how substitutions are combined to provide the semantics for conjunctions in the LHS. Case 2 addresses the negation operator. Case 3 states that an atomic formula (which is not a constraint) holds in ∆ if it is a member of ∆ and the constraints on the variables of ∆ hold under σ. Case 4 deals with a constraint: we apply σ to it (thus reflecting the values of matchings of other atomic formula), then check whether the constraint can be included in the state. We want our institutional rules to be exhaustively applied on the institutional state. We thus need relationship s∗l (∆, LHS , Σ) which uses sl above to obtain in Σ = {σ0 , . . . , σn } all possible matches of the left-hand side of a rule: Definition 10. s∗l (∆, LHS, Σ) holds, iff Σ = {σ1 , . . . , σn } is the largest nonempty set such that sl (∆, LHS, σi ), 1 ≤ i ≤ n, holds. We must define the application of a set of substitutions Σ = {σ1 , . . . , σn } to a term T: this results in a set of substituted terms, T·{σ1 , . . . , σn } = {T·σ1 , . . . , T· σn }. We now define the semantics of the RHS of a rule as a mapping between the current institutional state ∆ and its successor state ∆ : Definition 11. sr (∆, RHS, ∆ ) holds depending on the format of RHS: 1. 2. 3. 4.
sr (∆, (U ∧ RHS), ∆1 ∪ ∆2 ) holds iff sr (∆, U, ∆1 ) and sr (∆, RHS, ∆2 ) hold. sr (∆, ⊕B, ∆ ∪ {B}) holds. sr (∆, B, ∆ \ {B}) holds. sr (∆, ⊕C, ∆ ∪ {C}) holds iff constrs(∆, Γ ) and satisfy(Γ ∪ {C}, Γ ) hold.
Case 1 decomposes a conjunction and builds the new state by merging the partial states of each update. Cases 2 and 3 cater for the insertion and removal of atomic formulae B which do not conform to the syntax of constraints. Case 4 defines how a constraint is added to an institutional state: the new constraint is checked for its satisfaction with constraints Γ ⊆ ∆ and then added to ∆. We assume the new constraint is merged into ∆: if there is another constraint that subsumes it, then the new constraint is discarded. For instance, if X > 20 belongs to ∆, then attempting to add X > 15 will yield the same ∆. In the usual semantics of rules of production systems [13, 14], the values to variables obtained when matching the LHS to the institutional state must be passed on to the RHS. We capture this by associating the RHS with a substitution σ obtained when matching the LHS against ∆ via sl . Def. 11 above should actually be used as sr (∆, RHS · σ, ∆ ), that is, we have a version of the RHS with ground variables whose values originate from the matching of the LHS to ∆. We now define how a rule maps two institutional states: Definition 12. s∗ (∆, LHS RHS, ∆ ) holds iff s∗l (∆, LHS, {σ1 , . . . , σn }) and sr (∆, RHS · σi , ∆ ), 1 ≤ i ≤ n, hold. That is, two institutional states ∆, ∆ are related by LHS RHS iff we obtain all different substitutions {σ1 , . . . , σn } that make the LHS match ∆ and apply these substitutions to RHS in order to build ∆ . Finally we extend s∗ to handle sets of rules: s∗ (∆, {R1 , . . . , Rn }, ∆ ) holds iff s∗ (∆, Ri , ∆ ), 1 ≤ i ≤ n, hold.
96
A. Garc´ıa-Camino et al.
4.2
Implementing Institutional Rules
The semantics above provides a basis for an interpreter for institutional rules, shown in Fig. 3 as a logic program, interspersed with built-in Prolog predicates; for easy referencing, we show each clause with a number on its left. Clause 1 con-
1. s∗ (∆, Rs, ∆ ) ← findall(RHS, Σ, (member((LHS RHS), Rs), s∗ l (∆, LHS, Σ)), RHSs), sr (∆, RHSs, ∆ ) 2. 3. 4. 5. 6.
s∗ l (∆, LHS, Σ) ← findall(σ, sl (∆, LHS, σ), Σ) sl (∆, (A ∧ LHS), σ1 ∪ σ2 ) ← sl (∆, A, σ1 ), sl (∆, LHS, σ2 ) sl (∆, ¬LHS, σ) ← ¬sl (∆, LHS, σ) sl (∆, B, σ) ← member(B · σ, ∆), constrs(∆, Γ ), satisfy (Γ · σ, Γ ) sl (∆, C, σ) ← constrs(∆, Γ ), {C · σ} Γ
7. sr (∆, RHSs, ∆ ) ← findall(∆ , (member(RHS, Σ, RHSs), member(σ, Σ), sr (∆, RHS · σ, ∆ )), AllDeltas), merge(AllDeltas, ∆ ) 8. sr (∆, (U ∧ RHS), ∆1 ∪ ∆2 ) ← sr (∆, U, ∆1 ), sr (∆, RHS, ∆2 ) 9. sr (∆, ⊕B, ∆ ∪ {B})) ← 10. sr (∆, B, ∆ \ {B})) ← 11. sr (∆, ⊕C, ∆ ∪ {C}) ← constrs(∆, C), satisfy ([Constr|C], C )
Fig. 3. An Interpreter for Institutional Rules
tains the topmost definition: given a ∆ and a set of rules Rs, it shows how we can obtain the next state ∆ by finding (via the built-in findall predicate2 ) all those rules in Rs (picked by the member built-in) whose LHS holds in ∆ (checked via the auxiliary definition s∗l ). This clause then uses the RHS of those rules with their respective sets of substitutions Σ as the arguments of sr to finally obtain ∆ . Clause 2 implements s∗l : it finds all the different ways (represented as individual substitutions σ) that the left-hand side LHS of a rule can be matched in an institutional state ∆ – the individual σ’s are stored in sets Σ of substitutions, as a result of the findall/3 execution. Clauses 3-6 are adaptations of Def. 9. Clause 7 shows how sr computes the new state from a list RHSs of pairs RHS, Σ (obtained in the second body goal of clause 1): it picks out (via predicate member/2) each individual substitution σ ∈ Σ and uses it in RHS to compute via sr a partial new institutional state ∆ which is stored in AllDeltas. AllDeltas contains a set of partial new institutional states and these are combined together via the merge/2 predicate – it joins all the partial states, removing any replicated components. A garbage collection mechanism can be also added to the functionalities of merge/2 whereby constraints whose variables are not referred in ∆ are discarded. Clauses 8-11 are adaptations of Def. 11. Our interpreter shows how we deal with constraints in our institutional rules: we could not simply refer to standard rule interpreters [13, 14] since these do not handle constraints. Our combination of Prolog built-ins and abstract definitions 2
ISO Prolog built-in findall/3 obtains all answers to a query (2nd argument), recording the values of the 1st argument as a list stored in the 3rd argument.
A Distributed Architecture for Norm-Aware Agent Societies
97
provides a precise account of the complexity of the computation, yet it is very close to the mathematical definitions. 4.3
Sample Institutional Rules
In this section we give examples of institutional rules and explain the computational behaviours they capture. These examples illustrate what can be achieved with our proposed formalism. To account for the passing of time, we shall simulate a clock using our institutional rules. Our clock is used by the governor and institutional agents when they are writing terms onto the tuple space (see discussion in Section 5 below). We shall represent our clock as the term now (T ) where T is a natural number. This term can either be provided in the initial state or a “bootstrapping” rule can be supplied as ¬now (T ) ⊕now(1), that is, if now (T ) is not present in the state, the rule will add now (1) to the next state. Similar rules can be used whenever new terms need to be added to the state, but once added they only need to be updated. We can simulate the passing of time in various ways. A reactive approach whereby an event triggers a rule to update the now term is captured as: (now (T ) ∧ att (S, W, P (A1 , R1 , A2 , R2 , M, T )) (now (T ) ∧ ⊕now (T + 1))) That is, if an event (an attempt to utter something) related to the current time step has happened then the clock is updated. It is important to notice that if there is more than one utterance, the exhaustive application of the rule above will carry out the same update for each utterance. Although this might be unnecessary or inefficient, it will not cause multiple now /1 formulae to be inserted in the next institutional state, as the same unification for T is used in all updates, rendering the same T + 1. If external agents fail to provide their messages (via their governor agents, as explained below), this can be then neatly captured by a “dummy” message. Governor agents can be defined to wait a predefined amount of time and then write out in the institutional state that a timeout took place – this can be represented by utt(S, W, P (A, R, adm, admin , timeout, T )), stating that agent A failed to say what it should have said. We can also define relationships among permissions, prohibitions and obligations via our institutional rules. Such relationships should capture the pragmatics of normative aspects – what exactly these concepts mean in terms of agents’ behaviour. We start by looking at those illocutions that external agents attempted to utter, i.e., att (S, W, I): (att (S, W, I) ∧ per (S, W, I)) (att (S, W, I) ∧ ⊕utt(S, W, I)) That is, permitted attempts become utterances – any constraints associated with S, W and I should hold for the left-hand side to match the current state. Attempts and prohibitions can be related together by the schematic rule (att (S, W, I) ∧ prh(S, W, I)) Sanction
98
A. Garc´ıa-Camino et al.
Where Sanction stands for sanctions on the agents who tried to utter a prohibited illocution. If we represent the credit of agents as oav (Ag, credit , V ), we can apply a 10% fine on those agents who attempt to utter a prohibited illocution:
att (S, W, P (A , R , A , R , M, T ))∧
prh(S, W, P (A , R , A , R , M, T )) 1
1
1
2
1
2
2
2
oav(A1 , credit, C)∧ ⊕oav(A1 , credit, C − C/10)∧ att (S, W, P (A1 , R1 , A2 , R2 , M, T ))
Another way of relating attempts, permissions and prohibitions is when a permission granted in general (e.g., to all agents or to all agents adopting a role) is revoked for a particular agent (e.g., due to a sanction). We can ensure that a permission has not been revoked via the rule (att (S, W, I) ∧ per (S, W, I) ∧ ¬prh(S, W, I)) (att(S, W, I) ∧ ⊕utt (S, W, I))
That is, only permitted attempts which are not prohibited become utterances. We can also capture other relationships among deontic modalities. For instance, the rule below states that all obligations are also permissions: obl (S, W, I) ⊕per (S, W, I)
Such a rule would add to the institutional state a permission for every obligation. Another relationship we can forge concerns how to cope with the situation when an illocution is an obligation and a prohibition – this may occur when an obligation assigned to agents in general (or to any agents playing a role) is revoked for individual agents (for instance, due to a sanction). In this case, we can choose to ignore/override either the obligation or the prohibition. For instance, the rule below overrides the obligation and ignores the attempt to fulfil the obligation: (att (S, W, I) ∧ obl (S, W, I) ∧ prh(S, W, I)) att (S, W, I)
The rule below ignores the prohibition and transforms an attempt to utter the illocution I into its utterance: (att (S, W, I) ∧ obl (S, W, I) ∧ prh(S, W, I)) (att (S, W, I) ∧ ⊕utt (S, W, I))
A third possibility is to raise an exception via a term which can then be dealt with at the institutional level. The following rule could be used for this purpose: (att (S, W, I) ∧ obl (S, W, I) ∧ prh(S, W, I)) ⊕exception (S, W, I)
We do not want to be prescriptive in our discussion and we are aware that the sample rules we present can be given alternative formulations. Furthermore, we notice that when designing institutional rules, it is essential to consider the combined effect of the whole set of rules over the institutional states – these should be engineered in tandem. The rules above provide a sample of the expressiveness and precision of our institutional rules. As with all other formalisms to represent computations, it is difficult to account for the pragmatics of institutional rules. Ideally, we should provide a list of programming techniques engineers are likely to need in their day-to-day activities, but this lies outside the scope of this paper.
A Distributed Architecture for Norm-Aware Agent Societies
5
99
An Architecture for Norm-Aware Agent Societies
We now elaborate on the distributed architecture which fully defines our normative (or social) layer to EIs. We refer back to Fig 1, the initial diagram describing our proposal. We show in the centre of the diagram a tuple space [6] – this is a blackboard system with accompanying operations to manage its entries. Our agents, depicted as a rectangle (labelled IAg), circles (labelled GAg) and hexagons (labelled EAg) interact (directly or indirectly) with the tuple space, reading and deleting entries from it as well as well as writing entries onto it. We explain the functionalities of each of our agents below. The institutional states ∆0 , ∆1 , . . . are recorded in the tuple space; we propose a means to represent institutional states with a view to maximising asynchronous aspects (i.e., agents should be allowed to access the tuple space asynchronously) and minimising housekeeping (i.e., not having to move information around). The topmost rectangle in Fig. 1 depicts our institutional agent IAg, responsible for updating the institutional state, applying s∗ . The circles below the tuple space represent the governor agents GAgs, responsible for following the EI “chaperoning” the external agents EAgs. The external agents are arbitrary heterogeneous software or human agents that actually enact an EI to ensure that they conform to the required behaviour, each external agent is provided with a governor agent with which it communicates to take part in the EI. Governor agents ensure that external agents fulfil all their social duties during the enactment of an EI. In our diagram, we show the access to the tuple space as black block arrows; communication among agents are the white block arrows. We want to make the remaining discussion as concrete as possible so as to enable others to assess, reuse and/or adapt our proposal. We shall make use of SICStus Prolog [9] Linda Tuple Spaces [6] library in our discussion. A Linda tuple space is basically a shared knowledge base in which terms (also called tuples or entries) can be asserted and retracted asynchronously by a number of distributed processes. The Linda library offers basic operations to read a tuple from the space (predicates rd/1 and its non-blocking version rd noblock/1), to remove a tuple from the space (predicates in/1 and its non-blocking version in noblock/1), and to write a tuple onto the space (predicate out/1). Messages are exchanged among the governor agents by writing them onto and reading them from the tuple space; governor agents and their external agents, however, communicate via exclusive point-to-point communication channels. In our proposal some synchronisation is necessary: the utterances utt(s, w, I) will be written by the governor agents and the external agents must provide the actual values for the variables of the messages. However, governor agents must stop writing illocutions onto the space so that the institutional agent can update the institutional state. We have implemented this via the term current state(N) (N being an integer) that works as a flag: if this term is present on the tuple space then governor agents may write their utterances onto the space; if it is not there, then they have to wait until the term appears. The institutional agent is responsible for removing the flag and writing it back, at appropriate times.
100
A. Garc´ıa-Camino et al.
We show in Fig. 4 a Prolog implementation 1 main:2 out(current state(0)), for the institutional agent IAg. It bootstraps the 3 time step(T), architecture by creating an initial value 0 for the 4 loop(T). current state (lines 2-3); the initial institutional 5 loop(T):6 sleep(T), state is empty. In line 3 the institutional agent 7 no one updating, obtains via time step/1 a value T, an attribute 8 in(current state(N)), 9 get state(N,Delta), of the EI enactment setting up the frequency 10 inst rules(Rules), new institutional states should be computed. 11 s∗ (Delta,Rules,NewDelta), 12 write onto space(NewDelta), The IAg agent then enters a loop (lines 5-14) 13 NewN is N + 1, where it initially (line 6) sleeps for T millisec14 out(current state(N)), 15 loop(T). onds – this guarantees that the frequency of the updates will be respected. IAg then checks via no one updating/0 (line 7) that there are no Fig. 4. Institutional Agent governor agents currently updating the institutional state with their utterances – no one updating/0 succeeds if there are no updating/2 tuples in the space. Such tuples are written by the governor agents to inform the institutional agent it has to wait until their utterances are written onto the space. When agent IAg is sure there are 1 main:2 connect ext ag(Ag), no more governor agents updating 3 root scene(Sc), the tuple space then it removes the 4 initial state(Sc,St), 5 loop([Ag,Sc,St,Role]). current state/1 tuple (line 8) thus preventing any governor agent from 6 loop(Ctl):7 rd(current state(N)), trying to update the tuple space (the 8 Ctl = [Ag| ], 9 out(updating(Ag,N)), governor agent checks in line 7 of 10 get state(N,Delta), Fig. 5 if such entry exists – if it 11 findall([A,NC],(p(Ctl):-A,p(NC)),ANCs), 12 social analysis(ANCs,Delta,Act,NewCtl), does not, then the flow of execution is 13 perform(Act), blocked on that line). Agent IAg then 14 in(updating(Id,N)), 15 loop(NewCtl). obtains via predicate get state/2 all those tuples pertaining to the current institutional state N and stores them Fig. 5. Governor Agent in Delta; the institutional rules are obtained in line 10 – they are also stored in the tuple space so that any of the agents can examine them. In line 11 Delta and Rules are used to obtain the next institutional state NewDelta via predicate s∗ /2 (cf. Def. 12 and its implementation in Fig 3). In line 12 the new institutional state NewDelta is written onto the tuple space, then the tuple recording the identification of the current state is written onto the space (line 14) for the next update. Finally, in line 15 the agent loops3 . Distinct threads will execute the code for the governor agents GAg shown in Fig. 5. Each of them will connect to an external agent via predicate connect ext ag/1 and obtain its identification Ag, then find out (line 3) about the EI’s root 3
For simplicity we did not show the termination conditions for the loops of the institutional and governor agents. These conditions are prescribed by the EI specification and should appear as a clause preceding the loop clauses of Figs. 4 and 5.
A Distributed Architecture for Norm-Aware Agent Societies
101
scene (where all agents must initially report to [5]) and that scene’s initial state (line 4) – we adopt here the representation for EIs proposed in [8]. In line 5 the governor agent makes the initial call to loop/1: the Role variable is not yet instantiated at that point, as a role is assigned to the agent when it joins the EI. The governor agents then will loop through lines 6-15, initially checking in line 7 if they are allowed to update the current institutional state, adding their utterances. Only if the current state/1 tuple is on the space then does the flow of execution of the governor agent move to line 8, where it obtains the identifier Ag from the control list Ctl; in line 9 a tuple updating/2 is written out onto the space. This tuple informs the institutional agent that there are governors updating the space and hence it should wait to update the institutional state. In line 10 the governor agent reads all those tuples pertaining to the current institutional state. In line 11 the governor agent collects all those actions send/1 and receive/1 in the EI specification which are associated with its current control [Ag,Sc,St,Role]. In line 12, the governor agent interacts with the external agent and, taking into account all constraints associated with Ag, obtains an action Act that is performed in line 14 (i.e., a message is sent or received). In line 14 the agent removes the updating/2 tuple and in line 15 the agent starts another loop. Although we represented the institutional state as boxes in Fig 1 they are not stored as one single tuple containing ∆. If this were the case, then the governors would have to take turns to update the institutional state. We have used instead a representation for the institutional state that allows the governors to update the space asynchronously. Each element of ∆ is represented by a tuple of the form t(N,Type,Elem) where N is the identification of the institutional state, Type is the description of the component (i.e., either a rule, an atf, or a constr) and Elem the actual element. Using this representation, we can easily obtain all those tuples in the space that belong to the current institutional state. Predicate get state/2 is thus: get state(N,Delta):- bagof rd noblock(t(N,T,E),t(N,T,E),Delta).
That is, the Linda built-in [9] bagof rd noblock/3 (it works like the findall/3 predicate) finds all those tuples belonging to institutional state N and stores them in Delta. 5.1
Norm-Aware Governor Agents
We can claim our resulting society of agents is endowed with norm-awareness because their behaviour is regulated by the governor agents depicted above. The social awareness of the governor agent, in its turn, stems from two features: i) its access to the institutional state where obligations, prohibitions and permissions are recorded (as well as constraints on the values of their variables); ii) its access to the set of possible actions prescribed in the protocol. With this information, we can define various alternative ways in which governor agents, in collaboration with their respective external agents, can decide on which action to carry out. We can define predicate social analysis(ANCs,Delta,Act,NewCtr) in line 12 of Fig. 5 in different ways – this predicate should ensure that an action Act
102
A. Garc´ıa-Camino et al.
(sending or receiving a message) with its respective next control state NewCtr (i.e., the list [Ag,Sc,NewSt,Role]) is chosen from the list of options ANCs, taking into account the current institutional state Delta. This predicate must also capture the interactions between governor and external agents as, together, they choose and customise a message to be sent. We show in Fig. 6 a definition social analysis(ANCs,Delta,Act,NewCtr):remove prhs(ANCs,Delta,ANCsWOPrhs), for predicate social analysis/4. Its select obls(ANCsWOPrhs,Delta,ANCsObls), choose customise(ANCsObls,Delta,Act,NewCtr). first subgoal removes from the list ANCs all those utterances that are proFig. 6. Definition of Social Analysis hibited from being sent, obtaining the list ANCsWOPrhs. The second subgoal ensures that obligations are given adequate priority: the list ANCsWOPrhs is further refined to get the obligations among the actions and store them in list ANCsObls – if there are no obligations, then ANCsWOPrhs is the same as ANCsObls. Finally, in the third subgoal, an action is chosen from ANCsObls and customised in collaboration with the external agent. This definition is a rather “draconian” one in which external agents are never allowed even to attempt to utter a prohibited illocution; other definitions could be supplied instead. We use a “flat” structure to represent atomic formulae. For instance, utt(agora, w2 , inform(ag 4 , seller , ag 3 , buyer , offer (car , 1200), 10)) is represented as t(N,atf,[utt,agora,w2 ,[inform,ag4 ,seller,ag3 ,buyer,offer(car,1200),10]])
Governor agents are able to answer queries by their external agents such as “what are my obligations at this point?”, encoded as: findall([S,W,[I,Id|R]],member(t(N,atf,[obl,S,W,[I,Id|R]]),Delta),MyObls)
These interactions enrich the definition of predicate choose customise/4 above.
6
Related Work
Apart from classical studies on law, research on norms and agents has been addressed by two different disciplines: sociology and philosophy. On the one hand, socially oriented contributions highlight the importance of norms in agent behaviour (e.g., [15, 16, 17]) or analyse the emergence of norms in multi-agent systems (e.g., [18, 19]). On the other hand, logic-oriented contributions focus on the deontic logics required to model normative modalities along with their paradoxes (e.g., [20, 21, 22]). The last few years, however, have seen significant work on norms in multi-agent systems, and norm formalisation has emerged as an important research topic in the literature [1, 23, 24, 25]. V´ azquez-Salceda et al. [24, 26] propose the use of a deontic logic with deadline operators. In their approach, they distinguish norm conditions from violation conditions. This is not necessary in our approach since both types of conditions can be represented in the LHS of our rules. Their model of norm also separates sanctions and repairs (i.e., actions to be done to restore the system to a valid state); these can be expressed in the RHS of our rules without having to differ-
A Distributed Architecture for Norm-Aware Agent Societies
103
entiate them from other normative aspects of our states. Our approach has two advantages over [24, 26]: i) we provide an implementation for our rules; and ii) we offer a more expressive language with constraints over norms. Fornara et al. [25] propose the use of norms partially written in Object Constraint Language (OCL). Their commitments are used to represent all normative modalities; of special interest is how they deal with permissions: they stand for the absence of commitments. This feature may jeopardise the safety of the system since it is less risky to only permit a set of safe actions thus forbidding other actions by default. Although this feature can reduce the amount of permitted actions, it allows unexpected actions to be carried out. Their within, on and if clauses can be encoded as LHS of our rules as they can all be seen as conditions when dealing with norms. Similarly, “foreach in” and “do” clauses can be encoded as RHS of our rules since they are the actions to be applied to a set of agents. L´opez y L´opez et al. [27] present a model of normative multi-agent system specified in the Z language. Their proposal is quite general since the normative goals of a norm do not have a limiting syntax as is the case with the rules of Fornara et al. [25]. However, their model assumes that all participating agents have a homogeneous, predetermined architecture. No agent architecture is imposed on the participating agents in our approach, thus allowing for heterogeneity. Artikis et al. [28] use event calculus for the specification of protocols. Obligations, permissions, empowerments, capabilities and sanctions are formalised by means of fluents (i.e., predicates that may change with time). Prohibitions are not formalised in [28] as a fluent since they assume that every action not permitted is forbidden by default. Although event calculus models time, their deontic fluents are not enough to inform an agent about all types of duties. For instance, to inform an agent that it is obliged to perform an action before a deadline, it is necessary to show the agent the obligation fluent and the part of the theory that models the violation of the deadline. Michael et al. [29] propose a formal scripting language to model the essential semantics, namely, rights and obligations, of market mechanisms. They also formalise a theory to create, destroy and modify objects that either belong to someone or can be shared by others. Their proposal is suitable to model and implement market mechanisms. However, it is not as expressive as other proposals: for instance, it cannot model obligations with a deadline.
7
Conclusions, Discussion and Future Work
We have proposed a distributed architecture to provide MASs with an explicit social layer: the institutional states store information on the execution of the MAS as well as the normative positions of its agents – their obligations, prohibitions and permissions. The institutional states capture the dynamics of the execution and are managed via institutional rules: these are a kind of production system depicting how the states are updated when certain situations arise. An important contribution of this work concerns the rule-based language to explicitly manage normative positions of agents. We achieve greater flexibility,
104
A. Garc´ıa-Camino et al.
expressiveness and precision by allowing constraints to be part of our rules – such constraints associate further restrictions with permissions, prohibitions and obligations. Our language is general-purpose, allowing various kinds of deontic notions to be captured. The institutional states and rules are put to use within a distributed architecture, supported by a team of administrative agents implemented as Prolog programs sharing a tuple space. We propose means to store the institutional state that allows maximum distributed access. The “norm-awareness” of our proposal stems from the fact that the governor agents, part of our team of administrative agents, can regulate the behaviour of external agents taking part in the MAS execution. The regulation takes into account the normative position of individual external agents stored in the institutional state. We provide a detailed implementation of governor agents that hinges on the notion of social analysis: this is a decision procedure which can be defined differently, for distinct scenarios and solutions. We would like to investigate the verification of norms (along the lines of our work in [30]) expressed in our rule language, with a view to detecting, for instance, obligations that cannot be fulfilled, prohibitions that prevent progress, inconsistencies (i.e., when an illocution is simultaneously permitted and prohibited) and so on. We also want to provide engineers with means to analyse their rules, so that they can, for instance, assess the “social burden” associated with individual agents and whether any particular agent has too important a role in the progress of an electronic institution. If the verification and analysis are done during the design, that is, as the rules are prepared, then this could prevent problems from being propagated to latter parts of the MAS development. We are currently working on tools to help engineers prepare and analyse their rules; these are norm editors that will support the design of norm-oriented electronic institutions.
References 1. Dignum, F.: Autonomous Agents with Norms. A. I. & Law 7 (1999) 69–79 2. L´ opez y L´ opez, F., Luck, M., d’Inverno, M.: Constraining Autonomy Through Norms. In: Procs. AAMAS 2002, ACM Press (2002) 3. Verhagen, H.: Norm Autonomous Agents. PhD thesis, Stockholm University (2000) 4. Sergot, M.: A Computational Theory of Normative Positions. ACM Trans. Comput. Logic 2 (2001) 5. Esteva, M.: Electronic Institutions: from Specification to Development. PhD thesis, Universitat Polit`ecnica de Catalunya (UPC) (2003) IIIA monography Vol. 19. 6. Carriero, N., Gelernter, D.: Linda in Context. Comm. of the ACM 32 (1989) 7. Apt, K.R.: From Logic Programming to Prolog. Prentice-Hall, U.K. (1997) 8. Vasconcelos, W.W., Robertson, D., Sierra, C., Esteva, M., Sabater, J., Wooldridge, M.: Rapid Prototyping of Large Multi-Agent Systems through Logic Programming. Annals of Mathematics and Artificial Intelligence 41 (2004) 135–169 9. Swedish Institute of Computer Science: SICStus Prolog. (2005) http://www.sics. se/isl/sicstuswww/site/index.html, viewed on 10 Feb 2005 at 18.16 GMT.
A Distributed Architecture for Norm-Aware Agent Societies
105
10. Jaffar, J., Maher, M.J., Marriott, K., Stuckey, P.J.: The Semantics of Constraint Logic Programs. Journal of Logic Programming 37 (1998) 1–46 ¨ 11. Holzbaur, C.: OFAI clp(q,r) Manual, Edition 1.3.3. TR-95-09, Austrian Research Institute for Artificial Intelligence, Vienna, Austria (1995) 12. Fitting, M.: First-Order Logic and Automated Theorem Proving. Springer-Verlag, New York, U.S.A. (1990) 13. Kramer, B., Mylopoulos, J.: Knowledge Representation. In Shapiro, S.C., ed.: Encyclopedia of Artificial Intelligence. Volume 1. John Wiley & Sons (1992) 14. Russell, S.J., Norvig, P.: Artificial Intelligence: A Modern Approach. 2 edn. Prentice Hall, Inc., U.S.A. (2003) 15. Conte, R., Castelfranchi, C.: Understanding the Functions of Norms in Social Groups through Simulation. In: Artificial Societies. The Computer Simulation of Social Life, UCL Press (1995) 16. Conte, R., Castelfranchi, C.: Norms as Mental Objects: From Normative Beliefs to Normative Goals. In: Procs. of MAAMAW’93, Neuchatel, Switzerland (1993) 17. Tuomela, R., Bonnevier-Tuomela, M.: Norms and Agreement. European Journal of Law, Philosophy and Computer Science 5 (1995) 41–46 18. Walker, A., Wooldridge, M.: Understanding the emergence of conventions in multiagent systems. In: Procs. ICMAS 2005, San Francisco, USA (2005) 19. Shoham, Y., Tennenholtz, M.: On Social Laws for Artificial Agent Societies: Offline Design. Artificial Intelligence 73 (1995) 231–252 20. von Wright, G.: Norm and Action: A Logical Inquiry. Routledge and Kegan Paul, London (1963) 21. Alchourron, C., Bulygin, E.: The Expressive Conception of Norms. In Hilpinen, R., ed.: New Studies in Deontic Logics, London, D. Reidel (1981) 95–124 22. Lomuscio, A., Nute, D., eds.: Procs. of DEON 2004. Volume 3065 of LNAI. Springer Verlag (2004) 23. Boella, G., van der Torre, L.: Permission and Obligations in Hierarchical Normative Systems. In: Procs. ICAIL 2003, ACM Press (2003) 24. V´ azquez-Salceda, J., Aldewereld, H., Dignum, F.: Implementing Norms in Multiagent Systems. Volume 3187 of LNAI., Springer-Verlag (2004) 25. Fornara, N., Vigan` o, F., Colombetti, M.: A Communicative Act Library in the Context of Artificial Institutions. In: Procs. EUMAS. (2004) 26. V´ azquez-Salceda, J., Aldewereld, H., Dignum, F.: Norms in Multiagent Systems: Some Implementation Guidelines. In: Procs. EUMAS. (2004) 27. L´ opez y L´ opez, F., Luck, M.: A Model of Normative Multi-Agent Systems and Dynamic Relationships. Volume 2934 of LNAI., Springer-Verlag (2004) 28. Artikis, A., Kamara, L., Pitt, J., Sergot, M.: A Protocol for Resource Sharing in Norm-Governed Ad Hoc Networks. Volume 3476 of LNAI. Springer-Verlag (2004) 29. Michael, L., Parkes, D.C., Pfeffer, A.: Specifying and monitoring market mechanisms using rights and obligations. In: Proc. AMEC VI. (2004) 30. Vasconcelos, W.W.: Norm Verification and Analysis of Electronic Institutions. Volume 3476 of LNAI. Springer-Verlag (2004)
About Declarative Semantics of Logic-Based Agent Languages Stefania Costantini and Arianna Tocchio Universit`a degli Studi di L’Aquila, Dipartimento di Informatica, Via Vetoio, Loc. Coppito, I-67010 L’Aquila - Italy {stefcost, tocchio}@di.univaq.it
Abstract. In this paper we cope with providing an approach to declarative semantics of logic-based agent-oriented languages, taking then as a case-study the language DALI which has been previously defined by the authors. This “evolutionary semantics” does not resort to a concept of state: rather, it models reception of events as program transformation steps, that produce a “program evolution” and a corresponding “semantic evolution”. Communication among agents and multi-agent systems is also taken into account. The aim is that of modeling agent’s evolution according to either external (environmental) or internal changes in a logical way, thus allowing in principle the adoption of formal verification methods. We also intend to create a common ground for relating and comparing different approaches/languages.
1 Introduction The original perspective on agents in Computational Logic focused on agent’s reasoning process, thus identifying “intelligence” with rationality, while neglecting the interactions with the environment. The identification of intelligence with rationality has been heavily criticized, even adopting the opposite point of view, i.e., that intelligent behavior should result solely from the ability of an agent to react appropriately to changes in its environment. A novel view of logical agents, able to be both rational and reactive, i.e., capable of timely response to external events, that has been introduced by Kowalski and Sadri in [15] [16]. A meta-logic program defines the “observe-think-act” cycle of an agent. Integrity constraints are used to generate actions in response to updates from the environment. After that, both the notion of agency and its interpretation in computational logic has evolved, and many interesting approaches and languages have been proposed in the last two decades [23]. A foundational approach in artificial intelligence and cognitive science is BDI, introduced in [4] that stands for “Belief, Desire, Intentions”: and agent’s beliefs correspond to information the agent has about the world, which may be incomplete and incorrect; an agent’s desires intuitively correspond to its objectives, or to the tasks allocated to it; as an agent will not, in general, be able to achieve all its desires, the desires upon which the agent commits are intentions that the agent will try to achieve. The theoretical foundations of the BDI model have been investigated (the reader may refer to [19]). M. Baldoni et al. (Eds.): DALT 2005, LNAI 3904, pp. 106–123, 2006. c Springer-Verlag Berlin Heidelberg 2006
About Declarative Semantics of Logic-Based Agent Languages
107
The original formal definition was in terms of modal logics. However, there have been reformulations of the BDI approach that have led to the logic programming language AgentSpeak(L) and to the programming language 3APL [11]. Among agent-oriented languages based on logic programming are the following. Go! [5] is a multi-paradigm programming language with a strong logic programming aspect. Go! has strong typing, and higher-order functional aspects. Its imperative subset includes action procedure definitions and rich program structuring mechanisms. Go! agents can have internal concurrency, can deductively query their state components, can communicate with other Go! agents using application specific symbolic messages. DALI [6] [7] is syntactically very similar to traditional logic programming languages such as prolog. The reactive and proactive behavior of a DALI agent is triggered by several kinds of events: external events, internal, present and past events. Events are coped with by means of a new kind of rules, reactive rules. DALI provides a filter on communication for both incoming and out-coming message. In fact, a message is accepted (or otherwise discarded) only if it passes the check of the communication filter. This filter is expressed by means of meta-rules specifying two distinguished predicates. Interesting agent architectures are logic-based, like KGP [3], which builds on the original work of Kowalski and Sadri, where various forms of reasoning can be gracefully specified. IMPACT [2] provides interoperability by accommodating into a computational logic shell various others kinds of agents. The semantics of the above-mentioned languages and approaches has been defined in various ways. All of them have suitable operational models that account for agent’s behavior. Many of them enjoy, in the tradition of logic, a logic declarative semantics. It is however not so easy to find a common ground for relating and comparing the different approaches. Aim of this paper is to introduce an approach to declarative semantics of logical agent-oriented languages that considers evolution of agents, without introducing explicitly a concept of state. Rather, changes either external (i.e., reception of exogenous events) or internal (i.e., courses of actions undertaken based on internal conditions) are considered as making a change in the agent program, which is a logical theory, and in its semantics (however defined). For such a change to be represented, we understand this change as the application of a program-transformation function. Thus, agent evolution is seen as program evolution, and semantic evolution. This novel approach will perhaps not encompass all the existing ones, but, in our opinion, it can constitute a starting point for establishing a common viewpoint. The approach presented in this paper is not in contrast with operational semantics. Rather, the declarative view of agent programs can be linked to an operational model, as outlined by means of a case-study in [20]. In Section 2 we review the features that intelligent logical agents in our opinion possess. In Section 3 we introduce the approach. In Sections 5 and 6, as a case-study, we show in detail the approach with respect to the DALI language, that for the sake of clarity we shortly review in Section 4. Finally, we conclude in Section 7.
2 Features of Evolving Logical Agents A great deal can be said about features that agents in general and logical agents in particular should possess (for a review the reader may refer for instance to [20], for a
108
S. Costantini and A. Tocchio
discussion to [14]). It is widely recognized however that agents, whatever the language and the approach on which they are based, should exhibit the following features. – Autonomy: agents should be able to operate without the direct intervention of humans. – Reactivity: agents should perceive their environment (which may be, in a broad sense, the physical world, a user, a collection of agents, the internet, etc.) and respond in a timely fashion to changes that occur in it. – Pro-activeness: agents should not simply act in response to their environment: they should be able to exhibit opportunistic, goal-directed behavior and take the initiative where it is appropriate. – Social ability: i. e., agents should be able to interact, when they deem it appropriate, with other agents or with humans in order to complete their own problem solving and to help others with their activities. In order to exhibit these properties, logical agents usually rely upon some form of rationality, which might be based on some or all the following internal features: – Perceptive abilities, i.e. the possibility of be aware (to same extent) of what is going on (what events happen) in the environment. – Inferential abilities, which may include many different forms of either formal or commonsense ways of drawing consequences from what is known. – Introspective abilities which imply some kind of control over their actions and internal state, for being able to affect their own functioning according to priorities, desires (or goals), time or resource bounds, etc. This point may include a concept of time. – Meta-level control for establishing goals and selecting intentions. – Learning abilities which at the lowest level include the memory of what has happened so as to be able influence the future course of actions depending on the past, and then may include more or less sophisticated forms of belief revision or classification. – Communication control in the sense that it should be possible to decide which social interactions are appropriate and which not, both in terms of generating appropriate requests and of judging incoming requests. These abilities must be operationally combined in the context of either a cycle, or a multi-threaded execution, or an interleaving, so as to be able to timely respond to the environment changes based on reasoning, planning, learning, etc. Effective operational models have been proposed, based either on versions of the observe-think-act cycle introduced by [15], or on operational semantics of various kinds. For logical agents, it is in our opinion important to give a declarative account of the agent behavior. This would have at least two advantages: (i) understanding the agent behavior more easily than in an operational model; (ii) being able to verify properties of both agents and multi-agent systems. The difficulty is that agents evolve, according to both changes in the environment and to their internal state. Traditionally, logic includes neither the notion of state nor that of
About Declarative Semantics of Logic-Based Agent Languages
109
evolution. It includes the notion of a theory with some kind of model(s). In this paper we propose a semantic approach that accounts for evolution, though not introducing state explicitly.
3 Declarative Semantics of Evolving Agents The evolutionary semantics that we propose is aimed at declaratively modeling the changes inside an agent which are determined both by changes in the environment and by the agent’s own self-modifications. The key idea is to understand these changes as the result of the application of program-transformation functions. In this approach, a program-transformation function is applied upon reception of either an external or an internal event, the latter having a possibly different meaning in different formalisms. As a typical situation, perception of an external event will have an effect on the program which represent the agent: for instance, the event will be stored as a new fact in the program . This transforms the program into a new program, that will procedurally behave differently than before, e.g., by possibly reacting to the event. Or, the internal event corresponding to the decision of the agent to undertake an activity triggers a more complex program transformation, resulting in version of the program where the corresponding intention is somewhat “loaded” so as to become executable. Then, in general one will have an initial program P0 which, according to these program-transformation steps (each one transforming Pi into Pi+1 ), gives rise to a Program Evolution Sequence P E = [P0 , ..., Pn ]. The program evolution sequence will have a corresponding Semantic Evolution Sequence M E = [M0 , ..., Mn ] where Mi is the semantic account of Pi . The different languages and different formalisms will influence the following key points: 1. When a transition from Pi to Pi+1 takes place, i.e. which are the external and internal factors that determine a change in the agent. 2. Which kind of transformations are performed. 3. Which semantic approach is adopted, i.e., how Mi is obtained from Pi . Mi might be for instance a model, or an initial algebra, or a set of Answer Sets if the given language is based on Answer Set Programming (that comes from the stable model semantics of [13]). In general, given a semantics S we will have Mi = S(Pi ). A particular internal event that may determine a transition can be the decision of the agent to revise its knowledge, for instance by verifying constraints, removing “old” facts, or performing any kind of belief revision. Also belief revision in fact can be seen in our approach as a step of program transformation that in this case results in the updated theory. We also believe it is useful to perform an Initialization step, where the program PAg written by the programmer is transformed into a corresponding program P0 by means of some sort of knowledge compilation. This initialization step can be understood as a rewriting of the program in an intermediate language and/or as the loading of a “virtual machine” that supports language features. This stage can on one extreme do nothing, on the other extreme it can perform complex transformations by producing “code” that
110
S. Costantini and A. Tocchio
implements language features in the underlying logical formalism. P0 can be simply a program (logical theory) or can have additional information associated to it. In Multi-agent systems (MAS), where new pieces of knowledge (beliefs, but possibly also rules or sets of rules) can be received by other agents, an agent might have to decide whether to accept or reject the new knowledge, possibly after having checked its correctness/usefulness. This can imply a further knowledge compilation step, to be performed: (ii) Upon reception of new knowledge. (ii) In consequence to the decision to accept/reject the new knowledge. To summarize, we will have in principle at least the following programtransformation functions: Γinit for initialization, Γevents for managing event reception, Γrevise for belief revision, Γlearn for incorporating new knowledge. Then, given an agent program PAg we will have: P0 = Γinit (PAg ) and Pi+i = Γop (PAg ) with op ∈ {learn, revise, events}. The evolutionary semantics of an agent represents the history of an agent without introducing a concept of a “state”. Definition 1 (Evolutionary semantics). Let PAg be an agent program. The evolutionary semantics εPAg of PAg is the couple P E, M E. In order to illustrate the approach on a case-study, in the rest of the paper we will discuss the evolutionary semantics of the DALI language. With respect to previous discussions [7] that considered external events only, we here account for all kinds of events and also consider the DALI communication architecture.
4 DALI in a Nutshell DALI [7] [8] [20] is an active agent-oriented Logic Programming language designed in the line of [14] for executable specification of logical agents. The Horn-clause language is a subset of DALI. The reactive and proactive behavior of the DALI agent is triggered by several kinds of events: external events, internal, present and past events. All the events and actions are time-stamped, so as to record when they occurred. A DALI program with its interpreter give rise to an agent since, when activated, it stays “awake” and monitors the arrival of events from the external world. These events are treated similarly to user’s queries in the sense that they trigger an inference activity that in this case can be considered a reaction. At a certain frequency and/or upon certain conditions, the interpreter tries on its own initiative to check whether certain distinguished propositions can be proved. If so, their success is interpreted as the reception of an event, thus triggering further inference. This mechanism of “internal events”
About Declarative Semantics of Logic-Based Agent Languages
111
is an absolute novelty of DALI (other languages such as 3-APL language have internal events, but with a different meaning). Internal events make DALI agents proactive, and make them exhibit a behavior that depends on the logic programs, but also on the history of the interaction of the agent with its external environment. In fact, all the external and internal events and the actions that the agent performs are recorded as “past events”, and having past events in the conditions of distinguished goals will influence the internal event to happen or not. We will now shortly present the language in a more formal way. Given the usual definition of atom and term, we will introduce distinguished kinds of atoms, which will be syntactically denoted by special postfixes. In the following definitions, let Body be any conjunction of atoms, possibly including distinguished ones. An external event is a stimulus perceived by the agent from the environment. We define the set of external events perceived by the agent from time t1 to time tn as a set E = {e1 : t1 , ..., en : tn } where E ⊆ S, and S is the set of the external stimuli that the agent can possibly perceive. A single external event ei is an atom indicated with a particular postfix in order to be distinguished from other DALI language events. Definition 2 (External Event). An external event is indicated by postfix E and it is defined as: ExtEvent ::= > | seq > When an event comes into the agent from its “external world”, the agent can perceive it and decide to react. The reaction is defined by a reactive rule which has in its head that external event. The special token :>, used instead of :-, indicates that reactive rules performs forward reasoning. Definition 3 (Reactive rule). A reactive rule has the form: ExtEventE :> Body or ExtEvent1E , ..., ExtEventnE :> Body Operationally, if an incoming external event is recognized, i.e., corresponds to the head of a reactive rule, it is added into a list called EV and consumed according to the arrival order, unless priorities are specified. Before the event is reacted to, the agent has the possibility of reasoning about it. Then, each external event AtomE has a counterpart called “present event” that may occur in the body of rules with suffix N. In particular, the present event AtomN is true as far as the external event AtomE is still in EV . Internal events make a DALI agent proactive independently of the environment, of the user and of the other agents, and also allow the agent to manipulate and revise its knowledge. Definition 4 (Internal Event). An internal event is indicated by postfix I: InternalEvent ::= > The internal event mechanism implies the definition of two rules. The first one contains the conditions (knowledge, past events, procedures, etc.) that must be true so that the reaction (in the second rule) may happen:
112
S. Costantini and A. Tocchio
IntEvent :- Conditions IntEventI :> Body Internal events are automatically attempted with a default frequency customizable by means of directives in the initialization file. A DALI agent is able to build a plan in order to reach an objective, by using internal events of a particular kind, called planning goals. After reaction to either an external or an internal event, the agent remembers to have reacted by converting the external event into a past event, postfix P (time-stamped). Actions are the agent’s way of affecting the environment, possibly in reaction to either an external or internal event. An action in DALI can be also a message sent by an agent to another one. Definition 5 (Action). An action is syntactically indicated by postfix A: Action ::= > | messageA > Actions occur in the body of rules. In DALI, actions may have or not preconditions: in the former case, the actions are defined by actions rules, in the latter case they are just action atoms. An action rule is just a plain rule, but in order to emphasize that it is related to an action, we have introduced the new token : R1 , . . . , Rq into the standard rule: p(Args)E :- p(Args)E , R1 , . . . , Rq . Similarly, we have to transform the reactive rule corresponding to each internal event. The first rule related to the internal event is left unchanged, as it is a plain rule which is employed in order to check whether the internal event has occurred. Definition 8 (Transformation of internal events rules Γir ). An internal events q(Args)I is allowed to be reacted to only if the subgoal q(Args) has been proved. Then, we transform each reactive rule for internal events: q(Args)I :> R1 , . . . , Rq . into the standard rule: q(Args)I :- q(Args), R1 , . . . , Rq . Now, we have to declaratively model actions, with or without an action rule. The point is, an action atom should become true (given its preconditions, if any) whenever the action is actually performed in some rule. Declaratively, this means that the action occurs in the body of an applicable rule. Practically, whenever that rule will be processed by the interpreter the action will be actually performed (by means of any kind of mechanism that connects the agent to its environment). To clarify the declarative part, consider another simple program written in the plain Horn-clause language: p : −b, a. b. Its least Herbrand model is {b}. Assume that a is an action atom with no defining clause, where the action is meant to be attempted in the rule defining p. In principle, actions without preconditions can always be performed. In logic programming terms, this implies that whenever the action is attempted the corresponding atom should be true. If we modify the program as follows: p : −b, a. b. a : −b. Its least model becomes {p, b, a}, thus modeling the desired behavior. In fact, the last rule ensures that the action atom a becomes true whenever the action can be actually performed in the rule defining p, as the previous condition b is true. Let us now assume that a has a defining clause, like in the program: p : −b, a. b. a : −c. c.
About Declarative Semantics of Logic-Based Agent Languages
115
If we modify the program as follows: p : −b, a. b. a : −c, b. c. we obtain least model {p, b, a}. The last rule ensures that the action atom a becomes true only if: (i) its own precondition its true; (ii) the corresponding action can actually be performed in the rule defining p, as the previous condition b is true. Precisely, an action A is performed by a DALI agent whenever A is executed as a subgoal in a rule of the form B :- D1 , . . . , Dh , A1 , . . . , Ak , h ≥ 1, k ≥ 1 where the Ai s are actions and A ∈ {A1 , . . . , Ak }. Declaratively, whenever the conditions D1 , . . . , Dh of the above rule are true, the action atoms should become true as well (given their preconditions, if any), so that the rule can be applied by the immediateconsequence operator. To model this behavior we introduce the following: Definition 9 (Transformation of action rules Γar ). For every action atom A, with action rule: A :- C1 , . . . , Cs , s ≥ 1 we modify this rule into: A :- D1 , . . . , Dh , C1 , . . . , Cs . If A has no defining rule, we instead add the clause: A :- D1 , . . . , Dh . DALI Definition 10. The program-transformation initialization function Γinit is defined as the composition of Γir , Γir and Γir . Given a DALI logic program PAg , let Ps = DALI Γinit (PAg ) be the horn-clause logic program obtained by applying the above transformations.
Example 1. The following DALI program P Ex defines an agent waiting for an invitation (external event invitationE). In case she receives an invitation, as a reaction she accepts (action acceptA). If she has received an invitation she is happy (she remembers about the invitation by means of past event invitationP). In recognition of her own happiness, she reacts by joining the friends who has invited her (action join friendsA, with precondition have time). The program is oversimplified under many respects (there are no variables, and message exchange among agents is hidden) but it is useful in order to demonstrate how the proposed declarative semantics works. This example will be continued in the rest of the paper. invitationE :> acceptA. happy :- invitationP. happyI :> join f riendsA. join f riendsA :< have time. have time. DALI The application of Γinit produces the modified program PsEx given below: invitationE :- invitationE, acceptA. acceptA :- invitationE. happy :- invitationP. happyI :- happyI, join f riendsA.
116
S. Costantini and A. Tocchio
join f riendsA :- happyI, have time. have time. Ps is the basis for the evolutionary semantics, that describes how the agent is affected by actual arrival of events. In fact, we need now to specify the agent evolution according to the events that happen. In the proposed approach, the program Ps is actually affected by the events by means of subsequent syntactic transformations. Whenever the agent receives an external event, we ideally apply a suitable syntactic transformation and compute the least Herbrand model, thus generating a ’snapshot’ of the agent change process. The declarative semantics of agent program PAg at a certain stage then coincides with the declarative semantics of the version of Ps at that stage. Initially, many of the rules of Ps will be in general not applicable, since no external and present events are available, and no past events are recorded. Later on, as soon as external events arrive and internal events happen, the reactive and proactive behavior of the agent will be put at work. Example 2. The least Herbrand model of PsEx is {have time}. In order to obtain the evolutionary declarative semantics of P , we explicitly associate to Ps the list of the external events that we assume to have arrived up to a certain point (in the order in which they are supposed to have been received) and the list of internal events that have became true up to that point. In this context, we make the simplifying assumption that all internal events are attempted at the same default frequency, i.e., at each step. We also assume that past events are never removed. Let EXT H be the set of all ground instances of atoms corresponding to external events which occur in the DALI logic program. Similarly, let IN T H the set of ground instances of internal events. Finally, let ACT H be the set of ground instances of action atoms. Given the above sets, at each step, say n, we can consider the subsets: EXT Hn, containing the external events perceived up to that step; IN T Hn , containing the internal events that have succeeded up to that step; and ACT Hn , containing the actions performed up to that step. We consider these sets as lists, that reflect the order in which external events/internal events/actions respectively happened/were proved/were performed. We let P0 = Ps , [ ], [ ], [ ], to indicate that initially no event has happened and no action has been performed. Later on, the program Ps will be modified by the perceived external events, the triggered internal events and by the performed actions. Namely, let Pn = P rogn , EXT Hn , IN T Hn , ACT Hn be the result of n steps, where P rogn is the current program that has been obtained from Ps step by step by means of the DALI program-transformation transition function Γevents . DALI In particular, Γevents specifies that, at the n − th step, the current external event En is added to the program as a fact. En is also added as a present event. The immediateconsequence operator will consequently be able to apply both the reactive rule related to En , and the rules in whose bodies the corresponding present event occurs. The performed action (if any) is added as fact in the form of a past action. The internal event (if any) that has been proved at this step is added as a fact, thus enabling the corresponding reactive rule. Previous step external/internal event En−1 is removed and is added as a past event. In the definitions below, given program P and fact F the notation P ∪ F stands for the program obtained by adding F to P .
About Declarative Semantics of Logic-Based Agent Languages
117
DALI The program-transformation transition function Γevents is related to an external event, an internal event and an action all together. In case any of them should be missing at any step, the corresponding part of the definition would just not apply. I.e., we assume that at each step any of them may be empty. If instead two or more events/actions would simultaneously occur, for the sake of simplicity we assume that they are considered one by one in subsequent steps.
Definition 11 (Transition function). Let EjE be an external event, IjI an internal event and AjA an action, ∀ j ≥ 1. The program-transformation transition function DALI Γevents is defined as follows: DALI Γevents (P rogn−1 , EnE , InI , AnA ) = DALI Γevents (Pn−1 , EnE , InI , AnA ), [EnE |EXT Hn−1 ], [InI |IN T Hn−1 ], [AnA |ACT Hn−1 ] where: DALI DALI Γevents (P0 , E1E , I1I , A1A ) = Γevents (Ps , [ ], [ ], [ ], E1E , I1I , A1A ) = Ps ∪ E1E ∪ E1N ∪ I1 ∪ A1P and DALI Γevents (P rogn−1 , [En−1E |Te ], [In−1I |Ti ], [An−1A |Ta ], EnE , InI , AnA ) = (P rogn−1 ∪ EnE ∪ EnN ∪ En−1P ∪ InI ∪ In−1P ∪ An−1P ) \ En−1E \ En−1N \ In−1I . Definition 12 (Program evolution). Let Ps be a DALI program, EXT Hn = [EnE , . . . , E1E ] be the list of external events, IN T Hn = [InI , . . . , I1I ] the list of internal events and ACT Hn = [AnA , . . . , A1A ] the list of actions. Let for DALI all i, Pi = Γevents (Pi−1 , EiE , IiI , AiA ). The list [P0 , . . . , Pn ], that we denote by P E(P s, EXT Hn, IN T Hn , ACT Hn ) (or for short P E if we assume that the arguments as given), is the program evolution of Ps with respect to EXT Hn , IN T Hn and ACT Hn . We can generalize the semantics by allowing an agent to be restarted from a previously reached stage rather than from scratch. Namely, while in the definitions above we let P0 = Ps , [ ], [ ], [ ], we can later let P0 = Pk where Pk is the result of a previous evolution. Definition 13 (Model Evolution). Let Ps be a DALI program, EXT H, IN T H and ACT H be the lists of events and actions and P E = [P0 , . . . , Pn ] be the program evolution of Ps with respect to EXT H, IN T Hand ACT H. Let Mi be the semantics S(Pi ) of the program at step i. The list [M0 , . . . , Mn ], that we denote by M E(Ps , EXT Hn, IN T Hn , ACT Hn ) (or for short M E if we assume that the arguments as given), is the model evolution of Ps with respect to PE, and Mi is the instant model at step i. As mentioned before, we assume the semantics S(Pi ) to be either the least Herbrand model or the well-founded model of P rogi . Notice that the Program Evolution and the Model Evolution are not independent. In fact, while each external event EiE reaches the agent from the outside, both the internal event IiI and the action AiA occur in Mi−1 , i.e., they have been proved/performed at the previous step.
118
S. Costantini and A. Tocchio
Definition 14 (Evolutionary semantics). Let Ps be a DALI program and EXT H, IN T H and ACT H be the lists of events and actions. The evolutionary semantics εP s of Ps with respect to EXT H, IN T H and ACT H is the couple P E, M E. Example 3. Assume that we let, as discussed above, P0 = PsEx , [ ], [ ], [ ]. Then, M0 = {have time}. Assume that external event invitationE arrives. We have DALI to compute Γevents (P0 , invitationE, , ) where the indicates that neither the internal event nor the action are present. The function adds to the program the new facts invitationE and invitationN thus obtaining PsEx . Consequently, we get P1 = PsEx , [invitationE], [ ], [ ]. Now, the first two rules of the program can fire and thus we have M1 = {have time, invitationE, invitationN, acceptA}. Notice that the first rule corresponds to the translation of the reactive rule related to the external event, while the second rule has been created to enable the action. Assume that no other external event arrives. However, in M1 we have an action, and DALI thus we have to compute Γevents (P1 , , , acceptA). The effect is twofold: the external and present event invitationE and invitationN are removed; the past event invitationP and the past action acceptP are added. We thus obtain PsEx . Consequently, P2 = PsEx , [invitationE], [ ], [acceptA ]. The internal event can be now proved, and M1 = {have time, invitationE, invitationN, acceptA, happy}. That is, the internal event happy has become true. With no other external event, we DALI compute Γevents (P1 , happyI, , ). It adds happyI as a fact, obtaining PsEx and Ex P3 = Ps , [invitationE], [happyI], [acceptA ]. The (translation of) the reactive rule corresponding to the internal event happyI can fire, and the action join friendsA is enabled. Thus, M3 = {have time, invitationE, invitationN, acceptA, happy, happyI, join f riendsA}. The evolutionary semantics of P Ex at this stage is P E = [P 1, P 2, P 3], M E = [M 1, M 2, M 3]. At the next step, the past events happyP and join friendsP would be added, and happyI would removed.
6 Semantics of Communication In this section we extend the declarative semantics of DALI so as to encompass the communication part. For doing so, we have to modify the initialization stage, i.e., we DALI have to extend the program-transformation initialization function Γinit . To this purpose, we build upon some of the author’s previous work on meta-logic. In [1], a logical framework for called RCL (Reflective Computational Logic) is introduced, based on the concept of “Reflection Principle”. Reflection principles are understood in RCL as logical schemata intended to capture the basic properties of a domain. The purpose of introducing reflection principles is to make it easier to build a complex theory, by allowing a basic theory to be enhanced by a compact meta-theory. These schemata need however to be given a role in the theory, both semantically (thus obtaining a declarative semantics for the resulting theory) and procedurally (mak-
About Declarative Semantics of Logic-Based Agent Languages
119
ing them usable in deduction). To this aim, they are interpreted as procedures, more precisely as functions that transform rules into (sets of) rules. These new Horn clauses are called “reflection axioms”. Then, the model-theoretic and fixed point semantics of the given program plus a reflection principle coincides with the corresponding semantics of the program obtained from the given one, by adding the reflection axioms. Definition 15. Let C be a definite clause. A reflection principle R is a mapping from rules to (finite) sets of rules. The rules in R(C) are called reflection axioms. Definition 16. Let R be a reflection principle. Let R(P ) be the set of reflection axioms obtained by applying R on all the clauses of P . Let P = P ∪ R(P ) be the resulting program. Let ΓR be a function that performs the transformation from P to P . Reflection principles thus allow extensions to be made to a logic language like for instance the Horn-clause language leaving the underlying logic unchanged. Several reflection principles can be associated to a program. A potential drawback is that the resulting program (P ∪ R(P )) may have, in general, a large number of rules, which is allowed in principle but difficult to manage in practice. To avoid this problem, one can apply reflection principles in the inference process only as necessary, which means by computing the reflection axioms on the fly as needed. In [1] an extended resolution procedure with this behavior is defined. Reflection principles can be expressed as axiom schemata in the form new rules ⇐ given rule The left-hand-side of ⇐ denotes a (set of) rule(s) (possibly facts) which is produced by applying the given reflection principle to the program at hand. The right-hand-side is the starting rule (the one that actually occurs in the given program), plus possibly some conditions for the application of the correspondence. For coping with the DALI communication architecture, it is then sufficient to augDALI ment Γinit so as to apply suitable reflection principles. In particular, we add the following three, that for the sake of brevity we denote together by RDcomm . The first reflection principle takes a told rule occurring in the DALI logic program, and, assuming a generic incoming message message received(Ag, primitive(Content, Sender)), generates the actual filter rule where the constraints are instantiated with the message elements primitive, Content, Sender. The second reflection principle generates an external event from every successful application of told. Notice that the second reflection principle acts on the actual told rules generated by the first one. The last reflection principle transforms a successful application of a tell rule into a message to be sent. told(Ag, primitive(Content)) : − constraint1 , . . . , constraintn , message received(Ag, primitive(Content, Sender)). ⇐ told(Ag, primitive(Content)) : −constraint1 , . . . , constraintn . primitive(Content)E : −told(Ag, primitive(Content)). ⇐ told(Ag, primitive(Content)) : −
120
S. Costantini and A. Tocchio
constraint1 , . . . , constraintn , message received(Ag, primitive(Content, Sender)). message to be sent(T o, Comm primitive(Content)) : − tell(T o, Comm primitive(Content)), constraint1 , . . . , constraintn ⇐ tell(T o, Comm primitive(Content)) : −constraint1 , . . . , constraintn . Let ΓRDcomm be the function that augments a program P by applying RDcomm . The new initialization stage is then performed by a program-transformation function DALIcomm Γinit that, beyond coping with event and action rules, also applies reflection principles related to communication. DALIcomm DALI Definition 17. We define Γinit as the composition of Γinit and ΓRDcomm .
Example 4. Let us assume to add to P Ex the following rules, that state that incoming communications are accepted only if arriving from “friendly” agents, and that (for example) anne is a friend: told(Ag, C) : −f riend(Ag). f riend(anne). The function ΓRDcomm , applied in the context of the initialization step, would transform the rules and, taking into account that the only external event that occurs in the program is invitationE, the result would be: told(Ag, invitationE) : −f riend(Ag), message received(Ag, invitationE). invitationE : −told(Ag, invitationE). f riend(anne). What changes is that now the incoming external event at stage i − 1 is taken from Mi , while what comes from the outside is message received(Ag,primitive(Content)). Only the external events that pass the communication filter may occur in Mi . In the example, an invitation sent by (for instance) agent fred, who is not among “friends”, would be ignored. 6.1 Generalization Reflection principles allow many kinds of language extensions to be modeled in uniform way. For instance, another DALI feature that can be modeled is the attempt of understanding message contents. I.e., there can be the case where a Content passes the told filter, but cannot be added as an external event because it is not understandable by the agent, as it does not occur in the head of any reactive rule. In this case, Content is automatically submitted to a procedure meta, which is predefined but user-extensible, which tries (possibly for instance by using ontologies) to translate Content into an equivalent though understandable form. DALI Actually, also the program transformations performed by Γinit for reactive and action rules can be represented by means of reflection principles. One may find it awkward the notion of the semantics of the given program being defined to be the semantics of the program after the transformations. Again by resorting
About Declarative Semantics of Logic-Based Agent Languages
121
to RCL, it is possible to clean up this notion, by introducing the concept of a reflective model. Definition 18. Let I be an interpretation of a program P . Then, I reflectively satisfies P (with respect to a (set of) reflection principle(s) R) if and only if I satisfies P = ΓR (P ) = P ∪ R(P ). Definition 19. Let I be an interpretation of a logic program P . Then, I is a reflective model of P if and only if I reflectively satisfies P . The model intersection property still holds, so there exists a least reflective Herbrand model of P . Then, in the evolutionary semantics we may let S be the least reflective Herbrand model of P rogn .
7 Concluding Remarks In this paper we have presented an approach to giving a declarative semantics to logical agents that evolve according to both their perceptions and their internal way of “reasoning”. This semantics is evolutionary, as it models each step of this evolution as the generation of an updated agent program with a correspondingly updated semantics. The proposed approach may constitute a ground for comparing different languages/approaches, according to: (i) which factors trigger the transition form one version of the agent program to the next one; (ii) which kind of transformation is performed, and which changes this implies in the agent behavior. Then, such a semantic view of logical agents can make verification techniques such as model-checking easier to apply in this field. In logic-programming-based languages such as DALI, a procedural semantics can be defined that corresponds to the declarative one [20] and can then be linked to an operational model [9]. Another advantage of the approach for logic-programming-based languages is that all the analysis, debugging and optimization techniques related this kind of languages (such as methods for program analysis and optimization, abstract interpretation, partial evaluation, debugging, etc.) remain applicable. Several aspects of the agent behavior remain are at the moment not described by the proposed semantics, e.g., how often to check for incoming messages, how often to perform belief-revision, etc. These aspects do not affect the logical semantics of the agent, but affect in a relevant way its run-time behavior, according to Kowalski’s famous principle Program = Logic + Control. In practice, this kind of “tuning” can be done via directives associated to the agent program. Directives can be even specified in a separate module, which is to be added to the agent program when the agent is initialized. Then, on the one hand directives can be modified without even knowing the agent program. On the other hand, the same agent program with different directives results in a different agent (e.g., apparently more quick, more lazy, eager to remember of ready to forget things, etc.). Directives can be coped with in the operational semantics of the language [9]. It appears more difficult to account for them in the declarative semantics. It appears also difficult to model time in a proper way, but this will be however an important subject of future research.
122
S. Costantini and A. Tocchio
References 1. J. Barklund, S. Costantini, P. Dell’Acqua e G. A. Lanzarone, Reflection Principles in Computational Logic, J. of Logic and Computation, Vol. 10, N. 6, December 2000, Oxford University Press, UK. 2. P. Bonatti, J. Dix, T. Eiter, S. Kraus, F. Ozcan, R. Ross and V. S. Subrahmanian, Heterogeneous Agent Systems, The MIT Press, 2000. 3. A. Bracciali, N. Demetriou, U. Endriss, A. Kakas, W. Lu, P. Mancarella, F. Sadri, K. Stathis, G. Terreni, and F. Toni, The KGP Model of Agency for Global Computing: Computational Model and Prototype Implementation. In Global Computing 3267, (2004). 4. M. E. Bratman, D. J. Israel and M. E. Pollack, Plans and resource-bounded practical reasoning,” Computational Intelligence, vol. 4, pp. 349-355, 1988. 5. K. L. Clark and G. McCabe, Go! A multi-paradigm programming language for implementing multi- threaded agents, Annals of Mathematics and Artificial Intelligence 41, ISSN: 10122443, pp. 171 - 206, 2004. 6. S. Costantini, Towards active logic programming, In A. Brogi and P. Hill, editors, Proc. of 2nd International Workshop on component-based Software Development in Computational Logic (COCL’99), Available on-line, URL http://www.di.unipi.it/ brogi/ResearchActivity/COCL99/proceedings/index.html. 7. S. Costantini and A. Tocchio, A Logic Programming Language for Multi-agent Systems. In S. Flesca, S. Greco, N. Leone, G. Ianni (eds.), Logics in Artificial Intelligence, Proc. of the 8th Europ. Conf., JELIA 2002, LNAI 2424, Springer-Verlag, 2002. 8. S. Costantini, A. Tocchio, The DALI Logic Programming Agent-Oriented Language. In J. J. Alferese and J. Leite(eds.), Logics in Artificial Intelligence, Proceedings of the 9th European Conference, Jelia 2004, Lisbon, September 2004. LNAI 3229, Springer-Verlag, Germany, 2004. 9. S. Costantini, A. Tocchio and A. Verticchio, A Game-Theoretic Operational Semantics for the DALI Communication Architecture, In Proc. of WOA04, Turin, Italy, ISBN 88-371-15334, December 2004. 10. S. Costantini, A. Tocchio and A. Verticchio, Communication and Trust in the DALI Logic Programming Agent-Oriented Language, In M. Cadoli, M. Milano and A. Omicini (eds), Italian J. of Artificial Intelligence, March 2005. 11. M. d’Inverno K. Hindriks and M. Luck, A formal architecture for the 3APL agent programming language, In First Intern. Conf. on B and Z Users, Springer-Verlag 1878, 2000, pp.168-187. 12. FIPA, Communicative Act Library Specification, Technical Report XC00037H, Foundation for Intelligent Physical Agents, 10 August 2001. 13. M. Gelfond and V. Lifschitz, The stable model semantics for logic programming. In Proceedings of the Fifth Joint International Conference and Symposium. The MIT Press, Cambridge, MA, 1988, 1070–1080. 14. R. A. Kowalski, How to be Artificially Intelligent - the Logical Way, Draft, revised February 2004, Available on line, URL http://www-lp.doc.ic.ac.uk/UserPages/staff/rak/rak.html. 15. R. A. Kowalski and F. Sadri, Towards a unified agent architecture that combines rationality with reactivity, In Proc. International Workshop on Logic in Databases, San Miniato (PI), Italy, LNCS 1154, Springer Verlag, Berlin, 1996. 16. R. A. Kowalski, and F. Sadri, An Agent Architecture that Unifies Rationality with Reactivity, Department of Computing, Imperial College, 1997. 17. J. W. Lloyd, Foundations of Logic Programming, 1987
About Declarative Semantics of Logic-Based Agent Languages
123
18. A. S. Rao, AgentSpeak(L): BDI agents speak out in a logical computable language, In W. Van de Velde and J. W. Perram, (eds.), Agents Breaking Away: Proc. of the Seventh Europ. Work. on Modelling Autonomous Agents in a Multi-Agent World, LNAI 1038, SpringerVerlag, Heidelberg, Germany, pp. 42-55, 1996. 19. A. S. Rao and M. Georgeff, BDI Agents: from theory to practice, In Proc. of the First Int. Conf. on Multi-Agent Systems (ICMAS-95), San Francisco, CA, pp. 312-319, 1995. 20. A. Tocchio, Multi-Agent sistems in computational logic, Ph.D. Thesis, Dipartimento di Informatica, Universit´a degli Studi di L’Aquila, 2005. 21. E.C. Van der Hoeve, M. Dastani, F. Dignum, J.-J. Meyer, 3APL Platform, In Proc. of the The 15th Belgian-Dutch Conference on Artificial Intelligence(BNAIC2003), held in Nijmegen, The Netherlands, 2003. 22. A. Van Gelder, K. A. Ross and J. Schlipf, The well-founded semantics for general logic programs, J. of the ACM 38(3), 620-650, 1990. 23. M. Wooldridge and N. R. Jennings, Intelligent agents: Theory and practice , Knowl. Eng. Rev., Vol. 10., N. 2, pp. 115-152, 1995.
Goal Decomposition Tree: An Agent Model to Generate a Validated Agent Behaviour Gaële Simon, Bruno Mermet, and Dominique Fournier University of Le Havre {Gaele.Simon, Bruno.Mermet, Dominique.Fournier}@univ-lehavre.fr
Abstract. This paper deals with a goal-oriented agent model called Goal Decomposition Tree (GDT) allowing both to specify and validate the behaviour of an agent. This work takes place in a global framework whose goal is to define a process allowing to start from a problem specification to obtain a validated implementation of a corresponding MAS. The GDT model has been used to specify a prey-predator system which has been verified this way.
1
Introduction
Our research deals with methods and models, in order to help multiagent systems designers to manage the complexity of MAS. We also aim at helping to develop multiagent systems whose behaviour can be verified. Our proposal takes place in “formal transformation systems” context as defined in [16]: "Formal transformation systems provide automated support to system development, giving the designer much more confidence that the resulting system will operate correctly, despite its complexity". The main principle of a formal transformation system is that each transformation step must preserve correctness from one model to the next. This constraint has been used as a guide to design our approach which consists of four steps: 1. an agentification method that helps to determine the set of agents which must be used to implement a given system; 2. an agent design model to help to design an agent behaviour that can be verified; 3. a proof system to prove that the agent model satisfies the main goal of the agent; 4. an implementation model that can be automatically generated from the agent design model. Our aim is to provide a complete MAS design process starting from the problem specification and ending by a MAS implementation. Moreover, as it is said in [16] about formal transformation systems, “since each transformation preserves correctness from one model to the next, the developer has much more confidence that no inconsistencies or errors occured during the design process”. Our agentification method and our implementation model have already been presented in other articles [10, 14].This paper is focused on the agent model called GDT. M. Baldoni et al. (Eds.): DALT 2005, LNAI 3904, pp. 124–140, 2006. c Springer-Verlag Berlin Heidelberg 2006
GDT: An Agent Model to Generate a Validated Agent Behaviour
125
In order to be able to implement and validate an agent, its behaviour must be clearly and formally specified. In our proposal, this specification is based on a Goal Decomposition Tree (GDT) which helps to describe how an agent can manage its goals. It is used – as a support for agent behaviour validation and proof; – to guide the implementation of the agent behaviour induced by the tree using an automaton as in our implementation model called SPACE [10]. The main contribution of this model is that it can be proven that the specified agent behaviour is correct according to the main goal of this agent. The proof mechanisms are detailed in [6]. That’s why the aim of this model is to provide a declarative description of goals. Several works have already pointed out the advantage to have a declarative description of goals [19], [4], [17], [11], [3]. Many multiagent models or systems are essentially focused on procedural aspects of goals which is important in order to obtain an executable agent. But the declarative aspect of goals is also very important. Indeed, as it is said in [19], “by omitting the declarative aspect of goals, the ability to reason about goals is lost. Without knowing what a goal is trying to achieve, one can not check whether the goal has been achieved, check whether the goal is impossible”. In [17], the authors say that declarative goals “provide for the possibility to decouple plan execution and goal achievement”. A GDT is a partial answer to these requirements: as it will be shown in next sections, both procedural and declarative aspects of goals management can be described by a GDT. Another important aspect of a GDT is that it is intended to be used to directly generate the behaviour of the agent to which it is associated. Indeed, as explained in [17], “the biggest role of goals in agents is thus not the ability to reason about them but their motivational power of generating behaviour”. Moreover, “A certain plan or behaviour is generated because of a goal”. It is exactly what a GDT allows to express. Nodes of a GDT are goals the agent has to achieve. As in [19], goals are considered as states of the world the agent has to reach. Inside the tree, a goal is decomposed into subgoals using decomposition operators. The notion of subgoal used here is similar to the one described in [17]: “a goal can be viewed as a subgoal if its achievement brings the agent closer to its top goal”. The notion of “closeness” used by van Riemsdjik et al. is specified differently by each decomposition operator. In the same paper, authors distinguish “subgoals as being the parts of which a top goal is composed” and “subgoals as landmarks or states that should be achieved on the road to achieving a top goal”. In a GDT, these two kinds of subgoals exist. The fact that a subgoal is of a particular kind is a direct consequence of the decomposition operator which has been used to introduce the subgoal. All works on agent goals management do not use the same notion of subgoal. In [17] or [4], a subgoal is considered as a logical consequence of a goal. So, in these works, subgoals can be considered as necessary conditions for the satisfaction of the parent goal. In our vision, subgoals are on the contrary sufficient conditions for the satisfaction of the parent
126
G. Simon, B. Mermet, and D. Fournier
goal. The TAEMS model [18] does not use goals buts tasks which, however, can be compared to goals used in our work. A decomposition operator encapsulates a set of mechanisms corresponding to a typical goals management behaviour ([4], [17], [19]). Each operator is specified by different kinds of semantics: – a goal decomposition semantics describing how a goal can be decomposed into subgoals with this operator; – a semantics describing how to deduce the “type” of the parent goal knowing the types of its subgoals; – a semantics associating an automata composition pattern to each operator. These patterns are used incrementally to build the complete automaton describing the agent behaviour; – a semantics associating a local proof schema. This schema is used to verify the agent behaviour (i.e. to prove that its goals management behaviour satisfies its main goal). This semantics is described in [6]. Section 2 defines the notion of goal as it is used in this work and describes the typology of goals which has been defined. Section 3 describes the set of operators which can be used to decompose a goal into subgoals inside a GDT. For each operator, the two first semantics described before are given. Section 4 defines more precisely a GDT and shows how it can be built using the tools described in the two previous sections. Last but not least, section 5 presents a comparison of the proposed model with other works.
2
Goals and Typology of Goals
In the context of a GDT, a name and a satisfaction condition are associated to each goal. According to the type of each goal, additional information is also used to completely specify the goal. Satisfaction conditions are used to specify goals formally with respect to the declarative requirement for goals described in the previous section. A goal is considered to be achieved if its satisfaction condition is logically true. Satisfaction conditions are expressed using a temporal logic formalism which is a subset of TLA [9]. More precisely, primed variables have been used. For example, if x is a variable, the notation x in a satisfaction condition corresponds to the value of x before the execution of the goal to which the condition is associated. On the contrary, the notation x corresponds to the value of x after the goal execution. Note that more complex temporal logic formulae are also used to specify the semantics of decomposition operators described in section 3. These formulae, which are not described in the paper because of a lack of space, are then used during the proof process. However, an example of such a formula is given in section 3.1 for the SeqAnd operator. Variables are supposed to be attributes managed by the agent. Thus, specifying goals of an agent helps also to define the set of variables defining the view of the agent on its environment. For example, in [1], the authors describe a case study where two robots are collecting garbage on Mars. One of the two
GDT: An Agent Model to Generate a Validated Agent Behaviour
127
robots, named R1, must move in order to discover pieces of garbage which must be picked up. As a consequence, for R1, to be in a location where there is a piece of garbage corresponds to a goal. The satisfaction condition of this goal can be defined by: garbage = true where garbage is supposed to be an internal variable of the agent which describes its perception of its current location. This variable is supposed to be updated each time the agent moves i.e. each time its coordinates (x,y) (which are also internal variables of the agent) are modified. A typology of goals has been defined in order to distinguish more precisely different ways to manage goals decomposition. The type of a goal has consequences on the execution process of this goal, on the design of its corresponding behaviour automaton and on the proof process of the behaviour implied by its execution by the agent. The first criterion to distinguish goals corresponds to the situation of the goal in the tree. This leads naturally to distinguish two first kinds of goals: elementary and intermediate goals. Elementary goals: they correspond to the leaves of the tree that’s why they are not decomposed into subgoals. Furthermore, they are not only defined by a name and a satisfaction condition but also by a set of actions. The execution of these actions are supposed to achieve the goal i.e. to make its satisfaction condition true. Notice that these actions are related to the capabilities of the agent as described in [4]. They correspond to the procedural aspect of goals described in the previous section. As satisfaction conditions, they are based on variables of the agents. The aim of an action is to modify these variables. For example, for the robot R1 described in section 2, moving one step right can be an elementary goal. Its satisfaction condition is: x = x + 1 ∧ y = y. The corresponding actions are: x := x + 1; garbage := containsGarbage(x, y). It is supposed that containsGarbage is a function associated to the environment allowing the agent to know if there is a piece of garbage at its current location. Intermediate goals: They correspond to the nodes of the tree which are not leaf ones. They are specified by a name, a satisfaction condition and also a Local Decomposition Tree (LDT). A LDT contains a root node corresponding to the intermediate goal and a decomposition operator which creates as many branches (and subgoals) as needed by the operator. It describes how the intermediate goal can be decomposed into subgoals, and sometimes in which context, in order to be achieved. The number and the order of subgoals to be achieved in order to accomplish the parent goal depends on the chosen operator (see next section for more details). The second criterion used to define goals is related to the goals satisfiability. Using this criterion, two kinds of goals are again distinguished: Necessarily Satisfiable goals (NS) and Not Necessarily Satisfiable goals (NNS). Necessarily Satisfiable goals (NS): This kind of goals ensures that, once the goal has been executed, the satisfaction condition of the goal is necessarily true (the goal is achieved). M
128
G. Simon, B. Mermet, and D. Fournier
Not Necessarily Satisfiable goals (NNS): this set of goals is complementary to the previous one. It is the more prevalent case. This kind of goals is not necessarily achieved after its actions or its decomposition (and the subgoals associated to this decomposition) have been executed. This kind takes into account that some actions or some decompositions can only be used in certain execution contexts. If these contexts are not set when the goal is to be executed, these actions or decompositions become inoperative. This criterion is used for the automaton generation step and the proof process. It is not involved in the execution process of a goal. Moreover, during the GDT design step, using this criterion may help to verify that the tree is syntactically correct. Indeed, as it will be shown in the next section, decomposition operators do not accept all kinds of goals from this criterion point of view. The third and last criterion used to distinguish goals is related to the evaluation of the satisfaction condition. Using this criterion, two other kinds of goals can be defined: Lazy goals (L) and Not Lazy goals (NL).M Lazy goals (L): when a goal is considered to be a lazy one, its satisfaction condition is evaluated before considering its actions (for an elementary goal) or its decomposition (for an intermediate one). If the satisfaction condition is true, the goal is considered to be achieved which implies that the set of actions or the decomposition associated to the goal are not executed. If the satisfaction condition is false, the set of actions or the decomposition are executed. As a consequence, primed variables can not be used in the satisfaction condition associated to a lazy goal. Indeed, in this case, implicit stuttering of variables is supposed. It means that if a variable is not used in the satisfaction condition, its value is considered as unchanged. Not Lazy goals (NL): this set of goals is complementary to the previous one. For a not lazy goal, the associated set of actions or decomposition tree is always executed even if the satisfaction condition is already true before the execution. This property is usefull for goals which must absolutely always be executed, whatever is the system execution context. This criterion can be compared to the requirement for goals management described in [19]: “...given a goal to achieve condition s using a set of procedures (or recipes or plans) P , if s holds, then P should not be executed”. This requirement corresponds to the solving process of a lazy goal whose satisfaction condition is already satisfied at the beginning of the process. As a conclusion, in the context of a GDT, each goal can be characterised by three criteria which can be combined independently. Figure 1 summarizes the graphical notations introduced for the two last criteria. Each criterion has two possible values which implies that eight effective kinds of goals can be used in the tree. Formally, a goal is described by a 6-tuple < name, sc, el, ns, lazy, LDT or actions > with: name a string, sc (satisfaction condition) a temporal logic formula, el (elementary) a boolean, ns (necessarily satisfiable) a boolean, lazy a boolean and a Local Decomposition Tree (LDT) or a set of actions (actions).
GDT: An Agent Model to Generate a Validated Agent Behaviour L
129
NL
Lazy goal
Not Lazy goal
Not Necessarily Satisfiable goal
Necessarily Satisfiable goal
Fig. 1. NS, NNS, L and NL goals boolean solve(G) : if (G.lazy) then if (G.sc) then return(true); endif endif if (G.el) then execute(G.actions); else satisfy(G.LDT); endif return(G.sc); Fig. 2. Algorithm for executing a goal in a GDT
From this definition, the process of goal execution in the GDT can be described by the algorithm given in figure 2. The solve function used in this algorithm describes how the tree must be walked during the execution process. The execute function consists in executing sequentially the set of actions associated to the goal. The satisfy function is a recursive one and uses itself the solve function which just has been defined. The satisfy function is detailed in the next section which also describes available decomposition operators.
3
Decomposition Operators
In this section, available decomposition operators are described. For each operator, the two first semantics are given that is to say the decomposition semantics and the goals types composition semantics. The goals types composition semantics is based only on the satisfiability mode of goals (NS or NNS). Indeed, the other criteria have not a direct influence on decomposition operators. AND and OR operators are available but are not described because they correspond to the standard logical operators. In a tree, they are managed as indeterminist operators. Before describing each operator, let us precisely define goal decomposition. Let A be the goal to be decomposed and Op the decomposition operator to be used. As almost all available operators are binary, Op is also supposed to be binary (but it does not modify the semantics of the decomposition). Let B and C be the subgoals introduced by the use of Op to decompose A. As described in the previous section, Op(B, C) corresponds to the Local Decomposition Tree associated to A. The semantics of this decomposition is that the satisfaction of
130
G. Simon, B. Mermet, and D. Fournier
Op(B, C) (i.e. the LDT) implies the satisfaction of A. But what does mean the satisfaction of Op(B, C)? It corresponds exactly to the decomposition semantics of each operator. Indeed this semantics describes how many subgoals must be achieved and in which order to be able to satisfy the parent goal. In other words, the satisfy function used inside the solve function given previously is different for each operator. So in the sequel, for each operator, the satisfy function is instantiated. Let us notice that, for all operators, this function uses the solve function in order to evaluate the satisfaction of subgoals. 3.1
SeqAnd Operator
This operator corresponds to a "sequential And" operator. Indeed, the main difference with the standard logical And operator is that the two subgoals must be executed in the order specified by the operator. The figure 3(a) gives the satisfy function corresponding to this behaviour. The composition semantics is the same as the And operator’s one i.e. the parent goal is NNS unless the two subgoals are NS. boolean satisfy(SeqAnd(B,C)): if (solve(B)) then return (solve(C)); else return(false); endif (a) SeqAnd
boolean satisfy(SeqOr(B,C)): if (solve(B)) then return (true); else return(solve(C)); endif (b) SeqOr
Fig. 3. SeqAnd and SeqOr satisfaction algorithms
Last but not least, here is the temporal logic formula associated to this operator. This formula is supposed to be true when decomposing a lazy parent goal A in two subgoals B and C using seqand operator. It is a logical expression of the “solve” function with G being A, G.ldt being (B seqand C) and G being a lazy goal. There exists a second formula to be used when the parent goal is a not lazy one. This formula is used during the proof process: (¬A ∧ (¬A ⇒ B) ∧ ((¬A ⇒ B) ⇒ C) ⇒ A) is a temporal logical operator meaning “always”. is an other temporal logical operator meaning “eventually”. Informally, this formula means that if A is not satisfied (i.e. its satisfaction condition is false; this condition must be verified because A is a lazy goal), and if B can be satisfied when A is not satisfied and if C can be satisfied in the previous context (i.e. B can be satisfied when A is not satified) then A will eventually be satisfied (i.e. its satisfaction condition will eventually be true). 3.2
SeqOr Operator
The difference between SeqOr and the standard logical Or operator is the same as the one between SeqAnd and the standard logical And operator described
GDT: An Agent Model to Generate a Validated Agent Behaviour
131
in the previous section. The figure 3(b) gives the satisfy function associated to SeqOr. Its composition semantics is the same as the Or operator’s one i.e. the parent goal is NNS unless the two subgoals are NS. 3.3
SyncSeqAnd Operator
This operator is a synchronized version of the SeqAnd operator. Unlike SeqAnd, this operator ensures that the two subgoals (if they are both executed) are solved without any interruption by another agent. This operator must not be used too much. By reducing the shared variables, the agentification method we have proposed [5] is designed to limit cases where this kind of operators must be used. Its decomposition semantics, goals types composition semantics and the corresponding satisfaction algorithm are the same as the SeqAnd operator’s ones. 3.4
SyncSeqOr Operator
The difference between SyncSeqOr and SeqOr is the same as the one between SyncSeqAnd and SeqAnd described in the previous section. This operator is a synchronized version of the SeqOr operator. 3.5
Case Operator
This operator decomposes a goal into subgoals according to conditions defined by logical formulae. These logical formulae use the same variables as satisfaction conditions. The decomposition semantics of this operator states that the disjunction of conditions must always be true when the parent goal is decomposed. The principle is that if a condition is true, the corresponding subgoal must be executed. The satisfaction of the parent goal depends on the satisfaction of the chosen subgoal. This semantics is summarised by the associated satisfy function given in figure 4. As far as the composition semantics of the operator is concerned, there are four possible trees as shown in figure 5. If subgoals are both necessarily satisfiable, the parent goal is necessarily satisfiable. If at least one of the subgoals is not necessarily satisfiable, the parent goal is not necessarily satisfiable. It is very important to notice that the property of being "necessarily satisfiable" is a little bit different in the context of the case operator. Indeed, here, a subgoal is necessarily satisfiable only if its associated condition is true. For the other operators, when a goal is declared to be necessarily satisfiable, it is true in any context. This characteristic is particularly useful for the proof process. boolean satisfy(Case(B1,B2,condB1 ,condB2 )): if (condB1 ) then return(solve(B1)) else if (condB2 ) then return(solve(B2)) Fig. 4. Case satisfaction algorithm
132
condB1
G. Simon, B. Mermet, and D. Fournier
A
condB2
condB1
B2
B1
case B1
A
condB2
condB1
B2
B1
case
A
case
condB2
condB1
B2
B1
A
condB2
case B2
Fig. 5. Case operator composition semantics
3.6
Iter Operator
This operator is an unary one. The main feature of this operator is that its behaviour depends on the satisfaction condition of the parent goal. The decomposition semantics of this operator states that the parent goal will be achieved if the subgoal is achieved several times. In other words, the satisfaction condition of the subgoal must be true several times in order the satisfaction condition of the parent goal to become true. Nevertheless, it is possible that, for example, modifications in the environment of the agent imply that the satisfaction condition of the parent goal becomes true even if the subgoal has not really been satisfied. In that case, the parent goal is considered to be satisfied and the solving process of the subgoal must stop. This operator is very important because it takes into account a progress notion inside a goal execution process. For example, let suppose that the satisfaction condition of the parent goal A is "to be in (x, y) location". Let suppose that the agent can only move one step at a time. As a consequence, the execution of A must be decomposed into n execution of the subgoal "move one step", n being the number of steps between the current location of the agent and the desired final location. This operator can only be used when the satisfaction of the subgoal implies a progress in the satisfaction of the parent goal. In other words, each time the subgoal is satisfied, the satisfaction of the parent must be closer. However, sometimes it is possible that the subgoal can not be satisfied (because the context of the agent has changed for example). In this case, the satisfaction degree of the parent goal stays at the same level and the subgoal must be executed again. The important characteristic of this operator is that the satisfaction level of the parent goal can not regress after a satisfaction step of the subgoal, even if this step has failed. If it is the case, it means that the Iter operator should not have been used. The proof schema associated to the Iter operator helps to verify this property. The overall behaviour of the operator described in the previous paragraph is summarised by the associated satisfy function given in figure 6(a). The goals types composition semantics of this operator is summarised in figure 6(b). It shows that the subgoal can be either necessarily satisfiable or not. However, the parent goal is always necessarily satisfiable. Indeed, the behaviour of the operator implies that the solving process of the subgoal stops when the satisfaction condition of the parent goal is true which implies that this one is necessarily satisfiable.
GDT: An Agent Model to Generate a Validated Agent Behaviour
A boolean satisfy(Iter(sc,B)): repeat solve(B); until (sc) return true; (a) Iter satisfaction algorithm
A iter
B
133
iter B
(b) Iter composition semantics
Fig. 6. Iter Operator
3.7
Comparison with Other Works on Agent Goals Decomposition
Other works on goals management of an agent propose mechanisms to express relations between goals or subgoals. In this paragraph, two of them are detailed in order to be precisely compared with GDT. In GOAL [4], the authors propose a global logical framework in order to formalise the goals management behaviour of an agent. In this framework, the state of an agent is defined by a mental state < B, G > which consists of the beliefs and goals of the agent. Beliefs and goals are modelled by logical formulae: B(F ) is a belief and G(F ) is a goal, F being a logical formula. In this framework, a goal must not be deductible from the set of beliefs. When performing an action, if a goal becomes a consequence of the modifed set of beliefs, it is removed from the goals base. The goal is considered to be achieved. The behaviour of the agent is specified by a set of conditional actions. A conditional action is a pair φ → do(a) where φ is a condition (a logical formula) and a is an action. There are three kinds of actions: – beliefs management actions: theses actions allow to manage the set of beliefs: • ins(φ) adds B(φ) to the set of beliefs, • del(φ) deletes B(φ) from the set of beliefs; – goal management actions: these actions allow to explicitly manage goals: • adopt(φ) adds G(φ) to the set of goals, • drop(φ) deletes G(φ) from the set of goals; – basic actions: these actions are described by a semantic function , a partial function that modifies the set of beliefs of the agent.
τ
For instance, here is how our SeqAnd operator could be translated in Goal. Let suppose that A is a goal that is decomposed in SeqAnd(X,Y), that is to say SeqAnd(X,Y) ⇒ A. The behaviour of the agent corresponding to the execution of the goal A can then be described by the following conditional actions: – G(A) ∧ ¬B(X) → do(adopt(X)) (if A must be satisfied and X is not yet believed, then X becomes a goal of the agent);
134
G. Simon, B. Mermet, and D. Fournier
– G(A) ∧ B(X) ∧ ¬B(Y ) → do(adopt(Y )) (if A must be satisfied and X has already been achieved (and is thus a belief), then Y becomes a goal of the agent); – G(A)∧B(X)∧B(Y ) → do(ins(A)) (if A must be satisfied and X and Y have been achieved, then A is achieved and can be added to the set of beliefs of the agent. It will also be removed from the set of goals of the agent because Goal agents implement the blind commitment strategy). It is assumed that there are also rules to satisfy goals X and Y which are supposed to add B(X) and B(Y ) to the set of beliefs. However, with our model, X and Y are removed from the set of goals remaining to solve by the agent after the resolution of A. This can not be expressed in GOAL because conditional actions can not be combined, for instance to be sequentialised. More generally, the hierarchical structure of our model allows a progressive specification of the agent behaviour which is more difficult with Goal. Moreover, more elements can be proven with our model than with GOAL. For example, relations between goals like ”ITER” can not be proven with GOAL. Last but not least, our model allows to perform proofs using first order logic which is not the case with GOAL. Our decomposition operators can also be compared to the Quality Accumulation Functions (QAF) proposed in TAEMS [18]. TAEMS is a modelling language allowing to describe activities of agents operating in environments where responses by specific deadlines may be required. That’s why TAEMS represents agent activities in terms of a task structure i.e. a graph where tasks can be decomposed into subtasks using QAFs. A QAF specifies how the quality of a task can be computed using qualities of its substasks. The quality of a task evaluates the level of completion of the task. QAF can be seen as a quantitative version of our logical decomposition operators.
4
The GDT Design Process
A Goal Decomposition Tree (GDT) specifies how each goal can be achieved by an agent. More precisely, the root node of the tree is associated to the main goal of the agent, i.e. the one which is assigned to the agent by the used agentification method ([14] [20]). If this goal is achieved by the agent, the agent is considered to be satisfied from the multiagent system point of view. The tree describes how this goal can be decomposed in order to be achieved using a solution which must be the most adapted to the agent context as possible. Notice that the overall tree can be seen as a collection of local plans allowing to achieve each goal. A local plan corresponds to a Local Decomposition Tree associated to a subgoal. The main difference with plans used in [1] is that, in a GDT, they are organised hierarchically. A GDT is very close to the tasks graph used in TAEMS [18]. This graph describes relationships between tasks agents may have to achieve. A graph, instead of a tree, is needed because relationships between goals (“disables”, “facilitates”...), different from decomposition ones, can be expressed. These relationships often involve tasks which are executed by different
GDT: An Agent Model to Generate a Validated Agent Behaviour
135
agents. That is why they can not be expressed in a GDT but are taken into account at the system level (which is not presented here). The building process of a GDT consists of four steps. In a first step, a tree must be built by the designer, starting from the main goal of the agent using a topdown process. This first step allows to introduce subgoals with their satisfaction condition, elementary goals with their associated actions and also decomposition operators. The designer must also decide for each goal if it is lazy or not. During this step, the designer must also define invariants associated to the tree. A GDT must be designed so as to ensure that the execution of goals is always consistent with all the invariants. This characteristic is verified during the proof process. Invariants specify properties of the system inside which agents are to be executed. For example, for a prey/predator system, invariants must specify that only one agent can be on a grid cell at a time. In order to make the building process of the tree easier, we are currently defining what can be seen as design patterns i.e. rules which can be used to choose the right operator in particular contexts. For example a rule is focused on the problem of interdependency between goals. When this property exists between two goals A and B, it means that the satisfaction of A has an influence on the satisfaction of B. When detected, this property can help to guide the choice of the decomposition operator. For example, let suppose that a goal G can be satisfied if two subgoals B and C are satisfied. The And operator may be used to model this hypothesis. But if another hypothesis indicates that the satisfaction of B can prevent from the satisfaction of C, the And operator can not be used anymore, but must be replaced by the SeqAnd operator. In the second step of the GDT design process, the designer must decide for each elementary goal if it is necessarily satisfiable or not. In a third step, the type of each intermediate goal, as far as satisfiability is concerned, is computed using the goals types composition semantics of each used decomposition operator. Unlike the first step, this step is a down-top process. During this process, inconsistencies or potential simplifications can be detected. In that case, the first step must be executed again in order to modify the tree. Figure 7 shows such a simplification for the SeqOr operator which can be detected during this step.
R R
seqand G
L
seqand seqor
L subtree
A
A subtree
L
A
L subtree
A subtree
B
B substree
Fig. 7. Tree simplification with SeqOr
136
G. Simon, B. Mermet, and D. Fournier
The first tree can be replaced by the second tree because if the first subgoal of a SeqOr is a necessarily satisfiable one, the second subgoal will never be executed. Once the three first steps have been achieved, a proof of the tree can be built in a fourth step. The process used to achieve this proof is described in [6] . Again, this step can lead to detect inconsistencies in the tree based on proof failures. In a last step, the validated tree is used to build the behaviour automaton of the agent which can then be implemented. This process is also described in [15]. As explained before, the building of the tree leads also to the definition of variables and actions of the agent which are essential parts of an agent model. As a consequence, the GDT and the associated design process can be seen as a design tool for a validated agent model in the context of a MAS design.
5
Comparison with Other Models and Methods
5.1
Comparison with Other Goal-Oriented Agent Models
Table 1 compares our agent model with a few other ones with a goal oriented point of view: Winikoff et al’s model [19], AgentSpeak [13] and GOAL [4]. In the Goal expression column, it is specified whether a formal satisfaction condition and a formal failure condition is expressed for each goal in the model. For the models having only a procedural point of view, like AgentSpeak, there is no formal expression of goals. Only the Winikoff et al’s model explicitly gives a formal failure condition, making a distinction between a plan failure and a goal failure. Among the Goal management hypotheses, we distinguish the five characteristics described in [19, 12]. The Drop Successful attitude (DS) consists in dropping a goal that is satisfied. The Drop Impossible attitude (DI) consists in dropping a goal that becomes impossible to satisfy. A goal is persistent (PG) if it is dropped only if it has been achieved or if it becomes impossible to achieve. The other Table 1. Comparison with goal-oriented agent models Model
Goal Goal Action kinds Expression management hypotheses
Winikoff et satisfaction al. failure AgentSpeak none Goal
satisfaction
GDT
satisfaction
Plan language (operators)
DS, DI, CG, GA, BI, BD, BAN sequencing, paralPG, KG lelism, conditional selection none GA, BI, BD, BAN and, conditional selection (context) DS GA, GD, BI, BD, only atomic condiBAS tional actions DS GA, GD, BI, BD many derived from the GDT, BAS
GDT: An Agent Model to Generate a Validated Agent Behaviour
137
characteristics correspond to constraints on the goal management process. The Consistent Goals property (CG) is satisfied if the set of goals the agent has to achieve must be consistent (if a is a goal, not(a) cannot be a goal). Finally, the Known Goals (KG) property specifies that the agent must know the set of all its goals. DS property implicitly holds in our model: it is a direct consequence of the execution schema of a GDT specified by the solve and satisfy functions described previously. DI does not hold because, as there is not any failure condition associated to goals, there is no way to decide whether a goal become impossible to achieve. PG does not hold because DI does not hold. Last but not least, CG and KG properties which are constraints on the model usage, do not hold in our model because it does not need these two properties to be verified in order to be used. In the Action kinds column, kinds of actions the language provides are precised. These actions can be divided into 3 types: actions concerning goals management (goal dropping GD, goal adoption GA), actions concerning beliefs management (belief insertion BI, belief deletion BD) and all other actions that we call Basic Actions. These actions may be Specified in the language (BAS) or only Named (BAN). BAS are essential to allow a proof process. Finally, in the last column, we tried to enumerate the Goal decomposition operators provided by the language. For the model described in this paper, see section 3. The plan language of Goal is rather a rule language. But for each rule, only one action may be executed: there is, for instance, no sequence operator. In a GDT, plans rely on goal decompositions, and as a consequence, the expressivity of our plan language is also the expressivity of our goal decomposition language. 5.2
Comparison with Goal-Oriented MAS Development Methods
Already existing MAS development methodologies like Tropos [2], Prometheus [11], Kaos [3] or MaSE [16] include goals management. As these methodologies, our approach is intended to be a start to end goal-oriented support for MAS design with a specific property: to be able to prove that the generated behaviour is correct with respect to system requirements. As previously explained, our approach is very close to MaSE from this point of view: they are both designed as "Formal Transformation Systems". All these methods lead to define system goals which are then used as a base for further design steps. But, as explained in [8], "beyond the initial phases of requirements, the goal-oriented focus of these methodologies dissipates and does not progress into the detailed design phase where agent internals are fleshed out". The work presented in this paper tries exactly to answer to this problem because it is focused on agents internal from a goal oriented point of view. However Tropos and a new version of Prometheus presented in [8] try to keep goals during all the design process. In the following, previous cited methodologies are compared from a goals management point of view (for the Prometheus methodology, the comparison is based on [8]). Used criteria are the following:
138
G. Simon, B. Mermet, and D. Fournier
– Goals level: goals can be defined at the system level(SL), at agents level(AL) or at agents’ internal level (AIL). It is supposed that AIL implies AL and that AL implies SL. – Goals description: goals can be specified formally (SF) or are just named (N). – Goals role: goals are directly exploited to produce models during the design phase (PM) or are used to ensure that systems requirements are taken into account (SR). – Typology use: when a typology exists, it can be used to help to understand goals (UG) or to help to determine goals (DG) or to guide strongly further design steps (GD). – Goals decomposition: does the method include means to express relations between goals and subgoals? Table 2. Comparison with Goal-Oriented MAS development methods
As far as goals decomposition is concerned, the term “typology” used for MaSE means that available decompositions are directly defined by the different types of goals.
6
Conclusion
In this article, we presented a goal-oriented behaviour model of an agent relying on a Goal Decomposition Tree. The goal notion is central in the development of an agent. This appears for instance in the desires concept of the BDI model, or is the basis of other methods such as Goal or Moise ([4], [7]). Using a decomposition tree, the user can progressively specify the behaviour of an agent. Thus, goals and plans are closely linked: the decomposition tree of a goal is the plan associated to this goal. A part of goal decomposition operators involves nondeterminism, which is necessary for autonomous agents. Of course, using our model, an agent designer must specify each goal by a satisfaction condition. This may seem difficult, but the experience shows that rapidly, inexperienced designers can write the right satisfaction condition. Moreover, this model can be verified using our proof method. The model can then be automatically translated into a behaviour automaton which is, as a consequence, also validated. This automaton can be implemented inside agents which can be developped using any MAS development platform in order to manage their life cycle, their communications, etc. However, the design and the proof process are strongly disconnected. So, the
GDT: An Agent Model to Generate a Validated Agent Behaviour
139
designer can develop the GDT without taking care of the proof process. This model has been used to specify prey agents behaviour inside a prey/predator system [15]. The resulting GDT contains sixteen nodes. This system have been also verified using the produced GDT. As shown before, this model can be seen as a tool for agents design. That’s why we are going to develop an interpreter which can directly simulate the behaviour of agents from their GDT. The idea is to obtain, as in TAEMS, a method for fast prototyping with validation in parallel.
References 1. R.H. Bordini, M. Fisher, W. Visser, and M. Wooldridge. Verifiable multi-agent programs. In M. Dastani, J. Dix, and A. El Fallah Seghrouchni, editors, ProMAS, 2003. 2. P. Bresciani, A. Perini, P. Giorgini, F. Giunchiglia, and J. Mylopoulos. Tropos: an agent-oriented software development methodology. Autonomous agents and multiagent systems, 8:203–236, 2004. 3. A. Dardenne, A. van Lamsweerde, and S. Fickas. Goal-directed requirements acquisition. Science of Computer Programming, 20:3–50, 1993. 4. F.S. de Boer, K.V. Hindriks, W. van der Hoek, and J.-J.Ch. Meyer. Agent programming with declarative goals. In Proceedings of the 7th International Workshop on Intelligent Agents VII. Agent Theories Architectures and Language, pages 228–243, 2000. 5. M. Flouret, B. Mermet, and G. Simon. Vers une méthodologie de développement de sma adaptés aux problèmes d’optimisation. In Systèmes multi-agents et systèmes complexes : ingénierie, résolution de problèmes et simulation, JFIADSMA’02, pages 245–248. Hermes, 2002. 6. D. Fournier, B. Mermet, and G. Simon. A compositional proof system for agent behaviour. In H. Muratidis M. Barley, F. Massacci and A. Unruh, editors, Proceedings of SASEMAS’05, pages 22–26, 2005. 7. J.F. Hubner, J.S. Sichman, and O. Boissier. Specification structurelle, fonctionnelle et deontique d’organisations dans les sma. In Journees Francophones Intelligence Artificielle et Systemes Multi-Agents (JFIADSM’02). Hermes, 2002. 8. J. Khallouf and M. Winikoff. Towards goal-oriented design of agent systems. In Proceedings of ISEAT’05, 2005. 9. L. Lamport. The temporal logic of actions. ACM Transactions on Programming Languages and Systems, 1994. 10. B. Mermet, G. Simon, D. Fournier, and M. Flouret. SPACE: A method to increase tracability in MAS Development. In Programming Multi-agent systems, volume 3067. LNAI, 2004. 11. L. Padgham and M. Winikoff. Developing intelligent agent systems: a practical guide. John Wiley and Sons, 2004. 12. A. S. Rao and M. P. Georgeff. An abstract architecture for rational agents. In Proceeding of the 3rd International Conference on Principles of Knowledge Representation and Reasoning, pages 439–449. San Mateo. CA. Morgan Kaufmann, 1992. 13. A.S. Rao. AgentSpeak(L): BDI agents speak out in a logical computable language. In W. Van de Velde and J. Perram, editors, MAAMAW’96, volume 1038, Eindhoven, The Netherlands, 1996. LNAI.
140
G. Simon, B. Mermet, and D. Fournier
14. G. Simon, M. Flouret, and B. Mermet. A methodology to solve optimisation problems with MAS, application to the graph coloring problem. In Donia R. Scott, editor, Artificial Intelligence : Methodology, Systems, Applications, volume 2443. LNAI, 2002. 15. G. Simon, B. Mermet, D. Fournier, and M. Flouret. The provable goal decomposition tree : a behaviour model of an agent. Technical report, Laboratoire Informatique du Havre, 2005. 16. Clint H. Sparkman, Scott A. Deloach, and Athie L. Self. Automated derivation of complex agent architectures from analysis specifications. In Proceedings of AOSE’01, 2001. 17. M.B. van Riemsdijk, M. Dastani, F. Dignum, and J.-J.Ch. Meyer. Dynamics of declarative goals in agent programming. In Proceedings of Declarative Agent Languages and Technologies (DALT’04), 2004. 18. R. Vincent, B. Horling, and V. Lesser. An agent infrastructure to build and evaluate multi-agent systems: the java agent framework and multi-agent system simulator. In Infrastructure for Agents, Multi-Agent Systems, and Scalable Multi-Agent Systems, 2001. 19. M. Winikoff, L. Padgham, J. Harland, and J. Thangarajah. Declarative & procedural goals in intelligent agent systems. In Proceedings of the Eighth International Conference on Principles of Knowledge Representation and Reasoning (KR2002), 2003. 20. M. Wooldridge, N. R. Jennings, and D. Kinny. The gaia methodology for agentoriented analysis and design. Journal of Autonomous Agents and Multi-Agent Systems, 3(3):285–312, 2000.
Resource-Bounded Belief Revision and Contraction Natasha Alechina, Mark Jago, and Brian Logan School of Computer Science, University of Nottingham, Nottingham, UK {nza, mtw, bsl}@cs.nott.ac.uk
Abstract. Agents need to be able to change their beliefs; in particular, they should be able to contract or remove a certain belief in order to restore consistency to their set of beliefs, and revise their beliefs by incorporating a new belief which may be inconsistent with their previous beliefs. An influential theory of belief change proposed by Alchourron, G¨ardenfors and Makinson (AGM) [1] describes postulates which rational belief revision and contraction operations should satisfy. The AGM postulates are usually taken as characterising idealised rational reasoners, and the corresponding belief change operations are considered unsuitable for implementable agents due to their high computational cost [2]. The main result of this paper is to show that an efficient (linear time) belief contraction operation nevertheless satisfies all but one of the AGM postulates for contraction. This contraction operation is defined for an implementable rule-based agent which can be seen as a reasoner in a very weak logic; although the agent’s beliefs are deductively closed with respect to this logic, checking consistency and tracing dependencies between beliefs is not computationally expensive. Finally, we give a non-standard definition of belief revision in terms of contraction for our agent.
1 Introduction Two main approaches to belief revision have been proposed in the literature: AGM (Alchourron, G¨ardenfors and Makinson) style belief revision as characterised by the AGM postulates [1] and reason-maintenance style belief revision [3]. AGM style belief revision is based on the ideas of coherence and informational economy. It requires that the changes to the agent’s belief state caused by a revision be as small as possible. In particular, if the agent has to give up a belief in A, it does not have to give up believing in things for which A was the sole justification, so long as they are consistent with the remaining beliefs. Classical AGM style belief revision describes an idealised reasoner, with a potentially infinite set of beliefs closed under logical consequence. Reason-maintenance style belief revision, on the other hand, is concerned with tracking dependencies between beliefs. Each belief has a set of justifications, and the reasons for holding a belief can be traced back through these justifications to a set of foundational beliefs. When a belief must be given up, sufficient foundational beliefs have to be withdrawn to render the belief underivable. Moreover, if all the justifications for a belief are withdrawn, then that belief itself should no longer be held. Most implementations of reason-maintenance style belief revision are incomplete in the logical sense, M. Baldoni et al. (Eds.): DALT 2005, LNAI 3904, pp. 141–154, 2006. c Springer-Verlag Berlin Heidelberg 2006
142
N. Alechina, M. Jago, and B. Logan
but tractable. A more detailed comparison of the two approaches can be found in, for example, [2]. In this paper, we present an approach to belief revision and contraction for resourcebounded agents which is a synthesis of AGM and reason-maintenance style belief revision. We consider a simple agent consisting of a finite state and a finite agent program which executes in at most polynomial time. The agent’s state contains literals representing the beliefs of the agent, and the agent’s program consists of rules which are used to derive new beliefs from its existing beliefs. When the agent discovers an inconsistency in its beliefs, it removes sufficient beliefs (literals) to restore consistency. Our algorithm for belief contraction is similar to algorithms used for propagating dependencies in reason-maintenance systems (e.g. [4]), but we show that our approach satisfies all but one of the basic AGM postulates for contraction (the recovery postulate is not satisfied). The belief revision and contraction operations which we define compare in space and time complexity to the usual overhead of computing the conflict set and firing rules in a rule-based agent. The basic contraction algorithm runs in time O(kr + n), where n is the number of literals in the working memory, r is the number of rules and k is the maximal number of premises in a rule. We show how our algorithm can be adapted to remove the agent’s least entrenched beliefs when restoring consistency. Recomputing entrenchment order of beliefs also has sub-quadratic complexity. Finally, we investigate defining belief revision in terms of our contraction operator, and show that using the Levi identity does not lead to the best result. We propose an alternative definition, and show that the resulting operation satisfies all but one of the basic AGM postulates for revision. The paper is organised as follows. In Section 2, we introduce the AGM belief revision. In Section 3, we describe the rule-based resource-bounded reasoners. In Section 4, contraction algorithm for those reasoners is defined, and shown to run in linear time. The main result of the paper is in Section 5, where we define the logic under which the beliefs of our reasoners are closed, and show that the basic postulates for contraction, apart from recovery, hold for the contraction operations we defined. In Section 6, we show how to extend the algorithm to contract by a least preferred set of beliefs, using a preference order on the set of beliefs. In Section 7 we present a definition of revision in terms of our contraction operation, and in Section 8 we discuss related work.
2 AGM Belief Revision The theory of belief revision as developed by Alchourron, G¨ardenfors and Makinson in [5, 1, 6] models belief change of an idealised rational reasoner. The reasoner’s beliefs are represented by a potentially infinite set of formulas K closed under logical consequence, i.e., K = Cn(K), where Cn denotes closure under logical consequence. When new information becomes available, the reasoner must modify its belief set K to incorporate it. The AGM theory defines three operators on belief sets: expansion, contraction and revision. Expansion, denoted K + A, simply adds a new belief A to K and closes the resulting set under logical consequence: K + A = Cn(K ∪ {A}). . Contraction, denoted by K − A, removes a belief A from the belief set and modifies . K so that it no longer entails A. Revision, denoted K + A, is the same as expansion if
Resource-Bounded Belief Revision and Contraction
143
A is consistent with the current belief set, otherwise it minimally modifies K to make it consistent with A, before adding A. Contraction and revision cannot be defined uniquely, since in general there is no unique maximal set K ⊂ K which does not imply A. Instead, the set of ‘rational’ contraction and revision operators is characterised by the AGM postulates [1]. The basic AGM postulates for contraction are: . (K−1) . (K−2) . (K−3) . (K−4) . (K−5) . (K−6)
. . K − A = Cn(K − A) (closure) . K − A ⊆ K (inclusion) . If A ∈ / K, then K − A = K (vacuity) . If not A, then A ∈ / K − A (success) . If A ∈ K, then K ⊆ (K − A) + A (recovery) . . If Cn(A) = Cn(B), then K − A = K − B (equivalence)
The basic postulates for revision are: . (K+1) . (K+2) . (K+3) . (K+4) . (K+5) . (K+6)
. . K + A = Cn(K + A) . A∈K+A . K +A⊆K+A . If {A} ∪ K is consistent, then K + A = K + A1 . K + A is inconsistent if, and only if, A is inconsistent. . . If Cn(A) = Cn(B), then K + A = K + B
The AGM theory elegantly characterises rational belief revision for an ideal reasoner. However it has been argued that the definition of the expansion, contraction and revision operators on belief sets and the resulting assumption of logical omniscience, means that it cannot be applied to resource-bounded reasoners. For example, Doyle [2] states: ‘. . . to obtain a practical approach to belief revision, we must give up both logical closure and the consistency and dependency requirements of the AGM approach’ (p.42). In the next section, we present a reasoner which has bounded memory and implements a polynomial (sub-quadratic) algorithm for belief contraction. In subsequent sections we show that it nevertheless satisfies the AGM postulates for rational belief con. traction apart from (K − 5). We achieve this by weakening the language and the logic of the reasoner so that it corresponds to that of a typical rule-based agent. We assume that the agent has only atomic and negative atomic beliefs which are subject to revision (in agent programming languages such as AgentSpeak [7], beliefs are normally assumed to be literals) and, in addition, a set of beliefs in the form of implications (similar to rules of a rule-based agent or AgentSpeak plans) which constitute the agent’s program and are not subject to revision. The only inference rule the agent can apply is modus ponens. The set of all consequences which are derivable from the agent’s rules and literal beliefs using modus ponens is exactly the set of new facts which a rule-based agent will assert after firing its rules to quiescence. 1
We replaced ‘¬A ∈ K’ with ‘{A} ∪ K is consistent’ here, since the two formulations are classically equivalent.
144
N. Alechina, M. Jago, and B. Logan
3 Resource-Bounded Agents We consider a simple resource-bounded agent consisting of a finite state and a finite agent program. The agent’s state or working memory (WM ) contains literals (propositional variables or their negations) representing the beliefs of the agent. The agent’s program consists of a set of propositional rules of the form: A1 , . . . , An → B where A1 , . . . , An , B are literals. The agent repeatedly executes a sense–think–act cycle. At each cycle, the agent adds any observations (including communications from other agents) to its existing beliefs in WM and then fires its rules on the contents of WM . We distinguish two models of rule application by the agent. In the simplest case, which we call the quiescent setting for belief revision, the agent fires all rules that match until no new rule instances can be generated. In the quiescent setting, WM is closed under the agent’s rules: all literals which can be obtained by the repeated application of rules to literals in the WM , are in WM . Note that firing the agent’s rules to quiescence takes at most polynomial time. An example of a rule-based system which fires rules to quiescence is SOAR [8]. Another model of rule application, which is perhaps more interesting, is the non-quiescent case, which we call the non-quiescent setting for belief revision. In the non-quiescent setting, the agent fires a subset of the rules that match at any given ‘think’ cycle, and we look at revising the agent’s beliefs after the application of one or more rules but before all the rule instances have been fired. This setting is natural when considering many rule-based systems, such as CLIPS [9], which fire one rule instance at a time. Periodically, e.g., at the end of each cycle, or after each rule firing, the agent checks to see if its beliefs are consistent. If A is a literal, we denote by A− the literal of the opposite sign, that is, if A is an atom p, then A− is ¬p, and if A is a negated atom ¬p, then A− is p. We say that WM is inconsistent iff for some literal A, both A and A− are in WM . For each pair {A, A− } ⊆ WM , the agent restores consistency by contracting by one element of each pair. Note that we only consider contraction by literals—rules are part of the agent’s program and are not revised.
4 Contraction We define resource-bounded contraction by a literal A as the removal of A and sufficient literals from WM so that A is no longer derivable using the rules which constitute the agent’s program. In this section, we present a simple algorithm for resource-bounded contraction and show that it runs in time linear in kr + n, where r is the number of rules in the agent’s program, k is the maximal number of premises in a rule, and n is the number of literals in the working memory. We assume that WM consists of a list of cells. Each cell holds a literal and its associated dependency information in form of two lists, a dependencies list and a justifications list.2 Both lists contain pointers to justifications, which correspond to fired 2
In the interests of brevity, we will refer to the cell containing the literal A as simply A when there is no possibility of confusion.
Resource-Bounded Belief Revision and Contraction
145
rule instances; each justification records the derived literal, the premises of the rule, and (for efficiency’s sake) back-pointers to the dependencies and justifications lists which reference it. We will denote a justification as (A, [B C]) or (A, s) where A is the derived literal and [B C] or s is the list of premises of the rule (or support list). Each support list s has a distinguished position w(s) which contains the ‘weakest’ member of s. Later we will show how to give a concrete interpretation to the notion of ‘weakness’ in terms of preferences; for now, we assume that w(s) is the first position in s, or is chosen randomly. The dependencies list of A contains the justifications for A. For example, the dependencies list [(A, [B C]) (A, [D])] means that A can be derived from B and C (together) and from D (on its own). In the quiescent setting, the dependencies list of A corresponds to all rules which have A in the consequent and where premises are actually in working memory. In the non-quiescent setting, the dependencies list corresponds only to the rules which have been fired so far. If A is an observation, or was present in WM when the agent started, its dependencies list contains a justification, (A, [ ]), with an empty support. The justifications list of A contains all the justifications where A is a member of a support. For example, if the dependencies list of C contains a justification (C, [A B]), then A’s justifications list contains the justification (C, [A B]). We need both lists to guarantee constant time access from a given belief both to the beliefs it follows from, and to beliefs which it was used to derive. The dependencies and justifications lists are updated whenever a rule is fired. For example, when firing the rule E, F → B, we check to see if B is in working memory, and, if not, add a new cell to WM containing the literal B. We also add the justification (B, [E F ]) to the dependencies list for B and to the justifications lists of E and F . Algorithm 1. Contraction by A for all j = (C, s) in A’s justifications list do remove j from C’s dependencies list remove j from the justifications list of each literal in s end for for all j = (A, s) in A’s dependencies list do if s == [] then remove j else contract by the literal w(s) end if end for Finally, delete the cell containing A
The algorithm for contraction (see Algorithm 1) is very simple, and consists of two main loops. The first loop removes all references to the justifications in A’s justifications list. We assume that removing a reference to a justification is a constant time operation, due to the use of back-pointers. If a justification corresponds to a rule with k premises, there are k + 1 pointers to it: one from the dependencies list of the derived literal, and k from the justifications lists of the premises. The second loop traverses A’s dependencies
146
N. Alechina, M. Jago, and B. Logan
list, and for each justification there, either removes it (if it has an empty support), or recurses to contract by the weakest member of the justification’s support list, w(s). The total number of steps the algorithm performs in the two loops is proportional to the total length of all dependencies and justifications lists involved. The maximal number of justifications with non-empty supports is r, where r is the number of rules. The number of references to each justification with a non-empty support is k + 1, where k is the maximal number of premises in a rule. So the maximal number of steps is r × (k + 1) for justifications with non-empty supports (assuming that each support can be updated in constant time), plus at most n for the justifications with empty supports, where n is the number of literals in WM . The last step in the contraction algorithm (removing a cell) is executed at most n times, and we assume that access to a cell given its contents takes constant time. The total running time is therefore O(kr + n). 4.1 Reason-Maintenance Type Contraction The algorithm above can be modified to perform reason-maintenance type contraction. Reason-maintenance contraction by A involves removing not just those justifications whose supports contain A, but also all beliefs which have no justifications whose supports do not contain A. In this case, in addition to removing the justifications in A’s justifications list from other literals’ dependencies lists, we check if this leaves the dependencies list of the justified literal empty. If so, we remove the justified literal and recurse forwards, following links in its justifications list. This adds another traversal of the dependencies graph, but the overall complexity remains O(kr + n).
5 The Agent’s Logic and AGM Postulates In this section, we present a weak logic, W , and show that our rule-based agent can be seen as a fully omniscient reasoner in W : that is, its belief set is closed with respect to derivability in W . Consider a propositional language LW where well-formed formulas are either (1) literals, or (2) formulas of the form A1 ∧ . . . ∧ An → B, where A1 , . . . , An , B are literals. Note that there is a clear correspondence between an agent’s rules and the second kind of formula. We will refer to the implication corresponding to a rule R as R, where this cannot cause confusion. A logic W in the language LW contains a single inference rule, generalised modus ponens: A1 , . . . , An , A1 ∧ . . . ∧ An → B B The notion of derivability in the logic is standard, and is denoted by W . The corresponding consequence operator is denoted by CW . W is obviously much weaker than classical logic. In particular, the principle of excluded middle does not hold, so A → B and A− → B do not imply B. For any finite set Γ of implications and literals, CW (Γ ) is finite. It contains exactly the same implications and literals as Γ , plus possibly some additional literals derived by generalised modus ponens. All such additional literals occur as consequents (right-hand sides) of the implications in Γ .
Resource-Bounded Belief Revision and Contraction
147
Let WM be the set of literals in working memory, and R the set of the agent’s rules. Proposition 1. For any literal A, WM ∪ R W A iff A ∈ WM after running R to quiescence. The proposition above means that the set comprising the agent’s beliefs is closed under consequence if the agent runs all its rules to quiescence: WM ∪ R = CW (WM ∪ R) after running R to quiescence. Somewhat surprisingly, an agent which does not run its rules to quiescence can also be seen as a totally rational and omniscient reasoner in W — provided we only include the rules which actually have been fired in its beliefs. Assume that a subset R of the agent’s rules R are fired. Proposition 2. Let R ⊆ R; then for any literal A, WM ∪ R W A iff A ∈ WM after firing the rules R . In other words, in the non-quiescent setting, WM ∪ R = CW (WM ∪ R ) where R is the set of rules fired. By the belief state K of the agent we will mean the set of literals in its working memory and the set of rules which have been fired. df
K = CW (WM ∪ R )
(1)
.
By K − A we will denote the result of applying our contraction by A algorithm to K. Now we can show that AGM belief postulates are satisfied for our agent. . . . . Proposition 3. − satisfies (K−1)–(K−4) and (K−6). Proof. Given that K is closed under consequence, and contraction removes literals and recursively destroys rule instances used to derive them, no new rule instances can be . generated as a consequence of contraction. So K − A is still closed under consequence . . . and K−1 holds. K−2 holds because − deletes literals from the working memory without . . adding any, K−3 is satisfied for the same reason. K−4 states that after a contraction by A, A is no longer in the working memory. Since the contraction algorithm removes A as its last step, A is indeed not in the working memory after the algorithm is executed. . K−6 is trivially valid, since for any literal A, CW (A) = {A}. . . Proposition 4. − does not satisfy K−5. Proof. Suppose we have a single rule R = A → B and WM = {A, B}. After contraction by B, WM is empty. When we expand by B, WM contains only B. The recovery postulate is the most controversial of the AGM postulates [10], and many . contraction operations defined in the literature do not satisfy it. We can satisfy K−5 in our setting, if we are prepared to re-define the expansion operator. We simply save the current state of working memory before a contraction, and restore the previous state of WM if we have a contraction followed by an expansion by the same literal. More precisely, to expand by a literal A, we first check if the previous operation was contraction by A, and if so we restore the previous state of working memory. Otherwise we add A to the contents of WM and run (a subset of) the agent’s rules. This requires O(n) additional space, where n is the size of the working memory.
148
N. Alechina, M. Jago, and B. Logan
6 Preferred Contractions So far we have assumed that the choice of literals to be removed during contraction is arbitrary. However, in general, an agent will prefer some contractions to others. For example, an agent may prefer to remove the smallest set of beliefs necessary to restore consistency, or to remove those beliefs which are least strongly believed. The problem of computing a minimal set of beliefs which, if deleted, would restore consistency is exponential in the size of working memory, and approaches based on this type of ‘minimal’ contraction and revision do not sit comfortably within our resource-bounded framework. In this section we focus instead on contractions based on preference orders over individual beliefs, e.g., degree of belief or commitment to beliefs. We show that computing the most preferred contraction can be performed in time linear in kr + n. We distinguish independent beliefs, beliefs which have at least one non-inferential justification (i.e., a justification with an empty support list), such as observations and the literals in working memory when the agent starts. We assume that an agent associates an a priori quality with each non-inferential justification for its independent beliefs. For example, the quality assigned to communicated information may depend on the degree of reliability the recipient attaches to the speaker; percepts may be assumed to have higher quality than communicated information and so on. For simplicity, we assume that quality of a justification is represented by non-negative integers in the range 0, . . . , m, where m is the maximum size of working memory. A value of 0 means lowest quality and m means highest quality. We take the preference of a literal A, p(A), to be that of its highest quality justification: p(A) = max{qual(j0 ), . . . , qual(jn )}, where j0 , . . . , jn are all the justifications for A, and define the quality of an inferential justification to be that of the least preferred belief in its support:3 qual(j) = min{p(A) : A ∈ support of j}. This is similar to ideas in argumentation theory: an argument is only as good as its weakest link, yet a conclusion is at least as good as the best argument for it. This approach is also related to Williams ‘partial entrenchment ranking’ [11] which assumes that the entrenchment of any sentence is the maximal quality of a set of sentences implying it, where the quality of a set is equal to the minimal entrenchment of its members. To perform a preferred contraction, we preface the contraction algorithm given above with a step which computes the preference of each literal in WM and for each justification, finds the position of a least preferred member of support (see Algorithm 2). We compute the preference of each literal in WM in stages, starting with the most preferred independent beliefs. Note that unless WM is empty, it always contains at least one literal with a justification whose support is empty (otherwise nothing could be used to derive other literals) and at least one of those independent literals is going to have 3
For simplicity, in what follows we assume reason-maintenance style contraction. To compute preferences for coherence-style contraction we can assume that literals with no supports (as opposed to an empty support) are viewed as having an empty support of lowest quality.
Resource-Bounded Belief Revision and Contraction
149
the maximal preference value of literals in WM even when all other preferences are computed (since a derived literal cannot have a higher preference than all of the literals in justifications for it). Assume we have a list ind of justifications with an empty support list, (A,[],q), where q is the a priori quality of the justification. We associate a counter c(j) with every justification j = (A, s). Initially c(j) is set to be the length of s. When c(j) is 0, the preferences of all literals in s have been set. Algorithm 2. Preference computation order ind in descending order by q while there exists j = (A,[],q) in ind with A unmarked do take first unmarked j = (A,[],q) in ind mark A p(A) = q propagate(A, q) end while procedure PROPAGATE (A, q) for all j = (B, s) in A’s justifications list do decrement c(j) end for if c(j) == 0 then qual(j) = q w(s) = A’s position in s if B is unmarked then mark B p(B) = q propagate(B, q) end if end if end procedure
We then simply run the contraction algorithm, to recursively delete the weakest member of each support in the dependencies graph of A. We define the worth of a set of literals as worth(Γ ) = max{p(A) : A ∈ Γ }. We can prove that our contraction algorithm removes the set of literals with the least worth. Proposition 5. If WM was contracted by A and this resulted in removal of the set of literals Γ , then for any other set of literals Γ such that WM − Γ does not imply A, worth(Γ ) ≤ worth(Γ ). Proof. If A ∈ W M , the statement is immediate since Γ = ∅. Assume that A ∈ W M . In this case, A ∈ Γ and A ∈ Γ (otherwise WM − Γ and WM − Γ would still derive A). It is also easy to see that A is the maximal element of Γ , because a literal B is in Γ if either (1) B = qual(ji ) for some justification ji for A, and since p(A) = max(qual(j0 ), ..., qual(jn )), p(B) ≤ p(A); or (2) B is a least preferred element of a support set for some literal A depends on, in which case its preference is less or equal to the preference of the literal it is justification for, which in turn is less or equal to p(A). So, since A is an element of both Γ and Γ , and A has the maximal preference in Γ , then worth(Γ ) ≤ worth(Γ ).
150
N. Alechina, M. Jago, and B. Logan
Computing preferred contractions involves only modest computational overhead. The ordering of ind takes O(n log n) steps; ind is traversed once, which is O(n); PROP AGATE traverses each justifications list once, which is O(kr) (setting the w(s) index in each support can be done in constant time, assuming that the justifications list of each literal A actually contains pairs, consisting of a justification and the index of A’s position in the support list of the justification). The total cost of computing the preference of all literals in WM is therefore O(n log n + kr). As the contraction algorithm is unchanged, this is also the additional cost of computing a preferred contraction.
7 Revision In the previous sections we described contraction. Now let us consider revision, which is adding a new belief in a manner which does not result in an inconsistent set of beliefs. Recall that a set of beliefs K is inconsistent if for some literal A, both A and A− are in K. For simplicity, we will consider revision in a quiescent setting only. If the agent is a reasoner in classical logic, revision is definable in terms of contrac. . tion and vice versa. Given a contraction operator − which satisfies postulates (K−1)– . . df . . . (K−4) and (K−6), a revision operator + defined as K + A = (K − ¬A) + A (Levi . . identity) satisfies postulates (K+1)–(K+6). Conversely, if a revision operator satisfies . . . df . (K+1)–(K+6), then contraction defined as K − A = (K + ¬A) ∩ K (Harper identity) . . satisfies postulates (K−1)–(K−6) (see [6]). However, revision and contraction are not inter-definable in this way for an agent which is not a classical reasoner, in particular, a reasoner in a logic for which it does not hold that K + A is consistent if, and only if, K A− . If we apply the Levi identity to the contraction operation defined earlier, we will get a revision operation which does not satisfy the revision postulates. One of the reasons for this is that contracting the . agent’s belief set by A− does not make this set consistent with A, so (K − A− ) + A may be inconsistent. . Let us instead define revision by A as (K + A) − ⊥ (expansion by A followed by elimination of all contradictions). However, even for this definition of revision, not all basic AGM postulates are satisfied. Algorithm 3. Revision by A add A to W M run rules to quiescence while W M contains a pair (B, B−) do contract by the least preferred member of the pair end while
. . . Proposition 6. The revision operation defined above satisfies (K+1) and (K+3) – (K+6). . . Proof. (K+1) is satisfied because when we do +, we run the rules to quiescence. (K+3) . is satisfied because the construction of K + A starts with A being added to WM which is then closed under consequence (which is K + A), and after that literals can only be
Resource-Bounded Belief Revision and Contraction
151
. removed from WM . (K+4) holds because, if adding A does not cause an inconsistency, . . . . then K + A = K +A by the definition of +. (K+5) holds trivially because A and K + A are never inconsistent. Finally, recall that in the agent’s logic, Cn(A) = Cn(B) only if . A = B, so (K+6) holds trivially. . . The reason why (K+2), or the property that A ∈ K + A, does not hold, is simple. Suppose we add A to K and derive some literal B, but B − is already in WM and has a higher preference value than B. Then we contract by B, which may well result . in contraction by A. Another example of a situation when (K+2) is violated would be revision by A in the presence of a rule A → A− . . One could question whether (K+2) is a desirable property. For example, it has been argued in [12] that an agent which performs autonomous belief revision would not satisfy this postulate in any case. However, if we do want to define a belief revision operation which satisfies this postulate, we need to make sure that A has a higher preference value than anything else in working memory, and that A on its own cannot be responsible for an inconsistency. One way to satisfy the first requirement is to use a preference order based on timestamps: more recent information is more preferred. To satisfy the second requirement, we may postulate that the agent’s rules are not perverse. We call a set of rules R perverse if there is a literal A such that running R to quiescence on WM = {A} results in deriving a contradiction {B, B − } (including the possibility of deriving A− ). This is equivalent to saying that no singleton set of literals is exceptional in the sense of [13].
8 Related Work AGM belief revision is generally considered to apply only to idealised agents, because of the assumption that the set of beliefs is closed under logical consequence. To model Artificial Intelligence agents, an approach called belief base revision has been proposed (see for example [14, 15, 16, 17]). A belief base is a finite representation of a belief set. Revision and contraction operations can be defined on belief bases instead of on logically closed belief sets. However the complexity of these operations ranges from NP-complete (full meet revision) to low in the polynomial hierarchy (computable using a polynomial number of calls to an NP oracle which checks satisfiability of a set of formulas) [18]. This complexity would not generally be considered appropriate for operations implemented by a resource-bounded agent. The reason for the high complexity is the need to check for classical consistency while performing the operations. One way around this is to weaken the language and the logic of the agent so that the consistency check is no longer an expensive operation (as suggested in [19]). This is the approach taken in this paper. Our contraction algorithm is similar to the algorithm proposed by McAllester in [4] for boolean constraint propagation. McAllester also uses a notion of ‘certainty’ of a node, which is similar to our definition of preference. Our approach to defining the preference order on beliefs is similar to the approach developed in [20, 21, 11] by Williams, Dixon and Wobcke. However, since they work with full classical logic, and calculating entrenchment of a sentence involves consid-
152
N. Alechina, M. Jago, and B. Logan
ering all possible derivations of this sentence, the complexity of their contraction and revision operations is at least exponential. The motivation for our work is very similar to Wasserman’s in [22], but Wasserman’s solution to the computational cost of classical belief revision is to consider only small (relevant) subsets of the belief base and do classical belief revision on them. Chopra et al [23], defined a contraction operation which approximates a classical AGM contraction operation; its complexity is O(|K| · |A| · 2S ), where K is the knowledge base, A the formula to be contracted, and S is a set of ‘relevant’ atoms. As S gets larger, the contraction operation becomes a closer approximation of the classical contraction. Perhaps the work most similar to our is that of Bezzazi et al [13], where belief revision and update operators for forward chaining reasoners were defined and analysed from the point of view of satisfying rationality postulates. The operators are applied to programs, which are finite sets of rules and literals, and are presented as ‘syntactic’ operators, which do not satisfy the closure under consequence and equivalence postulates. Rather, the authors were interested in preserving the ‘minimal change’ spirit of revision operators, which resulted in algorithms with high (exponential) complexity. Only one of the operators they propose, ranked revision, has polynomial complexity. To perform a ranked revision of a program P by a program P , a base P0 , . . . , Pn = ∅ of P is computed, where P0 = P and each Pi+1 is a subset of Pi containing the ‘exceptional’ rules of Pi . The base of a program can be computed in polynomial time in the size of the program; this involves adding the premises of each rule to the program, running rules to quiescence and checking for consistency. If the resulting set is inconsistent, the rule is exceptional. Ranked revision of P by P is defined as Pi ∪ P , where Pi is the largest member of the base of P which is consistent with P . Consistency can also be checked in polynomial time by running the program to quiescence. For programs without exceptional rules, the result of ranked revision is either the union of P and P , if they are consistent, or P alone, which is essentially full meet contraction. Even for the full classical logic, this is computable in NP time.
9 Conclusions and Further Work In this paper, we have presented a realisable resource-bounded agent which does AGM style belief revision. The agent is rule-based, and can be seen as a fully rational and omniscient reasoner in a very weak logic. The rules of the agent’s program are fixed, and only literal beliefs are revised. We define an efficient (linear time) algorithm for contraction, similar to McAllester’s algorithm for boolean constraint propagation, and show that the corresponding contraction operation satisfies all the basic AGM postulates apart from the recovery postulate. We show how to use a preference order on beliefs similar to the entrenchment ranking introduced in [11] to contract by the minimally preferred set of beliefs. The additional cost of computing the preference order is small: the resulting algorithm is still sub-quadratic in the size of the agent’s program. We then define a belief revision operator in terms of contraction, and show that it also satisfies all but one of the basic AGM postulates. The complexity of belief revision is polynomial in the size of the agent’s program and the number of literals in the working memory. To the best of our knowledge, no one has previously pointed out that reason-maintenance
Resource-Bounded Belief Revision and Contraction
153
style belief revision satisfies the AGM rationality postulates, provided that we assume that the logic which the agent uses is weaker than full classical logic. In future work, we plan to look at efficient revision operations on the agent’s programs, and extend the syntax of the agent’s programs. Acknowledgements. We thank the anonymous referees for their helpful comments and suggestions.
References 1. Alchourr´on, C.E., G¨ardenfors, P., Makinson, D.: On the logic of theory change: Partial meet functions for contraction and revision. Journal of Symbolic Logic 50 (1985) 510–530 2. Doyle, J.: Reason maintenance and belief revision. Foundations vs coherence theories. In G¨ardenfors, P., ed.: Belief Revision. Volume 29 of Cambridge Tracts in Theoretical Computer Science. Cambridge University Press, Cambridge, UK (1992) 29–51 3. Doyle, J.: Truth maintenance systems for problem solving. In: Proceedings of the Fifth International Joint Conference on Artificial Intelligence, IJCAI 77. (1977) 247 4. McAllester, D.A.: Truth maintenance. In: Proceedings of the Eighth National Conference on Artificial Intelligence (AAAI’90), AAAI Press (1990) 1109–1116 5. G¨ardenfors, P.: Conditionals and changes of belief. In Niiniluoto, I., Tuomela, R., eds.: The Logic and Epistemology of Scientific Change. North Holland (1978) 381–404 6. G¨ardenfors, P.: Knowledge in Flux: Modelling the Dynamics of Epistemic States. The MIT Press, Cambridge, Mass. (1988) 7. Rao, A.S.: AgentSpeak(L): BDI agents speak out in a logical computable language. In Van de Velde, W., Perram, J., eds.: Proceedings of the Seventh Workshop on Modelling Autonomous Agents in a Multi-Agent World (MAAMAW’96), 22–25 January, Eindhoven, The Netherlands. Number 1038 in Lecture Notes in Artificial Intelligence, London, SpringerVerlag (1996) 42–55 8. Laird, J.E., Newell, A., Rosenbloom, P.S.: SOAR: An architecture for general intelligence. Artificial Intelligence 33 (1987) 1–64 9. Software Technology Branch, Lyndon B. Johnson Space Center Houston: CLIPS Reference Manual: Version 6.21. (2003) 10. Makinson, D.: On the status of the postulate of recovery in the logic of theory change. Journal of Philosophical Logic 16 (1987) 383–394 11. Williams, M.A.: Iterated theory base change: A computational model. In: Proceedings of Fourteenth International Joint Conference on Artificial Intelligence (IJCAI-95), San Mateo, Morgan Kaufmann (1995) 1541–1549 12. Galliers, J.R.: Autonomous belief revision and communication. In G¨ardenfors, P., ed.: Belief Revision. Volume 29 of Cambridge Tracts in Theoretical Computer Science. Cambridge University Press (1992) 220–246 13. Bezzazi, H., Janot, S., Konieczny, S., P´erez, R.P.: Analysing rational properties of change operators based on forward chaining. In Freitag, B., Decker, H., Kifer, M., Voronkov, A., eds.: Transactions and Change in Logic Databases. Volume 1472 of Lecture Notes in Computer Science., Springer (1998) 317–339 14. Makinson, D.: How to give it up: A survey of some formal aspects of the logic of theory change. Synthese 62 (1985) 347–363 15. Nebel, B.: A knowledge level analysis of belief revision. In Brachman, R., Levesque, H.J., Reiter, R., eds.: Principles of Knowledge Representation and Reasoning: Proceedings of the First International Conference, San Mateo, Morgan Kaufmann (1989) 301–311
154
N. Alechina, M. Jago, and B. Logan
16. Williams, M.A.: Two operators for theory base change. In: Proceedings of the Fifth Australian Joint Conference on Artificial Intelligence, World Scientific (1992) 259–265 17. Rott, H.: “Just Because”: Taking belief bases seriously. In Buss, S.R., H´ajaek, P., Pudl´ak, P., eds.: Logic Colloquium ’98—Proceedings of the 1998 ASL European Summer Meeting. Volume 13 of Lecture Notes in Logic., Association for Symbolic Logic (1998) 387–408 18. Nebel, B.: Base revision operations and schemes: Representation, semantics and complexity. In Cohn, A.G., ed.: Proceedings of the Eleventh European Conference on Artificial Intelligence (ECAI’94), Amsterdam, The Netherlands, John Wiley and Sons (1994) 341–345 19. Nebel, B.: Syntax-based approaches to belief revision. In G¨ardenfors, P., ed.: Belief Revision. Volume 29 of Cambridge Tracts in Theoretical Computer Science. Cambridge University Press, Cambridge, UK (1992) 52–88 20. Dixon, S.: A finite base belief revision system. In: Proceedings of Sixteenth Australian Computer Science Conference (ACSC-16): Australian Computer Science Communications. Volume 15., Brisbane, Australia, Queensland University of Technology, Australia (1993) 445–451 21. Dixon, S., Wobcke, W.: The implementation of a first-order logic AGM belief revision system. In: Proceedings of 5th IEEE International Conference on Tools with AI, Boston, MA, IEEE Computer Society Press (1993) 40–47 22. Wasserman, R.: Resource-Bounded Belief Revision. PhD thesis, ILLC, University of Amsterdam (2001) 23. Chopra, S., Parikh, R., Wassermann, R.: Approximate belief revision. Logic Journal of the IGPL 9 (2001) 755–768
Agent-Oriented Programming with Underlying Ontological Reasoning ´ Alvaro F. Moreira1 , Renata Vieira2 , Rafael H. Bordini3 , and Jomi F. H¨ubner4 1
Universidade Federal do Rio Grande do Sul, Brazil
[email protected] 2 Universidade do Vale do Rio dos Sinos, Brazil
[email protected] 3 University of Durham, UK
[email protected] 4 Universidade Regional de Blumenau, Brazil
[email protected] Abstract. Developing applications that make effective use of machine-readable knowledge sources as promised by the Semantic Web vision is attracting much of current research interest; this vision is also affecting important trends in computer science such as grid-based and ubiquitous computing. In this paper, we formally define a version of the BDI agent-oriented programming language AgentSpeak based on description logic rather than predicate logic. In this approach, the belief base of an agent contains the definition of complex concepts, besides specific factual knowledge. We illustrate the approach using examples based on the wellknown smart meeting-room scenario. The advantages of combining AgentSpeak with description logics are: (i) queries to the belief base are more expressive as their results do not rely only on explicit knowledge but can be inferred from the ontology; (ii) the notion of belief update is refined given that (ontological) consistency of a belief addition can be checked; (iii) retrieving a plan for handling an event is more flexible as it is not based solely on unification but on the subsumption relation between concepts; and (iv) agents may share knowledge by using ontology languages such as OWL. Extending agent programming languages with description logics can have a significant impact on the development of multiagent systems for the semantic web.
1 Introduction Developing applications that make effective use of machine-readable knowledge sources as promised by the Semantic Web vision is attracting much of current research interest. More than that, semantic web technologies are also being used as the basis for other important trends in computer science such as grid computing [13] and ubiquitous computing [9]. Among the key components of semantic web technologies are domain ontologies [24], responsible for the specification of application-specific knowledge. As they can be expressed logically, they can be the basis for sound reasoning in a specific domain. Several ontologies are being developed for specific applications [10, 12, 18, 25]. Another key component of semantic web technologies are intelligent agents, which M. Baldoni et al. (Eds.): DALT 2005, LNAI 3904, pp. 155–170, 2006. c Springer-Verlag Berlin Heidelberg 2006
156
´ Moreira et al. A.F.
should make use of the available knowledge, and interact with other agents autonomously, so as to act on the user’s best interests. In this work, we bring together these two key semantic web components by proposing an extension to the BDI agent programming language AgentSpeak [22]; there has been much work on extending this language so that it becomes a fully-fledged programming language for multi-agent systems [19, 2, 6]. The AgentSpeak variant proposed here is based on Description Logic (DL) [3] rather than classical (predicate) logic; we shall call them AgentSpeak-DL and predicate-logic AgentSpeak (for emphasis), respectively. With DL, the belief base of an AgentSpeak agent consists of the definition of complex concepts and relationships among them, as well as specific factual information — in DL terminology, these are called TBox and ABox respectively. To the best of our knowledge, this is the first work to address ontological reasoning as an underlying mechanism within an agent-oriented programming language. Description logics are at the core of widely known ontology languages, such as the Ontology Web Language (OWL) [17]. An extension of AgentSpeak and its interpreter with underlying automated reasoning over ontologies expressed in such languages can have a major impact on the development of agents and multi-agent systems that can operate in the context of the semantic web. Although applications for the semantic web are already being developed, often based on the agents paradigm, most such development is being done on a completely ad hoc fashion as far as agent-oriented programming is concerned. This work contributes to the development of multi-agent systems for semantic web applications in a principled way. Further, ontological reasoning combined with an agent programming language itself facilitates certain tasks involved in programming in such languages. The remainder of this paper is organised as follows. In Section 2, we describe the syntax of the AgentSpeak language based on DL and we also explain briefly the main characteristics of the particular DL used for defining ontologies in the context of this paper. Section 3 presents the modifications that are necessary in the formal semantics of predicate-logic AgentSpeak as a consequence of introducing ontological description and reasoning. Each modification in the semantics is followed by a small example giving its intuition and illustrating the practical benefits of the proposed modification. The examples used here are related to the well-known scenario of smart meeting-room applications. In the final section we draw some conclusions and discuss future work.
2 The AgentSpeak-DL Language The syntax of AgentSpeak-DL is essentially the same as the syntax of predicate-logic AgentSpeak [8], the only difference being that in predicate-logic AgentSpeak the belief base of an agent consists solely of ground atoms, whereas in AgentSpeak-DL the belief base contains the definition of complex concepts, besides factual knowledge. In order to keep the formal treatment and examples simple, in this paper we assume ALC as the underlying description logic [4] for AgentSpeak-DL. An agent program ag is thus given by an ontology Ont and a set ps of plans, as defined by the grammar in Figure 1.
Agent-Oriented Programming with Underlying Ontological Reasoning ag
::= Ont
Ont TBox ABox TBoxAx D at
::= ::= ::= ::= ::= ::=
TBox ABox TBoxAx1 . . . TBoxAcn (n ≥ 0) at 1 . . . atn (n ≥ 0) D1 ≡ D 2 | D1 D1 ⊥ | A | ¬D | D1 D2 | D1 D2 | ∀R.D | ∃R.D A(t) | R(t1 , t2 )
ps p te ct h g u
::= ::= ::= ::= ::= ::= ::=
p1 . . . pn (n ≥ 1) te : ct ← h +at | −at | +g | −g at | ¬at | ct ∧ ct | T a | g | u | h; h | T !at |?at +at | −at
157
ps
Fig. 1. AgentSpeak-DL Syntax
As seen in the grammar above, an ontology consists of a TBox and an ABox. A TBox is a set of class and property descriptions, and axioms establishing equivalence and subsumption relationships between classes. An ABox describes the state of an application domain by asserting that certain individuals are instances of certain classes or that certain individuals are related by a property. In the grammar, metavariable D represent classes; D1 ≡ D2 asserts that both classes are equivalent, and D1 D2 asserts that class D1 is subsumed by class D2. The definition of classes assumes the existence of identifiers for primitive classes and properties. The metavariable A stands for names of primitive classes (i.e., predefined classes) as well as atomic class names chosen to identify constructed classes defined in the TBox; the metavariable R stands for primitive properties. New classes can be defined by using certain constructors, such as and , for instance ( and represent the intersection and the union of two concepts, respectively). Metavariable t is used for (first-order) terms. An AgentSpeak plan is formed by a triggering event — denoting the events for which that plan should be considered relevant — followed by a conjunction of belief literals representing a context. The context must be a logical consequence of that agent’s current beliefs for the plan to be applicable for handling an event. The remainder of the plan, the plan body, is a sequence of basic actions to be executed, (sub)goals that the agent has to achieve (or test), or (internal) belief updates. The basic actions (represented by the metavariable a in the grammar above) that can appear in a plan body denote the pre-defined ways in which an agent is able to change its environment. AgentSpeak distinguishes two types of goals: achievement goals and test goals. Achievement and test goals are predicates (as for beliefs) prefixed with operators ‘!’ and ‘?’ respectively. Achievement goals state that the agent wants to achieve a state of the world where the associated predicate is true. (In practice, these initiate the execution of subplans.) A test goal corresponds to a query to the agent’s belief base. Events, which initiate the execution of plans, can be internal, when a subgoal needs to be achieved, or external, when generated from belief updates as a result of perceiving
158
´ Moreira et al. A.F.
the environment. There are two types of triggering events that can be used in plans: those related to the addition (‘+’) and deletion (‘-’) of mental attitudes (i.e., beliefs or goals). In every reasoning cycle, one of the existing events that have not been dealt with yet is selected for being handled next. If there is an applicable plan for it (i.e., a relevant plan whose context is satisfied), an instance of that plan becomes an “intended means”. The agent is then committed to execute that plan, as part of one of its intentions. Also within a reasoning cycle, one of the (possibly various) intentions of the agent is selected for being further executed. Throughout the paper, we illustrate the practical impact of ontological reasoning with simple examples related to the well-known scenario of smart meeting-room applications [9]. An example of TBox components for such scenario is as follows: presenter ≡ invitedSpeaker paperPresenter attendee ≡ person registered ¬presenter . . . This TBox asserts that the concept presenter is equivalent to invited speaker or paper presenter, and the concept attendee is equivalent to the concept of a registered person who is not a presenter. Examples of elements of an ABox defined with respect to the TBox above are: invitedSpeaker(john) paperPresenter(mary) We assume that, as usual, the schedule of an academic event has slots for paper presenters and for invited speakers. At the end of a presentation slot, an event is generated indicating the next presenter, according to the schedule, is now required for the continuation of the session. For example, at the time Mary is required to start her presentation, a meeting-room agent would acquire (say, by communication with the schedule agent which has sensors for detecting when a speaker leaves the
+paperPresenter(P) : late(P) ← !reschedule(P). +invitedSpeaker(P) : late(P) ← !apologise; !announce(P). +presenter(P) : ¬late(P) ← !announce(P).
Fig. 2. Examples of AgentSpeak plans
Agent-Oriented Programming with Underlying Ontological Reasoning
159
stage) the belief paperPresenter(mary), which generates an event for handling +paperPresenter(mary), a belief addition event. In Figure 2, we give examples of AgentSpeak-DL plans to handle such “next presenter” events. The first plan in Figure 2 says that if a presenter of a paper is late s/he is rescheduled to the end of the session (and the session continues with the next scheduled speaker). If an invited speaker is late, apologies are given to the audience and the speaker is announced (this event only happens when the invited speaker actually arrives, assuming the paper sessions must not begin before the invited talk). The third is a general plan that announces any presenter (paperPresenter or invitedSpeaker) if s/he is not late. We now proceed to discuss the implications, to the formal semantics, of incorporating ontologies in AgentSpeak.
3 Semantics of AgentSpeak-DL The reasoning cycle of an AgentSpeak agent follows a sequence of steps. The graph in Figure 3 shows all possible transitions between the various steps in an agent’s reasoning cycle (the labels in the nodes name each step in the cycle). The set of labels used is {ProcMsg, SelEv, RelPl, ApplPl, SelAppl, AddIM, SelInt, ExecInt, ClrInt}; they stand for, respectively: processing communication messages, selecting an event from the set of events, retrieving all relevant plans, checking which of those are applicable, selecting one particular applicable plan (the intended means), adding the new intended means to the set of intentions, selecting an intention, executing the selected intention, and clearing an intention or intended means that may have finished in the previous step. In this section we present an operational semantics for AgentSpeak-DL that formalises some of the possible transitions depicted in the reasoning cycle of Figure 3. However, we concentrate here on the formalisation of the steps that required changes to accommodate the DL extensions. Operational semantics [21] is a widely used method for giving semantics to programming languages and studying their properties. The semantic rules for the steps in the reasoning cycle are essentially the same in AgentSpeak-DL as for predicate-logic AgentSpeak, with the exception of the following aspects that are affected by the introduction of ontological reasoning: – plan retrieval and selection: performed in the steps responsible for collecting relevant and applicable plans, and selecting one plan among the set of applicable plans (steps RelPl, ApplPl, and SelAppl of Figure 3, respectively);
ProcMsg
SelEv
ClrInt
RelPl
ExecInt
ApplPl
SelInt
Fig. 3. Transitions between Reasoning Cycle Steps
SelAppl
AddIM
160
´ Moreira et al. A.F.
– querying the belief base: performed in step ExecInt of Figure 3); and – belief updating: also performed in step ExecInt of Figure 3). A complete version of an operational semantics for AgentSpeak is given in [8]. For this reason, in this work we give only the semantic rules of AgentSpeak-DL that are different from their counterparts in the operational semantics of AgentSpeak. 3.1 Configuration of the Transition System The operational semantics is given by a set of rules that define a transition relation between configurations ag, C, T, s where: – An agent program ag is, as defined above, a set of beliefs and a set of plans. Note that in predicate-logic AgentSpeak, the set of beliefs is simply a collection of ground atoms. In AgentSpeak-DL the belief base is an ontology. – An agent’s circumstance C is a tuple I, E, A where: • I is a set of intentions {i, i , . . .}. Each intention i is a stack of partially instantiated plans. • E is a set of events {(te, i), (te , i ), . . .}. Each event is a pair (te, i), where te is a triggering event and i is an intention (the particular intention that generated an internal event, or the empty intention in case of an external event). • A is a set of actions that the agent has chosen for execution; the agent’s effector will change the environment accordingly. – It helps in giving the semantics to use a structure T which keeps track of temporary information that is required in subsequent steps of the reasoning cycle; however, such information is only used within a single reasoning cycle. T is used to denote a tuple R, Ap, ι, ε, ρ with the required temporary information; it has as components: (i) R for the set of relevant plans (for the event being handled); (ii) Ap for the set of applicable plans (the subset of relevant plans whose context are true), and (iii) ι, ε, and ρ keep record of a particular intention, event and applicable plan (respectively) being considered along the execution of a reasoning cycle. – The current step s within an agent’s reasoning cycle is annotated by labels s ∈ {SelEv, RelPl, ApplPl, SelAppl, AddIM, SelInt, ExecInt, ClrInt} (as seen in Figure 3). In the general case, an agent’s initial configuration is ag, C, T, SelEv, where ag is as given by the agent program, and all components of C and T are empty. In order to keep the semantic rules elegant, we adopt the following notation: – If C is an AgentSpeak agent circumstance, we write CE to make reference to component E of C. Similarly for all the other components of a configuration of the transition system. – We write i[p] to denote the intention formed by pushing plan p on top of intention i. 3.2 Retrieving and Selecting Plans in AgentSpeak-DL The reasoning cycle of an agent can be better understood by assuming that it starts with the selection of an event from the set of events (this step assumes the existence of a selection function SE ). The next step in the reasoning cycle is the search for relevant plans
Agent-Oriented Programming with Underlying Ontological Reasoning
161
for dealing with the selected event. In the semantics of predicate-logic AgentSpeak, this is formalised by the following rules. Rule Rel1 below initialises the R component of T with the set of relevant plans determined by the auxiliary function RelPlans (which is formally defined below), and sets the reasoning cycle to the step (ApplPl) that determines the applicable plans among those in the R component. Tε = te, i RelPlans(agps , te) = {} ag, C, T, RelPl −→ ag, C, T , ApplPl
(Rel1 )
where: TR = RelPlans(agps , te) Tε = te, i RelPlans(agps , te) = {} ag, C, T, RelPl −→ ag, C, T, SelEv
(Rel2 )
If there are no relevant plans for an event, it is simply discarded and, with it, the associated intention. In this case the cycle starts again with the selection of another event from the set of events (rule Rel2 ). If there are no pending events to handle, the cycle skips to the intention execution step. In predicate-logic AgentSpeak, a plan is considered relevant in relation to an event if it has been written to deal specifically with that event (as stated by the plan’s triggering event). In practice, this is checked in predicate-logic AgentSpeak by trying to unify the triggering event part of the plan with the event that has been selected, from the set of events E, for begin handled during this reasoning cycle. The auxiliary function RelPlans for predicate-logic AgentSpeak is then defined as follows (below, if p is a plan of the form te : ct ← h, we define TrEv(p) = te). Definition 1. Given the plans ps of an agent and a selected event te, i, the set RelPlans(ps, te) of relevant plans for that event is defined as follows:
RelPlans(ps, te) = {(p, θ) | p ∈ ps and θ is a mgu s.t. teθ = TrEv(p)θ}. It is important to remark that, in predicate-logic AgentSpeak, the key mechanism used for searching relevant plans in the agent’s plan library is unification. This means that the programmer has to write specific plans for each possible type of event. The only degree of generality is obtained by the use of variables in the triggering event of plans. When ontological reasoning is added (instead of using unification only), a plan is considered relevant in relation to an event not only if it has been written specifically to deal with that event, but also if the plan’s triggering event has a more general relevance, in the sense that it subsumes the actual event. In practice, this is checked by: (i) finding a plan whose triggering event predicate is related (in the ontology) by subsumption or equivalence relations to the predicate in the event that has been selected for handling; and (ii) unifying the terms that are arguments for the event and the plan’s triggering event.
162
´ Moreira et al. A.F.
In the formal semantics of AgentSpeak-DL, rules Rel1 and Rel2 still apply, but the auxiliary function RelPlans for AgentSpeak-DL has to be redefined as follows (recall that D is a metavariable for classes of an ontology): Definition 2. Given plans ps and ontology Ont of an agent, and an event ∗A1 (t), i where ∗ ∈ {+, −, +!, +?, −!, −?}, the set RelPlans(Ont, ps, ∗A1 (t)) is a set of pairs (p, θ) such that p ∈ ps, with TrEv(p) = ∗ A2 (t ), such that – ∗ = ∗ – θ = mgu(t, t ) – Ont |= A1 A2 or Ont |= A1 ≡ A2 As an example, let us consider the case of checking for plans that are relevant for a particular event in the smart meeting-room scenario. Suppose that the application detects that the next slot is allocated to the invited speaker john. This causes the addition of the external event +invitedSpeaker(john), to the set of events. Recall that invitedSpeaker presenter can be inferred from the ontology. With this, and given Definition 2, the plan with triggering event +presenter(X) is also considered relevant for dealing with that event (see Figure 2). Observe that using subsumption instead of unification alone as the mechanism for selecting relevant plans results in a potentially larger set of plans than in predicate-logic AgentSpeak. A plan is applicable if it is relevant and its context is a logical consequence of the agent’s beliefs. Rules Appl1 and Appl2 formalise the step of the reasoning cycle that determines the applicable plans from the set of relevant plans. AppPlans(agbs , TR ) = {} ag, C, T, ApplPl −→ ag, C, T , SelAppl
(Appl1 )
where: TAp = AppPlans(agbs , TR )
AppPlans(agbs , TR ) = {} ag, C, T, ApplPl −→ ag, C, T, SelInt
(Appl2 )
Rule Appl1 initialises the TAp component with the set of applicable plans; Appl2 is the rule for the case where there are no applicable plans (to avoid discussing details of a possible plan failure handling mechanism, we assume the event is simply discarded). Both rules depend on the auxiliary function ApplPlans, which for predicatelogic AgentSpeak has been defined as follows (note that bs was originally the agent’s set of beliefs, which now has been replaced by an ontology). Definition 3. Given a set of relevant plans R and the beliefs bs of an agent, the set of applicable plans AppPlans(bs, R) is defined as follows:
AppPlans(bs, R) = {(p, θ ◦ θ) | (p, θ) ∈ R and θ is s.t. bs |= Ctxt(p)θθ }.
Agent-Oriented Programming with Underlying Ontological Reasoning
163
Observe that the context of a plan is a conjunction of literals and, as the belief base is formed by a set of ground atomic formulæ only, the problem of checking if the plan’s context is a logical consequence of the belief base reduces to the problem of checking membership (or otherwise, for default negation) of context literals to the set of beliefs while finding an appropriate unifier. In AgentSpeak-DL, a plan is applicable if it is relevant and its context can be inferred from the whole ontology forming the belief base. A plan’s context is a conjunction of literals; a literal l is either A(t) or ¬A(t). We can say that Ont |= l1 ∧ . . . ∧ ln if, and only if, Ont |= li for i = 1 . . . n. The auxiliary function for checking, from a set of relevant plans, which ones are applicable is then formalised below. Again, because the belief base is structured and that reasoning is based on ontological knowledge rather than just straightforward variable instantiation, the resulting set of applicable plans might be larger than in predicate-logic AgentSpeak. Definition 4. Given a set of relevant plans R and ontology Ont of an agent, the set of applicable plans AppPlans(Ont, R) is defined as follows:
AppPlans(Ont, R) = {(p, θ ◦ θ) | (p, θ) ∈ R and θ is s.t. Ont |= Ctxt(p)θθ }. More than one plan can be considered applicable for handling an event at a given moment in time. Rule SelAppl in the formal semantics of predicate-logic AgentSpeak assumes the existence of a (given, application-specific) selection function SAp that selects a plan from a set of applicable plans TAp . The selected plan is then assigned to the Tρ component of the configuration indicating, for the next steps in the reasoning cycle, that an instance of this plan has to be added to the agent’s intentions. SAp (TAp ) = (p, θ) ag, C, T, SelAppl −→ ag, C, T , AddIM
(SelAppl)
where: Tρ = (p, θ) In predicate-logic AgentSpeak, users define the applicable plan selection function (SAp ) in a way that suits that particular application. For example, if in a certain domain there are known probabilities of the chance of success, or resulting quality of achieved tasks, associated with various plans, this can be easily used to specify such function. However, note that, in predicate-logic AgentSpeak, the predicate in the triggering event of all the plans in the set of applicable plans are exactly the same. On the contrary, in AgentSpeak-DL, because of the way the relevant and applicable plans are determined, it is possible that plans with triggering events +presenter(P) and +invitedSpeaker(P) are both considered relevant and applicable for handling an event +invitedSpeaker(john), . The function SAp in rule SelAppl, could be used to select, for example, the the least general plan among those in the set of applicable plans. To allow this to happen, the semantic rule has to be slightly modified so as to include, as argument to SAp , the event that has triggered the search for a plan (see rule SelApplOnt below). In the example we are using, the selected plan should be the one
164
´ Moreira et al. A.F.
with triggering event +invitedSpeaker as probably this plan has been written to deal more specifically with the case of invited speakers, rather than the more general plan which can be used for other types of presenters as well. On the other hand, if the particular plan for invited speakers that we have in the example is not applicable (because s/he is not late), instead of the agent not acting at all for lack of applicable plans, the more general plan for presenters can then be used. Tε = te, i SAp (TAp , te) = (p, θ) ag, C, T, SelAppl −→ ag, C, T , AddIM
(SelApplOnt)
where: Tρ = (p, θ) Events can be classified as external or internal (depending on whether they were generated from the agent’s perception of the environment, or whether they were generated by goal additions during the execution of other plans, respectively). If the event being handled in a particular reasoning cycle is external, a new intention is created and its single plan is the plan p assigned to the ρ component in the previous steps of the reasoning cycle. If the event is internal, rule IntEv (omitted here) states that the plan in ρ should be pushed on top of the intention associated with the given event. Rule ExtEv creates a new intention (i.e., a new focus of attention) for an external event. 3.3 Intention Execution: Querying the Belief Base The next step, after updating the set of intentions as explained above, uses an agentspecific function (SI ) that selects the intention to be executed next (recall that an intention is a stack of plans). When the set of intentions is empty, the reasoning cycle is simply restarted. The plan to be executed is always the one at the top of the intention that has been selected. Agents can execute actions, achievement goals, test goals, or belief base updates (by adding or removing “internal beliefs”, as opposed to those resulting from perception of the environment). Both the execution of actions and the execution of achievement goals are not affected by the introduction of ontological reasoning, so their semantics are exactly the same as their counterparts in predicate-logic AgentSpeak. The execution of actions, from the point of view of the AgentSpeak interpreter, reduces to instructing other architectural components to perform the respective action so as to change the environment. The execution of achievement goals adds a new internal event to the set of events. That event will be selected at a later reasoning cycle, and handled as explained above. The evaluation of a test goal ?at, however, is more expressive in AgentSpeak-DL than in predicate-logic AgentSpeak. In predicate-logic AgentSpeak, the execution of a test goal consists in testing if at is a logical consequence of the agent’s beliefs. The Test auxiliary function defined below returns a set of most general unifiers, all of which make the formula at a logical consequence of a set of formulæ bs. Definition 5. Given a set of formulæ bs and a formula at, the set of substitutions Test(bs, at) produced by testing at against bs is defined as follows:
Test(bs, at) = {θ | bs |= atθ}.
Agent-Oriented Programming with Underlying Ontological Reasoning
165
This auxiliary function is then used in the formal semantics by rules Test1 and Test2 below. If the test goal succeeds (rule Test1 ), the substitution is applied to the whole intended means, and the reasoning cycle can carry on. If that is not the case, in recent extensions of AgentSpeak it may be the case that the test goal is used as a triggering event of a plan, which is used by programmers to formulate more sophisticated queries1 . Rule Test2 is used in such case: it generates an internal event, which may eventually trigger the execution of a plan (explicitly created to carry out a complex query). Tι = i[head ←?at;h] Test(agbs , at) = {} ag, C, T, ExecInt −→ ag, C, T, ClrInt
(Test1 )
where: CI = (CI \ {Tι }) ∪ {(i[head ← h])θ} θ ∈ Test(agbs , at) Tι = i[head ←?at;h] Test(agbs , at) = {} ag, C, T, ExecInt −→ ag, C, T, ClrInt
(Test2 )
where: CE = CE ∪ {+?at, i[head ← h]} CI = CI \ {Tι } In AgentSpeak-DL, the semantic rules for the evaluation of a test goal ?A(t) are exactly as the rules Test1 and Test2 above. However, the function Test used to check whether the formula A(t) is a logical consequence of the agent’s belief base, which is now based on an ontology, needs to be changed. The auxiliary function Test is redefined as follows: Definition 6. Given a set of formulæ Ont and a formula ?at, the set of substitutions Test(Ont, at) is given by
Test(Ont, at) = {θ | Ont |= atθ}. Observe that this definition is similar to the definition of the auxiliary function Test given for predicate-logic AgentSpeak. The crucial difference is that now the reasoning capabilities of description logic allows agents to infer knowledge that is implicit in the ontology. As an example, suppose that the agent’s belief base does not refer to instances of attendee, but has instead the facts invitedSpeaker(john) and paperPresenter(mary). A test goal such as ?attendee(A) succeeds in this case producing substitutions that map A to john and to mary. 3.4 Belief Updating In predicate-logic AgentSpeak, the addition or deletion of internal beliefs has no further implications apart from a possible new event to be included in the set of events E (as for 1
Note that this was not clear in the original definition of AgentSpeak(L). In our work on extensions of AgentSpeak we have given its semantics in this way as it allows a complex plan to be used for determining the values to be part of the substitution resulting from a test goal (rather than just retrieving specific values previously stored in the belief base).
166
´ Moreira et al. A.F.
belief update from perception of the environment, events are generated whenever beliefs are added or removed from the belief base). Rule AddBel below formalises belief addition in predicate-logic AgentSpeak: the formula +b is removed from the body of the plan and the set of intentions is updated accordingly. In practice, this mechanism for adding and removing internal beliefs allow agents to have “mental notes”, which can be useful at times (for practical programming tasks). These beliefs should not normally be confused with beliefs acquired from perception of the environment. In the rules below, notation bs = bs + b means that bs is as bs except that bs |= b. Tι = i[head ← +b;h] ag, C, T, ExecInt −→ ag , C , T, ClrInt
(AddBel)
where: agbs = agbs + b CE = CE ∪ {+b, T} CI = (CI \ {Tι }) ∪ {i[head ← h]} In predicate-logic AgentSpeak, the belief base consists solely of ground atomic formulæ so ensuring consistency is not a major task. In AgentSpeak-DL, however, belief update is more complicated than in predicate-logic AgentSpeak. The agent’s ABox contains class assertions A(t) and property assertions R(t1 , t2 ). The representation of such information should, of course, be consistent with its TBox. In AgentSpeak-DL, the addition of assertions to the agent’s ABox is only allowed if the ABox resulting of such addition is consistent with the TBox (i.e., if adding the given belief to the ontology’s ABox2 maitains consistency). Approaches for checking consistency of an ABox with respect to a TBox are discussed in detail in [4]. The semantic rule AddBel has to be modified accordingly: Tι = i[head ← +b;h] agOnt ∪ {b} is consistent ag, C, T, ExecInt −→ ag , C , T, ClrInt
(AddBelOnt1)
where: agOnt = agOnt ∪ {b} CE = CE ∪ {+b, T} CI = (CI \ {Tι }) ∪ {i[head ← h]} There is a similar rule for belief deletions (i.e., a formula such as −at in a plan body), but it is trivially defined based on the one above, so we do not include it explicitly here. Expanding on the smart meeting-room example, assume that the TBox is such that the concepts chair and bestPaperWinner are disjoint. Clearly, if the ABox asserts that chair(mary), the assertion bestPaperWinner(mary) cannot simply be added to it, as the resulting belief base would become inconsistent. The rule bellow formalises the semantics of AgentSpeak-DL for such cases. 2
In the AddBelOnt rules below, we assume that the union operation applied to an ontology TBox1 , ABox1 and an ABox ABox2 would result in an ontology TBox1 , ABox1 ∪ ABox2 , as expected.
Agent-Oriented Programming with Underlying Ontological Reasoning
Tι = i[head ← +b;h] agOnt ∪ {b} is not consistent (adds, dels) = BRF(agOnt , b) ag, C, T, ExecInt −→ ag , C , T, ClrInt
167
(AddBelOnt2)
where: agOnt = (agOnt ∪ adds) \ dels CE = CE ∪ {+b , T | b ∈ adds} ∪ {−b , T | b ∈ dels} CI = (CI \ {Tι }) ∪ {i[head ← h]} According to this rule, a belief revision function (BRF) is responsible for determining the necessary modifications in the belief base as a consequence of attempting to add a belief that would cause ontological inconsistency. Given an ontology and a belief atom, this function returns a pair of sets of atomic formulæ that were, respectively, added (adds) and deleted (dels) to/from the belief base. Belief revision is a complex subject and clearly outside the scope of this paper; see [1] for an interesting (tractable) approach to belief revision. The rules above are specific to the addition of beliefs that arise from the execution of plans. Recall that these are used as mental notes that the agent uses for its own processing. However, the same BRF function is used when beliefs need to be added as a consequence of perception of the environment. In the general case, whenever a belief needs to be added to the belief base, a belief revision function should be able to determine whether the requested belief addition will take place and whether any belief deletions are necessary so that the addition can be carried out and consistency maintained. Note that the precise information on resulting additions and deletions of beliefs are needed by the AgentSpeak interpreter so that the necessary (external) events are generated, hence the signature of the BRF function as defined above, and the relevant events added to CE in rule AddBelOnt2. The reasoning cycle finishes by removing from the set of intentions an intended means or a whole intention that has been executed to completion. There are no changes in those rules of the original semantics as a consequence of the extensions presented in this paper.
4 Conclusions and Future Work This paper has formalised the changes in the semantics of AgentSpeak that were required for combining agent-oriented programming with ontological reasoning. The main improvements to AgentSpeak resulting from the variant based on a description logic are: (i) queries to the belief base are more expressive as their results do not depend only on explicit knowledge but can be inferred from the ontology; (ii) the notion of belief update is refined so that a property about an individual can only be added if the resulting ontology-based belief base would preserve consistency (i.e., if the ABox assertion is consistent with the concept descriptions in the TBox); (iii) retrieving a plan (from the agent’s plan library) that is relevant for dealing with a particular event is more flexible as this is not based solely on unification, but also on the subsumption rela-
168
´ Moreira et al. A.F.
tion between concepts; and (iv) agents may share knowledge by using web ontology languages such as OWL. With this paper, we hope to have contributed towards showing that extending an agent programming language with the descriptive and reasoning power of description logics can have a significant impact in the way agent-oriented programming works in general, and in particular for the development of semantic web applications using the agent-oriented paradigm. In fact, this extension makes agent-oriented programming more directly suitable for other application areas that are currently of major interest such as grid and ubiquitous computing. It also creates perspectives for elaborate forms of agent migration, where plans carefully written to use (local) ontological descriptions can ease the process of agent adaptation to different societies [11]. Clearly, the advantages of the more sophisticated reasoning that is possible using ontologies also represent an increase in the computational cost of running agent programs. The trade-off between increased expressive power and possible decrease in computational efficiency in the context of this work has not been considered as yet, and remains future work. Also as future work, we aim at incorporating other ongoing activities related to agent-oriented programming and semantic web technologies into AgentSpeak-DL. To start with, we plan to improve the semantics of AgentSpeak-DL, to move from the simple ALC used here to more expressive DLs such as those underlying OWL Lite and OWL DL [16]. The idea is to allow AgentSpeak-DL agents to use ontologies written in OWL, so that applications written in AgentSpeak-DL can be deployed on the Web and interoperate with other semantic web applications based on the OWL W3C standard (http://www.w3.org/2004/OWL/). An interpreter for predicate-logic AgentSpeak called Jason [7] is available open source (http://jason.sourceforge.net). An AgentSpeak-DL interpreter is currently being implemented on top of Jason, based on the formalisation presented here. Jason is in fact an implementation of a much extended version of AgentSpeak. It has various available features which are relevant for developing an AgentSpeak-DL interpreter. Particularly, it implements the operational semantics of AgentSpeak as defined in [8], thus the semantic changes formalised in Section 3 can be directly transferred to the Jason code. However, Jason’s inference engine needs to be extended to incorporate ontological reasoning, which can be done by existing software such as those described in [14, 20, 15]. We are currently considering the use of RACER [14] in particular for extending Jason so that belief bases can refer to OWL ontologies. Another of our planned future work is the integration of AgentSpeak-DL with the AgentSpeak extension presented in [19], which gives semantics to speech-actbased communication between AgentSpeak agents. In such integrated approach, agents in a society can refer to specific TBox components when exchanging messages. Another relevant feature provided by Jason is that predicates have a (generic) list of “annotations”; in the context of this work, we can use that mechanism to specify the particular ontology to which each belief refers. For example, the fact that an agent ag1 has informed another agent ag2 about a belief p(t) as defined in a given ontology ont can be expressed in ag2’s belief base as p(t)[source(ag1),ontology("http://.../ont")].
Agent-Oriented Programming with Underlying Ontological Reasoning
169
An interesting issue associated with ontologies is that of how different ontologies for the same domain can be integrated. Recent work reported in [5] has proposed the use of type theory in order to both detect discrepancies among ontologies as well as align them. We plan to investigate the use of type-theoretic approaches to provide our ontology-based agent-oriented programming framework with techniques for coping with ontological mismatch. Although the usefulness of combining ontological reasoning within an agentoriented programming language seems clear (e.g., from the examples and discussions in this paper), the implementation of practical applications are essential to fully support such claims. This is also planned as part of our future work.
Acknowledgements Rafael Bordini gratefully acknowledges the support of The Nuffield Foundation (grant number NAL/01065/G).
References 1. N. Alechina, M. Jago, and B. Logan. Resource-bounded Belief Revision and Contraction. In this volume. 2. D. Ancona, V. Mascardi, J. F. H¨ubner, and R. H. Bordini. Coo-AgentSpeak: Cooperation in AgentSpeak through plan exchange. In Proceedings of the Third International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS-2004), New York, NY, 19–23 July, pages 698–705, New York, NY, 2004. ACM Press. 3. F. Baader, D. Calvanese, D. N. D. McGuinness, and P. Patel-Schneider, editors. Handbook of Description Logics. Cambridge University Press, Cambridge, 2003. 4. F. Baader and W. Nutt. Basic description logics. In F. Baader, D. Calvanese, D. N. D. McGuinness, and P. Patel-Schneider, editors, Handbook of Description Logics, pages 43– 95. Cambridge University Press, Cambridge, 2003. 5. R.-J. Beun, R. M. van Eijk, and H. Pr¨ust. Ontological feedback in multiagent systems. In Proceedings of the Third International Joint Conference on Autonomous Agents and Multi Agent Systems (AAMAS 2004), New York, NY, 19–23 July, 2004. 6. R. H. Bordini, A. L. C. Bazzan, R. O. Jannone, D. M. Basso, R. M. Vicari, and V. R. Lesser. AgentSpeak(XL): Efficient intention selection in BDI agents via decision-theoretic task scheduling. In Proceedings of the First International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS-2002), 15–19 July, Bologna, Italy, pages 1294– 1302, New York, NY, 2002. ACM Press. 7. R. H. Bordini, J. F. H¨ubner, et al. Jason: A Java-based AgentSpeak interpreter used with Saci for multi-agent distribution over the net, manual, version 0.6 edition, Feb 2005. http://jason.sourceforge.net/. ´ F. Moreira. Proving BDI properties of agent-oriented programming 8. R. H. Bordini and A. languages: The asymmetry thesis principles in AgentSpeak(L). Annals of Mathematics and Artificial Intelligence, 42(1–3):197–226, Sept. 2004. Special Issue on Computational Logic in Multi-Agent Systems. 9. H. Chen, T. Finin, A. Joshi, F. Perich, D. Chakraborty, , and L. Kagal. Intelligent agents meet the semantic web in smart spaces. IEEE Internet Computing, 19(5):69–79, November/December 2004.
170
´ Moreira et al. A.F.
10. H. Chen, F. Perich, T. Finin, and A. Joshi. SOUPA: Standard Ontology for Ubiquitous and Pervasive Applications. In International Conference on Mobile and Ubiquitous Systems: Networking and Services, Boston, MA, August 2004. 11. A. C. da Rocha Costa, J. F. H¨ubner, and R. H. Bordini. On entering an open society. In XI Brazilian Symposium on Artificial Intelligence, pages 535–546, Fortaleza, Oct. 1994. Brazilian Computing Society. 12. Y. Ding, D. Fensel, M. C. A. Klein, B. Omelayenko, and E. Schulten. The role of ontologies in ecommerce. In Staab and Studer [24], pages 593–616. 13. I. Foster and C. Kesselman, editors. The Grid 2: Blueprint for a New Computing Infrastructure. Morgan Kaufmann, second edition, 2003. 14. V. Haarslev and R. Moller. Description of the RACER system and its applications. In C. A. Goble, D. L. McGuinness, R. M¨oller, and P. F. Patel-Schneider, editors, Proceedings of the International Workshop in Description Logics 2001 (DL’01), 2001. 15. I. Horrocks. FaCT and iFaCT. In P. Lambrix, A. Borgida, M. Lenzerini, R. M¨oller, and P. Patel-Schneider, editors, Proceedings of the International Workshop on Description Logics (DL’99), pages 133–135, 1999. 16. I. Horrocks and P. F. Patel-Schneider. Reducing OWL entailment to description logic satisfiability. In D. Fensel, K. Sycara, and J. Mylopoulos, editors, Proc. of the 2003 International Semantic Web Conference (ISWC 2003), number 2870 in LNCS, pages 17–29. Springer, 2003. 17. D. L. McGuinness and F. van Harmelen, editors. OWL Web Ontology Language overview. W3C Recommendation. Avalilable at http://www.w3.org/TR/owl-features/, February 2004. 18. S. E. Middleton, D. D. Roure, and N. R. Shadbolt. Ontology-based recommender systems. In Staab and Studer [24], pages 577–498. ´ F. Moreira, R. Vieira, and R. H. Bordini. Extending the operational semantics of a BDI 19. A. agent-oriented programming language for introducing speech-act based communication. In Declarative Agent Languages and Technologies, Proceedings of the First International Workshop (DALT-03), held with AAMAS-03, 15 July, 2003, Melbourne, Australia, number 2990 in LNAI, pages 135–154, Berlin, 2004. Springer-Verlag. 20. P. F. Patel-Schneider. DLP system description. In E. Franconi, G. D. Giacomo, R. M. MacGregor, W. Nutt, C. A. Welty, and F. Sebastiani, editors, Proceedings of the International Workshop in Description Logics 1998 (DL’98), pages 133–135, 1998. 21. G. Plotkin. A structural approach to operational semantics, 1981. Technical Report, Department of Computer Science, Aarhus University. 22. A. S. Rao. AgentSpeak(L): BDI agents speak out in a logical computable language. In W. Van de Velde and J. Perram, editors, Proceedings of the Seventh Workshop on Modelling Autonomous Agents in a Multi-Agent World (MAAMAW’96), 22–25 January, Eindhoven, The Netherlands, number 1038 in LNAI, pages 42–55, London, 1996. Springer-Verlag. 23. Y. Shoham. Agent-oriented programming. Artificial Intelligence, 60:51–92, 1993. 24. S. Staab and R. Studer, editors. Handbook on Ontologies. International Handbooks on Information Systems. Springer, 2004. 25. R. Stevens, C. Wroe, P. W. Lord, and C. A. Goble. Ontologies in bioinformatics. In Staab and Studer [24], pages 635–658.
Dynagent: An Incremental Forward-Chaining HTN Planning Agent in Dynamic Domains Hisashi Hayashi1 , Seiji Tokura2 , Tetsuo Hasegawa1, and Fumio Ozaki2 1
Knowledge Media Laboratory, R&D Center, Toshiba Corporation, 1 Komukai Toshiba-cho, Saiwai-ku, Kawasaki, 212-8582 Japan {hisashi3.hayashi, tetsuo3.hasegawa}@toshiba.co.jp 2 Humancentric Laboratory, R&D Center, Toshiba Corporation, 1 Komukai Toshiba-cho, Saiwai-ku, Kawasaki, 212-8582 Japan {seiji.tokura, fumio.ozaki}@toshiba.co.jp
Abstract. HTN planning, especially forward-chaining HTN planning, is becoming important in the areas of agents and robotics, which have to deal with the dynamically changing world. Therefore, replanning in “forward-chaining” HTN planning has become an important subject for future study. This paper presents the new agent algorithm that integrates forward-chaining HTN planning, execution, belief updates, and plan modifications. Also, through combination with an A*-like heuristic search strategy, we show that our agent algorithm is effective for the replanning problem of museum tour guide robots, which is similar to the replanning problem of a traveling salesman.
1
Introduction
Recently, we often hear the keyword “HTN Planning” [6, 16, 19, 20, 21] or hierarchical task network planning. HTN planning is different from standard planners which just connect the “preconditions” and “effects” of actions. It makes plans, instead, by decomposing abstract tasks into more concrete subtasks or subplans, which is similar to Prolog that decomposes goals into subgoals. HTN planning, which is not a new technique [19, 20], is becoming popular again in such areas as multi-agents [7, 8], mobile agents [14], web service composition [18, 23], interactive storytelling [4, 5], RoboCup simulation [17], and robotics [2, 3]. This is because in addition to its efficiency and expressiveness of knowledge, as discussed in [22], HTN planning is suitable for a dynamically changing world. For example, some task decompositions can be suspended when planning initially and resumed, using the most recent knowledge, just before the abstract tasks are executed. Also, some task decompositions can be carried out by other agents or users, which is useful for multi-agent systems and user interface agent systems. Forward-chaining HTN planners such as SHOP [16] are simple and useful. They are simple because when decomposing an abstract task in a plan, the planner knows what fluents (time-varying predicates) hold just before the execution of the abstract task. They are useful because we can easily evaluate some M. Baldoni et al. (Eds.): DALT 2005, LNAI 3904, pp. 171–187, 2006. c Springer-Verlag Berlin Heidelberg 2006
172
H. Hayashi et al.
constraints or conditions, which must be satisfied just before the task execution. We can even use external constraint solvers or built-in predicates for this evaluation. On the other hand, as mentioned in [8], replanning in a dynamically changing world has become an important subject for future study regarding the multi-agent system that uses a forward-chaining HTN planner. This is a very important theme if we use the planner in the area of agents and robotics. Another challenge in HTN planning, as pointed out in [24], is to use a heuristic search strategy for selecting the best task decomposition among others. To solve these problems, we propose a new agent algorithm based on forward-chaining HTN planning. During the plan execution, the agent keeps and incrementally modifies its belief, the plan being executed, and the other alternative plans, including abstract plans. Therefore, the agent can still use an alternative plan even when the agent fails to execute the action. The agent chooses the best plan in terms of the cost. It can suspend the currently executed plan and switch to a better plan regarding the costs. When switching to an alternative plan, the agent takes into account the already executed actions. Heuristic search and replanning in HTN planning are also important in the area of interactive storytelling [4, 5]. However, in [4, 5], the algorithm is not presented in detail. In this paper, we will clearly define our HTN planning agent algorithm. This paper is organized as follows. Section 2 introduces the museum tourguide robot scenario as an example. Section 3 defines terminology including belief and planning knowledge. Section 4 shows how to describe the museum tour-guide robot domain using belief and planning knowledge. Section 5 defines the planning algorithm. Section 6 defines the agent algorithm that integrates forward-chaining HTN planning, execution, belief updates, and plan modifications. Section 7 shows the experimental result of the museum tour-guide robot scenario.
2
Museum Tour Guide Robot Scenario
In this section, we introduce a museum tour guide robot scenario as an example. Figure 1 shows a map of the museum. There are two rooms: roomA and roomB, which are connected by door1 and door2. Arcs, which connect nodes, express the
node8
poi8
roomB node7
door1 arc5
arc10
arc7
node6
arc9
node5
node4 node3 arc2
arc6 door2
arc4
arc1
roomA node1
poi2
node2
arc3
arc8
Fig. 1. The Map
poi6 poi5
Dynagent: An Incremental Forward-Chaining HTN Planning Agent
173
paths the robot takes. There are some points of interest (POIs). The role of the robot is to move around some POIs and explain the exhibitions there. Suppose that the robot is initially at node1. After being instructed to show the exhibitions of poi2, poi5, and poi6, the robot starts the tour. After explaining the exhibition at poi2 at node2, the robot moves towards poi6 at node6. However, at node4, the robot finds that door2 is not open. The robot has to change its plan. The robot should change the next destination from poi6 to poi5, and should not explain the exhibition of poi2 twice.
3
Terminology
As with Prolog, fluents (corresponding to positive literals in Prolog) and clauses (corresponding to Horn clauses in Prolog) are defined as follows using constants, variables, functions, and predicates. Definition 1. A complex term is of the form: F(T1 , · · · , Tn ) where n ≥ 0, F is an n-ary function, and each Ti (1 ≤ i ≤ n) is a term. A term is one of the following: a constant, a variable, or a complex term. A fluent is of the form: P(T1 , · · · , Tn ) where n ≥ 0, P is an n-ary predicate, and each Ti (1 ≤ i ≤ n) is a term. When P is a 0-ary predicate, the fluent P() can be abbreviated to P. A fluent is either derived or primitive. Note that the truth value of primitive fluents is directly updated by the effects of actions, which will be defined soon, or by observation. On the other hand, the truth value of derived fluents is not directly updated, and is subject to the truth value of other fluents, as defined by clauses. Definition 2. A clause is of the form: F ⇐ F1 , · · · , Fn where n ≥ 0, F is a derived fluent called the head, each Fi (1 ≤ i ≤ n) is a fluent, and the set of fluents F1 , · · · , Fn is called the body. When n = 0, the clause F ⇐ is called a fact and can be expressed as the derived fluent F. The clause F ⇐ F1 , · · · , Fn defines the fluent G if F is unifiable with G. The belief (corresponding to the program in Prolog) is defined using primitive fluents and clauses. Primitive fluents might be updated in the middle of plan execution. On the other hand, clauses defining derived fluents are never updated. Definition 3. A belief is of the form: D, S where D is a set of primitive fluents, and S is a set of clauses. Intuitively speaking, primitive fluents in D express the current state from which we can also derive the truth value of derived fluents using the clauses in S. The separation of D and S is similar to the distinction between intensional and extensional programs in traditional logic programming. A plan is defined as a list of tasks as follows. Abstract tasks are not directly executable. In order to execute an abstract task, it has to be decomposed to actions (= primitive tasks).
174
H. Hayashi et al.
Definition 4. A task is of the form: T(X1 , · · · , Xn ) where n ≥ 0, T is an n-ary task symbol, and each Xi (1 ≤ i ≤ n) is a term. When T is a 0-ary task symbol, the task T() can be abbreviated to T. A task is either abstract or primitive. An action is a primitive task. Cost, which is a non-negative number, is recorded in association with a task. Definition 5. A plan is a list of tasks of the form: [T1 , · · · , Tn ] where n ≥ 0 and each Ti (1 ≤ i ≤ n) is a task, which is called the i-th element of the plan. The cost of the plan [T1 , · · · , Tn ] is the sum of each cost of Ti (1 ≤ i ≤ n). For the purpose of planning and replanning, we extend the notion of tasks, actions, and plans to task pluses, action pluses, and plan pluses. Each task/action plus records two kinds of precondition, protected condition and remaining condition. These represent the fluents that have to be satisfied just before the execution. The satisfiability of the protected condition has been confirmed in the process of planning, and the protected condition will be used for plan checking when the belief is updated. On the other hand, the satisfiability of the remaining condition has not been confirmed yet. In addition, each action plus records the initiation set and the termination set. The initiation (termination) set records the primitive fluents which start (respectively, cease) to hold after the action execution. Definition 6. A task plus representing the task T is of the form: (T, PC, RC) where T is a task, PC is a set of primitive fluents called the protected condition, and RC is a set of fluents called the remaining condition. An action plus representing the action A is of the form: (A, PC, RC, IS, TS) where A is an action, PC is a set of primitive fluents called the protected condition, RC is a set of fluents called the remaining condition, IS is a set of primitive fluents called the initiation set, and TS is a set of primitive fluents called the termination set. The precondition of a task plus or an action plus refers to its protected condition and remaining condition. The effect of an action plus refers to its initiation set and termination set. A solved action plus is an action plus whose remaining condition is empty. The cost of the task plus (the cost of the action plus) representing the task T (respectively, the action A) is the cost of T (respectively, A). Definition 7. A plan plus representing the plan [A1 , · · · , An−1 , An , Tn+1 , · · · , + + + + Tm ] (n ≥ 0, m ≥ n) is of the form: [A+ 1 , · · · , An−1 , An , Tn+1 , · · · , Tm ] where each + Ai (1 ≤ i ≤ n − 1), which is called the i-th element of the plan plus, is a solved action plus representing the action Ai , A+ n , which is called the n-th element of the plan plus, is an action plus representing the action An , and each T+ j (n + 1 ≤ j ≤ m), which is called the j-th element of the plan plus, is a task plus representing the task Tj . A solved plan plus is a plan plus such that each element is a solved action plus. A supplementary plan plus is a plan plus + + + + of the form: [A+ 1 , · · · , An , TAn+1 , · · · , Tm ] where n ≥ 0, m ≥ n + 1, each Ai (1 ≤ i ≤ n) is a solved action plus, and TA+ n+1 is a task plus or an action plus
Dynagent: An Incremental Forward-Chaining HTN Planning Agent
175
such that there exists only one marked1 fluent, which is a primitive fluent, in the remaining condition of TA+ n+1 . The cost of the plan plus representing the plan P is the cost of P. In the planning phase, abstract tasks (task pluses) are decomposed to more concrete plans using the following HTN rules. We can specify the precondition where the HTN rule is applicable. This precondition must be satisfied just before the execution of the abstract task. Definition 8. An HTN rule is of the form: htn(H, C, B) where H is an abstract task called the head, C is a set of fluents called the precondition, and B is a plan called the body. The HTN rule htn(H, C, B) defines the task T if T is unifiable with H. In order to express the effect of an action, we use the following action rules. Like HTN rules, we can specify the precondition where the action rule is applicable. This precondition must be satisfied just before the execution of the action. Definition 9. An action rule is of the form: action(A, C, IS, TS), where A is an action, C is a set of fluents called the precondition, IS is a set of primitive fluents called the initiation set, and TS is a set of primitive fluents called the termination set, such that no primitive fluent is unifiable with both a primitive fluent in the initiation set and a primitive fluent in the termination set. The effect of an action rule refers to its initiation set and termination set. The action rule action(A, C, IS, TS) defines the action B if B is unifiable with A. Definition 10. Planning knowledge is of the form: AS, HS where AS is a set of action rules, and HS is a set of HTN rules.
4
Expressing the Museum Tour Guide Robot Domain
In this section, we express the domain of the scenario introduced in Section 2. The map in Figure 1 is expressed as follows where belief(F) means2 the fact F is in the belief: belief(inRoom(node1,roomA)). belief(inRoom(node2,roomA)). belief(inRoom(node3,roomA)). belief(inRoom(node4,roomA)). belief(inRoom(node5,roomB)). belief(inRoom(node6,roomB)). belief(inRoom(node7,roomB)). belief(inRoom(node8,roomB)). belief(arcN(arc1,node1,node2)). belief(arcN(arc2,node1,node3)). belief(arcN(arc3,node2,node4)). belief(arcN(arc4,node3,node4)). 1
2
When trying to check the satisfiability of a primitive fluent in a plan plus, the primitive fluent is marked, and the plan plus is recorded as a supplementary plan plus. Supplementary plan pluses will be used to make new valid plans when the marked fluent becomes valid. Because our planner is implemented in Prolog, we express the domain using the Prolog expression of belief and planning knowledge.
176
H. Hayashi et al.
belief(arcN(arc5,node3,node5)). belief(arcN(arc6,node4,node6)). belief(arcN(arc7,node5,node6)). belief(arcN(arc8,node5,node7)). belief(arcN(arc9,node6,node8)). belief(arcN(arc10,node7,node8)). belief(arcR(arc5,roomA,roomB,door1)). belief(arcR(arc6,roomA,roomB,door2)). belief(poiNode(poi2,node2)). belief(poiNode(poi5,node5)). belief(poiNode(poi6,node6)). belief(poiNode(poi8,node8)). inRoom(N,R) means that the node N is in the room R. arcN(A,N1,N2) means that the arc A connects the nodes N1 and N2. arcR(A,R1,R2,D) means that the arc A connects the rooms R1 and R2 with the door D. Because we do not care about the direction of arcs, connectsNodes and connectsRooms are defined as follows where belief(H, [B1 , · · · , Bn ]) expresses the clause H ⇐ B1 , · · · , Bn : belief(connectsNodes(Arc,Node1,Node2),[arcN(Arc,Node1,Node2)]). belief(connectsNodes(Arc,Node1,Node2),[arcN(Arc,Node2,Node1)]). belief(connectsRooms(Arc,R1,R2,D),[arcR(Arc,R1,R2,D)]). belief(connectsRooms(Arc,R1,R2,D),[arcR(Arc,R2,R1,D)]). By checking connectsNodes, we can understand that if Arc connects Node1 and Node2, then it also connects Node2 and Node1. connectsRooms is similarly defined. open and atNode are primitive fluents, which might be updated during the plan execution. dy(F)3 expresses that F is a primitive fluent. At first, all doors are open, and the robot is at node1: dy(open(_)). dy(atNode(_)). belief(open(door1)). belief(open(door2)). belief(atNode(node1)). We use the actions explain and gotoNextNode, which are defined as follows: action(explain(POI),[poiNode(POI,Node),atNode(Node)],[]). action(gotoNextNode(Arc,Y),[connectsNodes(Arc,X,Y),atNode(X)], [terminates(atNode(X)),initiates(atNode(Y))]). action(gotoNextNode(Arc,Door,Y), [connectsNodes(Arc,X,Y),atNode(X),open(Door)], [terminates(atNode(X)),initiates(atNode(Y))]). The above form of the action rule is slightly different from the previously defined one. action(A,C,E) says that the precondition of the action A is C, and that E is the effect (the initiation set and the termination set). initiates(F) (terminates(F)) says that F belongs to the initiation set (the termination set). The first action rule says that to explain the exhibition at a POI, the robot has to be at the POI which is located at Node. The second and third action rules say that the robot moves to Y which is directly connected with X via Arc. The second rule considers the case where Arc does not have a door. The third rule considers the case where Arc has a door (Door), which has to be open. We use the abstract tasks showAround, gotoNodeWide, and gotoNodeInRoom, defined by the following HTN rules: 3
“dy” stands for “dynamic.”
Dynagent: An Incremental Forward-Chaining HTN Planning Agent
177
htn(showAround([POI]), [atNode(StartNode),poiNode(POI,POINode)], [gotoNodeWide(StartNode,POINode),explain(POI)]). htn(showAround([POI1,POI2|POIs]), [atNode(StartNode),pick(NextPOI,Rest,[POI1,POI2|POIs]), poiNode(NextPOI,POINode)], [gotoNodeWide(StartNode,POINode),explain(NextPOI), changeGoal(showAround(Rest))]). These HTN rules say that in order to show some POIs, the robot has to choose one of them (using pick), go to the place, explain the POI, and show the other POIs. changeGoal(showAroundRest) is the built-in action that abandons all the plans, changes the goal to the specified goal (showAround(Rest)), and replans. pick, which is similar to the “member” predicate in Prolog, is defined as follows: belief(pick(X,Rest,[X|Rest])). belief(pick(X,[H|Rest],[H|T]),[pick(X,Rest,T)]). Among the “movement” tasks, gotoNodeWide is in the top hierarchy. htn(gotoNodeWide(StartNode,GoalNode), [inRoom(StartNode,Room),inRoom(GoalNode,Room)], [gotoNodeInRoom(StartNode,GoalNode)]). htn(gotoNodeWide(StartNode,GoalNode), [inRoom(StartNode,StartRoom), connectsRooms(Arc,StartRoom,NextRoom,Door), connectsNodes(Arc,NodeA,NodeB),inRoom(NodeA,StartRoom)], [gotoNodeInRoom(StartNode,NodeA),gotoNextNode(Arc,Door,NodeB), gotoNodeWide(NodeB,GoalNode)]). In order to move from one place to another, the robot needs definite routes. The first HTN rule says that if the destination and the starting node are in the same room, then the task is decomposed to gotoNodeInRoom. The second rule says that the robot moves inside the room (gotoNodeInRoom) to the door, goes through the door (gotoNextNode) to the next room, and then moves to the destination. It seems that the robot could indefinitely go back and forth between two nodes. However, using cost information, this kind of repetition is avoided in planning. htn(gotoNodeInRoom(GoalNode,GoalNode),[],[]). htn(gotoNodeInRoom(StartNode,GoalNode), [inRoom(StartNode,Room),inRoom(GoalNode,Room), connectsNodes(Arc,StartNode,NextNode),inRoom(NextNode,Room)], [gotoNextNode(Arc,NextNode),gotoNodeInRoom(NextNode,GoalNode)]). The above HTN rules define the movement of the robot inside the room. The task is decomposed to the action gotoNextNode. The basic strategy is to go to the next node, and then move to the destination.
178
H. Hayashi et al.
In addition, we use the information regarding the costs of tasks and the coordinates of nodes, which we omit here because of limited space. The costs of the movement tasks between two nodes (gotoNodeWide, gotoNodeInRoom, and gotoNextNode) are estimated by the distance of the straight line. Therefore, the actual cost is always greater than the estimated cost. The costs of the other tasks are regarded as 0. However, we can also take into account the costs of non-movement tasks such as explain, for example, if we specify.
5
Planning Algorithm
Using the belief and planning knowledge, we now introduce the planning algorithm. Our planner makes plans by decomposing abstract tasks and checking the satisfiability of the preconditions of tasks and actions. Using HTN rules, abstract tasks are decomposed as follows. Definition 11. (Task Decomposition) Let PLAN+ be the plan plus: + + + + [A+ 1 , · · · , An , Tn+1 , Tn+2 , · · · , Tm ] (n ≥ 0, m ≥ n + 1) + such that each A+ i (1 ≤ i ≤ n) is a solved action plus, Tn+1 is a task plus of the form: (Tn+1 , PCn+1 , RCn+1 ), RCn+1 is an empty set of fluents, and each T+ i (n + 2 ≤ i ≤ m) is a task plus. When Tn+1 is an action, let AR be the action rule: action(A, C, IS, TS) such that Tn+1 is unifiable with A using the most general unifier (mgu) θ. The resolvent of PLAN+ on Tn+1 by AR is the following plan plus: + + + + ([A+ 1 , · · · , An , A , Tn+2 , · · · , Tm ])θ
where A+ is the action plus: (A, PCn+1 , C, IS, TS). When Tn+1 is an abstract task, let HR be the HTN rule: htn(H, C, [B1 , · · · Bk ]) (k ≥ 1 or PCn+1 = C = ∅)4 such that Tn+1 is unifiable with H using the mgu θ. The resolvent of PLAN+ on Tn+1 by HR using θ is the following plan plus: + + + + + ([A+ 1 , · · · , An , B1 , · · · , Bk , Tn+2 , · · · , Tm ])θ + where B+ 1 is the task plus: (B1 , PCn+1 , C), and each Bi (2 ≤ i ≤ k) is the task plus: (Bi , ∅, ∅).
Figure 2 shows the way taskA is decomposed into taskA1, taskA2, and taskA3. After the task decomposition, in addition to the precondition (precond1) of taskA, precond2 and precond3 are added to the precondition of taskA1. This means that this task decomposition is valid if precond2 and precond3 are satisfied just before the execution of taskA. When decomposing an action, in addition to the precondition, the effect of the action is recorded. The satisfiability of a derived fluent depends on the satisfiability of the primitive fluents that imply the derived fluent. For this purpose, we decompose derived fluents as follows. 4
When k is 0, we can still relax this condition by merging PCn+1 (and C), which will disappear otherwise, into the protected condition (the remaining condition) of Tn+2 .
Dynagent: An Incremental Forward-Chaining HTN Planning Agent
179
precond1 taskA
precond1 precond2 precond3 taskA1
taskA2
taskA3
Fig. 2. Task Decomposition
Definition 12. (Derived Fluent Decomposition) Let PLAN+ be the plan plus: + + + + [A+ 1 , · · · , An , TAn+1 , Tn+2 , · · · , Tm ] (n ≥ 0, m ≥ n + 1) + such that each A+ i (1 ≤ i ≤ n) is a solved action plus, TAn+1 is either the task plus: (Tn+1 , PCn+1 , RCn+1 ) or the action plus: (An+1 , PCn+1 , RCn+1 , ISn+1 , TSn+1 ), the derived fluent F belongs to RCn+1 , and each T+ i (n + 2 ≤ i ≤ m) is a task plus. Let CL be the clause of the form: H ⇐ B1 , · · · , Bk (k ≥ 0) such that the derived fluent F is unifiable with H using the mgu θ. The resolvent of PLAN+ on F by CL using θ is the following plan plus: + + + + ([A+ 1 , · · · , An , TA , Tn+2 , · · · , Tm ])θ. + where if TA+ n+1 is a task plus, then TA is the following task plus:
(Tn+1 , PCn+1 , RCn+1 \ {F} ∪ {B1 , · · · , Bk }), otherwise TA+ is the action plus: (An+1 , PCn+1 , RCn+1 \ {F} ∪ {B1 , · · · , Bk }, ISn+1 , TSn+1 ) The satisfiability of primitive fluents in the remaining condition of a task plus in a plan plus is checked based on the belief, the initiation sets and termination sets mentioned in the action pluses. In our planning algorithm, when decomposing a task in a plan, we know what actions to execute before the task. Therefore, it is possible to check the condition of the task plus when decomposing it. This is one of the advantages of forward-chaining HTN planning. Definition 13. (Primitive Fluent Checking) Let PLAN+ be the plan plus: + + + + [A+ 1 , · · · , An , TAn+1 , Tn+2 , · · · , Tm ] (n ≥ 0, m ≥ n + 1) + such that each A+ i (1 ≤ i ≤ n) is a solved action plus, TAn+1 is either the task plus: (Tn+1 , PCn+1 , RCn+1 ) or the action plus: (An+1 , PCn+1 , RCn+1 , ISn+1 , TSn+1 ), the primitive fluent F belongs to RCn+1 , and each T+ i (n + 2 ≤ i ≤ m) is a task plus. Given the belief D, S, if
180
H. Hayashi et al.
– F is unifiable with a primitive fluent mentioned in D using the mgu θ, – and there does not exist an action plus A+ i (1 ≤ i ≤ n) such that (F)θ is unifiable with a primitive fluent in its termination set, or – there exists an action plus A+ i (1 ≤ i ≤ n) such that F is unifiable with a primitive fluent in its initiation set using the mgu θ, – and there does not exist an action plus A+ j (i + 1 ≤ j ≤ n) such that (F)θ is unifiable with a primitive fluent in its termination set, then F is satisfiable, and after checking the satisfiability of F, PLAN+ is updated to: + + + + ([A+ 1 , · · · , An , TA , Tn+2 , · · · , Tm ])θ. + where if TA+ n+1 is a task plus, then TA is the following task plus:
(Tn+1 , PCn+1 ∪ {F}, RCn+1 \ {F}), otherwise TA+ is the action plus: (An+1 , PCn+1 ∪ {F}, RCn+1 \ {F}, ISn+1 , TSn+1 ). Our planning algorithm is similar to the planning algorithm of SHOP [16]. The main difference is that we record extra information for the purpose of replanning. The protected condition recorded in each action plus and each task plus is used to detect invalid plans when deleting a primitive fluent from the belief. The supplementary plan pluses are used to make new valid plans when adding a new primitive fluent to the belief. The following algorithm will be used not only for initial planning but also for replanning. In the case of initial planning, given the task G as the goal, we set the current set of plan pluses to {[(G, ∅, ∅)]}, and the current set of supplementary plan pluses to ∅. The purpose of planning is to decompose the goal G to a sequence of actions. As we shall see later when introducing the agent algorithm, even after the initial planning, we keep and continuously modify the current belief, the current set of plan pluses, and the current set of supplementary plan pluses. Algorithm 1. (Planning) 1. Given the following: – a set of plan pluses (current set of plan pluses) – a set of supplementary plan pluses (current set of supplementary plan pluses), – a belief (current belief), and – the planning knowledge, repeat the following procedure until there exists at least one solved plan plus PLAN+ in the current set of plan pluses: (a) If the remaining condition in the first element of each plan plus is empty, then do the following procedure:
Dynagent: An Incremental Forward-Chaining HTN Planning Agent
181
i. (Plan/Task Selection) Select a plan plus PLAN+ from the current + set of plan pluses such that PLAN+ is of the form: [A+ 1 , · · · , An , + + + T+ , T , · · · , T ] (n ≥ 0, m ≥ n + 1), each A (1 ≤ i ≤ n) is m n+1 n+2 i + a solved action plus, and each Ti (n + 1 ≤ i ≤ m) is a task plus representing the task Ti . ii. (Task Decomposition) Replace the occurrence of PLAN+ in the current set of plan pluses with its resolvents R1 , · · · , Rk (k ≥ 0) on Tn+1 by the HTN rules that define Tn+1 in the planning knowledge. (See Definition 11.) (b) Repeat the following procedure until each remaining condition of action pluses and task pluses mentioned in the current set of plan pluses5 becomes empty. i. (Plan/Fluent Selection) Select a plan plus PLAN+ from the current + set of plan pluses such that PLAN+ is of the form: [A+ 1 , · · · , An , + + + + TAn+1 , Tn+2 , · · · , Tm ] (n ≥ 0, m ≥ n + 1), each Ai (1 ≤ i ≤ n) is a solved action plus, TA+ n+1 is an action plus or a task plus, and the fluent F is mentioned in the remaining condition of TA+ n+1 . ii. (Derived Fluent Decomposition) If the fluent F is a derived fluent, replace the occurrence of PLAN+ in the current set of plan pluses with its resolvents R1 , · · · , Rk (k ≥ 0) on F by the clauses that define F in the current belief. (See Definition 12.) iii. If the fluent F is a primitive fluent, do the following procedure: A. (Supplementary Plan Recording) Add PLAN+ to the current set of supplementary plan pluses, marking the occurrence of F in PLAN+ . B. (Primitive Fluent Checking) Check the satisfiability of the primitive fluent F in PLAN+ based on the current belief. (See Definition 13.) If F is not satisfiable, delete PLAN+ from the current set of plan pluses. 2. Return the solved plan plus PLAN+ , the current set of plan pluses, and the current set of supplementary plan pluses. A* [10, 11] is a well-known heuristic graph search algorithm for finding the shortest path. Suppose that A* has found a route from the starting point to another point p. The distance g(p) of the route to p can be calculated easily. A* estimates the distance h(p) between the point p and the destination. In this way, A* estimates the distance (f (p) = g(p) + h(p)) from the starting point to the destination via the point p. Afterwards, A* picks up the best already computed path to p in terms of f (p), expands the path to the next points, and continues the search in the same way. In Algorithm 1, when decomposing a task in a plan, all the tasks before the task have been decomposed into actions. The cost of an action is the exact cost, and the cost of the abstract task is its estimation. (There are many ways to 5
We just need to check the remaining condition of the resolvents produced at Step 1(a)ii and Step 1(b)ii.
182
H. Hayashi et al.
execute the abstract task, and we cannot know the actual cost of the task until the task is decomposed to actions, which are primitive tasks.) Like A*, if the estimated cost is always lower than the actual cost, we can find the plan whose cost is the lowest in the following way: Algorithm 2. (Heuristic Planning Using Cost Information) The algorithm is a special case of Algorithm 1 where the cost of the plan plus selected at Step 1(a)i is the lowest among the non-solved plan pluses in the current set of plan pluses.
6
Combining Planning, Execution, and Belief Updates
In dynamic domains, while executing the plan, the environment might change, or the agent might find new information. In the previous section, we showed a forward-chaining (and A*-like heuristic) HTN planning algorithm. In this section, we show how to combine planning, execution, and belief updates. 6.1
Finding Invalid Plans and New Valid Plans
When updating the belief, some plans might become invalid, and some new plans might become valid. Especially, when a plan depends on the deleted fluent, the plan is no longer valid. On the other hand, if a new fluent is added, it might be possible to make new valid plans. Definition 14. (Invalid Plan Pluses) Let PLAN+ be a plan plus of the form: + + [T+ 1 , · · · , Tn ] (n ≥ 0). PLAN becomes invalid after deleting the primitive fluent F from the belief iff: – there exists T+ i (1 ≤ i ≤ n) such that F is unifiable with a primitive fluent G which belongs to the protected condition of T+ i , – and there does not exist T+ (1 ≤ k ≤ i − 1) such that G belongs to the k initiation set of T+ . k Note that fluents in the “protected conditions” are used to check the validity of plans at the time of belief updates. Figure 3 shows that open(door1) is the precondition of task3. Suppose that the satisfiability of open(door1) has been confirmed in the process of planning. While executing the plan, if open(door1) is deleted from the belief just before task1, the plan becomes invalid unless task1 or task2 initiates open(door1).
open(door1)
open(door1)
task1
task2
current belief now
Fig. 3. Precondition Protection
task3
Dynagent: An Incremental Forward-Chaining HTN Planning Agent
183
On the other hand, even if the precondition (open(door1)) of task3 is invalid when planning, the plan is recorded as a supplementary plan, and open(door1) is marked. When later, open(door1) is added to the belief, the precondition becomes valid. In this way, we can find new valid plans after belief updates. Definition 15. (New Valid Plan Pluses) Let PLAN+ be a supplementary plan plus such that the marked primitive fluent F is unifiable with the primitive fluent F2 using the mgu θ. The new valid plan plus made from PLAN+ after adding F2 to the belief is (PLAN+ )θ such that the satisfiability of F in PLAN+ is checked. (See Definition 13.) Note that supplementary plan pluses are used to create new valid plan pluses. Intuitively, a supplementary plan plus corresponds to an intermediate node of the search tree, and a new valid plan plus is made by adding a new branch to it. 6.2
Agentizing the Whole Procedure
Now, we introduce the whole procedure that integrates planning, execution, and belief updates. In other words, using the following algorithm, we can make “intelligent agents” working in dynamic environments. Algorithm 3. (Agent Algorithm) Given the task G (goal), the belief (current belief), and the planning knowledge, the agent starts the following procedure: 1. Set the current set of plan pluses to {[(G, ∅, ∅)]}. 2. Set the current set of supplementary plan pluses to ∅. 3. Repeat the following procedure until the current set of plan pluses contains the empty plan plus6 []: (a) (Observation) Repeat the following procedure if necessary: i. Add (or delete) the primitive fluent F to (from) the set of primitive fluents in the current belief, and update the current set of plan pluses and the current set of supplementary plan pluses as follows: A. If F is added to the set of primitive fluents in the current belief, then make the new valid plan pluses from the supplementary plan pluses in the current set of supplementary plan pluses, and add them to the current set of plan pluses. (See Definition 15.) B. If F is deleted from the set of primitive fluents in the current belief, then delete each invalid plan plus from the current set of plan pluses and the current set of supplementary plan pluses. (See Definition 14.) (b) (Planning) Following Algorithm 1 (or its special case: Algorithm 2), make a solved plan plus PLAN+ , and update the current set of plan pluses and the current set of supplementary plan pluses. 6
The empty plan plus in the current set of plan pluses means that the agent has already executed the given goal.
184
H. Hayashi et al. + (c) (Action Execution) If PLAN+ is of the form: [A+ 1 , · · · , An ] (n ≥ 1) and A+ is the solved action plus representing the action A , then try to exe1 1 cute A1 , and update the current set of plan pluses and the current set of supplementary plan pluses as follows: i. If the execution7 of (A1 )θ is successful, do the following procedure: A. From the current set of supplementary plan pluses, delete8 each plan plus such that the first element is not a solved action plus. B. For each plan plus in the current set of plan pluses and in the current set of supplementary plan pluses, if the plan plus is of + + + the form: [B+ 1 , B2 , · · · , Bm ] (m ≥ 1), and B1 represents an action that is unifiable with (A1 )θ using the mgu ρ, then replace the plan + plus with (([B+ 2 , · · · , Bm ])θ)ρ. C. For each primitive fluent F that belongs to the initiation set of (A+ 1 )θ, add F to the set of primitive fluents in the current belief. D. For each primitive fluent F that belongs to the termination set of (A+ 1 )θ, delete F from the set of primitive fluents in the current belief, and delete all the invalid plan pluses9 from the current set of plan pluses and the current set of supplementary plan pluses. (See Definition 15.) ii. If the execution of A1 is unsuccessful10 , do the following procedure: A. For each plan plus in the current set of plan pluses (and for each plan plus in the current set of supplementary plan pluses), + + if the plan plus is of the form: [C+ 1 , · · · , Cm ] (m ≥ 1) and C1 is an action plus which represents an action that is unifiable with A1 , then delete the plan plus from the current set of plan pluses (respectively, from the current set of supplementary plan pluses).
In Algorithm 3, at Step 3(a)iA, Step 3(a)iB, and Step 3(c)iD, plans are modified after the belief updates, as explained in Section 6.1. Plans are also modified, at Step 3(c)iB and Step 3(c)iiA, after action execution. Figure 4 shows how plans are modified after successful execution of a, and already executed a is deleted from the first element of each plan. Figure 5 shows how plans are modified after unsuccessful execution of a, and each plan whose first task is a is deleted because these plans cannot be executed. As in [14, 15], we can also use “undoing actions.” For example, in Figure 4, if we use the undoing action (undo(a)) of a, the plans [b,c], [b,b,c], [d,c], and [d,a,c] would be [b,c], [undo(a),b,b,c], [d,c], and [undo(a),d,a,c]. In this case, unlike [14, 15], we have to check that the effect of undo(a) does not invalidate the plans. For simplicity, we did not include the concept of undoing actions in this paper. However, this is an important technique for avoiding side-effects of already executed actions. For example, before changing a travel plan, we should 7 8 9 10
After the action execution, some variables might be bound by the substitution θ. Note that this operation is similar to the cut operation of Prolog. The effect of an action in a plan might invalidate the other plans. We assume that action failure does not have any effect.
Dynagent: An Incremental Forward-Chaining HTN Planning Agent
a
b
c
b
c
b
b
c
b
b
a
d
c
d
c
d
a
c
d
a
185
c
c
Fig. 4. Plans after Successful Action Execution
a
b
c
b
b
c
a
d
c
d
a
c
b
b
c
d
a
c
Fig. 5. Plans after Action Execution Failure
cancel all the hotel reservations. Undoing actions can be used in this case. The importance of undoing is recognized not only in the area of planning [9] but also in the area of web service composition [1].
7
Testing the Museum Tour Guide Robot Scenario
Following Algorithm 3, we implemented the agent system in Java, using the A*like search strategy described in Algorithm 2. The planner called from the agent is implemented in Prolog, which is also implemented in Java. We used the belief and planning knowledge in Section 4, and tested the scenario in Section 2 on a simulator. Given showAround([poi2,poi5,poi6]) as the goal, the robot moved from node1 to node2, explained poi2, moved to node4, (found door2 closed,) replanned after removing open(door2) from the belief, moved to node3, moved to node5, explained poi5, moved to node6, and explained poi6. Note that the robot correctly replanned at node4, changed the next destination from poi6 to poi5, and did not revisit poi2. We tested other similar scenarios, and the agent behaved as we expected. If naively replanned from scratch using the same planning knowledge and belief, not only it takes time
186
H. Hayashi et al.
but also the agent will revisit node2 and explain poi2 again. If we use this naive replanning method, in order to avoid the latter problem, it is necessary to record the already explained POIs in the belief through the effects of actions. Also, additional preconditions (of tasks) are necessary to avoid explaining the same POIs more than once. This means that the agent programmer has to write additional rules, precondition, and/or effects just for the purpose of replanning. In the area of robotics, many researchers are involved in such areas as mobility, arm control, image processing, conversation, network communication, and so on. Therefore, it would be difficult for each agent programmer to write those additional rules, preconditions, and/or effects just for the purpose of replanning, imagining possible future applications.
8
Conclusion
We have defined and implemented a new agent system combining forwardchaining HTN planning, A*-like heuristic search, execution, belief updates, and plan modifications. Compared with our previous work [14, 15] on HTN planning agents, there are two major differences. The first difference is that the algorithm in this paper uses preconditions (of tasks) and effects (of actions) in the same way as SHOP [16]. This means that given the same belief (rules), Dynagent and SHOP compute the same plans if the belief is not subject to change. On the other hand, the HTN planning algorithms in [14, 15] are based on logic programming procedure called DSLDNF [12, 13] and do not use preconditions and effects. Literals are used in [14, 15] instead of fluents and tasks. The second difference is that the algorithm in this paper uses A*-like heuristics for selecting the task to be decomposed, which is necessary to implement the museum tour guide robot scenario. We have also shown that our planning agent algorithm is useful for implementing the museum tour guide robot scenario. This scenario is related to replanning in the traveling salesman problem. However, our planner is a general-purpose planner, and it can be used for other purposes. Our next step is to use our real robots [25] instead of the simulator.
References 1. BPEL4WS v1.1 Specification, 2003. 2. E. Beaudry, F. Kabanza, and F. Michaud. Planning for a Mobile Robot to Attend a Conference. Canadian Conference on AI, pp. 48-52, 2005. 3. T. Belker, M. Hammel, and J. Hertzberg. Learning to Optimize Mobile Robot Navigation Based on HTN Plans. ICRA03, pp. 4136-4141, 2003. 4. M. Cavazza, F. Charles, and S. Mead. Characters in Search of an Author: AI-based Virtual Storytelling. ICVS, pp. 145-154, 2001. 5. M. Cavazza, F. Charles, and S. Mead. Planning Characters’ Behaviour in Interactive Storytelling. The Journal of Visualization and Computer Animation, 13(2):121-131, 2002.
Dynagent: An Incremental Forward-Chaining HTN Planning Agent
187
6. K. Currie and A. Tate. O-Plan: the Open Planning Architecture. Artificial Intelligence, 52(1):49-86, 1991. 7. M. desJardins, E. Durfee, C. Ortiz Jr., and M. Wolverton. A Survey of Research in Distributed, Continual Planning. AI Magazine 20(4):13-22 , Winter, 1999. 8. J. Dix, H. Munoz-Avila, and D. Nau. IMPACTing SHOP: Putting an AI Planner into a Multi-Agent Environment. Annals of Math and AI, 4(37):381-407, 2003. 9. T. Eiter, E. Erdem, and W. Faber. Plan Reversals for Recovery in Execution Monitoring. NMR04, pp. 147-154, 2004. 10. P. Hart, N. Nilsson, and B. Raphael. A Formal Basis for the Heuristic Determination of Minimum Cost Paths. IEEE Transactions on Systems, Science, and Cybernetics, SCC-4(2):100-107, 1968. 11. P. Hart, N. Nilsson, and B. Raphael Correction to “A Formal Basis For the Heuristic Determination of the Minimum Path Costs”, SIGART Newsletter, 37, pp. 28-29, 1972. 12. H. Hayashi. Replanning in Robotics by Dynamic SLDNF. IJCAI 99 Workshop on “Scheduling and Planning Meet Real-Time Monitoring in a Dynamic and Uncertain World”, 1999. 13. H. Hayashi. Computing with Changing Logic Programs. Ph.D. Thesis. Imperial College of Science, Technology and Medicine, University of London, 2001. 14. H. Hayashi, K. Cho, and A. Ohsuga. Mobile Agents and Logic Programming. MA02, pp. 32-46, 2002. 15. H. Hayashi, Kenta Cho, and Akihiko Ohsuga. A New HTN Planning Framework for Agents in Dynamic Environments. Post-Proceedings of CLIMA IV, LNAI 3259, Springer-Verlag, pp. 108-133, 2004. 16. D. Nau, Y. Cao, A. Lotem, and H. M˜ unoz-Avila. SHOP: Simple Hierarchical Ordered Planner. IJCAI99, pp. 968-975, 1999. 17. O. Obst, A. Maas, and J. Boedecker. HTN Planning for Flexible Coordination of Multiagent Team Behavior. Technical report, Universit¨ at Koblenz-Landau, 2005. 18. J. Peer. Web Service Composition as AI Planning - a Survey. Technical Report, Univ. of St.Gallen, 2005. 19. E. Sacerdoti. A Structure for Plans and Behavior. American Elsevier, 1977. 20. A. Tate. Generating Project Networks. IJCAI77, pp. 888-893, 1977. 21. D. Wilkins. Practical Planning, Morgan Kaufmann, 1988. 22. D. Wilkins and M. desJardins. A Call for Knowledge-based Planning, AI Magazine, 22(1):99-115, Spring, 2001. 23. D. Wu, B. Parsia, E. Sirin, J. Hendler, and D. Nau. Automating DAML-S web services composition using SHOP2. ISWC03, 2003. 24. Q. Yang. Intelligent Planning — A Decomposition and Abstraction Based Approach. Springer-Verlag, 1997. 25. T. Yoshimi, N. Matsuhira, K. Suzuki, D. Yamamoto, F. Ozaki, J. Hirokawa, H. Ogawa. Development of a Concept Model of a Robotic Information Home Appliance, aprialpha. IROS04, pp. 205-211, 2004.
A Combination of Explicit and Deductive Knowledge with Branching Time: Completeness and Decidability Results Alessio Lomuscio and Bo˙zena Wo´zna Department of Computer Science, University College London, Gower Street, London WC1E 6BT, United Kingdom {A.Lomuscio, B.Wozna}@cs.ucl.ac.uk
Abstract. Logics for knowledge and time comprise logic combinations between epistemic logic S5n for n agents and temporal logic. In this paper we examine a logic combination of Computational Tree Logic and an epistemic logic augmented to include an additional epistemic operator representing explicit knowledge. We show the resulting system enjoys the finite model property, decidability and is finitely axiomatisable. It is further shown that the expressivity of the resulting system enables us to represent a non-standard notion of deductive knowledge which seems promising for applications.
1
Introduction
The use of modal logic has a long tradition in the area of epistemic logic. In its simplest case (dating back to Hintikka [16]) one considers a system of n agents and associates an S5 modality Ki for every agent i in the system, thereby obtaining the system S5n . In this system, all agents can be said to be logically omniscient and enjoy positive and negative introspection with respect to their knowledge (which will always be true in the real world). While the system S5n can already be seen as a (trivial) combination, or fusion [20], of S5 with itself n times, more interesting combinations have been considered. For example one of the systems presented in [17] is a fusion between systems S5n for knowledge and system KD45n for belief plus interaction axioms regulating the relationship for knowledge and belief. The system S5W Dn [22] is an extension of S5n obtained n−1 by adding the “interaction axiom” i=1 ♦i i+1 α ⇒ n ♦1 α. Other examples are discussed in the literature, including [23, 2]. The completeness proofs in these works typically are based on some reasoning on the canonical model [1]. Because of the importance in applications, and in particular in verification, there has been recent growing interest in combining of temporal with epistemic logic. This allows for the representation of concepts such as knowledge of one agent about a changing world, the temporal evolution of the knowledge of agents about the knowledge of others, and other various epistemic properties. A first
The authors acknowledge support from the EPSRC (grant GR/S49353) and the Nuffield Foundation (grant NAL/690/G).
M. Baldoni et al. (Eds.): DALT 2005, LNAI 3904, pp. 188–204, 2006. c Springer-Verlag Berlin Heidelberg 2006
A Combination of Explicit and Deductive Knowledge with Branching Time
189
approach for a logic of knowledge and time was given by Sato in [31]. Subsequently, other logics have been proposed [11, 21, 32, 33]; as it is reported in [14] there are 96 logics of knowledge and time: 48 are based on the linear time logic (LTL) and 48 involve branching time. In particular, a variety of semantical classes (interpreted systems with perfect recall, synchronicity, asynchronicity, no learning, etc.) have been defined and their axiomatisations shown with respect to a temporal and epistemic language [25, 26, 14]. This is of particular relevance for verification of multi-agent systems via model checking, an area that has recently received some attention [13, 18, 27, 30, 34, 35, 36]. While these results as a whole seem to constitute a rather mature area of investigation, the underlying assumption there is that an S5 modality is an adequate operator for knowledge. This is indeed the case in a variety of scenarios (typically in communication protocols) when the properties of interests are best captured by means of an information-theoretic concept. Of interest in these cases is not what an agent explicitly knows but what the specifier of the system can ascribe to the agents given the information they have at their disposal. In other instances S5 is not a useful modality to consider, at least on its own, and weaker forms of knowledge are called for. A variety of weaker variants of the epistemic logic S5n (most of them inspired by solving what is normally referred to as the “problem of logical omniscience”) have been developed [19, 10, 8, 15] over the years. The most relevant for this paper are awareness and explicit knowledge presented in [8, 9]. In this work two new operators: Ai and Xi are introduced. The former represents the information an agent has at its disposal; its semantics is not given as in standard Kripke semantics by considering the accessible points on the basis of some accessibility relation, but simply by checking whether the formula of which an agent is aware of is present in its local database, i.e. whether the formula φ is i−local in the state in question. The latter represents the information an agent explicitly knows, this being interpreted as standard knowledge and awareness of that fact. The aim of the present paper is two-fold. First, we aim to axiomatise the concept of explicit knowledge when combined with branching time CTL on a standard multi-agent systems semantics, and show the decidability of the resulting system. Second, we argue that the combination of explicit knowledge with branching time not only give rise to interesting axiomatisation problems, but also allows to express some subtle epistemic concepts needed in applications. In particular, it allows to characterise the notion of “deductive knowledge”1 formalised below. The rest of the paper is organised as follows. In Section 2 we briefly present the basics of the underlying syntactical and semantical assumptions used in the paper. Section 3 is devoted to the construction of the underlying machinery to prove completeness and decidability, viz Hintikka structures and related concepts. Sections 4 and 5 present the main results of this paper: a decidability result and a completeness proof for the logic. We conclude in Section 6 with some observations on alternative definitions. 1
Our use of the term “deductive knowledge” is inspired by [29], although the focus in this paper is different.
190
2
A. Lomuscio and B. Wo´zna
Temporal Deductive Logic
Logics for time and knowledge provide a fundamental framework for a formal modelling of properties of multi-agent systems (MASs) [9]. In particular, epistemic logics are designed to reason about the knowledge of agents in distributed and multi-agent systems. They are not taken as describing what an agent actually knows, but only what is implicitly represented in his information state, i.e., what logically follows from his actual knowledge. Temporal logics are used to model how a state of agents’ knowledge changes over times. Most of them enable to express such temporal concepts as “eventually”, “always”, “nexttime”, “until”, “release”, etc. In this section, we present Temporal Deductive Logic (TDL), a combination of the temporal logic CTL [3, 6, 5] with different epistemic notions. In fact, TDL extends standard combinations of branching time epistemic languages by introducing three further epistemic modalities: awareness, explicit knowledge, and deductive knowledge. These new modalities can be combined with Boolean connectives, temporal and standard epistemic operators or nested arbitrarily. 2.1
Syntax
Assume a set of propositional variables PV also containing the symbol standing for true, and a set of agents AG = {1, . . . , n}, where n ∈ {1, 2, 3, . . .}. The set WF of well-formed TDL formulae is defined by the following grammar: ϕ := p | ¬ϕ | ϕ ∨ ϕ | Eϕ | E(ϕUϕ) | A(ϕUϕ) | Ki ϕ | Ai ϕ | Xi ϕ, where p ∈ PV and i ∈ AG. In the above syntax, a path quantifier, either A (”for all the computation paths”) or E (”for some computation paths”) is immediately followed by the single one of the usual linear-time operators (“next time”) and U (“until”). This implies that the above grammar defines only one type of temporal formulae, namely state formulae; this is a traditional name for the temporal formulae that are interpreted over states only rather than over paths. In fact, the above syntax extends CTL with standard epistemic modality Ki as well as operators for explicit knowledge (Xi ) and awareness (Ai ) as in [9]. The formula Xi ϕ is read as “agent i knows explicitly that ϕ”, the formula Ai ϕ is read as “agent i is aware of ϕ”, and Ki ϕ (the standard epistemic modality) is read as “agent i knows (implicitly) that ϕ” . We shall further use the shortcut Di ϕ to represent E(Ki αUXi α). The formula Di ϕ is read as “agent i may deduce ϕ (by some computational process)”. The remaining operators can be introduced as abbreviations in the usual way, def def def def i.e. α ∧ β = ¬(¬α ∨ ¬β), α ⇒ β = ¬α ∨ β, α ⇔ β = (α ⇒ β) ∧ (β ⇒ α), Aα = def
def
def
def
¬E¬α, E♦α = E(Uα), A♦α = A(Uα), Eα = ¬A♦¬α, Aα = ¬E♦¬α, def
def
def
A(αWβ) = ¬E(¬αU¬β), E(αWβ) = ¬A(¬αU¬β), Ki α = ¬Ki (¬α).
We conclude this section with some essential definitions, used later on in all the proofs.
A Combination of Explicit and Deductive Knowledge with Branching Time
191
Let ϕ and ψ be TDL formulae. ψ is a sub-formula of ϕ if either (a) ψ = ϕ; or (b) ϕ is of the form ¬α, Eα, Ki α, Xi α, Ai α, or Di α, and ψ is a sub-formula of α; or (c) ϕ is of the form α ∨ β, E(αUβ),or A(αUβ) and ψ is a sub-formula of either α or β. The length of ϕ (denoted by |ϕ|) is defined inductively as follows: • • • • • 2.2
If If If If If
ϕ ∈ PV, then |ϕ| = 1, ϕ is of the form ¬α, Ki α, Xi α, or Ai α, then |ϕ| = |α| + 1, ϕ is of the form Eα, then |ϕ| = |α| + 2, ϕ is of the form α ∨ β then |ϕ| = |α| + |β| + 1. ϕ is of the form A(αUβ) or E(αUβ), then |ϕ| = |α| + |β| + 2. Semantics
Traditionally, the semantics of temporal logics with epistemic operators is defined on interpreted systems, defined in the following way [9]. Definition 1 (Interpreted Systems). Assume that each agent i ∈ AG is associated with a set of local states Li , and the environment is associated with a set of local states nLe . An interpreted system is a tuple IS = (S, T, ∼1 , . . . , ∼n , V), where S ⊆ i=1 Li × Le is a set of global states; T ⊆ S × S is a (serial) temporal relation on S; ∼i ⊆ S × S is an (equivalence) epistemic relation for each agent i ∈ AG defined by: s ∼i s iff li (s ) = li (s), where li : S →Li is a function returning the local state of agent i from a global state; V : S →2PV is a valuation function. V assigns to each state a set of propositional variables that are assumed to be true at that state. For more details and further explanations of the notation we refer to [9]. In order to give a semantics to TDL we extend the above definition by means of local awareness functions, used to indicate the facts that agents are aware of. As in [9], we do not attach any fixed interpretation to the notion of awareness, i.e. to be aware may mean “to be able to figure out the truth”, “to be able to compute the truth within time T”, etc. Definition 2 (Model). Given a finite set of agents AG = {1, . . . , n}, a model is a tuple M = (S, T, ∼1 , . . . , ∼n , V, A1 , . . . , An ), where S, T , ∼i , and V are defined as in the interpreted system above, and Ai : Li →2WF is an awareness function assigning a set of formulae to each state, for each i ∈ AG. Intuitively, Ai (li (s)) is a set of formulae that the agent i is aware of at state s, i.e. the set of formulae for which the agent can assign a truth value to (unconnected with the global valuation), but he does not necessarily know them. Note that the set of formulae that the agent is aware of can be arbitrary and may not be closed under sub-formulae. Note also that the definition of a model is an extension of the awareness structure, introduced in [9], by a temporal relation. Moreover, it restricts the standard awareness function to be defined over local states only. A review of other restrictions, which can be placed on the set of formulae that an agent may be aware of, and their consequences is given in Section 6.
192
A. Lomuscio and B. Wo´zna
A path in M is an infinite sequence π = (s0 , s1 , . . .) of states such that (si , si+1 ) ∈ T for each i ∈ IN. For a path π = (s0 , s1 , . . .), we take π(k) = sk . By Π(s) we denote the set of all the paths starting at s ∈ S. Definition 3 (Satisfaction). Let M be a model, s a state, and α, β TDL formulae. The satisfaction relation |=, indicating truth of a formula in model M at state s, is defined inductively as follows: (M, s) |= p iff p ∈ V(s), (M, s) |= α ∧ β iff (M, s) |= α and (M, s) |= β, (M, s) |= ¬α iff (M, s) |= α, (M, s) |= Eα iff (∃π ∈ Π(s))(M, π(1)) |= α, (M, s) |= E(αUβ) iff (∃π ∈ Π(s))(∃m ≥ 0)[(M, π(m)) |= β and (∀j < m)(M, π(j)) |= α], (M, s) |= A(αUβ) iff (∀π ∈ Π(s))(∃m ≥ 0)[(M, π(m)) |= β and (∀j < m)(M, π(j)) |= α], iff (∀s ∈ S) (s ∼i s implies (M, s ) |= α), (M, s) |= Ki α iff α ∈ Ai (li (s)), (M, s) |= Ai α iff (M, s) |= Ki α and (M, s) |= Ai α. (M, s) |= Xi α
Note that since Di α is a shortcut for E(Ki αUXi α), as defined on page 190, we have that (M, s) |= Di α iff (M, s) |= E(Ki αUXi α). Note also that satisfaction for Xi can be defined simply on Ki and Ai , but we will find it convenient in the axiomatisation to have a dedicated operator for Ai . This is in line with [9]. Satisfaction for the Boolean and temporal operators as well as the epistemic modalities Ki , Xi , Ai is standard (see Figure 1 for some examples of TDL formulae holding in a state of a given model). The formula Di α holds at state s in a model M if Ki α holds at s and there exists a path starting at state s such that Xi α holds in some state on that path and always earlier Ki α holds. The meaning captured here is the one of potential deduction by the agent: the agent is able to participate in a run (path) of the system under consideration, which leads him to the state, where he knows explicitly the fact in question. Moreover, from an external observer point of view, the agent had enough information from the beginning of such run to deduce the fact. The computation along the path represents, in abstract terms, the deduction performed by the agent to turn implicit into explicit knowledge. Note s
s
s
α
α
α
α
β
M, s |= EX α
s
α
β
M, s |= E(αUβ)
α
β
β
M, s |= A(αUβ)
α Let α ∈ Aa (la (s)). Then: α
α
(1) M, s |= Aa α (2) M, s |= Xa α
M, s |= Ka α
Fig. 1. Examples of TDL formulae which hold in the state s of the model M
A Combination of Explicit and Deductive Knowledge with Branching Time
193
that the operator Di is introduced to account for the process of deduction; other processes resulting in explicit knowledge (discovery, communication, ...) are possible but are not modelled by it. Alternative definitions of deductive knowledge are possible, and we discuss few of them in Section 6. We conclude this section with some other essential definitions. Definition 4 (Validity and Satisfiability in a Model). Let M be a model. A TDL formula ϕ is valid in M (written M |= ϕ), if M, s |= ϕ for all states s ∈ S. A TDL formula ϕ is satisfiable in M , if M, s |= ϕ for some state s ∈ S. Definition 5 (Validity and Satisfiability). A TDL formula ϕ is valid (written |= ϕ), if ϕ is valid in all the models M . A TDL formula ϕ is satisfiable if it is satisfiable in some model M . In the latter case M is said to be a model for ϕ.
3
Finite Model Property for TDL
In this section we prove that the TDL language has the finite model property (FMP); a logic has the FMP if any satisfiable formula is also satisfiable in a finite model. The standard way of proving such a result for modal logics is to collapse, according to an equivalence relation of finite index, a possibly infinite model to a finite one (so called quotient structure or filtration), and then to show that the resulting finite structure is still a model for the formula in question. The technique, for example, has been used in [12] to prove that PDL has FMP. For some logic (in particular for TDL) the quotient construction yields a quotient structure that is not a model, however, it still contains enough information to be unwound into a genuine model. Therefore, to prove FMP for TDL, we follow [7], where a combination of the filtration and unwinding technique has been applied to prove FMP for CTL2 . We begin with providing definitions of two auxiliary structures: a Hintikka structure for a given TDL formula, and the quotient structure for a given model, also called - in the classical modal logic - filtration. Definition 6 (Hintikka structure). Let ϕ be a TDL formula, and AG = {1, . . . , n} a set of agents. A Hintikka structure for ϕ is a tuple HS = (S, T, ∼1 , . . . , ∼n , L, A1 , . . . , An ) such that the elements S, T , ∼i , and Ai , for i ∈ AG, are defined as in Definition 2, and L : S →2WF is a labelling function assigning a set of formulae to each state such that ϕ ∈ L(s) for some s ∈ S. Moreover, L satisfies the following conditions: H.1. H.2. H.3. H.4. H.5. H.6. 2
if if if if if if
¬α ∈ L(s), then α ∈ L(s) ¬¬α ∈ L(s), then α ∈ L(s) (α ∨ β) ∈ L(s), then α ∈ L(s) or β ∈ L(s) ¬(α ∨ β) ∈ L(s), then ¬α ∈ L(s) and ¬β ∈ L(s) E(αUβ) ∈ L(s), then β ∈ L(s) or α ∧ EE(αUβ) ∈ L(s) ¬E(αUβ) ∈ L(s), then ¬β ∧ ¬α ∈ L(s) or ¬β ∧ ¬EE(αUβ) ∈ L(s)
Here, we would like to emphasise that despite the fact that in [7] a logic CTL∗ is introduced as a underlying formalism, the decidability and completeness results are given for CTL only.
194 H.7. H.8. H.9. H.10. H.11. H.12. H.13. H.14. H.15. H.16. H.17. H.18. H.19. H.20.
A. Lomuscio and B. Wo´zna if A(αUβ) ∈ L(s), then β ∈ L(s) or α ∧ ¬E(¬A(αUβ)) ∈ L(s) if ¬A(αUβ) ∈ L(s), then ¬β ∧ ¬α ∈ L(s) or ¬β ∧ E(¬A(αUβ)) ∈ L(s) if Eα ∈ L(s), then (∃t ∈ S)((s, t) ∈ T and α ∈ L(t)) if ¬Eα ∈ L(s), then (∀t ∈ S)((s, t) ∈ T implies ¬α ∈ L(t)) if E(αUβ) ∈ L(s), then (∃π ∈ Π(s))(∃n ≥ 0)(β ∈ L(π(n)) and (∀j < n)α ∈ L(π(j))) if A(αUβ) ∈ L(s), then (∀π ∈ Π(s))(∃n ≥ 0)(β ∈ L(π(n)) and (∀j < n)α ∈ L(π(j))) if Ki α ∈ L(s), then α ∈ L(s) if Ki α ∈ L(s), then (∀t ∈ S)(s ∼i t implies α ∈ L(t)) if ¬Ki α ∈ L(s), then (∃t ∈ S)(s ∼i t and ¬α ∈ L(t)) if Xi α ∈ L(s), then Ki α ∈ L(s) and Ai (α) ∈ L(s) if ¬Xi α ∈ L(s), then ¬Ki α ∈ L(s) or ¬Ai α ∈ L(s) if s ∼i t and s ∼i u and Ki α ∈ L(t), then α ∈ L(u) if Ai α ∈ L(s), then α ∈ Ai (li (s)) ∈ Ai (li (s)) if ¬Ai α ∈ L(s), then α
Note that the Hintikka structure differs from a model in that the assignment L is not restricted to propositional variables, nor it is required to always contain p or ¬p for every p ∈ PV. Further, the labelling rules are of the form ”if” and not ”if and only if”. They provide the requirements that must be satisfied by a valid labelling, but they do not require that the formulae belonging to L(s) form a maximal set of formulae, for any s ∈ S. This means that there are formulae that are satisfied in a given state but they are not included in the label of that state. As usual, we call the rules H1-H8, H13, and H16 propositional consistency rules, the rules H9, H10, H14, H15, H17, and H18 − H20 local consistency rules, and the rules H11 and H12 the eventuality rules. A consequence of such a definition of the Hintikka structure is the following: Lemma 1 (Hintikka’s Lemma for TDL). A TDL formula ϕ is satisfiable (i.e., ϕ has a model) if and only if there is a Hintikka structure for ϕ. Proof. It is easy to check that any model M = (S, T, ∼1 , . . . , ∼n , V, A1 , . . . , An ) for ϕ is a Hintikka structure for ϕ, when we extend V to cover all formulae which are true in a state, i.e., in M we replace V by L that is defined as: α ∈ L(s) if (M, s) |= α, for all s ∈ S. Conversely, any Hintikka structure HS = (S, T, ∼1 , . . . , ∼n , L, A1 , . . . , An ) for ϕ can be restricted to form a model for ϕ. Namely, it is enough to restrict L to propositional variables only, and require that for every propositional variable p appearing in ϕ and for all s ∈ S either p ∈ L(s) or ¬p ∈ L(s). Now we proceed to define the quotient structure for a given model M . The quotient construction depends on an equivalence relation on states of M of a finite index, therefore we first have to provide such a relation. We will define it with respect to the Fischer-Ladner closure of a TDL formula ϕ (denoted by F L(ϕ)) that is defined by: F L(ϕ) = CL(ϕ)∪{¬α | α ∈ CL(ϕ)}, where CL(ϕ) is the smallest set of formulae that contains ϕ and satisfies the following conditions: (a). if ¬α ∈ CL(ϕ), then α ∈ CL(ϕ), (b). if α ∨ β ∈ CL(ϕ), then α, β ∈ CL(ϕ),
A Combination of Explicit and Deductive Knowledge with Branching Time
195
(c). if E(αUβ) ∈ CL(ϕ), then α, β, EE(αUβ) ∈ CL(ϕ), (d). if A(αUβ) ∈ CL(ϕ), then α, β, AA(αUβ) ∈ CL(ϕ), (e). if Eα ∈ CL(ϕ), then α ∈ CL(ϕ), (f). if Ki α ∈ CL(ϕ), then α ∈ CL(ϕ), (g). if Ai α ∈ CL(ϕ), then α ∈ CL(ϕ), (h). if Xi α ∈ CL(ϕ), then Ki α ∈ CL(ϕ) and Ai α ∈ CL(ϕ), Note that for a given TDL formula ϕ, F L(ϕ) is a finite set of formulae, as the following lemma shows; hereafter, the size of a set A (denoted by Card(A)) is the cardinality of A. Lemma 2. Let ϕ be a TDL formula. Then, Card(F L(ϕ)) ≤ 2(|ϕ| + 3). Proof. Straightforward by induction on the length of ϕ.
Definition 7 (Fischer-Ladner’s equivalence relation). Let ϕ be a TDL formula and M = (S, T, ∼1 , . . . , ∼n , V, A1 , . . . , An ) a model for ϕ. The relation ↔F L(ϕ) on a set of states S is defined as follows: s ↔F L(ϕ) s if (∀α ∈ F L(ϕ))((M, s) |= α iff (M, s ) |= α) By [s] we denote the set {w ∈ S | w ↔F L(ϕ) s}. Observe that ↔F L(ϕ) is indeed an equivalence relation, so using it we can define the quotient structure for a given TDL model. Definition 8 (Quotient structure). Let ϕ be a TDL formula, M = (S, T, ∼1 , . . . , ∼n , V, A1 , . . . , An ) a model for ϕ, and ↔F L(ϕ) a Fischer-Ladner’s equivalence relation. The quotient structure of M by ↔F L(ϕ) is the structure M↔F L(ϕ) = (S , T , ∼1 , . . . , ∼n , L , A1 , . . . , An ), where – S = {[s] | s ∈ S}, – T = {([s], [s ]) ∈ S × S | (∃w ∈ [s])(∃w ∈ [s ]) s.t. (w, w ) ∈ T }, – for each agent i, ∼i = {([s], [s ]) ∈ S × S | (∃w ∈ [s])(∃w ∈ [s ]) s.t. (w, w ) ∈∼i }, – L : S →2F L(ϕ) is a function defined by: L ([s]) = {α ∈ F L(ϕ) | (M, s) |= α}, →2F L(ϕ) is a function defined by: Ai (li ([s])) = t∈[s] Ai (li (t)), for – Ai : li (S ) →2Li is a function defined by: li ([s]) = t∈[s] li (t). It each agent i. li : S returns a set of local states for agent i ∈ AG for a given set of global states. Note that the set S is finite as it is the result of collapsing states satisfying formulae that belong to the finite set F L(ϕ). In fact we have Card(S ) ≤ 2Card(F L(ϕ)). Note also that Ai is well defined. The quotient structure cannot be directly used to show that TDL has FMP. This is because the resulting quotient structure may not be a model, as the following lemma shows. Lemma 3. The quotient structure does not preserve satisfiability of formulae of the form A(αUβ), where α, β ∈ WF. In particular, there is a model M for A(Up) with p ∈ PV such that M↔F L(ϕ) is not a model for A(Up).
196
A. Lomuscio and B. Wo´zna
Proof. [Idea] Consider the following model M = (S, T, ∼, V, A) for A(Up), where S = {s0 , s1 , . . . , }, T = {(s0 , s0 )} ∪ {(si , si−1 ) | i > 0}, ∼= {(si , si ) | i ≥ 0}, p ∈ V(s0 ) and p ∈ V(si ) for all i > 0, and A(si ) = ∅ for all i ≥ 0. Observe that in the quotient structure of M two distinct states si and sj for i, j > 0 will be identified, resulting in a cycle in the quotient structure, along which p is always false. Hence A(Up) does not hold along the cycle. Although the quotient structure of a given model M by ↔F L(ϕ) may not be a model, it satisfies another important property, which allows us to view it as a pseudomodel, a definition of which is defined later on; it can be unwound into a proper model that can be used to show that the TDL language has the FMP property. To make this idea precise, we introduce the following auxiliary definitions. An interior (respectively frontier) node of a directed acyclic graph (DAG)3 is one which has (respectively does not have) a successor. The root of a DAG is the node (if it exists) from which all other nodes are reachable. A fragment M = (S , T , ∼1 , . . . , ∼n , L , A1 , . . . , An ) of a Hintikka structure HS = (S, T, ∼1 , . . . , ∼n , L, A1 , . . . , An ) is a structure such that (S , T ) generates a finite DAG, in which the interior nodes satisfy H1-H10 and H13-H20, and the frontier nodes satisfy H1-H8, and H13, H16-H20. Given M = (S, T, ∼1 , . . . , ∼n , L, A1 , . . . , An ) and M = (S , T , ∼1 , . . . , ∼n , L , A1 , . . . , An ), we say that M is contained in M , and write M ⊆ M , if S ⊆ S , T = T ∩(S × S), ∼i=∼i ! ∩(S × S), L = L |S, Ai = Ai |Li . Definition 9 (Pseudo-model). Let ϕ be a TDL formula. A pseudo-model M = (S, T, ∼1 , . . . , ∼n , L, A1 , . . . , An ) for ϕ is defined in the same manner as a Hintikka structure for ϕ in Definition 6, except that condition H12 is replaced by the following condition H 12: (∀s ∈ S) if A(αUβ) ∈ L(s), then there is a fragment (S , T , ∼1 , . . . , ∼n , L , A1 , . . . , An ) ⊆ M such that: (a) (S , T ) generates a DAG with root s; (b) for all frontier nodes t ∈ S , β ∈ L (t); (c) for all interior nodes u ∈ S , α ∈ L (u). Now we can prove the main claim of the section, i.e., the fact that TDL has the finite model property. Theorem 1 (FMP for TDL). Let ϕ be a TDL formula. Then the following are equivalent: 1. ϕ is satisfiable 2. There is a finite pseudo-model for ϕ 3. There is a Hintikka structure for ϕ Proof. (1) ⇒ (2) follows from Lemma 4, presented below. To prove (2) ⇒ (3) it is enough to construct a Hintikka structure for ϕ by “unwinding” the pseudomodel for ϕ. This can be done in the same way as is described in [7] for the proof of Theorem 4.1. (3) ⇒ (1) follows from Lemma 1. 3
Recall that a directed acyclic graph is a directed graph such that for any node v, there is no nonempty directed path starting and ending on v.
A Combination of Explicit and Deductive Knowledge with Branching Time
197
Lemma 4. Let ϕ be a TDL formula, F L(ϕ) the Fischer-Ladner closure of ϕ, M = (S, T, ∼1 , . . . , ∼n , V, A1 , . . . , An ) a model for ϕ, and M↔F L(ϕ) = (S , T , ∼1 , . . . , ∼n , L, A1 , . . . , An ) the quotient structure of M by ↔F L(ϕ) . Then, M↔F L(ϕ) is a pseudo-model for ϕ. Proof. The proof for the temporal part of TDL follows immediately from Lemma 3.8 in [7]. Consider now ϕ to be of the following forms: ¬Ki α, Xi α, and Ai α. The other cases can be proven in a similar way. 1. ϕ = ¬Ki α. Let (M, s) |= ¬Ki α, and ¬Ki α ∈ L([s]). By the definition of |=, we have that (∃t ∈ S) such that s ∼i t and (M, t) |= ¬α. Thus, by the definitions of ↔F L(ϕ) and L, we have that ¬α ∈ L([t]). Therefore, by the definition of ∼i we conclude that ∃[t] ∈ S such that [s] ∼i [t] and ¬α ∈ L([t]). So, condition H15 is fulfilled. 2. ϕ = Xi α. Let (M, s) |= Xi α, and Xi α ∈ L([s]). By the definition of |=, we have that (M, s) |= Ki α and (M, s) |= Ai α. By the definition of ↔F L(ϕ) and L, we have that Ki α ∈ L([s]) and Ai α ∈ L([s]). So, condition H16 is fulfilled. 3. ϕ = Ai α. Let (M, s) |= Ai α, and Ai α ∈ L([s]). By the definition of |=, we have that α ∈ Ai (li (s)). Since Ai (li (s)) ⊆ Ai (li ([s])), we have that α ∈ Ai (li ([s])). So, condition H19 is fulfilled. In the subsequent sections we will present an algorithm for deciding satisfiability and an axiomatic system for proving all valid formulae in TDL.
4
Decidability for TDL
Let ϕ be a TDL formula, and F L(ϕ) the Fischer-Ladner closure of ϕ. We define ∆ ⊆ F L(ϕ) to be maximal if for every formula α ∈ F L(ϕ), either α ∈ ∆ or ¬α ∈ ∆. Theorem 2. There is an algorithm for deciding whether any TDL formula is satisfiable. Proof. Given a TDL formula ϕ, we construct a finite pseudo-model for ϕ. We proceed as follows. 1. Build a structure M 0 = (S 0 , T 0 , ∼0i , . . . , ∼0n , L0 , A01 , . . . , A0n ) for ϕ in the following way: – S 0 = {∆ ⊆ F L(ϕ) | ∆ maximal and satisfying all the propositional consistency rules}; – T 0 ⊆ S 0 × S 0 is the relation such that (∆1 , ∆2 ) ∈ T 0 iff ¬Eα ∈ ∆1 implies that ¬α ∈ ∆2 ; – for each agent i ∈ AG, ∼0i ⊆ S 0 ×S 0 is the relation such that (∆1 , ∆2 ) ∈∼i iff {α | Ki α ∈ ∆1 } ⊆ ∆2 ; – L0 (∆) = ∆; – assume that for each agent i ∈ AG the set of local states Li is equal to S 0 . Then, A0i (∆) = {α | Ai α ∈ ∆} for each agent i ∈ AG. It is easy to observe that M 0 , as constructed above, satisfies all the propositional consistency properties; property H10 (because of the definition of
198
A. Lomuscio and B. Wo´zna
T 0 ), property H14 (because of the definition of ∼0i ), and properties H19 and H20 (because of the definition of A0i ). 2. Test the above structure M 0 for fulfilment of the properties H9, H11, H 12, H15, H17, and H18 by repeatedly applying the following deletion rules until no more states in the pseudo-model can be deleted. (a) Delete any state which has no T 0 -successors. (b) Delete any state ∆1 ∈ S 0 such that E(αUβ) ∈ ∆1 (respectively A(αUβ) ∈ ∆1 ) and there does not exist a fragment M ⊆ M 0 such that: (i) (S , T ) is a DAG with root ∆1 ; (ii) for all frontier nodes ∆2 ∈ S , β ∈ ∆2 ; (iii) for all interior nodes ∆3 ∈ S , α ∈ ∆3 . (c) Delete any state ∆1 ∈ S 0 such that ¬Ki α ∈ ∆1 , and ∆1 does not have any ∼0i successor ∆2 ∈ S 0 with ¬α ∈ ∆2 . (d) Delete any state ∆ ∈ S 0 such that ¬Xi α ∈ ∆ and Ki α ∈ ∆ and α ∈ A0i (∆). (e) Delete any state ∆1 ∈ S 0 such that ∆1 ∼0i ∆2 and ∆1 ∼0i ∆3 and α ∈ ∆2 and Ki ¬α ∈ ∆3 We call the algorithm above the decidability algorithm for TDL. Claim (1). The decidability algorithm for TDL terminates. Proof. The termination is obvious given that the initial set S 0 is finite.
Claim (2). Let M = (S, T, ∼1 , . . . , ∼n , L, A1 , . . . , An ) be the resulting structure of the algorithm. The formula ϕ ∈ WF is satisfiable iff ϕ ∈ s, for some s ∈ S. Proof. In order to show the part right-to-left of the satisfaction property, note that either the resulting structure is a pseudo-model for ϕ, or S = ∅ (this can be shown inductively on the structure of the algorithm). Any pseudo-model for ϕ can be extended to a model for ϕ (see the proof of Theorem 1). Conversely, if ϕ is satisfiable, then there exists a model M ∗ such that M ∗ |= ϕ. Let M↔F L(ϕ) = M = (S , T , ∼1 , . . . , ∼n , L , A 1 , . . . , A n ) be the quotient structure of M ∗ by ↔F L(ϕ) . By Theorem 1 we have that M is a pseudo-model for ϕ. So, L satisfies all the propositional consistency rules, the local consistency rules, and properties H11 and H 12. Moreover, by the definition of L in the quotient structure, L (s) is maximal with respect to F L(ϕ) for all s ∈ S . Now, let M = (S , T , ∼1 , . . . , ∼n , L , A 1 , . . . , A n ) be a structure defined →S a function defined by by step 1 of the decidability algorithm, and f : S f (s) = L (s). The following conditions hold:
1. if (s, t) ∈ T , then (f (s), f (t)) ∈ T ; Proof (via contradiction): Let (s, t) ∈ T and (f (s), f (t)) ∈ T . Then, by the definition of T we have that ¬Eα ∈ f (s) and α ∈ f (t). By the definition of f , we have that ¬Eα ∈ L (s) and α ∈ L (t). So, by the definition of L in the quotient structure we have that M ∗ , s |= ¬Eα and M ∗ , t |= α, which contradict the fact that (s, t) ∈ T . 2. if (s, t) ∈∼i , then (f (s), f (t)) ∈∼i ; ∈∼i . Then, by the Proof (via contradiction): Let (s, t) ∈∼i and (f (s), f (t)) definition of ∼i we have that Ki α ∈ f (s) and α ∈ f (t). By the definition
A Combination of Explicit and Deductive Knowledge with Branching Time
199
of f , we have that Ki α ∈ L (s) and α ∈ L (t). So, by the definition of L in the quotient structure we have that M ∗ , s |= Ki α and M ∗ , t |= ¬α, which contradict the fact that (s, t) ∈∼i . Thus, the image of M under f is contained in M , i.e., M ⊆ M . It remains to show that if s ∈ S , then f (s) ∈ S will not be eliminated in step 2 of the decidability algorithm. This can be checked by induction on the order in which states of S are eliminated. For instance, assume that s ∈ S , and A(αUβ) ∈ f (s). By the definition of f , we have that A(αUβ) ∈ L (s). Now, since M is a pseudomodel, by Definition 9 we have that there exists a fragment rooted at s that is contained in M and it satisfies property H 12. Thus, since f preserves the above condition (1), we have that there exists a fragment rooted at f (s) that is contained in M and it satisfies property H 12. This implies that f (s) ∈ S will not be eliminated in step 2(b) of the decidability algorithm. Other cases can be proven similarly. Therefore, it follows that for some s ∈ S we have ϕ ∈ L(s).
5
A Complete Axiomatic System for TDL
Recall, an axiomatic system consists of a collection of axioms schemes and inference rules. An axiom scheme is a rule for generating an infinite number of axioms, i.e. formulae that are universally valid. An inference rule has the form “from formulae ϕ1 , . . . , ϕm infer formula ϕ”. We say that ϕ is provable (written ϕ) if there is a sequence of formulae ending with ϕ, such that each formula is either an instance of an axiom, or follows from other provable formulae by applying an inference rule. We say that a formula ϕ is consistent if ¬ϕ is not provable. A finite set {ϕ1 , . . . , ϕm } of formulae is consistent exactly if and only if the conjunction ϕ1 ∧ . . . ∧ ϕm of its members is consistent. A set F of formulae is a maximally consistent set if it is consistent and for all ϕ ∈ F , the set F ∪ {ϕ} is inconsistent. An axiom system is sound (respectively, complete) with respect to the class of models, if ϕ implies |= ϕ (respectively, if |= ϕ implies ϕ). Let i ∈ {1, . . . , n}. Consider system TDL as defined below: PC. T1. T3. K1. K3. A1. R1. R2. R3. R4. R5.
All substitution instances of classical tautologies. T2. E(α ∨ β) ⇔ Eα ∨ Eβ E E(αUβ) ⇔ β ∨ (α ∧ EE(αUβ)) T4. A(αUβ) ⇔ β ∨ (α ∧ AA(αUβ)) K2. Ki α ⇒ α (Ki α ∧ Ki (α ⇒ β)) ⇒ Ki β ¬Ki α ⇒ Ki ¬Ki α X1. Xi α ⇔ Ki α ∧ Ai α Ai α ⇒ Ki Ai α A2. ¬Ai α ⇒ Ki ¬Ai α From α and α ⇒ β infer β (Modus Ponens) From α infer Ki α, i = 1, . . . , n (Knowledge Generalisation) From α ⇒ β infer Eα ⇒ Eβ From γ ⇒ (¬β ∧ Eγ) infer γ ⇒ ¬A(αUβ) From γ ⇒ (¬β ∧ A(γ ∨ ¬E(αUβ))) infer γ ⇒ ¬E(αUβ)
Theorem 3. The system TDL is sound and complete with respect to the class of models of Definition 1, i.e. |= ϕ iff ϕ, for any formula ϕ ∈ WF.
200
A. Lomuscio and B. Wo´zna
Proof. Soundness can be checked inductively as standard. For completeness, it is sufficient to show that any consistent formula is satisfiable. To do this, we first construct the structure M = (S, T, ∼1 , . . . , ∼n , L, A1 , . . . , An ) for ϕ just as in step 1 of the decidability algorithm for TDL. We then execute step 2 of the algorithm, obtaining a pseudo-model for ϕ. Crucially we show below that if a state s ∈ S is eliminated at step 2 of the algorithm, then the formula ψs = α∈s α is inconsistent. Observe now that for any α ∈ F L(ϕ) we have {s | ϕ ∈ s and {s | α ∈ s and α⇔ ψs is consistent} ψs . Thus, in particular, we have ϕ ⇔ ψs is consistent} ψs . Thus, if ϕ is consistent, then some ψs is consistent as well for some s ∈ S. It follows by Claim 2 of Theorem 2 that this particular s is present in the pseudomodel resulting from the execution of the algorithm. So, by Theorem 1, ϕ is satisfiable. Note that pseudo-models share the structural properties of models, i.e., their underlying frames have the same properties. It remains to show that if a state s ∈ S is eliminated at step 2 of the algorithm then the formula ψs is inconsistent. Before we do it, we need some auxiliary claims. Claim (3). Let s ∈ S and α ∈ F L(ϕ). Then, α ∈ s iff ψs ⇒ α. Proof. (’if’). Let α ∈ s. By the definition of S, we have that any s in S is maximal. Thus, ¬α ∈ s. So, ψs ⇒ α. (’only if’). Let ψs ⇒ α. So, since s is maximal we have that α ∈ s. Claim (4). Let s, t ∈ S, both of them be maximal and propositionally consistent, and s ∼i t. If α ∈ t, then ¬Ki ¬α ∈ s. Proof.[By contraposition] Let α ∈ t and ¬Ki ¬α ∈ / s. Then, since s is maximal we have that Ki ¬α ∈ s. Thus, since s ∼i t, we have that ¬α ∈ t. This contradicts the fact that α ∈ t, since t is propositionally consistent. Claim (5). Let s ∈ S be a maximal and consistent set of formulas and α such that α. Then α ∈ s. Proof. Suppose α ∈ s and α. Since s is maximal then ¬α ∈ s. So ¬α ∧ ψs is consistent where ψs where ψs ∈ s. So by definition of consistency we have that ¬(¬α ∧ ψs ), so α ∨ ¬ψs . But we have α ∨ ψs , so this is a contradiction. We now show, by induction on the structure of the decidability algorithm for TDL, that if a state s ∈ S is eliminated, then ¬ψs . Claim (6). If ψs is consistent, then s is not eliminated at step 2 of the decidability algorithm for TDL. Proof (a). Let Eα ∈ s and ψs be consistent. By the same reasoning as in the proof of Claim 4(a) in [7], we conclude that s satisfies H9. So s is not eliminated. (b). Let E(αUβ) ∈ s (respectively, A(αUβ) ∈ s) and suppose s is eliminated at step 2 because H11 (resp. H 12) is not satisfied. Then ψs is
A Combination of Explicit and Deductive Knowledge with Branching Time
201
inconsistent. The proof showing that fact is the same as the proof of Claim 4(c) (respectively Claim 4(d)) in [7]. (c). Let ¬Ki α ∈ s and ψs be consistent. Consider the set S¬α = {¬α} ∪ {β | Ki β ∈ s}. We will show that S¬α is consistent. Suppose that S¬α is inconsistent. Then, β1 ∧ . . . ∧ βm ⇒ α, where βj ∈ {β | Ki β ∈ s} for j ∈ {1, . . . , m}. By rule R2 we have Ki ((β1 ∧ . . . ∧ βm ) ⇒ α). By axioms K1 and P C we have (Ki β1 ∧ . . . ∧ Ki βm ) ⇒ Ki α. Since each Ki βj ∈ s for j ∈ {1, . . . , m} and s is maximal and propositionally consistent, we have Ki α ∈ s. This contradicts the fact that ψs is consistent. So, S¬α is consistent. Now, since each set of formulae can be extended to a maximal one, we have that S¬α is contained in some maximal set t. Thus ¬α ∈ t, and moreover, by the definition of ∼i in M and the definition of S¬α we have that s ∼i t. Thus s cannot be eliminated at step 2(c) of the decidability algorithm. (d). (By contradiction) Let ¬Xi α ∈ s and s be eliminated at step 2(d) (because H17 is not satisfied). We will show that ψs is inconsistent. Since ¬Xi α ∈ s, by Claim 3 we have that ψs ⇒ ¬Xi α. Since H17 fails, by Claim 3 we have that ψs ⇒ Ki α ∧ Ai α. So, by axiom X1 we have ψs ⇒ Xi α. Therefore, we have that ψs ⇒ ¬Xi α and ψs ⇒ Xi α. This implies that ψs ⇒ ⊥. Thus, ψs is inconsistent. (e). Suppose that s is consistent and it is eliminated at step 2(e)(because H18 is not satisfied) of the decidability algorithm. Thus, we have that s ∼i t, s ∼i u, α ∈ t, and Ki ¬α ∈ u. So, since s ∼i t, α ∈ t, s and t are maximal and propositionally consistent, by Claim 4 we have that ¬Ki ¬α ∈ s. Since s is maximal and consistent, by axiom K3 and Claim 5, we have that ¬Ki ¬α ⇒ Ki ¬Ki ¬α ∈ s. Therefore, we have that Ki ¬Ki ¬α ∈ s. Thus, since s ∼i u, we have that ¬Ki ¬α ∈ u. But this is a contradictions given that Ki ¬α ∈ u an u is propositionally consistent. So s is inconsistent. Therefore s cannot be eliminated at step 2(e) of the decidability algorithm. We have now shown that only states s with ψs inconsistent are eliminated. This ends the completeness proof.
6
Discussion
In the paper we have shown that the logic TDL is decidable, and can be axiomatised. TDL permits to express different concepts of knowledge as well as time. In the following we briefly discuss alternative definitions of the notions defined in TDL. Let us first note that the semantics of explicit knowledge in TDL is defined as in [9], with the difference that we assume the awareness function to be defined on local states (as opposed to global states as in [9]). In other words we have that: if s ∼i t, then Ai (li (s)) = Ai (li (t)). Although this is a special case of the definition used in [9], we find this natural for the tasks we have in mind (communication,
202
A. Lomuscio and B. Wo´zna
fault-tolerance, security 4 , ...), given that all the information the agents will have in these cases can be represented in their local states. Next, observe that the considered notion of explicit knowledge is sound, i.e., the following axiom is valid on TDL models: Xi α ⇒ Ki α, but it is not complete i.e., |= Ki α ⇒ Xi α holds. Note also that defining awareness on local states forces the following two axiom schemas to be valid on TDL models: Ai α ⇒ Ki Ai α and ¬Ai α ⇒ Ki ¬Ai α. These do not seem counterintuitive. Further restrictions can be imposed on the awareness function. One consists in insisting that the function Ai maps consistent sets. If this is the case, the formula Ai α ⇒ ¬Ai (¬α) becomes valid on TDL models. While this is a perfectly sound assumption in some applications (for instance in the case Ai models a consistent database), for the aims of our work it seems more natural not to insist on this condition. An even more crucial point is whether the local awareness functions should be consistent among one another, whether a “hierarchy of awareness” should be modelled, and whether they should at least agree with the global valuation function. In this paper we have made no assumption about the power of different agents; insisting this is the case is again reasonable in some scenarios but not considered here. It should be noted that forcing consistency between any Ai and V would make awareness and explicit knowledge collapse to the same modality. Further, knowledge about negative facts would be impaired given that Ai would only return propositions. The interested reader should refer to [9] for more details. We have found that the decidability and completeness proofs presented here can be adapted to account for different choices on the awareness function, provided that appropriate conditions are included in the construction of Hintikka structures. The notion presented here of deductive knowledge is directly inspired by the notion of algorithmic knowledge of [15, 29]. Typically, formalisms for algorithmic knowledge capture the notion of which the derivation algorithm is used to obtain a formula, and whether these derivations are correct and complete. The work presented here, on the other hand, focuses on the meta-logical properties of these notions, something not normally discussed, to our knowledge, in the literature. It should be pointed out that alternative definitions of deductive knowledge can be considered. For example one can consider: (M, s) |= Di αiff(M, s) |= A(Ki αUXi α), or (M, s) |= Di α iff (M, s) |= Ki α∧E(UXi α). Both of them enjoy the same logical properties as the one proposed here. The first one describes a notion of “inevitability” in the deductions carried out by the agent. This does not seem as appropriate as the one we used here, as typically one intends to model the capability, not the certainty, of deducing some information. The second definition does not insist on implicit knowledge remaining true over the run while the deduction is taking place. 4
Using the TDL framework, it is easy to capture the capabilities of the Dolev − Y ao adversary [4]; we specify how the adversary can extract an intercepted message by defining an adequate awareness function. Moreover, using TDL we can perform a semantical analysis of the timed efficient stream loss-tolerant authentication (TESLA) protocol [28]. The analysis allows us to reason about properties that so far could not be expressed in any other formalism. For more details we refer to [24].
A Combination of Explicit and Deductive Knowledge with Branching Time
203
In this case any explicit knowledge deduced could well be unsound (in the sense of [29]), something that cannot happen in the formalism of this paper. We stress that all logics discussed above remain decidable. This allows us to explore model checking methods for them. We leave this for further work.
References 1. P. Blackburn, M. de Rijke, and Y. Venema. Modal Logic, volume 53 of Cambridge Tracts in Theoretical Computer Science. Cambridge University Press, 2001. 2. L. Catach. Normal multimodal logics. In Proceedings of the 7th National Conference on Artificial Intelligence (AAAI’88), pp. 491–495, 1988. Morgan Kaufmann. 3. E. Clarke and E. Emerson. Design and synthesis of synchronization skeletons for branching-time temporal logic. In Proceedings of Workshop on Logic of Programs, volume 131 of LNCS, pp. 52–71. Springer-Verlag, 1981. 4. D. Dolev and A. Yao. On the security of public key protocols. IEEE Trans. Inf. Theory, 29(2):198–208, 1983. 5. E. A. Emerson. Temporal and modal logic. In J. van Leeuwen, editor, Handbook of Theoretical Computer Science, chapter 16, pp. 996–1071. Elsevier Science Publishers, 1990. 6. E. A. Emerson and E. M. Clarke. Using branching-time temporal logic to synthesize synchronization skeletons. Science of Computer Programming, 2(3):241–266, 1982. 7. E. A. Emerson and J. Y. Halpern. Decision procedures and expressiveness in the temporal logic of branching time. Journal of Computer and System Sciences, 30(1):1–24, 1985. 8. R. Fagin and J. Y. Halpern. Belief, awareness, and limited reasoning. Artificial Intelligence, 34(1):39–76, 1988. 9. R. Fagin, J. Y. Halpern, Y. Moses, and M. Y. Vardi. Reasoning about Knowledge. MIT Press, Cambridge, 1995. 10. R. Fagin, J. Y. Halpern, and M. Vardi. A nonstandard approach to the logical omniscience problem. Artificial Intelligence, 79, 1995. 11. R. Fagin, J. Y. Halpern, and M. Y. Vardi. A model-theoretic analysis of knowledge. Journal of ACM, 91:382–428, 1991. 12. M. J. Fischer and R. E. Ladner. Propositional dynamic logic of regular programs. Journal of Computer and System Sciences, 18(2):194–211, 1979. 13. P. Gammie and R. van der Meyden. MCK: Model checking the logic of knowledge. In Proceedings of 16th International Conference on Computer Aided Verification (CAV’04), volume 3114 of LNCS, pp. 479–483. Springer-Verlag, 2004. 14. J. Halpern, R. van der Meyden, and M. Y. Vardi. Complete axiomatisations for reasoning about knowledge and time. SIAM Journal on Computing, 33(3):674–703, 2003. 15. J. Y. Halpern, Y. Moses, and M. Y. Vardi. Algorithmic knowledge. In Theoretical Aspects of Reasoning About Knowledge. Proceedings of the 5th Conference (TARK 1994), pp. 255–66. Morgan Kaufmann Publishers, 1994. 16. J. Hintikka. Knowledge and Belief, An Introduction to the Logic of the Two Notions. Cornell University Press, Ithaca (NY) and London, 1962. 17. W. van der Hoek. Systems for knowledge and belief. Journal of Logic and Computation, 3(2):173–195, 1993. 18. W. van der Hoek and M. Wooldridge. Model checking knowledge and time. In Proceedings of the 9th International SPIN Workshop on Model Checking of Software, 2002.
204
A. Lomuscio and B. Wo´zna
19. K. Konolige. A Deduction Model of Belief. Brown University Press, 1986. 20. M. Kracht and F. Wolter. Properties of independently axiomatizable bimodal logics. Journal of Symbolic Logic, 56(4):1469–1485, 1991. 21. D. Lehman. Knowledge, common knowledge, and related puzzles. In Proceedings of the 3rd ACM Symposium on Principles of Distributed Computing, pp. 62–67, 1984. 22. A. Lomuscio, R. van der Meyden, and M. Ryan. Knowledge in multi-agent systems: Initial configurations and broadcast. ACM Transactions of Computational Logic, 1(2), 2000. 23. A. Lomuscio and M. Sergot. Deontic interpreted systems. Studia Logica, 75(1):63– 92, 2003. 24. A. Lomuscio and B. Wo´zna. A complete and decidable security-specialised logic and its application to the tesla protocol. In Proceedings of the 5th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS’06). ACM Press, 2006. To appear. 25. R. van der Meyden. Axioms for knowledge and time in distributed systems with perfect recall. In Proceedings, 9th Annual IEEE Symposium on Logic in Computer Science, pp. 448–457. IEEE Computer Society Press, 1994. 26. R. van der Meyden and K. Wong. Complete axiomatizations for reasoning about knowledge and branching time. Studia Logica, 75(1):93–123, 2003. 27. W. Penczek and A. Lomuscio. Verifying epistemic properties of multi-agent systems via bounded model checking. Fundamenta Informaticae, 55(2):167–185, 2003. 28. A. Perrig, R. Canetti, J. D. Tygar, and Dawn X. Song. Efficient authentication and signing of multicast streams over lossy channels. In IEEE Symposium on Security and Privacy, pp. 56–73, May 2000. 29. R. Pucella. Deductive Algorithmic Knowledge. In Proceedings of the 8th International Symposium on Artificial Intelligence and Mathematics (SAIM’04), Online Proceedings: AI&M 22-2004, 2004. 30. F. Raimondi and A. Lomuscio. Verification of multiagent systems via ordered binary decision diagrams: an algorithm and its implementation. In Proceedings of the 3rd International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS’04), volume II. ACM, July 2004. 31. J. M. Sato. A study of Kripke style methods for some modal logic by Gentzen’s sequential method. Technical report, Publication Research Institute for Mathematical Science, 1977. 32. E. Spann. Nextime is not necessary. In Proceedings of the 3rd Conference on Theoretical Aspects of Reasoning about Knowledge, pp. 241–256, 1990. 33. W. van der Hoek and M. Wooldridge. Cooperation, knowledge, and time: Alternating-time temporal epistemic logic and its applications. Studia Logica, 75(1):125–157, 2003. 34. R. van der Meyden and H. Shilov. Model checking knowledge and time in systems with perfect recall. In Proceedings of the 19th Conference on Foundations of Software Technology and Theoretical Computer Science (FST&TCS’99), volume 1738 of LNCS, pp. 432–445. Springer-Verlag, 1999. 35. R. van der Meyden and Kaile Su. Symbolic model checking the knowledge of the dining cryptographers. In Proceedings of the 17th IEEE Computer Security Foundations Workshop (CSFW’04), pp. 280–291. IEEE Computer Society, 2004. 36. B. Wo´zna, A. Lomuscio, and W. Penczek. Bounded model checking for knowledge over real time. In Proceedings of the 4st International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS’05), volume I, pp. 165–172. ACM Press, 2005.
An Intensional Programming Approach to Multi-agent Coordination in a Distributed Network of Agents Kaiyu Wan and Vasu S. Alagar Department of Computer Science, Concordia University, Montreal, Quebec H3G 1M8, Canada {ky wan, alagar}@cse.concordia.ca
Abstract. We explore the suitability of Intensional Programming Paradigm for providing a programming model for coordinated problem solving in a multi-agent system. We extend our previous work on Lucx, an Intensional Programming Language extended with context as first class object, to support coordination activities in a distributed network of agents. We study coordination constructs which can be applied to sequential programs and distributed transactions. We give formal syntax and semantics for coordination constructs. The semantics for transaction expressions is given on top of the existing operational semantics in Lucx. The extended Lucx can be used for Internet-based agent applications. Keywords: Multi-agent systems, coordinated transactions, Intensional Programming Language, coordination constructs.
1 Introduction Our goal is to provide a programming model for a network of distributed agents solving problems in a coordinated fashion. We suggest an Intensional Programming Language, with context as first class objects and a minimal set of coordination constructs, to express the coordinated communication and computing patterns in agent-based systems. We give a formal syntax and semantics for the language and illustrate the power of the language with a realistic example. Intensional Programming Paradigm. Intensional logic is a branch of mathematical logic used to precisely describe context-dependent entities. According to Carnap, the real meaning of a natural language expression whose truth-value depends on the context in which it is uttered is its intension. The extension of that expression is its actual truth-value in the different possible contexts of utterance. For an instance, the statement “The capital of China is Beijing” is intensional because its valuation depends on the context (here is the time) in which it is uttered. If this statement is uttered before 1949, the extensions of this statement are False (at that time, the capital was Nanjing). However, if it is uttered after 1949, the extensions of this statement are True. In the Intensional Programming (IP) paradigm, which has its foundations in Intensional Logic,
This work is supported by grants from Natural Sciences and Engineering Research Council, Canada.
M. Baldoni et al. (Eds.): DALT 2005, LNAI 3904, pp. 205–222, 2006. c Springer-Verlag Berlin Heidelberg 2006
206
K. Wan and V.S. Alagar
the real meaning of an expression is a function from contexts to values, and the value of the intension at any particular context is obtained by applying context operators to the intension. Basically, intensional programming provides intension on the representation level, and extensions on the evaluation level. Hence, intensional programming allows for a more declarative way of programming without loss of accuracy. Lucid was a data-flow language which evolved into a Multidimensional Intensional Programming Language [1]. Lucid is a stream (i.e. infinite entity) manipulation language. The only data type in Lucid is a stream. The basic intensional operators are first, next, and fby. The four operators derived from the basic ones are wvr, asa, upon, and prev, where wvr stands for whenever, asa stands for as soon as, upon stands for advances upon, and prev stands for previous. All these operators are applied to streams to produce new streams. Example 1 illustrates the definitions of these operators (nil indicates an undefined value). Example 1. A = 1 2 3 4 5 ... B = 0 0 1 0 1 ... first A = 1 1 1 1 1 ... next A = 2 3 4 5 . . . prev A = nil 1 2 3 4 5 . . . A fby B = 1 0 0 1 0 1 . . . A wvr B = 3 5 . . . A asa B = 3 3 3 . . . A upon B = 1 1 1 3 3 5 . . . The following program computes the stream 1, 1, 2, 3, 5, . . . of all Fibonacci numbers: result = fib fib = 1 fby (fib + g) g = 0 fby fib Lucid allows the notion of context only implicitly. This restricts the ability of Lucid to express many requirements and constraints that arise in programming a complex software system. So we have extended Lucid by adding the capability to explicitly manipulate contexts. This is achieved by extending Lucid conservatively with context as a first class object. We call the resulting language Lucx (Lucid extended with contexts). Lucx has provided more power for representing problems in different application domains and given more flexibility of programming. We discuss Lucx, context calculus which is its semantic foundation, and multi-agent coordination constructs introduced in Lucx in Section 3. Multiple-Agent Paradigm. By agent we mean software agents which can be personalized, continuously running and semi-autonomous, driven by a set of beliefs, desires, and intentions (BDI). Instead of describing each agent in isolation, we consider agent types. An agent type is characterized by a set of services. A generic classification of agent types, is interface agent (IA), middle agent (MA), task agent (TA), and security agent (SA). The MA type can be specialized into arbitrator agent, match-maker agent,
An Intensional Programming Approach to Multi-agent Coordination
207
and broker agent. They play different roles, yet share the basic role of MA. All agents of an agent type have the same set of services. Agents are instances of agent types. An agent with this characterization is a black-box with interfaces to service its clients. It behaves according to the context and feedback from its environment. Two agents interact either directly or through intermediaries, who are themselves agents. An atomic interaction between two agents is a query initiated by an agent and received by the other agent at the interface which can service the query. An interaction among several agents is a collection of sequences of atomic interactions. Several methods are known in the field of distributed systems to characterize such behavior. Our goal is to explore intensional programming for expressing different interaction types, not in characterizing the whole collection of interactions. In order to understand interaction patterns let us consider a typical business transactions launched by an agent. The agent acquires data from one or more remote agents, analyzes the data and computes some result, and based on the computed result invokes services from other agents. The agents may be invoked either concurrently or in sequence. In the latter case, the result of an agent’s computation may be given as input to the next agent in the sequence. In the former case, it is possible that an agent splits a problem into subproblems and assigns each subproblem to an agent which has the expertise to solve it. We discuss the syntax and semantics for such coordination constructs in Lucx, under the assumption that an infrastructure exists to carry out the basic tasks expressed in the language. A configuration is a collection of interacting agents, where each agent is an instance of an agent type. A configuration is simple if it has only one agent. An agent interacts with itself when it is engaged in some internal activity. More generally, the agents in a configuration communicate by exchanging messages through bidirectional communication channels. A channel is an abstract binding which when implemented will satisfy the specifications of the interfaces, the two ends of the channel. Thus, a configuration is a finite collection of agents and channels. The interfaces of a configuration are exactly those interfaces of the agents that are not bound by channels included in the configuration. In a distributed network of agents, each network node is either an agent or a configuration. Within a configuration, the pattern of computation is deterministic but need not be sequential. There are three major language components in the design of distributed multi-agent systems: – (ACL) agent communication language – (CCL) constraint choice language – (COL) coordination language. These three languages have different design criteria. An ACL must support interoperability in agent community while providing the freedom for the agent to hide or reveal its internal details to other agents. A CCL must be designed to support agent problem solving by providing explicit representation of choices and choice problems. A COL must support transaction specification and task coordination among the agents. The two existing ACLs are Knowledge Query and Manipulation Language (KQML) and the FIPA agent communication language [5]. The FIPA language includes the basic concepts of KQML, yet they have slightly different semantics. Agents require a
208
K. Wan and V.S. Alagar
content language to express information on constraints, which is encapsulated as a field within performatives of ACL. FIPA Constraint Choice Language (CCL) is one such content languages [6], designed to support agent problem solving by providing explicit representations of choices and choice problems. In our works [2] [15], we have shown the suitability of Lucx, an Intensional Programming Language (IPL), for agent communication as well for choice representation. In this paper we extend Lucx with a small number of constructs to express task coordination. We are motivated by the following merits of Lucx. 1. Lucx allows the evaluation of expression at contexts which are definable as first class objects in the language. Context calculus in Lucx provides the basis for expressing dynamically changing situations. 2. Performatives, expressible as context expressions, can be dynamically introduced, computed, and modified. A dialog between agents is expressible as a stream of performatives, and consequently using stream functions such as first, next, and fby, a new dialog can be composed, decomposed, and analyzed based on the existing dialogs. 3. Lucx allows a cleaner and more declarative way of expressing the computational logic and task propagation of a program without loss of accuracy of interpreting the meaning of the program; 4. Lucx deals with infinite entities which can be any simple or composite data values.
2 Basics of Agent Coordination A coordination expression, Corde, in its simplest form is a message/function call from one agent to another agent. A general Corde is a message/function from one configuration to another configuration. We introduce an abstract agent coordination expression S.a(C) where S is a configuration expression, a is the message (function call) at S, and C is a context expression (discussed in Section 3). The context C is combined with the local context. If the context parameter is omitted it is interpreted as the current context. If the message is omitted, it is interpreted as a call to default method invocation at S. The evaluation of a Corde returns a result (x, C ), where x is the result and C is the context in which x is valid. In principle, x may be a single item or a stream. As an example, let B be a broker agent. The expression B.sell(dd) is a call to the broker to sell stocks as specified in the context expression dd, which may include the stock symbol, the number of shares to be sold, and the constraints on the transaction such as date and time or minimum share price. The evaluation of the expression involves contacting agent B with sell and dd as parameters. The agent B computes a result (x, dd ), and returns to the agent A who invoked its services, for which the expression is A.receive(C ), where context C includes x and dd . In this example, dd may include constraints on when the amount x can be deposited in A’s bank account. Composition Constructs. We introduce composition constructs for abstract coordination expressions and illustrate with examples. A coordination in a multi-agent system requires the composition of configuration expressions. As an example, consider an agent-based system for flight ticket booking. Such a system should function with minimal human intervention. An interface agent, representing a client, asks a broker agent
An Intensional Programming Approach to Multi-agent Coordination
209
to book a flight ticket whose quoted price is no more than $300. The broker agent may simultaneously contact two task agents, each representing an airline company, for price quotes. The broker agent will choose the cheaper ticket if both of the quoted price are less than $300 and make a commitment to the corresponding task agent, then informs the interface agent about the flight information. If both prices are above $300, the broker agent will convey the information to the interface agent. The integrated activities of those agents to obtain the solution for the user is regarded as a transaction. Typically, the result from the interaction between two agents is used in some subsequent interaction, and results from simultaneously initiated interactions are compared to decide the next action. To meet these requirements, we provide many composition constructs,including sequential composition, parallel composition, and aggregation constructs. We informally explain the sequential and parallel composition constructs below. The expression E = S1 .a1 (C1 ) > (x, C ) > S2 .a2 (C2 ) is a sequential composition of the two configuration expressions S1 .a1 (C1 ), and S2 .a2 (C2 ). The expression E is evaluated by first evaluating S1 .a1 (C1 ), and then calling S2 with each value (x, C ) returned by S1 .a1 (C1 ). The contexts C is substituted for C2 in the evaluation of S2 .a2 (C2 ). The expression S1 .a1 (C1 ) S2 .a2 (C2 ) is a parallel composition of the two configuration expressions S1 .a1 (C1 ), and S2 .a2 (C2 ). The evaluation of the two expressions are invoked simultaneously, and the result is the stream of values (x, C ) returned by the configurations ordered by their time of delivery (available in C ). Example 2. Let A (Alice) and B (Bob) be two interface agents, and M be a mediator agent. The mediator’s service is to mediate a dispute between agents in the system. It receives the information from users and delivers a solution to them. In the expression ((B.notifies > m1 ) (A.notifies > m2 )) > M.receives(C ) > (m3 , C ) > (B.receives A.receives) the mediator computes a compromise (default function) m3 for each pair of values (m1 , m2 ) and delivers to Bob and Alice. Context C includes (m1 , m2 and the local context in M. Context C is a constraint on the validity of the mediated solution m3 . The other constructs that we introduce are And, Or, and Xor constructs to enforce certain order on expression evaluations. The And (◦) construct is to enforce evaluation of more than one expressions although the order is not important. The Or () construct is to choose one of the evaluations nondeterministically. The Xor () construct defines one of the expressions to be evaluated with priority. In addition, we introduce Commit construct (com) to enable permanent state changes in the system after viewing the effect of a transaction. The syntax and semantics of these constructs in Lucx are given in Section 4. We can combine the where construct in Lucx with the above constructs to define parameterized expressions. Once defined, such expressions may be called from another expression.
3 An Intensional Programming Model for Distributed Networks of Agents Lucx [2, 15] is a conservative extension of Lucid [1], an Intensional Programming Language. We have been exploring Lucx for a wide variety of programming applications.
210
K. Wan and V.S. Alagar
In [14], we have studied real-time reactive programming models in Lucx. Recently we have given constraint program models in Lucx [15]. In this section we review these works for agent communication and content description. 3.1 An Overview of Intensional Programming Language: Lucx Syntax and Semantic Rules of Lucx. The syntax of Lucx [2, 15], shown in Figure 1, is sufficient for programming agent communication and content representation. The symbols @ and # are context navigation and query operators. The non-terminals E and Q respectively refer to expressions and definitions. The abstract semantics of evaluation in Lucx is D, P E : v, which means that in the definition environment D, and in the evaluation context P , expression E evaluates to v. The definition environment D retains the definitions of all of the identifiers that appear in a Lucid program. Formally, D is a partial function D : Id → IdEntry, where Id is the set of all possible identifiers and IdEntry has five possible kinds of value such as: Dimensions, Constants, Data Operators, Variables, and Functions. The evaluation context P , is the result of P † c, where P is the initial evaluating context, c is the defined context expression, and the symbol†denotes the overriding function. A complete operational semantics for Lucx is defined in [2, 15]. The implementation technique of evaluation for Lucx programs is an interpreted mode called eduction [1]. Eduction can be described as tagged-token demand-driven dataflow, in which data elements (tokens) are computed on demand following a dataflow network defined in Lucid. Data elements flow in the normal flow direction (from producer to consumer) and demands flow in the reverse order, both being tagged with their current context of evaluation. Context Calculus. Informally, a context is a reference to a multidimensional stream, making an explicit reference to the dimensions and the tags (indexes) along each dimension. The formal definition is given in [15]. The syntax for context is [d1 : x1 , . . . , dn : xn ], where d1 , . . . , dn are dimension names, and xi is the tag for dimension di . An atomic context with only one dimension and a tag is called micro context. A context with different dimensions is called simple context. Given an expression E and a context c, the Lucid expression E @ c directs the eduction engine to evaluate E in the context c. According to the semantics, E @ c gives the stream value at the coordinates referenced by c. In our previous papers [2, 14], we have introduced the following context operators: the override ⊕ is similar to function override; difference , comparison =, conjunction E ::= | | | | | | |
id C ::= {E1 , . . . , En } E(E1 , . . . , En ) | Box[E1 , . . . , En | E ] if E then E else E | [E1 : E1 , . . . , En : En ] # Q ::= dimension id E@C | id = E E1 , . . . , En E | id(id1 , . . . , idn ) = E select(E, E ) |QQ E where Q Fig. 1. Abstract syntax for Lucx
An Intensional Programming Approach to Multi-agent Coordination
211
Table 1. Precedence Rules for Context Operators
C ::= | | | | | |
syntax c | C⊇C | C|C | C⊕C | CC | CC| C↓D |
precedence C=C C⊆C C/C CC CC CC C↑D
1. ↓, ↑, / 2. | 3. , 4. ⊕, 5. , 6. =, ⊆, ⊇
, and disjunction are similar to set operators; projection ↓ and hiding ↑ are selection operators; constructor [ : ] is used to construct an atomic context; substitution / is used to substitute values for selected tags in a context; choice | accepts a finite number of contexts and nondeterministically returns one of them. undirected range and directed range produce a set of contexts. The formal definitions of these operators can be found in [15]. The right column of Table 1 shows the precedence rules for the context operators, listed from the highest to the lowest precedence. The formal syntax of context expressions is shown in the left column of Table 1. Parentheses will be used to override this precedence when needed. Operators having equal precedence will be applied from left to right. Rules for evaluating context expressions are given in [15]. Example 3. The precedence rules shown in Table 1 are applied in the evaluation of the well-formed context expression c3 ↑ D ⊕ c1 | c2 , where c1 = [x : 3, y : 4, z : 5], c2 = [y : 5], and c3 = [x : 5, y : 6, w : 5], D = {w}. The evaluation steps are as follows: [Step1]. c3 ↑ D = [x : 5, y : 6] [↑ Definition] [Step2]. c1 | c2 = c1 or c2 [| Definition] [Step3]. Suppose in Step2, c1 is chosen, c3 ↑ D ⊕ c1 = [x : 3, y : 4, z : 5] [⊕ Definition ] else if c2 is chosen, c3 ↑ D ⊕ c2 = [x : 5, y : 5] [⊕ Definition] A context which is not a micro context or a simple context is called a non-simple context. In general, a non-simple context is equivalent to a set of simple contexts [2]. In several applications we deal with contexts that have the same dimension set ∆ ⊆ DIM and the tags satisfy a constraint p. The short hand notation for such a set is the syntax Box[∆ | p]. Definition 1. Let ∆ = {d1 , . . . , dk }, where di ∈ DIM i = 1, . . . , k, and p is a k-ary predicate defined on the tuples of the relation Πd ∈∆ fdimtotag (d). The syntax Box[∆ | p] = {s | s = [di1 : xi1 , . . . , dik : xik ]}, where the tuple (x1 , . . . , xk ), xi ∈ fdimtotag (di ), i = 1, . . . k satisfy the predicate p, introduces a set S of contexts of degree k. For each context s ∈ S the values in tag(s) satisfy the predicate p.
212
K. Wan and V.S. Alagar Table 2. Precedence Rules for Box Operators
B ::= | | |
syntax b |B|B BB | BB BB | B↓D B↑D
precedence 1. ↓, ↑ 2. | 3. , ,
Many of the context operators introduced above can be naturally lifted to sets of contexts, in particular for Boxes. We have defined three operators exclusively for Boxes. These are (, , and ). They have equal precedence and have semantics analogous to relational algebra operators. Table 2 shows a formal definition of Box expression B, and precedence rules for Box expressions. We use the symbol D to denote a dimension set. An Example of a Lucx Program. Consider the problem of finding the solution in positive integers that satisfy the following constraints: x3 + y3 + z3 + u3 = 100 x
:
E(Φ,Cr(Ag1 ,SC(Ag1 ,Ag2 ,φ))) E(Φ,SC(Ag1 ,Ag2 ,φ))
R11 < Ac > :
E(Φ,Ac(Ag2 ,SC(Ag1 ,Ag2 ,φ))) E(Φ,SC(Ag2 ,Ag1 ,φ))
R7 < W it >
:
E(Φ,W it(Ag1 ,SC(Ag1 ,Ag2 ,φ))) E(Φ,¬SC(Ag1 ,Ag2 ,φ))
R12 < Ref > :
E(Φ,Ref (Ag2 ,SC(Ag1 ,Ag2 ,φ))) E(Φ,SC(Ag2 ,Ag1 ,¬φ))
R8 <SatAg1 > :
E(Φ,Sat(Ag1 ,SC(Ag1 ,Ag2 ,φ))) E(Φ,φ)
R13 < Jus > :
E(Φ,Jus(Ag1 ,SC(Ag1 ,Ag2 ,φ1 ),φ2 )) E(Φ,SC(Ag1 ,Ag2 ,φ2 ∴φ1 ))
R9 :
E(Φ,V io(Ag1 ,SC(Ag1 ,Ag2 ,φ))) E(Φ,¬φ)
R14 < At >
E(Φ,At(Ag2 ,SC(Ag1 ,Ag2 ,φ1 ),φ2 )) E(Φ,SC(Ag2 ,Ag1 ,φ2 ∴¬φ1 ))
R10 < Ch >
E(Φ,Ch(Ag2 ,SC(Ag1 ,Ag2 ,φ))) E(Φ,SC(Ag2 ,Ag1 ,?φ))
R15 < Def > :
:
:
E(Φ,Def (Ag1 ,SC(Ag1 ,Ag2 ,φ1 ),φ2 )) E(Φ,SC(Ag1 ,Ag2 ,φ2 ∴φ1 ))
R16 [SCAg1 ] :
E(Φ,SC(Ag1 ,Ag2 ,φ)) E(Φ,φ)
R17 :
E(Φ,Ψ ) E(Φ)E(Ψ )
R18 ∧ :
E(Φ,φ1 ∧φ2 ) E(Φ,φ1 ,φ2 )
R19 ∨
E(Φ,φ1 ∨φ2 ) E(Φ,φ1 )E(Φ,φ2 )
R20 X
E(Φ,Xφ1 ,...,Xφn ) E(Φ,φ1 ,...,φn )
R21 ∧ :
E(Φ,φ1 ∴φ2 ) E(Φ,φ1 ,X(¬φ1 ∨φ2 ))
:
R22 ∨ :
:
E(Φ,φ1 U φ2 ) E(Φ,φ2 )E(Φ,φ1 ,X(φ1 U φ2 ))
Rule R1 labeled by ”∧” indicates that ψ1 and ψ2 are the two sub-formulae of ψ1 ∧ ψ2 . This means that, in order to prove that a state labeled by ”∧” satisfies the formula ψ1 ∧ ψ2 , we have to prove that the two children of this state satisfy ψ1 and ψ2 respectively. According to rule R2, in order to prove that a state
A Tableau Method for Verifying Dialogue Game Protocols
231
labeled by ”∨” satisfies the formula ψ1 ∨ ψ2 , we have to prove that one of the two children of this state satisfies ψ1 or ψ2 . R3 labeled by ”∨” indicates that ψ is the sub-formula to be proved in order to prove that a state satisfies E(ψ). E is the existential path-quantifier. According to R4, the formula ¬ψ is satisfied in a state labeled by ”¬” if this state has a successor representing ψ. R5 is defined in the usual way. The label ”< Cr >” (R6) is the label associated with the creation action of a social commitment. According to this rule, in order to prove that a state labeled by ”< Cr >” satisfies Cr(Ag1 , SC(Ag1 , Ag2 , φ)), we have to prove that the child state satisfies the sub-formula SC(Ag1 , Ag2 , φ). The idea is that by creating a social commitment, this commitment becomes true in the child state. In the model representing the dialogue game protocol, the idea behind the creation action is that by creating a social commitment, this commitment becomes true in the accessible state via the transition labeled by the creation action. The label ”< W it >” (R7) is the label associated with the withdrawal action of a social commitment. According to this rule, in order to prove that a state labeled by ”< W it >” satisfies W it(Ag1 , SC(Ag1 , Ag2 , φ)), we have to prove that the child state satisfies the sub-formula ¬SC(Ag1 , Ag2 , φ). Rules R8 to R15 are defined in the same way. For example, the idea of rule R11 is that by accepting a social commitment whose content is φ by an agent Ag2 , this agent commits about this content in the child state. In this state, the commitment of Ag2 becomes true. In rule R10, we introduce a syntactical construct ”?” to indicate that the debtor Ag2 does not have in argument supporting φ or ¬φ. The idea of this rule is that by challenging a social commitment, Ag2 commits in the child state that it does not have an argument for or against the content φ. Rule R16 indicates that E(φ) is the sub-formula of E(SC(Ag1 , Ag2 , φ)). Thus, in order to prove that a state labeled by ”[SCAg1 ]” satisfies the formula E(SC (Ag1 , Ag2 , φ)), we have to prove that the child state satisfies the sub-formula E(φ). According to the semantics of social commitments (Section 3), the idea of this rule is that if an agent commits about a content along a path, this content is true along this path (we recall that the commitment content is a path formula). Rules R17, R18, and R19 are straightforward. According to rule R20 and in accordance with the semantics of ”X”, in order to prove that a state labeled with ”X” satisfies E(Xφ), we have to prove that the child state satisfies the subformula E(φ). According to R21 and in accordance with the semantics of ”∴” (Section 3), in order to prove that a state labeled with ” ∧ ” satisfies E(φ1 ∴ φ2 ), we have to prove that the child state satisfies the sub-formula E(φ1 ∧ X(¬φ1 ∨ φ2 )). This mean that the support is true and next if the support is true then the conclusion is true. Finally, rule R22 is defined in accordance with the usual semantics of until operator. 5.2
Alternating B¨ uchi Tableau Automata (ABTA) for ACTL*
As a kind of B¨ uchi automata, ABTAs [7] are used in order to prove properties of infinite behavior. These automata can be used as an intermediate representation
232
J. Bentahar, B. Moulin, and J.-J. Ch. Meyer
for system properties. Let Γ p be the set of atomic propositions and let be a set of tableau rule labels defined as follows: 1
= {∧, ∨, ¬} ∪ Act ∪ ¬Act ∪ SC ∪ Set where: Act ={< Cr >,< W it >, < SatAg >, < V ioAg >, < Ch >, < Ac >, < Ref >, < Jus >, < At >, < Def >},
SC = {[SCAg ]}, and Set = {, X}. We define ABTAs for ACTL* logic as follows: Definition 3 (ABTA). An ABTA for ACTL* is a 5-tuple Q, l, →, q0 , F , where: Q is a finite set of states; l : Q → Γ p∪ is the state labeling; →⊆ Q×Q is the transition relation; q0 is the start state; F ⊆ 2Q is the acceptance condition2 . ABTAs allow us to encode ”top-down proofs” for temporal formulae. Indeed, an ABTA encodes a proof schema in order to prove, in a goal-directed manner, that a TS satisfies a temporal formula. Let us consider the following example. We would like to prove that a state s in a TS satisfies a temporal formula of the form F1 ∧ F2 , where F1 and F2 are two formulae. Regardless of the structure of the system, there would be two sub-goals. The first would be to prove that s satisfies F1 , and the second would be to prove that s satisfies F2 . Intuitively, an ABTA for F1 ∧ F2 would encode this ”proof structure” using states for the formulae F1 ∧ F2 , F1 , and F2 . A transition from F1 ∧ F2 to each of F1 and F2 should be added to the ABTA and the labeling of the state for F1 ∧ F2 being ”∧” which is the label of a certain rule. Indeed, in an ABTA, we can consider that: 1) states correspond to ”formulae”, 2) the labeling of a state is the ”logical operator” used to construct the formula, and 3) the transition relation represents a ”sub-goal” relationship. 5.3
Translating ACTL* into ABTA (Step 1)
The procedure for translating an ACTL* formula p = E(φ) to an ABTA B uses goal-directed rules in order to build a tableau from this formula. Indeed, these proof rules are conducted in a top-down fashion in order to determine if states satisfy properties. The tableau is constructed by exhaustively applying the tableau rules presented in Table 1 to p. Then, B can be extracted from this tableau as follows. First, we generate the states and the transitions. Intuitively, states will correspond to state formulae, with the start state being p. To generate new states from an existing state for a formula p , we determine which rule is applicable to p , starting with R1, by comparing the form of p to the formula appearing in the ”goal position” of each rule. Let rule(q) denote the rule applied at node q. The labeling function l of states is defined as follows. If q does not have any successor, then l(q) ∈ Γ p. Otherwise, the successors of q are given by rule(q). The label of the rule becomes the label of the state q, and the sub-goals of the rule are then added as states related to q by transitions. 1 2
The partition of the set of tableau rule labels is only used for readability and organization reasons. The notion of acceptance condition is related to the notion of accepting run that we define in Section 5.4.
A Tableau Method for Verifying Dialogue Game Protocols
233
Table 2. The tableau of Formula 1 ¬ : AG(¬Ch(Ag2 , SC(Ag1 , Ag2 , φ1 )) ∨ F (Jus(Ag1 , SC(Ag1 , Ag2 , φ1 ), φ2 ))) (1) ∨ : EF (Ch(Ag2 , SC(Ag1 , Ag2 , φ1 )) ∧ G(¬Jus(Ag1 , SC(Ag1 , Ag2 , φ1 ), φ2 ))) (2) : E(Ch(Ag2 , SC(Ag1 , Ag2 , φ1 ))∧ <X>: EX(F (Ch(Ag2 , SC(Ag1 , Ag2 , φ1 ))∧ G(¬Jus(Ag1 , SC(Ag1 , Ag2 , φ1 ), φ2 ))) (3) G(¬Jus(Ag1 , SC(Ag1 , Ag2 , φ1 ), φ2 )))) (4) [SCAg2 ] : E(SC(Ag2 , Ag1 , ?φ1 )∧ EF (Ch(Ag2 , SC(Ag1 , Ag2 , φ1 ))∧ G(¬Jus(Ag1 , SC(Ag1 , Ag2 , φ1 ), φ2 ))) (5) G(¬Jus(Ag1 , SC(Ag1 , Ag2 , φ1 ), φ2 ))) (2) : E(?φ1 ∧ G(¬Jus(Ag1 , SC(Ag1 , Ag2 , φ1 ), φ2 ))) (6) ?φ1 (7) ∨ : E(G(¬Jus(Ag1 , SC(Ag1 , Ag2 , φ1 ), φ2 ))) (8) : E(¬Jus(Ag1 , SC(Ag1 , Ag2 , φ1 ), φ2 ), XG(¬Jus(Ag1 , SC(Ag1 , Ag2 , φ1 ), φ2 ))) (9) [SCAg1 ] : E(SC(Ag1 , Ag2 , φ1 ∴ φ2 ), XG(¬Jus(Ag1 , SC(Ag1 , Ag2 , φ1 ), φ2 ))) (10) ∧ : E(φ2 ∴ φ1 , XG(¬Jus(Ag1 , SC(Ag1 , Ag2 , φ1 ), φ2 ))) (11) : E(φ2 , X(¬φ2 ∨ φ1 ), XG(¬Jus(Ag1 , SC(Ag1 , Ag2 , φ1 ), φ2 ))) (12) φ2 (13) X : E(X(¬φ2 ∨ φ1 ), XG(¬Jus(Ag1 , SC(Ag1 , Ag2 , φ1 ), φ2 ))) (14) : E((¬φ2 ∨ φ1 ), XG(¬Jus(Ag1 , SC(Ag1 , Ag2 , φ1 ), φ2 ))) (15) ¬φ2 ∨ φ1 (16) X : E(XG(¬Jus(Ag1 , SC(Ag1 , Ag2 , φ1 ), φ2 ))) (17) ∨ : E(G(¬Jus(Ag1 , SC(Ag1 , Ag2 , φ1 ), φ2 ))) (8)
A tableau for a ACTL* formula p is a maximal proof tree having p as its root and constructed using our tableau rules (see Section 5.1). If p results from the application of a rule to p, then we say that p is a child of p in the tableau. The height of a tableau is defined as the length of the longest sequence < p0 , p1 , . . . >, where pi+1 is the child of pi [11]. Example 1. In order to illustrate the translation procedure and the construction of an ABTA from an ACTL* formula, let us consider our formula Formula 1 given in Section 4. Table 2 is the tableau to build for translating Formula 1 into an ABTA. The form of Formula 1 is: AG(p ⇒ q)(≡ AG(¬p ∨ q)) (the root of Table 2). The first rule we can apply is R5 labeled by ¬ in order to transform all paths to exists a path. We also use the equivalence (F (p) ≡ ¬G(¬p)). We then obtain the child number (2). The next rule we can apply is R22 labeled by ∨ because F is an abbreviation of U (F (p) ≡ T rue U p). Consequently, we obtain two children (3) and (4). From the child (3) we obtain the child (5) by applying the rule R10, and from the child (4) we obtain the child (2) by applying the rule R20 etc. The ABTA obtained from this tableau is illustrated by Fig. 3. States are labeled by the child’s number in the tableau and the label of the applied rule according to Table 2. The termination proof of the translation procedure is based on the finiteness of the tableau. This proof is based on the length of formulae and an ordering relation between these formulae. The proof is detailed in [4].
234
J. Bentahar, B. Moulin, and J.-J. Ch. Meyer
Fig. 3. The ABTA of Formula 1
5.4
Run of an ABTA on a Transition System (Step 2)
Like the automata-based model checking of PLTL, in order to decide about the satisfaction of formulae, we use the notion of the accepting runs. In our technique, we need to define the accepting runs of an ABTA on a TS. Firstly, we have to define the notion of the ABTA’s run. For this reason, we need to introduce two types of nodes: positive and negative. Intuitively, nodes classified positive are nodes that correspond to a formula without negation, and negative nodes are nodes that correspond to a formula with negation. Definition 4 gives the definition of this notion of run. In this definition, elements of the set S of states are denoted si or ti .
A Tableau Method for Verifying Dialogue Game Protocols
235
Definition 4 (Run of an ABTA). A run of an ABTA B = Q, l, →, q0 , F on Act
a TS T = S, Lab, ℘, L, Act, −→, s0 is a graph in which the nodes are classified as positive or negative and are labeled by elements of Q × S as follows: 1. The root of the graph is a positive node and is labeled by < q0 , s0 > . 2. For a positive node ϕ with label < q, si >: (a) If l(q) = ¬ and q → q , then ϕ has one negative successor labeled < q , si > and vice versa. (b) If l(q) ∈ Γ p, then ϕ is a leaf. (c) If l(q) ∈ {∧, } and {q |q → q } = {q1 , . . . , qm }, then ϕ has positive successors ϕ1 , . . . , ϕm with ϕj labeled by < qj , si > (1 ≤ j ≤ m). (d) If l(q) = ∨, then ϕ has one positive successor ϕ labeled by < q , si > for some q ∈ {q |q → q }. • (e) If l(q) = X and q → q and {s |si −→ s } = {t1 , . . . , tm } where • ∈ Act, then ϕ has positive successors ϕ1 , . . . , ϕm with ϕj labeled by < q , tj > (1 ≤ j ≤ m). • (f ) If l(q) =< • > where • ∈ Act and q → q , and si −→ si+1 , then ϕ has one positive successor ϕ labeled by < q , si+1,0 > where si+1,0 is the initial state of the decomposition TS of si+1 . •
(g) If l(q) =< • > where • ∈ ¬Act and q → q , and si −→ si+1 where • = • and • ∈ Act, then ϕ has one positive successor ϕ labeled by < q , si+1 >. 3. For a negative node ϕ labeled by < q, si >: (a) If l(q) ∈ Γ p, then ϕ is a leaf. (b) If l(q) ∈ {∨, } and {q |q → q } = {q1 , . . . , qm }, then ϕ has negative successors ϕ1 , . . . , ϕm with ϕj labeled by < qj , si > (1 ≤ j ≤ m). (c) If l(q) = ∧, then ϕ has one negative successor ϕ labeled by < q , si > for some q ∈ {q |q → q }. • (d) If l(q) = X and q → q and {s |si −→ s } = {t1 , . . . , tm } where • ∈ Act, then ϕ has negative successors ϕ1 , . . . , ϕm with ϕj labeled by < q , tj > (1 ≤ j ≤ m). • (e) If l(q) =< • > where • ∈ Act and q → q , and si −→ si+1 , then ϕ has one negative successor ϕ labeled by < q , si+1,0 > where si+1,0 is the initial state of the decomposition TS of si+1 . •
(f ) If l(q) =< • > where • ∈ ¬Act and q → q , and si −→ si+1 where • = • and • ∈ Act, then ϕ has one negative successor ϕ labeled by < q , si+1 >. 4. Otherwise, for a positive (negative) node ϕ labeled by < q, si,j >: (a) If l(q) = and {q |q → q } = {q1 , q2 } such that q1 is a leaf, and si,j has a successor si,j+1 , then ϕ has one positive leaf successor ϕ labeled by < q1 , si,j > and one positive (negative) successor ϕ labeled by < q2 , si,j+1 >. (b) If l(q) = and {q |q → q } = {q1 , q2 } such that q1 is a leaf, and si,j has no successor, then ϕ has one positive leaf successor ϕ labeled by < q1 , si,j > and one positive (negative) successor ϕ labeled by < q2 , si >. r (c) If l(q) ∈ {∧, ∨, X, [SCAg ]} and {q |q → q } = {q1 }, and si,j −→ si,j+1 such that r = l(q), then ϕ has one positive (negative) successor ϕ labeled by < q1 , si,j+1 >.
236
J. Bentahar, B. Moulin, and J.-J. Ch. Meyer
The notion of run of an ABTA on a TS is a non-synchronized product graph of the ABTA and the TS (see Fig. 1). This run uses the label of nodes in the ABTA (l(q)), the transitions in the ABTA (q → q ), and the transitions in the • TS (si −→ sj ). The product is not synchronized in the sense that it is possible to use transitions in the ABTA while staying in the same state in the TS (this is the case for example of clauses 2.a, 2.c, and 2.d). The clause 2.a in the definition says that if we have a positive node ϕ in the product graph such that the corresponding state in the ABTA is labeled with ¬ and we have a transition q → q in this ABTA, then ϕ has one negative successor labeled with < q , si >. In this case we use a transition from the ABTA and we stay in the same state of the TS. In the case of a positive node and if the current state of the ABTA is labeled with ∧, all the transitions of this current state of the ABTA are used (clause 2.c). However, if the current state of the ABTA is labeled with ∨, only one arbitrary transition from the ABTA is used (clause 2.d). The intuitive idea is that in the case of ∧, all the sub-formulae must be true in order to decide about the formula of the current node of the ABTA. However, in the case of ∨ only one sub-formula must be true. The cases in which a transition of the TS is used are: 1. The current node of the ABTA is labeled with X (which means a next state in the TS). This is the case of clauses 2.e and 3.d. In this case we use all the transitions from the current state si to next states of the TS. 2. The current state of the ABTA and a transition from the current state of the TS are labeled with the same action. This is the case of clauses 2.f and 3.e. In this case, the current transition of the ABTA and the transition from the current state si of the TS to a state si+1,0 of the associated decomposition TS are used. The idea is to start the parsing of the formula coded in the decomposition TS. 3. The current state of the ABTA and a transition from the current state of the TS are labeled with different actions where the state of the ABTA is labeled with a negative formula. This is the case of clauses 2.g and 3.f . In this case, the formula is satisfied. Consequently, the current transition of the ABTA and the transition from the current state si of the TS to a next state si+1 are used. Finally, clauses 4.a, 4.b, and 4.c deal with the case of verifying the structure of the commitment formulae in the sub-TS. In these r clauses, transitions si,j −→ si,j+1 are used. We note here that when si,j has no successor, the formula contained in this state is an atomic formula or a boolean formula whose all the sub-formulae are atomic (for example p ∧ q where p and q are atomic). Example 2. Fig. 4 illustrates an example of the run of an ABTA. This figure illustrates a part of the automaton B⊗ resulting from the product of the TS of Fig. 2 and the ABTA of Fig. 3. According to the clause 1 (Definition 4), the root is a positive node and it is labeled by < ¬, s0 > because the label of the ABTA’s root is ¬ (Fig. 3). Consequently, according to the clause 2.a, the successor is a negative node and it is labeled by < ∨, s0 >. According to the clause 3.b, the second node has two negative successors labeled by , s0 > and < X, s0 > etc.
A Tableau Method for Verifying Dialogue Game Protocols
237
Fig. 4. An example of an ABTA’s run
In an ABTA, every infinite path has a suffix that contains either positive or negative nodes, but not both. Such a path is referred to as positive in the former case and negative in the latter. Now we can define the notion of accepting runs (or successful runs). Let p ∈ Γ p and let si be a state in a TS T . Then si |=T p iff p ∈ Lab(si ) and si |=T ¬p iff p ∈ / Lab(si ). Let si,j be a state in a decomposition TS of a TS T . Then si,j |=T p iff p ∈ Lab (si,j ) and si,j |=T ¬p iff p ∈ / Lab (si,j ). Definition 5 (Successful Run). Let r be a run of an ABTA B = Q, l, →, q0 , F Act on a TS T = S, Lab, ℘, L, Act, −→, s0 . The run r is successful iff every leaf and every infinite path in r is successful. A successful leaf is defined as follows: 1. A positive leaf labeled by < q, si > is successful iff si |=T l(q) or l(q) =< • > • where • ∈ Act and there is no sj such that si −→ sj . 2. A positive leaf labeled by < q, si,j > is successful iff si,j |=T l(q) 3. A negative leaf labeled by < q, si > is successful iff si |=T ¬l(q) or l(q) =< • > • where • ∈ Act and there is no sj such that si −→ sj . 4. A negative leaf labeled by < q, si,j > is successful iff si,j |=T ¬l(q)
238
J. Bentahar, B. Moulin, and J.-J. Ch. Meyer
A successful infinite path is defined as follows: 1. A positive path is successful iff ∀f ∈ F, ∃q ∈ f such that q occurs infinitely often in the path. This condition is called the B¨ uchi condition. 2. A negative path is successful iff ∃f ∈ F, ∀q ∈ f, q does not occur infinitely often in the path. This condition is called the co-B¨ uchi condition. We note here that a positive or negative leaf labeled by such that l(q) =< • • > where • ∈ Act and there is no s such that s −→ s is considered a successful leaf because we can not consider it unsuccessful. The reason is that it is possible to find a transition labeled by • and starting from another state s in the TS. If we consider such a leaf unsuccessful, then even if we find a successful infinite path, the run will be considered unsuccessful. However, this is false. An ABTA B accepts a TS T iff there exists a successful run of B on T . In order to compute the successful run of the generating ABTA, we should compute the acceptance states F . For this purpose we use the following definition. Definition 6 (Acceptance States). Let q be a state in an ABTA B and Q the set of all states. Suppose φ = φ1 U φ2 ∈ q 3 . We define the set Fφ as follows: Fφ = {q ∈ Q|(φ ∈ / q and Xφ ∈ / q ) or φ2 ∈ q }. The acceptance set F is defined as follows: F = {Fφ |φ = φ1 U φ2 and ∃q ∈ B, φ ∈ q}. According to this definition, a state that contains the formula φ or the formula Xφ is not an acceptance state. The reason is that according to Definition 4, there is a transition from a state containing φ to a state containing Xφ and vice versa. Therefore, according to Definition 5, there is a successful run in the ABTA B. However, we can not decide about the satisfaction of a formula using this run. The reason is that in an infinite cycle including a state containing φ and a state containing Xφ, we can not be sure that a state containing φ2 is reachable. However, according to the semantics of U , the satisfaction of φ needs that a state containing φ2 is reachable while passing by states containing φ1 . Example 3. In order to compute the acceptance states of the ABTA of Fig. 3, we use the formula associated to the child number (2) in Table 2: F (Ch(Ag2 , SC(Ag1 , Ag2 , φ1 )) ∧ G(¬Jus(Ag1 , SC(Ag1 , Ag2 , φ1 )φ2 ))) We consider this formula, denoted φ, instead of the root’s formula because its form is E(φ) (see Section 5.3). Consequently, state (1) and states from (3) to (17) are the acceptance states according to Definition 6. For example, state (1) is an acceptance state because φ and Xφ are not in this state, and state (3) is an acceptance state because φ2 is in this state. States (2) and (4) are not acceptance states. Because only the first state is labeled by ¬, all finite and infinite paths are negative paths. Consequently, the only infinite path that is a valid proof of Formula 1 is (1, (2, 4)*). In this path there is no acceptance state that occurs 3
Here we consider until formula because it is the formula that allows paths to be infinite.
A Tableau Method for Verifying Dialogue Game Protocols
239
infinitely often. Therefore, this path satisfies the B¨ uchi condition. The path visiting the state (3) and infinitely often the state (9) does not satisfy Formula 1 because there is a challenge action (state (3)), and globally no justification action of the content of the challenged commitment (state (9)). 5.5
Model Checking Algorithm (Step 3)
Our model checking algorithm for verifying that a dialogue game protocol satisfies a given property and that it respects the decomposition semantics of the underlying communicative acts is inspired by the procedure proposed by [7]. Like the algorithm proposed by [14], our algorithm explores the product graph of an ABTA representing an ACLT* formula and a TS for a dialogue game protocol. This algorithm is on-the-fly (or local) algorithm that consists of checking if a TS is accepted by an ABTA. This ABTA-based model checking is reduced to the emptiness of the B¨ uchi automata [31]. The emptiness problem of automata is to decide, given an automaton A, whether its language L(A) is empty. The language L(A) is the set of words accepted by A. Act Let T = S, Lab, ℘, L, Act, −→, s0 be a TS for a dialogue game and let B = Q, l, →, q0 , F be an ABTA for ACTL*. The procedure consists of building the ABTA product B⊗ of T and B while checking if there is a successful run in B⊗ . The existence of such a run means that the language of B⊗ is non-empty. The automaton B⊗ is defined as follows: B⊗ = Q × S, →B⊗ , q0B ⊗ , FB⊗ . There is a transition between two nodes < q, s > and < q , s > iff there is a transition between these two nodes in some run of B on T . Intuitively, B⊗ simulates all the runs of the ABTA. The set of accepting states FB⊗ is defined as follows: q0B ⊗ ∈ FB⊗ iff q ∈ F . Unlike the algorithms proposed in [7, 14], our algorithm uses only one depthfirst search (DFS) instead of two. This is due to the fact that our algorithm explores directly the product graph using the sign of the nodes (positive or negative). In addition, our algorithm does not distinguish between recursive and nonrecursive nodes. Therefore, we do not take into account the strongly-connected components in the ABTA, but we use a marking algorithm that directly works on the product graph. The idea of this algorithm is to construct the product graph while exploring it. The construction procedure is directly obtained from Definition 4. The algorithm uses the label of nodes in the ABTA, and the transitions in the product graph obtained from the TS and the ABTA as explained in Definition 4. In order to decide if the ABTA contains an infinite successful run, all the explored nodes are marked ”visited”. Thus, when the algorithm explores a visited node, it returns false if the infinite path is not successful. If the node is not already visited, the algorithm tests if it is a leaf. In this case, it returns false if the node is a non-successful leaf. If the explored node is not a leaf, the algorithm explores recursively the successors of this node. If this node is labeled by ” ∧ ”, and signed positively, then it returns false if one of the successors is false. However, if the node is signed negatively, it returns false if all the successors are false. A dual treatment is applied when the node is labeled by ” ∨ ”.
240
J. Bentahar, B. Moulin, and J.-J. Ch. Meyer
Example 4. In order to check if the language of the automaton illustrated by Fig. 4 is empty, we check if there is a successful run. The idea is to verify if B⊗ contains an infinite path visiting the state (3) and infinitely often the state (9) of the ABTA of Fig. 3. If such a path exists, then we conclude that Formula 1 is not satisfied by the TS of Fig. 2. Indeed, the only infinite path of B⊗ is successful because it does not touch any accepted state and all leaves are also successful. For instance, the leaf labeled by (< Ch >, s0 ) is successful since there Ch
is no state si such that s0 −→ si . Therefore, the TS of Fig. 2 is accepted by the ABTA of Formula 1. Consequently, this TS satisfies Formula 1 and respects its decomposition semantics. Soundness and completeness of our model checking method are stated by the following theorem. Theorem 1 (Soundness and Completeness). Let ψ be a ACTL* formula and Bψ the ABTA obtained by the translation procedure described above, and let Act
T = S, Lab, ℘, L, Act, −→, s0 be a TS that represents a dialogue game protocol. Then, s0 |=T ψ iff T is accepted by Bψ . Proof. (Direction ⇒). To prove that T is accepted by Bψ , we have to prove that there exists a run r of Bψ on T such that all leaves and all infinite paths in the run are successful. Let us assume that s0 |=T ψ. First, let us suppose that there exists a leaf < q, s > in r such that s |= ¬l(q). Since the application of tableau rules does not change the satisfaction of formulae, it follows from Definition 4 that s0 |=T ¬ψ which contradicts our assumption. Now, we will prove that all infinite paths are successful. The proof proceeds by contradiction. ψ is a state formula that we can write under the form EΦ, where Φ is a set of path formulae. Let us assume that there exists an unsuccessful infinite path xr in r and prove that xT |=T ¬Φ where xT is the path in T that corresponds to xr (xr is the product of Bψ and T ). The fact that xr is infinite implies that R22 occurs at infinitely many positions in xr . Because xr is unsuccessful, ∃φ1 , φ2 , qi such that φ1 U φ2 ∈ qi and ∀j ≥ i we have φ2 ∈ / qj . When this formula appears in the ABTA at the position qi , we have l(qi ) = ∨. Thus, according to Definition 4 and the form of R22, the current node ϕ1 of r labeled by < qi , s > has one successor ϕ1 labeled by < qi+1 , s > with φ1 U φ2 ∈ qi and {φ1 , X(φ1 U φ2 )} ⊆ qi+1 . Therefore, l(qi+1 ) = ∧, and ϕ2 has a successor ϕ3 labeled by < qi+2 , s > with X(φ1 U φ2 ) ∈ qi+2 . Using R20 and the fact that l(qi+2 ) = X, the successor ϕ4 of ϕ3 is labeled by < qi+3 , s > with φ1 U φ2 ∈ qi+3 and s is a successor of s. This process will be repeated infinitely since the path is unsuccessful. It follows that there is no s in T such that s |=T φ2 . Thus, according to the semantics of U , there is no s in T such that s |=T φ1 U φ2 . Therefore, xT |=T ¬Φ. (Direction ⇐). The proof proceeds by an inductive construction of xr and an analysis of the different tableau rules. A detailed proof of this theorem is presented in [4].
A Tableau Method for Verifying Dialogue Game Protocols
6
241
Related Work and Conclusion
The verification problem has recently begun to find a significant audience in the MAS community. Rao and Georgeff [26] have proposed an adaptation of CTL and CTL* model checking to verify BDI (beliefs, desires and intentions) logics. van der Hoek and Wooldridge [30] have proposed some techniques in order to reduce the model checking of temporal epistemic properties of MAS to the model checking of LTL. Benerecetti and Cimatti [3] have proposed a general approach for model-checking MAS based on CTL together with modalities for BDI attitudes. Wooldridge et al. [33] have presented the translation of the MABLE language for the specification and the verification of MAS into Promela, the language of the SPIN model checker of LTL. Bordini et al. [9, 10] have addressed the problem of verifying MAS specified using the AgentSpeak language. They have showed how programs written in AgentSpeak can be automatically transformed into Promela and into Java, the language of the JPF2 model checker. Penczek and Lomuscio [24] have developed a bounded model checking algorithm for branching time logic for knowledge (CTLK). In a similar way, Raimondi and Lomuscio [25] have implemented an algorithm to verify epistemic CTL properties of MAS using ordered binary decision diagrams. Kacprzak et al. [20] have also investigated the problem of verifying epistemic properties using CTLK by means of an unbounded model checking algorithm. Kacprzak and Penczek [19] have addressed the problem of verifying game-like structures by means of unbounded model checking. There are many differences between all these proposals and the work presented in this paper that we can summarize as follows. First, these proposals are based on BDI and epistemic logics that stress the agents’ private mental states. In contrast, our work uses a logic highlighting the public states reflecting the agents’ interactions expressed in terms of SC and argumentation relations. Second, our model checking algorithm allows us to verify not only the system’s temporal properties but also the action properties. Finally, the technique that we use is based on the tableau method and is different from the techniques used for LTL, CTL and CTL*. Complementarily, the verification of agent communication protocols has been addressed by some research work. Endriss et al. [15] have proposed abductive logic-based agents and some means of determining whether or not these agents behave in conformance to agent communication protocols. Baldoni et al. [2] have addressed the problem of verifying that a given protocol implementation using a logical language conforms to its AUML specification. Alberti et al. [1] have considered the problem of verifying on the fly the compliance of the agents’ behavior to protocols specified using a logic-based framework. These approaches are different from our proposal in the sense that they are not based on model checking techniques and they do not address the problem of verifying if a protocol satisfies given properties. Huget and Wooldridge [18] have used a variation of the MABLE language to define a semantics of agent communication and have showed that the compliance to this semantics can be reduced to a model checking problem. Walton [32] has applied model checking techniques in order to verify the correctness of protocol
242
J. Bentahar, B. Moulin, and J.-J. Ch. Meyer
communication using the SPIN model checker. The model checking techniques used by these two proposals are based on LTL, whereas our technique is based on ACTL*. In addition, our approach is based on a new algorithm and not on the translation of the specification language to the languages of existing model checkers. Unlike these two proposals, our technique allows us to simultaneously verify the correctness of protocols and the agents’ conformance to the semantics. Recently, Giordano et al. [17] have addressed the problem of specifying and verifying agent interaction protocols using a Dynamic Linear Time Temporal Logic (DLTL). The authors have addressed three kinds of verification problems: 1) the compliance of a protocol execution to its specification; 2) the satisfaction of a property in the protocol; 3) the compliance of agents to the protocol. They have showed that these problems can be solved by model checking DLTL. This model checking technique uses a tableau-based algorithm for obtaining a B¨ uchi automaton from a formula in DLTL. Although this work is close to our proposal, there are four main differences between these two approaches: (1) The protocols we dealt with are dialogue game protocols specified using actions that agents apply on SC. However, the protocols used in [17] are abstract protocols specified in terms of the effects of communicative actions and some precondition laws. (2) The model checking technique proposed in [17] uses classical B¨ uchi automaton that is constructed using a tableau-like procedure and propositional rules. Our technique is different because it is based on ABTA and not on traditional B¨ uchi automaton. In addition, the construction of this automaton uses proof rules and not propositional rules. (3) Our approach is based not only on SC like [17], but also on an argumentation theory. Consequently, our protocols are more suitable for MAS because agents can make decisions using their argumentation systems. (4) The dynamic part in our logic is reflected by an action theory, whereas in DLTL, the dynamic part is represented by regular programs. The contribution of this paper is the proposition of a new verification technique for dialogue game protocols. Our model checking technique allows us to verify both the correctness of the protocols and the agents’ compliance with the decomposition semantics of the communicative acts. This technique uses a combination of an automata-based and a tableau-based algorithm to verify temporal and action specifications. The formal properties to be verified are expressed in our ACTL* logic and translated to ABTA using tableau rules. Our model checking algorithm that works on a product graph is an efficient on-the-fly procedure. As an extension to this work, we intend to use this tableau-based technique to verify MAS specifications and the conformance of agents with these specifications. Another interesting direction for future work is to extend the technique and the logic in order to consider the epistemic properties. Finally, we plan to use this technique to specify and verify agents’ trust policies. Acknowledgements. We would like to thank the three anonymous reviewers for their interesting comments, questions, and suggestions. We are also grateful to Mohamed Aoun-Allah and Mohamed Mbarki for their suggestions.
A Tableau Method for Verifying Dialogue Game Protocols
243
References 1. Alberti, M., Chesani, F., Gavanelli, M., Lamma, E., Mello, P., and Torroni, P.: Compliance verification of agent interaction: a logic-based tool. Proc. of the European Meeting on Cybernetics and Systems Research, Vol. II (2004) 570–575 2. Baldoni, M., Baroglio, C., Martelli, A., Patti, V., and Schifanella, C.: Verifying protocol conformance for logic-based communicating agents. Computational Logic in Multi-Agent Systems, LNAI 3487 (2004) 196–212 3. Benerecetti, M. and Cimatti, A.: Symbolic model checking for multi-agent systems. Proc. of the International Workshop on Model Checking and AI (2002) 1–8 4. Bentahar, J.: A pragmatic and semantic unified framework for agent communication. PhD Thesis, Laval University, Canada May (2005) 5. Bentahar, J., Moulin, B., and Chaib-draa, B.: A persuasion dialogue game based on commitments and arguments. Proc. of the International Workshop on Argumentation in Multi-Agent Systems (2004) 148–164 6. Bentahar, J., Moulin, B., Meyer, J-J, Ch., and Chaib-draa, B.: A logical model for commitment and argument network for agent communication. Proc. of the International Joint Conference on AAMAS (2004) 792–799 7. Bhat, G., Cleaveland, R., and Groce, A.: Efficient model checking via B¨ uchi tableau automata. Computer-Aided Verification, LNCS 2102 (2001) 38–52 8. Bhat, G., Cleaveland, R., and Grumberg, O.: Efficient on-the-fly model checking for CTL*. The IEEE Symposium on Logics in Computer Science (1995) 388–397 9. Bordini, R.H, Fisher, M., Pardavila C., and Wooldridge, M.: Model checking AgentSpeak. Proc. of the International Joint Conference on AAMAS (2003) 409– 419 10. Bordini, R.H., Visser, W., Fisher, M., Pardavila, C., and Wooldridge, M.: Model checking multi-agent programs with CASP. Computer-Aided Verification, 2725 (2003) 110–113 11. Cleaveland, R.: Tableau-based model checking in the propositional mu-calculus. Acta Informatica, 27(8) (1990) 725–747 12. Cohen, P.R., and Levesque, H.J.: Persistence, intentions and commitment. Intentions in Communication, MIT Press, (1990) 33–69 13. Colombetti, M.: A commitment-based approach to agent speech acts and conversations. Proc. of the International Autonomous Agent Workshop on Conversational Policies (2000) 21–29 14. Courcoubetis, C., Vardi, M.Y., Wolper, P., and Yannakakis, M.: Memory efficient algorithms for verification of temporal properties. Formal Methods in System Design, vol. 1 (1992) 275–288 15. Endriss, U., Maudet, N., Sadri, F., and Toni, F.: Protocol conformance for logicbased agents. Proc. of the International Joint Conference on AI (2003) 679–684 16. Fornara, N. and Colombetti, M.: Operational specification of a commitment-based agent communication language. Proc. of the International Joint Conference on AAMS (2002), 535–542 17. Giordano, L., Martelli, A., and Schwind, C.: Verifying communicating agents by model checking in a temporal action logic. Logics in Artificial Intelligence, LNAI 3229 (2004) 57–69 18. Huget, M.P and Wooldridge, M.: Model checking for ACL compliance verification. Advances in Agent Communication. LNAI 2922 (2004) 75–90 19. Kacprzak, M. and Penczek, W.: Unbounded model checking for alternating-time temporal logic. The International Joint Conference on AAMAS (2004) 646–653
244
J. Bentahar, B. Moulin, and J.-J. Ch. Meyer
20. Kacprzak, M., Lomuscio, A., and Penczek, W.: Verification of multiagent systems via unbounded model checking. Proc. of the International Joint Conference on AAMAS, (2004) 638–645 21. Maudet, N., and Chaib-draa, B.: Commitment-based and dialogue-game based protocols, new trends in agent communication languages. Knowledge Engineering Review, 17(2) (2002) 157–179 22. McBurney, P. and Parsons, S.: Games that agents play: A formal framework for dialogues between autonomous agents. Journal of Logic, Language, and Information, (11)3 (2002) 315–334 23. Moulin, B.: The social dimension of interactions in multi-agent systems. Agent and Multi-Agent Systems, Formalisms, Methodologies and Applications. Artificial Intelligence, 1441 (1998) 109–122 24. Penczek, W. and Lomuscio, A.: Verifying epistemic properties of multi-agent systems via model checking. Fundamenta Informaticae, 55(2) (2003) 167–185 25. Raimondi, F. and Lomuscio, A.: Verification of multiagent systems via ordered binary decision diagrams: an algorithm and its implementation. Proc. of the International Joint Conference on AAMAS (2004) 630–637 26. Rao, A.S. and Georgeff, M.P.: A model-theoretic approach to the verification of situated reasoning systems. Proc. of IJCAI (1993) 318–324 27. Sadri, F. Toni, F., and Torroni, P.: Dialogues for negotiation: agent varieties and dialogue sequences, Proc. of the International workshop on Agents, Theories, Architectures and Languages, LNAI 2333 (2001) 405–421 28. Singh, M.P.: Agent communication languages: rethinking the principles. IEEE Computer, 31(12) (1998) 40–47 29. Stirling, C. and Walker, D.: Local model checking in the modal Mu-Calculus. LNCS 354 (1989) 369–383 30. van der Hoek, W. and Wooldridge, M.: Model checking knowledge and time. In Model Checking Software, LNCS 2318 (2002) 95–111 31. Vardi, M. and Wolper, P.: An automata-theoretic approach to automatic program verification. Symposium on Logic in Computer Science (1986) 332–344 32. Walton, D.C.: Model checking agent dialogues. Declarative Agent Languages and Technologies, LNAI 3476 (2005) 132–147 33. Wooldridge, M., Fisher, M., Huget, M.P., and Parsons. S.: Model checking multiagent systems with MABLE. Proc. of the International Joint Conference on AAMAS (2002) 952–959
Author Index
˚ Agotnes, Thomas 33 Alagar, Vasu S. 205 Alechina, Natasha 141
Mermet, Bruno 124 Meyer, John-Jules Ch. 223 ´ Moreira, Alvaro F. 155 Moulin, Bernard 223
Bentahar, Jamal 223 Bordini, Rafael H. 155 Brain, Martin 72 Broersen, Jan 1 Cliffe, Owen 72 Costantini, Stefania Crick, Tom 72 Dastani, Mehdi De Vos, Marina
106
124
Garc´ıa-Camino, Andr´es Hasegawa, Tetsuo 171 Hayashi, Hisashi 171 H¨ ubner, Jomi F. 155 141
Kwisthout, Johan
Ozaki, Fumio
171
Padget, Julian
72
72
Rodr´ıguez-Aguilar, Juan A.
1, 17 72
Fournier, Dominique
Jago, Mark
Needham, Jonathan
89
Sears, Tom D. 51 Sierra, Carles 89 Simon, Ga¨ele 124 Tocchio, Arianna 106 Tokura, Seiji 171 Vasconcelos, Wamberto Vieira, Renata 155
89
17
Lloyd, John W. 51 Logan, Brian 141 Lomuscio, Alessio 188
Walicki, Michal 33 Wan, Kaiyu 205 Winkelhagen, Laurens Wo´zna, Bo˙zena 188
1
89