Novel Approaches in Cognitive Informatics and Natural Intelligence Yingxu Wang University of Calgary, Canada
Information science reference Hershey • New York
Director of Editorial Content: Assistant Development Editor: Director of Production: Managing Editor: Assistant Managing Editor: Typesetter: Cover Design: Printed at:
Kristin Klinger Deborah Yahnke Jennifer Neidig Jamie Snavely Carole Coulson Michael Brehm Lisa Tosheff Yurchak Printing Inc.
Published in the United States of America by Information Science Reference (an imprint of IGI Global) 701 E. Chocolate Avenue, Suite 200 Hershey PA 17033 Tel: 717-533-8845 Fax: 717-533-8661 E-mail:
[email protected] Web site: http://www.igi-global.com and in the United Kingdom by Information Science Reference (an imprint of IGI Global) 3 Henrietta Street Covent Garden London WC2E 8LU Tel: 44 20 7240 0856 Fax: 44 20 7379 0609 Web site: http://www.eurospanbookstore.com Copyright © 2009 by IGI Global. All rights reserved. No part of this publication may be reproduced, stored or distributed in any form or by any means, electronic or mechanical, including photocopying, without written permission from the publisher. Product or company names used in this set are for identification purposes only. Inclusion of the names of the products or companies does not indicate a claim of ownership by IGI Global of the trademark or registered trademark.
Library of Congress Cataloging-in-Publication Data
Novel approaches in cognitive informatics and natural intelligence / Yingxu Wang, editor. p. cm. Includes bibliographical references and index. Summary: "This book covers issue of cognitive informatics with a transdisciplinary enquiry of cognitive and information sciences that investigates the internal information processing mechanisms and processes of the brain and natural intelligence, and their engineering applications via an interdisciplinary approach"--Provided by publisher. ISBN 978-1-60566-170-4 (hardcover) -- ISBN 978-1-60566-171-1 (ebook) 1. Neural computers. 2. Cognitive science. 3. Artificial intelligence. I. Wang, Yingxu. QA76.87.N68 2009 006.3--dc22 2008018331
British Cataloguing in Publication Data A Cataloguing in Publication record for this book is available from the British Library. All work contributed to this book set is original material. The views expressed in this book are those of the authors, but not necessarily of the publisher.
Novel Appraoches in Cognitive Informatics and Natural Intelligence is part of the IGI Global series named Advances in Cognitive Informatics and Natural Intelligence (ACINI) Series, ISBN: Pending
If a library purchased a print copy of this publication, please go to http://www.igi-global.com/agreement for information on activating the library's complimentary electronic access to this publication.
Advances in Cognitive Informatics and Natural Intelligence (ACINI) Series ISBN: pending
Editor-in-Chief: Yingxu Wang, University of Calgary, Canada Novel Approaches in Cognitive Informatics and Natural Intelligence Yingxu Wang, University of Calgary, Canada Information Science Reference • copyright 2009 • 395 pp • H/C (ISBN: 978-1-60566-170-4) • US $195.00 (our price)
Creating a link between a number of natural science and life science disciplines, the emerging . eld of cognitive informatics presents a transdisciplinary approach to the internal information processing mechanisms and processes of the brain and natural intelligence. Novel Approaches in Cognitive Informatics and Natural Intelligence penetrates the academic field to offer the latest advancements in cognitive informatics and natural intelligence. This book covers the five areas of cognitive informatics, natural intelligence, autonomic computing, knowledge science, and relevant development, to provide researchers, academicians, students, and practitioners with a ready reference to the latest findings.
The Advances in Cognitive Informatics and Natural Intelligence (ACINI) Book Series seeks to fill the gap of literature that transcends disciplinary boundaries, and is devoted to the rapid publication of high quality books. In providing a scholarly channel for new research principles, theories and concepts, the book series will enhance the fields of Natural Intelligence, Autonomic Computing, and Neuroinformatics. The development and the cross fertilization between the aforementioned science and engineering disciplines have led to a whole range of extremely interesting new research areas known as Cognitive Informatics and Natural Intelligence. Advances in Cognitive Informatics and Natural Intelligence (ACINI) Book Series seeks to propel the availability of literature for international researchers, practitioners, and graduate students to investigate cognitive mechanisms and processes of human information processing, and to stimulate the transdisciplinary effort on cognitive informatics and natural intelligent research and engineering applications.
Hershey • New York Order online at www.igi-global.com or call 717-533-8845 x 100 – Mon-Fri 8:30 am - 5:00 pm (est) or fax 24 hours a day 717-533-7115
Editorial Advisory Board
Editor in Chief Yingxu Wang, University of Calgary, Canada
Associate Editors Lotfi A. Zadeh, University of California, Berkeley, USA Witold Kinsner, University of Manitoba, Canada John Bickle, University of Cioncinnati, USA Christine Chan, University of Regina, Canada
International Editorial Advisory Board James Anderson, Brown University, USA George Baciu, Hong Kong Polytechnic University, Hong Kong Franck Barbier, Par University, France Brian H. Bland, University of Calgary, Canada Keith Chan, Hong Kong Polytechnic University, Hong Kong Michael R.W. Dawson, University of Alberta, Canada Geoff Dromey, Griffith University, Australia Frank L. Greitzer, Pacific Northwest National Lab, USA Ling Guang, Ryerson University, Canada Bo Huang, The Chinese University of Hong Kong, Hong Kong Brian Henderson-Sellers, University of Technology Sydney, Australia Zeng-Guang Hou, Chinese Academy of Sciences, China Yaochu Jin, Honda Research Instutite Europe, Germany Jiming Liu, University of Windsor, Canada Pelayo F. Lopez, Universidad de Castilla-La Mancha, Spain Roger K. Moor, Department of Computer Science, University of Sheffield, UK Bernard Moulin, University of Laval, Canada Dilip Patel, South Bank University, UK Shushma Patel, South Bank University, UK Witold Pedrycz, University of Alberta, Canada Lech Polkowsk, University Warmia and Mazury, Poland Vaclav Rajlich, Wayne State University, USA Fernando Rubio, Universidad Complutense de Madrid, Spain
Gunther Ruhe, University of Calgary, Canada Philip Sheu, University of California, Irvine, USA Kenji Sugawara, Chiba Technical Institute, Japan Jeffrey Tsai, University of Illinois in Chicago, USA Guoyin Wang, Chongqing University of Posts and Telecoms, China Yiyu Yao, University of Regina, Canada Du Zhang, Department of Computer Science, California State University, USA Ning Zhong, Maebashi Institute of Technology, Japan Mengchu Zhou, New Jersey Institute of Technology, USA Xiaolin Zhou, Peking University, China
Table of Contents
Preface . ............................................................................................................................................................... xix Acknowledgment................................................................................................................................................ xxii Section I Cognitive Informatics Chapter I The Theoretical Framework of Cognitive Informatics.............................................................................................1 Yingxu Wang, University of Calgary, Canada Chapter II Is Entropy Suitable to Characterize Data and Signals for Cognitive Informatics?.................................................28 Witold Kinsner, University of Manitoba, Canada Chapter III Cognitive Processes by using Finite State Machines..............................................................................................52 Ismael Rodríguez, Universidad Complutense de Madrid, Spain Manuel Núñez, Universidad Complutense de Madrid, Spain Fernando Rubio, Universidad Complutense de Madrid, Spain Chapter IV On the Cognitive Processes of Human Perception with Emotions, Motivations, and Attitudes............................65 Yingxu Wang, University of Calgary, Canada Chapter V A Selective Sparse Coding Model with Embedded Attention Mechanism.............................................................78 Qingyong Li, Beijing Jiaotong University, China Zhiping Shi, Chinese Academy of Sciences, China Zhongzhi Shi, Chinese Academy of Sciences, China
Section II Natural Intelligence Chapter VI The Cognitive Processes of Formal Inferences......................................................................................................92 Yingxu Wang, University of Calgary, Canada Chapter VII Neo-Symbiosis: The Next Stage in the Evolution of Human Information Interaction.........................................106 Douglas Griffith, General Dynamics Advanced Information Systems, USA Frank L. Greitzer, Pacific Northwest National Laboratory, USA Chapter VIII Language, Logic, and the Brain............................................................................................................................118 Ray E. Jennings, Simon Fraser University, Canada Chapter IX The Cognitive Process of Decision Making.........................................................................................................130 Yingxu Wang, University of Calgary, Canada Guenther Ruhe, University of Calgary, Canada Chapter X A Commonsense Approach to Representing Spatial Knowledge Between Extended Objects.............................142 Tiansi Dong, Cognitive Ergonomic Systems, Germany Chapter XI A Formal Specification of the Memorization Process..........................................................................................157 Natalia López, Universidad Complutense de Madrid, Spain Manuel Núñez, Universidad Complutense de Madrid, Spain Fernando L. Pelayo, Universidad de Castilla-La Mancha, Spain Section III Autonomic Computing Chapter XII Theoretical Foundations of Autonomic Computing.............................................................................................172 Yingxu Wang, University of Calgary, Canada Chapter XIII Towards Cognitive Machines: Multiscale Measures and Analysis.......................................................................188 Witold Kinsner, University of Manitoba, Canada Chapter XIV Towards Autonomic Computing: Adaptive Neural Network for Trajectory Planning.........................................200 Amar Ramdane-Cherif, Université de Versailles St-Quentin, France Chapter XV Cognitive Modelling Applied to Aspects of Schizophrenia and Autonomic Computing.....................................220 Lee Flax, Macquarie University, Australia
Chapter XVI Interactive Classification Using a Granule Network............................................................................................235 Yan Zhao, University of Regina, Canada Yiyu Yao, University of Regina, Canada Section IV Knowledge Science Chapter XVII A Cognitive Computational Knowledge Representation Theory.........................................................................247 Mehdi Najjar, University of Sherbrooke, Canada André Mayers, University of Sherbrooke, Canada Chapter XVIII A Fixpoint Semantics for Rule-Base Anomalies..................................................................................................265 Du Zhang, California State University, USA Chapter XIX Development of an Ontology for an Industrial Domain.......................................................................................277 Christine W. Chan, University of Regina, Canada Chapter XX Constructivist Learning During Software Development......................................................................................292 Václav Rajlich, Wayne State University, USA Shaochun Xu, Laurentian University, Canada Chapter XXI A Unified Approach to Fractal Dimensions..........................................................................................................304 Witold Kinsner, University of Manitoba, Canada Section V Relevant Development Chapter XXII Cognitive Informatics: Four Years in Practice: A Report on IEEE ICCI’05........................................................327 Du Zhang, California State University, USA Witold Kinsner, University of Manitoba, Canada Jeffrey Tsai, University of Illinois in Chicago, USA Yingxu Wang, University of Calgary, Canada Philip Sheu, University of California, USA Taehyung Wang, California State University, USA
Chapter XXIII Toward Cognitive Informatics and Cognitive Computers: A Report on IEEE ICCI’06.......................................330 Yiyu Yao, University of Regina, Canada Zhongzhi Shi, Chinese Academy of Sciences, China Yingxu Wang, University of Calgary, Canada Witold Kinsner, University of Manitoba, Canada Yixin Zhong, Beijing University of Posts and Telecommunications, China Guoyin Wang, Chongqing University of Posts and Telecommunications, China Zeng-Guang Hou, Chinese Academy of Sciences, China Compilation of References.................................................................................................................................335 About the Contributors......................................................................................................................................363 Index.....................................................................................................................................................................369
Detailed Table of Contents
Preface . ............................................................................................................................................................... xix Acknowledgment................................................................................................................................................ xxii Section I Cognitive Informatics Chapter I The Theoretical Framework of Cognitive Informatics.............................................................................................1 Yingxu Wang, University of Calgary, Canada Cognitive Informatics (CI) is a transdisciplinary enquiry of the internal information processing mechanisms and processes of the brain and natural intelligence shared by almost all science and engineering disciplines. This chapter presents an intensive review of the new field of CI. The structure of the theoretical framework of CI is described, encompassing the Layered Reference Model of the Brain (LRMB), the OAR model of information representation, Natural Intelligence (NI) vs. Artificial Intelligence (AI), Autonomic Computing (AC) vs. imperative computing, CI laws of software, the mechanism of human perception processes, the cognitive processes of formal inferences, and the formal knowledge system. Three types of new structures of mathematics, Concept Algebra (CA), RealTime Process Algebra (RTPA), and System Algebra (SA), are created to enable rigorous treatment of cognitive processes of the brain as well as knowledge representation and manipulation in a formal and coherent framework. A wide range of applications of CI in cognitive psychology, computing, knowledge engineering, and software engineering has been identified and discussed. Chapter II Is Entropy Suitable to Characterize Data and Signals for Cognitive Informatics?.................................................28 Witold Kinsner, University of Manitoba, Canada This chapter provides a review of Shannon and other entropy measures in evaluating the quality of materials used in perception, cognition, and learning processes. Energy-based metrics are not suitable for cognition, as energy itself does not carry information. Instead, morphological (structural and contextual) metrics as well as entropybased multiscale metrics should be considered in cognitive informatics. Appropriate data and signal transformation processes are defined and discussed in the perceptual framework, followed by various classes of information and entropies suitable for characterization of data, signals, and distortion. Other entropies are also described, including the Rényi generalized entropy spectrum, Kolmogorov complexity measure, Kolmogorov-Sinai entropy, and Prigogine entropy for evolutionary dynamical systems. Although such entropy-based measures are suitable for many signals, they are not sufficient for scale-invariant (fractal and multifractal) signals without corresponding complementary multiscale measures.
Chapter III Cognitive Processes by using Finite State Machines..............................................................................................52 Ismael Rodríguez, Universidad Complutense de Madrid, Spain Manuel Núñez, Universidad Complutense de Madrid, Spain Fernando Rubio, Universidad Complutense de Madrid, Spain Finite State Machines (FSM) are formalisms that have been used for decades to describe the behavior of systems. They can also provide an intelligent agent with a suitable formalism for describing its own beliefs about the behavior of the world surrounding it. In fact, FSMs are the suitable acceptors for right linear languages, which are the simplest languages considered in Chomsky’s classification of languages. Since Chomsky proposes that the generation of language (and, indirectly, any mental process) can be expressed through a kind of formal language, it can be assumed that cognitive processes can be formulated by means of the formalisms that can express those languages. Hence, we will use FSMs as a suitable formalism for representing (simple) cognitive models. We present an algorithm that, given an observation of the environment, produces an FSM describing an environment behavior that is capable to produce that observation. Since an infinite number of different FSMs could have produced that observation, we have to choose the most feasible one. When a phenomenon can be explained with several theories, Occam’s razor principle, which is basic in science, encourages choosing the simplest explanation. Applying this criterion to our problem, we choose the simplest (smallest) FSM that could have produced that observation. An algorithm is presented to solve this problem. In conclusion, our framework provides a cognitive model that is the most preferable theory for the observer, according to the Occam’s razor criterion Chapter IV On the Cognitive Processes of Human Perception with Emotions, Motivations, and Attitudes............................65 Yingxu Wang, University of Calgary, Canada An interactive motivation-attitude theory is developed based on the Layered Reference Model of the Brain (LRMB) and the Object-Attribute-Relation (OAR) model. This chapter presents a rigorous model of human perceptual processes such as emotions, motivations, and attitudes. A set of mathematical models and formally described cognitive processes are developed. The interactions and relationships between motivation and attitude are formally described in real-time process algebra (RTPA). Applications of the mathematical models of motivations and attitudes in software engineering are demonstrated. This work is the detailed description of a part of the layered reference model of the brain (LRMB) that provides a comprehensive model for explaining the fundamental cognitive processes of the brain and their interactions. This work demonstrates that the complicated human emotional and perceptual phenomena can be rigorously modeled in mathematics and be formally treated and described. Chapter V A Selective Sparse Coding Model with Embedded Attention Mechanism.............................................................78 Qingyong Li, Beijing Jiaotong University, China Zhiping Shi, Chinese Academy of Sciences, China Zhongzhi Shi, Chinese Academy of Sciences, China Sparse coding theory demonstrates that the neurons in the primary visual cortex form a sparse representation of natural scenes in the viewpoint of statistics, but a typical scene contains many different patterns (corresponding to neurons in cortex) competing for neural representation because of the limited processing capacity of the visual system. We propose an attention-guided sparse coding model. This model includes two modules: the non-uniform sampling module simulating the process of retina and a data-driven attention module based on the response saliency. Our experiment results show that the model notably decreases the number of coefficients which may be activated, and retains the main vision information at the same time. It provides a way to improve the coding efficiency for sparse coding model and to achieve good performance in both population sparseness and lifetime sparseness.
Section II Natural Intelligence Chapter VI The Cognitive Processes of Formal Inferences......................................................................................................92 Yingxu Wang, University of Calgary, Canada Theoretical research is predominately an inductive process, while applied research is mainly a deductive process. Both inference processes are based on the cognitive process and means of abstraction. This chapter describes the cognitive processes of formal inferences such as deduction, induction, abduction, and analogy. Conventional propositional arguments adopt static causal inference. This chapter introduces more rigorous and dynamic inference methodologies, which are modeled and described as a set of cognitive processes encompassing a series of basic inference steps. A set of mathematical models of formal inference methodologies is developed. Formal descriptions of the 4 forms of cognitive processes of inferences are presented using Real-Time Process Algebra (RTPA). The cognitive processes and mental mechanisms of inferences are systematically explored and rigorously modeled. Applications of abstraction and formal inferences in both the revilement of the fundamental mechanisms of the brain and the investigation of next generation cognitive computers are explored. Chapter VII Neo-Symbiosis: The Next Stage in the Evolution of Human Information Interaction.........................................106 Douglas Griffith, General Dynamics Advanced Information Systems, USA Frank L. Greitzer, Pacific Northwest National Laboratory, USA The purpose of this paper is to re-address the vision of human-computer symbiosis as originally expressed by J.C.R. Licklider nearly a half-century ago and to argue for the relevance of this vision to the field of cognitive informatics. We describe this vision, place it in some historical context relating to the evolution of human factors research, and observe that the field is now in the process of re-invigorating Licklider’s vision. A central concept of this vision is that humans need to be incorporated into computer architectures. We briefly assess the state of the technology within the context of contemporary theory and practice, and we describe what we regard as this emerging field of neo-symbiosis. Examples of neo-symbiosis are provided, but these are nascent examples and the potential of neo-symbiosis is yet to be realized. We offer some initial thoughts on requirements to define functionality of neo-symbiotic systems and discuss research challenges associated with their development and evaluation. Methodologies and metrics for assessing neo-symbiosis are discussed. Chapter VIII Language, Logic, and the Brain............................................................................................................................118 Ray E. Jennings, Simon Fraser University, Canada Language is primarily a physical, and more particularly a biological phenomenon. To say that it is primarily so is to say that that is how, in the first instance, it presents itself to observation. It is curious then that theoreticians of language treat it as though it were primarily semantic or syntactic or some fusion of the two, and as though our implicit understanding of semantics and the syntax regulates both our language production and our language comprehension. On this view the brain is both a repository of semantic and syntactic constraints, and is the instrument by which we draw upon these accounts for the hard currency of linguistic exchange. With this view comes a division of the vocables of language into those that carry semantic content (lexical vocabulary) and those that mark syntactic form (functional and logical vocabulary). Logical theory of the past 150 years has been understood by many as a purified abstraction of linguistic forms. So it is not surprising that the “logical” vocabulary of natural language has been understood in the reflected light of that formal science. Those internal transactions in which “logical” vocables essentially figure, the transactions that we think of as reasonings, are seen by many as constrained by those laws of thought that logic was supposed to codify. Of course no vocabulary can be entirely independent of semantic understanding, but whereas the meaning of lexical vocabulary varies from context to context (run on the treadmill,
run on the market, run-on sentence, run in her stocking, run down, run the tap etc.) logical vocabulary is thought to have fixed minimal semantic content independently of context. A biological view of language presents a sharply contrasting picture. On an evolutionary time-scale the human brain and human language have co-evolved. So we have pre-linguistic ancestors, some of whose cunning we have inherited, as we have quasi-linguistic ancestors and early linguistic ancestors whose inherited skills were enhanced and made more effective by the slow acquisition of linguistic instruments of control and coordination. Where in this long development does logic enter? On the shorter time-scale of linguistic evolution, we know that all connective vocabulary descends from lexical vocabulary, much of it from the language of spatial and other physical relationships. We can now say, more or less, how that happens. We can even find many cases of mutations in logicalized vocabulary, semantic changes that come about in much the way that biological mutations occur in molecular biological processes. These changes proliferate to yield a wide diversity in the evolved uses of natural language connectives. Just as surprisingly, we discover, we don’t in general understand connective vocabulary, nor do we need to for the purpose of using it correctly in speech. And by no means do our automatic uses of it coincide with those that would be predicted by the syntax/semantics view. Far from having fixed minimal semantic content, logical vocabulary is semantically rich, context-dependent, and, partly because we do not in general understand it, semantically extremely fragile. Chapter IX The Cognitive Process of Decision Making.........................................................................................................130 Yingxu Wang, University of Calgary, Canada Guenther Ruhe, University of Calgary, Canada Decision making is one of the basic cognitive processes of human behaviors by which a preferred option or a course of actions is chosen from among a set of alternatives based on certain criteria. Decision theories are widely applied in many disciplines encompassing cognitive informatics, computer science, management science, economics, sociology, psychology, political science, and statistics. A number of decision strategies have been proposed from different angles and application domains, such as the maximum expected utility and Bayesian method. However, there is still a lack of a fundamental and mathematical decision model and a rigorous cognitive process for decision making. This chapter presents a fundamental cognitive decision making process and its mathematical model, which is described as a sequence of Cartesian-product-based selections. A rigorous description of the decision process in Real-Time Process Algebra (RTPA) is provided. Real-world decisions are perceived as a repetitive application of the fundamental cognitive process. The result shows that all categories of decision strategies fit in the formally described decision process. The cognitive process of decision making may be applied in a wide range of decision-based systems, such as cognitive informatics, software agent systems, expert systems, and decision support systems. Chapter X A Commonsense Approach to Representing Spatial Knowledge Between Extended Objects.............................142 Tiansi Dong, Cognitive Ergonomic Systems, Germany This chapter proposes a commonsense understanding of distance and orientation knowledge between extended objects, and presents a formal representation of spatial knowledge. The connection relation is taken as primitive. A new axiom is introduced to govern the connection relation. Notions of ‘near extension’ regions and the ‘nearer’ predicate are coined. Distance relations between extended objects are understood as degrees of the near extension from one object to the other. Orientation relations are understood as distance comparison from one object to the sides of the other object. Therefore, distance and orientation relations are internally related through the connection relation. The ‘fiat projection’ mechanism is proposed to model the mental formation of the deictic orientation reference framework. This chapter shows diagrammatically the integration of topological relations, distance relations, and orientation relations in the RCC frameworks.
Chapter XI A Formal Specification of the Memorization Process..........................................................................................157 Natalia López, Universidad Complutense de Madrid, Spain Manuel Núñez, Universidad Complutense de Madrid, Spain Fernando L. Pelayo, Universidad de Castilla-La Mancha, Spain In this chapter we present the formal language STOPA (STOchastic Process Algebra) to specify cognitive systems. In addition to the usual characteristics of these formalisms, this language features the possibility of including stochastic time. This kind of time is useful to represent systems where the delays are not controlled by fixed amounts of time, but are given by probability distribution functions. In order to illustrate the usefulness of our formalism, we will formally represent a cognitive model of the memory. Following contemporary theories of memory classification (see [Squire et al., 1993; Solso, 1999]) we consider sensory buffer, short-term, and long-term memories. Moreover, borrowing from Y. Wang and Y. Wang (2006), we also consider the so-called action buffer memory. Section III Autonomic Computing Chapter XII Theoretical Foundations of Autonomic Computing.............................................................................................172 Yingxu Wang, University of Calgary, Canada Autonomic computing (AC) is an intelligent computing approach that autonomously carries out robotic and interactive applications based on goal- and inference-driven mechanisms. This chapter attempts to explore the theoretical foundations and technical paradigms of AC. It reviews the historical development that leads to the transition from imperative computing to AC. It surveys transdisciplinary theoretical foundations for AC such as those of behaviorism, cognitive informatics, denotational mathematics, and intelligent science. On the basis of this work, a coherent framework towards AC may be established for both interdisciplinary theories and application paradigms, which will result in the development of new generation computing architectures and novel information processing systems. Chapter XIII Towards Cognitive Machines: Multiscale Measures and Analysis.......................................................................188 Witold Kinsner, University of Manitoba, Canada Numerous attempts are being made to develop machines that could act not only autonomously, but also in an increasingly intelligent and cognitive manner. Such cognitive machines ought to be aware of their environments which include not only other machines, but also human beings. Such machines ought to understand the meaning of information in more human-like ways by grounding knowledge in the physical world and in the machines’ own goals. The motivation for developing such machines ranges from self-evidenced practical reasons, such as the expense of computer maintenance, to wearable computing in health care, and gaining a better understanding of the cognitive capabilities of the human brain. To achieve such an ambitious goal requires solutions to many problems, ranging from human perception, attention, concept creation, cognition, consciousness, executive processes guided by emotions and value, and symbiotic conversational human-machine interactions. An important component of this cognitive machine research includes multiscale measures and analysis. This chapter presents definitions of cognitive machines, representations of processes, as well as their measurements, measures and analysis. It provides examples from current research, including cognitive radio, cognitive radar, and cognitive monitors.
Chapter XIV Towards Autonomic Computing: Adaptive Neural Network for Trajectory Planning.........................................200 Amar Ramdane-Cherif, Université de Versailles St-Quentin, France Cognitive approach through the neural network (NN) paradigm is a critical discipline that will help bring about autonomic computing (AC). NN-related research, some involving new ways to apply control theory and control laws, can provide insight into how to run complex systems that optimize to their environments. NN is one kind of AC systems that can embody human cognitive powers and can adapt, learn, and take over certain functions previously performed by humans. In recent years, artificial neural networks have received a great deal of attention for their ability to perform nonlinear mappings. In trajectory control of robotic devices, neural networks provide a fast method of autonomously learning the relation between a set of output states and a set of input states. In this chapter, we apply the cognitive approach to solve position controller problems using an inverse geometrical model. In order to control a robot manipulator in the accomplishment of a task, trajectory planning is required in advance or in real time. The desired trajectory is usually described in Cartesian coordinates and needs to be converted to joint space for the purpose of analyzing and controlling the system behavior. In this chapter, we use a memory neural network (MNN) to solve the optimization problem concerning the inverse of the direct geometrical model of the redundant manipulator when subject to constraints. Our approach offers substantially better accuracy, avoids the computation of the inverse or pseudoinverse Jacobian matrix, and does not produce problems such as singularity, redundancy, and considerably increased computational complexity. Chapter XV Cognitive Modelling Applied to Aspects of Schizophrenia and Autonomic Computing.....................................220 Lee Flax, Macquarie University, Australia We give an approach to cognitive modelling which allows for richer expression than the one based simply on the firing of sets of neurons. The object language of the approach is first-order logic augmented by operations of an algebra, PSEN. Some operations useful for this kind of modelling are postulated: combination, comparison and inhibition of sets of sentences. Inhibition is realised using an algebraic version of AGM belief contraction(Peter Gärdenfors: Knowledge in Flux,1988). It is shown how these operations can be realised using PSEN. Algebraic modelling using PSEN is used to give an account of an explanation of some signs and symptoms of schizophrenia due to Frith (The Cognitive Neuropsychology of Schizophrenia, 1992) as well as a proposal for the cognitive basis of autonomic computing. A brief discussion of the computability of the operations of PSEN is also given. Chapter XVI Interactive Classification Using a Granule Network............................................................................................235 Yan Zhao, University of Regina, Canada Yiyu Yao, University of Regina, Canada Classification is one of the main tasks in machine learning, data mining, and pattern recognition. Compared with the extensively studied automation approaches, the interactive approaches, centered on human users, are less explored. This chapter studies interactive classification at 3 levels. At the philosophical level, the motivations and a process-based framework of interactive classification are proposed. At the technical level, a granular computing model is suggested for re-examining not only existing classification problems, but also interactive classification problems. At the application level, an interactive classification system, ICS, using a granule network as the search space, is introduced. ICS allows multi-strategies for granule tree construction, and enhances the understanding and interpretation of the classification process. Interactive classification is complementary to the existing classification methods.
Section IV Knowledge Science Chapter XVII A Cognitive Computational Knowledge Representation Theory.........................................................................247 Mehdi Najjar, University of Sherbrooke, Canada André Mayers, University of Sherbrooke, Canada Encouraging results of previous years in the field of knowledge representation within virtual learning environments confirms that artificial intelligence research in this topic, find it very beneficial to integrate the knowledge psychological research have accumulated on understanding the cognitive mechanism of human learning and all the positive results obtained in computational modelling theories. This chapter introduces a novel cognitive and computational knowledge representation approach inspired by cognitive theories which explain the human cognitive activity in terms of memory subsystems and their processes, and whose aim is to suggest formal computational models of knowledge that offer efficient and expressive representation structures for virtual learning. Practical studies both contribute to validate the novel approach and permit drawing general conclusions. Chapter XVIII A Fixpoint Semantics for Rule-Base Anomalies..................................................................................................265 Du Zhang, California State University, USA A crucial component of an intelligent system is its knowledge base that contains knowledge about a problem domain. Knowledge base development involves domain analysis, context space definition, ontological specification and knowledge acquisition, codification, and verification. Knowledge base anomalies can affect the correctness and performance of an intelligent system. In this chapter, we describe fixpoint semantics for a knowledge base that is based on a multi-valued logic. We then use the fixpoint semantics to provide formal definitions for 4 types of knowledge base anomalies: inconsistency, redundancy, incompleteness, circularity. We believe such formal definitions of knowledge base anomalies will help pave the way for a more effective knowledge base verification process. Chapter XIX Development of an Ontology for an Industrial Domain.......................................................................................277 Christine W. Chan, University of Regina, Canada This chapter presents a method for ontology construction and its application in developing ontology in the domain of natural gas pipeline operations. Both the method as well as the application ontology developed, contribute to the infrastructure of Semantic Web that provides semantic foundation for supporting information processing by autonomous software agents. This chapter presents the processes of knowledge acquisition and ontology construction for developing a knowledge-based decision support system for monitoring and control of natural gas pipeline operations. Knowledge on the problem domain was acquired and analyzed using the Inferential Modeling Technique, then the analyzed knowledge was organized into an application ontology and represented in the Knowledge Modeling System. Since ontology is an explicit specification of a conceptualization that provides a comprehensive foundation specification of knowledge in a domain, it provides semantic clarifications for autonomous software agents that process information on the Internet. Chapter XX Constructivist Learning During Software Development......................................................................................292 Václav Rajlich, Wayne State University, USA Shaochun Xu, Laurentian University, Canada
This chapter explores the non-monotonic nature of the programmer learning that takes place during incremental program development. It uses a constructivist learning model that consists of four fundamental cognitive activities: absorption that adds new facts to the knowledge, denial that rejects facts that do not fit in, reorganization that reorganizes the knowledge, and expulsion that rejects obsolete knowledge. A case study of an incremental program development illustrates the application of the model and demonstrates that it can explain the learning process with episodes of both increase and decrease in the knowledge. Implications for the documentation systems are discussed in the conclusions. Chapter XXI A Unified Approach to Fractal Dimensions..........................................................................................................304 Witold Kinsner, University of Manitoba, Canada Many scientific chapters treat the diversity of fractal dimensions as mere variations on either the same theme or a single definition. There is a need for a unified approach to fractal dimensions for there are fundamental differences between their definitions. This chapter presents a new description of three essential classes of fractal dimensions based on: (i) morphology, (ii) entropy, and (iii) transforms, all unified through the generalized-entropy-based Rényi fractal dimension spectrum. It discusses practical algorithms for computing 15 different fractal dimensions representing the classes. Although the individual dimensions have already been described in the literature, the unified approach presented in this chapter is unique in terms of (i) its progressive development of the fractal dimension concept, (ii) similarity in the definitions and expressions, (iii) analysis of the relation between the dimensions, and (iv) their taxonomy. As a result, a number of new observations have been made, and new applications discovered. Of particular interest are behavioral processes (such as dishabituation), irreversible and birth-death growth phenomena (e.g., diffusion-limited aggregates, DLAs, dielectric discharges, and cellular automata), as well as dynamical nonstationary transient processes (such as speech and transients in radio transmitters), multifractal optimization of image compression using learned vector quantization with Kohonen’s self-organizing feature maps (SOFMs), and multifractal-based signal denoising. Section V Relevant Development Chapter XXII Cognitive Informatics: Four Years in Practice: A Report on IEEE ICCI’05........................................................327 Du Zhang, California State University, USA Witold Kinsner, University of Manitoba, Canada Jeffrey Tsai, University of Illinois in Chicago, USA Yingxu Wang, University of Calgary, Canada Philip Sheu, University of California, USA Taehyung Wang, California State University, USA The 2005 IEEE International Conference on Cognitive Informatics (ICCI’05) was held during August 8th to 10th 2005 on the campus of University of California, Irvine. This was the fourth conference of ICCI. The previous conferences were held at Calgary, Canada (ICCI’02), London, UK (ICCI’03), and Victoria, Canada (ICCI’04), respectively. ICCI’05 was organized by General Co-Chairs of Jeffrey Tsai (University of Illinois) and Yingxu Wang (University of Calgary), Program Co-Chairs of Du Zhang (California State University) and Witold Kinsner (University of Manitoba), and Organization Co-Chairs of Philip Sheu (University of California), Taehyung Wang (California State University, Northridge), and Shangping Ren (Illinois Institute of Technology).
Chapter XXIII Toward Cognitive Informatics and Cognitive Computers: A Report on IEEE ICCI’06.......................................330 Yiyu Yao, University of Regina, Canada Zhongzhi Shi, Chinese Academy of Sciences, China Yingxu Wang, University of Calgary, Canada Witold Kinsner, University of Manitoba, Canada Yixin Zhong, Beijing University of Posts and Telecommunications, China Guoyin Wang, Chongqing University of Posts and Telecommunications, China Zeng-Guang Hou, Chinese Academy of Sciences, China Cognitive informatics (CI) is a cutting-edge and multidisciplinary research area that tackles the fundamental problems shared by modern informatics, computation, software engineering, AI, cybernetics, cognitive science, neuropsychology, medical science, systems science, philosophy, linguistics, economics, management science, and life sciences. CI can be viewed as a trans-disciplinary enquiry of cognitive and information sciences that investigates into the internal information processing mechanisms and processes of the brain and natural intelligence, and their engineering applications. It is a trans-disciplinary study of the internal information processing mechanisms and processes of the natural intelligence—human brains and minds—and their engineering applications. Compilation of References.................................................................................................................................335 About the Contributors......................................................................................................................................363 Index.....................................................................................................................................................................369
xix
Preface
Cognitive informatics (CI) is a new discipline that studies the natural intelligence and internal information processing mechanisms of the brain, as well as the processes involved in perception and cognition. CI provides a coherent set of fundamental theories and contemporary mathematics, which form the foundation for most information and knowledge-based science and engineering disciplines, such as computer science, cognitive science, neuropsychology, systems science, cybernetics, computer/software engineering, knowledge engineering, and computational intelligence. The basic characteristic of the human brain is information processing. Information is recognized as the third essence supplementing matter and energy to model the natural world. Information is any property or attribute of the natural world that can be distinctly elicited, generally abstracted, quantitatively represented, and mentally processed. Informatics is the science of information that studies the nature of information, it’s processing, and ways of transformation between information, matter, and energy. Cognitive Informatics is the transdisciplinary enquiry of cognitive and information sciences that investigates the internal information processing mechanisms and processes of the brain and natural intelligence, and their engineering applications via an interdisciplinary approach. In many disciplines of human knowledge, almost all of the hard problems yet to be solved share a common root in the understanding of the mechanisms of natural intelligence and the cognitive processes of the brain. Therefore, CI is a discipline that forges links between a number of natural science and life science disciplines with informatics and computing science. This book, “Novel Approaches in Cognitive Informatics and Natural Intelligence,” is the first volume in the IGI Global Series of Advances in Cognitive Informatics and Natural Intelligence. It covers five sections on (i) Cognitive Informatics; (ii) Natural Intelligence; (iii) Autonomic Computing; (iv) Knowledge Science; and (v) Relevant Development.
Section i. Cognitive Informatics A wide range of interesting and ground-breaking progresses has been made in CI, especially the theoretical frameworks of CI and denotational mathematics for CI. This section presents the recent advances in CI on theories, models, methodologies, mathematical means, and techniques toward the exploration of the natural intelligence and the brain, which form the foundations for natural intelligence, neural informatics, autonomic computing, and agent systems. This section on cognitive informatics encompasses the following five chapters: • • • • •
Chapter I. The Theoretical Framework of Cognitive Informatics Chapter II. Is Entropy Suitable to Characterize Data and Signals for Cognitive Informatics? Chapter III. Cognitive Processes by using Finite State Machines Chapter IV. On the Cognitive Processes of Human Perception with Emotions, Motivations, and Attitudes Chapter V. A Selective Sparse Coding Model with Embedded Attention Mechanism
xx
Section ii. Natural Intelligence Natural intelligence, in the narrow sense, is a human or a system ability that transforms information into behaviors. In the broad sense, it is any human or system ability that autonomously transfers the forms of abstract information between data, information, knowledge, and behaviors in the brain. The history of human quest to understand the brain and natural intelligence is certainly as long as human history itself. It is recognized that artificial intelligence is a subset of natural intelligence, therefore, the understanding of natural intelligence is a foundation for investigating into artificial, machinable, and computational intelligence. This section on natural intelligence encompasses the following six chapters: • • • • • •
Chapter VI. The Cognitive Processes of Formal Inferences Chapter VII. Neo-Symbiosis: The Next Stage in the Evolution of Human Information Interaction Chapter VIII. Language, Logic, and the Brain Chapter IX. The Cognitive Process of Decision Making Chapter X. A Common Sense Approach to Representing Spatial Knowledge Between Extended Objects Chapter XI. A Formal Specification of the Memorization Process
Section iii. Autonomic Computing The approaches to computing can be classified into two categories known as imperative and autonomic computing. Corresponding to these, computing systems may be implemented as imperative or autonomic computing systems. An imperative computing system is a passive system that implements deterministic, context-free, and storedprogram controlled behaviors. While an autonomic computing system is an intelligent system that autonomously carries out robotic and interactive actions based on goal- and event-driven mechanisms. The autonomic computing system implements nondeterministic, context-dependent, and adaptive behaviors. Autonomic computing does not rely on instructive and procedural information, but are dependent on internal status, and willingness that formed by long-term historical events and current rational or emotional goals. This section on autonomic computing encompasses the following five chapters: • • • • •
Chapter XII. Theoretical Foundations of Autonomic Computing Chapter XIII. Towards Cognitive Machines: Multiscale Measures and Analysis Chapter XIV. Towards Autonomic Computing: Adaptive Neural Network for Trajectory Planning Chapter XV. Cognitive Modeling Applied to Aspects of Schizophrenia and Autonomic Computing Chapter XVI. Interactive Classification Using a Granule Network
Section iv. Knowledge Science Knowledge science is an emerging field that studies the nature of human knowledge, its mathematical model, and its manipulation. Because almost all disciplines of science and engineering deal with information and knowledge, investigation into the generic theories of knowledge science and its cognitive foundations is one of the profound areas of cognitive informatics. Francis Bacon (1561-1626) asserted that “knowledge is power.” In CI, knowledge is recognized as one of the important forms of cognitive information supplement to behaviors, experience, and skills. This section on knowledge science encompasses the following five chapters: • • • • •
Chapter XVII. A Cognitive Computational Knowledge Representation Theory Chapter XVIII. A Fixpoint Semantics for Rule-Base Anomalies Chapter XIX. Development of an Ontology for an Industrial Domain Chapter XX. Constructivist Learning During Software Development Chapter XXI. A Unified Approach to Fractal Dimensions
xxi
Section v. Relevant Development A series of the IEEE International Conferences on Cognitive Informatics (ICCI) have been organized annually. The inaugural conference was held at Calgary, Canada (ICCI’02), followed by events in London, UK (ICCI’03); Victoria, Canada (ICCI’04); Irvine, USA (ICCI’05); Beijing, China (ICCI’06), Lake Tahoe, USA (ICCI’07), and Stanford University, USA (ICCI’08). This part on relevant development encompasses the following two chapters: • •
Chapter XXII. Cognitive Informatics: Four Years in Practice: A Report on IEEE ICCI’05 Chapter XXIII. Toward Cognitive Informatics and Cognitive Computers: A Report on IEEE ICCI'06
A wide range of applications of CI has been identified. The key application areas of CI can be divided into two categories. The first category of applications implements informatics and computing techniques to investigate cognitive science problems, such as memory, learning, and reasoning. The second category adopts cognitive theories to investigate problems in informatics, computing, and software/knowledge engineering. CI focuses on the nature of information processing in the brain, such as information acquisition, representation, memory, retrieve, generation, and communication. Through the interdisciplinary approach and with the support of modern information and neuroscience technologies, mechanisms of the brain and the mind may be systematically explored within the framework of CI.
xxii
Acknowledgment
Many persons have contributed their dedicated work to this book and related research and events. The Editor-inChief would like to thank all authors, the associate editors of IJCINI, the editorial board members, and invited reviewers for their great contributions to this book. I would also like to thank the IEEE Steering Committee and organizers of the series of IEEE International Conference on CI (ICCI) in the last eight years, particularly Witold Kinsner, James Anderson, Witold Pedrycz, John Bickle, Du Zhang, Yiyu Yao, Jeffrey Tsai, Philip Sheu, Jean-Claude Latombe, Dilip Patel, Christine Chan, Shushma Patel, Guoyin Wang, Ron Johnston, and Michael R.W. Dawson. I would like to acknowledge the publisher of this book, IGI Global, USA. I would like to thank Dr. Mehdi KhosrowPour, Jan Travers, Kristin M. Klinger and Deborah Yahnke, for their professional editorship. I would also like to thank Maggie Ma and Siyuan Wang for their valuable help and assistance.
Yingxu Wang
Section I
Cognitive Informatics
Chapter I
The Theoretical Framework of Cognitive Informatics Yingxu Wang University of Calgary, Canada
Abstract Cognitive Informatics (CI) is a transdisciplinary enquiry of the internal information processing mechanisms and processes of the brain and natural intelligence shared by almost all science and engineering disciplines. This chapter presents an intensive review of the new field of CI. The structure of the theoretical framework of CI is described, encompassing the Layered Reference Model of the Brain (LRMB), the OAR model of information representation, Natural Intelligence (NI) vs. Artificial Intelligence (AI), Autonomic Computing (AC) vs. imperative computing, CI laws of software, the mechanism of human perception processes, the cognitive processes of formal inferences, and the formal knowledge system. Three types of new structures of mathematics, Concept Algebra (CA), Real-Time Process Algebra (RTPA), and System Algebra (SA), are created to enable rigorous treatment of cognitive processes of the brain as well as knowledge representation and manipulation in a formal and coherent framework. A wide range of applications of CI in cognitive psychology, computing, knowledge engineering, and software engineering has been identified and discussed.
Introduction The development of classical and contemporary informatics, the cross fertilization between computer science, systems science, cybernetics, computer/software engineering, cognitive science, knowledge engineering, and neuropsychology, has led to an entire range of extremely interesting new research field known as Cognitive Informatics (Wang, 2002a; Wang et al., 2002; Wang, 2003a/b; Wang, 2006b; Wang and Kinsner, 2006). Informatics is the science of information that studies the nature of information, it’s processing, and ways of transformation between information, matter, and energy. De. nition 1. Cognitive Informatics (CI) is a transdisciplinary enquiry of cognitive and information sciences that investigates into the internal information processing mechanisms and processes of the brain and natural intelligence, and their engineering applications via an interdisciplinary approach.
Copyright © 2009, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
The Theoretical Framework of Cognitive Informatics
In many disciplines of human knowledge, almost all of the hard problems yet to be solved share a common root in the understanding of the mechanisms of natural intelligence and the cognitive processes of the brain. Therefore, CI is a discipline that forges links between a number of natural science and life science disciplines with informatics and computing science. The structure of the theoretical framework of CI is described in Figure 1, which covers the Information-Matter-Energy (IME) model (Wang, 2003b), the Layered Reference Model of the Brain (LRMB) (Wang et al., 2006), the Object-Attribute-Relation (OAR) model of information representation in the brain (Wang, 2006h; Wang and Wang, 2006), the cognitive informatics model of the brain (Wang et al., 2003; Wang and Wang, 2006), Natural Intelligence (NI) (Wang, 2003b), Autonomic Computing (AC) (Wang, 2004), Neural Informatics (NeI) (Wang, 2002a; Wang, 2003b; Wang, 2006b), CI laws of software (Wang, 2006f), the mechanisms of human perception processes (Wang, 2005a), the cognitive processes of formal inferences (Wang, 2005c), and the formal knowledge system (Wang, 2006g). In this chapter, the theoretical framework of CI is explained in Section 2. Three structures of new descriptive mathematics such as Concept Algebra (CA), Real-Time Process Algebra (RTPA), and System Algebra (SA) are introduced in Section 3 in order to rigorously deal with knowledge and cognitive information representation and manipulation in a formal and coherent framework. Applications of CI are discussed in Section 4, which covers cognitive computing, knowledge engineering, and software engineering. Section 5 draws conclusions on the theories of CI, the contemporary mathematics for CI, and their applications.
Figure 1. The theoretical framework of CI t he t heoretical f ramework of c ognitive informatics (c i)
CI Theories (T)
CI Applications (A)
Descriptive Mathematics for CI (M)
T1 The IME model T2 The LRMB model
T7 CI laws of software
M1 Concept algebra (CA)
A1 Future generation Computers
T3 The OAR model
T8 Perception processes
M2 RTPA
A2 Capacity of human memory
T4 CI model of the brain
T9 Inference processes
M3 System algebra (SA)
A3 Autonomic computing
T5 Natural intelligence
T10 The knowledge system
T6 Neural informatics A9 Cognitive complexity of software
A4 Cognitive properties of knowledge A5 Simulation of cognitive behaviors
A8 Deductive semantics of software
A7 CI foundations of software engineering
A6 Agent systems
The Theoretical Framework of Cognitive Informatics
The Fundamental Theories of CI The fundamental theories of CI encompass ten transdisciplinary areas and fundamental models, T1 through T10, as identified in Figure 1. This section presents an intensive review of the theories developed in CI, which form a foundation for exploring the natural intelligence and their applications in brain science, neural informatics, computing, knowledge engineering, and software engineering.
The Information-Matter-Energy (IME) Model Information is recognized as the third essence of the natural world supplementing to matter and energy (Wang, 2003b), because the primary function of the human brain is information processing. Theorem 1. A generic world view, the Information-Matter-Energy (IME) model, states that the natural world (NW) that forms the context of human beings is a dual world: one aspect of it is the physical or the concrete world (PW), and the other is the abstract or the perceptive world (AW), where matter (M) and energy (E) are used to model the former, and information (I) to the latter, i.e.: NW =ˆ PW || AW = p( M , E ) || a ( I ) = n( I , M , E )
(1)
where || denotes a parallel relation, and p, a, and n are functions that determine a certain PW, AW, or NW, respectively, as illustrated in Figure 2. According to the IME model, information plays a vital role in connecting the physical world with the abstract world. Models of the natural world have been well studied in physics and other natural sciences. However, the modeling of the abstract world is still a fundamental issue yet to be explored in cognitive informatics, computing, software science, cognitive science, brain sciences, and knowledge engineering. Especially the relationships between I-M-E and their transformations are deemed as one of the fundamental questions in CI. Corollary 1. The natural world NW(I, M, E), particularly the part of the abstract world AW(I), is cognized and perceived differently by individuals because the uniqueness of perceptions and mental contexts among people. Corollary 1 indicates that although the physical world PW(M, E) is the same to everybody, the natural world NW(I, M, E) is unique to different individuals because the abstract world AW(I), as a part of it, is subjective depending on the information an individual obtains and perceives.
Figure 2. The IME model of the world view T h e a b s tra c t w o rld (A W )
I T h e n a tu ra l w o rld (N W )
M
E T h e p h ys ic a l w o rld (P W )
The Theoretical Framework of Cognitive Informatics
Corollary 2. The principle of transformability between I-M-E states that, according to the IME model, the three essences of the world are predicated to be transformable between each other as described by the following generic functions f1 to f6: I = f1 (M)
(2.1)
M = f2 (I) Ҁ f1 -1(I)
(2.2)
I = f3 (E)
(2.3)
E = f4 (I) Ҁ f3 -1(I)
(2.4)
E = f5 (M)
(2.5)
M = f6 (E) = f5 -1(E)
(2.6)
where a question mark on the equal sign denotes an uncertainty if there exists such a reverse function (Wang, 2003b). Albert Einstein revealed Functions f5 and f6 , the relationship between matter (m) and energy (E), in the form E = mC2, where C is the speed of light. It is a great curiosity to explore what the remaining relationships and forms of transformation between I-M-E will be. In a certain extent, cognitive informatics is the science to seek possible solutions for f1 to f4. A clue to explore the relations and transformability is believed in the understanding of the natural intelligence and its information processing mechanisms in CI. Definition 2. Information in CI is defined as a generic abstract model of properties or attributes of the natural world that can be distinctly elicited, generally abstracted, quantitatively represented, and mentally processed. Definition 3. The measurement of information, Ik, is defined by the cost of code to abstractly represent a given size of internal message X in the brain in a digital system based on k, i.e.: I k = f : X ® Sk
(3)
= é logk X ù
where Ik is the content of information in a k-based digital system, and Sk the measurement scale based on k. The unit of Ik is the number of k-based digits (Wang, 2003b). Eq. 3 is a generic measure of information sizes. When a binary digital representation system is adopted, i.e. k = b = 2, it becomes the most practical one as follows. Definition 4. The meta-level representation of information, Ib, is that when k = b = 2, i.e.: Ib = f : X ® Sb
(4)
= é logb X ù
where the unit of information, Ib, is a bit. Note that the bit here is a concrete and deterministic unit, and it is no longer probability-based as in conventional information theories (Shannon, 1948; Bell, 1953). In a certain extent, computer science and engineering is a branch of modern informatics that studies machine representation and processing of external information; while CI is a branch of contemporary informatics that studies internal information representation and processing in the brain.
The Theoretical Framework of Cognitive Informatics
Theorem 2. The most fundamental form of information that can be represented and processed is binary digit where k = b = 2. Theorem 2 indicates that any form of information in the physical (natural) and abstract (mental) worlds can be unified on the basis of binary data. This is the CI foundation of modern digital computers and NI.
The Layered Reference Model of the Brain The Layered Reference Model of the Brain (LRMB) (Wang et al., 2006) is developed to explain the fundamental cognitive mechanisms and processes of natural intelligence. Because a variety of life functions and cognitive processes have been identified in CI, psychology, cognitive science, brain science, and neurophilosophy, there is a need to organize all the recurrent cognitive processes in an integrated and coherent framework. The LRMB model explains the functional mechanisms and cognitive processes of natural intelligence that encompasses 37 cognitive processes at six layers known as the sensation, memory, perception, action, meta cognitive, and higher cognitive layers from the bottom-up as shown in Figure 3. LRMB elicits the core and highly repetitive recurrent cognitive processes from a huge variety of life functions, which may shed light on the study of the fundamental mechanisms and interactions of complicated mental processes, particularly the relationships and interactions between the inherited and the acquired life functions as well as those of the subconscious and conscious cognitive processes.
The OAR Model of Information Representation in the Brain Investigation into the cognitive models of information and knowledge representation in the brain is perceived to be one of the fundamental research areas that help to unveil the mechanisms of the brain. The Object-AttributeRelation (OAR) model (Wang et al., 2003; Wang, 2006h) describes human memory, particularly the long-term Figure 3. The layered reference model of the brain (LRMB)
Conscious cognitive processes
Layer 6 Higher cognitive functions
Layer 5 Meta cognitive functions
Subconscious cognitive processes
Layer 4 Action
Layer 3 Perception
Layer 2 Memory
Layer 1 Sensation
The Theoretical Framework of Cognitive Informatics
memory, by using the relational metaphor, rather than the traditional container metaphor that used to be adopted in psychology, computing, and information science. The OAR model shows that human memory and knowledge are represented by relations, i.e. connections of synapses between neurons, rather than by the neurons themselves as the traditional container metaphor described. The OAR model can be used to explain a wide range of human information processing mechanisms and cognitive processes.
The Cognitive Informatics Model of the Brain The human brain and its information processing mechanisms are centred in CI. A cognitive informatics model of the brain is proposed in (Wang and Wang, 2006), which explains the natural intelligence via interactions between the inherent (subconscious) and acquired (conscious) life functions. The model demonstrates that memory is the foundation for any natural intelligence. Formalism in forms of mathematics, logic, and rigorous treatment is introduced into the study of cognitive and neural psychology and natural informatics. Fundamental cognitive mechanisms of the brain, such as the architecture of the thinking engine, internal knowledge representation, long-term memory establishment, and roles of sleep in long-term memory development have been investigated (Wang and Wang, 2006).
Natural Intelligence (NI ) Natural Intelligence (NI) is the domain of CI. Software and computer systems are recognized as a subset of intelligent behaviors of human beings described by programmed instructive information (Wang, 2003b, Wang and Kinsner, 2006). The relationship between Artificial Intelligence (AI) and NI can be described by the following theorem. Theorem 3. The law of compatible intelligent capability states that artificial intelligence (AI) is always a subset of the natural intelligence (NI), i.e.: AI ⊆ NI
(5)
Theorem 3 indicates that AI is dominated by NI. Therefore, one should not expect a computer or a software system to solve a problem where humans cannot. In other words, no AI or computing system may be designed and/or implemented for a given problem where there is no solution being known by human beings.
Neural Informatics (NEI) Definition 5. Neural Informatics (NeI) is a new interdisciplinary enquiry of the biological and physiological representation of information and knowledge in the brain at the neuron level and their abstract mathematical models (Wang, 2004; Wang and Wang, 2006). NeI is a branch of CI, where memory is recognized as the foundation and platform of any natural or artificial intelligence (Wang and Wang, 2006). Definition 6. The Cognitive Models of Memory (CMM) states that the architecture of human memory is parallel configured by the Sensory Buffer Memory (SBM), Short-Term Memory (STM), Long-Term Memory (LTM), and Action-Buffer Memory (ABM), i.e.: CMM SBM || STM || LTM || ABM where the ABM is newly identified in (Wang and Wang, 2006).
(6)
The Theoretical Framework of Cognitive Informatics
The major organ that accommodates memories in the brain is the cerebrum or the cerebral cortex. In particular, the association and premotor cortex in the frontal lobe, the temporal lobe, sensory cortex in the frontal lobe, visual cortex in the occipital lobe, primary motor cortex in the frontal lobe, supplementary motor area in the frontal lobe, and procedural memory in cerebellum (Wang and Wang, 2006). The CMM model and the mapping of the four types of human memory onto the physiological organs in the brain reveal a set of fundamental mechanisms of NeI. The OAR model of information/knowledge representation described in Section 2.3 provides a generic description of information/knowledge representation in the brain (Wang et al., 2003; Wang.2006h). The theories of CI and NeI explain a number of important questions in the study of NI. Enlightening conclusions derived in CI and NeI are such as: (a) LTM establishment is a subconscious process; (b) The long-term memory is established during sleeping; (c) The major mechanism for LTM establishment is by sleeping; (d) The general acquisition cycle of LTM is equal to or longer than 24 hours; (e) The mechanism of LTM establishment is to update the entire memory of information represented as an OAR model in the brain; and (f) Eye movement and dreams play an important role in LTM creation. The latest development in CI and NeI has led to the determination of the magnificent and expected capacity of human memory as described in Section 4.2.
Cognitive Informatics Laws of Software It is commonly conceived that software as an artifact of human creativity is not constrained by the laws and principles discovered in the physical world. However, it is unknown what constrains software. The new informatics metaphor proposed by the author in CI perceives software is a type of instructive and behavioral information. Based on this, it is asserted that software obeys the laws of informatics. A comprehensive set of 19 CI laws for software have been established in (Wang, 2006f) such as: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19.
Abstraction Generality Cumulativeness Dependency on cognition Three-dimensional behavior space known as the object (O), space (S), and time (T) Sharability Dimensionless Weightless Transformability between I-M-E Multiple representation forms Multiple carrying media Multiple transmission forms Dependency on media Dependency on energy Wearless and time dependency Conservation of entropy Quality attributes of informatics Susceptible to distortion Scarcity
The informatics laws of software extend the knowledge on the fundamental laws and properties of software where the conventional product metaphor could not explain. Therefore, CI forms one of the foundations of software engineering and computing science.
Mechanisms of Human Perception Processes Definition 7. Perception is a set of interpretive cognitive processes of the brain at the subconscious cognitive function layers that detects, relates, interprets, and searches internal cognitive information in the mind.
The Theoretical Framework of Cognitive Informatics
Perception may be considered as the sixth sense of human beings which almost all cognitive life functions rely on. Perception is also an important cognitive function at the subconscious layers that determines personality. In other words, personality is a faculty of all subconscious life functions and experience cumulated via conscious life functions. According to LRMB, the main cognitive processes at the perception layer are emotion, motivation, and attitude (Wang, 2005a). The relationship between the internal emotion, motivation, attitude, and the embodied external behaviors can be formally and quantitatively described by the motivation/attitude-driven behavioral (MADB) model (Wang and Wang, 2006), which demonstrates that complicated psychological and cognitive mental processes may be formally modeled and rigorously described by mathematical means (Wang, 2002b; Wang, 2003d; Wang, 2005c).
The Cognitive Processes of Formal Inferences Theoretical research is predominately an inductive process; while applied research is mainly a deductive one. Both inference processes are based on the cognitive process and means of abstraction. Abstraction is a powerful means of philosophy and mathematics. It is also a preeminent trait of the human brain identified in CI studies (Wang, 2005c). All formal logical inferences and reasonings can only be carried out on the basis of abstract properties shared by a given set of objects under study. Definition 8. Abstraction is a process to elicit a subset of objects that shares a common property from a given set of objects and to use the property to identify and distinguish the subset from the whole in order to facilitate reasoning. Abstraction is a gifted capability of human beings. Abstraction is a basic cognitive process of the brain at the meta cognitive layer according to LRMB (Wang et al., 2006). Only by abstraction can important theorems and laws about the objects under study be elicited and discovered from a great variety of phenomena and empirical observations in an area of inquiry. Definition 9. Inferences are a formal cognitive process that reasons a possible causality from given premises based on known causal relations between a pair of cause and effect proven true by empirical arguments, theoretical inferences, or statistical regulations. Formal inferences may be classified into the deductive, inductive, abductive, and analogical categories (Wang, 2005c). Deduction is a cognitive process by which a specific conclusion necessarily follows from a set of general premises. Induction is a cognitive process by which a general conclusion is drawn from a set of specific premises based on three designated samples in reasoning or experimental evidences. Abduction is a cognitive process by which an inference to the best explanation or most likely reason of an observation or event. Analogy is a cognitive process by which an inference about the similarity of the same relations holds between different domains or systems, and/or examines that if two things agree in certain respects then they probably agree in others. A summary of the formal definitions of the five inference techniques is shown in Table 1. For seeking generality and universal truth, either the objects or the relations can only be abstractly described and rigorously inferred by abstract models rather than real-world details.
The Formal Knowledge System Mathematical thoughts (Jordan and Smith, 1997) provide a successful paradigm to organize and validate human knowledge, where once a truth or a theorem is established, it is true till the axioms or conditions that it stands for are changed or extended. A proven truth or theorem in mathematics does not need to be argued each time when one uses it. This is the advantage and efficiency of formal knowledge in science and engineering. In other words, if any theory or conclusion may be argued from time-to-time based on a wiser idea or a trade-off, it is an empirical result rather than a formal one.
The Theoretical Framework of Cognitive Informatics
Table 1. Definitions of formal inferences (24) No.
Inference technique
Formal description Primitive form
Usage Composite form
1
Abstraction
∀S, p ⇒ ∃ e ∈ E ⊆ S, p(e)
—
To elicit a subset of elements with a given generic property.
2
Deduction
∀x ∈ X, p(x) ⇒ ∃a ∈ X, p(a)
(∀x ∈ X, p(x) ⇒ q(x)) ⇒ ∃a ∈ X, p(a) (∃a ∈ X, p(a) ⇒ q(a))
To derive a conclusion based on a known and generic premises.
3
Induction
((∃a ∈ X, P(a)) ∧ (∃k, k+1 ∈ X, (P(k) ⇒ P(k+1))) ⇒ ∀x ∈ X, P(x)
((∃a ∈ X, p(a) ⇒ q(a)) ∧ (∃k, k+1 ∈ X, ((p(k) ⇒ q(k)) ⇒ (p(k+1) ⇒ q(k+1)))) ⇒ ∀x ∈ X, p(x) ⇒ q(x)
To determine the generic behavior of the given list or sequence of recurring patterns by three samples.
4
Abduction
(∀x ∈ X, p(x) ⇒ q(x)) ⇒ (∃a ∈ X, q(a) ⇒ p(a))
(∀x ∈ X, p(x) ⇒ q(x) ∧ r(x) ⇒ q(x)) ⇒ (∃a ∈ X, q(a) ⇒ (p(a) ∨ r(a)))
To seek the most likely cause(s) and reason(s) of an observed phenomenon.
5
Analogy
∃a ∈ X, p(a) ⇒ ∃b ∈ X, p(b)
(∃a ∈ X, p(a) ⇒ q(a)) ⇒ (∃b ∈ X, p(b) ⇒ q(b))
To predict a similar phenomenon or consequence based on a known observation.
The Framework of Formal Knowledge (FFK) of mankind (Wang, 2006g) can be described as shown in Figure 5. An FFK is centered by a set of theories. A theory is a statement of how and why certain objects, facts, or truths are related. All objects in nature and their relations are constrained by invariable laws, no matter if one observed them or not at any given time. An empirical truth is a truth based on or verifiable by observation, experiment, or experience. A theoretical proposition is an assertion based on formal theories or logical reasoning. Theoretical knowledge is a formalization of generic truth and proven abstracted empirical knowledge. Theoretical knowledge may be easier to acquire when it exists. However, empirical knowledge is very difficult to be gained without hands-on practice. According to the FFK model, an immature discipline of science and engineering is characterized by its body of knowledge not been formalized. Instead of a set of proven theories, the immature disciplines document a large set of observed facts, phenomena, and their possible or partially working explanations and hypotheses. In such disciplines, researchers and practitioners might be able to argue every informal conclusion documented in natural languages from time-to-time probably for hundreds of years, until it is formally described in mathematical forms and proved rigorously. The disciplines of mathematics and physics are successful paradigms that adopt the FFK formal knowledge system. The key advantages of the formal knowledge system are its stability and efficiency. The former is a property of the formal knowledge that once it is established and formally proved, users who refers to it will no longer need to reexamine or reprove it. The latter is a property of formal knowledge that is exclusively true or false that saves everybody’s time from arguing a proven theory.
Denotational Mathematics for CI The history of sciences and engineering shows that new problems require new forms of mathematics. CI is a new discipline, and the problems in it require new mathematical means that are descriptive and precise in expressing and denoting human and system actions and behaviors. Conventional analytic mathematics are unable to solve the fundamental problems inherited in CI and related disciplines such as neuroscience, psychology, philosophy, computing, software engineering, knowledge engineering. Therefore, denotational mathematical structures and means (Wang, 2006c) beyond mathematical logic are yet to be sought. Although there are various ways to express facts, objects, notions, relations, actions, and behaviors in natural languages, it is found in CI that human and system behaviors may be classified into three basic categories known
The Theoretical Framework of Cognitive Informatics
Figure 4. The framework of formal knowledge (FFK) The Formal Knowledge System
Discipline
Doctrine Definitions
Propositions
Hypotheses
Theories Theorems Concepts
Empirical verifications
Lemmas Formal proofs
Factors Corollaries
Laws
Arguments
Principles
Instances
Truths Phenomena Rules Models
Case studies Statistical norms
Methodologies Algorithms
as to be, to have, and to do. All mathematical means and forms, in general, are an abstract and formal description of these three categories of expressibility and their rules. Taking this view, mathematical logic may be perceived as the abstract means for describing ‘to be,’ set theory describing ‘to have,’ and algebras, particularly process algebra, describing ‘to do.’ Theorem 4. The utility of mathematics is the means and rules to express thought rigorously and generically at a higher level of abstraction. Three types of new mathematics, concept algebra (CA), real-time process algebra (RTPA), and system algebra (SA), are created in CI to enable rigorous treatment of knowledge representation and manipulation in a formal and coherent framework. The three new structures of contemporary mathematics have extended the abstract objects under study in mathematics from basic mathematical entities of numbers and sets to a higher level, i.e. concepts, behavioral processes, and systems. A wide range of applications of the denotational mathematics in the context of CI has been identified (Wang, 2002b; Wang, 2006d; Wang, 2006e).
Concept Algebra (CA) A concept is a cognitive unit (Ganter and Wille, 1999; Quillian, 1968; Wang, 2006e) by which the meanings and semantics of a real-world or an abstract entity may be represented and embodied based on the OAR model. Definition 10. An abstract concept c is a 5-tuple, i.e.:
c (O, A, R c , R i , R o )
10
(7)
The Theoretical Framework of Cognitive Informatics
where O is a nonempty set of object of the concept, O = {o1, o2, …, om} = ÞU, where ÞU denotes a power set of U. A is a nonempty set of attributes, A = {a1, a2, …, an} = ÞM. Rc ⊆ O × A is a set of internal relations. Ri ⊆ C′ × C is a set of input relations, where C′ is a set of external concepts. Ro ⊆ C × C′ is a set of output relations.
• • • • •
A structural concept model of c = (O, A, Rc, Ri, Ro) can be illustrated in Figure 6, where c, A, O, and R, R = {R , Ri, Ro}, denote the concept, its attributes, objects, and internal/external relations, respectively. c
Definition 11. Concept algebra (CA) is a new mathematical structure for the formal treatment of abstract concepts and their algebraic relations, operations, and associative rules for composing complex concepts and knowledge (Wang, 2006e).
Figure 5. The structural model of an abstract concept Θ c A Other Cs
Ri
Ro
O
Other Cs
Rc
Figure 6. The nine concept association operations as knowledge composing rules c1
Inheritance Extension
c2 Instantiation
+
º
A1
Tailoring
o 21
⇒
A2
R21
A21
-
º
Substitute
~
º
R1
Composition Decomposition
O1 s
R2
O2
Aggregation Specification
A
11
The Theoretical Framework of Cognitive Informatics
Concept algebra deals with the algebraic relations and associational rules of abstract concepts. The associations of concepts form a foundation to denote complicated relations between concepts in knowledge representation. The associations among concepts can be classified into nine categories, such as inheritance, extension, tailoring, substitute, composition, decomposition, aggregation, specification, and instantiation as shown in Figure 7 and Table 2 (Wang, 2006e). In Figure 7, R = {Rc, Ri, Ro}, and all nine associations describe composing rules among concepts, except instantiation that is a relation between a concept and a specific object. Definition 12. A generic knowledge K is an n-nary relation Rk among a set of n multiple concepts in C, i.e.: n
K = Rk : ( XCi ) → C
(8)
i=1
n
where
UC
i
+
, , , , , } . = C , and Rk ∈ ℜ = {⇒, ⇒, ⇒, ⇒
i=1
In Definition 12 the relation Rk is one of the concept operations in CA as defined in Table 2 (Wang, 2006e) that serves as the knowledge composing rules. Definition 13. A concept network CN is a hierarchical network of concepts interlinked by the set of nine associations ℜ defined in CA, i.e.: n
n
i=1
i= j
CN = Rê : XCi → XC j
(9)
where Rk ∈ R. Because the relations between concepts are transitive, the generic topology of knowledge is a hierarchical concept network. The advantages of the hierarchical knowledge architecture K in the form of concept networks are as follows: a) Dynamic: The knowledge networks may be updated dynamically along with information acquisition and learning without destructing the existing concept nodes and relational links. b) Evolvable: The knowledge networks may grow adaptively without changing the overall and existing structure of the hierarchical network. A summary of the algebraic relations and operations of concepts defined in CA are provided in Table 2.
Real-Time Process Algebra (RTPA) A key metaphor in system modeling, specification, and description is that a software system can be perceived and described as the composition of a set of interacting processes. Hoare (Hoare, 1985), Milner (Milner,1989), and others developed various algebraic approaches to represent communicating and concurrent systems, known as process algebra. A process algebra is a set of formal notations and rules for describing algebraic relations of software processes. Real-Time Process Algebra (RTPA) (Wang, 2002b; Wang, 2005b) extends process algebra to time/event, architecture, and system dispatching manipulations in order to formally describe and specify architectures and behaviors of software systems. A process in RTPA is a computational operation that transforms a system from a state to another by changing its inputs, outputs, and/or internal variables. A process can be a single meta-process or a complex process formed by using the process combination rules of RTPA known as process relations. Definition 14. Real-Time Process Algebra (RTPA) is a set of formal notations and rules for describing algebraic and real-time relations of software processes. RTPA models 17 meta processes and 17 process relations. A meta process is an elementary and primary process that serves as a common and basic building block for a software system. Complex processes can be derived from meta processes by a set of process relations that serves as process combinatory rules. Detailed semantics of RTPA may be referred to (Wang, 2002b). 12
The Theoretical Framework of Cognitive Informatics
Program modeling is on coordination of computational behaviors with given data objects. Behavioral or instructive knowledge can be modelled by RTPA. A generic program model can be described by a formal treatment of statements, processes, and complex processes from the bottom-up in the program hierarchy. Definition 15. A process P is a composed listing and a logical combination of n meta statements pi and pj, 1 ≤ i < n, 1 < j ≤ m = n+1, according to certain composing relations rij, i.e.: P=
n −1
R( p r i =1
i
p j ), j = i + 1
ij
= (...((( p1 ) r12 p2 ) r23 p3 ) ... rn −1,n pn )
(10)
where the big-R notation (Wang, 2002b; Wang, 2007) is adopted to describes the nature of processes as the building blocks of programs. Definition 16. A program P is a composition of a finite set of m processes according to the time-, event-, and interrupt-based process dispatching rules, i.e.:
P=
m
R(@ e P ) k
k
k =1
(11)
Equations 9 and 10 indicate that a program is an embedded relational algebraic entity. A statement p in a program is an instantiation of a meta instruction of a programming language that executes a basic unit of coherent function and leads to a predictable behavior. Theorem 5. The embedded relational model (ERM) states that a software system or a program P is a set of complex embedded relational processes, in which all previous processes of a given process form the context of the current process, i.e.: P=
m
R(@ e P ) k
k
k =1
=
n −1
m
R[@ e R( p (k ) r (k ) p (k ))], j = i + 1 k =1
k
i =1
i
ij
j
(12)
ERM presented in Theorem 5 provides a unified mathematical model of programs (Wang, 2006a) for the first time, which reveals that a program is a finite and nonempty set of embedded binary relations between a current statement and all previous ones that formed the semantic context or environment of computing. Definition 17. A meta process is the most basic and elementary processes in computing that cannot be broken up further. The set of meta processes P encompasses 17 fundamental primitive operations in computing as follows: P ={:=, , ⇒, ⇐, ⇐, , , |, |,
@
, , ↑, ↓, !, , , §}
(13)
Definition 18. A process relation is a composing rule for constructing complex processes by using the meta processes. The process relations R of RTPA are a set of 17 composing operations and rules to built larger architectural components and complex system behaviors using the meta processes, i.e.:
13
The Theoretical Framework of Cognitive Informatics
R = {→, , |, |…|,
*
+
i
R ,R , R , , , ||, ∫∫ , |||, », , , , } t
e
i
(14)
The definitions, syntaxes, and formal semantics of each of the meta processes and process relations may be referred to RTPA (Wang, 2002b; Wang, 2006f). A complex process and a program can be derived from the meta-processes by the set of algebraic process relations. Therefore, a program is a set of embedded relational processes as described in Theorem 5. A summary of the meta processes and their algebraic operations in RTPA are provided in Table 2.
System Algebra (SA) Systems are the most complicated entities and phenomena in the physical, information, and social worlds across all science and engineering disciplines (Klir, 1992; Bertalanffy, 1952; Wang, 2006d). Systems are needed because the physical and/or cognitive power of an individual component or person is not enough to carry out a work or solving a problem. An abstract system is a collection of coherent and interactive entities that has stable functions and clear boundary with external environment. An abstract system forms the generic model of various real world systems and represents the most common characteristics and properties of them. Definition 19. System algebra (SA) is a new abstract mathematical structure that provides an algebraic treatment of abstract systems as well as their relations and operational rules for forming complex systems (Wang, 2006d). Abstract systems can be classified into two categories known as the closed and open systems. Most practical and useful systems in nature are open systems in which there are interactions between the system and its environment. However, for understanding easily, the closed system is introduced first. Definition 20. A closed system S is a 4-tuple, i.e.: S = (C, R, B, Ω)
(15)
where • • • •
C is a nonempty set of components of the system, C = {c1, c2, …, cn}. R is a nonempty set of relations between pairs of the components in the system, R = {r1, r2, …, rm}, R ⊆ C × C. B is a set of behaviors (or functions), B = {b1, b2, …, bp}. Ω is a set of constraints on the memberships of components, the conditions of relations, and the scopes of behaviors, Ω = {ω1, ω2, …, ωq}.
Most practical systems in the real world are not closed. That is, they need to interact with external world known as the environment Θ in order to exchange energy, matter, and/or information. Such systems are called open systems. Typical interactions between an open system and the environment are inputs and outputs. Definition 21. An open system S is a 7-tuple, i.e.: S = (C, R, B, Ω, Θ) = (C, Rc, Ri, Ro, B, Ω, Θ)
where the extensions of entities beyond the closed system are as follows:
14
(16)
The Theoretical Framework of Cognitive Informatics
• • • •
Θ is the environment of S with a nonempty set of components CΘ outside C. Rc ⊆ C × C is a set of internal relations. Ri ⊆ CΘ × C is a set of external input relations. Ro ⊆ C × CΘ is a set of external output relations. An open system S = (C, Rc, Ri, Ro, B, Ω, Θ) can be illustrated in Figure 7 (Wang06d).
Theorem 6. The equivalence between open and closed systems states that an open system S is equivalent to a closed system S , or vice verse, when its environment QS or QS is conjoined, respectively, i.e.: ìï S = S QS ïï í ïï S = S Q S ïî
(17)
According to Theorem 6, any subsystem S k of a closed system S is an open system S. That is, any super system S of a given set of n open systems Sk , plus their environments Θk , 1 ≤ k ≤ n, is a closed systems. The algebraic relations and operations of systems in SA are summarized in Table 2.
Theorem 7. The Wang’s first law of system science, system fusion, states that system conjunction or composition between two systems S1 and S2 creates new relations ∆R12 and/or new behaviors (functions) ∆B12 that are solely a property of the new super system S determined by the sizes of the two intersected component sets #(C1) and #(C2), i.e.: ∆R12 = #(R) – (#(R1) + #(R2)) = (#(C1 + C2))2 – ((#(C1))2 +(#(C2))2) = 2 (#(C1) • #(C2))
(18)
The discovery in Theorem 7 reveals that the mathematical explanation of system utilities is the newly gained relations ∆R12 and/or behaviors (functions) ∆B12 during the conjunction of two systems or subsystems. The empirical awareness of this key system property has been intuitively or qualitatively observed for centuries. However, Theorem 7 is the first rigorous explanation of the mechanism of system gains during system conjunctions and compositions. According to Theorem 7, the maximum incremental or system gain equals to the number of bydirectly interconnection between all components in both S1 and S2, i.e., 2(#(C1) • #(C2)). Figure 7. The abstract model of an open system U Θ Ri1
R1 C1
B1
Ro1
Ω1 S
Rc1
Rc1 R2
Ri2
C2
B2
Ro2
Ω2
15
The Theoretical Framework of Cognitive Informatics
Table 2. Taxonomy of contemporary mathematics for knowledge representation and manipulation Operations
Concept System Algebra Algebra
Super/sub relation / / Related/independent ↔ / ↔ / Equivalent = = Consistent ≅ Overlapped Π
Real-Time Process Algebra Meta Processes Relational Operations Assignment := Sequence → Evaluation Jump Addressing ⇒ Branch | Memory allocation ⇐ Switch |…|… * Memory release While-loop
Conjunction
+
Read
Repeat-loop
R R
Elicitation
*
Write
For-loop
R
Comparison Definition Difference
~
Input Output Timing
| Recursion | Procedure call @ Parallel
||
↑ ↓ !
Event-driven dispatch § Interrupt-driven dispatch
Inheritance Extension Tailoring Substitute Composition Decomposition Aggregation/ generalization Specification Instantiation
⇒
⇒
Duration Increase Decrease Exception detection Skip
Stop System
Concurrence Interleave Pipeline Interrupt Time-driven dispatch
||| »
+
i
t
e i
Theorem 8. The Wang’s 2nd law of system science, the maximum system gain, states that work done by a system is always larger than any of its components, but is less than or is equal to the sum of those of its components, i.e.: n ìï ïïW (S ) £ åW (ei ), h £1 ïí i =1 ïï ïïîW (S ) > max(W (ei )), ei Î ES
(19)
There was a myth on an ideal system in conventional systems theory that supposes the work down by the n ideal system W(S) may be greater than the sum of all its components W(ei), i.e.: W (S ) ³ åW (ei ) . According to i =1 Theorems 7 and 8, the ideal system utility is impossible to achieve. A summary of the algebraic operations and their notations in CA, RTPA, and SA is provided in Table 2. Details may be referred to (Wang, 2006d; Wang, 2006g).
Applications of CI Sections 2 and 3 have reviewed the latest development of fundamental researches in CI, particularly its theoretical framework and descriptive mathematics. A wide range of applications of CI has been identified in multidisciplinary
16
The Theoretical Framework of Cognitive Informatics
and transdisciplinary areas, such as: (1) The architecture of future generation computers; (2) Estimation the capacity of human memory; (3) Autonomic computing; (4) Cognitive properties of information, data, knowledge, and skills in knowledge engineering; (5) Simulation of human cognitive behaviors using descriptive mathematics; (6) Agent systems; (7) CI foundations of software engineering; (8) Deductive semantics of software; and (9) Cognitive complexity of software.
The Architecture of Future Generation Computers Conventional machines are invented to extend human physical capability, while modern information processing machines, such as computers, communication networks, and robots, are developed for extending human intelligence, memory, and the capacity for information processing (Wang, 2004). Recent advances in CI provide formal description of an entire set of cognitive processes of the brain (Wang et al., 2006). The fundamental research in CI also creates an enriched set of contemporary denotational mathematics (Wang, 2006c), for dealing with the extremely complicated objects and problems in natural intelligence, neural informatics, and knowledge manipulation. The theory and philosophy behind the next generation computers and computing methodologies are CI (Wang, 2003b; Wang, 2004). It is commonly believed that the future-generation computers, known as the cognitive computers, will adopt non-von Neumann (Neumann, 1946) architectures. The key requirements for implementing a conventional stored-program controlled computer are the generalization of common computing architectures and the computer is able to interpret the data loaded in memory as computing instructions. These are the essences of stored-program controlled computers known as the von Neumann architecture (Neumann, 1946). Von Neumann elicited five fundamental and essential components to implement general-purpose programmable digital computers in order to embody the concept of stored-program-controlled computers. Definition 22. A von Neumann Architecture (VNA) of computers is a 5-tuple that consists of the components: (a) the arithmetic-logic unit (ALU), (b) the control unit (CU) with a program counter (PC), (c) a memory (M), (d) a set of input/output (I/O) devices, and (e) a bus (B) that provides the data path between these components, i.e.: VNA (ALU, CU, M, I/O, B)
(20)
Definition 23. Conventional computers with VNA are aimed at stored-program-controlled data processing based on mathematical logic and Boolean algebra. A VNA computer is centric by the bus and characterized by the all purpose memory for both data and instructions. A VNA machine is an extended Turing machine (TM), where the power and functionality of all components of TM including the control unit (with wired instructions), the tape (memory), and the head of I/O, are greatly enhanced and extended with more powerful instructions and I/O capacity. Definition 24. A Wang Architecture (WA) of computers, known as the Cognitive Machine as shown in Figure 8, is a parallel structure encompassing an Inference Engine (IE) and a Perception Engine (PE) (Wang, 2006b; Wang, 2006g), i.e.: WA (IE || PE) = ( KMU // The knowledge manipulation unit || BMU // The behavior manipulation unit || EMU // The experience manipulation unit || SMU // The skill manipulation unit ) || ( BPU // The behavior perception unit || EPU // The experience perception unit )
(21)
17
The Theoretical Framework of Cognitive Informatics
As shown in Figure 8 and Eq. 21, WA computers are not centered by a CPU for data manipulation as the VNA computers do. The WA computers are centered by the concurrent IE and PE for cognitive learning and autonomic perception based on abstract concept inferences and empirical stimuli perception. The IE is designed for concept/knowledge manipulation according to concept algebra (Wang, 2006e), particularly the 9 concept operations for knowledge acquisition, creation, and manipulation. The PE is designed for feeling and perception processing according to RTPA (Wang, 2002b) and the formally described cognitive process models of the perception layers as defined in the LRMB model (Wang et al., 2006). Definition 25. Cognitive computers with WA are aimed at cognitive and perceptive concept/knowledge processing based on contemporary denotational mathematics, i.e. Concept Algebra (CA), Real-Time Process Algebra (RTPA), and System Algebra (SA). As that of mathematical logic and Boolean algebra are the mathematical foundations of VNA computers. The mathematical foundations of WA computers are based on denotational mathematics (Wang, 2006b; Wang, 2006c). As described in the LRMB reference model (Wang et al., 2006), since all the 37 fundamental cognitive processes of human brains can be formally described in CA and RTPA (Wang, 2002b; Wang, 2006e). In other words, they are simulatable and executable by the WA-based cognitive computers.
Estimation the Capacity of Human Memory Despite the fact that the number of neurons in the brain has been identified in cognitive and neural sciences, the magnitude of human memory capacity is still unknown. According to the OAR model, a recent discovery in CI is that the upper bound of memory capacity of the human brain is in the order of 108,432 bits (Wang et al., 2003). The determination of the magnitude of human memory capacity is not only theoretically significant in CI, but also practically useful to unveil the human potential, as well as the gaps between the natural and machine intelligence. This result indicates that the next generation computer memory systems may be built according to the OAR model rather than the traditional container metaphor, because the former is more powerful, flexible, and efficient to generate a tremendous memory capacity by using limited number of neurons in the brain or hardware cells in the next generation computers.
Figure 8. The architecture of a cognitive machine
IE LTM
KMU
LTM
Knoledge
LTM
BMU
ABM
Behaviors
ABM
EMU
LTM
Experience
ABM
SMU
ABM
Skills
Enquiries
CM = IE || PE
Interactions
SBM
BPU
ABM
Behaviors
SBM
EPU
LTM
Experience
Stimuli PE The Cognitive Machine (CM)
18
The Theoretical Framework of Cognitive Informatics
Autonomic Computing The approaches to implement intelligent systems can be classified into those of biological organisms, silicon automata, and computing systems. Based on CI studies, autonomic computing (Wang, 2004) is proposed as a new and advanced computing technique built upon the routine, algorithmic, and adaptive systems as shown in Table 3. The approaches to computing can be classified into two categories known as imperative and autonomic computing. Corresponding to these, computing systems may be implemented as imperative or autonomic computing systems. Definition 26. An imperative computing system is a passive system that implements deterministic, context-free, and stored-program controlled behaviors. Definition 27. An autonomic computing system is an intelligent system that autonomously carries out robotic and interactive actions based on goal- and event-driven mechanisms. The imperative computing system is a traditional passive system that implements deterministic, context-free, and stored-program controlled behaviors, where a behavior is defined as a set of observable actions of a given computing system. The autonomic computing system is an active system that implements nondeterministic, context-dependent, and adaptive behaviors, which do not rely on instructive and procedural information, but are dependent on internal status and willingness that formed by long-term historical events and current rational or emotional goals. The first three categories of computing techniques as shown in Table 3 are imperative. In contrast, the autonomic computing systems are an active system that implements nondeterministic, context-sensitive, and adaptive behaviors. Autonomic computing does not rely on imperative and procedural instructions, but are dependent on perceptions and inferences based on internal goals as revealed in CI.
Cognitive Properties of Knowledge Almost all modern disciplines of science and engineering deal with information and knowledge. According to CI theories, cognitive information may be classified into four categories known as knowledge, behaviors, experience, and skills as shown in Table 4. Definition 28. The taxonomy of cognitive information is determined by its types of inputs and outputs to and from the brain during learning and information processing, where both inputs and outputs can be either abstract information (concept) or empirical information (actions). It is noteworthy that the approaches to acquire knowledge/behaviors and experience/skills are fundamentally different. The former may be obtained either directly based on hands-on activities or indirectly by reading, while the latter can never be acquired indirectly.
Table 3. Classification of computing systems Behavior (O) Constant Event (I)
Variable
Constant
Routine
Adaptive
Variable
Algorithmic
Autonomic
Deterministic
Nondeterministic
Type of behavior
19
The Theoretical Framework of Cognitive Informatics
According to Table 4, the following important conclusions on information manipulation and learning for both human and machine systems can be derived. Theorem 9. The principle of information acquisition states that there are four sufficient categories of learning known as those of knowledge, behaviors, experience, and skills. Theorem 9 indicates that learning theories and their implementation in autonomic and intelligent systems should study all four categories of cognitive information acquisitions, particularly behaviors, experience, and skills rather than only focusing on knowledge. Corollary 3. All the four categories of information can be acquired directly by an individual. Corollary 4. Knowledge and behaviors can be learnt indirectly by inputting abstract information; while experience and skills must be learnt directly by hands-on or empirical actions. The above theory of CI lays an important foundation for learning theories and pedagogy (Wang, 2004; Wang, 2006e). Based on the fundamental work, the IE and PE of cognitive computers working as a virtual brain can be implemented on WA-based cognitive computers and be simulated on VNA-based conventional computers.
Simulation of Hman Cognitive Behaviors Using the Contemporary Mathematics The contemporary denotational mathematics as described in Section 3, particularly CA and RTPA, may be used to simulate the cognitive processes of the brain as modeled in LRMB (Wang et al., 2006). Most of the 37 cognitive processes identified in LRMB, such as the learning (Wang, 2006e), reasoning (Wang, 2006b), decision making (Wang et al., 2004), and comprehension (Wang and Gafurov, 2003) processes, have been rigorously modeled and described in RTPA and CA. Based on the fundamental work, the inference engineering and perception engine of a virtual brain can be implemented on cognitive computers or be simulated on conventional computers. In the former case, a working prototype of a fully autonomic computer will be realized on the basis of CI theories.
Agent Systems Definition 29. A software agent is an intelligent software system that autonomously carries out robotistic and interactive applications based on goal-driven mechanisms (Wang, 2003c). Because a software agent may be perceived as an application-specific virtual brain (see Theorem 3), behaviors of an agent are mirrored human behaviors. The fundamental characteristics of agent-based systems are autonomic computing, goal-driven action-generation, knowledge-based machine learning. In recent CI research, perceptivity is recognized as the sixth sense that serves the brain as the thinking engine and the kernel of the natural intelligence. Perceptivity implements self consciousness inside the abstract memories of the brain. Almost all cognitive life functions rely on perceptivity such as consciousness, memory searching, motivation, willingness, goal setting, emotion, sense of spatiality, and sense of motion. The brain may be stimulated by external and internal information, which can be classified as: willingness-driven (internal events such as goals, motivation, and emotions), event-driven (external events), and time-driven (mainly external events triggered by an external Table 4. Types of cognitive information Type of Output
Type of Input
20
Ways of Acquisition
Abstract Concept
Empirical Action
Abstract Concept
Knowledge
Behavior
Direct or indirect
Empirical Action
Experience
Skill
Direct only
The Theoretical Framework of Cognitive Informatics
clock). Unlike a computer, the brain works in two approaches: the internal willingness-driven processes, and the external event- and time-driven processes. The external information and events are the major sources that drive the brain, particularly for conscious life functions. Recent research in CI reveals that the foundations of agent technologies and autonomic computing are CI, particularly goal-driven action generation techniques (Wang, 2003c). The LRMB model (Wang et al., 2006) described in Section 2.2 may be used as a reference model for agent-based technologies. This is a fundamental view toward the formal description and modeling of architectures and behaviors of agent systems, which are created to do something repeatable in context, to extend human capability, reachability, and/or memory capacity. It is found that both human and software behaviors can be described by a 3-dimensional representative model comprising action, time, and space. For agent system behaviors, the three dimensions are known as mathematical operations, event/process timing, and memory manipulation (Wang, 2006g). The 3-D behavioral space of agents can be formally described by RTPA that serves as an expressive mathematical means for describing thoughts and notions of dynamic system behaviors as a series of actions and cognitive processes.
CI Foundations of Software Engineering Software is an intellectual artifact and a kind of instructive information that provides a solution for a repeatable computer application, which enables existing tasks to be done easier, faster, and smarter, or which provides innovative applications for the industries and daily life. Large-scale software systems are highly complicated systems that have never been handled or experienced precedent by mankind. The fundamental cognitive characteristics of software engineering have been identified as follows (Wang, 2006g): • • • • • • • •
The inherent complexity and diversity The difficulty of establishing and stabilizing requirements The changeability or malleability of system behavior The abstraction and intangibility of software products The requirement of varying problem domain knowledge The non-deterministic and poly-solvability in design The polyglotics and polymorphism in implementation The dependability of interactions among software, hardware, and human beings
The above list forms a set of fundamental constraints for software engineering, identified as the cognitive constraints of intangibility, complexity, indeterminacy, diversity, polymorphism, inexpressiveness, inexplicit embodiment, unquantifiable quality measures (Wang, 2006g). A set of psychological requirements for software engineers has been identified, such as: a) Abstract-level thinking; b) Imagination of dynamic behaviors with static descriptions; c) Organization capability; d) Cooperative attitude in team work; e) Long-period focus of attentions; f) Preciseness; g) Reliability; and h) Expressive capability in communication.
Deductive Semantics of Software Deduction is a reasoning process that discovers new knowledge or derives a specific conclusion based on generic premises such as abstract rules or principles. In order to provide an algebraic treatment of the semantics of program and human cognitive processes, a new type of formal semantics known as deductive semantics is developed (Wang, 2006f/g). Definition 30. Deductive semantics is a formal semantics that deduces the semantics of a program from a generic abstract semantic function to the concrete semantics, which are embodied onto the changes of status of a finite set of variables constituting the semantic environment of computing (Wang, 2006g).
21
The Theoretical Framework of Cognitive Informatics
Theorem 10. The semantics of a statement p, θ(p), on a given semantic environment Θ in deductive semantics is a double partial differential of the semantic function, fθ (p) = fp : T ´ S ® V = v p (t, s ), t Î T Ù s Î S Ù v p Î V , on the sets of variables S and executing steps T, i.e.: 2 2 θ (p) = ∂ fq ( p) = ∂ v p (t , s)
∂t ∂s
∂t ∂s
(22)
#T ( p ) # S ( p )
=
R Rv i =0
=
p
(ti , s j )
1
#{s1 , s 2 , ..., s m }
R
R
i =0
j =1
j =1
v p (ti , s j )
æ ö çç s1 s2 sm ÷÷ ÷÷ çç = ç t0 v 01 v 02 v 0m ÷÷ çç ÷÷ ÷ ç (t , t ] v v v ççè 0 1 11 12 1m ÷ ø÷
where t denotes the discrete time immediately before and after the execution of p during (t0, t1), and # is the cardinal calculus that counts the number of elements in a given set, i.e. n = #T(p) and m=#S(p). The first partial differential in Eq. 22 selects all related variable S(p) of the statement p from Θ. The second partial differential selects a set of discrete steps of p’s execution T(p) from Θ. According to Theorem 10, the semantics of a statement can be reduced onto a semantic function that results in a 2-D matrix with the changes of values for all variables over time along program execution. Deductive semantics perceives that the carriers of software semantics are a finite set of variables declared in a given program. Therefore, software semantics can be reduced onto the changes of values of these variables. The deductive mathematical models of semantics and the semantic environment at various composing levels of systems are formally described. Properties of software semantics and relationships between the software behavioral space and the semantic environment are discussed. Deductive semantics is applied in the formal definitions and explanations of the semantic rules of a comprehensive set of software static and dynamic behaviors as modeled in RTPA. Deductive semantics can be used to define abstract and concrete semantics of software and cognitive systems, and facilitate software comprehension and recognition by semantic analyses.
Cognitive Complexity of Software The estimation and measurement of functional complexity of software are an age-long problem in software engineering. The cognitive complexity of software (Wang, 2006j) is a new measurement for cross-platform analysis of complexities, sizes, and comprehension effort of software specifications and implementations in the phases of design, implementation, and maintenance in software engineering. This work reveals that the cognitive complexity of software is a product of its architectural and operational complexities on the basis of deductive semantics and the abstract system theory. Ten fundamental basic control structures (BCS’s) are elicited from software architectural/behavioral specifications and descriptions. The cognitive weights of those BCS’s are derived and calibrated via a series of psychological experiments. Based on this work, the cognitive complexity of software systems can be rigorously and accurately measured and analyzed. Comparative case studies demonstrate that the cognitive complexity is highly distinguishable in software functional complexity and size measurement in software engineering. On the basis of the ERM model described in Theorem 5 and the deductive semantics of software presented in Section 4.8, the finding on the cognitive complexity of software is obtained as follows.
22
The Theoretical Framework of Cognitive Informatics
Theorem 11. The sum of the cognitive weights of all rij, w(rij), in the ERM model determines the operational complexity of a software system Cop, i.e.:
Cop =
n −1
Σ w(r ), j = i + 1 ij
i =1
(23)
A set of psychological experiments has been carried out in undergraduate and graduate classes in software engineering. Based on 126 experiment results, the equivalent cognitive weights of the ten fundamental BCS’s are statistically calibrated as summarized in Table 5 (Wang, 2006j), where the relative cognitive weight of the sequential structures is assumed one, i.e. w1 = 1. According to deductive semantics, the complexity of a software system, or its semantic space, is determined not only by the number of operations, but also by the number of data objects. Theorem 12. The cognitive complexity Cc(S) of a software system S is a product of the operational complexity Cop(S) and the architectural complexity Ca (S), i.e.: C c (S ) = C op (S ) · C a (S ) nC #(C s (C k ))
= {å
k =1 nCLM
å
w(k, i )} ·
i =1
{ å OBJ(CLM k ) + k =1
(24)
nC
å OBJ(C k )}
[FO]
k =1
Based on Theorem 12, the following corollary can be derived. Corollary 5. The cognitive complexity of a software system is proportional to both its operational and structural complexities. That is, the more the architectural data objects and the higher the operational complicity onto these objects, the larger the cognitive complexity of the system.
Table 5. Calibrated cognitive weights of BCS’s BCS
RTPA Notation
Description
Calibrated cognitive weight
1
→
Sequence
1
2
|
Branch
3
3
|… |…
Switch
4
R
i
For-loop
7
5
R
*
Repeat-loop
7
6
R
*
While-loop
8
7
Function call
7
8
Recursion
11
9
|| or ∫∫
Parallel
15
10
Interrupt
22
4
23
The Theoretical Framework of Cognitive Informatics
Table 6. Measurement of software system complexities System
Time complexity (Ct (OP))
Cyclomatic complexity (Cm (-))
Symbolic complexity (Cs (LOC))
Cognitive complexity Operational complexity (Cop (F))
Architectural complexity (Ca (O))
Cognitive complexity (Cc (FO))
IBS (a)
ε
1
7
13
5
65
IBS (b)
O(n)
2
8
34
5
170
MaxFinder
O(n)
2
5
115
7
805
SIS_Sort
O(m+n)
5
8
163
11
1,793
Based on Theorem 11, the cognitive complexities of four typical software components (Wang, 2006j) have been comparatively analyzes as summarized in Table 6. For enabling comparative analyses, data based on existing complexity measures, such as time, cyclomatic, and symbolic (LOC) complexities, are also contrasted in Table 6. Observing Table 6 it can be seen that the first three traditional measurements cannot actually reflect the real complexity of software systems in software design, representation, cognition, comprehension, and maintenance. It is found that: (a) Although four example systems are with similar symbolic complexities, their operational and functional complexities are greatly different. This indicates that the symbolic complexity cannot be used to represent the operational or functional complexity of software systems. (b) The symbolic complexity (LOC) does not represent the throughput or the input size of problems. (c) The time complexity does not work well for a system there is no loops and dominate operations, because in theory that all statements in linear structures are treated as zero in this measure no matter how long they are. In addition, time complexity can not distinguish the real complexities of systems with the same asymptotic function, such as in Case 2 (IBS (b)) and Case 3 (Maxfinder). (d) The cognitive complexity is an ideal measure of software functional complexities and sizes, because it represents the real semantic complexity by integrating both the operational and architectural complexities in a coherent measure. For example, the difference between IBS(a) and IBS(b) can be successfully captured by the cognitive complexity. However, the symbolic and cyclomatic complexities cannot identify the functional differences very well.
Conclusion This chapter has presented an intensive survey of the recent advances and ground breaking studies in Cognitive informatics (CI), particularly its theoretical framework, denotational mathematics, and main application areas. CI has been described as a new discipline that studies the natural intelligence and internal information processing mechanisms of the brain, as well as processes involved in perception and cognition. CI is a new frontier across disciplines of computing, software engineering, cognitive sciences, neuropsychology, brain sciences, and philosophy in recent years. It has been recognized that many fundamental issues in knowledge and software engineering are based on the deeper understanding of the mechanisms of human information processing and cognitive processes. A coherent set of theories for CI has been described in this chapter, such as the Information-Matter-Energy (IME) model, Layered Reference Model of the Brain (LRMB), the OAR model of information representation, Natural Intelligence (NI) vs. Artificial Intelligence (AI), Autonomic Computing (AC) vs. imperative computing, CI laws of software, mechanisms of human perception processes, the cognitive processes of formal inferences, and the formal knowledge system. Three contemporary mathematical means have been created in CI known as the denotational mathematics. Within the new forms of denotational mathematical means for CI, Concept Algebra (CA) has been designed to deal with the new abstract mathematical structure of concepts and their representation and manipulation in learning and knowledge engineering. Real-Time Process Algebra (RTPA) has been devel-
24
The Theoretical Framework of Cognitive Informatics
oped as an expressive, easy-to-comprehend, and language-independent notation system, and a specification and refinement method for software system behaviors description and specification. System Algebra (SA) has been created to the rigorous treatment of abstract systems and their algebraic relations and operations. A wide range of applications of CI has been identified in multidisciplinary and transdisciplinary areas, such as the architecture of future generation computers, estimation the capacity of human memory, autonomic computing, cognitive properties of information, data, knowledge, and skills in knowledge engineering, simulation of human cognitive behaviors using descriptive mathematics, agent systems, CI foundations of software engineering, deductive semantics of software, and cognitive complexity of software systems.
Ak The author would like to acknowledge the Natural Science and Engineering Council of Canada (NSERC) for its support to this work. The author would like to thank the anonymous reviewers for their valuable comments and suggestions.
References Bell, D.A. (1953), Information theory. London: Pitman. Ganter, B., & Wille, R. (1999). Formal concept analysis (pp. 1-5). Springer. Hoare, C.A.R. (1985). Communicating sequential processes. Prentice-Hall Inc. Jordan, D. W., & Smith, P. (1997). Mathematical techniques: An introduction for the engineering, physical, and mathematical sciences (2nd ed.). UK; Oxford University Press. Klir G.J. (1992). Facets of systems science. New York: Plenum. Milner, R. (1989). Communication and concurrency. Englewood Cliffs, NJ: Prentice-Hall. Quillian, M.R. (1968). Semantic memory. In M. Minsky (ed.), Semantic information processing. Cambridge, MA: Cambridge Press. Shannon, C.E. (1948). A mathematical theory of communication. Bell System Technical Journal,.27, 379-423, 623-656. von Bertalanffy, L. (1952). Problems of life: An evolution of modern biological and scientific thought. London: C.A. Watts. von Neumann, J. (1946). The principles of large-scale computing machines. Reprinted in Annals of History of Computers, 3(3), 263-273. Wang, Y. (2002, August). Cognitive informatics Keynote Speech from the Proceedings 1st IEEE International Conference on Cognitive Informatics (ICCI’02), Calgary, Canada: IEEE CS Press. (pp. 34-42). Wang, Y. (2002). The real-time process algebra (RTPA). The International Journal of Annals of Software Engineering, 14, 235-274. Wang, Y., R. Johnston, & Smith, M. (eds.) (2002, August). Cognitive informatics: Proceedings of the 1st IEEE International Conference (ICCI02). Calgary, AB, Canada: IEEE CS Press. Wang, Y. (2003). Cognitive informatics: A new transdisciplinary research field. Brain and Mind: A Transdisciplinary Journal of Neuroscience and Neurophilosophy, 4(2), 115-127.
25
The Theoretical Framework of Cognitive Informatics
Wang, Y. (2003). On cognitive informatics. Brain and Mind: A Transdisciplinary Journal of Neuroscience and Neurophilosophy, 4(2), 151-167. Wang, Y. (2003, August), Keynote Speech: Cognitive informatics models of software agent systems and autonomic computing. Keynote Speech from the Proceedings of International Conference on Agent-Based Technologies and Systems (ATS’03) (p. 25)..Calgary, Canada: Univ. of Calgary Press. Wang, Y. (2003). Using process algebra to describe human and software system behaviors. Brain and Mind: A Transdisciplinary Journal of Neuroscience and Neurophilosophy, 4(2), 199-213. Wang, Y., D., Liu, & Wang, Y. (2003). Discovering the capacity of human memory. Brain and Mind: A Transdisciplinary Journal of Neuroscience and Neurophilosophy,.4(2), 189-198. Wang, Y., & Gafurov, D. (2003, August). The cognitive process of comprehension. Proceedings of the 2nd IEEE International Conference on Cognitive Informatics (ICCI’03) (pp. 93-97). London, UK: IEEE CS Press. Wang, Y. (2004, August). Autonomic computing and cognitive processes. Keynote Specch from the Proceedings of the 3rd IEEE International Conference on Cognitive Informatics (ICCI’04) (pp. 3-4). Victoria, Canada: IEEE CS Press. Wang, Y., Dong, L., & Ruhe, G. (2004, July). Formal description of the cognitive process of decision making. Proceedings of the 3rd IEEE International Conference on Cognitive Informatics (ICCI’04) (pp. 124-130). Victoria, Canada: IEEE CS Press., Wang, Y. (2005, August). On the cognitive processes of human perceptions. Proceedings of the 4th IEEE International Conference on Cognitive Informatics (ICCI’05) (pp. 203-211). Irvin, California: IEEE CS Press. Wang, Y. (2005, May). On the mathematical laws of software. Proceedings of the 18th Canadian Conference on Electrical and Computer Engineering (CCECE’05) (pp. 1086-1089). Saskatoon, SA, Canada. Wang, Y. (2005, August). The cognitive processes of abstraction and formal inferences. Proceedings of the 4th IEEE International Conference on Cognitive Informatics (ICCI’05) (pp. 18-26). Irvin, California: IEEE CS Press. Wang, Y. (2006, May). A unified mathematical model of programs. Proceedings of the 19th Canadian Conference on Electrical and Computer Engineering (CCECE’06) (pp.2346-2349). Ottawa, ON, Canada. Wang, Y. (2006, July). Cognitive informatics - Towards the future generation computers that think and feel. Keynote Speech from the Proceedings of the 5th IEEE International Conference on Cognitive Informatics (ICCI’06) (pp. 3-7). Beijing, China: IEEE CS Press. Wang, Y. (2006, July). Cognitive informatics and contemporary mathematics for knowledge representation and manipulation. Invited Plenary Talk from the Proceedings of the 1st International Conference on Rough Set and Knowledge Technology (RSKT’06) (pp. 69-78). Lecture Notes in Artificial Intelligence, LNAI 4062, Chongqing, China: Springer. Wang, Y. (2006, July). On abstract systems and system algebra. Proeedings.of the 5th IEEE International Conference on Cognitive Informatics (ICCI’06) (pp. 332-343).Beijing, China: IEEE CS Press. Wang, Y. (2006, July). On concept algebra and knowledge representation. Proceedings of the 5th IEEE International Conference on Cognitive Informatics (ICCI’06) (pp.320-331).Beijing, China: IEEE CS Press. Wang, Y. (2006). On the informatics laws and deductive semantics of software. IEEE Transactions on Systems, Man, and Cybernetics (Part C), 36(2), 161-171. Wang, Y. (2006, May) The OAR model for knowledge representation. Proceedings of the 19th IEEE Canadian Conference on Electrical and Computer Engineering (CCECE’06) (pp. 1696-1699). Ottawa, Canada.
26
The Theoretical Framework of Cognitive Informatics
Wang, Y., & Kinsner, W. (2006, March). Recent advances in cognitive informatics. IEEE Transactions on Systems, Man, and Cybernetics (Part C), 36(2), 121-123. Wang, Y., & Wang, Y. (2006, March). On cognitive informatics models of the brain. IEEE Transactions on Systems, Man, and Cybernetics, 36(2), 203-207. Wang, Y. (2006, July). On the Big-R notation for describing iterative and recursive behaviors. Proceedings of the 5th IEEE International Conference on Cognitive Informatics (ICCI’06) (pp. 132-140). Beijing, China: IEEE CS Press. Wang, Y., Wang, Y., Patel, S., & Patel, D. (2006, March). A layered reference model of the brain (LRMB), IEEE Transactions on Systems, Man, and Cybernetics (Part C), 36(2), 124-133. Wang, Y. (2006, July). Cognitive complexity of software and its measurement. Proceedings of the 5th IEEE International Conference on Cognitive Informatics (ICCI’06) (pp. 226-235). Beijing, China: IEEE CS Press. Wang, Y. (2007). Software engineering foundations: A software science perspective. CRC Book Series in Software Engineering, Vol. II, Auerbach Publications, USA.
27
28
Chapter II
Is Entropy Suitable to Characterize Data and Signals for Cognitive Informatics? Withold Kinsner University of Manitoba, Canada
Abstract This chapter provides a review of Shannon and other entropy measures in evaluating the quality of materials used in perception, cognition, and learning processes. Energy-based metrics are not suitable for cognition, as energy itself does not carry information. Instead, morphological (structural and contextual) metrics as well as entropybased multiscale metrics should be considered in cognitive informatics. Appropriate data and signal transformation processes are defined and discussed in the perceptual framework, followed by various classes of information and entropies suitable for characterization of data, signals, and distortion. Other entropies are also described, including the Rényi generalized entropy spectrum, Kolmogorov complexity measure, Kolmogorov-Sinai entropy, and Prigogine entropy for evolutionary dynamical systems. Although such entropy-based measures are suitable for many signals, they are not sufficient for scale-invariant (fractal and multifractal) signals without corresponding complementary multiscale measures.
Introduction This chapter is concerned with measuring the quality of various materials used in perception, cognition and evolutionary learning processes. The multimedia materials may include temporal signals such as sound, speech, music, biomedical and telemetry signals, as well as spatial signals such as still images, and spatio-temporal signals such as animation and video. A comprehensive review of the scope of multimedia storage and transmission is presented by Kinsner (2002). Most of such original materials are altered (compressed or enhanced) either to fit the available storage or bandwidth during their transmission, or to enhance perception of the materials. Since the signals may also be contaminated by noise during different stages of their processing and transmission, various denoising techniques must be used to minimize the noise, without affecting the signal itself (Kinsner, 2002). Different classes of coloured and fractal noise are described by Kinsner (1996). The multimedia compression
Copyright © 2009, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
Is Entropy Suitable to Characterize Data and Signals for Cognitive Informatics?
is often lossy in that the signals are altered with respect not only to their redundancy, but also to their cognitive relevancy. Since the signals are presented to humans, cognitive processes must be considered in the development of suitable quality metrics. This chapter describes a very fundamental class of metrics based on entropy, and identifies its usefulness and limitations in the area of cognitive informatics (CI) (Wang, 2002).
Issues in Compression and Coding A simple source compression consists of taking an input stream of symbols S and mapping the stream into an output stream of codes G, so that G should be smaller than S. The effectiveness of the mapping depends on the selection of an appropriate model of the source. This two-step process is illustrated in Figure 1. Modelling of the source is intended to extract information from the source in order to guide the coder in the selection of proper codes. The models may be either given a priori (static) or may be constructed on-the-fly (dynamic, in adaptive compression) throughout the compression process. In data compression, the modeller may either consider the discrete probability mass function (pmf) of the source, or look for a structure (e.g., the pattern of edges and textures) in the source itself. In perceptual signal compression, the modeller may consider the perceptual framework (e.g., edges and textures in images and the corresponding masking in either the human visual system, HVS, (Pennebaker & Mitchell, 1993) or the human psycho-acoustic system, PAS (Jayant, 1992). It is in this modelling that CI ought to be used extensively. A simple data source coder minimizes the bit rate of the data by redundancy minimization based on Shannon first-order or higher-order entropies. Redundancy is a probabilistic measure (entropy) of the spread of probabilities of the occurrence of individual symbols in the source with respect to the the equal (uniform) symbol probabilities. If the probabilities of the source symbols are all equal, the source entropy becomes maximum, and there is no redundancy in the source alphabet, implying that a random (patternless) source cannot be compressed without a loss of information. The objective of the lossless compression techniques is to remove as much redundancy from the source as possible. This approach cannot produce large source compression. The quality of an actual code is determined by the difference between the code entropy and the source entropy; if both are equal, then the code is called perfect in the information-theoretic sense. For example, Huffman and Shannon-Fano codes (e.g., Held, 1987, and Kinsner, 1991) are close to perfect in that sense. Clearly, no statistical code will be able to have entropy smaller than the source entropy. On the other hand, a perceptual source coder minimizes the bit rate of the input signal, while preserving its perceptual quality, as guided by two main factors: (i) information attributes derived from the structure in the given source (e.g., probabilities related to frequency of occurrence or densities, as well as edges and textures related to the singularities in the signal), and (ii) features derived from the perceptual framework (e.g., masking in the HVS and PAS). This corresponds to the removal of both redundancy and irrelevancy, as shown by the Schouten diagram in Figure 2. This orthogonal principle of both redundancy reduction and irrelevancy removal is usually difficult as it does not correspond to the maximization of signal-to-noise ratio, SNR (i.e., the minimization of the mean-squared error, MSE), and is central to the second-generation of codecs. For example, an edge of an object Figure 1. Compression is modeling and coding.
29
Is Entropy Suitable to Characterize Data and Signals for Cognitive Informatics?
in an image may not carry much energy, but may be critical in its shape recognition. Another example is a stop consonant in speech which may be insignificant energetically and broadband spectrally, but may be critical in speech recognition. The major questions in data compression include: (i) how to model the source data (e.g., through statistical or dictionary models, transforms, prediction), (ii) how to measure the redundancy (e.g., through low or high-order entropies which deal with precise knowledge), and (iii) how to encode the source data (through fixed or variablelength codes). On the other hand, the major questions in signal compression include: (i) how to model a linear time-invariant (LTI) signal or a scale-invariant (SI) signal, as described in Sec. 2.1 (i.e., how to find transforms, patterns, prediction, scalar and vector quantization, and analysis/synthesis), (ii) how to measure irrelevancy, and (iii) how to encode the source signal (e.g., through fixed or variable-length codes) (Sayood, 2000). Measuring irrelevancy can be done through feature maps, perceptual entropy (Jayant, Johnson, & Safranek, 1993), and relative multifractal dimension measures (Dansereau & Kinsner, 2001; and Dansereau, Kinsner, & Cevher, 2002) as well as through other models of uncertainty. These include: (i) possibilistic to deal with vague and imprecise, but coherent knowledge (Dubois & Prade,1988), (ii) Dempster-Shafer belief theory to deal with inaccurate and uncertain information, (iii) rough sets to establish the granularity of the information available, (iv) fuzzy sets to deal with membership functions, and (v) fuzzy perceptual measures. Another major question relates to how the source and channel are treated. Figure 3 shows a combined encoding and decoding scheme. A source coder is often followed by a channel coder which adds redundancy for error protection, and a modem which maximizes the bit rate that can be supported in a given channel or storage medium, without causing an unacceptable level of bit error probability. This is of particular importance in wireless communications in which the channel may change appreciably not only during a single transaction but over a session. Ideally, the entire process of source coding, channel coding and modulation should be considered jointly to achieve the most resilient bit stream for transmission, as is often the case in modern source-channel joint coding. There may also be a considerable advantage to the joint coding by including joint text, image, video and sound coding. This chapter addresses the source coding only.
Figure 2. Reduction of redundancy and irrelevancy
30
Is Entropy Suitable to Characterize Data and Signals for Cognitive Informatics?
Figure 3. Joint source-channel-multimedia coding
Another problem is due to the characteristics of packet switched networks. Specifying the characteristics of traffic in multimedia environments is more difficult than in circuit switched systems in which a fixed bandwidth channel is held for the duration of a call, and only the incidence of calls and their durations are required. Packet switched systems carrying multimedia have variable-bit rates with bandwidth on demand. This calls for knowledge not only of the statistics of the sources, but also of the rules for assembling the packets in order to control the traffic. Such metrics must be based on multi-scale singularity measures because the signals have long-term dependence.
Taxonomy of Compression Methods Multimedia compression can be classified into lossless and lossy approaches, based on the distinctive features of the materials, as described in the next section. The lossless approach includes five methods: (i) the run-length encoding, (ii) statistical encoding, (iii) dictionary encoding, (iv) adaptive encoding, and (v) transform-based encoding. The lossy approach includes transform-based encoding and quantization encoding. A comprehensive taxonomy of the techniques, together with extensive reference material, is provided by Kinsner (2002); Sayood (2000); Kinsner (1998); and Kinsner (1991).
Models of Data, Signals and Complexity Models of Data and Signals The objective of source coding (compression) is a compact digital representation of the source information. Often, the receiver of data is the computer, while the receiver of signals is a human. The above definition of compression requires distinction between data and signals. Digital data are defined as a collection (a bag) of arbitrary finite-state representations of source information, with no concept of temporal or spatial separation between the elements of the bag, and no concept of the origin or destination of the bag. (Notice that in the bag theory, elements of a bag may be equal, while elements of a set must be different.) Examples of data could include either an intercepted encrypted stream of bits (without a known beginning or end), or a financial file, or a computer program. As a consequence, if nothing is known about the nature of the source or destination, compression can only be done losslessly; i.e., without any loss of information, as measured through redundancy (entropy differ-
31
Is Entropy Suitable to Characterize Data and Signals for Cognitive Informatics?
ence), with the data modelled either statistically, or through a dictionary, or a transform such as prediction. The coder could then use either fixed or variable-length codes. A signal, on the other hand, is a function of independent variables such as time, distance, temperature, and pressure.The value of the function is called its amplitude, and the variation of the amplitude forms its waveform. The waveform can be either (i) unchanging (DC), (ii) periodic such as alternating (AC) or oscillating, (iii) aperiodic, (iv) chaotic, or (v) random (stochastic). The signals can be either (i) analog (continuous with infinite resolution), or (ii) discrete (sampled in time or space, but still with an infinite resolution), or (iii) digital (discrete and quantized to a specific resolution), or (iv) boxcar (continuous, piecewise constant with step displacements, as formed after a digital-to-analog converter). We are mostly concerned with the digital signals in this chapter. The signals can be classified as linear time invariant, LTI (additive invariance), or scale invariant, SI (multiplicative invariance). The LTI system theory is based on the idea that periodic waveforms shifted by multiples of the period are the same (e.g., Oppenheim & Schafer, 1975; Oppenheim & Willsky, 1983; Oppenheim & Schafer, 1989; and Mitra, 1998). This also applies to stationary and cyclostationary signals in the sense that their statistics do not change (i.e., either the wide-sense stationarity, WSS, in which the first two moments must not change, or the strict-sense stationarity, SSS, where none of the moments could change). Fourier (spectral) and wavelet (spectral and scale) transforms may be applied to such signals in order to extract appropriate features. On the other hand, scale-invariant (fractal) signals are fundamentally different from the LTI signals (Wornell, 1996). Their short-scale and long-scale behaviours are similar (i.e., they have no characteristic scale). Such selfsimilar signals (i.e., signals with one scale for time and amplitude) or self-affine signals (different scales for the time and amplitude) must be processed differently because well-separated samples in the signal may be correlated strongly. Unlike the LTI signals (whose Gaussian distributions have very short tails), the SI signals have power-law distributions that have long tails. Their higher-order moments do not vanish. Consequently, detection, estimation, identification, feature extraction, and classification of fractal signals are all different from the LTI signals. Most of the physical signals are not LTI. Examples of such signals include speech, audio, image, animation and video, telecommunications traffic signals, biomedical signals such as the electrocardiogram (ecg) and electromyogram (emg), sonar, radar, seismic waves, turbulent flow, resistance fluctuations, noise in electronic devices, frequency variations of atomic clocks, and time series such as stock market and employment. They are often highly nonGaussian, nonstationary, and in general have a complex and intractable (broadband) power spectrum. To emphasize this important point, Figure 4 shows the two LTI and SI classes of systems and signals. Many dynamical systems produce signals that are chaotic (deterministic, yet unpredictable in a long term (e.g., Kinsner, 1996; Peitgen, Jürgens, & Saupe, 1992; Sprott, 2003; Kantz & Schreiber, 1997; and Schroeder, 1991). Since such a signal has more attributes than the self-affine signal, more information can be extracted from it if one can show that the measured signal is chaotic indeed. We must also remember that the common assumption that both the LTI and SI signals originate from (and are processed by) systems that do not change in time and space can rarely be assured because both artifacts (such as electronic and mechanical systems) and living organisms age and change with the environment.
Figure 4. LTI and SI systems and signals
32
Is Entropy Suitable to Characterize Data and Signals for Cognitive Informatics?
An added complication in processing such signals is that the human receiver does not employ a mean-squarederror criterion to judge the quality of the reconstructed signal (Jayant, 1992). Instead, humans use a perceptual distortion criterion to measure source entropy. This leads to two approaches to source compression: lossless and lossy, with the latter involving characteristic (relevant) features related to the HVS and PAS. The relevancy is measured through feature maps and perceptual entropy (Jayant, Johnson, & Safranek, 1993). The signal is modelled through either transforms, patterns, or analysis/synthesis processes. As it was with data, the coder may use either fixed or variable-length codes.
The EMO and Other World Views We have seen that simple redundant patterns can be removed from messages quite easily through many contextual (non-probabilistic) techniques such as the run-length encoding (Sayood, 2000). More complicated patterns based on the spread of probabilities in the pmf of the source can lead to lossless techniques such as the Huffman and Shannon-Fano (Held, 1987). A transform-based technique such as the JPEG produces higher compression ratios based on concentration of energy in few coefficients in the transform (discrete cosine) domain (Pennebaker & Mitchell, 1993). The consideration of the psycho-acoustic model in audio has resulted in MP3 (MPEG-1 Layer 3) compression (ISO/IEC 11172-3, 1993) On the other hand, perceptual and cognitive signal processing requires techniques based on features related to perception and cognition that go beyond the simple morphological or probabilistic patterns. To enhance perception and cognition, information and knowledge must be considered. Wang (2002) postulated an E-M-I model of the CI world view, where E, M, and I denote energy, matter, and information, respectively. The E and M components are located in the physical world, while the I component is placed in an abstract world, as shown in Figure 5. A similar IME world view was discussed by Stonier (Stonier, 1990, Ch. 3), with the major difference that the information (I) was considered by Stonier to be an integral part of the physical world. Still another approach to a CI world view is to develop an ontology for the structure in the knowledge base of an expert system (e.g., as described by Chan, 2002). We propose another CI world view in which organization (complexity, or pattern, or order, O) is an integral part of the physical world that also includes the E and M components, as shown in Figure 6. The argument for treating order as the integral part of the physical world is as follows. Order can be found in both M and E when the system is far from its thermodynamic equilibrium. In Newtonian physics, space and time were given once and for all, with perfect reversibility, and time was common to all observers. In relativity, space and time were no longer fixed, but “the distinction between the past, present and future was an illusion,” according to Einstein. On the other hand, irreversibility, or Eddington’s thermodynamic arrow of time (e.g., Mackey, 1992; and Hawkins, 1996), is fundamental in Boltzmann’s thermodynamic evolution of an isolated (Hamiltonian) system, from order to disorder, towards its equilibrium at which entropy is maximum. Nonequilibrium is the source of order; it brings “order out of chaos” [Prigogine & Stengers, 1984, p. 287]. Irreversibility is the source of order; it brings “order out of chaos” [Prigogine & Stengers, 1984, p. 292]. Far-from-equilibrium self-organization in open systems leads
Figure 5. Wang’s I-M-E world view with matter (M), energy (E) and information (I)
33
Is Entropy Suitable to Characterize Data and Signals for Cognitive Informatics?
Figure 6. The EMO world view that includes complexity with energy (E), matter (M) and order (O)
to their increased complexity. This also leads to the existential time arrow (duration) as introduced by Henri Bergson (1859-1941) (Bergson, 1960) which could also play an important role in CI. This complexity can be described in a number of distinct ways: by information, entropy, dimensionality spectra (Rényi), and singularity spectra (Hölder and Mandelbrot). Cognitive processes are also being linked to dynamical systems (e.g., Thelen & Smith, 2002; and Mainzer, 2004). In this view, information and the other measures are just descriptors of the fundamental natural entity, complexity. Figure 6 also illustrates the incompleteness of any view on reality. There are two objective worlds: the physical world and the abstract world. The third is the perceptual world, as formed by the intersection of the physical and abstract worlds. Within this world, order has always been seen by human observers, though time and matter were comprehended just centuries ago, while energy was comprehended even later, and only then the relationship between E and M was established. Today, much is known about the relation between all three elements (e.g., Prigogine & Stengers, 1984; Turcotte, 1997; Vicsek, 1992; Kadanoff, 1993; Alligood, Sauer, & Yorke, 1996; and Mainzer, 2004). The diagram also illustrates that a part of the physical world is not known yet (e.g., the dark matter and dark energy in the Universe), and that a part of the abstract world transcends the physical world.
Ojective and Subjective Metrics There are three basic classes of performance evaluation of compression algorithms and their implementations: (i) efficiency metrics (e.g., compression ratio, percentage, bit rates), (ii) complexity metrics (processing cost, memory size, and chip size), and (iii) delay metrics (to evaluate delays due to the processor used and networking). There are also three classes of metrics that relate to the quality of reconstruction: (i) difference distortion metrics (signal to noise ratio, SNR, and their variations), (ii) perceptual quality metrics (mean opinion score, MOS, segmented SNR), and (iii) recognizability metrics (relative and absolute). The first three classes are always required to evaluate the process of compressing the source and its transmission over a network. The other three classes relate to the evaluation of the fidelity of the reconstructed signal with respect to the human observer. Of course, lossless compression assures the quality of the reconstruction to less than one bit per pixel. On the other hand, lossy compression requires perceptual quality metrics to establish how accurate the reconstructed sound, image or video is to a human user. The recognizability metrics are concerned with the preservation of the intended message in the reconstructed signal, without any reference to the source, thus being an absolute subjective measure. In speech, this metric is called intelligibility. The confusion matrix is another recognizability metric. However, the test is non-binary in that, in addition to the correct utterance, other confusing utterances are also scored. These metrics are summarized by Kinsner (2002). Since many of these objective metrics are based on energy (e.g., MSE, and peak SNR), and energy itself does not carry information, they do not agree with the subjective quality metrics. For example, whispering or shouting
34
Is Entropy Suitable to Characterize Data and Signals for Cognitive Informatics?
of a speech utterance differs much in its energy, although the message itself is unaltered significantly. Formants of the utterance and their transitions in time carry much more information than their energy. Fricatives also convey more information than would be implied by their energy. Much effort is being directed towards perceptual coding of digital audio (Painter & Spanias, 1998) and digital image and video (e.g., Farrell & Van Den Branden Lambrecht, 2002; and Tekalp, 1998), with corresponding developments in multidimensional quality metrics. Our focus has been on multifractal complexity measures to determine both the local and global complexities of the signal, using the Rényi fractal dimension spectrum, Mandelbrot singularity spectrum (Kinsner, 1994), and the generalized Kullback-Leibler distance (e.g., Kinsner & Dansereau, 2006; Dansereau, Kinsner, & Cevher, 2002; and Cover & Thomas, 1991).
Symbols, Alphabets, Messages, Probability and Information Since the non-energy-based metrics are related to the concepts of information and entropy, the next three sections describe them critically in order to delineate their advantages and limitations from the perspective of CI. Information, regardless of its definition, will be considered in this chapter as a measure of complexity.
Symbols and Alphabets A symbol σj is defined as a unique entity in a set. There is no limitation on the form that the symbol can take. For example, in a specific natural language, it could be a letter or a punctuation mark (e.g., a, A, α, ℵ, a Braille symbol, or a sign in the American Sign Language). In a specific number system, it could be a digit (e.g., unary {1}, binary {0, 1}, octal {0, 1, ..., 7}, hexadecimal {0, 1, ..., F}, Mayan nearly-vigesimal {•, —} corresponding to {1, 5}, or Babylonian base-60 with two symbols corresponding to {1, 10}). Other universal symbols (morphs) have been designed to form either an arbitrary font, or iconic languages (e.g., Chinese), or music notation, or chemical expressions. A symbol may also be a pixel (either binary, or gray scale, or colour). Another example of a symbol is the phoneme, defined as the elementary indecomposable sound in speech. A set of such unique symbols forms an alphabet. We shall consider several distinct alphabets relevant to compression. A source alphabet, Σ, is a set of symbols that the source uses to generate a message. It is denoted by Σ = {σ1, σ2, ..., σN}
(1)
where N is the cardinality (size) of Σ, and is denoted by N = |Σ|
(2)
It should be clear from the context of this chapter that this notation does not represent an absolute value. It should also be noticed that each symbol is independent from any other symbol in Σ. This independence of symbols could lead to a message whose symbols are arranged in either a random or correlated pattern, depending on the probability mass function discussed in the next section. For transmission and storage, each symbol σj must be encoded with other symbols from a coding alphabet, Γc, denoted by Γc = {γc1, γc2, ..., γcb}
(3)
where the cardinality b = | Γc | gives the base of the number system from which the digits γcj are drawn. This is also the base of the logarithm used in all the subsequent calculations. For example, the binary coding alphabet is Γc = {0, 1} with b = 2. The encoded symbols γj corresponding to the source symbol σj form the code alphabet, Γ, denoted by
35
Is Entropy Suitable to Characterize Data and Signals for Cognitive Informatics?
Γ = {γ1, γ2, ..., γN}
(4)
Its cardinality usually matches the cardinality of the source alphabet. There are also other alphabets and dictionaries used in the formation of compact messages, but are outside the scope of this chapter.
Strings and Messages A string sj is a collection of symbols σj (a bag, in the bag theory) that is larger than any individual symbol, but smaller than a message M. For example, a string “the” in English could be coded as a unit, and not as three separate symbols, thus resulting in a more compact representation of the string. A bag of all the symbols and strings forms a message M denoted by M ≡ M[σ1, σ2, ..., σM]
(5)
where M = | M | is the size of the message, and the symbol ≡ denotes equivalence. Notice that this vectorial notation [•] allows σi = σj for i ≠ j, while the set notation {•} would preclude equality of its elements.
Probability A Priori DeILQition The definition of probability used in this chapter is in the context of the formation of a message as defined by Ralph Hartley (1888-1970) (Hartley,1928) and Claude Shannon (1916-2001) (Shannon, 1948). Let us consider a process of successive selection of symbols σj (according to some probability p(σj) ≡ pj for that symbol) from a given source alphabet Σ of size N to form a message M containing M symbols. In this scheme of generating the message, the probabilities pj for all the symbols must be given in advance. This collection of known symbol probabilities forms the a priori probability mass function (pmf), P, denoted by P ≡ P[p(σ1), p(σ2), ..., p(σN)]
(6)
Since the pmf is a bag, the vectorial notation [•] is used again. Notice that the name pmf implies a discrete distribution, and distinguishes it from a continuous probability density function (pdf). Also notice that the selection of a symbol can be called an event. Finally, notice that the symbols can be substituted with strings of symbols, sj. We must distinguish between two fundamentally different probability distributions in the pmf: uniform and nonuniform. The uniform distribution is selected if nothing is known about the symbols in the message to be formed. As we shall see, this will lead to the longest possible (worst-case) message. If the symbols in a message form associations and patterns, the distribution is nonuniform, thus leading to shorter messages. If the symbols are independent, then the two distributions are also called the independent-identically-distributed (iid) pmf and independent-nouniformly-distributed (ind) pmf. We shall see that the iid pmf produces messages whose elements are uncorrelated (memoryless) and have the maximum entropy, while the ind pmf produces messages whose elements are still uncorrelated but shorter and with a lower entropy.
A Posteriori Definition If the message M has been formed, transmitted and received, the pmf can be estimated directly from M. If the symbol σj occurs nj times in the message of size M =|M|, then the relative frequency of occurrence of this symbol is defined as
f (σ j )
36
nj M
[dimensionless]
(7)
Is Entropy Suitable to Characterize Data and Signals for Cognitive Informatics?
where the symbol ∆ above the equality sign denotes the relation by definition. With this definition, the following conditions are satisfied 0 ≤ f(σj ) ≤ 1, ∀ σj
(8)
and N
∑ f (σ j =1
j
) = 1
(9)
where N is the size of the alphabet. If the message is ergodic, then the frequency of occurrence f(σj) becomes the a posteriori probability p(σj) for a symbol σj p(σj) ≡ f(σj)
(10)
and their complete collection forms the a posteriori pmf.
Conditional and Joint Probabilities The above symbol selection process assumes no dependence of one symbol on any other symbol in the message. This is true when there is no pattern in the message (random message). However, patterns may imply dependence between either individual symbols or even groups of symbols. This can be measured by a conditional probability that symbol σj occurs, given that symbol σi has occurred. This can be expressed as
p (σ j σ i )
p (σi σ j ) p (σi )
(11)
where p(σiσ j) is called the joint probability of a digram σiσj (i.e., the probability that both σi and σj occur). The scaling by p(σi) assures that the conditional probability of the sample space equals 1 again. This concept of digrams can be expanded to k-grams if the dependence (memory) exists between k symbols. When the symbols are independent, then the joint probability is the product of probabilities of the individual symbols p(σi , σj ) = p(σi ) p(σj )
(12)
In this case, the message is called memoryless, or the 0th-order Markovian.
Shannon’s Self-Information For such a memoryless source, the Shannon self-information Ij of the jth event is defined as
I (σ j ) ≡ I j log b
1 = − log b p j pj
[information unit, or u]
(13)
where pj ≡ p(σj) for brevity, and b is the size of the coding alphabet Γc required to code each symbol. Since each symbol probability is confined to the unit interval pj = [0,1], the self-information is always non-negative Ij = [∞,0]. For a binary coding alphabet Γc = {0,1}, b = 2 and u ≡ bit (binary digit), while for natural base b = e, u ≡ nat (natural digit), and for b = 10, u ≡ Hartley. For simplicity, we shall assume the binary coding alphabet. This gives a clear basis for the interpretation of Shannon self-information: it is the number of bits required to repre-
37
Is Entropy Suitable to Characterize Data and Signals for Cognitive Informatics?
sent a symbol. If the probability of a symbol is 1, it requires no bit, as it is tautology. When the probability of a symbol drops, the number of bits required increases. This statement could also be rephrased “information that is surprising (improbable, news) is more informative”. For example, the probabilities of the frequent letters E and T in English are p(E) = 0.13 and p(T) = 0.09, respectively, while the less frequent letter Q has probability of p(Q) = 0.0025. Consequently, the letters require I(E) = –log2(0.13) = 2.94 bits, I(T) = 3.47 bits, and I(Q) = 8.64 bits. Of course, the number of bits used in any simple practical code would have to be integers 3, 4, and 9, respectively. In general, the number of information units γj required to encode a symbol σj, whose probability is pj, can be computed from λj Ij = –logb pj
(14)
where x is the ceiling function that produces the closest integer greater or equal to x. This encoded symbol with λj information units is called a codeword. This strategy has been employed in many codes. For example, the Shannon-Fano codes for E, T, and Q are 000, 001, and 111111110, while the slightly better Huffman codes for the letters are 000, 0010, 1111110, respectively (Kinsner, 1991). Another example is the Morse code used in telegraphy in which the letter E requires a single short sound DIT, and the letter T has a single long sound DAH, while the less frequent Q requires four sounds DAH DAH DIT DAH. Such variable-length codes always reduce the number of bits in a message with respect to a code that uses the same number of bits per symbol, regardless of their frequency of use in a specific class of messages. What is the computational application of this definition of Shannon’s information? As we have seen, it leads to more compact messages through efficient coding of symbols, and it allows to calculate the total number of bits in any message to be generated. It should be clear, however, that this definition of information is divorced from all subjective factors, such as meaning (context), common-sense understanding, and perception or cognition. It just means more bits for a lower-probability symbol. This is the main source of difficulties in connecting this definition with subjective performance metrics
Conditional Self-Information Following the reasons behind the definitions of conditional and joint probabilities for messages with inter-symbol dependence (memory), we define the conditional self-information as
I (σ j σi ) ≡ I j i log b
1 = − log b p j i p (σ j σi )
[information unit, or u]
(15)
1 = − log b pij p (σ i σ j )
(16)
and the joint self-information as
I (σi σ j ) ≡ I ij log b
As before, for M independent events, the joint self-information is M
I1...M = ∑ I j
(17)
j =1
This definition of conditional self-information shortens the number of bits per symbol for digrams and, when expanded further, for k-grams.
38
Is Entropy Suitable to Characterize Data and Signals for Cognitive Informatics?
Etropies of Alphabets and Messages There are many definitions of entropy, as summarized at the end of the next section. We shall first define it based on Shannon’s self-information, followed by a review of other definitions of entropy and distortion entropies in Sec. 6.
Shannon’s Source Etropy and Redundancy While self-information describes the length of a single symbol in terms of information units, thus providing the length of the entire message containing M symbols, entropy gives the average information, regardless of the message size. It is then defined as the average (expected) value of self-information N
H ∑ p (σ j ) I (σ j ) j =1
(18)
N
= −∑ p (σ j ) log b p (σ j ) j =1 N
≡ −∑ p ( j ) log b p ( j ) j =1 N
≡ −∑ p j log b p j [u/symbol] j =1
where N is the size of the source alphabet Σ = {σ1, σ2 , ..., σN} and p(σj) ≡ p(j) ≡ pj is the probability of the jth symbol taken from the corresponding pmf P = [p1, p2, ..., pN]. The expression is related to the Boltzmann entropy (but with the opposite sign) and Boltzmann -Gibbs entropy (with the same sign), as described in Secs. 6.6.3 and 6.6.4, respectively. This entropy function H(P) is non-negative and concave in P (Cover & Thomas, 1991). This is also called the 1st-order entropy denoted by H(1) because the expression uses a single value of the probability in both the self-information and the weight. The parentheses are used in the subscript to differentiate this notation from the Hq notation in the Rényi entropy, as discussed later. We often use another subscript to emphasize the order of the Markov chain model of the message itself. For example, the 1st-order entropy for a memoryless message with a nonuniform pdf is denoted by H(1,0), while the 1st-order entropy for memoryless message with a uniform pmf is denoted by H(1,-1). This special case can be expressed as N
H (1, −1) = H max = −∑ j =1
1 1 log b = log b N N N
(19)
It is very important because it defines redundancy H R HR(A) = Hmax(A) – H(A)
(20)
where A represents any alphabet (either source or code), Hmax(A) represents the maximum possible entropy for an iid distribution, and H(A) is the actual entropy for the given alphabet A. If H R(A) is removed from the message, no loss of information occurs. This defines a lossless compression.
Shannon’s Code Entropy If each individual symbol has a codeword that has an integer number of bits, λj, then the source entropy H(Σ) may be different from the code entropy H(Γ). The code entropy is defined as the weighted sum of the self-information of the individual codewords
39
Is Entropy Suitable to Characterize Data and Signals for Cognitive Informatics?
N
H (Γ) + ∑ p j λ j [u/symbol]
(21)
j =1
Notice that since Ij ≤ λj then H(Σ) ≤ H(Γ)
(22)
When the equality in (22) is reached, then the code is called perfect in the information theoretic sense. For example, the arithmetic code (which does not require an integer number of bits per symbol) is closer to the perfect code than the Huffman code (Sayood, 2000).
Higher-Order Message Entropy For independent symbols, the message M is of the 0th order, and its entropy equals the source entropy, H(M) = H(S). If encoded, then the following relation must hold H(M) ≤ H(G). However, if the message is of the 1st order (i.e., is has memory of one symbol), then the message entropy must be of the 2nd order, as denoted by
N N H (2,1) ( M ) S ∑∑ p (i, j ) log b p ( j i )
[u/symbol]
(23)
i = j j =1
where the p(i,j) and p(j | i) are the joint and conditional probabilities, respectively. This can be generalized to any higher order entropy H(k+1,k) for messages of higher-order k (Sayood, 2000).
Entropies of Distortion In lossless compression, the original message M and reconstructed messages M* are the same, and the measures discussed so far are sufficient for their comparison. In lossy compression, the reconstructed message may be different from M, thus leading to distortion and a different reconstruction alphabet Σ*. The distortion can be measured through distortion entropies such as conditional, mutual, and relative (Cover & Thomas, 1991, and Figure 7. Venn diagram illustration of joint entropy, H(X,Y), conditional entropy, H(X|Y) and H(Y|X), and mutual entropy, H(X;Y)
40
Is Entropy Suitable to Characterize Data and Signals for Cognitive Informatics?
Kinsner, 1998). In order to avoid cumbersome notation, we shall denote the original message as X ≡ M, and the reconstructed message as Y ≡ M*, with the corresponding source and reconstruction alphabets denoted by X = {x1, x2, ..., xN} and Y = {y1, y2, ..., yL}, and their cardinalities of N and L, respectively. Notice that N and L do not have to be equal. We also assume that H(X) = H(X) and H(Y) = H(Y).
Joint Entropy, H (X,Y ) The joint entropy H(X,Y) of two discrete random variables X and Y is fundamental to the definition of the conditional and other entropies. It is defined as N
∆ H (X,Y ) = –Σ
L
Σ
i = 1j = 1
p(xi , y j) log b p(xi , y j)
(24)
where N and L are the cardinalities of X and Y, respectively, and p(x,y) is the joint pmf. This joint entropy can be illustrated by a Venn diagram shown in Figure 7. It can be seen from the diagram in Figure 7 that (for proof, see Cover & Thomas, 1991, p. 28) H(X,Y) ≤ H(X) + H(Y)
(25a)
and H(X,Y) = H(Y,X)
(25b)
C onditional E ntropy, H (Y |X) and H (X|Y ) The conditional entropy H(Y|X) that the reconstruction message Y has occurred, given that the source message X has occurred, is defined as the average conditional self-information I(y | x) H (Y X )
∑ p( x) I (Y
x∈ X
N
X = x)
(26)
L
= −∑∑ p ( xi , y j ) log b p ( y j xi ) i =1 j =1
Similarly, L
N
H ( X Y ) −∑∑ p ( xi , y j ) log b p ( xi y j )
(27)
j =1 i =1
This conditional entropy is illustrated in Figure 7. It can be seen that (Cover & Thomas, 1991, p. 27) H(Y|X) ≤ H(Y) H(X|Y) ≤ H(X)
(28a) (28b)
and, in general, H(X|Y) ° H(Y|X)
(29)
It can also be shown that (Sayood, 1996, Example 7.4.2)
41
Is Entropy Suitable to Characterize Data and Signals for Cognitive Informatics?
H(Y|X) ≤ H(Y, X) – H(X) H(X|Y) ≤ H(X, Y) – H(Y)
(30a) (30b)
Mutual Entropy, H (X;Y ) The mutual entropy H(X;Y) of the source message X and the reconstruction message Y is defined as the average mutual self-information denoted by I(x ; y) H ( X ;Y )
∑ ∑ p( x, y) I ( x; y)
(31)
x∈ X y∈Y
where I ( x; y ) log b
p ( xi y j ) p ( xi )
= log b
p ( xi y j ) p ( xi ) p ( y j )
(32)
It can be seen from Figure 7 that, since the mutual entropy is common to both the source and reconstruction, it could be used to make the reconstruction look like the source when H(X;Y) reaches its maximum value. When H(X;Y) = 0, the source and reconstruction are totally different. This feature has made mutual entropy a prominent player in many areas of signals processing. It can also be shown that H(X; Y) = H(X) –H(X|Y) H(Y; X) = H(Y) –H(Y|X)
(33a) (33b)
and H(Y; X) = H(X) + H(Y) – H(Y|X)
(33c)
R elative E ntropy, H (X || Y ) In this chapter, the most important distortion-related entropy is the relative entropy denoted by H(X || Y). If we assume that both the source X and reconstruction Y alphabets have the same cardinality N, then the relative entropy can be written as N
p( x j )
j =1
p( y j )
H ( X Y ) ∑ p ( x j ) log b
(34)
This value is non-negative if the pmfs of the two alphabets are not equal, and zero if and only if P(X) = P(Y). The relative entropy is also called the Kullback-Leibler divergence (distance), as it measures the dissimilarity between two alphabets of the same cardinality. This property is suitable for perceptual quality metrics (Dansereau & Kinsner, 2001, 2006).
Rényi Entropy Spectrum, H q Shannon’s 1st-order and higher-order entropies provide a measure of the average information for either the source or the reconstruction or both, and are of great importance in data and signal transmission, storage, and signal processing. In 1955, Alfréd Rényi (1921-1970; Erdös Number 1) introduced a generalized entropy, Hq, that could discern the spread of probabilities in the pmf. For a source message M with its source alphabet Σ of cardinality N and its corresponding pmf, P, the Rényi entropy spectrum is given by
42
Is Entropy Suitable to Characterize Data and Signals for Cognitive Informatics?
H q ( P) =
N 1 log b∑ p qj , −∞ ≤ q ≤ ∞ 1− q j =1
(35)
where q is the moment order. For q = 0, the Rényi entropy becomes the maximum (capacity) entropy H(1,–1), also known as the morphological entropy (Kinsner, 1996, 2005) H0(P) = Hmax = logb N [u/symbol]
(36)
For q = 1, it can be shown that it is the Shannon entropy H(1,0), also known as the information entropy (Kinsner, 1996) N
H1 ( P) = −∑ p j log b p j
(37)
j =1
For q = 2, it becomes the correlation entropy (Kinsner, 1994) N
H 2 ( P) = − log b ∑ p 2j
(38)
j =1
For q = ±∞, it becomes the Chebyshev entropy (Kinsner, 1996) with the extreme values of the probability defining the following two extreme values H∞(P) = –logb pmax H–∞(P) = –logb pmin
(39a) (39b)
Since pmin ≤ pmax, then |log b(pmax)| ≤ |log b (pmin)|, and the entropy spectrum has the upper and lower bounds. It can be shown that Hq is a monotonically nonincreasing function of q, and it becomes constant only for an iid pmf. Since the spread of this “inverted S” curve in Figure 8 depends on the spread of probabilities in the pmf, the curve can be used as a measure of the differences (distortion) in the source pmf, X, and the corresponding reconstructed pmf, Y, as shown in Figure 8. Based on these measures, a suitable cost function can then be established for rate distortion minimization. Figure 8. Rényi entropy spectrum for a source X and its reconstruction Y messages
43
Is Entropy Suitable to Characterize Data and Signals for Cognitive Informatics?
This entropy spectrum can also be used as a detector of stationarity of a signal; i.e., while a stationary signal produces a constant curve over time or space, a nonstationary signal produces a varying spectrum trajectory. The major advantages of this approach over the direct study of the pmfs include: (i) the pmfs can be of different cardinalities, (ii) this entropy spectrum Hq can be used in multiscale analysis to establish the fractal dimension spectrum Dq (Kinsner, 1996, 2005) and (iii) Dq can then be used to extract the Mandelbrot singularity spectrum (Kinsner, 1996, 2005). We have applied both Hq and Dq in the study of multifractals in dielectric discharges, transient signal analysis, fingerprint compression, speech segmentation into phonemes, image and video compression, biomedical (ecg and emg) segmentation and classification, DNA sequencing, and cryptography.
Oher Entropies The Shannon and Rényi entropies relate to the Boltzmann-Gibbs entropy concept in which a probability function, W, determines the direction towards disorder: since a closed system tends to a thermodynamical disorder, the entropy increases with increasing W. Since self-information was defined in the same direction, a random message carries more self-information than a legible message. Clearly, self-information should not be the opposite to the conventional perceptual and cognitive information. Several alternative approaches to defining entropy and information will be summarized. We shall start from the Kolmogorov and Kolmogorov-Sinai entropies that provide a fundamental alternative to the Shannon entropy as they do not involve probabilities, with the latter describing dynamic rather than static systems. It is followed by Prigogine’s entropy for open self-organizing systems. For completeness, Boltzmann, Gibbs, Schrödinger, and Stonier entropies will also be highlighted. There are still other entropies (e.g., fuzzy entropy) that are not treated in this chapter.
Kolmogorov Entropy (Complexity) In 1965, Andrei N. Kolmogorov (1903-87) introduced an alternative algorithmic (descriptive) complexity measure KU(X) of a message X as the shortest length of a binary program P that can be interpreted and halted on a universal computer U (such as the Turing machine), and that describes the message completely, without any reference to the pmf. The entropy is given by KU ( X ) = min P P:U ( P ) = X
[bits]
(40)
Since the expected value of this Kolmogorov complexity measure of a random message is close to Shannon’s entropy, this concept can be considered more fundamental than the entropy concept itself (Cover & Thomas, 1991, Ch. 7).
Kolmogorov-Sinai Entropy In dynamical systems, the Kolmogorov-Sinai (KS) entropy HKS is a measure of information loss per iteration in maps for which the iteration count n is an integer, n ∈ Z (or per unit of time in flows for which time t is continuous, t ∈ R) in m-dimensional (mD) phase space (Kinsner, 2003a). Thus, the KS entropy can be used to characterize chaos in an mD phase space (Atmanspacher & Scheingraber, 1987). For example, while nonchaotic systems have HKS = 0, chaotic systems have HKS > 0, and uncorrelated noise has HKS = ∞ (Kinsner, 2003c). There are several schemes to compute the KS entropy (Kinsner, 2003b). If a dynamical system has several positive Lyapunov exponents, the following Ruelle inequality holds for most dynamical systems (Ruelle, 1978; and Grassberger & Procaccia, 1983) J
H KS (λ) ≤ ∑ λ j j =1
44
(41)
Is Entropy Suitable to Characterize Data and Signals for Cognitive Informatics?
where J is the index of the smallest positive Lyapunov exponent. Pesin (1977) has shown that the inequality also holds for flows. Thus, Lyapunov exponents provide a good estimate of the KS entropy, without any reference to the source statistics because the Lyapunov exponents can be calculated directly from the trajectories of the corresponding strange attractor. This is important because accurate estimates of the entropy from the process statistics would require a very large number of data points in a time series (Williams, 1997, Ch. 26). The significance of the KS entropy is that it extends the static probabilistic Shannon entropy measure to dynamical systems which are deterministic and dynamic in that they provide a continuous supply of new information during their evolution in chaos. We propose that this single KS entropy could also be generalized to HqKS with moments order q ∈ R, similarly to the generalization of the single Shannon entropy, as discussed in Sec. 6.5.
Prigogine Entropy For years, Ilya Prigogine (1917-2003) had been developing ideas related to dynamical systems and complexity, with emphasis on far-from-equilibrium self-organization. He described three forms of thermodynamics: (i) thermostatics (i.e., systems in equilibrium at which nothing special can happen because any perturbation is ignored by the system due to the Gibbs’ minimum free energy principle), (ii) linear thermodynamics (near-equilibrium, also governed by the minimum principle), and (iii) far-from-equilibrium thermodynamics (Prigogine & Stengers, 1984; and Prigogine, 1996). The latter form is the most interesting, as it includes both inflows and outflows of energy, matter and entropy (organization) between the open system and its environment. This exchange can be written as dSP = dSC + dSE
(42)
where SP denotes Prigogine’s entropy which consists of the internal (Clausius) entropy SC and the exchange entropy SE. Since for irreversible systems, dSC > 0, the Prigogine entropy dSP depends on the new component which can now be either (i) dSE > 0 (nothing special), or (ii) dSE = 0 (an isolated system at equilibrium), or (iii) dSE < 0 (negentropy, or provision of order). If |dSC| < |dSE| then dSP < 0. This negentropy indicates self-organization which can occur in the far-from-equilibrium state because the system does not have to conform to any minimum principle. This entropy appears to be critical in future studies of measures for CI.
Clausius Entropy In 1820, Sadi Carnot (1796-1832) formulated the first law of thermodynamics (that energy cannot be created or destroyed) in the context of the maximum efficiency that a steam engine could achieve. In 1865, Rudolf Clausius (1822-88) proposed the following definition of entropy function SC δQ dSC = T R
(43)
where dSC denotes the exact differential (i.e., whose integral is independent of the configuration path selected), while δQ is an inexact differential of thermal energy Q (as its integral depends on the path selected), T is the absolute temperature in K, and the subscript R denotes that the expression is valid for reversible processes only, close to thermal equilibrium at a macroscopic scale. He also expanded this expression to irreversible systems for which the entropy increases, dSC > 0, and by introducing this physical evolution, he defined the second law of thermodynamics (that heat rise from a colder to a hotter body is impossible), and coined the word entropy from the Greek word (τροπη) for “transformation” or “evolution.” Clausius also made the following famous categorical statements: (1) “The energy of the universe is constant”, and (2) “The entropy of the universe tends to a maximum.” These statememnts apply to an abstract closed universe only.
45
Is Entropy Suitable to Characterize Data and Signals for Cognitive Informatics?
Boltzmann Entropy In 1898, following Carnot and Clausius, Ludwig Boltzmann (1844-1906) expanded this fundamental concept of thermodynamic entropy ST as given by ST = k logb W
(44)
where k is the Boltzmann constant (1.3807×10 –23 J/K or 3.2983×10 –24 cal/K), b = e, and W is the thermodynamic function such that when the disorder of the system increases, W increases with it, thus increasing ST. He defined entropy in terms of a macrostate determined by a large number of microstates. For example, let us consider that the macrostate is determined by a set of 16 non-overlapping coins distributed in a 2D space, and that the microstate is formed by each coin either face up or down. The number of the most unlikely scenarios of the organized macrostate (in which all the coins are either face up or face down) is W(pmin) = 1. The most likely scenario of the disorganized macrostate is that half of the coins is up and the other half is down (or vice versa) which is W(pmax) = C(16,8) = 12,870. Thus, ST(pmin) < ST (pmax). Since W is represented by the natural numbers, starting from 1, ST is non-negative. Since any ordered closed system tends to a disordered state at its equilibrium ST*, the disordered state is more probable than an ordered state, thus leading to the second law of thermodynamics. Observe that if W is reformulated in terms of a probability function, and the sequence of macrostates is substituted by time t, then –∞ < ST (t0) ≤ ST (t) ≤ 0 for all times t0 < t, regardless of the initial system preparation, where t0 is the initial time. In either case, the entropy difference between t and t0 is positive. Work is required to organize a system. The present research interest is in open systems that are far from this equilibrium. Notice that although Boltzmann did not deal with information explicitly, the concept of “degree of disorder” is related to it.
Boltzmann-Gibbs Entropy In 1902, J. Willard Gibbs (1839-1903) formalized Boltzmann’s entropy within a measure space (consisting of a phase space X, a σ algebra and a measure µ (Mackey, 1992)), and formulated the thermodynamic entropy in terms of densities f on an ensemble to deal with the very large numbers of particles in a volume. An ensemble is a set of small subsystems that are configured identically, each with a finite number of particles. The entropy can be written as H T ( f ) = − ∫ f ( x) log f ( x)dx X
(45)
which is the expected value of the density (for a continuous case). Notice that the sign is the opposite of the original Boltzmann’s ST. Again, Gibbs did not deal with information explicitly. He also formulated the concept of free energy which is the difference between the total energy and the unavailable energy (lost in the processes). This leads to the concept of quality of energy sources, and may also be useful in CI.
Schrödinger Negentropy In 1944, Erwin Schrödinger (1887-1961) introduced the concept of negative entropy (negentropy) to stress organization of living systems (Schrödinger, 1944). He started from Boltzmann’s formulation SS = k logb DS
(46)
where DS is similar to W in (44). Since living organisms have the tendency to maintain a low level of entropy by “feeding upon negative entropy” (i.e., taking orderliness from their environment), he expressed it as:
46
Is Entropy Suitable to Characterize Data and Signals for Cognitive Informatics?
− S = k log b
1 DS
(47)
Again, Schrödinger did not deal with information directly. Later, the expression was also pursued by Brillouin (1964) who considered W to be a measure of uncertainty.
Stonier Entropy The Schrödinger entropy was further developed by many others, including Tom Stonier (1927-99) (Stonier, 1990). He considered OS =
1 DS
(48)
in (46) as a measure of an ordered system, and defined information as I = f(Os)
(49)
or I = ce–S/k
(50)
where S is the Schrödinger entropy, k is Boltzmann’s constant, and c is an information constant of a system at zero entropy. This formulation of information is totally different from Shannon and Rényi in that an ordered (legible) message M1 has now more information than a more random string M 2.
Summary and Discussss The main objective of this chapter was to provide a review of self-information and entropy as they might be used in measuring the quality of reconstruction in data and signal compression for multimedia. Another objective was to introduce alternative definitions of entropy that do not require the source or reconstruction statistics. Still another objective was to describe an entropy capable of measuring dynamic information content, as can be found in chaotic dynamical systems. This chapter is an extension of the data and signal compression techniques and metrics described by Kinsner (2002). We have defined data as bags of symbols (or strings) whose origin and destination are not known. Any transformation of the data must be lossless in the sense that no information is lost. On the other hand, signals are bags of symbols (or strings) with known origin and destination. Such data or signals can form finite messages. In cognitive informatics, we are concerned with the transformation of signals to enhance their characteristic features for perception, cognition and learning. The transformations can be lossy, as long as the distortion between the reconstruction and the source does not impede the key objective of the maximal transfer of information through the signals used. We have also distinguished between two fundamentally different classes of signals: linear-time invariant (LTI) and scale-invariant (SI). Many new metrics can be found for the SI signals that are not available for the LTI signals. This chapter has reviewed a number of different forms of Shannon self-information and entropy. The self-information of a symbol is defined as a function of its probability, and is measured in information units such as bits. Entropy is defined as the average (expected) self-information, which can be interpreted as the average number of information units per symbol, regardless of the size of the message. Since the Shannon self-information and entropy have both the same root, their interpretation relates to the Boltzmann entropy. Consequently, Shannon self-information had to be divorced from any cognitive meaning.
47
Is Entropy Suitable to Characterize Data and Signals for Cognitive Informatics?
The single kth-order Shannon entropy of messages with different memories (according to Markov-chain models) is useful in developing perfect codes in the information-theoretic sense, but does not deal with the spread of probabilities in the source or destination alphabet. To solve the problem, we discussed the Rényi generalized entropy spectrum, Hq, which provides a bounded representation of the signal. This functional (or vectorial) representation could be used to determine the distortion between a source, Hq(X) and its reconstruction Hq(Y), no longer in terms of scalars, but in terms of vectors. The difference between Hq(X) and Hq(Y) could then be used to establish a cost function in order to achieve an optimal perceptual quality of the reconstruction. This singlescale Rényi entropy spectrum, however, has a serious limitation when dealing with self-similar or self-affine signals which are scale-invariant. For such signals, the analysis must be done at different scales to discover any power-law relationship that might be present in the signal, and if present, a spectrum of fractal dimensions could be computed (Kinsner, 1996). The significance of this Rényi fractal dimension spectrum is that it can characterize strange attractors that are often multifractal. Furthermore, since images or temporal signals can be considered as strange attractors of iterative function systems (Barnsley, 1988), the Rényi fractal dimension spectrum can be used to characterize such signals. We have demonstrated elsewhere that this approach can lead to even better perceptual metrics (Dansereau & Kinsner, 2001; and Kinsner & Dansereau, 2006). Other definitions of entropies have also been presented in this chapter. For example, the Kolmogorov entropy generalizes Shannon’s entropy, as it does not refer the the pmf at all. The Kolmogorov-Sinai entropy also extends Shannon’s entropy, as it can deal with systems that create new information during their evolution. Such metrics could be applicable to learning processes in CI. Although there are many definitions of entropy, the core idea that makes entropy so important in the probabilistic and algorithmic information theories is that it describes disorder and order of a message. This order is critical to CI. Many contemporary quality metrics still have a major difficulty with measuring perceptual quality because they are based on the error energy between the source and the reconstruction, while the human visual system and the psychoacoustic system involve not only energy, but many other factors such as singularities. On the other hand, entropy-based measures are more suitable for quality metrics, as they describe disorder of the source and reconstruction. A suitable cost could then be designed to maximize the perceptual quality of the reconstruction, at the lowest possible bit rate. Since it is most unlikely that a single cost function could apply to all the multimedia materials, it should use adaptation and learning to match both the properties of the material and the specific needs of a user. Thus, the question posed in this chapter has an affirmative answer: although the entropy-based measures are useful in characterizing data and signals, and in establishing the perceptual quality of their reconstructions objectively, they should be used only in conjunction with other complementary concepts such as various multiscale singularity measures that could be developed from the entropy-based measures described in this chapter. In fact, such measures are described by Kinsner (2005) and Kinsner & Dansereau (2006). The fundamental reason for multiscale entropy-based measures being more suitable for quality metrics than various energy-based measures is that the former describe the complexity of the source and reconstruction. The complexity is related not only to the structure and context of the message, but also to the singularity distribution in the message over multiple scales. This property is essential in perceptual, cognitive and conscious processes. Thus, such entropybased multiscale metrics differ fundamentally from any other measures in the classical information theory. This is described in more detail by the unified approach to fractal dimensions (Kinsner, 2005), and is illustrated by the explicit examples of perceptual quality metrics through a relative multiscale entropy-based measures as described by Kinsner & Dansereau, 2006. However, since measuring of the content (meaning) and value (utility) of a message to a single user and to multiple users requires not only the static multiscale entropy-based measures, as described here, but also measures of their relative dynamics, this problem will be covered in our future work.
Acknowledgment This work was supported partially by a grant from the Natural Sciences and Engineering Research Council (NSERC) of Canada.
48
Is Entropy Suitable to Characterize Data and Signals for Cognitive Informatics?
References Alligood, K.T. , Sauer, T.D., & Yorke, J.A. (1996). Chaos: An introduction to dynamical systems (p. 603). New York, NY: Springer Verlag. Atmanspacher, H., & Scheingraber, H. (1987). A fundamental link between system theory and statistical mechanics. Foundations of Physics, 17, 939-963. Barnsley, M. (1988). Fractals everywhere (p. 396) Boston, MA: Academic. Bergson, H. (1960). Time and free will: An essay on the immediate data of consciousness. New York, NY: Harper Torchbooks (Original edition 1889, translated by F.L. Pogson). Brillouin, L. (1964). Scientific uncertainty and information. New York, NY: Academic. Chan, C.W. (2002, August). Cognitive informatics: A knowledge engineering perspective. In Proceedings of the 1st IEEE International Conference on Cognitive Informatics (pp. 19-20, 49-56). Calgary, AB.{ISBN 0-76951724-2} Cover, T.M., & Thomas, J.A. (1991). Elements of information theory (p. 542). New York, NY: Wiley. Dansereau, R.M., Kinsner, W., & Cevher, V. (2002, May 12-15). Wavelet packet best basis search using Rényi generalized entropy. In Proceedings of the IEEE 2002 Canadian Conference on Electrical & Computer Engineering, CCECE02, 2, 1005-1008 Winnipeg, MB. ISBN: 0-7803-7514-9. Dansereau, R., & Kinsner, W. (2001, May 7-11). New relative multifractal dimension measures. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing, ICASSP2001, 1741-1744. Salt Lake City, UT. Dubois, D., & Prade, H. (1988). Possibility theory: An approach to computerized processing of uncertainty (p. 263). New York, NY: Plenum. Farrell, J.E., & Van Den Branden Lambrecht, C.J. (eds.) (2002, January). Translating human vision research into engineering technology [Special Issue]. Proceedings of the IEEE, 90(1). Grassberger, P., & Procaccia, I. (1983, January 31). Characterization of strange attractors. Physics Review Letters, 50A(5), 346-349. Hawkins, S. (1996). The illustrated a brief history of time (2nd ed.) (p.248). New York, NY: Bantam. Hartley, R.V.L. (1928). Transmission of information. Bell System Technical Journal, I, 535-563. Held, G. (1987). Data compression: Techniques and applications, hardware and software considerations (2nd ed.), (p. 206). New York, NY: Wiley. ISO/IEC 11172-3 (1993) Information Technology - Coding of Moving Pictures and Associated Audio for Digital Storage Media at up to about 1.5 Mbits/s - Part 3: Audio. Jayant, N. (1992, June). Signal compression: Technology targets and research directions. IEEE Journal on Selected Areas Communications, 10, 796-818. Jayant, N. (ed.) (1997). Signal compression: Coding of speech, audio, text, image and video (p. 231). Singapore: World Scientific. Jayant, N.S., Johnson, J.D., & Safranek, R.S. (1993, October). Signal compression based on models of human perception. Proceedings of the IEEE, 81(10), 1385-1422. Kadanoff, L.P. (1993). From order to chaos: Essays (p. 555). Singapore: World Scientific.
49
Is Entropy Suitable to Characterize Data and Signals for Cognitive Informatics?
Kantz, H., & Schreiber, T. (1997). Nonlinear time series analysis (p. 304). Cambridge, UK: Cambridge University Press. Kinsner, W. (1991). Review of data compression methods, including Shannon-Fano, Huffman, arithmetic, Storer, Lempel-Ziv-Welch, fractal, neural network, and wavelet algorithms. Technical Report DEL91-1 (p. 157). Winnipeg, MB, Canada: Dept. Electrical & Computer Engineering, University of Manitoba. Kinsner, W. (1994). Fractal dimensions: Morphological, Entropy, Spectrum, and Variance Classes. Technical Report, DEL94-4 (p.146) Winnipeg, MB, Canada: Dept. of Electrical & Computer Engineering, University of Manitoba. Kinsner, W. (1996). Fractal and chaos engineering: Postgraduate lecture notes (p. 760). Winnipeg, MB, Canada: Department of Electrical & Computer Engineering, University of Manitoba. Kinsner, W. (1998). Signal and data compression: Postgraduate lecture notes P. 642). Winnipeg, MB, Canada: Department of Electrical & Computer Engineering, University of Manitoba. Kinsner, W. (2002, August 19-20). Compression and its metrics for multimedia. In Proceedings of the 1st IEEE International Conference on Cognitive Informatics (pp.107-121). Calgary, AB. {ISBN 0-7695-1724-2} Kinsner, W. (2003a). Characterizing chaos with Lyapunov exponents and Kolmogorov-Sinai entropy. Technical Report DEL03-1, (p. 76). Winnipeg, MB, Canada: Dept. Electrical & Computer Engineering, University of Manitoba. Kinsner, W. (2003b, August 18-20). Characterizing chaos through Lyapunov metrics. In Proceedings of the 2nd IEEE International Conference on Cognitive Informatics (pp. 189-201). London UK. ISBN 0-7695-1986-5. Kinsner, W. (2003c). Is it noise of chaos? Technical Report DEL03-2 (p. 98). Winnipeg, MB, Canada: Dept. Electrical & Computer Engineering, University of Manitoba. Kinsner, W. (2005, August 8-10). A unified approach to fractal dimensions. In Proceedings of the 4th IEEE International Conference on Cognitive Informatics (pp. 58-72). Irvine, CA. ISBN 0-7803-9136-5. Kinsner, W., & Dansereau, R. (2006, July 17-19). A relative fractal dimension spectrum as a complexity measure. In Proceedings of the 5th IEEE International Conference on Cognitive Informatics. Beijing, China. ISBN 1-4244-0475-4. Mackey, M.C. (1992). Time’s arrow: The origin of thermodynamic behavior (p. 175). New York, NY: Springer Verlag. Mainzer, K. (2004). Thinking in complexity (4th ed.) (p. 456)..New York, NY: Springer Verlag. Mitra, S.K. (1998). Digital signal processing: A computer-based approach (p.864). New York: McGraw-Hill (MatLab Series) Oppenheim A.V., & Schafer, R.W. (1975). Digital signal processing (p.585). Englewood Cliffs, NJ: PrenticeHall. Oppenheim A.V., & Willsky, A.S. (1983). Signals and systems (p. 796). Englewood Cliffs, NJ: Prentice-Hall. Oppenheim A.V., & Schafer, R.W. (1989). Discrete-time signal processing (p. 879). Englewood Cliffs, NJ: Prentice-Hall. Painter T., & Spanias, A. (1998, April). Perceptual coding of digital audio. Proceedings of the IEEE, 88(4), 451513. Peitgen, H.-O., Jürgens, H., & Saupe, D. (1992). Chaos and fractals: New frontiers of science (p. 984). New York: Springer Verlag.
50
Is Entropy Suitable to Characterize Data and Signals for Cognitive Informatics?
Pennebaker, W.B., & Mitchell, J.L. (1993). JPEG still image data compression standard (p. 638). New York, NY: Van Nostrand Reinhold. Pesin, Y.B. (1977). Characteristic Lyapunov exponents and smooth ergodic theory. Russian Mathematical Surveys, 32, 55-114. Prigogine, I., & Stengers, I. (1984). Order out of chaos: Man’s new dialogue with nature (p. 349). New York, NY: Bantam. Prigogine, I. (1996). The end of certainty: Time, chaos, and the new laws of nature (p. 228). New York, NY: The Free Press. Ruelle, D. (1978). Thermodynamics formalism (p. 183). Reading, MA: Addison-Wesley-Longman and Cambridge, UK: Cambridge University Press. Sayood, K. (2000). Introduction to data compression (2nd ed.) (p. 636). San Francisco, CA: Morgan Kaufman. Schroeder, M.R. (1991). Fractals, chaos, power laws (p. 429). New York, NY: W.H. Freeman. Schrödinger, E. (1944). What is Life? with Mind and Matter and Autobiographical Sketches (p. 184). Cambridge, UK: Cambridge University Press {ISBN 0-521-42708-8 pbk; Reprinted 2002} Shannon, C.E. (1948, July). A mathematical theory of communication. Bell System Technical Journal, 27(7), 398-403. Reprinted in Shannon, C.E. & Weaver, W. (1949). The mathematical theory of communication. Urbana, IL: University of Illinois Press. Sprott, J.C. (2003). Chaos and time-series analysis (p. 507). Oxford, UK: Oxford University Press. Stonier, T. (1990). Information and the internal structure of the universe: An exploration into information physics (p. 155). New York, NY: Springer Verlag. Tekalp, A.M. (ed.) (1998, May). Multimedia signal processing [Special Issue]. Proceedings of the IEEE, 86(5). Thelen, E., & Smith, L.B. (2002). A dynamic systems approach to the development of cognition and action (5th pr.) (p. 376). Cambridge, MA: MIT Press. Turcotte, D.L. (1997). Fractals and chaos in geology and geophysics (2nd ed.) (p. 398). Cambridge, UK: Cambridge University Press. Vicsek, T. (1992). Fractal growth phenomena (2nd ed.) (p. 488). Singapore: World Scientific. Wang, Y. (2002, August 19-20). On cognitive informatics. In Proceedings of the 1st IEEE International Conference on Cognitive Informatics (pp. 34-42). Calgary, AB, {ISBN 0-7695-1724-2} Wornell, G.W. (1996). Signal processing with fractals: A wavelet-based approach (p. 177). Upper Saddle River, NJ: Prentice-Hall. Williams, G.P. (1997). Chaos theory tamed (p. 499). Washington, DC: Joseph Henry Press.
51
52
Chapter III
Cognitive Processes by using Finite State Machines Ismael Rodríguez Universidad Complutense de Madrid, Spain Manuel Núñez Universidad Complutense de Madrid, Spain Fernando Rubio Universidad Complutense de Madrid, Spain
Abstract Finite State Machines (FSM) are formalisms that have been used for decades to describe the behavior of systems. They can also provide an intelligent agent with a suitable formalism for describing its own beliefs about the behavior of the world surrounding it. In fact, FSMs are the suitable acceptors for right linear languages, which are the simplest languages considered in Chomsky’s classification of languages. Since Chomsky proposes that the generation of language (and, indirectly, any mental process) can be expressed through a kind of formal language, it can be assumed that cognitive processes can be formulated by means of the formalisms that can express those languages. Hence, we will use FSMs as a suitable formalism for representing (simple) cognitive models. We present an algorithm that, given an observation of the environment, produces an FSM describing an environment behavior that is capable to produce that observation. Since an infinite number of different FSMs could have produced that observation, we have to choose the most feasible one. When a phenomenon can be explained with several theories, Occam’s razor principle, which is basic in science, encourages choosing the simplest explanation. Applying this criterion to our problem, we choose the simplest (smallest) FSM that could have produced that observation. An algorithm is presented to solve this problem. In conclusion, our framework provides a cognitive model that is the most preferable theory for the observer, according to the Occam’s razor criterion.
Copyright © 2009, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
Cognitive Processes by using Finite State Machines
INTRODUCTION Cognitive Informatics (Kinsner 2005, Wang 2002, 2003) provides Computer Science with a remarkable inspiration source for solving computational problems where the objectives are similar to those performed by the human mind. In spite of the fact that computational environments have some specific requirements and constraints that must not be ignored, understanding our mind is usually the key to provide successful (particularized) intelligent systems. This cross-fertilization has yielded the development of some successful intelligence mechanisms such as neural networks (Lau 1991) and case-based reasoning algorithms (Schank and Abelson 1977). It is particularly relevant to note that the relationship between Computer Science and other mind-related sciences is two-faced. In particular, the development of formal language theories (oriented to computational languages) has led to a better understanding of our mind. Due to the close relationship between language generation and mental processes, some mathematical formalisms proposed for dealing with formal computational languages turned out to be good approximations to model human reasonings. Following this line, it is specially relevant the language theory developed by Noam Chomsky. He proposed that natural languages can be represented as formal languages (Chomsky 1957, 1965). Chomsky considered four categories of languages (from simpler to more complex, right linear, context-free, context-sensitive, and grammars - or recursive enumerable) and he argued that natural languages are context-sensitive. All of these categories can be produced by a kind of suitable formal machine or acceptor ( finite state automata, push-down automata, linear bounded automata, and Turing machines, respectively). Thus, the generation of natural languages can be represented in terms of some kind of formal automata, specifically linear bounded automata. This statement is specially relevant: Since the language is a projection of our cognitive processes, we can say that our own reasonings can be represented by using context-sensitive languages. Similarly, other less expressive languages (like right linear or context-free) may provide approximated and simpler models to represent human mental processes. The difficulty to use a formal language to represent reasonings in a computational environment has discouraged most researchers to explore this trend. Paradoxically, the great expressivity of formal languages is its main handicap. For example, the beliefs/knowledge of an intelligent system cannot be internally represented by a recursive enumerable language (or its acceptor, a Turing machine), because there is no method to automatically construct the Turing machine that produces some given behavior. So, such an internal representation would be unable to create and maintain. Nevertheless, in some domains, using the simplest languages according to Chomsky’s classification could provide us with formalisms endowed with a suitable structure and expressivity while being efficient to handle. In particular, let us note that right linear languages are a suitable formalism for representing the behavior of a wide range of entities and systems. Their acceptors, that is Finite State Machines, have been used for decades to model the behavior of sequential digital circuits and communication protocols. Similarly, an intelligent entity can use an FSM to represent its belief about the behavior of the world that surrounds it. As any other knowledge representation, this model should be updated and maintained so that it provides, at any time, a feasible explanation of the events the agent has observed so far. If the model is accurate then the agent could use it to predict future situations. Hence, FSMs may be the basic formalism for knowledge representation in a learning system. In order to use an FSM to represent the knowledge of an intelligent agent, the agent must create an FSM that is consistent with all the observations and interactions it has performed so far with the environment. Once we have fixed the set of inputs and outputs that the agent will use to interact with its environment (that is, the set of operations an agent can produce to affect the environment and the actions the environment could produce in response, respectively), an environment observation is a sequence of pairs (input,output). Given such a historical trace, the agent will create an FSM describing a behavior that, in particular, produces that behavior. There exist an infinite number of FSMs that may produce a given (finite) sequence of interactions between an agent and its environment, so we have to choose one of them. Any of these FSMs extrapolates infinite behaviors from a single finite behavior. Thus, our aim is to choose an FSM with the best predictive power. If several FSMs fit into some observation then no observational information provides us with a criterion to choose one of them. However, the Occam’s razor principle will give us a scientific criterion to choose one of them. This criterion says that, on equal plausibility, the simplest theory must be chosen. The application of this criterion to our problem will provide us with a scientific argument to choose the machine that has the minimal number of states (arguments for apply-
53
Cognitive Processes by using Finite State Machines
ing this criterion in this case, and in Cognitive Informatics in general, will be extensively discussed in the next section). Since we assume that our capability to express and develop hypotheses matches the one provided by a specific model (in our case, FSMs), we will have that the simplest model (machine) that could have produced the observed events is actually the simplest hypothesis to explain these events. In this chapter, our objective is to create a learning algorithm based on this idea. The rest of this chapter is structured as follows. In Section 2 we will discuss our criterion to choose the best FSM that fits into an observation. This criterion will be based on the Occam’s razor principle. In Section 3 we present finite state machines, which are the basic formalism we will use along the chapter. In Section 4 we define folding operations (also called unifications), which are the basic operations we will apply to construct our minimal machines. Next, in Section 5 we present our algorithm to build the minimal finite state machine that could have produced a given observation. We apply that algorithm to the construction of a learning mechanism in Section 6. Finally, in Section 7 we present our conclusions.
APPLYINGOCCAMRAORPRINCIPLE A key aspect to understand the human learning processes is the preference of people to explain their observations through the simplest available theory. Let us consider the example depicted in Figure 1. Let us imagine a person who observes the inside of a room through the lock of the door, and let us suppose that he observes that, just in this moment, seven flies appear. As a consequence, he will think that the room is full of flies. Let us imagine another person who looks through a different lock of a different room, and he sees nothing. Then, he will think that the room is completely empty. Let us remark that both observers could be mistaken. In the first case, it could happen that there are only seven flies in the room, but that these flies love locks. In the second case, it could happen that the room is full of flies, but all of them are so shy that they keep out of the lock. However, the criteria of our observers are basically valid, because, before more data are collected, they choose the simplest and more probable option. This intuitive preference for simple things is usually known as Occam’s razor criterion. William of Occam criticized the high complexity of the scholastic philosophical theories because their complexity did not improve their predictive power. His criticism can be stated as “Entities should not be multiplied beyond necessity” (Tornay 1938). We can interpret it by considering that, on equal plausibility, we should choose the simplest solution. This distinction criterion, which is one of the main scientific criteria of all times (typical examples of its applicability are the Newton laws and the Maxwell equations of electromagnetism), has been applied to develop computational mechanisms of Knowledge Discovery (Blumer et al. 1987). In fact, the application of the Occam’s razor to these systems is controversial (Domingos 1999). Actually, there exist theoretical arguments supporting it (the bayesian information criterion (Schwarz 1978) and the minimum description length principle (Rissanen 1978)) and against it (the conservation law of generalization performance (Schwarz 1978) and the theory of structural Figure 1. Room with/without flies
54
Cognitive Processes by using Finite State Machines
risk minimization (Vapnik 1995)). Similarly, there are empirical results that support it (the improvement of accurateness obtained by using pruning mechanisms (Mingers 1989)) while some others refuse its validity (only in some cases concept simplification improves ID3 (Fisher and Schlimmer 1988)). It is worth to point out that those who refuse the validity of the Occam’s razor principle usually accept its practical applicability in real-world domains (Rao et at. 1995). If we consider the applicability of the Occam’s razor in the context of Cognitive Informatics, we should ask ourselves whether this criterion is actually applied by the human mind. We conjecture that it actually is. We can illustrate it easily by considering erroneous reasonings of human beings. For instance, a child learning to speak will make linguistic errors as saying “I eated a peach” instead of “I ate a peach” (even if he did not hear the word “eated” before). This error comes from his intuitive use of the English grammar rules, which say that past verbal forms are created by adding the suffix -ed. That is, children minds try to apply the simplest theory to explain the observations they perceive, and the exceptions to the rules are what require a highest learning effort. In fact, the child would never learn to speak if he did not seek the simplest rules that explain his environment (in this case, the linguistic rules). Therefore, the Occam’s razor criterion seems to be part of our own learning mechanism. The natural language is not an accidental example of the applicability of the Occam’s razor within human learning. Let us remark that the language has been created as the result of the simultaneous interaction of a huge amount of human minds during generations. So, the rules underlying it are actually a projection of our own mental processes. In that projection, the preference for the simplification is clear: Regular rules and patterns dramatically outnumber exceptions and irregularities. This property is specially relevant in our context since our application of formal languages for representing reasonings is based on the assumption that generation of language and reasonings can be produced by a formal language. Our aim is to formally apply a criterion based on the Occam’s razor to obtain, in each case, the most plausible theory that explains a sequence of observations, where our abstraction model of reasoning generation will be based on Chomsky’s theory. This means to find the simplest model that could have generated the perceived observation. Specifically, since according to that theory cognitive models can always be modelled by using linear bounded automata, our ideal objective should be to find the simplest linear bounded automaton that could have generated the detected observations. In this regard, we could define the simplicity criterion in terms of the number of states or the number of transitions of the automaton (that is, we assume that the simplest model is the smallest model). However, as we argued before, using a very expressive language is not feasible in practice because of the difficulty or impossibility of creating and/or updating it automatically. Hence, as a first approach to this difficult problem, we will tackle the previous task in the context of right linear languages, which can be modelled by finite state machines and are the simplest languages according to the classification provided by Chomsky. Therefore, our application of the Occam’s razor criterion to the Chomsky’s language theory will consist in developing an algorithm capable to find the simplest FSM that could have generated the detected observation. Specifically, in this approach we will use the number of states of the machine as simplicity criterion. Let us remark that our objective is not to minimize the number of states of a given finite state machine (the classical minimization algorithm for finite automata can be found in (Huffman 1954)) but to create from scratch the FSM with the minimal number of states that could have generated the observation. In general, two machines that can generate a given observation will not be equivalent because any behavior that is not explicitly included in the observation is not specified and can be anyone. In fact, the problem of finding the minimal deterministic Mealy machine that can produce a given sequence of inputs/outputs was first identified in (Gold 1978), where it was found to be NP-hard. To the best of our knowledge, this is the first time this problem is used as the core of a cognitive learning method in the Cognitive Informatics field. The suitability of this application is based on the arguments commented before. Besides, let us note that the solution to this problem given in this chapter is strongly different from the one given in (Gold 1978). While the method in (Gold 1978) basically consists in filling holes (that is, giving values to undefined transitions) in such a way that the minimal machine is pursued, we iteratively manipulate an initial FSM by introducing new loops (we call folding operations, or just unifications, to these operations) until the FSM is minimal. This enables an intuitive and efficient use of pruning in our branch and bound algorithm. Moreover, if the algorithm is terminated before completion, the partial output actually is a (suboptimal) FSM that can be taken as it is. On the contrary, the only output of the algorithm (Gold 1978) is given upon termination.
55
Cognitive Processes by using Finite State Machines
FINITETATEMACHINE In this section we define the simple abstraction we will use as cognitive model. Basically, we will assume that theories explaining observations must be constructed in terms of a finite state machine. These machines can be represented by two main forms: Moore and Mealy machines. The difference between them concerns the treatment of output actions. Due to the clear separation between outputs and states, we will use Mealy machines in our framework. Definition 1. A finite state machine (FSM) M is a tuple (S, I, O, T, sin) where S is the set of states of M, I is the set of input actions of M, O is the set of output actions of M, sin є S is the initial state of M, and T ⊆ S x I x O x S is the set of transitions of M. Intuitively, a transition t=(s,i,o,s’)єT represents that if M is in state s and receives and input i then M produces i/o an output o and moves to state s’. Transitions (s,i,o,s’)єT will be simply denoted by s → s'. In Figure 2 we show two FSMs. Let us consider M1. We have M1=(S,I,O,T,1) where S={1,2,3,4,5,6}, I={a,c,x,z}, and O={b,d,y,w}. The transition set T includes all transitions linking states in M1. Thus, a/b a/d c/d c/w 1 → → 2, 2 → 3, 3 → 2, 2, 2 T = x/ y a/b c/d z/w → 4, 4 → 5, 5 → 6 6 →1 3
For the sake of simplicity, we will assume that our cognitive model concerns only deterministic finite state machines. Definition2. Let M=(S, I, O, T, sin) be a finite state machine. We say that M is deterministic if for all state sєS and input iєI there do not exist transitions (s, i, o1, s1), (s, i, o2, s2) ∈ T with either o1 ≠ o2 or s1 ≠ s2, or both. The finite state machine shown in Figure 2 is deterministic. Let us note that if we did not constrain our cognitive models to be deterministic then the problem of finding the minimal finite state machines that could have produced a sequence of inputs and outputs would be trivial. This is so because it would be enough to create a machine with a single state where each pair of input and output in the observation sequence is represented by a Figure 2. Examples of FSMs
56
Cognitive Processes by using Finite State Machines
transition outgoing from that state and incoming to the same state, labelled by that pair. Since nondeterministic FSMs may produce several outputs in response to an input, they are less suitable as cognitive models than deterministic FSMs. Nondeterministic machines do not provide any additional criterion to choose one of the available outputs after an input is produced. In forthcoming definitions, we will have to deal with sequences of transitions. In the next definition we introduce the notion of trace. n −1 n −1 2 2 1 1 → s3 ,..., sn −1 → sn, and → s2, s2 Definition 3. Let M=(S, I, O, T, sin) be an FSM such that s1 in / on in / on i1 / o1 i2 / o2 sn → sn +1. In this case we say that σ = s1 → s2 → ... → sn +1 is a trace of M.
i /o
i /o
i
/o
In our framework, the interaction with the environment will be defined by means of traces. For instance, if the inputs a and b denote “drop a glass” and “take the glass with your hands”, respectively, and the outputs c and d denote “a glass falls and breaks” and “a broken glass pricks”, respectively, then for some states s1, s2, a/c b/d s3 the trace s1 → s2 → s3 could probably be generated by a real environment. However, if we are not interested on the states involved in the trace then we will use the simpler notion of observation sequence, which is basically a sequence of pairs of inputs and outputs. For instance, for the previous trace, (a/c,b/d) is an observation sequence.
PERFORMINGFOLDINGOPERATION In this section we define the basic operations we will use in our minimization algorithm. The learning algorithm we will present in this chapter, which finds the simplest finite state machine that could have produced a given observation, is based on the folding of traces. This technique consists in the iterative modification of a given finite state machine by creating cycles of states. By introducing new cycles, some states become unnecessary and can be removed. So, the total amount of states is reduced. In this process, newly created states become representative of two former states of the machine. To keep the needed information about the former states represented by a new single state, we need to extend our notion of finite state machine to attach that information. In the next definition we assume that P(X) represents the powerset of set X. Definition 4. A folding machine is a tuple U=(M,S,f ) where M=(S, I, O, T, sin) is a finite state machine, S is a set of states called set of original states of U, and the total function f:S → P(S) is the set function of U. Intuitively, given a folding machine U=(M,S,f ), the set S represents the set of original states in the former finite state machine from which M has been constructed. The mechanism of construction of M will be described later. Besides, the function f associates each state in M with the set of states of S that it represents. Each time two states of the former machine are unified into a single new state, the function f will be modified to include such information. In the next definition we provide the mechanism to perform that operation. Definition 5. Let f:X → P(Y) be a total function. We define the addition of the set y⊆Y to the element x∈X, denoted by f⊕(x,y), as the total function g:X → P(Y) where
if z ≠ x z g ( z) = f(z) y otherwise We extend that operation to sets of elements in (X,P(Y)) by overloading the symbol ⊕ in such a way that f ⊕ {(x1, y1), ..., (xn, yn)} = ((( f ⊕(x1, y1)) ⊕...) ⊕ (xn, yn)). Let us remark that there is no ambiguity in the definition of the operation ⊕ for sets since the order of application of elements is irrelevant. Now we are ready to present the formal definition of the folding of traces, in which we introduce in / on i1 / o1 i2 / o2 i q / oq i r / or → s1 → s2 → ... → sn +1 →r a new cycle in a finite state machine. Given two traces σ = q
57
Cognitive Processes by using Finite State Machines
in / on i1 / o1 i2 / o2 i /o i /o and σ = q ' → s '1 → s '2 → ... → s 'n +1 → r ' of a machine M that produce the same sequence of inputs and outputs from state s1 to sn+1 and from s’1 to s’n+1, respectively, the goal of the folding is to remove the states s’1 to s’n+1 in M. In order to do so, the transition in σ' connecting q’ to s’1 has to be redirected to s1. Besides, the transition of σ' that goes from s’n+1 to r’ has to be replaced by one departing from sn+1. More generally, any transition departing or arriving at a state in {s’1, ..., s’n+1} has to be redirected to/from the corresponding state in {s1, ..., sn+1}. q'
q'
r'
r'
in / on i1 / o1 i2 / o2 Definition 6. Let U=(M,S,f ) be a folding machine, where M=(S, I, O, T, sin). Let σ = s1 → s2 → ... → sn +1 in / on i1 / o1 i2 / o2 and σ ' = s '1 → s '2 → ... → s 'n +1 be two traces of M. The folding operation (also called unification) of the traces σ and σ’ in the folding machine U is a new folding machine U’=(M’,S,f’), with M’=(S’, I, O, T’, s’ in), where
(1) (2)
S ' = S \ {s '1 ,..., s 'n }, i/o T ' = (T \ {u → v {u , v} {s '1,..., s ' n + 1} ≠ ∅}) i/o i/o {u → s j u → s ' j ∈ T ∧ u ∉{s '1 ,..., s 'n +1}} i/o i/o {s j → u s ' j → u ∈ T ∧ u ∉{s '1 ,..., s 'n +1}}
(3) (4)
i/o i/o {s j → sk s ' j → s 'k ∈ T }, f ' = f ⊕ {( s1 , f ( s '1 )),...,( sn + 1, f ( s ' n + 1))}
sin if sin ∉{s '1 ,..., s 'n +1} s 'in = si if sin = s 'i
From now on, we will say that the location of a folding operation is the state where both unified traces diverge (that is, sn+1 in the previous definition). As an example, let us consider the finite state machine M1 depicted in Figure 2, and let us suppose that we want a/b c/d a/b c/d 2 → 3 and σ ' = 4 → to unify the traces σ = 1 → 5 → 6. That is, we want the resulting machine to perform both instances of the input/output sequence (a/b,c/d) through the same sequence of states, in this case 1, 2, and 3. Hence, the states 4, 5, 6 will be unified to 1, 2, 3, respectively. The resulting FSM M2 is also depicted in Figure 2. In this folding, the location is the state 3. Let us remark that the machine resulting after the folding is not equivalent, in general, to that we had before. In particular, some sequences of inputs and outputs that are available from the initial state in the new machine are not available in the old one. Let us consider the traces we commented just before Definition 4.0.6. Supposing that there is a path from state r to state q’ (or from r’ to q) a new cycle will be introduced (we assume the example we introduced before Definition 4.0.6.). So, an infinite set of new available sequences of inputs and outputs will be introduced in the new machine. For instance, the sequence of inputs and outputs (a/b,c/d,x/y,a/b,c/d,x/y) can be executed from state 1 in the machine depicted in Figure 2, but this trace is not available in the machine M1. On the other hand, let us note that no trace that was available before the folding will become unavailable afterwards. We formally present this idea in the next result. Lemma 4.0.1 Let U,U’ be folding machines such that U’ represents the folding of traces σ,σ' in U. Let U=(M,S,f ) with in / on i1 / o1 M=(S, I, O, T, sin) and let U’=(M’,S,f’) with M’=(S’, I, O, T’, s’ in). Then, for all trace s1 → s2 ,..., sn → sn +1 in / on i1 / o1 in M there exists states s’2, ..., s’n+1 such that s '1 → s '2 ,..., s 'n → s 'n +1 is a trace in M’. Besides, for 1≤i≤n+1 we have that if si ∉ S' then si ∈ f'(s'i). Due to lack of space, we do not include the proofs of the lemmas and theorems presented in the chapter. The interested reader can find them in (Núñez et al. 04). More detailed proofs are available from authors. The main feature of the folding of traces is that the folding operation reduces the number of states in the machine. The key to construct the minimal machine that could have produced an observed trace is that some folding
58
Cognitive Processes by using Finite State Machines
operations will be iteratively introduced so that, after each of them, the resulting machine will still be able to produce the observed trace. However, let us remark that not all folding operations are suitable. Particularly, care must be taken to not lose the determinism of the machine. For instance, let us suppose that ir = ir' and or ≠ or', where we assume again the traces commented before Definition 4.0.6. In this case, the unified machine would i r / or ' i r / or have two transitions, sn +1 → r ' , outgoing from the same state sn+1. So, the new machine → r and sn +1 r r' would be nondeterministic. Moreover, if i = i and or = or' then there would exist two equally labelled transitions outgoing from sn+1 and arriving to different states. So, a condition to unify two traces is that ir ≠ ir'. Similarly, that restriction applies to any pair of inputs that label transitions leaving the unified path at any intermediate point. In particular, if there exists a transition leaving the path labelled by some input in one of the traces and, in addition, there does not exist any transition labelled by that input leaving the path at that point in the other trace, then there is no incompatibility with that input. We will refer the availability to introduce a new transition labelled by an input at a point of the folding as the input slot for that input at that point. If it is possible to introduce such a new transition then we will say that the input slot for that input at that point is free. For instance, in the folding operation we performed in machine M1 to create machine M2 (see Figure 2) the input slot to introduce transition z/w → in state 3 is free, because there is no outgoing transition in 3 labelled with an input z. in / on i1 / o1 i2 / o2 → s2 → ... → sn +1 Definition 7. Let U=(M,S,f ) be a folding machine, where M=(S, I, O, T, sin). Let σ = s1 in / on i1 / o1 i2 / o2 and σ ' = s '1 → s '2 → ... → s 'n +1 be two traces of M. The folding of traces σ and σ' in the folding machine U is acceptable if two conditions hold:
•
•
i1 / o1 i2 / o2 i /o → s2 → ...s j → s1 of M such that i1 ≠ ij and 1≤j≤n+1, we have that either For any trace σ1 = s1 i1 / o1 i2 / o2 i 2 / o2 2 → s '2 → ...s ' j → s 2 of M such that i2 ≠ ij and i1 = i2, there does not exist another trace σ = s '1 1 2 1 2 or it does exist such a trace but s = s and o = o . i1 / o1 i2 / o2 i 2 / o2 → s '2 → ...s ' j → s 2 of M such that i2 ≠ ij and 1≤j≤n+1, we have that For any trace σ 2 = s '1 i1 / o1 i2 / o2 i1 / o1 → s2 → ...s j → s1 of M such that i1 ≠ ij and either there does not exist another trace σ1 = s1 1 2 2 1 2 1 i = i , or it does but s = s and o = o . 1
1
For example, the folding where we created M2 from M1 is acceptable. Folding operations are the basic operations to minimize a machine so that we obtain the minimal machine that could have produced a given observed sequence. These operations will be iteratively applied to improve an initial machine that we will explicitly construct. This is a very simple machine that has the capability of performing the observed sequence that is provided. It consists of a set of states containing a state for each step in the observed sequence and a set of transitions where each transition links each state with the next state through the corresponding input and output in the sequence. No cycle is performed in the machine, so all states lead to a new state. The resulting machine is a simple linear machine whose structure is directly inherited from that of the sequence. Let us formally present this idea. Definition 8. Let I and O be sets of inputs and outputs, respectively, and L=[i1/o1,...,in /on] be a sequence, where for all 1≤j≤n we have ij ∈ I and oj ∈ O. The initial machine to perform L is a finite state machine M=(S, I, O, T, sin) where —S = {s1, ..., sn+1} in / on i1 / o1 —T = {s1 → s2 ,..., sn → sn +1} —sin = s1 For instance, let L=(a/b,a/c,a/d,a/b) be a sequence. Then, the initial machine to perform L is the machine M3 depicted in Figure 3. Trivially, an initial machine becomes an (initial) folding machine by introducing the suitable additional information. As we want the new folding machine to represent the machine where no folding operation has been performed yet, the set of original states coincides with the one corresponding to the associated finite state machine. Besides, the set function returns for each state a trivial unitary set containing that state.
59
Cognitive Processes by using Finite State Machines
Definition 9. Let M=(S, I, O, T, sin) represent an initial machine to perform L. The initial folding machine to perform L is the folding machine U=(M,S,f ), where for all s∈S we have that f(s)={s}. Before presenting the algorithm to construct the minimal machine producing a given observation, we formally define the properties such a machine must have. Definition 10. Let M=(S, I, O, T, sin) be a finite state machine. Let L = [i1/o1, ..., in /on] be a sequence such that there in / on i1 / o1 i2 / o2 exists a trace σ = s1 → s2 → ... → sn +1 in M and s1 = sin. We say that M is a minimal machine producing in / on i1 / o1 i2 / o2 L if there does not exist another machine M’=(S’, I, O, T’, s’ in) and a trace σ ' = s '1 → s '2 → ... → s 'n +1 in M such that s'in = s'1 and |S’| < |S|. For instance, the minimal machine producing L=(a/b,a/c,a/d,a/b) is the machine M4, shown in Figure 3. In our algorithm, the new minimized machine will be obtained by the iterative application of some folding operations to an initial machine. The original machine is the initial folding machine associated to the given observation sequence. We need a suitable notation to represent the iterative application of a sequence of folding operations to a machine in such a way that the result after each folding is the input of the next one. This notion is formally introduced in the next definition. Definition 11. Let U1, ..., Un be folding machines and σ1, σ'1, ..., σn, σ'n be traces such that for all 2 ≤ i ≤ n we have that Ui is the folding of σi and σ’i in Un-1. Let us suppose that these n-1 folding operations are acceptable. We say α that α = [(σ1, σ'1), ..., (σn, σ'n)] is a folding sequence from U1 leading to Un and we denote it by U1 ⇒ Un.
MINIMIATION Before presenting our minimization algorithm, we present a property that we will use to prove the optimality of the machines obtained by the algorithm. It says that a minimal folding machine will be obtained by applying the suitable folding sequence. Lemma 5.0.2 Let U be the initial folding machine to perform L. Then, there exists a folding sequence α of U α with U1 ⇒ Un such that U' = (M', S', f') and M’ is a minimal machine producing L. Next, we show that the order of application of the folding operations is irrelevant. α α' Lemma 5.0.3. Let U1 ⇒ Un with α = [(σ1, σ'1), (σ2, σ'2)]. Then, U1 ⇒ Un with α' = [(σ2, σ'2), (σ1, σ'1)].
Now we are ready to present our minimization algorithm. The minimal machine will not be available until the end of the execution. We will find the minimal finite state machine that could have produced a given observation Figure 3. Examples of initial and minimal machines
60
Cognitive Processes by using Finite State Machines
by executing the backtracking algorithm presented in Figure 5. The inputs of that algorithm are the observation sequence L = [i1/o1, ..., in /on] and the initial folding machine U=(M,S,f ) associated to L, where we suppose that in / on i1 / o1 M = (S, I, O, T, s1). Besides, we assume that s1 → s2 ...sn → sn +1 is the unique trace of M that ranges all the states in S. We have used a functional programming notation to define lists: ( ) denotes an empty list, head(l) and tail(l) denote the fist element of l and the remaining of l after removing the head, respectively, and x:l denotes the inclusion, as first element, of x into the list l. Let us comment on the algorithm. Initially, we identify all folding operations that could be performed in the initial machine M and they are introduced in a list (unificationList). Besides, we calculate the number of states that would be eliminated if all the folding operations appearing in the list from a given folding up to the end were performed. We store this information in another list (heuristicList). Then, we search for the best solution in the solutions space. At each point of the search tree we decide whether a folding of the list is performed or not. Hence, each branch of the tree specifies a subset of folding operations of the list. Branches are pruned by comparing the best solution found so far with the addition of the states eliminated from the root to the current node plus an heuristic estimation of the number of states that could be eliminated up to the leaves. This heuristic consists in adding the states that would be eliminated if all the folding operations remaining in the list were performed. Trivially, this method gives an upper bound of the profit that could be gained up to the leaves of the tree. So, the heuristic is valid because it will never make a potentially useful branch to be pruned. If the upper bound of the profit is less than the one provided by the best solution found so far, the corresponding branch is not constructed. Next we prove that our minimization algorithm is optimal, that is, the returned folding machine is a minimal folding machine. Theorem 5.0.1. Let L = [i1/o1, ..., in /on] be a observation sequence and U be the initial folding machine associated to L. Then, the folding machine U’’ returned after the application of the algorithm depicted in Figure 5 to the machine U is a minimal folding machine associated to L. For example, an application of the algorithm shown in Figure 5 to the initial machine M3 that performs L=(a/ b, a/c, a/d, a/b), depicted in Figure 3, gives us the minimal machine M4 depicted in the same figure. In this case, a/b a/b the only folding performed is that relating traces 1 → 2 and 4 → 5, which is acceptable.
LEARNINGALGORITHM In this section we consider how our algorithm to find the minimal FSM fitting into an observation can be used in the core of a learning algorithm. An algorithm that allows an intelligent agent to develop the simplest theory that is consistent with those observations it has collected so far is the following: 1. 2. 3.
4.
5.
First, the sets of inputs and outputs in its environment, that is, the way in which the agent and its environment can affect each other, is fixed. The agent interacts with its environment and collects an historical record of the results of each interaction. When the length of the record exceeds a given threshold, the minimal FSM capable to produce that behavior is constructed according to the algorithm depicted in Figure 5. This FSM represents the cognitive theory of the agent. From now on, the agent takes into account that theory to make its decisions, that is, to decide in any moment the input it will use to interact with the environment. It will use the theory to try to guess in advance the possible effect of its hypothetical future actions. So, it can use it to succeed and avoid failing. The agent keeps recording its interaction with the environment. Periodically, the minimal FSM is reconstructed according to (longer) records, which allows the agent to refine its cognitive theory along time.
Let us note that using the simplest theory (in this case, the smallest FSM) as cognitive model is not only a suitable procedure to extrapolate infinite behaviors from a finite observation, but it is also a mechanism to reduce the size of the cognitive model. Since the basic mechanism of the learning algorithm encourages the creation of small knowledge models, this algorithm may help to reduce the amount of required memory in an intelligent system. 61
Cognitive Processes by using Finite State Machines
Figure 5. Minimization algorithm unificationsList := []; maximalSaving := 0; heuristicList := [0]; for j := 1 to n do for all substring Y of L length j do let k be the position where substring Y starts in L. for all substring of L of length j coinciding with Y do let l be the position where such substring starts in L. ik + j / ok + j ik / ok let σ = sk → sk +1 ...sk + j → sk + j +1 il + j / ol + j il / ol → sl +1 ...sl + j → sl + j +1 let σ ' = sl if unification of σ,σ' is acceptable in U then unificationList := (σ,σ') : unificationList; maximalSaving := maximalSaving + j; heuristicList := maximalSaving : heuristicList; fi od od od (u, bestSaving) := SearchBest(U, unificationList, heuristicList, 0, 0); return u; function SearchBest (u, unificationList, heuristicList, currentSaving, bestSaving) if unificationList = [] then if currentSaving ≥ bestSaving then bestSaving := currentSaving; fi return (u, bestSaving); else (σ,σ') := head(unificationList); maximalSaving := head(heuristicList); bestIndex := 0; if currentSaving + maximalSaving ≥ bestSaving and unification of σ,σ' is acceptable in u then u' := unification (u, σ,σ'); (u", bestSaving') := SearchBest(u', tail(unificationList), tail(heueristicList) currentSaving + length(σ), bestSaving); if bestSaving' ≥ bestSaving then bestSaving := bestSaving'; bestIndex := 1; fi fi maximalSaving := head(tail(heuristicList)); if currentSaving + maximalSaving ≥ bestSaving then (u"', bestSaving') := SearchBest (u', tail(unificationList), tail(heuristicList), currentSaving, bestSaving); if bestSaving' ≥ bestSaving then bestSaving := bestSaving'; bestIndex := 2; fi fi if bestIndex = 2 then return (u"', bestSaving); else if bestIndex = 1 then return (u", bestSaving); else return (u, bestSaving); fi fi
62
Cognitive Processes by using Finite State Machines
CONCLUION We have presented an algorithm that provides a mechanism to find the simplest finite state machine that could have produced a given observation. This algorithm obtains the simplest theory to explain an observation. So, it represents the theory we would obtain in the chosen cognitive model by systematically applying Occam’s razor criterion. Finite state machines are formalisms that produce the simplest kind of languages according to Chomsky’s classification (right linear languages). Hence, since languages and reasoning processes are linked, our approach provides a learning algorithm that fits into the simplest form of reasoning. Let us note that our methodology assumes two postulates. First, following Chomsky, since natural language (and, indirectly, any human cognitive process) is produced by one of the languages in Chomsky’s classification (specifically, the context-sensitive languages), we postulate that the lowest languages in that classification (that is, right linear languages) provide a suitable (simplified) model to represent human reasonings. Second, we assume that the simplest model is the smallest model. Thus, a suitable way to apply Occam’s razor criterion to build the simplest theory that explains an observation of the environment consists in finding the smallest finite state machine that could have produced that observation.
REFERENCE Blumer, A., Ehrenfeucht, A., Haussler, D., & Warmuth, M. (1987). Occam’s razor. Information Processing Letters, 24, 377-380. Chomsky, N. (1957). Syntactic Structures. Haag, Mouton. Chomsky, N. (1965). Aspect of the Theory of Syntax. MIT Press. Domingos, P. (1999). The role of Occam’s razor in knowledge discovery. Data Mining and Knowledge Discovery 3(4) 409-425. Fisher, D., & Schlimmer, J. (1988). Concept simplification and prediction accuracy. In Proceedings of the Fifth International Conference on Machine Learning. Morgan Kaufmann, 22-28. Gold, E. M. (1978). Complexity of automaton identification from given data. Information and Control, 37, 302320. Huffman, D. (1954). The synthesis of sequential switching circuits. J. Franklin Inst. 257, 3-4, 161-190, 275-303. Kinsner, W. (2005). Some advances in cognitive informatics. In International Conference on Cognitive Informatics (ICCI’05).6-7. IEEE Press. Lau, C. (1991). Neural networks, theoretical foundations an analysis. IEEE Press. Mingers, J. (1989). An empirical comparison of pruning measures for decision tree induction. Machine Learning, 4, 227-243. Núñez, M., Rodríguez, I., & Rubio, F. (2004). Applying Occam’s razor to FSMs. In International Conference on Cognitive Informatics. (pp. 138-147). IEEE Press. Rao, R., Gordon, D., & Spears, W. (1995). For every generalization action, is there really an equal or opposite reaction? Analysis of conservation law. In Proceedings of the Twelveth International Conference on Machine Learning. (pp. 471-479). Morgan. Rissanen, J. (1978). Modelling by shortest data description. Automatica, 14, 465-471. Schaffer, J. (1994). A conservation law for generalization performance. In Proceedings of the 11th International Conference on Machine Learning (pp. 259-265). Morgan Kaufmann.
63
Cognitive Processes by using Finite State Machines
Schank, R., & Abelson, R. (1977). Scripts, plans, goals, and understanding. Hillsdale, NJ: Erlbaum. Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6, 461-464. Tornay, S. (1938). Ockham: Studies and Selections. La Salle, IL: Open Court Publishers. Vapnik, V. (1995). The nature of statistical learning theory. Springer. Wang, Y. (2002). On Cognitive Informatics. In International Conference on Cognitive Informatics (ICCI’02) (pp. 34-42). IEEE Press. Wang, Y. (2003). Cognitive informatics: A new transdisciplinary research field. Brain and Mind: A Transdisciplinary Journal of Neuroscience and Neurophilosophy, 4(2),115-127.
64
65
Chapter IV
On the Cognitive Processes of Human Perception with Emotions, Motivations, and Attitudes Yingxu Wang University of Calgary, Canada
Abstract An interactive motivation-attitude theory is developed based on the Layered Reference Model of the Brain (LRMB) and the Object-Attribute-Relation (OAR) model. This chapter presents a rigorous model of human perceptual processes such as emotions, motivations, and attitudes. A set of mathematical models and formally described cognitive processes are developed. The interactions and relationships between motivation and attitude are formally described in real-time process algebra (RTPA). Applications of the mathematical models of motivations and attitudes in software engineering are demonstrated. This work is the detailed description of a part of the layered reference model of the brain (LRMB) that provides a comprehensive model for explaining the fundamental cognitive processes of the brain and their interactions. This work demonstrates that the complicated human emotional and perceptual phenomena can be rigorously modeled in mathematics and be formally treated and described.
INTRODUCTION A variety of life functions and cognitive processes has been identified in cognitive informatics (Wang, 2002a, 2003a, 2007b), cognitive science, neuropsychology, and neurophilosophy. In order to formally and rigorously describe a comprehensive and coherent set of mental processes and their relationships, a layered reference model of the brain (LRMB) developed (Wang and Wang, 2006; Wang et al., 2006) that explains the functional mechanisms and cognitive processes of the natural intelligence. LRMB encompasses 37 cognitive processes at six layers known as the sensation, memory, perception, action, meta and higher cognitive layers from the bottom up.
Copyright © 2009, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
On the Cognitive Processes of Human Perception with Emotions, Motivations, and Attitudes
Definition 1. Perception is a set of internal sensational cognitive processes of the brain at the subconscious cognitive function layers that detects, relates, interprets, and searches internal cognitive information in the mind. Perception may be considered as the sixth sense of human beings since almost all cognitive life functions rely on it. Perception is also an important cognitive function at the subconscious layers that determines personality. In other word, personality is a faculty of all subconscious life functions and experience cumulated via conscious life functions. It is recognized that a crucial component of the future generation computers known as the cognitive computers is the perceptual engine that mimic the natural intelligence (Wang, 2006, 2007c). The main cognitive processes at the perception layer of LRMB are emotion, motivation, and attitude (Wang et al., 2006). This chapter presents a formal treatment of the three perceptual processes, their interrelationships, and interactions. It demonstrates that complicated psychological and cognitive mental processes may be formally modeled and rigorously described. Mathematical models of the psychological and cognitive processes of emotions, motivations, and attitudes are developed in the following three sections. Then, interactions and relationships between emotions, motivations, and attitudes are analyzed. Based on the integrated models of the three perception processes, the formal description of the cognitive processes of motivations and attitudes will be presented using Real-Time Process Algebra (RTPA) (Wang, 2002b, 2003c). Applications of the formal models of emotions, motivations, and attitudes will be demonstrated in a case study on maximizing strengths of individual motivations in software engineering.
THEHIERARCHICALMODELOF Emotions are a set of states or results of perception that interprets the feelings of human beings on external stimuli or events in the binary categories of pleasant or unpleasant. Definition 2. An emotion is a personal feeling derived from one’s current internal status, mood, circumstances, historical context, and external stimuli. Emotions are closely related to desires and willingness. A desire is a personal feeling or willingness to possess an object, to conduct an interaction with the external world, or to prepare for an event to happen. A willingness is the faculty of conscious, deliberate, and voluntary choice of actions. According to the study of Fischer and his colleagues (Fischer et al., 1990; Wilson and Keil, 1999), the taxonomy of emotions can be described at three levels known as the sub-category, basic, and super levels as shown in Table 1. It is interesting that human emotions at the perceptual layer may be classified into only two opposite categories: pleasant and unpleasant. Various emotions in the two categories can be classified at five levels according to its strengths of subjective feelings as shown in Table 2, where each level encompasses a pair of positive/negative or pleasant/unpleasant emotions.
Table 1. Taxonomy of emotions Level
66
Description
Super level
Positive (pleasant)
Negative (unpleasant)
Basic level
Joy
Love
Anger
Sadness
Fear
Sub-category level
Bliss, pride, contentment
Fondness, infatuation
Annoyance, hostility, contempt, jealousy
Agony, grief, guilt, loneliness
Horror, worry
On the Cognitive Processes of Human Perception with Emotions, Motivations, and Attitudes
Table 2. The Hierarchy of emotions Description
Level (Positive/Negative) 0
No emotion
1
Weak emotion
2
3
4
— Comfort
Safeness, contentment, fulfillment, trust
Fear
Worry, horror, jealousy, frightening, threatening
Joy
Delight, fun, interest, pride
Moderate emotion Sadness
Anxiety, loneliness, regret, guilt, grief, sorrow, agony
Pleasure
Happiness, bliss, excitement, ecstasy
Anger
Annoyance, hostility, contempt, infuriated, enraged
Love
Intimacy, passion, amorousness, fondness, infatuation
Hate
Disgust, detestation, abhorrence, bitterness
Strong emotion
Strongest emotion
Definition 3. The strength of emotion |Em| is a normalized measure of how strong a person’s emotion on a fivelevel scale identified from 0 through 4, i.e: 0 ≤ |Em| ≤ 4
(1)
where |Em| represents the absolute strength of an emotion regardless whether it is positive (pleasant) or negative (unpleasant), and the scope of |Em| is corresponding to the definitions of Table 2. It is observed that an organ known as hypothalamus in the brain is supposed to interpret the properties or types of emotions in terms of pleasant or unpleasant (Payne and Wenger, 1998; Pinel, 1997; Smith, 1993; Westen, 1999; Wang et al., 2006). Definition 4. Let Te be a type of emotion, ES the external stimulus, IS the internal perceptual status, and BL the Boolean values true or false. The perceptual mechanism of the hypothalamus can be described as a function, i.e: Te : ES × IS → BL
(2)
It is interesting that the same event or stimulus ES may be explained in different types, in terms of pleasant or unpleasant, due to the difference of the real-time context of the perceptual status IS of the brain. For instance, walking from home to the office may be interpreted as a pleasant activity for one who likes physical exercise, but the same walk due to car breakdown will be interpreted as unpleasant. This observation and the taxonomy provided in Tables 1 and 2 leads to the following Theorem. Theorem 1. The human emotional system is a binary system that interprets or perceives an external stimulus and/or internal status as pleasant or unpleasant. Although there are various emotional categories in different levels, the binary emotional system of the brain provides a set of pairwise universal solutions to express human feelings. For example, angry may be explained as a default solution or generic reaction for an emotional event when there is no better solution available; otherwise, delight will be the default emotional reaction.
THEMATHEMATICALMODELOFMOTIVATION Motivation is an innate potential power of human beings that energizes behavior. It is motivation that triggers the transformation from thought (information) into action (energy). In other words, human behaviors are the embodiment of motivations. Therefore, any cognitive behavior is driven by an individual motivation. 67
On the Cognitive Processes of Human Perception with Emotions, Motivations, and Attitudes
Definition 5. A motivation is a willingness or desire triggered by an emotion or external stimulus to pursue a goal or a reason for triggering an action. As described in the Layered Reference Model of the Brain (LRMB) (Wang et al., 2006), motivation is a cognitive process of the brain at the perception layer that explains the initiation, persistence, and intensity of personal emotions and desires, which are the faculty of conscious, deliberate, and voluntary choices of actions. Motivation is a psychological and social modulating and coordinating influence on the direction, vigor, and composition of behavior. This influence arises from a wide variety of internal, environmental, and social sources, and is manifested at many levels of behavioral and neural organizations. The taxonomy of motives can be classified into two categories known as learned and unlearned (Wittig, 2001). The latter is the primary motives such as the survival motives (hunger, thirst, breath, shelter, sleep, and eliminating), and pain. The former is the secondary motives such as the need for achievement, friendship, affiliation, dominance of power, and relief anxiety, which are acquired and extended based on the primary motives. Definition 6. The strength of motivation M is a normalized measure of how strong a person’s motivation on a scale of 0 through 100, i.e.: 0 ≤ M ≤ 100
(3)
where M = 100 is the strongest motivation and M = 0 is the weakest motivation. It is observed that the strength of a motivation is determined by multiple factors (Westen, 1999; Wilson and Keil, 1999) such as: a. b. c.
The absolute motivation |Em|: The strength of the emotion. The relative motivation E - S: A relative difference or inequity between the expectancy of a person E for an object or an action towards a certain goal and the current status S of the person. The cost to fulfill the motivation C: A subjective assessment of the effort needed to accomplish the expected goal.
Therefore, the strength of a motivation can be quantitatively analyzed and estimated by the subjective and objective motivations and their cost as described in the following theorem. Theorem 2. The strength of a motivation M is proportional to both the strength of emotion |Em| and the difference between the expectancy of desire E and the current status S, of a person, and is inversely proportional to the cost to accomplish the expected motivation C, i.e.:
M =
2.5 · | Em | (E -S ) C
(4)
where 0≤ |Em| ≤ 4, 0 ≤ (E,S) ≤ 10, 1 ≤ C ≤ 10, and the coefficient 2.5 makes the value of M normalized in the scope of (0 .. 100). In Theorem 2, the strength of a motivation is measured in the scope 0 ≤ M ≤ 100. When M > 1, the motivation is considered being a desired motivation, because it indicates both an existing emotion and a positive expectancy. The higher the value of M, the stronger the motivation. According to Theorem 2, in a software engineering context, the rational action of a manager of a group is to encourage individual emotional desire, and the expectancy of each software engineers, and to decrease the required effort for the employees by providing additional resources or adopting certain tools. Corollary 1. There are super strong motivations toward a resolute goal by a determined expectancy of a person at any cost.
68
On the Cognitive Processes of Human Perception with Emotions, Motivations, and Attitudes
It is noteworthy that a motivation is only a potential mental power of human beings, and a strong motivation will not necessarily result in a behavior or action. The condition for transforming a motivation into a real behavior or action is dependent on multiple factors, such as values, social norms, expected difficulties, availability of resources, and the existence of alternative goals. The motivation of a person is constrained by the attitude and decision making strategies of the person. The former is the internal (subjective) judgment of the feasibility of the motivation, and the latter is the external (social) judgment of the feasibility of the motivation. Attitude and decision making mechanisms will be analyzed in the following subsections.
THEMATHEMATICALMODELOFATTITUDE As described in the previous section, motivation is the potential power that may trigger an observable behavior or action. Before the behavior is performed, it is judged by an internal regulation system known as the attitude. Psychologists perceive attitude in various ways. R. Fazio describes an attitude as an association between an act or object and an evaluation (Fazio, 1986). A. Eagly and S. Chaiken define that an attitude is a tendency of a human to evaluate a person, concept, or group positively or negatively in a given context (Eagly and Chaiken, 1992). More recently, Arno Wittig describes attitude as a learned evaluative reaction to people, objects, events, and other stimuli (Wittig, 2001). Attitudes may be formally defined as follows. Definition 7. An attitude is a subjective tendency towards a motivation, an object, a goal, or an action based on an intuitive evaluation of its feasibility. The modes of attitudes can be positive or negative, which can be quantitatively analyzed using the following model. Definition 8. The mode of an attitude A is determined by both an objective judgment of its conformance to the social norm N and a subjective judgment of its empirical feasibility F, i.e.:
ìï1, N = T Ù F = T A = ïí ïï 0, N = F Ú F = F î
(5)
where A = 1 indicates a positive attitude; otherwise, it indicates a negative attitude.
INTERACTIONBETWEENMOTIVATIONATTITUDE This section discusses the relationship between a set of interlinked perceptual psychological processes such as emotions, motivations, attitudes, decisions, and behaviors. A motivation/attitude-driven behavioral model will be developed for formally describing the cognitive processes of motivations and attitudes. It is observed that motivation and attitude have considerable impact on behavior and influence the ways a person thinks and feels (Westen, 1999). A reasoned action model is proposed by Martin Fishbein and Icek Ajzen in 1975 that suggests human behavior is directly generated by behavioral intensions, which are controlled by the attitude and social norms (Fishbein and Ajzen, 1975). An initial motivation before the judgment by an attitude is only a temporal idea; with the judgment of the attitude, it becomes a rational motivation (Wang et al., 2006), also known as the behavioral intention. The relationship between an emotion, motivation, attitude, and behavior can be formally and quantitatively described by the motivation/attitude-driven behavioral (MADB) model as illustrated in Figure 1. In the MADB model, motivation and attitude have been defined in Eqs. 2 and 3. The rational motivation, decision, and behavior
69
On the Cognitive Processes of Human Perception with Emotions, Motivations, and Attitudes
can be quantitatively analyzed according to the following definitions. It is noteworthy that, as shown in Figure 1, a motivation is triggered by an emotion or desire. Definition 9. A rational motivation Mr is a motivation regulated by an attitude A with a positive or negative judgment, i.e.: Mr = M A =
(6)
2.5 · | Em | (E -S ) A C
Definition 10. A decision D for confirming an attitude for executing a motivated behavior is a binary choice on the basis of the availability of time T, resources R, and energy P, i.e.: ìï1, T Ù R Ù P = T D = ïí ïï 0, T Ú R Ú P = F î
(7)
Definition 11. A behavior B driven by a motivation Mr and an attitude is a realized action initiated by a motivation M and supported by a positive attitude A and a positive decision D toward the action, i.e.:
ì 2.5 | Em | (E -S ) ï ï T, M r D = A D >1 ï C B =í ï ï F, otherwise ï î
Figure 1. The model of motivation/attitude-driven behavior Satisfy/dissatisfy
Motivation
M
Rational motivation
Mr
A
Stimuli
Emotion
Strengthen/weaken
Outcome
D
Attitude (Perceptual feasibility) N
B Behavior
Decision (physical feasibility) F T/R/P
Values/ social norms
Experience
Internal process
70
Availability of time, resources, and energy
External process
(8)
On the Cognitive Processes of Human Perception with Emotions, Motivations, and Attitudes
FORMALDECRIPTIONOFPROCEes OFMOTIVATION AND ATTITUDE The formal models of emotion, motivation, and attitude have been developed in previous sections. This section extends the models and their relationship into detailed cognitive processes based on the object-attribute-relation (OAR) model (Wang, 2007d) and using RTPA (Wang, 2002b, 2003c), which enables more rigorous treatment and computer simulation.
The Cognitive Process of Motivations The mathematical model of motivation is described in equation 6. Based on equation 6, the cognitive process of motivation (MTVT) is presented in Figure 2. The motivation process is divided into four major sub-processes known as (i) Form motivation goal, (ii) Estimate strength of motivation, (iv) Form rational motivation, and (vi) Stimulate behavior for the motivation. The MADB model provides a formal explanation of the mechanism and relationship between motivation, attitude, and behavior. The model can be used to describe how the motivation process drives human behaviors and actions, and how the attitude as well as the decision making process help to regulate the motivation and determines whether the motivation should be implemented.
The Cognitive Process of Attitudes The mathematical model of attitude has been described in Equation 5. Based on Equation 5, the cognitive process of attitude (ATTD) is presented in Figure 3. The attitude process is divided into three major sub-processes known as (iii) Check the mode of attitude, (v) Determine physical availability, and (vi) Stimulate behavior for the motivation.
The Integrated Process of Motivation and Attitudes According to the model of motivation/attitude-driven behavior (MADB) and the formal description of the motivation and attitude processes as shown in Figures 1 through 3, the cognitive processes of motivation and attitude are interleaved. An integrated process that combines both motivation and attitude is given in Figure 4, via the following sub-processes: (i) Form motivation goals, (ii) Estimate strength of motivation, (iii) Check the mode of attitude, (iv) Form rational motivation, (v) Determine physical availability, and (vi) Stimulate behavior for the rational motivation.
MAIMIINGTRENGTH OFMOTIVATION Studies in sociology provide a rich theoretical basis for perceiving new insights into the organization of software engineering. It is noteworthy that in a software organization, according to Theorem 1x, the strength of a motivation of individuals M is proportional to both the strength of emotion and the difference between the expectancy and the current status of a person. At the same time, it is inversely proportional to the cost to accomplish the expected motivation C. The job of management at different levels of an organization tree is to encourage and improve Em and E, and to help employees to reduce C. Example 1. In software engineering project organization, the manager and programmers may be motivated to the improvement of software quality in different extent. Assume the following factors as shown in Table 3 are collected from a project on the strengths of motivations to improve the quality of a software system, analyze how the factors influence the strengths of motivations of the manager and the programmer.
71
On the Cognitive Processes of Human Perception with Emotions, Motivations, and Attitudes
Figure 2. The cognitive process of motivations
The Motivation Process Motivation (I:: oS; O:: OAR(O’, A’, R’)ST) { I. Form motivation goal(s) Identify (o, A’, R’) II. Estimate strength of motivation M(o)N Quantify (Em(o)N) // The strength of emotion Quantify (S(o)N)
// The current status
Quantify (E(o)N)
// The expectancy of desire
Quantify (C(o)N)
// The cost to accomplish
M (o)N
2.5 Em (o)N (E (o)N -S (o)N) C (o)N
(
M(o) N > 1
|
~
M(o)BL = T
// Positive motivation
M(o)BL = F
// Negative motivation
) III. Check the mode of attitude A(o)N // Refer to the Attitude process IV. Form rational motivation Mr(o) Mr(o)N := M(o)N • A(o)N ( Mr(o)N > 1 |
Mr(o)BL = T
// Rational motivation
Mr(o)BL = F
// Irrational motivation
~
) V. Determine physical availability D(o)N // Refer to the Attitude process VI. Stimulate behavior for M r(o) ( D(o)N = 1 GenerateAction (Mr(o)) ExecuteAction (Mr(o)) R’ := R’ ∪ | ~
// Implement motivation o
// Give up motivation o
D(o)N := 0 o := Ø R’ := Ø ) OAR’ST = // Form new OAR model Memorization (OAR’ST) }
72
On the Cognitive Processes of Human Perception with Emotions, Motivations, and Attitudes
Figure 3. The cognitive process of attitude The Attitude Process Attitude (I:: oS; O:: OAR(O’, A’, R’)ST) { I. Form motivation goal(s) Identify (o, A’, R’) II. Estimate strength of motivation M(o)N // See the MTVT process III. Check the mode of attitude A(o)N // Perceptual feasibility Qualify (N(o)BL) // The social norm Qualify (F(o)BL) (
N(o)BL ∧ F(o)BL = T
|
~
// The subjective feasibility
A(o)N := 1 A(o)N := 0 ) IV. Form rational motivation Mr(o) // Refer to the Motivation process V. Determine physical availability D(o)N Qualify (T(o)BL) // The time availability Qualify (R(o)BL)
// The resource availability
Qualify (P(o)BL)
// The energy availability
(
T(o)BL ∧ R(o)BL ∧ P(o)BL = T
|
~
D(o)N := 1
// Confirmed motivation
D(o)N := 0
// Infeasible motivation
) VI. Stimulate behavior for Mr(o) ( D(o)N = 1
// Implement motivation o
GenerateAction (Mr(o)) ExecuteAction (Mr(o)) R’ := R’ ∪ |
~
// Give up motivation o D(o)N := 0 o := Ø R’ := Ø
) OAR’ST = // Form new OAR model Memorization (OAR’ST) }
�
73
On the Cognitive Processes of Human Perception with Emotions, Motivations, and Attitudes
Table 3. Motivation factors of a project Role
Em
C
E
S
The manager
4
3
8
5
Programmers
3.6
8
8
6
According to Theorem 1x, the strengths of motivations of the manager M1 and the programmer M2 can be estimated using Equation 4, respectively: 2.5 | Em | (E -S ) C 2.5 4 (8 - 5) = 3 = 10.0
M 1 (manager ) =
and 2.5 3.6 (8 - 6) 8 = 2.3
M 2 (programer ) =
The results show that the manager has much stronger motivation to improve the quality of software than that of the programmer in the given project. Therefore, the rational action for the manager is to encourage the expectancy of the programmer or to decrease the required effort for the programmer by providing additional resources or adopting certain tools. According to social psychology (Wiggins et al., 1994), social environment, such as culture, ethical norms, and attitude, greatly influences people’s motivation, behavior, productivity, and quality towards collaborative work. The chain of individual motivation in a software organization can be illustrated as shown in Figure 5. Cultures and values of a software development organization helps to establish a set of ethical principles or standards shared by individuals of the organization for judging and normalizing social behaviors. The identification of larger set of values and organizational policy towards social relations may be helpful to normalize individual and collective behaviors in a software development organization that produces information products for a global market. Another condition for supporting creative work of individuals in a software development organization is to encourage diversity in both ways of thinking and work allocation. It is observed in social ecology that a great diversity of species and a complex and intricate pattern of interactions among the populations of a community may confer greater stability on an ecosystem. Definition 12. Diversity refers to the social and technical differences of people in working organizations. Diversity includes a wide range of differences between people such as those of race, ethnicity, age, gender, disability, skills, educations, experience, values, native language, and culture. System theory indicates that if the number of components of a system reaches a certain level – the critical mass, then the functionality of the system may be dramatically increased (Wang, 2007a). That is, the increase of diversity in a system is the condition to realize the system fusion effect, which results in a totally new system. Theorem 3. The diversity principle states that the more diversity the workforce in an organization (particularly the creative software industry), the higher the opportunity to form new relations and connections that leads to the gain of the system fusion effect.
74
On the Cognitive Processes of Human Perception with Emotions, Motivations, and Attitudes
Figure 4. The Integrated process of motivation and attitude The Motivation and Attitude Process Motivation-Attitude (I:: oS; O:: OAR(O’, A’, R’)ST) { I. Form motivation goal(s) Identify (o, A’, R’) II. Estimate strength of motivation M(o)N // The strength of emotion Quantify (Em(o)N) Quantify (S(o)N)
// The current status
Quantify (E(o)N)
// The expectancy of desire
Quantify (C(o)N)
// The cost to accomplish
M (o) N
2.5 E m (o) N (E (o) N -S (o) N) C (o) N
(
M(o)N > 1
|
~
M(o)BL = T
// Positive motivation
M(o)BL = F
// Negative motivation
) III. Check the mode of attitude A(o)N // Perceptual feasibility Qualify (N(o)BL) // The social norm Qualify (F(o)BL)
// The subjective feasibility
N(o)BL ∧ F(o)BL = T A(o)N := 1 ~ A(o)N := 0
( | )
IV. Form rational motivation Mr(o) Mr(o)N := M(o)N • A(o)N ( Mr(o)N > 1 |
Mr(o)BL = T
// Rational motivation
Mr(o)BL = F
// Irrational motivation
~
) V. Determine physical availability D(o)N Qualify (T(o)BL) // The time availability Qualify (R(o)BL)
// The resource availability
Qualify (P(o)BL)
// The energy availability
(
T(o)BL ∧ R(o)BL ∧ P(o)BL = T
|
~
D(o)N := 1
// Confirmed motivation
D(o)N := 0
// Infeasible motivation
) VI. Stimulate behavior for Mr(o) ( D(o)N = 1
// Implement motivation o
GenerateAction (Mr(o)) ExecuteAction (Mr(o)) R’ := R’ ∪ |
~
// Give up motivation o D(o)N := 0 o := Ø R’ := Ø
) OAR’ST = // Form new OAR model Memorization (OAR’ST) }
�
75
On the Cognitive Processes of Human Perception with Emotions, Motivations, and Attitudes
Figure 5. The chain of motivation in a software organization
Basic human needs of individuals
Motivation
Behavior
Productivity
Organizational objectives Attitude
Quality
The social environment of software engineering
CONCLUION This chapter has described the cognitive processes of emotions, motivations, and attitudes, and demonstrates that complicated psychological and cognitive mental processes may be formally modeled and rigorously described. The perceptional cognitive processes such as emotions, motivations, and attitudes have been explored in order to explain the natural drives and constraints of human behaviors. Relationships and interactions between motivation and attitude have been discussed and formally described in Real-Time Process Algebra (RTPA). It has been recognized that the human emotional system is a binary system that interprets or perceives an external stimulus and/or internal status as pleasant or unpleasant. It has revealed that the strength of a motivation is proportional to both the strength of the emotion and the difference between the expectancy of desire and the current status of a person, and is inversely proportional to the cost to accomplish the expected motivation. Case studies on applications of the interactive motivation-attitude theory and cognitive processes of motivations and attitudes in software engineering have been presented. This work has demonstrated that the complicated human emotional and perceptual phenomena can be rigorously modeled in mathematics and be formally treated and described. This work has been based on two fundamental cognitive informatics models: the Layered Reference Model of the Brain (LRMB) and the Object-Attribute-Relation (OAR) model. The former has provided a blueprint to exploring the natural intelligence and its mechanisms. The latter has established a contextual foundation to reveal the logical representation of information, knowledge, and skills in the abstract space of the brain.
AC The author would like to acknowledge the Natural Science and Engineering Council of Canada (NSERC) for its support to this work. We would like to thank the anonymous reviewers for their valuable comments and suggestions.
REFERENCE Eagly, A.H., & Chaiken, S. (1992). The psychology of attitudes. San Diego: Harcourt, Brace. Fazio, R.H. (1986). How do attitudes guide behavior. In R.M. Sorrentino and E.T. Higgins (eds.), The Handbook of Motivation and Cognition: Foundations of Social Behavior. New York: Guilford Press.
76
On the Cognitive Processes of Human Perception with Emotions, Motivations, and Attitudes
Fischer, K.W., Shaver, P.R., & Carnochan, P. (1990). How emotions develop and how they organize development. Cognition and Emotion, 4, 81-127. Fishbein, M., & Ajzen, I. (1975). Belief, attitude, intention, and behavior: An introduction to theory and research.. Reading, MA: Addison-Wesley. Payne, D.G., & Wenger, M.J. (1998). Cognitive psychology. New York: Houghton Mifflin Co. Pinel, J.P.J. (1997). Biopsychology, (3rd. ed.).Needham Heights, MA: Allyn and Bacon. Smith, R.E. (1993). Psychology. St. Paul, MN: West Publishing Co. Wang, Y. (2002a, August). On cognitive informatics. Keynote Lecture from the Proceedings 1st IEEE International Conference on Cognitive Informatics (ICCI’02) (pp.34-42). Calgary, Canada: IEEE CS Press. Wang, Y. (2002b). The real-time process algebra (RTPA). The International Journal of Annals of Software Engineering, 14, 235-274. Oxford: Baltzer Science Publishers.. Wang, Y. (2003a). Cognitive informatics: A new transdisciplinary research field. Brain and Mind: A Transdisciplinary Journal of Neuroscience and Neurophilosophy,.4(2), 115-127. Wang, Y. (2003b). On cognitive informatics. Brain and Mind: A Transdisciplinary Journal of Neuroscience and Neurophilosophy, 4(2), 151-167. Wang, Y. (2003c). Using process algebra to describe human and software behaviors. Brain and Mind: A Transdisciplinary Journal of Neuroscience and Neurophilosophy 4(2), 199-213. Wang, Y. (2006, Ju;y). Cognitive informatics - Towards the future generation computers that think and feel. Keynote speech of the Proceedings of the 5th IEEE International Conference on Cognitive Informatics (ICCI’06) (p. 3-7). Beijing, China: IEEE CS Press. Wang, Y. (2007a). Software engineering foundations: A software science perspective (p. 1580). CRC Software Engineering Series, 2, USA: CRC Press. Wang, Y. (2007b, Jan). The theoretical framework of cognitive informatics. The International Journal of Cognitive Informatics and Natural Intelligence (IJCINI) 1(1), 1-57. Hershey, PA: IGI Publishing, Hershey, PA. Wang, Y. (2007c, July). Towards theoretical foundations of autonomic computing. The International Journal of Cognitive Informatics and Natural Intelligence (IJCINI) 1(3), 1-15. Hershey, PA: IGI Publishing, Hershey. Wang, Y. (2007d, July). The OAR model of neural informatics for internal knowledge representation in the brain. The International Journal of Cognitive Informatics and Natural Intelligence (IJCINI) 1(3). 64-75. Herhsey, PA: IGI Publishing. Wang, Y., & Wang, Y. (2006, March). Cognitive informatics models of the brain. IEEE Transactions on Systems, Man, and Cybernetics (C), 36(2), 203-207. Wang, Y., Wang, Y., Patel, S., & Patel, D. (2006, March). A layered reference model of the brain (LRMB), IEEE Transactions on Systems, Man, and Cybernetics (C), 36(2), 124-133. Westen, D. (1999). Psychology: Mind, brain, and culture (2nd ed.). New York: John Wiley & Sons, Inc. Wiggins, J.A., Eiggins, B.B., & Zanden, J.V. (1994). Social psychology (5th ed.). New York: McGraw-Hill, Inc. Wilson, R. A., & Keil, F.C. (eds.) (1999). The MIT encyclopedia of the cognitive sciences. Cambridge, MA: , The MIT Press. Wittig, A.F. (2001). Schaum’s outlines of theory and problems of introduction to psychology (2nd ed.). New York: McGraw-Hill.
77
78
Chapter V
A Selective Sparse Coding Model with Embedded Attention Mechanism Qingyong Li Beijing Jiaotong University, China Zhiping Shi Chinese Academy of Sciences, China Zhongzhi Shi Chinese Academy of Sciences, China
Abstract Sparse coding theory demonstrates that the neurons in the primary visual cortex form a sparse representation of natural scenes in the viewpoint of statistics, but a typical scene contains many different patterns (corresponding to neurons in cortex) competing for neural representation because of the limited processing capacity of the visual system. We propose an attention-guided sparse coding model. This model includes two modules: the non-uniform sampling module simulating the process of retina and a data-driven attention module based on the response saliency. Our experiment results show that the model notably decreases the number of coefficients which may be activated, and retains the main vision information at the same time. It provides a way to improve the coding efficiency for sparse coding model and to achieve good performance in both population sparseness and lifetime sparseness.
Introduction Understanding and modeling the functions of the neurons and neural systems are one of the primary goals of cognitive informatics (CI) (Wang 2002, 2007; Wang and Kinsner 2006). The computational capabilities and limitations of neurons, and the environment in which the organism lives are two fundamental components driving the evolution and development of such systems. The researchers have broadly investigated them.
Copyright © 2009, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
A Selective Sparse Coding Model with Embedded Attention Mechanism
The utilization of environmental constraints is most clearly evident in sensory systems, where it has long been assumed that neurons are adapted to the signals to which they are exposed (Simoncelli 2001). Because not all signals are equally like each other, it is natural to assume that perceptual systems should be able to best process those signals that occur most frequently. Thus, it is the statistical properties of the environment that are relevant for sensory process of vision perception (Field 1987; Simoncelli 2003). Efficient coding hypothesis (Barlow 1961) provides a quantitative relationship between environmental statistics and neural processing. Barlow for the first time hypothesized that the role of early sensory neurons was to remove statistical redundancy in the sensory input. Then, Olshausen and Field put forward a model, called sparse coding, which made the variables (equivalence of neurons stimulated by the same stimulus in the neurobiology) be activated (i.e., significantly non-zero) only rarely (Olshausen 1996). This model is named SC here. Vinje’s results validated the sparse properties of neural responses under natural stimuli conditions (Vinje 2000). Afterwards, Bell brought forward another sparse coding model based on statistical independence (called SCI) and obtained the same results as Olshausen and Field’s model (Bell 1997). More recent studies can be seen in survey (Simoncelli 2003). However, Willmore and Tolhurst (Willmore 2001) argued that there were two different ways for 'sparseness': population sparseness and lifetime sparseness. Population sparseness describes codes in which few neurons are active at any time and it is utilized in Olshausen and Field’s sparse coding model (Olshausen 1996); while lifetime sparseness describes codes in which each neuron's lifetime response distribution has high kurtosis, which is the main contribution in Bell’s sparse coding model (Bell 1997). In addition, it is proved that lifetime sparseness was uncorrelated with population sparseness. Just as figure 3.a shows the number of variables, which have large values produced by sparse coding model and are possible to be activated, is relatively large compared with the computation capacity of neurons. Though, the kurtosis of every response coefficient is also high. So, how to reduce both population sparseness and lifetime sparseness at the same time to retain the important information as much as possible is a valuable problem in practice. Vision attention mechanism is an active strategy in information processing procedure of brain, which has many interesting characteristics such as selectivity and competition. Attention is everywhere in the visual pathway (Britten 1996). Furthermore, a typical scene within the neuron’s classic receptive field (CRF) contains many different patterns that compete for neural representation because of the limited processing capacity of neurons in the visual system. So, integrating attention mechanism into sparse coding framework to reduce the population sparseness and to improve the coding efficiency sounds reasonable and essential. In this chapter, we extend sparse coding principle combining the vision attention. We first model the sampling mechanism of retina by a non-uniform sampling module; then, we implement a bottom-up attention mechanism based on the response saliency of the sparse coefficient. The diagram is illustrated in figure 1. This model has two main contributions: 1. 2.
Modeling the vision attention in the framework of sparse coding. Improving the population sparseness of the response coefficient in the same time retaining the most efficient information.
Figure 1. The diagram of the model Simple cell Natural image
Complex cell
Retina
Attentio n (1)
Attentio n (2)
79
A Selective Sparse Coding Model with Embedded Attention Mechanism
The rest of the chapter is organized as follows. Section 2 presents related work. In section 3, a detailed description of the model is given. Experimental results are presented in section 4. Conclusions are given section 5.
Related Work In the sparse coding model (Olshausen 1996; Bell 1997), a perceptual system is exposed to a series of small image patches, drawn from one or more large images, just like the CRF of neurons. Imagine that each image patch, represented by the vector x, has been formed by the linear combination of N basis functions. The basis functions form the columns of a fixed matrix A. The weight of this linear combination is given by a vector s. Each component of this vector has its own associated basis function, and represents a response value of a neuron in vision system. The linear synthesis model is therefore given by: x = As
(1)
The goal of a perceptual system in this simplified framework is to linearly transform the images x with a matrix of filters W so that the resulting vector u = Wx
(2)
recovers the response values s. In a cortical interpretation, the s models the responses of (signed) simple cells and the column of matrix A closely related to their CRF’s (Olshausen 1996). Figure 2.a shows some basis functions which are selective for location, orientation and frequency just as simple cells. Note that we are considering the contrast only. In the framework of efficient coding hypothesis, a fundamental assumption is that s is non-gaussian in a particular way, called sparseness (Field 1994). Sparseness means that random variable takes very small (absolute) values or very large values more often than a Gaussian random variable and it takes values in between relatively more rarely. Thus, the random variable is activated, which has significantly non-zero value only rarely. There are many models to implement the efficient coding. The most noted models include SC model in (Olshausen 1996) and SCI model in (Bell 1997). Though SCI model achieves good lifetime sparseness, its population sparseness does not show good results (Willmore 2001). It also does not consider the computational capacity limitation of neuron in primary vision cortex. Convergent evidences from single-cell recording studies in monkeys, functional brain imaging and eventrelated potential studies in humans indicate that selective attention can modulate neural processing in visual cortex. Visual attention affects neural processing in several ways. These include the following: enhancement of neural responses to a pattern, filtering of unwanted pattern counteracting the suppression, and so on. There are also many computational modeling of visual attention: given that the purpose of visual attention is to focus computational resources on a specific, “conspicuous" or “salient" region within a scene, it has been proposed that
Figure 2. Basis functions randomly selected from the set. (a) the original basis functions produced by sparse coding model; (b) the corresponding binary basis functions with distinct excitatory subregion labeled with white
80
A Selective Sparse Coding Model with Embedded Attention Mechanism
the control structure underlying vision attention needs to represent such locations within a topographic saliency map. There are two famous saliency-based vision attention models (Itti 1998; Rybak 1998; Itti 2001). They provide a data-driven model to simulate the attention mechanism in vision perception. Obviously, a typical image patch or the input of neuron’s CRF contains many different patterns. Because of the limited processing capacity these patterns compete for neural representation. That is to say, some variables of u for certain basis functions (here we also call pattern), corresponding to simple cells’ response in cortex, will be selected for further processing; on the contrary, some variables will be omitted. Next section will show how to model the competition or attention mechanism in vision sparse coding framework.
Attention-Guided Sparse Coding Model (AGSC ) General Description of the Model A functional diagram of AGSC model is shown in Figure. 1. AGSC model includes two sequent attention modules in the sparse coding framework. At the beginning, the first attention module performs a transformation of the image into a ‘retinal image’ simulating the process of retina. The transformation provides a decrease of resolution for the retinal image from the center to the periphery of the CRF. The retinal image used as an input to the sparse coding module of the simple cell. Then, the second attention module performs the selective attention based on response saliency. It is a data-driven module, related to the so-called ‘feature integration theory’ and ‘saliency-based attention model’ (Itti 1998). The simple cell’s response value and discrepancy distance based on their selective properties such as location, orientation and space-frequency formed the response saliency of simple cell. The simple cells’ responses compete for being further processed in complex cell based on response saliency value.
Non-Uniform Sampling Module It is well known that the density of photoreceptors in the retina is greatest in the central area (fovea) and decreases to the retinal periphery (Kronaver 1985). As a result, the resolution of the image representation in the visual cortex is highest for the part of the image projected onto the fovea and decreases rapidly with distance from the fovea center. The results show that the retina nonuniformly samples the input visual information. The retinal image (labeled RI = {V'ij}) is derived from the initial image I = {Vij} by way of a special transformation which obtains a decrease in resolution from the center of CRF to its periphery. To represent certain area D in the image I at resolution level n (n ∈ {1, 2, 3}), we utilize the recursive computation of the Gaussian-like convolution at each position in D: Rij1 = Vij Rij2 = Rij3 =
p =2
q=2
∑ ∑G
p =−2
q =−2
p =2
q=2
p =−2
q =−2
∑ ∑G
pq
Ri1− p , j − q
pq
Ri2−2 p , j −2 q
(3)
where the coefficients matrix of convolution is as following (Burt 1985):
1 4 [G pq ] = 6 4 1
4 16 24 16 4
6 24 36 24 6
4 16 24 16 4
1 4 1 6 * 256 4 1
(4) 81
A Selective Sparse Coding Model with Embedded Attention Mechanism
The input image patch is taken as the whole CRF and the center of the image patch is the center of the CRF. Here, we simply divide the image patch into three concentric circles from center to periphery. The radiuses for the concentric circles are R0, R1, R2 (empirically specified 6R0 = 2R1 = R2)(Rybak 1998). And the Euclidean distance between point (i, j) and the center is D(i, j). So the retinal image RI after being non-uniformly sampled can be represented as following: Rij1 if D (i, j ) ≤ R0 Vij′ = Rij 2 if R0 ≤ D (i, j ) ≤ R1 3 Rij if R1 ≤ D (i, j ) ≤ R2
(5)
Thus, the input image patch is represented as following: the pixels are full sampled within the central circle just as the original image, sampled with lower resolution within the first ring surrounding the central circle, and sampled with the lowest resolution within the third circle.
Response Saliency and Discrepancy Distance It is the second attention module, named selective attention module based on response saliency in AGSC after the input stimulus is processed by the non-uniform sampling module. It is the key part for the attention mechanism in AGSC since it determines what input patterns are selected and further processed in higher cortex. This section introduces the detail of selective attention module based on response saliency. De. nition 1: Response saliency is the response extent for a neuron compared with a group of neurons which respond to the same stimulus. The purpose of the response saliency is to represent the conspicuity of every neuron in the same perception level for a stimulus and to guide the selection of attended neuron, based on the value of response saliency. The neuron response that has great response saliency value will be chosen to further process. On the contrary, the neuron that has small value will be omitted. In the framework of sparse coding, the simple cells in the primary visual cortex (V1) produce sparse codes for the input stimuli. That is to say, response of simple cell takes very small (absolute) values or very large values often; to compensate, it takes values in between relatively rarely. The lifetime sparseness focuses on the possibility distribution of response (Olshausen 1996). Intuitively, the response value itself provides very useful information: if the response value is bigger, the information represented by the neuron is more important; otherwise, the information is less important. Obviously, the response value gives a foundation for the attention mechanism. Supposed here that Ai represents simple cell i, and Ri represents the simple cell’s response. So Ri is greater, the response saliency value of Ai is also greater. Every simple cell (corresponding to the column of A in Equation 1) carries a specific pattern. Furthermore, every such pattern is selective for location, orientation and frequency. Based on Gestalt similarity perception principle and Hebb rule, we can get that neurons which have similar visual selectivity characteristics such as location, orientation and space-frequency will enhance the response saliency for each other. On the contrary, the neurons with different selectivity characteristics will suppress the response saliency values (Simon 1998). Suppose that the response saliency value of a neuron which has great discrepancy for visual selectivity characteristics among a group of neurons which respond to the same stimulus will decrease, and the value for a neuron which has small discrepancy will increase relatively (Boothe 2002). The neuron set responding to the same stimulus assumes as S, S = {A1, A2, ..., Am}, corresponding to the basis functions in sparse coding model. Now we first define two important measures. Definition 2: pattern distance measures the similarity between two patterns of simple cell, and it is represented as D(Ai, Aj) between two simple cells Ai and Aj. D(Ai, Aj) is a function of simple cell’s selectivity characteristics: location (L), orientation (O) and frequency (F), since every simple cell here can be regarded as a pattern characterized by the parameters: L, O, F.
82
A Selective Sparse Coding Model with Embedded Attention Mechanism
Definition 3: discrepancy distance measures the discrimination for a simple cell among the simple cell set S when they respond to the same stimulus, and it is assumed as Diff(Ai, S) for simple cell Ai. The basis functions obtained by the sparse coding model are selective to location, orientation and spatial frequency just like the simple cell receptive field, so we analyze the visual selectivity of such basis functions instead of simple cells. We first deal with the basis functions as gray image, and transform the gray image into binary image using Otsu's method (Otsu 1979). Figure 2.b shows the binary basis function with distinct excitatory subregion labeled with white. Then we extract the location, orientation and frequency features from the binary basis functions. Location selectivity is the first important characteristic of simple cell receptive field. We treat the center, L = (x, y), of the excitatory subregion as the location selectivity; orientation O is a scalar, which represents the angle (in degree) between the x-axis and the major axis of the ellipse that has the same secondmoments as the excitatory subregion; and here spatial frequency F is replaced by size which is the area of the excitatory subregion. So D(Ai, Aj) can be calculated as below: D( Ai , Aj ) = W1 * N( ( Lix − L jx ) 2 + ( Liy − L jy ) 2) + W2 * N( Oi − O j )+ W3 * N( Fi − F j )
(6)
Here, operation N(.) represents the normalization operator which makes the values between 0 and 1, and 0 ≤ W1, W2, W3 ≤ 1 represents the weights, W1, W2, W3 = 1. Lx and Ly refer to the x-axes coordinates and y-axes coordinates, respectively. We here call the simple cell subset in S, except Ai, as neighbor cells of Ai and refer as NSi. According to definition 3, Diff(Ai, S) reflects the response discrimination extent between Ai and its neighbor cells. It is influenced not only by pattern distance, but also by their response values. So we define Diff(Ai, S) as the weighted sum of response value for neighbor cells and the weights is the designated by the pattern distance. The equation is given by: Diff ( Ai , S ) = (
∑
A j ∈NSi
N ( D (i, j )) ∗
Rj ∑ Ak ∈NSi Rk
)
(7)
Here, operation N(.) represents the normalization operator which makes the values between 0 and 1. Note that the normalization is also implied on response values of neighbor cells in order to limit the value of Diff(Ai, S) in range (0, 1). From equation 7, we can easily get that if the pattern distance and response value are both larger, then the discrepancy distance will be larger too, so the response of the Ai will be suppressed, just like the lateral suppression mechanism in neural system (Simon 1998). After we get the response value and discrepancy distance, we can finally define the response saliency (RS). There are two factors influencing the RS value. The first one is the internal factor-response value. The response value provides the foundation for the data-driven attention mechanism as discussed above and it is also the most important difference among the simple cells responding to the same stimulus. The second one is the external factor-discrepancy distance. It measures the relationship between the individual simple cell and its neighbor cells and simulates the interaction among the cells. Because the details of neural mechanism of attention are yet not known (Britten 1996), we define the RS value as the weighted sum of norm response value and neuron discrepancy distance for the simplicity. The equation is given by:
RS( Ai ) = N( Ri ) + λ * (1 − Diff ( Ai , S ))
(8)
Here λ is the weight which determinates the importance of each component. Note that the second component is defined as the complement of Diff(Ai, S), since Diff(Ai, S) is counteractive factor like the function of suppression. Greater its value is, smaller the RS value will be.
83
A Selective Sparse Coding Model with Embedded Attention Mechanism
Selective Attention Module Based on Response Saliency After we get the simple cell’s response saliency value we can select certain simple cells as the complex cell’s inputs according to the response saliency value. Selection is an important characteristic for attention mechanism (Kahneman 1973). Psychologists regard it as an internal mechanism, which controls how to process the input stimuli and adjust the behaviors. Selection makes the information process procedure be more efficient (Kahneman 1973). We design two selection strategies: threshold selection (TS) and proportion selection (PS).
Threshold Selection Strategy Treisman firstly put forward the concept of thresholds in the famed attenuation attention model (Treisman 1964). He argued that every response pattern had its own threshold, and the input stimulus would be activated if its response was greater than the threshold, otherwise it would be attenuated or ignored. Intuitionally, it sounds reasonable to set up a threshold for the simple cell’s response based on the RS value resembling the attenuation attention model. So we put forward a threshold selection (TS) strategy. TS is a threshold filtering algorithm. Assumed we get a threshold, T. If the response saliency value for a simple cell is greater than T the simple cell is chosen as the input for complex cell; on the contrary, if the value is smaller than T the simple cell is omitted. We can formalize it as follow: 0 Output ( Ai ) = Ri
if RS ( Ai ) ≤ Ti if RS ( Ai ) Ti
(9)
where RS(Ai) refers to response saliency value of simple cell Ai, and Ri is the response value of Ai. Output(Ai) represents the output of the attention module for Ai. Obviously, if its value equals 0, the output of simple cell Ai is omitted, otherwise, the output of Ai will be further processed in the complex cell. The key problem is how to determine the threshold. In principle, different simple cell has different threshold, however, it is very difficult to determine the thresholds even by biology experiments (Treisman 1964). Simply, we assume that all simple cells have the same threshold, T. So we can learn the threshold through data set. We have to note that the purpose of attention mechanism is to omit the petit information of input stimuli and to retain the primary information. In the viewpoint of data transformation, it means that we can well reconstruct the original stimulus by the information processed by attention module. So we can learn threshold T by controlling the reconstruction error. The threshold learning algorithm is described below: Algorithm 1: Threshold Learning Algorithm Input: The upper limitation of reconstruction error (UE), basis functions set (A), training data set (I), sparse coding coefficients set (R), and response saliency value set (RS). Output: Threshold (T) Method: 1. 2. 3.
Initialize T; Filter sparse coding coefficients by T. If RSi greater than T, R'i = Ri, else R'i = 0; Compute the reconstruction error for the data set I: Error ( Ri′,
4.
A) = ∑ Ii
∑ Ii ( x, y ) − ∑ Ri′Ai ( x, y ) x, y i
2
If Error ≥ UE, T = ηT where 0 ), an error risk is highly probable in a lapse of memory situation and if there is no information about what has been done before. A second kind of errors results in being confused about the number of sets already placed and/or those remaining to place without necessarily being mistaken on the mixtures order. For example, duplicating more than twice or adding only once the set .
The ACT-R Based Model To model the Tiramisu recipe preparation, we have defined various types of chunks. A first set of chunks contains information allowing to know the number of the ingredients already placed in the baking dish. A second set of chunks memorises the last mixture placed before the ingredient. To cause a fault due to an interruption, the simulation is stopped (at a given time chosen randomly) and the model was forced to compute some arithmetical calculus (during few seconds) to return again to the Tiramisu suspended realisation. Experimental tests show that the model was unable to remember the last chunk used before the interruption. In fact, the chunk which is always reminded was the one having the greatest activation value. According to the ACT-R theory, the activation of a chunk reflects (i) its general usefulness in the past and (ii) its relevance to the current context (Anderson et al., 2004). This activation is given by equation (1) : [ Ai = Bi + ∑j Wj Sji ]
(1)
where Bi is the base-level activation of the chunk i, Wj reflect the attentional weighting of the elements that are part of the current goal, and Sji are the strengths of association from the elements j to chunk i. The activation of a chunk controls both its probability of being retrieved and its speed of retrieval. In this sense, it is impossible to recover a given chunk according to its occurrence in time. Only the activation law determines which chunk will be recovered. Since the time impact on the calculation of the Bi factor decreases rapidly, after an interrup-
256
A Cognitive Computational Knowledge Representation Theory
tion the number of use of a chunk becomes the most decisive feature that affect its recall. This induces abnormal behaviours of our model, especially in the situation where the last placed only observable ingredient after the interruption was . In theory, to decide what to do next, the model must recall a chunk that memorise the last added ingredient. In practice, the recovered chunk will not be the one desired –the chunk recalled was that used for deciding to lay in the previous step (prior to the occurred interruption) and as it has already been utilised, this increases its activation value. The further an inappropriate chunk is reminded, the more its activation augments. Gradually, this chunk becomes the only remembered one. For example, if it specifies that the last ingredient is this will result in placing an infinite sequence of followed by . To resolve the mentioned problem, it was indispensable to enrich the defined chunks with additional slots. For example, adding the slot “is-last-created” to chunks of the type “last-ingredient” to permits to the model to retain the last ingredient placed before the mixture. The “is-last-created” slot values are exclusively Booleans. However, this proposed solution, even if it demonstrates efficiency in memorising ultimate steps done before interruptions, cannot translate a right use of the human cognitive structures –by handling Boolean parameters, the model behaves and reasons rather like a machine than like a human being.
The AURELLIO Based Model The experiment of the simulation of the Tiramisu realisation shows that our ACT-R based model cannot memorise particular events in a temporal context. Particularly, it cannot remember its last actions. For example, what it has just placed before adding , last operation before the caused interruption. Even if it is possible to simulate a memory of events by distinguishing between various occurrences of the same chunk (by defining several instances of this declarative knowledge entity), to establish the relation between each occurrence and the context in which it was created and handled remains a missing advantage of the model. By encoding each event in a suitable structure, it would be possible (i) to find it later by scanning the structure and (ii) to recreate all the context characteristics to which it was attached. i.e., the intention or the need leading to the event creation, means used to satisfy (or try to satisfy) this need and the generated consequences, etc. Modelling the Knowledge We have used the authoring tool to model the knowledge related to the Tiramisu realisation. For example, Figure 6 shows a structural diagram which defines that the concepts “sugar”, “cocoa”, “none” (which stands for the emptiness of the baking dish), “biscuit” and “base” are four primitive concepts that inherits from the abstract concept “Ingredient”. The structural diagram also establishes that “OneBaseAdded”, “TwoBasesAdded” and “ThreeBasesAdded” are primitive sub-concepts of the abstract concept “BaseAddedRememberance” which refers to the remembrance of the number of already added in the baking dish. Figure 7 illustrates a part of procedural diagram which defines that the subgoal “G_AddRightIngredient” (specified by the complex procedure “P_AddIngredient” called to achieve the goal “G_AddIngredient”) can be attained by means of one complex procedure (“P_AddOnCocoa”) or three other which are primitive (“P_AddBaseOnNone”, “P_AddCocoaOnBiscuit” and “P_AddCocoaOnBase”). Whereas the complex procedure “P_AddOnCocoa” decides what to do after an interruption if the last placed only observable ingredient is , “P_AddBaseOnNone” covers the bottom of an empty baking dish with , “P_AddCocoaOnBiscuit” adds over , and “P_AddCocoaOnBase” spreads over . Implementing the Model An AURELLIO-based simulator was designed. Its purpose is to emulate the realisation of the Tiramisu recipe and its temporary interruption by a secondary task. As shown in figure 8, the graphical interface of the simulator is divided in five panes and a legend that define and explains the symbols used. The “Process Log” pane shows the handled procedural knowledge as well as goals achieved by procedures. The “Human Actions” pane simulates a scenario of the recipe realisation and describes both the execution of the recipe’s steps and the internal operations of the model (data handling). The “Last Episode Remembered” pane shows an episodic knowledge which represents a past event that the model succeeded to remember. The “Tiramisu’s Realisation” pane shows (i) the cook’s view of the baking dish (in which only the last added ingredient is observable) and (ii) the order—in relation to
257
A Cognitive Computational Knowledge Representation Theory
Figure 6. The Tiramisu structural diagram
Figure 7. Part of the Tiramisu procedural diagram
258
A Cognitive Computational Knowledge Representation Theory
Figure 8. The graphical interface of the simulator
the given scenario—of the layers of ingredients. The interruption is represented in the diagram by a continuous red line. The interrupter task is represented in the “Secondary Task” pane. The reduction of Boolean expressions was chosen as a secondary task. All events which occur during a simulation are stored in an XML file. According to the statements that memorising, rather than being a mechanical act of storage, assumes the perception of coherent and logical structures and that the human brain does not record information as a computer does but assimilates, sorts and organises the knowledge entities in various systems that link them and give them a sense, the retrieving of data (in our model) especially requires the initial context of encoding episodes. To remember the second last ingredient after an interruption, the model’s reasoning is based on the interrupted goal. When this one has been identified by scanning the episodic contents of the XML file, the model (1) looks for an episode whose goal that it contains is of type “G_AddIngredient” and which is chronologically nearest to the interrupted goal, (2) identifies the cognition created when achieving that goal (through the realisation of its sub-goals, if the need arises). This cognition represents the last added ingredient; and (3) expresses the intention to add a new ingredient over the last one. Therefore, it formulates a new goal of type “G_AddIngredient”. To remember the second last ingredient in the critical situation where the last added layer was , the model creates a goal of type “G_Remember_SecondLastAdded” which can be achieved by the procedure “P_Get_SecondLastAdded”. The latter scours the episodic memory for the second chronologically nearest goal to the interrupted one and which is of type “G_AddIngredient”. This goal represents the intention to add the second last ingredient before the interruption. The model identifies again the cognition created after the satisfaction of that goal. The research process in the episodic memory is facilitated thanks to (1) the hierarchical chaining between episodes which links any episode to its predecessor and (2) the chronological order of episodes which is determined by the “Time” slot of each episode. In addition to this slot, which indicates the time at which the episode occurred, episodes are characterised by the following slots: (1) “Identifier” is a unique number randomly generated by the system, (2) “Goal-Episode” points to the goal identifier and to those of the handled cognitions, (3) “Procedure-Episode” contains a reference to the selected procedure, (4) cognition obtained by the application
259
A Cognitive Computational Knowledge Representation Theory
Figure 9. A part of the episodic history
of the chosen procedure is stored in “Result”, (5) “Super-Episode” and (6) “Sub-Episodes” contain links to the inner and the outer hierarchical episode, (7) “Status” takes a qualitative value (success, failure, on standby or aborted) according to the status of the current episode; and finally, (8) “Cost” comprises an estimated cost of the procedure usage. For example, figure 9 shows the episodic history of layering <sugar> after verifying that the second last added ingredient was the third layer of . In that particular case (where the second last added ingredient was ) and for the recall of the number of its already added layers (to decide to add either or <sugar> over ), the model starts the process of remembering the number of placed. It combines this second last added ingredient with the cognition “OneBaseAdded”. Then, the procedure “P_Get_BaseAddedRememberance” is called to achieve the goal “G_Remember_BaseAdded”. This procedure seeks in the XML file of the episodic memory for every episode with a goal which is of type “G_AddIngredient” and whose result is a cognition of type . If need be, the model associates in order, to the found episodes the cognitions “TwoBaseAdded” and “ThreeBaseAdded” which represent the fact of having already placed two layers—respectively, three layers—of .
Concluding Remarks By the Tiramisu experiment, we have shown that in momentarily interrupted realisations of a cooking recipe, the widely acknowledged ACT-R knowledge representation approach cannot offer a model that faithfully reproduces the usual human behaviour. More precisely, we highlighted the incapacity of the ACT-R theory to reproduce properly the recall of information in a temporal context (Najjar et al., 2005). We emphasized its failure in memorising remembrances and we have show that AURELLIO—which uses additional knowledge structures that are inspired from the human memory—allows encoding, retrieving and reproducing properly recollections; and thus, attempts to offer models that are highly close to the natural human behaviour. If our model has prove its success in remembering events in a much more natural way than the ACT-R model, it stays that its behaviour during the recall process, although it tries to look like the human one, remains some-
260
A Cognitive Computational Knowledge Representation Theory
what computational. In fact, the localisation process of remembrances does not consist of exploring exhaustively the memory to take out the interrelated events between which the recollection to be located will take place. In reality, the use of points of reference facilitates the expression of a given memory. These markers are states of consciousnesses imperatively necessary to the correct functioning of the recollection mechanism. If, to reach a distant recollection, one had to follow the whole series of events which separate him from it, the memory would be impossible to exist—as a structure—because of the complexity of the operation. In this sense, our model should memorise knowledge in order to make it operational by a complex act of recall which unfolds by successive stages and which, ideally, considers phases of forgetfulness that belong to the process of memorising and recollection. Thus, it would be necessary to take account of an equation of recall using of a law of activation which controls the access to episodes. In this way, a distant event or a recollection whose encoding was not made properly could not be completely and/or always reminded by the model (what is quite natural and usual for a human being).
Discussss Cognitive Informatics have proved that it is very beneficial to integrate into the new generation of software and information technologies industry the encouraging knowledge that studies of internal information processing mechanisms and processes of the brain have accumulated (Shao & Wang 2003; Wang, 2005). In this sense, we think that it would be advantageous and practical to be inspired by a psychological cognitive approach, which offers a fine modelling of the human process of the knowledge handling, for representing both the learner and the domain knowledge within virtual learning environments. Our hypothesis is that the proposed knowledge structures because they are quite similar to those used by human beings, offer a more effective knowledge representation (for example, for a tutoring purpose). In addition, we chose a parsimonious use of cognitive structures suggested by psychology to encode knowledge. Indeed, we divide these structures into two categories: on one hand, semantic and procedural knowledge which is common, potentially accessible and can be shared—with various mastery degrees—by all learners; and, on the other hand, episodic knowledge which is specific for each learner and whose contents depend on the way with which the common knowledge (semantic and procedural) is perceived and handled. More precisely, primitive units of semantic and procedural knowledge—chosen with a small level of granularity—are used to build complex knowledge entities which are dynamically combined in order to represent the learner knowledge. The dynamic aspect is seen in the non-predefined combinations between occurrences of concepts and the applied procedures handling them translating the learner goals. Generally, the complex procedure “P” selected to achieve a given goal “G” determines number and order of subgoals of “G” (whose each one can be achieved, in turn, by a procedure called, in this case, a sub-procedure of “P”). The choice of “P” depends of the learner practices and preferences when s/he achieves the task. This means that goal realisation can be made in various ways, by various scenarios of procedures execution sequences. Therefore, number and chronological order of subgoals of “G” are not predefined. Thus, the learner cognitive activity is not determined systematically, in a static way, starting from her/his main goal. Traces of this cognitive activity during problems solving are stored as specific episodic knowledge. This allows a tutor to scan the episodic knowledge model that the system formulates in its representation of the learner to determine—via reasoning strategies—the degree of mastery of procedural knowledge and/or the acquisition level of semantic knowledge. Our distinction between semantic and procedural knowledge is mainly based on the criteria of the ACT-R theory (Anderson, 1993). However, AURELLIO takes into account an additional component of the declarative memory—the episodic memory, a structure which is characterised by the capacity to encode information about lived facts (Richard & Goldfarb, 1986). Humphreys et al. (1989), for instance, affirms that cognitive-based models which do not make a distinction between declarative and episodic cannot distinguish various occurrences of the same element of knowledge2. An episodic structure appears advantageous to use for the analysis of the learning tasks’ trace. This trace can be examined for a better understanding of the learner reasoning. In addition, the episodic knowledge structuring suggested by AURELLIO places any episode in a novel hierarchical context. The proposed structures connect concrete episodes in a hierarchy which is not of the type generalisation/specialisation (Weber & Brusilovsky, 2001), but rather of the type event/sub-events where event is represented by an ancestry episode and sub-events are represented by a line of descent episodes. Thus, for
261
A Cognitive Computational Knowledge Representation Theory
an adequate strategic reasoning, it is possible—for example—for an intelligent agent to scan and scrutinise the episodic history in order to extract relevant indices directly from concrete episodes. Another original aspect of our approach is the explicit introduction of goals into the knowledge representation. Although they are treated by means of procedures, we consider goals are a special case of knowledge that represents intentions behind the actions of the cognitive system. i.e., a goal is seen as a semantic knowledge which describes a state to be reached. The fact that there exists a particular form of energy employed to acquire goals distinguishes them from any standard form of knowledge. This distinction involves a different treatment for goals in the human cognitive architecture (Altman & Trafton, 2002). We propose to treat goals explicitly to reify them as particular semantic knowledge which is totally distinct from those which represent objects.
Conclusion We have presented a knowledge representation approach which combines some cognitive theories on knowledge handling and acquisition with computational modelling techniques. We have described an authoring tool which offers to represent—according to the proposed approach—any domain knowledge. We have depicted practical validations of the authoring tool and the related knowledge representation models. We have finally emphasized some originalities of our theory. We are currently investigating a new idea for integrating pedagogic and didactic knowledge in our knowledge representation approach. We are also performing advanced user-tests with the authoring tool. Tests results will undoubtedly lead to the software’s improvement. On another side, we are elaborating the equation of recall which uses an activation law to control the remembrance degree of episodes in the memory.
Ackk The authors want to thank (1) Froduald Kabanza for his instructive comments on an early version of this chapter, (2) Jean-François Lebeau and Francis Bouchard for their help on the realisation of the Tiramisu experiment and (3) Philippe Fournier-Viger for his contribution on the realisation of the Boolean reduction VLE and the Tiramisu experiment.
References Aleven, V. & Koedinger, K. 2002. An Effective Metacognitive Strategy : Learning by doing and explaining with computer-based Cognitive Tutors. Cognitive Science. 26 (2): 147-179. Altmann, E. & Trafton, J. 2002. Memory for goals: An Activation-Based Model. Cognitive Science, 26, 39-83. Anderson, J. R. 1993. Rules of the mind. Lawrence Erlbaum Eds. Anderson, J. R., Corbett, A.T., Koedinger, K.R. & Pelletier, R. 1995. Cognitive Tutors: Lessons learned. The Journal of Learning Sciences. 4 (2):167-207. Anderson, J. R., Bothell, D., Byrne, M. D., Douglass, S., Lebiere, C., & Qin, Y. 2004. An integrated theory of the mind. Psychological Review 111 (4): 1036-1060. Anderson, J. R., & Ross, B. H. 1980. Evidence against a semantic-episodic distinction. Journal of Experimental Psychology: Human Learning and Memory, 6, 441-466. Baddeley, A. 1990. Human Memory : theory and practice. Hove (UK): Lawrence Erlbaum. Brusilovsky, P. & Peylo, C. 2003. Adaptive and intelligent Web-based educational systems. International Journal of AI in Education. 13 (2) : 159-172.
262
A Cognitive Computational Knowledge Representation Theory
Collins, M. & Loftus, F. 1975. A spreading activation theory of semantic processing. Psychological Review. (82):407-428. Corbett, A., Mclaughlin, M. & Scarpinatto, K. C. 2000. Modeling Student Knowledge: Cognitive Tutors in High School and College. Journal of User Modeling and User-Adapted Interaction. (10): 81-108. de Rosis, F. 2001. Towards adaptation of interaction to affective factors. Journal of User Modeling and UserAdapted Interaction. 11(4). Gagné, R., Briggs, L. & Wager, W. 1992. Principles of Instructional Design. (4th edition), New York: Holt, Rinehart & Winston (Eds.). Garagnani, M., Shastri, L., & Wendelken, C. 2002. A connectionist model of planning as back-chaining search. Proceedings of the 24th Conference of the Cognitive Science Society. Fairfax, Virginia, USA. pp 345-350. Halford, G. S. 1993. Children’s understanding: The development of mental models. Hillsdale, NJ: Lawrence Erlbaum Associates. Heermann, D. & Fuhrmann, T. 2000. Teaching physics in the virtual university: the Mechanics toolkit, in Computer Physics Communications, 127, 11-15 Hermann, D. & Harwood, J. 1980. More evidence for the existence of separate semantic and episodic stores in long-term memory. Journal of Experimental Psychology, 6 (5), 467-478. Humphreys, M. S., Bain, J. D. & Pike, R. 1989. Different ways to cue a coherent memory system: A theory for episodic, semantic and procedural tasks. Psychological Review, 96, 208-233. Kokinov, B. & Petrov, A. 2000. Dynamic extension of episode representation in analogy-making in AMBR. Proceedings of the 22nd Conference of the Cognitive Science Society, NJ. 274-279. Lintermann, B. & Deussen, O. 1999. Interactive Structural and Geometrical Modeling of Plants, IEEE Computer Graphics and Applications, 19 (1). Najjar, M., Fournier-Viger, P., Lebeau, J. F. & Mayers, A. (2006). Recalling Recollections According to Temporal Contexts—Applying of a Novel Cognitive Knowledge Representation Approach. Proceedings of the 5th IEEE International Conference on Cognitive Informatics (ICCI’06). July 17-19, Beijing, China. Najjar, M., Fournier-Viger, P., Mayers, A. & Bouchard, F. 2005. Memorising Remembrances in Computational Modelling of Interrupted Activities. Proceedings of the 7th International Conference on Computational Intelligence and Natural Computing. July 21-26, Salt Lake City, Utah, USA. pp: 483-486. Neely, J. H. 1989. Experimental dissociations and the episodic/semantic memory distinction. Experimental Psychology: Human Learning and Memory, (6): 441-466. Richards D. D. & Goldfarb J. 1986. The episodic memory model of conceptual development: an integrative viewpoint. Cognitive Development. 1, 183-219 Rzepa, H & Tonge, A. 1998. VChemlab: A virtual chemistry laboratory. Journal of Chemical Information and Computer Sciences, 38(6) : 1048-1053. Shao, J. & Y. Wang. 2003. A New Measure of Software Complexity based on Cognitive Weights, IEEE Canadian Journal of Electrical and Computer. Engineering. 28(2), pp.69-74. Shastri, L. 2002. Episodic memory and cortico-hippocampal interactions. Trends in Cognitive Sciences, 6: 162168. Sweller, J. 1988. Cognitive load during problem solving: effects on learning. Cognitive Science. 12: 257-285. Tulving, E. 1983. Elements of Episodic Memory. Oxford University Press, New York.
263
A Cognitive Computational Knowledge Representation Theory
Weber, G. & Brusilovsky, P. 2001. ELM-ART: An adaptive versatile system for Web-based instruction. International Journal of AI in Education 12 (4) : 351-384. Wang, Y. 2003. Cognitive Informatics: A New Transdisciplinary Research Field. Brain and Mind: A Transdisciplinary Journal of Neuroscience and Neurophilosophy, 4(2), pp.115-127. Wang, Y. & W. Kinsner. 2006. Recent Advances in Cognitive Informatics, IEEE Transactions on Systems, Man, and Cybernetics (Part-C), 36(2), March, pp.121-123. Wang, Y. & L. Wang. 2006. Cognitive Informatics Models of the Brain, IEEE Transactions on Systems, Man, and Cybernetics (Part-C), 36(2), March, pp. 203-207. Wang, Y., D. Liu, & Y. Wang. 2003. Discovering the Capacity of Human Memory. Brain and Mind: A Transdisciplinary Journal of Neuroscience and Neurophilosophy, 4(2), pp.189-198. Wang, Y. 2005. The Development of the IEEE/ACM Software Engineering Curricula. IEEE Canadian Review, 51(2), May, pp.16-20. Wells, L.K. & Travis, J. 1996. LabVIEW for Everyone: Graphical Programming Made Even Easier, Prentice Hall Eds., NJ.
Endnotes 1
2
264
Semantic, procedural, episodic and goal-based knowledge representation theory. To fill this gap, ACT-R permits to create several instances of the same chunk type in order to indirectly simulate the existence of an episodic memory. However, this form of memory does not have an explicit structure.
265
Chapter XVIII
A Fixpoint Semantics for Rule-Base Anomalies Du Zhang California State University, USA
Abstract A crucial component of an intelligent system is its knowledge base that contains knowledge about a problem domain. Knowledge base development involves domain analysis, context space definition, ontological specification, and knowledge acquisition, codification and verification. Knowledge base anomalies can affect the correctness and performance of an intelligent system. In this chapter, we describe a fixpoint semantics for a knowledge base that is based on a multi-valued logic. We then use the fixpoint semantics to provide formal definitions for four types of knowledge base anomalies: inconsistency, redundancy, incompleteness, circularity. We believe such formal definitions of knowledge base anomalies will help pave the way for a more effective knowledge base verification process.
Introduction Computing plays a pivotal role in our understanding of human cognition (Pylyshyn, 1989). The classical cognitive architecture for intelligent behavior assumes that both computers and minds have at least the following three distinct levels of organization (Pylyshyn, 1989). (a) The semantic level or the knowledge level where the behavior of human beings or appropriately programmed computers can be explained through the things they know and the goals they have. It attempts to establish, in some meaningful or even rational ways, connections between the actions (by human or computer) and what they know about their world. (b) The symbol level where the semantic content of knowledge and goals is assumed to be encoded through structured symbolic expressions. It deals with representation, structure and manipulation of symbolic expressions. (c) The physical or biological level where the physical embodiment of an entire system (human or computer) is considered. It encompasses the structure and the principles by which a physical object functions. Pylyshyn’s cognitive penetrability criterion states that “the pattern of behavior can be altered in a rational way by changing subjects’ beliefs about the task” (Pylyshyn, 1989). It is the subjects’ tacit knowledge about the world, not the properties of the architecture that enables such behavior adjustment.
Copyright © 2009, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
A Fixpoint Semantics for Rule-Base Anomalies
The hallmark of a knowledge-based system is that by design it possesses the ability to be told facts about its world and to alter its behavior accordingly (Brachman & Levesque, 2004). It exhibits the property of cognitive penetrability. Today, knowledge-based systems not only play an important role in furthering the study in cognitive informatics (Wang et al., 2002; Patel et al., 2003; Chan et al., 2004; Kinsner et al., 2005; Wang, 2002, 2007; Wang and Kinsner, 2006), but also have found their way into so many problem domains (Cycorp, 2006) and have been utilized to generate numerous successful applications (IBM, 2006; Ross, 2003). A crucial component of an intelligent system or a knowledge-based system is its knowledge base (KB) that contains knowledge about a problem domain (Brachman & Levesque, 2004; Fagin et al, 1995; Levesque & Lakemeyer, 2000). Knowledge base development involves domain analysis, context space definition, ontological specification, and knowledge acquisition, codification and verification (Zhang, 2005). When developing a KB for an application, it is important to recognize the context under which we formulate and reason about domain-specific knowledge. A context is a region in some n-dimensional space (Lenat, 1998). In a KB development process, domain analysis should result in identification of the region of interest in the context space. Specifying a context entails specifying or locating a point or region along each of those n dimensions. Once the context (or contexts) for a problem domain is identified, ontological development is in order. An ontology is a formal, explicit specification of a shared conceptualization (Chandrasekaran et al, 1999; Gomez-Perez et al, 2004; O’Leary, 1998). After the conceptualization is in place, knowledge acquisition, codification and verification can be carried out to build the KB for some application. Inevitably, there will be anomalies in a KB as a result of existing practices in its development process. Knowledge base anomalies can affect the correctness and performance of an intelligent system, though some systems are robust enough to perform rationally in the presence of the anomalies. It is necessary to define KB anomalies formally before identifying where they are in a KB and deciding what to do with them. In this chapter, our focus is on formal definitions of KB anomalies and on the issue of how to identify them. Our attention is on rule-based KB. A rule-based KB has a set of facts that is stored in a working memory (WM) and a set of rules stored in a rule base (RB). Rules represent general knowledge about an application domain. They are entered into a RB during initial knowledge acquisition or subsequent KB updates. Facts in a WM provide specific information about the problems at hand and may be elicited either dynamically from the user during each problem-solving session, or statically from the domain expert during knowledge acquisition process, or derived through rule deduction. We assume that rules in a KB have the following format: P1 ∧...∧ Pn → R, where Pis are the conditions (collectively, the left-hand side, LHS, of a rule), R is the conclusion (or right-hand side, RHS, of a rule), and the symbol “→" is understood as the logical implication. The Pis and R are literals. If the conditions of a rule instance are satisfied by facts in WM, then the rule is enabled and its firing deposits its conclusion into WM. A fact is represented as a ground atom. It specifies an instance of a relationship among particular objects in the problem domain. WM contains a collection of positive ground atoms which are deposited through either assertion (initial or dynamically), or rule deduction. A negated condition ¬p(x) in the LHS of a rule is satisfied if p(x) is not in WM for any x. A negated ground atom ¬p(a) in the LHS of a rule is satisfied if p(a) is not in WM. A negated conclusion ¬R in the RHS of a rule results in the removal of R from WM, when the LHS of the rule is satisfied1. Rule instances and negated literals can be utilized by the inference system, but are never deposited into WM (Ginsberg & Williamson, 1993). Let WM0 denote the initial state for WM. We use WMi (i = 1,2,3,…) to represent subsequent states of WM as a result of firing all enabled rules under the state of WMi-1. For the basic concepts and terminology in the first order predicate logic, readers are referred to (Ben-Ari, 1993; Chang & Lee, 1973). The rest of the chapter is organized as follows. Section 2 offers a brief review of the related work. Section 3 discusses the four types of KB anomalies. Section 4 describes the fixpoint semantics we adopt for a KB. Formal definitions of the KB anomalies are given in Section 5 in terms of the fixpoint semantics. Finally Section 6 concludes the chapter with remark on future work.
266
A Fixpoint Semantics for Rule-Base Anomalies
Related Work Early work in rule-base verification and validation treated the anomalies as specific deficiencies, and focused on devising algorithms to detect them. The approaches were based on formalizing KB either in terms of some graphical model (Zhang, 1994) or as a quasi logic theory (Ginsberg & Williamson, 1993) or in some knowledge representation formalism (Nguyen et al, 1987). For a summary of the previous results and additional references, please refer to (Menzies & Pecheur, 2005). There are three ways to assign the meanings to a logical theory: operational, model-theoretic, and fixpoint (van Emden & Kowalski, 1976). Operational semantics is based on the deduction of a set of ground atoms through some inference method. Model-theoretic semantics defines a set of ground atoms that are a logical consequence of the given logical theory. Fixpoint semantics associates a transformation with the given logical theory and uses the transformation’s fixpoint to denote the meanings of components in the theory. All three semantics are proved to be equivalent for certain logical theories (e.g., definite clause programs) (van Emden & Kowalski, 1976). In goal-oriented inference systems, various computation rules compute the relations determined by the fixpoint semantics (van Emden & Kowalski, 1976). However, because of the discrepancy between a first-order logical theory and a KB (Rushby & Whitehurst, 1989), the existing results thus far include some conservative semantics for KB. Model-theoretic flavored approaches include (Zhang & Luqi, 1999; Levy & Rousset, 1998; Rushby & Whitehurst, 1989). Both consistency and completeness are defined in (Rushby & Whitehurst, 1989), whereas the work in (Levy & Rousset, 1998) addresses consistency, completeness and dependency. The results in (Zhang & Luqi, 1999) discuss not only consistency and completeness, but also redundancy and circularity. The difficulties in defining an operational semantics for KB are discussed in (Rushby & Whitehurst, 1989). An approximate imperative semantics for KB is outlined in (Rushby & Whitehurst, 1989), which relies on establishing an invariant for the rule base. Though fixpoint approach has been used to delineate semantics in logic programming (see Section 4) and to underpin the concept of common knowledge in (Fagin et al, 1995), it has not been utilized to characterize KB anomalies. Finally, the issues of knowledge consistency and completeness are also dealt with in the context of default reasoning (Brachman & Levesque, 2004).
Rule-Base Anomalies In this chapter, we are interested in the formal definitions of the following KB anomalies: 1. 2. 3. 4.
Inconsistency. Redundancy. Incompleteness. Circularity.
Before we proceed to the definitions of KB anomalies in terms of the fixpoint semantics, we need to say a few words about the language used for KB. We assume that the language is expressive enough to allow the following terms to be properly defined: same, synonymous, complementary, mutual exclusive, and incompatible literals (L1 and L2 in Table 1 are literals).
Inconsistency Under the semantics of classical logic, if a KB derives conclusions that are contradictory, the KB is said to contain inconsistent knowledge. The root cause of KB inconsistency is due to rules in RB, but its manifestation is through WM. For instance, the inconsistency of a RB containing a pair of rules {p(x) → q(x), p(x) → ¬q(x)} is not apparent until a fact p(a) is asserted into WM and both rules are enabled and fired. In general, although the rules in a RB are consistent on their own (because there exists a model for them), they can form an inconsistent
267
A Fixpoint Semantics for Rule-Base Anomalies
theory when combined with certain facts in WM. In order for a KB to be consistent, there needs to be a model for both RB and WM. On the other hand, facts in WM are changing over time due to dynamic assertions and retractions. RB may be consistent with WMi, but inconsistent with WMj where i ≠ j. Thus, relying on a particular WM state in verifying the consistency of RB may not produce an accurate result. KB consistency has both spatial and temporal properties. Spatially, KB consistency can be local within a context or global among several contexts. Temporally, it can be transient with regard to some WMi or persistent for all WMi. There are logical consistency and output consistency (whether the same set of inputs produces the same set of outputs or several sets of outputs) (Rushby & Whitehurst, 1989). In this chapter, we are interested in characterizing logical, intra-context and persistent consistency in a KB. Our goal is to find a way to identify the types of inconsistency that result in derivations of complementary, mutual exclusive, or incompatible conclusions. KB inconsistency can be attributed to a number of factors: • • • • • • •
Merging or fusing of knowledge bases results in disagreeing rules being introduced. In a distributed knowledge base environment where a federation of geographically dispersed KBs is deployed, contradictive knowledge can be present at different sites. The development language for a KB allows for certain literals, such as complementary, mutual exclusive, or incompatible, to be expressively used in the KB and the ontology explicitly sanctions those concepts. There is lack of explicit constraints in the ontology specification (e.g., ontology does not specify that animals and vegetables are mutually exclusive and jointly exhaustive in living things). Honest programming errors in complex applications may be possible causes. The need for assertion lifting (i.e., lifting, or importing assertions from one context to another (Lenat, 1998)) may also be a cause. Redundancy-induced circumstances where some redundant rule is modified, but others do not (see below).
Redundancy Declaratively, KB redundancy does not diminish nor increase the set of KB entailments. Operationally, KB redundancy may lead to the following anomalous situations. (a) During KB maintenance or evolution, if one of the redundant rules is modified and the others remain unchanged, then the updated KB will not correspond to
Table 1. Same, synonymous, complementary, mutual exclusive, and incompatible literals Syntax Identical
Different
* †
268
Conflict
Semantics
Equivalent
*
†
Same: denoted as L1 = L2 L1 and L2 are syntactically identical (same predicate symbol, same arity, and same terms at corresponding positions)
Synonymous: denoted L1 ≅ L2 L1 and L2 are syntactically different, but logically equivalent
Complementary: denoted L1 # L2 L1 and L2 are an atom and its negation
mutual exclusive: denoted L1 L2 L1 and L2 are syntactically different and semantically have opposite truth values Incompatible: denoted L1 ≭ L2 L1 and L2 are complementary pair of synonymous literals
Given two rules r i and rk , if LHS(r i) = {P1,...,Pn} and LHS(rk) = {P1′,...,Pn′}, then LHS(r i) = LHS(rk) iff ∀i ∈ [1, n] Pi = Pi′. Given two rules r i and rk , if LHS(r i) = {P1,...,Pn} and LHS(rk) = {P1′,...,Pn′}, then LHS(r i) ≅ LHS(rk) iff ∀i ∈ [1, n] Pi ≅ Pi′.
A Fixpoint Semantics for Rule-Base Anomalies
the intended change, and inconsistencies can be introduced as well. (b) For a KB where no certainty factors are utilized, redundant rules may be enabled under a given state, thus resulting in performance slow down because all the enabled redundant rules may be fired, even though the firings of those redundant rules will yield the same set of literals (conclusions). (c) For a KB containing certainty factors, redundancy will become a serious problem, the reason being that each redundant rule’s firing results in a duplicate counting of the same information, which, in turn, erroneously increases the level of confidence assigned to the derived literals (conclusions). This may ultimately affect the set of deducible literals. Spatially, KB redundancy can occur within a context or among several contexts. There are several types of redundancy (Zhang & Luqi, 1999). Causes for redundancy include knowledge base merging, programming errors, and assertion lifting.
Incompleteness A KB is incomplete when it does not have all the necessary information to answer a question of interest in an intended application (Brachman & Levesque, 2004; Levesque, 1984; Rushby & Whitehurst, 1989). Thus, completeness represents a query-centric measure for the quality of a KB. It is a challenging issue because of the following reasons: (a) In many applications, the KB is built in an incremental and piecemeal fashion and it undergoes a continual evolution. The information acquired at each stage of the evolution may be vague or indefinite in nature. (b) The deployment of a KB system cannot just wait for the KB to be stabilized in some final and complete form since this may never happen. Despite the fact that a practical KB may never exhaustively capture knowledge in all aspects of a real problem domain, it is still possible for a KB to be complete for a specific area in the domain. The boundaries of this specific area can be defined in terms of all relevant queries that can be asked during problem solving sessions. If a KB has all the information to answer those relevant queries definitely, then the KB is complete with regard to those queries. The concepts of relevant queries and the ability of a KB to answer those queries are what underpin our discussion of KB completeness. Given a KB, we define ℙKB and ℙA as sets of all predicate symbols and askable predicate symbols in the KB, respectively. An askable predicate symbol is one that can appear in a query. Usually it is the case that ℙKB ⊇ ℙA. A query ǭ containing predicate symbols pi,.., pj ∈ℙA is denoted as ǭ ≃ ǫ(pi,.., pj). A set ℚ of relevant queries is now defined as follows: ℚ = {ǭ | ǭ appears in some query session ∧ ǭ ≃ ǫ(pi,..., pj) ∧ pi,..., pj ∈ℙA }. Given a query ǭ ∈ℚ, the answer to ǭ, denoted as α(ǭ), can be either definite or unknown. α(ǭ) is definite if either KB⊢ǭ or KB⊢¬ǭ; α(ǭ) is unknown if neither KB⊢ǭ and nor KB⊢¬ǭ.
Circularity There are many circular phenomena in computer science and AI. The theory of non-well-founded sets has been utilized to study the phenomena in (Barwise & Moss, 1996). Circularity in a KB has been informally defined as a set of rules forming a cycle (Chang et al, 1990; Nguyen et al, 1987; Rushby, 1988). What exactly KB circularity entails semantically is not that clear in the literature. In (Zhang & Luqi, 1999), KB circularity is defined in terms of the derivation of tautologous rules. The phenomena reflect an anomalous situation in a KB and have both operational and semantic ramifications. Operationally speaking, circular rules may result in infinite loops (if an exiting condition is not properly defined) during inference, thus hampering the problem solving process. Semantically speaking, the fact that a tautologous formula is derivable indicates that the circular rule set encompasses knowledge that is always true regardless of any problem specific information. In general, tautologous formulas are those that are true by virtue
269
A Fixpoint Semantics for Rule-Base Anomalies
of their logical form and thus provide no useful information about the domain being described (Genesereth & Nilsson, 1987). Therefore, circular rules prove to be less useful in the problem solving process. What is needed, as evidenced in many real KB systems, are consistent rules that are triggered by problem specific information (facts) rather than tautologous rules that are true regardless of a problem to be solved.
Fixpoint Semantics for Knowledge Base There are a number of fixpoint semantics for a logical theory (Fitting, 1991; Fitting, 2002): classical two-valued, two-valued with stratification, three-valued for handling negation, four-valued for dealing with inconsistency and incompleteness, and the truth value space of [0, 1]. In this chapter, we adopt the four-valued logic FOUR as defined in (Belnap, 1977; Ginsberg, 1988). FOUR has the truth value set of {true, false, ⊥, } where true and false have their canonical meanings in the classical two-valued logic, ⊥ indicates undefined or don’t know, and is overdefined or contradiction (Figure 1). The four-valued logic FOUR is the smallest nontrivial bilattice, a member in a family of similar structures called bilattices (Ginsberg, 1988). Bilattices offer a general framework for reasoning with multi-valued logics and have many theoretical and practical benefits (Ginsberg, 1988). As can be seen later in Section 5, the fixpoint characterizations of rule-base anomalies will make it possible for an automated anomaly detection process. According to (Belnap, 1977), there are two natural partial orders in FOUR: knowledge ordering ≤k (vertical) and truth ordering ≤t (horizontal) such that: ⊥ ≤k false ≤k , ⊥ ≤k true ≤k and false ≤t ≤t true, false ≤t ⊥ ≤t true. Both partial orders offer a complete lattice. The meet and join for ≤k , denoted as ⊗ and ⊕, respectively, yield: false ⊗ true= ⊥ and false ⊕ true = . The meet and join for ≤t, denoted as ∧ and ∨, respectively, result in: ∧ ⊥ = false and ∨ ⊥ = true. The knowledge negation reverses the ≤k ordering while preserving the ≤t ordering. The truth negation reverses ≤t ordering while preserving ≤k ordering. For a knowledge base Ω, we define a transformation TΩ , which is a “revision operator” (Fitting, 2002) that revises our beliefs based on the rules in RB and established facts in WM. The interpretation of TΩ can be understood in the following sense. A single step of TΩ to Ω amounts to generating a set of ground literals, denoted as ⊢WMiRB, which is obtained by firing all enabled rules in RB under WMi. It can be shown that TΩ is monotonic and has a least fixpoint lfp(TΩ) with regard to ≤k (Fitting, 1991, Fitting, 2002). Since a monotonic operator also has a greatest fixpoint denoted as gfp(), gfp(TΩ) exists and can be expressed as follows: gfp(TΩ) = {B | TΩ(B) = B}. Figure 1. The four-valued logic FOUR
false
True
⊥
270
A Fixpoint Semantics for Rule-Base Anomalies
Because of the definition of TΩ , lfp(TΩ) is identical to gfp(TΩ) for a given knowledge base Ω. Operationally, the fixpoint of TΩ for a KB can be obtained as follows. Given a set G of initial facts, WM0 gets initialized based on G. i = 0; Φ0 = G; Φ1 = Φ0 ∪ ⊢WM0RB; while (Φi+1 != Φi) do { i++; Φi+1 = Φi ∪ ⊢WMiRB }; lfp(TΩ) = gfp(TΩ) = Φi; lfp(TΩ) (gfp(TΩ)) contains all the derivable conclusions from the KB through some inference method, thus, constituting the semantics for the KB. In the following fixpoint descriptions for rule-base anomalies, we simply utilize the lfp() in the definitions. Let ν be a mapping from ground atomic formulas to FOUR. Given a ground atomic formula A, ν(A) = true, if A ∈ lfp(TΩ) ∧¬A ∉ lfp(TΩ). ν(A) = false, if ¬A ∈ lfp(TΩ) ∧ A ∉ lfp(TΩ). ν(A) = , if A ∈ lfp(TΩ) ∧¬A ∈ lfp(TΩ). ν(A) = ⊥, if A ∉ lfp(TΩ) ∧¬A ∉ lfp(TΩ).
Fpoint Semantics for Rule-Base Anomalies With the least fixpoint semantics for a KB in place, we can study the formal definitions of knowledge base anomalies, the detection and resolution of which are one of the important issues in developing reliable and correct knowledge bases.
Consistency Given a knowledge base Ω, we obtain its least fixpoint lfp(TΩ). KB is said to contain inconsistent knowledge if for hi and hk ∈ lfp(TΩ) (i ≠ k), the following holds: ∃ hi, hk ∈ lfp(TΩ) [(ν(hi) = ) ∨ (ν(hk) = ) ∨ (hi hk) ∨ (hi ≭ hk)] When lfp(TΩ) contains a pair of either complementary, or mutual exclusive, or incompatible literals, it indicates that the KB contains inconsistency.
Redundancy Let FB stand for a list of facts and ri a rule in RB. For a knowledge base Ω = RB ∪ FB, let us consider another knowledge base Ω′ = RB′ ∪ FB, where RB′ = RB − ri. Ω contains redundancy if lfp(TΩ) = lfp(TΩ′). Since Ω′ yields the same set of derivable assertions as Ω and |Ω′| 0; e.g., r = (1/4) / (1/2) = 1/2, as shown in Fig. 4), and N is the increase rate of the number of vels between two successive coverings (N = Nk / Nk-1; e.g., N = (9) / (3) = 3, as shown in Fig. 4). For the Sierpinski gasket shown in Fig. 4, having complexity between a line and a plane, the self-similarity dimension can be computed from DS = lim
k→∞
= lim
k→∞
log 3 k log 1 / 1 /2 k
log 3 k log 3 = = 1.5850... k log 2 log 2
(8)
It is seen that the self-similarity fractal dimension of a pure mathematical fractal can be obtained from a single measurement at any scale for k > 0. For the Cantor set (see Kinsner, 1994a), having complexity between a point and a line, the self-similarity dimension is DS ≈ 0.6309.
Figure 4. Construction of the Sierpinski triangular gasket
309
A Unified Approach to Fractal Dimensions
2.3 Husdorff Dimension, DH In 1919, Felix Hausdorff proposed an iterative multiple-scale subdivision procedure to define a measure of any irregular object. Since this multiscale measure is fundamental to a group of morphological fractal dimensions, these fractal dimensions bear his name. The concept of successive coverings of a given fractal object is very simple: (i)
COVER the object by using the concept of neighbourhood which could be a small region of any shape (often called the Borel ball, or a volume element, vel for short), centered on a point either on or in the vicinity of the fractal. If the fractal is embedded in a specific Euclidean dimension (e.g., the Koch curve is embedded in DE = 2), use the neighbourhood of the same embedding dimension and size r to cover the object. Notice that r must be smaller than the fractal object itself; otherwise, Nk reaches a saturation level at 1. (ii) For a given size rk, COUNT the number Nk of vels required to cover the object. (iii) REDUCE the size of the vel, and REPEAT Step (ii) UNTIL no further detail is seen (i.e., another saturation level is reached). If we have any two successive measurements for k–1 and k, then the Hausdorff fractal dimension can be computed from D Hk =
log N k / N k − 1 log r k − 1 / r k
(9)
In the limit, the expression becomes D H = lim
k→∞
log N k log 1/r k
(10)
As it should, the expression resembles both the length dimension and the self-similarity dimension, but now relates to more general irregular fractals. Is is important to notice that while the fractal dimension of a strict monofractal can be computed from a single scale, an irregular or stochastic fractal must be computed from at least three scales or rk. Each scale produces just a single point in the log-log plot of Nk vs rk, and must never be taken as DH because it is a function of the scale rk (a common mistake made by novices in this area). Instead, a linear regression must be done on the points (except for any saturation points at the extreme values of the scale) to obtain the slope m. The dimension is then DH = m. This procedure applies to all the successive multiscale calculations. Another important problem is related to the concept of covering. Since there is no single definition of the covering, several distinct implementations techniques have evolved, three of which are shown in Fig. 5 (Kinsner, 1994a). Figure 5a shows the minimum number of regular vels of radius r required to cover a fractal F completely, while Fig. 5b shows the opposite extremum: the maximum non-overlapping vels that can be used to cover F. Figure 5c shows adjacent vels forming a mesh, and the dimension is often called the box-counting dimension. Notice that the first two techniques use r as the radius of the vels, while the latter technique uses r as their diameter. Although the numbers of intersecting vels are different for each covering type, the three Hausdorff dimensions are closely related, as long as the same technique is used throughout the experiment. This technique applies to objects in any embedding dimension.
2.4 Minkowski-Bouligand Dimension, DMB This dimension is touted as the fundamental approach by some authors (e.g., Tricot, 1995, Sec. 2.5). It is based on orders of growth and scale of functions at 0 (i.e., when r€→€0) to measure the degree of space filling of a
310
A Unified Approach to Fractal Dimensions
Figure 5. Examples of covering schemes for the Hausdorff dimension: (a) Minimum overlapping vels. (b) Maximum non-overlapping vels. (c) Adjacent mesh for the mesh-counting dimension
curve. One embodiment of this approach is the Minkowski sausage procedure in which a centre of a small disk with radius rk is allowed to follow the curve to measure the Minkowski content (i.e., the area Ak of the resulting Minkowski sausage, as shown in Fig. 6). A measure of the space filling of the curve can be calculated by dividing the area by the diameter of the disk A mk =
Ak 2r k
(11)
and a rate of this change can be established by reducing the size of the disk. If this follows the power-law relationship, the Minkowski-Bouligand dimension (also known as the Cantor-Minkowski-Bouligand dimension) is given by
311
A Unified Approach to Fractal Dimensions
Figure 6. “Minkowski sausage” covering
D MB = lim
r→0
log A mk +2 log 1/r k
(12)
Notice that for a smooth curve, Amk ~ rk and DMB = (log rk/(–log rk)+2 = –1+2=1. It can be shown that for strictly self-similar fractals, DH = DMB, but for natural fractals, DH < DMB. Based on the concept of dilation, we have also tried a variation on this scheme to determine the rate of spacefilling of electrical discharges in dielectrics, as they are related to diffusion-limited aggregates (DLAs), and obtained very-high-accuracy estimates of the fractal dilation dimension (Stacey, 1994).
2.5 Mass Dimension, DM The mass fractal dimension is another embodiment of the Minkowski sausage approach. It is often used in measuring the complexity of natural fractals such as the Lichtenberg figure (Fig. 7), dielectric discharges, and fractal growth phenomena in general. An estimate of the “mass” Ak contained in the fractal is obtained by measuring the area of the fractal branches contained within a corresponding circle of radius rk and centered at the seed of the fractal (i.e., the point where the branches merge). This measurement is repeated for different radii, and the mass fractal dimension is computed from D Mk =
log A k / A k − 1 log r k / r k − 1
(13)
log A r log r
(14)
or D M = lim
r→0
in which case the mass dimension is calculated from the slope of the corresponding log-log plot. Clearly, the plot will saturate when the radius exceeds the size of the fractal, and when the radius becomes smaller than the lattice on which the fractal is observed or simulated. The mass dimension is DM ≈ 1.7 to 1.9 for the Lichtenberg figure,
312
A Unified Approach to Fractal Dimensions
Figure 7. Successive coverings of a growth fractal for mass dimension
Figure 8. Comparison of coverings in mass dimension (rk) and gyration dimension (RG)
DM ≈ 1.6 for both the diffusion-limited aggregates (DLAs) in two-dimensional (2D) embedding space and for the natural down feather, DM ≈ 2.4 for DLAs in 3D, and DM ≈ 2.73 for the Sierpinski sponge in 3D.
2.6 Gyration Dimension, DG Similarly to the mass fractal dimension, the gyration fractal dimension is ideally suited for fractal growth phenomena with asymmetry. Furthermore, we also consider the gyration dimension as a major extension of the mass dimension. Although both seem to be related, the differences are significant in that the gyration dimension uses a statistical measure of the spread of the fractal during its growth, rather than a priori selected circles centered at its seed. The two techniques are compared in Fig. 8. The radius of gyration RG is equal to the standard deviation of the spread of the fractal, and has its origin at the center of mass (alias centre of gravity, or centroid) of the fractal. It is seen from Fig. 8 that a circle with radius RG covers any asymmetrical fractal much better than the concentric circles coincident with the seed, as used in the mass dimension. For a given number Nk of discharged sites at stage k in a dielectric discharge simulation, the radius of gyration is defined as (Vicsek,1992, p. 84) R Gk (N k ) =
1 Nk
Nk
Σ r 2j j=1
(15)
313
A Unified Approach to Fractal Dimensions
where rj is the distance of the jth bond from the centroid of the fractal. The location of the centroid µ is defined as the arithmetic mean of all the discharged sites along the x and y directions ck (N k )
≡ x,y
(16)
where Nk
Nk
x ck (N k ) = 1 Σ x j and yck (N k ) = 1 Σ y j Nk j= 1 N k j = 1
(17)
DG If the power-law relationship N k = R Gk , holds, then the gyration dimension is
DG = lim
k→∞
log N k log RGk
(18)
As before, DG can be obtained from the slope of a log-log plot. Notice that Eq. (15) does not reveal its relation to variance. However, it can be rewritten into the following more convenient and practical form (Kinsner, 1994a) R Gk (N k ) =
1 Nk
=
2 x
2 y
Nk
Σ x 2j − N1k j=1
Nk
+
+
Nk
2
Nk
Σ
j=1
Σ y 2j − N1k jΣ= 1 y j j=1
xj
2
1/2
(19)
Since the four sums in Eq. (19) can now be computed during the simulation of the random fractal, the radius of gyration can be computed at any desired value of k. We term this type of computation as being real time as opposed to batch processing if the averages must be computed first. Furthermore, this mode of computation is applicable not only to birth processes in which the number of elements always expands (e.g., DLAs), but also to birth-death processes in which the number of occupied sites can both grow and diminish (e.g., cellular automata). An incremental real-time procedure for computing the gyration dimension is described in (Kinsner, 1994a). Another important observation is that the gyration dimension is the only morphological dimension that lends itself to a modification to include a nonuniform probability distribution within the fractal. The modified expression for the radius of gyration is the square root of the weighted average R Gk (N k ) =
2 r
=
1 Nk
Nk
Σ p(r j) j=1
rj −
2
(20)
Observe that Eq. (20) no longer relates to a morphological dimension (uniform distribution); it now relates to an entropy-based dimension as discussed next.
3. Entropy-Based Fractal Dimensions Entropy-based fractal dimensions differ significantly from the morphological dimensions discussed in the previous section in that they can deal with nonuniform distributions in the fractals, while the morphological dimensions show the shape of a projection of the fractal only. This is understandable because the morphological dimensions
314
A Unified Approach to Fractal Dimensions
are purely metric (and not probabilistic or possibilistic) concepts. Since this distinction has not been appreciated uniformly in the literature, one should be aware of possible fundamental errors in the results and conclusions there.
3.1 IDI The simplest entropy-based fractal dimension is related to the first-order Shannon entropy. Let us consider an arbitrary fractal that is covered by Nk vels, each with a diameter rk, at the kth covering (which is a setting similar to that used to determine the Hausdorff dimension, DH ). Recall that DH was estimated from the number of vels intersected by the fractal, regardless of the density of the fractal in each vel. In contrast, the estimation of the information dimension, DI, considers the density of the fractal, as determined from the relative frequency of occurrence of the fractal in each intersecting vel. If njk is the frequency with which the fractal enters (intersects) the jth vel of size rk in the kth covering, then its ratio to the total number NTk of intersects of the fractal with all the vels is an estimate of the probability pjk of the fractal within the jth vel, and is given by n jk N Tk
(21)
Σ n jk j=1
(22)
p jk def lim
k→∞
where Nk
N Tk def
Notice that this total number NTk must be recalculated for each kth covering because, in general, it can change substantially on dense fractals. With this probability distribution at the kth covering, the average (expected) self-information (i.e., Ijk = log (1 / pjk) of the fractal contained in the Nk vels can be expressed by the Shannon entropy H1k as given by Nk
H 1k def
Σ p jk I jk = j=1
Nk
− Σ p jk log p jk j=1
(23)
Notice that the subscript 1 in H denotes that the Shannon entropy is of the first order which assumes independence between all the vels. If the following power-law relationship holds c H 1k ~ r1 k
DI
(24)
where c is a constant, then the information fractal dimension is D I = lim
k→∞
H 1k log 1/r k
(25a)
H 1r log 1/r
(25b)
or D I = lim
r→0
As before, DI can be obtained from the slope m of a log-log plot of Shannon’s entropy H1k vs precision (1/rk) as DI = m.
315
A Unified Approach to Fractal Dimensions
The difference between the self-similarity dimension and the information dimension can be illustrated by studying fractals and nonfractals such as an ensemble of a unit interval and an isolated point with equal probability distribution between the interval and the point. It can be shown (Kinsner, 1994a) that the self-similarity dimension masks out the point completely (DS = 1), while the information dimension preserves the presence of the point (DI = 1/2). This also applies to the Hausdorff dimension. In general, DI ≤ DS and DI ≤ DH, with the equality occurring only for fractals with uniform probability distributions. Such fractals are called monofractals.
3.2 Correlation Dimension, DC The information dimension reveals the expected spread in the nonuniform probability distribution of the fractal, but not its correlation. The correlation fractal dimension was introduced to address this problem. Let us consider a setting identical to that required to define the information dimension, DI. If we assume the following powerlaw relationship −1
Nk
Σ
j=1
p jk2
DC
~ 1r
(26)
then the correlation dimension is −log D C = lim
k→∞
Nk
Σ p jk2 j=1
log 1/r k
(27a)
(27b)
or Nr
log D C = lim
k→ ∞
Σ p jk2 j=1
log r k
As before, DC can be obtained from the slope m of a log-log plot of the second-order entropy H 2 vs precision (1/rk) as DC = m. It is clear that the numerator is different from the Shannon first-order entropy in the information dimension. It can be shown that it has the meaning of a correlation between pairs of neighboring points on the fractal F (Kinsner, 1994a). This correlation can be expressed in terms of a density-density correlation (or pair correlation) function. It is also known as the correlation sum, or correlation integral. This interpretation can lead to a very fast algorithm for computing the correlation dimension (Grassberger & Procaccia,1983; Kinsner, 1994a; Kinsner, 1994b). There are numerous examples in the literature of computing the correlation dimension for natural fractals, including DLAs, dielectric discharges, retinal vessels, damped pendulum, and the Hénon strange attractor (for a review, see Kinsner, 1994a). In general, DC ≤ DI ≤ DH, with the equality occurring for monofractals only.
3.3 Rényi Dimension Spectrum, Dq Since the correlation dimension is an extension of the Shannon-entropy-based dimension, could we gain anything by generalizing the concept further? Yes, we could see the entire spectrum of power-law relationships if we use the generalized higher-order entropy concept, as introduced by Alfred Rényi in 1955, and prior to him by M.P. Schutzenberger, and given by
316
A Unified Approach to Fractal Dimensions
Hq =
1 log 1−q
Nk
Σ p jkq j=1
0≤q≤∞
(28)
where q is called the moment order. Notice that although the Rényi entropy becomes singular for q = 1, it can be shown that it is the Shannon entropy (as in Eq. 23) (Kinsner, 1994a). Let us consider a setting identical to the previous two fractal dimensions. If the following power-law relationship holds for the expanded range of q (to cover all its negative values) Nk
Σ
j=1
p jkq
1 1−q
~ 1r
Dq
−∞ ≤ q ≤ ∞
(29)
then the Rényi fractal dimension spectrum is Nk
log
D q = lim
k→∞
Σ p jkq j=1
1 1 − q log 1/r k
(30a)
or Nk
D q = lim
k→∞
1 q−1
log
Σ p jkq j=1
log r k
(30b)
Once again, for a given order q, Dq can be obtained from the slope s of a log-log plot of the q-order entropy Hq vs reduction (rk) as Dq = m. It should be clear that the process should be repeated for a desired range of q, often −10 ≤ q ≤ 10 to contain numerical errors for high powers on some computers. Also notice that special attention must be given for q = 1 at which Eq. (30) becomes singular. It can be shown (Kinsner, 1994a) that for a fractal with a nonuniform probability distribution function, Dq decreases monotonically from D −∞ to D ∞, resembling an inverted S curve, as shown in Fig. 9. For a pure selfsimilar monofractal, all the dimensions become equal to D0 = DH, as shown by the horizontal line in Fig. 9. One by one, we can show (Kinsner, 1994a) that for q = 0, the Rényi dimension is equivalent to the morphological Hausdorff dimension D0 ≡ DH; while for q = 1, the Rényi dimension is equivalent to the information dimension D1 ≡ DI; for q = 2, it is the correlation dimension D2 ≡ DC; and for q = ±∞, Dq becomes what we call the Chebyshev dimension computed from the maximum and minimum probabilities, respectively. The Chebyshev extreme dimensions provide the Rényi spectrum bounds which are very useful in classification by neural networks. The other fractal dimensions discussed in this chapter are also equivalent to the Rényi dimensions with some values of q, including noninteger values of q (e.g., the gyration and variance dimensions). Thus, since the Rényi fractal dimension spectrum covers all the known dimensions, it can be seen as the unifying framework. The significance of the Rényi dimension spectrum is that it is no longer a single-valued dimension, but a single-valued monotonically decreasing function. Without any assumptions, it reveals the nature of the object either as a monofractal (a straight line), or as a mixture of fractals (a multifractal) whenever the function looks like an inverted S curve. This S-curve can be interpreted as a bounded signature of the fractal. This is in contrast to the Rényi entropy which is unbounded. This boundedness of the signature is of particular importance to classification of fractal objects. The inverted S-curve may be either antisymmetric about D 0 for q = 0, or not. Any asymmetry is an indicator of an asymmetrical probability distribution (skewness). Whenever a fractal has the complexity of a multifractal strange attractor in chaos, with varying densities, a single-valued fractal dimension no longer can describe the fractal adequately, and the Rényi dimension spectrum should be used for its characterization. For example, the multifractal spectrum can be used in characterizing the
317
A Unified Approach to Fractal Dimensions
Figure 9. A typical Rényi dimension spectrum for a multifractal
multifractal structure and dynamics of the majority of non-equilibrium heterogeneous stochastic phenomena in physics and chemistry (including DLAs and dielectric discharges; viscous fingering; solidification; and surface growth), as well as in biology, medicine and CI (including the Lévy walks of ion pumps in a cell; electrical signals from muscular, cardiac and brain activities; perception processes; and cognition dynamics). It can also be used in the study of percolation, cellular automata, textures of images and surfaces of materials, non-stationary processes such as speech, and electrical signals in biological organisms. We have studied several such phenomena using the Rényi dimension spectrum, and found it to be extremely useful as both a detector of multifractality, and as a source for input vectors for neural network classifiers. As described in Sec. 3.2, the correlation dimension can be interpreted in terms of the pair correlation function. One can also show that the Rényi dimension spectrum can be interpreted as q-tuple correlation function for q > 0. Once again, this interpretation may lead to a fast algorithm for computing the Rényi dimension spectrum. The Rényi dimension may also be instrumental in revealing the spectrum of fractals contained in an object, as discussed next.
3.4 Mandelbrot Singularity Spectrum, Sq We shall now show that the Rényi fractal dimension spectrum is closely related to the Mandelbrot singularity spectrum. If a fractal object has different local measures (such as a probabilistic weight) at its different regions, the distribution of the measures can be described by a multifractal distribution function f(α) (that we also call the multifractal singularity spectrum). As it will be seen, we could interpret α as the strength of the local singularity of the measure, and thus could call it the Hölder exponent or the crowding index. Since the idea of multifractality was first proposed by Mandelbrot (e.g., Mandelbrot, 1974), and later described by many other (e.g., Hentschel & Procaccia, 1983; Grassberger & Procaccia, 1983; and Vicsek, 1992), we would like to call the multifractal dimension f(α) the Mandelbrot singularity spectrum, Sq. Consider a recursive (multiplicative) process generating a non-uniform fractal (i.e., with rescaled regions of different sizes rj) with inhomogeneous measures (i.e., regions with different probabilities pj) at each of the rescaled regions. An example of such a process over a square of size L at its commencement (the second iteration, k = 2) is shown in Fig. 10. We have seen that for a uniform fractal with homogeneous measures, the distribution of probabilities p for a given vel of size r satisfies the single-valued power-law relation p(r) ≡ 1 ~ r D S N r
(31)
where DS is the self-similarity fractal dimension. However, for a non-uniform fractal with inhomogeneous distribution, the local relationship is
318
A Unified Approach to Fractal Dimensions
Figure 10. A recursive process producing a multifractal (after (Vicsek,1992, p. 50)
Figure 11. The Mandelbrot singularity spectrum
p j(r j) ~ r j
j
for
j
= [−∞,∞]
(32)
In addition, we can consider how many vels have the same αj. For example, for k = 2 in Fig. 10, there are two vels with p1, three vels with p2, and one vel with p3. In general, the number of vels with a specific α has the following power-law relation N (r) ~
1 r f ( )
(33)
where f(α) is the Mandelbrot singularity spectrum, Sq. If we perform a set of measurements, a plot of f(α) could be constructed, as shown in Fig. 11. Notice that the maximum Sqmax is reached for the Hausdorff dimension DH ≡ D0, while the minimum Sqmin = 0 is reached for the extreme values of probability (pmax on the left of the maximum, and pmin on the right). So, Sq is not only bounded, but has a finite support! From an application point of view in fractal object classification, this finite support makes this object characterization more desirable than the Rényi spectrum Dq which is also bounded, but has an infinite support. Also notice that the exponent α is analogous to the energy, while f(α) is analogous to the entropy as a function of energy, and is reminiscent of plots in thermodynamical systems (Stanley & Meakin,1988). Since the Rényi dimension spectrum, Dq, contains all the information about the multiscale dimensional analysis of the fractal, it also contains the information about its singularity spectrum. Consequently, the Mandelbrot singularity spectrum, Sq, can be computed directly from the Rényi dimension Dq. The Hölder exponent can be
319
A Unified Approach to Fractal Dimensions
obtained by taking the derivative with respect to the Rényi exponent q q
def d (q Š1) D q dq
(34)
and f(α) ≡ Sq is obtained from S q def q
q
− (q Š1) D q
(35)
This Rényi and Mandelbrot spectra represent equivalent descriptions of multifractals, as they are Legendre transforms of each other (Halsey, Jensen, Kadanoff, Procaccia & Shraiman, 1986; and Stanley & Meakin,1988). However, since Eq. (34) involves numerical differentiation, it is better to compute the Mandelbrot singularity spectrum directly, using wavelets and their modulus maxima (Faghfouri & Kinsner, 2005); and Mallat,1998).
4. Transform-Based Fractal D imensions We shall concentrate on three very practical dimensions based on: (i) power spectrum density, and (ii) multiscale variance, and (iii) Lyapunov exponents.
4.1 Spectral Dmension, Dβ A time series representing a chaotic or non-chaotic process can be transformed into its power spectrum density, using spectral analysis techniques such as Fourier (including the short-term windowed fast Fourier transform, FFT, and the discrete cosine transform, DCT), or the time-scale transforms such as wavelets (Wornell, 1996). If the power spectrum has equally spaced harmonics, the underlying process is periodic or quasiperiodic (nonchaotic). On the other hand, if the power spectrum is broadband, with substantial power at low frequencies, it may originate from chaos, although the broadband power spectrum does not guarantee sensitivity to initial conditions, and therefore chaos. A broadband signal v(t) can be characterized either by its energy spectrum E( f ) = |V( f )|2, or its power spectrum P*( f ) = |V( f )|2/T, or its power spectrum density given by P( f ) = lim 1 V ( f ) 2 T →∞ T
(36)
where |V( f )| is the Fourier transform amplitude. The spectral density gives an estimate of the mean-square fluctuations of the signal at a frequency f. If we assume that the power spectrum density has the following power-law form P( f ) ~ 1 f
(37)
then we can use the exponent β to define the spectral fractal dimension as D = DE +
3− 2
(38)
where DE = 1 is the embedding Euclidean dimension for the time series. It is now customary to characterize such broadband signals as colored noise, according to the value of β; for β = 0, 1, 2, 3, the noise is white, pink, brown, and black, respectively, as shown in Fig. 12. White noise is completely random, while the colored noise has more persistence (which is defined as the trend of a process to
320
A Unified Approach to Fractal Dimensions
Figure 12. (a) White, pink, brown, and black noise defined by their power spectrum. (b) Example of a black noise
continue in the direction upon which it has embarked). Black noise is often representative of natural and unnatural catastrophes such as floods and droughts. For a fractal time series, the exponent β may also be fractional. A single value of β indicates self similarity or self affinity of the noise everywhere. A complicated phenomenon may exhibit more than one β in its power spectrum density, thus facilitating search for the critical points in the process.
4.2 Variance Dimension, Dσ As we have seen, a time series representing a chaotic or non-chaotic process could be characterized through the power spectrum exponent β. It can also be characterized directly in real time by analyzing the spread of the increments in the signal amplitude (variance, σ2). Let us assume that the signal v(t) is discrete. If we assume that the variance of its amplitude increments is related to the time interval according to Var v(t 2) − v(t 1) ~ t 2 − t 1
2H
(39)
(40)
or for short Var
v
t
~
t
2H
then the Hurst exponent H can be calculated from a log-log plot using log Var v H = lim 1 t→0 2 log t
t
(41)
Finally, for the embedding Euclidean dimension DE, the variance dimension Dσ can be computed from D = D E + 1 − H
(42)
The technique of computing the variance dimension is so simple that it lends itself to real-time fractal analysis of a time series (Kinsner, 1994c). Although the spectral dimension may reveal a multifractal nature of the underlying process by estimating β for different short-term (windowed) Fourier analysis, the choice of the window is difficult and may introduce artifacts. On the other hand, the variance dimension does not require a window in the Fourier sense, and thus avoids the windowing problem. This technique can also be used to calculate a variance fractal dimension trajectory (VFDT) for a process that is piecewise stationary (Kinsner, 1994a). Such a trajectory can be used to analyze the temporal or spatial
321
A Unified Approach to Fractal Dimensions
multifractality in the study of dishabituation in behavior modification (e.g., Kinsner, Cheung, Cannons, Pear, & Martin, 2003).
4.3 Lapunov Dimension, DΛ The Lyapunov dimension is another useful fractal dimension because it can be derived from Lyapunov exponents which, in turn, can be calculated directly from the strange attractor or the orbit (phase trajectory) of a dynamical system, without the explicit knowledge of the underlying nonlinear system of coupled differential equations (for flows), or recursive difference equations (for maps) (Kinsner, 2003). This can be done by measuring quantitatively the stretching and contracting of the evolution of neighbouring orbits of the dynamical system. The spectrum of such Lyapunov exponents can then be used to distinguish between nonchaotic and chaotic, or even hyperchaotic processes. If all the exponents are negative, the system is convergent to a point or a cyclic attractor. However, if one of the exponents is positive, the system is divergent, but if the orbit forms a strange attractor, the system is chaotic. If two or more exponents are positive, the system is hyperchaotic. The Lyapunov dimension is defined as (Kaplan & Yorke,1979; and Ott,1993, p. 134)
D =K+
1 K +1
K
Σ
j=1
j
(43)
where λj denotes the jth Lyapunov exponent, arranged as a spectrum from the largest λ1 to the smallest λm for an m-dimensional system, K is the largest integer index which makes the sum of the exponents non-negative according to K
Σ j=1
j
≥0
(44)
and λK+1 is the first negative exponent. The advantage of the Lyapunov dimension is that it characterizes the complexity of a strange attractor without much computational effort. However, since this scalar cannot possibly reveal the multifractality of a strange attractor, the Rényi or Mandelbrot spectrum should be used.
5. Concluding Remarks The main objective of this chapter was to present a unified framework for fractal dimensions in order to identify conditions under which the different types of fractal dimensions can be either equal or unequal. We have seen that only pure self-similar monofractals have all the distinct fractal dimensions equal (within the computational accuracy used). On the other hand, the different types of fractal dimensions of multifractals cannot be equal. Since the Rényi fractal dimension spectrum includes all the dimensions, it is used as the foundation for the unified approach in this chapter. Like a singularity filter, the Rényi dimension spectrum sifts out each monofractal from the multifractal object. This spectrum includes not only the morphological fractal dimensions, but also the entropy-based and transform-based fractal dimensions. Another objective of this chapter was to develop a taxonomy of fractal dimensions. There are at least three approaches to the classification of fractal dimensions. Firstly, they can be classified according to the information content of a fractal F under consideration. Another classification may be according to the method of computing the dimension. Still another approach may be based on the applicability of the dimension to specific processes and objects. We have taken the first approach for the taxonomy, and identified three classes of dimensions based on: (i) morphology of an object, (ii) its entropy, and (iii) its transform. The morphological dimensions are based on purely geometric concepts, and they emphasize the shape (morphology) of the object. This applies to objects whose distributions of a measure (such as probability) is uniform (i.e., the fractal is homogeneous) or the information about the distribution is not available. It should be stressed that not all morphological dimensions produce the
322
A Unified Approach to Fractal Dimensions
same values for the same object. For example, a Hausdorff dimension of a dielectric discharge is different from its mass dimension and gyration dimension. We also established that the gyration dimension is a good candidate for a bridge between the morphological and entropy dimensions in that it can be modified from the purely geometrical form to its information-based form. The entropy-based dimensions take into account a probability measure or a correlation measure of F. All the entropy dimensions considered in this chapter are defined in terms of the relative frequency of visitation of a typical trajectory (in temporal fractals) or the distribution measure (in spatial fractals), so they use either information about the time behavior of a dynamical system or the measure describing the inhomogeneity of a spatial fractal. The use of q-tuple correlation functions is covered elsewhere (Kinsner, 1994b). Notice that the entropy dimensions include the morphological dimensions as special cases of the Rényi dimension spectrum. The Mandelbrot singularity spectrum is closely related to the Rényi dimension spectrum. Finally, the transform fractal dimensions rely on changing the original fractal into a different domain in which its properties could be observed. One example is the spectral fractal dimension derived from the frequency domain. The variance fractal dimension is another example, though less conventional, of moving from the time domain to the variance domain. It should be stressed that the variance dimension is not the same as the variance of a process, or even the variance index, as used in the literature. There are many other fractal dimensions in this class that are not discussed in this chapter. Notice that all the three classes of dimensions use the Hausdorff covering procedure to extract the dimensions. Each class of fractal dimensions has its own range of applications. For example, the very popular box-counting dimension (called the Hausdorff mesh dimension in this chapter) has been applied to nearly everything that looks like a fractal. It produces reliable results when analyzing contours and n-dimensional projections of fractal objects, as long as they are single fractals. However, this morphological dimension could not possibly reveal the intricate structures of multifractal objects. Consequently, if such morphological dimensions are used to characterize the shape and the texture of malignant cancerous cells, the results are unacceptable and lead to skepticism among pathologists. Multifractal objects such as the malignant cells must use the entropy dimensions for their characterization. But we must caution that the single-valued information dimension or correlation dimension cannot reveal the multifractal complexity either. Instead, the Rényi dimension or the Mandelbrot singularity spectrum can show the spectrum of fractals contained in a multifractal object. The transform dimensions have their applicability too. For example, the spectral dimension or the variance dimension of a temporal or spatial signal reveals the persistence (i.e., the likelihood that the present trend continues) or antipersistence (the likelihood that the present trend will reverse) of the corresponding fractal object. The variance dimension has certain advantages over the spectral dimension. The fractal dimensions presented in this chapter constitute a sample from an even larger family of dimensions. This sample was intended to show the relative merits of the dimensions, and how they relate to one another. There are many other issues not discussed here. One of the major issues is the accuracy of computing the dimensions. The problem of saturation at the boundaries of natural fractals is discussed further in (Kinsner, 1994c). Another representation problem (i.e., the number of points required to represent the fractal, and the vel sizes used to compute the dimensions) is still being investigated. Our study of fractal dimension spectra indicates that they could be very good candidates for the characterization of perceptual and cognitive processes (Kinsner & Dansereau, 2006) because they reveal long-term relations in those processes through the fundamental multi-scale measurements. They are superior not only to any energybased metrics, but also to entropy-based metrics (Kinsner, 2004).
Acknowledgment This work was supported in part through a research grant from the Natural Sciences and Engineering Research Council (NSERC) of Canada.
323
A Unified Approach to Fractal Dimensions
References Barnsley, M. (1988). Fractals everywhere (p. 396). Boston, MA: Academic. Edgar, G.A. (1990). Measure, topology, and fractal geometry (p. 230). New York, NY: Springer Verlag. Falconer, K. (1990). Fractal geometry: Mathematical foundations and applications (p. 288). New York, NY: Wiley. Faghfouri, A., & Kinsner, W. (2005, May 2-5). Local and global analysis of multifractal singularity spectrum through wavelets. In Proc. IEEE 2005 Can. Conf. Electrical & Computer Eng.(pp. 2157-2163). Saskatoon, SK. Feder, J. (1988). Fractals (p. 238). New York, NY: Plenum. Grassberger, P., & Procaccia, I. (1983, January 31). Characterization of strange attractors. Phys. Rev. Lett., 50(5), 346-349. Halsey, T.C., Jensen, M.H., Kadanoff, L.P., Procaccia, I., & Shraiman, B. (1986, February). Fractal measures and their singularities: The characterization of strange sets. Phys. Rev., A33(2), 1141-1151. Hentschel, H.G.E., & Procaccia, I. (1983). The infinite number of generalized dimensions of fractals and strange attractors. Physica, 8D, 435-444. Hoggar. S.G. (1992). Mathematics for computer graphics (p. 472). Cambridge, UK: Cambridge University Press. Kantz, H., & Schreiber, T. (1997). Nonlinear time series analysis (p. 304).. Cambridge, UK: Cambridge Univ. Press. Kaplan, J.L., & Yorke, J.A. (1979). Chaotic behavior of multidimensional difference equations. In Peitgen, H.-O. & Walther, H.O. (eds.), Functional differential equations and approximations of fixed points (pp. 204-227, 503). New York, NY: Springer Verlag. Kinsner, W., Cheung, V., Cannons, K., Pear, J., & Martin, T. (2003, August 18-20). Signal classification through multifractal analysis and complex domain neural networks. In Proc. IEEE 2003 Intern. Conf. Cognitive Informatics, ICCI03 (pp. 41-46). London, UK. ISBN: 0-7803-1986-5. Kinsner, W. (1994a, May). A unified approach to fractal and multifractal dimensions. Technical Report (p. 147) DEL94-4. Department of Electrical and Computer Engineering, University of Manitoba, Winnipeg, Manitoba, Canada (abbreviated to UofM in the references below). Kinsner, W. (1994b, June 7). Entropy-based fractal dimensions: Probability and pair-correlation algorithms for E-Dimensional images and strange attractors. Technical Report, DEL94-5; UofM, (p. 44). Kinsner, W. (1994c, June 15). Batch and real-time computation of a fractal dimension based on variance of a time series. Technical Report, DEL94-6; UofM (p. 22). Kinsner, W. (1994d, June 20). The Hausdorff-Besicovitch dimension formulation for fractals and multifractals. Technical Report, DEL94-7; UofM (p. 12). Kinsner, W. (1995, January). Self-similarity: The foundation for fractals and chaos. Technical Report, DEL95-2. UofM, (p. 113). Kinsner, W. (2002, August 19-20). Compression and its metrics for multimedia. In Proc. IEEE 2002 Intern. Conf. Cognitive Informatics, ICCI02 (pp. 107-121). Calgary, AB. ISBN: 0-7695-1724-2. Kinsner, W. (2003, August 18-20). Characterizing chaos through Lyapunov metrics. In Proc. IEEE 2003 Intern. Conf. Cognitive Informatics, ICCI03 (p. 189-201). London, UK. ISBN: 0-7803-1986-5.
324
A Unified Approach to Fractal Dimensions
Kinsner, W. (2004, August 16-18). Is entropy suitable to characterize data and signals for cognitive informatics? In Proc. IEEE 2004 Intern. Conf. Cognitive Informatics, ICCI04 (p. 6-21). Victoria, BC. ISBN: 0-7695-2190-8. Kinsner, W., & Dansereau, R. (2006, July 17-19). A relative fractal dimension spectrum as a complexity measure. In Proceedings of the 5th IEEE International Conference on Cognitive Informatics. Beijing, China. ISBN 1-4244-0475-4. Kinsner, W., Potter, M., & Faghfouri, A. (2005, June 16-18). Signal processing for autonomic computing. In Rec. Can. Applied & Industrial Mathematical Sciences, CAIMS05. Winnipeg, MB. Mallat, S. (1998). A wavelet tour of signal processing (p. 577). San Diego, CA: Academic. Mandelbrot, B.B. (1974). Intermittent turbulence in self-similar cascades: Divergence of higher moments and dimension of the carrier. J. Fluid Mech., 62(2), 331-358. Mandelbrot, B.B. (1982). The fractal geometry of nature (p. 468). New York, NY: W.H. Freeman. Ott, E. (1993). Chaos in dynamical systems (p. 385). Cambridge, UK: Cambridge University Press. Peitgen, H.-O., Jürgens, H., & Saupe, D. (1992). Chaos and fractals: New frontiers of science (p. 984). New York (NY): Springer-Verlag. Sprott, J.C. (2003). Chaos and time-series analysis.(p. 507). Oxford, UK: Oxford University Press. Stacey, G. (1994, November). Stochastic fractal modelling of dielectric discharges. (p. 308). Master’s Thesis. Winnipeg, MB: University of Manitoba. Stanley, H.E., & Meakin, P. (1988, Septembr 29). Multifractal phenomena in physics and chemistry. Nature, 335, 405-409. Tricot, C. (1995). Curves and fractal dimension (p. 323). New York, NY: Springer-Verlag. Vicsek, T. (1992). Fractal growth phenomena (p. 488). Singapore: World Scientific, (2nd ed.). Wang, Y. (2002, August 19-20). On cognitive informatics. In Proc. 1st IEEE Intern. Conf. Cognitive Informatics (pp. 34-42). Calgary, AB. Wang, Y. (2007), Toward theoretical foundations of autonomic computing. The International Journal of Cognitive Informatics and Natural Intelligence (IJCINI), 1(3), 1-16. USA: IPI Publishing. Wang, Y., & Kinsner, W. (2006, March), Recent advances in cognitive informatics. IEEE Transactions on Systems, Man, and Cybernetics (C), 36(2), 121-123. Wornell, G.W. (1996). Signal processing with fractals: A wavelet-based approach (p. 177). Upper Saddle River, NJ: Prentice-Hall.
325
Section V
Relevant Development
327
Chapter XXII
Cognitive Informatics: Four Years in Practice: A Report on IEEE ICCI’05 Du Zhang California State University, USA Witold Kinsner University of Manitoba, Canada Jeffrey Tsai University of Illinois in Chicago, USA Yingxu Wang University of Calgary, Canada Philip Sheu University of California, USA Taehyung Wang California State University, USA
The 2005 IEEE International Conference on Cognitive Informatics (ICCI’05) was held during August 8th to 10th 2005 on the campus of University of California, Irvine. This was the fourth conference of ICCI [Kinsner et al. 05]. The previous conferences were held at Calgary, Canada (ICCI’02) [Wang et al. 02], London, UK (ICCI’03) [Patel et al. 03], and Victoria, Canada (ICCI’04) [Chan et al. 04], respectively. ICCI’05 was organized by General Co-Chairs of Jeffrey Tsai (University of Illinois) and Yingxu Wang (University of Calgary), Program Co-Chairs of Du Zhang (California State University) and Witold Kinsner (University of Manitoba), and Organization CoChairs of Philip Sheu (University of California), Taehyung Wang (California State University, Northridge), and Shangping Ren (Illinois Institute of Technology). Cognitive informatics (CI) is a cutting-edge and multidisciplinary research area that tackles the fundamental problems shared by modern informatics, computation, software engineering, AI, cybernetics, cognitive science,
Copyright © 2009, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
Cognitive Informatics: Four Years in Practice
neuropsychology, medical science, systems science, philosophy, linguistics, economics, management science, and life sciences [Wang02]. CI is defined as a transdisciplinary enquiry of cognitive and information sciences that investigates into the internal information processing mechanisms and processes of the brain and natural intelligence, and their engineering applications via an interdisciplinary approach [Wang03]. CI is the transdisciplinary study into the internal information processing mechanisms and processes of the natural intelligence – human brains and minds – and their engineering applications. Since its inception in 2002 [Wang et al. 02], ICCI has been growing steadily in both its size and scope. It attracts both researchers from academia, government agencies, and industry practitioners from many countries. The conference provides a main forum for the exchange and cross-fertilization of ideas in the new research areas of CI. To facilitate a more concerted effort toward a particular focus, each conference had its theme in the paper solicitation and final program organization process. For ICCI’05 [Kinsner et al. 02], the theme was natural intelligence and autonomous computing. ICCI’05 had three insightful keynote speeches. The first one entitled “Cognitive Computation: The Ersatz Brain Project” was presented by James A. Anderson of Brown University, offered some exciting glimpses on an ambitious project to build a brain-like computing system. The talk focused on the progress made in three areas: preliminary hardware design, programming techniques, and software applications. The proposed hardware architecture, based on ideas from mammalian neo-cortex, is a massively parallel and two-dimensional locally connected array of CPUs and their associated memory. What makes this design feasible is an approximation to cortical function called the Network of Networks which indicates that the basic computing unit in the cortex is not a single neuron but small groups of neurons working together to form attractor networks. Thus, each network of networks module corresponds to a single CPU in the hardware design. A system with approximately the power of a human cerebral cortex would require about a million CPUs and a terabyte of data specifying the connection strengths using the network of networks approach. To develop “cognitive” software for such a brain-like computing system, there needs to be some new programming techniques such as topographic data representation, lateral data movement, and use of interconnected modules for computation. The software applications involve language, cognitive data analysis, visual information processing, decision making, and knowledge management. The second keynote speech at ICCI’05 was given by Yingxu Wang of University of Calgary on “Psychological Experiments on the Cognitive Complexities of Fundamental Control Structures of Software Systems.” To tackle the fundamental issue of the cognitive complexity of software systems, a set of concepts were introduced. There are ten basic control structures (BCS), each of them has a cognitive weight that describes the extent of difficulty, or the time and effort in comprehending the functionality and semantics of a given structure in programs. Through cognitive psychological experiments, the weights were calibrated. From the cognitive complexities of BCSs, the cognitive functional size (CFS) of a software system can be established, which is modeled as a product of its architectural and operational complexities. Results of case studies indicated that CFS is the most sensitive measure for representing the real complexity of a software system. Third keynote speech at ICCI’05 by Witold Kinsner of University of Manitoba summarized the recent developments in CI. In his talk on “Some Advances in Cognitive Informatics”, Kinsner took a closer look at some recent advances in signal processing for autonomic computing and its metrics. Autonomic computing is geared toward mitigating the escalating complexity of a software system in both its features and interfaces by making the system self-configuring, self-optimizing, self-organizing, self-healing, self-protecting and self-communicating. Because signal processing is used in nearly all fields of human endeavor ranging from signal detection, fault diagnosis, advanced control, audio and image processing, communications engineering, intelligent sensor systems, and business, it will play a pivotal role in developing autonomic computing systems. The classical statistical signal processing nowadays is augmented by intelligent signal processing which utilizes supervised and unsupervised learning through adaptive neural networks, wavelets, fuzzy rule-based computation and rough sets, genetic algorithms, and blind signal estimation. Quality metrics are needed to measure the quality of various multimedia materials in perception, cognition and evolutionary learning processes, to gauge the self-awareness in autonomic systems, and to assess symbiotic cooperation in evolutionary systems. Instead of energy-based metrics, the multiscale metrics based on fractal dimensions are found to be most suited for perception. The technical program of ICCI’05 [Kinsner et al. 05] included 42 papers from researchers and industrial practitioners in this growing field. The accepted papers covered a wide spectrum of topics in CI: from topics on
328
Cognitive Informatics: Four Years in Practice
processes of the natural intelligence (brain organization, cognitive mechanism and process, memory and learning, thinking and reasoning, cognitive linguistics, and neuropsychology), to topics on the internal information processing mechanisms (information model of the brain, knowledge representation and engineering, machine learning, neural networks and neural computation, pattern recognition, and fuzzy logic), and to topics on the engineering applications (autonomic computing, informatics foundation of software engineering, software agent systems, quantum information processing, bioinformatics, web-based information systems, and agent technologies). During the conference, presentations were arranged into the following nine sessions: (1) Cognitive informatics; (2) Information and signal theories; (3) Intelligent systems; (4) Applications of cognitive informatics; (5) Human factors in engineering; (6) Cognitive learning; (7) Knowledge and concept modeling; (8) Intelligent decision making; and (9) Cognitive software engineering. Past four years have witnessed some exciting results in the trenches of CI. The research interest in this niche area from all over the world is growing and the body of work produced thus far is taking shape in both quality and quantity. ICCI’06 will be held during July 17-19, 2006 in Beijing, China with its theme on natural intelligence, autonomic computing, and neuroinformatics [Yao et al. 06], while ICCI’07 and ICCI’08 are slated for Sydney, Australia in 2007, and Madrid, Spain in 2008, respectively. The ICCI’05 program and proceedings are the result of the great effort and contributions of many people. We would like to thank all authors who submitted interesting papers to ICCI’05. We acknowledge the professional work of the Program Committee and external reviewers in effectively reviewing and improving the quality of submitted papers. Our acknowledgement also goes to the invaluable sponsorships of IEEE Computer Society, UCI, UIC, Univ. of Calgary, Univ. of Manitoba, IEEE Canada, and IEEE CS Press. We acknowledge the organizing committee, ICCI’05 secretariat, and student volunteers who have helped to make the event a success.
References Chan, C., Kinsner, W., Wang, Y., & Miller, D.M. (eds.) (2004, August). Cognitive informatics. Proceedings of the 3rd IEEE International Conference (ICCI’04). Victoria, Canada: IEEE CS Press. Kinsner, W., Zhang, D., Wang, Y., & Tsai, J. (eds.) (2005). Cognitive Informatics..Proceedings of the 4th IEEE International Conference (ICCI’05). Irvine, CA: IEEE CS Press. Patel, D., Patel, S., & Wang, Y. (eds.) (2003, August). Cognitive informatics. Proceedings of the 2nd IEEE International Conference (ICCI’03). London, UK: IEEE CS Press. Wang, Y. (2002, August), On cognitive informatics. Keynote Speech at the Proc. 1st IEEE International Conference on Cognitive Informatics (ICCI’02) (pp.34-42). Calgary, Canada: IEEE CS Press. Wang, Y. (2003). Cognitive informatics: A new transdisciplinary research field. Brain and Mind: A Transdisciplinary Journal of Neuroscience and Neurophilosophy, .4(2), 115-127. Wang, Y., Johnston, R., & Smith, M. (eds.) (2002, August). Cognitive informatics. Proceedings of the1st IEEE International Conference (ICCI’02). Calgary, AB: IEEE CS Press.. Yao, Y., Shi., Z., Wang, Y., & Kinsner, W. (eds.) (2006, July). Cognitive Informatics. Proceedings of the 5th IEEE International Conference (ICCI’06). Beijing, China: ) IEEE CS Press.
329
330
Chapter XXIII
Toward Cognitive Informatics and Cognitive Computers: A Report on IEEE ICCI’06 Yiyu Yao University of Regina, Canada Zhongzhi Shi Chinese Academy of Sciences, China Yingxu Wang University of Calgary, Canada Witold Kinsner University of Manitoba, Canada Yixin Zhong Beijing University of Posts and Telecommunications, China Guoyin Wang Chongqing University of Posts and Telecommunications, China Zeng-Guang Hou Chinese Academy of Sciences, China
Cognitive informatics (CI) is a cutting-edge and multidisciplinary research area that tackles the fundamental problems shared by modern informatics, computation, software engineering, AI, cybernetics, cognitive science, neuro-psychology, medical science, systems science, philosophy, linguistics, economics, management science, and life sciences [Wang, 2002]. CI can be viewed as a trans-disciplinary enquiry of cognitive and information sciences that investigates into the internal information processing mechanisms and processes of the brain and natural intelligence, and their engineering applications [Wang, 2003, 2007a; Wang and Kinsner, 2006]. It is a trans-disciplinary study of the internal information processing mechanisms and processes of the natural intelligence – human brains and minds – and their engineering applications.
Copyright © 2009, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
Toward Cognitive Informatics and Cognitive Computers
The IEEE International Conference on Cognitive Informatics (ICCI) series has been established since 2002 [Wang et al., 2002; Patel et al., 2003; Chan et al., 2004; Kinsner et al., 2005; Yao et al., 2006]. The conference provides the main forum for the exchange and cross-fertilization of ideas in CI. ICCI’06 is the fifth conference of the series and was held at the Institute of Automation, Chinese Academy of Sciences, Beijing, China during July 17-19, 2006. ICCI’06 was organized by conference Co-Chairs Yingxu Wang (University of Calgary), Yixin Zhong (Beijing University of Posts and Telecommunications), and Witold Kinsner (University of Manitoba), and Program Co-Chairs Zhongzhi Shi (Chinese Academy of Sciences) and Yiyu Yao (University of Regina), with the valuable support of Organization Co-Chairs Yuyu Yuan (Beijing University of Posts and Telecommunications), Guoyin Wang (Chongqing University of Posts and Telecommunications) and Zeng-Guang Hou (Chinese Academy of Sciences, China). The program committee of ICCI’06 consists of over 50 experts in various areas of CI around the world. The theme of ICCI’06 is natural intelligence, autonomic computing, and neural informatics. The objectives of ICCI’06 are to draw attention of researchers, practitioners, and graduate students to the investigation of cognitive mechanisms and processes of human information processing, and to stimulate the international effort on cognitive informatics research and engineering applications. The ICCI’06 program encompasses 40 regular papers and 55 short papers selected from 276 submissions from 18 countries based on rigorous reviews by program committee members and external reviewers. Two-volume proceedings have been published by IEEE CS Press [Yao et al., 2006]. During the conference, presentations were arranged into the following 18 sessions: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18.
Cognitive Models Pattern and Emotion Recognition Computational Intelligence CI Foundations of Software Engineering Autonomic Agents Biosignal Processing Cognitive Complexity of Software Knowledge Manipulation Rough Sets and Problem Solving Descriptive Mathematics for CI Visual Information Processing Knowledge Representation Cognitive Data Mining Neural Networks Pattern Classification Machine Learning Intelligent Algorithms Intelligent Decision-Making
The ICCI’06 program covers a wide spectrum of topics that contribute to cognitive informatics and cognitive computers. Researchers exchanged ideas on processes of the natural intelligence (i.e., brain organization, cognitive mechanism and process, memory and learning, thinking and reasoning, cognitive linguistics, and neuropsychology), internal information processing mechanisms (i.e., cognitive informatics model of the brain, the OAR model, knowledge representation and engineering, machine learning, neural networks and neural computation, pattern recognition, and fuzzy logic), and engineering applications of CI (i.e., autonomic computing, informatics foundation of software engineering, software agent systems, quantum information processing, bioinformatics, web-based information systems, and agent technologies). ICCI’06 brought together a group of over 100 researchers and graduate students to exchange latest research results and to explore new ideas in CI. Through stimulating discussion and a panel session on the future of cognitive informatics, the participants were excited about the current advances, the future trends, and expected development in CI.
331
Toward Cognitive Informatics and Cognitive Computers
The ICCI’06 program is enriched by three keynotes and three special lectures. Jean-Claude Latombe, Professor of Stanford University, presented the keynote speech entitled “Probabilistic Roadmaps: A Motion Planning Approach Based on Active Learning [Latombe, 2006].” This talk focused on motion planning of autonomous robots. A new motion-planning approach – Probabilistic RoadMap (PRM) planning – was presented. PRM planning trades the prohibitive cost of computing the exact shape of a feasible space against the cost of learning its connectivity from dynamically chosen examples (the sampled configurations). PRM planning, a widely popular approach in robotics, is extremely successful in solving apparently very complex motion planning problems with various feasibility constraints. The talk touched the foundations of PRM planning and explained why it is so successful. The PRM success reveals key properties satisfied by feasible spaces encountered in practice. Furthermore, a better understanding of these properties is already making it possible to design faster PRM planners capable of solving increasingly more complex problems. In his keynote speech on “Cognitive Informatics: Towards Future Generation Computers that Think and Feel,” Yingxu Wang, Professor of Calgary University, presented a set of the latest advances in CI that may lead to the design and implementation of cognitive computers capable of thinking and feeling [Wang, 2006a]. He pointed out that CI provides the theory and philosophy for the next generation computers and computing paradigms. In particular, recent advances in CI were discussed in two groups, namely, an entire set of cognitive functions and processes of the brain and an enriched set of denotational mathematics. He described the approach to design cognitive computers for cognitive and perceptible concept/knowledge processing, based on denotational mathematics such as Concept Algebra [Wang, 2006b, 2006d], Real-Time Process Algebra (RTPA) [Wang, 2002b, 2006d], and System Algebra [Wang, 2006c]. Cognitive computers implement the fundamental cognitive processes of the natural intelligence such as the learning, thinking, formal inferences, and perception processes. They are novel information processing systems that think and feel. In contrary to the von Neumann Architecture for traditional stored-program controlled imperitive computer, the speaker elaborated on the Wang Architecture for cognitive computers, consisting of the Knowledge Manipulation Unit (KMU), the Behavior Manipulation Unit (BMU), the Experience Manipulation Unit (EMU), the Skill Manipulation Unit (SMU), the Behavior Perception Unit (BPU), and the Experience Perception Unit (EPU) [Wang, 2006a]. This is an applausive step towards the study on cognitive computers. Withold Kinsner, Professor of Manitoba University, presented a keynote speech entitled “Towards Cognitive Machines: Multiscale Measures and Analysis [Kinsner, 2006]”. He showed that computer science and computer engineering have contributed to many shifts in technological and computing paradigms. The next paradigm shift is the cognitive machines. Such cognitive machines must be able to aware of their environments like human beings. Such machines ought to understand the meaning of information in more human-like ways. The motivation for developing such machines range from self-evidenced practical reasons such as the expense of computer maintenance, to wearable computing in healthcare, and gaining a better understanding of the cognitive capabilities of the human brain. In order to design such machines, we are faced with many problems, ranging from human perception, attention, concept creation, cognition, consciousness, executive processes guided by emotions and value, and symbiotic conversational human-machine interactions. Kinsner suggested that cognitive machine research includes multiscale measures and analysis. More specifically, he discussed definitions of cognitive machines, representations of processes, as well as their measurements and analyses. Application paradigms of cognitive machines were given, including cognitive radio, cognitive radar, and cognitive monitors. Three special lectures stimulating discussions on CI have been invited. Professor Yixin Zhong’s talk was on “A Cognitive Approach to NI and AI Research.” He reviewed the major AI approaches, namely structuralism, functionalism, behaviorism, and cognitivism that seemed contradictory to each other from the traditional point of view. He then proposed the mechanism approach of AI expressed in the form of information-knowledge-intelligence transformation. He showed that the four major approaches may constitute an incorporation based on the new approach, which may be of great significance to both NI and AI research. Professor Zhongzhi Shi presented a talk “On Intelligence Science and Recent Progresses.” He proposed that intelligence science is a cross-fertilized discipline dedicated to joint research on basic theory and technology of intelligence by brain science, cognitive science, cognitive informatics, AI, and etc. Within this framework, he reported recent advances in visual perception, introspective learning, linguistic cognition, consciousness model, and platform of agent-grid intelligence. Professor Yiyu Yao talked on “Granular Computing and Cognitive Informatics.” A framework of granular com-
332
Toward Cognitive Informatics and Cognitive Computers
puting was discussed from three perspectives. From the philosophical perspective, granular computing concerns structured thinking. From the methodological perspective, granular computing concerns structured problem solving. From the computational perspective, granular computing focuses on structured information processing. In summary, the plenary lectures attempted to connect CI research to artificial intelligence, natural intelligence, intelligence science in general and granular computing in particular. They demonstrated that CI may provide an overarching theoretical framework for the studies related to the above research areas. The participants of ICCI’06 have witnessed exciting results from the exploration of many perspectives of CI. The research interest all over the world is growing rapidly and the body of work produced thus far is taking shape in both quality and quantity. Several proposals had been discussed during the conference. It was decided that ICCI’07 will be held at Lake Tahoe, California, USA. A special issue of selected papers of ICCI’06 is planned in the International Journal of Cognitive Informatics and Natural Intelligence (IJCiNi, http://www.enel.ucalgary. ca/IJCINI/). More information about CI may be found at http://www.enel.ucalgary.ca/ICCI2006/. The ICCI’06 program as presented in the proceedings is the result of the great effort and contributions of many people. We would like to thank all authors who submitted interesting papers to ICCI’06. We acknowledge the professional work of the Program Committee and external reviewers for their effective review and improvement of the quality of submitted papers. Our acknowledgement also goes to the invaluable sponsorships of IEEE Computer Society, The IEEE ICCI Steering Committee, Chinese Academy of Sciences, IEEE Canada, IEEE CS Press, and The International Journal of Cognitive Informatics and Natural Intelligence (IJCiNi). We thank the keynote speakers and invited lecturers for presenting their visions and insights on fostering this emerging interdisciplinary area. We acknowledge the organizing committee members, and particularly the ICCI’06 secretariats and student volunteers who have helped to make the event a success. The ICCI Steering Committee welcomes contributions and suggestions from researchers around the world in planning future events. Multidisciplinary researchers and practitioners are invited to join the CI community and participate in the future conferences in the IEEE ICCI series.
References Chan, C., Kinsner, W. Wang, Y., & Miller, D.M. (Eds.) (2004, August). Cognitive informatics. Proceedings of the 3rd IEEE International Conference (ICCI’04). Victoria, Canada: IEEE CS Press.. Kinsner, W. (2006, July). Towards cognitive machines: Multiscale measures and analysis. Keynote speech at the Proceedings of the 5th IEEE International Conference on Cognitive Informatics (ICCI’06) (pp. 8-14). Beijing, China: IEEE CS Press. Kinsner, W., D. Zhang, Y. Wang, and J. Tsai (Eds.) (2005, August). Cognitive informatics. Proceedings of the 4th IEEE International Conference (ICCI’05). Irvine, CA: IEEE CS Press.. Latombe, J.-C. (2006, July). Probabilistic roadmaps: A motion planning approach based on active learning. Keynote Speech at the Proceedings of the 5th IEEE International Conference on Cognitive Informatics (ICCI’06) (pp. 1-2). Beijing, China: IEEE CS Press. Patel, D., Patel, S., & Wang, Y. (Eds.) (2003, August). Cognitive informatics. Proceedings of the 2nd IEEE International Conference (ICCI’03), London, UK: IEEE CS Press. Wang, Y. (2002a). On Cognitive Informatics. Keynote speech at the Proceedings of the 1st IEEE International Conference on Cognitive Informatics (ICCI’02) (pp. 34-42). IEEE CS Press. Wang, Y. (2002b, October). The real-time process algebra (RTPA). Annals of Software Engineering: An International Journal 14, 235-274. Oxford: Baltzer Science Publishers. Wang, Y. (2003). Cognitive informatics. A new transdisciplinary research field. Brain and Mind: A Transdisciplinary Journal of Neuroscience and Neurophilosophy, 4(2), 115-127.
333
Toward Cognitive Informatics and Cognitive Computers
Wang, Y. (2006a, July). Cognitive informatics - Towards the future generation computers that think and feel. Keynote speech at the Proceedings fo the 5th IEEE International Conference on Cognitive Informatics (ICCI’06) (pp. 3-7). Beijing, China: IEEE CS Press. Wang, Y. (2006b, July). On concept algebra and knowledge representation. Proceedings of the 5th IEEE International Conference on Cognitive Informatics (ICCI’06) (pp. 320-331). Beijing, China: IEEE CS Press. Wang, Y. (2006c, July). On abstract systems and system algebra. Proceedings of the 5th IEEE International Conference on Cognitive Informatics (ICCI’06) (pp. 332-343). Beijing, China: IEEE CS Press. Wang, Y. (2006d, March). On the informatics laws and deductive semantics of software. IEEE Transactions on Systems, Man, and Cybernetics (C), 36(2), 161-171. Wang, Y. (2007a, January). The theoretical framework of cognitive informatics. The International Journal of Cognitive Informatics and Natural Intelligence (IJCiNi), 1(1), 1-27. USA: IGI Publishing, Wang, Y. (2007b, July). The OAR model of neural informatics for internal knowledge representation in the brain. The International Journal of Cognitive Informatics and Natural Intelligence (IJCINI), 1(3), 64-75. USA: IGI Publishing.,. Wang, Y., & Kinsner, W. (2006, Mrach). Recent advances in cognitive informatics. IEEE Transactions on Systems, Man, and Cybernetics (C), 36(2), 121-123. Wang, Y., Johnston, R., & Smith, M. (Eds.) (2002, August). Cognitive informatics. Proceedings of the 1st IEEE International Conference (ICCI’02). Calgary, AB, Canada: ). IEEE CS Press.. Wang, Y., Wang, Y., Patel, S., & Patel, D. (2006, March). A layered reference model of the brain (LRMB). IEEE Transactions on Systems, Man, and Cybernetics (C), 36(2), 124-133, Yao, Y.Y., Shi, Z.,Wang, Y., & Kinsner, W. (Eds.) (2006). Cognitive informatics. Proceedings of the 5th IEEE International Conference (ICCI’06). Beijing, China: IEEE CS Press.
334
335
Compilation of References
Aleksander, I. (1989). Neural computing architectures: The design of brain-like machines. (p. 401). Cambridge, MA: MIT Press. Aleksander, I. (1998, August) From WISARD to MAGNUS: A family of weightless neural machines (pp. 18-30). Aleksander, I. (2003). How to build a mind: Towards machines with imagination (maps of the mind 2nd ed.) (p 192) New York, NY: Columbia University Press. Aleksander, I. (2006). Artificial consciousness: An update. Available as of May 2006 from http://www.ee.ic. ac.uk/research/neural/publications/iwann.html (This is an update on his paper “Towards a neural model of consciousness,” in Proc. ICANN94, New York, NY: Springer, 1994.)
Anderson, J. (2002, August 19-20). Hybrid computation with an attractor neural network In Proceedings of the 1st IEEE Intern. Conf. Cognitive Informatics (pp. 3-12). Calgary, AB{ISBN 0-7695-1724-2} Anderson, J. A. (2005). Cognitive computation: The Ersatz brain project. In Proceedings of the IEEE 2005 International Conference on Cognitive Informatics (pp. 2-3). IEEE Computer Society. Anderson, J. A. (2005). A brain-like computer for cognitive software applications: The Ersatz brain project. In Proceedings of the IEEE 2005 International Conference on Cognitive Informatics (pp. 27-36). IEEE Computer Society. Anderson, J. R. (1993). Rules of the mind. Lawrence Erlbaum Eds.
Aleven, V. & Koedinger, K. (2002). An Effective Metacognitive Strategy : Learning by doing and explaining with computer-based Cognitive Tutors. Cognitive Science. 26(2), 147-179.
Anderson, J. R., & Ross, B. H. (1980). Evidence against a semantic-episodic distinction. Journal of Experimental Psychology: Human Learning and Memory, 6, 441-466.
Alligood, K.T. , Sauer, T.D., & Yorke, J.A. (1996). Chaos: An introduction to dynamical systems (p. 603). New York, NY: Springer Verlag.
Anderson, J. R., Bothell, D., Byrne, M. D., Douglass, S., Lebiere, C., & Qin, Y. (2004). An integrated theory of the mind. Psychological Review, 111(4), 1036-1060.
Altman, R.B., Bada, M., Chai, X.J., Carillo, M.W., Chen, R.O., & Abernethy, N.F. (1999). RiboWeb: An ontologybased system for collaborative molecular biology. IEEE Intelligent Systems, 14(5), 68-76.
Anderson, J. R., Corbett, A.T., Koedinger, K.R. & Pelletier, R. (1995). Cognitive Tutors: Lessons learned. The Journal of Learning Sciences. 4(2), 167-207.
Altmann, E. & Trafton, J. (2002). Memory for goals: An Activation-Based Model. Cognitive Science, 26, 39-83.
Ankerst, M., Elsen, C., Ester, M., & Kriegel, H.P. (1999). Visual classification: An interactive approach to decision tree construction, ACM SIGKDD. International Conference on Knowledge Discovery and Data Mining (pp. 392-396).
Copyright © 2009, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
Compilation of References
Atmanspacher, H., & Scheingraber, H. (1987). A fundamental link between system theory and statistical mechanics. Foundations of Physics, 17, 939-963. Austin, J. (ed.) (1998). RAM-based neural networks (p. 240). Singapore: World Scientific. Baars B. (1988). A cognitive theory of consciousness. Cambridge, UK: Cambridge University Press.,Available as of May 2006 from http://nsi.edu/users/baars/ BaarsConsciousnessBook1988/index.html Badalamente, R. V., & Greitzer, F. L. (2005). Top ten needs for intelligence analysis tool development. In Proceedings of the 2005 International Conference on Intelligence Analysis, McLean, Virginia. Baddeley, A. 1990. Human Memory : theory and practice. Hove (UK): Lawrence Erlbaum. Baeten, J., & Weijland, W. (1990). Process algebra. Cambridge Tracts in Computer Science, 18. Cambridge University Press. Baeten, J.,& Middelburg, C. (2002). Process algebra with timing. EATCS Monograph. Springer. Baillieul, J. (1985). Kinematic programming alternatives for redundant manipulators. In Proc. IEEE Int. Conf. Robotics and Automation (pp. 722-728). St. Louis, MO. Bargiela, A., & Pedrycz, W. (2002). Granular computing: An introduction (p. 480). New York, NY: Springer. Barlow, H.B. (1961). Possible principles underlying the transformation of sensory messages. In Rosenblith, W.A,editor, Sensory communication (pp. 217-234). .Cambridge, MA: The MIT Press. Barnsley, M. (1988). Fractals everywhere (p. 396). Boston, MA: Academic. Barwise, J. & Moss, L. (1996). Vicious circles. Stanford CA: CLSI Publications. Bateman, J. (1990). Upper modeling: A general organization of knowledge for natural language processing. In Paper prepared for the Workshop on Standards for Knowledge Representation Systems, Santa Barbara.
336
Beck, K. (2000). Extreme programming explained. MA: Addison-Wesley. Bell, A.J, & Sejnowski, T.J. (1997). The ‘independent components’ of natural scenes are edge filters. Vision Research, 37(23), 3327-3338. Bell, D.A. (1953), Information theory. London: Pitman. Belnap, N. D. (1977). A useful four-valued logic. In G. Epstein & J. Dunn (Ed.), Modern uses of multiple-valued logic (pp. 8-37). D. Reidel, Dordrecht. Ben-Ari, M. (1993). Mathematical logic for computer science. UK: Prentice Hall International. Bender, E.A. (1996). Mathematical methods in artificial intelligence. Los Alamitos, CA: IEEE CS Press Berger, J. (1990). Statistical decision theory – Foundations, concepts, and methods. Springer-Verlag. Bergson, H. (1960). Time and free will: An essay on the immediate data of consciousness. New York, NY: Harper Torchbooks (Original edition 1889, translated by F.L. Pogson). Bergstra, J., Ponse, A., & Smolka, S. (eds.) (2001). Handbook of process algebra. North Holland. Bernardo, M., & Gorrieri, R. (1998). A tutorial on EMPA: A theory of concurrent processes with nondeterminism, priorities, probabilities and time. Theoretical Computer Science, 202, 1-54. Bestaoui, Y. (1991). An unconstrained optimization approach to the resolution of the inverse kinematic problem of redundant and non-redundant robot manipulators. Int. J. Robotics, Autonomous Systems, 7, 37-45. Biederman, G. B., Stepaniuk, S., Davey, V. A., Raven, K., & Ahn, D. (1999). Observational learning in children with Down syndrome and developmental delays: The effect of presentation speed in videotaped modeling. Down Syndrome Research and Practice, 6(1), 12-18. Biederman, I. (1987). Recognition-by- components: A theory of human image understanding. Psychological Review, 94(2), 115-147.
Compilation of References
Biggerstaff, T. J., Mitbander, B. G., & Webster, D. E. (1994). Program understanding and the concept assignment problem. Communication of the ACM, 37(5), 72-82.
Bravetti, M., Bernardo, M., & Gorrieri, R. (1998). Towards performance evaluation with general distributions in process algebras. In CONCUR’98, LNCS 1466, (pp. 405-422). Springer.
Bishop, C.M. (1995). Neural networks for pattern recognition (p. 482). Oxford, UK: Oxford University.
Breiman, L., Friedman, J.H., Olshen, R.A., & Stone, C.J. (1984). Classification and regression trees. Belmont, CA: Wadsworth Int. Group.
Bloom, B. S. (Ed.). (1956). Taxonomy of educational objectives: The classification of educational goals: Handbook, I, Cognitive domain. New York, Toronto: Longmans, Green. Blumer, A., Ehrenfeucht, A., Haussler, D., & Warmuth, M. (1987). Occam’s razor. Information Processing Letters, 24, 377-380. Boole, G. (1854). An investigation of the laws of thought, on which are founded the mathematical theories of logic and probabilities. New York: Dover Publications, Inc. Boothe, R.G. (2002). Perception of the visual environment. New York: Springer-Verlag; Berlin Heidelberg. Borgo, S., Guarino, N., & Masolo, C., (1997). An ontological theory of physical objects. In Ironi L (ed.), Proceedings of Eleventh International Workshop on Qualitative Reasoning (QR’97), Cortona (Italia), 3-6, 223-231. Brachman R. J. & Levesque, H. J. (2004). Knowledge representation and reasoning, San Francisco: Morgan Kaufmann Publishers. Brachmann, R., & Anand, T. (1996). The process of knowledge discovery in databases: A human-centered approach. Advances in knowledge discovery and data mining. (pp. 37-57). Menlo Park, CA: AAAI Press & MIT Press. Braine, M. D. S., & Rumain, B. (1989). Development of comprehension of “or”: Evidence for a sequence of competencies. Journal of Experimental Child Psychology, 31, 46-70. Bravetti, M., & Gorrieri, R. (2002). The theory of interactive generalized semi-Markov processes. Theoretical Computer Science, 282(1), 5-32.
Brillouin, L. (1964). Scientific uncertainty and information. New York, NY: Academic. Britten K H. (1996). Attention is everywhere. Nature, 382, 497-498. Brooks, R. (1983). Towards a theory of the comprehension of computer programs. International Journal of Man-Machine Studies, 18(6), 543-554. Brooks, R.A. (1970). New approaches to robotics. American Elsevier, 5, 3-23. New York. Brusilovsky, P. & Peylo, C. 2003. Adaptive and intelligent Web-based educational systems. International Journal of AI in Education. 13 (2) : 159-172. Buelthoff, H. H., & Edelman, S. (1992). Psychophysical support for a two-dimensional view interpolation theory of object recognition. In Proceedings of National Academy of Science (pp. 60-64).. USA. Burt P.J. (1985). Smart sensing within a pyramid vision machine. Proceedings of the IEEE, 76(8), 1006-1015. Calvin, W. H. (1996). How brains think: Evolving intelligence, then and now. New York: Basic Books. Calvin, W. H. (1996). The cerebral code: Thinking a thought in the mosaics of the mind. Cambridge, MA: MIT Press. Calvin, W. H., & Bickerton, D. (2000). Lingua ex machina: Reconciling Darwin and Chomsky with the human brain. Cambridge, MA: MIT Press. Cameron, P. J. (1999). Sets, logic, and categories. Springer. Carbonell, J.G., & Mirchell, T.M. (Eds.). (pp. 463-482). Palo Alto, CA: Morgan Kaufmann.
337
Compilation of References
Carlsson, C., & Turban, E. (2002). DSS: Directions for the next decade. Decision Support Systems, 33, 105-110. Cazorla, D., Cuartero, F., Valero, V., Pelayo, F., & Pardo, J. (2003). Algebraic theory of probabilistic and non-deterministic processes. Journal of Logic and Algebraic Programming, 55(1-2), 57-103. Cendrowska, J. (1987). PRISM: An algorithm for inducing modular rules. International Journal of Man-Machine Studies, 27, 349-370. Cestnik, B., Kononenko, I., & Bratko, I. (1987). ASSISTANT 86: A knowledge-elicitation tool for sophisticated users. Proceedings of the 2nd European Working Session on Learning (pp. 31-45). Yugoslavia.
Chandrasekaran, B. (1986). Generic tasks in knowledgebased reasoning: High-level building blocks for expert systems design. IEEE Expert, 1(3), 23-30. Chandrasekaran, B., Josephson, J. R. & Benjamins, V. R. (1999). What are ontologies, and why do we need them? IEEE Intelligent Systems, Vol.14, No.1, pp. 20-26. Chandrasekaran, B., Josephson, J.R., & Benjamins, V.R. (1998). Ontology of tasks and methods. 11th Knowledge Acquisition for Knowledge-Based Systems Workshop ‘98 (KAW 98) (pp. 6.1-6.21). Banff, Canada. Chang, C. L. Combs, J. B. & Stachowitz, R. A. (1990). A report on the expert systems validation associate (EVA). Expert Systems with Applications, Vol.1, pp.217-230.
Chalmers, D. (1997). The conscious mind: In search of a fundamental theory. (p. 432). Oxford, UK: Oxford University Press.
Chang, C.L. & Lee, R.C.T. (1973). Symbolic logic and mechanical theorem proving. New York: Academic Press.
Chan, C., Kinsner, W., Wang, Y., & Miller, D.M. (eds.) (2004, August). Cognitive informatics. Proceedings of the 3rd IEEE International Conference on Cognitive Informatics (ICCI'04), Victoria, Canada, IEEE Computer Society Press, Los Alamitos, CA., July, 320pp.
Chatry, N., Perdereau, V., Drouin, M., Milgram, M., & Riat, J. C. (1996, May). A new design method for dynamical feedback networks. In Proc. Int. Sym. Soft Computing for Industry, Montpellier, France.
Chan, C.W. (1992). Knowledge acquisition by conceptual modeling. Applied Mathematics Letters Journal, 3, 7-12. Chan, C.W. (1995). Development and application of a knowledge modeling technique. Journal of Experimental and Theoretical Artificial Intelligence, 7, 217-236 Chan, C.W. (2000). A knowledge modelling technique and industrial applications. In C. Leondes, (Ed.), Knowledge-Based Systems Techniques and Applications, 34(4), 1109-1141. USA: Academic Press. Chan, C.W. (2002, August 19-20)Cognitive informatics: A knowledge engineering perspective. Proceedings of First IEEE International Conference on Cognitive Informatics (ICCI 02) (pp. 49-56). Calgary Alberta. Chan, C.W. (2004, May 2-4). A knowledge modeling system. Proceedings of IEEE Canadian Conference on Electrical and Computer Engineering (CCECE ’04) (pp.1353-1356). Niagara Falls, Ontario.
338
Chen, Y.-C., & Walker, I. D. (1993). A consistent nullspace approach to inverse kinematic of redundant robots. In Proc. IEEE Int. Conf. Robotics and Automation (pp. 374-381). Atlanta, USA. Chevallereau, C., & Khalil, W. (1988). A new method for the solution of the inverse kinematics of redundant robots. In Proc. IEEE Int. Conf. Robotics and Automation (pp. 37-42). Philadelphia, USA. Chiew, V., & Wang, Y. (2003, August). A multi-disciplinary perspective on cognitive informatics. The 2nd IEEE International Conference on Cognitive Informatics (ICCI’03) (pp. 114-120). London, UK: IEEE CS Press. Chiew, V., & Wang, Y. (2004). Formal description of the cognitive process of problem solving. Proceedings of ICCI’04 (pp. 74-83). Chomsky, N. (1957). Syntactic Structures. Haag, Mouton.
Compilation of References
Chomsky, N. (1965). Aspect of the Theory of Syntax. MIT Press. Chomsky, N. (1988). Language and mind (p. 208). Cambridge, UK: Cambridge Univ. Press, 2006 (3rd ed.). Clancey, W.J. (1985). Heuristic classification. Artificial Intelligence, 27, 289-350. Clark, P., & Niblett, T. (1989). The CN2 induction algorithm. Machine Learning, 3(4), 261-283. Clarke, B. L. (1981). A calcus of Individuals based on connection. Notre Dame Journal of Formal Logic, 23(3), 204-218. Clarke, B. L. (1985). Individuals and points. Notre Dame Journal of Formal Logic, 26(1), 61-75. Cleaveland, R., Dayar, Z., Smolka, S., & Yuen, S. (1999). Testing pre-orders for probabilistic processes. Information and Computation, 154(2), 93-148. Collins, M. & Loftus, F. 1975. A spreading activation theory of semantic processing. Psychological Review. (82):407-428. Corballis, M. C. (2002). From hand to mouth: The origins of language. Princeton/Oxford: Princeton University Press. Corbett, A., Mclaughlin, M. & Scarpinatto, K. C. 2000. Modeling Student Knowledge: Cognitive Tutors in High School and College. Journal of User Modeling and UserAdapted Interaction. (10): 81-108. Cotterill, R. (2003). CyberChild: A simulation test-bed for consciousness studies. In J. Consciousness Studies, 10, 4-5, 31-45. Cotterill, R. (ed.) (1988). Computer simulations in brain science (p. 566) Cambridge, UK: Cambridge University Press. Cover, T.M., & Thomas, J.A. (1991). Elements of information theory (p. 542). New York, NY: Wiley. Cox, J. R., & Griggs, R. A. (1989). The effects of experience on performance in Wason’s selection tasks. Memory and Cognition, 10, 496-503.
Croft, W. B. (1984). The role of context and adaptation in user interfaces. International Journal of Man-Machine Studies, 21, 283-292. D’Argenio, P., Katoen, J.-P., & Brinksma, E. (1998). An algebraic approach to the specification of stochastic systems. In Programming Concepts and Methods, (pp.126-147). Chapman & Hall. Dansereau, R., & Kinsner, W. (2001, May 7-11). New relative multifractal dimension measures. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing, ICASSP2001, 1741-1744. Salt Lake City, UT. Dansereau, R.M., Kinsner, W., & Cevher, V. (2002, May 12-15). Wavelet packet best basis search using Rényi generalized entropy. In Proceedings of the IEEE 2002 Canadian Conference on Electrical & Computer Engineering, CCECE02, 2, 1005-1008 Winnipeg, MB. ISBN: 0-7803-7514-9. Davies, J., & Schneider, S. (1995). A brief history of timed CSP. Theoretical Computer Science, 138, 243-271. Dawkins, R. (1990). The selfish gene (2nd ed.) (p. 368). Oxford, UK: Oxford University Press. de Farias, D.P. (2002). The linear programming approach to approximate dynamic programming: Theory and application, doctorate dissertation. (p. 146). Stanford, CA: Stanford University. Available as of May 2006 from http://web.mit.edu/~pucci/www/daniela_thesis.pdf de Farias, D.P., & Van Roy, B. (2003). The linear programming approach to approximate dynamic programming. Oper. Res., 51(6), 850-865. de Farias, D.P., & Van Roy, B. (2004, August). On constraint sampling in the linear programming approach to approximate dynamic programming. Math. Oper. Res., 29,(3), 462-478. de Rosis, F. (2001). Towards adaptation of interaction to affective factors. Journal of User Modeling and UserAdapted Interaction, 11(4). Dennett D.C. (1991). Consciousness explained (p. 528). London, UK: Allan Lane/Penguin.
339
Compilation of References
Dillinger, M., Madani, K., & Alonistioti, Nancy (eds.) (2003). Software defined radio: Architectures, systems and functions (p. 454). New York, NY: Wiley. Domingos, P. (1999). The role of Occam’s razor in knowledge discovery. Data Mining and Knowledge Discovery 3(4) 409-425. Dong, T. (2005). Recognizing variable spatial environments—The theory of cognitive prism. Unpublished doctoral dissertation. University of Bremen, Germany. Dong, T. (2005). SNAPVis and SPANVis: Ontologies for recognizing variable vista spatial environments. In C. Freksa, M. Knauff, B. Krieg-Brückner, B. Nebel, & T. Barkowsky (ds.), International Conference Spatial Cognition, 4, 344-365. Berlin: Springer. Dong, T. (2006). The theory of cognitive prism—Recognizing variable spatial environments. In Proceedings of the Nineteenth International Florida Artificial Intelligence Research Society Conference (pp. 719–724). Menlo Park, CA: AAAI Press. Dong, T. (2007). Knowledge representation of distances and orientation of rregions. International Journal of Cognitive Informatics and Natural Intelligence, 1(2), 86-99. Doxygen. (2004). Doxygen Web site. Retrieved January 7, 2005, from http://www.stack.nl/~dimitri/doxygen/ Dreyfus, H. L., & Dreyfus, S. E. (1986). Mind over machine: The power of human intuition and expertise in the era of the computer. New York: The Free Press.
Eagly, A.H., & Chaiken, S. (1992). The psychology of attitudes. San Diego: Harcourt, Brace. Ebbinghaus, H. D., Flum, J., & Thomas, W. (1984). Mathematical logic. Springer. Edgar, G.A. (1990). Measure, topology, and fractal geometry (p. 230). New York, NY: Springer Verlag. Edwards, W., & Fasolo, B. (2001). Decision technology. Annual Review of Psychology, 52, 581-606. Elm, W.C., Cook, M.J., Greitzer, F.L., Hoffman, R.R, Moon, B., & Hutchins, S.G. (2004). Designing support for intelligence analysis. Proceedings of the Human Factors and Ergonomics Society (pp. 20-24). Faghfouri, A., & Kinsner, W. (2005, May 2-5). Local and global analysis of multifractal singularity spectrum through wavelets. In Proc. IEEE 2005 Can. Conf. Electrical & Computer Eng.(pp. 2157-2163). Saskatoon, SK. Fagin, R., Halpern, J. Y., Moses, Y. & Vardi, M. Y. (1995). Reasoning about Knowledge. Cambridge, MA: MIT Press. Falconer, K. (1990). Fractal geometry: Mathematical foundations and applications (p. 288). New York, NY: Wiley. Fang, G., & Dissanayake, M.W.M.G. (1993, July). A neural network-based algorithm for robot trajectory planning. Proceedings of International Conference of Robots for Competitive Industries. (p. 521-530). Brisbane, Qld, Australia.
Dreyfus, H.L. (1992). What computers still can’t do. MIT Press/Cambridge Press.
Fang, G., & Dissanayake, M.W.M.G. (1998). Experiments on a neural network-based method for time-optimal trajectory planning. Robotica, 16, 143-158.
Dubey, R. V., Euler, J. A., & Babcock, S. M. (1991). Real-time implementation of an optimization scheme for seven-degree-of-freedom redundant manipulators. In IEEE Tran. Sys. Robotics and Automation, 7(5), 579-588.
Farrell, J.E., & Van Den Branden Lambrecht, C.J. (eds.) (2002, January). Translating human vision research into engineering technology [Special Issue]. Proceedings of the IEEE, 90(1).
Dubois, D., & Prade, H. (1988). Possibility theory: An approach to computerized processing of uncertainty (p. 263). New York, NY: Plenum.
340
Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., & Uthurusamy, R. (eds.) (1996). Advances in knowledge discovery and data mining. AAAI/MIT Press.
Compilation of References
Fazio, R.H. (1986). How do attitudes guide behavior. In R.M. Sorrentino and E.T. Higgins (eds.), The Handbook of Motivation and Cognition: Foundations of Social Behavior. New York: Guilford Press. Featherstone, R. (1994). Accurate trajectory transformations for redundant and non-redundant robots. In Proc. IEEE Int. Conf. on Robotics and Automation (pp.18671872). San Diego, USA. Feder, J. (1988). Fractals (p. 238). New York, NY: Plenum. Field, D. J. (1987). Relations between the statistics of natural images and the response properties of cortical cells. J. Opt. Soc. Am. A, 4(12), 2379-2394. Field, D.J. (1994). What is the goal of sensory coding. Neural Computation, 6, 559-601. Finin, T., & Silverman, D. (1986). Interactive classification of conceptual knowledge. Proceedings of the First International Workshop on Expert Database Systems (pp. 79-90). Fischer, G., Mccall, R., Ostwald, J., Reeves, B., & Shipman, F. (1994). Seeding, evolutionary growth and reseeding: Supporting the incremental development of design environments. Paper presented at the Conference on Computer-Human Interaction (Chi’94), Boston, MA. Fischer, K.W., Shaver, P.R., & Carnochan, P. (1990). How emotions develop and how they organize development. Cognition and Emotion, 4, 81-127. Fischler, M.A., & Firschein, O. (1981). Intelligence: The eye, the brain and the computer (p. 331). Reading, MA: Addison-Wesley. Fishbein, M., & Ajzen, I. (1975). Belief, attitude, intention, and behavior: An introduction to theory and research.. Reading, MA: Addison-Wesley. Fisher, D., & Schlimmer, J. (1988). Concept simplification and prediction accuracy. In Proceedings of the Fifth International Conference on Machine Learning. Morgan Kaufmann, 22-28.
In H. T. Smith & T. R. G. Green (Eds.), Human interaction with computers. London: Academic Press. Fitting, M. (1991). Bilattices and the semantics of logic programming. Journal of Logic Programming, Vol.11, pp. 91-116. Fitting, M. (2002). Fixpoint semantics for logic programming: a survey. Theoretical Computer Science, Vol. 278, Issues 1-2, pp. 25-51. Fitts, P. M. (1951). Engineering psychology and equipment design. In S. S. Stevens (Ed.), Handbook of experimental psychology (pp. 1287-1340). New York: Wiley. Fitts, P. M. (Ed.) (1951). Human engineering for an effective air navigation and traffic control system. Washington, DC: National Academy Press, National Academy of Sciences. Flax, L. (2004, September). Algebraic belief revision and nonmonotonic entailment results and proofs. Technical Report C/TR04-01, Macquarie University. Retrieved from http://www.comp.mq.edu.au/~flax/techReports/ brNm.pdf Flores-Mendez, R.A., Van Leeuwee, P., Lukose, D. (1998). Modeling expertise using KADS and MODEL-ECS. In B.R. Gaines & M. Musen (eds.) Proceedings of the 11th Knowledge Acquisition for Knowledge-based Systems Workshop (KAW’98), 1 3-14, 19-23. Banff, Canada. Forward, A., & Lethbridge, T. (2002). The relevance of software documentation, tools and techniqies: A survey. Paper presented at the ACM Symposium on Document Engineering, Mclean, VA. Fowler, M. (1999). Refactoring: Improving the design of existing code. MA: Addison-Wesley. Franklin, N., & Tversky, B. (1990). Searching imagined environments. Journal of Experimental Psychology: General, 119, 63-76. Franklin, S. (1995). Artificial minds (p. 464). Cambridge, MA: MIT Press. Franklin, S. (2003). IDA: A conscious artefact.
Fitter, M. J., & Sime, M. E. (1980). Creating responsive computers: Responsibility and shared decision-making.
341
Compilation of References
Freeeman, W. (2001). How brains make up their minds (2nd ed.) (p. 146). New York, NY: Columbia University Press. Frith, C. D. (1992). The cognitive neuropsychology of schizophrenia. Lawrence Erlbaum Associates. Gabrieli, J.D.E. (1998). Cognitive neuroscience of human memory. Annual Review of Psychology, 49, 87-115. Gadhok, N., & Kinsner, W. (2006, May 10-12). An implementation of beta-divergence for blind source separation. In Proceedings of the IEEE Can. Conf. Electrical & Computer Eng., CCECE06 (pp. 642-646). Ottawa, ON.
Gershon, N. (1995). Human information interaction. In Proceedings of the WWW4 Conference, Boston, MA. Giarrantans, J., & Riley, G. (1989). Expert systems: Principles and programming. Boston: PWS-KENT Pub. Co. Ginsberg, A. & Williamson, K. (1993). Inconsistency and rdundancy checking for quasi-first-order-logic knowledge bases. International Journal of Expert Systems, Vol.6, No.3, pp. 321-340. Ginsberg, M. L. (1988). Multivalued logics: a uniform approach to inference in artificial intelligence. Computational Intelligence, Vol.4, No.3, pp. 265-316.
Gagné, R., Briggs, L. & Wager, W. 1992. Principles of Instructional Design. (4th edition), New York: Holt, Rinehart & Winston (Eds.).
Glabbeek, R.V., Smolka, S., & Steffen, B. (1995). Reactive, generative and stratified models of probabilistic processes. Information and Computation, 121(1), 59-80.
Ganek, A.G., & Corbi, T.A. (2006, May). The dawning of the autonomic computing era. IBM Systems Journal, 42(1), 34-42. Available as of May 2006 from http://www. research.ibm.com/journal/sj/421/ganek.pdf/
Glasser, W. (1998). The quality school. Perennial.
Ganter, B., & Wille, R. (1999). Formal concept analysis (pp. 1-5). Springer. Garagnani, M., Shastri, L., & Wendelken, C. 2002. A connectionist model of planning as back-chaining search. Proceedings of the 24th Conference of the Cognitive Science Society. Fairfax, Virginia, USA. pp 345-350. Gärdenfors, P. (1988). Knowledge in flux. MIT Press. Gärdenfors, P., & Rott, H. (1995). Belief revision. In M. Dov, C. Gabbay, J. Hogger, & J. A. Robinson (Eds.), Handbook of logic in artificial intelligence and logic programming (Vol. 4, pp. 35-132). Oxford University Press. Gasson, M., Hutt, B., Goodhew, I., Kyberd, P., & Warwick, K. (2002, September). Bi-directional human machine interface via direct neural connection. In Proceedings of the IEEE Workshop on Robot and Human Interactive Communication (pp. 265-270), Berlin, Germany. Genesereth, M. R. & Nilsson, N. J. (1987). Logical foundations of artificial intelligence. Los Altos, CA: Morgan Kaufmann Publishers.
342
Gold, E. M. (1978). Complexity of automaton identification from given data. Information and Control, 37, 302-320. Goldberg, D.E. (2002). The design of innovation: Genetic algorithms and evolutionary computation (p. 272). New York, NY: Springer. Gomez-Perez, A., Fernandez-Lopez, M. & Corcho, O. (2004). Ontological engineering. London: SpringerVerlag. Gosling, J., Joy, B., & Guy, S. (1996). Java language specification. MA: Addison-Wesley. Grassberger, P., & Procaccia, I. (1983, January 31). Characterization of strange attractors. Physics Review Letters, 50A(5), 346-349. Greitzer, F. L. (2005). Toward the development of cognitive task difficulty metrics to support intelligence analysis research. In Proceedings of the IEEE 2005 International Conference on Cognitive Informatics (pp. 315-320). IEEE Computer Society. Greitzer, F. L. (2005). Extending the reach of augmented cognition to real-world decision making tasks. In Proceedings of the HCI International 2005/Augmented Cognition Conference, Las Vegas, NV.
Compilation of References
Greitzer, F. L., Hershman, R. L., & Kaiwi, J. (1985). Intelligent interfaces for C2 operability. In Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics. Griffith, D. (1990). Computer access for persons who are blind or visually impaired: Human factors issues. Human Factors, 32, 467-475. Griffith, D. (2005). Beyond usability: The new symbiosis. Ergonomics in Design, 13, 3. Griffith, D. (2005). Neo-symbiosis: A tool for diversity and enrichment. Retrieved August 6, 2006, from http://2005.cyberg.wits.ac.za. Griffith, D., Gardner-Bonneau, D. J., Edwards, A. D. N., Elkind, J. I., & Williges, R. C. (1989). Human factors research with special populations will further advance the theory and practice of the human factors discipline. In Proceedings of the Human Factors 33rd Annual Meeting (pp. 565-566), Santa Monica, CA. Human Factors Society. Grossberg, S. (1982). Studies of mind and brain: Neural principles of learning, perception, development, cognition and motor control (p.662). Boston, MA: D. Reidel Publishing. Grossberg, S. (ed.) (1988). Neural networks and natural intelligence (p. 637). Cambridge, MA: MIT Press. Gruber, T. (1992, October). A translation approach to portable ontology specifications. Proceedings of the 7th Banff Knowledge Acquisition Knowledge-based Systems Workshop ’92 (pp. 11-16). Paper No. 12. Banff, Canada. Guarino, N., & Giaretta, P., (1995). Ontologies and knowledge bases: Towards a terminological clarification. In Towards Very Large Knowledge Bases: Knowledge Building and Knowledge Sharing, (pp. 25-32) N. Mars (Ed.), Amsterdam, The Netherlands: IOS Press. Guez, A., & Ahmad, Z. (1989, June). Accelerated convergence in the inverse kinematics via multiplayer feedforward networks. In Proc. IEEE Int. Conf. Neural Networks (pp. 341-344). Washington, USA.
Gцtz, N., Herzog, U., & Rettelbach, M. (1993). Multiprocessor and distributed system design: The integration of functional specification and performance analysis using stochastic process algebras. In 16th Int. Symp. on Computer Performance Modelling, Measurement and Evaluation (PERFORMANCE’93), LNCS 729, (pp. 121–146). Springer. Haikonen, P.O.A. (2003). The cognitive approach to conscious machines (p. 294). New York, NY: Academic. (See also http://personal.inet.fi/cool/pentti.haikonen/) Haikonen, P.O.A. (2004, June). Conscious machines and machine emotions. Workshop on Models for Machine Consciousness, Antwerp, BE.. Halford, G. S. (1993). Children’s understanding: The development of mental models. Hillsdale, NJ: Lawrence Erlbaum Associates. Halsey, T.C., Jensen, M.H., Kadanoff, L.P., Procaccia, I., & Shraiman, B. (1986, February). Fractal measures and their singularities: The characterization of strange sets. Phys. Rev., A33(2), 1141-1151. Han, C.h., Lidz, J., & Musolino, J. (2003). Verb-raising and grammar competition in Korean: Evidence from negation and quantifier scope. Unpublished manuscript. Simon Fraser University, Northwestern University, Indiana University. Han, C.-h., Ryan, D., Storoshenko, S., & Yasuko, S. (in press). Scope of negation, and clause structure in Japanese. In Proceedings of the 30th Berkeley Linguistics Society. Han, J., Hu, X., & Cercone, N. (2003). A visualization model of interactive knowledge discovery systems and its implementations, Information Visualization, 2(2), 105-125. Hancock, P. A., Pepe, A. A., & Murphy, L. (2005). Hedonomics: The power of positive and pleasurable ergonomics. Ergonomics in Design, 13, 8-14. Hanh, U., Schulz, S., & Romacker, M., (1999). Part whole reasoning: A case study in medical ontology engineering. IEEE Intelligent Systems, 14(5), 59-67.
343
Compilation of References
Harrison, P., & Strulo, B. (2000). SPADES – A process algebra for discrete event simulation. Journal of Logic Computation, 10(1), 3-42.
Held, G. (1987). Data compression: Techniques and applications, hardware and software considerations (2nd ed.), (p. 206). New York, NY: Wiley.
Hartley, R.V.L. (1928). Transmission of information. Bell System Technical Journal, I, 535-563.
Henninger, S. (1997). Tools supporting the creation and evolution of software development knowledge. Paper presented at the International Conference on Automated Software Engineering (ASE’97), Incline Village, NV.
Hastie, R. (2001). Problems for judgment and decision making. Annual Review of Psychology, 52, 653-683. Hatano, K., Sano, R., Duan, Y., & Tanaka, K. (1999). An interactive classification of Web documents by selforganizing maps and search engines. Proceedings of the 6th International Conference on Database Systems for Advanced Applications (pp. 19-22). Hawkins, S. (1996). The illustrated a brief history of time (2nd ed.) (p.248). New York, NY: Bantam. Haykin, S. (2005, February). Cognitive radio: Brainempowered wireless communications. IEEE J. Selected Areas in Communications, 23(2), 201-220. Haykin, S. (2005, September 28-30). Cognitive machines. In IEEE Intern. Workshop on Machine Intelligence & Sign. Proceedings, IWMISP05. Mystic, CT. Available as of May 2006 from http://soma.crl.mcmaster.ca/ASLWeb/Resources/data/ Cognitive_Machines.pdf Haykin, S. (2006, January). Cognitive radar. IEEE Signal Processing Mag (pp. 30-40).. Haykin, S., & Chen, Z. (2005). The cocktail party problem. Neural Computation, 17, 1875-1902. Haykin, S., & Kosko, Bart. (2001). Intelligent signal processing (p. 553) New York, NY: Wiley. . Haykin, S., Principe, C.J., Sejnowski, T.J., & McWhirter, J. (2006). New directions in statistical signal processing (p. 544). Cambridge, MA: MIT Press.
Hentschel, H.G.E., & Procaccia, I. (1983). The infinite number of generalized dimensions of fractals and strange attractors. Physica, 8D, 435-444. Hermann, D. & Harwood, J. 1980. More evidence for the existence of separate semantic and episodic stores in long-term memory. Journal of Experimental Psychology, 6 (5), 467-478. Hillston, J. (1996). A compositional approach to performance modelling. Cambridge University Press. Hinton, G.E., & Anderson, J.A. (1981). Parallel models of associative memory (p. 295). Hillsdale, NJ: Lawrence Erlbaum Associates. Hoare, C.A.R. (1985). Communicating sequential processes. Prentice-Hall Inc. Hoffman, R. R., Feltovich, P. J., Ford, K. M., Woods, D. D., Klein, G., & Feltovich, A. (2002). A rose by any other name… would probably be given an acronym. Retrieved August 6, 2006, from http://www.ihmc.us/research/projects/EssaysOnHCC/TheRose.pdf Hoffman, R.R., Klein, G., & Laughry, K.R. (2002, January/February). The state of cognitive systems engineering. IEEE Intelligent Systems Magazine (pp. 73-75).. Hoggar. S.G. (1992). Mathematics for computer graphics (p. 472). Cambridge, UK: Cambridge University Press.
Heath, P. (Eds). (1966). On the syllogism and other logical writings by Augustus De Morgan. New Haven: Yale University Press.
Holland, O. (ed.), (2003). Machine consciousness (p. 192). Exeter, UK: Imprint Academic.
Heermann, D. & Fuhrmann, T. 2000. Teaching physics in the virtual university: the Mechanics toolkit, in Computer Physics Communications, 127, 11-15
Hollnagel, E., & Woods, D. D. (1983). Cognitive systems engineering: New wine in new bottles. International Journal of Man-Machine Studies, 18, 583-600. Reprinted (1999) in 30th Anniversary Issue of International Journal
344
Compilation of References
of Human-Computer Studies, 51, 339-356. Retrieved August 6, 2006, from http://www.idealibrary.com
IBM Autonomic Computing Manifesto. (Available as of May 2006, http://www.research.ibm.com/autonomic/)
Hubel, D.H. (1995). Eye, brain and vision (Reprint edition). W.H. Freeman & Company.
IBM Corp (2005). Autonomic computing. Retrieved April 2005, from http://www.research.ibm.com/autonomic/glossary.html
Huffman, D. (1954). The synthesis of sequential switching circuits. J. Franklin Inst. 257, 3-4, 161-190, 275-303. Hughes, F. J., & Schum, D. A. (2003). Preparing for the future of intelligence analysis: Discovery – Proof – Choice. Unpublished manuscript. Joint Military Intelligence College. Humphreys, G. K., & Khan, S. C. (1992). Recognizing novel views of three-dimensional objects. Canadian Journal of Psychology, 46, 170-190. Humphreys, M. S., Bain, J. D. & Pike, R. (1989). Different ways to cue a coherent memory system: A theory for episodic, semantic and procedural tasks. Psychological Review, 96, 208-233. Hunt, K. H. (1987, March). Robot kinematics - A compact analytic inverse solution for velocities. In ASME J. Mechanisms, Transmissions and Automat. Design, 109, 42-49. Hunt, K. J., et al. (1992, November). Neural networks for control systems. Automatica, 28(2), 1083-1112. Hurley, P.J. (1997). A concise introduction to logic (6th ed.). London: Wadsworth Publishing Co., ITP. Hyvarinen, A., & Hoyer, P.O. (2001). A two-layer sparse coding model learns simple and complex cell receptive fields and topography from natural images. Vision Research. 41(18), 2413-2423. Hyvarinen, A., Karhunen, J., & Oja, E. (2001). Independent component analysis (p. 481). New York, NY: Wiley. IBM (2001). IBM autonomic computing manifesto. http://www.research. ibm.com/autonomic/. IBM (2006, June). Autonomic computing white paper: An architectural blueprint for autonomic computing (4th ed.) (pp.1-37).
IBM Corp (2005). The eight elements. Retrieved April 2005, from http://www.research.ibm.com/autonomic/ manifesto/autonomic_computing.pdf ISO/IEC 11172-3 (1993) Information Technology - Coding of Moving Pictures and Associated Audio for Digital Storage Media at up to about 1.5 Mbits/s - Part 3: Audio. Itti, L. (2001). Visual attention and target detection in cluttered natural scenes. Optical Engineering, 40(9), 1784-1793. Itti, L., Koch, C., & Niebur, E. (1998). A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on PAMI, 20(11), 1254-1259. Jasper, R., & Uschold, M. (1999, August). A framework for understanding and classifying ontology applications. In Proceedings of the IJCAI99 Workshop on Ontologies and Problem-Solving Methods (KRR5) (pp. 11.1-11.12). Stockholm, Sweden. Jayant, N. (1992, June). Signal compression: Technology targets and research directions. IEEE Journal on Selected Areas Communications, 10, 796-818. Jayant, N. (ed.) (1997). Signal compression: Coding of speech, audio, text, image and video (p. 231). Singapore: World Scientific. Jayant, N.S., Johnson, J.D., & Safranek, R.S. (1993, October). Signal compression based on models of human perception. Proceedings of the IEEE, 81(10), 1385-1422. Jennings, N.R. (2000). On agent-based software engineering. Artificial Intelligence, 17(2), 277-296. Jennings, R. E. (1994). The genealogy of disjunction. New York: Oxford University Press. Jennings, R. E. (2004). The meaning of connectives. In S. Davis & B. Gillon (Eds.), Semantics: A reader. New York: Oxford University Press. 345
Compilation of References
Jennings, R. E. (2005). The semantic illusion. (in press). In A. Irvine & K. Peacock (Eds.), Errors of reason. Toronto: University of Toronto Press. Jennings, R. E., & Friedrich, N. A. (2006). Proof and consequence: An introduction to classical logic. Peterborough: Broadview Press. Jennings, R. E., & Schapansky, N. (2000). Without: From separation to negation, a case study in logicalization. In Proceedings of the CLA 2000 (pp.147-158). Ottawa: Cahiers Linguistiques d’Ottawa. Jensen, E. (2000). Brain-based learning: The new science of teaching and training (revision ed.). Brain Store Inc. Jordan, D. W., & Smith, P. (1997). Mathematical techniques: An introduction for the engineering, physical, and mathematical sciences (2nd ed.). UK; Oxford University Press. Joy, B. (2000, April). Why the future doesn’t need us. Wired, 8.04. Jung, S., & Hsia, T.C. (2000). Neural network inverse control techniques for PD controlled robot manipulator. Robotica, 18, 305-314. Kadanoff, L.P. (1993). From order to chaos: Essays (p. 555). Singapore: World Scientific. Kahneman, D. (1973). Attention and effort. Englewood Cliffs, NJ: Prentice Hall, Inc. Kahneman, D. (2002, December 8). Maps of bounded rationality: A perspective on intuitive judgment and choice. Nobel Prize lecture. Kahneman, D. (2003). A perspective on judgment and choice: Mapping bounded rationality. American Psychologist, 58, 697-720. Kahneman, D., & Frederick, S. (2002). Representativeness revisited: Attribute substitution in intuitive judgment. In T. Gilovich, D. Griffin, & D. Kahneman (Eds.), Heuristics and biases (pp. 49-81). New York: Cambridge University Press.
Kantor, P. B. (1980). Availability analysis. Journal of the American Society for Information Science, 27(6), 311-319. Reprinted (1980) in Key Papers in information science (pp. 368-376). White Plains, NY: Knowledge Industry Publications, Inc. Kantz, H., & Schreiber, T. (1997). Nonlinear time series analysis (p. 304).. Cambridge, UK: Cambridge Univ. Press. Kaplan, J.L., & Yorke, J.A. (1979). Chaotic behavior of multidimensional difference equations. In Peitgen, H.-O. & Walther, H.O. (eds.), Functional differential equations and approximations of fixed points (pp. 204-227, 503). New York, NY: Springer Verlag. Kawato, M., Maeda, Y., Uno, Y., & Suzuki, R. (1990). Trajectory formation of arm movement by cascade neural network model based on minimum torque-change criterion. Biological Cybernetics, 62(275), 288. Kephart, J., & Chess, D. (2003, January). The vision of autonomic computing. IEEE Computer, 26(1), 41-50. Kieffer, S., Morellas, V., & Donath, M. (1991, April). Neural network learning of the inverse kinematic relationships for a robot arm. In Proc. IEEE Int. Conf. Robotics and Automation (pp. 2418-2425). Sacramento, CA. Kim, S. W., Park, K. B. & Lee, J. J. (1994). Redundancy resolution of robot manipulators using optimal kinematic control. In Proc. IEEE Int. Conf. Robotics and Automation (pp. 683-688). San Diego, USA. Kinsner, W. (1991). Review of data compression methods, including Shannon-Fano, Huffman, arithmetic, Storer, Lempel-Ziv-Welch, fractal, neural network, and wavelet algorithms. Technical Report DEL91-1 (p. 157). Winnipeg, MB, Canada: Dept. Electrical & Computer Engineering, University of Manitoba. Kinsner, W. (1994). Fractal dimensions: Morphological, Entropy, Spectrum, and Variance Classes. Technical Report, DEL94-4 (p.146) Winnipeg, MB, Canada: Dept. of Electrical & Computer Engineering, University of Manitoba. Kinsner, W. (1994, May). A unified approach to fractal and multifractal dimensions. Technical Report (p. 147)
346
Compilation of References
DEL94-4. Department of Electrical and Computer Engineering, University of Manitoba, Winnipeg, Manitoba, Canada (abbreviated to UofM in the references below). Kinsner, W. (1994, June 7). Entropy-based fractal dimensions: Probability and pair-correlation algorithms for E-Dimensional images and strange attractors. Technical Report, DEL94-5; UofM, (p. 44). Kinsner, W. (1994, June 15). Batch and real-time computation of a fractal dimension based on variance of a time series. Technical Report, DEL94-6; UofM (p. 22).
Report DEL03-1, (p. 76). Winnipeg, MB, Canada: Dept. Electrical & Computer Engineering, University of Manitoba. Kinsner, W. (2003, August 18-20). Characterizing chaos through Lyapunov metrics. In Proceedings of the 2nd IEEE International Conference on Cognitive Informatics (pp. 189-201). London UK. ISBN 0-7695-1986-5. Kinsner, W. (2003). Is it noise of chaos? Technical Report DEL03-2 (p. 98). Winnipeg, MB, Canada: Dept. Electrical & Computer Engineering, University of Manitoba.
Kinsner, W. (1994, June 20). The Hausdorff-Besicovitch dimension formulation for fractals and multifractals. Technical Report, DEL94-7; UofM (p. 12).
Kinsner, W. (2004, August 16-18). Is entropy suitable to characterize data and signals for cognitive informatics? In Proc. IEEE 2004 Intern. Conf. Cognitive Informatics, ICCI04 (p. 6-21). Victoria, BC. ISBN: 0-7695-2190-8.
Kinsner, W. (1995, January). Self-similarity: The foundation for fractals and chaos. Technical Report, DEL95-2. UofM, (p. 113).
Kinsner, W. (2005). Some advances in cognitive informatics. In International Conference on Cognitive Informatics (ICCI’05).6-7. IEEE Press.
Kinsner, W. (1996). Fractal and chaos engineering: Postgraduate lecture notes (p. 760). Winnipeg, MB, Canada: Department of Electrical & Computer Engineering, University of Manitoba.
Kinsner, W. (2005, August 8-10). A unified approach to fractal dimensions. In Proceedings of the 4th IEEE International Conference on Cognitive Informatics (pp. 58-72). Irvine, CA. ISBN 0-7803-9136-5.
Kinsner, W. (1998). Signal and data compression: Postgraduate lecture notes P. 642). Winnipeg, MB, Canada: Department of Electrical & Computer Engineering, University of Manitoba.
Kinsner, W. (2005, August 8-10). A unified approach to fractal dimensions. In Proceedings of the IEEE 2005 Intern. Conf. Cognitive Informatics, ICCI05 (pp. 58-72). Irvine, CA.{ISBN: 0-7803-9136-5}.
Kinsner, W. (2002, August 19-20). Compression and its metrics for multimedia. In Proceedings of the 1st IEEE International Conference on Cognitive Informatics (pp.107-121). Calgary, AB. {ISBN 0-7695-1724-2}
Kinsner, W. (2005, June 16-18). Signal processing for autonomic computing. In Proceedings 2005 Meet. Can. Applied & Industrial Math Soc., CAIMS 2005, Winnipeg, MB Available as of May 2006 from http://www.umanitoba.ca/institutes/iims/caims2005_theme_signal.shtml
Kinsner, W. (2002, August 19-20). Compression and its metrics for multimedia. In Proc. IEEE 2002 Intern. Conf. Cognitive Informatics, ICCI02 (pp. 107-121). Calgary, AB. ISBN: 0-7695-1724-2. Kinsner, W. (2003, August 18-20). Characterizing chaos through Lyapunov metrics. In Proc. IEEE 2003 Intern. Conf. Cognitive Informatics, ICCI03 (p. 189-201). London, UK. ISBN: 0-7803-1986-5. Kinsner, W. (2003). Characterizing chaos with Lyapunov exponents and Kolmogorov-Sinai entropy. Technical
Kinsner, W. (2006, July). Towards cognitive machines: Multiscale measures and analysis. Keynote speech at the Proceedings of the 5th IEEE International Conference on Cognitive Informatics (ICCI’06) (pp. 8-14). Beijing, China: IEEE CS Press. Kinsner, W. (2007). Towards cognitive machines: Multiscale measures and analysis. The International Journal on Cognitive Informatics and Natural Intelligence (IJCINI), 1(1), 28-38.
347
Compilation of References
Kinsner, W., & Dansereau, R. (2006, July 17-19). A relative fractal dimension spectrum as a complexity measure. In Proceedings of the 5th IEEE International Conference on Cognitive Informatics. Beijing, China. ISBN 1-4244-0475-4. Kinsner, W., & Dansereau, R. (2006, July 17-19). A relative fractal dimension spectrum as a complexity measure. In Proceedings of the IEEE 2006 Intern. Conf. Cognitive Informatics, ICCI06, Beijing, China.{ISBN: 1-4244-0475-4}. Kinsner, W., & Dansereau, R. (2006, July 17-19). A relative fractal dimension spectrum as a complexity measure. In Proceedings of the 5th IEEE International Conference on Cognitive Informatics. Beijing, China. ISBN 1-4244-0475-4. Kinsner, W., Cheung, V., Cannons, K., Pear, J., & Martin, T. (2003, August 18-20). Signal classification through multifractal analysis and complex domain neural networks. In Proc. IEEE 2003 Intern. Conf. Cognitive Informatics, ICCI03 (pp. 41-46). London, UK. ISBN: 0-7803-1986-5. Kinsner, W., D. Zhang, Y. Wang, and J. Tsai (Eds.) (2005), Cognitive Informatics: Proc. 4th IEEE International Conference on Cognitive Informatics (ICCI’05), IEEE CS Press, Irvine, California, USA, August. Kinsner, W., D. Zhang, Y. Wang, and J. Tsai (Eds.) (2005, August). Cognitive informatics. Proceedings of the 4th IEEE International Conference on Cognitive Informatics (ICCI'05), Irvine, California, USA, IEEE Computer Society Press, Los Alamitos, CA., July, 356pp. Kinsner, W., Potter, M., & Faghfouri, A. (2005, June 16-18). Signal processing for autonomic computing. In Rec. Can. Applied & Industrial Mathematical Sciences, CAIMS05. Winnipeg, MB. Kinsner, W., Zhang, D., Wang, Y., & Tsai, J. (eds.) (2005). Cognitive Informatics..Proceedings of the 4th IEEE International Conference (ICCI’05). Irvine, CA: IEEE CS Press. Kinsner, W., Zhang, D., Wang, Y., & Tsai, J. (eds.) (2005, August). Cognitive informatics. Proceedings of the 4th
348
IEEE International Conference, (ICCI’05). Irvine, CA: IEEE CS Press. Kircanski, M., & Petrovic, T. (1993). Combined analytical-pseudoinverse inverse kinematic solution for simple redundant manipulators and singularity avoidance. Int. J. Robotics Research, 12(1), 188-196. Kleene, S.C. (1956). Representation of events by nerve nets. In C.E. Shannon and J. McCarthy (eds.) Automata Studies (pp. 3-42). Princeton Univ. Press. Klein, C. A., & Huang, C. H. (1983, April). Review of pseudoinverse control for use with kinematically redundant manipulators. In IEEE Tran. Sys. Man and Cyb., SMC-13, 3, 245-250. Klein, C., Chu-Jenq, C., & Ahmed, S. (1993). Use of an extended jacobian method to map algorithmic singularities. In Proc. IEEE Int. Conf. Robotics and Automation (pp. 632-637). Atlanta, USA. Klir G.J. (1992). Facets of systems science. New York: Plenum. Klivington, K. (1989). The science of mind (p. 239). Cambridge, MA: MIT Press. Knuth, D. (1984). Literate programming. The Computer Journal, 27(2), 97-111. Koehler, W. (1929). Gestalt Psychology. London: Liveright. Koffka, K. (1935). Principles of Gestalt psychology. New York: Brace & World. Kohonen, T. (2002). Self-organization and associative memory. (pp. 312) (2nd ed.). New York, NY: Springer Verlag. Kokinov, B. & Petrov, A. 2000. Dynamic extension of episode representation in analogy-making in AMBR. Proceedings of the 22nd Conference of the Cognitive Science Society, NJ. 274-279. Kort, B., & Reilly, R. (2002). Theories for deep change in affect-sensitive cognitive machines: A constructivist model. Educational Techology. & Society, 5(4), 3
Compilation of References
Kozaczynski, W., & Wilde, N. (1992). On the re-engineering of transaction systems. Journal of software maintenance, 4, 143-162.
Licklider, J. C. R., & Taylor, R. G. (1968, April). The computer as a communication device. Science & Technology, 76, 21-31.
Kronaver, R.E., & Yehoshua, Y.Z. (1985). Reorganization and diversification of signals in vision. IEEE Trans. on Sys. Man, and Cyber SMC, 15(1), 91-101.
Lieberman, P. (1984). The biology and evolution of language. Cambridge, MA: Harvard University Press.
Kurzweil, R. (1990). The age of intelligent machines (p. 565). Cambridge, MA: MIT Press.
Lieberman, P. (1991). Uniquely human: The evolution of speech, thought, and selfless behavior. Cambridge, MA: Harvard University Press.
Kurzweil, R. (1999). The age of spiritual machines: When computers exceed human intelligence. New York: Penguin Group
Lieberman, P. (2000). Human language and our reptilian brain: The subcortical bases of speech, syntax and thought. Cambridge, MA: Harvard University Press.
Laresgoiti, I., Anjewierden, A., Bernaras, A., Corera, J., Schreiber, A. TH, & Wielenga, B.J. (1996). Ontologies as vehicles for reuse: A mini-experiment. In B.R. Gains & M.A. Musen (eds.). Proceedings of the 10th Banff Knowledge Acquisition for Knowledge-Based Systems Workshop (KAW-96) (pp.301-30.21). Banff, Canada.
Liegeois, A. (1986, December). Automatic supervisory control of the configuration and behavior of multi-body mechanisms. In IEEE Tran. Sys. Man and Cyb., SMC-7, 3, 868-871.
Latombe, J.-C. (2006, July). Probabilistic roadmaps: A motion planning approach based on active learning. Keynote Speech at the Proceedings of the 5th IEEE International Conference on Cognitive Informatics (ICCI’06) (pp. 1-2). Beijing, China: IEEE CS Press. Lau, C. (1991). Neural networks, theoretical foundations an analysis. IEEE Press. Lenat, D. (1998, October). The dimensions of contextspace. CYCorp Report. Levesque, H. J. & Lakemeyer, G. (2000). The logic of knowledge bases. Cambridge, MA: MIT Press. Levesque, H. J. (1984). The logic of incomplete knowledge bases. In M. L. Brodie, J. Mylopoulos & J.W. Schmidt (Ed.), On Conceptual Modeling. New York: Springer-Verlag. Levy, A. Y. & Rousset, M-C. (1998). Verification of knowledge bases based on containment checking. Artificial Intelligence, Vol. 101, Issues 1-2, pp.227-250. Licklider, J. C. R. (1960). Man-computer symbiosis. IRE Transactions on Human Factors in Electronics, HFE, 4-11.
Lin, T.Y. (1997). Granular computing. Announcement of the BISC special interest group on granular computing. Lintermann, B. & Deussen, O. (1999). Interactive Structural and Geometrical Modeling of Plants, IEEE Computer Graphics and Applications, 19(1). Lipschutz, S. (1964), Schaum’s outline of theories and problems of set theory and related topics. New York, NY: McGraw-Hill Inc. Lipschutz, S. (1967). Schaum’s outline of set theory and related topics. McGraw-Hill Inc. Liu, L., & Lin, M., (1991). Forecasting residential consumption of natural gas using monthly and quarterly time series. International Journal of Forecasting, 7, 3-16 López, N., & Núñez, M. (2001). A testing theory for generally distributed stochastic processes. In CONCUR 2001, LNCS 2154, (pp. 321-335). Springer. López, N., Núñez, M., & Rubio, F. (2004). An integrated framework for the analysis of asynchronous communicating stochastic processes. Formal Aspects of Computing, 16(3), 238-262.
349
Compilation of References
Mackey, M.C. (1992). Time’s arrow: The origin of thermodynamic behavior (p. 175). New York, NY: Springer Verlag.
Menzies, T. & Pecheur, C. (2005). In M. Zelkowitz (Ed.), Advances in computers, Vol. 65, Amsterdam, the Netherlands: Elsevier.
Mainzer, K. (2004). Thinking in complexity (4th ed.) (p. 456). New York, NY: Springer Verlag.
Meystel, A.M., & Albus, J.S. (2002). Intelligent systems, architecture, design, and control. John Wiley & Sons, Inc.
Mallat, S. (1998). A wavelet tour of signal processing (p. 577). San Diego, CA: Academic. Mandelbrot, B.B. (1974). Intermittent turbulence in self-similar cascades: Divergence of higher moments and dimension of the carrier. J. Fluid Mech., 62(2), 331-358. Mandelbrot, B.B. (1982). The fractal geometry of nature (p. 468). New York, NY: W.H. Freeman. Mann, S. (2002). Intelligent image processing (p. 339). New York, NY: Wiley/IEEE. Mannila, H. (1997). Methods and problems in data mining. Proceedings of International Conference on Database Theory’97 (pp. 41-55). Martin, R. C. (2002). Agile software development, principles, patterns, and practices. MA: Addison Wesley. Maslow, A. H. (1970). Motivation and personality (2nd ed). New York: Viking. Matlin, M.V. (1998). Cognition (4th ed.). Harcount Brace and Company. Mayer, R.E. (1992). Thinking, problem solving, cognition (2nd ed.). W.H. Freeman and Company. McCulloch, W.S. (1965). Embodiments of mind. Cambridge, MA: MIT Press. McCulloch, W.S. (1993). The complete works of Warren S. McCulloch. Salinas, CA: Intersystems Pub. McCulloch, W.S.,& Pitts, W.H. (1943). A logic calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics, 5. USA. Mcdermot, J. (1998) Preliminary steps toward a taxonomy of problem-solving methods. In Automating Knowledge Acquisition for Expert Systems (pp.225-255) S. Marcus, (ed.), Boston: Kluwer.
350
Michalski, J.S., Carbonell, J.G., & Mirchell, T.M. (Eds.) (1983). Machine learning: An artificial intelligence approach (pp. 463-482). Palo Alto, CA: Morgan Kaufmann. Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits on our capacity to process information. Psychological Review, 63, 81-97. Milner, R. (1989). Communication and concurrency. Englewood Cliffs, NJ: Prentice-Hall. Mingers, J. (1989). An empirical comparison of pruning measures for decision tree induction. Machine Learning, 4, 227-243. Minsky, M. (1986). The society of mind (p339). New York, NY: Touchstone. Mitra, S.K. (1998). Digital signal processing: A computer-based approach (p.864). New York: McGraw-Hill (MatLab Series) Montello, D. (1993). Scale and multiple Ppsychologies of space. In A. Frank & I. Campari (Eds.), Spatial Information Theory: A Theoretical Basis for GIS (pp. 312-321). Berlin: Springer. Murata, T., Subrahmanian, V. S. & Wakayama, T. (1991). A Petri net model for reasoning in the presence of inconsistency. IEEE Transactions on Knowledge and Data Engineering, Vol.3, No.3, pp. 281-292. Murch, R. (2004). Autonomic computing. London: Person Education. Najjar, M., Fournier-Viger, P., Lebeau, J. F. & Mayers, A. (2006). Recalling Recollections According to Temporal Contexts—Applying of a Novel Cognitive Knowledge Representation Approach. Proceedings of the 5th IEEE International Conference on Cognitive Informatics (ICCI’06). July 17-19, Beijing, China.
Compilation of References
Najjar, M., Fournier-Viger, P., Mayers, A. & Bouchard, F. (2005). Memorising Remembrances in Computational Modelling of Interrupted Activities. Proceedings of the 7th International Conference on Computational Intelligence and Natural Computing. July 21-26, Salt Lake City, Utah, USA. pp: 483-486.
Norman, D. A. (2005). Human-centered design considered harmful. Interactions. Retrieved August 6, 2006, from http://delivery.acm.org/10.1145/1080000/1070976/ p14-norman.html?key1=1070976&key2=3820555211& coll=portal&dl=ACM&CFID=554857554&CFTOKE N=554857554
Nakamura, Y., & Hanafusa, H. (1987). Optimal redundancy control of robot manipulators. Int. J. Robotics Research, 6(1), 32-42,.
Norman, D. A., & Draper, S. W. (1986). User-centered system design: New perspectives on human-computer interaction. Mahwah, NJ: Lawrence Erlbaum.
Neches, R.. Fikes, R., Finn, T., Gruber, T., Patil, R., Senator, T., & Swartout, W.R. (1998). Enabling technology for knowledge sharing. AI Magazine, 12(3), 37-56.
Novak, J. D. (1998). Learning, creating, and using knowledge. Mahwah, NJ: Lawrence Erlbaum Associates.
Neely, J. H. (1989). Experimental dissociations and the episodic/semantic memory distinction. Experimental Psychology: Human Learning and Memory, (6), 441466. Newell, A. (1982). The knowledge level. AI Magazine, 18(1), 1-20. Ngo-The, A., & Ruhe, G. (2006). A systematic approach for solving the wicked problem of software release planning. Submitted to Journal of Soft Computing.
Núñez, M. (2003). Algebraic theory of probabilistic processes. Journal of Logic and Algebraic Programming, 56(1-2), 117-177. Núñez, M., & de Frutos, D. (1995). Testing semantics for probabilistic LOTOS. In Formal Description Techniques 8, 365-380. Chapman & Hall. Núñez, M., de Frutos, D., & Llana, L. (1995). Acceptance trees for probabilistic processes. In CONCUR’95, LNCS 962, (pp.249-263). Springer.
Nguyen, S.H., Skowron, A., & Stepaniuk, J. (2001). Granular computing: A rough set approach. Computational Intelligence, 17, 514-544.
Núñez, M., Rodríguez, I., & Rubio, F. (2003). Towards the identification of living agents in complex computational environments. In 2nd IEEE Int. Conf. on Cognitive Informatics, (pp. 151-160). IEEE Computer Society Press.
Nguyen, T. A., Perkins, W. A., Laffey, T. J. & Pecora, D. (1987). Knowledge base verification, AI Magazine, Vol.8, No.2, pp. 69-75.
Núñez, M., Rodríguez, I., & Rubio, F. (2004). Applying Occam’s razor to FSMs. In International Conference on Cognitive Informatics. (pp. 138-147). IEEE Press.
Nicollin, X. & Sifakis, J. (1991). An overview and synthesis on timed process algebras. In Computer Aided Verification’91, LNCS 575, (pp. 376-3)..
O’Leary, D. E. (1998). Using AI in knowledge management: knowledge bases and ontologies. IEEE Intelligent Systems, Vol.13, No.3, pp. 34-39.
Nielsen, J. (1993). Usability engineering. Cambridge, MA: Academic Press/AP Professional.
Olshausen, B.A, & David, J. F. (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381, 607-609.
Nielsen, M.A., & Chuang, I.L. (2000). Quantum computation and quantum information (p. 676) Cambridge, UK: Cambridge University Press.. Norman, D. A. (2004). Emotional design: Why we wove (or hate) everyday things. New York: Basic Books.
Oppenheim A.V., & Schafer, R.W. (1975). Digital signal processing (p.585). Englewood Cliffs, NJ: PrenticeHall. Oppenheim A.V., & Schafer, R.W. (1989). Discretetime signal processing (p. 879). Englewood Cliffs, NJ: Prentice-Hall.
351
Compilation of References
Oppenheim A.V., & Willsky, A.S. (1983). Signals and systems (p. 796). Englewood Cliffs, NJ: Prentice-Hall. Oppenheim, A.V., Schafer, R.W., & Buck, J.R. (1999). Discrete-time signal processing (p. 870) (2nd Ed.) Prentice Hall. Osborne, M., & Rubinstein, A. (1994). A course in game theory. MIT Press. Otsu, N. (1979). A threshold selection method from graylevel histograms. IEEE Transactions on Systems, Man, and Cybernetics, 9(1), 62-66. Ott, E. (1993). Chaos in dynamical systems (p. 385). Cambridge, UK: Cambridge University Press. Painter T., & Spanias, A. (1998, April). Perceptual coding of digital audio. Proceedings of the IEEE, 88(4), 451-513. Parasuraman, R. (2003). Neuroergonomics: Research and practice. Theoretical Issues in Ergonomics Science, 4(1-2), 5-20. Parsell, M. (2005, March). Review of P.O. Haikonen, the cognitive approach to consious machines. Psyche, 11(2), 1-6.. Available as of May 2006 from http://psyche. cs.monash.edu.au/book_reviews/haikonen/haikone. pdf Patel, D., Patel, S., & Wang, Y. (eds.) (2003, August). Cognitive informatics. Proceedings of the 2nd IEEE International Conference on Cognitive Informatics (ICCI'03), IEEE Computer Society Press, July, 227pp. Pawlak, Z. (1991). Rough sets: Theoretical aspects of reasoning about data (p. 252). New York, NY: Springer.
(NY): Springer-Verlag. Pelayo, F.L., Cuartero, F., Valero, V., & Cazorla, D. (2000). An example of performance evaluation by using the stochastic process algebra ROSA. In 7th Int. Conf. on Real-Time Systems and Applications, (pp. 271-278). IEEE Computer Society Press. Pennebaker, W.B., & Mitchell, J.L. (1993). JPEG still image data compression standard (p. 638). New York, NY: Van Nostrand Reinhold. Penrose, R. (1989). The emperor’s new mind (p. 480). Oxford, UK: Oxford University Press. Penrose, R. (1994). The shadows of the mind: A search for the missing science of consciousness (p.457). Oxford, UK: Oxford University Press. Perdereau, V., Passi, C., & Drouin, M. (2002). Real-time control of redundant robotic manipulators for mobile obstacle avoidance. In Int. J. Robotics, Autonomous Systems, 41, 41-59. Pescovitz, D. (2002). Autonomic computing: Helping computers help themselves. IEEE Spectrum, 39(9), 49-53. Pesin, Y.B. (1977). Characteristic Lyapunov exponents and smooth ergodic theory. Russian Mathematical Surveys, 32, 55-114. Pheifer, R., & Scheier, C. (1999). Understanding intelligence (pp. 720). Cambridge, MA: MIT Press. Piaget, J. (1954). The construction of reality in the child. New York: Basic Books.
Payne, D.G., & Wenger, M.J. (1998). Cognitive psychology. New York: Houghton Mifflin Co.
Piaget, J., & Indelder, B. (1948). La représentation de l’espace chez l’enfant. Paris: PUF: Bibliothèque de Philosphie Contemporaine.
Pedrycz, W. (Ed.) (2001). Granular computing: An emerging paradigm. Heidelberg: Physica-Verlag.
Pinel, J.P.J. (1997), Biopsychology, (3rd ed.). Needham Heights, MA: Allyn and Bacon.
Pedrycz, W., & Gomide, F. (1998). An introduction to fuzzy sets: Analysis and design (p. 465). Cambridge, MA: MIT Press.
Pirolli, P., & Card, S. K. (1999). Information foraging. Psychological Review, 106(4), 643-675.
Peitgen, H.-O., Jürgens, H., & Saupe, D. (1992). Chaos and fractals: New frontiers of science (p. 984). New York
352
Plotkin, G.D. (1981). A structural approach to operational semantics. Technical Report DAIMI FN-19, Computer Science Department, Aarhus University.
Compilation of References
Popper, K. (2003). The logic of scientific discovery. Taylor & Francis Books Ltd. Posner, M. (ed.) (1989). Foundations of cognitive science (p 888). Cambridge, MA: MIT Press. Prigogine, I. (1996). The end of certainty: Time, chaos, and the new laws of nature (p. 228). New York, NY: The Free Press. Prigogine, I., & Stengers, I. (1984). Order out of chaos: Man’s new dialogue with nature (p. 349). New York, NY: Bantam. Principe, J.C., Euliano, N.R., & Lefebvre, W.C. (2000). Neural and adaptive systems: Foundations through simulations (p. 656). New York, NY: Wiley. Proakis, J.G., & Manolakis, D.G. (1995). Digital signal processing: Principles, algorithms and applications (p. 1016) (3rd ed.). Upper Saddle River, NJ: Prentice-Hall. Pylyshyn, Z. W. (1989). Computing in cognitive science. In M. I. Posner (Ed.), Foundations of cognitive science (pp. 49-92). Cambridge, MA: MIT Press. pp. 49-92. Quillian, M.R. (1968). Semantic memory. In M. Minsky (ed.), Semantic information processing. Cambridge, MA: Cambridge Press. Quinlan, J.R. (1983). Learning efficient classification procedures and their application to chess end-games. In Machine learning: An artificial intelligence approach, 1, Michalski, J.S. Quinlan, J.R. (1993). C4.5: Programs for machine learning. Morgan Kaufmann. Rabin, M.O., & Scott, D. (1959). Finite automata and their decision problems. IBM Journal of Research and Development, 3, 114-125. Rajlich, V. (2002). Program comprehension as a learning process. Paper presented at the First IEEE International Conference on Cognitive Informatics, Calgary, Alberta. Rajlich, V., & Bennett, K. H. (2000). A staged model for the software lifecycle. Computer, 33(7), 66-71.
Rajlich, V., & Xu, S. (2003). Analogy of incremental program development and constructivist learning. Paper presented at the Second IEEE International Conference on Cognitive Informatics, London, UK. Ralson, A., Reilly, E.D., & Hemmendinger, D. (eds.) (2003). Encyclopedia of computer science (p. 2064) (4th ed.), New York, NY: Wiley. Ramdane-Cherif, A., Perdereau, V., & Drouin, M. (1995, November-December). Optimization schemes for learning the forward and inverse kinematic equations with neural network. In Proc. IEEE Int. Conf. Neural Networks, Perth, Australia.. Ramdane-Cherif, A., Perdereau, V., & Drouin, M. (1996, April). Penalty approach for a constrained optimization to solve on-line the inverse kinematic problem of redundant manipulators. In Proc. IEEE Int. Conf. Robotics and Automation (pp. 133-138). Minneapolis, USA. Ran, A., & Kuusela, J. (1996). Design decision trees. Paper presented at the Eighth International Workshop on Software Specification and Design, Paderborn, Germany. Randell, D., Cui, Z., & Cohn, A. (1992). A spatial logic based on regions and connection. In B. Nebel & W. Swartout & C. Rich (Eds.), Proceeding 3rd International Conference on Knowledge Representation and Reasoning (pp. 165-176). San Mateo: Morgan Kaufmann. Rao, R., Gordon, D., & Spears, W. (1995). For every generalization action, is there really an equal or opposite reaction? Analysis of conservation law. In Proceedings of the Twelveth International Conference on Machine Learning. (pp. 471-479). Morgan. Reed, G., & Roscoe, A. (1988). A timed model for communicating sequential processes. Theoretical Computer Science, 58, 249-261. Richards D. D. & Goldfarb J. 1986. The episodic memory model of conceptual development: an integrative viewpoint. Cognitive Development. 1, 183-219 Rissanen, J. (1978). Modelling by shortest data description. Automatica, 14, 465-471.
353
Compilation of References
Rittel, H., & Webber, M. (1984). Planning problems are wicked problems. In N. Cross (ed.), Developments in Design Methodology (pp 135-144). Chichester, UK: Wiley. Robillard, P. N. (1999). The role of knowledge in software development. Communications of the ACM, 42(1), 87-92. Rosch, E., Mervis, C. B., Gray, W., Johnson, D., & BoyesBraem, P. (1976). Basic objects in natural categories. Cognitive Psychology, 8, 382-439. Ross, R. G. (2003). Principles of the business rules approach, Boston, MA: Addison-Wesley. Rostkowycz, A. J., Rajlich, V., & Marcus, A. (2004). Case study on the long-term effects of software redocumentation. Paper presented at the 20th IEEE International Conference on Software Maintenance, Chicago, IL. Roy, B. (1991). The outranking approach and the foundations of ELECTRE methods. Theory and Decision, 31, 49-73. Roy, D.K. (2005, August). Grounding words in perception and action: Insight for computational models. Trends in Cognitive Sciences, 9(8), 389-396. Roy, D.K., & Pentland, A.P. (2002). Learning words from sights and sounds: A computational model. Cognitive Science, 26, 113-146. Ruaro, M.E., Bonifazi, P., & Torre, V. (2005, March). Toward the neurocomputer: Image processing and pattern recognition with neuronal cultures. IEEE Trans. Biomedical Eng., 52(3), 371-383. Ruelle, D. (1978). Thermodynamics formalism (p. 183). Reading, MA: Addison-Wesley-Longman and Cambridge, UK: Cambridge University Press. Rugaber, S., Ornburn, S. B., & LeBlanc, R. J. (1990). Recognizing design decisions in programs. IEEE Software, 7(1), 46-54. Ruhe, G. (2003). Software engineering decision support – Methodologies and applications. In Tonfoni and Jain (eds.), Innovations in Decision Support Systems, 3, 143-174.
354
Ruhe, G., & An, N.-T. (2004). Hybrid intelligence in software release planning. International Journal of Hybrid Intelligent Systems, 1(2), 99-110. Rumelhart, D.E., & McClelland, J.L. (1986). Parallel Distributed Processing, 1-2, 547&611. Cambridge, MA: MIT Press. Rushby J. & Whitehurst, R.A. (1989, February). Formal verification of AI software. NASA Contractor Report 181827. Rushby, J. (1988, October). Quality measures and assurance for AI software. NASA Contractor Report 4187. Rybak, G., & Golovan, P. (1998). A model of attention-guided visual perception and recognition. Vision Research. 38, 2387-2400. Rzepa, H & Tonge, A. 1998. VChemlab: A virtual chemistry laboratory. Journal of Chemical Information and Computer Sciences, 38(6) : 1048-1053. Sailor, D.J., & Munoz, J.R. (1997). Sensitivity of electricity and natural gas consumption to climate in the U.S.A. Methodology and results for eight states. Energy 22(10), 987-998. Sandia National Laboratories (2006). Projects. Available as of May 2006 from http://www.sandia.gov/cog. systems/Projects.htm Sanquist, T. F., Greitzer, F. L., Slavich, A., Littlefield, R., Littlefield, J., & Cowley, P. (2004). Cognitive tasks in information analysis: Use of event dwell time to characterize component activities. In Proceedings of the Human Factors and Ergonomics Society 48th Annual Meeting, New Orleans, Louisiana. Sanz, R., Chrisley, R., & Sloman, A. (2003). Models of sonsciousness: Scientific report (p. 37). European Science Foundation. Available as of May 2006 from http://www. esf.org/generic/1650/EW0296Report.pdf Sastry, P. S., Santharam, G., & Unnikrishnan, K. P. (1994, March). Memory neuron networks for identification and control of dynamical systems. In IEEE, Tran. Neural Networks, 5(2), 306-319.
Compilation of References
Sayood, K. (2000). Introduction to data compression (2nd ed.) (p. 636). San Francisco, CA: Morgan Kaufman.
Searle, J.R. (1980). Minds, brains and programs. Behavioral & Brain Sciences, 3, 417-424.
Schaffer, J. (1994). A conservation law for generalization performance. In Proceedings of the 11th International Conference on Machine Learning (pp. 259-265). Morgan Kaufmann.
Searle, J.R. (1992). The rediscovery of the mind. (p. 288). Cambridge, MA: MIT Press.
Schank, R., & Abelson, R. (1977). Scripts, plans, goals, and understanding. Hillsdale, NJ: Erlbaum. Schmorrow, D. D., & Kruse, A. A. (2004). Augmented cognition. In W. S. Bainbridge (Ed.), Berkshire encyclopedia of human computer interaction (pp. 54-59). Great Barrington, MA: Berkshire Publishing Group. Schmorrow, D., & McBride, D. (2005). Introduction to special issue on augmented cognition. International Journal of Human-Computer Interaction, 17(2). Scholtz, J., Morse, E., & Hewett, T. (2004, March). In depth observational studies of professional intelligence analysts. Paper presented at Human Performance, Situation Awareness, and Automation (HPSAA), Daytona Beach, FL. Retrieved August 6, 2006, from http://www.itl.nist. gov/iad/IADpapers/2004/scholtz-morse-hewett.pdf Schoning, U. (1989). Logic for computer scientists. Boston: Birkhauser. Schreiber, G., Breuker, J., Bredeweg, B., & Wielinga, B. (1988, June 19-23). Modeling in knowledge based systems development. In J. Boose, B. Gaines, M. Linster (eds.). Proceedings of the European Knowledge Acquisition Workshop (EKAW ’88) (pp. 7.1-7.15). Gesellschaft fur Mathematik und Datenverarbeitung, MBH, 7.1-7.15. Schrödinger, E. (1944). What is Life? with Mind and Matter and Autobiographical Sketches (p. 184). Cambridge, UK: Cambridge University Press {ISBN 0-521-42708-8 pbk; Reprinted 2002}
Shannon, C.E. (1948). A mathematical theory of communication. Bell System Technical Journal,.27, 379-423, 623-656. Shannon, C.E. (ed.) (1956). Automata studies. Princeton: Princeton University Press. Shao, J. & Y. Wang. 2003. A New Measure of Software Complexity based on Cognitive Weights, IEEE Canadian Journal of Electrical and Computer. Engineering. 28(2), pp.69-74. Shastri, L. 2002. Episodic memory and cortico-hippocampal interactions. Trends in Cognitive Sciences, 6: 162-168. Sienko, T., Adamatzky, A., Rambidi, N.G., & Conrad, M. (2003). Molecular computing (p. 257). Cambridge, MA: MIT Press. Simon, H. (1998). Neural networks: A comprehensive foundation. Upper Saddle River, NJ: Prentice Hall PTR. Simoncelli, E. P. (2003). Vision and statistics of the visual environment. Current Opinion in Neurobiology, 13, 144-149. Simoncelli, E. P., & Olshausen, B. A. (2001). Natural image statistics and neural representation. Annual Review of Neuroscience, 24, 1193-1216. Sloman, S. A. (1996). The empirical case for two systems of reasoning. Psychological Bulletin, 119, 3-22.
Schroeder, M.R. (1991). Fractals, chaos, power laws (p. 429). New York, NY: W.H. Freeman.
Sloman, S. A. (2002). Two systems of reasoning. In T. Gilovich, D. Griffin, & D. Kahneman (Eds.), Heuristics and biases (pp. 379-396). New York: Cambridge University Press.
Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6, 461-464.
Smith, K.J. (2001). The nature of mathematics (9th ed.). CA: Brooks/Cole, Thomson Learning Inc. Smith, R.E. (1993). Psychology. St. Paul, MN: West Publishing Co.
355
Compilation of References
Soloman, S. (1999). Sensor handbook (p. 1486). New York, NY: McGraw-Hill. Solso, R. (ed.) (1999). Mind and brain science in the 21st century. MIT Press. Spelke, E. S. (1990). Principles of object oerception. Cognitive Science, 14, 29-56. Sperschneider, V., & Antoniou, G. (1991). Logic: A foundation for computer science. Reading, MA: Addison-Wesley. Sprott, J.C. (2003). Chaos and time-series analysis (p. 507). Oxford, UK: Oxford University Press. Squire, L., Knowlton, B., & Musen, G. (1993). The structure and organization of memory. Annual Review of Psychology, 44, 453-459. Stacey, G. (1994, November). Stochastic fractal modelling of dielectric discharges. (p. 308). Master’s Thesis. Winnipeg, MB: University of Manitoba. Stanley, H.E., & Meakin, P. (1988, Septembr 29). Multifractal phenomena in physics and chemistry. Nature, 335, 405-409. Stanovich, K. E. (1999). Who is rational: Studies of individual differences in reasoning. Mahway, NJ: Erlbaum. Stanovich, K. E., & West, R. F. (2002). Individual differences in reasoning. Implications for the rationality debate. In T. Gilovich, D. Griffin, & D. Kahneman (Eds.), Heuristics and biases. New York: Cambridge University Press. Steels, L. (1990). Components of expertise. AI Magazine, 11(2), 29-49. Sternberg, R.J. (1998). In search of the human mind, (2nd ed.), Orlando, FL: Harcourt Brace & Co., Stevens, S. S. (1975). Psychophysics: Introduction to perceptual, neural, and social prospects New York: Wiley. Stonier, T. (1990). Information and the internal structure of the universe: An exploration into information physics (p. 155). New York, NY: Springer Verlag.
356
Subramanian, D., Pekny, J.F., Reklaitis, G.V. (2000). A simulation-optimization framework for addressing combinatorial and stochastic aspects of a research & development pipeline management. Computers and Chemical Engineering, 24(7), 1005-1011. Sweller, J. 1988. Cognitive load during problem solving: effects on learning. Cognitive Science. 12: 257-285. Tamma, V., Phelps, S., Dickinson, I., & Woolridge, M. (2005). Ontologies for supporting negotiation in ECommerce. Special Issue Engineering Applications of Artificial Intelligence, 18(2), 223-236. Tao, W.O., & Ti, H.C. (1998). Transients analysis of gas pipeline network. Chemical Engineering Journal, 69, 47-52. Tarr, M. J. (1995). Rotating objects to recognize them: A case study of the role of mental transformations in the recognition of three-dimensional objects. Psychonomic Bulletin and Review, 2, 55-82. Tarr, M. J., & Buelthoff, H. H. (1995). Is human object recognition better described by Geon structural descriptions or by multiple views? Commen on Biederman and Gerhardstein (1993). Journal of Experimental Psychology: Human Perception and Performance, 21(6), 1494-1505. Taylor, J.G. (2001). The Race to Consciousness (p. 392). Cambridge, MA: MIT Press. Taylor, J.G. (2002). Paying attention to consciousness. Trends Cogn. Sciences, 6, 206-210. Taylor, J.G. (2003, June 20-24). The CODAM model of attention and consciousness. In Proceedings of the Intern. Joint Conf. Neural Networks, IJCNN03, 1, 292297. Portland, OR. Tekalp, A.M. (ed.) (1998, May). Multimedia signal processing [Special Issue]. Proceedings of the IEEE, 86(5). Thelen, E., & Smith, L.B. (2002). A dynamic system approach to the development of cognition and action (p. 376). Cambridge, MA: MIT Press.
Compilation of References
Tomassi, P. (1999). Logic. London and New York: Routledge. Tornay, S. (1938). Ockham: Studies and Selections. La Salle, IL: Open Court Publishers. Treisman, A.M. (1964). Verbal cues, language, and meaning in selective attention. American Journal of Psychology, 77, 206-219. Tricot, C. (1995). Curves and fractal dimension (p. 323). New York, NY: Springer-Verlag. Tulving, E. (1983). Elements of Episodic Memory. Oxford University Press, New York. Turcotte, D.L. (1997). Fractals and chaos in geology and geophysics (2nd ed.) (p. 398). Cambridge, UK: Cambridge University Press. Turing, A.M. (1950). Computing machinery and intelligence. Mind, 59, 433-460. Tversky, B. (2005). Functional significance of visuospatial representation. In P. Shah & A. Miyake (Eds.), Handbook of higher-level visuospatial thinking. Cambridge: Cambridge University Press. Tversky, B., & Lee, P. (1999). How space structures language. In C. Freksa, C. Habel, & K. F. Wender (Eds.), Spatial cognition: an interdisciplinary approach to representation and processing of spatial knowledge (pp. 157-176). Springer-Verlag. Tversky, B., Morrison, J. B., Franklin, N., & Bryant, D. (1999). Three spaces of spatial cognition. Professional Grographer, 51, 516-524. UCLA Cognitive Systems Laboratory (2006). Available as of May 2006 fromhttp://singapore.cs.ucla.edu/cogsys. html Uraikul, V., Chan, C.W., & Tontiwachwuthiku, P. (2000). Development of an expert system for optimizing natural gas operations. Expert Systems with Applications, 18(4), 271-282. Uschold, M. (2003). Where are the sematics in the Semantics Web. AI Magazine, 24(3), 25-36.
Uschold, M., King, M., Morale, S., & Zorgios, Y. (1998). The enterprise ontology. The Knowledge Engineering Review, 13(1), 31-89. Valente, A., & Breuker, J. (1996). Towards principled core ontologies. In B.R. Gaines & M. Mussen (eds.) Proceedings of the KAW-96, Banff, Canada. Van de Velde, W., & Schreiber, G. (1997). The future of knowledge acquisition: A european perspective. IEEE Expert, 1, 1-3. van Emden, M. H. & Kowalski, R. (1976). The semantics of predicate logic as a programming language. Journal of ACM, Vol. 23, pp. 733-742. van Heijenoort, J. (1997). From Frege to Godel, A source book in mathematical logic 1879-1931. Cambridge, MA: Harvard University Press. Vapnik, V. (1995). The nature of statistical learning theory. Springer. Velmans, M. (2000). Understanding consousness (p. 296). New York, NY: Routledge. Vicsek, T. (1992). Fractal growth phenomena (2nd ed.) (p. 488). Singapore: World Scientific. Vinje, W.E., & Gallant, J.L., (2000). Sparse coding and decorrelation in primary visual cortex during natural vision. Science, 287, 1273-1276. von Bertalanffy, L. (1952). Problems of life: An evolution of modern biological and scientific thought. London: C.A. Watts. von Glasersfeld, E. (1995). Radical constructivism. London: The Falmer Press. von Neumann, J. (1946). The principles of large-scale computing machines. Reprinted in Annals of History of Computers, 3(3), 263-273. von Neumann, J. (1958). The computer and the brain. New Haven: Yale University Press von Neumann, J. (1963). General and logical theory of automata, A.H. Taub ed., collected works, 5, Pergamon, (pp. 288-328).
357
Compilation of References
von Neumann, J. and O. Morgenstern (1980), Theory of Games and Economic Behavior, Princeton Univ. Press. von Neumann, J., & Burks, A.W. (1966). Theory of self-reproducing automata. Urbana, IL: University of Illinois Press. Vygotsky, L. S. (1978). Mind in society. Cambridge, MA: Harvard University Press. Wald, A. (1950). Statistical Ddecision functions. John Wiley & Sons. Wang, Y. (2003). Cognitive informatics models of software agents and autonomic computing. Keynote speech at The First International Conference on Agent-Based Technologies and Systems (ATS’03) (pp. 25-26). Canada: University of Calgary Press. Wang, Y. (2003). Cognitive informatics: A new transdisciplinary research field. Brain and Mind: A Transdisciplinary. Journal of Neuroscience and Neurophilosophy, 4(2), 115-127. Wang, Y. (2003). Using process algebra to describe human and software system behaviors. Brain and Mind: A Transdisciplinary. Journal of Neuroscience and Neurophilosophy, 4(2), 199-213. Wang, Y. (2004, August). Autonomic computing and cognitive processes. Keynote Speech from the Proceedings of the 3rd IEEE International Conference on Cognitive Informatics (ICCI’04) (pp. 3-4). Victoria, Canada: IEEE CS Press. Wang, Y. (2005, August). On cognitive properties of human factors in engineering. Proceedings of the 4th IEEE International Conference on Cognitive Informatics (ICCI’05) (pp. 174-182). Irvine, CA: IEEE CS Press. Wang, Y. (2005, August). The cognitive processes of abstraction and formal inferences. Proceedings 4th IEEE International Conference on Cognitive Informatics (ICCI’05) (pp. 18-26). Irvin, California: IEEE CS Press. Wang, Y. (2005, May). On the mathematical laws of software. Proceedings of the 18th Canadian Conference
358
on Electrical and Computer Engineering (CCECE’05) (pp. 1086-1089). Saskatoon, SA, Canada. Wang, Y. (2005). Mathematical models and properties of games. Proceedings of the 4th IEEE International Conference on Cognitive Informatics (ICCI’05) (294300). Irvin, California, USA: IEEE CS Press. Wang, Y. (2005). On cognitive properties of human factors in engineering. In Proceedings of the IEEE 2005 International Conference on Cognitive Informatics (pp. 174-182). IEEE Computer Society. Wang, Y. (2005, August). A novel decision grid theory for dynamic decision making. Proceedings 4th IEEE International Conference on Cognitive Informatics (ICCI’05) (pp. 308-314) Irvin, California: IEEE CS Press. Wang, Y. (2006). On the informatics laws and deductive semantics of software. IEEE Transactions on Systems, Man, and Cybernetics (Part C), 36(2), 161-171. Wang, Y. (2006, July). Cognitive complexity of software and its measurement. Proceedings of the 5th IEEE International Conference on Cognitive Informatics (ICCI’06) (pp. 226-235). Beijing, China: IEEE CS Press. Wang, Y. (2006, July). Cognitive informatics - Towards the future generation computers that think and feel. Keynote Speech from the Proceedings of the 5th IEEE International Conference on Cognitive Informatics (ICCI’06) (pp. 3-7). Beijing, China: IEEE CS Press. Wang, Y. (2006, July). Cognitive informatics and contemporary mathematics for knowledge representation and manipulation. Invited Plenary Talk from the Proceedings of the 1st International Conference on Rough Set and Knowledge Technology (RSKT’06) (pp. 69-78). Lecture Notes in Artificial Intelligence, LNAI 4062, Chongqing, China: Springer. Wang, Y. (2006, July). On abstract systems and system algebra. Proceedings of the 5th IEEE International Conference on Cognitive Informatics (ICCI’06) (pp. 332-343). Beijing, China: IEEE CS Press. Wang, Y. (2006, July). On concept algebra and knowledge representation. Proceedings of the 5th IEEE International
Compilation of References
Conference on Cognitive Informatics (ICCI’06) (pp.320331).Beijing, China: IEEE CS Press. Wang, Y. (2006, July). On the Big-R notation for describing iterative and recursive behaviors. Proceedings of the 5th IEEE International Conference on Cognitive Informatics (ICCI’06) (pp. 132-140). Beijing, China: IEEE CS Press. Wang, Y. (2006, March). On the informatics laws and deductive semantics of software. IEEE Transactions on Systems, Man, and Cybernetics (C), 36(2), 161-171. Wang, Y. (2006, May) The OAR model for knowledge representation. Proceedings of the 19th IEEE Canadian Conference on Electrical and Computer Engineering (CCECE’06) (pp. 1696-1699). Ottawa, Canada. Wang, Y. (2006, May). A unified mathematical model of programs. Proceedings of the 19th Canadian Conference on Electrical and Computer Engineering (CCECE’06) (pp.2346-2349). Ottawa, ON, Canada. Wang, Y. (2006, July). On concept algebra and knowledge representation. Proceedings of the 5th IEEE International Conference on Cognitive Informatics (ICCI’06) (pp. 320-331). Beijing, China: IEEE CS Press. Wang (2007). Exploring machine cognition mechanisms for autonomic computing. The International Journal of Cognitive Informatics and Natural Intelligence (IJCINI), 1(2), pp. i-v. Wang, Y. (2007), Toward theoretical foundations of autonomic computing. The International Journal of Cognitive Informatics and Natural Intelligence (IJCINI), 1(3), 1-16. USA: IPI Publishing. Wang, Y. (2007). Software engineering foundations: A software science perspective. CRC Book Series in Software Engineering, Vol. II, Auerbach Publications, USA. Wang, Y. (2007, January). The theoretical framework of cognitive informatics. The International Journal of Cognitive Informatics and Natural Intelligence (IJCiNi) 1(1), 1-27. Hershey, PA: IGP.
Wang, Y. (2007, July). The OAR model of neural informatics for internal knowledge representation in the brain. The International Journal of Cognitive Informatics and Natural Intelligence (IJCINI), 1(3), 64-75. Hershey, PA: IGI Publishing. Wang, Y. (2007). Exploring machine cognition mechanisms for autonomic computing. The International Journal on Cognitive Informatics and Natural Intelligence (IJCINI), 1(2), i-v. Wang, Y. (2007). Formal description of the cognitive process of memorization. Proceedings of the 6th IEEE International Conference on Cognitive Informatics (ICCI'07). Lake Tahoe, IEEE Computer Society Press, Los Alamitos, CA., August. Wang, Y. (ed.) (2007). Special issues on autonomic computing. The International Journal on Cognitive Informatics and Natural Intelligence (IJCINI), 1(3). Wang, Y. & Liu, D. (2003, August). On information and knowledge representation in the brain. The 2nd IEEE International Conference on Cognitive Informatics (ICCI’03), (pp. 26-31). London, UK: IEEE CS Press. Wang, Y., & Gafurov, D. (2003). The cognitive process of comprehension. Proceedings of the 2nd IEEE International Conference on Cognitive Informatics (ICCI’03) (pp. 93-97). London, UK. Wang, Y., & W. Kinsner (2006). Recent Advances in Cognitive Informatics. IEEE Transactions on Systems, Man, and Cybernetics (C), 36(2), March, 121-123. Wang, Y., & Wang Y.(2002, August). Cognitive models of the brain. Proceedings of the First IEEE International Conference on Cognitive Informatics (ICCI’02) (pp. 259269). Calgary, AB., Canada: IEEE CS Press. Wang, Y., & Y. Wang. (2006). Cognitive informatics models of the brain. IEEE Transactions on Systems, Man, and Cybernetics (Part-C), 36(2), March, pp. 203-207. Wang, Y., & Wang, Y. (2006, March). On cognitive informatics models of the brain. IEEE Transactions on Systems, Man, and Cybernetics (C), 36(2), 16-20.
359
Compilation of References
Wang, Y., Liu, D., & Wang, Y. (2003). Discovering the capacity of human memory. Brain and Mind: A Transdisciplinary. Journal of Neuroscience and Neurophilosophy, 4(2), 189-198.
Warwick, K., & Gasson, M. (2005). Human-machine symbiosis overview. In Proceedings of the HCI International 2005/Augmented Cognition Conference, Las Vegas, NV.
Wang, Y., Dong, L., & Ruhe, G. (2004, July). Formal description of the cognitive process of decision making. Proceedings of the 3rd IEEE International Conference on Cognitive Informatics (ICCI’04) (pp. 124-130). Victoria, Canada: IEEE CS Press.
Wason, P. (1966). Reasoning. In B. M. Foss (Eds.), New horizons in psychology (pp. 135-151). London: Penguin.
Wang, Y., Wang, Y., Patel, S., & Patel, D. (2006, March). A layered reference model of the brain (LRMB), IEEE Transactions on Systems, Man, and Cybernetics (Part C), 36(2), 124-133. Wang, Y., Johnston, R., & M. Smith (Eds.) (2002), Cognitive Informatics: Proc. 1st IEEE International Conference on Cognitive Informatics (ICCI’02), IEEE CS Press, Calgary, AB, Canada, August. Wang, Y., D. Zhang, W. Kinsner, J-C. Latombe eds. (2008), Proceedings of the 7th IEEE International Conference on Cognitive Informatics (ICCI'08), Stanford University, IEEE Computer Society Press, Los Alamitos, CA., August. Wang, Y., Dong, L., & Ruhe, G. (2004, July). Formal description of the cognitive process of decision making. Proceedings of the 3rd IEEE International Conference on Cognitive Informatics (ICCI’04) (pp. 124-130). Victoria, Canada: IEEE CS Press. Wang, Y., R. Johnston, and M. Smith (Eds.) (2002), Cognitive Informatics: Proc. 1st IEEE International Conference on Cognitive Informatics (ICCI’02), IEEE CS Press, Calgary, AB, Canada, August. Wang, Y., Wang, Y., Patel, S., & Patel, D. (2006, March). A layered reference model of the brain (LRMB), IEEE Transactions on Systems, Man, and Cybernetics (Part C), 36(2), 124-133. Wang, Y., & Gafurov, D. (2003, August). The cognitive process of comprehension. Proceedings of the 2nd IEEE International Conference on Cognitive Informatics (ICCI’03) (pp. 93-97). London, UK: IEEE CS Press.
360
Weber, G. & Brusilovsky, P. 2001. ELM-ART: An adaptive versatile system for Web-based instruction. International Journal of AI in Education 12 (4) : 351-384. Wells, L.K. & Travis, J. 1996. LabVIEW for Everyone: Graphical Programming Made Even Easier, Prentice Hall Eds., NJ. Wertheimer, M. (1958). Principles of perceptual Organization. In Reading in Perception. New York: Van Nostrand. Westen, D. (1999). Psychology: Mind, brain, and culture (2nd ed.). New York: John Wiley & Sons, Inc. Widrow, B., & Lehr, M.A. (1990, September). 30 years of adaptive neural networks: Perception, madeline, and backpropagation. Proceedings of the IEEE, 78(9), 1415-1442. Wiener, N. (1948). Cybernetics or control and communication in the animal and the machine. Cambridge, MA: MIT Press. Wiggins, J.A., Eiggins, B.B., & Zanden, J.V. (1994). Social psychology (5th ed.). New York: McGraw-Hill, Inc. Wilkins, D. J. (2002, November). The bathtub curve and product failure behavior. Reliability HotWire, 21. Retrieved August 6, 2006, from http://www.weibull. com/hotwire/issue21/hottopics21.htm Williams, G.P. (1997). Chaos theory tamed (p. 499). Washington, DC: Joseph Henry Press. Williams, L., Kessler, R., Cuningham, W., & Jeffries, R. (2000). Strengthening the case for pair-programming. IEEE Software, 17(4), 19-25. Willmore, B, & Tolhurst, D.J. (2001). Characterizing the sparseness of neural codes. Network. 2(3), 255-270.
Compilation of References
Wilson, R.A., & Keil, F.C. (2001). The MIT Encyclopedia of the Cognitive Sciences. MIT Press. Wittig, A.F. (2001). Schaum’s outlines of theory and problems of introduction to psychology (2nd ed.). New York: McGraw-Hill. Woolf, H. B. (1980). Webster’s New Collegiate Dictionary. Springfield, Massachusetts, U.S.A.: G. & C. Merriam Company. Wornell, G.W. (1996). Signal processing with fractals: A wavelet-based approach (p. 177). Upper Saddle River, NJ: Prentice-Hall. Xu, S., & Rajlich, V. (2004). Cognitive process during program debugging. Paper presented at the Third IEEE International Conference on Cognitive Informatics, Victoria, BC. Xu, S., & Rajlich, V. (2005). Dialog-based protocol: An empirical research method for cognitive activity in software engineering. Paper presented at the Fourth ACN/ IEEE International Symposium on Empirical Software Engineering, Noosa Heads, Queensland. Yao, Y., Shi., Z., Wang, Y., & Kinsner, W. (eds.) (2006, July). Cognitive Informatics. Proceedings of the 5th IEEE International Conference (ICCI’06). Beijing, China: ) IEEE CS Press. Yao, Y.Y. (2006). Granular computing for data mining. Proceedings of SPIE Conference on Data Mining, Intrusion Detection, Information Assurance, and Data Networks Security, paper 624105. Yao, Y.Y., & Yao, J.T. (2002). Induction of classification rules by granular computing. Proceedings of the 3rd International Conference on Rough Sets and Current Trends in Computing (pp. 331-338). Yao, Y.Y., & Zhong, N. (1999). Potential applications of granular computing in knowledge discovery and data mining. Proceedings of World Multiconference on Systemics, Cybernetics, and Informatics, 5, Computer Science and Engineering (pp. 573-580). Yao, Y.Y., Shi, Z.,Wang, Y., & Kinsner, W. (Eds.) (2006). Cognitive informatics. Proceedings of the 5th IEEE
International Conference on Cognitive Informatics (ICCI'06), Beijing, China, IEEE Computer Society Press, Los Alamitos, CA., July, 1,018pp. Yao, Y.Y., Zhao, Y., & Maguire, R.B. (2003). Explanation-oriented association mining using rough set theory. Proceedings of Rough Sets, Fuzzy Sets and Granular Computing (pp. 165-172). Yao, Y.Y., Zhao, Y., & Yao, J.T. (2004). Level construction of decision trees in a partition-based framework for classification. Proceedings of SEKE’04 (pp. 199-205). Yao, Y.Y., Zhong, N., & Zhao, Y. (2004). A three-layered conceptual framework of data mining. Proceedings of ICDM Workshop of Foundation of Data Mining (pp. 215-221). Ye, Y. (2006). Supporting software development as knowledge-intensive and collaborative activity. Paper presented at the 2006 International Workshop on Interdisciplinary Software Engineering Research, Shanghai, China. Yi, W. (1991). CCS+ Time = An interleaving model for real time systems. In 18th ICALP, LNCS 510, 217–228. Springer. Zachary, W., Wherry, R., Glenn, F., & Hopson, J. (1982). Decision situations, decision processes, and decision functions: Towards a theory-based framework for decision-aid design. Proceedings of the 1982 Conference on Human Factors in Computing Systems. Zadeh, L.A. (1997). Towards a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic. Fuzzy Sets and Systems, 19, 111-127. Zhang, D. & Luqi. (1999). Approximate declarative semantics for rule base anomalies, Knowledge-Based Systems, Vol.12, No.7, pp. 341-353. Zhang, D. & Nguyen, D. (1994). PREPARE: a tool for knowledge base verification., IEEE Transactions on Knowledge and Data Engineering, Vol.6, No.6, pp. 983-989. Zhang, D. (2005). Fixpoint semantics for rule base anomalies. In the Proceedings of the Fourth IEEE Inter-
361
Compilation of References
national Conference on Cognitive Informatics, Irvine, CA. pp. 10-17. Zhang, J., & Norman, D. (1994). Representations in distributed cognitive tasks. Cognitive Science, 18(1), 87-122. Zhao, Y., & Yao, Y.Y. (2005). Interactive user-driven classification using a granule network. Proceedings of ICCI’05 (pp. 250-259).
362
Zhou, J., & Adewumi, M. A. (1998). Transients in gascondensate natural gas pipelines. Journal of Energy Resources Technology, 120, 32-40. Zhu, G., Henson, M.A., & Megan, L. (2001). Dynamic modeling and linear model predictive control of gas pipeline networks. Journal of Process Control, 11, 129-148 Zsombok, C. E. (1997). Naturalistic decision making: Where are we now? In C. Zsombok & G. Klein (Eds.), Naturalistic decision making. Mahwah, NJ: Erlbaum.
363
About the Contributors
Yingxu Wang is professor of cognitive informatics and software engineering, director of International Center for Cognitive Informatics (ICfCI) and director of Theoretical and Empirical Software Engineering Research Center (TESERC) at the University of Calgary. He received a PhD in software engineering from the Nottingham Trent University, UK, in 1997, and a BSc in electrical engineering from Shanghai Tiedao University in 1983. He was a visiting professor in the Computing Laboratory at Oxford University during 1995 and a visiting professor in the Dept. of Computer Science at Stanford University during 2008, and has been a full professor since 1994. Wang is a Fellow of WIF, a P.Eng of Canada, a Senior Member of IEEE, and a member of ACM, ISO/IEC JTC1, the Canadian advisory committee (CAC) for ISO, the advisory committee of IEEE Canadian Conferences on Electrical and Computer Engineering (CCECE), and the National Committee of Canadian Conferences on Computer and Software Engineering Education (C3SEE). He is the founder and steering committee chair of the annual IEEE International Conference on Cognitive Informatics (ICCI). He is the founding editor-in-chief of the International Journal of Cognitive Informatics and Natural Intelligence (IJCINI), founding editor-in- chief of the International Journal of Software Science and Computational Intelligence (IJSSCI), associate editor of IEEE TSMC-A, and editor-in-chief of CRC book series in Software Engineering. He has accomplished a number of European Union, Canadian, and industry-funded research projects as principal investigator and/or coordinator, and has published over 300 journal and conference papers, and 12 books in software engineering and cognitive informatics. He has served on numerous editorial boards and program committees, and as guest editors for a number of academic journals. He has won dozens of research achievement, best paper, and teaching awards in the last 36 years, particularly the 1994 National Zhan Tianyou Young Scientist Prize, China, and the ground breaking book, Software Engineering Foundations: A Software Science Perspective. *** Patricia Boechler received her PhD in psychology from the University of Alberta in 2002, and is currently a member of that university’s educational psychology department. Her general research interests include cognition, memory, learning and developmental psychology. For the last few years her research has been centered around the study of cognition and learning in educational hypermedia. She has been investigating the effects of different types of interfaces on learning, taking into account individual differences in spatial and literacy skills. Boechler is also interested in developing novel research and statistical methods for uncovering regularities in user navigation behaviours. She is applying her background in the study of spatial cognition using neural networks to the use of neural networks for understanding students’ path patterns in educational hypermedia. Christine W. Chan is professor of engineering at the University of Regina, Regina, Saskatchewan, Canada. She is an adjunct scientist of Telecommunications Research Laboratory, adjunct professor of the electrical and computer engineering department of University of Calgary, and associate member of Laboratory for Logic and Experimental Philosophy in Simon Fraser University. She obtained her PhD degree in applied sciences from Simon Fraser University in 1992. She was assistant professor of computer science at University of Regina in 1993, and professor of computer science from 2000 to 2003. Chan founded the Energy Informatics Laboratory at University of Regina in 1995 and has served as principal investigator of the laboratory. Chan has been in-
Copyright © 2009, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
About the Contributors
volved in research on applications of artificial intelligence and knowledge-based technologies to energy and the environment, industrial applications of artificial intelligence, ontological engineering, knowledge and software engineering, intelligent data analysis using artificial intelligence techniques, object-oriented analysis and design, socio-economic impacts of information technology, and development and impacts of educational instructional software. She has published or presented over 190 technical papers, of which over 50 are international journal articles and over 90 are refereed conference papers. She presently serves as editor of Engineering Applications of Artificial Intelligence, and associate editor of Journal of Environmental Informatics. In 2003, she was co-guest editor of a special issue of Engineering Applications of Artificial Intelligence. In 2004-2006, she was awarded the President’s Scholar award at University of Regina. Dr. Chan is a member of Institute of Electrical and Electronics Engineers (IEEE) Computer Society and American Association of Artificial Intelligence. Tiansi Dong, the key software developer at the Cognitive Ergonomic Systems Germany, professional member of ACM and IEEE, received BS at the Department of Computer Science and Technology, Nanjing University in 1997, M.E. at the Department of Computer Science and Engineering, Shanghai Jiaotong University in 2000, China, and Dr. rer. nat at the Department of Mathematics and Informatics, University of Bremen, Germany in 2005 for the grounding of the theory of cognitive prism. Lee Flax is a senior lecturer in the computing department at Macquarie University, Sydney, Australia. He started work in 1970 for the computer company, ICL. Over the years he has been a programmer, systems analyst, project leader and office manager, all in the information systems area. In the early 1980s he became an academic. His current research interests lie in logic and artificial intelligence. Some specific areas are: cognitive modelling using symbolic methods, computable agent reasoning and algebraic belief revision and non-monotonic entailment. Frank L. Greitzer, PhD, is a chief scientist at the Pacific Northwest National Laboratory (PNNL), where he conducts R&D in human-information interaction for diverse problem domains. He holds a PhD degree in mathematical psychology with specialization in memory and cognition and a BS degree in mathematics. Dr. Greitzer leads a R&D focus area of cognitive informatics that addresses human factors and social/behavioral science challenges through modeling and advanced engineering/computing approaches. His research interests include human-information interaction, human behavior modeling to support intelligence analysis, and evaluation methods and metrics for assessing effectiveness of decision and information analysis tools. In the area of cyber security, Greitzer serves as predictive defense focus area lead for the PNNL Information and Infrastructure Integrity Initiative. Greitzer also conducts research to improve training effectiveness by applying cognitive principles in innovative, interactive, scenario-based training and serious gaming approaches. Representative project descriptions and publications may be found at the cognitive informatics Web site, http://www.pnl.gov/cogInformatics. In addition to his work at PNNL, Greitzer serves as an adjunct faculty member at Washington State University, Tri-Cities campus, where he teaches courses for the computer science department (interaction design) and for the psychology department (cognition, human factors). Greitzer also serves on the editorial board of the Journal of Cognitive Informatics & Natural Intelligence. Douglas Griffith is an applied cognitive psychologist in the Cognitive Solutions Laboratory of General Dynamics Advanced Information Systems. He holds a PhD from the University of Utah and has 32 years of applied experience in government and industry. A former president of Division 21 (applied experimental and engineering psychology) of the American Psychological Association, he is particularly interested in systems that produce a synergism between the human and the machine. One project was the Computer Aids for Vision and Employment (CAVE) Program. The goal was to design better computer systems and training packages for the visually impaired. He managed a subcontract on a project to study cognitive aids for intelligence analysts to counter denial and deception. The work consisted of a review of human information processing shortcomings with an emphasis on those shortcomings that make analysts vulnerable to denial and deception techniques. Then remedies, the cognitive aids, were identified to compensate for these shortcomings and increase the analysts’ awareness of the likelihood of denial and deception activities. In addition to neo-symbiotic systems, he is currently working on collaborative technologies and with metrics for collaboration and the analysis of nonconventional imagery.
364
About the Contributors
Zeng-Guang Hou received the BE and ME degrees in electrical engineering from Yanshan University, Qinhuangdao, China, in 1991 and 1993, respectively, and the PhD degree in electrical engineering from Beijing Institute of Technology, Beijing, China, in 1997. From May 1997 to June 1999, he was a postdoctoral research fellow at the Laboratory of Systems and Control, Institute of Systems Science, Chinese Academy of Sciences, Beijing. He was a research assistant at the Hong Kong Polytechnic University, Hong Kong, China, from May 2000 to January 2001. From July 1999 to May 2004, he was an associate professor at the Institute of Automation, Chinese Academy of Sciences, and has been a full professor since June 2004. From September 2003 to October 2004, he was a visiting professor at the Intelligent Systems Research Laboratory, College of Engineering, University of Saskatchewan, Saskatoon, SK, Canada. His current research interests include neural networks, optimization algorithms, robotics, and intelligent control systems. Ray Jennings is professor of philosophy and director of the Laboratory for Logic and Experimental Philosophy at Simon Fraser University, where he supervises research in logic and the biology of language. He was co-founder (with P.K. Schotch) of the preservationist approach to paraconsistency. He has published in, among others, the Journal of Philosophical Logic, Notre Dame Journal of Formal Logic, Logique et Analyse, Journal of the IGPL, Studia Logica, Zeitschrift für Mathematische Logik und Grundlagen der Mathematik, Analysis, Fundamenta Informaticae, Synthese. His papers on language are in numerous journals and collections including most recently, Mistakes of Reason (UTP) and A Semantics Reader (OUP). He is the author of The Genealogy of Disjunction (OUP) and of the Stanford University Encyclopaedia article Disjunction. He is co-author (with N. A. Friedrich) of Proof and Consequence (Broadview). He gave a set of lectures on the biology of language (Logicalization) at NASSLI’02, Stanford University. Witold Kinsner is professor and associate head at the Department of Electrical and Computer Engineering, University of Manitoba, Winnipeg, Canada. He is also affiliate professor at the Institute of Industrial Mathematical Sciences, and adjunct scientist at the Telecommunications Research Laboratories, Winnipeg. He obtained a PhD in electrical engineering from McMaster University in 1974. He has authored and co-authored over 500 publications in his research areas. Dr. Kinsner is a senior member of the Institute of Electrical & Electronics Engineers (IEEE), a member of the Association of Computing Machinery (ACM), a member of the Mathematical and Computer Modelling Society, a member of Sigma Xi, and a life member of the Radio Amateurs of Canada. Qingyong Li is a lecturer in School of Computer and Information Technology at Beijing Jiaotong University. He holds a PhD from the Institute of Computing Technology, Chinese Academy of Sciences. His research interests include cognitive informatics, machine learning and image processing. Natalia López was born in Alcalá de Henares, Spain. She obtained her MS in mathematics in 1997 and her PhD in computer science in 2003 from the Universidad Complutense de Madrid. Since 1998, she has been in the Computer Systems and Computation Department, Universidad Complutense de Madrid (Spain) where she is an associate professor. Her topics of interest include process algebra, stochastic temporal systems, and formal testing methodologies. André Mayers is a professor of computer science at the University of Sherbrooke and a founder of a research group about intelligent tutoring systems, mainly focused on knowledge representation structures that simultaneously make easier the acquisition of knowledge by students, the identification of their plans during problem solving activities, and the diagnosis of knowledge acquisition. Mehdi Najjar is actually a postdoctral researcher on cognitive and computational modelling. He received his PhD in artificial intelligence from the University of Sherbrooke (Canada). He is also interested in knowledge representation, management and engineering within virtual learning environments and he collaborates with other researchers on the refinement of the knowledge representation structures within intelligent systems.
365
About the Contributors
Manuel Núñez is an associate professor in the computer systems and computation department, Universidad Complutense de Madrid (Spain). He obtained his MS degree in mathematics in 1992 and his PhD in computer science in 1996. Afterwards, he also studied economics, obtaining his MS in economics in 2002. Dr. Núñez has published more than 70 papers in international refereed conferences and journals. In the last years, he has been co-chair of the Forte 2004 conference and of FATES/RV 2006. His research interests cover both theoretical and applied issues, including testing techniques, formal methods, e-learning environments, and e-commerce. Fernando Lopez Pelayo obtained the following degrees (MSc, mathematics, Complutense University of Madrid (UCM); European PhD, computer science, University of Castilla–La Mancha (UCLM), Spain). Nowadays he is developing his teaching activities in the UCLM and in the Spanish Distance Learning University, UNED. His main research interests are focused on formal aspects of concurrency and performance, cognitive informatics, grid computing and symbolic computation. He has published about fifty scientific papers, a third of them in international journals and the rest in refereed international workshops/conferences. He is member of the scientific committeeies of a couple of journals and five workshops/conferences. Vaclav Rajlich received the PhD degree in mathematics from the Case Western Reserve University. He is a professor and former chair of the computer science department of Wayne State University. His research interests are software evolution and program comprehension. He published approximately 70 peer-reviewed articles and one book. He is a member of ACM and IEEE Computer Society. Amar Ramdane-Cherif received his PhD from Pierre and Marie University of Paris in 1998 in neural networks and IA optimization for robotic applications. Since 2000, he has been associate professor in the laboratory PRISM, University of Versailles, Saint-Quentin en Yvelines, France. His main current research interests include software architecture and formal specification, dynamic architecture, architectural quality attributes, architectural style, neural networks, and agent paradigms. Ismael Rodríguez is an associate professor in the computer systems and computation department, Universidad Complutense de Madrid (Spain). He obtained his MS degree in computer science in 2001 and his PhD in the same subject in 2004. Dr. Rodríguez received the Best Thesis Award of his faculty in 2004. He also received the Best Paper Award in the IFIP WG 6.1 FORTE 2001 conference. Rodríguez has published more than 40 papers in international refereed conferences and journals. His research interests cover formal methods, testing techniques, e-learning environments, and e-commerce. Fernando Rubio is an associate professor in the computer systems and computation department, Universidad Complutense de Madrid (Spain). He obtained his MS degree in computer science in 1997 and his PhD in the same subject in 2001. Dr. Rubio received the National Degree Award on the subject of computer science from the Spanish Ministry of Education in 1997, as well as the Best Thesis Award of his faculty in 2001. Dr. Rubio has published more than 40 papers in international refereed conferences and journals. His research interests cover functional programming, testing techniques, e-learning environments, and e-commerce. Guenther Ruhe received a doctorate in mathematics with emphasis on operations research from Freiberg University, Germany and a doctorate degree from both the Technical University of Leipzig and University of Kaiserslautern, Germany. From 1996 until 2001, he was deputy director of the Fraunhofer Institute for Experimental Software Engineering Fh IESE. Ruhe holds an Industrial Research Chair in Software Engineering at University of Calgary. This is a joint position between department of Computer Science and department of Electrical and Computer Engineering. His laboratory for Software Engineering Decision Support (see www.seng-decisionsupport.ucalgary.ca) is focusing on research in the area of intelligent support for the early phases of software system development, analysis of software requirements, empirical evaluation of software technologies, and selection of components-off-the -shelf (COTS) software products. He is the main inventor of a new generation of intelligent decision support tool for software release planning and prioritization. ReleasePlanner® (www.releaseplanner. com). Ruhe has published more than 155 reviewed research papers at journals, workshops, and conferences. Ruhe is a member of the ACM, the IEEE Computer Society, and the German Computer Society GI.
366
About the Contributors
Phillip C-Y. Sheu is currently a professor of computer engineering, information and computer science, and biomedical engineering at the University of California, Irvine. He received his PhD and MS from the University of California at Berkeley in electrical engineering and computer science in 1986 and 1982, respectively, and his BS from National Taiwan University in electrical engineering in 1978. He has published two books: Intelligent Robotic Planning Systems and Software Engineering and Environment: An Object-Oriented Perspective, and more than 100 papers in object-relational data and knowledge engineering and their applications, and biomedical computations. He is currently active in research related to complex biological systems, knowledge-based medicine, semantic software engineering, proactive web technologies, and large real-time knowledge systems for defense and homeland security. Dr. Sheu is a Fellow of IEEE. Zhiping Shi is an assistant professor at the Key Lab of Intelligent Information Processing of Institute of Computing Technology, Chinese Academy of Science. He received his PhD in computer software and theory in Institute of Computing Technology, Chinese Academy of Science in 2005. His research interests include contentbased visual information retrieval, image understanding, machine learning and cognitive informatics. Zhongzhi Shi is a professor at the Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China. His research interests include intelligence science, multi-agent systems, and semantic web. He has published 10 books, edited 11 books, and has more than 300 technical papers. His most recent books were Intelligent Agent and Applications and Knowledge Discovery (in Chinese). Shi is a member of the AAAI. He is the chair of WG 12.3 of IFIP. He also serves as vice president of the Chinese Association for Artificial Intelligence. He received the 2nd Grade National Award of Science and Technology Progress in 2002. In 1998 and 2001 he received the 2nd Grade Award of Science and Technology Progress from the Chinese Academy of Sciences. Jeffrey J.P. Tsai received his PhD in computer science from the Northwestern University, Evanston, Illinois. He is a professor in the Department of Computer Science at the University of Illinois at Chicago, where he is also the Director of Distributed Real-Time Intelligent Systems Laboratory. He co-authored Knowledge-Based Software Development for Real-Time Distributed Systems (World Scientific, 1993), Distributed Real-Time Systems: Monitoring, Visualization, Debugging, and Analysis (John Wiley and Sons, Inc., 1996), Compositional Verification of Concurrent and Real-Time systems (Kluwer, 2002), Security Modeling and Analysis of Mobile Agent Systems (Imperial College Press, 2006), co-edited Monitoring and Debugging Distributed Real-Time Systems (IEEE/CS Press, 1995), and Machine Learning Applications in Software Engineering (World Scientific, 2005). Taehyung Wang is an assistant professor in the Department of Computer Science at California State University Northridge (CSUN). His research interests include cognitive informatics, biomedical information system, software engineering, data mining, data warehousing, object-oriented design and analysis methodology, location-based service, data visualization, and Web technologies. Before joining CSUN, he worked as a researcher for the Visual Interactive Data Engineering Lab and the Center of Bioengineering at the University of California, Irvine. Dr. Wang received a PhD from the University of California at Irvine in 1998, and he received a MS in computer science from Western Illinois University and a B.S. in Control and Instrumentation from Seoul National University in 1985, respectively. Shaochun Xu received the PhD degree in computer science at Wayne State University, Detroit, USA, the PhD in geology from the University of Liege, Liege, Belgium, and the MSc degree in computer science from the University of Windsor, Windsor, Canada. From 1997 to 1999, he was a post-doctoral fellow in the Department of Geological Sciences at the University of Manitoba, Winnipeg, Canada. He is currently an assistant professor at the computer science department at Algoma University College, Laurentian University, Sault Ste. Marie, Canada. His research focuses on cognitive aspects of software engineering. Yiyu Yao received his BEng from Xi’an Jiaotong University, People’s Republic of China, in 1983, and his MSc and PhD from the University of Regina, Canada (1988 and 1991, respectively). He was an assistant professor
367
About the Contributors
and an associate professor at Lakehead University, Canada (1992-1998). He joined the Department of Computer Science at the University of Regina in 1998, where he is currently a professor of computer science. His research interests include data mining, rough sets, Web intelligence, granular computing, machine learning and information retrieval. Du Zhang received a PhD in computer science from the University of Illinois. He is a professor and chair at Department of Computer Science, California State University, Sacramento. He has authored or co-authored over 100 publications in journals, conference proceedings, and book chapters, and has edited or co-edited two books, five special issues for five journals, and five IEEE conference proceedings. Du Zhang is a senior member of IEEE and a member of ACM. He is an associate editor for International Journal on Artificial Intelligence Tools and a member of editorial board for International Journal of Cognitive Informatics and Natural Intelligence. Yan Zhao received her BEng from the Shanghai University of Engineering and Science, People’s Republic of China, in 1994, MSc and PhD from the University of Regina, Canada in 2003 and 2007, respectively. Her research interests include data mining and machine learning.
368
369
Index A abstraction 92, 93 activity-centered design 108 algebra, concept (CA) 1, 2, 10, 11, 12, 16, 18, 20, 24, 50, 51, 104, 115, 155, 185, 195, 197, 218, 219, 244, 274, 275, 276, 289, 325, 329, 333, 336, 337, 339, 340, 342, 343, 346, 347, 348, 350, 355, 356, 358, 362 algebra, real-time process (RTPA) 1, 2, 10, 12, 13, 14, 16, 18, 20, 21, 22, 23, 24, 25, 65, 66, 71, 76, 92, 77, 97, 94, 97, 98, 99, 101, 102, 103, 104, 130, 131, 137, 139, 140, 157, 158, 159, 162, 170, 175, 182, 187, 332, 333, 358 algebra, system (SA) 1, 2, 10, 14, 15, 16, 18, 25, 26, 358 algebra PSEN 228 attention-guided sparse coding (AGSC) model 81, 82, 85, 86, 87, 88, 89 attitude 8, 21, 65, 66, 69, 70, 71, 73, 74, 75, 76, 77, 341 AURELLIO 248, 253, 257, 260, 261 autonomic computing (AC) 1, 2, 24, 32, 172, 173, 174, 175, 177, 178, 179, 180, 181, 182, 183, 184, 185, 192, 193, 194, 200 axiom of choice 131
B Bayesian method 130 behaviors, cognitive 175 behaviors, instructive 175 behaviors, perceptive 175 behaviors, reflective 175 Boole, George 118
C Carnot, Sadi 45, 46 cognition 7, 24, 28, 33, 38, 47, 51, 108, 109, 112,
113, 115, 116, 156, 182, 186, 188, 189, 191, 193, 194, 196, 198, 244, 248, 259, 260, 265, 278, 293, 299, 304, 305, 306, 318, 328, 332, 343, 350, 355, 357, 359 cognitive informatics (CI) 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 16, 17, 18, 19, 20, 21, 24, 7, 8, 9, 10, 9, 10, 17, 16, 17, 16, 17, 18, 19, 20, 21, 26, 25, 26, 24, 26, 27, 29, 33, 34, 35, 45, 46, 49, 50, 51, 53, 54, 26, 77, 78, 90, 104, 2, 24, 55, 63, 117, 5, 105, 114, 25, 1, 28, 64, 129, 48, 115, 117, 141, 143, 155, 170, 173, 179, 185, 186, 187, 193, 194, 195, 197, 198, 199, 200, 218, 219, 234, 245, 261, 263, 264, 275, 276, 289, 290, 293, 297, 302, 303, 305, 318, 324, 325, 327, 328, 329, 330, 331, 332, 333, 334, 335, 338, 340, 342, 347, 348, 349, 350, 351, 353, 358, 359, 360, 361, 362 cognitive machine model, Haikonen 190 cognitive machines 185, 188, 189, 190, 192, 193, 194, 197, 332, 333, 347, 348 cognitive modelling 220–234 cognitive scientist 118 constructivist learning, during softaware development 292–303 control theory 299
D deduction 95, 97 digital sentience 189 discrepancy distance 81, 83, 86
E emotion 8, 20, 66, 67, 68, 69, 70, 71, 76, 67, 108, 174, 179 emotion, strength of 67, 68, 71 emotional system, human 67, 76 entropy, Boltzmann 46 entropy, Boltzmann-Gibbs 46
Copyright © 2009, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
Index
entropy, Clausius 45 entropy, conditional 40, 41 entropy, higher-order message 40 entropy, joint 40, 41 entropy, Kolmogorov 44 entropy, Kolmogorov-Sinai (KS) 44, 45, 305 entropy, mutual 40, 42 entropy, negative (negentropy) 45, 46 entropy, Prigogine 45 entropy, relative 42 entropy, Stonier 47 entropy spectrum, Rényi 42, 43, 48, 194 eXtreme Programming process 294
F fiat projection 142, 145, 153 finite state machines (FSM) 52, 53, 54, 55, 56, 57, 58, 61 first-order logic 226 folding machine 57, 58, 59, 60, 61 formal logical inferences 92
G gas pipeline operation advisor (GPOA) 278, 281, 287, 288 granular computing (GrC) 238
H Hartley, Ralph 36, 37, 49, 344 human/machine symbiosis 109
I imperative computing (IC) 173, 174, 175, 176, 177, 178, 182 inferences 92, 93, 94, 95, 97, 103, 104, 358 inferential modelling technique (IMT) 278, 279, 280, 282, 283, 284, 285, 286, 287, 288 information-matter-energy (IME) model 2, 3, 4, 24, 33 intelligence, artificial (AI) 1, 6, 24, 132, 172, 174, 248, 262, 264, 276, 184, 276, 269, 279, 289, 290, 327, 330, 332, 337, 351, 354, 356, 357, 360 intelligence, natural (NI) 1, 2, 5, 6, 7, 24, 174, 180, 181, 182, 184, 332 interactive classification system (ICS) 235, 241, 242, 243 interactive motivation-attitude theory 65, 76
370
K knowledge acquisition (KA) 278, 282 knowledge base (KB) 266, 267, 268, 269, 270, 271, 272, 273, 274 Kolmogorov, Andrei N. 28, 44, 48, 50, 305, 347
L language, and semantics 119 layered reference model of the brain (LRMB) 1, 2, 5, 8, 18, 20, 21, 24, 27, 65, 66, 68, 76, 93, 97, 130, 77, 105, 103, 136, 137, 141, 156, 174, 179, 180, 181, 182, 183, 187, 245, 290, 334, 360 leaning process 294 learning, brain-based 299 learning, observational 299 Licklider, J.C.R. 106
M man/machine symbiosis 109 memory, long term (LTM) 6, 7, 136, 137, 167, 168, 181 memory, short term (STM) 6, 167, 181 memory neural network (MNN) 200, 202, 204, 205, 210, 217 modified memory neural network (MMNN) 201, 204, 217 motivation 8, 20, 65, 66, 67, 68, 69, 70, 71, 74, 75, 76, 174, 179, 188, 190, 236, 237, 332 motivation/attitude-driven behavioral (MADB) model 8, 69, 71
N natural language 120 neo-symbiosis 106–117 neo-symbiosis, implementing of 110 neural network (NN) paradigm 200, 201, 205, 210, 213, 215 neural systems 78 neurons 6, 18, 78, 79, 80, 82, 85, 136, 201, 204, 205, 210, 220, 221, 222, 233, 236, 328
O object-attribute-relation (OAR) model 1, 2, 5, 6, 7, 10, 18, 24, 26, 65, 71, 76, 77, 97, 99, 100, 103, 105, 136, 137, 167, 168, 245, 331, 334, 359 Occam’s razor principle 52, 53, 54, 55 Occam, William of 52, 53, 54, 55, 63, 337, 340, 351
Index
P pathology, in cognitive function 223–234 perception 1, 2, 5, 8, 17, 18, 20, 24, 28, 33, 38, 47, 49, 65, 66, 68, 79, 81, 82, 86, 89, 90, 103, 107, 108, 109, 117, 137, 143, 167, 173, 174, 179, 188, 190, 191, 194, 196, 198, 223, 224, 226, 230, 249, 259, 299, 304, 305, 306, 318, 328, 332, 343, 345, 354 preliminary knowledge, and learning 294 Prigogine, Ilya 28, 33, 34, 44, 45, 51, 353 programmer learning 292 psychology, engineering of 108
Shannon’s self-information 37, 38, 39, 41, 42, 44, 47, 315 Shannon’s source entropy and redundancy 39 Shannon, Claude 4, 25, 28, 29, 33, 36, 37, 38, 39, 42, 43, 44, 45, 47, 48, 50, 51, 172, 185, 186, 315, 316, 317, 346, 348, 355 sparse coding theory 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 89, 90, 345 spatial environments 142, 143, 153, 155, 340 stochastic process algebra (STOPA) 157, 158, 168, 169
T
R
time delay neural network (TDNN) 204
region connection calculus (RCC-8) theory 143, 146, 147, 151, 152, 153 response saliency 82, 86 rule base (RB) 266, 267, 268, 270, 271, 272, 273, 274
V
S
virtual learning environments (VLE) 247, 248, 253, 254, 262
W working memory (WM) 266, 267, 268, 270, 276
schizophrenia 220–234 Shannon’s code entropy 39
371