Advances in Human Factors/Ergonomics, 20B
Symbiosis of Human and Artifact: Human and Social Aspects of Human-Computer Interaction
Advances in Human Factors/Ergonomics
Series Editor: Gavriel Salvendy, Purdue University, West Lafayette, IN 47907, U.S.A. Vol. 4 Vol. 5
Vol. 6 Vol. 7 Vol. 8 Vol. 9 Vol. 10A Vol. 10B Vol. 11 Vol. 12A Vol. 12B Vol. 13 Vol. 14 Vol. 15 Vol. 16 Vol. 17 Vol. 18A Vol. 18B Vol. 19A Vol. 19B Vol. 20A Vol. 20B
Engineering Physiology: Physiologic Bases of Human Factors/Ergonomics (K.H.E. Kroemer, H.J. Kroemer and K.E. Kroemer-Elbert) Human Factors Testing and Evaluation (D. Meister) Applications of Fuzzy Set Theory in Human Factors (W. Karwowski and A. Mital, Editors) Human Reliability: Analysis, Prediction, and Prevention of Human Errors (K.S. Park) Human Aspects of Occupational Vibration (D.E. Wasserman) Human Factors Research: Methods and Applications for Architects and Interior Designers (J.E. Harrigan) Social, Ergonomic and Stress Aspects of Workwith Computers (G. Salvendy, S.L. Sauter and J.J. Hurrell, Jr., Editors) Cognitive Engineering in the Design of Human-Computer Interaction and Expert Systems (G. Salvendy, Editor) Occupational Safety and Accident Prevention: Behavioral Strategies and Methods (C. G. Hoyos and B. Zimolong) Work with Computers: Organizational, Management, Stress and Health Aspects (M.J. Smith and G. Salvendy, Editors) Designing and Using Human-Computer Interface and Knowledge Based Systems (G. Salvendy and M.J. Smith, Editors) Designing User Interfaces for International Use (J. Nielsen, Editor) Human Factors in Product Design (W.H. Cushman and D.J. Rosenberg) Workspace, Equipment and Tool Design (A. Mital and W. Karwowski, Editors) Connective Networks in Ergonomics: General Methodological Considerations (E.A. Franus) Psychology of Systems Design (D. Meister) Human Aspects in Computing: Design and Use of Interactive Systems and Work with Terminals (H.-J. Bullinger, Editor) Human Aspects in Computing: Design and Use of Interactive Systems and Information Management (H.-J. Bullinger, Editor) Human-Computer Interaction: Applications and Case Studies (M.J. Smith andG. Salvendy, Editors) Human-Computer Interaction: Software and Hardware Interfaces (G. Salvendy and M.J. Smith, Editors) Symbiosis of Human and Artifact: Future Computing and Design of HumanComputer Interaction (Y. Anzai, K. Ogawa and H. Mori,Editors) Symbiosis of Human and Artifact: Human and Social Aspects of HumanComputer Interaction (Y. Anzai, K. Ogawa and H. Mori, Editors)
Advances in Human Factors/Ergonomics, 20B
Symbiosis of Human and Artifact H u m a n and Social Aspects of Human-Computer Interaction
Proceedings of the Sixth International Conference on Human-Computer h~teraction, (HCI International '95), Tokyo, Japan, 9-14 July 1995, Volume 2 Edited by Yuichiro Anzai Department of Computer Science Keio University Yokohama, Japan Katsuhiko Ogawa Nippon Telephone and Telegram Tokyo, Japan Hirohiko Mori Musashi Institute of Technology Tokyo, Japan
l[ ELSEVIER 1995 Amsterdam - Lausanne- New York- Oxford- Shannon - Tokyo
ELSEVIER SCIENCE B.V. Sara Burgerhartstraat 25 P.O. Box 211, 1000 AE Amsterdam, The Netherlands
ISSN 0921-2647 ISBN 0-444-81795 6 © 1995 Elsevier Science B.V. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior written permission of the publisher, Elsevier Science B.V., Copyright & Permissions Department, P.O. Box 521, 1000 AM Amsterdam, The Netherlands. Special regulations for readers in the U.S.A. - This publication has been registered with the Copyright Clearance Center Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923. Information can be obtained from the CCC about conditions under which photocopies of parts of this publication may be made in the U.S.A. All other copyright questions, including photocopying outside of the U.S.A., should be referred to the copyright owner, Elsevier Science B.V., unless otherwise specified. No responsibility is assumed by the publisher for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions or ideas contained in the material herein. This book is printed on acid-free paper. Printed in The Netherlands
PREFACE This book presents the latest advances in the research of ergomomics, human factors and social aspects of work with computer systems. The 171 papers presented in this volume were selected from those submitted to the Sixth International Conference on Human-Computer Interaction (HCI International '95) held in Tokyo, 9-14 July 1995 with the support of a grant from the Commemorative Association for the Japan World Exposition (1970). A total of 1,298 individuals from 36 countries submitted their work for presentation at this first major international meeting on human-computer interaction held in Asia. Among the submittals, only those judged to be of high quality were accepted for presentation. The papers accepted for verbal presentation, totaling 354, present recent advances in human interaction with computers and related machines in a variety of environments. The selected papers in the areas of future computing and system design in human-computer interaction are included in the accompanying Volume 1 entitled Symbiosis of Human and Arti.fact: Future Computing and Design.for Human-Computer Interaction. We are grateful for the help of many organizations which made the congress successful, and would like to thank the following sponsors of the conference: Information Processing Society of Japan Institute for Electronics, Information and Communication Engineers Japan Ergonomics Research Society Public Health Research Center The Society for Instrument and Control Engineers and the following cooperating organizations: Architectural Institute of Japan Atomic Energy Society of Japan Chinese Academy of Sciences Chinese Ergonomics Society EEC-European Strategic Programme for Research and Development in Information Technology- ESPRIT Ergonomics Society of Taiwan Finnish Institute of Occupational Health IEEE Systems, Man & Cybernetics Society IEEE Tokyo Section Indian Society of Ergonomics Institute of Management Services (UK) International Ergonomics Association
Japan Association of Industrial Health Japan Industrial Management Association Japan Institute of Office Automation Japan Management Association Japan Society for Software Science and Technology Japan Society of Health Science Japanese Cognitive Science Society Japanese Society for Artificial Intelligence Japanese Society for Science of Design Korea Research Institute of Standards and Science National Institute for Occupational Safety & Health (USA)
vi National Institute for the Improvement of Working Conditions and Environment (Thailand) National Institute of Industrial Health (Japan) Society of Biomechanisms (Japan) Software Psychology Society The Ergonomics Society of Korea
The Illuminating Engineering Institute of
Japan The Institute of Electrical Engineers of Japan The Japan Society of Mechanical Engineers The Japanese Association of Rehabilitation Medicine The Society of Heating, Air Conditioning and Sanitary Engineers of Japan.
We are most grateful to the following Board members for their fine contributions to the organization of the conference: General Chair Y0shio Hayashi, Japan Vice Chair Hiroshi Tamura, Japan
Advisory Committee Chair Kageyu Noro, Japan Organizing Committee Chair Takao Ohkubo, Japan
Advisory Board Hideo Aiso, Japan Shun'ichi Amari, Japan Takaya Endo, Japan Hal Hendrick, U.S.A. Atsunobu Ichikawa, Japan Kazumoto linuma, Japan Hiroshi Kashiwagi, Japan Akinobu Kasami, Japan Kakutaro Kitashiro, Japan
Kazutaka Kogi, Japan Takao Shirasuna, Japan Sadao Sugiyama, Japan Yotaro Suzuki, Japan Kei Takeuchi, Japan Thomas J. Triggs, Australia Keiichi Tsukada, Japan Masao Ueda, Japan Jiirgen E. Ziegler, Germany.
We thank, in particular, the Program Committee members who made their best contributions to organizing the program: Ame Aar~is, Norway Munehira Akita, Japan Yuichiro Anzai, Japan (Chair) Kazuo Aoki, Japan Albert G. Arnold, The Netherlands Eiichi Bamba, Japan Nigel Bevan, U.K. John M. Carroll, U.S.A. Yam San Chee, Singapore Marvin J. Dainoff, U.S.A. Miwako Doi, Japan Wolfgang Dzida, Germany Ray Eberts, U.S.A. Klaus-Peter F/ihnrich, Germany Emiliano A. Francisco, The Philippines
Hiroshi Hamada, Japan Hiroshi Harashima, Japan Susan Harker, U.K. Martin Helander, Sweden Herbert Heuer, Germany Michitaka Hirose, Japan Erik Hollnagel, U.K. Ken Horii, Japan Tohru Ifukube, Japan Koichi Inoue, Japan Kitti Intaranont, Thailand Hiroo Iwata, Japan Hiroyasu Kakuda, Japan Katsuari Kamei, Japan John Karat, U.S.A.
vii Osamu Katai, Japan Takashi Kato, Japan Yosuke Kinoe, Japan Bengt Knave, Sweden Richard J. Koubek, U.S.A Masaharu Kumashiro, Japan Masaaki Kurosu, Japan Nahm Sik Lee, Korea Soon Yo Lee, Korea Xu Liancang, China Holger Luczak, Germany Thomas L~iubli, Switzerland Marilyn Mantei, Canada Marvin Minsky, U.S.A. Naomi Miyake, Japan Hirohiko Moil, Japan Masaki Nakagawa, Japan Jakob Nielsen, U.S.A. Kazuhisa Niki, Japan Shogo Nishida, Japan Takeshi Nishimura, Japan Donald Norman, U.S.A Katsuhiko Ogawa, Japan Takao Okubo, Japan Choon-Nam Ong, Singapore
Olov Ostberg, Sweden Peter G. Poison, U.S.A. Jens Rasmussen, Denmark Kazuo Saito, Japan Susumu Saito, Japan Steven L. Sauter, U.S.A Dominique L. Scapin, France Pentti Seppala, Finland Thomas B. Sheridan, U.S.A. Ben Shneiderman, U.S.A. Michael J. Smith, U.S.A. T.F.M. Stewart, U.K. Yasuo Sudoh, Japan Yuzuru Tanaka, Japan Yoh'ichi Tohkura, Japan Kim J. Vicente, Canada Tomio Watanabe, Japan Runbai Wei, China Sakae Yamamoto, Japan Eiichiro Yamamoto, Japan Michiaki Yasumura, Japan Atsuya Yoshida, Japan Hidekazu Yoshikawa, Japan Richard Young, U.K.
This book, as well as the conference program, could not have been completed without the outstanding effort of Ms. Yoko Osaku, the secretariat for HCI International '95, and Mr. Akira Takeuchi of the Musashi Institute of Technology. Yuichiro Anzai, Keio University Miwako Doi, Toshiba Corporation Hiroshi Hamada, NTT Hirohiko Mori, Musashi Institute of Technology Katsuhiko Ogawa, NTT Susumu Saito, National Institute of Industrial Health
This Page Intentionally Left Blank
ix
CONTENTS Analysis, Design and Evaluation in Human-Computer Interaction
M. 1 Interaction in Context Context in Interaction Interaction in Context Context in Interaction William Edmondson, Jim Alty, Patrick Brezillon, Erik Hollnagel, John Meech, Katsuhiko Ogawa, Dan Suthers -
111.2 Winning the Market of HMS Winning the Market of Human-Machine Systems (HMS)
Elena A. Averbukh
21
Different Approaches in Visual Interactive Software Construction Olivier Esteban, St6phane Chatty, Philippe Palanque
27
Toward a Comprehensive Manipulation Solution on 3D Workspace Nobuo Asahi, Kazuhisa Okada, Akira Maenaka, Eun-Seok Lee
33
Tailoring Non-Visual Interaction in a Graphical Environment C. Stephanidis, R. Gogoulou
39
Command-Line Prediction and Error Correction Using Generalized Commandline YasuhiroUeda, Toshikazu Nishimura, Michihiko Minoh, Katsuo Ikeda
45
111.3 Interaction Design i
111.4 Interaction Design 2 FRADS: A System for Facilitating Rapid Prototyping by End Users Irvin R. Katz User Interface Development Environment for End Users CUIDE Yoshiichi Tokuda, Eun-Seok Lee, Norio Shiratori
53 59
Supporting Computer Users through Dialogue Akira Ito, Tadahiko Kumamoto, Tsuyoshi Ebina
65
A Denotational Approach for Formal Specification of Human-Computer Dialogue Kohji Matsubayashi, Yoshio Tsujino, Nobuki Tokura
?1
A Case-Based Method to Support Creative Design Incorporating Intention Recognition Takayuki Yamaoka,ShogoNishida
77
Designing Interfaces for Computer-based Assessments Randy M. Kaplan, Irvin R. Katz
83
III.5 Interaction Design 3 WMH Methodologyfor HCI Design
Christian Co~ff6
91
Guided Understanding for Problem Solving Process Using the Refining Self Explanation
Kazuhide Kanenishi, Yoneo Yano
97
A Strategy and Technology for Front End System Development Linda Candy, Ernest Edmonds, Susan Heggie, Bryan Murray, Nick Rousseau
103
A Methodology for Developing New Interaction Techniques Deborah Hix, James N. Templeman, Ankush Gosain, Kapil Danderkar
109
Basic Models for User Interface Design: Tasks, Users, Data, and Interaction Devices Chris Stary
115
Ill.6 Screen Design 1 The Effects of Realistic Versus Unrealistic Desktop Interface Designs on Novice and Expert Users
Laura A. Miller, Kay M. Stanney
Rendering Calligraphy Words with 'Kasure' Variations
123
Qinglian Guo
129
Kuniharu Takayama, Hiroyuki Kano, Yoshiharu Maeda, Kazuo Misue, Shinya Hosogi, Kozo Sugiyama
135
Dynamic Font: Its Concept and Generation Method
A Spatial Data Structure for the 3D Graphical Facility Management System Takashi Tamada, Teruhiko Teraoka, Minoru Maruyama, Shogo Nishida
Cryptographic Human Identification
Tsutomu Matsumoto
141 147
xi
III.7 Screen De sign 2 Adjustment Mechanism for a Drawing System with Stationery Metaphors Naoki Kato, Natsuko Fukuda, Masaki Nakagawa
155
Analysis Tool for Skill Acquisition with Graphical User Interfaces Based on Operation Logging Nobuko Kishi
161
The Role of Screen Parameters in Visual Communication Masaaki Kurosu, Hitoshi Yamadera, Itaru Mimura
167
Re-sampling of 3-D Object Range Data by Cube-Based Segmentation Sheng Jin Wang, Yi Cai, Makoto Sato, H. Kawarada
173
Harmonic Curve Design
John R. Rankin
179
GUIs and SUIs: More of the Same or Something Different? Alison Black, Jacob Buur
187
Ill.8 Screen Desi
3
Vision-Based Human Interface System with World-Fixed and Human-Centered Frames
Kang-Hyun Jo, Yoshinori Kuno, Yoshiaki Shirai
Fuzzy Reasoning Approach to Iconic Interface Design
193
Rungtai Lin
199
Takeharu Tanimura, Tsukasa Noma, Naoyuki Okada
205
Inferring Graphical Constraints from Users' Modification Enhancing Fold Manipulation Techniques
Ying K. Leung, Richard J. King
211
Providing Diagram User Interfaces for Interactive Theorem Proving Jun Hart, Tao Lin
217
111.9 Active Interface Active Interfaces for Useful Software Tools Amedeo Cesta, Daniela D'Aloisi, Vittorio Giannini
225
Interacting with Real Objects Real Object Interface and Transferred Object Interface Soichiro Iga, Michiaki Yasumura
231
User Identification in Human Robot Interaction Using Identification Pendant Kaoru Hiramatsu, Yuichiro Anzai
237
xii
Applying Personal Robots and Active Interface to Video Conference Systems Nobuyuki Yamasaki, Yuichiro Anzai
243
An Object-Oriented G UI for the Specification of Robotic Systems Raymond K. Wong
249
Augmented Interaction" Interacting with the Real World through a Computer Jun Rekimoto
255
Itiro Siio
261
An Object Oriented Methodology for Man-Machine Systems Analysis and Design A. Mahfoudhi, M. Abed, J-C. Angu6
267
An Analysis of Relationship between Human and Information System by Quantification Theory III Tsuneki Mukahi, Ken Murasugi, Tetsuo Ui
273
InfoBinder: A Pointing Device for a Virtual Desktop System
III. 10 Evaluation and Analysis I
Towards an Effective Subjective Measurement Method Based on Fuzzy Set Theory
Hiromi Terashita, Mieko Ohsuga, Futomi Shimono, Mamiko Toda
279
The Design and Experiment of an Evaluation Function for User Interaction Cost in the Interactive Semantic Disambiguation Masaya Yamaguchi, Nobuo Inui, Yoshiyuki Kotani, Hirohiko Nisimura
285
An Analysis of the Human-Computer Interfaces to High-Energy Physics Control Systems at CERN J.F. Meech, P. Huuskonen, E. Wagner, M. Meri, J.M. Le Goff
291
III. l 1 Evaluation and Analysis 2 PDS Analysis for Evaluating Procedural Usability on Conversational Systems Akinori Komatsubara, Masayuki Kobayashi
299
Quantitative Evaluation of Media Quality by Method of Competitive Priority Hiroshi Tamura, Jun Wu
305
Evaluation of Control Strategies in a Complex Space-Vehicle Control Task: Effects of Training Type Ranvindra S. Goonetilleke, Colin G. Drury, Joseph Sharit
311
Development of the Analysis Support System for Incidents and Troubles; "ASSIST"
Yuriko Yoshizawa, Keiko Mutoh
317
xiii
Discount Video Analysis for Usability Engineering Mark H. Chignell, Tetsuro Motoyama, Venicio Melo
323
User Interface Evaluation: Is It Ever Usable? Christelle Farenc, Philippe Palanque,JeanVanderdonckt
329
III. 12 HCI Evaluation Methodologies Software Tools for Evaluating the Usability of User Interfaces
Sandrine Balbo
337
How Usable are Usability Principles, Criteria and Standards? J.M.C. Bastien, D.L. Scapin
343
Usability is Quality of Use
Nigel Bevan
349
Deborah Hix Standards and Software-Ergonomics Evaluation Harald Reiterer, Reinhard Oppermann
355
Usability Evaluation: How Does It Relate to Software Engineering?
Using Ergonomic Rules for Evaluation by Linguistic Ergonomic Criteria Franqois Bodart, JeanVanderdonckt
361
367
HI.13 Usability Engineering A Teaching Method as an Alternative to the Concurrent Think-Aloud Method for Usability Testing Pawan R. Vora, Martin G. Helander
375
Tools for Iterative User Interface Design: UI-Tester and OST Toshiyuki Asahi, Hidehiko Okada, Osamu lseki, Ryoichi Matsuda
381
A Composite Measure of Usability for Human-Computer Interface Designs Kay Stanney, Mansooreh Mollaghasemi
387
Why Choose? A Process Approach to Usability Testing Troy Kelley, Laurel Allender
393
Usability and Quality Control of Human-Machine Interaction Elena A. Averbukh
399
Color Coordinate Supporting System with Navigating State of User's Mind Yasushi Yagi, Tomohiko Yagyu, Yoshihiko Hisamori, Masahiko Yachida
405
III. 14 Cognitive Engineering
xiv
Comparison between Three Human-Interfaces in Hospital Information System Kotaro Minato, Akira Endoh
411
Explaining Plant Design Knowledge through Means-End Modelling Pertti Huuskonen, Karl Kaarela
417
Method of Ecological Interface Design Applied to Interactive Diagnosis Support System YokoAsano, Shun-ichi Yonemura,Hiroshi Hamada,Katsuhiko Ogawa
423
III. 15 Computer Modeling of Mental Processes Computer Analysis of Characteristics of CreativeThinking and Self-esteem Level A.E. Kiv, V.A. Molyako, Stephen T. McHale, V.G. Orishchenko, L.A. Polozovskaya
431
Computer-based Testing of Reflective Thinking: Executive Control of Erroneous Performance in 9 to 12 Year Old Children Uri Shafrir
437
The Creative Thinking Testing by Using of Testing Problems Based on Different Logical Schemes A.E. Kiv, V.A. Molyako,V.L. Maloryan, I.A. Polozovskaya,Zelina I. Iskanderova
443
From Novice to Expert Decision Behaviour: A Qualitative Modelling Approach with Petri Nets
Matthias Rauterberg
449
Modeling and Simulation of Human Operator in Mental Task Handling Qualities Celestine A. Ntuen
455
The Interface Improvement for the Creative Thinking Computer Testing V.V. Chislov, V.L. Maloyran, I.A. Polozovskaya,G.V. Shtakser,A.I. Uyemov, I.G. Zakharchenko, Mafia Athoussaki
459
Evaluating Human Operator Models in Tool-based User Interface Design Maria Athousaki
463
Qiyang Chen, A.F. Norcio
471
111.16 Modeling I Associative User Modeling: A Neural Network Approach Personality Engineering: Applying Human Personality Theory to the Design of Artificial Personalities
Linda S. Endres
477
XV
Using the Template Model to Analyse Interface Specifications Christopher R. Roast, J.l. Siddiqi
483
Task Model-System model Towards an Unifying Formalism Philippe A. Palanque,R6mi Bastide, Val6rie Senges
489
III. 17 Modeling 2 Scenario Based Specification of Interaction Metaphors C. Stephanidis,C. Karagiannidis, A. Koumpis
497
Cocktail-Party Effect with Computational Auditory Scene Analysis- Preliminary Report -
Hiroshi G. Okuno, Tomohiro Nakatani, Takeshi Kawabata
503
The Effects of Rehearsal on Visual Memory Mamoru Umemura, Hiroshi lchikawa, Kenichi Teguchi
509
Mechanisms of Slips in Display-Based Human-Computer Interaction" A ModelBased Analysis Muneo Kitajima, PeterG. Poison
515
Computation Model for Human Communication Masahiro Hiji, Hiroshi Nunokawa, Masatoshi Miyazaki
521
III. 18 Voices and Faces Delivering the Promise of Speech Interfaces
Charanjit K. Sidhu, Gerry Coyle
529
VOICEDIC: A Practical Application of Speech Recognition Technology Kenji Kita, Kazuhiko Ashibe, Yoneo Yano, Hiroaki Ogata
535
An Operation Analysis of an Address Input System with Speech Recognition Kazuhiro Arai, Osamu Yoshioka, Shigeki Sagayama,Noboru Sugamura
541
A Menu-Guided Spoken Dialog System and Its Evaluation Mikio Yamamoto, Takashi Koike, Seiichi Nakagawa
547
Face Observation Using an Active Camera Qian Chen,Takeshi Fukumoto, Haiyuan Wu, Masahiko Yachida
553
Facial Features and Configurations Affecting Impressions of Faces Takashi Kato, Masaomi Oda, Misami K. Yamaguchi, Shigeru Akamatsu
559
Anthropomorphic Media Approach to Human-Computer Interactive Communication Using Face Robot Hiroshi Kobayashi, Furnio Hara
565
xvi
Ergonomics and Health Aspects of Work with Computers IV.1 Health Aspects Symptom Clusters among VDU- Workers Knut lnge Fostervold, Ivar Lie, Stig Larsen, Gunnar Horgen, Arne Aar~is, Arid V&gland
575
Construct Validity of Computer Anxiety as Measured by the Computer Attitudes Scale Dearie, F.P., Henderson,R.D., Barrelle, K., Saliba, A., Mahar, D.
581
Sick Building Syndrome: Are UK Libraries Affected? Anne Morris, Peter Dennison
587
Head-Coupled Display System- Research Issues on Health Aspects Wolfgang Felger
593
Establishment of an Expert System for Visual Display Terminals (VDT) Workers' Periodic Eye Checkups Hitoshi Nakaishi, Masaru Miyao
599
W.2 Workstation and Work Environments Ocular Motility of 72,000 VDU Operators Bruno Bagolini,Femando Molle, Marco Turbati, Domenico Lepore, Luigi Scullica
607
The Vertical Horopter and Viewing Distance at Computer Workstations Dennis R. Ankrum, Earl E. Hansen, Krisrie J. Nemeth
611
Recommendation for VDT Workstation Design Based on Analysis of Ocular Surface Area Midofi Sotoyama,Shin Saito, SasitornTaptagaporn,Susumu Saito
617
Lighting and Visual Ergonomics for the Display Screen Environment M.J. Perry, P.J. Littlefair
623
Berman Kayis, Khoi Hoang
629
Computerised Analysis of Prolonged Seated Posture Indoor Air Quality Evaluation by Continuous Measurement in Computerized Offices Akiyoshi Ito, Makoto Takahashi, Kazuhiro Sakai, Kazutaka Kogi
635
xvii
IV.3 Human Factors in Display Technology Effects of Ambient Lighting Conditions on Luminance Contrast and Color Gamut of Displays with Different Technologies Satoru Kubota
643
Display User Response by Task Lighting/Office Configuration: Implications for Flat Panel Display Users G. Sweitzer
649
Computer Workstations and Ergonomic Standards" Issues in Science and Engineering R.E. Granda, J. Greeson Jr.
655
Measurement of TFT/LCD Flicker for ISO Compliance Ryohji Yoshitake, Rieko Kataoka
661
A Psychometric Scale of TFT/LCDs with a Few Defecting Sub-Pixels Tohru Tamura, Yuhji Gohda
667
VI.4 Psychosocial Stress among VDU Workers Research Frameworks of Stress among VDU Workers - Impacts of Computerization and Task Characteristics of Computer Workers Yuko Fujigaki
,675
The Impact of Computerization on Job Content and Stress" A Seven Year Follow-Up in the Insurance Sector Tuula Leino, Kirsi Anola, Pekka Huuntanen, Irja Kandolin
681
The Impact of Office Computerization on Job Characteristics, Physical and Mental Health of Japanese Office Workers: Gender Difference Takashi Asakura
687
Effect of Computer System Performance and Other Work Stressors on Strain of Office Workers Pascale Carayon
693
Job Stressors and Depressive Symptoms in Japanese Computer Software Engineers and Managers Takashi Haratani, Yuko Fujigaki, Takashi Asakura
699
Job Stress Characteristics of Computer Work in Japan Norito Kawakami, C.R. Roberts, T. Haratani
705
xviii
VI.5 Input Devices An Integrated Haptographical User Interface Using a Force-Feedback Mouse Allan J. Kelley, T. Higuchi, S.E. Salcudean
713
Discussion on Method for Predicting Targets in Pointing by Mouse Atsuo Murata
719
The Difference of Information Input Method on Psycho-physiological Reaction of VDT Work Takao Ohkubo, MichiyoshiAoki, Mitsugu Sawa, Moritoshi Ikeda, Keun Sang Park
725
Rotating Objects Using Dials Devices
Atsumi Imamiya, Tadaaki Sakamoto
731
A New Integrated System to Assess the Amount of Information of Pointing Devices for Motor-Disabled Person Toshiyasu Yamamoto, Tetsuya Yamashina, Jyunichi Ohshima, Masafumi Ide
737
IV.6 Musculoskeletal, Postural, Visual, and Psychosocial Outcomes Resulting from Ergonomics and Optometrical Intervention Musculoskeletal, Postural, Visual, and Psychosocial Outcomes Resulting from Ergonomic and Optometric Intervention A. Aar~s, G. Horgen, M. Thoresen,A. Bugajska, A. Wolska, R. Danuta, M. Widerszal-Bazyl, M. Konarska, M.J. Dainoff, B.G.F. Cohen, M.H. Daonoff
745
A Method to Consider Ergonomic Conditions at VDT Workplaces Annika Johansson, Houshang Shahnavaz
749
IV.7 Physiological Measurements 1 Task-Related Musculoskeletal Disorders in Computerized Office Work Pentti Sepp~ila
759
Analysis of Mental Workload during the Work with Computer Using R-R Intervals Time Series Kiyoko Yokoyama, Masanori Moyoshi, Yosaku Watanabe, Takayoshi Yoshioka, Isao Tawamura, Kazuyuki Takata
765
xix
Assessment of Mental Workload Based on a Model of Autonomic Regulations on the Cardiovascular System Mieko Ohsuga, Hiromi Terashita, Futomi Shimono, Mamiko Toda
771
Experimental Study on R-R Intervals of Heart Rate by Wavelet Analysis Satoshi Kishino, Mitsuru Katoh, Yoshio Hayashi
777
IV.8 Physiological Measurements 2 CFF Values for Stress Caused by VDT Work and Relationship among Analysis of Uric Properties Masaharu Takeda, Yoshio Hayashi, Kaoru Suzuki
785
Development of a New Hand-Grasp Measurement System Yoshikazu Seki, Sigeru Sato, Makoto Shimojo, Akihiko Takahashi
791
On a Simple Method to Measure the Intensity of Keystrokes
Kaoru Suzuki
797
A Support System for Handwriting for the Blind Using a Virtual Auditory Screen Kazunori Itoh, Yoshihiro Inagaki, Yoshimichi Yonezawa, Masami Hashimoto
803
A System for 3D Motion and Position Estimation of Hand from Monocular Image Sequence Yoshio lwai, Yasushi Yagi, Masahiko Yachida
809
IV.9 Physiological Measurements 3 A Case Study on Evaluation Method for VDT Workload Using with Face Skin Temperatures Yoshinori Horie
817
Measurement of Work Load Using Brain Potentials During VDT Tasks Akihiro Yagi, Mika Ogata
823
The Relationship between Human Mental Variation and Its Application to Communication Aids Sakae Yamamotol Shigeaki Matsuoka, Sumio Yano
827
64-Channel EEG Measurement System- Applying to Stress MeasurementShin'ichi Fukuzumi
833
Analysis of Brain Activity for HCI
Mariko Fujikake Funada, Satoki P. Ninomija
839
Detection of the Event Related Brain Potential and Its Application to Communication Aids Takashi Kawakami, Michio Inoue, Yasuhiro Kobayashi, Kenji Nakashima
845
XX
IV. 10 Organizational and Psychological Aspects A Basic Experimental Study on Mental Workload for Human Cognitive Work at Man-Machine Interface Hidekazu Yoshikawa, H. Shimoda, Osamu Wakamori, Yoshinori Nagai
853
Workflow Technology Based Project Management Carlos K.H. Leung, Heloisa Martins Shih, Mitchell M. Tseng
859
Involving Workers in the Transformation of Work Organizations: Problems and Tools Irene Odgaard
865
Emotional Workload: Its Operationalization, Measurement, and Consideration in the Design of Human-Computer Interfaces Irwin Matin
871
The Psychological Impact of Computerised Production Feedback Systems: A Comparative Study of the U.K. Subsidaries of U.S. and Japanese Multinational Companies Cliff Oswick, David Grant
877
IV.I 1 HCI Standard Human-Computer Interaction Standards
Nigel Bevan
885
Frederik Dehlholm
891
Application of Ergonomic Standards to the EC Directive on Requirements for Display Screen Equipment Henrik Hopff
895
The Applicability of the ISO User Interface Standards
Structured Human Interface Validation Technique- SHIVA Jtirgen Ziegler, Michael Burmester
899
Interface for Physically Challenged V.1 Interface for Physically Challenged Composition of Messages on Winking by ALS Patients Naoyuki Kanou, Michio Inoue, Yasuhiro Kobayashi
911
xxi
Development of Language Training System for Developmentally Handicapped Children Kumiko Itoh, Kyoko litaka
917
INTERACT: An Interface Builder Facilitating Access to Users with Disabilities C. Stephanidis,Y. Mitsopoulos
923
Supporting Blind and Sighted User Collaboration through Dual User Interfaces Using the HOMER System Anthony Savidis, Constantine Stephanidis
929
Development of Human-oriented Information Systems - Learning with Mentally Handicapped PeopleYasuko Kaminuma Personal Information Appliances
935
Social Aspects, Management and Work VI.1 Information Technology Personal Information Appliances Peter J. Thomas,John F. Meech,Robert D. Macredie
945
Efficient Development of Organisations and Information Technology- A Design Approach
Jan Gulliksen, Mats Lind, Magnus Lif, Bengt Sandblad
951
Integration of People, Technology and Organization: The European Approach Christina Kirsch,Peter Troxler, Eberhard Ulich
957
Dynamic Changes of Human Systems under a Simple Task of HCI Mariko Fujikake Funada, Satoshi Suzuki, Takao Tanaka, Yusuke Yazu, Kyoko Idogawa, Chieko Hukuda, Satoki P. Ninomija
963
Temporal Organisation of Human Centred Systems V.A. Chernomorets,S.V. Kirpich
969
VI.3 Job Design Job Satisfaction in the Computer-Assisted Work Environment Andrew A. Mogaji
975
A Study on Shifting Time to Low Awakening Conditions on Monotonous VDT Works Chieko Fukuda, Satoshi Suzuki, Takao Tanaka, Keiko Kasamatsu, Yusuke Yazu, Mariko Fujikake Funada, Kyoko ldogawa, Satoki, P. Ninomija
983
XXII
Complementary Allocation of Functions in Automated Work Systems Gudela Grote, S. Weik, T. W~ifler, M. Z61ch
989
From Taylorism to Tailorability: Supporting Organizations with Tailorable Software and Object Orientation Helge Kahler
995
VI.3 The Esprit Project 8162 QUALIT, Quality Assessment of Living with Information Technology Human Oriented Management of Change. A Conceptual Model Federico Butera
1003
S. Downing, G. Ryan, A. McNeive, M. Mariani, O. Parlangeli
1011
User Requirements for Tools to Support Human Oriented Management of Change Irene Odgaard
1017
New Forms of Empowerment Using Simulation Games and Learning Form Cases K. Mertins, B. Schallock, P. Heisig
1021
The Quality of Working Life Concept
VI.6 The I CHING and Modem Science The I Ching Onto-/Axio-Genesis and the Analytic Hierarchy Process" Decisions, Negotiations and Conflict Resolutions Chung-ying Cheng
1029
Philosophy of Unity in Diversity - The Dance of Quantum and the I-Ching's Symbol -
Thomas In-sing Leung
1033
The I Ching and Non-Linear Mapping" A Meta-Binary Approach to Reflective Choice, Decision-Making, and Hierarchical Information Systems M. Secter
1037
Exploring Self-Developing Models in Computerized, Interactive Learning Environments D.A. Smith
1041
Business Rules, Revolutionary Discourse, and Multilogical Information Systems G. Tropea The I Ching as a Paradigm for Understanding Corresponding States in Fundamentally Different Systems J.W. Walls Nonlinear Computation in the I Ching Biomathematics Derived from the I Ching
1043 1047
K. Walter
1053
J.F. Yan
1059
xxiii
Author Index
1061
Keyword Index
1064
This Page Intentionally Left Blank
III. Analysis, Design, and Evaluation in Human-Computer Interaction
This Page Intentionally Left Blank
III.1
Interaction in Context" Context in Interaction
This Page Intentionally Left Blank
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
INTERACTION Participants: Panel chair:
EXTENDED
IN CONTEXT
- CONTEXT
IN INTERACTION
Jim Alty, Patrick Brezillon, William Edmondson, Erik Hollnage|, John Meech, Katsuhiko Ogawa, Dan Suthers William Edmondson
ABSTRACTS
C O N T E X T IN M U L T I M E D I A I N T E R A C T I O N S Prof. Jim Alty
Head of Department
LUTCHI Research Centre, Dept. Computer Studies, Loughborough University of Technology, Leics, LE11 3TU, UK. Tel: +44-1509-222648 Fax: +44-1509-211586
[email protected] Over the past five years we have been investigating the relevance of Multimedia interfaces in the process control area (Alty 1991, Alty & McCartney 1992, Alty & Bergan 1995). At the outset we assumed that the need was obvious. We saw the combination of knowledge-based support, coupled with a high bandwidth on the interface, as providing two powerful solutions to operator problems. However, the process operators were not so sure. Process operators are, all the time, balancing the volume of information available against its usefulness in a particular situation - in fact, the problem of context. As process operators see it, the only possible value in providing information in alternative media comes from the ability to: • do the task faster • do the task with fewer errors • make it easier (and more pleasant to do the task) • make learning easier and they had a suspicion that Multimedia interfaces would simply increase the amount of information being presented. Once we started to examine the literature it became clear that real examples of benefits (as opposed to features) of Multimedia interfaces were thin on the ground. The literature is full of wild claims and lacks real evaluations. We therefore decide to carry out a set of experiments both in laboratory conditions and in real plants to try to determine if Multimedia interfaces really did provide thebenefits claimed by largely unsubstantiated reports (Alty, Bergan, Craufurd & Dolphin 1994). We used a variant of the Crossman's waterbath (Crossman & Cooke 1974) experiment (used heavily in process control) and developed a set of different media interfaces to the experiment combining sound, speech, graphics, animation, colour, text and still video. Subjects had to carry out a large number of trials to achieve steady state conditions in the bath. In some conditions warnings were given (either audio or visually), in others no warnings were given. This abstract does not allow a full discussion of the results, but here are some highlights: • We found that the nature of the task and the subjects' knowledge were crucial in determining whether different media were important or not. For example, in a situation
where the task is obvious, then choice of medium has little effect. In cases where the task is difficult and the subjects have little understanding or knowledge of the problem, the choice of medium has little effect also. However, the choice of medium could be critical for success. This area of applicability is an important one since it covers the situation where operators are trying to solve problems at the limits of their competence, i.e. many emergency situations. • Whilst warnings improved task completion time they seriously affected subjects' ability to learn about the interface, particularly the more difficult concepts. • Use of realistic sound was not well received. The important result here (which seems rather obvious in retrospect) was that the medium must be able to convey the information needed to carry out the control action. When the results were analysed according to task difficulty, operator performance in difficult tasks using sound was better than without sound. In the latter situations, much larger differences were occurring, and it may be that these were such as to allow subjects to carry out realistic actions using the sound. • Subjects performed equally well with spoken and visual warnings and there was considerable positive feedback about the use of speech. However there was a bi-modal distribution in the ratings of speech and textual warnings. Those who rated textual warnings important perceived the interface task as difficult, whilst those who found the interface task easy gave the textual warnings a low rating. • The simple textual interface worked remarkably well for straightforward tasks. • The graphically oriented interfaces yielded better performance than the textual ones. This was due, in part, to much finer tuning being used in the textual interfaces presumably because of a lack of overview. The graphical advantage increased with task difficulty. The experiments in the real plant - a large Chemical Plant in Holland - were carried out over a three month period (Alty, Bergan & Schepens 1995). In a real plant it is virtually impossible to carry out controlled experiments, so our results are much more speculative. Although the operators did not have to use the interfaces at all, our multimedia interfaces were extensively used. This, in itself, must mean that the interfaces were not only usable, but at least as effective as the traditional interface. Another indirect measure of usability came from a request by the operators for our work-station to be positioned more centrally in the control room. The operator feedback was very positive for most aspects of the interfaces provided, but it was difficult to separate out the positive effects resulting from interconnected applications with the positive effects of the media themselves. Overall, we can conclude that our use of Multimedia interfaces, in specific situations, did improve performance, learning, or satisfaction, or sometimes all three aspects. However, the difficulty of the task, the knowledge of the operator, the sophistication of the surrounding software, and whether the experiments are laboratory-based or in a real plant, are all relevant issues and affect the interpretation of the results.
References Alty, J. L. (1991) Multimedia: What is it and how do we exploit it? Presented at HCI '91. Published in the proceedings, edited by D. Diaper and R. Winder, pp31 - 41. Cambridge: Cambridge University Press.
Alty, J. L., and McCartney, C. D. (1992) Design of a Multimedia Presentation System for a Process Control Environment. In L. Kjelldahl (ed.), Proc of the EUROGRAPHICS Workshop on Multimedia, pp293- 306, Springer-Verlag. Alty, J. L., Bergan, M., Craufurd, P., and Dolphin, C. (1994) Computing and Graphics, 17(3):205 - 218. Alty, J. L., and Bergan, M. (1995) Multimedia Interfaces for Process Control: Matching Media to Tasks. J. Control Eng. Practice, 3(2):241 - 248. Crossman, E. R., and Cooke, F. W. (1975) Manual Control of Slow Response Systems. In The Human Operator in Process Control, Edwards, E., and Lees, F., (Eds.), pp51 - 64. London: Taylor and Francis. Alty, J. L., Bergan, J., and Schepens. (1995) The Design of the PROMISE Multimedia System and its use in a Chemical Plant. In Multimedia Applications, Earnshaw, R., (Ed.). London: Academic Press.
TAILORING EXPLANATION CONTEXTUALIZATION
TO
USER'S
NEEDS
THROUGH
Patrick Brezillon
LAFORIA, Box 169, University Paris 6, 4 place Jussieu, 75252 Paris Cedex 05, France. Tel: +33-1-44-27-70-08 Fax: +33-1-44-27-70-00
[email protected] An explanation is a process that allows the user to assimilate a new piece of knowledge. The keystone of assimilation is explaining the plausibility of new knowledge with respect to the knowledge that the user already possesses. This must be a cooperative process where the answer given by the system is progressively adapted by both participants. We focus in this panel on the relationships between explanation and context of the cooperation. We think that a system must produce context-sensitive explanations to enhance its intervention in its cooperation with a user. Making explicit the context of the cooperation enables one to: tailor explanations to the users' needs; simplify correctly complex information for the user; paraphrase; structure the explanation; manage counter-examples; guide the research and the focus of attention; resolve ambiguities; correct misunderstanding; help learning; fill possible gaps; reduce cognitive load; develop a model of the "user-in-situation"; make sure that the participants' mental models match; and improve qualitative and quantitative performance of the explanation. Our purpose will be developed around the following example where CK stands for chunk of knowledge, SK the set of CKs that is considered at a given time, and CKo is a particular CK with which the user has a problem. I may say to a person (CKo): "I heard a lion from the window of my office this morning". This utterance has two parts: "I heard a lion roar" and "I was in my office". This is the link that may be unknown for the other person (it will be selfexplanatory for a person that is knowledgeable of me). If the person is surprised (the person has some trouble with CKo), I must develop my utterance, saying: "My office is in a university near a zoo that I can see from the window of my office. There are lions in the part of the zoo near the university. I often hear lions roar". Here, CKo is introduced after various CKs are first presented (the knowledgeable person knows these CKs: that I work in a university near a zoo, that I can see the zoo from my office and that there are lions in the zoo). However, I must make clear the links between the two parts of my utterance and share them with the person when explaining my first statement.
Note that CKs are introduced in the SK progressively, beginning with the CK that may be accepted easily by the other. In the example, I begin with "I work in a University", and then "there is a zoo near my University". Each piece of knowledge, which is presented to the other person, is explained within the context of the pieces of knowledge already shared and integrated by them with explicit explanations, i.e., with links with other CKs in the SK. The context is the shared knowledge and the mechanisms to select relevant knowledge to improve the interaction. It changes dynamically each time a CK is added or removed from the SK during this evolution. Any participant (i.e., the user or the system) may introduce a CK in the context, first to establish a link with CKo, and second to share that CK with the other participant. The dynamic aspect of context implies that it is not possible to plan in advance the whole explanatory dialogue. The context helps the user and the system to focus on the minimal number of relevant CKs at a time by making explicit the focus of attention of the explanation. Supplying a context to a system permits structuring of an important problem in explanation, namely the management of questions and answers. We note that context may enable to: support the user in formulating and asking questions; anticipate users' need for information; organize user's questions, making relationships among questions explicit; determine the fight meaning of a question in the current context; and facilitate the explanation production of different types, at different levels of details and abstraction. The user may navigate easily in the explanatory hyperspace and thus avoid the "lost-in-space" problem because the context will restrict the search space. Conversely, the explanation process may be a mechanism for managing the context of the interaction. The explanation process aims at the building of the SK in which an equivocal CK, CKo, may be assimilated by the user.
CONTEXT IN TASK ANALYSIS William Edmondson
Cognitive Science Research Centre, School of Computer Science, The University of Birmingham, Edgbaston, B15 2TT, UK. Tel: +44-121-414-4763 Fax: +44-121-414-4281
[email protected] Task analysis formalisms for HCI either misrepresent context - it is read off a higher level description (as in HTA diagrams) - or ignore it altogether (e.g. TAG). Recent approaches to menu design (e.g. Lean Cuisine (Apperley & Spence 1989), Decision Track (Edmondson 1990, Edmondson & Billups 1992), SMD (Edmondson & Spence 1992)) implicitly locate an action in the context of possible actions. However, what needs to be addressed in task analysis for HCI is the development of formalisms which explicitly address issues of context (Edmondson & Meech 1994). There are two such issues. The first concerns the modelling of the process of contextualization within the user, the second concerns descriptors for context, so that context can be incorporated in formal systems or notations. These two issues are addressed here. The process of contextualization is best understood, and thus modelled, in the following terms. Information is the process of contextualizing data. In the general case, and thus in humancomputer interaction, the human being is constantly exposed to many data from many sources. These data are not simply received or 'taken in' as information, they are perceived in the context of other data and processes and thereby become informative, and thus used. Some examples illustrate this perspective. An aircraft is fitted with many devices which produce data concerning height, speed, direction, etc. These data can be displayed on instruments, recorded in the flight recorder, and fed as input to the autopilot. The pilot contextualizes these data- there is an information process here. The autopilot also contextualizes the data - there is an information process here too (but not necessarily the same as in the human). The flight recorder
simply records the data (which may be subsequently retrieved and contextualized, by human or machine - say in an accident investigation). This model is valuable in HCI because it forces attention onto the processing. Simply displaying many data- in a cockpit, on a control panel, on a CRT - does not contextualize any of them; mere 'copresence' is not enough. The user, in a task dependent fashion, will be contextualizing the data- the information process - and may also be producing more data as 'output'. For the user to be supported in their interaction (and thus their task) the need is for the system designer to understand how to present the data in such a way as to promote, change, prompt, contextualization. This could be as simple as arranging dials in a row, so as to prompt perception of difference (visual 'pop-out', as it is called). On the other hand, contextualization support could be complex and reliant on the successful incorporation into the system of a model of the user's contextualization processes. Where this is correctly done the data are drawn appropriately and coherently to the user's attention because the system 'knows' that they are conjointly required for successful contextulaization. The model of contextualization advocated is insight promoting, but incomplete, which brings us to the second issue raised earlier. Context can be viewed as a property of data (ultimately; objects, events, etc., may 'present' the data) in that they are contextual for some other data. Thus cruising altitude, cabin lighting level, and galley oven temperature may be aircraft data with miniscule mutual relevance (they don't jointly participate in any conceivable contextualization - except this one !). However, other data will be mutually relevant in one (or more) process(es) of contextualization - and somehow this must be formalizable if a set of context descriptors is to be made available in system design. The challenge appears to be that of taking the general insight concerning contextualization and allying it with notions of task analysis to yield a formal approach which ensures that the system designer is incorporating contextualization support. In plain English this must mean moving away from task analysis and HCI as concerned with the 'what' of activity, toward a notion of the 'how' of it (cf. Edmondson 1993). But even here the challenge is toughened by the ever more urgent need to ensure that HCI is given formal methods for turning perspicacity into code. References
Apperley, M. D., and Spence, R.
(1989)
Lean Cuisine: a low-fat notation for menus.
Interacting with Computers, 1(1):45-68. Edmondson, W. H. (1990) Decision Track: A formalism for menu structure and user's selection behaviour. Presented at Human-Computer Interaction - INTERACT'90, Cambridge, U.K., 27 - 31 August, 1990. Published in the proceedings, edited by D. Diaper, G. Cockton, D. Gilmore, and B. Shackel, pp441-446 Amsterdam: North-Holland. Edmondson, W. H. (1993) A Taxonomy for Human Behaviour and Human-Computer Interaction. Presented at HCI International '93 - The Fifth International Conference on HumanComputer Interaction, Orlando, Florida, 8-13 August, 1993. Published in the proceedings: Advances in Human Factors/Ergonomics 19B: Human-Computer Interaction: Software and Hardware Interfaces, eds G. Salvendy & M.J. Smith, pp885-890. Amsterdam: Elsevier. Edmondson, W. H. and Billups, I. R. (1992) The Decision Track Formalism. Unpublished report CSRP-92-6. Edmondson, W. H., and Spence, R. (1992) Systematic Menu Design. Presented at HCI '92. Published in the proceedings, edited by A. Monk, D. Diaper, & M.D. Harrison, pp209-226. Cambridge: Cambridge University Press.
10
Edmondson, W. H., and Meech, J. F. (1994) Putting Task Analysis into Context. SIGCHI Bulletin, 26(4):59 - 63, October 1994.
S T R U C T U R A L AND F U N C T I O N A L A P P R O A C H E S I N T E R A C T I O N IN HCI Erik Hollnagel
TO M O D E L L I N G
OF
Honorary Professor of Psychology, University of Manchester & Technical Director, Human Reliability Associates
1 School House, Higher Lane, Dalton, Lancs., WN8 7RP, UK. Tel: +44-1257-463-810 Fax: +44-1257-463-121
[email protected] Structural and functional approaches The interaction between a user (human) and a machine (computer) is usually described in terms of states and state transitions. This description is very convenient for the application of formal languages and formal methods to support design. From a humane perspective it is, however, more appropriate to describe the interaction as a set of events which in prospect are loosely coupled to the user's intentions and in retrospect are strongly coupled to time. The analysis of human-computer interaction can accordingly be seen as the mapping of the set of events onto a classification scheme. A specific objective of this mapping is to determine the principles that governs the sequence of the events - or, in other words, the rules for the state transitions. The classification scheme must necessarily refer to a set of concepts about the user, in particular about the mental or cognitive functions, commonly expressed in terms of a psychological or cognitive model of the user. The necessity arises because the objective of the mapping is to find the principles that explain the temporal organisation of the events. Psychological theories and models are typically either structural or functional - although mostly the former. A structural approach refers to a set of basic assumptions about how the mind works, in terms of the components or 'mechanisms' and their individual functional characteristics, expressed in accordance with the accepted theories or models of the day. This way of explanation is in good agreement with the classical principle of decomposition, which dominates the Western approach to science. In psychology, and not least in cognitive science, the dominating theory or model is that the mind can be explained as information processing mechanism. (In the strong version of this view, the mind IS an information processing mechanism or physical symbol system (Newell 1980)). Several variations of the information processing view exist, which put the emphasis on particular types of structures, such as linguistic structures, knowledge structures, etc. The basic principle, however, remains the same, and the analytical principle is a classical decomposition In contrast to the structural approach, a functional approach refers to the dominant manifestations that can be provided through systematic observations. The emphasis is thus on WHAT happens rather than on HOW it happens. (I am aware that this somewhat begs the question, since one cannot make a systematic observation without looking for some things rather than others. However, the issue can be resolved in practice by referring to the notion of requisite variety, as it is known from control theory.) The functional approach starts from the phenomena rather than from the explanations, i.e., that it ends with a model but does not begin with it (Broadbent 1980). It thus conforms well with the maxims of minimal modelling (Neisser 1976, Hollnagel 1993), which means that it is less dependent on the model. This specifically means that the functional approach is less tempted to try to control the environment according to the presumptions of the underlying model or theory, hence less likely to construct an impoverished environment where only the responses expected by the model are wont to be noticed.
11
Cognition and context Structural approaches have always been attractive because they provide a seemingly objective frame of reference which allows us to describe the HCI as reciprocal information processing information processing in man and information processing in the machine. The disadvantage is that structural theories refer to an information processing mechanism in splendid isolation, i.e., to the hypothetically pure information processes of the brain, which are set in motion by events in the external world. Cognition is therefore seen as a kind of higher level information processing which occurs entirely within the human brain, and every effort is made to unravel the mechanisms of pure cognition. In the technological domain information processing can exist as a pure process, for instance the function of an adder or a pattern matching algorithm. But it does not make sense to speak of basic human information processes in the same manner. The fact that the information processing metaphor is useful to understand some fundamental features of human thinking does not mean that the mind is an information processor. In the late 1980s many people began to realise that cognition actually oents. This means that functional theories are intrinsically linked to the context, since the manifestations or the regularities by definition only exist in a given context. Functional theories therefore avoid the problems that stem from the notion of pure mental processes, and in particular do not make the mistake of assuming that cognition is a epiphenomenon of information processing. Thus, whereas a structural approach forces us to account for the context separately from the processes of the mind, the functional approach makes this problem disappear. The advantages of that should be obvious.
Implications for HCI The issue in HCI research is, however, not the modelling of cognition but the design of HCI. A specific HCI design is expected to facilitate the occurrence of a set of events, corresponding to meeting the functional goals of the joint system. Conversely, the design is also expected to hinder the occurrence of other events, which could put the functioning of the system into jeopardy. The value of making a distinction between a structural and a functional approach should therefore show itself in the ways we approach HCI design, specifically in how we describe and analyse events. It is a fundamental assumption for HCI design that the interaction has a purpose or an objective, i.e., that the user is trying to achieve a specific goal and that the interaction with the computers or machines is a necessary part of this. The goal could be writing a document, landing a plane, washing dirty clothes, controlling an industrial process, buying a train ticket from a machine, etc. The proliferation of computers into practically every aspect of ber of objects (buttons, menu lines) through which the interaction can take place. This solution may, however, not be practical for non-trivial tasks. Firstly, it requires that a complete analysis of all possible events has been made. Secondly, it only works if there are no significant constraints on the time available to the user. There are, in fact, only very few applications where both of these assumptions are fulfilled, even if they are relaxed somewhat. Undesired and unexpected actions cannot generally be avoided by constraining the interaction. Instead, efforts should be made to ensure that information and controls are unambiguous, both within and between different contexts. Thus, rather than assuming that the user will interpret the information in a specific way, we should consider what the possible range of interpretations is. The same goes for controls; we should carefully consider how the functioning of a control object can be understood. In particular, users may be unaware of certain options, or interpret a function differently from what the designer intended. Within contexts, the experience from a large number of event analyses have shown that it is important to avoid multiple mode indicators or multiple mode operations. Between contexts it is important that an interface attribute or object does not mean A in one context and non-A in another. A simple example is the meaning
12 of colours red and green; a more complex example is the meaning of symbols and icons. The structural approach assumes that users deliberately try to contextualise the information and controls that are part of the interface. This is a consequence of seeing cognition as a separate, internal mental process. According to the functional approach this step is unnecessary; there is no need for contextualisation because there can be no cognition without including the context. There is therefore no "pure" interface to be considered, only the interface as it is perceived. The interface, together with the tasks, the demands, etc., IS the context. The essence of interaction design is therefore to anticipate the possible contexts that can occur, and to remove any ambiguities from the information and controls that the user is exposed to. This can most effectively be achieved by means of a functional approach, because it looks at what actually happens rather than at what should happen. It furthermore has the advantage of avoiding academic discussions about whether one structural model is more correct than another. References Broadbent, D. E. (1980). The minimization of models. In A. J. Chapman & D. M. Jones (Eds.), Models of man. Leicester: The British Psychological Society. Hollnagel, E. (1993). Requirements for dynamic modelling of man-machine interaction.
Nuclear Engineering and Design, 144, 375-384. Neisser, U. (1976). Cognition and reality. San Francisco: W. H. Freeman. Newell, A. (1980). Physical symbol systems. Cognitive Science, 4, 135-183.
INTERFACES THAT HELP THE USER CONTEXTUALIZE I N F O R M A T I O N John F Meech
Centrefor Personal Information Management, Faculty of Computer Studies and Mathematics, University of the West of England, Bristol UK Tel: +44-1179-656261 x3331 Fax: +44-1179-763973
[email protected] In complex real time systems such as avionics, process control and similar environments the user or operator of the system is provided with a huge quantity of data. These data relate to current, present and possibly future system states and from this the user must decide what control action to take in order to optimise system performance. Because of the complexity of the systems in question it is generally impossible to display all relevant operating parameters to the user. 'Information Automation' is used as a technique by which a subset of the total possible data is presented to the user based on a concept of 'task' (e.g. task-based paging displays in nuclear power plants and 'glass-cockpit' aircraft). In current systems the user is driven by the behaviour of the system, and when things go wrong (and alarms are generated etc.) the user must be able to correctly identify the current state of the system (i.e. what the problem is) by placing the supplied data into context. The dynamic nature of system behaviour, particularly when alarms happen, provides different workloads for the user in order to search for the relevant data that may be displayed on several different task-based displays. This sudden transition of user workload (low when monitoring the system, high when intervening to diagnose alarms)is frequently a contributing factor to errors made in diagnosis (Meech 1992a). In order to aid the user in correctly identifying system behaviour and reduce 'operator error' I have argued elsewhere (Meech 1992b) that adaptive or intelligent user interfaces may be used to
13 constrain the data set supplied to the user and hence reduce the apparent complexity of the system. This would go some way towards constraining the set of all possible user-interface data to a relevant sub-set. In essence this process of intelligently restricting the data is one of correctly identifying the context that the system is currently operating within. The important distinction between this technique and task-based interface systems is that this is a way of composing data sets dynamically, according to the system state. This reduces the display navigation that the user must perform and may be used to supply context-sensitive user assistance. Further research (Edmondson & Meech 1994) into the nature of context as seen in humancomputer interaction has suggested that this use of dynamic context identification mirrors the process undergone in the user: context is not a static thing but a dynamic process of contextualisation. An intelligent/adaptive interface may therefore be viewed as a means of precontextualising the data for the user. As part of a preliminary investigation into the effectiveness of intelligent/adaptive user interfaces as contextualisation aids a series of experiments are currently being conducted on a representative domain. The objectives of these experiments are: ° To implement an intelligent/adaptive interface which is based on a dynamic model of user tasks which can be used to dynamically provide user aiding, • To evaluate the effectiveness of such an interface in aiding contextualisation when compared with a standard warning/alarm system and with no aiding at all. The preliminary results of these experiments will be presented at the panel, including the accuracy of the models used to contextualise by the computer. The ability to track context in this manner will therefore enable enhanced user performance in a variety of real-time systems.
References Meech, J. F. (1992a) Addressing Operator Errors in Supervisory Systems. Proceedings of Information-Decision-Action Systems in Complex Organizations IDASCO 92, IEE Digest 353, April 1992, Oxford U.K. Meech, J. F. (1992b) Intelligent Aiding in Complex, Supervisory Systems. Paper presented at the 5th IFIP/IFORS/IFAC Conference on Man-Machine Systems, The Hague, The Netherlands. June 1992. Edmondson, W. H. and Meech J. F. (1994) Putting Task Analysis into Context SIGCHI Bulletin, 26(4):59 - 63, October 1994.
INFORMATION SEARCH TO SUPPORT USER'S TASK Katsuhiko Ogawa
Nippon Telegram & Telephone, 1-1-6 Uchisaiwai-cho, Chiyoda-ku, Tokyo 100-19, Japan Tei:+81-3-3509-5944 Fax: +81-3-3503-4523
[email protected] Many database systems have been widely used to support users' tasks. They are usually very useful. But sometimes they are inconvenient if the users cannot easily operate the systems. The book metaphor approach is created to simplify database access. The user interacts with the
14 databases through a 'book like' screen and 'book like' functions such as the table of contents, the index, or browsing. Computer functions, such as string searches are also supported. A database system has been developed using the book metaphor interface to a set of about 400 human-computer interface design guidelines. The aim of the system is to support the software design review task more effectively and efficiently than the original paper version. The interface is more comfortable, but users do not quickly find guidelines appropriate for their designs. The problem is that inexperienced designers (users) do not know the details of the guidelines. Users often take a lot of time to browse guidelines and to find the most appropriate words for string searching. Although the browsing technique is inefficient, it is used more frequently than string searching. It is necessary for users to locate the appropriate guidelines effectively to produce high quality interfaces. This paper will discuss improving the access to the guideline databases based on the context of the designer's task. The most efficient search method would reproduce the experience of many users. Neural networks are a logical method of achieving this. We first propose a method to find appropriate guidelines from words employed by users using neural networks. The user can select the preferable words from the words listed by HCI experts, and then get the guidelines through the neural networks. The network is trained with the knowledge of the experts in advance. Each word from the experts corresponds to each unit in the input layer, and each guideline corresponds to each unit in the output layer. This method, however, do not always find appropriate guidelines for supporting the users' work, because the usage and meaning of the words employed by the experts and by the inexperienced designers are different. This method often suggests too few or too many guidelines. A new method is proposed: to use the list of the words used by the inexperienced designers. The list is composed of three categories: 'where is the problem', 'what is the problem' and 'how to improve it.' The method displays three categories of lists with three prompt messages: 'where','what' and 'how.' Users select each word from each category, and then the selected words are input to the neural network. This idea comes from observing many designers' behaviour in design review work with guidelines, checklists, or design specifications. Several designers, who have no human factors experience, participated in an experimental design review task. They were provided with a representation of a bad interface design. One group was instructed to improve the design by using the new neural searching, and the other group was instructed to improve by the ordinary string searching. The result indicates that both groups make similar number of improvements and task completion times, but the neural searching group produces high quality improvements and high users' satisfaction. We often find the same problem on other database applications. We believe, as the first step to the context based information search, that an approach of the neural networks with task analysis would be useful to support the user's task.
DESIGNING FOR INTERNAL vs. EXTERNAL DISCOURSE G R O U P W A R E F O R D E V E L O P I N G C R I T I C A L DISCUSSION SKILLS
IN
Dan Suthers
Learning Research & Development Center, University of Pittsburgh, 3939 O'Hara Street, Pittsburgh, PA 15260, USA. Tel: +1-412-624-7036 Fax: +1-412-624-9149
[email protected] As future "consumers" of science, today's students need to understand the process by which the claims of science are generated and revised. Towards this end, we experimenting with a
15 software environment, called "Belvedere" (Paolucci, Suthers & Weiner 1995), for supporting student's learning to engage in critical discussion of competing scientific theories. Herein we consider the locus of relevant student discourse with respect to the resources provided by supporting software, and with the implications of this locus for the design of that software. Our experience suggests that in applications where a nontrivial portion of this discourse is external to the software, we may find exceptions to standards of "good" interface design. A number of researchers have experimented with hypertext systems and graphical interfaces for supporting argumentation (Conklin & Begeman 1987, Fischer, McCall & Morch 1989, Smolensky, Fox, King & Lewis 1987, Streitz, Hanneman & Thuring 1989). For the most part, these systems are designed to provide either a medium for a genetic competent reasoner, or support for an expert user in a specific professional practice. The "Belvedere" effort seeks to support the development of scientific argumentation skills in young students. These students can't be presumed to have either general skills of constructing arguments or the specific knowledge of a domain. Therefore, the design of Belvedere has had to address the cognitive and motivational limitations and requirements of these unpracticed beginners, as presented in the psychological literature and as we encountered them in formative testing with students. Briefly, these limitations include (1) difficulty recognizing abstract relationships implicit in scientific theories and arguments, (2) difficulty focusing attention on particular problems encountered in the construction and evaluation of complex arguments, (3) lack of domain knowledge, and (4) lack of motivation. Belvedere addresses these limitations by (1 and 2) giving arguments a concrete diagrammatic form, (2) providing an automated, on-demand argumentation advisor, (3) providing access to on-line information resources; and (3 and 4) supporting students working in small groups to construct documents to be shared with others (Scardamalia & Bereiter 1991, Slavin 1990). Superficially, Belvedere is networked groupware for constructing representations (Paolucci et al 1995). The interface looks like a drawing program, but using it feels more like assembling components into desired configurations. However the utility of Belvedere's representations are primarily in their stimulus value rather than their modeling value. When Belvedere stimulates productive discourse, in some student groups much of this discourse occurs external to the representations that result. Because of this, our emphasis is primarily on designing representations the production and inspection of which stimulate critical discussion, and secondarily on representations that are adequate in themselves as a medium of communication or as the basis for final-product representations of a debate.
Examples of design issues The emphasis on stimulating critical discussion complicates the criteria for interface design. Although we design to make it easy to construct diagrammatic representations of the dialectical aspects of science, we also design to stimulate external discourse that need not be recorded in the diagram. We have found that the latter goal can overrule the utility features we would otherwise provide in support of the former. This point is illustrated with a few examples. Statements in Belvedere are embedded in shapes that represent their epistemological status (e.g., as "theory," "hypothesis," or "data"). Users often discuss the epistemological status of a statement before representing it in the diagram. An object can only have one shape at a time; therefore their discussion of the epistemological status cannot be part of the diagram. Is this a design flaw of the graphical language? Should we use an epistemologically noncommittal representation for statements, and provide annotations with which users can record any disagreement concerning the epistemological status of a statement? If the goal is to "push" all discussion into the interface, perhaps these questions are answered in the affirmative. However, it may be useful to force a decision prior to entry in the diagram precisely because it stimulates discussion towards making the decision. Otherwise, for example, students might
16 never care to discuss the difference between "data," "hypotheses," and "theories". We have not resolved this issue, but the present point is that it is a nontrivial issue. It illustrates the danger of assuming that optimizing a representation with respect to criteria of epistemological adequacy will constitute an optimization of the representation with respect to the larger task of interest. Enforcement of semantic constraints provides another example. In some versions of Belvedere, semantic constraints on the links are not enforced. For example, an "explains" link can be drawn from data to theory as well as from theory to data. (Instead of enforcement, we provide an "Advisor" that, at the user's request looks for these and other semantic anomalies that can be detected on a purely formal basis, and makes suggestions for improvements). If Belvedere were a tool for use by expert members of some community of practice, there would be no point in allowing users to make such errors. However, Belvedere is not such a tool because its users do not yet share standard terminology and practice in argumentation. Furthermore, in a learning environment we must consider the role of "errors" in the learning process. Some errors may be so superficial that they are not likely to result in a useful learning experience. Perhaps the interface should be designed to make these errors impossible, or they should be ignored as unimportant. On the other hand, delayed or absent feedback is clearly more appropriate for incomplete or problematic patterns of argumentation. Immediate "correction" could prevent users from engaging in processes of theory criticism and revision that are encountered in the real world. Investigation of whether immediate enforcement of semantics, feedback on request, or no feedback at all has a better qualitative effect on the user's discussions is ongoing. The point, again, is that this is a nontrivial issue, illustrating that interface features that are considered to be "good" for certain subtasks may be suboptimal for the larger task of interest.
Conclusions The specific requirements of our application may be somewhat unusual, but the lesson can be generalized. Design to support discourse processes must transcend the representational environment of the software itself, even in software that specifically relies on the utility of online representations for discourse. User's discourse processes take place in the social environment as well as within the representational and computational resources provided by support software. Thus, the utility of software features should be evaluated in terms of how well they stimulate the right kind of activity in the total human-computer system. We do not assume that local optimization of software support for isolated subtasks (e.g., making "correct" argument diagrams) always optimizes overall task performance. Rather, our main question is: What kind of discourse is facilitated or stimulated by each feature of the interface and of the task posed to the students, and what kind of discourse is inhibited?
Acknowledgements This research was conducted while supported by grant MDR-9155715 from the NSF Applications of Advanced Technology program. The author also thanks Violetta CavalliSforza, John Connelly, Alan Lesgold, Massimo Paolucci, and Arlene Weiner for valuable input and support.
References J. Conklin and M.L. Begeman. (1987) gibis: A hypertext tool for team design deliberation. In Hypertext '87, pages 247--252, Chapel Hill, NC, November 1987.
17 G. Fischer, R. McCall, and A. Morch. (1989) Janus: Integrating hypertext with a knowledgebased design environment. In Hypertext '89, pages 105--117, Pittsburgh, PA, November 1989. M. Paolucci, D. Suthers, and A. Weiner. (1995) Belvedere: Stimulating students' critical discussion. In CHI95 (to appear), Denver, CO, May 1995. M. Scardamalia and C. Bereiter. (1991) Higher levels of agency for children in knowledge building: A challenge for the design of new knowledge media. The Journal of the Learning Sciences, 1(1):37--68, 1991. R. E. Slavin. (1990) Cooperative Learning: Theory, Research, and Practice. Englewood Cliffs: Prentice-Hall. P. Smolensky, B. Fox, R. King, and C. Lewis. (1987) Computer-aided reasoned discourse, or, how to argue with a computer. In R.Guindon, editor, Cognitive Science and its Implications for Human-Computer Interaction. Lawrence Erlbaum. N.A. Streitz, J. Hannemann, and M. Thuring. (1989) From ideas and arguments to hyperdocuments: Traveling through activity spaces. In Hypertext '89, pages 343--364, Pittsburgh, PA, November 1989.
This Page Intentionally Left Blank
111.2 Winning the Market of HMS
This Page Intentionally Left Blank
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
21
W i n n i n g the M a r k e t of H u m a n - M a c h i n e S y s t e m s ( H M S ) Elena A. Averbukh Laboratory for Man-Machine Systems (IMAT-MMS) - University of Kassel (GhK), D-34109 Kassel, Germany Fax: +49 561 804 35 42 Tel: +49 561 804 27 54 e-mail: elena@ imat.maschinenbau.uni-Kassel.de This panel discussion is motivated by the increasing competition among both research and development institutions and companies in today's Human-Machine Systems marketplace. Only those who manage to respond rapidly to market conditions with a high degree of HMS differentiation and valuebased customer services adapted to the broader geography of the marketplace, efficiently save resources and environment, and - improve social health aspects, can really winn the HMS Market. -
-
This competitiveness problem challenges the enginuity of all members of enterprices and demands - clear understanding of the burning problems and goals of the analysis, design and evaluation of the human-machine systems, - role of the human factor within the whole life-cycle of the systems, - innovative techniques and strategies which support entire HMS life cycle, including its' further reuse and technology transfer, usage of new emerging technologies, e.g., multimedia human interface technology etc., appropriate criteria, technologies and standards. -
-
The structure of this market is schematically shown in Fig. 1. Normally both parties which negotiate about systems' requirements/services comprise at least two main hierarchical levels, i.e., managers at the top level and designers for HMS producers and end-users for HMS customers (industry and government) at the subordinate level. The following five main phases of the systems' life cycle are executed by these parties i.e., Establishment of the systems (safety)design goals, e.g., of plant operation and control, Rough HMS Requirements Formulation, HMS Requirements Specification, and Design & Development with End-User Participation, and Marketing and Maintenance. These phases are highly intersect with each other, particularly through the related internal and external Quality Control loops as shown in Fig.1.
• • • • •
Unfortunately, significant distance between hierarchical levels in both HMS producers' and customers' organisations, as well as cultural differences are still critical issues for winning HMS market and effective technology transfer. Hense, it demands sound discussion of the
22 - burning problems, - criteria, technologies and standards, as well as future perspective research trends related to all the above mentioned phases of HMS development and customisation at least in two "dimensions"- job organisation vs cultural aspects. Table 1 supports our preferences in selection the panelists who, to our mind, can consistently and soundly highlight the main points of the whole scope of proposed discussion within the selected "coordinates" from different points of view. Advanced design concepts, as, e.g., integrated user-centered design (G. Johannsen), vital technologies for knowledge industry (E.Averbukh), "evolutionary ergonomics" concept (K.Kawai) etc. will be presented and discussed
Table I Culture/
Producers
Geography
Designers
Japan
ToshioFukuda
Europe
•unnar Johannsen
Customers
Managers
Managers
End-Users
KensukeKawai
RajkoMilovanovich
Alberto Stefanini
Chair: Dr. Habil. Elena Averbukh (IMAT-MMS, University of Kassel, Germany) Panelists:
Prof. Dr. Toshio Fukuda (Nagoya University, Japan) Kensuke Kawai (Toshiba Corporation, Fuchu Works, Japan) Prof. Dr. Gunnar Johannsen (IMAT-MMS, University of Kassel, Germany) Dr. Rajko Milovanovich (British Telecom, Great Britain) Dr. Alberto Stefanini (CISE, Italy)
The presented work is partly supported by the Commissionof the European Union under the BRITE/EURAMII programme in Project 6126 (AMICA: Advanced Man-MachineInterfaces for Process Control Applications). The consortium partners are: CISE (Italy), ENEL (Italy),FLS Automation A/S (Denmark), Universityof Kassel (Germany), Marconi Simulation and Training (Scotland).
MARKET OF I-IUMAN-MACHINESYSTEMS
b
Rough Prqoct
F
Deslgn & Development with Encl User Pcir ticipatiori
-b
__ . .. .
I2equircrnenls f’orrnulation
--
---
b - >:-
Project Requlrcmcnis Speclflcatlnns Internal and Fxteriiol Qucllit y Control
cc, N
This Page Intentionally Left Blank
111.3 Interaction Design 1
This Page Intentionally Left Blank
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
27
Different A p p r o a c h e s in Visual Interactive S o f t w a r e C o n s t r u c t i o n Olivier Esteban a, St6phane Chatty a and Philippe Palanque ab aCentre d'l~tudes de la Navigation A6rienne 7 avenue Edouard Belin, 31055 Toulouse Cedex, FRANCE bLIS, University of Toulouse 1, Place Anatole France, 31042 Toulouse Cedex, FRANCE
In this paper we focus on tools and environments for visual interface development that have been proposed recently. The aim is to introduce Whizz'Ed, an experimental editor for construction of highly interactive or animated applications in order to highlight the original contribution of the Whizz'Ed approach. Whizz'Ed provide an easy-to-use environment using elementary components for visual design and development of an highly interactive interface, allowing rapid constructions by non-programmer users.
1. INTRODUCTION Since the beginning of the 80's, many tools have been developed to support user interface construction. Most of these tools can be divided in two main categories: user-interface toolkits and user-interface development systems. Though there are many of them, tools from those two categories only offer a partial solution to the problem of building graphical interfaces and more precisely to the description of their behaviour. Indeed, toolkits are difficult to use even for experienced programmers, and user-interface development systems commonly known as interface builders (e.g. Visual Basic TM) suffer from a lack of programming possibilities (programming can usually only be performed through a restricted programming language) and genericity (they are often dedicated to a certain kind of interactive systems). Hence, the design of visual programming tools (such as the ones described in [1]) for interface construction is still a stimulating challenge [ 14], especially when it comes to allowing non-specialists to develop highly interactive, direct manipulation interfaces [13]. This paper is dedicated to the visual construction of highly interactive software. The first section describes previous work on user interface construction and on visual programming. The second part presents an overview of Whizz'Ed, an experimental visual tool devoted to the construction of highly interactive or animated interfaces. In this section a snapshot of the tool is presented, and a simple example is shown. The paper ends up with a prospective view on Whizz'Ed in order to cope with non conventional interfaces such as the ones including multimodal or sound interaction.
28 2. RELATED W O R K
Most difficulties in describing the behaviour of objects in highly interactive interfaces have long been identified. In addition to the interface builders mentioned earlier, several approaches have been proposed to solve these difficulties. These approaches can be put in three main classes: constraint based systems, the state-transition diagrams, and data-flow systems. The use of constraints is well demonstrated by ThingLab, which implements that notion by providing a general simulation environment where physics laws are described by means of constraints [9]. Constraints in ThingLab are bi-directional to allow objects to be attached and updated simultaneously. Garnet [ 11 ] is another constraint-based system where constraints are Common Lisp expressions stored in the slots of objects. Garnet offers a set of tools to assist the design and rapid prototyping of user interfaces: an interface builder called Lapidary, an automatic dialog box and menu creation system called Jade and a spreadsheet for constraint specification called C32. Lapidary permits the drawing of the layout of the interface and the design of the dialogue by describing the user interface in terms of the user's actions. Jade is used to automatically create dialog boxes with a textual description. C32 uses the spreadsheet paradigm. It allows designers to create relations between the objects of the interface. NoPumpG II [ 16] is a spreadsheet-like visual language that attempts to combine the power of interactive graphics with the ease-of-use of a spreadsheet as it is possible to define constraints using a spreadsheet-like interface. It does this by allowing users to create applications that consist of spreadsheet-like cells used to control and report various attributes of interactive graphical primitives (behaviour and appearance) created in a simple MacDrawrM-like environment. However, NoPumpG II does not provide a general approach to events. More generally, constraint based systems do not allow designers to describe behaviours in a very natural way. Indeed, it is more difficult for them to express behaviours in an abstract way (such as needed with constraints) rather than describing causality links between objects (such as needed with data-flow based systems). Moreover, constraints are usually expressed in textual languages, that may be harder to understand (e.g. first order logic or temporal logic) than graphical ones [ 10] (e.g. Petri nets or State diagrams). RAPID/USE [15] uses state-transition diagrams in order to represent the behaviour of an interactive system. The nodes of the diagram represent the messages to be displayed. The arcs represent the transitions and thus fully state the appearance of messages. Transitions may occur either due to the user's actions or to external events. Iconit [4] is an environment based on state-transition diagrams for the development of iconic interfaces. It is composed of two subsystems: Iconit-D and Iconit-X. Using Iconit-D it is possible to design and specify the behaviour of the interface. The overall scheme of the user-application dialogue is represented by a state-transition diagram drawn interactively. Each node of the diagram is asssociated to a window. An arc between two nodes (nl and n2) represents the possibility to navigate from the window associated to n l to the window associated to n2. The layout of windows (e.g. menus, icons) is created independently. The windows are stored in a window base linked to the state-transition diagram. Iconit-X is composed of a verifier and an interpreter used to verify and test the interface specified using Iconit-D. Using this environment it is possible to distinguish between actions corresponding to the semantics of the application (usually known as Callbacks) and navigation actions. Such tools based on state-transition diagrams are very useful for creating interfaces as they make explicit both the representation of states and state changings. Iconit
29 describes in an efficient way the overall organization of the windows for highly interactive interfaces while RAPID/USE describes menus, buttons, dialog boxes, etc. for conventional interfaces. However, in most highly interactive interfaces the variety of choices at the user disposal is much larger than in traditional interfaces as the user can manipulate freely interfaces objects. So, the description of the sequencing of actions using state-transition diagram becomes over complicated (problem usually referenced as combinatory explosion). In order to describe in a declarative way those complex behaviours (using geometrical or temporal constraints) the data-flow paradigm has been introduced in user interface construction environments. Systems based on data-flows make programs easier to construct due to the natural understandability of data-flow diagrams. Such systems can be used by a broad range of people with different programming levels. NL [6] is a visual programming language, based on a data-flow programming model. A NL data-flow program is a directed graph. The arcs represent the paths over which tokens move between nodes, where they may be transformed into other tokens. NL uses a data-driven firing rules: a node is fired when each of its input ports holds a token. NL provides composite nodes which enable programmers to recode groups of nodes into comprehensible chunks. InterCONS [14] is another visual data-flow language, in which certain primitives are associated with interactors like buttons or sliders. Show and TellrU[8] is a visual programming language for school children. The aim of Show and Tell is to develop the programming knowledge of school children. The model used by Show and Tell is based on concepts of data-flow and completion. The basic idea is to increase the learning process by providing mechanisms for direct manipulation of values. Besides, Show and Tell uses the data-flow's intrinsic concurrency which is very useful to model communication systems. The completion is used to unify the concepts of communication, computation and data query. The Show and Tell system is presented to the children as a tool used to build and solve puzzles corresponding to their goals. Graphically a puzzle consists of boxes and arrows which are connected together. The completion shows them how far they are from the goal. Other approaches based on data-flow can be found in Prograph [5], an object-oriented visual programming language and a software development environment for the Macintosh TM. Fabrik [7] enhances the traditional data-flow model with a bidirectional data-flow. This extension permits the use of nodes that combine several functions (typically a function and its inverse).
3. WHIZZ'ED
The purpose of such a tool is to allow the creation by direct manipulation of interactive objects to build highly interactive or animated interfaces. The conceptual model of Whizz'Ed is based on the data-flow model. In order to enhance this model, Whizz'Ed uses the building games metaphor promoted by the Lego-Logo [12] construction kit which is a very interesting example of the use of this concept. Lego-Logo is a rich construction environment which allows designers to construct creatures with electronic bricks like sensors, motors, lights, logical gates, flip-flop, timers, etc. A parallel can be made between the creatures of Lego-Logo and the highly interactive applications to be built with Whizz'Ed. Assuming that point of view, the behaviour of a Lego-Logo creature corresponds to the behaviour of the components of the application. Whizz'Ed provides a set of predefined components (called elementary bricks) that can be wired together to build new components. Graphically connecting bricks together results either in creating the data-flow or in dynamically
30 reconfiguring it. Whizz'Ed proposes a visual language representing objects by icons and dataflow by lines connecting these icons. Plugs are represented by small rectangles coupled with the icon representing the object. The shape of the plug varies according to its type. An example of a Whizz'Ed construction can be seen on Figure 1. Whizz'Ed consists of three main parts: the palette, the edition area and the simulation area. • The palette contains the graphical representation of the bricks (cf. left part of Figure 1) that are supplied to the designer. The chaining of these elementary bricks in order to build complex behaviours avoids the limitations that could be introduced if only complex predifined bricks were available. Furthermore, compound bricks can be exported as elementary bricks, thus increasing the set of bricks primarily proposed. This allows designers to cope with reusability as well as hierarchical refinement in order to handle complex real world systems. • The edition area allows to build the highly interactive or animated interfaces with elementary bricks. The interface designer uses the palette (as in classical drawing tools) by direct manipulation, selecting an icon in the palette, dragging it in the edition area and dropping it at the desired place. The data-flow between bricks is also built using direct manipulation, by selecting a plug of a brick, dragging it and dropping it above another one. Syntactic constraints are automatically checked by editor in order to ensure that the types of the plugs are compatible (thus eliminating many errors at design time). An example of the edition area is shown on the middle part Figure 1. • The simulation area (cf. right part of the Figure 1) aims at graphically representing the execution of the visual program built in the edition area. This area can be used either for debugging, rapid prototyping or simulation of the construction. This example presented Figure 1 in aims at describing a rectangle moving on a screen according to a predefined elliptic trajectory, each of its movements occurring at a given time interval. Icons from left to right corresponds to: a tempo which is designed to synchronize animated objects, a rotor which is a point instrument that sends positions on an ellipsis and the rectangle itself. The data-flow is graphically represented by an arc between the tempo and the rotor and by another one between the rotor and the TopLeft input plug of the rectangle. This small visual program behaves as follows: according to its initial values in its input plugs, the tempo produces a pulse in its output plug that is put in the input plug of the rotor. The rotor reacts to this pulse by calculating the next position for the rectangle. This position is put in its output plug that is related to the input plug of the rectangle. As soon as the rectangle receives the position (the new position for its topleft comer) it changes its position on the screen. In this window, the elliptic trajectory of the rotor is shown but it could be hidden if necessary. This corresponds to the internal parameters of an object, and they can be accessed interactively by double-clicking on their icon. This basic example is dedicated to animation. However it can be easy made interactive by replacing the tempo brick by a mouse brick devoted to the management of the mouse position. Hence, the resulting construction will be able to handle user's actions.
31
Figure 1. Whizz'Ed visual construction of the animated rectangle.
4. CONCLUSION AND FUTURE WORK In this paper, we have presented different ways investigated in the area of user interface construction. Then we have rapidly presented our environment called Whizz'Ed which is a visual programming environment for highly interactive interfaces. Whizz'Ed is based on Whizz [2] a C++ library ported on most X Window systems (e.g. Sun, HE DEC). The underlying model of Whizz is based on the data-flow paradigm to which Whizz'Ed provides a direct manipulation interface. The use of graphical notation such as data-flow diagrams make Whizz'Ed usable by a broad range of people with different programming levels. The work currently done at CENA is to widen the range of applications that can be designed using Whizz'Ed. This is three fold: • extending the underlying library Whizz such as in [3] where two handed manipulation has been introduced, • extending the set of predifined bricks in order to improve the usability and the efficiency of the editor • integrating others media such as sound and video in new bricks.
REFERENCES 1. S.K. Chang. Visual languages: A tutorial and survey. IEEE Software, Jan. 1987. 2. S. Chatty. Defining the behaviour of animated interfaces. In Proceedings of the IFIP WG 2.7 working conference, pages 95-109. North-Holland, Aug. 1992.
32 3. S. Chatty. Extending a graphical toolkit for two-handed interaction. In Proceedings of the A CM UIST, 1994. 4. M. Costabile and M. Missikoff. Iconit: an environment for design and prototyping of iconic interfaces. In Journal Of Visual Languages and Computing, pages 151-174, June 1994. 5. P. Cox, F. Giles, and T. Pietrzykowski. Prograph: A step towards liberating programming from textual conditioning. In IEEE Workshop on Visual Languages, pages 150-156, Oct. 1989. 6. N. Harvey and J. Morris. NL: A generic purpose visual dataflow programming language. Technical report, University of Tasmania, Australia, Oct. 1993. 7. D. Ingalls, S. Wallace, Y. Chow, F. Ludolph, and K. Doyle. The Fabrik programming environment. In IEEE Workshop on Visual Languages, pages 222-230, Sept. 1988. 8. T.D. Kimura, J. W. Choi, and J. M. Mack. Show and tell: A visual programming language. In E. P. Glinert, editor, Visual Programming Environments: Paradigms and Systems, Jan. 1990. 9. J.H. Maloney, A. Borning, and B. N. Freeman-Benson. Constraint technology for userinterface construction in ThingLab II. In OOPSLA'89 Proceedings, pages 381-388, Oct. 1989. 10. T. Moher, B. Blumenthal, and L. Leventhal. Comparing the comprehensibility of textual and graphical programs: the case of Petri nets. In Fifth workshop on empirical studies of programmers. Albex Publishing Company, 1993. 11. B. A. Myers et al. Garnet, comprehensive support for graphical, highly interactive user interfaces. IEEE Computer, pages 71-85, Nov. 1990. 12. M. Resnick. Behavior construction kits. Communications of the A CM, pages 66-71, July 1993. 13. B. Shneiderman. Direct manipulation: a step beyond programming languages. IEEE Computer, pages 57--69, Aug. 1983. 14. D.N. Smith. The interface construction set. In Visual Languages and Applications. Plenum Pub, 1990. 15. A. I. Wasserman. Extending state transition diagrams for the specification of humancomputer interaction. IEEE Transactions on Software Engineering, SE-11:699-713, Aug. 1985. 16. N. Wilde and C. Lewis. Spreadsheet-based interactive graphics: from prototype to tool. In CHI'90 Proceedings, pages 153-159, 1990.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
33
T o w a r d a C o m p r e h e n s i v e M a n i p u l a t i o n S o l u t i o n on 3 D W o r k s p a c e Nobuo Asahi a, Kazuhis a Okada a, Akira Maenaka ~, Eun-Seok Lee b and Keiji Kobayashi ~ apersonal-use Electronics Laboratory, Mitsubishi Electric Corporation, 5-1-1 Ofuna, Kamakura, Kanagawa 247, J a p a n bDepartment of Inform ation Engineering, Sung-Kyun-Kwan University, Seoul, Korea
1. I N T R O D U C T I O N
We have already developed a system for construction and execution of 3D animated metaphor environments, named MECOT (Metaphor Environment Construction Tool), which provides i) easy and efficient construction model for designers by explicitly separating environments from graphical objects, and ii) application transparent environments for users by arranging application functions in a consistent environment [1]. As a result, it was confumaed t h a t it enables rapid construction of a variety of three dimensional graphical user interfaces (hereinafter, 3D GUIs). Since MECOT can adopt tmlimited variety of 3D objects as their interface metaphors and show animation presentation including viewpoint change triggered by user's operation and applications' events, the 3D GUIs constructed with MECOT are more intuitive to users than ordinary 2D GUIs, however, the manipulation on the 3D GUIs has become more difficult than 2D GUIs. Generally, the manipulation on 3D graphics is much harder than on 2D graphics, because there are six degrees of freedom in 3D while there are only three in 2D. Moreover, since the MECOT assumes t h a t the environment spreads over wide 3D space, users are required to move their viewpoint to accomplish their task. In order to give a good look and feel for the users, providing an easy 3D manipulation method is essential, because the feel of the manipulation strongly affects the usability of user interfaces. There are a lot of work relating to this problem [2-6], however, most of them try to solve specific issues such as object rotation and viewpoint control. In order to provide users a natural way of manipulation on the 3D GUIs, a comprehensive solution of the 3D manipulation issues should be investigated. We have categorized the issues on 3D manipulation to viewpoint control issue, short distance object placement issue, and long distance object placement issue. In this paper, we will report the result of the experiments we hadfor the viewpoint controlissue and the short distance object placement issue. The experiments were executed by
34 implementing several manipulation methods and evaluating them in terms of time spent to complete the given t a s k a n d e r r o r rate during the execution. For these experiments, we did not use special input devices for 3D manipulation.
2. ISSUES ON 3D MANIPULATION 2.1. Categorizing Issues 3D GUI assumed in this paper is a 3D graphical environment where there are various 3D graphical objects. Some of the graphical objects are translatable and/or rotatable in the environment. When users would like to accomplish their t a s k s in this environment, they should manage the combination of object translation/rotation and viewpoint movement. For example, consider a 3D GUI consists of a desk object with drawers and some document objects on the desk, and the given t a s k is to store one of the document objects on the desk to a drawer which is out of view now. The user's manipulation strategy m a y firstly move one of the documents close to the drawer, change the viewpoint to see both the document and the drawer in the same view, then open the drawer and drag off the document to the drawer. Independent of the input devices to be used, from our experiences, there are three categories of issues on 3D manipulation: viewpoint control, short distance object placement, and long distance object placement. Viewpoint control is a combination of viewpoint translation and view vector rotation. Generally, view angles used to generate scenes on screens are smaller t h a n our actual eyes so t h a t users often lose their point of interest (POI, afterwards) while they are changing their viewpoint and view vectors, and it is usually hard to recover from the lost situation. To prevent users from POI lost is an important issue. Short distance object placement is a combination of view vector rotation, object translation and object rotation. The t a s k in this category is to move objects to a different place and align them. In order to move and/or align objects, users need to rotate their view vectors because the source and the destination are not arranged to be seen in one fixed view. When users change their view vectors, generally the x, y and z direction of object movement seems different and this m a k e s users confuse& As the direction of movement depends on the local coordinate which the moving object is based on, we assume t h a t the choice of the local coordinate is a key of this category. Long distance object placement is a kind of combination of the above two. Depending on what input devices are used, all of the manipulation on the viewpoint, the view vector and objects need to be defmed as actual operations on the input devices. Optimal combination of the view control and the object placement manipulation, and optimal arrangement of the actual operations could provide a fully comprehensive 3D manipulation solution for users.
2.2. Proposed Manipulation Methods for Each Category We have developed a test environment for 3D manipulation on which several
35 different manipulation methods can be evaluate& With this test environment, we have investigated the viewpoint control issue and the short distance object placement issue to find out the best manipulation method for each issue. As for the viewpoint control issue, one of the important techniques is to give a good manipulation metaphor to users to prevent users from POI lost. According to Ware et. al., for the control of viewpoint movement in a wide-spread graphical environment like maze, flying vehicle control which gives users a control like driving a flying vehicle is better than the other two, eyeball in hand and environment in hand [2]. In this paper, we examine the flying vehicle control in more detail by adding some options. The proposed methods for viewpoint control are: (1) Flying vehicle control only (2) Flying vehicle control with R&L translation The way to move to the right and left directions is added to the basic flying vehicle control. (3) Flying vehicle control with fire-lock control It provides the way to lock the viewpoint to a specific object when users hit space key. (4) Flying vehicle control with wide-view It provides a wide-view to show more right and left view for users. Regarding the short distance object placement issue, since the choice of local coordinate is one of the important issues, we have firstly examined some local coordinate systems. The following is the local coordinate candidates. (1) World coordinate (2) Local coordinate in terms of moving object (3) Local coordinate in terms of destination object (4) Local coordinate in terms of view vector Furthermore, as we have noticed that positional guidances would be helpful for users while they are dragging objects, some of them are examined for each local coordinate. (a) Grid guidance It shows the horizontal position of the moving object on the grid on the floor. (b) Beam guidance It shows the x, y and z axes extending from the center of the moving object. (c) Both grid and beam
3. EXPERIMENT 3.1. Experiment System Figure 1 shows the block diagram of the experiment system used for the experiments. Experiment environment defmition defmes graphical environments, so we have only to change the defmition to change the experiment environments. Event manager reads a n experiment environment defmition and shows an environment according to the current viewpoint. Manipulation control manager is
36 an add-on module of the event manager. In order to test various manipulation methods, a manipulation control manager is replaced to another. The experiment system is working on IRIS 4D/340VGX and now it adopts only the three button mouse as an input device.
ManipulationControlManager I
EventManager ~-.e"
f Experiment 1 Environment t Definition
Figure 1. Block Diagram of Experiment System
3.2. E x p e r i m e n t Method Five subjects were given a t a s k for viewpoint control evaluation and a task for short distance object placement evaluation. For each task, subjects are required to accomplish it with all the proposed manipulation methods described in section 2.2. The evaluation is done by measuring time spent to accomplish tasks. All subjects did this experim ents on everyday for a week. The viewpoint control task is to see six ornaments. Each of them is put in a box only one of whose facet is open. The directions of the open facets are varied so that the subjects need to turn around the boxes to see the ornaments. The short distance object placement task is to move three plates from one place to another, and pile and align the plates. The source place and the destination place are not so far but they cannot be seen in one view. The subjects can see the destination place just by rotating their view vectors, so no viewpoint translation is require& 3.3. Result Figure 2 - 4 show the result of the experiments. Each graph shows average, minimum and maximum time spent for each manipulation metho& We can say that better manipulation method is of smaller average time and of smaller difference between maximum and minimum time. The difference between maximum and minimum time can be regarded as one of the indicators of error rate, because manipulation errors make the accomplish time longer. As a results of the manipulation experiments: (1) Flying vehicle control with fire lock control is the best for the viewpoint movement (Figure 2). (2) Giving local coordinate based on moving object is the best for the object movement (Figure 3). (3) As for the positional guidance, beam or grid depends on individual preference (Figure 4).
37 Sec. 900 800 700 600 500 400 300
0
/11 FV
I
I
+R&L
I
+FL
I
+WV
Figure 2. Viewpoint Movement Task with Various Control Methods Sec. Sec. 250
250 200
200
150
150
100
100
50
50
world
view
object
destination
Figure 3. Object Movement Task under Various Coordinate Systems
beam
grid
beam+ grid
Figure 4. Object Movement Task with Various Positional Guidances
4. D I S C U S S I O N S
According to Ware et. al., they have investigated three types of manipulation metaphors for the viewpoint control, and found that flying vehicle control is the best for walk-through and environment in hand is the best for object investigation [2]. The task given to the subjects for the viewpoint control investigation can be regarded as a combination of the two: walk through type and object investigation type. The flying vehicle control with fire lock control, which is the best way according to our experiment, is a good manipulation metaphor for this type of task, because this method provides easy switch between flying vehicle control for
38
walk-through and a similar control to environment in hand for object investigation. Furthermore, with this manipulation, a speed control of viewpoint movement can be easily implemente& Mackinlay et. al. suggested a way to control the viewpoint movement speed according to the distance toward POI [3]. The subjects mentioned after the experiments that they feel the operation for the POI setting is so natural that they can use fire lock control easily. This means, with this method, the system can get users' POI without disturbing their task. With regard to the short distance object placement issue, we could not get a significant difference among the four local coordinate, but according to the subjects' opinion after experiments, most of them said that the local coordinate in terms of moving object provides the best feeling. As for the positional guidances, we can say that giving some positional guidance significantly reduce the operation time compared with no guidance environment. It might be a good idea to show a guidance only when the user is grabbing an object. The long distance object placement needs to be studied next time based on the result of this research. We are now implementing a combined method incorporating the best of viewpoint control and the best of short distance object placement, and applying it to MECOT.
REFERENCES
1. Asahi, N" An Environment for Developing Metaphor Worlds - Toward a User-friendly Virtual Work Space based on Metaware, Proceedings of the 1994 FRIEND21 Symposium, 1994. 2. Ware,C., & Osborne,S.: Exploration and virtual camera control in virtual three dimensional environments., Proceedings of the 1990 Symposium on Interactive 3D Graphics, In Computer Graphics 24, 2, pp175-183, ACM, 1990. 3. Mackinlay,J.D., Card, S.K., & Rovertson,G.G.: Rapid controUedmovement through a virtual 3D workspace., SIC~RAPH'90 Conference Proceedings, In Computer Graphics 24, pp171-176, 1990. 4. Chen,M., Mountford, S.J., & Sellen,A.: A study in interactive 3-D rotation using 2-D control devices., Proceedings of SIC~RAPH'88, In Computer Graphics 22,4, pp121-129, 1988. 5. Houde,S.: Iterative Design of an Interface for Easy 3-D Direct Manipulation, Proceedings of CHI'92, pp135-142, 1992. 6. Bire,E.A.: Snap-dragging in three dimensions., Proceedings of the 1990 Symposium on Interactive 3D Graphics, In Computer Graphics 24,2, pp193-204, 1990.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
39
Tailoring non-visual interaction in a graphical e n v i r o n m e n t C. Stephanidis and R. Gogoulou Institute of Computer Science, Foundation for Research and Technology - Hellas, Science and Technology Park of Crete, P.O. Box 1385, GR-71110 Heraklion, Crete, Greece
The provision of alternative, non-visual interaction techniques is necessary in order to enhance non-visual interaction possibilities in a graphical environment, and fulfd the needs and preferences of blind users. This paper addresses issues related to the non-visual reproduction of a graphical environment and describes a dialogue configuration system (CONFIG) for tailoring non-visual interaction on the basis of individual blind user's needs and preferences, through 'easy-to-use' interactive facilities. 1. I N T R O D U C T I O N The proliferation of Graphical User Interfaces (GUIs) and multimedia technologies in human-computer interaction has introduced additional problems to blind users in accessing computer-based applications. This is mainly due to the fact that GUIs aim towards the exploitation of the capacity of the human visual channel; however, no alternative provision is made to ensure continued access by blind users to the same computer-based systems and applications as for sighted users (until recently, blind users had the possibility to access textbased interfaces through screen reader systems'). Currently available solutions, which address this problem through adaptations, are considered of restricted scope and applicability [ 1], [2], [3], because: • They take a rather narrow view of the problem domain; they adopt special purpose techniques for the reproduction of the user interface components in non-visual form, based either on auditory cues or tactile output; and, they do not provide alternative input methods which would facilitate blind users' interaction (i.e. mouse substitutes, speech recognition, etc). • They address specific graphical environments. • They provide limited access to the graphical interface (e.g. access only to the text within windows). • They do not support customisation of the non-visual graphical environments to the individual blind user's needs and preferences. This paper focuses mainly on the last issue by addressing: (i) Issues related to non-visual reproduction of a graphical environment utilising different special purpose devices. * A screen readeris a software system which, during user-computer interaction: (i) extracts information regarding the display content, and (ii) selectively represents the dialogue in non-visual form.
40 (ii) Interactive facilities for enabling configuration of blind user interaction in the graphical environment. In this context, we have developed a dialogue configuration system (the CONFIG system) for tailoring the graphical environment to blind users' needs and abilities. 2. NON-VISUAL INTERACTION IN A GRAPHICAL ENVIRONMENT Interaction in a graphical environment necessitates appropriate handling and manipulation of the various interaction objects of the application (e.g. menus, buttons, text) as well as navigation in the graphical environment (e.g. exploration of the screen contents). The various toolkits that may be used for the implementation of graphical applications (e.g. Motif, Athena) introduce different layout styles and interaction techniques (e.g. keyboard interaction, mouse interaction). In order to overcome the various difficulties that may arise during blind users' interaction, it is necessary to provide appropriate non-visual interfaces by combining alternative output media, different navigational facilities, and to support additional input methods. Non-visual reproduction can be based on the following approaches [4], [5]: (a) spatial representation of the screen contents, (b) hierarchical representation of the graphical interface objects, and (c) both of the above. The GUIB project (see Acknowledgements) has addressed, amongst other issues, the accessibility of the X Windows environment by blind users. In this context, we have carded out an in-depth examination of the various objects supported by the various graphical environments, and in particular, those employed in the X Windows environment, in order to identify critical issues related to non-visual reproduction of interaction objects. The various presentational and behavioural attributes of the low-level physical entities (e.g. windows, text cursor) as well as of various widget classes (e.g. pull-down menus, command buttons) have been investigated. This work has led to an identification of objects and attributes meaningful to blind users, and pointed out a number of critical issues that need to be considered for the reproduction of the interaction dialogue and of the lexical structure of the user interface in a non-visual form. The amount of graphical information to be reproduced varies depending on the approach followed (i.e. spatial representation, hierarchical representation, or both) and on the utilised media. In certain cases, specific objects and attributes concerning layout policies can be ignored as irrelevant. For example, the separator widget which is used for grouping options together (e.g. menu options referring to operations on files such as open/save/etc a file can be separated from the "exit" option using a separator) or simply for decoration purposes, can be ignored, since it does not provide any meaningful information for the non-visual reproduction. Additionally, various widget attributes (e.g. margins, shadow of a push button widget) can also be ignored, if either speech or tactile output is used; moreover, geometry attributes can be ignored in the case of speech output, although this should not happen in the case of tactile output. Another interesting case is the representation of the windows structure; if the nonvisual reproduction is based on the spatial structure of the graphical interface, then overlapping windows need to be taken into account and appropriately represented, while in a hierarchical structure such issues can be ignored. Additionally, different navigation facilities and user-support mechanisms need to be provided in each approach. For instance, special key accelerators can be used to facilitate the exploration of the hierarchical presentation of the graphical interface objects (e.g. "help" keys
41 for switching between the various active applications, for traversing menu bars), while "help" messages can be used to inform the user on the current status of the dialogue. Furthermore, the combination of alternative output media can further enhance the quality of presentation of interaction objects; for instance, the utilisation of both tactile and speech/non-speech output can facilitate the exploration of the interaction objects that constitute a dialogue box. 3. THE CONFIG SYSTEM From the above, it becomes apparent that blind users' interaction in a graphical environment can be significantly enhanced, if alternative non-visual reproductions of the graphical environments are provided and interactive facilities for combining alternative interaction techniques are also supported. In this context, the CONFIG system has been designed to support: • the selection or combination of different non-visual interaction techniques (e.g. exploration of the screen contents utilising speech, non-speech audio and tactile output) through the specification of alternative media for the various graphical objects, and • the configuration of the utilised media (e.g. speech output parameters), through the modification of the device parameters. The system is implemented in the X Windows environment using the Motif toolkit, and provides easy-to-use interactive facilities for the customisation of the non-visual interface.
3.1.
Specification of interaction techniques
On the basis of the above examination, a generalisation of the most commonly used interaction objects has been proposed, resulting into seven basic classes of objects: Windows, Text, Menus, Buttons, Dialogue Boxes, Icons, and Text Fields. This generalisation was considered necessary because of the availability of a number of toolkits that can be used for the development of graphical applications in the X Windows environment; these toolkits support nearly the same sets of interaction objects with similar behavioural attributes, but with different presentational attributes. Additionally, the different toolkits may provide different kinds of objects with minor variations with respect to their presentational and/or behavioural characteristics, although all of these objects belong to the same general category (e.g. there are different kinds of menus - popup, pulldown, menu bars - but all of them follow some common behavioural and presentational characteristics, so they are all treated as menus). As an example, Figure 1 depicts the "Motif' and "Athena" widgets that are mapped to the "Buttons" generalised class.
Motif Widgets XmPushButton XmToggleButton XmArrowButton XmDrawnButton
Generalised Object Class .1~
Buttons
~,"
Figure 1. Mapping of Motif and Athena Widgets to the Buttons Class
Athena Widgets Command
42 The dialogue configuration system supports the specification of alternative media for these general classes of objects; the output media that are supported for non-visual interaction include speech and non-speech auditory cues, and Braille output (Figure 2). The selection of one of these media, results into the utilisation of the corresponding device, while the non-visual reproduction of the current object is based on the appropriate associated attributes. For example, the selection of all the output media for windows, has the following effect in the nonvisual interface when entering an application window: (a) the window title is spoken; (b) a special sound is played to denote the "enter event"; and, (c) the window title as well as the window frames are "shown" on the transient Braille display. In the case that only speech output is used, there are three possible non-visual representation scenarios: (i) both the window title and geometry attributes (x and y position on the screen as well as width and height) are spoken; (ii) only the window title is spoken and the blind user is given the possibility to retrieve more information (about the window geometry) by pressing a special key; and, (iii) only the window title is spoken - there is no facility supported for getting any further information for the specific window. The above example supports the view that the r~ CONe'K;=
[]
i l ( ; ] i:[-~ ~[.] i 1 0 ] ~ [ ; - I - I [ / [ i [ . ]
U IHDOUS R wlr~w
including R number
iulndows.
.
.
|i! | i t ! I i ] [-]:Xl [.] i i * ] i~:! ,! , I ;] t
.
can
conEain
diFFerenE
concepts
In this
and
context,
user
-- R o v e , Reslzep Enter, TouchPad, RoutingKeMs,
-
a
the
InFor~mEion
are
related
to
also the ~allable
Fo||owing
Focus. [coniFti: House and in
Interaction
I~rFor~d via certain cams KeMboard
window)
,.
! Dia,o0=
:.. !
| Text-F-ieid--i.i ~
[-o-K-I
oF
Feedback is = i r o n v i a t h e Braille Co.=. w i n d o . F r a e t e ) , S p ~ c h (o.l~. u i n d o u nacre) and ,'~w:w,mnds ( e . g . w h i l e ~ n t : o r i n g
DeFault
[Menus
actions
regarding
I~B conewJnicatlon channels, i~ues are considered=
Im Button:
kinds
EexE, E r a p h i c s and even v i d e o .
oF
~DocIFications
For" o u t p u t :
~ra,lle
Br'aillep
Speech
and
~unds.
uu,ou,.
O
Input:
TouchPad / Routing Keys / Mouse / Keyboard
Output:
[ ] Braille
Input:
TouchPad / Routing Keys / M o u s e / Keyboard
Output:
n
Input:
Routing Keys t M o u s e / Keyboard
output:
M Braille
Braille
ml ~ p e e c n
[ ] Speech
[ ] Speech
r-I Speech
LmJ :5ouna
[ ] Sound
[ ] Sound
r-I Sound
I-CANcEtl
Figure 2. A snapshot of the CONFIG system for specifying the desired output media for the seven general classes of objects; on-line help for the Windows class
43 combination of various output media can further enhance the quality of the non-visual representation of interaction objects and empowers user's understanding of the presented screen contents. Input operations (e.g. button activation, menu exploration) are performed by means of the standard devices (keyboard or mouse) and special devices (mouse substitutes and routing keys of a Braille device, and touchpad).
3.2. Specification of device parameters The configuration of the utilised media is performed by appropriately setting the associated parameters. For example, the system supports the specification of the speech parameters (i.e. language, voice, pitch and speed) to be used according to the different modes of non-visual interaction (Figure 3). The German language may be selected, if the applications provide messages/text (e.g. button label, menu options) written in German, while messages coming from the screen reader (various help messages informing the user about the window title, entering the root window, etc) can be spoken in English. The utilisation of different speech parameters enables the blind user to be always aware of the current state of the dialogue. IE~KJ~Ii: _ell
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
L a n g u a g e for a p p l i c a t i o n s
Voice for applications ...
Pitch 1..10"
5
.
.
.
.
.
II._
.
Ill I' DEVICEPA METERS !11
I i.=.A-CT,o. .
.
...
IG~rman-=--I !
I-~
Male ca I ~
I! I
L a n g u a g e for the s c r e e n r e a d e r ...
I English ~ I
Voice for the screen
[ Male ~]
Speed 1..10"
5
reader
,..
r~
!
r~
Figure 3. A snapshot of the CONFIG system for specifying the desired speech parameters
3.3. Help facilities The CONFIG system provides a general "HELP" facility regarding the use of the system. It also provides on-line help, which is interactively activated by pressing the fight mouse button, for the seven classes of objects; brief descriptions of the main characteristics of these classes and of the basic features (e.g. object attributes, state of the object such as active, iconified) which are supported for non-visual reproduction are provided.
44 4. CONCLUSIONS The provision of flexible and effective solutions which enable customisation of the nonvisual environment on the basis of individual blind user's needs and preferences is considered very important for enhancing non-visual interaction. The CONFIG system has been designed and developed to support alternative non-visual representations of the graphical interface, as well as facilities for modifying the various non-visual interaction characteristics. Preliminary tests, which have been carried out with a small number of sighted and blind users, conf'wmed the practical value of such a system in enhancing blind users' interaction in a graphical environment. More systematic evaluation is currently under way. ACKNOWLEDGEMENTS This work has been carded out in the context of the GUIB-II (TP 215) project, partially funded by the TIDE Programme of the Commission of European Union (DG XIII). Partners of this consortium are: IROE-CNR, Italy; Institute of Computer Science-FORTH, Greece; Vrije Universiteit Brussels, Belgium; Department of Computer Science-FUB, Germany; Institute of Telecommunications-TUB, Germany; IFI, University of Stuttgart, Germany; VTT, Finland; RNIB, England; F.H. Papenmeier Gmb & Co KG, Germany. REFERENCES
1. TIDE-GUIB (TP 103) Deliverable No 5: Analysis of Textual and Graphical User Interfaces, European Commission, TIDE Office DG XIII, Brussels, December 1992. 2. E.D. Mynatt and W.K. Edwards, Mapping GUIs to Auditory Interfaces, Proceedings of the UIST '92 Conference, ACM Press, Monterey, California, 61-70, November 15-18, 1992. 3. A. Savidis and C. Stephanidis, Developing Dual User Interfaces for Integrating Blind and Sighted Users: the HOMER UIMS, Paper to appear in the Proceedings of the CHI '95 Conference on Human Factors in Computing Systems, Denver, Colorado, May 7-11, 1995. 4. G. Weber, D. Kochanek, C. Stephanidis and G. Homatas, Access b y Blind People to Interaction Objects in MS Windows, Proceedings of ECART-2, Stockholm, Sweden, May 26-28, 1993. 5. E.D. Mynatt and G. Weber, Nonvisual Presentation of Graphical User Interfaces: Contrasting Two Approaches, Proceeding of the CHI '94 Conference on Human Factors in Computing Systems, ACM Press, Boston, Massachusetts, New York, 166-172, April 24-28, 1994.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
45
Command-line Prediction and Error Correction Using Generalized Command-line Yasuhiro UEDA * and Toshikazu NISHIMURA* and Michihiko MINOH* and Katsuo IKEDA* *Department of Information Science, Faculty of Engineering, Kyoto University, 606-01, Japan A UNIX shell that maintains the history of not only command-lines but also "generalized command-lines" is proposed. A generalized command-line consists of a command name and argument symbols, and represents the syntax of a command. The system can detect an erroneous command-line and correct it, even if the system designer does not give such syntactic data to the system in advance. Moreover, by analyzing the history of generalized command-lines, the system can get information useful for command-line prediction and thus achieve better prediction. 1. I N T R O D U C T I O N Many systems are command-driven systems that interpret a command-line typed by a user and execute it. the command. To use such a system, a user must learn the function and usage of commands, and type command-lines correctly. Therefore, particularly for a novice, command-driven systems are hard to use. A shell, a command interpreter, is a typical command-driven system. Conventional shells such as the UNIX csh maintain the history of the command-lines typed by a user, and the user can retrieve a command-line typed in the past. This history mechanism is very useful. To use the history mechanism more effectively, we propose that the system maintains a history of "generalized command-lines" and use it for command-line prediction, error detection and correction. A generalized command-line consists of a command name and argument symbols such as FILE or SWITCH. The generalized command-line represents the syntax of a command. The system can detect erroneous command-lines and correct them using the generalized command-lines submitted in the past, even if the system designer does not give such syntactic data to the system in advance. Moreover, by analyzing the history of generalized command-lines, the system can get information useful for command-line prediction and thus achieve better prediction. 2. G E N E R A L I Z E D
COMMAND-LINE
A command-line which a user of the UNIX shell enters is a string that consists of a command name and zero or more argument words. We generalize a command-line by replacing the argument words with argument symbols. Considering ease of classification, frequency of use and effectiveness of prediction, error detection and correction, we define
45 the following argument symbols. SWITCH command option FILE file name DIR directory name WILD wild-card R E S T F I L E file name (it may not exist before command-line execution) RESTDIR directory name (it may not exist before command-line execution) REST the rest The procedure for replacement is carried out as follows. • If the argument word exists as directory, it is replaced with DIR; but, if it does not exist before command-line execution, it is replaced with R E S T D I R . • If it exists as file, it is replaced with FILE; but, if it does not exist before commandline execution, it is replaced with R E S T F I L E . • If it includes wild-card characters such as '*' and '?', it is replaced with W I L D . • If its first character is '-' or '+', it is replaced with S W I T C H . • Otherwise it is replaced with R E S T . 3. C O M M A N D
LINE PREDICTION
Command-line prediction is to predict the command-line that a user wants to submit next and to show the predicted candidates before he types. The prediction is performed in two steps. In the first step, the system predicts generalized command-lines. In the second step, the system predicts argument words assigned to the argument symbols in the predicted generalized command-lines. 3.1. Generalized C o m m a n d - l i n e Prediction Many commands of the UNIX system have rather limited power and a user must submit several commands to achieve a task. Therefore, many similar command-line sequences can exist in the history. As the following example, the similar command-line sequences whose difference is only argument words have the same generalized command-line sequence. command-line generalized command-line mkdir tmp mkdir R E S T D I R cd tmp cd D I R :
mkdir work cd work
mkdir R E S T D I R cd D I R
The preceding generalized command-lines can be an important key to predict generalized command-lines. Let C-i be t h e / t h preceding generalized command-line, and let n be the length of referencing sequences. The same sequences as C-n...C-1 is searched for in the history. If C_(j+,,)...C_(i+I ) is one such sequence, C_ i is a candidate. But there are many such candidates, so they should be ordered. Two of simple methods of ordering are frequency based and recency based. We call the former nD-FRQ and the latter nD-LRU. A user performs one task at a time and his submission of command-lines has locality, so LRU is superior to FRQ in general[l]. But in the case where interrupting command-lines, such as reading online manual, are inserted,FRQ is superior to LRU, because LRU is apt to be affected by these interrupting command-lines. So it will be effective to integrate LRU- and FRQ-candidates. This can be realized as follows. Let the initial evaluation
47 value e for any generalized command-line be 0. Every time a user enters a command-line, the evaluation values for all generalized command-lines are reduced by c~(___ 1) times. This operation adds the feature of LI%U. Then ~ is added to the evaluation value for the generalized command-line obtained from the command-line entered by the user. This operation adds the feature of FRQ. The ordering is made with this evaluation value. We call this ordering method WFRQ. We call the rate at which each candidate is hit, i.e. corresponds to what the user wants to enter, single hitting rate and the rate at which any one of the candidates is hit total hitting rate. The single hitting rate increases as n increases, but the number of candidates and the total hitting rate decreases as n increases (Figure 1). Therefore, we try to integrate the candidates of the various nD-WFRQ. The evaluation value of a candidate is not absolute among the various nD-WFI%Q methods and cannot be compared directly. Instead, we use expectant hitting rate(EHrate) for comparison. Let nl be the number of the candidates with evaluation various e in the nD-WFRQ method, and let n2 be the number of times that they are hit. The EH-rate of the candidate with evaluation value e in the nD-WFRQ method is then defined as ~nl. We conducted experiments to examine the effect of the integration. We compared the hitting rate of each of the single prediction methods with the integrated method. We used the command-line histories of eight users who had been using the UNIX system more than one year. These histories were 5,000 ,,~ 10,000 lines in length. We found out by the pre-experiments that integrating mD-WFRQ(m ~ 3) was ineffective, so we integrated 0,1,2D-WFRQ. We set c~ to 0.9 based on the pre-experiments. The result can be seen in Figure 1. As can be seen, the hitting rate is improved by the integration. 3.2. A r g u m e n t W o r d s P r e d i c t i o n To predict the argument words for an argument symbol, we use the 0D-WFRQ method based on the history of argument words that were assigned to the argument symbol. However, words that were not assigned to the argument symbol are not predicted in this method. Therefore even if a generalized command-line candidate is predicted successfully from the similar command-line sequences, the prediction for the argument symbols may fail. In the previous example, if 'cd D I R ' is predicted, 'work' may not be predicted for ' D I R ' . To predict 'work', the system must know that the ' D I R ' of the cd command and the ' R E S T D I R ' of the preceding mkdir are often the same word. To realize this, we use RP(A, B), the relevance of argument symbol A for B. Let n3 be the number of times that A is found in the preceding M lines including B, and let n4 be the number of times that A and B are assigned the same word in this situation. RP(A, B) is then defined as ~n 3. The candidates using WFRQ and RP are integrated also by the EH-rate. Let n~ be the number of candidates with evaluation value e, and let n6 be the number of times they are hit. The EH-rate of the candidates with the WFRQ method is then defined as On the other hand RP(A, B) is the hitting rate of the candidate for A that was the same argument word assigned to B, so the EH-rate of this candidate is just RP(A, B). 3.3. C o m m a n d - l i n e P r e d i c t i o n A command-line candidate is a combination of the generalized command-line candidates and the argument word candidates. The command-line candidates are also ordered
48 I i , 50
:
40
i
:
....
i, •
~- ~o
.1/~.
~s g°
"
//:
7
i ,
I | °
i
:
~ t ~ I K
: *
~0
,?:L~o
o...= . . . . . . . . . . . . .
i~
_" . . . .
:
,
,
0
.o.~....~zg
.
gener~,zea
..... I I ~il
o0
iii
.... ::.... : :
*
*
/
,
.
,"
~
,
.....l . . . . . .
~
,
.,~.--.~~..
,
,v--- . . . . .
",
,
- RU
:
i
. . . . . . . . . . . . . . . . .
~= so .E=
-:-..: , , :"..-~.' - . - :. - - - .. : . -.. . : . . . .. . :. . . . : - - . J
iiiiiii iii
..-;.-.i.--i.'-":'~..-i.-.2
'° . . . . ! . . . . . . . . . . . 0
:
I
:/$,,,';" ..........i ............ F.° :- ;/--.-:,.,.":"---'--.-..:..:D,--"-":'-"'-:---
~3o
10
I
.)-..-"','
'
"~
I | ,
2
4
6
8
10
o0 5:,
n u m b e r ol cadidates
10
i .......
15
i
20
25
total hitting rate
Figure 1. Hitting rate of predicted generalized command-lines,
.... i .... i-..
:, 35i
30 (%)
40
t 45
Figure 2. Single/total hitting rate of predicted command-lines.
according to the EH-rate. The hitting rate of a command-line candidate is the product of the EH-rates of the generalized command-line and all the argument words in it. It is important to improve the total hitting rate, and this is easily realized by increasing the number of candidates to be shown. But too many candidates tend to bother users. Therefore, it is important not to show the candidates whose single hitting rate is low. This is easily accomplished by not showing candidates whose EH-rate is lower than some threshold 7, because the EH-rate is the estimation value of the single hitting rate. We conducted experiments to examine the effect of the generalization, and checked the single and total hitting rate of the command-line prediction with and without generalization. The prediction method without generalization is the same as the method of the generalized command-line prediction. The result for various 7s is summarized in Figure 2. As can be seen, when 7 is large and the total hitting rate is low, the single hitting rate of the generalized method is higher than the non-generalized method by more than 10%. In conclusion, our method of generalization is effective for command-line prediction. 4. E R R O R D E T E C T I O N
AND CORRECTION
Many of traditional UNIX shells only return error messages, when a user enters an erroneous command-line. This can be a source of great frustration for users. t c s h corrects trivial spelling errors using Levenshtein distance (cf. next section). But to correct spelling errors in argument words, the users must describe what type of argument words a command takes, so in many cases the correction is performed only for command names. Bradford[2] introduced a method for detecting and correcting user errors not only syntactically but also semantically. In his method, the command-line sequence entered by the user is converted into a command paragraph, that consists of command-names and predefined symbols. The detection and correction are made by comparing the command paragraph with the correct command paragraphs. The correct command paragraphs are
49 given to the system in advance. But the user has to update them, and it is difficult to deal with the user's private commands and new commands. A generalized command-line represents the syntax of a command, and the system can detect and correct an erroneous command-line using them. Moreover, the generalized command-lines are obtained only from the command-lines entered by the user, so the system designer need not provide the generalized command-lines, and it is easy to deal with the user's private commands and new commands. In the following, we describe the detection and correction methods of the command-lines with typological and syntactic errors using the generalized command-lines. 4.1. L e v e n s h t e i n D i s t a n c e Damerau[3] reported that more than 80o£ of all typological errors are insertion, deletion, substitution or transposition of letters. The distance between two strings can be defined as the Levenshtein distance[4] by assuming each of these four errors as a unit error. The Levenshtein distance is the minimum number of unit errors that transform one string to the other. The distance between two argument symbol sequences can be defined similarly as the Levenshtein distance by assuming insertion, deletion, substitution and transposition of an argument symbol as unit errors. 4.2. D e t e c t i o n of a n E r r o n e o u s C o m m a n d - L i n e There is no way for a shell to detect an erroneous command-line completely, because the detection is performed based on each submitted command. We propose a method for detection of an erroneous command-line using the generalized command-lines. Every generalized command-line has a state. The state is one of 'erroneous', 'correct', 'unknown'. 'correct' is the state that the system infers the generalized command-line to be correct. 'erroneous' is the state that the system inferred the generalized commandline to be erroneous. 'unknown' is the initial state. If the generalized command-line obtained from the command-line entered by the user is 'erroneous', the system detects this command-line as erroneous. Because the command-line corrected by the system is not always what the user wants, the system shows the corrected command-line to the user and he confirms it by entering one of 'y', 'n', 'e'. His confirmation also determines the state of the generalized commandlines. 'y' means 'yes' and the system executes the corrected command-line. The original command-line will be erroneous and the corrected command-line will be correct, so the system determine the state of the original generalized command-line to 'erroneous' and the state of the corrected generalized command-line to 'correct'. 'n' means 'no' and the system executes the original command-line. The original command-line will be correct, so the system determines the state of the original to 'correct'. 'e' means 'edit' and the user edits the original command-line and the system executes the edited command-line. The original command-line will be erroneous and the edited command-line will be correct, so the system determines the state of the original to 'erroneous' and the state of the command-line after the editing to 'correct'. 4.3. C o r r e c t i o n of an E r r o n e o u s C o n n n a n d - l i n e The system corrects the detected an erroneous command-line by applying the following unit-correcting operations each of which corresponds to each of the four unit errors.
50 • For deletion of an argument word, complete argument word in the same method using the argument word prediction described in section 3.2. The cost is 2. • For insertion of an argument word, delete the word. The cost is 4. • For transposition of argument words, transpose these words. The cost is 4. • For substitution of an argument word, substitute the original word with the nearest word. The cost is the distance between those two words. The system selects the command-line with the minimum total cost as the corrected command-line candidate. There are often more than one corrected command-line whose costs equal to the minimum. In such a case, the system selects the one with the lowest EH-rate (described in section 3.1), because the command-line prediction can be assumed as the command-line correction of the null command-line.
4.4. Experiment As an experiment, four users used an extended shell that performed the detection and correction according to the method described above, and entered 11,657 command-lines altogether. 509 of them were detected as erroneous, but 234 of them were actually correct. This is too many to ignore. If the system so frequently detects the correct command-lines as erroneous, the user doubts the system and ignores the detection. Thus, it is important to reduce incorrect detection. This may be done by relaxing the condition for detection. The remaining 275 of 509 were properly detected. 115 of 275 were corrected properly, and 106 were corrected properly in the generalized command-line level, but the corrected argument words were improper. It is necessary to improve the argument word correction. 5. C O N C L U S I O N We have proposed a system that maintains the history of generalized command-lines. Through experiments we showed that by analyzing this history such a system can get information useful to command-line prediction and thus achieve better prediction. A generalized command-line represents the syntax of a command. The system can detect and correct a erroneous command-line using the generalized command-lines which are obtained only from the command-lines entered by the user. However, our detection and correction methods need improvement. REFERENCES 1. S. Greenberg and I. H. Witten: Supporting Command Reuse: Mechanisms for Reuse, Int. J. Man-Machine Studies, 39-3 (1993), 391-425. 2. J.H. Bradford, W. D. Murray, and T. T. Carey: What Kind of Errors Do Unix Users Make?, Proc. IFIP INTERACT'90: Human-Computer Interaction (1990), 43-46. 3. F.J. Damerau: A Technique for Comuter Detection and Correction of Spelling Errors, Commun. A CM, 7-3 (1964), 171-176. 4. V.I. Levenshtein: A Method of Constructing Qua~ilinear Codes Providing Synvhronization in the Presence of Errors, Problems of Information Transmission, 7 (1971), 215-222. 5. J. H. Bradford: Semantic Strings: A New Technique for Detecting and Correcting User Errors, Int. J. Man-Machine Studies, 33-1 (1990), 399-407.
III.4 Interaction Design 2
This Page Intentionally Left Blank
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
53
FRADS: A System for Facilitating Rapid Prototyping by End Users Irvin R. Katz Division of Cognitive and Instructional Science, Educational Testing Service, Princeton, New Jersey 08541 USA While user interface toolkits and managers facilitate prototyping by programmers, few systems allow nonprogrammers to create their own applications. In this paper, we report some techniques that bring prototyping to nonprogramming domain experts, namely professional test developers at Educational Testing Service. The Free-Response Authoring and Delivery System (FRADS) allows professional test developers to create dynamic, working prototypes of computer-based test questions. FRADS was designed to leverage nonprogrammers' experience with commercial graphics packages. Test developers create questions by importing graphics and other user-interface objects, choosing the tools to provide to students in responding to the question, and delineatingmvia dialog boxes and specially designed graphical objectsmhow the tools and provided interface objects interact. With FRADS, we explore how much "programming power" can be obtained by using direct, graphical specification of applications. 1. INTRODUCTION*
Traditional user interface management systems (UIMSs) rely on a "widgets plus programming" framework to enable prototyping. As a result, few systems allow nonprogrammers to create their own applications. In this paper, we report a system designed to bring prototyping to nonprogramming domain experts--in this case, professional test developers at Educational Testing Service. The Free-Response Authoring and Delivery System (FRADS) allows professional test developers to create dynamic, working prototypes of computer-based test items (such items resemble specially tailored graphical editors). + The authoring component of FRADS makes it possible for a item author with no programming experience to create graphical items on the computer, interact with the items as a test taker (i.e., student) would, and immediately make necessary revisions. A focus of this work is on the set of techniques that allow prototyping without the use of formal programming languages. Free-response items require students to create their own responses rather than selecting a response from a set of altematives (i.e., multiple choice). In a typical test item, a student is presented with background information on the computer screen and uses the mouse or keyboard to respond. Responding to an item might involve selecting a point on a plotted function, moving a curve onto a coordinate system, entering the result of a calculation, etc. In other words, each item may be thought of as a highly-restricted graphics editor, in which information is presented and only certain types of responses (e.g., draw a line, alter a line) are permitted. A sample item is shown in Figure 1. For this item, a student must select a curve and place it onto the axes in a position that corresponds to the equation. * I thank Daniel Zuckerman for his assistance with the implementation of FRADS. I thank Ruby Chan, JungMin Lee, and Kevin Singley for comments on earlier drafts of this report. This research was funded in part by the Graduate Record Examination Board and Educational Testing Service. + A "test item" refers both to the test question (i.e., what is being asked) and the means for responding to that question (e.g., multiple-choice alternatives).
54 1.1. P r o t o t y p i n g f r a m e w o r k s In the "widgets plus C I i c k on an o b j e c t
to sl I ect
i t.
Ho I d d o l n
t h e mouse b u t t o n t o moue the o b j e c t programming" framework, UIMSs provide programmers with typical user-interface Item elements, or "widgets" (e.g., I-..'- .L~ _L-!-:4--., .;i....l-÷-{--i-'--'- s • -,-t--{ .-t-iT1-1 sliders, icons, text fields), and 0 v--f-t-r-i-,..-,-+- :]~LT]-j-.]il.Tj Untried ilili.l'ii: a programming protocol for Fv-t-r-t-N-t-;interaction among the widgets. Although some systems allow -;.' -~-4-4+~..i.-i-.i- '< "-i'-l-'i'-.rr-Fl-l-~ nonprogrammers to arrange i..-~.~-'.+.-..-i....-4. bi--;-f-bi--.:-+:~...-,.-..~....t....~.+.~----:-,. widgets on the screen, the behavior of widgets--how they react to user actions and Produce the graph of the equation Next communicate with each other-Untrkd ( x - 1)~+ ( y - 3)' 4 4 =1 must be specified via a formal by selecting one of the given curves and programming language [2]. positioning it in the coordinate system. Next Item Even systems geared for the c o m m o n user, such as "~ Hypercard and Visual Basic, Figure 1" Move object item follow the widgets plus programming framework, albeit using simplified programming languages. This approach provides a great deal of flexibility in the range of prototypes that can be developed. Unfortunately, it also excludes nonprogrammers from the benefits of prototyping. One method of accelerating the production of domain-specific applications is to enable domain-experts themselves, who lack programming skills, to create their own prototype applications. The challenge is for a UIMS to retain flexibility without relying on a programming language.
Pr£s
U
#
An alternative approach is to provide prototype builders not only with a collection of interface objects, but also with operation "objects" that can act on these widgets. This approach achieves flexibility because prototype authors can combine different widgets with the different operations in a variety of ways. Furthermore, in this framework, authors specify via direct manipulations how the widgets and operations interact. In other UIMSs, operation "objects" and their behavior would be specified through a formal programming language, and thus be inaccessible by nonprogrammers (but see [4]). There are three main elements in the FRADS prototyping framework (Figure 2): widgets, tools, and allowed actions. Like widgets of other UIMSs, those of FRADS are user interface elements that may have associated attributes, such as the font size in a text field. The second element of this end-user prototyping framework are tools. Tools are conceptually the same as the operations provided in commercial graphics packages; tools are the "verbs" that act on the "nouns" of the interface (i.e., widgets, screen). Finally, mediating between widgets and tools are allowed actions, which authors use to specify which tools may act on which widgets. For example, certain graphics may be background information, so should be immutable by the student, while other graphics may be manipulated (e.g., moving a curve onto a set of axes).
FRADS Tools
Widgets • Lines, arrows • Pictures • Text boxes • On-screen keyboards • Static text
• • • • • •
Move object Shade region Rotate object Draw line Enter text Erase object
Allowed Actions • Which objects may be moved, copied, rotated, shaded, etc. • Text entry via on-screen keyboard, hardware keyboard, or both
Figure 2: Sample elements of prototyping framework
55
2. FREE-RESPONSE
AUTHORING AND DELIVERY SYSTEM
A key requirement of any "end-user programming" environment is that it allow nonprogramming domain experts to make use of their own expertise [3]. FRADS capitalizes on item authors' familiarity with commercial graphics applications. Typical graphics applications use a "palette-canvas" interface metaphor. The application provides a set of tools that determine what gets "painted" on the canvas, what changes occur to existing objects on the canvas (e.g., rotation), and what gets removed from the canvas. FRADS generalizes this metaphor into a prototyping system. The "palette" of each test item is editable by the item author, who selects the tools available to students in answering the item. The "canvas" is stocked with background material (e.g., the item's question) and the objects that the student can manipulate using the provided tools. Students manipulate objects on the canvas to create their response. Item authors can further customize students' use of tools by specifying which objects can be affected by which tools. 2.1. Example of authoring
This section presents an example of using FRADS to develop a simple free-response item. The screen layout of FRADS's authoring component is essentially a blank delivery screen (the same screen that a student sees when taking a test, e.g., Figure 1), with an additional set of palettes on the right (Figure 3). One palette contains the tools that an item author can provide to test takers. A second palette contains several user-interface widgets, such as text fields and an on-screen keyboard, which may be added to the "canvas" of test items. Note that the system does not provide sophisticated graphic-drawing facilities; instead, an author may use the graphics application he/she is most familiar with, and then copy-and-paste the graphic onto the item's canvas. Overall, authoring an item consists of (a) choosing the appropriate tools for students, (b) selecting the widgets to include, (c) importing graphics from a graphics application, and (d) specifying restrictions on students' use of provided tools (e.g., only certain imported objects are moveable). In the example below, we assume that any needed graphics already exist and that an appropriate graphics package is running concurrently with FRADS. The item to be created requires students to choose the appropriate curve and move it onto the coordinate system such that the curve satisfies the stated equation (Figure 1). To get the background information for the item, the author switches to the graphics application and copies the coordinate system and problem statement. The author then switches back to FRADS to paste and position the graphic on the canvas. Next, the author specifies the required response tools by clicking on the appropriate button and placing that button into the tool palette. The only tool needed in this case is the "move object" button, although other items may use multiple tools. Now the author must copy separately each of the closed curves into FRADS. Each curve must be copied separately because each must be able to be manipulated by the student. Finally, the author must specify that the closed curves are moveable by the student. By default, FRADS assumes that any imported graphic is part of the background and so is immutable by the student. Double-clicking on a closed curve brings up a dialog box through which the author can specify what actions students may perform on the object (e.g., one or more of: move, erase, rotate). The author specifies that the curve may be moved, then closes the dialog box and repeats this process for the other curves. The author then names the item, saving it onto disk, which completes the authoring of this item. Once an item is saved, the author may tryout the item, bringing up a sample delivery window with the actual, working item. This facility allows authors to review the item and, if necessary, edit the item if it does not perform as expected.
56
Tool valette. Authors add buttons (i.e., response tools)here. ,ak
I.
Response tools, The author choses the tools to be available to students for creating their answer.
File Edit Options Add New Item
I Buttons ~L I
@ Previous Item
Held window. During item delivery, this area contains a short message describing for the ~.~ student how to use the currently selected tool (e.g., Figure 1). The text changes when a new tool is ~rl*~s selected. This area is merely a placeholder in the authoring system.
]
;
"
'
rDZT "rCXT [NTI[RT[XT
" - ...... [ i ~ -,toe,toT
-Ira- sxo~
It0TaTl[
.o~,, ~-~Rtszz[ LZ.E -~
S[L[¢T][ON
(Objects }
m ....
n
K[YBO~ItD
/
Next Untried
Canvas. This area contains the test question and is where students create their response.
Next Item
O_]~ects. gesides imported graphics, the author can add these widgets to an i t e m ' s canvas,
Figure 3" Item authoring environment In summary, authors build interfaces (i.e., test items) that limit students' interactions. That is, authoring consists of effectively programming the actions available to a student when responding to a particular item. The item author specifies what objects the tools can affect, where on the screen the tool can be applied, and in what way the tool is applied (e.g., move an object versus copy an object). For each item, students are given only the tools they need, which have been "tuned" to the particular task. 2.2. Declarative specification of tool use
In the example above, an author specified via a simple dialog box whether objects were moveable by students. Specifying which tools can act on which objects is the simplest way that authors "program" how their items behave. More complex control of student response tools can be achieved through use of special-purpose graphical objects. The way that shading is specified provides an example of using special-purpose objects to "program" a tool's behavior. By default, the shade tool works similarly to such tools in graphics applications: from the point where a user clicks, the system "flood fills" the region with the shade pattern until a boundary (solid line) is encountered. Figure 4 shows a simple shade-region item. When the student clicks on the x-axis, a rectangular area between two adjacent tick-marks is shaded (the five leftmost areas are shaded in the figure). Clicking on a shaded area causes that shading to disappear. If the student clicks elsewhere on the screen (e.g., on the y-axis), no shading occurs. Also, note that even though there are no visible boundaries on the shaded rectangles, shading does not "flood" the entire screen. In a graphics application, shading is bounded only by what's visible on the screen. This item demonstrates that authors can control shading in two ways, specifying (1) the shadeable portions of the visible graphic and (2) the subareas on the screen to be shaded, irrespective of whether those areas are visibly bounded.
57 IClick on a re ion to shade i t ; lick to unshade itg c again
@ Previous
Item
_.~=
Untried
e(x)
Next Untried
S h a d e the l a r g e s t interval of x - v a l u e s f o r which the function y =
f(x)
h a s an i n v e r s e .
Next Item
Figure 4: Sample response to "Shade" item FRADS provides two special-purpose graphical widgets: masks and boundaries. Both of these objects are created and placed by the author, but are not visible to students. Masks specify the allowable areas of a graphic widget that may be shaded. Shading of the visible graphic will occur only if the student clicks in a location that corresponds to an "on" bit in the mask. Boundaries allow the item author to specify invisible (to students) shading boundaries associated with a graphic. Shading becomes bounded by the lines in the boundary object rather than by the lines in the visible graphic. Figure 5 shows the corresponding mask and boundary of the item in Figure 4. The mask shows that shading can occur only in the area around the xaxis; the boundary object indicates that the subareas to be shaded are rectangles between adjacent tick marks (the y-axis and labels are included in each object to aid the author in aligning these graphic objects to the visible graphic). Note that a to-be-shaded graphic might have only an associated mask, only a boundary, or neither, depending on the needs of the item.
I
I
I
I
I
I
I
I
I
p.x
Figure 5: Corresponding shade mask (left) and shade boundary (fight) for Figure 4 These special-purpose objects represent a powerful end-user programming approach. Rather than controlling a tool's action through a programming language, or even through the data-flow representation of many visual programming languages (e.g., [ 1]), the special-purpose objects allow control of tools via simple graphical mechanisms, easily understood by nonprogrammers.
58
3. SYSTEM
EVALUATION
Preliminary evidence suggests that FRADS successfully meets its going of providing a usable rapid prototyping environment to nonprogrammers. FRADS has been used by eight test development staff at ETS, who all volunteered to use the system in their daily work. These authors were trained to use the basics of FRADS within a few hours (2-3 hours). After that time, authors could create items largely on their own. Using FRADS, these item authors have implemented approximately 230 distinct items, which can be classified into approximately 70 different item "types" (e.g., graph a function, arrange words in a sentence, indicate an area on a graph). Anecdotally, authors report that they explore a wider variety of item designs using FRADS compared with creating items on paper (which would be put onto computer by production staff). Similar to programmer-oriented UIMSs, FRADS appears to accelerate the item creation and revision cycle, giving authors more time to refine their items. These items, in turn, have been shown to be usable by their target user population--test taking students. Several items have been delivered in preliminary tests to student volunteers (i.e., usability studies); more than 200 such volunteers have taken a test on FRADS to date. In questionnaires administered after testing, 75% of the students reported that they found the computer test-taking environment easy to use; 62% thought that they would perform equally well on a computer-delivered test or paper-and-pencil test. 4.
CONCLUSIONS
To summarize, FRADS helps nonprogrammers to prototype by providing: • a user interface, similar to that of commercial graphics applications, that capitalizes on users' existing expertise. • a framework for creating restricted palette-canvas graphics editors (i.e., test items). Authors specify the tools available on the palette, the objects that appear in the canvas, and which tools may affect which objects. • a combination of "allowed actions" and special-purpose objects (e.g., masks, boundaries) through which authors can control how particular tools will function for students. Thus, much of the "end-user programming" performed by authors is done via creating and importing graphics as part of an item stimulus. How far can direct manipulation and declarative, graphical specification of applications take us? At what point is it be necessary to rely on some type of programming formalism? Future work will address these questions through (a) extending the functionality of FRADS through, for example, introducing greater flexibility in authors' control of tools and (b) applying the widgets-tools-actions framework to the construction of end-user prototyping systems for other application domains. REFERENCES
1. Haeberli, P. E. (1988). ConMan: A visual programming language for interactive graphics. Computer Graphics, 22(4), 103-111. 2. Myers, B. A. (1989). User-interface tools: Introduction and survey. IEEE Software, 6(1), 15-23. 3. Nardi, B. A. (1993). A Small Matter of Programming. Cambridge: MIT Press. 4. Took, R. (1990). Surface interaction: A paradigm and model for separating application and interface (pp. 35-42). Human Factors in Computing Systems: CHI '90 Proceedings. New York: Addison-Wesley.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
User Interface Development Environment for End Users
59
•
CUIDE
Y. Tokuda ~, E. S. Lee b and N. ShiratorP ~Research Institute of Electrical Communication, Tohoku University, 2-1-1, Katahira, Aoba-ku, Senda.i, 980-77, Japan bDepartment of Infornla.tion Engineering, Faculty of Engineering, Sung-Kyun-Kwai~ Universit.y, 300. (-',hllncl~un-dong, Jangan-ku, Sllwon. Kyllnggi-do 440-746, Korea. To cope with individual requirements of user interface (UI) from various classes of users, it is desirable to have UI development support environment which allows end users to develop UIs without designer's helps. The support of conventional methods and tools, however, focuses on expert of UI development. In this paper, we propose a UI development environment for end users, named CUIDE (Case-based UI Development Environment) in order to support UI development by end users, who have never developed any UI. CUIDE utilizes the case base of design and the case base of parts to make up for their lack of knowledge and experience of UI development. 1. I N T R O D U C T I O N With the spread of computer systems, a great part of the entire system development tends to go for user interface (UI) development, because of strong requirements for userfriendly UI. The conventional methods to develop UI efficiently[I-3], focus on how they could support mainly the expert users. Under the present state of UI development, which requires advanced technical skills and experience, however it is difficult to cope with individual requirements of UI fi'om \ariolls classes of users, and as the result, tile methods have limit of developing UI efficiently. Therefore we need UI develol)ment environment so that even general users can design, modify and expand U I. in this paper, we propose a UI development environment for end user, named CUIDE (Case-based UI Development Environment) in order to support the whole process of UI development by ordinary users, who do not have complete knowledge a,nd experience for UI development. CUIDE includes the knowledge for the techniques of UI design a.s well as parts of UI as case bases, and provides the cases for them visually. 2. D E S I G N
CONCEPT
To allow end users develop a desirable UI satisfying their requirements, it is not enough that they are provided only with parts of UI. They generally have functional requirements about their images of UI based on their experience to use UIs, even if they have never developed UIs. To actually design a. desirable UI satisfying their functional requirements,
60 end user should have the following design knowledge" i)functional knowledge, ii)layout knowledge and iii)a.ttribute knowledge, i) is for determining the parts of UI satisfying the above functional requirements, ii) is for organizing and laying out the above selected UI objects, iii) is for redesigning the attributes of each UI part in detail. Thus, because there is a wide gap between end users' own knowledge and the design knowledge, it is difficult to embody their requirements and to construct UI without the help of UI designers who have the above design knowledge. We propose a. UI development environment for end users, Case-based UI Development Environment (CUIDE). To bridge the above gap, CUIDE makes use of the two design knowledge, stored in i) the case base of design and ii) the case base of parts, i) deals with the cases for the knowledge of design methods with UI parts, ii) deals with the cases for the knowledge of UI parts constructing UI. Since with these case bases, we can retrieve existing design examples similar to desirable ones based on the incomplete and fragmentary requirements, they are suitable for the utilization of the design knowledge of UI. By reusing these case bases effectively, CUIDE a.ims to acquire and formalize users' requirements user-friendly, and to present the UI images according to the user requirements. 3. C U I D E Based on above design concept, CUIDE has the following structure(Fig.i) : three modules, (1) Requirement Management Module (RMM), (2) Case Management Module (CMM) and (3) hnage Generation Module (IGM) and two case bases, (4) Case Base of Design (CBD) and (5) Case Base of Parts (CBP). We'll mention each functional structure of CUIDE in 3.1, and the case bases of design and parts in 3.2.
~;,.;.-~ ~i!~!:.-~:ii~:~i:!:i~:i:~!:i#i-:i~i:!i:i~!~i~:~!i:~i~!~!~i ~'.-.~i*?x .:;~
::.........- ...................... ::::::::::::::::::::::::::::::::::::: i:!:i~:!:i::-"" ::::::::::::::::::::::::::::::::::::::::::::::::::
~ ~i!iiii~i~~iiiiiiiiiiii!ii ~ -:~:i::~:i:?i:i:~:~:i:i-:i:!:?i:i:~:i:!:!:?~:!-:i:i:i:i:!:i:i:?i:: o
f~x~
.;..v...;;;;:....,.....
. . . . .
• ...........;;....:.:
~i'::i:i:!~:i:!:i:~:i:":?N~~i]===================== ....;;;..v.;;.......
v.......:.:.;....;;:.;.
r -*~N : ~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:-:-:~:~:~:~:~:~:~:~:~\~:~:~:~:~:~:~:~:~:~:~:~::::~:~:~:~:~~' :~:~:~:~:~:~*::":~:~" ) , 1 ~ ""~/~i iii::i::i::iR~~~ii::i| ii::iiiii::i::iii~i::iiiiii;::ii~ili::i!M::iil;| IAI ~~.~.~,::~.::;:: ~.~i~i..~i~:-.:.~/t~. ! ~i4::~::::i:~-" i.:,"!iii~....ii:i:'":!~"-":~".'.":~":'-"~:~",'.'." ::.i~:".'.."!~:~.::::!:~i~i~i!i:~:~"-:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::~~m~::~ :~i~!~:!:#:~ ~ ::::::::::: . ..'" :.','.:- ":'.'.'-'.','.'.'.'.:-" :'...',:: End user
,'~,.';?
i:i:!:i:i:i:i:i:i:i:i:" "::" !::':::-: :'?::" :::::::::::::::::::::
~i~..':..'~ii~i~iii~!~iiii!~ii~.*.i~~iiiiii~ iiii~"
:i:i:i:i:!:i:!:i:i:!:!:""':'" 9::!::'" :-::::: "::-::i:i:i:!:!:i:i:
iiiiiiiiiiiiii
~ ! i i~ ..i::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: ~ ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: ................ ...................... ~ ~......... ................ ............... .Fh" ........................... ~ ~ ..... ......................................... ~. u.~ •~i'.:~'~ ::::' . "~.~:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: "......... .~........ :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::. ~.~6~q~.~~ •••.•••••
~'°'.~" ~.~:t ~ ~ ~ ~
~.
~ ~~
~].~v.' ~..~!~i :.`..`~.~:::~' .`.~!~..~:::..'.~,`:~.~.
Dialogue |||||||||
a
(..taskmodels~
( otherenvironments ~
Figure 1. Interaction between user and system. Then, the interaction between the user and the system is defined as a triple of the user, the system and the dialogue between them. Note that in real circumstances some other factors e.g. problem domains or task structures could be involved in the interaction, but here we don't deal with them in the current specification.
2.2. Specifying meanings of the dialogue In our denotational approach, dialogue is treated as a sequence of symbols. S t r u c t u r e of the dialogue sequence are defined using abstract syntax, and the meaning of the dialogue is given for each symbol or sequence. Let a b e a user's input symbol to the system, and fl be a system's correspondent output symbol to the user. The meaning of user's input symbol a is given as a mapping from Es to Es and b. Similarly, let b' be a system's output symbol to the
73 user, and a' be a user's correspondent i n p u t symbol to the user. The meaning of be given as a m a p p i n g from Ewto E v and a'. So the specification of the whole phases of the i n t e r a c t i o n can be given in t e r m s of E U, Es, 2:, F, dialogue structures and the mappings which define m e a n i n g of the dialogue.
system's output symbol fl" can
3. E X A M P L E : A P P L I C A T I O N O F T H E S P E C I F I C A T I O N This section p r e s e n t s a practical example of the denotational specification, applying the t e c h n i q u e to the existing p r o g r a m s . The purpose is to show the expressive ability of this approach to specify the dialogue. Macintosh Finder, t a k e n up here, is a GUI-based file m a n i p u l a t i n g program as well as a kind of login shell, s y s t e m configuration utilities. We specified the s t r u c t u r e of the dialogue between F i n d e r and the user, and the m e a n i n g of the user's input symbols to Finder. Note t h a t we are currently dealing only with the functions relating file manipulating, which involves typical type of operations e.g. handling of m e n u s and windows, dragging icons and even inputs from keyboard. Also note t h a t user's internal entities and the m e a n i n g of system's output symbols to the user are omitted here in the specification this time. 3.1.
Domains
The domains needed for the specification of the meaning of user's input symbols are 2:,/~ and Es. Here we illustrate Es as an example. Es denotes system's internal entities which are needed for the specification of the m e a n i n g of user's input symbols. Here Es is represented as a product domain. Each e l e m e n t d o m a i n is equally defined by more basic domains. Following specification is an extract from the definition of Es. Es = D e s k t o p x M o u s e x C l i p b o a r d x F i n d S t r Desktop Size Disk
• Size x Disk* x WinList x W i n D B x Selection x M e n u b a r • ( Nx N ) /* display size */ • N a m e x Content x Content x C o n t e n t
Content
/* disk name, disk contents, desktop contents and trach contents */ • ( Folder + File + Alias )* /* list of the content of disk (or folder) */
. . . .
--->Window )
WinList Window
• ( N
Rect
• (NxN)x(NxN)
/* window list (which returns the 'Nth' window)*/
• Rect x W C o n t e n t x Path
/* window size, windowcontent and associated path */
The domain Z and F are defined similarly. Note t h a t we prepared two abstraction level of Z-- command level and mouse-key level. C o m m a n d level deals with user's inputs as commands like "copy selected item" or "create new folder under c u r r e n t folder". Mouse-key level deals with inputs as lower inputs like mouse movements or key inputs. F is, in our c u r r e n t specification, the domain which is a set of
74 possible output representations, that is an entire screen of the display.
3.2. Syntax Specification The next step is to specify the structure of the dialogue in a form of abstract syntax. Syntax of the dialogue are given for both command level and mouse-key level. Following specification is the syntax for command level dialogue. Nonterminal symbols are bold-faced, and system's output symbols are underlined. The symbol e denotes 'no input' from the user. Dialoguecor. ::= Init Dco,. Dcor. ::= ~1 ( NewFolder ResDI OpenSelection ResD I CloseWindow Reso DuplicateSelection ResD I FindFile Resp I FindAgain ResD SelectAII ResDI EmptyTrash Resp I SelectFile ResD RenameFile ResD I MoveSelection <MoveDialog> ResD CopySelection ResD MoveWindow Respl ResizeWindow ResD ) D~or.
Symbols with angle brackets denote involved dialogue, which work as a part of input symbol from the user. For example, the syntax of can be given as"
":= FindDialoQ FindStr ( O K I C a n c e l ) DismissDialoa FindStr ::= ~1 ( Char ResD I CutStr Resp I CopyStr ResD I PasteStr Reso I SelectStr ResD I MoveWindow ResD ) FindStr
The syntax for mouse-key level dialogue are given as follows • DialogUernk ::= Init Dr"k Dr.. ::= ~1 ( MDrag I Click ResD I DblClick ResD I Char ResD I Bs ResD I Esc ResD I UP Resp I Down ResDI Left ResD I Right Resp) Dr.k MDrag ::= MC Drag1 Drag1 ::= PressMB ResD Drag2 Drag2 ::= MC ReleaseMB ResD MC ::= E ( MouseUp Resp I MouseDown ResD I MouseLeft ResDI MouseRight Resp ) MC
3.3. Semantic Specification The final step is to define semantic functions, which are the mapping from input symbols to the meanings, i.e. to specify the meaning of the dialogue with the functions. Following functions are used for specifying the meaning of command level input symbols from the user to Finder. ~cute gives the meaning of user's input in a view of change of system's internal entities; dt~h~ in a view of change of system's output. ~ , u t e • Com ~ ( ( Desktop ~ Mouse -~ Clip ~ FindStr ) --> ( Desktop x Mouse x Clip x FindStr ) ) ~lLqp~ • Corn ~ ( ( Desktop ~ Mouse -~ Clip ~ FindStr ~ Disp ) ~ Disp )
Next specification is a part of the description of the function exe~te, to a user's command Newfolder . The arguments dr, m , c and f are system's current internal entities.
75 execute [ NewFolder ] d t m c f = let dt" = create-new-folder d t in execute [ RenameSelection ] d t ' m c f
Similarly, functions action and edao are defined for specifying m e a n i n g for mousekey level input symbols from the user. action
• MK ---> ( ( Desktop--> Mouse ---> Clip--> FindStr ) --> ( Desktop x Mouse x Clip x FindStr ) • MK ~ ( ( Desktop--> Mouse --> Clip--> FindStr --> Disp ) ---> Disp )
Following is an extract from the specification concerning mouse dragging. This specification shows t h a t the r e s u l t of m e n u selection are h a n d e d over c o m m a n d level, and the result of dragging from inside a certain window. action [ MC MDrag ] d t m c f = let (dt', m', c',f') = aaion [ MC ]dt m c f in oxtion [ MDrag ] dt' m' c ' f action [ PressMB Reso M D r a g l ]dt m c f = case whereMC dt m of onMenuBar : let (dt] m', corn) - s ¢ ~ ~ [ MDragl ] dt m in execute com dt' m' c f
onWindow : let n = whichWindow dt m in let i whichltem dt m in if i= e then let dt'= front-window (dt, n) in .s~et'da~ [ MDragl ]dt m e l s e let dt'= select-file (dt', i) in .urvc,...~etlon [ MDragl ]dt m onDesktop : =
end case
Similarly, the detailed description for the rest of the dialogue are also given in a formal m a n n e r . The whole specification which we have applied to Macintosh F i n d e r are shown in [6].
4. C O N C L U S I O N We have introduced the a l t e r n a t i v e approach to specify the h u m a n - c o m p u t e r dialogue in a formal m a n n e r . With the denotational a p p r o a c h , we can give the d e t a i l e d a n d precise s p e c i f i c a t i o n s u n d e r given d o m a i n s . L i s t e d are t h e charasteristic features and possibilities for the future steps. •
•
Our specification t e c h n i q u e can be shown applicable for c h a r a c t e r - b a s e d text editor [4] and G U I - b a s e d p r o g r a m s [51 [6] which h a s c o m p a r a t i v e l y detailed interaction. This shows the expressive ability of this approach to specify the dialogue. This model can deal with several abstraction levels of input symbols, output symbols and i n t e r n a l e n t i t i e s in a consistent m a n n e r . So modellers can
76 choose any abstraction levels to be discribed which are appropriate for their own purposes. And extending the model so that it can handle task level in a same way should come to the next in our research. • "Meaning" of dialogue sequences can be specified, so the practical application of the model to the interface specifications would be a straightforward extension of this research. • If the user's internal entities are to be given, the meaning of system's output symbols to the user could be defined symmetrically as a mapping from user's internal entities to user's input symbols to the system, and it would bring some possibility for the simulation of the dialogue. However, there is a problem whether an appropriate user model could be given to specify user's knowledge and activities. • Besides a user and a system, other domains like tasks or environments could be introduced to specify the meaning of higher level dialogue in which some other domains are taken into consideration. Finder is a t r a d e m a r k of Apple Computer Inc.; Macintosh are registered trademark of Apple Computer Inc.
REFERENCES
[1] S.K. Card, T.P. Moran and A. Newell : The Psychology of Human-Computer Interaction. Laurence Erlbaum Associates, Hillsdale, NJ. [2] D. Kieras and P.G. Polson : An Approach to the Formal Analysis of User Complexity. International Journal of Man-Machine Studies, 22, pp.365-94 (1986). [3] T.P. Moran : The Command Language Grammer : A Representation for the User Interface of Interactive Computer Systems. International Journal of ManMachine Studies, 15, pp.3-50 [4] K. Matsubayashi, Y. Tsujino and N. Tokura : A Formal Approach To Hierarchical Specification of Human-Computer Dialogue Using Denotational Semantics (in Japanese). Trans. IPS Japan, 94-HI-53-13 (1994), pp.93-100. [5] K. Matsubayashi, Y.Tsujino and N. Tokura • An Application of the Denotational Specification of Human-Computer Dialogue to GUI-based System (in Japanese). Trans. IPS Japan, 95-HI-58-6 (1995), pp.37-44. [6] K. Matsubayashi, Y. Tsujino and N. Tokura • A Denotational Specification of Human-Computer Interaction and Its Application to Macintosh Finder TM (95ICS-3). Department of Information and Computer Sciences, Osaka University, Toyonaka, Osaka, Japan (1995). [7] S.J. Payne and T.R.G. Green : Task-Action Grammers - A Model of the Mental Representation of Task Languages. Human-Computer Interaction, Vol.2, pp. 93-133 (1986). [8] D. Scott and C. Strachey : Towards a Mathematical Semantics for Computer languages. Proceedings of Symposium on Computers and Automata. Polytechnic Institute of Brooklyn Press, New York, U.S.A. pp.19-46 (1971).
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
77
A Case-Based Method To Support Creative Design Incorporating Intention Recognition Takayuki Yamaoka and Shogo Nishida~ a Central Research Lab., Mitsubishi Electric Corporation, 8-1-1, Tsukaguchi-honmachi, Amagasaki, Hyogo, 661, JAPAN. In this paper, we describe a method and a system to support a creative design process. This method is characterized by its ability to infer the designer's intention based on case-based reasoning(CBR) methodology, so that the system can provide more useful and cooperative information on the current design. 1. I N T R O D U C T I O N In order for computer systems to cooperatively support design processes, it is required for systems to understand intentions of the designer and use them to improve the design process[I]. Conventional CAD systems at present, such as graphical drawing tools with simulators or with expert systems, can only support lower level operations of the designer or tend to aim at fully automated systems, but cannot sufficiently support creative aspects of design processes. If a system can grasp intention and share it with the designer, it becomes possible to provide more useful and cooperative information. In spite of the importance of this role, there has been little research on design intention and its use for design support. In this paper, we will provide a method and a system to support a creative design process, incorporating designer's intentions and using CBR methodology. 2. I N T E N T I O N A L 2.1. Design
DESIGN AND SUPPORT
METHOD
intention and means
In many creative design processes, the designer may intentionally generate a result which is satisfactory for his/her preference. In a design task such as furniture layout in a room, for instance, if the room is wide enough to put all given furniture in, the designer may decide their layout along ideas on how or what s/he wants it to be, such as beautiful and/or compact for future extension, and so on. We call these ideas, such as to be beautiful or to be compact in this example, "design intention". Also we call the resulting state to realize design intention, "means". In the furniture example, a means would be a real layout in the room. Normally, an intention causes a designer to choose particular means. We call the structure of this causal relationship "intention structure". Notice that the design intention is not always the design goal or the functional end the final product will perform, but is a kind of mental state of the designer in the design
78 process. If alternatives to achieve the goal exist, designers would make decisions to choose the preferred means to satisfy the intention, Grasping the design intention and sharing it between the designer and the system make it possible for the system to provide more useful and cooperative information, especially in the conceptual design phase. 2.2. C a s e - b a s e d s u p p o r t for i n t e n t i o n a l d e s i g n Figure 1 depicts an image of human-computer cooperative design support. In this image, a designer and a system look at a working field. The designer can perform 3 actions to reflect his/her design intention on the design: 1. operating on objects to change the current design status as s/he like, 2. asking the system to see information associated with the current design, and 3. editing the shared intention structure of the ongoing design. On the other hand, to support the designer, the system performs the following actions: 1. recognizing the designer's intention, 2. showing information which is useful to satisfy the intention, and 3. storing the intention structure and the result to the database for future support. By iteration of interactions using these actions, the designer and the system can share the design intention step by step, so that they can collaborate and make the entire design process more effective.
1. H: Input a problem, 2(a). H: Ask to retrieve means to satisfy the design intention, 2(b). H: Operate on objects to change the status, S: Infer the design intention of the operations or the states. 3. S: Display the intention structure (the result of 2.). 4. H: Take preferred means into the current status, if any. 5. H: Edit the intention structure to reflect the design intention. 6. H: If the result is not sufficient, then goto step 2 and do recursively. 7. S: Store the intention structure in the case-base. Note: H denotes Human designer, and S does System. Figure 1. An Image of Cooperative Design Support
Figure 2. Typical Process of Cooperative Design Support
79 To incorporate capabilities mentioned above with the system, the following functions are a minimum requirement: 1. to presume and confirm the design intention from the ongoing means and/or the current status done by the designer, and 2. to retrieve and display concrete means which are satisfactory for the design intention. However, since design intentions may vary among designers and situations, the most problematic matter is the difficulty to describe fixed rules or knowledge sources about relationships between intentions and means. In order to realize a system with these functions, we employ a framework and an architecture based on CBR methodology. The CBR approach is good for such tasks, mainly because CBR method makes it possible: 1. to avoid preparing fixed rules and knowledge sources in advance, 2. to output flexible and varied information through the modification and adaptation processes, and 3. to extend knowledge sources step by step. Furthermore, since computer systems cannot make creative decisions at present, it is reasonable for computer systems to work as augmented memories of human designers, and to provide information useful for the users to make good decisions[2]. In this framework, a human designer and a system can interactively proceed with a design process as in Figure 2; The input problem includes the goal, objects, constraints, and so on. Step 2(a) may be performed if the designer explicitly has a particular design intention, otherwise 2(b) is performed. In Step 5, the designer can modify any symbol of mental statement in the intention structure to reflect his/her design intention. Each step corresponds to a particular phase in a typical CBR process, 2 to case retrieval, 4 and 5 to adaptation, and 7 to case storage, respectively. A remarkable characteristic of this method is the twofold application of the CBR method for complete design and intention recognition. Details of the intention recognition will be described in the next section. 3. C A S E - B A S E D I N T E N T I O N
RECOGNITION
We represent an intention structure as a labeled graph as shown in Figure 3. It consists of four types of nodes, vocabulary, object, physical and mental, and labeled links between them. An object node consists of a set of attributes, where each attributes name is a link label and value is a vocabulary. A physical node consists of a set of objects(the predicate name and arguments), and generally stands for a physical state, such as "left" or "lower". A mental node consists of a set of any nodes, and generally stands for a mental state, such as "beautiful" or "compact". An intention structure describes a (partial) case realized with an intention which is on the top(root) of the graph. The example in Figure 3 is a intention structure of an electric facilities layout of which the design intention is to be "beautiful". The intention recognition process in a design proceeds incrementally, since the entire design process described in the previous section may recurse. Each step of the intention
80
, h~ut,u-;,
:.>1
(
....
" "
. "Nn,,,...nod.,
Left-1 ) ( ~ght-1 _)
susJ
oa~ s ~
om
( Figure 3. An Example of Data Structure
Figure 4. Synthesis of Intention Structures
recognition is based on a CBR process, in which the input is a list of partial structures, including object, physical and mental nodes (those represents the current state of design). The step can be divided in two main phases: presumption and extension, which correspond to the retrieval and adaptation of CBR, respectively. The presumption phase is a straightforward case retrieval process from the current input. The similarity measurement between two structures(target and source), which is the core of retrieval, is realized by a recursive graph matching method based on a graph unification algorithm. The value of similarity between two structures is the summation of the lower structures. Similarities between vocabularies, which is the lowest level of the intention structure, are predefined in the vocabulary database. The extension phase is done by synthesizing partial intention structures from the casebase, in order to presume the upper intention, which might reflect wider contents of the design. An example of the synthesis is shown in Figure 4. In this example, the system could retrieve a case (rooted by "beautiful") from which is possible to synthesize two input structure ("Sparse" and "Symmetry"), so that the two inputs are extended to a new structure "beautiful". The retrieval in this phase is also based on the similarity, therefore it is possible to extend structures which do not appear in cases of the current case-base. So by this method, it will be possible to generate new structures across several cases in case-bases. 4. S Y S T E M 4.1.
System
architecture
A system architecture based on the framework described above is shown in Figure 5. The user interface is a graphical one in which a user can directly manipulate design materials and edit graph structures (see Figure 6). The input interpretation part interprets operations and states in the user interface and transfers them to the internal data struc-
81
Problem ' User Interface
Dis play
Operation
Operation
________.t_ ___k
I onu
II
Work
Space
] INetworkEditor]
Input interpretation Scoping
Buffer
Edit
Synthesis Case-Base Retrieval =-
Distance
Similarity
Inference Kernel
Memory
BB~
-----"
Input/Output
Process *;,-',=*;,iiJP,~ Data/Knowledge
Figure 5. System Architecture tures (graph), using the objects and statements knowledge bases. The inference kernel is the main part of CBR, including the retrieval and synthesis programs described in the previous section. The buffer in the inference kernel maintains several intention structures which are presumed by the system to be similar to the input, so that the system can show various information, in case the best matched case would not satisfy the user's intention. The memory part maintains cases as intention structures and provides an editing facility to change structures for users to modify and adapt them to reflect the design intention. 4.2. Y A A D : a p r o t o t y p e We implemented a prototype YAAD(Yet Another CAD system) for electric facilities layout design using C and Motif on a UNIX workstation. A screen copy of Y A A D is shown in Figure 6. This figure is a snapshot of GIS(Gas Insulated Sub-station) layout design. Designers can do some layout operations on objects and edit the intention structure in the left hand two windows, while a retrieved layout example and its intention structure are shown in the right hand windows. Although the system is now under evaluation, several features mentioned above have been verified: • Y A A D provides various information associated with design intention robustly, and most information is based on real cases done by other designers, so that it is useful for the conceptual design phase. • Users can interactively refer to information whenever s/he desires, making it possible for the design process, especially in the conceptual phase, to be effective and smooth. • It is an important feature for users to be able to directly see and modify the means and the intention structure through the graphical interface. On the other hand, several shortcomings have been appeared:
82 ~.j
VAAD
I "" z'zl i
~'0~"'
I
Fie
Edit
Window
Tiree Relemnce
Retrieve
Gold .Inlpecl
Trace
q
• oo
t:l Or- ll° ,
.4
IP,
. . . . . . . . . .
I
i
n
II
HI
o o i N
I
GoalTroe
r~----l-~---l~r~---'~ NN 4
I.-
NI
Figure 6. Screen Copy of YAAD
• Since Y A A D is an interactive system, the response time is crucial to real use. Y A A D can respond quickly when there are not too many scoped objects, but it is slow to respond if the number of objects is large. So developing an efficient method to retrieve examples from the case-base, including parallel and/or intelligent search algorithms, is a future goal. • The quality of information shown by the system heavily depends upon the construction of the vocabularies that the system uses. Development of an effective mechanism to obtain and maintain good relationships between vocabularies is also required. 5. C O N C L U S I O N We described a case-based method and system to support a creative design process, incorporating the ability to infer the designer's intention. Grasping the design intention and sharing it with the designer, the system can provide more useful information and can collaborate with the designer. We verified the features of the method using a prototype YAAD. REFERENCES [1] Tomiyama, T., Kiriyama, T. and Yoshikawa, H.: Intelligent CAD Systems: Today and Tomorrow, Journal of JSAI 7(2), !992, (in Japanese). [2] Kolodner, J. L.: Improving Human Decision Making through Case-Based Decision Aiding, A I Magazine, 1991.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
83
D e s i g n i n g Interfaces for C o m p u t e r - b a s e d A s s e s s m e n t s R. M. Kaplan and I. R. Katz Division of Cognitive and Instructional Science, Educational Testing Service, Princeton, New Jersey 08541 USA 1. INTRODUCTION Educational institutions and testing organizations around the world have been moving away from traditional paper-and-pencil tests toward computer-based tests. Computer-administered tests now exist for college entrance exams and professional licensing exams, among other areas. Advances in graphical user interfaces and artificial intelligence make the computer administration of tests on a large scale feasible. For example, administering a test on computer opens the possibility of automatically scoring candidate responses, resulting in potentially large savings over manual scoring. However, whether we simply reproduce a paper-and-pencil, multiple-choice test onto computer or create new kinds of test questions that require more complex responses, significant issues in interface design inevitably arise. The interface for a test question should require minimal computer skill. The end-users of an assessment system--the test takers--are infrequent users of the system, perhaps taking a test a few times a year or once in their lifetime. Such interfaces must be "walk up and use" [ 1]. That is, the interface for a computer-based test question has constraints similar to applications that are publicly available (e.g., cash machines)--the interface must represent a design understandable by the majority of people who will use the system. Furthermore, greater experience with a test question's interface should not translate to better test-taking performance. Thus, facility with the interface's functions and speed of performance are not as important as ease-of-use for an entire test-taking population. A person should reach "expertise" in a relatively short time, and further practice should result in only modest gains in performance speed. For multiple-choice questions, these design constraints do not typically pose difficult challenges. However, testing organizations are beginning to use questions in which students create their own responses, known as "constructed response" questions. In contrast to multiple-choice questions in which a test taker selects an answer from a set of alternatives, constructed-response questions require that each test taker construct his/her own answer, which could be a number, word, essay, or diagram. Constructed-response questions pose significantly more complex problems for user interface design. When designing interfaces for constructed-response questions, it is necessary to meet all of the goals described above, while at the same time making the question realistic. That is,
84 constructed-response questions typically require test takers to perform real-world tasks. Several challenges arise when attempting to accomplish these goals. On the one hand, making an interface easy to learn and simple to use diminishes the question's correspondence to the real-world task. Simplifying a test question's interface might involve constraining the possible range of user-computer interactions. On the other hand, the closer that a testing interface becomes to real-world tasks, the greater the difficulty of automatic scoring of responses. In other words, the data collected in response to that question becomes more difficult to score via computer algorithm. This trade-off is particularly apparent with questions requiring natural language responses. This paper describes two examples of computer-based constructed-response questions that (a) represent real-world tasks and (b) must be automatically scored. We describe the "design specifications" for each question, along with their corresponding interfaces. Particular attention is given to the iterative evolution of each interface as well as the design rationale behind the initial and subsequent designs. The design rationale focuses on compromises made, such as trade-offs between real-world "look and feel" and the constraints necessary to allow automatic scoring. We first describe a question created to assess certain aspects of architectural skill, called the "block diagram" task. In constructing a block diagram, an architect arranges a set of rooms (e.g., the lobby and customer service area of a bank) onto a building site, specifying the spatial arrangement of rooms as well as the building's location on the site. This type of question represents one of the first tasks that architects perform when designing a new building. As in the real-world situation, the test taker (i.e., an aspiring architect) is provided with a general description of the building and each of its rooms. The architect is free to design the building in any way to meet the constraints specified in the problem. The second question is from the writing domain. In this constructed-response question, a test taker is presented with a one to three paragraph passage. The passage contains syntactic and/or semantic errors that are to be located and corrected by the test taker. The interface must allow test takers to correct the errors in the passage, yet must also keep track of the revisions. Single sentences may contain more than ohe error and errors can cross sentence boundaries. Thus, in order to make automatic scoring of responses possible, each revision must be individually tracked and attached to its related error in the passage. 2. B L O C K DIAGRAM When an architect begins to design a building, they are given a set of specifications. These specifications include, among other things, a list of the spaces, their relative dimensions, and the requirements for connectivity between the spaces of the building. Using this information, the architect concretizes this information in the form of a special schematic representation called a block diagram. If we want to incorporate a task like this on an examination for architect we might do so by providing the architect with a generalized drawing tool. Unfortunately, doing so would make the process of scoring the architect's solution next to impossible. For this reason, the item interface gives the architect all of the tools they need to
85 construct a block solution while, at the same time allowing the characteristics of the solution to be recorded in detail. The computer-based interface for the block diagram task is shown in Figure 1. This figure depicts an architect's nearly completed design. Initially, blocks representing the building's rooms are provided at the top of the screen. Down the left side of the screen are buttons that allows subjects to perform various design actions (e.g., move block, rotate entire current design). In the figure, the architect has placed all of the blocks onto the building site and has indicated how people would move from one room to another (i.e., which rooms are connected and which rooms are not immediately accessible from one another).
illNi ::..:........:..:
NN ........
!~il'
........ t
"
.....
......... Figure 1 - Block item interface
II!II .....
i
:1
:i:
Figure 2 - Block item scoring analysis
The characteristics recorded as an architect creates a block diagram include the positions of the specific blocks and the connections between the blocks. These two pieces of information are critical for the scoring process. For example, one of the measurements that is carried out in the scoring program is the measurement of the paths between each space. For a good solution it is necessary for these lengths to be minimum. To calculate the lengths precisely it is important to know exactly which elements of the block diagram are connections between spaces. The interface includes a tool to precisely place connections between rooms. Figure 2 depicts the visualization interface created by the scoring process. Although this display will not be part of the actual scoring process, it is useful to watch the scoring procedure during development to gain an understanding of the operation of the scoring process. In Figure 2, the small black squares are called "occlusions" and represent "doors" that are placed in the diagram to indicate how the spaces are separated. These are an example of how the scoring process has a direct and critical effect on the task interface. In this case, it was determined that occlusions could not be readily inferred from the connections between the spaces. When the design team for this interface realized this was true, a tool for occlusions was added to the interface so that an architect could place the occlusions where necessary.
86
3. I N T E R L I N E A R M u c h o f our research involves creating computer-based versions o f complex examination items that lend themselves to automatic scoring. The latter is particularly important since, such items would not be cost effective to include on an examination if they were not partially or completely scorable by automatic means. This feature of our "problem space" is even more critical with those items whose response will be in the form of a natural language passage. Because no completely effective means of analyzing natural language presently exists, we must s o m e h o w constrain the task so as to (1) mimic to some degree the entirely open-ended task and (2) collect the same or similar information that we would have in the open-ended task. One type of natural language question that demonstrates this trade-off is used to test writing and revising skills. The original, paper-and-pencil version of this question presents to candidates a passage containing syntactic and semantic errors (Figure 3). The candidate makes corrections in the spaces between the lines of the passage--thus the name "Interlinear." The kinds o f corrections that a candidate can make are completely open-ended--syntactic, semantic or some combination of these (even rewriting or combining sentences). Directions: Reprinted below is a poorly written passage. You are to treat it as though it were the
first draft of a composition of your own, and revise it so that it conforms with standard formal English. Wide spaces have been left between the lines so that you may write in the necessary improvements. Do not omit ideas and do not add any ideas not now present. You may, however, change any word which you think is expressing an idea in exactly; and you may omit words, phrases, or sentences that are unnecessary. You are not expected to rewrite the whole passage. Trying to do so will not only waste your time but will also cause you to miss many of the specific errors you are expected correct. Much of the passage is satisfactory as it stands. Leave such parts alone and concentrate on finding weak places that need changing.
In general, corrections should be made by crossing out the word, phrase, or mark of punctuation you wish to change and writing your own version above it. Any clear methods of indicating changes is satisfactory. Simply make sure that what you intended is clear. Edgar Allan Poe is one of the great writers of the pre-twentieth century time period. He was one of the first to develop theories about the characteristics of short stories. In fact, Poe lay down the three most important rules of a short story in his critique of Hawthorne's "Twice Told Tales." His work was called "The Brief Prose Tale." In this critique, he set down three major laws of a short story; one of the rules was that it must be able to be read in one sitting. Another rule is that there must be a dramatic characteristic that must be carried throughout the story. And the third is that a central theme should be contained throughout the story.
Figure 3 - Sample of the original Interlinear item
87 In the sample above, a candidate can do almost anything to make changes to the passage. It turns out that the rubric for such a question (i.e., the specification of the procedure for scoring a response)is quite extensive and amounts to developing all of the possible alternatives for specific corrections to the passage. The complexity of the scoring process led to the eventual elimination of the paper-and-pencil Interlinear question type as a potential examination item. Recent work in natural language processing techniques now open the possibility of more easily scoring these responses automatically. Of course, instead of administering the question using paper and pencil, we would use a computer to administer the item. At first glance it would seem that a computer-based interface for this item would be very simple to construct. For example, we could simply provide an editing box in which the passage is displayed and allow the candidate to edit the passage as they would with any word processing program. Although this interface may be easy for a candidate to use, because of the freedom allowed in changing the passage, the interface is not a good one for automatically scoring responses. It would be the responsibility of the scoring program to determine what changes were made and whether those changes were correct. Scoring could become particularly complex if makes changes to portions of the passage other than those that are supposed to be corrected. Our task was to design an interface that structured candidates' activities without unduly constraining the types of changes candidates could make. At the same time, for automatic scoring, the interface needs to collect information about changes that are made to the passage. The instructions specify that there are specific errors that the candidates should correct, and that candidates should not spend time revising the whole passage. To accomplish the collection of corrections to the passage, we could present the candidate with a list of words and phrases to correct and allow only these words and phrases to be changed. This type of interface constrains what a candidate can do to the passage and also the data that must be analyzed. In this case, the data that must be analyzed is only the changes that a candidate made corresponding to the word or phrase that was to be changed. The resulting interface is shown in Figure 4. The interface consists of a window containing the passage, one containing a list of phrases to correct, and the related windows for editing and maintaining a list of the edits made to the passage. One of the features of this interface is that the passage is updated as the candidate makes changes to it. Just as in the case of the traditional editing window, this interface allows candidates to read the passage with their changes in place. They can, at any time, also have the passage display return to the original version (i.e., before they introduced changes). Although the interface is very different from the standard word processing interface, preliminary reviews suggest that the interface allows the testing of writing and revision skills while collecting proper information to aid automatic scoring. More extensive user testing is currently underway.
88
I
Hi#work story;oneof the rules must be ableto be read that must be carried
Figure 4 - Interlinear interface
4. CONCLUSIONS Over the last several years we have developed many new constructed-response items types. Part of our efforts in this process has been to develop, in conjunction with the item types, automatic or semi-automatic means for scoring these items. The development of scoring processes feeds directly back to the interface design process. Because creating complex intelligent applications to analyze a particular item can be time consuming, costly, and result in a process that cannot be generally applied, this approach to scoring these items is not a viable approach. An alternative to this is to constrain the item interface in such a way as to assist the scoring process while at the same time leaving the task realistic. We have shown two item types for which this type of development process took place. The first is a graphical task for architects. Rather than allow an architect to develop a completely open-ended solution, the item interface collects the same information while constraining the activities of the architect. Similarly, the second item, in the domain of writing, constrains the activities of the writer while collecting information like that which would be produced in a completely open-ended writing task. REFERENCES
1. Lewis, C., & Poison, P. G. (1991). Cognitive walk throughs: A method for theory-based evaluation of user interfaces(Tutorial Notes). ACM Computer-HumanInteraction
Conference (CH1'91).
lII.5 Interaction Design 3
This Page Intentionally Left Blank
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
91
WMH methodology for HCI design Christian Co~ff~ Sofr~avia / Centre d'Etudes de la Navigation A~rienne 1, rue Champagne 91200 Athis-mons, France
Abstract This paper aims at bringing up a practical methodology for Human Computer Interaction (HCI) design and a formalism for HCI description which have been successfully used in Air Traffic Management (ATM) field, namely for : • the French PHIDIAS programme developing the next generation of Air Traffic Controller Working Position to be installed in 1997 (Ref 1 ; Ref 2 ; Ref 3) ; • the European SWIFT project launched by the European Commission (DGVII/EURET1.5) (Ref 4 ; Ref 5). This so called WMH (What, Means & How) methodology is not tremendously innovative but it works and its usability by both operational and technical people has been proved. Due to a pressing need for better methods and tools in this part of system design, we think that what we see as a successful experience deserves to be made known to a wider circle of HCI design experts. 1. INTRODUCTION 1.1. The place of HCI aspects in the system d e s i g n Nowadays, nobody would make claims about the efficiency of a humanoperated system without having taken into consideration HCI aspects in the global design. Nevertheless, HCI design should play a different role and take a different place in the whole life cycle of the system depending on the type of application to be developed. When the system is entirely new, most of the time its development is technology-driven. In this case, even though HCI aspects are quite important, they are defined late in the design phase and even without interviewing potential operators. On the contrary, whatever its degree of innovation, when the system is to replace an existing one, operational requirements and constraints, the transition phase from one system to the other, operational acceptability by existing operators are major topics to be addressed. From this statement, the benefit of involving end users early on is very high : HCI operational aspects can be addressed early in the design phase so that the result can be considered as a "user requirements document" which goes much beyond stating general operational objectives. For that purpose, HCI specification is not limited to issues such as information presentation or the way to perform some pre-defined functions. On the contrary, identification and description of HCI functions have to be covered during this HCI design phase.
92 Involving end users from the beginning in this kind of exercise sounds very attractive but following this approach can raise several difficulties : • to distinguish the real user requirements from wide-ranging wishful thinking ; • to gather a representative enough sample of future end users to participate in this HCI design phase ; • to satisfy the user requirements by describing first a feasibility framework ; • to ensure that the output of this phase will be directly usable by developers involved in the global system design. When facing this situation, it seems that HCI design lacks both a methodology and a description formalism to tackle these difficulties. 1.2. T h e m a i n f e a t u r e s o f an e f f i c i e n t HCI m e t h o d o l o g y + f o r m a l i s m According to the needs presented above, a methodology for HCI design will not be efficient unless it is associated with a description formalism. Therefore, an efficient methodology + formalism should ensure t h a t : • questions be addressed and answered one at a time ; • all aspects (functional, interaction and presentation) be taken into account ; • the final description of the HCI tend to completeness ; • both users and developers be put in a position to discuss the document content and to use it ; • the methodology + formalism in itself be easy to learn and to use. The WMH methodology was built with these guidelines in mind. 2. WMH M E T H O D O L O G Y 2.1. G e n e r a l i t i e s Three phases illustrate the proposed "top-down" approach to elaborate and describe HCI specifications. These phases are named" "What to do", "Means to do it °' and "How it runs" and have to be used in sequence. One of the advantages of this WMH methodology is to ensure good quality of functional description (what to do) before any definition of basic HCI components (means to do it) and obviously before their use to perform the functions (how it runs). Splitting the elaboration and description of HCI specifications into these three phases allows to clearly separate the functional level (operational needs) from the technical and implementation levels of HCI design (technical solutions). Phase 1 "What to do" is the functional specification of the HCI requirements ; it means that all HCI functions to be developed are described during this phase. This functional description is quite independent on the way these HCI functions can be implemented in the end. Phase 2 "Means to do it '° is the description of HCI components available to perform these HCI functions ; it means that all the HCI components of the interface are described in detail. At the end of phase 2, background information of the interface is described as well as all the components which will be used for performing the HCI functions. But the detailed way to perform these HCI functions is not yet defined. Phase 3 '°How it runs" is the description of the operational use of HCI components to perform the functions. It means that the way to perform each HCI
93 function is presented in detail so that finally, each elementary action o f dialogue a n d its related effect on the interface is described. Applying this methodology implied the use of a specific formalism facilitating the description of HCI specifications. The proposed formalism mainly refers to the use of standard forms or tables with specific headings depending on the phase to be described. Some basic rules of syntax and wording were also defined to ensure consistency through the whole documentation. This methodology was proposed to a multidisciplinary team composed of the project manager, end users, HCI experts and developers. These actors commonly have different objectives to satisfy and use different languages (Ref 6) . The driving force of WMH methodology is to make easier the necessary partnership to perform a constructive HCI design phase. Table 1 gives the expected role of the different actors depending on the methodology phase. Table 1 : Expected roles of the actors Developers End Users HCI Experts What
I Opqra:i°nalts
Means How
Functional Analysis Representation Consistency Fluency of Dialogue
2.2. P r e s e n t a t i o n of t h e m e t h o d o l o g y
Phase 1 "What to do" This question about what we want the system to do is not answered only by stating the system objectives. It deals with the definition of the operational way to satisfy these general objectives, e.g. the definition of the operational functions to be performed. Therefore, this phase aims at a d o p t i n g a f u n c t i o n a l approach to operational requirements.
Answering the question "what to do" from an operational standpoint means that the operational functions have to be abstracted as much as possible from implementation issues. An operational (or HCI) function is defined as a function to be handled by the operator through an HCI procedure or triggered by the system but in this case having some HCI consequence. Both of these functions have to be defined and described. Therefore, each HCI function is described by using a standard form including the following headings : • objective(s) of the function from an operational standpoint ; • constraints or operational conditions on function availability ; • procedure / triggering event presenting the main steps for performing the function- the number of steps will indicate the necessity or not to perform the function very quickly due to its emergency and / or frequency ; • f i n a l effect / consequences presenting what has been changed in the system, having performed this function - final effects on the system lie beyond HCI ; • r e m a r k s allowing to supply any complementary information.
94 During this phase, implementation solutions such as input devices choices, presentation of information or "mouse clicks" description are not to be addressed. Due to this absence of technical issues, this phase was conducted directly by HCI experts with end users. Phase 2 "Means to d o it" Having defined the HCI functions the system must support, this second question focusses on the definition of HCI components to be used to perform these functions. The first step for that purpose consists in defining the different hardware devices available to perform the input/output aspects of the functions as well as the assignment of the functions to these hardware devices. It would not be possible to define HCI components such as, for instance, the content of a Touch Input Device (TID) page or the content of a dialogue box without having defined before the functions we will perform by using the TID or the pointing device. The second step consists in defining HCI components themselves. It requires the description of" • the generic software displays items (shape of a window, a menu, a button, a dialogue b o x , . . . ) ; • the way to use them (close/open a window, choose a menu option, ... ) independently of their final content ; • the detailed information elements specific to the application (for instance in ATM field : tracks, labels, strips, graphical routes, ...) ; • the detailed dialogue elements specific to the application (for instance the content of dialogue boxes, content of menus, ...). This second phase aims at describing in detail the different categories of HCI components presented above by using standard forms including the following headings • • for the generic sot~ware displays items : definition ; content ; figure ; remarks. • for the way to use them : objective(s) ; procedure ; remarks. • for the detailed information elements : functions initiated via this element ; functions implying any consequence on this element; content; figure ; remarks. • for the detailed dialogue elements : running function ; content ; figure ; remarks. Within these forms, the interest of enumerating the different functions related to the described HCI component is : • to propose a cross-reference framework for consistency checkings (e.g. to make sure each function can be triggered via at least one element) ; • to provide the developers with an HCI component-based description. All HCI components are described during this phase so that the global interface of the system is completely defined from a static point of view. While the first phase deals with operational functions, this second phase implies to involve both operational and technical people since the different proposed HCI choices have to be consistent with existing technical constraints. Phase 3 "How it runs" This third phase aims at describing in detail the way to perform the functions defined in the first phase by using HCI components defined in the second phase. The description of dialogues must be complete enough to allow the
95 implementation of the function in the system. Therefore, the different steps, elementary actions and intermediate consequences are described for each function. This last phase provides a "dynamic" description of the functions and allows to focus on dialogue efficiency and consistency. 2.3. P r e s e n t a t i o n of the f o r m a l i s m The methodology presented above provides the project manager with a structured approach for conducting the HCI design phase. Nevertheless, the different specifications produced during the different phases have to be easily usable, comprehensive and as clear and accurate as possible. Therefore, some wording rules and syntax rules are defined in order to ensure that, independently of the affected phase, a global consistency is made traceable across the different documents. w o r d i n g rules A clear understanding of specifications is made much easier when the proposed wording is precisely defined and used in a systematic way. For instance, carfully writing down the operational objective of each HCI function is very important in order to better understand the real justification of this function. Therefore, questions raise when choosing words for the description • "in which extent does this function differ from that one ? Are the objectives different ?" For instance, "to display a view ..." can be used for system functions triggered by a system event and related to an objective such as "to show ...". In other respects, "to activate a view ..." can be used for user functions triggered by a user action and related to an objective such as "to visualise ...". This is a way to take advantage of the different meanings of the words and to clarify what is really behind the description of the function. Definition of wording rules is certainly arbitrary to some extent but, having defined them, the most important point is to use them consistently. syntax rules In parallel with wording rules, syntax rules were also defined to provide a detailed description of the procedures (e.g. the sequence of actions the user may or must perform, the required conditions to perform these actions and the consequences of these actions). While wording rules are project dependent, syntax rules are quite robust and thus more reusable in other projects. Let us illustrate these basic syntax rules by presenting a theoretical example. • condition A • [] condition B 1 [] condition B2 =~ • action 1 • [consequence 1] • IX]action 21 IX]action 22 m [consequence 2] • [consequence 3] m [consequence 4]
96 The way to read this procedure is as follows : indicates the required conditions ; =v indicates the sequence of actions. • stands for a "AND" operator, C:] a "OR" operator and IX] a "XOR" operator. [ ..... ] indicates a consequence of the action(s) These basic syntax rules are completed with the use of IF, THEN, ELSE allowing the description of different options. Therefore these syntax rules can be combined together to allow the description of complex procedures. The balance achieved between readability and accuracy has been found satisfying according to our own experience. 3. C O N C L U S I O N The methodology presented in this paper was built up for the purpose of involving a user group in the definition of operational requirements then on HCI design. It was our first attempt to meet what we felt as pragmatic requirements on an efficient tool for conducting an HCI design phase. Our initial experience with this methodology encourage us to follow on with using it. WMH methodology appears to possess the main properties required from an efficient tool (as listed at the end of section 1). Nevertheless, it is more suitable to design voluminous HCI since offering a very detailed description and a number of consistency checking requires much effort. This methodology has already been successfully used in ATM field and could be used in any other industrial application where operational requirements and constraints are of the utmost importance ans safety is at a premium.
4. R E F E R E N C E S 1. CENA/R93008 : EBO PHIDIAS 1 - les fonctions d'interface contr61eur-syt~me 2. CENA/R93009 : EBO PHIDIAS - les ~l~ments et les r~gles pour rinterface contr61eur-syst~me 3. CENA/R93015 : EBO PHIDIAS - les dialogues d'interface contrSleur-syst~me 4. SWIFT 2 Report WPR10.1 : Specifications of H u m a n Machine Interface, Preliminary Steps, SWT/SOF/WPR/10.1/01, March 1994 5. SWIFT Report WPR10.2 • Specifications of H u m a n Machine Interface, Detailed HMI Specifications, SWT/SOF/WPR/10.2/01, September 1994 6. Beck, A. (1993) : User Participation in Systems Design - Results of a field study - : In : Smith M.J., Salvendy G. (Ed.) : Proc. of the 5th Intern. Conf. on H u m a n - C o m p u t e r Interaction, Orlando :Elsevier, 534-539 1 Position Harmonisant et Int~grant les Dialogues Interactifs, Assistances et Secours (French programme developing the next generation of Air Traffic Control Working Position) 2 Specifications for controller Working positions In Future air Traffic control (European Commission project conducted by Thomson CSF (F)with the collaboration of Siemens Plessey (UK), ESG (D), NLR (NL), Sofr~avia (F), Syseca (F), Roke Manor Research (UK), Aegean University (GR), Captec (IRL) and INESC (PTL)
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
Guided Understanding for P r o b l e m Solving cess Using the Refining Self Explanation
97
Pro-
K. K a n e n i s h i t and Y. Yano$ t D e p a r t m e n t of Children Education, Shikoku University Junior College, 123-1 F u r u k a w a , Ohjin-Cho, Tokushima 771-11, J a p a n $ D e p a r t m e n t of Information Science and Intelligent Systems, Faculty of Engineering, The University of Tokushima, 2-1 Minamijousanjima-Cho, Tokushima 770,Japan 1. I N T R O D U C T I O N
We use explanations to communicate knowledge. However, after the explanation, we find ourselves being able to u n d e r s t a n d the explained subject more deeply. It is considered t h a t explanations have a side effect to help the person who explains to u n d e r s t a n d the subject better, as well as the function to communicate knowledge to others. Here, we decided to use the side effect of explanations for the purpose of education. At present, we are building an intelligent learning environment for geography[i-2] by using self-explanations. In this paper, this learning environment employing self explanation is described. The explanations we is deal with are explanations in the problem-solving process. First, we let a learner solve an exercise in geography. Consequently the l e a r n e r explains her/his solution process to the system. The l e a r n e r deepens on her/his u n d e r s t a n d i n g of geographical problem-solving process by explaining a solution. We believe t h a t learning will be deepened not only by solving the exercise but also by explaining how to solve it. Moreover, it is i m p o r t a n t for the learner to know how to give a self-explanation. It is also i m p o r t a n t to give appropriate advice to the learner r a t h e r t h a n to let him or her explain without any advice. By giving advice, the learning effect will be increased. In the environment of self explanation, the guidance of the learner by the system is critical. To guide the learner, a learning model is necessary. With self explanations, it is obligatory to clarify how learning advances in the mind of the learner. We believe t h a t she/he is carrying out a reflection process due to self explanations. Reflection is also one of the i m p o r t a n t mechanisms in learning. The learner reflects the problem-solving process by explaining it. Reflection can more deep the u n d e r s t a n d i n g of the problem-solving process. To make the l e a r n e r reflect willingly means to make him proceed with learning. It is difficult for the system to control reflection directly. How-
98
ever, the system can control the function of the reflection indirectly by controlling the environment. Recently, studies of intelligent learning environment focusing on reflection have been made explanation based learning[3]. However, it should be pointed that in these studies the control of reflection is insufficient. In this paper, u n d e r s t a n d i n g by self-explanation is described first. Then, the ways to support self-explanation by the system are described. 2. M O D E L O F S E L F - E X P L A N A T I O N
Three cognitive processes are made by the learner's self-explanation is described. Self-explanation is made up of three large recognizing effects. Figure 1 shows the self-explanation model which is composed of several subprocesses. By making the learner explain by himself, learning proceeds within the learner[4-5]. Three of the cognitive processes in self-explanation are reflection, explanation generation and explanation planning. In self-explanation, a purpose is set first. Along with the set purpose, reflection and explanation work in parallel. In reflection, a solution is replayed. Because of the replaying of the solution to the problem, that has already been solved, the learner can observe the problem-solving process subjectively. A problem-solving process obtained as a result of the reflection is sent to the explanation generation. In explanation generation, the explanation is made through division, classification and integration. The explanation planner contents the three stages of explanation generation and manages the exchange between
i~~i~!~!!il i~:~~!
iiiiiiiiii!ii!iiili !iiiiiiil iiiiiiii~~iii~M~M~i~i~i~i~i~ii~i~i~ii~i~ii~iii~i~M!i~iiiiMii!i~iMi~!!~i~
i!ii!ii i!!!ii!il
!iiiii!iiiiiiiiiiiiiiiliillili!iiiiiiiiiiii!iii iii!iii!i!iiiii i!il
iiiiiii!!!i!i!iii!iiiiiii!i~,:!:~' iiii!~!~,:'~i:.iil
:i::i
~ii!iii!iiiiiiiiiiiiiiiii!ii!ii!iiiiiiilili! !iii!ii!!ii~~. ~~~iii~i;i~i~i!ii!!!iiii!i;i!i~!;!!i~!iii!!~i~iiii~!!~i~i~ii!!~ii~i~iiiii!!ii~i~i!i~i~
~'t
i~!~(~,,~= i
macm~
_o
~::iil;iiii
J
iii!i!i!!iii!i: :! i i~i~!i::::::::::::::::::::::::::::::::::::::::::::::::: ::::::::::::::::::::::::::::::::::::::::ii~i!~:i~iiiiiii!ii!i!iii!i :::::::::::::::::::::::::::::::::::::::::::::::::::::::::: ::::::::::::::::::::::::::::::::::::::::::::::::::
i!i!iil !iiiii!ii~ii:iiii::i:i:'i~i~i~i:i:!:ii~i~i~i~ii~i:~:~i~i~i:~i~i~!~!~i~!i~!i~i!~i!~!i~i!~!i~..!i~..i~.~ ..!~..!.........................~..i!.~.i.. •
ii
i iiiiiiiiii
i!i
i
i iiiili!i!iii~i~!~i~i:?:i:i:i:~i~i~i:i:i~i~i~!~i:::~i:!~i:!:!:i:!:i~i,'~i~ii,ii~'i!iiiiiiiiii!iiii' ! ,ii',iiii!iiii!i!i!iii!ili~,ii!iii' i ~i',i',i!',i ii!i~,i
:i?ii:i!i•••i::•i!•ii•i•ii:.i••iii:iiigi:i:::••::::•i::?•::i•i:i!ii?ii:f::::i:iiiil;i:: :•i•!i:•i•
~
|
~
Feedbackinformation
Learner ---....-~
Figure 1. Process of the self-explanation.
• Control now
Figure 2. Construction of the self-explanation environment.
99 the explanation generation and reflection. In the division phase, the problem-solving process obtained in reflection is divided into the explanation's components. In the classification phase, each divided component is classified. In the integration phase, classified elements are assembled as an explanation. When assembling an explanation, the planner aims to make an optimal explanation which is made in this way. Self-explanation doesn't always succeed. Sometimes other people can't u n d e r s t a n d an explanation. In this case, explanation generation is made repeatedly. But, when generating an explanation again, the learner wonders why the previous explanation has failed. The explanation corresponding to the reason failure is formed. For repeated explanation, additional information is communicated to the explanation planner. This additional information is also handed to the reflection process. We have already mentioned t h a t the u n d e r s t a n d i n g of a problem-solving process is enhanced by the leaner's reflection. Moreover, in the process of an explanation generation reflection allows progress in the u n d e r s t a n d ing. In the explanation generation, this occurs because is observed once again the problem-solving process t h a t happened in the reflection process. The explanation is asimbolising process. Therefore, the classification and the reorganization of the knowledge is done strictly. Also, by the failure in explanation, the work in each stage of the explanation generation process is completed in detail. As a result, it is considered t h a t the u n d e r s t a n d i n g of the problem solving process. 3. U S E OF E X P L A N A T I O N IN E D U C A T I O N 3.1. T h e r e f i m e n t o f t h e s e l f e x p l a n a t i o n The refiment of the explanation in our learning environment which uses self explanation is described. This refiment of the self explanation consists of five modules, 1)the explanation transformationmodule, 2)the evaluation module, 3)the planning explanation support module, 4)the advice control module 5)the interface module. The support of the self explanation process is shown in figure 2. The learner repeats an explanation through the environment. According to the advice from the system, she/he repeats an explanation. By repeating the explanation, the explanation is gradually polished. If the explanation becomes good enough, the learner deepens his understanding. 3.2. T h e w a y to r e f i n e t h e e x p l a n a t i o n Th explanation units stand basically straight in line sequentially. The contents of the explanation and the structure of the explanation deeply affect its evaluation. When the structure of the explanation is appropriate, it is possible to call the explanation a good explanation. The appropriateness of the structure of the explanation depends on the object of the explanation. We believe t h a t there is a m i n i m u m set of necessary elements which shows an appropriate structure[6]. In the explanation of the problem-solving process, the explanation of the initial state and the explanation of the goal
100 state are surely necessary. In the explanation there are partial explanations of each state, and they are connected according to the sequence of states. The partial explanations which does not have any connection before and after is considered as an abnormal structure. As a way of evaluating the s t r u c t u r e of the explanation, the structure of the optimal explanation has been prepare. The way to refine an explanation is shown in figure 3. In consists of three methods. 1) the insertion of the: partial explanation, 2) the deletion of the partial explanation and 3) the division of the partial explanation. In the insertion of a partial explanation, it is necessary to make the learner to repeat partial explanation when structure is not complete. The system advises t h a t the partial explanation is not compleate. The system advises to delete a r e d u n d a n t partial explanation. When the explanation is redundant, it is necessary to remove it. The system indicates the learner t h a t the explanation is too long. In the division of a partial explanation, in order to give an explanation in detail, the learner inserts some partial explanations. The division of the partial explanation is different from the insertion of the partial explanation. Generally, the learner tends to give only the explanation of the de-
I
~l Unit -- pl'~
Unit p l ~ L )
:iii::iii!iii::i!i::;i~ii!ii~!~::i::~iii;i;::~i~!i::~::~;i!::!ii::;::i .........i..........
"~"" ~
.........
Unit .. , 1 , ~ . : . .
..... U::n:::.............. ~: i:i:i:!:i:i:i:i:i:;:i:i:i:;:i:i:i:i::
Unit
Deletion of explanation unit
.................................
....~3:: .................................~i~ili~:~.......
;
~ i i i i
Unit
Insert Ion of explanation unit
unit p l - " " ~
UnR pl
i:ii!i!!~i!iii!i!ii~...... i~:i
unit pl
i~!~~~i~!~!~i~!~i~::
..% . i!i!i!i!ii!i!i!i!i!i!i!ii~~ .... '
Detail ed explanation
:% ' iiiiii!iiiiii~i~~ ' ,:';:
Division of explanationunit
Roughed explanation
Figure 3. Way of refining an explanation.
101
clarative knowledge. The system require the learner to explain about meta knowledge which is attached to the declarative knowledge. We believe t h a t the division of the partial explanation makes a detailed change to the explanation. The system generates advices such as =why does Poland produce coal?" for partial explanations such as =Poland produces coal". By using the three methods presented above, it is possible to refine the explanation of the learner. Based on the evaluation of the structure of the explanation, the kind of refinement is chosen. In the division of the explanation, the evaluation of both the structure and the contents of the explanation become necessary. 4. S E L F - E X P L A N A T I O N L E A R N I N G E N V I R O N M E N T
In our learning environment we use a graphical interface. Figure 4 shows our system's outlook. The learners can input their explanations by selecting an icon displayed on the screen and writing simple key words. We believe t h a t the input of explanations using natural language becomes a burden for the learners. Learners explanations are symbolized, in order to express their structure visually. Each explanation is expressed on the explanation screen as a picture. We think that the learners notice the problems in the structure of their explanation from the graphical display. Also, Advice window Main w i n d o w
i~
::. :~
~ ~ ~ ' ~ i l i ~ i
~ = ~ "~-'~
I.~ 11
![~ .......................... : " ';5~
~:~'~-~"
fl~J~4 :
,~'t,ll,
B~,~'¢~,~,
Response v~dow
:
ilii];:;i .,
-:
..................
:::
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::~;~:~:~:~;~;:~i:i
Explanationexpressionwindow
Figure 4. Overlook of the self explanation environment.
102 the system displays its advice on the advice screen, corresponding to the mistakes of both the declarative knowledge or the meta knowledge of the learner. 5. C O N C L U S I O N In this paper, we presented the recognition model of self explanation. Self explanation consists of the reflection and the explanation production. By giving a self explanation, the understanding of the learner's problemsolving process progresses. We showed the necessity of the environment to make the self explanation lively. Then, according the way of supporting a self explanation, the learner who does agood to explanation is seem to understand an object well. We discribed construction of the self-explanation environment, 1)the explanation transformationmodule, 2)the evaluation module, 3)the planning explanation support module, 4)the advice control module 5)the interface module. And we discribed way to refine an explanation, 1) the insertion of the partial explanation, 2) the deletion of the partial explanation and 3) the division of the partial explanation. We proposed the graphic explanation environment. The graphic environment helps the understanding of the learner. Also, the environment of our system integrates the conventional ITS and the environmental CAI[79]. REFERENCES
1. K. Kanenishi, T. Fujisaki and Y. Yano, Knowledge Structure for Geographical Intelligent Tutoring Systems(ITS), EAST-WEST CONFERENCE on Emerging Computer Technologies in Education, 38, 1992. 2. K. Kanenishi, T. Fujisaki and Y. Yano, Problem Solving Process in Geographical ITS, Proceeding of the IFIP OPEN CONFERENCE, 9-14, 1992. 3. A. Kashihara, K. Matsumura, T. Hirashima and T. Toyota, Load-oriented tutoring to enhance student's explanation understanding, - an explanation planner and self-explanation environment -, IEICET trans., E77-D, 1, 27-38, 1994. 4. C. Bereiter and M. Scardamalia, The psychology of written composition. Lawrence Erlbaum Associates, 1987. 5. J. R. Hayes and L. Flower, Writing research and the writer. American Psychologist, 41, 1106-1113, 1987. 6. K. Kanenishi, and Y. Yano, Construction of an ITS for Geography , Proceeding of the ninth conference on education technology, 344, 1993. 7. D. Sleeman and J. Brown, Intelligent Tutoring Systems, Academic Press, London, 1982. 8. T. Hayashi, and Y. Yano, Kanji Laboratory: An Environmental ICAI System for Kanji Learning, Trans. of IEICE E77-D, 80-88, January 1994. 9. R.Okamoto, and Y. Yano, Development of an Environmental ICAI System forr English Conversation Learning, Trans. of IEICE E77-D , 118-128, J a n u a r y 1994.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
A STRATEGY AND DEVELOPMENT
TECHNOLOGY
103
FOR
FRONT
END
SYSTEM
Linda Candy, Ernest Edmonds, Susan Heggie, Bryan Murray and Nick Rousseau LUTCHI Research Centre, Department of Computer Studies, Loughborough University of Technology, Loughborough, LE 11 3TU. U.K. Abstract
This paper describes an approach to the enhancement of existing software and the development of new applications based upon the premise that advanced software technology is not in itself sufficient to realise high quality usable systems. Development strategies designed to ensure quality must be accompanied by appropriate system architectures and effective implementation tools. We describe a technology and strategy that together enable the efficient development of user and task support systems in a wide variety of contexts. 1.
INTRODUCTION
There exist a wide variety of situations where the complexity of the users' activities and the range of support applications are such that explicit task support must be integrated within the total system. For example, in the field of scientific computation, it has been recognised for some time that the needs of users are not being met by most current software applications [1]. For reasons of economy and the poor availability of such expertise, it is necessary to bridge the gulf between user needs and system solutions and yet to make use of the vast amount of reliable and comprehensive software packages already on the market. Advanced technology such as graphical user interfaces and multi-media may enable developers to construct better users interfaces to existing systems. However, we argue that this must go further than providing mere "facades" [2]. If the system is to be tailored to the needs and task of professional expert users, there is a need for task specific support which is achieved by mapping the users' expertise and the functionality of existing applications. The work described is concerned with the development of complex systems that employ reusability, task support, and the integration of application functionality. An important characteristic of the target users is that they are skilled professionals employed in critical problem-solving roles and, as such they have considerable discretion as to how they complete their tasks. Therefore, they are at liberty to refuse to use a software system that is not suitably tailored to their needs. In consequence they need to be closely involved in a user-centred strategy for system development. The technology discussed in this paper is the Front End System technology (FES) [3]. The FES architecture and tools originated in Interactive Systems development, including and especially, User Interface Management (UIM) systems [4]. The development strategy and associated methods are drawn from a wide spectrum of studies in human-computer interaction and practical applications in the general software engineering field. The strategy and technology were developed in tandem and tested in industrial contexts as part of a large ESPRIT2 project. 2. STRATEGY AND METHODS This section summarises the overall life cycle strategy for FES development. The key concept that underpins the strategy is iterative design within which the role of prototyping and evaluation during the process are key elements as shown in Figure 1 below. The approach is
104 one of user-centred design drawing on existing expertise [5] and a concern to offer costeffective and practical methods for industrial contexts [6].
B test
Sys~[em "
Concept I Requirements
lili
Field
Specification
III I
I I!
I i
I l
|
I I I
,
Evaluation
Formative
Key
:.:~-:.'.'.'.'.'.'-'-'-'.'-'.'"":11
)
~Evaluation /~ Feed:)ack !
Figure 1 The Iterative Development Strategy, Prototyping and Formative Evaluation
2.1
Iterative Development
Strategy
A significant amount of effort must be invested in the analysis of user needs and requirements early prior to any design or implementation work [7]. While the information captured in analysis will be invaluable in supporting the system design process, it will not necessarily be sufficient to enable the design team to "get it right first time". It is often the case that information about the users' requirements will not have been captured fully during the analysis activities• That information will be obtainable only after a prototype is available to evaluate with users. With conventional approaches to development, the effort that is invested in designing and implementing a system prototype is such that it is difficult to make substantial changes and remain within the budget for the project. It is necessary, therefore, to have an early phase of rapid growth of learning that combines analysis and design with evaluations. A second phase where a number of key decisions are made takes place before further development and evaluation activity. During this phase, it is still possible to make changes but they become increasingly superficial as the knowledge is refined.
2.2 Prototyping One of the critical factors in making iterative development a practical possibility is whether or not one can generate prototypes or simulations that embody minimal commitment whilst enabling valid evaluations to be performed. An iterative development process requires the use of evolutionary prototyping as a tool for exploration as well as development. There are three basic types of prototype: Laboratory, Field and Delivery Prototypes. Laboratory prototypes include screen representations and dialogue simulations and may have a set of minimal functions but are not sufficiently robust to be delivered to user sites unsupported. They are used by developers to evaluate technical issues and also to carry out usability tests in experimental task scenarios. Field prototypes can be delivered to the user work environment where they are used to evaluate the support the system provides for real tasks and the impact this has on the general work design. They need to be more robust than the Laboratory prototypes but may not necessarily provide full functionality. They are used to validate the requirements specification in a more realistic situation, run in parallel with existing methods of performance. Delivery prototypes can be delivered to users to support real tasks, but that may not provide full functionality. To be effective, all the prototypes must be quick to build, realistic to users and evolutionary. The technology support for prototype construction is discussed in section 4 below.
105
2.3 Evaluation Evaluation may be classified as formative, measurement or diagnostic. Formative evaluation is particularly important early on and may be used with user interface mock-ups without real functionality. However, as calls to the functional units are developed these can be included in the prototypes and incorporated in the evaluations. Later in the project, evaluations will be more diagnostic or measurement-oriented as attention turns to minimising usability problems and achieving desired levels of performance. At the end of the day, the goal is to support the users' task performance and to provide a system that they experience as usable. It is important, therefore, to focus on the users' experience of their task and of usability [8]. The results of the evaluations may take the form of new or changed requirements which can be immediately fed back into the design process. The presentation of prototypes and design proposals represents a vital mechanism for achieving a full and accurate understanding of the objectives of the system [9]. Methods that support the different types of evaluation are essential for the evolution of the different prototypes. 2.4 Methods Many methods have been developed to support different development strategies. There is a need for methods that can be applied at different levels of complexity, according to the needs of the problem domain and development context. No single method will be appropriate for all development contexts and there is a need for methods that already embody considerable tailoring so that they can be picked up and used by developers, once selected from a "toolbox". The FOCUS methods address the need for 'discount' analysis methods. For that reason, they are quick to learn and easy to use for system developers. They may form the basis for evaluation reports or simply result in changes to a prototype leading to the next version. One example is the User-Software Observation Method (USOM) which involves users performing tasks and developers observing and discussing task performance with the users. It is an evaluation method which uses observational and verbal protocol techniques. 3.
FRONT END SYSTEM A R C H I T E C T U R E AND TOOLS
A Front End System is a separable user interface system that integrates new and existing applications, services and knowledge-based task support in order to provide specific users with tailored solutions. FES Technology is based on an extended Seeheim model, with a distributed modular architecture that employs a client-server level of separation between the modules. It has a specialised Application Interface Module and a Support Module that addresses task support and highly interactive applications. The fundamental concepts were described by Edmonds & McDaid [ 10] and refined in Edmonds et al [3]. The major components and relationships within the architecture are shown in Figure 2 below.
...........................
;-;---......
|
-
App,°.,lo. Inter.orion
l. ..... .'..'.."..'..~.:.'. ..... I
_0.............* _ .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
|c ..... ,c,,,,,, i ._IL_......................... ......... ."..'..7.:.;.'. ........_|
0 Figure 2. -............ The FOCUS Front End System Architecture
............... iiiil
I-A-,i.:~.-Ii
i ""qb- .......... ----'~"" ..............
!
106 The kind of modular architecture shown in the figure above can only work if communication between its components is simple, fast and flexible. FES communication is handled by 'messages', which are routed by the Communications Manager. The following sections briefly describe the concepts involved and the function of each component shown in the architecture.
3.1 User Interaction Manager (UIM) The User Interaction Manager provides user interface presentation and dialogue services to the other modules in the architecture. In addition, it controls access to the user and mediates between modules in conflict for access to the user. Its main function is to create and realise the details of the object oriented dialogue specification it receives via the central message system. The UIM is composed of three major components. The Dynamic Dialogue Manager (DDM) enables the monitoring of all dialogue traffic and where necessary the limiting of access to the user because, for example the screen is already very full and there is a clear risk of overloading the user. The Dynamic Presentation Manager (DPM) is responsible for the maintenance of the logical state of the interface and the interaction objects. It creates, updates and destroys Abstraction Interaction Objects (AIOs), using a library of AIO prototypes. When an event occurs at the end user interface the DPM sends an appropriate return message, using the standard message format. The Physical Presentation Layer (PPL) handles those parts of the presentation function that are specific to a particular windowing system and tool kit. It is responsible for mapping instantiated interaction objects received from the DPM to the presentation system in use.
3.2
Application Interaction Manager (AIM)
If existing software is to be integrated to support a user's task, it cannot use standard messages to communicate. Therefore, these applications must be integrated into the Front End System technology through the Application Interaction Manager (AIM) [11]. The AIM consists of two main components. The Task Manager deals with abstract application independent tasks. It passes a request to the Application Action Manager which has within it all the information necessary to realise that task in an application specific manner. k
3.3 Support Modules A Support Module communicates directly with the rest of the architecture using messages. Support Modules can have a variety of roles, although they may be re-usable in different systems. They range from having general purpose dialogue models to having a task specific dialogue that involves an integration of dialogue model and executable task model. This integration can either involve a logical separation [ 12], or a physical separation [ 14] of these models. In the latter case, individual models 'communicate' using the standard message system. Support Modules provide a range of functionality, such as analysis or complex knowledge base manipulation or visualisation tools, which are not available as Managed Applications. They typically support complex tasks and contains considerable knowledge based task and domain support. They involve task level integration of application functionality to support the user's task(s). They may contain an executable model of the task. There is an integration of the task and dialogue models so that each sub-task, or 'goal', relates to dialogue with the user to obtain data or decision information, internal domain specific processing or a task specified in the AIM's Task Manager. To obtain the data to satisfy its goals, a Support Module may, for example, obtain values for all the necessary parameters by interacting with the end user (via messages and AIOs) and then send a message to the AIM containing an application-independent task description. Alternatively, it may access the AIM to obtain data to support communication with the user. 4. T E C H N O L O G Y SUPPORT TO STRATEGY The strategy and technology have been described in brief. It is, however, the combination that is
107 significant. It is important that this combination is rather an integration and therefore, the integration needs to be of a reasonable depth to produce an effective solution. The section below discusses the integration between the technology and the strategy and methods.
4.1 Support for Iterative Development The Front End technology directly supports the strategy in two main areas: those of Prototyping and Evaluation. In addition it provides considerable support for the project management which is important in making the strategy commercially feasible. 4.2 Support for Prototype Construction The FES System technology provides a range of support to the incremental prototyping approach from laboratory to delivery prototypes The FES toolkit enables the developer to specify user-system dialogues much more rapidly than using conventional programming languages. This is because one can define a dialogue at an abstract level, leaving details of presentation to the User Interaction Manager to determine. Prototypes that simulate calls to underlying functionality that may not yet have been connected can be developed. This enables the FES technology to support rapid user interface prototyping. Also, when building a standard front end (i.e. to existing functionality), one does not have to implement the underlying functionality, but just to call it. In particular, it is much quicker and easier to alter such function calls than to change the actual functional code. These prototypes offering extensive functionality can still be changed with relatively little effort. For certain FESs there will be components that represent new functionality (complex Support Modules). These will have to be developed along more conventional lines but can be integrated with the rest of the system when they are stable. This provides a prototyping tool kit of power and flexibility. The FES AIO provides a rapid specification method with a limited learning curve and the use of defaults to further accelerate the specification process. Using the development tools for prototyping means that software redesign is only needed if fundamental changes are called for. The fidelity, scope and availability of the prototypes developed will also affect the validity of evaluation. Because the tools used are the delivery tools, the prototype will be faithful in 'look and feel' to the proposed system. The functionality and function calls that are implemented will call the Managed Applications. The prototype will increase in breadth as these function calls are developed. However, the functional components will only be integrated when they are stable, and, therefore, the user interface prototypes will be almost continuously executable.
4.3 Support for Evaluation The FES Technology provides a range of support for evaluation. The two main areas are the provision of facilities such as user logging and the ability to generate changes to prototypes rapidly in a structured and well managed manner. Logging facilities provide records of user interactions. This facility is accessed via a simple high level switch. There are facilities to preprocess logs to produce manageable quantities of data. The ability to change prototypes quickly at a range of granularity in response to evaluation is of considerable importance. This can range from changing the detail of a menu item which involves simple text editing to rapidly creating new interface objects using the AIO specification method. The FES technology generates the interface 'on the fly' and therefore there are no detailed representations such as State Transition Networks to edit when changing dialogue details. Finally, the modularity of the system means that a whole new module can be developed in response to evaluation and easily integrated as the module interfaces are defined by the messaging system. 4.4 Support for Project Management The modular construction of the architecture has important implications for the control and management of the software development process. Individual modules can be developed in parallel and various teams can be created to construct the different modules. The development task is, therefore, broken down into manageable development components supported by the underlying architecture. The minimal interfaces between the architectural components eases the
108 management of the development process by limiting the inter-dependencies that have to be managed. The technology, strategy and methods have been used to develop various Front End systems one of which SEPSOL [14], a complex multi-application system that supports chemists using statistical design to identify suitable models for their experiments. It employs considerable domain knowledge in a variety of roles. The support provided by the technology and methods enabled the completion of SEPSOL within the target time of twelve person months. 5.
CONCLUSION
This paper has described an approach to Front End System development that involves the combination of a development strategy and a supporting technology. It has described the main points of the strategy and the key components of the technology. REFERENCES
1. Hague, S. and Reid, I. The changing face of scientific computing. Human Aspects in Computing: Design and Use of Interactive Systems and Information Management. Ed H.-J. Bullinger. Amsterdam: Elsevier. 1991, pp 791-795. 2. Windsor, P. An object-oriented framework for prototyping user interfaces. In Diaper, D., Gilmore, D., Cockton, G. and Shackel, B., Human-Computer Interaction: INTERACT'90. Proceedings of IFIP TC 13 Third International Conference on Human-Computer Interaction, Cambridge, UK, Amsterdam, North-Holland, 1990, pp 309-314. 3. Edmonds, E.A., Murray, B.S., Ghazikhanian, J. and Heggie, S.P. The re-use and integration of existing software: a central role for the intelligent user interface. People and Computers VIII. A. Monk, D. Diaper and M. Harrison (Eds.), Cambridge University Press, Cambridge, 1992, pp 414-427. 4. Pfaff, G.E. User Interface Management Systems, Springer- Verlag:Berlin, 1985. 5. Gould, J.D., Boies, S.J. and Lewis, C. Making usable, useful, productivity-enhancing computer applications. Communications of the ACM, 34, 1, 1991, pp 74-85. 6. Wright, P.C. and Monk, A.F. A cost-effective evaluation method for use by designers. International Journal of Man-Machine Studies. 35, 6, 1992, pp 891-912. 7. DTI/NCC, Starts Guide, 2nd Edition, 1987. 8. Whiteside, J., Bennett, J. and Holtzblatt, K. Usability engineering: our experience and evolution. In Helander, M. Handbook of Human-Computer Interaction. Handbook of HumanComputer Interaction. Amsterdam, Elsevier Science, 1988, pp 791-817. 9. Edmonds, E.A., Candy, L., Lunn, S. and Slatter, P. Issues in the design of expert systems for business, Expert Systems for Information Management, Vol 2, No1,1989, ppl-22. 10. Edmonds, E.A. & McDaid, E. An Architecture for Knowledge-Based Front Ends. Knowledge-Based Systems. 3, 4, 1990, pp 221-224. 11. Prat, A., Lores, J., Fletcher, P. & Catot, J.M. Back-end Manager: an Interface between a Knowledge-Based Front End and its Application Sub-systems. Knowledge-Based Systems. 3, 4, 1990, pp 225-229. 12. Murray, B.S. & Edmonds, E.A. Flexibility in Interface Design. IEE Proceedings-E, Computers and Digital Techniques (Special issue), 1993. 13. Copas, C.V. and Edmonds, E.A., Executable Task Analysis: Integration Issues, People and Computers IX, Cockton, G. Draper, S.W., Weir, G.R.S. Cambridge University Press, Hcr94, August, 1994, pp 339-352. 14. Murray, B.S., Edmonds, E.A. & Govaert, B. SEPSOL: an experimental knowledge-based front end developed using the FOCUS architecture. Proceedings of PRICAI'92, September, Seoul, Sth. Korea 1992, pp 447-455.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) 1995 Elsevier Science B.V.
109
A M e t h o d o l o g y for D e v e l o p i n g N e w Interaction Techniques Deborah Hixl, 2, James N. Templeman 2, Ankush Gosain2, 3, and Kapil Dandekar2, 4 1 Department of Computer Science, Virginia Tech, Blacksburg VA 24061 USA (
[email protected]) 2 Navy Center for AI, Naval Research Laboratory, Washington DC 20375 USA (
[email protected]) 3 Department of Chemistry, Johns Hopkins University, Baltimore MD 21218 USA (
[email protected]) 4 Department of Electrical Engineering, University of Virginia, Charlottesville VA 22904 USA (
[email protected]) We present a m e t h o d o l o g y for inventing, implementing, and evaluating new interaction techniques. We illustrate use of this m e t h o d o l o g y using examples of some of the more interesting issues we encountered in developing a new interaction technique for head-coupled panning and zooming, called pre-screen projection. 1. INTRODUCTION An interaction technique is a way in which a human uses a physical input/output device to perform a task in a human-computer dialogue. It abstracts a class of generic interactive tasks, for example, selecting an object on a screen by pointing and clicking with a mouse. A pop-up menu is an interaction technique. Interaction techniques are a useful research topic because they are specific enough to be studied, yet general enough to have practical applicability to a broad variety of interactive systems. But research specifically in interaction techniques often emphasizes technological creativity, while user-based evaluation of techniques is either cursory or non-existent. Thus, over the years, a plethora of interaction techniques have been developed, but to date we know little about their impact on the usability of interactive systems - - we have only scant empirical evidence of whether they improve human performance and satisfaction. In this paper, we motivate the need for a methodology for developing interaction techniques, and present details of a methodology we have evolved for inventing, implementing, and evaluating new interaction techniques. We include examples of some of the more interesting issues we encountered in developing a new interaction technique, called pre-screen projection, for head-coupled panning and zooming.
2. A DEVELOPMENT METHODOLOGY FOR INTERACTION TECHNIQUES On the surface, developing a new interaction technique sounds easy. However, even after the initial idea for a new technique is conceived, designing and implementing the technique can still be difficult. Setting the new technique in the context of meaningful user tasks in a realistic setting, perhaps along with other new or existing interaction techniques, involves a myriad of unanticipated design decisions, often with unobvious consequences. Our recognition of the need for a methodology for developing interaction techniques has evolved through much experience, as we have produced several new techniques [e.g., 1, 2].
110 We found such a development process should include evaluation of the effects of a new interaction technique on human performance, not simply implementation to see if a new technique operates correctly. The major components of this methodology and their relationships are shown in Figure 1. Unfortunately, as mentioned earlier, interaction techniques research often stops after the technique has been coded and tested to make sure that its software works as intended. The remaining three components are ignored. To understand the effectiveness and efficiency of new interaction techniques, it is necessary to co-evolve tasks for a user to perform with a new technique, to set those tasks in an application that can be used in an empirical study, and to perform empirical evaluations of the techniques using tasks within the chosen application. Interaction Technique Application User Tasks . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Empirical Evaluation
J Figure 1. A methodology for interaction technique development. 2.1. Creating a New Interaction Technique Creating a new interaction technique involves many steps, beginning with its conceptualization m for example, the idea, the real world metaphor, or whatever is its underlying motivation. Conceptualization is followed by design of the technique, going from general abstractions to specific details, and prototyping to determine what use of the technique "feels like". Eventually the technique is implemented and continually refined, even before it is set in the context of user tasks or an application.
Pre-screen projection is an interaction technique that allows a user to pan and zoom integrally through a scene simply by moving the head relative to the screen. Its conceptualization is based on real world visual perception [3], namely, the fact that a person's view changes as the head moves. Pre-screen projection tracks a user's head in three dimensions and alters the display on the screen relative to head position, giving a natural perspective effect in response to a user's head movements. Further, projection of a virtual scene is calculated as if that scene were in front of the screen. As a result, the visible scene displayed on the physical screen expands (zooms) dramatically as a user moves closer. This is analogous to the real world, where the nearer an object is, the more rapidly it visually expands as a person moves toward it. Using pre-screen projection, a user wears a lightweight helmet or headband with a 3-D Polhemus tracker mounted on the front, as shown in Figure 2. As the user moves from side to side, the display smoothly pans over the world view. As the user moves closer to or further from the screen, the display smoothly zooms in and out, respectively. The virtual scene is calculated to appear as if it were 20 inches in front of the physical screen. Thus, the scene is displayed on the physical screen, but its dynamic perspective from the user's viewpoint reacts to head movements as if the scene were in front of the screen. This causes the scene to enlarge more rapidly than the screen as a user moves toward it and therefore produces a dramatic zoom. Having developed this concept to underlie pre-screen projection, we began the challenge of instantiating the interaction technique, which includes design, implementation, and
lll iterative refinement. Closely related to these activities is the evaluation plan for an interaction technique. As we proceeded with the design of pre-screen projection, we began to see its large design space of possible attributes. Each attribute could have several (and sometimes many) values, so combinatorics of the numerous design decisions became very large very fast. This meant that early design decisions must be made based on very rapid prototyping and a quick, informal evaluation cycle, rather than on fuller experimental evaluation. Especially interesting design challenges were provided by two attributes: scaling and fade in/out. For both attributes, three types of objects in the scene for the command and control application we chose (described later) could potentially be affected: a map, military icons, and descriptive text. Virtual Scene
\NN
Physica
Figure 2. Configuration of pre-screen projection. Scaling is an attribute that specifies which objects in the scene appear larger or smaller as a user moves the head toward or away from the screen. In all design iterations, the map scaled, because it is the predominant spatial context for icons and text. In early designs, icons did not scale (to remove any implication that their size has meaning, which it does not in our application). However, users found it difficult to zoom in on non-scaling icons, so we changed the design so that icons do scale. We also chose not to scale text associated with the icons, to maximize the amount of readable text presented at any time. Users were comfortable with a final design in which the map and icons scale, but text does not. Further, we found that pre-screen projection is good for zooming only by a factor of about 10 when using real world linear perspective ( l / z , where z is the distance from a user's head to the virtual scene in front of the screen). To overcome this, we added amplified zooming that takes effect when the user is approximately 54 inches from the screen. Fade in~out is an attribute that specifies the gradual appearance and disappearance (progressive disclosure) of objects in a scene as a user moves the head. In all design iterations, the map and icons scaled but did not fade i n / o u t , because of their role in maintaining spatial context. We had text fading i n / o u t at several different levels of disclosure, based on the distance of the user's head from the physical screen, in order to display as much detailed information (in the text) as possible. In early designs, text at one level faded out completely as different text from the next level replaced it. However, users found it hard to determine which level they were in as they looked for specific information. So we changed the design to fade in only additional text, beneath (rather than on top of the location of) text displayed at prior levels. Fade-in/out of text provided some other surprises. At first text was too densely packed, even after a user had zoomed quite far into the scene. Making what we thought was a reasonable adjustment to that problem resulted in the text being too spread out. Text at first was displayed 75% opaque, but this proved to be too low contrast to read easily and also led to incomprehensible clutter when icons were close
112 enough together so that their associated text overlapped. Again, the combinatorics of alternative designs at low levels of detail (e.g., color, font, spacing, icon shape) were huge. Frankly, many of our final decisions (e.g., distance for amplified zooming, distances for levels of disclosure) were based on trial and error, again supporting the need for a fast, highly iterative development methodology.
2.2. Setting User Tasks in an Appropriate Application Our intent was to incorporate pre-screen projection into an application that would provide a realistic situation for its evaluation. This application incorporates interaction technique(s) as used to perform specific tasks, and is the context within which experimentation is performed. The application can be a real world one, or it can be a testbed or simulation developed specifically for purposes of interaction technique development. Panning and zooming are inherently spatial, navigational information-seeking activities; that is, a user navigates through an information space. To design and evaluate pre-screen projection, we wanted to create user tasks that would place heavy demands on a user's ability to pan and zoom. Thus, a critical aspect of our research was the co-evolution of appropriate user tasks along with the interaction technique, as indicated in Figure 1, to produce tasks for evaluating pre-screen projection, rather than for producing the best possible application. This approach re-emphasizes the definition of an interaction technique as a means of performing tasks. Prior research at the Naval Research Laboratory's HumanComputer Interaction Laboratory [1, 4] used rather simplistic, non-military, low-fidelity domains and tasks. Because we wanted our work to be relevant to specific Naval applications, we chose Naval command and control systems (hereafter called C 2 systems) as a rich, realistic application domain for our interaction technique development and evaluation. These systems, simply explained, support the planning, coordination, and execution of a military mission. We incorporated pre-screen projection into a C2-1ike testbed and created task scenarios for evaluation. We are using this testbed and tasks to evaluate and compare many different interaction techniques for panning and zooming, especially in time-critical situations, not just pre-screen projection. We chose to develop our own simplified application, rather than using or modifying an existing C 2 system, because of the complexity of such systems, and because much of the software behind them has nothing to do with the user interface. C 2 is a highly diverse, extremely rich and demanding application area with a breadth and depth of tasks that its users perform. We performed a user task analysis of such systems, to know the design space from which to select possible tasks. Participants to which we have access for evaluation are, in general, civilians with little or no military background. Thus, selecting and designing specific tasks to evaluate pre-screen projection that were rich enough, yet were unbiased and simple enough for users to learn quickly, was a crucial issue. The interaction technique and goals for its evaluation should drive task development, but in reality they are very closely coupled. We often found that simple user tasks worked best for evaluation of our techniques, because they tend not to confound learnability and other issues of the application itself with the issues of evaluation of the interaction technique(s). We chose a set of defensive engagement tasks, in which friendly, enemy, unknown, and neutral military objects are displayed on a map. We developed several scenarios in which groups of friendly ships and planes remain constant from the start of the scenario, but additional enemies, unknowns, and neutrals appear during the course of the scenario. Using the testbed application, the user's goals (tasks) are to acquire and maintain an awareness of an evolving situational picture, and to appropriately allocate weapons from friendly ships to encroaching enemy planes. Specifically, a user monitors the scene, largely by panning, looking for threats. When one is observed, the user then determines the number of planes in
!13 that threat (by zooming in on textual details that are progressively disclosed as the user's head moves closer to the screen) and the number of missiles the friendly has available to fire at the enemy. The user then uses a slider (not pre-screen projection) to enter the number of missiles to shoot at the enemy and fire those missiles.
2.3. Performing Empirical Evaluations As emphasized, much of developing a new interaction technique is evaluation. The first version (prototype) of pre-screen projection simply moved a couple of objects around on a solid background. This was fine for debugging the implementation, but gave us little information about how pre-screen projection affects user performance. Two kinds of evaluation are most appropriate for interaction techniques: formative and summative [5]. Formative evaluation is observational evaluation with users early and continually throughout user interface d e v e l o p m e n t - - in this case, of a specific interaction technique m with both qualitative and quantitative results. Summative evaluation is experimental evaluation with users comparing two or more interaction techniques for performing the same user tasks, with primarily quantitative results. We have performed numerous cycles of formative evaluation m some as short as five minutes, others lasting nearly an hour. Evolution of the design of scaling and fade i n / o u t of text and graphics, described previously, as well as virtually all other decisions about design details, came from many rounds of formative evaluation. The next step in our research is to perform summative evaluation (see summary section). Both formative and summative evaluation need to assess two key aspects of any interaction technique: cognitive/perceptual (mental) issues and articulatory (motor) issues [6]. A user's ability to physically interact with the technique should be evaluated first. If motor demands of a technique are such that a user cannot manipulate it, then whether that technique is useful at a higher, mental level of task performance is meaningless. To determine motor issues surrounding pre-screen projection, we had users perform a series of simple information-seeking tasks, in order to learn to use the technique, before we asked them to perform the more cognitively difficult defensive engagement task. These motor tasks also help a user learn to manipulate a new interaction technique, before moving into the more complicated cognitive tasks.
3. RELATED WORK Most work on interaction techniques typically reports only about the technique itself, with little or no mention of its development process and especially little or no evaluation. Some notable exceptions of works that discuss evaluation of interaction techniques are generally about predominantly summative evaluations, and include fish tank virtual reality [7], the alphaslider [8], the Toolglass [9], marking menus [6], and eye tracking [10]. Few papers discuss formative evaluation of interaction techniques, and we are aware of no papers that present a methodology for developing interaction techniques.
4. S U M M A R Y & F U T U R E In summative evaluations, we are comparing user performance and satisfaction using pre-screen projection to other interaction techniques for panning and zooming, using the defensive engagement task. We intend to provide, by these evaluations, baseline metrics for performing various kinds of tasks for panning and zooming. These baseline metrics can then be used to comparatively evaluate other new interaction techniques for panning and zooming. Without such baselines, the cost of interaction technique evaluations is too great,
114 because every new technique would have to be compared to every existing technique - - a prohibitive exercise. We have discussed a methodology for developing new interaction techniques that focuses not merely on their creation and coding, but also focuses on their effectiveness and efficiency for users. We have found that it is almost impossible to set development (creation and evaluation) of new, novel interaction techniques such as pre-screen projection into a structured software engineering environment (at least at a micro level); developing a new interaction technique is a form of invention and evolution that does not readily conform to current software engineering practices. It is a process of scientific discovery, rather than pure engineering. Our best guesses about the design of pre-screen projection, and other interaction techniques we have developed over the years, were substantiated or refuted by many tight, short cycles of formative evaluation with users performing realistic tasks in a testbed application. The many difficult issues and resulting unexpected design decisions could only have been encountered and resolved using a development methodology, such as we have described, that supports this kind of fast, effective iteration. ACKNOWLEDGMENTS Dr. Rob Jacob and Linda Sibert contributed many excellent ideas to our interaction technique research and methodology. This work is sponsored by the Decision Support Technology block (RL2C) within the ONR Exploratory Development Program, and is managed by Dr. Elizabeth Wald. REFERENCES 1. Jacob, R.J.K. (1991). The Use of Eye Movements in Human-Computer Interaction Techniques: What You Look At Is What You Get, ACM Trans. Info. Sys., 9(3), 152-169. 2. Hix, D., J.N. Templeman, and R.J.K. Jacob. (1995) Pre-Screen Projection: From Concept to Testing of a New Interaction Technique. Proc. CHI'95 Conf. 3. Gibson, J.J. (1986). The Ecological Approach to Visual Perception. Lawrence Erlbaum Associates, Hillsdale, NJ. 4. Jacob, R.J.K., and L.E. Sibert. (1992). The Perceptual Structure of Multidimensional Input Device Selection. Proc. CHI'92 Conf., 211-218. 5. Hix, D., and H.R. Hartson. (1993). Developing User Interfaces: Ensuring Usability through Product & Process. John Wiley & Sons, Inc., New York, NY. 6. Kurtenbach, G., A. Sellen, and W. Buxton. (1993). An Empirical Evaluation of Some Articulatory & Cognitive Aspects of Marking Menus. Human-Computer Interaction, 8, 1-23. 7. Ware, C., K. Arthur, and K.S. Booth. (1993). Fish Tank Virtual Reality. Proc. InterCHI'93 Conf., 37-41. 8. Ahlberg, C., and B. Shneiderman. (1994). The Alphaslider: A Compact and Rapid Selector. Proc. CHI'94 Conf., 365-371. 9. Kabbash, P., W. Buxton, and A. Sellen. (1994). Two-Handed Input in a Compound Task. Proc. CHI'94 Conf., 417-423. 10. Ware, C., and H.T. Mikaelian. (1987). An Evaluation of an Eye Tracker as a Device for Computer Input. Proc. CHI+GI'87 Conf., 183-188.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
115
B a s i c M o d e l s for U s e r I n t e r f a c e Design" Tasks, Users, Data, and I n t e r a c t i o n Devices
Chris Stary TAU-Group at the Department for Information Systems University of Technology Vienna, Paniglgasse 16, A - 1040 Vienna In this paper a representation scheme for the specification of the users of an interactive system and its functionality in a declarative as well as procedural way is introduced. According to this scheme a design methodology is given that does not end with the isolated treatment of interface components but rather provides a sound integration strategy. 1. T H E P R O B L E M
For interface design several factors have to be considered, as given in e.g. [1]" • the users of an interactive system, i.e. all those individuals accomplishing tasks by manipulating interaction devices and problem domain data at the user interface of a computer system. • the task-specific functionality of the operations on problem domain data as well as interaction devices, including a procedural specification. In order to bridge the gap between the control flow on application data and interaction devices several proposals have been developed, such as starting system design with a separated data model of the problem domain, e.g. [2]. Unfortunately, none of these approaches has led to a representation of the control knowledge required for the integration of knowledge about the users, their perception of tasks, problem domain data and interaction devices. The situation will not change, as long as specification techniques will not provide an integrated representation of the knowledge mentioned above at the semantical design level. For instance, this evidence has been shown for object-oriented specification [3]. In order to provide support for task- and user-oriented interface design, designers have not only to be provided with a representation scheme (to know how to represent the results from task- and user-analysis), but also with a methodology (to know what kind of activities to perform in the design process itself). In the following we provide an object-oriented concept for the integrated representation of the results of task- and user-analysis as well as generic steps to follow in order to complete a specification by handling and relating the addressed interface components. In contrast to existing approaches the approach captures all relevant knowledge for interface development, since it takes into account the actual context of user interfaces explicitly - tasks and users. Another difference to most of the existing development paradigms is the top-down strategy that allows for stepwise refinement and decomposition in a natural way.
116 2. THE T A D E U S - A P P R O A C H
In the TADEUS (Task Analysis / Design / End User Systems)-approach we put the following components into mutual context: task models, data models, user models and interaction models. Task models comprise the decomposition of end user tasks according to the economic and the social organization of work. Data models provide the static and dynamic information about the functionality of the system that has to be derived from a task model. Interaction models capture all devices and styles that are required for users in the course of interaction. All of the presentation- and manipulation-issues concerning tasks, functions, and data structures have to be specified and refined in an interaction model. User models detail individual completion of tasks as well as individual features for the manipulation of data structures and interaction devices. User models take into account personal experiences and preferences, personal access modalities to tasks and data, individual task organization, and social conventions at the work place. Since we propose to use a unifying object-oriented scheme and notation, stemming from Object-Oriented Systems Analysis as proposed in [4], the knowledge about what is going on in a system, and how tasks are accomplished using an interactive system can be kept encapsulated throughout analysis, design and implementation. Moreover, since the design is specified in an object-oriented notation, code can automatically be generated according to the models and the integration steps that have to be performed in the design phase. We will use the following sample problem domain: An airline agent has to be supported when h a n d l i n g customer requests for flights, composed of dates, destinations, etc. The airline agent tries to find an appropriate flight for the acquired options: the request data are matched against a flight database containing the available flights. If one or more flights match the request data, the airline agent offers them to the customer. If a flight is accepted by the customer, it is going to be booked - the airline agent issues a ticket for the selected flight. 2.1 T a s k M o d e l i n g
The starting point for design is a proper representation of the knowledge acquired through task- and user-analysis. It leads to the task model depicted in Figure 1. In the ORM (Object-Relationship Model) rectangles are symbols for classes, triangle relationships denote superclass-subclass relationships. Figure 1 shows the static part of a task model for the example. It contains the object classes 'FlightRequest' and 'TicketRequest' as subclasses of 'Request'. Figure 2 shows a part of the dynamic specification by a state net diagram, i.e. the Object-Behaviour Diagram OBM of the task model for the example. States are represented by rounded rectangles with a state name. Each state-net diagram is related to a class by putting the diagram into a tagged rectangle. Transitions are also represented as rectangles, but are always located between states. The top section contains a trigger description. A trigger provides the condition that, if met, causes the transition to fire. The bottom section of the transition triangle contains the actions that is performed when the transition fires. An action may be composed of several operations.
117
I, A O E N T handles]
...... ........... utters
FLIGHT REQUEST Figure 1: ORM of the Task Model
AGENT(idle)
@customer call~ or enters office load flight dbase
(~aflightdbase e x i s t s nd options exi~-] et paper versioO ~ nable~ o load~
fmatch"~ k, options~
Figure 2: A Part of the OBM of the Task Model 2.2 Data Modeling
The data model is a structured set of data (e.g. flight request) and elementary operations (e.g. match) in the work domain (in our case flight reservation) on which user tasks, such as searching for a flight, are operating. Due to space limits the ORMs ('Flight', 'Request' ,'Ticket') and the OBMs have to be omitted.
118
2.3 Interaction Modeling The interaction model is composed of static and dynamic specifications concerning dialogue devices. It also comprises particular interaction styles integrating several interaction devices. For example, direct manipulation can be achieved by combining windows, menus, icons, a mouse, and a keyboard. The control flow is defined through window and icon manipulation by selecting menu items using mouse clicks and/or key strokes. The abstraction from interaction devices to styles allows to specify user- and task-specific interaction.
is assigned to Figure 3: A Part of the ORM for the Assignment of Data of the Problem Domain to the Interaction Domain Besides generic devices and styles in this step tasks and problem domain data have to be assigned to those devices and styles. Figure 3 shows the part of the ORM where the designer specifies that flight requests are presented in forms to airline agents. Flight options are assigned to fields of the corresponding form. The result of this step is a customized interaction model that covers all task- and data-specific devices, such as flight-request form and ticket form.
2.4 User Modeling For the specification of user groups the organization of tasks has to be considered again. For the tasks at hand the agent represents the only user group that has to be considered for refinement. Customers, the second type of persons, do not get in touch with the interactive system directly, they rather deliver the data for acquiring flight options and issuing tickets. The other candidates for user groups, namely managers and clerks, are not involved in handling customer requests.
119
is able to change password
Figure 4: A Part of the ORM of the User Model Figure 4 shows the ORM for the user group 'Agent'. The specification shown comprises the modalities to access problem domain data, e.g. the airline agent is not allowed to change flight data in the database, as well as features to design an individual work space, such as setting the layout of flight information.
2.5 Application Behaviour Modeling When discussing the interaction model and the user model we did not deal with the dynamic integration. We will do that in this section when we discuss the integration of control flows, in order to support the designer in providing an impression about the overall behaviour of the specified system. In general, two issues for dynamical integration have to be taken into account: • the synchronization of operations on the interaction devices with the operations on the problem domain data. • the work space of each user group, i.e. all tasks and operations including the degree of freedom to rearrange tasks and the presentation of problem domain data. In order to demonstrate the second issue for behaviour modeling Figure 5 shows a part of the activity space for agents in an OBM. The specification integrates all task- and user-related features provided for airline agents handling flight- and ticket requests. It subsumes elements from the task-, user-, data- and the user model. After having specified the overall behaviour of the interactive system and the work spaces for all user groups, the functional specification can be started.
120
@customer call~ or enters office
AGENT
oad flight dbase @ idle
set
@manager sick manage flight,,
Figure 5: A Part of the OBM of the Agent's Activity Space 3. C O N C L U S I O N S In becoming aware user interface designers are still struggling for a structured representation that takes into account all knowledge relevant for task- and useroriented user interface development, an integration scheme for the specification of users and its functionality in a declarative as well as procedural way has been introduced. The related design methodology does not end with the isolated treatment of interface components but rather provides a sound integration strategy for global behaviour modeling. These achievements yield in the TADEUSenvironment that is currently being implemented to allow designers to specify the acquired knowledge, comprising tasks, users and their refinements. In addition, it will support the automatic generation of application prototypes.
References
[1] Johnson, P. Human-Computer Interaction, McGraw Hill, London, 1992. [2] Janssen, Ch., Weisbecker, A., Ziegler, J. Generating User Interfaces from Data Models and Dialogue Net Specifications, in Proc. INTERCHI'93, ACM/IFIP, 1993, pp. 418-423. [3] McDaniel, S., Olson, G.M., Olson, J.S. Methods in Search of MethodologyCombining HCI and Object-Orientation: A Case, in Proc. CHI'94, ACM, 1994. [4] Embley, D.W., Kurtz, B.D., Woodfield, S.N. Object-Oriented Systems Analysis. A Model-Driven Approach. Yourdon Press, Englewood Cliffs, 1992.
Ill.6
Screen Design 1
This Page Intentionally Left Blank
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
123
T h e effects of r e a l i s t i c v e r s u s u n r e a l i s t i c d e s k t o p i n t e r f a c e d e s i g n s on novice a n d e x p e r t u s e r s L. A. Miller ~ and K. M. Stanney b a Department of Psychology, University of Central Florida, PO Box 25000, Orlando, FL 32816 bDepartment of Industrial Engineering and Management Systems, University of Central Florida, PO Box 25000, Orlando, FL 32816
Metaphors are one tool which designers have used to bridge the gap between technology and the novice user. While metaphors work in theory, often the metaphor falls apart or breaks down during seemingly logical procedures. It would stand to reason that a metaphor which approximates the humanenvironment interaction, and remains consistent in the analogy, would be a more valid test of the benefits of a metaphor. In this study, novice and expert subjects were given a set of four computer-based, editing tasks, each of which was composed of six subtasks. The dependent variable was time to complete the task. It was found that novice subjects significantly benefited from the use of the realistic desktop design, while there were no significant differences in the performance of expert users. 1. I N T R O D U C T I O N Technological advances in computer hardware and software have brought faster systems, larger memories, bigger hard drives, and a plethora of different applications in which to use these innovations. Yet despite these great advances in technology, the designer's knowledge and understanding of the computer system has not really changed (Booth, 1989). It is evident that more natural and convenient means for users and computers to exchange information are needed (Hartson & Boehm-Davis, 1993; Jacob, Leggett, Myers, & Pausch, 1993). Recently, many designers have been trying to make computers and their applications "user friendly." "User friendly" is a term which means that the computer can interact with a human much the way humans interact with each other (Eberts, 1994). While this design paradigm provides insight into how the
124 usability of a system could be enhanced, it is limited in its scope. This paradigm should be extended by stating that the computer should interact with a human much the way humans interact with each other and their environment. If today's software programs were similar to human-human communication and human-environment interaction, it is possible that we would not have so many individuals who feel uncomfortable, or even frightened, using the computer. A mental model often helps novice computer users understand the relationships and functions of target systems. A mental model works because it is set up to parallel the model of the system it emulates. According to JohnsonLaird (1980, as cited in Rasmussen, 1990, p.44), "A natural mental model of discourse has a structure that corresponds directly to the structure of the state of affairs that the discourse describes." A common means of aiding the formation of this mental model is to incorporate a metaphor into the computer design. Metaphors aid the user in understanding the capabilities of the computer system and the relationships between its components by showing that the system "acts like" something the user is more familiar with. Although this practice works in theory, often the metaphor 13teaks down" during seemingly logical procedures. The purpose of this study was to compare a realistic desktop design to an unrealistic design, such as the Windows version. A realistic design is defined here as one that preserves the critical relationships that exist for human-human communication and human-environment interaction. It was hypothesized that there would be no significant difference between the desktop interfaces for expert users; however, novice users were expected to perform significantly better using the realistic desktop. The objectives of this paper are (1) determine ff there is a substantial difference between the realistic and unrealistic desktop interfaces as mapped through the GOMS method, and (2) to investigate the effects of the two interface designs on performance times in experts and novices. 3. M E T H O D 3.1 S u b j e c t s Twenty subjects (10 computer experts and 10 novices) were recruited from engineering and psychology courses. They were given course credit for their participation. The subject's computer experience was first determined and then they were tested on both the realistic and unrealistic desktop designs. 3.2 M a t e r i a l s The materials consisted of a nonrealistic desktop interface (a Windows-like environment) and a realistic desktop interface. The realistic design looked much like an office would; it included representations of a desk, phone, file cabinet,
125 Rolodex, printer, trash can, and calendar. The interfaces were displayed on an IBM-compatible 486-DX66. A mouse and keyboard were provided to manipulate the icons on the screen and enter data. The computer recorded the amount of time it took to complete each task.
3.3 GOMS Analysis Before building the prototype, a predictive analysis was conducted on both the realistic desktop and unrealistic desktop interfaces to determine where in the analogy the metaphor broke from the physical world. The four tasks studied were: finding a file, deleting a file, printing a file, and telephoning a client. Applications in both the realistic and unrealistic design interfaces were mapped to real-world tasks by the GOMS method to determine how closely each of the interfaces assimilates the actual procedures an individual would use in real-life (Card, Moran, & Newell, 1983). For example, the physical task of dialing a phone might involve: • looking up the phone n u m b e r in the rolodex by t h u m b i n g through the cards • finding the correct n u m b e r • picking up the receiver • dialing the number. In comparison, the realistic desktop involves: • looking up the phone n u m b e r in the rolodex by clicking the cardtMe and flipping through the cards • finding the correct n u m b e r • pressing a "dial" button on the card. While the unrealistic desktop involves: • opening the accessories window • double-clicking the cardfile icon • clicking on the menu • clicking on in the File menu • double-clicking the correct cardffle entry (you can conceivably have more t h a n one cardfile) • flipping through the rolodex cards • finding the correct phone n u m b e r • opening up the terminal application (also in Accessories) • arranging the windows so t h a t both the cardtMe and terminal applications can be read (or rely on memorization of the phone number) • clicking the <Settings> option • clicking • reading the phone n u m b e r in the cardtMe application
126 •
typing the number into the dial dialogue box in the terminal application • pressing enter. As seen in this example, the realistic desktop interface is better able to preserve the critical relationships of the real-life task. The realistic metaphor maintained the first two procedures of the real-life procedure and then broke from the metaphor on the third. While the original procedures (i.e., picking up the receiver and dialing the number) could have been maintained, a cost-benefit analysis clearly indicated that breaking from the metaphor was beneficial. The break required users to press one button (i.e., the "Dial" button) rather than keying through the entire phone number. The cost (i.e., the break in the metaphor) was out-weighed by the benefit (i.e., fewer keystrokes and an intuitive alternative). The realistic metaphor is more consistent, however, than the unrealistic design which used embedded windows and dialog boxes behind the front end screen. Using the GOMS method it was determined that for every task performed (finding a file, deleting a file, printing a file, and telephoning a client), the unrealistic desktop interface broke from the metaphor at an earlier stage than the realistic design. 3.4 P r o c e d u r e Each subject participated in one two-hour session. Subjects were asked to fill out a demographic page and the Windows Computer Experience Questionnaire (WCEQ). The WCEQ was based on input from a subject matter expert and developed particularly for this study. The results from the WCEQ were used to assign subjects to either the novice or expert subject group. Novice and expert subjects were then given a set of four tasks to accomplish on both the realistic and unrealistic desktops. T h e s e tasks included finding a file, deleting a file, printing a file, and dialing a number. Subjects were instructed to engage in each of these tasks six times (i.e., finding six different files), for a total of 24 trials. To eliminate the effects of interface presentation, half the subjects were first tested on the realistic design, while the other half were first tested on the unrealistic desktop design. The order in which these occurred was random. The dependent variable was time to complete the task. The computer recorded the data. 4. R E S U L T S
An ANOVA revealed a significant interaction between the times to perform on each of the interfaces and expertise level (F (1.18)=10.7, p_O,
(20)
together with (18) and (1). Herein ~ is a scalar. Notice that °f(t) is the 3-dimensional vector representing the shape of the desirable dynamic font. Consequently, the dynamic font is designed by the following three steps: (i) Specify a control polygon corresponding to the desirable font with the parameters related to the shape of the writing implement, i. e., ( and 77appeared in (19) and (20) (or originally in (15)); (ii) Derive the dynamic font by computing (1),(18),(20), and (19); and (iii) If you would like to modify the resulting font, change the control polygon by using some suitable operations on control polygons, and/or the parameters, then go back to the step (ii). 5.2. S i m u l a t i o n e x a m p l e s We then show some examples of the dynamic fonts. In each figure below, (a) shows the generated font, (b) the font and the corresponding control polygon shown in °O-°X°Y°Z
139 space, and (c) those shown on °0 - ° X ° Y plane. Figure 1 is an example of a cursive letter "a". Figure 2 shows concatenated fonts "ac", of which motion M~c is generated by using the operation as M~c = concatenate(M~, M~). Thus, in the present framework, various complex fonts including concatenated fonts like continuously handwritten letters or characters can be generated in an easy way. Figure 3 is an example of cursive Chinese character ,,ql,,. Figure 4 shows a rounded Gothic font "q~". Note that the control polygon shown in the ° 0 - °X°Y plane in the Figure 4 (c)is the same as that in Figure 3 (c). The control polygon of Figure 3 was designed referring to that of Figure 4. Thus, the framework presents a possibility to design various fonts of different typefaces in more easy manner. 6. C O N C L U S I O N S A new concept of "dynamic font" and its generation method [1] were proposed. The dynamic font was generated by intersecting some virtual writing implement with some virtual plane and moving the implement continuously in both space and time subject to the designed writing-motion. The writing-motion was defined by using the concept of "unit motions" [2] and this made possible a local and dynamic generation of motions and the fonts as if human wrote such fonts in real time [3]. In order to build various motions and the fonts, a notion of operations on motions [2] was also contained. They included spatial operations such as scaling, translating, rotating, and tilting a motion. Structural operations of joining two motions and separating a motion into two enabled to generate any sequence of continuously connected cursive fonts. The writing-motion was represented formally as a sequence of the weighting coefficients for unit motions [2]. The sequence formed a "control polygon" geometrically and was used effectively to design the motion as well as the dynamic font. Several simulation examples were demonstrated by using an elliptic cone and a simple planar plane respectively as examples of the virtual writing implement and the plane for writing on. The figures showed fonts of really different typefaces were generated from similar control polygons except the rates and timings of putting the implements up and/or down and the shapes of the implements. REFERENCES
1. K. Takayama, H. Kano, Y. Maeda, K. Misue, S. Hosogi, and K. Sugiyama, "Dynamic font: its concept and generation method," (in Japanese), in Proc. Graphics and CAD Syrup., IPSJ (Tokyo, Japan, Sep. 1994), pp. 181-190, also to be published in ISIS Research Report, Fujitsu Laboratories Ltd., 1995. 2. K. Takayama and H. Kano, "A new approach to synthesizing free motions of robotic manipulators based on a concept of unit motions," to be appeared in IEEE Trans. Syst., Man, Cybern., vol. 25, no. 3, 1995. 3. K. Takayama and H. Kano, "A mathematical interpretation on the organization of free motions of arm," (in Japanese), in Proc. 4th SICE Syrup. Biological and Physiological Engineering (Tokyo, Japan, Nov. 1989), pp. 291-294.
140
(~)
(~)
(b)
(b)
(c)
(c)
Figure 1. A cursive letter "a".
Figure 2. Concatenated fonts "ac".
(~)
(~)
(b)
(b)
(c)
(c)
Figure 3. A cursive Chinese character " ~ " .
Figure 4. A rounded Gothic font " ~ " .
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) 1995 Elsevier Science B.V.
141
A Spatial D a t a Structure for the 3D Graphical Facility M a n a g e m e n t S y s t e m Takashi Tamada, Teruhiko Teraoka, Minoru Maruyama and Shogo Nishida Information L: System Science Department, Central Research Laboratory, Mitsubishi Electric Corporation 1-1, Tsukaguchi-honmachi 8 chome, Amagasaki, Hyogo 661 JAPAN
ABSTRACT Geographical information systems (GISs) are now used extensively in the analysis of environmental data. due to their capability to manage, manipulate, and display spatial data. However, most of the GIS applications (such as facility management system) possess data management structures that deal with only 2 dimentional space. The 3D system that enables users to explore the 3D space interactively and examine 3D spatial views of the environments would be very useful. In this paper, we propose a spatial data management method for a 3D graphical facility management system. Using this method, we have developed prototype 3D graphical management system that offers interacive operations with the 3D virtual city environments.
1
INTRODUCTION
Geographical Information Systems have been used for land and facility management [1,2,3]. In conventional GIS applications, there are several significant problems. For example, GISs handle only two-dimensional data and the capability to support user interaction with the data is insufficient [2]. Recently much work has been done using 3D graphics to visualize geometric objects in the areas of virtual reality, scientific visualization and so on. The emerging technologies of 3D and interactive animation can be effectively exploited to improve management of and access to large information space [4]. And it seems plausible that 3D space can be used to maximize effective use of screen space. From these observations, we have attempted to develop a graphical facility management system utilizing 3D virtual environment instead of 2D representations. Although the increasing power of graphics hardware and software technology has made it possible to create, explore and manipulate 3D virtual environment, if the virtual environment contains too many polygons, it is still hard to render them interactively. For example, the complex scene such as very large city may contain millions of polygons. This is far more than currently available workstations can render interactively. It implies that some techniques are required to reduce computational cost for object rendering [5].
142 This paper proposes a method of 3D virtual city management for interactive operations based on the 2D hierarchical data structure. Using this method, we have developed 3D graphical facility management system. Even if this system manages the virtual city consists of ten thousand objects, the user can perform various interactions with the virtual city environments such as rapid walkthrough, ray intersection picking for object selection, the change of spatial views and so on. 2
THE
MANAGEMENT
OF 3D VIRTUAL
CITY
ENVIRON-
MENTS In geographical information system, the user must interact with the data as solutions to many problems. Therefore, interactive performance is very important. Interactive performance particularly depends of the frame update rate. To realize interactive performance, the frame update rate around 10 frames/sec is required. In this section, we propose a spatial data management method for a 3D graphical facility management system that offers the user interactive operations with the 3D virtual city environments. Our methods target to manage the virtual city which consists of not only objects on the ground but also under the ground. In our method, each 3D object in the virtual city is managed by using its 2D projection onto the ground. The projected 2D shape is managed by the 2D hierarchical data structure (such as quad-tree [6], MD-tree [7]) which is specialized to manage geometric objects efficiently. Given a view volume, the objects to render are searched rapidly using the data structure. The search area is also a 2D projection of the view volume. Potentially visible objects whose 2D projections intersect with the area are retrieved by the range search. When the user is walking on the ground, the objects under the ground are not visible. For the efficient display control, the objects on the ground and the objects under the ground are managed separetely by distinct 2D trees (layers) as shown in Fig.1. When the user is exploring the space under the ground, all the layers(2D trees) are used. On the other hand, when the user is walking on the gound, usually, the layer that manages the space under the ground is not used. Moreover, the space on the ground is further sliced into multiple layers depending on the height (Fig.l). As Fig.1 shows, all the objects on the ground are managed by the tree (layer) To. If the object is higher than the given threshold, it is also managed by another tree T1. In the city area, if an object is far from the viewpoint, it is likely that the object is occluded by the other objects which are closer to the viewer. This implies that an object far from the viewer is visible only if it is tall enough. Based on the heuristics, the search area (i.e. the 2D projection of the view volume) is derided into two regions. During the walkthrough operation on the ground, in the region which is close to the viewpoint, the potentially visible objects are searched using the layer To. In the other region, which is far from the viewpoint, visible objects are searched using the higher layer T1. With this technique, we can expect to get a natural scene interactively without increasing the objects to render.
143 ~
//..' ...........................
2D ~ (layer T,)
?,
~ T1
OayerTo)
~
::/
/ (layer Ta)
To
Fig.l: Object management by multiple 2D-trees based on the height
3
ADAPTIVE
CONTROL
OF D E T A I L S
By the object management method described above, we can get the efficient visibility culling function. However, sometimes, too many polygons are visible from certain viewpoints to render interactively. And also, the number and complexity of the potentially visible objects changes suddenly in case the user may turn around a corner. In order to guarantee an intera.ctive frame rate and frame to frame coherence even in those cases, an adaptive algorithm for level of detail is employed. In our method, every potentially visible object is rendered according to its value of reality parameter [8]. The value of reality parameter for each object is calculated regarding total amount of detail in a scene, distance from the viewpoint to each object, and the size of polygons in the screen image at each frame. For example, when an object is far away from the viewpoint, simpler representation of the object is displayed to reduce the number of polygons rendered without noticeably reducing image quality. The shape of each object with various number of polygons is prepared to approximate the original shape of object. Using the algorithm, the desirable frame rates (10frames/second) can be realized without causing significant image quality down even if the total number of objects in the virtual city is more than millions. 4
3D GRAPHICAL
FACILITY
MANAGEMENT
SYSTEM
We have developed prototype 3D Graphical Facility Management System using the methods described above. To test whether our system can provide an interactive performance independent of the scale of the virtual city environments and the scene complexity, we ran our system on the virtual cities whose number of objects varied from thousand to hundred thousand. The virtual cities include many kinds of 3D objects on the ground
144
....i:!:!:~:~:~:~:~:~:~:~:~:!:~:~:~N ~$.~ ~ . - : . . , : ~ ~ i ....~;:~:~i~:~:s:~i~isi~.'..,x.'~N~g'."$ ,~. "~'.;~. •~;_~"~" .... •..'.
i
~
"
!! •
.:::.,::~.,'~::s:-.::..-.-.-::...-...'::.e..-...~.! ~ i ~ : - . . : ,
.....
~:,......,..
/u/ .
. . . . . . . . . . . .
.. . . . . . . . . .
• .....~..,.~:~...... ~i~.
Fig.2: (a)Entire region to be managed (b)A scene fi'om the viewpoint under the ground
such as buildings, houses, roads, and the electric facilities installed under the ground such as pipes, cables, man-holes. Each object is constructed from about 10 ,-~ 100 polygons. Fig.2 shows example views of the created city and electric facilities. In each test, the average fi'ame rate was around 10frames/second and the frame to frame coherence was guaranteed. Therefore, the user can explore 3D virtual city without losing the sense of interaction.
4.1
I n t e r a c t i o n w i t h t h e 3D O b j e c t s
In 3D graphics systems, Human Computer Interaction (HCI) is a very important issue [4][9]. As for the ItCI, in addition to interactive walkthrough mentioned above, our system supplies a rapid interaction method with the 3D virtual city environments using only the three-button mouse, keyboard, and pop-up windows without using special 3D devices. Many interactive operations in virtual environments are based on the task of moving a viewpoint. In our system, the viewpoint movement is controlled by simply a three-button mouse. Translations and rotations are selected with the mouse buttons and the mouse motion is transformed into an appropriate viewpoint movement. Based on the techniques of viewpoint control described above, interactive operation is performed as follows. Access to the detail information of the object of interest is the essential operation for GIS. In our system , when the user picks the object of interest directly, the detail information about the object is displayed immediately through the popup window (Fig.3 (a)). Moreover the user can move 3D object freely. That is performed by placing the mouse cursor over the desired object and input the transform matrix by the keyborad. The main advantage of tile user interface based on 2D input devices is that these devices
145 are ubiquitous and the user do not have to wear special equipments (such as gloves and helmets). 4.2
Visual
Interaction
with
3D Environments
Our system also provides the user the capabilities to visually interact with the data using a variety of visualization techniques. It may be important that the user can easily change the visualization process in order to try different ways to view data. For example, making the objects on the ground semitransparent using transparency blending, the user can observe the underground facilities beneath the objects on the ground. The interactive change of the transparency level is very u~eful to understand the spatial relationships between them (Fig.3 (b)). In our system, texture mapping is applied to some objects. Using texture to reduce polygonal complexity, we can often get both an improved picture and improved performance. The user is allowed to change texture pattern on an object surface so that the user can experiment with the different ways of observing the virtual city without spending a lot of time experimenting. The view of the city may be either perspective or orthographic. The former is corresponding to the 3D spatial view and the latter is to the 2D. The user can switch the view of the city respectively at any time. This assists the user to assimilate current statement immidiately during some operations (such as object positioning and navigating through the large space).
(b) Fig.3: (a)The result of 3D picking (b)Transparency blending
146 5
CONCLUSIONS
We have presented a new 3D object management and rendering method for the 3D graphical facility management system. In this method, the virtual city is managed by multiple 2D-trees and each object is rendered at appropriate level of detail to guarantee an interactive frame rate without reducing image quality. We have developed prototype 3D graphical management system using this method to test whether this method produces interactive performance. With this method, our system provides the user the interactive operations such as smooth walkthrough and the rapid information retrieval of object of interest. Moreover, in our system, the user can visually interact with the data using a variety of visualization techniques. Future work is to find what kinds of interaction are most useful for 3D graphical facility management system, and to develop systems which can support them. Also of interest is integrating a method which allows to link visualization environmen~ to the GIS database.
REFERENCES 1. Brodlie K.W. et al.,Scientific Visualization : Techniques and Application, SpringerVerlag, (1992) 2. Rhyne T.M., Ivey W., Knapp L., Kochevar P. and Mace T.,Visualizatin and Geographic Information System Integration : What are the needs and the requirements, if any?, Proc. Visualization'94, pp.400-403 (1994) 3. Nakamura Y., Abe S., and Ikeda K., Interactive Graphics and Spatial Data Management for GIS using the Hierarchical Data Structure, FEGIS '93, pp.106-120, (1993) 4. Robertson G.G.,Card S.K.,and Mackinlay J.D.,Information Visualization Using 3D Interactive Animation, Comm.ACM, Vol.36, No.4,pp.57-71(1993) 5. Funkhouser, T.A. and S6quin C.H.,Adaptive display alogorithm for interactive frame rates during visualization of complex virtual environments, Proc. SIGGRAPH'93, pp. 247-253 (1993) 6. Samet H., Webber R.E., Hierarchical data structures and algorithms for computer graphics, Part I:Fundamentals, IEEE CG & A, May 1988, pp.48-68 7. Nakamura Y., Abe S., Ohsawa Y. and-Sakauchi M., A Balanced Hierarchical Data Structure for Multidimentional Data with Efficient Dynamic Characteristics, IEEE Trans. KDE, Vol.5, No.4, pp.682-694 (1993) 8. Tamada T., Nakamura Y. and Takeda S.,An efficient 3D object management and interacive walkthrough for the 3D facility management system, Proc. IECON'94, pp.19371941 (1994) 9. Houde S.,Iterative design of an interface for easy 3D direct manipulation, Proc. CHI'92, pp.135-142 (1992)
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
Cryptographic
Human
Identification
147
Extended Abstract
Tsutomu Matsumoto Division of Electrical and Computer Engineering, Yokohama National University 156 Tokiwadai, Hodogaya, Yokohama 240, Japan. email: tsutomu©mlab, d n j . ynu. ac. jp Conventional password checking schemes have a disadvantage that an attacker who has correctly observed an input password can perfectly impersonate the corresponding user. To overcome it there have been proposed schemes such that a human prover knowing a secret key is asked a question by a machine verifier, who then checks if an answer from the prover matches the question with respect to the key. This paper presents practical interactive human identification schemes with novel ideas of displaying questions. Keywords: authentication, hulnan-computer interaction, passwords, information security. 1. I N T R O D U C T I O N Human identification is a necessary item for access control mechanisms. Figure 1 illustrates the above mentioned difference between widely-used conventional password schemes and the interactive human identification schemes firstly examined in [1]. The same reference contains a brief description on the feature and significance of the latter class by contrast with schemes requiring auxiliary devices for human provers. The resistance of such schemes can be evaluated by the probability P(n) of which an attacker can correctly answer a given question after obtaining n (_> 0) pairs of questions and correct answers. A simple interactive identification scheme [2] is easy to understand and requiring no randomness to make an answer, but its resistance has been developed only by computer experiments [3]. This paper presents practical interactive human identification schemes with novel ideas of displaying questions. Linear algebra supports theoretical basis of these schemes and clarifies their rigorous profiles of P(n). 2. I D E N T I F I C A T I O N
SCHEME
BASED ON LINEAR ALGEBRA
Let s be a prime power, u and v positive integers. Also let F~ and F~ *' respectively represent the finite field of order s and the vector space {[Xl,... , X v ] l X l , . . . , x v e Fs} consisting of all v-dimensional row vectors over F~. Now we define a basic protocol to be conducted by two players: Prover P who claims that P is a person P; Verifier V who acts as a machine V and communicate with P. P r o t o c o l 0.
Preparation Phase O. Key Sharing
P and V agree on a key Kp - [ k l , . . . , k~] where kiT, ... , k~ T E Fs v, and keep the key secret.
148
Conventional
password
Interactive human identification schemes
schemes
Verifier
Verifier
~ Prover**secret ~ ~ 1 **
Secret leaks out easily
to protect
Prover'
eeret
Figure 1" Conventional Password Schemes & Interactive Human Identification Schemes Interaction Phase 1. R e q u e s t
P requests V to decide " P -
P".
Serially or concurrently for i - 1 , . . . , u, entities P and V execute steps 2 and 3. V generates a question qi E F s t' - {0}, and sends it to P. When V - V, we assume that qi is selected randomly and uniformly from F ~ " - {0}.
2. C h a l l e n g e
3. R e s p o n s e
P sends an a n s w e r ai E F~ to V. When P = P, we assume that
ai = qiki E F~
(1)
If V = V, then using/(p, Verifier V checkswhether equation (1) holds or not for each i - 1,...,..u. V judges "P = P"if and only if all of them hold. Then V informs P whether V has judged "P = P" or not.
4. A c k n o w l e d g m e n t
We can observe the following features. P r o p o s i t i o n 1. For Protocol 0, the key, answers, and questions are respectively described by u . v . log 2 s [bit], u. log 2 s [bit], and Do(s, v, u) = u . v . log 2 s
3. S E C U R I T Y
[bit]
(2)
OF BASIC PROTOCOL
3.1. T a x o n o m y of A t t a c k s An attack is an action by an entity A, attacker, who is different from P and V, to aim at letting V decide "A - P". An attack succeeds if and only if V judges "A - P". Depending on the knowledge the attacker can utilize we can distinguish three types. B l i n d A t t a c k An attack by A who has been given no pair of a question to P and the corresponding answer from P. Let P(s, v, u; 0) denote the least upper bound (LUB) of the success probability of any blind attack to Protocol 0.
149
Buccess Probability
,'
/ /'", ."'" / // /
'
i
[' t,
:
i #
....
:
•
/
/
.
I / i
/
o,
.
"
: i
.
F;l
"
: I/
....
i
/ I / ..... if# t, / i / /1 // / / _. . . . . ~,~
:
i,#
, t
l'ii * * * t /
o...... o
l
/
/
./ i" ...../ .-/" ....,~,,..'~ .,, s"
~
~
i
....... "
. . . '
:, .
~
~
""
;
i i
' ' ' ' ' '#
i i i i i
[,,, , . :I
#
i
'
/
l I : I : I : I : I I : / /
l
l"
# i
/
/!
/
,
i / #
i !
o
;
.
oool
l'!,,'/
o...... o
/
:
~
! "
/
-
.. . .. ... ... .
:
2
~
,
=
-
. . . . .
~
,
'
/ " " "
. . . . . I .l l ; ./ . .
- . - . . ~ ..... v -
:
.
" I
i ! / ..... f / / / / / / 7 :I , i ,
.
r,s
o.oool
8uccess Probability
-- ~ = ~..7----;--~.~--~~---- -~---.,...... "
~
....
v
=
.
5
;
TI A
Figure 3" Examples of P(s, v, u; n)
Figure 2: Examples of P(s, v, u; n)
K n o w n Q &: A A t t a c k An attack by A who has observed n (> 1) pairs of given questions and corresponding answers. Let P(s, v, u; n) denote the LUB of the success probability of any known Q & A attack to Protocol 0. C h o s e n Q &: A A t t a c k An attack by A who has acted as V - A and observed n ( _ 1) pairs of chosen questions and corresponding answers. Let P(s, v, u; n) denote the L UB of the success probability of any chosen Q &: A attack to Protocol 0. A
3.2. S u c c e s s P r o b a b i l i t i e s We can prove the following profiles. See Figures 2 and 3. Proposition
2.
We have 1)~,
P ( ~ , ~, u; 0) -
Proposition
3.
(3)
(7
For n >_ 1, we have
1 P(s , v.u;n) - (1- + ( 1 - - ) . ' s s
s~'
1
£ ( s t - 1). R(s, v; n, l)) ~ l=l
-1
(4)
where for 0 _< l _< v the quantity R(s,v;n,l) is the probability that the vector space spanned by n vectors selected uniformly and randomly from F".; - {0} has dimension of l. Proposition
4.
-. P(s v~ u; n)
For 0 _ n _< v we have 1
- (-s- +(1-
1
s ' " - 1 ~,
-)'s"s
1)
(5)
150 4. I D E N T I F I C A T I O N
PROTOCOLS
USING HUMAN
CHARACTERISTIC
Computing the scalar multiplication in equation (t) seems hard for ordinary persons. To avoid this difficulty we direct our attention to a characteristic of human vision. Assume a screen where a lot of points lie at determined location and that each point is labeled by a symbol selected from a finite set. Assume a keyboard where each symbol in the set can be input. We use the fact that ordinary persons can quickly focus on a predetermined point and input the corresponding label into the keyboard. 4.1. P r o t o c o l 1 Let f~ be a set of s ~ elements. In Protocol 1, each question qi in Protocol 0 is assigned a question-expression, Q,-
{(w, q~(w)) I~a e f2},
q , ( w ) = q , . ¢,(w) w e F~
(6)
where ¢i is a bijection from f2 onto F,". Prover remembers u-tuple [kx, k2,... ,k~] over as a key. For each i = 1 , . . . , u, Prover answers qi(ki) to given question-expression Qi. Note that ¢i(ki) T = ki and qi(ki) = ai. Protocol 1 has the following properties. P r o p o s i t i o n 5. The LUB of success probability of blind, known, and chosen Q & A attack to Protocol 1 is upper bounded by that of the corresponding attack to Protocol 0. P r o p o s i t i o n 6. In Protocol 1, the question-expressions are described by D l ( s , v , u ) = u . s". log 2s
[bit]
(7)
E x a m p l e 1: M a p S c h e m e . Figure 4 depicts an example where ( s , v , u ) = (3,4,9), Fs = {1,2,3}, and fl is the set of 81 sampled names of railway stations. The questionexpression is the set of pairs of station names and figures appearing in small circles put on the location of the stations. This example displays u - 9 question-expressions serially. The route map helps Prover to remember a key and to quickly look for the location of a remembered station and recognize the figure. From this example we can see that such a relation among the elements of fl can greatly reduce load of human provers. 4.2. P r o t o c o l 2 Propositions 2, 3, 4, and 5 imply that to increase resistance against attacks we should make s, v, and u grow. However, human memory capacity limits u. And capacity of a screen to display question-expressions or human visual ability of discriminating points limit s and v according to Proposition 6. Now we point out a middle approach between Protocol 0 and Protocol 1 to reduce the amount of d a t a for displaying a question-expression. Let m (___ v) be a positive integer. Recall Protocol 0. We divide v into v - ~ "f=l vf, vf > 0. Correspondingly for i 1 , . . . , u, we divide qi and k w as qi - [ q i l , ' " , qi,n.] and k w - [kW, . . . , kiT] so that by letting air - qifkij E Fs for f - 1 , . . . , m, we have ai = ~ = 1 air. Thus we can derive a scheme, called Protocol 2, that 1) for each f = 1 , - . . , m Verifier challenges a sub-question-expression like in Protocol 1; 2) Prover makes in mind an subanswer for each sub-question-expression; 3) Prover mentally sums up all of the sub-answers
151 .,
.." ............ •"
•."
........ .:
~
~"
¢3
".. J ...... Nlshi-Urllwa ...". . . . .... A r e a ' E "
........
1".....
...•" / " '-~- ,....... ~z~ .-:p~ H,+m," ...... ~ - 4 " ~ C ~ ..... i r'.: ~. ............." ~ . 1 . . . ~ ~ . . : [ : :
m
~a..............
K,*,oo,~. " n" •
~ #r ~
.:':"-...
~r,,,y,
Migasni-urawa
"li ~
r~ ~
]_Klsuklbe ....... Klshiwl .............. ( ~ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ~r~mmmmz ! Shin-Matsudo ~ r ~
~ t.............. ,y, O ........ .... °o
.................. ......~\._ . / _ . ~
. . . . . . . . . . ".
/" ............~.,.-.
~..........i''. . : .
. . . .
:
............."+" ... ...................~..... K ami-Shakujii
~#~
...................~
~ I
K
:"
: ,
e
O
' ~Kita"~"
.
.
X
,
, , ..,
°"I~" S * ehJblma'* ""n " " ~
. ~
Abiko k
"%" "\"" ~
-+
'":-::',,
.
.
"....".; Shln-Yahashira (
"
"-...
"
mC
K
........... " " /
Nobor to
+ .:
i . + ! + +
hin-Kamagaya
c
........................................................ 1
i
.-.."" " .....J I!
.....( •
,,
.:...~.......................... ..-;
-,'..
"T ~
7 . . . . .j ...... ~
i
!
i
".:÷............................. ~ ) .................................
.." ........................... "~.
..." .." O',
,,
,
.........
t , . . ~ . \
!
1
|
z
.........
:
``...
J Kawasaki""
""
Figure 4: An Implementation of Protocol 1: Map Scheme ((s, v, u ) = (3, 4, 9)) to have an answer; 4) then Prover inputs the answer to Verifier. following properties.
Protocol 2 has the
P r o p o s i t i o n 7. The LUB of success probability of blind, known, and chosen Q & A attack to Protocol 2 is upper bounded by that of the corresponding attack to Protocol 0. Proposition
8.
In Protocol 2, the question-expressions are described by 27/.
D2(s, v, u) - u . (~-~ s~'S) • log 2 s I=l
[bit]
(8)
I , Protocol 2 P~ov+vshouid make addition(s)over F , but the value D2(s, v, u) can be far smaller than D1 (s, v, u). E x a m p l e 2: G a m e S c h e m e . Reducing load of Prover we translate the F~ addition into some binary operation that can be readily done by ordinary persons. Here we utilize the J a n k e n game of 'paper', 'stone' and 'scissors'. Let s = 3 and m = 2. See Figure 5. This example displays u = 9 question-expressions serially. In the figure, the left and right screen respectively displays a question-expression of (s,v) = (3, 3). In the left screen, element 0, 1, and 2 is respectively m a p p e d to hand 'stone', 'scissors', and 'paper'. In the right, element 0, 1, and 2 is respectively mapped to 'stone', 'paper', and 'scissors' Prover selects a hand from the left screen with respect to a key and one from the right screen in a similar way. Prover mentally matches both the hands and answers the winner. Namely, if the left wins the answer is to touch symbol L, if the right wins it is to touch symbol R, and if the game is drawn it is to touch symbol D. Note that the rule of the
152
j+~'.....A"'"':B......C'""D'.....E .....':F'"'"G'"'"~ Left
i+++++++++i i
. R ight I'~"'~'B'"C"'D' .....E .....~4.....'~'"",'.~
,,+~4+~ i+++++~++~i
i _+2qo,,y,..+/+ !
iI J K L M N 0 P Q K ~ / ~ , ~ / ~ L / ~ ~ I
i+ + + + + + + + + i ~ i +.a + ~ . v ,
x + ~.+i
i
a X L M N 0 P Qi
+ + + + + + + + +i
ita + + , ,, ,+ X + ~.,i
" 2 by a similar way: We translate symbol L, D, and R respectively into hand 'paper', 'stone', 'scissors' and conduct an elimination tournament by matching one of L, D, and R with a hand selected from another questionexpression newly displayed in the right screen. 5. C O N C L U S I O N We have examined interactive human identification schemes that can resist Q & A attacks in some extent. Applied Protocols require human provers only fairly simple manipulations. An important subject is to further investigate whether we can develop practically appealing human identification protocols along with the suggested line. REFERENCES 1. T. Matsumoto and H. Imai, "Human identification through insecure channel," Advances in C r y p t o l o g y - EUROCRYPT'91, Lecture Notes in Computer Science No.547, pp.409-421, Springer-Verlag, 1991. 2. H. Ijima and T. Matsumoto, "A simple scheme for challenge-response type human identification," Proc. of 1994 Symp. on Cryptography and Information Security, SCIS9413C, Jan. 1994. 3. R. Mizutani and T. Matsumoto, "Evaluating security of a simple interactive human identification scheme," IEICE Trans. Fundamentals, Vol. E78-A, No.5, May 1995.
III.7 Screen Design 2
This Page Intentionally Left Blank
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
155
Adjustment Mechanism for a Drawing System with Stationery Metaphors Naoki KATe, Natsuko FUKUDA and Masaki NAKAGAWA Nakagawa Lab., Department of Computer Science, Tokyo University of Agriculture and Technology, 2-24-16 Naka-Cho, Koganei-Shi, Tokyo, 184 JAPAN, e-marl:
[email protected] I. INTRODUCTION We have been carrying out research aiming for a system which will support our creative activities. As a part of this research we have realized a penbased drawing system for the creation of diagrams on a computer [1,2]. Up until now many drawing systems have been researched and developed, but the majority of current systems are dedicated to the stage of making a neat copy. As opposed to this approach, the drawing system we are developing is aimmg to support work from the stage of creating a rough copy, and makes use of a handwriting (pen) interface using a display-integrated tablet as a humancomputer interface. As handwriting interfaces do not obstruct the user's thought process with their operations, they make it possible to realize a system which can be used from the rough copy stage where the user's thinking is of utmost importance. With this drawing system, we have provided stationery metaphors as a new drawing interface. The stationery metaphors are virtual stationery displayed on a display-integrated tablet, and are metaphors which replicate the abilities and operation methods of real stationery. As they are manipulated in the same manner as actual stationery they can be used easily, even by people unaccustomed to computers. As with real stationery, however the positional adjustment of drawings that are about to be drawn is still a troublesome task. Although the difficulty of adjustment has been a problem with many mouse-based drawing systems, it is more serious since operations requiring precision are difficult with a pen on a display-integrated tablet due to parallax and unstability of hand movement [3]. In this paper we present an automatic adjustment mechanism designed for a drawing system with stationery metaphors to solve this problem. 2. NECESSITY OF ADJUSTMENT FUNCTIONS With this drawing system it is possible to draw rough copies in freehand with no restrictions. For freehand drawing, a pen is much mightier than a
156 mouse. When geometrical shapes such as lines, circles, and so on are required stationery metaphors are used. The stationery metaphors that have been developed so far consist of a ruler metaphor for drawing straight lines, a compass metaphor for drawing circles and arc lines, and template metaphors [4] for drawing objects such as squares, triangles, and so on. When drawing with stationery metaphors it is sometimes difficult to align the position or size of the drawing. For example, positioning a triangle template metaphor so that the point of the triangle matches to the end-point of an already drawn line (figure 1) is more difficult than it at first seem. In an experimental trial of this drawing system, five of the ten students from our laboratory who participated expressed the opinion that minute operations were difficult with the template metaphors. This phenomenon also occurs when drawing with pen and paper, or using a mouse-based drawing system. If the computer can support such precise adjustments it then becomes possible for the user to draw neat drawings in a short period of time. Here we propose an automatic adjustment mechanism for a drawhag system with stationery metaphors where the computer supports these adjustments. This automatic adjustment mechanism allows adjustments such as of position or size exactly as the user intends, without placing an unnecessary burden on the user. The employment of computing power for this type of adjustment is a natural automation of our physical alignment of real stationery. Therefore, its effect could be different from automatic adjustment in the mouse-based drawing systems. A template metaphor of a triangle. Moving
An already drawn line. ,
User wants to match these. Figure 1. An example of positioning a template metaphor.
3. CLASS OF ADJUSTMENT FUNCTIONS Adjustment operations for drawing with a computer can be divided into two classes depending on whether the adjustment is carried out when the object is drawn, or after it has been completed. The adjustment mechanism we present is the former. When manipulating a stationery metaphor the computer carries out
157
adjustments as it sees necessary. The types of adjustment required for drawing with stationery can be further divided into two main types. One type supplements movement, size changes and rotation of stationery metaphors to fit them precisely to already drawn objects. For example, ff you want to use the compass metaphor to draw a circle touching a straight line it is necessary to precisely line up the drawing point of the compass with the line (figure 2). When you move the pencil ~ m to a point near the line, the automatic adjustment adjusts the arm precisely tO the line. The other type is adjusting the start or end point of the object to be drawn when using a stationery metaphor. For example, when using the ruler metaphor it is difficult to align the start point with the position intended (figure 3). An adjustment can be carried out to adjust this alignment. We shall refer to the former case as an adjustment to the stationery metaphor, and to the latter case as an adjustment to the object
being drawn.
~
djusted. Start point is adjusted so as to fit a drawn line. A compass m e t a p h o r A ruler metaphor
An a l r e a d y d r a w n lme
An already drawn line
Figure 2. An example of positioning a compass metaphor.
Figure.3. An example of adjusting the start or end point of the drawing object.
4. DESIGN OF ADJUSTMENT FUNCTIONS 4.1. Design 1" Automatic adjustment and cancellation This adjustment mechanism supports adjustments made when the user is drawing with a stationery metaphor. Here the computer anticipates the user's intentions and automatically carries out adjustments. However, there are some cases where the user would not want adjustments carried out automatically. To cope with these situations the user can choose whether or not adjustments will be carried out automatically. 4.2. Design 2: The Operation the user last carried out is chosen as the method of adjusting The adjustment method carries on the operation that the user was performing. For example, an adjustment carried out after the user had performed a
158 movement operation would adjust the object by moving it. Revolving the object to adjust it, even though the user had been moving it, would produce an unexpected result. 4.3. Design 3: Objects that have already been drawn are not altered When adjusting a stationery metaphor, objects that have already been completed are not moved or distorted. Adjustments of position or size only apply to the stationery metaphor or object being drawn, and thus confusing the user is avoided.
5. IMPLEMENTATION This adjustment mechanism has been implemented in the drawing system with stationery metaphors. This system supports freehand strokes, straight lines and arc lines as drawing objects, and provides a compass metaphor, ruler metaphor and template metaphors. There are several types of adjustment mechanism that should be provided, but we currently have only the following. (i) The ruler metaphor is adjusted so as to fit a start or end point of a straight or arc line (figure 4(a)) (ii) The ruler metaphor is adjusted so as to touch an arc line (figure 4(b)) ('all)The pencil arm or needle arm of the compass metaphor is adjusted so as to fit a start or end point of a straight or arc line (figure 4(c))
(a)
Co)
(c)
Straight line
ight line Arc line
line
/
Arc line
/
h
Fit
I I A ruler metaphor
A ruler metaphor
A compass metaphor
Figure 4. Type of adjustment functions.
6. EVALUATION We have carried out a simple experiment to evaluate the effectiveness of adjustment functions. Twenty students from our laboratory were asked to reproduce a drawing, like in figure 5, in two different environments, one with adjustment functions and one without. In order to remove any learning effect the
159 subjects were divided into two groups of ten each, with group A using the environment without adjustment features first, while group B used the environment with adjustment features first. The results of the experiment were as follows. (A t-test validated that there was no difference between the results for the two groups, and so they have been combined.) The average number of operations performed with stationery metaphors is given m table 1, while the average time taken to complete the drawing task is given in figure 6. A t-test verifies that there is a difference, in both the number of operations performed and the time taken for the task, between the cases where the adjustment functions are or are not used. From the above results we believe that adjustment functions are effective in reducing the labor required m drawing tasks. Table 1. The average number of (moves) operations With adjustment functions
Without adjustment functions
Ruler
Ruler
5.45
Compass Pivotal
Pencil
3.10
3.50
13.35
Compass Pivotal
Pencil
10.20
8.85
7. SUMMARY This paper described an automatic adjustment mechanism for a drawing system employing a handwriting (pen) interface with stationery metaphors. With this mechanism the computer automatically carries out adjustments of the position or size of stationery metaphors at the stage of drawing. By including this adjustment mechanism in the drawing system it is possible to support the timeconsuming task of making adjustments, and thus improve efficiency of the users. REFERENCES 1. Nakagawa, M., Kazama, S., Satou, T. and Fukuda, N.: Pen-based Interfaces for Dra wing Figures with "Stationery Metaphors: Human-Computer Interaction: Software and Hardware Interfaces (Salvendy, G. and Smith, M.J. ed.), Elsevier Science Publishers B. V., Amsterdam, (1993) 1046-1051. 2. Kazama, S., Kato, N. and Nakagawa, M.: A Hand-Drawing System with 'Stationery Metaphors'On Japanese), Trans. IPS Japan, 35, 7 (July 1994) 14571468. 3. Kato, N., Fukuda, N. and Nakagawa, M.: An Experimental Study of Interfaces Exploiting a Pen's Me~its, Proc. HCI '95, to appear. 4. Fukuda, N., Masaki, N.: Prototyping of Pen-based Drawing Interfaces with Template Metaphors (in Japanese), SIGHI IPSJ 48-5 (May. 1993) 33-40.
160
~) The center of the circle is the left point of a drawn line. the radius of the circle is the ength of a drawn line.
Po t of contact F i g u r e 5. F i g u r e for e v a l u a t i o n e x p e r i m e n t .
W i t h o u t a d j u s t m e n t functions.
With a d j u s t m e n t functions.
Task time (sec) .
.
.
.
.
.
160-
.
! 140-149 130-139 120-129 ....... 110-119 . . . . . . . . . . . . . .
~
100-109
-
90-99
m
80-89 70-79
~
60-69 50-59
i~:::[ ~ ::
40-49
~
_
_
. . . . . . . . .
~
~
.
-39 I
8
6
4
2
0
0
2
4
6
F i g u r e 6. N u m b e r of s u b j e c t s b y t i m e t a k e n for t h e d r a w i n g t a s k s .
8
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
161
Analysis Tool for Skill Acquisition with Graphical User Interfaces Based on Operation Logging Nobuko Kishi ~ ~Department of Mathematics, Tsuda College, Kodaira-shi, Tokyo 187, Japan Observing users working with a system is essential for better user interface design. One of popular observation methods is logging users' operations, i.e., having the system collect the users' operations automatically. However, it is known that operation logs, particularly the ones collected in the systems with graphical user interfaces(GUI), can be very large and difficult to extract meaningful information. We developed a set of tools, collectively named SimUI, to log and analyze users' operations on applications with GUI. SimUI's analysis tool uses a technique, named multi-step matching, to detect differences between two sets of operation logs. This paper describes the use of multi-step matching in a experiment, in which the subjects are asked to learn to use a bitmap editor. In the experiment, we observed that the 'matching rate,' measured by multi-step matching, can indicate various type of skill acquisition, compared to statistics about the task completion time and the frequency of command use. 1. I N T R O D U C T I O N Observing users working with a computer is known to be an effective approach for better user interface design and better user support. Several observation methods can be used such as video-camera recording, audio-tape recording, and interviewing. However, the effectiveness of these methods depend on the skills and knowledge of human observers. To overcome these drawbacks, logging, i.e. having the system collect users' operations is often used to record the users' behavior automatically. Several statistical techniques have been used analyze operation log[2]. However, the log data collected in the system with graphical user interfaces can be huge and requires more powerful analysis techniques[l]. We developed a set of tools, collectively named SimUI (SIMulated User Interaction evaluation tools), to collect operation logs and to extract meaningful information for user interface designers. The central idea of SimUI is to record all user input and system output, and to generate the difference between the operation logs of novices and skilled users automatically. Although the amount of recorded data is very large, SimUI's uses two technique to automate the analysis' data gathering in playback mode and multi-step matching of recorded data. The usefulness of SimUI's analysis technique was studied by conducting an experiment. Four subjects were asked to learn to use a bitmap editor on X Window Systems. We found that the difference detected by SimUI can suggest various aspects of skill acquisition with applications with graphical user interfaces.
162
I
Modified X ~ Display Server ~ • ~"
-~ -]
~.
X Protocol Analyzer
.,
.- ~-Modified X - lToolkit
F
Application
k
".,,, ,/
Playback Tool
v~Tool[Data Collection
~ Operati Log on
} Recorded Data
Figure 1. Data collection during playback
2. S i m U I - T O O L S
FOR GRAPHICAL
USER
INTERFACE
EVALUATIONS
SimUI is implemented for X Window Release 5 applications. User operation logging capability of SimUI is provided by: • modified X Display server, • recording tool, • playback tool, • X protocol analyzer, • data collection tool, • modified X toolkit library. These tools are used in the following steps. 1. The recording tool sends a 'record' request to the modified X display server. The display server start sending all the input events to the recording tool. 2. After recording, an application, which is to be the main focus of observation, is relinked with the modified version of X toolkit library. 3. The X protocol analyzer is run as a pseudo display server. It behaves as a display server for the application program, while it behaves as a client of the modified X display server. 4. The playback tool sends a 'playback' request to the modified display server. The server switches the source of input from the user to the playback tool, then the playback tool starts sending the input events recorded by the recording tool. 5. The application and the display server runs as if they are in playback mode. In addition, the application program, the protocol analyzer and the playback tool send the information on their status to the data collection tool.
163 s>LOC_Y_DELTA: 12388714251 mili value -27 s>LOC_X_DELTA:I2388714661 mili value 3 s>LOC_Y_DELTA: 12388714661 mill value -2 sc>EVENT: FocusOut detail: Nonlinear event: WIN2000038 mode: Normal sc>EVENT: EnterNotify detail: NonlinearVirtual time: TIM49d7adaa root:WIN2b sc>EVENT: FocusIn detail: Pointer event: WIN2000038 mode: Normal sc>EVENT: FocusOut detail: Pointer event: WIN2000038 mode: Normal ap>handler WID27728 index 0 proc ADRf76fOeac eventtype 7 ap>handler WID27728 index 0 proc ADRf76fOeac eventtype 9 ap>handler WID27728 index 0 proc ADRf76fOeac eventtype I0 ap>handler WID27728 index 0 proc ADRf76fOeac eventtype 9 s>LOC_X_DELTA:I2388715081 mill value -I s>LOC_Y_DELTA: 12388715081 mill value 5 sc>EVENT: FocusIn detail: Nonlinear event: WIN2000038 mode: Normal s>LOC_X_DELTA: 12388715501 mill value 1 sc>EVENT: LeaveNotify detail:Virtual time: TIM49d7add4 root: WIN2b event:
Figure 2. Example of data: collected during playback
The relationship of the tools at the last step is shown in Figure 1. The example of data collected by the data collection tool is shown in Figure 2. Each line contains one record sent to the data collection tool from the other tools. The prefixes indicate which tool has sent the record as follows. s> Sent from the playback tool when it sends input events to the display server. sc> Sent from the protocol analysis tool when the server sends an event or reply to the client. cs> Sent from the protocol analysis tool when the client sends an event or reply to the server. xt>,ap> Sent from the application program (a X client program) when its X widgets changes their status.
3. L O G A N A L Y S I S B Y M U L T I - S T E P M A T C H I N G
The analysis tool in SimUI is to analyze the data recorded by the data collection tool. Although the analysis tool can generate various statistics, its main technique is the multistep matching. The multi-step matching is to compare two sets of records and generate .the difference between them. When one of the records are obtained from a skilled user, and the other from a novice user, the difference generated are closely related to the differences noticed by human observers. The multi-step matching is performed by repeating: 1. filtering out less important records.
164
ap>A :
--ap>A
lap>A:
:
lap>A
lap>A:
:
lap>A
S>X
•
•
S>X
•
•
S>X
S>X
~
S>X
sc>a
sc>a
•x
• s>x
S>X
•
• s>x
ap>B ~ s>x .
s>x
s>x sc>a
s>x ap>B
s>x
•
•
sc>a
First Matching
2ap>B,k~. s>x . \ , s>x • \.
s>x sc>
"~
2sc>a %
• s>x
3ap>B,k, ~ , _ \- s>x :\:
sc>a
s>x L,,,.~ 2ap>B
s>x " 4sc>a ~.....~ 3ap>B
s>x
•
s>x
-',
sc>a
Second Matching
•
--
4sc>a
Third Matching
Figure 3. Example of multi-step matching
2. generating a difference between two sets of records, by using a tool such as a 'diff' command in UNIX. 3. numbering the matched records so that they lnatch again at the later repetition. By doing so, we are more likely to match meaningful records (e.g. a command is executed) rather than somewhat meaningless records (e.g. a mouse pointer is moved by one pixel.) Figure 3 shows an example of the result obtained by the multi-step matching. The multi-step matching generates a matching rate, a rate of the matched records in all of the records. Because filtering and matching are repeated three times based on the prefixes of records, there are three kinds of matching rates. 1. Application-Level Matching Rate The matching rate of records prefixed with xt>,ap>. This rate is assumed to become large when the same callback functions and event handlers are called, which often occurs when the same commands are entered in the same context. 2. Protocol-Level Matching Rate The matching rate of records prefixed with sc>,cs>. This rate is assumed to become large when the same requests and replies are exchanged between the application and the X display server, which often occurs when the same windows or widgets are selected. 3. Device-Level Matching Rate The matching rate of records prefixed with s>. This rate is assumed to become large when the same keys are hit and when the mouse is moved the same distance in the same direction, which often occurs when the mouse are moved in the similar way.
165 4. E X P E R I M E N T -
LEARNING
TO USE
A 'BITMAP'
EDITOR
The assumptions about multi-step matching were studied by analyzing the operations on 'bitmap.' 'Bitmap' is a bitmap editor under X window systems (X11R5), included in the standard source distribution and implemented with X toolkit. The 'bitmap' used in the system was linked with SimUI's X toolkit library, so that it can produce an application level records. Four participants were asked to draw a shape of star with a 'bitmap' as shown in Figure 4. The reason for choosing the task of drawing a star is that it involves various types of skills. Figure 4: bitmap': a bitmap editor under X window systems • Motion skill.
For example, a user can move the mouse faster and more precisely. • Knowledge about an application program. For example, a user can learn the meaning of commands such as 'flood fill' and 'Invert', instead of painting each rectangle. • Knowledge about an application area, geometry. For example, a user can draw a star with only 5 lines although rhe shape of star has 10 edges. The four participants are asked to perform the task, drawing a star, twice. First they were asked to draw a star without any advance notices, although they are familiar with X window applications and have used the 'bitmap' editor for several times. Then they were asked to draw a star again after practicing as many times as they want. Figure 5 and Figure 6 show the results of the two participants, User C and D, respectively. Both participants improved the task completion time, i.e. they draw faster at the second performance. However, one participants, User D, didn't increased the applicationlevel matching rate at the second performance. This indicates that User D used a different set of commands to perform the task. That means that User C improved the motion skill, while User D found a different strategy to draw star between the two performances. 5. D I S C U S S I O N S A N D C O N C L U S I O N S Although we omitted other findings in the experiment, we found that the matching rate can detect the differences in the type of skills the participants improved with the bitmap editor. These differences would have been overlooked by using statistics such as the task completion time the frequency of command use. By implementing SimUI, we found:
166
• T h e o p e r a t i o n logging with X w i n d o w applications are feasible. In paticular, d a t a g a t h e r i n g d u r i n g p l a y b a c k m a k e s the collection of b o t h user i n p u t and s y s t e m o u t p u t feasible. • Because m o s t SimUI tools run as servers, it is possible to u n o b s t r u v e l y collect log d a t a from a large n u m b e r of users working in their e n v r i o n m e n t .
REFERENCES I. A.C. Siochi and R.W.Ehrich, Computer Analysis of User Interfaces Based on Repetition in Transcripts of User Sessions. ACM Transactions on Information Systems Vol.
9, No. 4 (1991) 309-335. 2. S.J. Hanson, R.E. Kraut and J.M. Farber, Interface Design and Multivariate Analysis of UNIX Command Use. ACM Transactions on Office Information Systems Vol. 2, No. 1 ( 1 9 8 4 ) 4 2 - 5 7 . Percent 100.00
I
-
_'Application Level
I
" P ~ & - 6 i ~ v 6 i ......................
- "g6~7~4%i . . . . . . . . . . . . .
80.00 60.00 --
m
40.00 -
...........'..,~.... ...................
° °-~t
20.00 0.00--
I
I
First
Second
Figure 5. User C's M a t c h i n g Rate.
Percent 100.00 80.00
I
I_ "Application Level ~i~8i8~'8i"E~'~;~i.................. - "~4~7L/~%i ........... ~I'ime (Seconds)
I
I
I--
First
Second
I
_l
I
-
60.00 40.00 20.00 0.00
-I
F i g u r e 6. User D's M a t c h i n g Rate.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
167
The role of screen p a r a m e t e r s in visual communication Masaaki K u r o s u a, H i t o s h i Y a m a d e r a a a n d Itaru M i m u r a b aDesign Center, Hitachi Ltd. 5-2-1 M i n a m i - a o y a m a , M i n a t o - k u , T o k y o 107, J A P A N Phone: +81-3-5485-1451, Email"
[email protected] b c e n t r a l Research Laboratory, Hitachi Ltd. 1-280 H i g a s h i - k o i g a k u b o , K o k u b u n j i shi, Tokyo 185, J A P A N Phone: +81-423-23-1111
1. I n t r o d u c t i o n
One reason why the video conference system has not yet widely used is the lack of the evidence that will convince us its cost-performance compared to the face-to-face meeting. Reality is a key concept to describe the performance of the system. Previous researches on the video conference system had a tendency to focus on the hardware and software aspects of the system. But few of them studied the human side of the system, i.e. how the reality can be measured or how physical parameters may affect the degree of the reality. In this study, we tried to fix the independent and dependent variables relating to the reality and performed a psychological experiment on some of these variables. 2. Variables relating to the reality of the c o m m u n i c a t i o n
Dependent variables that represent the degree of the reality may vary according to the type of the meeting.
For one-way communication, the understandability of the
information and the resulting change of the attitude of the attendants are important, and the degree of sharing the topics, the depth of the discussion and the satisfaction of each participant are important factors for two-way communication. Independent variables include the coference setting, i.e. the type of the meeting, the size of the meeting, the layout in the meeting place, or the distance among each participant. Other type o f independent variable is the device parameters, i.e. the resolution on the screen, the size of the screen, or the configuration of the sound field. As a first step, we decided to investigate the one-way communication by way of the attitude change as a dependent variable and the screen size and the resolution as
168 independent variables. 3. C o m m u n i c a t i o n e x p e r i m e n t
3.1 Experimental method The basic paradigm of the experiment was the before-after design that measures the degree of the attitude change of the subjects before and after the presentation of persuasive information. As a topic, we picked up the discriminative expression including the sexual harassment and the discrimination to the minorities. Based on the assumption that most of the subject may initially think that to avoid the use of such discriminative expression is enough, the persuasive information given at the presentation period paradoxically emphasizes that it does not necessarily matter whether to use such expressions or not and that what really matters is the way of thniking at the back of such expression.
The
presentation lasted for about 7 minutes. The presentation was given both via the screen and by the direct speech. There were two screen conditions, the screen size and the resolution. Screen size conditon included 110 inches and 55 inches. The image of the speaker appears on the screen in the actual size in 110 inches condition and in relatively minimized size in 55 inches condition.
Resolution
condition included HDTV level and NTSC level. 10 subjects participated in each condition who looked a t the screen in a row with the distance of about 3 meters from the screen. In the screen conditions, the presentation was pre-recorded in a HDTV video and the resolution level was controlled at the time of
the
presentation.
In the
direct
condition, the presentation was given by the same speaker directly in the same room.
Because there were four screen
conditions and one direct condition, total of
50
subjects
experiment.
participated
in
the
All the subjects were the
employees of the research laboratory. The image of the experimental situation is
Figure 1. Experimental Situation.
shown in Figure 1. Three types of data were collected. One is the attitude scale that included 9 subscales on 1. whether the government should control the expression, 2.1 whether the mass communication corporates should define the list of prohibited sexual expressions, 2.2 whether they should define the list of prohibited discriminative expressions against the minorities, 2.3 whether they should rely on the individual judgment of their employee, 2.4
169 whether they should ask the specialist for the adequacy of the expression, 3.1 whether the company to which all the experimental subjects work for should define the list of prohibited sexual expressions, 3.2 whether the company should define the list of prohibited discriminative expressions against the minorities, 2.3 whether the company should rely on the individual judgment, 2.4 whether the company should ask the specialist for the adequacy of the expression. Those were rated on the 7-points scales. Because the presentation given in between the attitude measurements were focusing on how the company to which the subjects belong should behave for the problem of the discriminative expression especially for the minorities, it was hypothesized that there will be a big attitude change at least for the scale #3.2. Other data were additional. One is the subjective evaluation d a t a on the personality of the speaker.
Fifteen
7-points scales were used after the presentation to describe the
personality of the presenter such as social ingelligent, nervous etc. Another is the evaluation data on the quality of the presentation.
Three 7-points scales were used including the
clearness or the visibility of the image on the screen, the naturalness of the atmosphere and the reality of the total
,0
..........................................................................................................................................................................................................
impression.
55 / HDTV Direct\ ~:i~.!i~
..............................................................................
3.2
Result
on
the
personality
60%
evaluation
. . . .
Figure
...~. ...............................................................................................................
i
; ..................
. . . . . .
;
..........................................
2
shows the mean of
each
of
conditions 15
"~40
~I
_..
,
for
personality
traits. was
5
There not
¢1,~
e, o / .................................
\
...................................
_.,.
a
significant difference in the
20
rating data, i.e. it can not
be
10
attributed to the personality
m
of
--~ ~-~
co
E
~
-~
==
=; ~ ~
~
8
~
._
~
the presenter, if there may be a
Figure 2, Subjective evaluation on the personality of the
difference in the
presenter.
170 attitude scales. As a supplement,
the
principal
component
analysis
was applied
to the
rating data.
Results
oisti
~ ewous ject :iabl
are shown in Figure 3. Dimension 1 and 2
ntuiti
~lli
showed about 40% of
;itiv
the total variance and the
configuration
of
the
scales
is
reasonable,
ive
which
means the personality of the speaker was adequately recognized by
the
without
opera
subjects
any
special
biases. X,Ca~m 3.3
Result
on
the
attitude change First of all, the
Figure 3. Result of the principal component analysis of the evaluation scales. The figure shows the configuration for dimension 1 and 2.
analysis was done to see if there is any previous differences among the attitude of the groups of subjects. The onefactor analysis of variance was applied to the data of each of 9 attitude scales obtained before the presentation.
The results are shown in Table 1. The probability ranges from
0.2805 to 0.9310 which means it can be said that there were not any differences among 5 subject groups for any of 9 scales before the experiment. For the purpose of analyzing the attitude change after the presentation, two-factors analysis of variance was applied to the difference of the data before and after the presentation.
The results are shown in Table 2.
Most scales revealed no significant
differences but the scale 3.1 showed the p=0.0716 (F 0.1056 0.4654 0.5834 Prob>F 0.7266 0.4856 0.7266 Prob>F 0.4420 0.2033 0.2033 Prob>F 0.9273 0.92 73 0.6488 Prob>F 0.4222 0.3170 0.6872 Prob>F 0.0716 0.2238 0.2238 Prob>F 0.4843 0.0098 0.7637 Prob>F 0.3600 0.8954 0.3600 Prob>F 0.7087 0.5341 0.9008
172 subjects.
The
scale 3.2 had no
problem
with regard to this
aspect
1.6 1.4 1.2 1.0 -~ 08 t~ o
and this result coincides with the hypothesis
0.6 0.4 0.2
:~ 0.0 ,~ -0.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ,~ -o.4
that the infor-
-0.8
mation in the
- 1.0
110/HDTV
presentation
110/NTS C
....................................
55/NTSC
Direct
Figure 4. Amount of the attitude change for item 3.2.
has affected to the
55/HDTV
~
same
aspect of the
B
attitude.
B Natural
The
5.0
Realistic
result for the scale
3.2
in
Figure 4.
4.0
is
shown
This
result suggests
Clear
o
>t~
3.0
O m
t~
@
2.0
that higher the resolution, the more
1.0
people
are affected in their attitude.
0.0
.:-x-:.:
,~.',~-:-:
4:;:.:;:
::::~::::: 1
ll0/HDTV 55/I-IDTV ll0/NTSC 55/NTSC
Direct
Figure 5. Rating values on the presentation quality.
3.4 Result on the quality of presentation Two rating scale values, clear and realistic, showed significant differences (p=0.0579 and p=0.0748) for the screen size.
This means that the screen size affects to the
impressional aspects of the presentation. The result is shown in Figure 5. 4. C o n c l u s i o n The effect of the video conference system was studied experimentally using the paradigm of the attitude change. Based on the result, it was found that the resolution has a strong effect over the attitude change whereas the screen size is influential on the impressional aspects of the audience. This type of psychological study was revealed to be effective in determining the relationship between the physical parameters and their mental effects of the user.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
173
R e - s a m p l i n g of 3 - D o b j e c t r a n g e d a t a b y c u b e - b a s e d s e g m e n t a t i o n S.J. WANG, Y. CAI, M. SATO and H. KAWARADA Precision and Intelligence Laboratory, Tokyo Institute of Technology, 4259 Nagatsuta-cho, Midori-ku,Yokohama, 227 Japan This paper proposes an efficient method for scattered range data re-sampling using uniform segmentation in 3-D space. The method combines the techniques of uniform cubebased segmentation, patch formation, and lattice points generation. After analyzing the scattered range data which formed by optical measuring machine, a preliminary processing is carried out by making the cross section in 3-D object space to form a group of cubes. Based on the cubes from preliminary processing, we generate lattice points in each cube. So using fewer re-sampling points, it is possible to reconstruct a smooth and complete 3-D object with parametric surface equations. 1. I n t r o d u c t i o n 3-D computer graphics(CG) applications and user interfaces offer a new qualitative step in man-machine interfaces by relating to the human's natural familiarity to live in a 3-D world. It provides us with information that makes it possible to locate and recognize objects and their interrelationships without direct physical contact. One of the main problems in virtual reality research is how to generate 3-D objects approaching to a real object. However, modeling of 3-D object in CG is too complex to be done by manpower. In another words, the scattered range data obtained by optical measuring machine are too enormous and discrete to be used directly to reconstruct 3-D objects and display it in real-time. Although a lot of methods of 3-D reconstruction from scattered range data were proposed, almost of them are the methods of plane interpolation. The methods of triangle patches [1]- [2] and the bilinear plane patches [3] proposed up to now have the problem that it is difficult to obtain the natural and high quality 3-D images if the number of extracted points are not enormous enough. Nagashima et al.[4] employed a method by inputting the control points to B~zier surface equations to reconstruct 3-D models, it is inconvenience since the control points have to be selected by manpower. In this paper, an efficient method for scattered range data re-sampling using uniform segmentation in 3-D space is proposed. A preliminary processing is carried out first by making the cross section in 3-D object space to form a group of cubes, and the scattered range data were segmented into each cube. By re-sampling the data in each cube, the lattice points are generated. Employing this technique, it is possible to reconstruct a smooth and complete 3-D object with parametric surface equations using fewer re-sampling points. The experimental results proved that the proposed method is efficient.
174 2. U n i f o r m cube s e g m e n t a t i o n 2.1. C u b e g e n e r a t i o n From the measurement scattered range data expressed as {Di(xi, Yi, z~)li = 1, ..., g } , the position and volume of 3-D object outline can be obtained. Then by making cross section along X, Y, Z axis in 3-D object space, the object is segmented, and a group of cubes are generated. As a result, the scattered range data are separated and located into each cube as shown in Figure 1. The segmented data in each cube are called patches. The sizes of the cubes are determined by
A:= W:In: } A,, - W,,In,, , A:= W:I~:
(1)
where Wx, Wv, and Wz are the sizes of the outline volume and nx, nv, nz are segmentation thresholds in X, ]I, Z directions, respectively.
Z
I
/w[:
w,
X
Figure 1 The cube-based segmentation of 3-D object.
iiiiiiiiii!iiiiiiiiiiiii ,,, (d)
(e)
(t)
Figure 2 The forms of patches in cubes.
2.2. F o r m of patches in cube The patches segmented in cubes are polygonal surface patches. Scattered range data re-sampling means dealing with these surface patches. Assume that a 3-D object is segmented fine enough, and the patches are restricted to six kinds shown in Figure 2. They are triangle ((a)), quadrangle ((b), (c)), pentagon ((d)), and hexagon ((e), (f)). 2.3. R e - s e g m e n t a t i o n of patches As the re-sampling results, the quadrangle or triangle patches will be formed since it is required by Spline or B~zier patches. That means the re-sampling can be carried out directly for triangle and quadrangle patches, and the re-segmentation has to be done if patches are pentagon or hexagon ones. The pentagon shown in Figure 3(a) is re-segmented into a triangle and a quadrangle patches. There are five possible re-segmentation forms. As a rule, the distant between
175 two nonneighbor points is calculated first, then the new boundary is decided by selecting two points with the shortest distance. The solid line B D re-segments the pentagon into two patches. The same rule can be used for the case of hexagon as that shown in Figure 3(b). The solid line B E re-segments the hexagon into two quadrangle patches. After re-segmentation, the obtained new patches can be re-sampled just as that done in the case of triangle or quadrangle patches. 3. G e n e r a t i o n of r e - s a m p l i n g p o i n t s 3.1. Basic a l g o r i t h m of r e - s a m p l i n g The distribution of scattered range data in a patch is shown in Figure 4. n presents the unit normal vector of the surface patch, and O is a point placed in the outside of the surface patch. A line L can be determined by n and O, P(t)
=
(2)
O + n . t,
where t is a parameter. Selecting three points which are the nearest to the L among scattered range data, a plane can be made as
(3)
( P ( t ) - D 1 ) . [(D2 - D1) × (D3 - D1)] = 0,
where D1, D2, D3 are the selected range d a t a . Substituting eq.(2) to eq.(3), (O + n . t - D 1 ) . [(D2 - D1) x ( D 3 - D1)] = 0
(4)
is deducted. Solving equation (4), parameter t can be found. Substituting t into equation (2), an intersection point P which is called re-sampling point is obtained. Further more, changing the point O in the orthogonal direction of n, the new intersection point can be generated. It is shown that once n and O are determined, the re-sampling points can be obtained. The following discussion is how to determinate n and O for six kinds of patches shown in Figure 2.
A
A n
Bi\~'7-~\"//
B
C
D
(a)
(b)
Figure 3 Re-segmentation of pentagon and hexagon patches.
Figure 4 The computation of a re-sampling point.
176
3.2. Generation of re-sampling points on boundary of patch In order to join the adjacent surface patches continuously, common re-sampling points on boundaries between adjacent patches are necessary. The points P1, ...,P6 are the apexes of the two adjacent patches in Figure 5. They are obtained re-sampling points on the edges of cubes. Assuming that M x M re-sampling points are required in the patch, so M re-sampling points are needed on boundary. At first,using P1 and P4, the M points can be determined on the line segment by
oi=el+(i-1)
×
P4 - P1
M-1
where Oi are the points on b are defined as
n~
=
nb
=
'
(5)
PIP4, i =
1, 2, ..., M. The normal vectors of the patches a and
(P3 - P1) × (P2 - P4) If(P3 - P1) × ( P 2 - P4)[I' (P4 - P6) X (P1 - P5) i [ ( p 4 _ P6) x (P1 Ps)II" -
(6) (7)
-
Using obtained normal vectors, the unit normal vector about the re-sampling points on the boundary is n a -k- nb
n -
IIn. + nbll"
(S)
With the obtained Oi and n, the M re-sampling points can be generated according to the equations (2)-(4). In the case of triangle patch, the patch is considered as a quadrangle patch which degenerates a boundary to a apex called degenerative point. The degenerative point is decided as the apex whose opposite side is the shortest one in the triangle. Here the degenerative point is considered as M common re-sampling points.
P1
PII
P ~ n ~ ~ ~ - ~ ~ P2
~
PIM p~j-'~
P,
P,
Figure 5 The generation of re-sampling points on boundary.
PMM
Figure 6 The generation of re-sampling points inside patch.
177 3.3. G e n e r a t i o n of r e - s a m p l i n g p o i n t s inside p a t c h Based on the re-sampling points on boundary, the generation of re-sampling points inside patch is carried out. For a re-sampling point P # in Figure 6, M points are obtained on the line segment of PilPiM and P l j P M j , respectively, given by ,
O#=Pil+(j-1)
x
,,
O#=Plj+(i-1)
x
PiM -- Pix M-1
PMj -- Pli M-1
(9)
' '
(10)
where P i l , PiM, Plj and PMj are the re-sampling points on boundaries of the patch, i, j - 1, 2, ..., M. Using O# and O#, the point O# can be determined as t
I
it
It
Oij = O# + 0 # . 2
(11)
The unit normal vector nij about the Pij is decided by the equation of
(PiM -- Pil) x (Pli - PMi)
(12)
n,j = [[(P,M - Pil) x (Pxi - PMi)][" Using O# and n#, the M x M re-sampling points P # can be generated according to the equations (2)-(4) by c h a n g i n g / a n d j from 1 to M. In the case of triangle patch, the method mentioned above also can be used by considering the triangle as a quadrangle patch with a degenerative side, just as that discussed in 3.2. 4. E x p e r i m e n t a l r e s u l t s For quantitative evaluation, an experiment has been done using scattered range data of a car model. The number of original scattered range data of the car is 70421. Figure 7 shows the 3-D range data of the car, whose size is 43.4 x 16.8 x 6.8(inch3). The outline volume enclosing car is 49.6 x 22.4 x 8.0(inch3), the number of cubes is 32, and the size of one cube is 6.2 x 5.6 x 8.0(inch3). The wireframe car model, which is the re-sampling result, is shown in Figure 8. The intersections of mesh act as the re-sampling points. The number of patch is 32, and there are 4 x 4 re-sampling points in one patch. The total re-sampling points are about 400. The shading representation of the car model, which is reconstructed with bicubic B~zier surface equation from re-sampling points [5], is shown in Figure 9. The evaluation of root mean square error(RMSE) of the reconstructed car is shown in Figure 10. 5. C o n c l u s i o n
This paper presents an novel approach in which scattered range data of 3-D objects are re-sampled by uniform cube-based segmentation. The proposed technique for generating lattice points can be considered as an interface between optical measuring machine and 3-D object reconstruction or modeling. Starting from scattered range data, a group of cubes are generated, and re-sampling for the data located in each cube is carried out. As a result, the patches with lattice re-sampling points are formed, and the 3-D object can
178 be represented by these points. Also the enormous scattered range data are compressed greatly. It is very useful for high speed display in the application of virtual reality.
Figure 7 The scattered range data of a car model.
Figure 8 The re-sampling result of the car model. '
"~-mod~'--
0.4 m
0.3
0.2
00
Figure 9 The shading of reconstructed car.
40
80
120
160
200
Figure 10 The RMSE evaluation of reconstructed car.
REFERENCES 1. W . T . Zheng and H. Harajima, Surface Representation Based on Invariant Characteristics, Technical Report of IEICE, IE93-123, pp. 31-38. (1994) 2. H. Nisino, K. Akiyama and Y. Kobayasi, Acquisition of 3-D Object Shape Using SlitRay Projection and Reconstruction of Surface Model, Trans. of IEICE, vol.J72-D-II, no. 11, pp. 1778-1787, (1989) 3. A. Amano, Y. Sakagukchi, M. Minoh and K. Ikeda, 3D Model Generation from A Range Image by Regularization, Technical Report of IEICE, PRU92-58, pp. 1-8. (1992) 4. S. Nagashima, K. Suzuki and S. Nagano, A System for Human Bodies Measurements with Free-Formed Surfaces, IPSJ SIG NoteS, Graphics & CAD 38-3, (May 1989) 5. S. J. Wang, Y. Cai and M. Sato, Modeling of 3-D Objects by Re-sampling Range Data, IPSJ SIG Notes, Graphics & CAD 73-2, (Feb. 1995)
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
179
Harmonic Curve Design Dr John R Rankin LaTrobe University Australia ABSTRACT
Many aesthetically appealing curves can be generated by using epicycloid and spirographic parametric curve equations. The difficulty in obtaining an interesting and appealing curve comes from having to choose good values for the coefficients in these equations. If a poor choice of coefficients is made the curve generated will have little regularity and a high amount of confusion. Since these curve types involve up to ten real coefficients the set of aesthetic curves is a set of measure zero on the infinite coefficient space. Rather than making random searches through coefficient space another approach which proved to be far more productive in discovering curves of aesthetic appeal was investigated. In this approach we looked at the Fourier Transform of the curves and performed editing on the coefficients in Fourier space. This approach is especially suited to the epicycloid and spirographic curve types which turn out to be simply low-order harmonic functions after suitable filtering and a rotation. As a result, curves can now be defined by free-form input and then cleaned up by this method to yield the nearest epicycle or spirograph. 1. I N T R O D U C T I O N Aesthetic curves involving cycles and spirals have long been used in motifs, logos and elementary graphics artwork. Celtic artwork, in particular weaved knotwork, has considerable appeal and is used widely (see Figure 1 taken from [4]). An analysis of this sort of artwork is given in [4] however no naathematical prescription is provided for generating and generalizing these patterns. The families of mathematical curves we concentrated on for this research were the spirographs [3] and the epicycles [1] which are 3 and 10 parameter families respectively. Many of the Celtic knotwork patterns of interest have strong similarities to spirographs and epicycles which are mathematically based. A number of books have described how to draw spirographs and epicycles [1,3] and catalogued some of their variety I1,21. When one begins to experiment with generating new designs of this sort one quickly discovers that the desirable or good shapes require very careful selection of the coefficients in the equations. A randomly selected set of coefficients yields drawings that contain very little repetition and far too much detail for the eye to be content with or else don't match other target criteria. One way of gaining an overview of patterns possible within these families is to build graphics animations that show sequences of curves in a family. In order for this to be achievable the curve parameters have to be taken as integers and only a finite range of values can be considered. Nested loops allow the rounded parameters to assume each value over their allotted ranges and a graph is displayed for each parameter combination. Parameter increments of +1 may suggest non-smooth transitions between animation frames however on a fast machine the human eye manages to interpolate well and a good understanding of smooth real parameter variations is obtained. Animation programs were written for the spirograph and the epicycle families with facilities to freeze the animation, view current parameter values and reverse the animation direction. However the possible paths through parameter space for each of these curve families is infinite and
180 these animation programs follow p~u'ticular parameter-space paths (which are cycles). Animations using different paths produced considerably different sequences and effects. Consequently, even with the fastest animation these programs, as interesting to watch as they are, can show only a negligible or measure zero fraction of the possibilities. Additionally the chances of finding a particular desired pattern are still very low. Another approach to the problem of locating desirable patterns is to start by precisely define an aestheticity measure [6]. The criterion can include non-fuzzy and fuzzy features such as the desire for symmetry, no sharp corners, uniform space-filling appearance and no triple or higher self-intersections. Once one has such a measure a hunt program can be implemented to optimize the objective function. This is a program that searches through all of parameter space (again only finite subranges of integer values for the parameters are chosen) selecting parameter combinations that meet the criterion and saving them in a file. Since the search is a long process the hunt program should be interruptible such that it can be resumed from the same place at a later time. Such a program requires special design and a hunt program was implemented by the author in a similar context (to solve a multi-parameter Diophantine equation for the design of a graphic character). A companion program reads through the file of solutions found and displays them one after the other without stopping but allowing the user to pause and query any solution. However, generally such a measure is difficult to express mathematically. Again the chances of finding a particular desired pattern are very low, it could take a long time to find and in any case the integer subset of parameter values might not even include the desired solution. The third approach investigated is to allow the user to describe to the computer roughly what the required pattern should look like by a free-form user-input drawing and then have the program analyse this drawing for closest match to the desired curve type. This is the approach taken for the research presented here. A program is provided that allows the user to sketch the target curve using the mouse input device. Mouse drawn curves are typically inaccurate in expressing the user's desired curve. The input data is then Fourier analysed by a fast fourier transform. After this a filter is applied to the results in Fourier-space and then transformed back into the original space. After a coordinate transformation the required spirograph or epicycle is obtained. This method does not restrict the curve parameters to integer values over a finite range and does not have uncertainty over finding a solution or cause long delays in finding the solution unlike the previous two approaches. It is therefore far more powerful than the other two approaches and worthy of a full investigation. Once the parameters for spirograph or epicycle curves are obtained, further operations can be performed on these curves such as banding and interlacing as described in section 2. 2. T A R G E T C U R V E F A M I L I E S The curves to be rendered are the spirograph and epicycles. The parametric formula for spirographs is [3]:
181 /?. x(O) = (R R + R w )cos(0)+ Rp cos(-z-c-0) Rw
(la&b)
y(O) = ( R R + Rw) sin (0) + Re sin(RR 0) Rw
This is a three-parameter family of curves each one forming a loop over the range 0 = 0 to 2~Nwwhere Rw= otN w, R R = o~NR and N R and N w are relatively prime integers. All curve parameters are real but the ratio RR/R w should be rational. The parametric formula for epicycles is [1 ]: x ( 0 ) = R, cos(F~0) + e2 cos(F~0) + R~ cos(F30) + R, cos(F,0) + R5 cos(Fs0)
(2a&2b)
y ( 0 ) = e I sin (El0) + e z sin (F20) + R3 sin (F30) + R 4 sin (F40) + R 5 sin (Fs0) This is a ten-parameter family of curves each one tbrming a loop over the range 0 = 0 to 2~. The curve parameters R i are real and F~ are integers (i = 1 to 5). Spirographs are a special case of epicycles. A spirograph or epicycle can be converted to look like a piece of Celtic knotwork by banding and interlacing. (The banding and interlacing algorithms will be presented elsewhere.) These operations are shown in Figure 2. However, not all spirographs or epicycles are suitable for this - some cannot match with Celtic art as for example the epicycle in Figure 3. Likewise not all Celtic weaving knotwork can be matched to spirographs and epicycles. The Fourier analysis described in the next section enables us to find the closest available spirograph or epicycle.
3. FOURIER ANALYSIS OF CURVES The Fast Fourier algorithm is described in [5]. It enables one to break down any periodic input curve into its frequency components. It was suitable here to use the complex discrete Fourier transform with z(0) = x(0) + i ' y ( 0 ) where i = {(-1). Taking N samples z k, for k = 1 to N, these are transformed into frequency amplitudes z'j by the Fast Fourier Transform. The original curve can then be recovered by the Fourier expansion: 1 Zk
"-
-1
ZZ'j+I+ N - N j=-N/2
e 2~ijk / N
1 N/2 "+"- - Z Z' j+l N j=0
e 2~djk/ N
(3)
where k = 1 to N. Tests were performed initially whereby spirographs were Fourier transformed and then regenerated from the Fourier expansion, equation 3 above. These tests all showed the regenerated points as visually very close to the original input set.
4 FREE-FORM INPUT A program was written to allow a user to design a curve on the screen using the mouse. The points digitized along the free-form curve (xk,yk) for k - 1 to N, were then stored in a file. These points were then read in to the Fast Fourier transform program and transformed to frequency amplitudes z'j for j - 1 to N. The user-curve was then regenerated from these coefficients (by equation 3) and displayed. Next the negative
182 frequencies were removed, i.e z'j set to zero for j = N/2+l to N giving the filtered amplitudes z"j" When the curve was regenerated from z"j the non-smooth free-form nature of the curve was still evident and the regenerated curve was not a spirograph or epicycle. The next operation in Fourier space was to remove the DC zero frequency component and to select out only the D dominant frequencies with D = 2 for spirographs and D = 5 for epicycles. Next the amplitudes were replaced by the root-mean-squared amplitude sign adjusted to retain the original signs of the real and imaginery parts to yield the new amplitudes z'"j. Interpreting the real and imaginery components of z"'j as the cartesian coordinates still does not result in the equations for spirographs or epicycles (equations 1 and 2 above) and the following scaled rotation transformation of coordinates is required:
x=(x'"+y'")/2 y = (y'"-x'" )/ 2
(4)
where x'" and y"' are the real and imaginery parts of z'".
5 RESULTS Figure 4 shows a sample free-form input consisting of N = 100 points. When these were transformed using the Fast Fourier transform and then reconstructed using equation 3 above a virtually identical set of points was obtained. When negative and zero frequencies were removed and components replaced by signed rms values the regenerated curve appeared as in Figure 5. This is clearly close to the desired spirograph. After the rotation of coordinates (equation 4 above) the desired sprographic curve was obtained (Figure 6). This was then banded and interlaced as shown in Figures 2b and 2c. Further work is progressing with the hand input of more complex curves. 6. CONCLUSIONS Fourier analysis of free-form mouse input can provide an intelligent interface for user input. Using suitable frequency filtering and transformations in Fourier space described above the nearest available smooth curves to the spirograph or epicycle curve families can be obtained. This is a much more efficient way of obtaining particular curve shapes than searching through parameter space. This research suggests that Fourier analysis of free-form user input can be used to smooth user-designed graphics in a pleasing way which can assist in curve design and user-interface friendliness overcoming the drawing limitations of input devices such as the mouse and even couter-balancing for the usual lack of drawing skills in the user.
REFERENCES 1. Edwards, R "Microcomputer Art", Prentice-Hall of Australia, 1985. 2. Guest, Julius "New Shapes, A Collection of Computer-Generated Designs", published by Robin A Vowels, RMIT, Melbourne, 1979. 3. Rankin, J "Computer Graphics Software Construction", Prentice Hall Australia, 1989, pp 79-82. 4. Bain, George, "Celtic Art, The Methods of Construction", Constable London, Harvey Menzies Johnston, 1977, pp 28, 36.
183 5. Conte, S and de Boor, C, "Elementary Numerical Analysis An Algorithmic •Approach", McGraw-Hill International, 1981, pp 268-283. 6. Stiny, George and Gips, James "Algorithmic Aesthetics, Computer Models for Criticism and Design in the Arts", University of California Press, 1978.
Figure 1. Celtic weaving patterns with similarities to epicycles.
Figure 2. (a) A simple spirograph. (b) The result of applying the banding algorithm. (c) The result of applying the interlace algorithm.
Figure 3. An epicycle unsuitable for representing Celtic art or banding.
184
Figure 4. Free-form input (diamonds) with the FFr regenerated points (+ markers).
~~.~~0
O0
Figure 5. The curve generated from the positive rms FFT frequencies on the data.
.L ...............
_, . . . . . . .
,
.
,
....
, . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
,...,....,
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
, . , . ,
. . . . . . .
|,,,,,
~.,
Figure 6. Rotated and rescaled spirograph derived from the free-form input of Fig. 4.
III.8 Screen Design 3
This Page Intentionally Left Blank
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Moil (Editors) © 1995 Elsevier Science B.V. All rights reserved.
187
GUIs and SUIs: more of the same or s o m e t h i n g different? Alison Black a and Jacob Buur b aIDEO Product Development, 7/8 Jeffreys Place, London NW1 9PP, UK bDanfoss A/S, DK-6430 Nordborg, Denmark
1 INTRODUCTION Solid user interface (SUI) is a term coined in Japan to distinguish the user interface of products with embedded microprocessors like video cassette recorders, photocopiers and cellular phones from the Graphical User Interfaces (GUIs) of computer applications. SUIs are a neglected area in HCI research despite the substantial revenue earned from sales in both industrial and consumer applications. The great potential of SUIs that are easy to use is evident from the success of products such as the VideoPlus (VCRPIus) programmer, which has reduced the mystery of video programming in households world-wide. Fortunately interest in the usability of SUIs is growing [3,14]. Many factors may have contributed to this shift: • HCI has evolved from large-scale systems, through personal cornputing, to more recent (and yet to be consolidated) personalised SUI devices, such as Personal Digital Assistants. • Industrial products with SUIs, which were once stand-alone devices used in remote workplaces (and often by blue-collar rather than white-collar workers), are now being integrated into organisational data handling systems. For example, stand-alone refrigeration case controllers in supermarkets are now linked, allowing computer control of a complete store, and even chain-wide monitoring from a central headquarters. So problems of SUI use are now set in the wider context of essential computing systems. • Design for usability has not kept up with the dramatic increase in functionality made possible by increasing sophistication and miniaturisation of electronic components, to the point where there is a crisis in usability that simply cannot be ignored [12]. • While feature promotion and price wars still dominate industrial and consurner markets for products with SUIs, usability is becoming a rneans of securing competitive advantage [9]. For example, in developing a new TV control interface Nokia Consumer Electronics adapted their product development process, bringing in expert usability consultants, in order to a c h i e v e a highly usable interface that would distinguish them from their competitors [1]. Despite the trend towards integrating products with SUIs and computer systems, we believe the distinction between GUIs and SUIs is worth preserving because user interaction with SUIs differs significantly from interaction with GUIs. And for many industrial and commercial applications SUIs offer distinct advantages over GUIs. The objectives of this paper are to encourage increased attention to SUIs in the research community. We will identify characteristics that differentiate SUIs from GUIs and highlight the different methods for analysis, design and testing that SUIs require.
188 2 SUI CHARACTERISTICS: CONSTRAINTS AND OPPORTUNITIES While user interaction with GUIs is supported by a mouse supplemented by a keyboard for input, and a high resolution display for feedback, SUIs are typically limited to a restricted set of buttons and small, low-resolution display. To understand the opportunities these offer, and constraints they impose we need to investigate SUI characteristics systematically. 2.1 W h a t u s e r s s e e
One button one function When buttons are dedicated to single functions the SUI is likely to be relatively transparent to the user as long as there are not too many buttons, their functions are clearly marked or can be learned and remembered, and the buttons are differentiated adequately in position, appearance and feel. However with the high number of functions offered by many microprocessor products, the one b u t t o n / o n e function solution is not viable. It may take up more surface space than is available on the product (both for the buttons and their identifying labels); it is likely to be expensive to manufacture; it will at best be off-putting to users; at worst it will lead to errors in use if the physical design does not allow for logical arrangement and differentiation of buttons with different functions. Hidden buttons One of the solutions most frequently adopted to solve the problem of too many buttons is to hide infrequently-used buttons under flaps in order to present an apparently simple interface (corresponding to unopened windows on a GUI display). Some users, however, may never find these hidden buttons; or even if they know they are there, be unable to find or open the flap to access them because the industrial design of the SUI does not cue interaction effectively. Unlike windows there is no equivalent of using a search routine to find out where the buttons are, nor any convention of clicking to open them.
Hidden functions An alternative way of reducing button count is to attach several functions to each button: either by double (or triple) sets of labels with a means to shift among them; or, alternatively, by soft keys - buttons tagged by variable labels that either change automatically, according to the stage in an interaction (the buttons are modal), or which users can scroll through to find the function they need. Modal interaction underlies usability problems in both GUIs and SUIs, particularly for products or functions that are used relatively infrequently. If it cannot be avoided then users need to be prompted that the status of function keys has changed. But available cues are likely to be limited given that in many SUIs variable information is presented on graphically-limited liquid crystal displays.
Limited alphabetic input SUIs are severely restricted in their capacity for alphabetic input. Neither of the standard options - cursor input or modal 'international' keypads - suit entry of more than a few characters; and solutions such as soft keyboards on SUI displays, or miniaturised attachment keyboards have some enthusiasts, but tend to the 'swiss armyknife' approach of providing partially effective solutions to interaction problems [16]. There is potential for voice input, especially as this increases users' freedom to use products while on the move, or in environments where space is constrained. But voice activation and control is fraught with difficulties for users [13] and speech analysis of any level of sophistication is likely to require processing and memory capacity well beyond the typical SUI product.
Limited display feedback SUI displays tend to be small with either segmented digits or only a few lines of dot matrix characters. If full graphics are used they are likely to be constrained in size and resolution. So displaying overviews (a common feature of large, high resolution
189 GUI displays which helps users find functions, or understand how presently displayed information relates to the recent sequence of activity with the product) is simply not possible. Use of additional cues to change of status, such as colour, sound or illumination is often limited by the need to conserve memory capacity or power and, of course, cost. 2.2 What users don't see
Although to users the most obvious differences between GUIs and SUIs lie in their external interface components, limited processing capacity and power supply have an impact on how those components can be deployed in user interactions.
Limited processing power Products with embedded processors tend to be more limited in power and program size than those driven by central processing units, so considerable ingenuity is required to make the most of opportunities for feedback and responsiveness to user interactions. For example, when there is insufficient processing capacity to drive a fully flexible display, specially designed icon characters may be used to give graphic visual feedback, with considerably less demand oi1 memory than full graphics.
Limited power supply Although not all products with SUIs need to be battery powered, those products which have to be portable or pocketable need small batteries. Minimising battery weight can be critical for ease of use if a product must be held single-handedly, and must be balanced and stable in the user's hand. Compactness also often helps to sell these products, and, up to a point, makes them more usable. At the same time small batteries restrict user feedback through illuminated displays and keys, and sound cues. Battery technology has not kept pace with the advances in electronics that have significantly reduced product size and cost at the same time as increasing functionality. 2.3 D e s i g n opportunities
Despite both internal and external constraints the conlbined soft/hard SUI interface presents opportunities for promoting interactions that are easy to learn and carry out.
More than just push buttons If input components are custom-designed or dedicated to particular functions, SUIs can extend beyond standard push buttons to exploit fully the hands' and fingers' capabilities. Turning knobs, rollers, sliders, pull rods etc. allow precise and intuitive adjustment, often signalling a position or state more efficiently than display graphics can. Research is underway to incorporate tactile and force feedback into input components, and the consumer and automotive industries have already seen a shift back from 'digital' button input to more responsive knobs and dials.
Supporting ergonomics of operation Since SUIs can be custom-designed, there is freedom to shape the physical form of the interface to fit users' hand and body postures. This opportunity is being exploited for hand-held measuring equipment, ticketing devices etc. Supporting functional understanding Custom design also means the layout of the interface can be planned to communicate a sequence of operation through fixed functional grouping. Combined with labelling and graphics, this can promote understanding of use in a way that is sometimes not as effective with more flexible GUI displays. Flexible buttons A novel approach, yet to be developed fully, but which could simplify modal interaction, is to change the buttons, rather than the labels, according to interaction mode.
190 With a flexible casing material, buttons required at any point of interaction can be shaped and raised from the surface when needed by means of active, computer-controlled elements underneath; then removed from the perceptible interface when they are not required [7].
Product identity Corporations have used (and defended) graphic identity to reinforce the 'look and feel' of their software products. Their motive is to boost sales by differentiating their products from competitors', but also to facilitate transfer of learning to new products, thereby increasing customer experience of ease of use. SUIs provide additional opportunities to build product identity by exploiting both three-dimensional and graphic qualities in user interactions. Carried across a range of products SUIs can literally have an identifiable 'feel' that communicates product features and functionality very directly to new users.
3 SUI DEVELOPMENT: DESIGNING FOR MAXIMUM USABILITY
There are many products with SUIs that frustrate users through poor design (video recorders are often cited). Poorly designed SUIs may lose revenue for their developers (if self-service machines are difficult to use potential customers may reject the service). And in safetycritical tasks and environments (transport, medical care, industrial systems maintenance), poorly designed SUIs can be positively dangerous. So organisations developing SUI products need to focus their development processes on achieving maximum usability. 3.1 User observation and testing In many cases software engineers are not good designers of interfaces for other users who do not share their specialist experience of the product [4]. So organisations developing products with either GUIs or SUIs need to research their users' needs and to test prototype products with users as early and as often as they call [5]. Organisations developing both industrial and commercial products with SUIs have tended to lag behind GUI developers in fostering contact with end-users. They may have a tradition of close communication with their customers via their sales force who have built relationships with clients over many years. But this dialogue often focuses on product price and features (compared to those offered by competitors), unless end-users have had difficulties or made complaints. So these organisations still have to make specific efforts to research products beyond the purchasing decision, in their environment of use [6]. Specialists in human computer interaction have developed innovative methods for observing and evaluating users' interactions with GUIs, and these days many software development houses have 'usability labs'. But lab methods are not always easily transferred to products with SUIs. Typical GUI applications (such as spread-sheets) tend to be the focus of work in relatively static office environments, which can be partially simulated in lab conditions. SUIs, however, are often incidental tools for mobile work (such as industrial control devices) or leisure activities (electronic games) where user interactions cannot be replicated well outside the real use environment. Designers need to make sure they engage with end-users in their own environments (albeit sometimes inhospitable or private) in order to understand user requirements from the earliest stages of product development. Similarly as designers develop prototypes they must test them in the environment in which the products will be used. If there are criticisms that models of user behaviour fail to capture the local contingencies that trigger users' actions [15] this is likely to be particularly so for
191 products and users whose natural domain is far removed from the usability lab. New methods are required to ensure a good fit between innovative products and their proposed environment; for example, 'Inforrnance' where designers act out future user scenarios in realistic settings, to understand and communicate the impact of new designs [2]. 3.2 Product prototyping There are many tools for rapidly prototyping GUIs to yield a close resemblance to the final products, and there are well-established techniques for recording from these tools in usability labs in order to develop detailed profiles of product performance [11]. In contrast, the combined hard and soft interaction with SUIs is more difficult to prototype. SUIs can be partially represented on touch screens but at some point prototypes representing the dimensions and weight of tile solid interface must be made because the real arrangement of the components influences the smoothness of user interactions. Three-dimensional modelling of products is more expensive and less flexible than screenbased modelling, so econornising by producing a series of approximations to the finished product is generally accepted. It can also be reassuring to practitioners working with SUIs to find that much of the data captured by direct recording from GUI prototypes in usability labs is never analysed [11]. So the difficulty of preparing three-dimensional prototypes need not always be compounded by the requirement to record directly from them. 3.30rganisational processes for product development Usability practitioners often complain that they lack opportunity to contribute to product development or that the opportunity comes too late in the development process [10]. If this is the case in GUI development it is even more so for SUIs, where commitment to tooling schedules mean the hard interface is fixed early on in the development process. For hard and soft interfaces to work together well, design of the two must be coordinated, so typical product development cycles may need adjustment to bring the two into alignment. There are many examples of failures to coordinate the hard and the soft. But successes, such as the Mercury one-2-one phone, which links a hard function key to soft, modal cues on the display, by means of a moulded 'lens' detail, testify to the power of coordination [8]. Interactive rendering tools, such as MacroMedia Director and Alias can help link hard and soft before commitment to solid prototyping. But these tools will only be influential if used within the context of an organisational approach to design that brings hardware and software together with the common goal of serving user needs. In typical engineering company structures hardware and software products are, to some extent, independent, but in SUI users' minds, naturally enough, they are not.
4 FOCUS FOR FUTURE DEVELOPMENT IN SUI DESIGN As touch-screen technology develops, and GUI design aims for increased realism in portraying three-dimensional objects on two-dimensional screens, screen-based solutions may be regarded as a universal answer to information handling problems. They have the advantages of flexibility and relatively low manufacturing costs. But thereare many environments in which SUIs are the most appropriate technology, and a growing
192 understanding of both products and environments of use is necessary to ensure that solutions are applied appropriately. Research is necessary to establish the best organisational structures, development tools and techniques to bring about increased SUI usability. Manufacturers must become more oriented to end-users than at present, and their efforts must be supported by prototyping tools and testing that allows close replication of real use conditions. The tools must help bridge hardware and software, combining the skills of human scientists, industrial designers, graphic designers and hardware and software engineers to deliver user-centred solutions. REFERENCES 1 Black, A., Bayley, O., Burns, C. Kuuluvainen, I., and Stoddard, J. Keeping viewers in the picture: real-world usability procedures in the development of a television control interface. CHI '94 Conference Companion (1994) 243-244. New York: ACM. 2 Bums, C., Dishman, E., Verplank, W. and Lassiter, B. Actors, hairdos and videotapeInformance Design. CHI'94 Conference Companion (1994) 119-120. New York: ACM. 3 Buur, J. and Windum, J. MMI Design - Man Machine Interface. (1994) Copenhagen: Danish Design Centre. 4 Gentner, D.R. and Grudin, J. Why good engineers (sometirnes) create bad interfaces. CHI'90 Conference proceedings (1990) 277-282. New York: ACM. 5 Gould, J.D. and Lewis, C. Designing for usability: Key principles and what designers think. Communications of the ACM, 28 (1985) 300-311. 6 Grudin, J. Systematic sources of suboptimal interface design in large product development organisations. Human-Computer Interaction, 6 (1991) 147-196. 7 Harada, A. and Tamon, H. Simulating mental images through user operation. Industrial Design, 157 (1992) Tokyo: JIDA. 8 London Business School Case study of handset design & development: Mercury one2one (1994) 9 March, A. Usability: the new dimension. Harvard Business Review. Sept/Oct (1994) 144-9. 10 Mulligan, R.M., Altom, M.W. and Simkin, D.K. User interface design in the trenches: some tips on shooting from the hip. In CHI "91 Proceedings (1991) 232-236. New York: ACM 11 Nayak, N.P., Mrazek, D. and Smith, D.R. Analyzing and communicating usability data. SigCHI Bulletin 27/1 (1995) 22-30. 12 Nussbaum, B. and Neff, R. I can't work this thing! Business Week. April 29 (1991) 58-66. 13 Oviatt, S. Interface techniques for minimizing disfluent input to spoken language systems. CHI'94 Proceedings (1994) 205-210. 14 Sato, K. User interface design theory. Special feature: cutting edge on interface design. Industrial Design, 157 (1992) Tokyo: JIDA. 15 Suchman, L. Plans and situated actions (1989) Cambridge: Cambridge University Press. 16 Verplank, W. (1991) Sketching Metaphors: graphic invention and user-interface design. Friend 21: International Symposium on Next Generation Human Interface (1991) Japan.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
193
Vision-Based Human Interface System with World-Fixed and Human-Centered Frames Kang-Hyun JO, Yoshinori Kuno, and Yoshiaki Shirai Dept. Mech. Eng., Faculty of Eng., Osaka Univ. 2-1, Yamada-oka, Suita, Osaka 565, Japan E-mail:
[email protected] p This paper presents a vision-based human interface system that enables a user to give 3D positioning commands by moving his hand. The user can choose either the world-fixed frame or the human-centered frame for the reference frame with which the system interprets his motion. If he uses the latter, he can move the object forward by moving his hand forward even if he changes his body position. The key for the realization of the system with this capability is vision algorithms based on the multiple view affine invariance theory. We demonstrate an experimental system as well as the vision algorithms. Appropriate uses for both frame cases are discussed. 1. I N T R O D U C T I O N Since hand gesture is one of the important means of communication for humans, research interests have been increasing to recognize hand gestures and to use them for computer-human interface. Hand gestures are used in various ways. An important one is sign language expressing some meaning by a particular hand shape and/or motion. Darrell and Pentland[1] have demonstrated a vision system to recognize such kind of hand gestures. Indicating the position and orientation of an object in 3D space is also one of the major usages of hand gestures. We concentrate on this type of gestures in this paper. Fukumoto et a/.[2] have proposed a stereo vision system where a user can point a place on the computer display screen by his hand and give some commands by hand gestures. We have also presented a system where a user can move an object in the 3D CG world by moving his hand [3]. Such conventional vision systems measure hand positions and motions with respect to the camera-centered coordinate system, that is, they interpret hand gestures in the world-fixed frame. This is appropriate in such cases where the user gives commands by hand gestures in the world where the results of the commands take effect. For example, suppose that the user asks a robot beside him to bring something by pointing it with his finger tip. The recognition system should interpret the pointed direction in the world-fixed frame without considering the position of the user.
194 There are, however, other cases that gesture recognition should take a user's position into account, such as, moving 3D CG objects by hand gestures while watching a headmount display, or controlling a tele-operation mobile robot while watching the images sent from the cameras on the robot. In these cases, it is natural and easy-to-use if the user can move the object in the 3D CG world or the robot forward by moving his hand forward. The user can do this with conventional systems if he does not move his body from the initial position. However, he cannot expect that the system can work as well when he moves. Recognition systems should interpret hand gestures considering the user position in such cases. In other words, they should obtain 3D information in the human-centered frame. In this paper, we present a vision-based human interface system that can interpret hand gestures both in the world-fixed frame and in the human-centered frame. It adopts vision algorithms based on the multiple view affine invariance theory[4,6]. We demonstrate an experimental human interface system and propose appropriate usages for the two frame cases. 2. A F F I N E I N V A R I A N T S F R O M M U L T I P L E V I E W S This section briefly describes the multiple view affine invariance on which our system is based [6,7]. We adopt the weak perspective projection as our camera model. Suppose we have a set of five 3-D points Xi, i E { 0 , . . . , 4}. We use four of these points to establish a basis vector Ei with origin X0 (see Figure 1): Ei = X i - X0,
i E {1,2,3}.
(1)
In this basis, the fifth point, and any other point, is given by X4 = X0 + aE1 +/3E2 + "yE3
(2)
for some a,/3, and ~/. The coefficients a,/3, and "y do not change under any 3D affine transformations, thus called affine invariants. They can be viewed as invariant coordinates of point Xa in the basis and are related to the 3D structure of the object.
,/X3 X2 X0
~
.......................
F i g u r e . 1. Affine basis. Since weak perspective projection is a linear transformation, it is possible to find the coefficients a,/3, and "r in terms of the projected points. For a particular view, we obtain
195
(3)
x 4 - Xo = c~el + fie2 + "/e3
where x 0 . . . x 4 are the projected points and e l . . . ea are the projections of the basis vectors. This means that we can derive two equations with three unknowns from the five point locations in a single 2D image. The problem is underdetermined. However, if a second view with known correspondece to the first view is available, we can obtain the coefficients a,/3, and -y. Hand position and orientation can be calculated based on this method. Details are found in [5]. 3. E X P E R I M E N T A L
HUMAN INTERFACE
SYSTEM
We have developed an experimental human interface system using the algorithms described in Section 2. Figure 2 shows the system configuration. The system consists of a personal computer (IBM PS/V) with an image processing board (Sharp Flex Vision), and two cameras on a computer-controlled pan-tilter. The image processing board captures images and the transputer on the board carries out feature extraction and invariant calculation. The personal computer moves 3D CG objects according to the 3D information sent from the image processing board. A CRT display or a headmount display can be chosen as a display device.
Pan-filter CRT display Cameral Headmount display
Video Signal
FlexVision Transput~"
~
Host Computer(IBM-PC PS[V)
F i g u r e . 2. System configuration. The user of this system can move an object in the 3D computer graphics world by his hand motion. He can control the object position by the tip of his index finger and change the object attitude by the direction of the finger. He can rotate the object around the index finger orientation by rotating his hand with his thumb up. Although our final goal is to develop a human interface system without any user attachments, we use special markers for reference points and a glove with three marks to make image processing fast and reliable in the current implementation.
196
!!!~i~ii!:i'i:i:i:i~i:ii!i!:~~iil!i:iiii!i :::::::::::::::::::::::::::::
(a) Reference object for the world-fixed frame case.
:: ::: ~::~!!::iii~::! ~
~! z
~
........
r::
(b) System operation in the human-centered frame case. Four marks are attached to the user for reference.
F i g u r e . 3. Reference point settings and overview of the system operation(b). The system can interpret hand gestures both in the world-fixed frame and in the human-centered frame, although the user should select either of them in advance in the current implementation. When we use the world-fixed frame, we put an object as shown in Figure 3(a) in the camera field of view. When we use the human-centered frame, we attach three point marks on the upper body and one on a knee as shown in Figure 3(b). We consider the situation in which the user is sitting on a swivel chair. He gives commands to the computer, watching the display. The user can use the system even if he turns the chair. Since we use special marks as features, the system can do with only simple image processing techniques. First, two synchronized images are taken. They are thresholded to extract the marks. The centers of gravity of these marks including the line mark on the index finger are used as the feature positions in the images in 3D information calculation. The areas around the current mark positions are examined in the next frame computation to track features. 4. E X P E R I M E N T S
We have examined whether we can control an object in 3D CG by hand gestures with the proposed system. The experimental results show that the system can track features and calculate 3D information at 10 Hz and that the user can move a 3D CG object in both frame cases. Figures 3(b) and 4 show an overview of the system and an example of the system operation, respectively. In principle, we need the human-centered frame when we use a headmount display as a display device. We made an experiment on this point. A subject sat on a swivel chair. He operated the system in both frame cases while turning the chair. The humancentered case had no problem when the rotation angle was within + 45 degrees. When
197 ii.~ i !!): i ¸¸
::
(a) Directing leftward
j"
......
"~,
(b) Heading left
:
t
(c) Downward
(d) Heading down
F i g u r e . 4. System operation examples. The hand gestures as shown in (a) and (c) move the object in CG as ( b ) a n d (d), repectively. the angle was larger than that, the reference marks were occluded by the body in the current implementation. This problem can be solved if we use multiple cameras. The operation in the world-fixed frame was uncomfortable when the angle became larger than -4- 10 degrees. 5. D I S C U S S I O N We have proposed a vision-based human interface system by which we can operate both in the world-fixed and human-centered frames. We need to select either of them depending on our applications and/or situations. Preliminary experiments with the developed system have led us to consider that there are three cases in terms of the frame selection. C a s e 1. T h e case w h e r e t h e i m a g e on t h e d i s p l a y s h o w n to a u s e r d o e s n o t m o v e w h e n he m o v e s . This is the case that we have mainly experimented. The user picks up a 3D CG object and moves it by his hand motion. The experimental results in Section 4 show that the human-centered frame should be used in this case. However, both frame cases can be used when we use a conventional CRT display. This is because we have a real existance, the CRT display, on which we can easily establish the world-fixed frame. Although we can use both frames, we have to operate considering which frame we are using, because the same hand motion might be interpreted differently depending on the body position. C a s e 2. T h e case w h e r e t h e i m a g e on t h e d i s p l a y s h o w n t o a u s e r m o v e s as he m o v e s . The user's position should be measured by some means in this case. The proposed method can be used for this purpose. This is the case for conventional virtual reality systems with a headmount display. When the user moves, the image on the headmount display changes according to his motion. The world-fixed frame should be used in this case. For example, when the user points at something by his hand, the hand direction should be interpreted in the world-fixed frame. The case where we see the world directly instead of through a display can be included in this case. However, the human-centered frame operation is useful in some cases. An example is the case of
198 controlling a remote operation robot. The user watches the image sent from the camera on the robot and controls the robot's arm or its body motion. To control the actual robot, the human-centered information is more convenient. Since the user's position is known in this case, the hand's position in the world-fixed frame can be translated into that in the human-centered frame. However, this translation process might increase measurement errors. C a s e 3. C o m b i n a t i o n c a s e . An example is the following case. We have many monitors of surveillance cameras whose pans and tilts can be controlled. The user chooses one of them by pointing it by his hand. In this case, the world-fixed frame should be used. Then, the user controls the pan and tilt of the selected camera by his hand. The humancentered frame case is more appropriate in this situation. Since there are many monitors and distance between the user and the cameras is long, it is not comfortable to establish the fixed-frame on the monitor screen each time a new camera is chosen. Our current theory for the selection is as follows. If we can easily consider a target object as a part of our body, the human-centered frame is preferred. Otherwise, the worldfixed frame should be chosen. It can also be used if we can easily establish the reference frame on some place even in the first case. We are planning to make experiments on this theory using the system presented in this paper. 6. C O N C L U S I O N We have proposed that a vision-based human interface system needs to interprets hand motions both in the world-fixed frame and in the human-centered frame. We have developed a system with this capability using multiple view aifine invariants. Experimental results confirms the usefulness of both frame cases. REFERENCES
1. T. Darrell and A. Pentland, "Space-Time Gestures", IEEE Computer Vision and Pattern Recognition, pp.335-340, 1993. 2. M. Fukumoto, K. Mase, and Y. Suenaga, "Real-Time Detection of Pointing Actions for a Glove-Free Interface", IAPR Workshop on Machine Vision Applications, pp.473476, 1992. 3. R. Cipolla, Y. Okamoto, and Y. Kuno, "Robust Structure from Motion using Motion Parallax", IEEE Fourth International Conference on Computer Vision, pp. 374-382, 1993. 4. J.J. Koenderink and A.J. Van Doom, "Affine Structure from Motion", Opt. Soc. Am. A, Vol.8(2), pp.377-385, 1991. 5. Y. Kuno, M. Sakamoto, K. Sakata, and Y. Shirai, " Vision-Based Human Interface with User-Centered Frame", IROS'94, pp.2023-2029, 1994. 6. J.L. Mundy and A. Zisserman, editors, Geometric Invariance in Computer Vision, Chapters 1 and 23, pp.1-39, pp.463-519, MIT Press, 1992. 7. S. Vinther and R. Cipolla, "Towards 3D Object Model Acquisition and Recognition using 3D Affine Invariants", Technical Report CUED/F-INFENG/TR136, University of Cambridge, 1993.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
199
FUZZY REASONING APPROACH TO ICONIC INTERFACE DESIGN Rungtai Lin
Industrial Design Department, Chang Gung College of Medicine and Technology 259 Wen-Hua 1st Road, Kwei-Shan, Tao-yuan 33333, Taiwan Abstract Understanding how users recognize an icon is as complex as understanding visual perception itself. Because the cognitive factors that affect the evaluation of icons have not been properly analyzed, the comprehension of pictorial symbols is typically ill-defined. The fuzzy approach seems to be a better way to provide a possible solution to this methodological problem. To remedy the disadvantage of the conventional approach, and to use the full potential of the computer as an aid to icon design, the purpose of this study is to propose a fuzzy graphic rating as a technique for evaluating the icon design. Although a fuzzy rating can be conducted using a pencil and paper technique, analysis is a difficult part through obtaining the rating in a pencil and paper method. In addition, iconic interface design is a highly graphic-oriented, so the graphic is playing an important role in design process. Therefore, based on the CAD technology and the fuzzy graphic rating as an approach to iconic interface design, this paper is intended to propose a more systematic treatment of iconic interface design than has hitherto been made. In this paper, a FUZRID (FUZZY Reasoning in Icon Design) system was implemented to provide designers with the predictive information using the fuzzy reasoning approach. Key Words: fuzzy set theory, icon design, human factors. 1. I N T R O D U C T I O N Along with the increasing availability of high quality graphic displays, the use of multiwindows and iconic interfaces in complex system environments is becoming much more common and accessible to computer users. While icons are playing an increasingly important role in iconic interface, many cognitive factors of icon design are not well understood [5,14,18]. For icons to be effective in an iconic interface, icons need to be properly designed to be meaningful, associated, leamable, memorable and consistent. There are no specific set of rules or criteria that can be followed by the designers during the design stage. Although the Intemational Organization for Standardization (ISO TC 145/SC 1) provides the selection criteria and original reference principles, whether an icon has met the criteria can be known only after the test is completed. During the design stage, the difficulty designers might confront is that there is not enough information available for modifying the proposed design. All designers can do is to develop many potential designs, then select the best of the proposed icons by a preference test. Whether these selections have met the criteria as a standard symbol is determined by a comprehension test; If not, a redesign has to be deployed. Even after following the ISO procedure and finding that one version of a symbol is more recognizable than other, the question of why the symbol is better remains [1,5,26]. This is because the ranking and comprehension test are subjective evaluation of the possible candidate symbols, and the ability of the viewer to understand what a symbol means will depend on many factors. These cognitive factors that affect the evaluation of pictorial symbols have not been properly explored. Because the comprehension of pictorial symbol is typically ill-defined, the fuzzy approach seems to be a better way to this methodological problem.
200 Fuzzy logic defmes concepts and techniques that provide a mathematical method able to deal with thought processes that are too imprecise to be dealt with by classical mathematical techniques [22,23,24]. The application of fuzzy approach is widely spread in various fields, from the behavioral and social science to product design and human factors [10,11,12,16,25]. Fuzzy set theory has been used in the psychology, for example, Hesketh [6, 7] proposed a computerized fuzzy graphic rating scale to the psychological measurement. The scale that is an extension of a semantic differential allows respondents to provide an imprecise rating and lends itself to analysis using fuzzy set theory. It takes account into the reality of imprecision of human thoughts by allowing ranges of score to be measured and translated into a single score. The availability of relative low cost personal computers and PC-CAD systems began to revolutionize the fields of design. In addition, icon design is a highly graphic-oriented, so the graphic is playing an important role in design process. Instead of using paper, pencil, and drafting board, the designer can execute design on the computer by using interactive graphical devices. No similar approach has been introduced for evaluating the icons. In order to remedy the disadvantage of the conventional approach, and to use the full potential of the computer as an aid to icon design, we must provide designers with a computer-based tool for designing icons. Therefore, based on the earlier studies [2,6,7,8,9,10,19,20, 21 ], a FUZzy Reasoning in Icon Design system (FUZRID) was implemented to provide designers with the predictive information using the fuzzy reasoning approach. 2. M E T H O D 2.1 Derivalion of Cognitive Factors To help the designer properly evaluate pictorial symbols, Lin et al. [13,14,15] derived six important factors for evaluating icons. They are "Associable," "Identifiable," "Meaningful," "Concise," "Eyecatching," and "Symbolic." Through a factor analysis, these items were reduced to the three cognitive factors, namely, "Communicativeness," "Design Quality," and "Image Function." These three cognitive factors can explain 87% of the total variance in terms of subjective rating on the icon design. It is suggested that these three cognitive factors can be used as the basis for evaluating icon design during the design stage. Moreover, based on the percentage of variance each factor can explain, the ratings can be weighted to provide a simple overall rating as follows [13]: Overall rating = 0.41 * rating on "Communicativeness" + 0.33 * rating on "Design Quality" + 0.26 * rating on "Image Function" ......... (1) 2.2 Derivation of Membership Functions The first step of applying fuzzy approach in evaluating icon is to derive the membership functions. On the basis of the studies [8,9,10,19,20,21], Lm [ 16,17] used the three cognitive factors to derive the membership functions. Using direct rating to determine membership functions have been reported [ 19,20, 21]. The semantic differential rating was designed to obtained membership functions of the three cognitive factors and the total performance. The subjects rate each cognitive factor on a scale of 0-6; 0 being no membership and 6 being full membership. Two techniques of probability distribution and normalized rating scores were used to derive membership functions. The frequency distribution of rating scales can be applied equally to the probability distribution; then we can derive membership functions based on the cumulative distribution function. According to the membership functions from probability distribution, we can define the grade of membership of averaged rating scores of total performance as: Averaged Rating of Total Performance = {0.0/0, 0.0/1, 0.0/2, 0.06/3, 0.59/4, 0.96/5, 1.0/6} ..... (2)
Another method to derive the membership function is to normalize the rating scores [20]. The overall
201 rating were normalized regarding the lowest rating score as zero ( no membership) and the highest rating score as one (full membership). Similarly, according to the membership functions from normalized means, we can define the grade of membership of normalized means of total performance as: Normalized Mean of Total Performance = {0.0/0, 0.0/1, 0.0/2, 0.19/3, 0.60/4, 1.0/5, 1.0/6} ..... (3)
With these membership functions, we can transfer the averaged rating score to membership of averaged rating of total performance. Then, the relationship between the membership and averaged rating score can be computed. 2.3 Calculation of Expected Value of a Fuzzy Variable
On the basis of the membership functions and the link between fuzzy set theory and probability theory [8,9], the "average weighting" procedure introduced by Baas and Kwakemaak [2] was used to calculate an expected value. Then, the expected value of a fuzzy variable can be calculated as follows [6,71:
X-~~X~ U
fx(X,) ~,u f~(X,)
Whero
(4) ordina o oe oach point Xi, and
o or umma ion of
all Xi multiplied their respective probabilities gives the expected value of the distribution. The procedure used to calculate the expected value was illustrated graphically with an example in [7]. 3. I M P L E M E N T A T I O N To improve the conventional design procedure, we propose the FUZRID system that will benefit icon designers. The system has been developed and implemented on PC/486. AutoCAD has been selected as base system for supporting the implementation because of its popularity, programmability, and flexibility. AutoLISP has been used as a programming language to develop many utility functions and subroutines. The brief concept of fuzzy reasoning approach to icon evaluation is shown in Figure 1. The fuzzy graphic rating subsystem is first asked respondents to rate an icon with its intended meaning using the fuzzy rating. Three values were collected automatically for the further analysis. Then, the fuzzy computing subsystem was provided to handle such imprecise rating data to compute rating scores, comprehension rates, and generate a comparison graph. Finally, the fuzzy reasoning evaluation subsystem provides the information of the current icon design to designers for modifying the proposed icon during the design stage. The FUZRID environment consists of three major subsystems: fuzzy graphic rating subsystem, fuzzy computing subsystem, and fuzzy reasoning evaluation subsystem. These subsystems are logically connected to a main function as shown in Figure 2. Three subsystems are described as follows Fuzzy eraphie rating subsystem. Based on the concept of fuzzy graphic rating proposed by Hesketh [6, 7], the fuzzy graphic rating subsystem provides fuzzy graphic rating scales with three pointers. The subsystem is composed of three functions" (iconlist), (frating), and (getdata). The (iconlist) function creates a list of icons for evaluating. The (frating) function allows respondents to provide imprecise rating by moving the pointers freely along the x-axis with a mouse. The (getdata) function is used to collect three values of most preferred point, left extension, and right extension. All the functions were implemented by using AutoLISP programming language. Fuzzy comoutin~ subsystem. On the basis of these values, the "average weighting" procedure introduced by Bass and Kwakemaak [2] was used to calculate an expected value. The fuzzy computing
202
FUZRID I I I I FuzzyCOMPU~CI_~l~uzzvR~SONINC FuzzyoRAP.,c RATING SUBSYSTEM1~ st~sYsrSM I",'ISUBSYSrE. [
.• ] ._• (FRATINC) FUNCTION[
_~ (READDATA)[
(CETDATA[)
..~ (EVALUE)] FUNCTION
OCONUST) ] FUNCTION
FUNCTIONI
~UN:ION I
.~ (MGRADE)[ FUNCTION
[
I (CerTES) [
~-~ FUNCTIONI I ] [ (FRULES)] ['---I FUNCTION[ I [ [ (DGUIDES)] ~ FUNCTION[
~c~?~o"~] Figure 1 Block diagram of F U R I D system
Figure 2 System structure of FUZRID
subsystem, including (readdata), (mgrade), (evalue), and (cgraph) function, was designed to compute the expected value based on the three values got from the fuzzy graphic rating subsystem. First, the (readdata) function reads the values from the fuzzy graphic rating subsystem. Then, the (mgrade) function was designed to convert the rating score to a grade of membership using equations (1,2,3). The (evalue) function was used to compute an expected value using the grade of membership according to the equation (4). Finally, the (cgraph) illustrates a comparison graph to show the results of icon evaluation. Fuzzy reasoning evaluation subsystem. The unique aspect of the FUZRID system is the fuzzy reasoning evaluation subsystem which was designed to provide a predictive information to designers for modifying the proposed icon during the design stage. The subsystem contains (crates) function, (frules) function, and (dguides) function. The (crates) function calculates total performance and estimates comprehension rate of the proposed icon. The (frules) provides the inference rules. The (dguide) provides designers with the information for modifying the proposed icon.
4. S A M P L E R U N The FUZRID system is organized into three subsystems which connected to a main function. This main function provides a function for learning how to use the system and errors handling. The explanation of using the FUZRID system was designed as a CAI format which were a part of main function of the FUZRID system. A sample run is shown as follows: 4.1. Startup. After booting the computer by entering the FUZRID system disk into driver A:, a batch file will be automatically executed and the system presents on the screen a briefing of the F U ~ system (Figure 3). 4.2. The procedures. The fuzzy rating procedure was presented on the screen as shown in Figure 4. The procedures are: 1) move the central pointer to a preferred point which best represents the rating. 2) indicate how far the rating could possible go by moving the fight-pointer and left-pointer, and 3) adjust the rating if necessary. 4.3. How to move the pointers. The pointers are moved by a mouse, and the explanation of moving pointers and errors handling is shown in Figure 5. The steps are: 1) moving the mouse to a pointer and press the fight button, 2) drag the pointer to the desired position press button again, 3) repeat the same steps until you are satisfied. The three graphic scales are provided for practicing. While respondents familiar with moving the pointers pushing the button will bring them to the formal rating test. 4.4. Rating the icons. After preliminary instructions and tuition in the use of the system, the user is asked to enter the icon's name for evaluating. Then the fuzzy rating scales with the proposed icon and its intended meaning is presented on the screen to the respondents (Figure 6). Respondents were asked to rate a proposed icon with its intended meaning using the semantic differential scales of the three
203
cognitive factors. After rating, respondents are asked to press the button at left comer for rating the next icon. 4.5. The fuzzy ¢omlmting. After the subsystem was running, an instruction (Figure 7) was presented on the screen to teach users how to use the subsystem. The subsystem required the user to give the icon to be reported, then the grade of membership of each rating score was computed, and an expected value was transferred 4.6. The comparison graph. After expected values were calculated by the fuzzy computing subsystem, a comparison graph was displayed (Figure 8). The most preferred point, left extension, right extension, and the expected value were drawn on a three-axis graph representing the three factors communicativeness, design quality, and image function, respectively. 4.7. The fuzzy reasoning. After the fuzzy reasoning evaluation subsystem running, the rating scores of communicativeness, design quality, image function, and total performance displayed on the screen. These rating scores provide designers an idea about current icon design. Then, the individual suggestions for each factor were given for modifying the current icon (Figure 9). 5. C O N C L U S I O N Although fuzzy graphic rating can be conducted by using a pencil and paper technique, analysis can be a difficult part through obtaining the rating in a pencil and paper method. In addition, the icon design is highly graphic-oriented that the graphic is playing an important role in the design process. The designers need a computer-based tool as an aid for icon design, therefore, this study explored the feasibility of fuzzy reasoning approach rating for evaluating the icon design. The FUZRID system is not meant to replace the designer or the respondent, but to augment both designers and respondents by providing a computerized tool for icon design and icon evaluation. The technique of fuzzy graphic rating may be advantageous by having fewer subjects to be tested, and using the computer for analyzing [17]. It is suggested that the FUZRID system will be validated in more testing and evaluating of icon design in the further study. 6. R E F E R E N C E S 1. Barthelemy, K. K., Mazur, and Reising, J. M. (1990). Color Coding and Size Enhancements of Switch Critical Features. Proceedings of the Human Factors Society 34th Annual Meeting, 99-103. 2. Bass, S. M., and Kwakernaak, H. (1977). Rating and ranking multiple-aspect alternatives using fuzzy sets. Automatics, 1977, 13, 47-58. 3. Brugger, C. (1990). Advances in the International Standardization of Public Information Symbols. Information Design Journal, Vol. 6/1, 79-88. 4. Collins, B. L. (1982). The Development and Evaluation of Effective Symbol Signs. Washington DC: US Department of Commerce, National Bureau of Standards NRS Building, Science series 141. 5. Gittins, D. (1986). Icon-based human-computer interaction. International Journal of Man-Machine Studies, 24, 519-543. 6. Hesketh, B., Pryor, g., Gleitzman, M., and Hesketh, T. (1987). Practical applications and psychometric evaluation of a computerized fuzzy graphic rating scale. In Zetenyi (editor), Fuzzy Sets in Psychology, 425-455.
7. Hesketh, T., Pryor, R., and Hesketh, B. (1988). An application of a computerized fuzzy graphic rating scale to the psychological measurement of individual difference. Int. J. of Man-Machine Studies, 29,21-35. 8. Hisdal, E. (1986a). Infinite-valued logic based on twovalued logic and probability. Part 1.1. Difficulties with present-day fuzzy-set theory and their resolution in the TEE model. International Journal of Man-Machine Studies, 25, 89-111. 9. Hisdal, E. (1986b).Infinite-valued logic based on twovalued logic and probability. Part 1.2. Different sources offitzziness. International Journal of ManMachine Studies, 25, 113-138. 10.Karwowski, W., Evans, G. W., and Ragade, R. R. (1984). Fuzzy Modeling Techniques in Human Factors Research. Proceedings of the Human Factors Society - 28th Annual Meeting, 403-407. 11.Kreffeldt, J. G., and Rao, K.V.N. (1986a). Fuzzy Sets in Consumer Product Design: Applications to Instructions and Warnings. Proceedings of the Annual Conference of the Human Factors Association of Canada-19th Annual Conference, Vancouver, BC, 99102.
204
FUZRID SYSTEM
FUZZY
GRAPHIC
RATING
SUBSYSTEM
g ' d 2 t y R e u o . n i m l in Icon D.mlill,n r o m e - All m , t m ' ~
m~ n , m ~ m c
~
HOW TO MOVE THE POINTERS? SIMPLY. J u s t llovz TaZ ttous~ T o THZ P o t m l m
PROCEDURE:
~zac~.
THEN P R g S S THE BUTI'ON. NOW TI.H~ P O ~ WILL BE III~GHT~'D. THF_,N DRAG; THZ POINTER TO ?HI~ DIL.qlR~} POSITION P R E ~ B U ' I ' ~ N . RF.,PFAT ' r l m sAm~ PROCr.DURZS m e r l L Y o u am~ S A T I ~ .
t~ a m y oR,a,mc ~,lW~ e u m ~ l m . .,~ i m ' z ~ t z a s m m ~ zv,,am.'nol~ s ~ m ' v s ' m t
|¢ sol~t-rml~wRol~a,v r ~ crm. c To s'ro~. m z l,lzo~ma. ~. u e t w - . ~
N
]
NNN -II-=- -I
4. t m ~ t
~ t o mmc,n • ~
tr~r ~mtt
AND T Y P E [ F I S O ] T O R E S T A R T .
~rr.
~ m nmnm ,rot ~ . t w c r o m z
I.
tur.
- = - -I Figure 5
Figure 4
Figure 3
-I
f
mt rtnm ~
m m
,rr~
at~e
com*tm~.,
m a~mD ~
70 ~ u o u t ~ t
WILL ~
~
.
mma
TABLET
....I
-
-
-
I
1:5 Figure
-i
6
Figure 8
Figure 7
I lM~et FUetC'rlON:
B2~.
Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9
A briefing of fuzzy graphic rating system The procedure of fuzzy graphic rating Practice of moving pointers Fuzzy graphic rating screen Instructions of fuzzy computing subsystem An example of Comparison graph Fuzzy reasoning evaluation subsystem
Figure 9
12.Kreifeldt, J. G., and Ran, K.V.N. (1986b). Fuzzy Sets: An Application to Warnings and Instructions. Proceedings of the Human Factors Society - 30th Annual Meeting, 1192-1196. 13.Lin, R. (1992). An Application of the Semantic Differential to Icon Design. Proceedings of the Human Factors Society 36th Annual Meeting, 336-340. 14.Lin, 1~ and Kreifeldt, J. G. (1992). Understanding the Image Functions for Icon Design. Proceedings of the Human Factors Society 36th Annual Meeting, 341345. 15.Lin, R., Kreifeldt, J. G., and Chi, C. F. (1992). A Study of Evaluation Design Sufficiency for Iconic Interface Design: The Design Perspective. Proceedings of the Ergonomics Society Annual Conference 1992, 376-384. 16.Lin, R. (1994). Fuzzy Approach to Standardizing Public Information Symbol. Journal of the Chinese Institute of I. E., Vol. 1I, No. 1, 33-39. 17.Lin, R. (1994). An Application of Fuzzy Graphic Rating in Icon Design. Mingcui Institute of Technology Journal, Vol.26, 201-207.
18.Lodding, K.N. (1982). Iconic interfacing. IEEE Computer Graphics and its Applications, 3, 11- 20. 19.Oden, G.C. (1977). Integration of Fuzzy Logical Information. Journal of Experimental Psychology: Human Perception and Performance, 3, 565-575. 20.Turksen, I. B. and Norwich, ,*~M. (1981). Measurement of fuzziness, Proceedings of the International Conference on Policy Analysis and Information System, Taipei, Taiwan, 745-754. 21.Turksen, I. B. (1986). Measurement of Membership Functions. In Applications of Fuzzy Set Theory in Human Factors, edited by W. Karwowski and A. Mital, 55-67. 22.Zadeh, L. A. (1965). Fuzzy Sets. Information and Control, 8, 338-353. 23.Zadeh, L. A. (1968). Fuzzy Algorithms. Information and Control, 12, 94-102. 24.Zadeh, L. A. (1973). Outline of A New Approach to the Analysis of Complex Systems and Decision Processes. IEEE Transactions on Systems, Man, and Cybernetics, 3, 28-44. 25.Wang, M. J., Sharit, J., and Drury, C. G. (1991). Fuzzy Set Evaluation of Inspection Performance. International J. of Man-Machine Studies, 35,587-596. 26.Zwaga, 14_ J. G. (1989). Comprehensibility Estimates of Public Information Symbols: Their Validity and Use. Proceedings of the Human Factors Society 33rd Annual Meeting, 979-983.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
205
Inferring Graphical Constraints from Users' Modification Takeharu Tanimura, Tsukasa Noma, and Naoyuki Okada a Department of Artificial Intelligence, Kyushu Institute of Technology, 680-4, Kawazu, Iizuka, Fukuoka 820, JAPAN This paper presents a new approach to inferring graphical constraints in graphical editors. In our approach, users modify graphical objects interactively so that the objects have approximate geometric relations, and relevant graphical constraints are automatically inferred by comparing the states before and after the modification. This approach can grasp users' intention more accurately than existing approaches that infer constraints only from a single state of drawings. 1. I N T R O D U C T I O N Graphical constraints have been used for maintaining graphical/geometric relations between objects while users edit drawings in constraint-based graphical editors[l-3]. But for many years, most users have preferred non-constraint-based editors such as MacDraw T M since it is often troublesome for users to manually specify a large number of constraints required for drawings. To lessen the load of constraint specification, several approaches have been proposed for inferring constraints from unfinished drawings[4-7]. Recent and typical approaches are found in Rockit[6] and Grace[7]. In both systems, constraints are inferred if their relevant objects are positioned in gravity fields associated with each constraint. Then the drawings are beautified so that the inferred (and chosen) constraints are satisfied. But the existing gravity field approach has difficulty in inferring proper constraints. The larger the gravity fields are, the more candidate constraints are inferred. On the other hand, the smaller the fields are, the more accuracy is required in users' operation. In fact, Grace forces users to position objects at only 4-3 pixels accuracy. In spite of the difficulty in inferring constraints mechanically, we humans can easily recognize users' intention, that is, intended graphical constraints from users' editing process. This gap arises from the fact that, in the existing gravity field approach, graphical constraints are inferred from only a single state of drawings after modification. This paper proposes a new approach to inferring graphical constraints in graphical editors. In our approach, users modify objects interactively so that the objects have approximate geometric relations, and relevant constraints are automatically inferred by comparing the states before and after the manipulation. Our relative criteria for deciding intended constraints work well in both rough positioning after big motion and precise positioning after tiny motion, and thus solve the problem of the current "fixed" gravity fields.
206
ART
B
.%
Delete VRT constraint
G
O
(2)
(1)
B
A
Move Point D
A
(Gravity Field Approach) Infer HRT and VRT constraints and redraw ¢~:~:y:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~t :::::::::::::::::::::::::::::::::::::::::::::::: |~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:!:~:~:~:} ::::::::::::::::::::::::::::::::::::::::::::::: ::::::::::::::::::::::::::::::::::::::::::::::::
(Our Approach) Infer HRT constraint and redraw
(3)
(4)
I
~iiiii~iiiiiiiiiiiiii: [ ......~.. j • VRT gravity field of Point A
~ ~~"
(4)' "HRT gravity field of Point C Overlapping gravity field of VRT and HRT
Figure i. Interactions in the existing gravity field approach and in our approach. (VRT and HRT are verticality- and horizontality-constraints, respectively.)
2. I N F E R R I N G CATION
GRAPHICAL CONSTRAINTS FROM USERS' MODIFI-
2.1. P r o b l e m s in inferring graphical constraints with gravity fields In this subsection, we discuss problems in inferring graphical constraints with gravity fields. Figure 1 shows a sample sequence of interactions for illustrating the problems in the existing gravity field approach (and later for comparing our approach with the existing one): (1) A user is to modify a drawing with a polyline A B C D and a verticality(VRT)constraint that forces points A and D to be vertically aligned; (2) the user deletes the VRT-constraint between A and D; (3) the user moves D so that C and D are horizontally aligned. Supposing that point D is in A's gravity field of VRT-constraint as well as C's gravity field of horizontality(HRT)-constraint, (4) then the system infers both VRT-
207 constraint between A and D and HRT-constraint between C and D. From the step (2) to (3), the verticality-relation between A and D is weakened. In the existing approach, however, the VRT-constraint between A and D is inferred if point D remains in the VRT gravity field of A after the modification. This problem is solved to some extent by much narrower gravity fields. But they would force users to position objects with hairbreadth accuracy.
2.2. Inferring graphical constraints from users' modification The above-mentioned problem in the existing gravity field approach is caused by its inference mechanism where constraints are selected if and only if the objects are in the corresponding gravity fields after the modification. To recognize users' intention more properly, we need to pay attention to how users edit drawings. We thus propose a heuristics given below: "If a user changes a drawing where a graphical relation is not fulfilled into another where the relation is almost satisfied, then the constraint corresponding to the relation is intended by the user." The above heuristics is realized as follows. To solve the problem of the current "fixed" gravity fields, we adopt relative criteria for deciding whether each constraint is intended or not so that our approach works well in both rough positioning after big motion and precise positioning after tiny motion. We numerically express the degree of satisfaction with graphical constraints. If the satisfaction degree of a particular constraint increases more than a predefined threshold after a modification operation, then the constraint is treated as "intended." Formally, let x be a vector of positions and directions of objects, and c be a constraint. A function e maps from a pair (x, c) into a real number which represents the degree of satisfaction of x with c. e is maximized when x satisfies c completely. Let xl and x2 be the vector x before and after a modification, respectively. The constraint c is inferred if and only if e(x2, c) - e(xl, c) > threshold.
(1)
Let us suppose that our new approach is applied to the case in Figure 1. Then the system infers only a HRT-constraint between C and D (See (4)'). The VRT-constraint between A and D is not inferred since its satisfaction degree does not increase considerably. In fact it decreases. As shown in the above example, our approach can securely pick up graphical relations intended by users even if the users' positioning is rough. On the other hand, it filters out most of the relations unexpectedly established.
3. I M P L E M E N T A T I O N 3.1. System overview We implemented a prototype graphical editor in C on a IBM PC/AT compatible running Linux with X l l R 5 and Motif. Our system consists of four modules: a user interface module, a drawing module, a constraint inference module, and a relaxation-based constraint solver (Figure 2). The user interface module lets users to create, modify, and
208
iili !iijiiiiiiiiiiii~~i~
ii~~iiiiiiii~~
2. Send object modification data
~~iii iiiiiiiiiiii!!!i!iiiill
3,6. Redraw the window
ii~;ii!~ii!i!::::~:::~:~:iii;~i!!!~i ii!i!!!!:i~/::::/:~:~::/:i:~:::i~i:~:::i:i:~i:i:!i:!::i:!i::i:!;::i::i:::::::::::::::::::::::::::::::i::,::::::iill ::::::::!::::i:: ::::::::::::::
iiiiiiiii ii iii~~~~
iiiiii!iiiliiiiiiiliiiiiiii iiiiii i iililiiiiiiiili i!iiii!iili!iiiiii
...................................................................................................................................................
4. Send constraint/object modification data l .....
i~iii~~i~ii iii i iiiiiiiiiii l
i: :: =:!:.!:::
::i.
Con~mt :: se:: : 5. Send object ill modification data :: : - :
j
!!i::!~i:ii~i~ii~!iiii~:!!(iiiiiii!!!!i!iii!ii~iiiiiii%ii!iil)~!ii!iii~ii:~:i :¸
Figure 2. System overview.
delete graphical objects in a direct manipulation fashion (See Table 1). The user interface module sends object modification data to the constraint inference module as well as the drawing module. The constraint inference module infers graphical constraints from the received data and sends newly established constraints to the constraint solver. The constraint solver solves all the current graphical constraints and sends the solution to the drawing module. The drawing module receives the object modification data from the constraint solver and redraws the window.
3.2. Graphical objects and constraints Graphical objects have their associated points and line segments depending on the object type. Constraints in our system are imposed on these points and line segments. Table 1 shows object types and their associated points/line segments in our system, and
209 Table 1 Graphical objects object type marker polyline (n-folded) box polygon ellipse string
associated points and line segments position (n + 1) vertices(points), n component line segments and their midpoints 4 corners(points), center point, 4 sides(line segments) and their midpoints n vertices(points), center of gravity(point), n sides(line segments) and their midpoints center/top/bottom/leftmost/rightmost points baseline(line segment) and its endpoints, 4 corners of bounding box
Table 2 Graphical constraints constraint type equivalence-of-points equivalence-of-length horizontality verticality parallelism perpendicularity
meaning Two points have the same position. Two line segments have the same length. Two points have the same y-coordinate value. Two points have the same x-coordinate value. Two line segments are parallel with each other. Two intersected line segments are perpendicular to each other.
graphical constraints in our system are listed in Table 2. 3.3. The degree of satisfaction with graphical constraints As discussed in Section 2.2, a function e maps from a pair of vector x and constraint c into the satisfaction degree of x with c. For example, in our current implementation, the function e for the constraint of equivalence-of-points is given below:
e(x, equivalence-of-points) =
log k b - l o g k a logk b - logk d
0
(0_ mo(P2)I I + mo(P3)l + + M0= too(P4)[ = <M-I> + <M-3> mo(P5)I 0 mo(P6)I o mo(~) I o
/a/Before the performance
P5
~ii~4~ l
iii~m q
q
~go\ g . t2
/
I
P766 m'1(P1) m'l(P2) m°l(P3) M' 1 = m'l(P4) m'l (P5) m'l(Pr) m'l(P7)
l
I
P6
0 0 0 =
fE
I k
0 0
/b/ During the performance
m~(P2)~ m~(P3)~ M' 2 = m~(p4)]rn~(P5)~ m~(P6)I
m~(PT)I
/~o \agR 6 oo
<M-l> + <M-3> 0 + +
°
/c/ After the performance
Figure 3. Performance and simulation of the task-object "T111 "to take knowledge of a new flight"
I
271
3. USER INTERFACE SPECIFICATION The aim of this stage is the automatic passage from the users' tasks description to the MMI specification. It allows to define all the necessary action plans and manipulated objects for the task performance. So, the resources of each terminal task-object become its component-objects which include MMI objects to be implemented in the future Interface, and Operator objects. All the component-objects co-operate in a precisely manner in order to fulfill the aim of the terminal task-object. A component-object shall be defined from its class (Interface or Operator) and provided with a set of states and a set of operations (or actions) which allow to change these states. Graphically, the component-object is presented in an identical structure that the one of a taskobject. However its internal control structure called Object Control Structure "ObCS", inspired by the cooperative interactive objects formalism (Palanque, 1992), is modelized by an Object Petri Net "OPN" (Sibertin, 1985).The OPNs are characterized by the fact that the tokens which constitute the place markings are not atomic nor similar entities, but they can be distinguished from each other and take values allowing to describe the characteristics of the MMI and operator. The terminal task-object "T111 : take knowledge the new flight" needs using two componentobjects : MMI object "a New Strips Table : NST" and operator object " Organic Controller : OC" (figure 4). The comportment of the MMI object "a New Strips Table" is defined by four states P1, P2, P3 and P4. From each state the Organic Controller can carry out a group of actions (transitions). From the P3 state (strip selected), for example, he has the possibility to achieve two actions : t3 ( open a road-zoom) or t5 (temporize the new strip).
ITl11
Take knowledge of a new flight ~,,
A new strips table
f
Organic Controller OC
'-"iii. . . . . . j ~ f
t...~
t
Fig. 4. A graphic Specification of the component-objects "New Strips Table" and "Organic Controller"
"
272
For the component-object "Organic Controller", the set of his states and his operations represents the different possible procedures to perform the terminal task "T111 : take knowledge of a new flight". So, the display of a New Strip NS in the MMI object "new strips table" invokes, by the event E2,1, the operation service "Consult the NS" of the operator object "Organic Controller OC". According to his selection "Ch=", the organic controller make a first reading of the NS information ("Consult the road" or "Consult the level"). After this reading, he changes his state into cognition in order to evaluate his information level. Then he decides to "read again the basic information" or "to ask for additional information". The asking for additional information expresses itself by a change of his state into "Action" in order to "select the NS" and to "open the Road-Zoom". Both actions transmit R2,2 and R2,3 reactions to the MMI object "new strips table". It is to be noticed that the organic controller carries out the action "open a road-zoom" only after receiving the event E2,2 confirming that the action "Select the NS" has been carried out. Once the Road-Zoom has been opened, the Organic Controller changes his state into "information reading" in order to read the additional information and then into the "situation evaluation" state to decide either to read again the information, or "to temporize the NS" or to invoke, by the reaction R2,1, the terminal task-object "T112 : Take decision about flight".
4. CONCLUSION The TOOD methodology enjoys the contributions of methods and concepts taken from cognitive sciences and ergonomic domains together with those of the software engineering domain. It provides a framework of efficient collaboration between various users and between ergonomists and computer specialists. Its formalism allows to define in a formal, coherent and structured model the task knowledge and to specify an adapted interface to the users' characteristics.
5. REFERENCES
Abed, M., 1990, Contribution ~ la modrlisation de la tfiche par des outils de sprcification exploitant les mouvements oculaires: application ~ la conception et l'rvaluation des Interfaces Homme-Machine, Doctorate thesis, University of Valenciennes France. Barthet, M-F., 1988, Logiciels interactifs et ergonomie : modules et mrthodes de conception, DUNOD Pads Coad, P. and Yourdon, E, 1991, Object Oriented Analysis, 2nd Ed., Yourdon Press Computing Series. Feller, A. and Rucker, R., 1990, Extending Structured Analysis Modelling with A.I.: An Application to MRPII Profiles and SFC Data Communications Requirements Specifications, IFIPS Conference. Jensen, K., 1987, Colored Petri Nets, In Petri Nets : Central models and their properties, LNCS Nb. 254, Spring Verlag. Mahfoudhi, A. and Abed, M., 1994, Description orientre objet des tfiches du contrrleur pour la sprcification et la conception des interfaces, Contract research report CENA 94, University of Valenciennes France. Michel, L., 1991, Conception orientre objet : Pratique de la mrthode HOOD, Dunod Press Pads. Palanque, P., 1992, Modrlisation par objets Cooprratifs Interactifs d'interfaces homme machine didgges par l'utilisateur, Doctorate thesis, University of Toulouse I France. Reiter, H. and Oppermann, R., 1993, Evaluation of user interfaces, Behaviour et Information Technology 12 (1993), 3, 137-148. Scapin, D.L. and Pierret-Golbreith, C., 1989, MAD : Une Mrthode Analytique de Description de Tfiches, Actes du Colloque sur l'ingrnierie des Interfaces Homme-Machine, France. Sibertin, B.C, 1985, High-level Petri nets with Data Structure. 6th European Workshop on Petri Net and application, Espoo, Finland.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
273
An Analysis of Relationship Between Human and Information System by Quantification Theory 11I Tsuneki Mukahi, Ken Murasugi and Tetsuo Ui Department of Industrial Management, Osaka Institute of Technology, 5-16-10miya, Asahi-ku, Osaka, 535 Japan The purpose of this study is to clarify the framework for understanding the relationship between human and advanced, large-scale information systems by analyzing the empirical data gathered using questionnaire. As the result, three axes of 'goodness of relationship between human and information system', 'attitude of organization toward the information system', 'stress felt by individuals in the information system' were found out. A new image of 'an advanced information-based organization' was obtained from these axes. It was also observed that such 'an advanced information-based organization' has been already realized in some organizations.
1. INTRODUCTION Advanced, large-scale information systems have been introduced in many organizations. To make full use of the ability of workers within such organizations, we should make it clear the relationship between human and information systems. Though the effects given on human by a computer have been well investigated in the field of psychology and ergonomics (e.g. Brod 1984), the number of studies on the impact to individuals by a networked information system is much smaller. However, we can find some studies concerning the relationship between the networked information system and the organization or job (e.g. Whisler 1970). Among these studies, it has been pointed out, for instance, that the information system brought fiat organization or job autonomy. Now that we can see the close relationship between the networked information system and the organization or job, we have to take into consideration the effects given on the individual human by the system through the organization and job, in order to grasp the relationship between the system and the individual human. The purpose of this study is to clarify the framework for understanding the relationship between human and advanced, large-scale information systems.
274 2. METHODOLOGY We gathered 584 sample data from five organizations using advanced information systems. Questionnaire consisted of items concerning the introduction of computer systems, how workers relate with computer systems, how they think of them, and some demographics. Quantification theory 1II was employed to clarify the potential axes which explain the relationship between human and information systems. Ten items were analyzed: 'instruction given when the system was introduced', 'time to spare caused by using the system', 'opportunity for self judgment', 'effect on ability development', 'stress', 'active use of information', 'changes of the way to use the system', 'information shared', 'prevention of mistakes on input' and 'satisfaction on using the system'. Each item included three choices, which sums into 30 choices altogether (throughout 10 items). 3. RESULTS Three axes were clarified by quantification theory 111. Category score of each axis is presented in Table 1. Figure 1 combines the first and the second axis, and Figure 2 combines the first and the third axis. In the first axis, scores of the categories such as 'J1. very satisfied', 'A1. sufficient instruction', 'B 1. time to spare obtained', 'F1. active use of information', 'H3. information well shared', 'I1. mistakes well prevented', became large. On the contrary, scores of the categories such as 'J3. unsatisfied', 'A3. insufficient instruction', 'B3. time to spare lost', 'F3. inactive use of information', 'HI. information shared limited', 'I3. mistakes not prevented', became small. Thus, the direction of plus in the first axis expresses the good relationship between human and information system, while the direction of minus expresses the bad relationship. So, the first axis is considered as 'goodness of relationship between human and information system'. In the second axis, scores of the categories such as 'C3. opportunity for self judgment not increased', 'E3. stress not felt', 'F3. inactive use of information', 'G3. the way to use it not changed' became large. On the contrary, scores of the categories such as 'C1. opportunity for self judgment increased', 'El. stress felt', 'F1. active use of information', 'G1. the way to use it changed', became small. Thus, the direction of plus in the second axis expresses active use of the information systems by organizations, while the direction of minus expresses inactive use of the information systems. So, the second axis is considered as 'attitude of organization toward the information system'. In the third axis, scores of the categories such as 'El. stress felt', 'F3. inactive use of information', '13. mistakes not prevented', 'B 1. time to spare obtained', 'B3. time to spare lost', 'C1. opportunity for self judgment increased', 'C3. opportunity for self judgment not
275 Table 1 Category scores by quantification theory Ill 1st axis
2nd axis
3rd axis
A1. Sufficient instruction
0.0410
-0.0321
-0.0204
A2. Fair instruction
0.0078
0.0096
0.0120
-0.0281
-0.0079
-0.0163
B1. Time to spare obtained
0.0203
-0.0021
0.0152
B2. Unchanged
0.0112
0.0096
-0.0180
-0.0428
-0.0469
0.0265
Items
Categories
A. Instruction
A3. Insufficient instruction B. Time to spare
B3. Time to spare lost C. Self judgment
C1. Increased C2. Unchanged C3. Not increased
D. Ability development
E. Stress
D1. No negative effect
G. Changes of the way to use
J. Satisfaction
-0.0154
0.0044
0.0137
0.0308
-0.0336
0.0126
0.0020
0.0159
0.0179
D3. Negative effect
0.0066
-0.0049
-0.0267
El. Stress felt
-0.0220
-0.0536
0.0381
E2. Stress a little felt
-0.0069
-0.0007
0.0073
0.0112
0.0106
-0.0145
0.0197
-0.0205
0.0008
F2. Sometimes use
F1. Active use
-0.0101
0.0085
-0.0163
F3. Inactive use
-0.0165
0.0236
0.0487
G1. Changed
-0.0144
-0.0440
0.0168
0.0042
-0.0039
-0.0092
G3. Not changed
-0.0069
0.0360
0.0228
H1. Shared limited
-0.0262
-0.0233
0.0092
H2. A little shared
-0.0078
0.0181
-0.0025
0.0243
-0.0082
-0.0023
H3. Well shared I. Prevention of mistakes
0.0143
0.0118
-0.0215
G2. Sometimes changed H. Information shared
-0.0250
D2. Not conscious
E3. Stress not felt F. Active use of information
0.0182 -0.0118
I1. Well prevented
0.0259
0.0032
-0.0041
I2. A little prevented
-0.0037
-0.0006
-0.0062
I3. Not prevented
-0.0232
-0.0027
0.0207
J1. Very satisfied
0.0556
-0.0345
-0.0283
J2. Satisfied
0.0065
0.0082
0.0081
-0.0261
-0.0182
-0.0188
J3. Unsatisfied
increased', became large. On the contrary, scores of categories such as 'E3. stress not felt', 'D 1. no negative effect on ability development', 'A1. sufficient instruction', 'A3. insufficient instruction', 'J 1. very satisfied', 'J3. unsatisfied', became small. In this axis, category whose
276
v'~3 mmF3
mH2 ~3
oo !---4
-0 05
ml~--'I" 1D~t
mI3 mA3
lI1 1B1 IH3
mi$3
mJ3 ~I1
or)
0.36
mF1 mAl
mnl
mj
mG1
IB3
1El V, VV
FIRST AXIS Figure 1. A distribution chart of the 1sff2nd axes combination.
NF3'. w
mE1
mC3
IB3
(.o
lI3
I---I
ID1 mill
1G3
roD2
m~l
mpJ~l
m~
mE2
=J2
I--I
-0, 05
mH2 lI
~2 mE3 mAl
0.03
~3
m j:
FIRST AXIS Figure 2. A distribution chart of the 1st/2nd axes combination. scores marked both plus and minus values was 'stress'. Here, the direction of plus expresses the existence of the stress felt in the information system, while the direction of minus expresses the non-existence of the stress felt in it. So, the third axis is considered as 'stress felt by individuals in the information system'.
277 4. DISCUSSION 4.1. An advanced information-based organization
The quadrant made by the combination of the first axis-plus and the second axis-minus expresses the good relationship between human and information system, and active use of information system. In other words, this quadrant shows that the information system is employed as successfully as the organization expected. So, we call this type of organization 'an advanced information-based organization'. In the concrete, the quadrant includes items which are 'J1. very satisfied', 'A1. sufficient instruction', 'C1 opportunity for self judgment increased', 'F1. active use of information', and 'H3. information well shared'. Furthermore, we calculated the average of sample scores of each organization and applied t-test in comparison of the average of each organization and the total average (Table 2). As a result, a tendency is observed that an organization A belongs to 'the advanced informationTable 2 Calculation of the average of sample scores of some items Items
Categories
Organization
A B
Sex
Age
Section
0.0057*** 0.0009
2nd axis -0.0033*** 0.0040*** -0.0007
3rd axis 0.0000 0.0019"*
C
-0.0059***
D
-0.0006
0.0002
-0.0018"*
E
-0.0008
-0.0008
-0.0019"
Men
-0.0008
-0.0010*
-0.0008*
-0.0001
Women
0.0019"*
0.0027**
0.0017"*
-24
0.0007
0.0007
0.0011
25-34
-0.0008
-0.0005
-0.0004
35-44
-0.0003
0.0003
-0.0003
0.0004
0.0006
45-
0.0024**
System development
0.0006
System use Class
1st axis
-0.0023**
-0.0008
-0.0002
0.0007
0.0002
Manager
0.0011
-0.0012"
-0.0009
Common
-0.0009
0.0004
0.0002
* "p'~ ~
"': ....
iii:i:~?
t.
~ -3--~
j
. . . . . :"" ":':':':":':':':':':" +:':':':':" ================================ ::::::::::: ::::::: ::::::::::::::::::::::::::::::::::::::::::::::::::::i:!:i:!:i:i:i:!i:.:.:.:.:.:.:.:.:.:.:.:.:.:.:.:.:.:.:.:.:.:~
9 ~3~
!!!!!ii!!i!i!ii!i!iii!iii!!i!!ii!!!!ii!i!!!iiii!ii~ii!~i~iiii!iiiii!i!iii!~!:i:i:!:!:i:i:!:i:i:!:i:i:i:!:i:i:i:!:i:!:!:3:!:!:!...:.~:. ....... ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
~iii::.:::-.:.~.!::iii!iii::i::i::i!:::::~iiii::iiiiiiiii::! ~:i!i ili!i~::~:~:~ii!::iiiiiiiiiiiiiii~:~:~:~ii i::ii i i i i !ili ::i:::::i~:~:~:::~iiii ::::ii::ii::!::ii::!ii~:~:~:~iill i ill i iil i i ::i ::ii~:~:~:~i:: i!iiii::i::::iil if::i ::ii::iii:::¢:~:~:~i::iii::iii!i::iii! i!ii!::~::~i~i~:~:~:~!~::::!::ii::::::iii::!iii::i~::!~i~::ii~:~:~:~:~i:?:!::::ii::!::::i~!~:~i::i~i::ii~ii~i~:~:~:~i~:i~::~!~i::ii ~:~
!i !i !i !i i iii i ililii!Iil ~
iii!iiiii!ii!iiii iiiiiiii!!i!i iii!i)ii iiiiiiiii}ii i iiiiiii i!i ii
.......................................................................................................................... - : _ - : - - - : ......................................................... ............... ~_________~
__
_..
Fig.2 An Example of Initial Candidates of Color Coordination IKeyWord 1IlKeyWord2 IKeyWord3]
DATA
........................................................................................ i~iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii!iiiiiiiiiiiiiiiiiiiiiii!iiiiiiii~iiiiiiiiiiiiiiiiiii
Active
BASE
+ 10
~rea,_ iiiiiiiiiiiiiiiii~'~iiiiiiiiiiiiiiiil
(Jacket) iiiii Area2 ~!~:~..~.-,~iiiiiiiii (shi.~ --!iiiiii ii!i? Area3~iii!ii!i (Trousers)!i~[i: ,iiiiiii!iiii! "rs~~iiiiii! I iiiiiiiiilrill..............
::iiiiii~!ii~
Calm
-20
Bold
+5
Mild
0
Fig.3 An Example of Men's Clothes Design
Fig.4 Representation of Color Preference
408
2.2 Representation of Color Coordinate Space As shown in Fig. 3, each pattern consists of three color areas such as a jacket, a shirt and a pair of trousers. Each area is represented by three parameters such as hue, saturation and lightness. The color coordinate space is represented by following 18 features. A color coordinated pattern in the database is corresponded to a point on this 18 dimensional color coordinate space each other. Area 1 (Jacket) : Hue, Saturation, Lightness Area 2 (Shirt) : Hue, Saturation, Lightness Area 3(Slacks) : Hue, Saturation, Lightness Defference of Hue between Area 1 and Area 2 Defference of Hue between Area 2 and Area 3 Defference of Hue between Area 3 and Area 1 Defference of Saturation between Area 1 and Area 2 Defference of Saturation between Area 2 and Area 3 Defference of Saturation between Area 3 and Area 1 Defference of Lightness between Area 1 and Area 2 Defference of Lightness between Area 2 and Area 3 Defference of Lightness between Area 3 and Area 1
2.3 Representing and Learning User's Color Preference This initial database corresponded to three keywords, does not always adapt to the current color preference of each user. Therefore, the system provides a method for learning the subjective color preference of the user. Generally, the subjective color preference of each user and the features of the color area such as hue and saturation are represented by a 1-dimensional result space and a high dimensional parameter space. The mapping relation between them is discontinuous and nonlinear. Therefore, it is difficult to model the color preference of human beings and estimate parameters of the model. We use the memory-based like model such as the table lookup with interpolation. The method can map from the result space to the parameter space directly. The user checks both desirable and undesirable designs from candidates. Then the score of the design with favorite ones is increased. Therfore, this method can learn the user's color preference effectively by using results of a sequence while the user narrows candidates of color coordination and finally decides the suitable one. On the othre hand, we consider an user has two types of color preference; a fundamental and a temporal ones. Temporal color preference varies according to such environmental factors as weather, season, time and etc. Using only an above method, disirable designs are not always mapped with selected keywords. Therefore, as shown in Fig.4, we make 18 feature histograms H*(x) for representing both color prefernce of the user. Against each feature, we accumulate the plus and the minus scores H(x) of color prefernce of every design respectively. Each feature histograms H*(x) is calculated by following equations.
1
H(x) = - ~ H(x) n .,=0
cr(H(x))=l ff ,-=0 H* (x) = 10 * H(x) - H(x) cr(n(x))
(1)
409 Using this 18 feature histograms H*(x), the system can generate initial candidates of color coordination with not only direct memory mapped ones but also similar ones.
m:-;:::::~ ...........~:':':::5::i:............................................ :..~!~!~i~z~!.i~.:~:.~!~~:*~i~!~.~i~!~.~!~!~.~.~!~i~.~!~!~!~B
.-:..ilia,, Ill i )i!ll ii:.iiiii/ i i ,...
" :~:~:E~E~::S q. =============================== :I~:~i~:E~: :~:: :::i~:~:l :"1 ":':: ::~:: :E:~:~EE:iE:::" :i'~$i':':~::~EE:~!:~:E:~:::::: : .... ::"i::E::: E:~,m::[
)i~iE" .~i~
)!:.i!
:::E)~i::E! ~I~!i~:
i~!~))~i~'): :i~!~i~....
:', i E::E::))~.~i~i~ :~!~i
"~i~))i~i~ii ~-::.~!~-:'.-
"~::Ei~> "~::
Ni))ii .... ~))i)i..... !l)i)!',i)'',,)~'....,i',)ii!i)ii))ii~i)L )i)~ Ni:~ ))) ))))))) )))))))~)))))))
~":':':::~"'Ii" !!ii ~))~!! ~))~i'" "[i:"::!..... )~~L:-:ii[.)))i i~~)i!~).... ~i~i!i~i .)'!!)~.~!~:::~"[~!i)~i!l:: i~: !i [~iii)..
Fig.5 An Example of Ten Candidates of Color Coordination at the Final Step
Fig.6 Final Step
2.4 Navigation to the Final Candidate of Color Coordination In the second step, these candidates are narrowed down to candidates adaptive to the user by an interaction to computer. The user checks both desirable and undesirable designs from candidates. Furthermore, if the user can not find a desirable design and want to change color coordination, the system shows several new designs which the system change a color pattern of area pointed by the user, then the user can chooses his favorite design. The score of the design with favorite ones is increased. The new choosed design is added in the personal database. Then, the candidates are sorted by these scores and narrowed down to ten as shown in Fig.5. In the final step, the system navigates the user to decide color coordination that matches with the current state of user's mind. To navigate the user, any two candidates in remaining ten candidates are shown to the user repeatedly (See Fig.6). However, it is often the case that most remaining candidates suit the user's preference. Thus it is not easy for the user to narrow the candidates of color coordination down to a single one. Therefore, by changing the balance of color coordinate, we make it possible for the user to find the design of color coordination with great satisfaction. Here, every remained candidate suit the user's preference and denote a stable condition for the user. When two candidate designs both satisfy the stable condition, it seems to be difficult for the user to differentiate between two. Therefore, we change the balance of color coordinate of one candidate to decrease likeness of the feature histogram of the color prefernce.
3. Experimental Results For evaluating the system, 15 subjects gave this system more than 13 trials. The persons were between 22 and 35 years of age. Fig.7 and 8 show results of the average time of retrieval and the average of feature histogram H(x), respectively. The average l,t is calculated by following equations.
H(x) /1 = Zx {x* p(x)}
P(x) = Z H(x) x
(2)
410 As the retrieval time is decreasing and the average of feature histogram becomes constant, it seems that the system learn the color preference of the user. In case of 11 subjects, the selectibity of design is increased by changing the balance of color coordinate. A number of this evaluated data is more than 200. Therefore we coonsider that the change of color coordinate is usuful for this kind of design system. Moon & Spencer proposed the principle of color harmony. In their principle, two areas in color harmony denote a stable condition. When two candidate designs both satisfy the stable condition, it seems to be difficult for the user to differentiate between two. By making unstable condition, we make it possible for the user to find the design of color coordination with great satisfaction. We analyized experimental data considering from Moon & Spencer's principle of color harmony. In case of the unstable condition, the design of color coordination can be easily selected with great satisfaction. This means that Moon & Spencer's principle of color harmony is effective rule to change the balance of color coordinate of one candidate. 400
it \
A V
-
" ; ~
TotalTimefor Retrieval RetrievalTimeat Final Stet r s t Step
300
~14 o~12 ~IO
200
=
8
6 100
4
Trial l
0
2
a
4
i
6
i
I
8
I 0
Fig.7 Time for Retrieval
I 2
q
M*.,.*,IIa.,M. V o ,.....ae
u~nr.I, t,.~m k~ c * d , , . n l .t,e ~ q
u
c~,.I,,..
~ , . ~ ~t ,q,~ m p
M
C~,,al"-
~t,,--,
M
r," ..-. oa,U m•(
P'.J.a,~ crldms,,,t0t,~ dl,~ X:)m,,N~
~,~i
or
,,... ~
•
M
~ .,,me ~, • ~,~.-- ~,~,~.
m
~ m,=..Imn.~ =.,.m,,= Mo,,ml ~ ,~,~=aml.llr
cur~i,,l~' I~m,,,m,
tK " "
,~.~
•
-n. c~ ' ,~,.a'..~ Mo,,l,~ ~ I,.9~.=d Tg~..,a'UCOa~'~"
~
M x
~mm |K
(b) Shiga university hospital
w.."
: ~---__
U*,~t*m. ~ " "
• i. ct~ma . . - ~ . , , " .
-
-
.,=m,..
T~7 M . , ~ S t. ,,.,,,.
c~,,..~
•
IK .c
,~..~
.,..,,..
,,,. ,.-*--, ~ r l m . p =
"
c~d,,~ .,,,q c..a, m s ~ . m . , . l c.,~ml'~-'.'-"-~--"
u . , ~ l ~. Cl,d ~ l " - - " I~/ c~ '
~,-.,-.-t..,m~
c.~.=~
- , - ,-,,d ~=~ x , . . .
•
H
.,=.~,.
~om * m 'l
,.~,~r c ~ , ~ , , ~ , , - -.., ~t ~ , . ~ , ~
c ~ "-mp--.'.,.~,~ V*a.~*¢.,ml ra. ~.r~,, • Mm,,l*l ~ "t'P k,../
/
.
/
(a) Yamaguchi university hospital (c) Kyoto university hospital (mouse) Figure 2. Examples of the tree representation for the prescription order entry interface
415
T a b l e 2 T h e n u m b e r o f b r a n c h e s for five tree r e p r e s e n t a t i o n s Level Total branch Averal~e Tree structure ( Number of branches for each level )
Kyoto (short cut) Kyoto (Eccent. usr) iYama~chi Shi~a Kyoto (mouse) UT F A KS UT F A KS UT F A KS LIT F A KS UT F A KS 5 17 43 100 4 15 40 56 4 15 40 94 4 15 25 54 4 7 10 47 4.00 1.75 1.43 4.70 4.'00 3.75 1.67 2.16 4.00 3.75 2.67 2.35 4.00 3.75 2.67 1.40 5.00 3.40 2.53 2.33 2 2 1 3 1 2 2 51 1
6
1
2
2 2
6 5
1 1
6 5
5 2 5
2 2 5
3 6 2
1
3
2 5 2
2 2 2
1
3
2 5 2
2 2 2
1
2
3 3
2 2 2 2 4 3 2 2 2 3 2 2 2 2 5 3 2 2 2 3 2 2 2 2 2 2 2
UT: Unit Task Level F: Function Level A: Argument Level KS: Keystroke Level
Table 3 Similarity distance ( H C I D ) b e t w e e n five interfaces a: Yamaguchi a: Yamaguchi system b: Shiga system c: Kyoto system (mouse version) d: Kyoto system (short cut key version) e: Kyoto system (Eccentric user)
b: Shiga 3.24
c: Kyoto (mouse) 3.33 1.02
d: Kyoto (key) 4.05 1.22 0.95
e: Kyoto (eccentric user) 3.25 1.38 1.07 1.42
416 Using the short cut key version in the KUH system, the total number of operations is 56. This is about half of the mouse version and almost the same as the SMSH system. While using the HCI distance, the distance between two versions of KUH system is HCID=0.95 and it is shorter than the distance between the mouse version and the SMSH system (HCID=I.02). That is, two brother interfaces based on the same mother system are similar, even if the difference of the number of operations is very large. This result shows that both of the way of dividing a unit task and the way of making a sequence of functions have rather larger effect on the similarity distance than the number of operations alone. The distance between the exemplary and the eccentric user in the KUH system is HCID= 1.07. It suggests that the difference between users could have lager influence than the difference of systems in some cases. 6. CONCLUSION In this report, we proposed the concept of the reproductive human-computer interaction as the composition of three elements, which:were the system, the task and the skilled user, for the direct prescription entry system. The reproductive HCI had a tree representation with five layers. We defined the similarity distance between those reproductive HCIs by introducing Euclid metric for the four dimensional feature space characterized with the four coordinate values, which were the average numbers of branches per a node for each layer of the tree. We made experimental studies for five reproductive interfaces of the direct prescription order entry in three different university hospitals and measured the mutual differences among them. The results showed that the similarity distance agreed with the user's impression. The proposed method could not specify the degree of "usability" or "good and bad" to a HCI directly. However, it might give a reasonable similarity distance between any two HCIs from the compromising point of view among the system, the skilled user and the task. This method could offer a common platform for designing and standardizing the human interface of hospital information system and for discussing the possibility of replacing old systems. REFERENCES
1. S.K.Card and T.P.Moran, The Keystroke-Level Model for User Performance Time with Interactive Systems, Comm. ACM vol.23 No.7:396-410 (1980) 2. S.K.Card, et al, The Psychology of Human-Computer Interaction, Lawrence Erlbaum Associates (1983) 3. A.Endoh, K.Minato, et al, Quantitative Comparison of Human-Computer Interaction for Prescription Order Entry Systems (by Japanese), Japan J. of Medical Informatics, Vol. 14 No.2:45-55 (1994)
ACKNOWLEDGMENT The authors wish to thank Dr. M.Komori, Dr. Y.Inoue, Dr. S.Nagata and Prof. T.Takahashi for their help and advice with this work.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
417
Explaining Plant Design Knowledge Through Means-End Modelling Pertti Huuskonen, Kari Kaarela VTT Electronics, P.O.Box 1100, FIN-90571 Oulu, Finland Tel. +358-81-5512111, Fax. +358-81-5512320 Email
[email protected] ABSTRACT We apply the multilevel means-end model of Rasmussen to explain the intentions behind design choices of industrial systems. We have extended a design tool, the Design++ environment, to capture higher level knowledge about the artefact under design and to explain this knowledge to the users. Means-end knowledge is encoded through relations added on top of the existing part-of and subclass hierarchies in the tool. Explanations are encoded as structured text in the objects and design rules. We have tested the explanations with a power plant application. Although still very primitive, this research prototype has confirmed that meansend information can be effectively used both to capture and explain design knowledge. We adopt a simplified view to design processes and decisions, placing our main emphasis in artefact modelling and explanation techniques. 1. INTRODUCTION Industrial plant design is a complex activity. The designers have to understand a lot of design information provided by other design disciplines to be able to successfully extend the designs. A major part of the designer's work consists of reusing designs. Unfortunately design documents represent only the end result, not the underlying intentions. This implies reverse engineering the purposes behind designs. Ways to clarify this teleological knowledge are needed.
1.1 Capturing design knowledge The purpose of an item is well documented only in the case of exceptional design choices, where the designers want to emphasise the deviation from standard practice. Otherwise, the purpose of the item is expected to be understood via experience. This may not be the case: memorised knowledge deteriorates with time, and seemingly similar solutions may be fundamentally different. Even between experienced designers there may be misunderstandings due to their different backgrounds. Moreover, the end users could interpret the purpose incorrectly, which may lead to non-optimal performance at the plant. We believe that some of this design knowledge could be captured with the design tools. The tools should include clear models of the design knowledge and be able to communicate this knowledge to the end users. Even if the tools do not fully support modelling, they should support knowledge acquisition from the designers. In addition to enhancing the communication between designers, operators would be helped as well. 1.2 Related work In this paper, we concentrate to capturing and explaining knowledge in design tools, somewhat overlooking the user's end of the problem. We adopt a simplified view to design proc-
418 esses and decisions, placing our main J Sub~oal ~ S u ~ o a l I "~ =realize/ emphasis in artefact modelling and ~' ~" ~ / achieve explanation techniques. Other reI /,~°" ~ / / ~ S~bqoa~l j "~ SubQo~ ~oall = suopan searchers have studied extensively the ~ ~u~o~, topic of modelling design t ~-~Sbbfun~.tionl [Chandrasekaran 93, Gruber 92, Ke~.Subtunetion~~~~_~ uneke 91, Klein 93, Lee 91, Stephar F. ion ~-------~Subfunct r~ io..n]~,^~.------~Subtu~'~lonl"l~bfuncti°nl~" nopoulos 90]. We share their views, ~ ~ , y | ....... r--.~.~Subfundliorl]| adding ideas from means-end modelEna, \ / I Suboa~ ] \ ~ Subpart/ I ling [Kaarela 93a, 93b, 94, Larsson 92, / ~ • Part ~ Subo~rt] . Lind 90, Rasmussen 86, Sassen 93] [Device ~ Part ~ .qnhn~£~ and explanation research [Franke 91, ~ ,//{Subpart I " "'q:=~ Part r'-------.t Subpart I Huuskonen 92, Swartout 91]. Our main contribution has been in formuWhole ~ ~. ~ Part lating an algorithm for deriving explanations and in embedding required Figure1: Dependencies between levels capture means-ends representations in a design tool. Here information we target only two aspects of design knowledge: purposes and justifications. 2. EXPLANATION THROUGH MEANS-END MODELS 2.1 Means-end modelling
Means-end modelling, pioneered by Rasmussen, is a sound way to structure knowledge about a plant [Rasmussen 86]. It makes the function and goal levels of the systems explicit (Figure 1). The means-ends dimension is divided into levels of abstraction, with more abstract concepts (goals) at the top and the very concrete concepts (devices) at the bottom. Each device is given a purpose through the functions it performs. In turn, the purpose of these functions is to achieve a set of goals. Trough these many-to-many relations between the model items one can explain the purpose behind a design choice [Rasmussen 86]. To see why an item was included in the representation, the mapping to the next higher level has to be investigated. This gives the purpose for this item. Correspondingly, to find out how an item is implemented, the relation to the next lower level has to be examined. [Rasmussen 86] 2.2 Explanation as means-end navigation
We view explanation as navigation in the means-end model. Explanations can be found either explicitly, as structured text in the item under consideration, or indirectly, through relations to other items. We define two ways of explaining designs: explicit explanations, referring to general design knowledge, and derived explanations, referring to related design choices. This is a straightforward approach to modelling design knowledge that does not allow for any deeper automatic understanding of the design, but will suffice for the purpose of explaining the knowledge. Explicit explanations are structured text that can be attached to objects, their attributes, or to design rules. They contain a number of predefined keywords giving rationale for individual items, and freeform text for less formal definitions. Possible criteria that the designer might offer if asked to justify a design choice include: authority, textbooks, similar case, standards, law, safety regulations, cost, tradition, and personal preferences. These criteria are represented
419 as keywords. This semiformal approach allows for both formal and informal knowledge to PREssuRE-25~) - - ~t h;~i~e,-'~ be captured. Realizes:I J I Part-of: Part-of: /[~1 ........ A justification can be derived by traversing the network / Pump_system / formed by the means-end relaType: assembly / Pump.1 tions. Indirect explanations are Purpose:/ ~ I Type'."device derived from other items I Realizes: ~ , Purpose'r ~ Realizes:through relations (Figure 2). tgj\ I Part-of: [~] Explicit justifications \' // An item may possess explicit I~ Derived via means-ends-relations justifications of design knowl[~1 Derived via structural relations edge, in which case the justifications are simply shown to Figure 2: Explanations derived through relations the user. If an item does not have an explicit purpose, its neighbouring objects are searched recursively until an explicit purpose is found. If means-end links ('realises' relation in the figure) are not defined for an item, the purposes can be derived through structural decomposition (part/subpart relations). The search continues until an explicit justification is found. Means-end relations are given the priority and, if they are not explicitly defined, an implicit relation is derived through part-of hierarchies. The number of potential relations is decreased, since several items can inherit a common purpose from their superiors. ..Raise-pressure , /ype •• goal Purpose:(INCREASE [~1
I Fw-pressure-elevation Type • function
~.nc.t
3. RESULTS The proposed ideas have been embedded in a research prototype. It is built as an extension to the Design++ tool, increasing its representative capabilities with features supporting meansend modelling and justifications. 3.1 Research prototype The prototype can answer some specific types of questions about purposes and justifications, as well as export the knowledge as C++ and Prolog structures for future embedding in automation systems. Allowed questions currently include "What is the purpose of this item?", "What is the purpose of this attribute?", "How is this item implemented?", and "How are the values for this attribute obtained?". We have tested the explanations with a power plant application (Figure 3). The research prototype, although very primitive, has shown that means-end knowledge can effectively explain design information. Currently, the user interface of the prototype is quite limited. It uses the facilities available in KEE and Design++: windows, mouse and menus. The queries for justification can be made either via object menus or Lisp commands. The justifications are displayed as narrative text. The means-end relations are drafted in AutoCad drawings or directly in the object hierarchy [Kaarela 94].
3.2 Design++ tool Design++ is a general engineering tool marketed by Design Power Inc. It is suited for maintenance of product and design knowledge [Katajam~¢i 91 ]. The tool encodes knowledge about the plant under design in KEE and Oracle objects, organised in class hierarchies and part-of
420
proj ect#
11bra~e=
ledel=
ut tL~ties
Component FEEBWATER_PUMP $627 in Model HI=W31 i[ H Class: FEEl)WATER_PUMP in Library HFW3 -ii Assembly: FEEDWATER_PUMP_SYSTEM Parts: AD3USTING_DEVICE ENGINE_DEVICE INSTRUMERr PURPOSE Comment: Vhy is this thing necessary in the process? Oefau]t: > Design Rule: > Valuec]ass: > V a l u e : >
iii til ~- " i. i i riri[' iin,',i i i ~i[l' t.,i~!i ~ii=!i ~'iLE /hp_desuperheat i n g _ s p r a y _ s y s t e m - -
storage_deaeration--...
,piping_system--... ~llfeeduater system /pump.1 ... ~iilli ~ p L ~ p _ s y s t e m ~ - - - ~ - p u r n p "2~ ~'.il ~ ~-~discharge_line.1 ~.discharge_]ine.2 h _feeduater_heatin s s t e m - - . . .
~un~tmsg "feeduater_Pump.s627 'shou-purpose!) No e x p l i c i t purpose for Part/Feeduater_pump.s627, but i t is part o f : Assemb]y/Feeduater_pump_system.s491. No e x p l i c i t purpose for Assembly/Feeduater_pump_system.s491, but i t has links to upper levels: Function/Fu-pressure-e]evation. No e x p l i c i t purpose for Function/Fu-pressure-elevation, but i t has links to upper levels: Goal/Produce-flou Goa]/Raise-pressure. The purpose of Goal/Produce-flou is: (Produce a uater flou of 125 kg/s). The purpose of Goal/Raise-pressure is: (Raise pressure from 7 to 245 bars). NIL
>|
Figure 3: The research prototype built on top of Design++. The windows at top right and centre show the meansend and part-of hierarchies of a feedwater system. The bottom window shows the trace of a generated explanation.
hierarchies. Each attribute of an object may have an associated design rule written in Lisp, that can be used to infer the value of the attribute. Figure 4 shows a rule that calculates the hydraulic power of the feedwater pump. The rules may retrieve knowledge from other objects' attributes. As a test bed, we used the feedwater system of a peat-fired power plant, modelled by Tampella Power, a Finnish boiler manufacturer. We extended these models to capture means-end information about the plant and to explicitly record justifications of design decisions. 3.3 Justifiable Objects We have extended the Design++ tool with the notion of justifiable objects. They are objects, classes, or instances that can reside at any level of abstraction or decomposition, and can offer justifications of themselves. This extends authors' previous work with explainable objects [Huuskonen 92]. Explicit justifications are encoded in Lisp keyword lists PURPOSE and RATIONALE. The lists may also contain free form English explanations. They can reside in an object's attributes, in an attribute's facets, or in a Lisp rule's code. The lists are generic structures in the sense that their representation is the same regardless of the type of the item they refer to. However, their meaning changes depending on the place, referring to either the object, the attribute, or to the rule. 4. DISCUSSION The prototype, built on top of a design tool, helps to capture knowledge from the design and to structure the knowledge in a meaningful way. The suggested representations capture semantic knowledge of the domain and the designed artefact.
4.1 Improvements to design The suggested ideas raise the abstraction level of the design documentation, formalise work habits, and force the designer to think about the reasons for design decisions. These develop-
421 ments should bring increased quality, reliability, and safety in the use of the plant. Also, communication among designers would be aided. However, design knowledge may sometimes be very fuzzy, hard to model with the framework described here, and therefore hard to justify. The designers or the companies may be unwilling to justify their proprietary designs. The extra work imposed by modelling will not be easily accepted by the designers, even if the benefits would be clear. Our approach would save some of the work by propagating the purposes through relations, though some manual knowledge entry is still necessary. Moreover, different power plants are rather similar. Most designs are reused at succeeding plants. This suggests that the justifying work could be reused as well. We chose the semiformal approach to modelling design knowledge because of its simplicity, especially since our tool did not support modelling the design process itself. Our main aim was to test the explanatory capabilities of FEEl)WATER_PUMP.S627 in Model HFWS1 the means-end model. More elaborate de- Component Class: FEEDWATER_PUNP in Library HFWS Assembly: FEEDWATER_PUMP_SYSTEM sign representations were not seen essential Parts: ADJUSTING_DEVICE ENGINE_DEVICE INSTRUMENT for this purpose. HYDRAULIC_POWER Hydraulic pouer efficiency [kY]. Note that the prototype has only been Comment: Default: > Design Rule: embedded into a design tool. It demon- (! SELF HYDRAULIC_POWER (DOCS strates supporting designer's work through (RATIONALE (TEXT explanations. In the future, similar explana"The hydraulic pouer of the pump is the pressure differ ence over the pump divided by the operation flou rate volume.") tory facilities should be included in operator (TEXTBOOK "Fundamentals of hydraulic engineering, p.999") support systems as well. (PURPOSE (TEXT "To calculate the hydraulic power of the pum 4.2 Efficiency issues
."))) (, (/ (:? MY PRESSURE_DIFFERENCE) (:? MY OPERATION_FLO~_RATE_VDLUME_NORMAL)) 1888)) Valueclass: (NUMBER)
Initial analysis suggests that even though means-end models form networks with a V a l u e : > large number of links, they do not pose any major computational problems. The physiFigure 4: Explicit explanations in a design rule cal level of the feedwater system model consists of some 1500 components, with numerous potential many-to-many relations. Fortunately, the complexity is reduced by the relations shared through the part/subpart- or class/subclass relations. Several subparts' purpose is defined by a single superpart. The number of items decreases on the upper levels. There are some 100 items of the functional level, and about 25 items on the goal level. For justification, the search space narrows upwards, avoiding complexity problems. Furthermore, the system can be made to act in a consultative mode. The user can interactively guide the search to interesting parts of the system, helping the complexity problem. 4.3 Future work Ways to manage several altemative paths through the models need more study. Further work is also needed in reduction of explanatory detail and in customisation to individual users. Now the prototype is built inside the design tool, but in the future the explanatory mechanisms will be embedded into automation systems. The prototype needs better interfaces for the design of the means-ends hierarchy. Improved graphical browsers and relational analysers would be needed before the tool could be introduced to the designers. More elaborate design representations would have to be included in further development.
422 ACKNOWLEDGEMENTS This research has been funded mainly by the Technology Development Centre (TEKES), Finland. Financial support has also been provided by the Technical Research Centre of Finland (VTT), the University of Oulu, Imatran Voima Oy, Valmet Automation Inc., and Tampella Power Inc. We would like to thank our industrial partners for their support and feedback. Jaakko Oksanen of VTT has contributed substantially to the implementation of the features supporting means-end modelling in the design tools. REFERENCES [Chandrasekaran 93] Chandrasekaran, B. et al: "Functional Representation as Design Rationale" IEEE Computer, Vol. 26, No. 1, 1993, pp. 48-56. [Franke 91] Franke, D.W.: "Deriving and Using Descriptions of Purpose", IEEE Expert, April 1991, pp. 4147. [Gruber 92] Gruber, T., Russel, D.M.: "Beyond the Record and Replay Paradigm for Design Rationale Support", in Working Notes of AAAI'92 Workshop on Design Rationale Capture and Reuse, San Jose, July 15, 1992, pp. 111-118. [Huuskonen 92] Huuskonen, P., Korteniemi, A.: "Explanation Based on Contexts", Proc. 8th IEEE Conference on Artifical Intelligence for Applications, Monterey, Califomia, March 2-6, 1992, pp. 179 185. [Kaarela 93a] Kaarela, K., Huuskonen, P., LeiviskiS, K., "The Role of Design Knowledge in Industrial Plant Projects", Proceedings of the International Conference on Cognitive and Computer Sciences for Organizations, Montreal, Canada, May 4-7, 1993, pp. 173-183. [Kaarela 93b] Kaarela, K., Huuskonen, P., Jaako, J.: "Providing Plant Design Knowledge to the Operators" Proc. Fifth Intl. Conf. on Human-Computer Interaction, August 8-13, 1993, Orlando, FL, USA, Vol. 1, pp. 546-551. [Kaarela 94] Kaarela, K., Oksanen, J., "Structuring and recording plant design knowledge", Proceedings of the IFIP international conference on Feature Modelling and Recognition in Advanced CAD/CAM Systems, Valenciennes, France, May 24-26, 1994, Vol. 2, pp. 853-866. [ K a t a j ~ 91] K a t a j ~ , M., "Knowledge-Based CAD" Expert Systems with Applications, Vol. 3, 1991, pp. 277-287. [Ketmeke 91] Keuneke, A.: "Device Representation, The Significance of Functional Knowledge", IEEE Expert, April 1991, pp. 22-25. [Klein 93] Klein, M.: "Capturing Design Rationale on Concurrent Engineering Teams", IEEE Computer January 1993, pp. 39-47. [Larsson 92] Larsson, J.-E.: "Knowledge-based Methods for Control Systems", Doctoral thesis, Lund Institute of Technology, Lund, Sweden, 1992, 236 p. [Lee 91] Lee, J., Lai, K.-Y.: "What's in Design Rationale?", in: Human-Computer Interaction, Vol. 6, No. 3&4, 1991, pp. 251-280. [Lind 90] Lind, M.: "Representing Goals and Functions of Complex Systems: An Introduction to Multilevel Flow Modelling", Institute of Automatic Control Systems report No. 90-D-381, Technical University of Denmark, November 1990. [Rasmussen 86] Rasmussen, J.: "Information Processing and Human-Machine Interaction", North-Holland, Amsterdam, 1986, 215 pages. [Sassen 93] Sassen, J.A.M.: "Design Issues of Human Operator Support Systems", Doctoral thesis, Delft University of Technology, Delft, The Netherlands, 1993, 226 p. [Stephanopoulos 90] Stephanopoulos, G.: "ArtificialIntelligence in Process Engineering -- Current State and Future Trends", Computers in Chemical Engineering, Vol. 14, No.11, 1990, pp. 1259-1270. [Swartout 91] Swartout, W.R., Paris, C.L., Moore, J.: "Design for Explainable Expert Systems", ~EE Expert, Vol. 6, No. 3, 1991, pp. 58-64.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
423
M e t h o d of E c o l o g i c a l I n t e r f a c e D e s i g n A p p l i e d to I n t e r a c t i v e Diagnosis Support System Yoko Asano, Shun-ichi Yonemura, Hiroshi Hamada, and Katsuhiko Ogawa Human Interface Laboratories, NTT, 1-2356 Take, Yokosuka-shi, Kanagawa, 238-03 JAPAN This paper proposes a procedure for designing ecological interfaces suitable for interactive diagnosis support systems based on the perspectives of work analysis and interface design proposed by Rasmussen [1, 2]. Several of the perspectives proposed by Rasmussen are chosen and concrete transformation paths are created for them. The effectiveness of the proposed procedure is confirmed by designing and testing a support system for interactive diagnosis.
1. INTRODUCTION Human recognition of work situations is necessary for ensuring flexible work. Consequently, interfaces that support human cognitive processing in a given work situation are needed. Instead of giving advice as to what should be done next in any situation, it is more effective to represent the whole work situation by providing alternatives for the following task and the information necessary for cognitive processing, that is, to design an ecological interface. An ecological interface is an interface through which people can directly perceive the meaning and value of the information available. This paper investigates a method of ecological interface design adapted to interactive diagnosis support system based on the perspectives proposed by Rasmussen [1, 2]. First, perspectives of work analysis and the design process are introduced to the design ecological interface proposed by Rasmussen. Next, a procedure for designing interactive diagnosis support systems is proposed. Dialogues of interactive diagnoses are analyzed and an interactive diagnosis support system is designed based on the procedure. An experiment using the designed system is conducted to evaluate the effectiveness of the design procedure.
2. PERSPECTIVES OF ECOLOGICAL INTERFACE DESIGN Rasmussen proposed several different perspectives of work analysis and design concept to design ecological information systems. Five perspectives were proposed for work analysis: work domain, task situation, decision making situation, mental strategies and information needs. The work domain is represented using a two-dimensional matrix: means-ends abstraction
424 level and whole-part decomposition level. Means-ends levels are divided into five levels: purposes, abstract functions, general functions, physical functions, and physical form. Whole-part levels are divided into four levels: total system, subsystem, unit function, and component Task situation analysis represents information flow and causal relations of the tasks. The work organization and user characteristics are understood from the information needs analysis. Five frameworks were proposed to characterized the design territory: work domain characteristics, knowledge base organization, navigation representation, knowledge representation, and display composition. The work domain characteristics are characterized in terms of the necessity of human information processing to carry out the work. Navigation is represented by a flow chart of the task transitions. The display composition is represented based on the means-ends and whole-part structure. The knowledge representation gives us a framework of the transformation from the perspectives of work analysis and user characteristics to the interface design.
3. APPLYING THE PERSPECTIVES FOR INTERACTIVE DIAGNOSIS When customers experience trouble with a piece of equipment, they often query an expert operator over the phone. The operator must recognize the customer's trouble situation and diagnose the trouble. If the operator can visit the equipment site, he/she can inspect the trouble situation directly and diagnosis is easier. However, when the operator makes the diagnosis over the phone, he/she can get information about the trouble only through the customer's reports. The operator always has to be sensitive to variations in the dialogue situation and change his/her diagnosis strategies flexibly. Therefore, a system that supports the operator's cognitive processing during an interactive diagnosis session is needed. This section investigates a procedure of interactive diagnosis support system design based on the perspectives proposed by Rasmussen.
3.1. Adaptation of the Perspectives for Interactive Diagnosis Since the perspectives proposed by Rasmussen provide only a framework of the design process, we must choose and adapt the perspectives to the interactive diagnosis being performed. We must also investigate how to reflect the results of work analyses to the design. Functions, information flow, and display compositions must be decided to design an interactive diagnosis support system. The frameworks of navigation and display composition are needed for this. According to the knowledge representation framework, the representation of the display information depends on the necessity of human judgment to carry out the work. Consequently, the framework of the work domain characteristics is necessary to categorize the relationships between activity and human information processing. The perspectives of work domain, task situation, and mental strategies for work analysis are needed to use these design frameworks. 3.2. Procedure of Interactive Diagnosis Support System Design Some of the perspectives proposed by Rasmussen were selected and their transformation paths were created to design an ecological interface for the interactive diagnosis support system, as shown with arrows in Figure 1. First,
425
Work Analyses
Classification of the Work Characteristics
~Are
work domain task situation mental strategy
[ h~ii: ~ ~d In~ ~ ~t~
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
there any rules or~no const~ .v I yes :ili:iii ~ ilililill ii:iiiiie~~e~i:i:i: [
.. . . . . . . . . . . . . . . . . . . . . . . . . .
prototyping
experiment
analysis
I
Figure 1. Procedure of interactive diagnosis support system design. work analyses are conducted with three perspectives: work domain analysis, task situation analysis, and mental strategy analysis. Next, the work characteristic is categorized based on the task transition pattern. The concrete decision rule is newly determined to categorize the work characteristic using the framework of work domain characteristics. The interfaces of the support system are then designed using the frameworks of navigation and display composition based on the results of the work analyses. 4. INTERACTIVE DIAGNOSIS ANALYSES
Twenty-eight cases of interactive diagnosis of telephone trouble over the phone were analyzed according to the proposed procedure. Each entire dialogue was analyzed in terms of three perspectives of work analysis. 4.1. Work D o m a i n Analysis Keywords of the dialogues were analyzed based on means-ends and whole-part abstraction levels. Five means-ends abstraction levels and four whole-part decomposition levels were determined for the dialogue of interactive diagnosis. All sets of subjective and predicative keywords of the dialogues were categorized into the two-dimensional levels. The result indicated that dialogue flow often stagnated when one's keyword abstraction level was extremely different from that of the other. 4.2. Task situation Analysis Operators' information processes of interactive diagnosis were analyzed and decomposed into several task units in terms of decision making function. They include interpreting expressions, supposing causes, asking questions, testing,
426 verification, diagnosis, and so on. These tasks were performed with many different processes. Possible patterns of transition between each task unit were extracted using a flow chart, although the transition rules could not be clearly determined. The total time for each kind of task was calculated. The interpreting expression task accounted for 45 percent of the total time for the dialogues of interactive diagnosis. The asking question task accounted for 28 percent of the total time.
4.3. Mental Strategy Analysis The operators' mental strategies were analyzed based on the relationships of dialogue flow between the customer and the operator. Two operators' diagnosis strategies were found: an operator leading strategy and a user leading strategy. The operator leading strategy has the operator's questions mainly taking the initiative. The user leading strategy has the customer's reporting predominate. The operator frequently shifted between these two strategies. When the operator led the dialogue and did not change his strategies, his dialogues had nearly twice as many words than when the operator shifted his strategy. 4.4. Classification of the Work Characteristics Task flow is not regular with the interactive diagnosis. The following task is selected from many alternatives by operator's judgment. Consequently, interactive diagnosis is not categorized as highly structured work. The work is inevitably categorized as autonomous work or constrained work. On the other hand, each task is considered while being aware of the constraints of physical and functional structures of telephones, because the flow chart could be made to describe the alternatives for the following tasks based on the task situation analysis. Therefore, interactive diagnosis is finally categorized as constrained work. 5. INTERACTIVE DIAGNOSIS SUPPORT SYSTEM The result of the task situation analysis indicated that interpreting user's expression task and asking question task, in particular, take longer than the other tasks. Consequently, functions that support these two tasks are required. Moreover, the result of the mental strategy analysis indicates that the operator's strategy involves frequent shifts. Therefore, all the functions should always be active in the support system. Since the interactive diagnosis was categorized as constrained work, we can indicate the alternatives for the following task based on the list of confirmed conditions according to the flow chart of the tasks. Illustrations organized by elements of similar abstraction levels would prevent dialogue flow stagnation based on the work domain analysis. However, many representations can be supposed. We cannot determine the best representation based only on the work domain analysis. In this system, the most primitive representation is adopted to develop keyword menus.
5.1. The Interactive Diagnosis Support System Design An interactive diagnosis support system is proposed based on the results of the work environment analyses. Figure 2 shows one of the displays of the proposed system. The system supports two operator tasks: interpreting expressions and
427 asking questions. The t a s k of Objects Functions interpreting the customers' troubles i bell H I receive H~ through their reports is supported by ~ ] bottom i] ] recondition ] 1 | I butt°n II I rec°nfirm II J a function with which operators can I busy t°ne II i rec°rd II enter the reported conditions by I call II I rec°ver il selecting a p p r o p r i a t e keywords I I cord H I reduce H matching the customers' reports from Question the alphabetical keyword menus. The I ] Can Y°Ureceive the Ph°ne? ] keyword menus are divided into the ( yes )(sometimes)( no ) menu of objects and the menu of functions according to the two dimensions of the work domain analysis. The task of asking question Figure 2. An example of the interactive diagnosis support system to diagnose the trouble is supported (Original written in Japanese). by suggesting the question that the operator should ask next. When the operator requests the result of a diagnosis, the alternative causes are presented. Following questions and causes are decided based on the flow chart. Conventional diagnosis support systems mainly support the asking question task by showing the next question in fixed order. In the proposed support system, conditions reported by a customer can be entered at any time and be reflected in the following questions.
5.2. The Interactive Diagnosis Experiment An interactive diagnosis experiment was conducted to evaluate the designed system. Participants: Eight subjects participated in the experiment. They were not experts in telephone servicing. Four were asked to acts as operators to diagnose customer's telephone troubles using two interactive diagnosis support systems. The other four were asked to act as customers reporting their trouble to the operator over the phone. Each operator and customer were paired. Systems: The proposed system and a conventional support system, which did not respond to entry of the customers' reports were used. The operators used both systems to diagnose two kinds of troubles. The customers used a telephone that had a cordless extension telephone. Procedure: Two telephone trouble situations were assumed. They were caused by the experimenter purposely. The operator diagnosed them with one of two systems through the customer's reports. The problems, the systems, and the order were randomly assigned. 5.3. Results and Discussions All operators could solve the troubles created. The total time of each dialogue was measured. There was no significant difference between both systems in terms of time. However, it is not essential to evaluate the efficiency of the systems based on time, because the time taken mainly depends on the participants' pace and dialogue expressions. Contents of the conversations and operators' entering operations were analyzed to evaluate their diagnosis processes. The total number of operator's questions and customer's reports was counted as the number of interactions. When using the conventional system, the average number of interactions was 14.25, while the average number of interactions was
428 12.0 with the proposed system. When using the proposed system, the operators never asked questions that overlapped the previous customer reports, because the customers' reports affected the diagnosis offered by the proposed system. The results indicate that interpreting the customers' reports leads to efficient interaction. Therefore, the task situation analysis and the mental strategy analysis are effective in determining which tasks should be supported and in designing the relationships among the functions. In the proposed system, the function of entering the customers' reports is performed by selecting appropriate keywords from the alphabetical keyword menus. However, three keywords out of eleven were selected as wrong meaning for entering the customers' reports on the proposed system. The results indicate that many keyword expressions might be supposed for one condition, and it is difficult for operators to search appropriate keywords among the whole keywords' list. Another operation for the entering function is expected. Graphic choice and hierarchical menus would reduce the wrong selections instead of keyword choice from alphabetical menus. 6. CONCLUSION A new procedure for interactive diagnosis support system design was proposed based on the concept of the ecological interface proposed by Rasmussen. Interactive telephone trouble diagnoses were analyzed according to the proposed procedure to design an interactive diagnosis support system. An interactive diagnosis experiment using the designed system was conducted to evaluate the method. The results indicate that the task situation analysis and the mental strategy analysis are effective to design functions and information flow of interactive diagnosis support systems. However, the representation of the display cannot be determined logically. Since there can be many alternatives, it is necessary to repeat design and test process at the final stage of the interface design. ACKNOWLEDGMENTS The authors would like to acknowledge Dr. J. Rasmussen for his helpful discussions and comments in earlier stage of this research. They also wish to express their appreciation for T. Kishimoto of NTT Human Interface Laboratories. REFERENCES 1. J. Rasmussen, and A. M. Pejtersen, Mohawc Taxonomy Implications for Design and Evaluation, RISO-R-673(EN), RISO National Laboratory, 1993. 2. J. Rasmussen, A. M. Pejtersen, and L. P. Goodstein, Cognitive Systems Engineering, Wiley-Interscience, 1994. 3. Y. Asano and K. Ogawa, Interactive Diagnosis Support Systems based on Operators' Mental Strategies, The Japanese Journal of Ergonomics, No. 30 (1994) 136. 4. Y. Asano, S. Yonemura, and H. Hamada, The Effect of Keyword Choice on an Interactive Diagnosis Support System, Proceedings of the 49th Annual Meeting of Information Processing Society of Japan, (1994) 4-347.
III. 15 Computer Modeling of Mental Processes
This Page Intentionally Left Blank
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
Computer
analysis of characteristics
431
of creative thinking
and self-
esteem level A.E. Kiva, V.A. Molyako b, Stephen T. McHale c, V.G. Orishchenko a and I.A. Polozovskaya a aSouth-Ukrainian Pedagogical University, 26 Staroportofrankovskaya, 270020 Odessa, Ukraine blnstitute of Psychology, 2 Pankovskaya, 252033 Kiev, Ukraine CUniversity of Maryland, Art-Sociology, College Park, 20742, USA Abstract The present paper consists of a description of correlation between the level of creative abilities of a person and his self-esteem. The relationship is demonstrated between creative thinking level parameters and the self-esteem level. These dependencies may be used for the different mental groups comparison. 1. D E T E R M I N A T I O N OF CREATIVE T H I N K I N G LEVEL It is believed that creative abilities influence many other psychological features of a person. It may consider the creative abilities level as a main character of a person which his behavior and self-esteem are based on. A high level of creative thinking promotes to an adequate perception of external conditions. It may be explained by certain correlations between the creative thinking level and the intellectual level of a person. That is why it is important to have effective methods for these characters measuring. There are well known tests for intellectual level (IQ) measuring [ 1]. But we need today the new testing methods for the creative thinking level determination [2, 3].
1.1. Computer testing of creative thinking In accordance to proposed model [3] of creative thinking processes the creative problems solving can be considered as a consistent series of thinking steps. To get a solution of a creative problem some critical number of effective thinking steps must be fulfilled. Each thinking step to the solution is considered as an element of the thinking space. There are a number of local regions in the thinking space that contain the different kinds of thinking elements. We give account of the creative thinking process on the basis of the kinetic model of thinking elements accumulation in some regions of a thinking space. The interactions between thinking elements are also considered. Typical equations for processes of creative thinking describing when choice problems solved are:
432 dml
(1)
dt - 11 + al N1n - bl N2,
dt =/2 + a2 where N1 and N2 are numbers of solution elements accumulated in two alternative directions corresponding to the right and the wrong choices, 11 and/2 are coefficients of the intuitive component, al and a2 are coefficients of an accumulated information mastering, k and n the order of "psychological reaction", bl and b2 are coefficients of a mutual influence of the alternative solution ways. The structure of these equations always corresponds to the particular type of test problems that have computer realization. The computer testing problems do not require any preknowledge and look like games. The developed software can give experimental longitudinal dependencies N1 (t) and N2 (t) that are stored in the data file. On the next stage the theoretical functions N1 (t) and Nz(t) obtained as solutions of the above written equations are compared with the experimental data. As a result of this comparison the coefficients of thinking process Ii,z, al,2, bl,2, n, k are calculated by the software. It is necessary to notice that the proposed testing programs do not check knowledge but bring out the objective information about the structure of the personality creative thinking. We use the simple logical computer games, such as: Hanoi towers, River crossing etc to collect data. When the children are done playing, we work out the psychological parameters of the thinking process by using our special program, which contains the approximation procedure and the procedure to determine psychological parameters. 1.2. Comparison with other testing methods In this study the general indexes of the children abilities by Torrence method were measured [6]. Results are represented in Table 1.
Table 1 Comparision of our computer and Torrence's methods Age
Mean Parameters
A -- 6.9 SD = 0.5
Subjects 1
2
3
4
5
6
7
8
9
10
0.6 0.4 0.7 0.7 0.5 0.6 0.2 0.5 0.2 0.2 FT
0.4 0.4 0.5 0.6 0.6 0.4 0.4 0.4 0.1 0.2
m
A = 10.1 SD - 0.6
0.7 0.7 0.8 0.3 0.8 0.4 0.6 0.7 0.6 0.6 FT
0.5 0.4 0.6 0.3 0.5 0.4 0.4 0.5 0.3 0.5
m
A-- 12.7 S D - 0.5
0.4 0.8 0.7 0.7 0.7 0.5 0.4 0.7 0.6 0.6 FT
0.5 0.6 0.6 0.7 0.5 0.6 0.3 0.5 0.6 0.4
Note: A is the mean age, SD is standard deviation.
433 We constructed auxiliary parameters that have the same sense as the corresponding Torrence characteristics. For instance, the parameter of searching activity in general we determine as 11+ 12+ al + a2
(3)
hi+ b2 We have compared parameter a with Torrence characteristic which may be called the fluency of thinking (FT). Data for ten subjects (5 boys and 5 girls) from three age intervals are represented in Table 1. The choice of subjects was arbitrary. Parameters a and FT in the Table 1 are given in relative units. Similar comparison was fulfilled for other parameters. The correlation coefficients between Torrence parameters and received by our computer testing method are 0.7 - 0.9. The probability of correlations 95 Yo. Thus our results are corresponded to those which are received by Torrence method, but our method allows to get precise information about creative thinking mechanisms. 2. C O R R E L A T I O N S BETWEEN CREATIVE T H I N K I N G AND S E L F - E S T E E M LEVELS The aim of these investigations was to make clear how creative abilities of children are connected with their self-esteem. It is believed that a self-esteem of each person influence general characteristics of a group. Therefore correlations between creative thinking and self-esteem levels of people may be used for optimum organization of different groups in educational, research and other institutions. We have chosen in this study two groups of students in special school for gifted children in Odessa (Ukraine). Each group includes twenty persons with an average age 11 and 13 years. The information about creative thinking level was obtained by help of new computer testing methods described above. Students were prepared to computer games using during studies on computer disciplines. We proposed to children our computer tests that were hidden among usual computer games. Thus children were tested without any negative emotions. We used two ways in order to obtain an information about the self-esteem of children [7]. The first way is testing questions that were proposed to children during a conversation. Ten questions were formulated. For example: Can you reach very important decision independently? Does your mood depend on the outward conditions? Do you believe in a realization of your dream? All ten adequate answers corresponded the highest level of self-esteem (1,0). In other cases the level of self-esteem was proportional to the number of adequate answers. Another way of self-esteem of students estimating was an analysis of students' characters that were prepared by teachers. Thus we obtained an additional information concerning self-esteem of children. The results are given in Table 2. Results represented in Table 2 correspond to children of 13 years average age. We have shown general results in Fig.1. The results are not simple. Two dashed curves show the layout of points corresponded to different children. We can see that there is an uniform density population in the fight of the figure between dashed curves. However in the left side we see only large or small values of self-esteem level. Here we have the effect of a polarization of self-esteems for children with a low creative abilities level.
434 Table 2 Number of a student
Psychological Parameters I
L
T
Z
S
1
0.77
0.92
0.73
0.81
0.85
2
0.33
0.64
0.62
0.53
0.84
3
0.86
0.79
0.71
0.79
0.62
4
0.61
0.59
0.62
0.61
0.32
5
0.78
0.88
0.58
0.75
0.92
6
0.86
0.90
0.81
0.86
0.4 l
7
0.68
0.90
0.72
0.77
0.52
8
0.61
0.87
0.74
0.74
0.82
9
0.39
0.66
0.62
0.56
0.32
10
0.85
0.82
0.80
0.82
0.88
Results obtained by our computer testing methods (I - parameter of intuition, L parameter of logic), and by Torrence method (T). S is a self-esteem parameter. Z is parameter of creative abilities that obtained as average value of parameters l, L and T.
1.0
/.
•
\-
0.8
/ x~.
/
•
J 0.6
0.4
/
°
,,~
/.
• x,.
0.2
0.2
0.4
0.6
0.8
1.0
Z
Figure 1. Dependence between a self-esteem level (S) and an avarage level of creative abilities (Z)
435 3. CONCLUSION The dependence of self-esteem of children on their creative abilities level was investigated and new computer testing methods were used for measuring creative abilities levels. It was established that children with a low level of creative abilities are characterized by nonuniform distribution of the self-esteem level. Effect of polarization of self-esteems takes place. REFERENCES 1. H. Eysenk The Structure of Human Personality. London: Harmondsworth Penquin Books, 1971. 2. A.E. Kiv, V.G. Orishchenko, I.A. Polozovskaya, I.G. Zakharchenko. Computer Modelling of the Learning Organization. Advances in Agile Manufacturing. P.T. Kidd and W. Karwowski (Eds). Amsterdam: lOS Press, 1994. 553-556. 3. A.E. Kiv, V.G. Orishchenko, I.A. Polozovskaya, I.G. Zakharchenko, V.V. Chislov, V.L. Maloryan. Creative Thinking Process Simulation and Computer Testing. Proceedings of the Symposium on Human Interaction with Complex Systems, Greensboro, North Carolina A&T Univ., 1994. 4. A.E. Kiv, V.A. Molyako, V.L. Malorayn, I.A. Polozovskaya, Z.I. lskanderova, The Creative Thinking Testing by Using of Testing Problems Based on Different Logical Schemes. In: Y. Anzai and K. Ogawa (Eds.) Proceedings of 6th International conference on Human-Computer Interaction (HCI International'95), Amsterdam: Elsevier Science Publishers 5. V.V. Chislov, V.L. Maloryan, I.A. Polozovskaya, G.V. Shtakser, A.I. Uyemov, I.G. Zakharchenko, M. Athoussaki The interface improvement for the creative thinking computer testing In: Y. Anzai and K. Ogawa (Eds.) Proceedings of 6th International conference on Human-Computer Interaction (HCI International'95), Amsterdam: Elsevier Science Publishers 6. E. Torrance (Research ed.). Thinking creatively in action and movement, Bensenville IL: Scholastic Testing Service, 1980. 7. U. Zabrodin (ed.), Psychological diagnosis of students, Moscow, 1990 (in Russian).
This Page Intentionally Left Blank
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
Computer-based testing of reflective thinking: performance in 9 to 12 year old children
437
Executive control of erroneous
Uri Shafrir Department of Applied Psychology, Ontario Institute for Studies in Education, 252 Bloor St., Toronto, Ont. M5S 1V6, Canada Attention to errors was operationalized recently by the measure of post-failure reflectivity (Shafrir & Pascual-Leone, 1990). Post-failure reflective children spend long periods of time following the production of incorrect response, compared to the time they spend following the production of correct response. Shafrir and Pascual-I.eone (1990) conceptualized post-failure reflectivity as an exploratory mental executive, a spontaneously activated "debugging" procedure that helps children reexamine and correct faulty internal plans. Researchers reported that children who show high executive control following the production of erroneous performance (post-failure reflective children) scored higher than post-failure impulsive children on an inference task, as well as on other measures of intellectual functioning and academic achievement (Shafrir & Pascual-Leone, 1990); that post-failure reflectivity generalized across tasks and across domains, and that post-failure reflective children were intentional learners (Shafrir, Ogilvie & Bryson, 1990, Experiment 2); and that those children used deep planning in constructing their action plans. In other studies, children with a reading disability were found to be significantly less post-failure reflective than normal controls (Shafrir, Ogilvie & Bryson, 1990, Experiment 1; Shafrir, Siegel & Chee, 1990). Shafrir, Ogilvie and Bryson (1990) claimed that attention to errors plays an important role in learning. This study investigated additional aspects of erroneous performance; in particular, we hypothesized that executive control of performance prior to the production of erroneous response, may also play an important role in learning. A spontaneously activated "rechecking" procedure, cued by a feeling of uncertainty about the planned, impending response, may help the child to reexamine internal plans prior to the production of an incorrect response. Attention to expected error production was operationalized by pre-failure reflectivity, defined as the ratio between mean pre-failure response latency and mean response latency; pre-failure reflective children spend long periods of time prior to the production of incorrect response, compared to the time they spend prior to the production of correct response. Finally, we operationalized overall executive control of erroneous performance as behavior that is both pre-failure and postfailure reflective. The development of executive control structures for the regulation of erroneous performance may be related to the transition from concrete to formal operations (Case, 1985; Pascual-Leone, 1987; Piaget, 1950). Paying close attention to potential as well as to actual disconfirmation of an hypothesis, i.e., pre- and post-failure reflective behavior, may be a necessary part of interpropositional, hypothetico-deductive thought, of vectorial operations, and of the manipulation of abstract sets. The development of such executive control structures may also signal the emergent increase in representational competence that enables children to expand
438 the scope of task representation to include a representation of their interaction with the task in real time. The two specific hypotheses tested in this study were: (1) twelve year olds have higher overall executive control of erroneous performance, and score higher on an inference task as well as on other intellectual and academic tasks, than younger children; (2) within each of the four age groups (9, 10, 11, and 12 years old), children who have high overall executive control of their erroneous performance, score higher than children with low overall executive control of their erroneous performance on an inference task as well as on other intellectual and academic tasks.
The PAR computer-based inference task The PAR (Pattern Recognition) task was administered individually. PAR is a computerbased induction task, with 80 stimuli of repeated designs shown through bars of different colors, heights, colors + heights, and colors + heights + sounds of varying pitch, where inter-trial intervals are subject controlled. Subjects were asked to decide whether the stimulus was a repeating design; if the subject's answer was "no", he/she was asked to point to the location of the "mistake" in the design with a blinking light on the computer screen; response was immediately followed by a yes/no feedback; the subject had to strike a key in order to see the next stimulus. The unobstrusive nature of the computer-based PAR task made it possible to record both response- and post-response latencies. The response latency was decomposed into pre-success and pre-failure; similarly, the post response latency was decomposed into post-success and postfailure. Pre-failure and post-failure reflectivities were calculated from the formulae: Mean pre-failure latency Pre-failure reflectivity
(1) Mean response latency Mean post-failure latency
(2)
Post-failure reflectivity Mean post-response latency
Subjects and procedures Subjects were students in grades four through seven in five public schools in Arad, a town in the south of Israel. We tested an unselected sample of 377 subjects, aged 9 (.q = 109), 10 (.n. = 114), 11 n(.n_= 85), and 12 (.n = 69). Children in each age group were divided by a double median split on pre-failure reflectivity and on post-failure reflectivity, into 4 quadrants: children who were both pre- and post-failure reflective, were defined as having high overall executive control of erroneous performance; children who were both pre- and post-failure impulsive, were defined as having low overall executive control of erroneous performance; finally, there were two groups of mixed conditions. Scores for the Israeli version of an IQ test (M~ = 106.1, SD = 11.5), and for Raven's SPM ~ = 32.5, SD = 8.6), were obtained when children entered 3rd grade. Scores for computer-based drill and practice in arithmetic and for teachers' evaluations of the student's level of intellectual functioning (not level of academic achievement) in percentiles were available at the time of this study. The Figural Intersection Test (FIT) for attentional capacity (Pascual-Leone & Ijaz, 1989), was group-administered.
439 Results Descriptive statistics are shown in Tables 1 and 2. Results of 2-way ANOVAs, age (4 levels: 9, 10, 11, and 12 years old) by overall executive control of erroneous performance (4 levels: high, low, and two mixed conditions) are shown in Table 3.
Table 1 Mean (SD) score on PAR, pre-failure reflectivity and post-failure reflectivity by age group Age group
Measure
9 years (n = 109)
10 years (n = 114)
11 years (n = 85)
12 years (n = 69)
PAR
.59 (.18)
.60 (.20)
.63 (.20)
.76 (.12)
Pre-failure reflectivity
1.16 (.22)
1.21 (.92)
1.15 (.21)
1.34 (.42)
Post-failure reflectivity
1.60 (.55)
1.73 (.80)
1.67 (.61)
2.14 (.92)
Note. PAR = proportion of correct response Twelve year olds scored significantly higher on the FIT task for attentional capacity, showed higher overall control of erroneous performance (were more pre-failure and more postfailure reflective) than children in the 9 to 11 years age range; the 12 year olds also scored higher on PAR, and on a variety of intellectual and achievement measures. Within each age group, children with high overall executive control of erroneous performance on the PAR task, scored significantly higher on a variety of tasks of intellectual functioning, on teacher's evaluations of intellectual functioning, and on arithmetic drill and practice, than children with low overall executive control of erroneous performance; the two groups of mixed conditions scored in the intermediate range. The younger children in the 9 and 10 years old age groups with high overall executive control of erroneous performance, scored as high as the 12 year olds on a measure attentional capacity (a score of 5 on the FIT task); these younger children scored significantly higher than the children in the 12 years old age group who had low overall executive control of erroneous performance, on the various tasks. Table 4 shows a commonality analysis of the variance of the score on PAR as the dependent variable, and age, FIT, IQ, SPM, Math, and pre- and post-failure reflectivities as the independent variables. Each R 2, the squared zero-order correlations between the dependent variable and each independent variable, was partiaUed into two components: unique variance of each independent variable, and common variance shared between the particular independent variable and one or more of the other independent variables (Kerlinger & Pedhazur, 1973). The unique contribution of the overall executive control of erroneous performance (pre- and post-failure reflectivities) to the variance of the score on the PAR inference task is higher (about
440 15 %), than the unique contributions of each of the other independent variables, age (2.3 %), Mcapacity (0.8 %), IQ (2.7 %), SPM (0.6 %), and arithmetic drill and practice (0.0 %). Table 2 Mean (SD) gores on tasks of intellectual functioning and academic achievement, for the low and high groups of overall executive control of erroneous performance by age
Overall executive control of erroneous performance Low
High
Age group
9
10
11
12
9
10
11
12
Number of subjects
30
39
28
20
31
39
29
21
PAR
.41 (.11)
.42 .46 (.15).21)
.63 (.09)
.72 (.13)
.73 (.12)
73 (.14)
.83 (.08)
Pre-failure reflectivity
1.0 (.07)
1.1 (.09)
1.0 (.10)
1.0 (.11)
1.3 (.19)
1.4 (.37)
1.3 (.17)
1.7 (.48)
Post-failure reflectivity
1.2 (.12)
1.2 (.14)
1.3 (.17)
1.6 (.25)
2.1 (.70)
2.2 (.96)
2.0 (.68)
2.7 (1.10)
IQ
100.4 98.6 98.0 102.6 (10.4) (10.3) (14.6) (8.1)
110.7 108.3 104.3 109.3 (11.9) (11.4) (10.6) (11.6)
SPM
26.4 (8.3)
26.5 (7.8)
28.8 (7.2)
28.4 (5.1)
35.2 (7.4)
34.5 (6.7)
31.3 (8.3)
35.8 (6.4)
FIT
3.8 (1.3)
4.1 (1.3)
4.2 (1.5)
5.1 (1.3)
4.7 (1.3)
5.1 (1.3)
5.1 (1.5)
5.2 (1.6)
Math
-7.3 (7.1)
-10.5 -12.0 -7.4 (7.6) (14.7) (13.6)
-0.9
-1.6
-8.5
-0.2
Teacher
41.6 37.1 32.2 na (25.5) (30.8) (32.0) na
(10.9) (12.4) (11.3) (14.2) 66.8 70.2 40.8 na (23.9) (21.1) (24.4) na
Note. Low = below median for age group on both pre- and post-failure reflectivity; High = above median for age group on both pre- and post-failure reflectivity; PAR = proportion of correct response; IQ = full scale score; SPM = Raven's Standard Progressive Matrices; FIT = Figural Intersection Task; Math = arithmetic drill and practice (months ahead or behind expected grade level); Teacher teacher's evaluation of intellectual functioning (percentile).
441 Table 3 F-ratios for two-way (age X overall executive control of erroneous performance) ANOVAs for the score on PAR and other test measures, for the whole population
Measure
PAR
IQ
SPM
FIT
Math
Teacher
Age
25.7***
ns
ns
4.2**
2.8*
4.9**
Exec
107.4"**
12.6"**
18.5"**
7.8***
7.6***
16.2"**
Age X Exec
ns
ns
ns
ns
ns
ns
MS~
0.018
121.8
51.5
1.9
127.5
684.9
Note. Age = age group; Exec = overall executive control of erroneous performance; Table 4 Commonality analysis of the variance of the score on PAR as a dependent measure Independent measures Age
FIT
Unique variance
.023***
Common variance
.054
R2
.077
.008*
.107
.115
IQ
SPM
Math Prefailure
Postfailure
.027***
.006* .000
.013"*
.139"**
.155
.150
.094
.108
.219
.182
.156
.094
.121
.358
Note. ***p < .0001; *'12 < .001; "12 < .05. Conclusions These results may be interpreted in terms of the neoPiagetian constructs of the growth in attentional capacity (Pascual-I.eone, 1987), the maturation of executive control structures (Case, 1985), and optimal levels of development and skill acquisition. The results lend support to the two hypotheses. The significant increase in executive control of erroneous performance at age 12 corresponds to an increase of attentional capacity from 4 to 5 units as predicted by Pascual-Leone (1987) and to the onset of the stage of formal operations (Case, 1985; Piaget, 1950). The newly acquired ability of 12 year old children to operate on operations and not only on concrete entities facilitates the development of internal procedures that "debug" and improve
442 currently operating action plans. The longer periods of time that 12 year olds spent both prior to, as well as following the production of incorrect response, compared to the time they spent
prior to as well as following the production of correct response, appear to signal the emergence of a new type of executive control structure, aimed at optimizing performance. However, we note that the 12 year olds with low overall executive control also scored 5 on M-capacity. This unexpected result may mean that an increase in M-capacity to 5 units at age 12 is only a necessary but not a sufficient condition for the development of effective executive control. This as well as other results of this study can not easily be interpreted in terms of current neoPiagefian theory. About 1/3 of the younger children in the 9 and 10 years old age groups showed a high level of executive control of erroneous performance, and scored 5 on the FIT measure of attentional capacity, a value predicted by neoPiagetian theory for 12 years old children (Case, 1985; Pascual-Leone, 1987). These younger children also scored significantly higher on various intellectual and achievement tasks than the children in the 12 years old age group who showed low overall executive control of erroneous performance. Similar results were reported in a study of precocious cognitive development at the level of formal operations; 'psychometrically bright' 5th graders were "at a more advanced cognitive developmental level" than the 'psychometrically average' 7th graders (Keating, 1975, p. 279). Our results, showing large individual differences within age groups, suggest that a significant number of children within each age group incorporated executive control of erroneous performance in their mental representation of the task. These results suggest that this emergent representational competence is an important measure of intellectual development. However, in its present form this measure lacks the necessary metric for testing the consistency of these findings with the current framework of neoPiagetian theory. References
Case, R. (1985). Intellectual Development: Birth to Adulthood. New York: Academic Press Keating, D. P. (1975). Precocious cognitive development at the level of formal operations. Child Development, 46, 276-280. Kerlinger, F. N., & Pedhazur, E. J. (1973). Multiple regression in behavioral research. New York: Holt, Rinehart and Winston. Pascual-Lexme, J. (1987). Organismic processes for neoPiagetian theories: A dialectical, causal account of cognitive development. In A. Demetriou (Ed.), The neoPiagetian theories of cognitive development: Toward an integration (pp. 25-64). Amsterdam: North-Holland. Pascual-I_eone, J., & Ijaz, H. (1989). Mental capacity testing as a form of intellectual developmental assessment. In R. Samuda, S. Kong, J. Cummings, J. Pascual-Leone & J. Lewis (F~s.), Assessment and placement of minority students: A review for educators, (pp. 141-171). Toronto: C. J. Hogrefe. Piaget, J. (1950). The Psychology of Intelligence. London: Roufledge and Kegan. Shafrir, U., Ogilvie, M., & Bryson, M. (1990). Attention to errors and learning: Across-task and across-domain analysis of the post-failure reflectivity measure. Cognitive Development, 5, 405-425. Shaffir, U. and Pascual-Leone, J. (1990). Postfailure reflectivity/impulsivity and spontaneous attention to errors. Journal of Educational Psychology, 82, 2, 378-387. Shafrir, U., Siegel, L. S., & Chee, M. (1990). Learning disability, inferential skills and postfailure reflectivity. Journal of ~ i n g Disabilities, 23, 506-517.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
The creative thinking testing on different logical schemes
by using
443
of t e s t i n g
problems based
A.E. Kiva, V.A. Molyako b, V.L. Maloryan a, I.A. Polozovskayaa and Zelina I. Iskanderova c aSouth-Ukrainian Pedagogical University, 26 Staroportofrankovskaya, 270020 Odessa, Ukraine blnstitute of Psychology, 2 Pankovskaya, 252033 Kiev, Ukraine CUniversity of Toronto, 4925 Dufferin Street, Downsview, Ontario, M3H 5T6 Canada Abstract New methods of testing of creative thinking are considered. They based on a mathematical model of thinking processes described in [1, 2]. This study consists more general consideration of computer testing programs structure. We have shown that there is a possibility of measuring of creative thinking parameters by using of computer testing problems based on different logical schemes. 1. CONCEPTION OF CREATIVE T H I N K I N G TESTING It is very difficult and may be impossible to determine the general peculiarities of creative thinking of a person. Complex creative problems are decided thanks to using enormous possibilities of a person brain. We model creative thinking processes taking into account that the significant part of the necessary information is absent. Our base hypothesis consists the following items: 1. The creative thinking processes proceed according to certain logical schemes. These schemes correspond to different variants of the modern logic and far and by may not correspond to any logics that are known today. The intuition component is not less essential and in many cases may be responsible for the success of a solution of creative problems. 2. The creative thinking quality depends on a set of logical schemes that a person uses in order to solve the creative problem. It is significant the degree of complexity of logical schemes using by a person. We assume that the intuition component is important in solution of creative problems any way. So we need a criterion for the evaluation of intuition component. The general procedure of development of testing programs is following: After choosing the testing problems corresponding to certain level of difficulty (certain logical scheme) the set of adequate differential equations are constructed. The equations are investigated on the basis of the quality theory of differential equations.
444 The software must be developed which can obtain the temporal function of the number of thinking elements as corresponding steps that a person fulfills during his moving to the solution. Testing problems have a computer realization and look like computer games. The software on the next stage must calculate the coefficients of differential equations that correspond to the best approximation of testing experimental curves by theoretical functions (solutions of differential equations). These coefficients has a sense of corresponding psychological parameters. In such a manner we obtain computer tests that sound different types of creative thinking. The final conclusion about the creative abilities of a person may be drawn by using of a set of testing problems in each case. Thus we suggest that the model of creative thinking processes may be given by the following scheme. Scheme 1.
Volume of memory I
Logic 1
I
Il
,
Logical operations Logic 2
Logic 3
Intuitive operations I
II !11
Logic N
According to this scheme the general level of creative thinking (Z) may be expressed as Z = X k Lk + 1 + V,
(1)
where Lk is the degree of complexity and k is the degree of mastering by a person of a given logic, I and V are parameters of intuition and memory. 2. COMPUTER SIMULATION OF CREATIVE T H I N K I N G PROCESSES 2.1. Description of the model The model of creative thinking processes which was proposed [2 - 4] is not intended for the general conformities to natural laws of a creative thinking description. This model corresponds to concrete processes in the person's mind connected with his aspiration to solve given logical problem. The modelling of creative thinking has many aspects. However, it has been generally assumed that solving complex intellectual problems entails the initiation of certain model structures in the human's mind. The examination of different authors' approaches to the concept of creative thinking modelling suggests that the usages of this notion are numerous and varied, and the relationship between them is not obvious [5 - 7]. Advantages of our model are in its practical direction. It can be applied directly to development and exploration of computer programs for creative thinking measurements.
445 The efficiency of developed testing methods in such a manner may testify that the model is correct. In order to explain the mathematical structure of equations let us briefly consider the main peculiarities of the model [2]. A thinking space existence is postulated that contains discrete elements corresponding to certain steps of a person during his moving to the problem solution. The thinking steps (or thinking elements) arise in local regions of thinking space and they are divided in three groups: effective steps, wrong steps and intermediate steps. Effective steps form the trajectory of a person moving straight towards the solution of the problem. If a person digresses from the direct route his trajectory includes wrong steps. The intermediate steps produce effective steps if they form complexes. As soon as the certain critical number of effective steps is accumulated the problem is solved. All steps may arise in regions II and III of Scheme 1. Region I provides other regions by resources (necessary information). It is in fact a realization of logic with resources [8]. 2.2. Construction of equations Our creative thinking model [ 1 - 4] is based on the assumption of equivalent contribution of irrational-intuitive and formal-logical steps in the process of creative problem solving. So according to the model the example of differential equations for the creative thinking process my be written in the following form: dNl dt - I1 + aiN1 + blN~- ClN2 dN2 dt = 12+ a2N2- b2Na- c2N1
(2)
dNa dt = I3+ aaNa+ baN1-c3N2 where N 1, N2 and N3 are numbers of thinking elements of three types: effective steps, wrong steps and intermediate steps; I1,12 and 13 are coefficients of the intuitive component, al, a2 and a3 are coefficients of mastering of accumulated information, bl, b2 and b3 and also c l, c2 and c3 are coefficients of the mutual influence of different thinking elements, k is the order of "psychological reaction". In every case the structure of these equations corresponds to the particular type of test problems which have computer realization. The coefficients in the above mentioned equations characterize the creative thinking process. Here it is notable that we introduce nontraditional characteristics for creative abilities of a person, such as coefficients of intuition, formal logic, the volume of thinking space etc. The quantitative determination of these coefficients by new computer testing methods allow to get a precise information for the creative thinking structure of a person. The traditional tests give only a general description of a person abilities without detailed information regarding the creative thinking structure.
446 3. G E N E R A L I Z A T I O N OF E Q U A T I O N S FOR D I F F E R E N T L O G I C A L SCHEMES In [9] the peculiarities of child's creative thinking were considered. Each person has concrete abilities to operate with some logical schemes. Our aim is to construct equations that describe thinking processes according to different logics. Equations (2) reflect principles of classical logic. The main principle is twice-valued character of any statement. Each step in thinking processes is true or false. The first equation described processes of accumulation of correct steps, the second equation is for wrong steps. In fact the third equation is equivalent to the first because correct steps are formed by intermediate steps. Another set of equations must be written for thinking processes based on nonclassical logics. Here we discuss only one example. Let us consider a set of equations that correspond to many - valued logic with resources. The typical equation is following: dNi dt = I (R) + a Ni + a ' Nkm -/ff Njp N[
(3)
Here Ni is a number of steps that are characterized by a certain value of probability to reach the solution. Functions Nm, Nn, Nj, Np have the same sense and are characterized by their probabilities, parameter of intuition I (R) depends on resource R that may change. Functions Nm and Nn describe intermediate steps that form Ni steps in processes of complexes formation. Steps Nj a n d Np form complexes that prevent to Ni steps accumulation. Other parameters and coefficients are the same as in equations (2). Thus the main differences of equations (3) are following: 1. There are no wrong steps. 2. During the accumulation of thinking steps and corresponding moving of a person to a solution the resources are changing. 3. The number of equations depends in each case on creative thinking processes of a person. It is very important to choice testing problems that have logical structure in accordance with equations structure. 4. C O N C L U S I O N It is shown that there are new ways of testing of creative thinking by mathematical modelling of different logical schemes that are realized in person's mind. Differential equations must describe this logical schemes. The next task is to choice testing problems that consist these logical schemes and to develop corresponding computer testing programs for different levels of creative thinking. We developed such programs for children of 7 14 years [1, 9, 10]. REFERENCES 1. A.E. Kiv, V.G. Orishchenko, I.A. Polozovskaya, I.G. Zakharchenko. Computer Modelling of the Learning Organization. Advances in Agile Manufacturing. P.T. Kidd and W. Karwowski (Eds). Amsterdam: IOS Press, 1994. 553-556. 2. A.E. Kiv, V.G. Orishchenko, I.A. Polozovskaya, I.G. Zakharchenko, V.V. Chislov, V.L. Maloryan. Creative Thinking Process Simulation and Computer Testing.
447 Proceedings of the Symposium on Human Interaction with Complex Systems, Greensboro, North Carolina A & T Univ., 1994. 3. A.E. Kiv, V.A. Molyako, V.G. Orishchenko, I.G. Zakharchenko, I.A. Polozovskaya, A.M. Solodovnikov. The Computer Testing System of a Group and Psychological Correction of Teaching Methods. Proceedings of 5th International Conference on Human-Computer Interaction, HCI'93, Orlando, Florida, USA, August 8-13, 1993. 153. 4. A.E. Kiv, V.G. Orishchenko, I.A. Polozovskaya, I.G. Zakharchenko. Criterions of Intellectual Level. Proceedings of International Conference "Gifted Children: Family, School, State", Kiev, 1994. 5. Mind Design. J. Haugeland (Ed.). Cambridge, Mass.: MIT Press, Bradford Book, 1981. 6. R.N. Shepard, L.A. Cooper. Mental Images and Their Transformations. Cambridge, Mass.: MIT Press, Bradford Book, 1982. 7. J.A. Fodor. Modularity of Mind. Cambridge, Mass.: MIT Press, Bradford Book, 1983. 8. N.J. Nilsson. Probabilistic logic, Ibid. 28 (1986) 71 - 87 9. V.V. Chislov, V.L. Maloryan, I.A. Polozovskaya, G.V. Shtakser, A.I. Uyemov, I.G. Zakharchenko, M. Athoussaki The interface improvement for the creative thinking computer testing In: Y. Anzai and K. Ogawa (Eds.) Proceedings of 6th International conference on Human-Computer Interaction (HCI International'95), Amsterdam: Elsevier Science Publishers 10.A.E. Kiv, V.A. Molyako, Stephen T. McHale, V.G. Orishchenko, I.A. Polozovskaya Computer analysis of characteristics of creative thinking and self-esteem level In: Y. Anzai and K. Ogawa (Eds.) Proceedings of 6th International conference on Human-Computer Interaction (HCI International'95), Amsterdam: Elsevier Science Publishers
This Page Intentionally Left Blank
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
449
F r o m N o v i c e to Expert D e c i s i o n Behaviour: a Q u a l i t a t i v e M o d e l l i n g A p p r o a c h w i t h Petri Nets. Matthias Rauterberg Work and Organizational Psychology Unit, Swiss Federal Institute of Technology (ETH) Nelkenstrasse 11, CH-8092 Zurich, Switzerland
Abstract To support the human factors engineer in designing a good interactive system a method has been developed to analyze the empirical data of the interactive decision behaviour described in a finite discrete state space. The sequences of decisions and actions produced by users contain much information about the mental model of this user, the individual problem solution strategies for a given task and the underlying decision structure. We distinguish between (1) the logical structure, (2) the sequential goal structure, and (3) the temporal structure. The analysing tool AMME can handle the recorded decision and action sequences and come up automatically with an extracted net description of the task dependent decision model (the logical structure). This basis model was filled up with additional elements to reconstruct one empirical action sequence of an expert user. Four different models are presented and their predictive power discussed. 1.
INTRODUCTION
Learning is a permanent process that changes our long-term knowledge base in an irreversible way. The structure of our long-term memory changes to more complexity and higher abstraction. Learning increases constantly the complexity of the mental model. What mental models are and how they work, is quite unclear. Carroll and Reitman-Olson [2] summarise their research recommendations as follows: "(1.) Detail what a mental model would consist of and how a person would use it to predict a system's behaviour . . . . (2.) Investigate whether people have and use mental models of various kinds . . . . (3.) Determine the behaviours that would demonstrate the model's form and the operations used on it . . . . (4.) Explore alternative views of Sequence/Method representations and the behaviour predicted from them . . . . (5.) Explore the types of mental representations that may exist that are not mechanistic . . . . (6.) Determine how people intermix different representations in producing behaviour . . . . (7.) Explore how knowledge about systems is acquired . . . . (8.) Determine how individual differences have an impact on learning of and performance on systems . . . . (9.) Explore the design of training sequences for systems . . . . (10.) Provide system designers with tools to help them develop interfaces that invoke good representations in users . . . . (11.) Expand the task domain to more complex software" ([2] pp. 59-61). In this paper we present a modelling approach that contributes to points (1), (3), (4), (5), (7), (10) and (11). We are primarily interested in a bottom-up, behaviour driven and not in a top-down, theory driven approach. One of the most elaborated modelling approach is SOAR [5]. Newell [7] describes SOAR as follows: "Soar is ... a symbolic computational system .... Soar is organised around problem spaces, that is, tasks are formulated as search in a space of states by means of operators that produce new states, where operators may be applied repeatedly, to find a desired state that signifies the accomplishment of the task .... Soar is organised entirely as a production system, that is, its long-term memory for both program and data consists of parallel-acting condition-action rules . . . . Soar incorporates a goal hierarchy .... Soar learns continuously from its experience by chunking, which constructs new productions (chunks) to capture the new knowledge that
450 Soar developed (in working memory) to resolve its difficulties" ([7] pp. 30-32). Soar is based on impasse-driven learning. "While Soar is performing a task by using the behaviour model in working memory, it is also learning. It is building chunks every time it impasses from one problem space to another . . . . These chunks constitute the acquisition of knowledge for doing the task" ([7] pp. 62-62). The knowledge generated by chunking and stored in the long-term memory represents only successful trials. Knowledge of unsuccessful attempts is not in memory. Learning in Soar means that long-term memory contains evidence only of the sequence of effective actions. But, what could it mean if the majority of our long-term memory consists only of unsuccessful trials ? Soar seems to be a typical representative of a top-down, theory driven approach for error-free skilled behaviour. Why do we believe that a bottom-up is better than a top-down approach? The answer refers to the following assumption. Most of the known modelling approaches is based on the assumption that the "mental model maps completely to the relevant part of the conceptual model, e.g. the user virtual machine. Unexpected effects and errors point to inconsistency between the mental model and the conceptual model" ([ 12] p. 258). This one-to-one mapping between the mental model and the conceptual model of the interactive system implies a positive correlation between the complexity of the observable behaviour and the complexity of the assumed mental model. But this assumption seems to be wrong. Based on the empirical result in [ 11 ], that the complexity of the observable behaviour of novices is larger than the complexity of experts, we must conclude that the behavioural complexity is negatively correlated with the complexity of the mental model. If the cognitive structure is too simple, then the concrete task solving process must be filled up with a lot of heuristics or trial and error behaviour. Learning how to solve a specific task with a given system means that the behavioural complexity decreases and the cognitive complexity increases. Now, one of the central question is: What kind of knowledge is stored in the cognitive structure? Before we are able to give a preliminary answer to this question, we have to introduce our complexity measure.
2.
THE MEASUREMENT OF COMPLEXITY
The symbolic representation of the machine system consists of the following elements: 1. objects (things to operate on), 2. operations (symbols and their syntax), and 3. states (the 'system states'). The mental model of the user can be structured in representing: objects, operations, states, system structure, decision and task structure. A net can be described as a mathematical structure consisting of two non-empty disjoint sets of nodes (S-elements and T-elements), and a binary flow relation (F). The flow relation links only different node types and leaves no node isolated [8]. Petri nets can be interpreted in our context by using a suitable pair of concepts for the sets S (signified by a circle '()') and T (signified by a square '[ ]') and a suitable interpretation for the flow relation F (signified by an arrow '->'). Bauman and Turano [ 1] showed, that Petri nets are equivalent to formalism based on production rules (like CCT of Kieras and Poison [4]). In this sense, our approach can be subsumed under 'logic modelling', too. The main operations (relations) between two Petri nets are abstraction, embedding and folding [3]. The folding operation in the Petri-net theory is the basic idea of the approach presented in this paper. Folding a process means to map S-elements onto S-elements and T-elements onto Telements while keeping the F-structure. The result is the structure of the performance net. Each state corresponds to a system context, and each transition corresponds to a system operation. This sequence is called a 'process' (see Figure 1). An elementary process is the shortest meaningful part of a sequence: (s') -> [t'] -> (s"). If the observable behaviour can be recorded in a complete ...-> (state) -> [transition] -> (state) ->... process description (see Figure 1), then the analysis and construction of the net structure of this process are simple: you have only to count the number of all different states and transitions used, or to mark on a list the frequencies of each state and transition used in the process. But, if the observable behaviour can only be recorded in an incomplete (e.g ..... ->
451 (state) -> [transition] -> [transition] ->... or ...-> (state) -> (state) -> [transition] ->...) process description, then the analysis and construction of the net structure of this process are difficulty. You have to find out the correct state (transitions, respectively) between both transitions (states, respectively). Unfortunately, this is the most frequent case in practice. For these cases we need automatic tool support. In the last years we developed a tool, that gives us the possibility to analyze any processes with an incomplete process description, that are generated by finite state transition nets (cf. [ 11 ]). The aim of the 'folding' operation is to reduce the elements of an observed empirical decision process to the minimum number of states and transitions, with the reduced number of elements being the 'logical decision structure'. Folding a decision process extracts the embedded net structure and neglects the information of the amount of repetitions, the sequential order, and the temporal structure. A simple pattern matching algorithm looks for all 'elementary processes' in the sequence. A composition algorithm (the folding operation) is now able to build up the Petri net combining all elementary processes. The result of a folding operation of our example sequence (Figure 1) is the Petri net given in Figure 2. Measurable features of the behavioural process are: number of states and transitions totally used, number of different states and different transitions used, dwell time per state and transition, etc. These measurements can be easily done based on a protocol of the user's behaviour automatically recorded by an interactive software program (the dialog system) in a 'log file'. To measure complexity we use the Ccycle metrics of McCabe [6]. With Ccycle we have a useful quantitative metric to measure behavioural complexity. We are discussing the advantages and disadvantages of four different quantitative metrics in the context of an empirical investigation elsewhere (see [ 10]). The complexity measured with Ccycle is defined by the difference of the total number of connections (F: arrows) and total number of net elements (T-transitions plus S-states). The parameter P is a constant to correct the result of Formula 1 in the case of a sequence (F - (T + S) = - 1); the value of P in our context is 1. Ccycle
-
F
-
(T
+
S)
+
P
(I)
The measure Ccycle of the model- 1 in Figure 2 is [ 18 - 13 + 1 = 6]; the complexity of the net shown in Figure 2 is six. But, what could this number mean? McCabe [6] interprets Ccycle as the number of linear independent paths through the net. Other interpretations of Ccycle are number of holes in a net or number of alternative decisions carried out by the users. Observing the behaviour of people solving a specific problem or task, is our basis for estimating 'model complexity (MC)'. The cognitive structures of users are not directly observable, so we need a method and a theory to use the observable behaviour to estimate MC. We call the complexity of the observable behaviour the 'behavioural complexity'. This behavioural complexity can be estimated by analysing the recorded concrete task solving process. The necessary task solving knowledge for a given task is constant. This knowledge embedded in the cognitive structure of the mental model can be reconstructed. 3.
R E C O N S T R U C T I O N OF T H E M E N T A L M O D E L
We carried out an empirical investigation to compare different types of interfaces (see [9]). For the reconstruction we chose one part of a log file of an expert user (see Figure 1). The whole process of the shown example is based on 12 transitions and 12+1=13 dialog states. The expert user started from the main menu and made with the ASCII-key 'd' the module 'data' active and with the ASCII-key 'a' the routine 'browse'. Pressing the function-key 'F3' he tried to reach a dialog state where he could change the actual data set, but this operation was only possible in the main menu. The system changed to the dialog state 's3' (wrong input state) and responded with the output message "Press space to continue." This message was incorrect implemented, so that the user tried unsuccessfully to leave the 'wrong input state'. Only when he 'found' the function-key 'F9', he could escape from 's3'. The wrong output message was the reason to press three times the space key '_'.
452
8.I s: main menu ascii key "d"
8.2 s: modul "data" ascii key "a"
8.3 s: routine "browse" function keg "3"
1.5 s: "wrong input" state ascii key "SPRCE" 8.4 s: "wrong input" state ascii keg "SPRCE" 3.3 s: "wrong input" state tabulator key
Figure 2. Model-l: the pure 'logical structure' of our example sequence in Figure 1.
3.8 s: "wrong input" state function key "2" 3,4 s: "wrong input" state gz
j¢
-
,%
g3
tabulator key 8.7 s: "wrong input" state carriage return 3.6 s: "wrong input" state ascii key "SPRCE" 4.8 s: "wrong input" state function key "9"
8.4 s: modul "data" ascii key "h" 8.1 s: main menu
Figure I. The original behavioural sequence of an expert with a relational database system (cf. [9]).
Figure 3. Model-2: the model-1 was supplemented with S-elements as goals (gl) ..... (g3), as memory places (ml), and as extinction places (e 1), (e2)..... (e4).
Mental models consist of three different types of knowledge: (1) the 'pure' logical structure of the task, (2.) the sequential structure of all goals, and (3.) the temporal structure of all operations. The pure logical structure is automatically extracted with our tool AMME (cf. [ 11 ]). This net is called model-1. Model-1 does not contain any knowledge about goals and time. For a first attempt to simulate the sequence in Figure 1 we build model-2 with goals and memory elements. In model-2 (see Figure 3) the S-element (ml) is free for 'recall' of the solution. The three S-elements (gl) ..... (g3) are included in model-2 to simulate the sequential goal structure of the valid transitions outside state 's3'. The four marked S-elements (e 1) ..... (e4) are included in the model to simulate the 'extinction'-rate as one aspect of the memory. The number of marks is positively correlated with the extinction-rate of the corresponding transition. If the transition is so often made active as marks are on the S-element then this transition can not be fired anymore. This consequence is a model of learning caused by unsuccessful trials. Model-3 has a complete goal structure for all transitions (see Figure 4). We combined the
453 goal-element of transition'_' with the S-element for the extinction-rate. The activation of transition 'F3' fills this S-element with an additional mark, so that we have then the same extinction-rate as in model-2. Model-4 has exactly the same structure as model-3 (cf. Figure 5). The only difference between model-3 and model-4 is the time delayed transition 'F2'. This delay increases the probability that the transitions '_' and 'TAB' exceeds completely their extinctionrate before transition 'F2' is fired.
Figure 4. Model-3" this Petri-net is equivalent to model-2 with four additional goals to simulate the sequential structure.
4.
Figure 5. Model-4: this Petri-net is equivalent to model-3 with an additional time delay of transition 'F2'.
VALIDATION OF THE FOUR MENTAL MODELS
To validate the four different mental models, a simulation study was carried out. With a Petri-net simulator each of our four models was implemented, marked and executed. We generated different task solving sequences with each model. To estimate the difference between the original sequence (cf. Figure 1) and each simulated sequence, we used the following procedure: 1. We numbered consecutively all operations ('transitions', respectively) in Figure 1 ['d'= '1', 'a' = '2', 'F3' = '3', .... 'h' = '12']. The number R is the rank-position of each transition t in the original sequence. 2. We attached these numbers to all generated transitions (t) of each simulated sequence. For example, the shortest sequence we found, was generated with model-1: ['d', 'h']. The rankpositions R of these both transitions are: ['1', '12'] (compared with the original sequence). 3. We calculated a 'similarity ratio' (SR) as follows:
[{Nt~II 'N°rg I I}/N'i:~100%
SR = 1 -
Rorg,t- Rsim,t + Z max Rorg = Nam+l SR is a sufficient measure of the difference between the simulated sequence and the original sequence. N is the number of all transitions in a sequence. The maximum of Rorg is equal to Norg (Norg in Figure 1 is 12). SR is only valid for simulated sequences that fulfil the following condition: Nsim < Norg. For example, SR of the shortest sequence ['1', '12'] is 10%. 4. We averaged the similarity ratios of all simulated sequences per model (see Table 1). The results in Table 1 show that with increasing complexity of the mental model (MC) the similarity ratio (SR) tends to 100%. Interesting is to note that the structure of model-3 does not distinguish from the structure of model-4. Only the delayed transition 'F2' increases SR by 9%. Additionally to the positive correlation between MC and SR we can see that the variance (measured by the standard deviation) decreases continually. This result indicates that the predictive power increases from model-1 to model-4. rg
"
454 Table 1. The model complexity (MC) and similarity ratios (SR) of model-1, -2,-3, and -4. MC: absolute value SR: mean SR: standard deviation SR: minimum.., maximum number of simulated sequences
5.
Model- 1 6 43 % + 33 % 10% ... 83% 11
Model-2 8 57 % + 23 % 32% ... 93% 12
Model-3 13 86 % + 12 % 68% ... 94% 8
Model-4 13 95 % + 1% 94% ... 96% 8
D I S C U S S I O N AND C O N C L U S I O N
Three different results are important: (1.) Our assumption that 'learning how to solve a specific task with a given system means that behavioural complexity decreases and cognitive complexity increases' seems to be correct. (2.) We can conclude form the validation results that we must discriminate between the logical decision structure of a task, the sequential goal structure, and the temporal structure. The logical structure of a task can be extracted automatically with our analysing tool AMME, and the complexity of this logical structure can be measured with the McCabe-measure. The temporal structure can be measured with the planning time per operation and transition, respectively. Learning the temporal structure means to accelerate the task solving process. (3.) The results of the goal structures of model-3 and -4 show us that we must take into consideration the knowledge of unsuccessful attempts. Our hypothesis is that the majority of our long-term knowledge consist of inhibitions. To model this aspect means to change the type of the arc between (e 1) - > [_] from an activator to an inhibitor in model-4. In our modelling approach we can not neglect knowledge of unsuccessful trials. One psychological dimension of the goal and time structure seems to be self-confident in doing the fight things at the right time avoiding all unsuccessful ways tried before. REFERENCES: [1] R. Bauman and T.A. Turano, Production based language simulation of Petri nets. Simulation 47 (1986) 191-198. [2] J. Carroll and J. Reitman-Olson, Mental models in human-computer interaction. In: M. Helander, Ed., Handbook of Human-Computer Interaction (North-Holland, 1991, pp. 45-65). [3] H.J. Genrich, K. Lautenbach and P. S. Thiagarajan, Elements of general net theory. In: W. Bauer, Ed., Lecture Notes in Computer Science 84 'Net Theory and Applications' (Springer, 1980, pp. 21-163). [4] D.E. Kieras and P.G. Poison, An approach to the formal analysis of user complexity, International Journal of ManMachine Studies 22 (1985) 365-394. [5] J. Laird, A. Newell and P. Rosenbloom, SOAR: An architecture for general intelligence. Artificial Intelligence 33 (1987) 1-64. [6] T. McCabe, A complexity measure, IEEE Transactions on Software Engineering, SE-2 (1976) 308-320. [7] A. Newell, Unified theories of cognition and the role of SOAR. In: J. Michon and A. AkyiJrek, Eds., SOAR: A Cognitive Architecture in Perspective (Kluwer, 1992, pp. 25-79). [8] C.A. Petri, Introduction to general net theory, pp. 1-19. In: W. Bauer, Ed., Lecture Notes in Computer Science 'Net Theory and Applications' (Springer, 1980). [9] M. Rauterberg, An empirical comparison of menu-selection (CUI) and desktop (GUI) computer programs carried out by beginners and experts. Behaviour and Information Technology 11 (1992) 227-236. [10] M. Rauterberg, A method of a quantitative measurement of cognitive complexity. In: G.C. van der Veer, M.J. Tauber, S. Bagnara and A. Antalovits, Eds., Human-Computer Interaction: Tasks and Organisation (CUD, Roma 1992, pp. 295-307). [11] M. Rauterberg, AMME: an automatic mental model evaluation to analyze user behaviour traced in a finite, discrete state space. Ergonomics36 (1993) 1369-1380. [12] G. Van der Veer, S. Guest, P. Haselager, P. Innocent, E. McDaid, L. Oesterreicher, M. Tauber, U. Vos and Y. Waern, Desiging for the mental model: an interdisziplinary approach to the definition of a user interface for electronic mail systems. In: D. Ackermann and M. Tauber, Eds., Mental Models and Human-Computer Interaction 1 (North-Holland, 1990, pp. 253-288).
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
455
M o d e l i n g and Simulation o f H u m a n Operator in Mental T a s k H a n d l i n g Qualities Celestine A. Ntuen Department of Industrial Engineering, North Carolina A&T State University, Greensboro, NC 27411
Effective human control of a complex system depends in part on the design of decision aids that have (high-level) models of human mental processes. Since the human operator utilizes the computer as a medium of interaction with the task environment, it is important that tightly coupled and interoperable human mental models be incorporated into HCI. This paper reports on a pilot study of simulating mental task performance in a desk top HCI platform.
1. INTRODUCTION Researchers who advocate the use of mental models in HCI design are often trapped in two epistemological myths. First, the notion of representing tasks as concepts defies the physical laws with which systems operate. Second, the use of mental models and mental tasks interchangeably reflects a lack of understanding between derived representation of knowledge about how humans perceive the objective world and how actions are used to control the tasks in the real world. Unequivocally, Gentner and Stevens (1983) have noted that "mental models research is fundamentally human knowledge about the world". Mental tasks on the hand are more akin to memory events such as perception and cognition. Humans utilize memory events when executing mental tasks. The memory structure have been shown by psychologists and cognitive scientists (Anderson, 1974, 1982) to contain information symbologies used to activate various forms of stimuli generated by the sensory o r g a n s - auditory, tactile, visual, etc. This clearly demands that HCI should be designed to complement such dependency.
2. MENTAL TASK HANDLING QUALITITES (MTHQ) A mental task handling quality is the study of perception and control of memorybased tasks. These tasks may include event prediction, pattem analysis (recognition, association, discrimination) and information processing (data fusion, data estimation, etc.) Mental tasks of the human operator can be conceptualized at four levels of memory events. These are:
456 1. Visual memory. This consists of chunks of tasks for pattern recognition, pattern matching, pattern discrimination and pattern association. 2. Spatial memory.. This consists of spatio-temporal tasks such as understanding geometric position and orientation and object dimensionability (size, shape, etc.). 3. Abstract memory. This consist of abstract tasks which hold facets of information in various forms such as symbologies, graphics and texts. 4. Decision memory.. This consists of tasks for decision, judgment, and inference w
3. MODELING AND SIMULATION OF MTHQ Mental handling quality models can be coupled with the human operator model through optimal control model as shown in Figure 1. Ntuen and Fang (1994) have shown that Fitt's law of stimulus-response compatibility can be used as quasi-linear models of the human operator (Sutton, 1990). Some of the human describing functions are available. . . . . . . . . .
~.~.~-~.o~r..o;~
. . . . ~,~oT,o~.s: . . . . . . . . . . .
{COGNITION}
/~
ACTION}
Figure 1" A representative systemfor application of optimal control model for mental task process simulation. In the HCI, we need to conduct experiments to derive tasks describing functions. Thus, we can consider the computer as the plant to be controlled.
4. EXPERIMENT WITH MTHQ USING HCI In our pilot experiment, we study MTHQ through perceptual/cognitive control of actions. The perceptual tasks consist of the use of graphical symbologies and a TIM (Textin-menu) box; control tasks require the use of a Microsoft mouse system and keyboards (functional keys); the actions to be performed were cognate tasks. Seven graduate students (considered "experts") and five undergraduate freshmen ("novices") with age ranges between 17.5 years to 26 years took part in the study; participation was voluntary. In order to induce mental tasks, visual stimuli were associated with the actions as follows" {Blue} = Save file & continue {Blue, Yellow} = {Save file, exit to DOS} {Yellow, Red} = {Cut and paste document} {Red} = {Cut document} {Green} = {Print current document} {Yellow, Green} = {Print review document} Document typing tasks were assigned to the students on Word Perfect (WP6.0) Windows Version. Each subiect took part in a two-hour review of the visual stimuli. Subjects
457 performed the desired tasks upon the stimuli emittance. Each stimulus was randomly generated from exponential distribution function with mean emittance rate (MER) of 40 seconds. The subjects reaction to light stimulus closely matches the Fitt's law of S-R compatibility and allows for optimal control simulation of the human operator in MTHQ.
5. SAMPLE EXPERIMENTAL RESULTS AND CONCLUSION The preliminary experiment was evaluated in terms of human errors (Booth, The aggregate results are shown in Table-1.
1990).
Table 1" Human Errors in Perceptual/Cognitive Control of Actions in HCI Mental Handling GM Selected Error Models Automacity Familiar ShortCut Signal Sustitution Task-Similarity Signal Interference
TIMS - M
GK
TIMS - K
]
Expert
Novice
Expert
Novice
Expert
Novice
Expert
Novice
0.045 0.003
0.090 0.106
0.008 0.006
0.114 0.121
0.08 0.100
0.098 0.137
0.052 0.0910.
0.115 0.106
0.028
0.088
0.I01
0.096
0.052
0.116
0.058
0.008
0.092 0.03
0.115 0.0431
0.0429
0.078 0.112
0.008 0.041
0.082 0.111
0.07 0.09
0.045 0.011
0.011
GM = Graphic - Mouse Control. GK = Graphics-- Keyboard Control, TIMS - M = Text-in-menu and Mouse Control, TIMS - K = Text-in-menu and keyboard control. The above results show human errors associated with mental tasks. A separate paper that reports on the response times and the use of optimal control models to predict human performance contains all the details of the experiment. The preliminary results show that as advanced HCIs are being developed, the human mental tasks should be studied and validated before developing related mental models.
REFERENCES 1. 2. 3. 4.
5.
Gentner, D. & Stevens, A. L. (1983). Mental Models, Hillsdale, N. J.: Lawrence Erlbaum. Anderson, J. R. (1974), Retrieval of prepositional information from long-term memory, Cognitive Psychology, 6, 471-474. Anderson, J. R. (1982). Acquisition of cognitive skills, Psychological Review, 89, 369-496. Ntuen, C. A. & Fang, J. (1994). A simulation model of the adaptive human operator, First Industry/Academy Symposium on Research for Future Supersonic & Hypersonic Vehicles, Vol.l., Homaifar, A & Kelly, J. C., Eds), Albuquerque, New Mexico: TSI Press, 466-471. Sutton, R. (1990). Modeling Human Operators in Control System Design, New York: John Wiley.
This Page Intentionally Left Blank
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
The interface testing
improvement
for the creative thinking
459
computer
Chislov a, V.L. Maloryan a, I.A. Pc~ozovskayaa, G.V. Shtakser a, A.I. Uyemova, ~• . .Zakharchenko a and M. Athoussaki aSouth-Ukrainian Pedagogical University, 26 Staroportofrankovskaya, 270020 Odessa, Ukraine bSIEM Ltd., Markou Mousouri 25 - Mets, Athens, 11636, Greece Abstract The interface design is one of significant components which the successful creation of computer testing programs depends from. This is especially important for children's creative thinking testing. A new approach to this problem that takes into account modern tendencies in interfaces design and the Language of Ternary Description [ 1] is supposed. 1. COMPUTER T E S T I N G OF CREATIVE T H I N K I N G Modem tendencies of human-computer interfaces design are connected with a maximal simplification of problems for human-operator in HCI process. These tendencies are realized by a gradual transition from a conversational style of interaction to immediate operator activity in a virtual world and by a wide using of the principle of direct manipulation by concrete objects. It is especially important for the creative thinking computer testing and particularly for children testing. A researcher must stimulate a strong interest to testing problems or puzzles among children. Another word he must create a good motivation for child's thinking. It goes easier if a test puzzle is formulated in the game form and if a child can manipulate objects of his/her interest directly. This situation will be realized if children will perceive testing programs as usual computer games. It is necessary also in order to have a pure psychological experiment. 1.1. Some features of child's logic The problem of a child's logic is connected with a paradox. On the one hand, the formal-logical thinking of a child is formed after 11-12 years. But on the other hand, a maximum of a human creative abilities achieve just before 11 years. From the our point of view a relevant solution of this paradox is based on various logical systems construction. Ideally it is necessary to learn children to create a free-reconstruct logic frame which can be transformed into the formal systems with various logic schemes in correspondence with own children's logic. We suppose that a child thinks logically. But his/her used logic relations are weak. Therefore usual psychological tests do not reveal real creative abilities of a child if only logic components of child's thinking is considered.
460 Some psychologists, for example J. Piaget [2], believed that children till 11 years can not use and can not understand, for instance, the set conception. But Piaget did not know a relevant logic which is connected with child's thinking and appeared later. It is a logic theory which is called the Language of the Ternary Description (LTD) [ 1,3 ]. The traditional logic of predicates is based of a mathematical deductions. But children think in the frame of more weak connections between things. It is more close to the natural language grammar rules. It's known that a child not only easily learns different languages but also create own language. K. Chukovskij collected many examples of children's creative activities in the age from 2 to 5 in [4]. One of the most interesting results is following: many children think up a "new" words that were used in Russia in ancient times or correspond to some dialects today. So children reveal theirs abilities by using logic which is close to natural language rules. 1.2. The Language of Ternary Description (LTD) We shall show that it can be more relevant for our aim the nontraditional logic system as LTD [ 1, 3]. Structures of this logic is approached to structures of the natural language. This logic can describe the children's thinking. The basis of LTD consists of two group categories (at three in each): "things, properties, relations" and "definite, indefinite, arbitrary". With the help of this concepts LTD's structures allow to underline the most essential objects or its elements and visa versa - hide inessential or dangerous ones. Moreover objects can be connected in different hierarchical chains by means of properties and relations. LTD is an alternative to logic of predicates, oriented at mathematical relations, because permits to describe relations, which can be often found in usual situations. Nevertheless, this logic system has own axioms, rules and theorems. With the help of categories "things, properties, relations" we construct well-formed formulae (WFF): 1) (X) Y means that a thing X has a property Y; 2) Y (X) means that a thing X has a relation Y; 3) [ (X)Y] means a thing X with a property Y ( contrary WFF (1) which describes propositions (relations) describes things); 4) [Y (X) ] means a thing X with a relation Y. The opposite direction of the predication relation is expressed by the next four WFF: 5) (X*) Y means that a property Y is attributed to a thing X; 6) Y (*X) means that a relation Y belong to a thing X; 7) [ (X*) Y] means a property Y attributing to a thing X; 8) [Y (*X) ] means a relation Y belonging to a thing X. Last WFF is a free list: 9) X,Y. The second group of categories is used in order to introduce elementary objects which substitute into a WFF instead of X and Y. Objects are denoted: "A" is arbitrary object, "a" is indefinite object, "t" is definite object. These elementary objects can be combined at a various manner. As a result we obtain different formal objects: identity, implications etc. Combined properties of these objects permit children to create theirs own new objects as they do it in the frame of natural language with "definite", "indefinite" and "arbitrary" concepts.
461 Besides mentioned properties LTD permit to decide problems in HCI's frame. Modern tests control only how a child master adult logic. 2. NEW T E S T I N G M E T H O D S 2.1. General description From the point of the LTD view we can suppose a new version of the creative abilities testing system. First, it's necessary to suggest a set of testing games that may be represented by its metaphoric pictograms for a free child choice. The menu must give to children a wide possibilities of the computer games choice. Moreover the ideal variant is to design such virtual worlds which can contain the orientation items for children. This position will give an additional information about the child abilities and the interests of the child. Second, we would like to open for children the possibilities of the independent investigation of the game rules and its peculiarities. A part of the possible operations and/or the game rules can be unknown for the child and another part can be arbitrary with accordance of the child wishes. Third, the test game can have a solution based upon the LTD logic or another logic that similar to the "child's" logic. Undoubtedly the new system saves all main properties of the previous system [5] and allows to demonstrate the creative abilities of the tested child more completely. 2.2. An example of creative thinking test For example, consider one of the creative thinking tests from [5]. This is one of River Crossing tests. The game rules are very simple. Three knights with servants have to cross a river in a boat that can hold not more than two people. The boat can't float itself. The game aim is to transport all personages on the other bank. The main rule says: "Any servant has no fight to stay in the presence of a strange knight without his master". This puzzle has natural computer realization in accordance with the above mentioned principles of an interface design. The graphical imagines' motion is controlled by the mouse. Researcher introduce the child with the all game rules before the test starts. Then the child plays, his/her decisions store on the disk and thinking parameters are calculated by the special program. But the thinking model [5] uses a traditional, "adult", logic. 2.3. More general examples A human operator influences on a computer processes but also he/she feels a strong influence from a computer on himself. A logic component of the operator thinking feels especially strong stress. The thinking style of human operator accepts new features. Therefore, designing HCI, we need a logic of the human operating in visual and natural situations. The traditional formal logic systems also are unsuitable to this aim. The nontraditional logic system as LTD, can be more relevant to it.
462 For more general example, we consider a typical file system as MS DOS. At the first approximation one can pick out the main categories in the such manner: "things" are files, "properties" are the file descriptors, "relations" are defined by directories. From this point of view, the typical file system has the next drawbacks: - it is impossible include a file into two different independent relations without copying; the categories "indefinite" and "arbitrary" are unexpressable; - properties from the fixed set and one unfixed are effectively expressible only. Our analysis exposes the directions of the file system improvement. Particularly we would like to express any properties and relations between files. Also if we introduce concepts "indefinite file" and "arbitrary file" then a number of operator decisions during a session are decreases. Now we suppose an idea of the file system without above mentioned defects. Any file must to have two lists for its properties and relations. We allow cross references in these lists. Therefore an operator can express properties of relations, relations between properties, any properties and any relations of files and so on. Moreover we allow "indefinite" objects and "arbitrary" objects. So LTD approach can give new ideas in the file system design area too. -
3. C O N C L U S I O N It was demonstrated new ideas in the interface design for creative thinking computer tests development. Similar approach may be applied to other problems of Human-Computer Interaction. It is possible that Language of Ternary Description and corresponding Logic can be useful for many areas of HC1. REFERENCES 1. A.I. Uyemov Fundamental features of the language of the Ternary Description as a logical formalism in the systems analysis / / 8 Congress LMPS, vol.5, part 1, sect. 1-6.Moscow, 1987.- p.p. 337-340. 2. J. Piaget The psychology of intelligence.- London, Routledge and Kegan Paul, 1950. 3. A.I. Uyemov A Formalism for Parametric General Systems T h e o r y / / S y s t e m research. Methodological problems. Yearbook 1984.- Moscow, Nauka, 1984.- p.p. 152-180. 4. K.I. Chukovskij From two to five.- Moscow, Det. Lit., 1986. 5. A.E. Kiv, V.G. Orishchenko, I.A. Polozovskaya, I.G.Zakharchenko Computer Modelling of the Learning Organization / / Abridged Proceedings of 4-th International Conference Human Aspects of Advanced Manufacturing and Hybrid Automation, Manchester, UK, 1994
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
463
Evaluating Human Operator Models in Tool-based User Interface Design Maria Athousaki SIEM Ltd, M. Mousourou 25, Metz, 116 36 Athens, Greece
This paper describes the context of use of a software tool aiming to provide assistance for the ergonomic design of man-machine interfaces as well as to facilitate usability engineering. The tool is currently being developed by SIEM Ltd and constitutes one of the company' s lines of activities towards innovative solutions in the area of user interface design and evaluation. The novelty of the tool being described is that it relies upon encoded knowledge of the human operator (user profile, mental model) and the tasks to be performed, to construct prototypical implementations of alternative potential designs. These, subsequently become subject to evaluation and usability testing.
1. INTRODUCTION The role of ergonomic design for man-machine interaction is widely acknowledged as critical for the overall performance of interactive systems. There is, consequently, an ever increasing need for promoting and supporting ergonomic design to arrive at usable systems that take into account the human operator characteristics, requirements and preferences. In recent years, there has been an increasing interest in the development of methodologies and frameworks for evaluating the usability of new products. The primary objective of these efforts has been to develop methods and tools for the specification and measurement of usability [1 ]. As a result, several methodologies, guidelines and checklists have been proposed by industries, research practitioners and international standards committees. Some of the studies which are frequently cited in the relevant literature include the work by Smith et al [2], which has been used as the basis to arrive at a more comprehensive set of guidelines published by ISO as international recommendations. Relevant checklists include EVADIS [3], [4] as well as the work of Ravden et al [5]. More recently, the MUSIC project [1], in the context of the ESPRIT programme of the European Commission has worked in conjunction with industry to produce a collection of methods and tools for the specification and measurement of usability. In this context, measuring usability during design aims to: (i) ensure that the final product reaches at least a minimum required level of usability; (ii) provide feedback during design on the extent to which the objectives are being met; and (iii) identify potential usability shortcomings of the product. Evaluation of the usability of interactive systems can be used to assist the design of ergonomic products, taking into consideration the human operator and his mental model (user-centered approach). Even today, design decisions largely depend on the designer's interpretations, beliefs and assumptions regarding the human operator of a system. According
464 to Christie et al [8], the development of appropriate human operator models regarding manmachine interaction can provide an appropriate basis for evaluating and implementing interactive systems. Such models should be sufficiently "rich" and contain information at different levels (lexical, syntactic, semantic and conceptual), considering the human operator as a psycho-physiological and cognitive system. 2. SUPPORTING E R G O N O M I C DESIGN T H R O U G H USABILITY ENGINEERING SERVICES
Usability evaluation at SIEM Ltd, is currently at the core of the company's activities. In the past, there have been a number of efforts towards the experimental evaluation of user performance in various tasks, such as CAD tasks [6]. As the role of man-machine interaction becomes increasingly more critical for the overall performance of an interactive system, there is a growing need for the development of methods and tools to support the measurement of usability of such systems. Towards this direction, SIEM Ltd seeks to develop innovative solutions to address the numerous problems associated with determining various aspects of usability (e.g. measurement, analysis of usability results), by building upon recent experience and the state of the art. One of the current activities of the company, is targeted towards the development of software tools for the design, rapid prototyping and evaluation of man-machine interfaces. Our objective is to incrementally integrate in one software architecture, a number of critical elements as they may be related to the human operator (e.g. user profile, mental model) and the tasks under consideration, in an attempt to determine and assess the usability of potential designs. At SIEM Ltd, our current objective towards measuring the usability of man-machine interfaces of interactive systems is facilitated through the process depicted in the diagram of Figure 1. More specifically, we are developing a software tool, called EDS (Ergonomic Design Support), which can quickly produce a mock up of different potential designs, given knowledge about the human operator (e.g. characteristics, preferences, knowledge of the domain of discourse) and the task(s) that this operator is to accomplish with the user interface. These mock-ups are subsequently subject to laboratory evaluation by establishing appropriate measurements as they are needed by the task(s) being considered. The usability results are interpreted in cooperation with the customer and are used in two different ways. First, they are translated into a number of recommendations for the customer, depicting specific guidelines for optimal, ergonomic design of the user interface. Secondly, they are fed into the system as past usability data, to extend EDS's functionality and decision making process. In the following, the diagram of Figure 1 is elaborated, exemplifying the way in which usability measurement during early design is serviced. 3. OVERVIEW OF EDS EDS aims to support the ergonomic design, simulation and prototypical implementation of man-machine interfaces to interactive systems, facilitating usability evaluation at design time. The tool which is currently under development, provides support for encoding operator
465 models, alternative dialogue syntax suitable for different operators, as well as usabilityspecific data; all these constitute the aggregation policies towards prototypical designs.
~
~ Task structur'e'sI< I
translates
equirements
A & B->C B & Z->G
~i
J Usability Testing I
Figure 1" Supporting ergonomic design: Usability engineering services using EDS The novelty of the tool which makes it attractive in the context of usability measurement is that it is geared towards experimentation with altemative operator models, task structures and measurement criteria. In other words, the usability engineer is allowed to experiment with the encoding of altemative models of human operators and usability measures to determine the value of different designs in various contexts and scenarios of use. A prototypical design is produced by fitting the task to the user. This is achieved by tailoring the dialogue structure and lexical interface design to the operator's characteristics and knowledge of the application domain, by applying rules and usability-related aggregation policies. Aggregation policies are specified by the usability engineer, during design, as metarules and draw upon past experience and cases that may influence or determine choices (of dialogue structure, interaction style, choice of interaction object) based on usability criteria (e.g. time taken to complete a task). For example, let us assume that the objective is to measure the usability of a potential design with respect to the time taken by the operator to complete the task. Furthermore, we assume the hypothetical operator model which depicts the information shown in Figure 2 (in a declarative notation).
466
o p e r a tor( n u m e ric_i nput, [ knows ( value_range, [1,25] ) , knows ( c o n c e p t , [ potentiometer,gauge] familiar ( device , [keyboard , mouse
), ])])
Figure 2" Instance of the operator's model The operator model depicted in Figure 2 declares that for a given task (e.g. entry of a numeric value represented by the constant n u m e r i c _ i n p u t ) the following hold: (i) the current operator knows the value range from which input is to be accomplished; (ii) the current operator knows the concepts of a potentiometer and gauge; (iii) the current operator is familiar with the keyboard and mouse input devices. In order for EDS to propose a design suitable for a numeric input task and the current human operator, it initiates a search to retrieve past usability data (if any) in which an interface design has been developed for a numeric input task. Usability data is encoded into the system as declarative statements depicting average test results such as those indicated in the example of Figure 3. case ( [criterion ( t i m e _ c o m p l e t i o n ) ] , numeric_entry, [known ( 2 5 s e c , [ p o t e n t i o m e t e r , keyboard ], user ( [ knows ( v a l u e _ r a n g e , _ ) , knows ( c o n c e p t , p o t e n t i o m e t e r ) , familiar( d e v i c e , [keyboard , mouse]) type (manager) . . . . ] ]) . case ( [criterion (time_completion) ] , n u m e r i c _ e n t r y , [known ( 4 2 s e c , [ g a u g e , m o u s e ] , user( [knows ( v a l u e _ r a n g e ,_) , knows ( c o n c e p t , p o t e n t i o m e t e r ) , familiar( d e v i c e , [keyboard , mouse]) type (manager) . . . . ] ] )
,
,
Figure 3" Relevant cases retrieved from the usability knowledge base For the purposes of our example, the information depicted in Figure 3 identifies two past cases in which a task classified as n u m e r i c i n p u t was evaluated with the 'timecompletion' usability criterion (represented by the term cri teri on (t i m e _ c o m p I e t i on) of the listconstituting the firstargument of the predicate case). In the first case, the average time taken to complete the task was 25 sec, while this was achieved with the use of a potentiometer and a keyboard device (all these are explicitly declared by the predicate k n o w n ) . In addition, the operators who took part in the experiment have exhibited a minimum set of characteristics, declared as members of the list appearing as the argument of the predicate user (i.e. they all knew the value range from which input was to be accomplished, etc). In the second case, the time taken to complete a numeric input task was substantially larger, but this was achieved through the use of a gauge and the mouse, while the operators who participated in the experiment exhibited another set of characteristics, which is also declared in the list.
467 Past cases are retrieved by matching the current task under consideration (i.e. numeric input) and the characteristics of the current operator, against task and operator characteristics addressed in the past cases. Consequently, all cases thus retrieved are relevant and can be used to reason towards a conclusion. The reasoning process, selects the result which indicates the shortest time for task completion. In the case of our example, EDS would have selected the first case and it would configure the interface accordingly, since there is evidence to suggest that this configuration is likely to be more usable than any other. In the case that such evidence does not exist, then EDS will simply inform the usability engineer. From the example, it can be seen that as new usability data are interpreted and encoded as new knowledge in EDS, the decisions that the tool makes, take full account of this incrementality in the semantics depicted in the knowledge base. Consequently, the tool exhibits a certain degree of "dynamic behaviour" which is required when conducting usability measurements. In order for EDS to be able to retrieve cases which are relevant to the current usability test, it uses additional knowledge related to the task being addressed. More specifically, tasks are characterised in terms of requirements. A number of criteria have been devised which serve this objective. Indicative examples of task characterisation criteria include: whether or not the task is safety critical, frequency of the task, accuracy, type (e.g. input or output), feedback, etc. Each one of these characterisations are associated with rules in the rule base of EDS and may potentially trigger different user interface versions for the same task. For instance, if the task is safety critical a general flag is set which activates particular types of (initiation, interim and completion) feedback on different interaction objects.
3.1. Interface design using operator and task knowledge The tool which interprets human operator models and task descriptions is equipped with a number of rules which operate on previous knowledge, so as to determine some of the design aspects of the user interface. Typical rules encode information such as the following: If the task that the operator is to perform with the man-machine interface is safety critical and the operator has the ability of visual comprehension, then the feedback to be offered by an interaction object may be of a particular type (e.g. blinking), as opposed to the feedback required in case that the operator's ability of visual comprehension is limited or reduced. Rules such as the above, relate the context of interaction (e.g. what the user in doing, or what the user interface is trying to convey, at a given time), to the operator's abilities. Another set of rules is used to relate tasks to the operator' s cognitive load and mental model. For instance, if from the human operator model, it can be inferred that the operator is not familiar with the use of a bar chart for numeric entry then the selection of a bar chart for numeric input, should be avoided.
3.2. Usability analysis There are a number of general criteria which are frequently used to evaluate userinterfaces. Some of the most common ones which are usually employed when conducting user trials include [7]: time, speed of response, activity rate, etc; accuracy and errors; convenience and ease of use; comfort; satisfaction; physical health and safety; amount achieved or accomplished; job satisfaction; etc. Usability testing of a user interface usually involves user trials in which several of the above criteria may be used to evaluate and compare alternative
468 designs. The results are incorporated into the usability results knowledge base to facilitate ergonomic design of new interfaces, as described in Section 3. 6. DISCUSSION AND CONCLUSIONS
The preceding discussion identified the main building blocks of a tool aiming to provide assistance for the ergonomic design of man-machine interfaces. The rationale for building such a tool has been to support usability engineering as well as the incremental embodiment of the state of the art in HCI research and development (R&D), past experiences and practices, so as to enable SIEM Ltd to successfully address industrial needs. The commercial targets that SIEM Ltd aims to address include domains in which the ergonomic design of the manmachine interface is crucial to the overall system performance or to the wide diffusion and user acceptance of the system. Such domains include interfaces to public or private terminals and equipment (e.g. ATMs, photocopiers), but also market niches, such as the assistive technology industry, where terminal/application/service access problems confronting the socio-economic integration of people with disabilities, are currently predominant. The added value of the usability cycle is that the recommendations produced at the end of this cycle, can be incorporated and subsequently used to extend the capabilities of the EDS tool. In addition, a conscious effort is expended on collecting past data on usability problems and corresponding solutions, in an attempt to establish a pool of knowledge revealing past data and experience. SIEM Ltd, seeks to exploit the above usability cycle, in a number of domains and application areas. One such application domain is user interfaces in smart home environments, accessible to disabled and elderly users. Currently, the company participates in European collaborative R&D activities related to identifying user requirements in smart home environments and conducting usability tests (SIEM Ltd is a partner in the CASA TP 1068 project of the TIDE Programme of the European Commission). REFERENCES
1. N. Bevan and M. Macleod, Usability measurement in context, Behaviour & Information Technology, 3(1&2), 132-145, 1994. 2. L. S. Smith and J. N. Mosier, Guidelines for Designing User Interface Software, MITRE Corporation, Bedford, MA, USA, 1986. 3. R. Oppermann, B. Murchner, M. Paetau, M. Pieper, H. Simm, I. Stellmacher, Evaluation of Dialogue Systems, GMD, St Augustin, Germany, 1989. 4. H. Reiterer, EVADIS II: A new method to evaluate user interfaces, in D. Diaper (ed.), People and Computers, Vol. VII, Cambridge University Press, 1992. 5. S. Ravden, G. Johnson, Evaluating the Usability of Human-Computer Interfaces, Chichester: Ellis Horwood, 1989. 6. L. Laios and M. Athousaki, An Experimental Evaluation of User Performance in CAD tasks, Fourth International Conference on Hunan aspects of advanced manufacturing & hybrid automation, July 6-8, 1994, Manchester, England. 7. I. McClelland, Product assessmentand user trials, in J. R. Wilson and E. N. Corlett (Eds), Evaluation of Human Work, Taylor and Francis, 1990, pp.231-232. 8. B. Christie (ed), Human factors of the user-system interface, North-Holland, 1985.
III. 16 Modeling 1
This Page Intentionally Left Blank
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
471
Associative User Modeling: A Neural Network Approach Qiyang Chen a and A. F. Norcio b aDepartment of Computer and Information Sciences Towson State University, Towson, MD 21204, U. S. A bDepartment of Information Systems University of Maryland, Baltimore County, MD 21228, U. S. A. This paper presents and discusses an approach of user modeling. A set of neural networks is utilized to store, maintain and infer users' task-related characteristics. Such networks function as associative memories that can capture the causal relationships among users' characteristics for the system adaptation. It is suggested that this approach can be expected to overcome some inherent problems of the conventional stereotyping approaches in terms of pattern recognition and classification. It can also avoid the complexity of truth maintenance in default reasoning that is required in previously known stereotyping approaches. 1. INTRODUCTION User models have become important components of adaptive human-computer interfaces. It is recognized that a system will be able to exhibit flexible user-oriented behavior only if it has access to a model of user which consists assumptions about users' characteristics regarding certain task being performed. These characteristics may vary depending on the types of tasks. Usually, they are related to a user's plans, goal, domain knowledge as well as cognitive preferences [ 1]. There are different taxonomies about these characteristics depending on the time period while they hold, the way by which their are elicited and represented as well as the degree by which they are specified [2]. It has been a common practice to use a set of predefined assumptions to initialize the system's beliefs about its users and the tasks they are performing. It is usually referred as stereotype approach. The pre-defined stereotypical knowledge is organized into a generalization hierarchy in which the stereotypes inherit knowledge from their ancestors. The modeling process proceeds with stereotype assignment in terms of default reasoning that allows the model to retain the stereotypical knowledge about a user in the absence of evidence to the contrary. Although the stereotype approach provides a simple way to initialize the modeling process and was successful in some applications, we belief that this approach limits the representation power of a user model in following aspects. Since the reasoning is conducted with extensive default assumptions that may conflict with the new evidence obtained as the interaction progresses, the revision of stereotypical knowledge is necessary to handle the inconsistencies. A common suggestion is to use dependency-directed backtracking process to accomplish the truth maintenance which examines one piece of evidence at a time in a nonmonotonic way [3]. This approach is often inefficient
472 and lacks the ability to detect noisy or inconsistent information that should be ignored [4]. Therefore, it is very possible that current effort of maintaining consistency may bring further conflicts in the subsequent interaction. Thus, model construction may fall into a dilemma where a non-monotonic process of reconciling conflicts is frequently involved and eventually no decision can be made after a period of interaction [5]. In addition, the pre-defined hierarchy confines the system beliefs within each stereotype and can be only inherited by the descendant stereotypes. Therefore, it is hard to effectively update those system beliefs that are no longer significant in the context of task performance. Also, since a user may fail to fit any set of stereotypes, so that the modeling process fails to associate any system decision to that user. In such situation, however, some of the assumptions distributed among the stereotypes might be still useful for characterizing that user. In this sense, the hierarchical structure of stereotyping approaches is limited in the degree of individualizing a user. We suggest that the users' task-related information should be examined in terms of pattern recognition and classification so that the interface system can establish a complete and consistent profiles about users in order to exhibit the cooperative behavior. 2. ASSOCIATIVE USER MODEL
2.1. Basic concepts We propose that associative networks can be used as an efficient mechanism for user modeling. In associative network based user modeling process, the stereotypical knowledge is organized as a set of patterns. Since the system's beliefs about a user should be determined through the context of task performance, fragmented pieces of observation may not bring any meaningful implication. In addition, the observed information about user's characteristics may be mixed with noise or inconsistencies. Therefore, all the aspects of the user's performance patterns have to be examined before any system decision can be made. In other words, user modeling process should be conducted in terms of pattern recognition, which requires that the modeling system has the capability of fault tolerance, graceful degradation, and signal enhancement. Therefore, the neural network techniques become natural tools for the implementation. Whereas, as suggested in above section, the conventional stereotyping approaches, in which the inference proceeds a step at a time through sequential logic, may become seriously inadequate for processing pattern-formatted knowledge especially when there are incomplete, noisy or inconsistent observations involved [6]. There are several paradigms of associative memories that can capture the associations between input pattern and output patterns despite incomplete or inconsistent inputs. In our approach, an associative memory is implemented by a single set of interconnected units. Each unit represents an assumption or an attribute as the system's stereotypical knowledge about users. All assumptions consist of a universal stereotype. The modeling process extracts some of assumptions to dynamically form a unique stereotype that fits a particular user. Unlike the hierarchical stereotyping approaches that only associate the user with a single or a set of stereotypes, an associative user model includes various assumptions associated to a user. In other words, associative user modeling proceeds at the level of the assumption rather than the level of the stereotype. In this approach, all assumptions of the universal stereotype are considered to be
473 relevant to each other in a spectrum which is valued from negative to positive (i. e., from contrary, via irrelevant, to consistent). Thus, it overcomes the limitation of the hierarchical stereotypes which is unable to extract assumptions from different stereotype structures to form a new profile about the current user and task. Therefore, this approach has a better ability to personalize a user. To test the proposed approach, several network models are used for modeling users' genetic domain knowledge in the field of database programming. 2.2. P a t t e r n a s s o c i a t o r s
A bidirection linear associator (BLA)is used as a knowledge processing paradigm to capture the causal relationships between an arbitrary number of assumptions. The associations among the assumptions are weighted under certain conditions. Figure 1 shows a structure of such paradigm. Once a user's input from the dialogue channel is observed, it forms an input to BLA. The modeling process is conducted by propagating the activation level throughout the network, to associate this input with an output pattern. This output pattern is considered as the current system beliefs about that user. This process simulates the behavior of default reasoning.
II o Assumptions
wij: weightfromnode i to nodej Figure 1 The structure of BLA The kernel part of the BLA is the weight matrix. The data for constructing this matrix is collected by the card sorting method [7]. Twenty database programming concepts (refer to Table 1) are used in the test. 49 undergraduate students who are majoring either information systems or computer sciences participated in the data collection procedure. Each subject is asked to create a weight matrix expressing the relationships among the relevant concepts. Given an assumption that a user knows (or not know) a concept, subjects are asked to choose other possible concepts the user might also know (or not know) and assign the belief values to the corresponding cells. For example, if it is believed that a user who knows concept x may also know concept y, then fill "1" into the corresponding cell (x,y). Subjects may use any number between -1 and 1 to characterize such beliefs. A simple average function is used to integrate the matrices from the subjects. No. 1 2 3 4 5
concept integer real loop array subroutine
No. concept No. concept 6 local variable 11 recursive 7 tree 12 record 8 stack 13 index 9 inheritance 14 weakentity 10 interrupt 15 dataintegrity Table 1. Concept index
16 17 18 19 20
No. concept FD 3NF concurrence locking NP-complete
474 A linear propagation activation algorithm is used: 1. Initialize weight matrix M M = ( wij ) n ' n , n is the total number of nodes. 2. Apply input vector V ( v 1, v 2 ..... v n ) on M to produce a stimuli vector V': n
v'= i=l
where fi (w × w u) = v,' =
f -1 1
if v i ' < - l / 2 if v~' > 1 / 2
0 others 3. a) If the network converges ( i.e., V '= V, ), stop. b) If the V' is previously created, set V' = V' u V (logic union) stop c) otherwise set V'= V, go to 2. 110 different input patterns are tested, including the input patterns that include inconsistent concepts (e.g. for same concept, both known and unknown nodes are fired). 100% of the output patterns satisfied the conditions: (a) the advanced concepts in the input yield less advanced concepts, and (b) the inconsistent input does not produce inconsistent output. 2.3. Feedforward Model A feedforward network trained by back-propagation method is also used in the study. This network (BP) is used to generalize the associative knowledge stored in BLA. The training data comes from BLA's testing results. A three-layer network is implemented shown in Figure 2. Table 2 shows the training and testing results. concepts:
int ......
NP
"1 i n t ,..,
. ~
--r N P
• • •0
output
~
input
!
. ~.. concepts:
mt
,..,
NP
"I i n t
,..,~ NP
Figure 2. Feedforward network Learning rate Momentum Output threshold Number of training cycles Number of training data testing/recall data recall accuracy
0.50 0.50 0.85 25,000 110 70 100%
Table 2. Training and testing results
475 For the unseen data, 100% of the testing results satisfy conditions (a) and (b) mentioned in (2.2). This network has been also tested for generalization: Introducing a new concept node into the network (i.e. this implies a new system beliefs coming into the modeling process) causes the structural change to the network. In order to retain the effects of previous training, a partial training is conduced as follows: • Fix all weights except those that are on the newly added connections. (i.e., the fixed connections are not participating the training process). • During the training period, present the new training samples on both input layer and output layer. The input vector contains the stimulus on the new concept, the output vector contains all concepts implied by the new concept. The representation pattern for a new concept (i.e., the training data) reflects the closeness between the new concept and the existing concepts. The test result shows that the functional correlated concepts yield similar concepts in output. In other words, a new concept makes the network to turn in the nodes that the correlated concepts might turn on. For example, once a concept "queue" is added, it yields the similar concepts in the output as the concept "stack" yields. This result implies that the network can generalize its reasoning ability to adapt new system assumptions without being totally retrained. This feature is particular important for the dynamic modeling process in which it is often necessary to update the structure of system belief space. 2.4. Regularity detector
In a user modeling system it is also often necessary to classify users' characteristics for system adaptation. An ART model is therefore used to further classify the outputs from above two networks. The classification is based on the closeness of the input patterns. Sixteen of the twenty concepts are represented by input nodes. Four output nodes are used to indicate user categories as expert, expert-intermediate, intermediate-novice, and novice respectively. The unsupervised training process stores the typical patterns for each category. The test result shows that the network successfully associates the test patterns to the closest stored-patterns and activates the corresponding categorical node in output layer. Figure 3 shows the example of such network behavior. 2.5. Simulating a blackboard framework
Above network models can be integrated as a blackboard system where each model can function either independently or cooperatively (refer to Figure 4). This system framework has been simulated by passing the input or output from one network to another. For example, the input and output from BLA model can be used as input (or training data) to BP model; the output from BP or BLA can be represented to ART model for classification. The output from any network is viewed as the current user profile in the dialogue context. Thus, this framework provides an effective way for dynamic user modeling. 3. CONCLUSION This study tested and integrated several neural networks as associative memories in user modeling process. It has shown several advantages such as fast default reasoning and generalization, insensitivity to inconsistent input, personalization, and learning ability. In
476 addition, comparing with the rule based systems used in conventional user modeling systems, it is easier to implement and maintain the proposed system. Also, the knowledge elicitation process is simpler than the rule-base construction, because only the causal relationships are considered to initiate the modeling process. The further research is aimed at incorporating the task information into user models so that the system can have a more comprehensive picture about not only who the user is but also what he is going to do.
2GATEGORIES 3
!
I;I;I;I;I
l*lel*l*l le!e!e!ol
4
lll:,O,O 11119 I • o!o!o
IoOl-lOlI O l O I O I I ; I ; " I"OIl O
I:1:1o'1o'1 I:1:1:1:1 o
ol • ]
• I:I..
ol ol ol o
o l _o l_ o l o I
IO°l ' :I'O°l O"! ' "• • • i:1-1:1: • l l oo~.o !aolo
OlOlOlO 0101010
~I~IoI o i'""l
o I o l ol ol o I
lele/ole I lelololo o i I o l ol ol
I O I OI OI 0
•.. ................................
•
,,.,.,.,. I"""" I I:l:I:: !i-,-,-,ololol •
diatog ~ . . ~ , d i x i........................... ,..:.
I I'1"1"!
~ BLA
lele • • I elololol I lelolo o Io!~ olo
uonolO •
"" • i:1:1 o
llol
•
l ""'" I l o l o"Oo1 '! ol o i:l:lOn:l,o,olo,o,,o,ol i,ojo,o,o ~,.,.,o,o ,,.,o,~I ° I o l o l o l o !lololole
i
o o I I oo 1~~ o ol o l l o ol oo l
0
t
L
II
' ............................ 1
ii
'~" ;"
ART
i
BP
x, t i! ~"v
"
•
no!ouono n,o,~ o, o , o , ~
.~ ..................... " ..................i;.~.:
::
1
..........................
x: stimulus training data , v: output '
o, o lloUololo I
iono!olo. II o! ~_ ol o II o! ~_ ol o
OlO101o
Figure 3. An example of the behavior of the ART for user classification
Figure 4. A blackboard framework for user modeling
REFERENCES 1. Norcio, A. F. and Stanley, J., "Adaptive Human-computer Interfaces, A literature Survey and Perspective," IEEE Trans. on System, Man and Cybernetics, Vol. 19, No. 2, pp. 399408 (1989). 2. Rich, E., "User Modeling via Stereotypes," Cognitive Sci. Vol 3, pp 329-354, (1979). 3. de Kleer, J., "An Assumption-Based TMS," Artificial Intelligence, Vol. 28 No. 2, pp. 127162 (1986). 4. Chen, Q. and Norcio, A.F. " An Associative Approach in Dynamic User Modeling ," Proceedings of 5th Intentional Conference of Human Computer Interaction, pp 909-914 (1993) 5. Huang, X., Mccalla, G. I., Greer, J. E. and Neufeld, E., "Revising Deductive Knowledge and Stereotype Knowledge in a Student Model," User Modeling and User-Adapted Interaction, Vol. 1, No. 1, (1991) 6. Pao, Y., Adaptive Pattern Recognition and Neural Networks, Addision-Wesley Publishing Co., (1989). 7. Wilson, M., "Task Models for Knowledge Elicitation," Knowledge Elicitation: Principles, Techniques and Applications, Dan Diaper (ed.), Ellis Horwood Pub., (1989).
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
477
Personality engineering: Applying human personality theory to the design of artificial personalities Linda S. Endres Johnson Controls, Inc., 507 East Michigan Street, Milwaukee, Wisconsin 53201
[email protected] We should take care not to make the intellect our god; it has, of course, powerful muscles, but no personality. - Albert Einstein (1950, p. 260)
Abstract
To understand and explain our world, we often resort to "anthropomorphism," attributing human characteristics to things that are not human. For example, intelligence and friendliness are often perceived--in varying degrees and by various definitions--in computers and other machines. This perception of human traits in nonhumans can be interpreted as "artificial personality." Because anthropomorphic systems have proven controversial, more research is required to give us the insights needed to develop more compatible and effective artificial personalities. The purpose of this paper is to introduce the concept of "personality engineering" to enable us to better understand artificial personalities and ultimately to improve the interaction between human and artificial personalities. To accomplish this, personality psychology is recommended as a framework within which to design, develop, and evaluate any system or product. A brief example is given to illustrate how one human personality theory could be adapted for the study of artificial personalities. 1. INTRODUCTION
Attributing human characteristics to things that are not human is natural in that we tend to "see the world not as it is but as we are" (Frude, 1983, p. 103). Particularly with the development of human-inspired software--agents, assistants, experts, and guides, as well as the eventual evolution of sophisticated virtual beings--it becomes easy to perceive characteristic human-like behavior, or "artificial personality." The tendency to attribute human characteristics to nonhumans is known as "anthropomorphism." Whether computers and other nonhumans actually have personalities is left for debate elsewhere. This paper begins with the observation that people do perceive personality characteristics in nonhumans and it is this perception that is referred to here as "artificial personality." As background, human personality will briefly be defined and examples will be given of personality characteristics perceived in nonhumans. Because anthropomorphic systems have proven controversial, several concerns will be noted.
478
The concept of "personality engineering" will then be introduced to enable us to better understand artificial personalities and ultimately to improve the interaction between human and artificial personalities. To accomplish this, personality psychology is recommended as a framework within which to design, develop, and evaluate any system or product. An example illustrates how we might adapt one personality trait theory to study artificial personalities. 2. EXPLORING HUMAN AND ARTIFICIAL PERSONALITIES
Personality has several definitions, including: "the complex of characteristics that distinguishes an individual," "the totality of an individual's behavioral and emotional tendencies," and "the organization of the individual's distinguishing character traits, attitudes, or habits" (Webster's, 1979, p. 848). According to the trait theory developed by Cattell (Cattell, Eber, & Tatsuoka, 1970), varying degrees of each of 16 core factors make up the normal human personality. Of these factors, warmth and intelligence are considered the most important because they account for more variation in personality than the other 14 factors (Karson & O'Dell, 1976). Not surprisingly, these two human traits are already represented--albeit in varying degrees and using various definitions-by the machine traits of user friendliness and artificial intelligence. Countless products have been advertised as "smart" or "intelligent," including batteries, buildings, highways, sensors, mops, tires, and toilets. Equally impressive is the number of products that have been advertised as "user friendly." Compared to the richness and complexity of the human personality, today's artificial personalities are typically shallow. Our quest for user friendliness and artificial intelligence has focused attention on only two dimensions of personality. We may wonder whether other personality traits can similarly be perceived in machines. A study by Asch (1946) found that when humans were described in terms of personality traits, those not described could still be inferred. Therefore, it is hypothesized here that traits that are not purposely designed in machines can still be inferred. Artificial personalities can take many forms. For example, Laurel noted that "even the most technologically savvy user will feel quite comfortable comparing the Macintosh and the IBM PC in terms of their 'personalities'" (1990, p. 356). Regardless of whether a system is overtly anthropomorphic (for example, one that uses a human image or otherwise presents itself as if it were human), personality traits can be perceived in the system. Personality may be inferred from even the choice of words used in a system. The degree of fidelity of an anthropomorphic system in simulating a given human behavior may vary. The goal of a product or system will dictate what level of fidelity in personality simulation is appropriate. For example, in the role of agents, Laurel (1990) advocates the use of stereotyped personalities--as opposed to full personality simulations. Anthropomorphic systems may also vary in the degree to which they attempt to model a complete person. The artificial nose (Coghlan, 1994), for example, is anthropomorphic in name and function, but it models only a narrow range of human capability.
479
3. ANTHROPOMORPHISM Anthropomorphism is both natural and useful and may be the most common metaphor used in formal computer discourse (Johnson, 1994). By making use of our past experience, metaphors help us understand our world. They enable us to think differently, in terms in which we might not otherwise have thought. Despite the naturalness of anthropomorphism, anthropomorphic systems are controversial and have a formidable opponent in Shneiderman (1993), a pioneer in human-computer interaction, who challenges us to rethink our anthropomorphic systems. Where children are involved, Shneiderman raised three concerns over an anthropomorphic view of machines in that it may: (1) give an erroneous impression of the workings and capabilities of computers, (2) confuse a child's developing sense of humanity, and (3) produce anxiety or confusion. To address these concerns, Shneiderman advocates a "nonanthropomorphic style" of language that attributes control to the user and avoids anthropomorphic references. Shneiderman's concerns are valid, but they are not limited to anthropomorphic machines or to children. When the concerns are recast to apply to both nonanthropomorphic systems and adults, other alternatives that address these concerns then become apparent. For example, we can acknowledge in the first concern that both anthropomorphic and nonanthropomorphic machines can give an erroneous impression to both adults and children as to the workings and capabilities of computers. However, if this information is needed, users can be trained. The second concern, that children may be confused over their own sense of humanity, can likewise be extended to include adults. Instead of attributing this confusion to anthropomorphic systems, perhaps we should work to strengthen our sense of humanity through some form of education. The third concern, recast to include nonanthropomorphic machines in producing anxiety or confusion in both adults and children, may also be addressed by prescribing training for the user, the system's designer, or both. In his nonanthropomorphic style guide, Shneiderman (1993) addressed the user's need for a sense of control by, for example, crediting a child with printing a list of animals instead of crediting the computer with printing a list. However, in crediting only the child, we lose a prime opportunity to acknowledge and appreciate teamwork between humans and their automated assistants. The human-computer relationship is an honorable one and it need not be perceived as threatening to our sense of control. The issue of control is not absolute. While there are times when a user wants complete control, there may be other times when such control would defeat the purpose of a given system. For example, consider future scenarios where an intern practices his or her bedside manner on a simulated patient or a person practices assertiveness techniques while role-playing with a virtual being. In these scenarios, users may desire strict control over such program considerations as on/off or in the setup of the role-play situation. Once these programming controls are defined (perhaps in part with sliders; see Ahlberg, Williamson, & Shneiderman, 1993), a user may then benefit from roleplaying with an anthropomorphic interface with predetermined or user-selectable personalities. In this case, both anthropomorphic and nonanthropomorphic interfaces are used as appropriate to the goal at hand. The needs of the user and the purpose of the system help determine whether it is best to employ an anthropomorphic interface, a nonanthropomorphic interface, or both in the same system, as either alternate or
480
complementary modes of interaction. In considering the needs of users, designers should be aware, for example, that an overtly anthropomorphic system might offend certain cultures (Marcus, 1993) or people who want only to view their machines as tools. A final criticism is that some anthropomorphic systems at first seem cute, but are later viewed as silly or annoying (Shneiderman, 1995). However, that such systems have been created does not mean that all artificial personalities are annoying; it only means that we have succeeded in creating personalities that can be annoying. Addressing the question, "Who wants obnoxious machines?" Kurzweil (1990) predicted that "shaping the personality of intelligent machines will be as important as shaping their intelligence" (p. 413). In the next section, a framework is introduced that will enable us to create more compatible and effective artificial personalities. 4. PERSONALITY ENGINEERING
The term "personality engineering" is introduced here to refer to the use of human personality psychology as a framework within which to design, develop, and evaluate any system or product. In effect, it proposes that we adapt traditional methods of understanding human personality for the study of artificial personalities. By drawing on psychology, personality engineering addresses two important needs: for a more scientific evaluation of interfaces (Shneiderman, 1995) and for multi-disciplinary collaboration (Laurel, 1990). It provides an excellent framework for researching products, users, and user-product interactions. Diversity and user-centered design can be taken to new levels, as designers are given one more tool to meet the needs and expectations of diverse users. By adapting the theories and tools of personality psychology, designers can anticipate how users might perceive systems and products. Of note, personality engineering is anticipated to be useful whether or not a system is overtly anthropomorphic. Furthermore, it is proposed as a supplement, not as a replacement, for traditional user interface techniques. Applications for personality engineering are wide-ranging, from planning the personality traits that users will perceive in simple text or voice output to designing fullscale artificial personalities for sophisticated virtual beings. At a basic level, traits are impossible to control if they cannot be measured. By considering personality theories, designers are at least made aware of potential interpretations that they may then decide to measure. Depending on the application and the intended users, certain personality traits may be more crucial to engineer than others. At a more sophisticated level, designers can match the personalities of users and machines to facilitate interaction and enhance productivity. Users could be allowed to implement their preferences in machine personality. Personalities of sophisticated virtual beings could be created from a palette of traits, perhaps implemented as a control panel. Various personalities could be codified in expert systems that provide the advice of multiple experts. Different artificial personalities could be supplied with computer-based training packages and students could select a teacher personality compatible with their learning style. One way to engineer the artificial personality is via the systematic application of personality trait theory to a product or system being designed or evaluated. Personality engineers might select, for example, the 16 personality factor theory of Cattell (see Table 1). By adapting Cattell's trait theory, they ensure that all essential personality
481
dimensions have been considered and thereby increase the probability that artificial personalities will meet the needs of the human personalities that will interact with them. To adapt the theory, each trait must first be defined for the application and a determination made as to how that trait might be manifested in behavior. Certain traits, such as suspiciousness and imagination, for example, have obvious counterparts in computer security and creativity. Potential users can be surveyed to determine whether they have preferences for certain traits, an inability to perceive certain traits, or an unanticipated ability to perceive or impute others. Users can also be surveyed to determine how they might perceive these traits being manifested in the behavior and/or physical characteristics of the particular product or system. The artificial personality can then be revised based on conclusions drawn from the survey. The adaptation of theories of both normal and abnormal personality will likely give us insights into both human and artificial personalities. Table 1 The 16 personality factors of Cattell's personality trait theory Description of Factor Name of Factor Cool vs. Warm Warmth Concrete Thinkincj vs. Abstract Thinkin 9 Intellicjence Affected by Feelings vs. Emotionally Stable Emotional Stability Submissive vs. Dominant Dominance Sober vs. Enthusiastic Impulsivity Expedient vs. Conscientious Conformity Shy vs. Bold Boldness Tou~lh-minded vs. Tender-minded Sensitivity Trusting vs. Suspicious Suspiciousness Practical vs. Imacjinative Imagination Forthright vs. Shrewd Shrewdness Self-assured vs. Apprehensive Insecurity Conservative vs. Experimentincj Radicalism Group-oriented vs. Self-sufficient Self-sufficiency Undisciplined Self Conflict vs. Followin 9 Self-ima~e Self-discipline Relaxed vs. Tense Tension Copyright © 1986, IPAT, Inc. Adapted from the 16PF Profile with permission. 5. CONCLUSIONS Regardless of whether a system is overtly anthropomorphic, personality traits, such as friendliness and intelligence, can be perceived in the system. This perception of personality traits in nonhumans can be interpreted as "artificial personality." Although anthropomorphism is natural, anthropomorphic systems have proven controversial. Research into the artificial personality should allay some of our concerns and enable us to create more compatible and effective artificial personalities.
482
The concept of "personality engineering" was introduced here as a way to research the artificial personality. Personality engineering recommends the use of personality psychology as a framework within which to design, develop, and evaluate any system or product. Given our ability to perceive personality traits in nonhumans, it is likely that the theories and tools of personality psychology can be adapted to enable us to better understand the artificial personality and to improve the interaction between human and artificial personalities. Cattell's 16 Personality Factor theory, for example, seems a viable way to explore the personality of a system or product. This and other personality theories can be adapted to give us insights into both human and artificial personalities. REFERENCES
Ahlberg, C., Williamson, C., & Shneiderman, B. (1993). Dynamic queries for information exploration: An implementation and evaluation. In B. Shneiderman (Ed.), Sparks of innovation in human-computer interaction (pp. 281-294). Norwood, NJ: Ablex Publishing Company. Asch, S. E. (1946). Forming impressions of personality. Joumal of Abnormal and Social Psychology, 41, pp. 258-290. Cattell, R. B., Eber, H. W. & Tatsuoka, M. M. (1970). Handbook for the Sixteen Personality Factor Questionnaire. Champaign, IL: IPAT, Inc.. Coghlan, A. (1994). Electronic nose detects the nasty niffs. New Scientist, 141(1911), p. 20. Einstein, A. (1950). Out of my later years. NY: Philosophical Library, Inc. Frude, N. (1983). The intimate machine: Close encounters with computers and robots. New York, NY: New American Library. IPAT, Inc. (Institute for Personality and Ability Testing, Inc.), P.O. Box 188, Champaign, IL 61820. Johnson, G. J. (1994). Of metaphor and the difficulty of computer discourse. Communications of the ACM, 37(12), pp. 97-102. Karson, S. & O'Dell, J. W. (1976). A guide to the clinical use of the 16PF. Champaign, IL: IPAT, Inc. Kurzweil, R. (1990). The age of intelligent machines. Cambridge, MA: MIT Press. Laurel, B. (1990). Interface agents: Metaphors with character. In B. Laurel (Ed.), The art of human-computer interface design (pp. 355-365). Reading, MA: AddisonWesley Publishing Co., Inc. Marcus, A. (1993). Metaphor design and cultural diversity in advanced user interfaces. In M. J. Smith & G. Salvendy (Eds.), Human-computer interaction: Applications and case studies, Proceedings of the fifth international conference on human-computer interaction, (HCI International 1993), Orlando, FL, August 1993, Vol. 1, pp. 469-474. Shneiderman, B. (1993). A nonanthropomorphic style guide: Overcoming the Humpty Dumpty syndrome. In B. Shneiderman (Ed.), Sparks of innovation in humancomputer interaction (pp. 331-335). Norwood, NJ: Ablex Publishing Company. Shneiderman, B. (1995). Perspectives: Looking for the bright side of user interface agents. Interactions, 2(1 ), pp. 13-15. Webster's new collegiate dictionary. (1979). Springfield, MA: G. & C. Merriam Co.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
483
Using the template model to analyse interface specifications C. R. Roast and J. I. SiddiqP ~Computing Research Centre, School of Computing and Management Science, Sheffield Hallam University, Sheffield, Sll 8HD, UK. This paper describes the application of a formal modelling technique within human computer interface design. The model described is termed the template model which is a system based model which explicitly identifies system abstractions (known as template abstractions) that have a defined relation to intended task and users' capabilities [4]. Using template abstractions is it possible to express and analyse general interface usability properties within a formal framework. We describe the application of the template model in the analysis of a wysiwyg-style word processor. In this way we are able to demonstrate the potential role of the model and the impact of formal analysis with a familiar yet non-trivial application. In the case of the word processor analysed, we are able to characterise system constraints that determine what user tasks are supported. In general, we argue that employing the template model helps explicate assumptions about interface requirements. 1. I N T R O D U C T I O N Within HCI research, models of interaction are based on two distinct perspectives of
either the users or the system which offer quite different benefits ([1]). User models conventionally are used to assess the ease of use of a system, that is the demand it makes upon a users' capabilities. System models, by contrast, focus upon the expression and comprehension of system behaviour and the role of these in the design process. One consequence of this is that usability requirements are often expressed solely in terms of users, and are not explicitly associated with system based requirements. This complicates the design process and over burdens the role of iteration in interface design. In this paper we introduce and employ a novel model of interaction termed the template model which introduces user modelling concepts into system modelling thereby enabling usability requirements to be expressed as system properties [4]. The model is applied to analyse a wysiwyg-style word processor this demonstrates its potential role in explicating assumptions about interface requirements. The analysis shows that the consequences of a formal analysis can help determine system constraints and the user tasks supported. 2. T H E T E M P L A T E
MODEL
The template model is an extension of a basic state based model of interaction proposed by Harrison and Dix [2] in that it provides a deterministic and synchronous account of
484 interaction that is adequate for modelling stand alone interactive systems. The basic state based model is defined by the tuple (S, so, D, v, K), in which: S is a set of system states; So E S identifies an initial state; D is a set of system outputs; v • S ~ D determines the output for any state; and, K is an input alphabet. Each input of the system k E K is assumed to identify a function over states, thus we have: K C_ (S --. S). The template model explicitly accommodates the notion that certain system abstractions are germane to successful system u s e - these abstractions are termed templates. The use of template abstractions focuses attention upon users intended tasks and perceptual capabilities, and the extent to which systems can support them. Being based upon system abstractions, the template model has the potential to be employed constructively within system development and avoids the complexity of having to speculate about users' psychological and cognitive activities. Two types of template abstraction are used, termed results and display templates. Results are system abstractions required by users in order to determine the successful completion, or otherwise, of their tasks. For example 'the document' within a word processor is the focus of many user tasks, and thus may be identified as a result. Since users perform a variety of tasks and sub-tasks, the system will be required to provide different and appropriate abstractions of the system state each of which may serve as a result. By contrast display templates are abstractions of the system output which denote perceived display features such as menus, icons, cursors, etc. independently of details not affecting their perceived role ([3]). We model these components as a set of results (R) each of which is a function from states to a range of values particular to that element. Thus, for a result r E R we shall denote its range by Ur" A r e s u l t r E R is a function upon states which extracts information necessary for users to determine if a task (or sub-task) is complete r" S ---, Ur. To model users' perceptions as system abstractions we shall identify a set of display templates Dt. As with results we shall treat each display template dt as a function with a range of particular type Udt. One distinct feature of the formal characterisation of display templates is that the are take to be partial functions. A d i s p l a y t e m p l a t e dt E Dt is a partial function upon displays which extracts details that can be perceived by users as potential sources of information dt • D -++ Udt. Modelling display templates as partial functions accommodates the possible absence of whatever feature a display template extracts. For instance, if a display template extract a certain form of icon, we would interpret the icon not being displayed while the display template undefined. The template model can be used to express general interactive properties that support ease of use. Roast and Harrison [5] give the derivation of two properties namely output correctness and structural consistency. In this paper we focus upon system behaviour that enables the user to determine information held within the system state - - we state, illustrate and apply the property of output correctness. Output correctness requires that each value a result may take is associated with a unique display template value, such that whenever the display template has that value the result has the corresponding value. This
485 characterises many informal notions familiar within human computer interaction, such as
wysiwyg. However, in order to accommodate situations when the display templates may be undefined we make use of the notions of weak and strong equality to arrive at the definition below. Strong equality (written =~) represents equality only between defined display template values, whereas weak equality (written =~) holds when display template values are the same or either values are undefined. For a template model, the display template dt and the result r are related in an o u t p u t c o r r e c t manner, written OC(dt, r), iff:
VSl, S2 e S :
r(Sl)-- r(s2)~ dt(v(sl)) =~ dt(v(s2)) A dt(v(sl)) =, dt(v(s2))=~ r ( S l ) = r(s2)
(1)
The conditions for output correctness requires that: if the result value is the same in any two states, then the display template in those two states has the same value (or either is undefined), and; if the display templates are defined and equal in any two displays, then the result values in the underlying states are the same. For example, with a conventional word processor a variety of tasks are dependent upon the number of pages a document has. Thus, the number of pages represented within the system state may be treated as a result which we shall term pageCount. In addition, a visual indicator of the number of pages in a document can be modelled as a display template, termed pageDisplay. For instance, a particular instance of pageDisplay may be considered to 'extract' the numerals framed and labeled in a particular manner, say, IPages" 14 i. To ensure that the display template serves as a reliable indicator of the result we to simply assert OC(pageDisplay, pageCount), this would ensure that whenever the frame and label were displayed the numerals shown would be in one-to-one correspondence with the actual page count (pageCount). In this way users engaged in tasks dependent upon a document's number of pages would be supported by the appropriate template abstractions and system behaviour. 3. A P P L Y I N G T H E M O D E L To illustrate the impact of template properties such as output correctness, we have developed an abstract specification of a wysiwyg-style word processor, this is expressed in the Z specification notation [7]. Working from the abstraction specification the impact of usability requirements expressed as instances of output correctness can be assessed. Our abstract model of a word processor introduces a framework, where a document is viewed as including 'printable' and 'control' characters. The image of a document is defined in terms of positioning the bitmaps of printable characters on an output medium, using styles and fonts defined by 'control' characters within the document. The imaging of a document is defined for two resolutions corresponding to the VDU output and the printed output. Given this framework we are able to consider two informal notions of wysiwyg that represent alternative instances of output correctness, each of which supports particular tasks. First, print correctness ensures that there is a perceivable correspondence
486 between the VDU output of the word processor and what appears on the printed page. This correspondence supports a variety of tasks, in particular those where the physical layout of the printed document is important. Second, document correctness concerns the correspondence between the VDU output and the underlying document content. In general, editing tasks demand that this form of correspondence is supported. Figure 1 illustrates the alternative views of the document suggested by print correctness and document correctness.
document j
cor r e ~
Display: [ - ~
Document" " ... ,n,t,boldOn,b,r,... ,,
correct
Page:
Figure 1. Alternative document views" the document itself, the displayed document and the printed document.
In order to express print correctness and document correctness formally, and analyse their impact upon the word processor, it is necessary to make certain assumptions about the nature of user perception. In particular, we model two forms of user perception with two relations • and gts. Briefly: we write b • f to represent that the bitmap b is seen by users as including the visual feature f, and b ~ s s represents that the bitmap b is seen as displaying the character sequence s. From this minimal view of perceptual properties we are able to express the notions of print correctness and document correctness formally and analyse the system's adherence to them. P r i n t c o r r e c t n e s s requires that for any visual feature f, VDU output bx and any printer output b2, we have:
OC(bl qt f, b2 ~1 f)
(2)
(Where, bl and b2 are bitmaps.) D o c u m e n t c o r r e c t n e s s requires that for any string of character s, VDU output b and document doc:
oc(b •
i. aoc)
(a)
(Where, b is a bitmap, and s in doc represents that s is a subsequence within doc.)
487 Since the word processor is modelled in generic terms, the requirements of print and document correctness can be interpreted as constraints upon the word processor model• The consequences of these constraints represent conditions which can focus further analysis and development. A full account of the specification and its analysis can be found in [6]. 4. O U T C O M E S The outcome of the analysis identifies general system constraints that satisfy the usability requirements expressed as instances of output correctness. The potential for a system to adhere to these constraints focuses attention upon the exact nature of the requirements expressed. Difficulties with satisfying such constraints can be interpreted as identifying the limitations of any eventual system's successful, or focusing the re-examination of the original assumptions. In the case of the word processor the constraints found have familiar counterexamples within commercial word processing packages we propose that these counterexamples illustrate potential usability problems. The analysis of print correctness results in two constraints upon the word processor model developed: (i) that the perceived features of any character bitmap at display resolution also should be present for the same character at print resolution, (ii) the relative angle and distance between any two characters in an image at display resolution should be preserved at print resolution. This constraints do not prevent two characters (in a certain font and style) from appearing identical, they require that whatever perceived features characters have they are preserved at alternative resolutions. Counterexamples of print correctnessrequirement can often be found in commercial word processing packages. For example: it common for the numeral ' l ' and the letter '1' are rendered identically on the display to appear distinct on the printed page; and the spacing of fonts is often not preserved at alternative resolutions (see figure 2). The significance of potential contradictions of this sort will depend upon the exact model of perception represented by @, and expected nature of users tasks.
h
.
•
.
.
•
.
h
.
•
.
.
.
.
.
.
.
.
B
B
Figure 2. The same document at different resolutions may not preserve relative angles between characters.
For document correctness the analysis yields the following constraints: (i) if a printable bitmap is perceived as a character, then that bitmap should be generated by the same character, (ii) any displayed bitmap which is perceived as two adjacent characters should be generated by two adjacent printable characters in the document. These conditions prevent any character bitmap from being perceived as another character or sequence of characters. Again contradictions are common: the character double quote '"' is often rendered on the display as though two single quote characters " ' '. A more dramatic
488 contradiction arises if we consider control character sequences that are ineffectual. For instance, a document containing "...a,boldOn, boldOff,b ..." may appear the same as though "...a,b ...". For each of the counterexamples of output correctness discussed above we can ask a number of questions that would motivate a further analysis of the requirements for the interactive system being developed. First, we can question whether or not the contradictory situation is likely to arise. For example, is it likely that the control characters boldOn and boldOff would be juxtaposed in a document? Second, we can examine the significance of the contradictory situation upon successful use. For example, we could analyse whether expected users were aware of the possible confusion of '1' and '1'. 5. C O N C L U S I O N S Our use of the template model and its associated properties illustrates that usability requirements can be formally analysed in a system based framework. Given the familiarity of the domain analysed one might argue that notions of "correctness" are well understood, but our analysis has revealed that such notions make complex demands upon successful interaction. In particular, usability requirements may be refined and explicated early in development through the use of the template model and properties such as output correctness, in terms of system properties, thereby enabling usability requirements to be an integral part of the system and its subsequent development. REFERENCES
1. P.J. Barnard and M. D. Harrison. Towards a framework for modelling human computer interactions. In J. Gornostaev, editor, Proceedings International Conference on HCI, EWHCI'92, pages 189-196. Moscow:ICSTI, 1992. 2. M.D. Harrison and A. J. Dix. A state model of direct manipulation. In M. D. Harrison and H. W. Thimbleby, editors, Formal Methods in Human Computer Interaction, pages 129-151. Cambridge University Press, 1990. 3. M. D. Harrison, C. R. Roast, and P. C. Wright. Complementary methods for the iterative design of interactive systems. In G. Salvendy and M.J. Smith, editors, Designing and Using Human-Computer Interfaces and Knowledge Based Systems, pages 651-658. Elsevier Scientific, 1989. 4. C.R. Roast. Executing Models in Human Computer Interaction. PhD thesis, Department of Computer Science, University of York, 1993. 5. C. R. Roast and M. D. Harrison. User centred system design using the the template model. In F. Paternb, editor, Proceedings, EUROGRAPHICS Workshop on
the Design, Specification, Verification of Interactive Systems, Bocca di Magra, Italy, Eurographics Seminar Series, pages 381-392. Springer-Verlag, 1995. 6. C.R. Roast and J. I. Siddiqi. A formal analysis of an interface specification using the template model. Technical Report in preparation, 1995. 7. J. M. Spivey. The Z Notation: A Reference Manual. Prentice Hall International, 1988.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
489
T a s k m o d e l - s y s t e m model: t o w a r d s an u n i f y i n g f o r m a l i s m Ph. A. Palanque a'b, R. Bastide a'b, V. Senges a 1 aL.I.S., Universit6 Toulouse I, Place Anatole France, 31042 Toulouse Cedex, France {palanque, bastide, senges }@cict.fr. Tel. +33 61 63 35 88 - Fax. +33 61 63 37 98 bCENA, Centre d'l~tudes de la Navigation Arrienne, 7 avenue Edouard Belin, 31055 Toulouse Cedex, France.
1. I N T R O D U C T I O N The ultimate goal of many current approaches in HCI design is to enhance the quality of interactive systems, meaning to produce systems that are both usable and reliable. To reach that double ended goal, design practices must include in their scope the modeling of the two actors of HCI : the user and the system. The goal of usability is tackled by the integration of the user' s point of view at several stages of the design. Several notations have been devised to describe user tasks, such as UAN [ 1], GOMS [2], MAD [3], TAG [4]. Those formalisms are generally easy to use and to understand, in the hope that they can be directly validated or modified by the users themselves. The reverse side of that ease of use is that those formalisms often lack a solid formal foundation, and thus may present incompleteness or ambiguities that hinder their seamless integration in the design of the interactive system itself. In the field of softwareengineering, a promising approach to reach the goal of reliability is the use of formal methods. The use of a formal notation which is both mathematically analyzable and directly executable, is of great help in the design of an interactive system, allowing for an " a priori "validation of correctness. This is especially true since modem interactive systems are in general very difficult to test, due to their graphical and event-driven nature, and that their reliability cannot generally be asserted by an " a posteriori" testing. The authors of this paper have previously devised a formal approach for the design of interactive systems [5]. This approach, called ICO (for Interactive Cooperative Objects) uses high-level Petri nets as a formal model for the dynamics of the system, and the structuring constructs of the object-oriented approach. The goal of this paper is to provide a gateway between task models defined with the UAN formalism and system models defined by ICO. This is accomplished by giving formal Petri net equivalencies for every UAN constructs. We begin by briefly summarizing the main features of the two formalisms (UAN and Petri nets). Then we describe how each of the nine constructs of UAN can be described in terms of Petri nets. Finally, we show on a simple example how those transformations are performed. In the conclusion we present what are the benefits of making a bridge between user's task models and a model of the system. 1 The authors would like to thanks the HCI group of the Department of Computer Science of the University of York (U.K.) where Ph. Palanque was a visiting researcherduring the developmentof this paper.
490
2. PRINCIPLES OF THE TWO FORMALISMS
2.1. UAN Principles The UAN formalism is a task oriented and a user centered notation [6]. The tasks are described at two levels: Action level: which describes the primary user actions and their corresponding interface feedback. Those actions correspond to the physical actions performed by the user when he/she interacts with the devices (e.g., Depress mouse button). This level is related to implementation and allows designers to have a complete view of all the possible actions and their corresponding impact on the user interface. Its representation in a tabular way improves readability and gives a very clear distinction amongst inputs and outputs of the interactive system. The information described at this level is of few interest regards to the high-level structuring of the tasks, which is the aim of this paper. Actually to take into account this kind of information, a more powerful formalism than basic Petri nets is needed. We have already proposed such a formalism [5] and are currently working on its merging with the action level, but this description is beyond the scope of this paper. Task level: manipulates tasks names and temporal relations between them. It corresponds to a high-level description of the tasks. It usually takes into account the descriptions of the Action level, and their inter-relations are described in [6]. There are nine constructs for the description of the temporal relations between tasks: sequence, iteration, choice, repeating choice, order independence, interruptibility, interleavability, concurrency, intervals-waiting. These constructs are described in the s~tion 3. 2.2. Petri nets: principles Petri nets [7] are a mathematically founded formalism designed for the modeling of concurrent systems. When modeling with Petri nets, a system is described in terms of state variables (called places, depicted as ellipses) and by state-changing operators (called transitions, depicted as rectangles), connected by arcs. The state of the system is given by the marking of the net, which is a distribution of tokens in the net's places. State changes result from the firing of transitions, yielding a new distribution of tokens. Transition firing involves two steps: (1) tokens are removed from every input places, and (2) new tokens are deposited in every output places. A transition is enabled to fire when all of its input places contain tokens. Figure 1 depicts a small Petri net which illustrates the simple concept of sequences of actions. When a variable n labels an arc between a place and a transition, it means that the place has to hold at least n tokens in order for the transition to be fireable. When it fires n tokens are removed from the place (cf. Figure 3). A variable n may label an arc from a transition to z place. In that case, when the transition is fired n tokens are deposited in the place. Using Petri nets it is possible to include temporal aspects in models. Sifakis [8] proposed for each place to have a duration, meaning that a token must remain in a place a certain amount of time before it can be used by the occurrence of a transition. This amount of time is described between brackets next to the corresponding place (cf. Figure 9). Petri net theory offers a set of structuring mechanism for describing complex systems [9]. Among them, hierarchical refinement is made possible using substitution nodes which are places or transitions related to a sub-Petri net. For instance, in order to model complex behaviors a transition can describe high-level actions, and a sub-Petri net can be associated to
491 this transition in order to describe precisely its behavior without enlarging the Petri net modeling the system at a high-level. However, in order for the Petri net to be rebuildable (i.e. to be able to build a single model from the main Petri net and its sub-Petri nets) and analyzable (i.e. to apply analysis techniques provided by the Petri net theory on the model of the system), it is requested for the sub-Petri nets to have at least one transition with no input place (called Begin) and one transition with no output place (called End). Of course this refinement can be used recursively in order to describe hierarchically complex behaviors.
3. F R O M A UAN MODEL TO A PETRI NET In this section we show how every UAN construct of the task level is represented by a subPetri net. The global task is thus represented by a Petri net featuring substitution nodes (precisely transitions called macro-transitions).
3.1. Description of the constructs O S e q u e n c e : There is a sequence relation between tasks when a task has to be completely performed immediately after another task (see Figure 1). @ Choice: There is a choice relation between tasks when the user has to choose (equally) and perform one of these tasks (see Figure 2).
UAN description A g
description UAN description A IB
Petri net
B,
Figure 1. Sequence
Petri net
Figure 2. Choice
UAN description :)etri net description UAN description (A) + Begin (A )n .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Petri net
description
.
¢
1,, A
,
description
Begin
Begin
I
Enl ~nn 'l UAN description
Petri net description .
.
.
.
.
.
.
.
(A)*
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Begin ' .
- _ -.
.
.
.
_.
.
.
.
.
End ¥ .
.
.
.
.
.
.
.
.
.
Figure 3. Iteration
.
.
.
.
.
492 • Iteration: There is an iteration relation if the task can be performed zero or several times (*) or has to be performed exactly n times (n) or one or more times(+) (see Figure 3). @ Repeating Choice: It is a combination of the two previous temporal relations (iteration and choice) (see Figure 4). The left part of the picture describes the repetition of exactly n choice, while the fight one describes zero or several choices. Due to space reasons the sub-Petri net corresponding to one or more choices is not presented here. UAN description Petri net description ( A Begin "n IB) n . .
. .
. .
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
. .
. .
. .
. .
. .
.
.
UAN description
.
.
.
.
.
.
(A
Petri net description Begin .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
t
IB)
I
A
~
B
I
I
Figure 4. Repeating choice • Order independence: There is an order independence relation between tasks if the user performs all the tasks without any constraint of ordering (see Figure 5). OAN description Petri net description A
......~ - - - - :
.............
& B
I
I
Figure 5. Order independence • lnterruptibility: There is an interruptibility between tasks if one task (A) can be interrupted by another task (B) and this last task (B) have to be entirely performed before to return to the interrupted task (A) for completion (see Figure 6). UAN description
Petri net description - -
B
-->
.
.
.
.
.
.
.
.
.
.
.
A
.
.
.
.
.
.
-
-
-
-
-
-
-
-
-
-
Petri net description Begin ._~
UAN description -
-
r~..,,i,,
-
-
-
=
.
A
With: Task: A A1 A2
¢=~
.
.
.
- - _ - _ - _ - - - - - _ - - - - _ - _ -
B
.
.
.
.
.
.
.
.
.
.
.
.
.
-
.
.
.
.
.
.
- - _ - _ - _
With: Task: A Task: B A1 BI A2 B2 c/~E~ tran,, .
.
.
.
.
.
.
transition .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Figure 6. Interruptibility
.
.
.
.
.
.
.
.
.
.
.
.
.
q -" Macro• transition B
~ .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
F i g u r e 7. Interleavability
@ Interleavability" There is an interleavability relation between tasks if each task can interrupt each other (see Figure 7).
493 • Concurrency: There is a concurrency relation between tasks if some (two or more) of these tasks can be performed at the same time (see Figure 8). UAN description All
Petrinet description
B
UAN description Petri net description A (t > n seconds) B Begin
I
I
[0, n] P. Temporisation
I End
I +
Figure 9. Intervals - Waiting
Figure 8. Concurrency
• I n t e r v a l s - Waiting" There is a waiting (or interval) relation between tasks if a task have to be performed before another task and before the waiting of a period of time (Figure 9).
3.2. Combination of constructs As stated in section 2.2, our translation scheme is of hierarchical nature, allowing to combine several macro-transitions to form a more complex net. Figure 10 for example presents the Order independence combination of two sequences of two subtasks. The merging of a subPetri nets with a transition T in the upper one is done by: - connecting the arcs from the input places of the transition T of the Petri net to the transitions without incoming arcs in the sub-Petri net (Begin transitions), - connecting the arcs from the output places of the transition T of the Petri net to the transitions without outgoing arcs in the sub-Petri net (Endtransitions). UAN formalism (A1 A2) &(B1 B2)
Petri net formalism ..... ~t-.... ; ..... ~ , - - -
W
I A2
II
I B2
I
r
Macro-transition A ~ ~__~ ~
~ "Macro-transition B
Figure 10. Combination of the constructs
494 4. C O N C L U S I O N In this paper we have proposed a way for translating from a UAN description to a Petri net one. For every UAN construct, we have proposed a Petri nets which interpretation is the same as the one of the UAN description. Lastly, we have shown how those sub-Petri nets can be merged in order to model complex task models. This approach provides significant advantages : • the ambiguities of the task models are solved during the translation process. This process, while automated to a certain extend, is interactive as it asks the designers of the task models to make more precise their specification if needed. Afterwards, the task models can be mathematically validated by analysis the Petri net model; • as we use Petri nets for modeling the interactive application, cooperation between the model of the tasks and the model of the system can be mathematically checked, in order to ensure before implementation that the system will be able to perform all the user's requests included in the task models; • as the model of the system is embedded at run time, it can be used to provide contextual help about the behavior of the system, as described in [10]. When the task model is also embedded within the same formalism, the help may be given not only with respect to the system behavior but also with respect to the user's task and goal.
REFERENCES
1.
D. Hix, H.R. Hartson, Developing user interfaces Ensuring, Usability Through Product & Process, John Wiley & Sons, inc., 1933. 2. S. Card, T.P. Moran, A. Newel, The psychology of Human-Computer Interaction, Hillsdale, NJ: Erlbaum, 1983. 3. Scapin D. L., Pierret-Golbreich C., Towards a method for task description: MAD, Work With display Units'89, Amsterdam, Elseiver, 1989. 4. S.J. Payne, T.R.G. Green, Task Action Grammar: A Model of the Mental Pepresentation of Task Languages, Human-Computer Interaction, 2, pp 93-133, 1986. 5. P.A. Palanque, R. Bastide, Petri net based Design of User-driven Interfaces Using Interactive Cooperative Object Formalism, In proceedings of 1st Eurographics Workshop on Design, Specification and Verification of Interactive Systems, Paterno editor, Carrara, Italy, 8-10 June 1994. 6. Hartson H.R., P.D. Gray, Temporal Aspects of Task in the User Action Notation, Human Computer Interaction, Vol. 7, pp 1-45 - 1992. 7. Peterson J.L., Petri net theory and modelling of systems, Prentice Hall 1981. 8. Sifakis J., Use of Petri nets for performance evaluation, In Measuring, Modelling and Evaluating Computer Systems. H. Beilner and E. Gelenbe (Eds.), North Holland 1977. 9. Huber P., Jensen K, Shapiro R., Hierarchies in coloured Petri nets, 10th international conference on application and theory of Petri nets, Bonn 1989. 10. P. Palanque, R. Bastide, L. Dourte, Contextual Help for Free with Formal Dialogue Design. "HCI International 93", 5th International Conference on Human-Computer Interaction, North Holland.Orlando, Floride (USA), 8-15 August 1993.
III. 17 Modeling 2
This Page Intentionally Left Blank
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
497
Scenario based specification of interaction metaphors C. Stephanidis, C. Karagiannidis and A. Koumpis Institute of Computer Science, Foundation for Research and Technology - Hellas, Science and Technology Park of Crete, P.O. Box 1385, GR-71110 Heraklion, Crete, Greece
A scenario-based formal specification of interaction metaphors is proposed, which assists designers to develop interaction dialogues at a high-level of abstraction. A formalism has been developed which provides means for describing both application and metaphor semantics, as well as a descriptive model for the specification of interaction scenarios. The proposed approach has been applied for the construction of application scenarios in the domain of special education (i.e. supporting students with learning difficulties), so that different user interaction requirements and cognitive abilities are addressed.* 1. INTRODUCTION Human-Computer Interaction research has paid considerable attention to interaction metaphors, recognising their impact on achieving various usability goals for interactive computer-based applications [1], [2]. In this paper, a method is described which enables the 'high-level', scenario-based specification of interaction metaphors, independently from the specifics of the target development environment. Instantiations of the model can be elaborated as prototypes through the use of storyboard, video, and rapid prototyping tools [3]. The work reported is based on a proposed formalism that captures application and metaphor semantics. This formalism serves as the basis for the development of a descriptive model for the specification and implementation of interaction metaphors, and complies with the scenario-based design approach, taking into account typical user activities in the development process [3]. 2. F O R M A L I S M The formalism captures application and metaphor semantics. In the context of this formalism, implemented interactive applications are abstracted to the 'combination' of behavioural aspects of the functional core of an application, with the presentational characteristics attributed to a specific metaphor. Applications are described by means of entities and actions. Each entity has a set of possible states, and a set of possible transitions between its states, which determine its behaviour. Each action, on the other hand, is characterised by the entity on which it is performed, and the state transition it causes. Part of this workhas been carriedoutin the contextof the HORIZONProjectof the Region of Crete, Greece,which was partiallyfundedby the Commissionof the E u r ~ Union (DG V).
498 Following this abstraction, we define relationships between applications (e.g. equivalence, sub- and super-application), as well as operations on applications (e.g. composition). These definitions are based on the relationships between application-specific entities and actions sets. Metaphors are also described by means of entities and actions. Entities are divided into objects on which actions are performed, and actors that perform the actions. Each entity has a set of possible appearances, and a set of possible transitions between its appearances, which determine its presentation. Each action, on the other hand, is characterised by the actor that performs it and the object on which it is being performed, as well as the appearance transition that is caused by its activation. In this context, each metaphor is defined by a triplet [ A, O, P ], containing its actors, objects and presentation, respectively. For example, the well-known desktop metaphor may be defined by the sets: • Actors, relating to the user (or users in the case of a CSCW application); for a single user, there might be no presentation, while for multiple users, there might be appearance states such as visible, non-visible, etc, and transitions between their appearances such as useri.visible --->useri.non-visible, etc; • Objects, relating to icons, windows, window managers, trashbins, etc; the window object has possible appearance states defined by its size, position, etc, and possible appearance transitions such as window.size1 --->window.size2; • Presentations, relating to the transitions between the appearance states of the objects and actors, caused by the actions; for instance, such a transition could be window.position1 ---) window.position2. Similarly with the case of applications, we define relationships and operations for metaphors. In particular, two metaphors M~ and M2, defined by [ A~, Ol, P~ ] and [ A2, 02, P2 ] respectively, can be related in one of the following ways: MI -- M2 (Ml equivalent to M 2 ) ¢ ~ :! fA : A~ --->A2, fo : O~ --->02, where fA, fO are bijective functions, such as V a.pi, a.pj e Al, if ( a.pi ---> a.pj ) ~ Pl then ( fA ( a.pi ) --->f^ ( a.pj ) ) e P2, and Vo.pi, o.pj~ O1, if(o.pk ---> o . p l ) ~ P l then ( f o ( O . p k ) - - - > f o ( O . p l ) ) ~ P2 MI :9 M2 ( MI super-metaphor of M2 ) ¢:* 3 f^ : AI --->A2, fo : O1 --->02, where fA, fo are surjective functions, such as V a.pi, a.pj e Al, if ( a.pi ---> a.pj ) e Pl then ( fA ( a.pi ) ~ fA ( a.pj ) ) ~ P2, and V o.pi, o.pj E O1, if ( o.pk -"> o.pl ) e PI then ( fo ( O.pk )--->fo ( o.pl ) ) e P2 M1 c M2 ( M~ sub-metaphor of M2 ) ¢=> M2:9 M~ ( M~ super-metaphor of M2 ) where a.p and o.p describe possible presentations of actor a and object o respectively. The equivalence relation segments the Metaphor space into classes of equivalent metaphors. Also, operations on metaphors are defined (where M3 is defined by [ A3, 03, P3 ]): M3 = MI u M2 ( composition ) M3 = MI c3 M2 ( common part )
¢:, A 3 = A I u A 2 , O 3 = O i u O 2 , ¢=> A3 = AI c3 A2, O3 = O1 {"~ O2,
P3 = Pl u P3 P3 = P I n P3
which have all the properties that hold in the respective operations on sets (e.g. commutative, transitive). Furthermore, we define relationships between applications and metaphors. A metaphor, in this context, can serve an application, if there is a surjective function which maps its entities and actions sets, onto the entities and actions sets of the application; in any other case, the
499 metaphor serves the application only partially. This surjective function maps the application semantics (behaviour) to the appearance attributes of a metaphor (presentation). In addition, as the equivalence relationships segment the Application and Metaphor Spaces into subspaces of equivalent applications and metaphors, we can conclude that equivalent metaphors can serve (or serve only partially) equivalent applications. The formalism follows a structural approach [ 1] in describing metaphors, in the sense that metaphors are defined/described in terms of primitives and relations between primitives, in both the source (i.e. application) and target (i.e. metaphor) domains, and mapping functions are defined between the two domains, relating portions of the descriptions. Apart for enabling the pure theoretical construction of composite artificial structures, the formalism also serves as the basis for the development of a descriptive model for the specification of metaphors. 3. MODEL Based on the above formalism, a descriptive model has been developed consisting of two submodels, namely scenario specification and scenario implementation submodels (Figure 1). The proposed model aims to be utilised mainly for the provision of alternative metaphors of an interactive application, such as an information system application that aims to be used by a variety of users with different requirements, needs and abilities.
/
With l~ospect_To ~
~
...... IAaivalion
•..
To_Rntity
I ~''~0"
_Sehaviour iitte;
'.....
Actor_~ Interact Eas_Subt,
:
/
\
InleraclionTechnique
I°b~; IlU~e, ...... ", I I Sy=emO~ver'I Dr~ve. Has ~un_~'m~_
: :
Bntity
:- ".
.,
multivalued attribute link
i s~-
Enli~
'/ llnterad0n
UsG Int
=tque i
lkploys_ r~ogtcal Devlc,
kploys_
l
............. isA Relation (Specialisation)
~ e r a c t ion_Technique_ ~a!Behavlour
~T~o_
I
""...,,..°. I
I~~,
/ t /
I'- i Figure 1. The scenario specification and scenario implementation submodels and their relations
I
I
500
3.1. Scenario specification submodel The scenario specification submodel enables the specification of metaphors and can be treated independendy of a computer-based environment. It provides a basis for the construction of metaphoric 'worlds' [4]. More specifically, theatrical plays, movies or even usual stories could be, in principle, described by means of the scenario specification submodel, without necessarily aiming towards a computer-based implementation. The metaphoric worlds to be created by employing the scenario specifcadon submodel follow a linguistic approach [5], since the model informally adopts a scheme whereby the primitive entity within a metaphor is the analogy of a sentence in a language, containing a subject, a verb and (optionally) an object. Of course, there might be more than one subjects, verbs or objects in a sentence. A series of sentences make up a paragraph and a series of paragraphs make up a story [6]. A typical sentence in the context of a specific metaphor includes: • an entity as the equivalent of a subject in the linguistic paradigm, • an activity as the equivalent of a verb in the linguistic paradigm, • an entity as the equivalent of an object in the linguistic paradigm. Logical Condition classes are introduced in order to assist the process of specifying dialogue control and sequencing issues in the metaphor, and are specialised in four sub-classes, namely: • precondition, which is a set of predicates that should be true, or a set of conditions/constraints that should be satisfied, so that the activity is enabled; • activation condition, which is a set of conditions that trigger an activity; • termination condition, which is a set of conditions that stop the activity; • postcondition, which is a set of predicates or conditions/constraints that should be satisfied after the termination of the activity. The metaphor designer is able to proceed to a customisation of the scenario specification submodel by means of a specialisation for a specific domain. More specifically, the metaphor designer is encouraged to create subclasses of the existing classes in the model, with characteristics considered important. For instance, when applied in the domain of computerbased education, the actor class in the scenario specification submodel might be further specialised in two new subclasses, e.g. tutor_actor and student_actor, that apart from the 'standard' (i.e. inherited) attributes of their class, they might be annotated with new attributes, which are due to the real world properties of the domain-specific classes. This possibility of custornisation (i.e. modularly modifying, adding or deleting classes of the model and their attributes) contributes towards the development of more complex or composite metaphors capable of conveying all those semantics that are considered by the metaphor designer to be important for the design procedure. 3.2. Scenario implementation submodel Any already specified metaphor by means of the scenario specification submodel might be described in terms of the scenario implementation model. Though the scenario specification submodel could be characterised as generic, the scenario implementation submodel is strictly limited to computer-based implementations of metaphors. The transition from the specification of a scenario to its implementation could potentially take place by means of an algorithm which implements the mappings between real-world entities and their attributes, to corresponding interface entities and attributes; this mapping
501 could be accomplished according to an automated procedure, where: specific entities of the specification submodel are instantiated in the implementation submodel as specific runtime entities, specific activities are realised by means of specific interaction techniques, etc. For the last step of the scenario implementation, namely the selection and realisation of specific interface entities, the metaphor designer could be assisted by a user interface developer or an application engineer, so that they could, in cooperation, decide on the implementational details of the metaphor. The correspondence of classes between the two submodels is shown in Table 1. It should be noted that the Scenario Implementation Classes are independent from the specificities of the various technology platforms. For instance, the Interaction Object class depicts abstract behaviours that can be assigned to interaction objects. In the present context, it might be thought of as referring to Interactors [7] or Abstract Interaction Objects [8]. Scenario Specification
Scenario Implementation
Entity Run time Entity Object Interaction Object Actor User or System Agent User driven Actor User System driven Actor System Agent System driven Actor or Object System Entity Activity Interaction Technique Table 1: The correspondence between the two submodels 4. APPLICATION OF THE MODEL The proposed approach has been employed in the construction of applications in the domain of special education. More specifically, a multi-disciplinary group, consisting of educators, psychologists, computer scientists, and rehabilitation specialists, have utilised the model as a 'formal reference description', in order to develop metaphoric worlds to assist students with learning difficulties acquire elementary mathematical concepts. More specifically, the following process has been adopted: • Initially, the multi-disciplinary group posed the educational objectives and goals to be satisfied by the educational application. • By means of a stepwise refinement process, the overall objectives have converged into a high level interaction scenario. • The high level scenario was iteratively reformulated into a specific instantiation of the scenario specification submodeL • The design group cooperatively arrived at the transition from the scenario specification to the scenario implementation. • The resulting instatiation of the scenario implementation submodel has been implemented in two application development environments for evaluation purposes.
502 5. CONCLUSIONS AND FUTURE WORK The utilisation of the proposed model in the above application has resulted in a significant reduction of the time spent on the specification and implementation of scenarios, by the multidisciplinary group. The main benefits of our approach, can be summarised as follows: • The proposed formalism facilitates the development of metaphors as a multidisciplinary task to be accomplished through the synergy and collaboration of various experts from different fields, such as human factors experts, psychologists, interface designers, artists, who need not be expert programmers. • The transition from the specification to the implementation of a specific scenario is not unique; multiple implementations of the same specification may exist, i.e. the same scenario can be implemented using different presentation and interaction means (interaction techniques, media/modalities, devices), as well as different technology platforms, depending on the specific abilities, needs and preferences of the target user group under consideration. • The proposed model facilitates the development of 'multi-user' metaphors, through the creation of multiple user-driven actors, hence it may be particularly suitable for ComputerSupported-Cooperative-Work (CSCW) applications. • The proposed model may be easily expanded and specialised through the introduction of new subclasses and attributes, in order to address the requirements of specific application domains and/or user groups. Specific aspects of the model are currently under implementation, concerning the automated transition from the specification to the implementation of interaction metaphors. REFERENCES
1. J.M. Carrol, R.L. Mack and W.A. Kellog, Interface Metaphors and User Interface Design, in M. Helander (ed.), Handbook of Human-Computer Interaction, North-Holland, 67-85, 1990. 2. Proceedings of the 1994 FRIEND 21 International Symposium on Next Generation Human Interface, Meguro Gajoen, Tokyo, February 2-4, 1994. 3. J.M. Carroll, Making Use A Design Representation, Communications of the ACM, 37(12), 2935, 1994. 4. B. Laurel, Computers as Theatre, Addison Wesley, 1991. 5. J.R. Hobbs, Metaphor and Abduction, in A. Ortony, J. Slack and O. Stock (eds.), Communication from an Artificial Intelligence Perspective, NATO ASI Series, Springer Verlag, 1992. 6. R.L. Campbell, Will the Real Scenario Please stand up?, SIGCHI Bulletin 24(2), 1992. 7. B. Myers, Encapsulating Interactive Behaviours, CHI '89 Conference Proceedings on Human Factors in Computing Systems, 319-324, 1989. 8. J.M. Vanderdonckt and F. Bodart, Encapsulating Knowledge for Intelligent Automatic Interaction Objects Selection, INTERCHI '93 Conference Proceedings on Human Factors in Computing Systems, 424-429, 1993. 9. M.D. Harrison and A.J. Dix, Modelling the Relationship between State and Display in Interactive Systems, in P. Gomy and M.J. Tauber (eds.), Visualisation in Human-Computer Interaction, Springer-Verlag, 241-249, 1990.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
503
Cocktail-Party Effect with Computational Auditory Scene Analysis Preliminary Report Hiroshi G. Okuno a, Tomohiro N a k a t a n i a, and Takeshi Kawabata a aNWW Basic Research Laboratories, Nippon Telegraph and Telephone, 3-1 Morinosato-Wakamiya, Atsugi, Kanagawa 243-01 JAPAN One of important and interesting phenomena in sophisticated h u m a n communications is the cocktail party effect: t h a t even at a crowded party, one can attend one conversation and then switch to another one. To model it in a computer implementation, we need a mechanism for understanding general sounds, and Computational Auditory Scene Analysis (CASA) is a novel framework for manipulating sounds. We use it to model the cocktail party effect as follows: sound streams are first extracted from a mixture of sounds, and then some sound stream is selected by focusing attention on it. Because sound stream segregation is an essential primary processing for the cocktail party effect, in this paper, we present a multi-agent approach for sound stream segregation. The resulting system can segregate a man's voice stream, a woman's voice stream, and a noise stream from a mixture of these sounds. 1. I N T R O D U C T I O N Looking and listening are more active t h a n seeing and hearing [1]. Some of important and interesting phenomena in sophisticated h u m a n communications are (1) the cocktail-party effect m selectively attending some conversation or sound source and then changing the focus of attention to another [2], and (2) the Prince Shotoku effect ~ listening to several things at the same time [3]. The latter effect is named for Prince Shotoku (574-622) in Japan, who is said to have been able to listen to seven people petitioning him at the same time. Although research on speech understanding has a rich history, it is still difficult to understand sounds under real-world conditions. In psychoacoustics, some efforts have been made to build a general framework for understanding sounds. This research area, called auditory scene analysis [4], has recently been explored by AI and computer science researchers trying to create a general representation of sounds in order to deal with more realistic acoustic environments and to integrate computational auditory frameworks into multimodal perception systems. This emerging research area is called Computational Auditory Scene Analysis (CASA)
[5]. The main research problem in listening to multiple sound sources is sound stream segregation, which may yield clues to a modeling of the cocktail-party effect. In this paper, sound stream segregation is modeled by the multi-agent paradigm.
504
t,,.,=.,,,,.,...,,,.,.,.....,,. •
.......
,,..,,,,,..,
.....
,..=.,..,..
.....
,,,,.,.,,,.,,=.,,~
,
sound streams mixture of sounds
Sound stream segregation
~
i multi-agent i system
i !
% ....................................
~'
%,% ..~o.. "%'°%,
Focus of attention mechanism
multimodal information
Figure 1. Cocktail party effect modeled by Computational Auditory Scene Analysis.
2. COMPUTATIONAL AUDITORY SCENE ANALYSIS Auditory scene analysis aims at an understanding of acoustic events (or sources) that produce sounds. An acoustic event consists of sound streams (hereafter simply stream), each of which is a group of acoustic components that have consistent attributes. The process that segregates sound streams from a mixture of sounds is called sound stream segregation, and the cocktail party effect can be modeled by it (Figure 1). The segregation system extracts individual sound streams from a mixture of sounds, and the focus-of-attention mechanism then selects one stream. Such a selection may be affected by the sound streams themselves as well as by other kinds of information, such as visual cues. Consider the sound stream selected by the focus-of-attention mechanism and given to a spoken language processing (SLP) system. If, for example, a segregated sound stream is that of a closing door or of breaking glass, the SLP may change the thread of discourse to a sudden happening by switching attention to the stream. This simple application of sound stream segregation is expected to make SLP robust and adaptive in a noisy environment, since it extracts a voiced speech stream from the real-world nonvoice sounds. Many techniques for this have been proposed so far. Brown, for example, uses auditory maps in sound stream segregation [6,7], but it is an off-line (or batch) algorithm in the sense that any part of the input is available to the algorithm at any time and for many applications off-line algorithms are not suitable. Additionally, it is not easy to incorporate schema-based segregation and grouping of streams into such a system because it does not support a mechanism for extending capabilities. We adopted the multi-agent paradigm to model sound stream segregation, partly because the number of sound streams is neither fixed nor constant and partly because it allows us to design a more flexible and expandable system. 3. SOUND STREAM SEGREGATION BY MULTI-AGENT PARADIGM The sound stream segregation system has four functions: determining when a stream appears, tracing the stream, detecting the end of the stream, and resolving
505
Tracer generator I tR~sidu~ I
Event ~'_ detector
'l I I I I I
Tracer Extract rl information Predict I .~ next input
I I i
"-
Predictednextinput
r'I I I I I
I
L._
I
'--~ L- -I
i
-----I
I
Tracer Extract information Predict v nextinput
I~
FI
Streamfragment
'~Streamfragment ~
J
Predictednextinput
I I
I '__
.~
Noise-Tracer Extract I [~i information Predict next input
I'-~Noisestreamfragment Predictednextinput
Figure 2. Residue-driven architecture for sound stream segregation. interference between simultaneous streams. We have designed and implemented a multi-agent stream segregation system MASS [8,9]. Since a number of sound attributes (such as onsets, pitch, and common modulation) are available for use in segregation, there are m a n y possible kinds of MASS systems differing in the attributes used. The modeling of the MASS systems is unique, however, and is called the r e s i d u e - d r i v e n a r c h i t e c t u r e . Each MASS system t h a t extracts sound streams by using some sound attributes is called a n agency. Agencies interact with each other to refine segregated streams or to replace other segregated streams with their own streams [2]. The residue-driven architecture uses three kinds of agents: an event detector, a tracer generator, and tracers (Figure 2). An agency based on this architecture extracts streams as follows: 1. An event detector subtracts a set of predicted input from the input and sends the residue to the tracer generator and tracers. 2. If the residue exceeds some threshold value, the tracer generator searches for the values of focused sound attributes. If it finds appropriate values of the attributes, it generates a tracer to trace on the attributes. If it fails to find such values, it generates a noise tracer. 3. Each tracer extracts a stream fragment by tracing the attributes of the stream.
506 It also composes a predicted next input by adjusting the segregated stream fragment with the next input and sends this prediction to the event detector. In other words, tracers are dynamically generated and terminated in response to the input. Amd because segregation is performed incrementally, on sufficiently fast hardware the system will be able to operate in real-time. The first implementation of the MASS uses only the harmonic structure of sound. The reasons are twofold: (1) the general representation of a sound has not yet been proposed, and (2) we need an attribute of a sound as simple as possible so that we can use it as a building block to manipulate more sophisticated attributes. The tracer generator is designed as a set of pitch-watcher agents. It is a kind of filter bank. Each pitch watcher has its own frequency region (about 25 Hz wide) and watches a residual input to see whether a new harmonic stream with its fundamental frequency in the watcher's region appears. The harmonic intensity Et(w) of the sound wave xt(7-) at frame t is defined as
Et(w) = ~ ]1 Ht,k(w) I12, k
where
Ht,k(w) = ~ Xt(T)" exp(--jkwT), r
T is time, k is the index of harmonics, xt(T) is the residual input, and Ht,k(w) is the sound component of the kth overtone. For the sound consistency check, we use a valid overtone for a harmonic stream and dormant period for a noise stream. An overtone is defined valid if the intensity of the overtone is larger t h a n a threshold value and the local time transition of the intensity can be approximated in a linear manner. The period is defined dormant if there are only non-harmonic sounds. The average spectrum intensity of the noise is calculated during the dormant period. We also use valid harmonic intensity, E~(w), which is defined as the sum of the [I H t , k ( w ) [ I of valid overtones. A pitch watcher is activated when the following conditions are satisfied: (1) E~(w)/Et(w) > r (r - 0.1), and (2) there is a power peak near frequency v in the residual input, where w is the frequency t h a t maximizes E~(w) within the region. Since there is more t h a n one activated pitch watcher, the one t h a t gives the maximum Et(w) is selected by the tracer generator and generates a tracer. If there is no active pitch watcher during a dormant period, the noise tracer is activated. When a harmonics tracer is generated, it gets the initial fundamental frequency from a pitch watcher, and at each residual input each tracer extracts the fundamental frequency t h a t maximizes the valid harmonic intensity Et(w). It then calculates the intensity and the phase of each overtone by evaluating the absolute value and the phase of Ht,k(v). It creates a predicted next input in a waveform by adjusting the phase of its overtones to the phase of the next input frame. It also recovers the actual input by adding its predicted input to the residual input before calculating the fundamental frequency. If there are no longer valid overtones, or if the intensity of the fundamental overtone drops below a threshold value, it terminates itself. The noise tracer segregates the static noise stream by means of the average spectrum intensity [10]. It calculates the spectrum intensity time average of the residual input during the dormant period. The noise tracer creates a predicted next input in the form of the spectrum intensity and sends it to other agents. When
507
300
300
250
250
200
200
Hz 15o
Hz 150
1OO
100
50
50 ,
0 0
50
100
J
150
i
,
,
200
250
300
O 350
time in unit of 7.5 ms (a) without a noise
O
100
200
300
400
500
600
time in unit of 7.5 ms) (b) with a white noise
Figure 3. F u n d a m e n t a l frequency patterns of segregated sound streams.
a tracer receives a spectrum intensity, it estimates the intensities of its sound components at each frequency by subtracting the predicted values. The predicted next input of the noise tracer inhibits the generator from generating unnecessary tracers and makes the harmonics tracers robust against nonharmonic noise. The noise tracer calculates the average spectrum intensity for a long time range as well as for a short time range; and it terminates itself when the short-time-range average intensity falls below a threshold.
4. E x p e r i m e n t s and E v a l u a t i o n s The performance of the MASS system in segregating harmonic sounds has been evaluated by using several benchmark sounds, and here we present the results only for two benchmarks; a mixture of the voiced speech of a m a n and of a woman both saying "ah-ee-oo-eh-oh" (the vowel sounds of Japanese) and with or without a white noise. Both speeches do not have any common fundamental frequency but fundamental frequencies of the man's speech coincided with overtones of the woman's speech. Such coincidence may cause incorrect segregation. The SignalNoise ratio of the white noise to the man's speech is 1. Sounds are put into the system at each time frame (7.5-ms frame period, with a 30-ms hamming window). Figure 3 shows the fundamental frequency patterns of segregated sound streams. The upper curves are those of the man's voiced speech and the lower ones are those of the woman's voiced speech. Figure 3(a) is the result of segregation from a mixture of the two voiced speeches and shows t h a t one tracer segregates the woman's speech and another segregates the man's speech. Several harmonic tracers were generated because of an incomplete subtraction but were killed immediately. We also synthesize a segregated sound from each segregated sound stream by a simple method and had its quality evaluated by h u m a n listeners. Figure 3(b) is the result of segregation of the same two voiced speeches under conditions with a white noise. Each speech is segregated by one harmonic tracer and it is easy to synthesize a segregated sound because no regrouping is needed.
508 The total number of generated harmonic tracers is 13, and 11 false tracers were generated but killed immediately. 5. CONCLUSIONS The MASS system described here segregates sound streams from a mixture of sounds. Its design and implementation are based on the multi-agent paradigm so that it can extract a variable number of sound streams in accordance with the input. Even though the current implementation uses only the harmonic structure of sounds, it can segregate a man's voiced speech and a woman's voice speech from the mixture of these two speeches and a white noise. We think that the MASS system provides a primitive function that can be used to implement the cocktail party effect. We are now designing and implementing the localization agency by using the MASS system as a building block. This agency will determine the direction of sound sources from input binaural sounds. A future project will be to integrate these agencies with multimodal information, implementing the cocktail party effect by computer in order to widen the communication channel for humancomputer interaction. REFERENCES 1. S. Handel: Listening. MIT Press, 1989. 2. H.G. Okuno: Cognition Model with Multi-Agent System (in Japanese), in Ishida T. (ed.): Multi-Agent and Cooperative Computation H (MACC "92), Kindai-Kagaku-sha (1993) 213-225. 3. M. Cooke, G.J. Brown, M. Crawford, and P. Green: Computational Auditory Scene Analysis: listening to several things at once. Endeavour, Vol. 17, No. 4 (1993) 186-190. 4. A. Bregman: Auditory Scene Analysis, MIT Press, 1990. 5. D. Rosenthal and H.G. Okuno (eds.) Proceedings of the 1995 IJCAI Workshop on Computational Auditory Scene Analysis, AAAI Press, to appear, Aug. 1995. 6. G.J. Brown: Computational auditory scene analysis: A representational approach, PhD thesis, Dept. of Computer Science, University of Sheffield, 1992. 7. G.J. Brown and M.P. Cooke: A computational model of auditory scene analysis, In Proceedings of International Conference on Spoken Language Processing (ICSLP-92), IEEE (1992) 523-526. 8. T. Nakatani, H.G. Okuno, and T. Kawabata : Auditory Stream Segregation in Auditory Scene Analysis with a Multi-Agent System. In Proceedings of the 12th National Conference on Artificial Intelligence (AAAI-94), AAAI (1994) 100-107. 9. T. Nakatani, T. Kawabata, and H.G. Okuno: Unified Architecture for Auditory Scene Analysis and Spoken Language Processing. Proceedings of International Conference on Spoken Language Processing (ICSLP-94), IEEE (1994) 14031406. 10. S.F. Boll: A Spectral Subtraction Algorithm for Suppression of Acoustic Noise in Speech, In Proc. of International Conference on Acoustics, Speech, and Signal Processing (ICASSP-79), IEEE (1979) 200-203.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
509
The Effects of Rehearsal on Visual Memory Mamoru UMEMURA a, Hiroshi ICHIKAWA b and Kenichi TEGURI ° Science University of Tokyo 1-3 Kagurazaka Shinjuku-ku Tokyo 162 Japan.
bSanno College, Jiyugaoka 6-39-15 Todoroki Setagaya-ku Tokyo 158 Japan. ° Asahi Orikomi Co., Ltd. 3-9-14 Kudanminami Chiyoda-ku Tokyo 102
Japan.
1. INTRODUCTION Recently the development of information equipment including computers has advanced rapidly. Accompanying it, the opportunity of using the man-machine interface using CRT display has increased. It is considered that in the case of the works that are carded out by reading the information displayed on a CRT, the easiness of observing the displayed information exerts large influence to the accuracy and efficiency of the works, fatigue and others. In the works accompanied by danger, its mistaken cognition may sometimes be linked with serious accidents. Accordingly, the easiness of its cognition is an important subject in view of safety. The easiness of recognizing the information displayed on CRTs has been reported regarding the color, shape, size and so on of display [1]. Besides, investigation has been carried out on the information processing by humans after reading display, especially on the maintenance of short-term memory [2-3]. In order to take proper judgment and action after reading displayed information, it is necessary to make information easy to be maintained in short-term memory. As its means, there is rehearsal, and it has been said that it is effective for maintaining information. Also in real scene, in the case of the works accompanied by danger, the confirmation of information by voicing has been frequently carried out. Therefore, this study aims at quantitatively investigating the following items by experiment. (1) The effect that rehearsal exerts to the time interval of presentation. (2) The effect of the rehearsal carried but by voicing in the case of continuously presenting information on CRTs.
510
2. METHODS The subjects were seven males of 20-odd years old having normal eyesight. At the same positions of CRT display, seven capital letters of alphabet and numerals were displayed at random, and the subjects were asked to answer what were the displayed letters. The answer was decided to be free recall experiment, the order of which may not always be the same as the order of presentation. The size of the letters was 32 x 32 dots. As for the presentation time, seven kinds from 100 to 700 ms at 100 ms intervals were set up, and the presentation for respective times was carried out 25 times at random. Before and after presenting a series of letter stimulus, as the masking letter, the square of 32 x 32 dots was displayed for 0.5 sec at the same position where the letter stimulus is presented. The experiment was carded out in two ways, namely the case of carrying out rehearsal and the case without rehearsal.
3. RESULTS
3.1. Presentation time Figure 1 shows the relationship between the presentation time and the probability of recall. In all the subjects, the probability of recall became higher when rehearsal was carried out, and the effect of the rehearsal on the maintenance of short-term memory was recognized. The probability of recall was compared with the mean values of all the subjects, and it is shown in figure 2. From the finding show in this figure, the value in the case of carrying out rehearsal is higher about 7-15% than that in the case without rehearsal. The presentation time when this difference between both values became the maximum was the case of 700 ms with 15.4%. The presentation time when it became the minimum was the case of 100 ms, and the difference was 6.7%, in this way, as the presentation time became longer, the tendency of the difference becoming larger was observed. Moreover, as to the number of recalled letters, in the case without rehearsal, the dispersion among the subjects was large, but in the case of carrying out rehearsal, the dispersion among the subjects became small. In the presentation time of 100-300 ms, the probability of recall increased relatively largely. However beyond 300 ms, the increase became slow. This is the similar tendency in both cases of carrying out rehearsal and without rehearsal, but in the case of carrying out rehearsal, the increase of the probability of recall became larger. When the probability of recall for each subject was observed, in the presentation time of 100 ms, the subject, whose probability of recall was smaller in the case of carrying out rehearsal than in the case without rehearsal, was observed. 3.2. Serial position Figure 3 shows the typical example (Subj. A) of the relation of the order of presenting letters (serial position) with the probability of recall. In both cases of
511
With rehearsal 1O0
~
A~-~.
-J-" Subj.A
80
--ffl- Subj.B - * - Subj.C Subj.D - A - Subj.E
O
40 ..--4
~
--ZX-- Subj.F
2o
O
Subj.G 0
I
100
200
I
I
,
I
,
300
400
500
600
700
Presentation time(ms) Without rehearsal 100
Subj.A
~-
80 ~
o
60
- U - Subj.B - * - Subj.C --O-- Subj.D - A - Subj.E
O
. ,.,,q
Subj.F Subj.G 0
i 100
200
,
,
,
I
I
300
400
500
600
700
Presentation time(ms) Fig. 1 The relation between presentation time and probability of recall
80 t
Proba ilityofrecall
~" o 70
35 r
/.23
D------_{2------~----.-.13
- I I - With rehearsal
30 ~" -tzy-- Without rehearsal
~-, 60 o
25~ '-' Increase rate
;~ 50
15~
.~
20
-
10 -`=
I
d: 40
100
200
300
400
500
600
700
exposure duretion(ms)
Fig.2 Probability of recall and Increase rate carrying out rehearsal and without rehearsal, in the former part and the latter part of serial position, the probability of recall was high. Besides, in the case of carrying out rehearsal as compared with the case without rehearsal, in the former part like the serial position 1-4, the probability of recall has increased.
512 90 80 ~" 70 ~O .~ .....0 ~ .0
50 40 30 20 b
-0-- Withoutrehearsal
10 0 1
2
3
4
5
6
7
Sirial position Fig.3 Probability of recall on Sirial position (Subj.A)
[~ Subj.A N Subj'B Subj.C n Subj.D Subj.E O l=
I~ Subj.A D Subj.B Subj.C 1 Subj.D
~ 50t ~'40 i!~!i!i!ili "~ !:!:i:i:!:i:i ~0 3 0 •:~:~:~i~!i ~ 20 !i~i~i:~i!
1[-] Subj.F Subj.G =1001L , 1
2
3
4
5
6
7
-100
,
,
i in ,
,
Subj.F 01 Subj.G
Sirial position
3 4 5 6 Sirial position
Fig.4 The effect of rehearsal in Sirial position
Fig.5 The effect of rehearsal in sirial position(Presentation time:300ms)
1
2
,
7
In the latter part from the serial position 5, even if rehearsal is carded out, the increase of the probability of recall was not observed, and on the contrary, also the case of lowering the probability of recall was observed. The effect of rehearsal in serial position is shown in figure 4. This is the summation of the increase rate of recalled letters for all the subjects when rehearsal was carried out. At the serial position 2, the rate of increase was largest, and the serial positions 1, 3 and 4 followed as for its values. It is known according to this figure that the effect of rehearsal was large in the former part 1-4 of serial position. Figure 4 shows the mean value of all the presentation time. However in the short presentation time, it is considered that the time sufficient for executing rehearsal cannot be taken. Therefore, the case of the presentation time 300 ms is shown in figure 5. From this figure, it is known that at the serial positions 1, 2 and 3, the rate of increase was large, and subsequently, it was large at the serial position 4.
513 4. DISCUSSION
4.1. Optimal presentation time In the case of the presentation time longer than 300 ms, it was found that by carrying out rehearsal, the probability of recall heightened by more than 20%. By carrying out rehearsal, the attenuation of information in short-term memory can be stopped, and it can be maintained. From this fact, it was expected that by carrying out rehearsal, even if the presentation time is short, the probability of recall is high. However, the results of this time became such tendency that in both cases of carrying out rehearsal and without rehearsal, at the presentation time of 100-300 ms, the probability of recall increased rapidly, and after exceeding 300 ms, much increase did not arise. This is considered that in the rehearsal by voicing, letters are recognized and those are pronounced, consequently, in the short presentation time, the cognition of the following letters cannot catch up, and in this way, it occurs. Accordingly, in the continuous presentation of letters using CRT display, at the presentation time shorter than 200 ms, the cognition of letters cannot catch up the presentation, and the maintenance of information in short-term memory becomes difficult. Moreover, at 400 ms or longer, even if the presentation time is increased, the probability of recall does not much heighten. Accordingly, the efficient presentation time is considered to be about 300-400 ms. Generally regarding the maintenance of short-term memory, it has been known that the difference according to individuals is large. Consequently, in the visual sense monitoring works of carrying out control and monitoring by using VDT equipment, there are the workers who are apt to cause mistaken reading and those who are not so. This is an important problem in view of safety, for example, in nuclear power plants. By seeing figure 1, it is observed that the execution of rehearsal makes the difference according to individuals in the probability of recall small. From this fact, it is considered that the execution of rehearsal in visual sense monitoring works makes the difference according to individual workers small, and is effective for the prevention of the mistaken reading of information.
4.2. Serial position and effect of rehearsal The probability of recall is high in the former part and the latter part of serial position in both cases of carrying out rehearsal and without rehearsal. This is the typical serial position curves [4]. Namely, the former part is that called primary effect, and the latter part is that called recency effect. As for the recency effect, the similar probability of recall was observed in both cases of carrying out rehearsal and without rehearsal. However, the primary effect in the case without rehearsal has not arisen conspicuously. Unless rehearsal is carried out, the information is short-term memory vanishes in relatively short time [5]. Generally, it has been said that the vanishment occurs due to the following causes. (1) Proactive inhibition that is the interference from the information inputted earlier [6].
514 (2) Retroactive inhibition that is the interference from the information inputted later [7]. The increase of the probability of recall when rehearsal was carded out was large in the former part of the serial position 1-4. To the former part of serial position, the effect of proactive inhibition is small, and it is considered that by strongly undergoing the effect of retroactive inhibition, information progressively vanishes. According to these facts, it can be said that the execution of rehearsal lowered the effect of retroactive inhibition, and increased the probability of recall as a whole. In the visual sense monitoring works using the CRT display that continuously presents information, the information in the former part of serial position (the information that was displayed earlier) is hard to be maintained in short-term memory. However, it is considered that by carrying out the rehearsal by voicing, the information in the former part of serial position can be maintained, and mistaken cognition can be prevented.
REFERENCES 1. T.Hatada, Characteristics of human vision for VDT, The Japanese Journal of Ergonomics, 22, 2, 45-52 (1986). 2. H.Yokoi and H.Yuma, Optimum presentation speed of character sequence based on time continuous model of short-term memory, The transaction of mICE, J70D, 11, 2327-2337 (1987). 3. G.Sperling, A model for visual memory tasks, Human Factors, 5, 19-39 (1963). 4. B.B.Murdock Jr., The serial position effect of free recall, Journal of Experimental Psychology, 64, 5, 482-488 (1962). 5. L.R.Peterson and M.J.Peterson, Short-term retention of individual verbal items, Journal of Experimental Psychology, 58, 3, 193-198 (1959). 6. G.Keppel and B.J.Underwood, Proactive inhibition in short-term retention of single items, Journal of verbal Learning and verbal Behavior, 1, 153-161 (1962). 7. N.C.Waugh and D.A.Norman, Primary memory, Psychological Review, 72, 89-104 (1965).
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
M e c h a n i s m s o f Slips in D i s p l a y - B a s e d H u m a n - C o m p u t e r
515
Interaction:
A Model-Based Analysis Muneo Kitajima a and Peter G. Polson b aNational Institute of Bioscience and Human-Technology, 1-1 Higashi Tsukuba Ibaraki 305, JAPAN blnstitute of Cognitive Science, University of Colorado, Boulder, CO 80309-0344, U.S.A.
1.
E R R O R S O F S K I L L E D U S E R S IN H U M A N - C O M P U T E R
INTERACTION
A puzzling and frequently-ignored fact in the human-computer interaction literature is that skilled users have surprisingly high error rates (10 ~ 15%). Card, Moran and Newell (1983) studied individual skilled users performing two tasks, manuscript editing and electronic circuit design editing. The manuscript editing experiment involved a detailed evaluation of a single expert user doing 70 edits presented in marked up manuscript. Errors were made on 37% of the command sequences describing edits. Over half of the errors were detected and corrected during generation of the editing commands. 21% (15 out of 70) of the commands issued by this very skilled user generated the wrong result and required additional edits to correct these errors. In a second study of a single expert carrying out an electronic circuit design editing task, the user had an error rate of 14% on 106 edits. Hanson, Kraut, and Farber (1987) studied 16 researchers and managers who were intermediate and expert level users of UNIX performing document preparation tasks and e-mail. They logged over 10,000 commands. The overall error rate was 10% with error rates ranging from 3% to 50% on different commands. The experiments briefly reviewed here are representative of results from a wide range of studies in the human-computer interaction literature. Error rates for expert users range from 5 to 20%. In all studies of experts, users eventually produced the correct results. Approximately 50% of the errors are detected during the generation of a command and corrected. Detection and correction of errors is an integral part of expert skill. The literature on errors has concluded that there are two qualitatively different types of errors (Norman, 1981; Reason, 1990). The first is errors of commission, or mistakes. Such errors are committed by users who are carrying out a novel tasks and fail to immediately discover the correct action sequence. The other is slips, where users have the correct intention but fail to successfully execute the correct action sequence. Most part of errors described above is slips.
516 Sellen (1990) reviews classes of models that provide principled, qualitative accounts for slips. She argues that all of these models have a hierarchical representation of action sequences that include representations of top-level task goals and lower-level goals that actually control execution of elementary actions. Reason (1990) argues that control of attention is a critical determinant for generating correct performance from a hierarchical representation of action sequences that include representations of top-level task goals and lower-level goals that actually control execution of elementary actions. Failure to adequately attend to the ongoing process and coordinate the interaction between the various schema causes a wrong low-level schema to become activated, generating related but incorrect actions for the current task. In HCI tasks, the users could be focusing on the task of composing new text or drawing a figure, and so on. This would lead to insufficient attention being allocated to subtasks involved in operating the interface. Card, et al. (1983) proposed that experts accept high error rates in order to increase their productivity, because for them error recovery can be done easily and rapidly. Experts trade speed for accuracy, causing slips. In this paper, two mechanisms of slips, attention failures, and speed-accuracy tradeoffs are simulated by a comprehension-based cognitive model of display-based human-computer interaction proposed by Kitajima and Poison (1992, 1994a, to appear), showing that they could account for the rate of slips made by skilled users interacting with graphical user interfaces (Kitajima and Poison, 1994a, 1994b).
2. A MODEL OF DISPLAY-BASED HCI The model developed by us (Kitajima and Poison, 1992, 1994a, to appear) is shown in Figure 1. The model elaborates Hutchins, Holland, and Norman's (1986) action theory framework which consists of the following four basic components: (1)
goals representing what the user wants to accomplish which are a schematic outline of the action sequence that will accomplish the task,
(2)
a task environment which is the world that reacts to the user's actions and generates new responses by modifying the display,
(3)
the stage of evaluation comprised of the processes that evaluate and interpret the display, and
(4)
the stage of execution comprised of the processes that select and execute actions that affect the world.
Our model of the Hutchins, Holland, and Norman's (1986) action theory incorporates goals, two processes for the stage of evaluation and two for the stage of execution.
517
Task Goals Device Goals e of Evaluation
Stage of Execution Selecting Candidate Objects for Next Action
Elaborating the Display
Action Cycle Se'ecting' cti°n
I
I
Generat'ng ,=,eoresen,ations °'s°'a''
World Figure 1.
2.1.
The comprehension-based cognitive model of skilled use of graphical user interfaces, mapped onto Hutchins, Hollan, & Norman's (1986) action cycle.
Task Goal and Device Goal
The model assumes that skilled users have a schematic representation of the task that is in the form of a hierarchical structure involving two kinds of goals: task goals and device goals. Our goal representation is taken directly from the Yoked State Space Hypothesis proposed by Payne, Squibb, and Howes (1990). Payne, et al. assume that discovering how to carry out a task involves searching of two problem spaces. The first is a space of possible task states. The second is a space of possible device states that are required to achieve a given task state. We assume that each task goal is associated with one or more device goals. The device goals specify device states that must be achieved in order to satisfy an associated task goal. Given a task goal and its associated device goals, the model simulates a sequence of action selections as follows.
518
2.2.
Generating Display Representations
At first, the model generates a representation of the display. The display representation only includes information about identity of each object on the display and its appearance, e.g. highlighted, pointed-at, dragged, etc. No information about what actions can be taken on the object, or its meaning and relationships to other objects in the display is included in this initial display representation.
2.3.
Elaborating the Display
All such information is generated by the elaboration process which retrieves information from long-term memory by a random memory sampling process. The retrieval cues are the representations of the current display, the task goal and the device goals. The probability that each cue retrieves particular information in a single memory retrieval process is proportional to the strength of the link between them. The model carries out multiple memory retrieval in a single elaboration process. A parameter, the elaboration parameter, controls the number of times each argument in the display and goal representations is used as retrieval cues 1. The retrieved information elaborates the display representation, providing information about interrelationships between display objects, relationships between the task and display objects, and other attributes of display objects. The elaborated display representation is model's evaluation of the current display in the context defined by the task goal and the device goals.
2.4.
Selecting Candidate Objects for Next Action
In the stage of execution, the model first limits its attention to a few number of screen objects out of---100 objects displayed on the screen. These screen objects are candidates for the next action to be operated upon. The candidate object selection is performed on the basis of the evaluation, defined by the elaborated display representation. The model uses the spreading activation mechanism to select candidate objects. The process is dominated by two factors: the strengths of links from the representation of the goals, which is parametrized by a parameter, the attention parameter, and the number of propositions that are necessary to bridge the goals and the candidate objects 2.
1The model represents goals and display in propositions, like (is-on-screen OBJECTI2). In the memory sampling process, the argument, such as OBJECT12, is used to retrieve information from long-term memory that has OBJECT12 as its argument. 2The model assumes the argument overlap mechanism to link up propositions. For example, the two propositions, (is-on-screen OBJECT12) and (has OBJECT12 CalculatorMenultem), are linked by the shared argument, OBJECT12.
519
2.5.
Selecting Action
The model considers all possible actions on each candidate object. The model incorporates 18 possible actions 3, such as "moving the mouse cursor to a menu item in order to display a pull-down menu." The process is dominated by the same two factors described above. Furthermore, the action representations include conditions to be satisfied for their execution. The conditions are matched against the elaborated display representations. Some conditions are satisfied by the current screen, others by information that was retrieved from long-term memory in the elaboration process. For example, the model cannot select an action to double click a document icon for editing unless the icon is currently pointed at by the mouse cursor and the information is available that the icon can be double clicked. Observe that if information about a necessary condition is missing from an elaborated display representation, the model cannot perform that action on the
3.
incorrectly described object.
HOW THE MODEL ACCOUNTS FOR ERRORS
In a set of experiments we conducted so far, where a graph drawing task was simulated, we found that the model could cause errors due to the following three reasons. The first is that the process of selecting candidate objects for the next action fails to include the correct object on the list of candidate objects. The second possible cause of errors is that the correct action fails to become the highest activated action among executable actions. In the model's terms, these kinds of errors are ascribed to both or either of small values of the attention parameter (A), and/or missing bridging knowledge that had to be retrieved from longterm memory (B). The third is that the elaboration process fails to incorporate all of the conditions for the correct action in the elaborated display representation. Low values of the elaboration parameter cause this error (C). Parameter values in the range of 12 to 20 caused the model to simulate error rates in the range of 10% to 20% (Kitajima and Poison, 1994, to appear). We argue that the elaboration parameter describes a speed-accuracy tradeoff process where low values of the parameter reduce the amount of time taken by the elaboration process.
4. C O M P A R I S O N W I T H O T H E R M O D E L S The strength of the model is that the model generates correct actions as well as occasional errors without assuming a special set of mechanisms to produce erroneous actions. In this
3Representations of actions define different functions of single physical actions in many different contexts. For simulating a graph drawing task, the model defines eighteen cognitive actions on six physical actions; MoveMouse-Cursor, Single-Click, Double-Click, Hold-Mouse-Button-Down,Release-Mouse-Button,and Type.
520 respect, the model is strikingly different from typical models of expert performance and error (Anderson, 1993; Reason, 1990; Card, et al., 1983). Typical models assume that skilled performance is mediated by detailed, large grain size action plans stored in long-term memory. Card, et al. (1983) refers to them as methods; Reason (1990) assumes that skilled performance is mediated by action schemata (Norman, 1981). Thus they have to be equipped with erroneous plans to generate errors. The grain size of action is much smaller in our model, at the level of individual pointing action. When the model makes an error, it has attempted to select a correct action based on incomplete knowledge, and/or insufficient attention. The incorrect action will be highly constrained by the user's current goals, the current state of the display, and the partial knowledge that was successfully retrieved from long-term memory. The candidate objects and the next action selected by a simulation are the model's best selections given the context represented by the elaborated display representation.
REFERENCES
Anderson, J. R. (1993). Rules of the mind. Hillsdale, New Jersey: Lawrence Erlbaum Associates. Card, S. K., Moran, T. P., & Newell, A. (1983). The Psychology of Human-Computer Interaction. NJ: Lawrence Erlbaum Associates. Hanson, S. J., Kraut, R. E., & Farber, J. M. (1984). Interface design and multivariate analysis of UNIX command use. ACM Transactions on Office Information Systems, 2, 42-57. Hutchins, E.L., Hollan, J.D., & Norman, D.A. (1986). Direct manipulation interfaces. In Norman, D.A. & Draper, S.W., Eds. User Centered System Design. Hillsdale, NJ: Lawrence Erlbaum Associates. Kitajima, M. and Poison, P.G. (1992). A computational model of skilled use of a graphical user interface. Proceedings of CHI'92 Conference on Human Factors in Computing Systems. NY: ACM, pp. 241-249. Kitajima, M. & Poison, P. G. (1994a). A comprehension-based model of correct performance and errors in skilled, display-based human-computer interaction. ICS Technical Report. 94-02. Boulder, CO: Institute of Cognitive Science, University of Colorado. Kitajima, M. and Poison, P.G. (1994b). A model-based analysis of errors in display-based HCI. Conference Companion CHI'94. pp. 301-302. Kitajima, M. & Poison, P. G. (to appear). A comprehension-based model of correct performance and errors in skilled, display-based human-computer interaction. International Journal of Human-Computer Studies. Norman, D. A. (1981). Categorization of action slips. Psychological Review, 88, 1-15. Payne, S. J., Squibb, H. R., & Howes, A. (1990). The nature of device models: The yoked state hypothesis and some experiments with text editors. Human-Computer Interaction, 5, 4, 415-444. Reason, J. (1990). Human Error. New York, NY: Cambridge University Press. Sellen, A. Mechanism of human error and human error detection. Unpublished doctoral dissertation (1990).
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
521
Computation Model for Human Communication Masahiro HIJ1.1, Hiroshi NUNOKAWA .2 and Masatoshi MIYAZAKI *l *l Graduate School of Information Sciences, Tohoku University, kawauchi Aoba, Sendai 980-77, Japan .2 Institute for Science Education, Miyagi University of Education, Aoba aza Aramaki Aoba, Sendai 980, Japan Abstract We have propose a computation model for human communication which has both dynamism and variety. And we propose a possibility to represent human communications as a figure of communication. This makes the computation model enable to model a dynamism in human communication smoothly. We design a programming language based on the computation model.
1. Introduction
Due to the increase in size, popularity and individual use of the computer network nowadays, their use as a means for inter-personal communications becomes increasing. Furthermore, as a communication system, they can also be common platform for cooperative activities which occurs while exchanging informations with other people via communications and coordinating activities and informations with each other. This is because, in the cooperative activities, communication is one of indispensable primary factors and even the cooperative activity itself can be considered as a particular activity caused by communications. Consequently, the communication system on computer network, as a recent information technology, is widely applicable. The development of such system becomes more and more important. For constructing system for computer-mediated communications on computer networks, this system must be able to correspond to varieties of human communications ,- e.g. a real-time person-to-person conversation, sending letters to specific person, creating a group concerned with a particular topic along with conversations in the group. In addition to this, it must be able to correspond to a dynamism of communication while switching among these various communications occur. Researches on a human communication so far emphasize analyzing of the influence of communications on human-beings and have been carried out in a field of sociology [1][2]. Therefore it is impossible to directly make use of the results of these researches as a basis to create a system that supports computer-mediated communications above. Based upon these background , we emphasize our research on the way to model a human communication, its implementation of and the development of the computer-mediated communication system on computer networks. Our aim is to provide a model for analyzing a human communication and also a basic theory of constructing a computer-mediated communication system. In this papers we propose a method of modeling a human communication from a view point of communication figure and its implementation.
522
Interaction~ ~ ~ ) - "
c,. . . .
~,,,,; . . . .
Communlcauons
.[ mteracuon ':. Computer-Mediated % Communication System
| ~::~
Interaction
"- ( " ~ ~ Q ~"~1~'~
•=======================================.
Interaction 4"~t.~~! .... :ii~ ..........~......:.:.:.:.:.:.:.:~:.:.:.:.:.:.:.:.::::::::::.:~:.::::::~:~:~::"
I~mmm
l1,,, u,,,
Figure 1. Computer-Mediated Communication System
2. Analysis of the Human Communications 2.1. Human Communications as the Target of Analysis In general, human communications can be analyzed from several view of points. In this paper, our meaning of human communications is as follows. In order to clearly define the computation model for human communications which will be the basis for the development of computer-mediated communication system. Generally speaking, human communication can be considered as choosing a selection of communicative partners for exchanging informations, and a process of exchanging informations among these selected partners. First, we think that human communications consist of some primary objects which exchange informations and the figures of these information exchanges [3]. In other words, it is a process of information exchanges consisting of these two elements. We call the primary object which exchanges informations and interpret these informations as communicator. There are two kinds of communicators. One of them is the sender which gives informations, the other is the receiver which takes informations from the sender in an exchange of informations. Communicator can be either sender or receiver. It means that a relation between the sender and the receiver is not fixed, communicator can be either the sender or the receiver in the process of information exchanges. Communicators includes not only individual person but also a group of people. When a group of people is the object for communication, it is considered as one independent object which communicates and is represented as one communicator. We call these figures of information exchange among communicators as figure of communication. It is a mode of an exchanging informations defined as a triple of-- the number of communicators, the direction of information flow and the timing of information exchange. The number of communicators means the combination of numbers of senders and receivers. It is represented in the form of 'number of senders TO number of receivers', and is any one of these four types of '1 TO 1', '1 TO n' , ' n TO 1' and 'n TO n', where n means any number more than 1. The direction of information flow is the direction which information flows among communicators. This is either' one-sided direction' or' mutual direction'.' one-sided direction' is the case when sender gives informations to receiver in one way.' mutual direction ' is the case when sender and receiver exchange informations mutually. The timing of information exchange means the time necessary to exchange informations between sender and receiver. This is either' realtime ' or 'non-realtime '. 'realtime 'is the case when the receiver reads out informations as soon as receiv-
523 ing it.' non-realtime' is the case when some time is necessary for the receiver until it reads out the information being sent. This is because either the means of communications itself need some time in transmission (non-realtime means ) , or the receiver needs some time in reading out (non-realtime from the activity of receiver). For realtime information exchanges, means of communication must be realtime and the receiver communicators must read out the received informations at once. From the viewpoint of information exchanges in the process of human communications, human communications can be modeled as a process in which the sender communicator selects the receiver communicators and exchanges information based on the figure of communication which is decided by number of communicators, the ability of the receiver communicator in doing a real time information exchanges as well as the direction of information flow. A sequence of communication proceeding with the time is modeled by combining the figures of communication with the flow of time.
2.2. Dynamism of the Human Communications Human communications are generally dynamic. It means that the number and the state of communicators as well as the figure of communications vary according to the process of human communication. Also the relation among communicators corresponding to the communication in similar way vary by changing an figure of communication. This is called as the dynamism of communication. So, when computer-mediated communication system are constructed, it must be able to correspond to the dynamical side of the human communication. To cope with such a dynamism of the human communication, we modeled the human communication based on the cooperative computation model [4] with the concept of autonomous decentralization. This dynamism of communication has not been considered yet in the communication model so far.
3. Computation Model for Human Communication 3.1. Modeling of Human Communication We propose a computation model for modeling a human communication based on the analysis of human communications in chapter 2. Our computation model uses communicators and an exchange of informations among them. In this model, a kind of information exchanged in communications and behavior of communicator varied with communications are not taken into consideration. It means our computation model considers human communication as a selection of communicators as communication partners and an exchange of information among them. This computation model gives basis for expressing human communication as described in section 2.1 In this computation model, an individual communicator is represented by an autonomous object, a group communicator is represented by afield. An autonomous object is similar to an object in Object-Oriented computation model. The autonomous object is same as the object described in Object-Oriented computation model, except that, it has an interpreter in itself to change its function by the execution of receiving scripts from other objects. The interpreter in autonomous object has high-order function which can interpret script as data. We represent communicators as autonomous objects or fields, communicators can be either a human or a software. Software communicators are different from human communicators in the sense that their ability to interpret the exchanged informations in communication are limited and these informations must be structured in some forms. Field is a boundary which distinguishes a group of autonomous objects from other autono-
524 mous objects. A Field as communicator sends information received from other communicators to all autonomous objects or certain autonomous objects belonging to it. Also information from an autonomous object belonging to a field is sent to other receiver communicators through the field. These fields do not directly exchange information by their own, but autonomous objects belonging to the field indirectly exchange information through the field. The computation model can represent any communication based on the figure of communication described in section 2.1 as follows. The number of communicators is decided upon whether communicator is described as autonomous object or field. The direction of information flow is represented as two kinds of correspondence form named as the Send and ContractSend. Send is that sender only send informations to receiver in one way. ContractSend is that communicators contract and exchange informations among them. The timing of information exchange is represented as two kinds of correspondence form named as lnformSend and CooperativeSend. CooperativeSend is the CooperativeScope which recognize all communicators able to do a real-time communication. A real-time communication is then done among communicators with the same CooperativeScope. CooperativeSend is a figure which assure that information will not only be sent to the receiver in real time but also be read out by the receiver immediately, lnformSend differs from CooperativeSend in the sense that it does not assure neither a real-time sending nor an immediate reading out of informations. It leaves the reading out as the responsibility of the receiver. The figure of communication are represented as combinations of individual correspondences form above these are InformSend, lnformContractSend,
CooperativeSend, CooperativeContractSend and MessengerSend. A MessengerSend specifies the conditions as the destination as well as communicator list (called known-information ) and then sends a message [5]. The message being sent will search for communicators which satisfies the specified conditions according to known-information. The message gets these known-information which from other communicators while moving to those communicators. In other words, MessengerSend is a communicative figure searching for communicators which satisfy the specified conditions while getting known-information from other communicators. That is to say, the figure of communication in section 2.1 is represented by the above five communicative figures which specified the receiver communicator to each destinations in these figures. 3.2. Describe Human Communication based on our model The programming language which we designed based on our computation model in section 3.1, it is possible to describe communicators and figure of communication using each function in Figure 2 and Figure 3. Communicator which is a partner in communication is designated as parameter < toField>
( CreateAutonomousObject ) ( CreateField <ScopeName> ) ( DeleteField ) ( InField ) ( OutField ) ( OpenScope <ScopeName> ) ( CloseScope <ScopeName> ) Figure 2. Functions for describing communicators
525 ( InformSend <Script> ) ( Contract ) ( InformContractSend <Script> ) ( ContractReply ) ( ContractEnd ) ( WholnScope <Scope> ) ( CooperativeSend <Script> ) ( CooperativeContractSend <Script> ) ( CreateMessenger <MessengerName> ) ( MessengerSend <MessengerName> ) ( MessengerReceive ) Figure 3. Functions for describing figure of communication
and in each communicative functions. An exchanged information through communication is described as a script <Script> in these communicative functions .A script as described in <Script> in these functions is exchanged among communicators as data by using the interpreter which the autonomous object has. A receiver communicator interprets and executes the received data as script. In our model, the interaction method between communicator along with informations are included inside the exchanged data. A process of a sequence of communication is described as combination of these communicative functions. Moreover, dynamism in human communications is interpreted as follows. A dynamism of number of communicators is described as creation and elimination of an autonomous object and field, participation and going out of autonomous objects using these functions in Figure 2. A dynamism of figure of communication is described as selection of the communicative function corresponding to the figure of communication as well as information exchanges by using it. An autonomous object can change its internal status as well as its script and can vary the figure of communication by interpreting the script which is exchanged as data among these objects. As a result, our computation model can be used to interpret human communication as described in chapter 2.
3.3. Example of modeling human communication Figure 4 shows an example to describe a communication when a person A sends a question to a group G. Then other two persons B and C belonging to group G make a discuss about the question in real time. Here, each person (A,B,C) is described as an autonomous object and group G is described as a field. Autonomous objects B , C belong to Field G by executing InField function. Autonomous objects B, C open CooperativeScopewhich shows a possibility of realtime exchanging informations by executing OpenScope function. When autonomous object A asks a questions to Field G, it sends the question by using lnformSend function. The question sent from A is transferred to each autonomous object B, C belonging to Field G . An autonomous object ( for example B ) receiving a question knows about a partner ( partner C with the same CooperativeScope ) to whom he can exchange the informations in real time by using WholnScopefunction. Autonomous object B exchanges the informations in real time with autonomous object C by using CooperativeContractSendfunction. If a new person D wants to participate in this discussion, autonomous object D must open the same CooperativeScope as the one of autonomous objects B, C. At this point, informations
526
Group G
person • B r,
"~.... . __......-.../~utonomous~ person : C ~Fleld : ~~'l~k,~ Object : .C-~..~ '~ ............. /~utonomous~ Cooperative ~ f ' " N J ~ t t- Obiect • B -1 ContractSend F ..... ..
answer }n:A
_
pamclpate I n f o r m S e ~ r m S e n ~'~ /rAutonomous~ ~person : D k, Object : A J
"-K..O.__bJ_e__c.t___;_P.,,r~.
~ . ,
d ............................ InField fAutonomous~ k, Object : D J
Figure 4. A sample of modeling human communications
exchanged by using sent to D also.
CooperativeContractSendfunction among autonomous objects B,C will be
4. Conclusion We have proposed a computation model for human communication which has both dynamism and variety. We try to make the range of human communication aimed used in modeling as wide as possible and we also model a dynamism of human communication. Therefore this kind of various human communications can be represented as process in which some communicators select other communicators and exchange informations according to one of the communication figures. We also developed a programming language based on our computation model. We have implemented the programming language based on the computation model and a system for supporting communication in education and production planning field. This programming language is defined as seven functions for describing communicators and eleven functions for describing communication form .This language is implemented on computer network consisting of UNIX Workstation. Human interaction is the essential elements in the information exchanges on computer-mediated communication .In a computer-mediated communication so far, the information being sent is just kept until being read by the receiver, sender can not specify the receiver to read it. In the new computer-mediated communication based on our computation model, sender is able to specify receiver to read an exchanged information or interact to sender in communication.
REFERENCES [1]Barnlund D.C. : "A Transactional Model of Communication" in J.Akin, A.Goldberg, G.Myers, and J.Stewart (eds),Langiage Behavior : A Book of Readings in Communication , Mouton & Co.N.V.Publishers, The Hague ,1970 [2]Giffin K. and Patton B.R. : "Fundamentals of Interpersonal Communication", 2nd ed. Harper & Row, Publishers, NewYork ,1976 [3]Hiji,M.,Nunokawa,H.,and Shiratori,N. :A Design Language Describing a Cooperative Work Model, IPSJ SIG Notes,93-GW-3,pp.33-40,1993 ( in Japanese ) [4]Takemiya,H.,Yano,H.,Nunokawa,H. and Noguchi,S. :Coordinated Computation Model with Coordination Scope, IPSJ SIG Notes,91-PRG-3,1991 ( in Japanese ) [5]Igarashi,T.,Nunokawa,H. and Noguchi,S. :Description of the communication tool using computation model based on an autonomous decentralized cooperative concept, IPSJ SIG Notes,92-DPS-58,pp.165-172,1992 ( in Japanese )
III. 18 Voices and Faces
This Page Intentionally Left Blank
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
529
Delivering the promise of speech interfaces Charanjit K Sidhu and Gerry Coyle BT Laboratories, Martlesham Heath, Ipswich, IP5 7RE, UK Key words: Speech interfaces, product life cycle, usability engineering.
Abstract We compiled a style guide that includes checklists and methods based on our experience of speech interface design. These have been applied in user trials and field evaluations throughout the development of Call Minder, a network-based telephone call answering service. Numerous improvements were made to the usability of the service as a result.
1 INTRODUCTION Speech technology offers intuitive information services accessible from any telephone worldwide. However, the design of a service has a major impact on ease of use and so we involved customers to steer the design process towards optimal usability [1]. We used this approach throughout the design of Call Minder--a network-based telephone call answering service. We believe that multi-disciplinary teams of software, systems, marketing and Human Factors people are needed to design successful interactive services [2]. The Human Factors contribution includes requirements capture, design, prototyping and evaluation supported by research into user psychology. Checklists, guidelines and methods have been compiled to support the design of usable services. This provides designers and engineers with guidance in the following areas: •
Requirements capture checklists to help specify user requirements;
•
Dialogue Structure and message composition design guidelines for the format and content of speech messages;
•
Keypad guidelines for mapping functions to the telephone keypad;
•
Evaluation guidelines for evaluating services at all stages of development.
Figure 1 demonstrates how information from different sources is utilised in the design of speech services.
530
Tools and Techniques RequirementsCapture,ExpertWalkthroughs,User Trials, Questionnaires,FocusGroups,FieldTrials
Ia
Case Studies ~' Evaluationof SpeechBasedServices eg. Call Minder,NetworkServices, dioPaging,Chargecard,Voicemessaging
¢ Speech Research Speechtechnology,evaluationmethods,user psychology
Figure 1: Optimising 2
Internal Standards SpeechStyleGuide UsabilityDesignGuide
External Standards ISO, ETSI,etc.
usability in developing speech based services
DESIGNING USABILITY INTO CALL M I N D E R
The Call Minder answering service generates voice prompts and interprets spoken or keyed responses. There are two main dialogues: Caller dialogue: When someone calls and no-one answers, or the phone is engaged, Call Minder answers the call and prompts the caller to leave a message. Figure 2 shows a typical interaction with Call Minder. Good afternoon. There is no reply from that number at present, but I can take a message for you. Can I have your name please ? "Charanjit Sidhu" Who are you calling ? "Gerry Coyle" Please leave your message and telephone number after the tone. "Hello Gerry, just demonstrating the system". I recorded your message as: 'Hello Gerry, just demonstrating the system'. You now have a choice. If you are happy with your message, please hang up now. I f you would like to replace it with a new message, please hold on. Your new message will overwrite the old one. Please leave your new message after the tone. "Hi Gerry, I'm now demonstratin~ the overwrite facility,". Figure 2: Call Minder- Caller dialogue
531
Customer Dialogue: Call Minder customers can listen to their messages from their own phone or, after keying a Personal Identity Number (PIN), from any other telephone. The service informs customers how many messages have been recorded and allows them to be retrieved. Customers can also change various options including the greeting message played to callers, the number of rings before answering and the customer's PIN. Call Minder's success depends on dialogues that customers can use effortlessly. Human Factors involvement in the development of the service can be seen in Figure 3. Service Co nc ept/design
Service development
Defivery o f completed Service Future modifications: User trials
Implementation of prototypes
quiremntsa ,tur , impl m ntation°f m ,l m ntation
functional specification o: dmlogue"
Concept testing (Focus groups) Paper walkthroughs
prototype ~Iterative "
Prototype walkthroughs User trials Paper walkthroughs
the final system FIELD ~,FIELD TRIAL 1.,
Minder ~LAUNCH
Field Trial 1: One-to one interviews~questionnaires Field Trial 2: Focus groups~questionnaires
Figure 3: Stages in the usability engineering of Call Minder As soon as Market Research confirmed the basic service concept, preliminary dialogue designs were constructed following principles such as those below: •
Provision of adequate feedback so users feel in control of the interaction, know where they are, what they can do next, and how to correct errors.
•
Complexity should not strain the users' memory. They should not have to remember information from one part of the interaction to the next.
•
Speak the users' language and provide easy to understand prompts.
We then improved the specification of the service by conducting a series of paper walk-throughs. A dialogue prototyping tool had been developed to build rapid prototypes and simulate the service. This allowed us to gather objective and subjective usability data in laboratory trials.
532 The results from these trials allowed many improvements to be made. Early trials showed that voice prompts did not always elicit the expected responses. For example the prompt: 'Would you like to leave a message ?'
elicited responses such as, 'Um. Yes I would' or 'Thank you, yes' or 'Hello Gerry ...... '
rather than with the expected 'yes' or 'no'. As a result the Caller dialogue was radically changed by eliminating the "YES/NO" questions and introducing open ended responses. Another benefit from carrying out these trials was that timing data we collected allowed us to specify appropriate time-out durations. Early prototypes had time-outs that were so short that hesitations triggered the next stage in the dialogue. Not surprisingly, users found this very frustrating! When all the improvements identified in the laboratory trials had been incorporated, we undertook an extensive field evaluation to test Call Minder in a real life environment. The aim of the field evaluation was to refine and establish: •
the robustness and effectiveness of the technology
•
the processes needed to deliver the service to market
•
data on usage, customer attitude and usability
The field evaluation involved over 200 customers for one year. Usability was investigated by means of questionnaires and interviews, which indicated that customers were satisfied with the service. However, there was still scope for improvement. For example, the high level of background noise in some environments reduced the recognition performance leading to timeconsuming error correction dialogues. As a result, some found the system too slow. Fast-track dialogues and keying options were thus introduced. 3
O B J E C T I V E AND SUBJECTIVE BENCHMARKING
Both objective and subjective measures were taken throughout the course of the trials and the field study which enabled us to benchmark different evaluation techniques. Objective measures are taken either from the service platform or by observing users. These provide data on performance and usage, such as the number of times and at what point in the dialogue users hang up, recognition performance and responses to voice prompts. They allow us to identify problem areas within the dialoguemfor example, we can pinpoint confusing voice prompts which cause users to hang up. Subjective assessment through interviews and questionnaires allow customer satisfaction monitoring. Questionnaires measure key dimensions relevant to speech services including speed of use, level of concentration needed, reaction to the voice and clarity of messages. The results from our questionnaire are represented on attitude profiles (see Figure 4), which allow easy comparison between different versions of a service. The questionnaire was validated and refined during the development of Call Minder and is now part of the Human Factor tool-kit for designing future speech services.
533
Clarity of system messages
t ~ , , /
m
Level of concentration required Perceived Recognition
m_.
Speed of operation
Im
.~\m
Voice used for the service Overall clarity of service
¢#
Woulduse serviceagain
,,m
Level of improvement *O m
Option sequence used Reliability of operation Overall ease of use -2 -1.5 Very Negative
-1 -0.5
0
0.5
1
1.5
2
Very Positive
Response score (Mean) KEY
--
-m - -
Service A ----------!:3-.-------Service B
Figure 4: Attitude profile comparing two versions of Call Minder Responses to the questionnaire statements and the objective data are considered together to identify potential improvements. An overall assessment of the service is based on all the available data. 4
CONCLUSIONS
The Call Minder project demonstrates the importance of designing usability into speech services. We involved users throughout development and identified numerous improvements to the service. Implementing such improvements in future speech services will yield enhanced customer satisfaction, increased return on investment and competitive advantage. REFERENCES 1 M. B. Cooper, Human Factor Aspects of Voice Input/Output, Speech Technology March/April 1987, 82-86. 2 M. Atyeo, R. Green, User friendly weapons for the competitive fight, British Telecommunications Engineering Journal Vol. 13.3, 1994, 201-205.
This Page Intentionally Left Blank
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
535
VOICEDIC: A Practical Application of Speech Recognition Technology Kenji Kita, Kazuhiko Ashibe, Yoneo Yano and Hiroaki Ogata a aFaculty of Engineering, Tokushima University Minami-josanjima, Tokushima 770, JAPAN
This paper presents a practical application of speech recognition technology, a voiceaccessible or voice-activated dictionary, which is undertaken by the VOICEDIC project at our laboratory. The paper will show an outline of VOICEDIC as well as its advantages. We will also describe implementational issues required to attain high speech recognition performance.
1. I N T R O D U C T I O N Over many years, researchers in the speech field believe that speech provides the ultimate medium for human-computer interaction. Some of the reasons include the following
[1,2]. • Speech is the most natural and easiest communication medium for humans. • Speech is more efficient than other modes of interaction. • Speech input can be used in hands/eyes busy situations. • Speech is very helpful to handicapped persons. Speech researchers have tackled many difficult problems towards this ultimate goal. As a result, several remarkable speech recognizers have emerged in recent years [3,4]. Now, at the research level, we are capable of attaining a word accuracy of over 90% in connected word recognition of several thousand words. But speech recognition technology still remains little used in real applications. Many possible reasons come into our mind. For example, (1) an accuracy of 90% is poor for a real use, (2) we do not have an effective error correction strategies at hand, and many other reasons. But the greatest reason is that we could not find an appropriate application. The importance is to find a particular application for which speech is appropriate. In this paper, we present a practical application of speech recognition technology, a voice-accessible or voice-activated dictionary, which is undertaken by the VOIOEDIC project at our laboratory. The paper will show an outline of VOlCEI)IC as well as its advantages. We will also describe implementational issues required to attain high speech recognition performance.
536 2. V O I C E D I C : O V E R V I E W A N D ITS A D V A N T A G E S V O I C E D I C allows us to consult a dictionary through human voice. Consulting a dictionary is rather a bothersome task. It takes a few seconds at the lowest estimate, even worse sometimes takes ten seconds or more. Of course, on-line electric dictionaries are recently available and they provide faster access. But, even in that case, keyboard typing is much slower than speech input. In addition, VOICEDIC is effective particularly in conjunction with tasks where hands are already occupied, such as while keyboard and/or mouse operations. Figure 1 shows a block diagram of the V O I C E D I C system. First, the system transforms an utterance, which is an analog form, into a digital form, and then extracts speech features that facilitate a subsequent recognition task. The speech recognition module produces a symbolic representation by comparing the feature pattern of the input to members of a set of ready-made reference patterns. On completion of the recognition, the result is converted into a network of alternative matches, called lattice. The next step is to obtain multiple word candidates from the lattice using a dictionary entries (lexicon). From the output candidates, a right result is selected by a user, and finally the chosen word is retrieved from the dictionary. VOICEDIC has the following advantages:
requires only isolated-word recognition technology. From current speech recognition standards, a high recognition accuracy is easy to achieve for isolatedword recognition.
• VOICEDIC
•
VOICEDIC displays several candidate words as recognition results. If the candidate set does not include the correct result, we can simply speak the word again. Thus, misrecognitions do not result in critical situations.
• We do not have to utter a complete word in order to retrieve words from a dictionary. Uttering the first few syllables makes it possible to list candidate words that begin with the uttered syllables. Then, we can select the desired word from the candidate set. •
VOICEDIC has an option of restricting the search within an active window (screen). With this option, words to be recognized are resrticted to those displayed in the current editing window. Therefore, very fast and accurate recognition is possible.
We are now considering two forms of realization according to the dictionary component" (A) Using a text-based electric dictionary. (B) Using a speech dictionary. In case (A) retrieved results (the meaning of words) are displayed on the screen, while in case (B) the results are given by sound. Speech dictionary holds real speech uttered by humans, and thus it gives more natural sound than a synthetic one. Also, using real speech data would be of great help for blind or visually impaired users.
537
Microphone
input speech AD Conversion ~ digitizedspeech Feature Extraction acoustic parameters I SpeechRecognitionModule ] -~
Syllable HMMs
syllable lattice I N-bestCandidatesGenerator ] -"
Trie Representation of DictionaryEntry Words
~ N-bestcandidates Candidate Selectionby User ~word to be retrieved l DictionaryInterfaceModule
l~
~ retrievedresult
Figure 1. Block diagram of the VOICEDICsystem
Electric Dictionary
538 3. I M P L E M E N T A T I O N A L ISSUES This section describes some parts of the system, which is highly important for the implementation. 3.1. Acoustic Models As acoustic models, we adopt hidden Markov models (HMMs for short) [5], which have been successfully used in recent state-of-the-art speech recognition systems. HMMs are stochastic models suitable for handling the uncertainty that arises in speech, such as contextual effects and speaker variabilities. In VOICEDIC,Japanese syllables are used as the basic HMM unit because the whole word-based approch is difficult to meet the real-time requirements in case of the large vocabulary size. There are about 100 phonetically different spoken syllables in all. Each syllable is represented by a continuous mixture HMM, in which an output probability density function is characterized by a 39-component diagonal covariance Gaussian mixture. See Table 1 for speech analysis conditions. 3.2. Speech R e c o g n i t i o n M o d u l e As stated above, VOICEDIC uses hidden Markov models of syllables as the basis for speech modeling. Word models are built by concatenating syllable models. The speech recognition module performs a time-synchronous Viterbi beam search, matching syllable models against the speech input. That is, it maintains a beam of the best scoring candidates and extends these one frame at a time. Recognition candidates with a low likelihood score are pruned. All candidates cover the utterance from the beginning to the most recently processed frame. As a recognition path reaches the end of a syllable model, the search transits to the beginning of syllable models that can follow the current syllable which ends the path. Currently, the speech recognizer uses no restrictions concerning syllable connections (i.e. any syllable can follow any other syllables), which sometimes overgenerate the lexical vocabulary. To avoid this problem, the recognizer gives a lattice of alternative syllable matches in order to get a high inclusion rate of the correct result. Figure 2 shows an
Table 1 Speech analysis conditions Sampling frequency and precision Pre-emphasis Hamming window Frame period Acoustic parameters
16 kHz, 16 bit 1 - 0.97z -1 25 ms 10 ms 12 MFCC (mel-frequency cepstral coefficients) + 12 A MFCC + 12 A A MFCC + power + A power + A A power (39 dimensions in all)
539 example of a lattice, where symbols A, B and C indicates distinct syllables. As is clear from the figure, a lattice is characterized as a set of hypothesized syllables with different starting and ending positions in the input speech.
A
C
B Time
tl
t2
t3
t4
t5
Figure 2. An example of a lattice.
3.3. N - b e s t C a n d i d a t e s G e n e r a t o r a n d V o c a b u l a r y R e p r e s e n t a t i o n The job of the N-best Candidates Generator is to generate N most likely word candidates from the syllable lattice. The entire vocabulary (a set of dictionary entry words) is represented as a trie structure, from which this module selects word candidates that matche well with paths in the lattice. The size of total vocabulary amounts to more than tens of thousands to a hundred thousand, and thus it is necessary to adopt both a compact and efficient representation of vocabulary. For these purposes, VOICEDIC uses a double-array-based trie structure [6], which combines the fast access of a matrix form with the compactness of a list form. In short, the double-array-based trie structure uses two one-dimensional arrays, called B A S E and C H E C K respectively. Suppose 6(n, a) = m is a notation indicating that there is an arc labeled a from node n to node m, then it is defined using B A S E and C H E C K such that the following relation holds.
6(n,a) = m
~
BASE[n] + 'a' = m & CHECK[m] = n
(1)
In this way, each arc of the trie can be retrieved from the double-array in 0(1) time, providing very fast access to the dictionary entry words.
540 4. C O N C L U S I O N S This paper presented the VOICEDIC project undertaken at our laboratory, which aims at a practical application of speech recognition technology. The main characteristics are: • Because of requiring only isolated-word recognition technology, it is expected to attain a high recognition performance from current technological standards. • It offers simple error handling for misrecognitions. That is, the system produces N most likely recognition results, from which a user selects a correct one. We are currently developing each module of VOICEDIC separately on workstations. These modules will be integrated into one system before long. Finally, let us turn our eyes to the future. Progress in device technology will surely make it possible to increase gate numbers in chips and clock frequencies more than 10..~100 times. Also, the development of large-scale application-specific ICs is going to enable us to realize a one-chip speech recognition device. In the future, it would not be a mere dream to make a portable pocket VOICEDIC so that anyone carries it to anywhere anytime. REFERENCES
1. C. Baber and J. M. Noyes (eds.), Interactive Speech Technology: Humans factors issues in the application of speech input/output to computers, Taylor & Francis Publishers
(1993). 2. D.B. Roe and J. G. Wilpon (eds), Voice Communication between Humans and Machines, National Academy Press (1994). 3. K . F . Lee, Automatic Speech Recognition: The Development of the SPHINX System, Kluwer Academic Publishers (1989). 4. K. Kita, Y. Yano and T. Morimoto, Continuous Speech Recognition Based on LRParser-Driven One-Pass Search Algorithm and Hidden Markov Modeling of Speech: Towards Real-time Intelligent Human-Computer Interaction by Voice, Proc. of Second Singapore International Conference on Intelligent Systems, pp. B347-B352 (1994). 5. X.D. Huang, Y. Ariki and M. A. Jack, Hidden Markov Models for Speech Recognition, Edinburgh University Press (1990). 6. J. Aoe, K. Morimoto and T. Sato, An Efficient Implementation of Trie Structures, Soj2ware-Practice and Experience, Vol. 22(9), pp. 695-721 (1992).
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
541
An Operation Analysis of An Address Input System with Speech Recognition Kazuhiro A R A I , Osamu Y O S H I O K A , Shigeki S A G A Y A M A and Noboru S U G A M U R A NTT Human Interface Laboratories 1-2356 Take, Y o k o s u k a - s h i , K a n a g a w a 238-03, J A P A N Abstract Address input is one of the most important processes in business. This process is needed for managing customer information and the delivery of commodities. In order to reduce the time necessary for address input, we have developed an address input system that employs speech recognition. Users of this system can input addresses by using not only a keyboard and a mouse but also speech. The input time was measured for 25 adult novice users in order to test whether or not the speech recognition function is useful. The results revealed that speech recognition reduces the time required for address input. This paper describes the system outline and the experiment measuring input time.
1
INTRODUCTION
Recently, the performance of speech recognition has reached new highs. Due to this remarkable progress, many new avenues have opened up for application systems employing speech recognition [1]. Architecture and user interface of speech recognition are the important factors for bringing the speech aspect of human-computer interface to a more readily accessible state. We have developed speech recognition architecture according to the following concepts. The core of speech recognition must be developed independently of any application system. From this viewpoint, we introduce a server-client architecture for the development of a speech recognition process. The server processes speech by accepting as input words that are to be recognized and speech data from the client. The results of speech recognition are sent back to the client. This server is developed using only software and does not need special hardware such as a digital signal processor. For making a useful user interface, speech recognition should be introduced which is complementary to conventional input devices such as a keyboard and a mouse [2]. By using speech recognition, the time required for data input may be reduced in comparison to that of the conventional devices [3]. However, sometimes the results of speech recognition may be incorrect. Therefore, we developed a support environment that corrects the results of speech recognition using conventional devices [4]. We have developed an address input system employing speech recognition using the architecture described above. Address input is one of the most important processes in business. The process is needed for managing customer information and the delivery of commodities. The user of this system can input an address using not only a keyboard
542
[ 03.3509_XXXX I AreaandI°calc ° d e ~ ITokyo-to lTokyo-to Tokyo-to Tokyo-to ITokyo-to Tokyo-to Tokyo-to Tokyo-to
Chiyoda-ku Uchisaiwai-cho"1~,,,,AddressRetrieval Chiyoda-ku Kasumigaseki Chiyoda-ku Nagata-cho ~ Dividing according to the classes Chiyoda-ku Hibiyakouen Tokyo-to Chiyoda-ku f Uchisaiwai-cho Chiyoda-ku Yuuraku-cho Kasumigaseki Minato-ku Shinbashi Nagata-cho Minato-ku Toranomon Hibiyakouen Yuuraku-cho Minato-ku Nishishinbashi Minato-ku | Shinbashi r Toranomon t__Nishishinbashi
'
1 t
Storing words into each class
1
Figure 1" The flow of address retrieval and a mouse, but also speech. Area codes and local exchanges of telephone numbers are generally assigned depending on the locality. The system has a database that describes relationship between addresses and telephone numbers. The user first inputs the telephone number. By retrieving addresses using the area code and exchange, the system creates address candidates.
An Address Input System 2.1 System Outline
Speech recognition accuracy will deteriorate as the system is forced to deal with more and more words. In order to reduce the number of words that must be handled, this system uses the relationship between the telephone number and addresses described below. The area code and the exchange of telephone numbers generally depend on the locality. The system has a database that describes the relationship between the area code and exchange of the telephone numbers and the regions. Figure 1 shows the flow of the address retrieval using the area code and exchange. The user first inputs a telephone number using the keyboard. Using the area and code the exchange, the system retrieves several addresses corresponding to the telephone number input. The addresses are divided into words according to the address classes such as prefecture, city, ward and town. Then the relationship between the words are assigned and are then stored into each class. Those address words form the word speech recognition input. A sample window for this system is shown in figure 2. The system's window consists of one entry field for telephone numbers, five fields for address such as prefecture, city and town, two fields for block numbers and building names, and three buttons. The fields and buttons are developed as parts of a graphical user interface employing speech recognition that works on the X window environment. By arranging these parts in a window, a developer of a system with speech recognition can easily create a specific window. The
543
PhoneNumber I 03-3509-0000I
AddressI Tokyo-to II Chiyoda-ku
I
I
••°•"'•a'•a'•"•••liiiiiiliiIi1•iiiiIi1•i••iiI!•i•I!ii•i1ijiii1•!••••iiiijIiIii•iIijijiiiiliIiiiijiilii•iIiil
lCearI lSavel
nOutl
Figure 2: A sample of the system's window user interface and recognition procedure are implemented as functions of the fields. In the address input system, the fields for address accept speech as input. After retrieving addresses using the area code and the exchange, the address words are assigned to each field according to their classes. If no address word is assigned to a field, the field turns dark to indicate that there is no entry. The user can input the address word using a mouse operation on a pulldown menu which shows the words stored in each field. The user can also input an address word using speech recognition. The procedure for speech recognition is described briefly as follows. When a field for address input is focused by the user, the client sends the words stored in the field to the recognition server to set a vocabulary. While a user is pushing a key for speech input, the server accepts the speech data from the client and processes it. After the server has finished the process, the server returns the words to the appropriate fields with plausibility scores. The user interface for speech recognition is described in detail in section 2.2. 2.2 User Interface The following user interface supports speech recognition.
Indicator for Speech Input Extracting the speech section correctly is an important issue for speech recognition. The speech recognition client should accept speech partitioning but not noise as input to get highly accurate speech recognition. In our server-client architecture, the client sends speech data to the server, while the user is pushing a key for speech input. Some users, however, start speaking before they push the key or release the key before completing the speech input. In order to avoid such error, the client darkens the field to notify the user t h a t speech data are being sent. The user can start speaking after the field has returned to its original state.
The Following Candidate When speech recognition is finished, the candidates are arranged into the field's menu list according to the plausibility scores. The most plausible candidate is shown in the field. If the word shown in the field is incorrect, the user can look for the correct word by depressing the space key and selecting the correct word using the return key. By operating the mouse, the user is also able to look for the correct word in the menu list which shows ten candidates at a time. It is easy for users to
544
Tokyo-to
Chiyoda-ku "-7--- Uchisaiwai-cho F Kasumigasoki '--- Nagata-cho F Hibiyakouen ' - - - Yuuraku-cho Minato-ku | Shinbashi F Toranomon L--- Nishishinbashi
In case that "Uchisaiwai-cho"is selected.
Tokyo-to II Ch,,o,,-ku Iluchis=w=-choI
In case that "Minato-ku"is selected.
Tokyo M,no-ku,
Shinbashi
Toranomon Nishishinbashi
Figure 3: A sample of fixing and reducing of words understand the key and the mouse operation required because similar actions are used by many other systems for transaction entry.
2.3
Use of Address Class
Using the user's selection of an address word in a class, the system fixes word in the larger classifications and decides if each word in the smaller classification is related to the selected word. Figure 3 shows an example for fixing and determining the word. In this figure, when the user selects the address word "UchisaiwaJ-cho," the words in the larger classifications such as "Chiyoda-ku" and "Tokyo-to" are automatically fixed using the relationship between the words. On the other hand, if the user selects the word "Minato-ku," the word "Tokyo-to" is fixed and the words in the smaller classification are reduced to "Shinbashi," "Toranomon," and "Nishishinbashi." By reducing the number of selectable words, it becomes easier for the user to decide the next classifying words and for the system to recognize the words spoken by the user.
3
Experiment
An experiment was carried out to test if speech recognition is useful for address input. The experiment focused on the input time needed for address input because the goal of this system is to shorten this time required. The conditions and results axe described below in detail.
3.1
Condition
25 female adult subjects were used as operators for the experiment. None of the operators had previous experience using speech input. Telephone numbers were arranged so that 10, 20, 30, or 40 addresses each would be retrieved from the database. 24 sets of telephone numbers and addresses were given to each operator. Half of them were used for speech input, and the others were for conventional input. Use of speech recognition for address input was left to the operator's discretion. Therefore, some users could opt not to use speech recognition even though it is available. The database contained 7,925 combinations of area codes and exchanges and 89,59? addresses in the Tokyo area.
545
"~' "6
10
z
I,
II
10
I
Input Time [sec]
Figure 4: A sample of histogram of the operation time The experiment consisted of first, an explanation of the system and its operation; second, practice operations; and finally, 24 operations for time measurement. When all words had been retrieved from the database and had been stored into fields, the system started the time measurement. The system ended the measurement when the operator had fixed all the fields. In the experiment, the input times of the system were compared to those of another system that used the same user interface excluding speech recognition. All key strokes, mouse operations, and speech data were recorded into files together with the time at which each action occurred.
3.2
Results
All records of the operation are categorized based on how many addresses the operation dealt with and whether or not the operation used speech recognition. Namely, all records were categorized into one of eight cases. For each case, histograms of the operation time were made within 2 sec. of the class internal. Figure 4 shows a sample of the histogram made from the operations that dealt with 20 addresses and did not use speech recognition. In the figure, the largest number of trials, that is, the mode of this histogram, is 18 trials at 12 sec. In this experiment, the operation time in each case was compared in the mode of their histogram. Figure 5 shows the modes of operation time in cases. Where the user used the system without speech recognition, the tendency shows that as the number of addresses increased, the amount of time required to input the addresses increased. In this system, accordingly as the number of addresses increased, the time may proportionally increase, because it takes more time and operation for the operator to find out the word to be input. However, for the system with speech recognition, the increase in the time was comparatively less than that in the conventional system. Therefore, these experimental results support the fact that the time required for address input by speech recognition did not significantly increase as the number of addresses increased.
546
30
I
25 _ "G' 20
I
I
I
With Speech Recognition. -.a-Witho
m
¢D ,____, (D
-o 0
15
m
10
0
0
I
I
I
I
10
20
30
40
Number Of Addresses
Figure 5: Comparison of the time for address input
4
Conclusion
This paper has described an address input system with speech recognition. This system was implemented based on a server-client architecture for speech recognition, the graphical user interface developed using parts for speech recognition, and address input using address classes. The experimental results revealed that when the input volume is large, our speech recognition system significantly decreases the time required for data input in comparison to conventional means. Future works needed to further improve speech recognition applications include improvement of the user interface and consideration of a support environment for user instruction.
Acknowledgment The authors would like to thank Dr.Nobuhiko Kitawaki for his administrative supports in this work. We would also like to acknowledge Tomokazu Yamada, Takashi Imoto and Yoshiaki Noda for their helpful technical supports.
References [1] Kai-Fu Lee, "The Conversational Computer: An APPLE Perspective," Proceedings of The Third European Conference on Speech Communication and Technology; EuroSpeech'93, Vol.2, pp.1377-1384, (1993). [2] Alexander I. Rudnicky, "Factors Affecting Choice of Speech Over Keyboard and Mouse in A Simple Data-RetrievM Task," Proceedings of The Third European Conference on Speech Communication and Technology; EuroSpeech'93, Vol.3, pp.2161-2164, (1993). [3] Lewis R. Karl, Michael Pettey and Ben Shneiderman, "Speech versus mouse commands for word processing: an empirical evaluation," International Journal of Man-Machine Studies, 39, pp.667-687, (1993). [4] Osamu Yoshioka, Yasuhiro Minami and Kiyohiro Shikano, "A Multi-Modal Dialogue System for Telephone Directory Assistance," Proceedings of 1994 International Conference on Spoken Language Processing; ICSLP'9.~, Vol.2, pp.887-890, (1994).
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
547
A Menu-Guided Spoken Dialog System and Its Evaluation Mikio Yamamoto, Takeshi Koike and Seiichi Nakagawa Department of Information and Computer Sciences, Toyohashi University of Technology, 1-1, Tempaku, Toyohashi, 441, Japan
Abstract This paper describes a man-machine spoken dialog system that integrates speech recognition and menu-based natural language understanding technique. One of the difficulties in speech recognition is that the recognition rate is inversely proportional to the size of the set of acceptable sentences in the recognition system. That is, there is a serious tradeoff between the recognition precision and coverage. To resolve this problem, we employ a menu-based natural language understanding technique as the guide for spoken input. We developed the preliminary system and evaluated it. The experiment shows that the menu-based spoken dialog system is suitable for novice users. 1. I N T R O D U C T I O N A spoken natural language dialog system is one of the best man-machine interfaces for novice users, if the system can understand and answer almost all the user's queries. However, current technology in speech recognition and natural language understanding cannot perfectly recognize and answer all the user's queries. The recognition rate depends on the perplexity of the grammar for the recognition system. Perplexity means the size of search space in speech recognition. If we decrease the perplexity of the grammar in the system, the recognition rate may improve. However, decreasing perplexity means decreasing the number of recognizable input sentences. Although the recognition rate for acceptable sentences is high in limited conditions, it will be difficult for the user to utter an acceptable sentence. The grammar-driven speech recognition system never recognizes unacceptable sentences and it is difficult for the user to know system's capabilities. This tradeoff between precision and coverage is a serious problem of speech recognizer. We think the speech input system has to show explicitly its capabilities to the novice user by some guidance facility. To show them, we employ the menu-based natural language understanding technique. Sentences inputted to an original menu-base natural language understanding system[l] are made by selecting words or phrases from a set of dynamically changing menus on screen. When a user selects a word or phrase from the menu, the system predicts the next words or phrases that can grammatically follow after the sequence of words selected previously and displays the predicted words or phrases as menus. This technique can show explicitly system's capabilities using menu. However, Thompson reported limitations of the original menu-based natural language understanding technique[2]. He pointed out the "big menu" problem in his paper.
548 This is a problem in that too many items are active for the selection in relatively bigger tasks. It is difficult for the user to select a word or phrase from such a big menu. Reducing the number of units for the selection solves this problem. For example, a phrase consisting of several words can be subdivided into words. The word menu may be smaller than phrase menu. However, two new problems arise by reducing the number of selections. One is that there is low input efficiency and selection time increases. Another problem is that it is unnatural to speak a small item in a menu. It is natural for user to be able to select an item that has a lump of meaning. The Japanese language has phrases called 'bunsetsu' that is a sub-sentential unit. A "ounsetsu' is a phrase that has a content word followed by a sequence of functional words. Most of 'bunsetsu' take one or more functional words. Since a 'bunsetsu' is a lump of meaning and is familiar for Japanese people, it is easy for Japanese to speak a sentence with pauses between 'bunsetsu' phrases. In addition, functional words are short and the sequences of them may be long, so the user goes to the trouble of selecting each functional word one by one. However, if we choose a ~ounsetsu' as the selection unit, the menu becomes big because 'bunsetsu' is the combination of content words and functional words. Also in Japanese, we must display several words for one root verb since the inflection of verbs changes the mood of sentence. In the inflection of Japanese verb, only the end of word changes, so dividing a verb into an unchangeable part and a changeable part decreases the size of the menu. But in this case the unit of selection is too small and meaningless. Thus the decision of units for the selection includes a tradeoff. That is, although we want a big selection unit as possible, it leads to a big menu. On the other hand, although a small menu is preferable, it leads to an unnaanal speaking manner. In the Section 2, we propose a new technology that solves both tradeoffs of speech recognition and the menu-based technique by integration of the both advantages of them. In the Section 3, we compare our method with other methods and evaluate it. 2. M E N U - G U I D E D S P O K E N D I A L O G S Y S T E M To solve the problems discussed above, we propose a new interface, which uses spoken language input in combination with the menu-based technique. The speech recognition system has enough speech recognition rate when the input sentences are in the range of a limited grammar for speech recognition. However, it means that the limitations reduce the set of acceptable sentences, so user's utterances are often not acceptable to the system. We need a guiding facility to help for novice users to speak acceptable sentences. We think that the menu-based technique can be used as a guiding facility for user input. If we choose a 'bunsetsu' as a unit for the selection and the system displays all 'bunsetsu' as a menu, the menu may be too big. To avoid this problem, we employ the mechanism that displays only content words in a menu and an user utters a 'bunsetsu' that is made from the selected content word in the menu or the inflection form, and following any sequence of functional words. This is a new method integrating the advantages of the menu-based technique --- which explicitly shows the capability of the system to the user and guides to make an acceptable sentence --- and the speech recognition --- by which the user can input a phrase that includes functional words out of the menu.
549 To evaluate our basic idea, we developed a preliminary spoken dialog system that can accept Japanese language for the task of "sight-seeing guidance for Mt. Fuji." Figure 1 shows the system configuration. The next word predictor predicts content words and 'bunsetsu' categories that can legally follow the partial sentence inputted previously. Since this part uses the prediction mechanism for the continuous speech recognition system controlled by a contextfree grammar(CFG), the grammar of the predictor is written by CFG. The basic algorithm for the prediction of the content words and "ounsetsu' categories uses the Earley's top-down parser[4]. The predicted content words are sent to the menu manager and are displayed. The predicted 'bunsetsu' categories are sent to the part of the 'bunsetsu' recognition, and a 'bunsetsu' spoken by the user is recognized using the grammar and dictionary related to the predicted 'bunsetsu' categories. The system's recognition rate and speed are very good, because the candidates not related to the predicted 'bunsetsu' categories are not considered. Since the system displays the recognized 'bunsetsu' phrase, the user can know whether recognition result is correct or not just after the end of uttering. If the recognition result is not correct, the user can cancel the misrecognized 'bunsetsu' phrase by uttering 'cut.' When the user completes a sentence, he speaks the word that means the end. Then the system sends the sentence to the dialog system that analyzes its syntax and meaning and generates a response[5]. The dialog system has a database for answering queries about sightseeing in Mt.Fuji that has about 500 entries. Figure 2 shows a simplification of the system's display. The above list written by Kanji characters is a menu. The kanji string '~~,7~~ ~ ¢9' on the bottom is the sequence of words selected previously. ' _ ~ 7 5 ~ G' means 'from Toyohashi city' in English. If the user want to input the following sentence
Toyohashi kara (from Toyohashi)
Fujisan made (to Mt. Fuji)
douyatte (How)
ikemasuka? (can I go)
that means 'How can I go to Mt. Fuji from Toyohashi city?', he may input '~:]zLI_I ~ "~' (to Mt. Fuji) as the next 'bunsetsu'. Bunsetsu ' ~ ' ~ ILl ~ ~ ' consists of the content word " ~ ' ~ ~ ' Menu-GuidedSentenceInputS~,stem
I Menudisp,ayI/--
Utterance
~.
nce
. ts,ste j ' o o
•
| |
Dialog /
Figure 1 System configuration
ply
MENU ~ E I ~ (Lake Kawaguchi) I11~'~ (Lake Yamanaka) ~ 1 1 1 (Mt. Fuji) I~ ~O')-~ (White String Fall) ~±-1" > ~ --(Fuji Interchange) ~~ -1" > "Y -- (Gotenba Interchange)
WO,DS OUENCE, CTED,,romTo,oha.h . P.,V,OU,., ...c. °
Figure 2 Display of the menu guided spoken language input system
550 (Mt. Fuji) in the menu and the Japanese post-position ' ~ "~'(to) that is not displayed as an item in the menu. Since there are only words that mean destinations in the menu, the user can easily know that he must specify a destination and decide the next utterance. The system can easily recognize user's input 'bunsetsu', because the allowable 'bunsetsu's are limited by the previous speech and the menu. While the system does restrict the user utterance for the speech recognition, it guides the user to decide easily on the next utterance by displaying the list of content words that are allowed by it. 3. EVALUATION
3.1 Compared system For the evaluation of our technique, we compare four systems which are: (i)the menu-based spoken word input system('word menu)[3] that accepts only utterances of the words in the menu, (ii)the full spoken sentence recognition system('sentence recognitionS[4] that accepts a sentence and displays no menu, (iii)the menu-based mouse input system('mouse menu3 that the same systern as the word menu system except the selection is done by pointing device instead of speech recognition and (iv)our menu-based spoken 'bunsetsu' input system('bunsetsu menu ~. The speech recognition program for each system is the same. The recognition program recognizes sentences, 'bunsetsu's or words according to the grammar of each system. The program uses the Hidden Markov Models(HMM) of syllables and connects them dynamically in accordance with the word dictionary and grammar of the system. Although each system covers the same task domain, they do not cover the same sentence set for user inputs. Because the user of menu-based systems has different information about input guidance with the user of the sentence recognition system without the guidance, the user of the menu-based spoken word input system has a complete guide information and the user of the menu-based spoken 'bunsetsu' input system has the intermediate information of other two systems. The less information of guidance the user has, the more degree of freedom for input he has. Thus the sentence recognition system has to have larger coverage than other systems. We have tuned each system for the possible inputs. As results, the sentence recognition system has the largest set of acceptable sentences and the menu-based spoken word input system has the smallest set of acceptable sentences. The number of words in the dictionary of the sentence recognition, the bunsetsu menu and the word or mouse menu, are about 180, 120 and 100, respectively. In the menu-based spoken 'bunsetsu' input system, about eighty words are used for displaying the menu. The word and 'bunsetsu' recognition rates of the menu-based systems are about 90 percent. The sentence recognition rate of the sentence recognition system is about 60 percent, but the sentence understanding rate --- that is the rate when the misrecognized sentence that has the same meaning as input sentence is also correct --- is about 80 percent, where each recognition rate is the case when the user speaks the acceptable word, 'bunsetsu' or sentence for each system. Of course, each system cannot recognize the word, 'bunsetsu' or sentence outside its grammar and dictionary. In the next section, we describe the experiment in that we give the subjects a scenario of a
551
Table 1 Results of evaluation Average Task Recog. time per complerate ition rate task
---..... .~
Word Menu Bunsetsu Menu
25 min.
21 min. 7° Sentence Recog. 31 min. 24 min. Mouse Menu d. Sentence Recog 12 min. 7 min. Mouse Menu
88% 98% 85% 100%
91.5% 87.1% 73.5%
100%
93.3%
100%
Novices [i~i~ '................................................................................................................. il:Acceptable(38%)iiiiiii~Unacceptable(52%)'~':':':~ ':' ........................
Average time per sentence 120(58) sec. 99 (54)sec. 133 (56) sec. 62 sec. 69 (29)sec. 40 sec. :'~;~
l.:i:i:~:!:~:!:~!............. : ~~.~:~.~:~:~.:~:~:~:.~:.~:.~:..................................................~:..:.::.~......~...................fff~f-1 ... [.'i~i~i!iiiiiiiiiii~ii::~i!!!iiiii
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::iii~6..'./-...'.`6..~`1~
Figure 3 Acceptable and unacceptable utterances task that they have to perform.
3.2 Evaluation by task performance Using the systems in the section 4.1, we make an experiment to evaluate each system's performance in the task execution. The scenario of the task is that subjects play a manager of an overnight trip to Mt.Fuji by car and make a plan for the trip. The subjects have to decide where to go, what to do and which hotel to stay at. Also the subjects must retrieve some information about routes and time for driving. The subjects were volunteer students, twenty-two novices and six experts of the systems. Each system based on speech recognition is used by five novice users respectively. Mousemenu system is used by seven novice users. Three experts used the sentence recognition o system and other three experts used the mouse-menu system. The results of this experiment are given in Table 1. "Novice" in the table shows the results by the novice users. "Exp." shows the experimental results by the expert users. The time in "Average time per task" means the average time the subjects completed a task. "Task completion rate" means the rate of completion of scenario we gave to the subjects. "Recog. rate" is the recognition rate of utterance units of each system that are words, 'bunsetsu' phrases or sentences. The time in "Average time per sentence" is the average time of inputting a sentence by each system. The numbers in parentheses are the time assuming the real time speech recognition that means finishing recognition at the same time of end of speech. Note that although the average time per sentence of the mouse-menu system is shorter than the bunsetsu-menu system, the average time per task of the mouse-menu system is longer. This reason is that subjects of mouse-menu system input more questions than speech-based system. The mouse-menu system has the best performance. But if we assume real time speech
552 recognition, the average time per sentence of all systems based on speech recognition is shorter than the mouse-menu system. In the systems based on speech recognition, the experts using the sentence recognition system has the best result. However, the menu-based spoken 'bunsetsu' input system has the best results for novice user except the recognition rate. In the menu-based spoken word input system and the spoken sentence recognition systems, some subjects gave up inputting some queries, since the system could not understand user's input utterances although the user repeated speaking the same content utterances many times. In contrast, the misrecognition of the menu-based spoken 'bunsetsu' input system isn't too consecutive for the user to give up inputting queries. In the menu-based spoken word input system, this trouble is caused by the fact that the length of Japanese postposition is too short for the speech recognition system. In the spoken sentence recognition system, the misrecognition is caused by the fact that the user's utterances are unacceptable for speech recognition system. Figure 3 gives the proportion of acceptable and unacceptable input sentences to the spoken sentence recognition system. Figure 3 shows that the novice users speak acceptable sentences at a rate of only 38 percent, but the expert users at 88 percent. This is the reason of different input efficiency of the novice and expert. This result shows the advantage of the menu-based system for the novice user, because it is difficult for the novice user to utter acceptable sentences for the spoken sentence recognition system. Thus we can say that the menu-based spoken 'bunsetsu' input system is the best for the novice users who are not familiar with the system. 4. CONCLUSION We proposed and evaluated a new spoken natural language dialog system that integrates the advantages of the menu-based natural language understanding and speech recognition techniques. By evaluation with other systems, we showed that our method was suitable for novice users and revealed the reason that the sentence recognition system isn't suitable for novice users. More than half of utterances of novice users were not acceptable sentence to the system. In the other hand, about 90 percent of expert's utterances were acceptable. REFERENCES [1] H.R.Tennant et al: "Menu-based natural language understanding", Proceedings of the Conference of the ACL, pp. 15 l- 158, 1983. [2] C.W,Thompson: "Constraints on the design of menu-based natural language interface", CSL Technical Note #84-03, Texas Instruments Inc., March, 1984. [3] M.Yamamoto and S. Nakagawa: "Menu-based dialog system on Japanese", Proceedings of the 7th Annual Conference of JSAI, 19-7, pp.537-540, June, 1993(in Japanese). [4] A.Kai and S. Nakagawa: "A frame-synchronous continuous speech recognition algorithm using a top-down parsing of context-free grammar", Proceedings of ICSLP92, pp.257260, 1992. [5]M.Yamamoto, S.Kobayashi, Y.Moriya and S.Nakagawa: " A spoken dialog system with verification and clarification queries", IEICE Trans., VoI.E76-D, No.l, pp.84-94, 1993.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors)
553
© 1995 Elsevier Science B.V. All rights reserved.
Face Observation
Using an Active Camera
Qian CHEN, Takeshi FUKUMOTO, Haiyuan WU and Masahiko YACHIDA Department of Systems Engineering, Osaka University, Machikaneyama-cho 1-3, Toyonaka, 560, Japan. {chen, fu ku m oto ,wu hy, yachid a} @yachi-lab, sy s. es. osakau, ac.j p
Abstract This paper describes a new approach of taking good images for the face recognition such as the extraction of facial expression. The face like parts in an input image are first extracted by comparing the skin color regions and the hair regions detected from the image with several pre-defined 2-dimensional face pattern models using the fuzzy pattern matching method. The 3 dimensional pose of the extracted face relative to the camera is estimated using the area and the center of the gravity of the skin color region and the ones of the hair region on the face, which is then used to guide an active camera changing its view point in order to taking the image where the face will appear in the desired pose. face recognition, active vision, face detection, pattern matching, pose estimation, skin color
Keywords:
1. I n t r o d u c t i o n Automatic face detection and obtaining the image containing the face of interest in the desired pose is a key problem in the face information processing research such as face recognition and expression estimation. Most of the studies about facial expression extraction, face recognition and facial feature extraction assume that the input is an image where the face in it is a frontal one and its size and position is known. However, such images are not available in general using a single fixed camera because the face obtained from a camera may not in the desired pose, and with a fixed camera, one has no chance to change the view point to get the desired image. Obtaining the image containing the face of interest in the desired pose is a very important subject in man-machine communication and human interface although very little attention has been paid to that subject. There are two reasons that we want the image containing the face of interest in some desired pose. First, the face in some desired pose (such as in frontal pose) is required by many face recognition and expression estimation approaches. The second one is that the face in some frontal pose can be verified easily. The active camera has been introduced in computer vision research to solve many problems such as the structure from motion problem and eye-hand cooperation problem and so on. It is proved that using an active camera in vision system is an efficient approach to make many difficult vision problems becoming easily to solve. We introduce the active camera into our research in order to obtain the image containing the face of interest in some desired pose. When a face like part is extracted from an input image, the pose of the "face" relative to the camera is estimated, which is then used to guide the camera to get an image containing the "face" in frontal pose.
2. D e t e c t i n g Face C a n d i d a t e s U s i n g Color I n f o r m a t i o n The first job of obtaining a frontal face image is the face detection. Since the subject about automatic face detection has been studied by many researchers, and it is not the main subject of our research, we will not discuss it in detail. The face like parts are detectedusing the information of the skin color of faces in the input image: 1 Extracting the skin color regions and hair like regions in the image. 2 Detecting the face like region by comparing the face models with the extracted skin color regions and hair like regions using fuzzy pattern matching approach. In our research, the perceptually uniform color space described in [2] is used to represent color information in order to obtain reliable results. A model called skin color distribution function (or SCDF) is developed for representing the concept of skin color. The SCDF represents the frequency that each color appears in the human skin regions in images, thus can be used to estimate the degree of how well a color looks like skin color.
554 The skin color regions are extracted by estimating a continuo measure for each pixel in the image indicating how well it looks like skin color using S C D F , and we call it as Skin Color S i m i l a r i t y (or S C S ) . The hair like regions are extracted by finding out the pixels with relatively low luminance and low saturation. We modeled the shapes of the faces appearing on the image as three 2-dimensional patterns, one for frontal face and two for side view faces. The cell in a model indicates how the corresponding part of a face should look like. The model can be changed to any size by enlarging or reducing the size of the cells in it. Therefore we can use the models to detect faces of any arbitrary size. The face candidates in an image are found out by comparing the various rectangular regions of the image with the face models to see if it looks like a human face. To detect faces of any possible size in an image, we prepare a series of face models of various sizes. To detect faces of a particular size, we set the size of the face model to the specified size by changing the size of each cell in it. We introduce the fuzzy theory to perform the classification. We define five fuzzy sets, each corresponding to a particular kind of cell in the face models as follows: #y : F ~ Face cell #h : H ~-* Hair cell I~b : B ~ B a c k g r o u n d
(1)
cell
lZh/f : H / F
~ H a i r or f a c e cell
#h/b : H / B
~-~ H a i r or b a c k g r o u n d cell
To perform the classification, we first compute 3 kinds of measures for each square area in the selected image region. These measures are hair measure ( H ) , face measure (F), and background measure (B). Then the matching degree of each cell in a face model with the corresponding square areas in the image is estimated using the fuzzy membership functions defined above as follows: / us(F), M(cell) =
lzh(H)'
gh/!(H,F), #h/b(H,B),
if if if if
the the the the
cell celt celt celt
in in in in
the the the the
model model model model
is is is is
a a a a
F cell; H cell; H / F cell; H / B cell.
(2)
To detect regions containing faces, we use the face models to scan the entire image in order to estimate the matching degree between the face model and each rectangle region in the image. 3 Obtaining Good Face Images Most existing methods of the extraction of facial expression and of the face recognition require that the face in the input image must be in some special pose, such as frontal and its size and position must be known (in most case, the face must in the center of the image plane and the size of it must be big enough). The images of this kind can not be always obtained using a fixed camera. This problem causes most of the face recognition methods to be difficult to applied to a practical application. To solve this problem, we use an active camera that can change its view point freely, such as a camera mounted on a robot manipulator, to take good input images for face recognition. The good image for face recognition is defined as the following: 1. The face in the image is in the desired pose. 2. The face is at the image center. 3. The face in the image is big enough. We use an active camera to obtain the images containing the face of interest in the desired pose. This operation is carried out by the following steps: 1. Find out the face in the input image. 2. Estimate the pose of the face. 3. Check if the face is a frontal one. If it is, then the process terminates.
555
4. Make a plan of the for the active camera to take the picture of the frontal face. 5. Guide the camera. 6. go to step 1. 3.1 E s t i m a t i n g t h e p o s e a n d t h e p o s i t i o n o f a face Much information can be used to estimate the pose of a face in the image, such as facial features, etc. However, the approach based on facial features often fails due to the low reliability of the extracted feature. In our research, the pose and the position of the face is estimated by making use of the active cam,era and the image processing techniques. The manipulator is calibration is done before hand thus the relation between the robot coordinates system and the camera coordinates system is known. At first, the face is detected from the input image and its position, sizel and pose is estimated. This is done by extracting the hair part and skin color part in the image, then find out the combination of them to compose a head like region. The position of the face in the image is computed the center of gravity of the hair region and skin color region. To estimate the 3D position of the face relative to the camera, we first bring the face in the image to the center of the image plane. This is done by rotating the camera around the x-axis and y-axis of the camera coordinates system. Since the camera is calibrated beforehand, the required rotation angle to bring the face to the image center can be computed from the displacement of the position of the face in the image from the center of the image plane. Then we know that the position of the face in the 3D environment will lie on the optical axis of the camera. Since the head of people does not change very much from person to person, we can use this information to estimate the approximate distance between of the camera and the face:
d= f wf
(3)
wi
where d is the distance, the W: is the average width of human head, and the w~ is the width of the face appeared in the image. The pose of the face is determined from the center of gravity of the skin color region and of the hair region. Since the face is approximately symmetric, when viewing a face in frontal pose, the face part will be at the center of the head and the hair part will be around the face. When viewing a head from the left side, the face part will appear on the right and the hair part on the left. Thus, the pose of the face can be estimated by considering the center of gravity of the hair part and the one of the skin color part (face). When the face is in frontal pose, the horizontal position of the center of gravity of the skin color region and the center of gravity of the hair region will be same. Thus, we can find out whether a face is in frontal pose by check the z-coordinate of the center of gravity of the hair region and the one of the skin color region. If the two are not same, we can estimate the approximated direction of the face relative to the camera from those data. Although the shape of face and the hair style changes for different person, the pose of the face estimated using that information can tell whether the face is in frontal pose successfully, and it can give the right direction that the active camera should move along to get the frontal pose image if the face is not in frontal pose. 3.2 A c t i v e C a m e r a C o n t r o l The detail discussion of the strategy for controlling the active camera can be found in [8], [9] listed in the references. When a face candidate is detected from an input image, the camera changes its direction to bring the face in the image to the image center. Then the pose of the face is estimated using the method described above. If the face is not in the desired pose (frontal), the camera can rotate along a circular path of which the radius is the distance between the face and the camera, the center is the 3D position of the face, and the optical axis of the camera keeps directing to the center of the face. (See figure 1). The rotation angle to do the rotation can be computed from the face direction estimated. Although the estimated direction of the face is not accurate, since we just use it as the error measurement while perform the visual feedback control, at the final state, the face
556 in the image is very near a frontal one. To take the frontal pose image of the face, the camera needs to rotate along a circular track of that the center is on the face in 3D space, the radius is the distance between the camera and the face. Head
r track Camera R: distance between the camera and the head O: estimated pose of the head relative to the camera Figure 1 Using an active camera to take the frontal pose image of a face.
4.
Experimental Results We have made some experiments to examine if it is possible to obtain the pose of the face relative to the camera using the face region and the hair region on it, and what kinds of information are useful for pose estimation. The experiment environment is shown in figure 1. The testee sits on a chair in front of the camera. For each testee, we take an image of frontal pose face, then the testee turns on left for 15 degrees, and we take the next and so on. For each image, we extract the skin color region and the hair region, then compute the area, center of gravity of the two regions. Figure 2 shows some results of the extracted skin color region and the hair region of a face. The results are shown in figure 3. We find out that the difference between the center of gravity of the skin color region and the one of the hair region is most useful for the pose estimation of faces. In all case, the distance become zero when the face is a frontal one. When the face rotates to left (relative to the testee), the distance increases monotonously, when it rotates to right, the distance decreases and becomes minus.
left 90
left 60
left 30
0
right 30
right 60
right 90
Figure 2 The extracted skin color regions and hair regions from a test image sequence. The number under each face indicates the rotation angle relative to the frontal face.
We also made some experiments using an active camera to obtain the face images in the frontal pose. The experiment environment is same as showed in figure 1. At first, we take an image and estimate the position and the pose of the face relative to the camera, the we use the method described in the above section to build a motion plane for the camera so that after the camera change its view point the image taken from the camera will contain the face in frontal pose. The image sequence obtained while guiding the
557
active camera towards the frontal face is shown in figure 4. The experiment results are shown in figure 5. . Fx (pixel) '"".
~ Hx (pixel)
.~Dx (pixel)
100
,-''~,,
8
t St
6 4
~.~ '"-.
-40
,' , , -==".~
4 .Rotation(deg..ree)
20"~K)'-6~) " - ~ ~
Fx: The x-coordinate of the center of gravity of the face part relative to the one of the whole head.
40
~ 0 " ~0 : j l e e )
2
:
' ' "x
6
"
f -';C'~"
-40
"~',,
8
",
. ~, "-.
',~ Rotation!de?re.ee)
-40
Hx: The x-coordinate of the center of gravity of the hair part relative to the one of the whole head.
~.,,
" ......
Dx: The difference between the x-coordinate of the center of gravity of the face region and the of of the hair region.
Hair Area (%)
Face Area(%)
Man1 lO£ 80 ...... .
100 80 6O 40 20
......... •
40
Rotation (degree)
~o;-~ '-38 ' 6 ' ~o' do: ~o~
2O
Man3 Man4
Rotation (degree)
Man5
-~o:-~ ~-ab ' b ':;o:do:~" ................
Face Area: Area of the face part measured asthe percentage of the frontal head.
Man2
Woman1
Hair Area: Area of the hair part measured as the percentage of the frontal head.
Rotation: rotation angle of the face relative to the frontal pose.
Figure 3 The results of the pose estimation of f a c e s . iT
ii!!!:i "~:i!~ii!::ili:ili!ili :~:...... ii!~iiiiiii:i::~:::~ ~:~ • - 30 degree
-5 degree
-4 degree
-3 degree
-6 degree
Figure 4 An image sequence obtained while guiding the active camera. T h e angles are real the direction of the face relative to the c a m e r a .
6. D i s c u s s i o n s and Conclusions This paper described a new approach for to obtain the frontal pose image of a face using an active camera. By using an active camera and visual feedback approach to get desired views of face that are required by many face recognition methods. This approach does not require the pre-built 3D model of a human head, and does not require 3D shape reconstruction during the operation, thus is very efficient and flexible. The experiment of obtaining frontal face shows that the desired views of faces can be get successfully using an active camera. Merging this method with expression recognition system or person
558 identify system, the facial expression system can achieve a big progress toward the real application. The experiments of the pose estimation of faces showed that the method described in this paper is very robust and the result of it is easy to use to control the active camera. Face direction
-30
-15
-9
-4
-2
Face direction
60
28
16
9
6
Estimated directior
-30
-12
-9
-4
-4
Estimated directior
64
23
13
5
8
- 15
-6
-5
-2
-2
Camerarotation
32
12
7
3
4
camera
rotation
Experiment 2
Experiment 1
Face direction
-30
-5
-4
-3
-6
Face direction
60
12
0
-2
-1
Estimated directior
-50
-2
-2
5
-4
Estimated directio~
95
23
3
-2
-2
Camera rotation
-25
-1
-1
3
-2
12
2
-1
-1
Experiment 3
Camera rotation
48
Experiment 4
Figure 5 The experimental results of obtaining frontal face using an active camera. The "Face direction" are real direction of the face relative to the camera. The "Estimated direction" is the direction of the face relative to the camera estimated using the method described in section 3. The "Camera motion" is the rotation angle of the camera along the circular path, we assign it as 1/2 of the estimated face direction since the face direction often over estimated.
References [1] Qian Chen and Saburo Tsuji, "Real Time Motion Estimation of Fingers in Color Image", Proc. 24th Conference on Image Engineering, pp.103-pp.106, 1993, Tokyo, Japan. [~t Giinter Wyszecki and W. S. Stiles, COLOR SCIENCE, John Wiley K: Sons, Inc. Guangzheng Yang and Tomas S. Huang, "Human Face Detection in a Scene", Proc. CVPR'93, pp. 453-458. [4] Xinguang Song, Chil-Woo Lee, Gang Xu, and Saburo Tsuji, "Extracting Facial Features with Partial Feature Template", Proc. Asian Conference on Computer Vision,pp. 751-754, 1994. [5] Akitoshi Tsukamoto, Chil-Woo Lee, Saburo Tsuji, "Detection and Tracking of Human Face with Synthesized Templates",Proc. Asian Conference on Computer Vision,pp. 183-186, 1994. [6] H. Zabrodsky, S. Peleg and D. Avnir, "Hierachical Symmetry", Proc. CVPR'92, pp. 9-12, 1992. [7] Roberto Brunelli and Tomaso Poggio, "Face Recognition: Features versus Templates ", IEEE Trans. Patern Analysis and Machine Intelligence, Vol. 15, No. 10, pp. 1042-1052, Oct. 1989. [8] Jiang Yu ZHENG, Qian CHEN, Fumio KISHINO and Saburo TSUJI, "Active camera guided manipulation", IEEE Int. Conf. Robotics and Automation, pp. 632-638, 1991. [9] Jiang Yu ZHENG, Qian CHEN, Fumio KISHINO and Saburo TSUJI, "Active camera controlling for manipulation", CVPR'91, pp. 413-418, 1991.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
559
Facial Features and Configurations Affecting Impressions of Faces Takashi Kato*, Masaomi Odd, Masami K. Yamaguchi, and Shigeru Akamatsu ~t ~ATR Human Information Processing Research Laboratories 2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-02 Japan {tkato, odd, kanazawa, akamatsu}~hip.atr.co.jp The present study investigated the relationships between physical characteristics and perceived impressions of human faces, namely fierceness and gentleness of the face. The study demonstrated that the spatial properties and configurations of facial parts that distinguish between fierce and gentle faces can be identified. Such information on physical characteristics of the face may be useful to interface designers who wish to incorporate faces into human interface in order to improve the intelligibility and/or perceived friendliness of human-computer communication. 1. I N T R O D U C T I O N Human-computer interaction (HCI) to date has been mostly text- or iconic-based. Previous studies show that a variety of factors affect the intelligibility and/or friendliness of such interaction. For example, the wording of system messages not only can confuse the user (thus resulting in unnecessary errors) but can make the system appear to be unfriendly or even hostile to the user [1]. It is not surprising that one would hope to utilize a (human) face as a friendly element of human interface. The intent is to bring some flavor of face-to-face communication into HCI, much in line with a general argument for anthropomorphizing interface agents [2]. If, however, a human face (whether a real or synthetic one) is to be incorporated into future human interface in order to improve its friendliness as well as intelligibility to the user, interface designers must first understand how the user's perception of the face might be affected by the visual cues of the face which they would intentionally or unintentionally make available to the user. This design issue on computer output should be distinguished from that on computer input, such as automatic understanding of the user's facial expressions or gestures [3]. In this paper we will report an experimental study which investigated the relationships between the spatial properties and configurations of facial parts and subjective impressions of the face. The main objective of this study was to clarify whether a particular facial feature (e.g., eyebrow tilt) and/or configuration (e.g., distance between eyebrow and eye) might determine whether a face of neutral expression is perceived to be gentle or fierce. *This research was conducted while the first author was a visiting researcher from Kansai University. tWe thank Ian Craw for producing the average faces presented in the Discussion, and Hideo Fukamachi and Shigeru Mukaida for technical assistance.
560
2. M E T H O D 2.1. Subjects A total of 40 undergraduate students from Doshisha University were paid to participate in the experiment. They were randomly divided into two groups of 20 subjects each. 2.2. Stimuli The facial stimuli used were black and white photographs of 101 males and 102 females who were in their 20's and 30's. The original pictures of 512x512 pixels were cut into 236x236 pixels in order to reduce the visible portion of the external features of the face such as hair, neck, and clothes. The visible portion of the face was normalized across the faces by keeping constant the proportion of the retained area to the face's baseline unit, which was defined to be the length between the mouth center and the line connecting the centers of the eyes. 2.3. Facial features and configurations For all the original faces mentioned above, the coordinates of a pre-defined set of 26 facial points were manually measured with the aid of an in-house tool developed on a Silicon Graphics system. We then defined three types of facial characteristics using another in-house tool which allows us to define any feature or configuration as long as it can be specified by the set of predefined points of the face. The first type of facial characteristics, which we termed individual features, is related to the properties of individual facial parts such as the eyebrow, eye, nose, mouth, chin, cheek, and face outline. There were 23 individual features which characterized the width, height/thickness, area, tilt, and/or (curving) shape of these facial parts. The second type, termed here as positional configurations, is concerned with the positional relationships between two points of different facial parts. There were 16 positional configurations which characterized the straight-line, vertical or horizontal distance between, or the tilt of the line connecting, two points of different facial parts. The third type, called area configurations, has to do with the areas defined by three or more points of at least two different facial parts. There were 8 such area configurations. It should be noted that such an area defined by the points of a single facial part was classified as an individual feature. Finally, we calculated the parameter values of all the defined features and configurations for each of the original faces. Since individual faces may have been photographed from a (slightly) different distance, the measurements of the defined features and configurations were normalized across the faces such that each measurement of a given face was taken as a relative value to its own baseline, which was defined to be the value of the vertical distance from the mouth center to the straight line connecting the centers of the eyes. All the measurement values in pixels were then converted to standard scores with # = 50 and cr = 14 within the set of male or female faces so that the variability of user-selected faces can be compared across different features and configurations. 2.4. Procedure The experiment was controlled by a Silicon Graphics Indy system which randomly drew from a database and presented 10 faces at a time on a selection screen and kept user-
561 selected faces in a background buffer screen. Subjects were free to select any number of faces from the currently-presented 10 faces and to discard any of the previously-selected faces from the background screen. One group of 20 subjects (11 females and 9 males) was asked to select the 8 most gentle/fierce faces from a database of 102 female faces and another group of 20 subjects (10 females and 10 males) from that of 101 male faces. Subjects were instructed to proceed at their own pace using their own criteria for gentleness or fierceness of the face, and to terminate the retrieval task when they were satisfied with the 8 faces collected in the background screen. 3. R E S U L T S
For each set of 8 faces collected by a given subject as being the most fierce or gentle faces, the mean and the standard deviation (SD) of each defined feature and configuration were obtained. Those means and SDs were subjected to a two-way (fierce vs. gentle x features or configurations) analysis of variance (ANOVA). For both female and male faces, ANOVAs of means and SDs both showed significant main effects for fierce vs. gentle faces and for different features and configurations, and significant interaction between the two factors. 3.1. F e m a l e faces Table 1 shows partial results of the analysis of simple main effects between fierce and gentle faces conducted on the mean data for female faces. The first column lists the significant (at the .01 level) features and configurations that distinguish between fierce and gentle faces. The second and third columns indicate whether fierce and gentle faces tend to have a smaller or larger value than the population average for that feature or configuration. The simple main effect analysis of the mean data also showed that the line connecting the left ends of the nose and mouth is tilted toward the right more for gentle than for fierce faces. The areas surrounded by different facial parts are smaller for fierce faces and larger for gentle faces than the population average, except that the area between the eyebrow and the eye for fierce faces and that between the nose and the mouth for both fierce and gentle faces are not significantly different from their population average. The analysis of simple main effects for features and configurations conducted on the SD data for female faces indicated that fierce faces tend to have smaller variance for the shape and size of the chin region and the distances between the eyes and between the eyebrows, whereas gentle faces have smaller variance for the face size and the positional relations between the horizontal end of the nose and the eye, eyebrow or mouth end. 3.2. M a l e faces Partial results of the analysis of simple main effects between fierce and gentle faces conducted on the mean data for male faces are shown in the bottom part of Table 1. The analysis also showed that the line connecting the left end of the nose and right end of the left eyebrow is tilted toward the right more for fierce than for gentle faces, which have a mean value roughly identical to the population average. Fierce and gentle faces are not significantly different in the size of the areas surrounded by the eyes and the mouth, and
562 Table 1 Distinguishing Characteristics of Fierce and Gentle Features/Configurations Face size Face width (along the mouth) Eyebrow tilt Eye height Nose length Upper-lip shape Mouth width Female Faces Chin size Eyebrow-eye vertical distance Eyebrow-nose vertical distance Eyebrow-mouth vertical distance Eyebrow-chin vertical distance Eye-nose vertical distance Eye-chin vertical distance Mouth-chin vertical distance Eyebrow shape Eyebrow thickness Eyebrow tilt Eye shape Eye height Eye size Male Faces Upper-lip shape Mouth width Distance between eyebrows Eyebrow-mouth vertical distance Eyebrow-nose vertical distance
Faces (Partial List) Fierce Gentle smaller larger shorter longer upward downward shorter average shorter longer average flatter shorter longer smaller larger longer average shorter longer shorter longer shorter longer shorter longer shorter longer shorter longer curved flatter thinner average-thicker upward downward flatter rounder shorter average smaller average-larger average curved average longer shorter average shorter longer shorter longer
by the nose and the mouth. The analysis of simple main effects for features and configurations conducted on the SD data for male faces indicated that fierce faces tend to have smaller variance for the shape of the nose, the horizontal width of the eye, the distance between the eyes, and the area defined by the horizontal ends of the nose and mouth, whereas gentle faces have smaller variance for the curving shape of the eye, the tilt of the line connecting the left end of the nose and the chin, the distance between the nose and the mouth, and the area defined by the horizontal ends of the nose and mouth. 4. D I S C U S S I O N
One important characteristic of the present method is that subjects are asked to collect a set of faces which all meet a particular retrieval requirement. It can be expected, therefore, that subjects would collect a set of faces with consistent values for important features
563 and/or configurations while making a necessary compromise on facial characteristics which are less important to them. It follows then that the relative size of mean SDs can be taken as a viable index of the relative importance of features and configurations [4]. In order to characterize fierce and gentle faces, therefore, we took into account the relative size of mean SDs in identifying those features and configurations which seem to distinguish between fierce and gentle faces. Female fierce faces appear to have tilted eyebrows, thin and tilted eyes, and a small mouth. The face outline tends to be thin, and the internal features compacted in the vertical direction of the face. Although the spatial separation between the eyebrow and the eye tends to be larger, the area surrounded by the inner corners of the eyebrows and eyes is relatively smaller than average. In contrast, female gentle faces appear to be characterized by horizontally longer eyebrows, bigger eyes, a larger mouth with a thinner upper lip, and a larger and rounder face. The internal features tend to be spread out along the vertical direction. Male fierce faces seem to have curved eyebrows, thinner eyes, and a thicker lower lip. The distance between the eyebrows tends to be shorter, and the area surrounded by the inner corners of the eyebrows and eyes smaller than average. Also, the eyebrows, nose and mouth are relatively compacted in the vertical direction. Male gentle faces, on the other hand, appear to have rounder eyes, a thinner upper lip, and a thicker lower lip. While the distance between the eyes tends to be shorter than average, the eyebrows, nose and mouth tend to be spread out along the vertical direction. The average fierce and gentle faces for females and males shown in Figures 1-4 clearly illustrate the general patterns of the features and configurations discussed above for fierceness and gentleness of the face. These faces were produced by averaging 5 most frequently selected faces in each category. The present study demonstrates that the spatial properties and configuration of facial parts that distinguish between different impressions of faces can be identified. Such information on physical characteristics might be useful to an interface designer who would want to select or design an appropriate face for a particular type of interaction between a system and the user. For example, the designer might want to adopt a (genuinely) gentle face for normal communication with the user but decide to introduce a fierce face where a serious warning is to be called for. Another example might be that in designing a synthetic face an interface designer might wish to manipulate the distinguishing characteristics of the face so that a standard face might appear gentler or fiercer depending on the intent to be communicated to the user. It should be pointed out that the experimental method used in the present research can easily be applied to the investigation of other types of impressions. In fact we have successfully used the method to investigate the physical characteristics of cheerful and gloomy faces [5]. A cautionary note, however, is warranted here. The present study has measured only the spatial properties of facial parts and their configurations. We have no intention of claiming that such spatial information is the most effective or sufficient means of distinguishing between different impressions of the face. The perceived impression of a neutral face is likely to be affected by other visual cues, such as the fairness and fineness of the skin and the three-dimensional shape information indicated by shading, both of which are
564 noticeable even in black and white photographs. We are currently pursuing other ways of representing facial images so that a more effective set of distinguishing characteristics, spatial or otherwise, of different types of faces can be identified.
:: : ::::::::::::::::::::::
Figure 1. Average Fierce Female
Figure 2. Average Gentle Female
~::! iiii!i!i!~!iiiii i~'~ .... ....
~:ili:~::::~
Figure 3. Average Fierce Male
ill ~
Figure 4. Average Gentle Male
REFERENCES
1. B. Shneiderman, Designing the User Interface, Addison-Wesley, Massachusetts, 1992. 2. B. Laurel, Interface Agents: Metaphors with Character. In B. Laurel (Ed.), The Art of Human-Computer Interface Design, Addison-Wesley, Massachusetts, 1990. 3. R.E. Eberts, User Interface Design, Prentice-Hall, New Jersey, 1994. 4. T. Kato and M. Oda, Indirect Measurement of Feature Saliency in Face Processing. ATR Technical Report TR-H-022, 1993. 5. T. Kato, M. K. Yamaguchi and R. Takahashi, Categorizing Faces Based on Subjective Impressions, 1995 (submitted for publication).
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
565
Anthropomorphic Media Approach to Human-Computer Interactive CommunicationUsing Face Robot H. Kobayashi* and F. Hara *
ABSTRACT
This lXttmr deals with the realtime response face robot to demonstrate the integration of three fimctions of active human interfaee(AHI) that realizes interactive human-computer communication through an anthropomorphic modality of face robot. As the 1st fimction of AHI, the face robot recognizes the existence of a human being in her view by using hi/her face image data and measures the distance between human being and herself, as the 2nd function of AHI, according to the distance obtained above, the face robot determines the facml expression to be displayed, and as the 3rd function of AHI, the face robot realizes the facial expression on her face. The experiment accomplishes the realtime response of the face robot in terms of her facial expression to the distance recognized in 80ms per one recognition. 1. INTRODUCTION
Since our dailyhmmn-to-human connnunicatim is fullybased on face-to-faceconnnunicativeintcmaion, the idealcommunication bctweon hunmn and computer iscamsideredto be establishedin the form of ~ f a c e interactiveinformationcxdmngc and thiskind of ideal communicationseems to be our goal in designingthe interfacefor hmrm.computer communication. The most critical feature of faceto-face communications ~ humam is the multiplicity of communication channels, such as facial expression, voice, gesture and so forth. The channel is, of course, a communication nxxtality associated with a lmrticular way of encoding infmmtion, for instance, an auditory chatmel of can34ng voice as ~ uttfxance, i n , ration and so on, and the visual chamiel for face actions including nodding motion ofa MM, facial expression and so fa~h. Thus the face-to-facecommunicationbetween htmam~ is a multi-modal communicative interaction and has been a long-time,hard-worksubject in socio-psychologyas well as in cognitivepsychology[1]. the face,to-face communication,Mehrabian[2] indicatedthat only7°,4ofmessage is carriedby linguistic ~ , 38°4 is dueto ~ e and 55% ofit is tmnffetred by facial exprmsims. Althoughthis kind ofstatislics seems to be mmgly ~ ~ t on ~ situation involved, it implies that the facial expressionis sndy a major modalityin the face-to-fac~communicationbetweon hunmns. We thus inmgim that, wlma designing an intezface between human and comtmtcr, the facial expression seems to be a good modality for hnpmving the communicabilityof message even in human and conqmtercommunication. The present paper assumes that the facialexpression is a communicativesignal to transfermostlypsychologicalmessages in human4ohmnmi commtmication. In this view of facial expression, there have bern quite a few wvdcs[3-6] on use of facial cxtxession for communication modalityin hunmn-comtmterint~mtion, which have been completelybased on comtmtergta#aica(CG) aplxoaclt We think that the CG ~ m c h can surely offersome advantagesin the human inlefface~ computerand human, but lime is no necessityto confinethe facial expressionwithin the 2Mimensional CRT displaydevices,wlxmwe want to expand the use of computer to artisticmedia, entmainment, and other new field~ We thoughtthat a 3Mitmnsioml, htmmn-likeface mightbe more realistic in hmmlr-col~uter communication iftmliz~. Then we developed a 3.dimmsioml, realistic fire robotas a conanunication modalityin h ~ communication. This in the firstplace will p m l x ~ a new conce~ of "ActiveHtma~ Interface:AHP' to realizethe intcracfvecommunication between h m m n and ccmlxaer.Thcxcafler,inorderto d c m ~ lhe integrationofAHI, wc willixcsmt the rcaltime~ ofa psycl~logieal distance of the face robot from its hurrah tmrmer, and also the realtime disphy of the tabors facial ~ o n in ~ to its "psF~logical distance" of the face robot measmxt Finallythe works to be done in future will be briefly disoamcd. 2. ACTIVE HUMAN I N T E R F A C E W h ~ we thinkabout h u r m n ~ communication as an advam~ form ofhlamn-eomlm~ intcamaion,cctnputerneeds to posses a specialintcxfaceto commtmicatc both ~ and psyr.hologicalinformationintcra~vcly with hunmn user as in hunmn-to-hmmn conanunication. The computer intca4_ace,then, must equip the conmaunicationfimction similar to that evolved in htmmn beings The function is, at least, thought to be ~ by the followingthree as pointed out by Arbib[7]:
* Departmentof Mechanical EngineeringScietr.,eUniversityof Tokyo 1-3 Kagurazaka,Shinjuku-ku,Tokyo 162 Japan E-mail: ~0103.me.kagu.mt.ac.jp
566
(1) Recognition of human user's intention, feeling or state of mimi, (2) Decifion of ~ action, and (3) Display ofitin an a l ~ form. As, on human useds side, he/she is alrmdy equipped with these three fimctions, the human user can tmdettake mutually inter communicatim with a ~ . This intamfive ~nmaunicalim mayresult in tmking it posst~olefor user's intdlismce to be activatm effectively. The firstfimctionoftlz~ threeisi n ~ as a "sensor"agentto detectthe message offeeling,ear~tionor the state ofhunmn 2nd fiatctimmay be a "c~nlzollc~agentto dctcm~e what kind ofactionshouldbe taken tohuman user,and 3rd one an "actuator" to outImtthe proper actim to ~msfcr the computers message. The hunmn usermay then bc i n ~ as a "controlobject"inthe v~ oon~olwhen we consider the interactive¢mlmunicationbetweenhunm and computer.From the discussion statedabove, we have re the cotw~t of"Active Human Intcfface(AHO"as a mw paradigm for developingthe technologyto realize a "face-to-face"communi, hunmn user and computer.The AHI is,of course,multi-modalconnuunicationmcdi~ we think about the &amns~tim of the AHI integrationby using visual infonmtion as communicatim modality,the folh information tm2cessing is neo4__~for AHI: 1. facerobot recognizes the appemmlce ofahuman being and ~ the psychologicaldistmc~ between the face robot and htmm 2. when the hmmm is getting close to the robot, the face robot starts interactiveresponse and than recognizes the facial extxessim hunm and resixmds to it. From the view point of'truman space"ineamnunicatim psyd~logy, the dimmcefnxn otherparty is very influential and which leads us to think that the measmml~t of the dimmce between the fw,e robot and its human pmlne, is also important fact ~'veloping AHItechnology.To dmmmate the integrationofAHI three fimOionsforthe situationstated above 1. and2., we have devc a realtime ~ ~ ~ tl~ facerobotand its hunmnimrtnerand also tmltirm displayo f ~ extxemkmof the face in reslmme to the psychologicaldimmce. In other mints, as the 1st function ofAHI, the face robot ~ the aplxamme of a h being inthe robotsview by usingthehunmn image dataand n~usurcs the distancebetween thehuman being and the robotsclf,as th function of AHI, according to the distance obtained above, the face robot dOmnines the facial expression to be displayed, and as tt function of AHI, the face robot getmates the facial expression on her face. 3. REALTIME MEASUREMENT OF ROBOT-HUMAN DISTANCE
In orderto r e c o ~ a human kmge by computer,parallelprocessingwas thoughtto bc also~ t in our researchwork sinc pothole to calculate all ~ data fast by dividing the tasks into several CPUs. We uscda ~ t c t ( T S 0 5 : 1 5 M ps,2.3 Mflops) ~. that performedImrallel~ for the realtitm distance~ t The tnmstmtersystemis camct with 4 other ~ by 4 serial links and performs plural tasks in one ~ by using the thread f i n ~ forpmallcl~ . In sideoftheeyeballsofthe facerobot,we installedtwo smallC C D cmmax~ 12*40nun, F=I 8mm) forvisualdataacquisition,we the Innmn image obtained from them for our realtimedistame mea.mrmlenL The clunaO.ai~'csof the pamlld ~ sy lmffonnmce constraintsthe system and the distame measuremmt procedme are briefly shown in the following sections: 3.1 Characteristics of Parallel Processing System The distame between the face robot aml its human lmrtner is measured in the following c~nditions: (a) We calculate the distancebyufingdiffaetreofthe gravitycenter betmma the two ~ ofo~~m~ installed in the right and left eyes of the face robot. (b) One color CCD camera is used to detect the hair and skin color( The other is black & white ). (c) Not necessaryto take a specialattentionto the ~ scene and lighting.
twoCCD carm
Since the memory catmcityfor color image data needs 3 times more than that for black and white one, it requires longer time for inmge lXOCessi~and thus we maployedblack and white ~ data for obtainingimage diffemitial. For acquiringthe data of hair anc color tmntiomd (b), the color image was used, but for others we utilized black and white monochrome image data by tmmfor colot(RGB) data to black and white one. The tnmsfommtionmethod from RGB colordata to mmochrome data (Y) is as follows by 1 YIQ color system: Y = 0299R + 0.587(3+ 0.114B It is noted here that the dimme m
~
(1)
has the followingconditions:
(1) Thisvis~ d i s t a m e ~ ~ e n t system wasusedimidea romL (2) The target person for visual dimmce tratstmanent was asstm~ not stayingand not to identifythe human itself ( an anthropano~r,,_ face robot was assumed to reqxmd to moving Ix~mn activdy). (3) The mngeoftl~ dimmce mmstmmm~was from 1.5mto 5.0m(the range depends ofcourse on the focus distameofthe CCD camera used). 3.2 Distance Measurement Procedure Fig. 1 shows the conmuOion of the mmsputer~ and data flowfor the realtime distance tmmsmxanent In this figure,each square enclosed bold line indicates the ttmlsputet used and each dam processed is tnmsferred along the allows by using data channel conummica-
567
\ CCD ~ 12nm
\
/ Black and White
TRIP with DSP
,~____'I~P with DSP
1/3 ®-I
' Rlghfl, L ~ l
,
CCD 12B
Color (RGB) (256*240pixel)
(256*240pixel)
t,U-~ _ Headpartl_ .I
/
I I'll tBlack&~hite~
',1 I I l i f t
,
- .....
.
~i~) (" 86*61pixel 1 I Trans. RGB into ~iK~!iiiii!J ~ Black & White
(~-2
(9-2
I--l[~::i::ili::iii::(::~$ii::iii::i::i::iii::!::i::iil
_
®-2
/ - Headpatt2 (.7)-2 -ti ," T ep, llettem,-'~I I\ Right, _ t ~ _ , \ ================================
~. . . . .
~: : ~ - ~ ' ~ . ~ J (~) (::i:::................. :::::~
®-x
~
CaLeflhec.aterefgravily o q t h e H e a d Partl with i860
III1®-1 v [_J~/,
lit
®_2
-Headpatti - ,~ /, Headpart2 ,~ Rlghtl&IMtl I iRightl&IMl2 t-
CaL of the crater of gravity af~H~d
"l]-"
]
Parl2
TRP with i860
TRP with i860 (~)d CaL ef the aimace bawem " l humm u d the face robot ~
F
RS232C
C o n t r o l l e r of Face R o b o t
Fig.1 The conamcfion of the tmmlmter system
~X
tion, which implies that one processing does not start the task until the information comes from the other processor according to the allow indicated in Fig,1. The timing of obtaining in~es from the right and left CCD cameras was conlrolled to be synchronous by the channel eommunicatio~ The spatial resolution of the one frame image is 256*240 pixels and 8 bit depth in color speclmn. The image filteringwas taken at the rate ofl/30s. In Fig.l, the shadowed portions indicate the processv ing for the colorimage data to acquire the head and Color Black & White skin colormentioned before. The details in each ~ are described a Fig.2 The conceptual procedure for vocessing @ to ® bellow. It is noted here that the spatial resolution of one image data was changed ifneeded a high speed calculation. The i.d. numberslocatedat the head of each explanation corresponds to that
- 1 To reduce the obtained imagedata to 52"48 pixel image data( one for 5 pixel data in horizontal and vertical directions)and forward
thena to ® -1. The same as 0 ) -1 for RGB colorimage data.
O) -2
In order to establish a fast calculation of image data, we divide the image data into 3 portions from the top to t_hebottom of the image
and eachofthoseistmnsferredtoe~ahof3 l m m p u t e r s . F o r ~ i n g f r o m @ to® mentiomdbelow are performedin each lrans[x~
~dently. -1 To reduce the obtained image data to 86 *61 pixel imagedata( one for4 pixel data for verticaldirectionand one for 3 pixel data for hmmntal direction)and is f ~ to ® -1. (~) -2 Byusing eq.(1), to transform RGB col~ image data to momchronaeinmge data and per fcrm the same as (~) -1. l n ~ ® , gmvitycenter is calculatedby ® usingin~edatanmationed~ -land(~ -2.Althot~toobtainarn~ewxanate gravityconter,it is better to use 256*240 pixd origiml inmgedata, t h e ~ ® needsrncretimettmnothertx'ocessing.Thenweused the transfnred image data. ® -1 To receive the f ~ i e d data from @ -1. @ -2 To receive the forwarded data fixxn0) -2. Forvocessing@ t o ® , Fig,2 shows the conceptual procedure oftheir data ~
.
568
-1 To calculate the hnage diff~mtial, i . e . , i f t h c d i f f e r ~ b a w e e n t h c p r e s e n t i m a g e d a t a ~ ~ ~ ~ 1 stcp before is larger than 21, which is empirically ~ for each pixel, we set the pixel value to 1 and others are set to 0. (g) -2 1)ByusingRGBinmgcdata, thc l ~ and skin color area arc damnined and sct the pixd ~ to 1 for their area and 0 for other are~ It is noted here that, if there is no hair and skin col~ area, the tnmsputer system judge that human being does not appear. 2) By using eq.(1), to transfer RGB image data to momchmme data and perform the same as (g) -1. To detmnixe the skin color, we employ eq.(2)[10]: if
l10X
target d i s p ~ c n t
4
0
2
Time[s]
4
6
Fig.6 Attention points for expressing facial expression
Fig.5 Result of realtime response ofthe facembot As pointed out beftxe it is sta'dy better to additionally ~nploy other kinds ofcommuaicatiou modalities such as facial exptessiom and voices, and we are specially interested in Vostxiy infonnafion processing in ~ recognition and synthesis. As a new directiou of our research, lhe integtafiou ofrealtime facial expression ~ t i o n with prosodicspeechrecognition is the most urgent and exciting one to improve file ~ u n i c a b i l J t y . Finally it should be pointed out the followings: Multiple channels in h ~ u t e r communication can surely redtre the current necessity ofcormnunicatiou regulafiou ~ upon hmmn users and can invoke more ~ u s , human-like communicatiol~ This will flourish when we would devdop li~eessential technology or "artifioal mind" for dynamically coordinating the messages transferred through various om~nunication modalities. Seomdly the anlhropotm~hism in h u t m n ~ p u t e r interface will be surely realized by using a 3dimcmsioual, hmnan-like face robot. The face robot can recogfize human user's fieial expressions as well as prosodic speech and then can display the messages through the robofs ~ expression and prosodic speedt This might imply a hunmnoid computer with'which we would omrantmicate as in human-to-human c~mmunicati~
ACKNO~MENTS This researchwork was pmliallys t ~ ~on, through 1992-1994.
by the ScientificC_aants04236107 and 05452172, Japanese~
of Cultme and
REFERENCES (1)Brace, V., ' ~ R ~ g Faces(Iranslatedinto J a l ~ n e s e ) " , SciencePub. Co. pp.l-205 (1990) (2)M~ A., '~_,ommunicalionwithoutWords" PsychologyToday,Vol2, No.4 (1968) (3) Takeuchi, A. andNagao, K., "CommuaicativeFadalDisplays as a New Convotsafional Modality", ACM/IFIP INTERCHI~3, pp.187-193 (1993.4) (4) Seto, S. ot al, ' ~ / l ~ o d e l Response ofa R~fl-TtmeSpeech DialogueSystem(inJapanese)", 8th Syrup.Human Intorfa~, SICE, pp. 693-698 0992) 5) Hara, F., "ANow Panutigm far Robot and Human Communicalion(inJatxmoso)",JSME Proc. Robolicsand Mocha~nics, No.940-21, pp.l-9 (1994) (6)Hasogawa, O., Yokosawa, K. and Ishizulag M.:'~RoallimeParalleland ~ v e Roc~gnilionof Human Face for a Naturalistic Visual Human Intoffs~in Japancso)", The Trans. ofIEICE D-II, VoI.J-77-D-H,no.l, pp.108-118 (1994) (7)Arbib, M. A., '~eural Netv~rks and Brain (translated into Japanese)", Scien~ Pub. Co., pp.l-507 (1992) ($)N. L. Ashton, M. K Shaw and A. P. Worshanr "Affo~ive Reaction to Intorporsonal Dislanc~s by Friends and Slrangors", Bulletin of file Psychonomic SociotyVoL15(5), pp306-308 (1980) (9)Fulmda, F., Shigenofi, I. and Anti, F.:"Retx~fifionof Human FaceUsing Fuzzy Inference and NeuralNetwork (in Japanese)", JSME Tmas.C, Vo1.59,No.558, pp200-206 (1993)
IV. Ergonomics and Health Aspects of Work with Computers
This Page Intentionally Left Blank
IV.1 Health Aspects
This Page Intentionally Left Blank
Symbiosisof Human and Artifact Y. Anzai, K. Ogawaand H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
575
Symptom clusters among VDU- workers. Part of the Norwegian field study: The VDU workplace - A new ergonomic concept*. K. I. Fostervold a, I. Lie b, S. Larsen c, G. Horgen d, A. Aarfis e & A. V~gland f a,b Vision Laboratory, Institute of Psychology, University of Oslo P.O. Box 1094, Blindern, N - 0317 Oslo, Norway E-mail: K.I.FostervoldC~sykologi.uio.no CMedstat Research AS. P.O. Box 210 N-2001 LillestrOm dCollege of Engineering, Dep. of optometry. P.O. Box 235 N-3601 Kongsberg eAlcatel STK AS. P.O. Box 60 Okern N-0508 Oslo fErgonomidesign AS. Huitfeldsgt. 25 N-0253 Oslo
1. I n t r o d u c t i o n Over the last decade implementation of advanced information technology has revolutionised ordinary office work, and video display units (VDU) are today a dominating device in most western workplaces. Due to this rapid implementation very little was known about possible health risks that could originate from prolonged interactive VDU work. When reports about adverse health reactions among VDU-workers started to appear, this was taken seriously by many scientists and health professionals. The initial concerns were mainly directed toward eye problems. Visual and oculomotor consequences of sustained excessive near-work have been research topics in visual perception and optometry for almost 50 years (1,2,3,4,5). Since typical VDU-work tasks are dominated by intensive and sustained visual work at short distances, a renewed interest in these issues was therefore a logical consequence. Subsequently, also other health concerns appeared, and typical problems discussed in the literature are: musculoskeletal problems, psychosocial factors, concentration problems, skin symptoms, headache, dizziness and general tiredness (4,6,7,8,9,10). Among these, musculoskeletal problems and psychosocial factors have gained particular attention. Musculoskeletal problems are reported to be very common in the industrialised world (11,12). Repetitive movements and long lasting static muscle load are assumed important factors in the aetiology of
* "The VDU workplace- A new ergonomic concept", are financially supported by: The Norwegian Research Council and The National Insurance Administration.
576 these conditions (13). A causal relationship between musculoskeletal problems and intensive VDU-work are therefore expected since most VDU operations are sedentary and necessitate repetitive and restricted body movements. Recent research reports have shown an increase in musculoskeletal problems among VDU-workers even though they are not always consistent about whether the increase is higher among VDU-workers than other employees (14,15,16). Other studies emphasise psychosocial factors as an explanation. In this perspective, focus is directed toward environmental factors such as job design, time pressure and obsolescence or more person centred concepts like mental workload, coping style and cognitive failure (9,10,17,18). In the public opinion new technologies always represent a source of potential hazards. In such situations people tend to blame existing symptoms already associated with certain causes, on new causes. The symptoms associated with VDU work could therefore be interpreted as nothing else than the normally occurring levels of non-specific symptoms that are present in any population. Research projects on VDU-related complaints use questionnaires providing frequency distributions of visual, musculoskeletal and vegetative symptoms. Little is known, however, as to whether or not these exist as symptom clusters among VDU-workers. If such clusters exist, advocates of the psychosocial explanation must explain why people develop different symptoms. If clusters do not exist, advocates of a more direct causal link between VDU-usage and symptoms must explain why this specific influence gives rise to different symptoms. The aim of the present study is to search for symptom clusters among VDU-workers participating in an on-going field study in Norway. 2. M e t h o d
The on-going field study "The V D U workplace - A new ergonomic concept", emanate from laboratory research which has shown significant health effects of gaze angle and optical corrections during VDU-work (19,20,21). The objective is to implement these variables in an ordinary office environment and to investigate whether they also reduce complaints in natural settings. The field study design includes the following health measures: a subjective symptom questionnaire, EMG-measurements, a visual examination and a health examination. The present paper is based on the symptom questionnaire administered as a part of the baseline measurements in this field study. Subjects were recruited among employees in The National Insurance Administration, The National Ensurance Office for Social Ensurance Abroad and Local Social Ensurance Offices in Oslo. A sample of 150 subjects out of a total pool of 500 employees was selected according to a given set of criteria. Subjective symptom reports are often subject to criticism. Scientists claim that they may produce biased results as a consequence of self-selection. This represents a danger of an over-reporting of symptoms, since respondents who complete questionnaires, are in most cases highly motivated. To prevent this, the symptom
577 questionnaire was distributed personally to each subject, together with short oral instructions. Each subject was told to take the questionnaire home, to read the instructions carefully and to complete it under quiet conditions. The questionnaire h a n d in procedure included a verification system that made it possible to distribute personal reminders. Very few reminders were necessary and all subjects, except one, (99.33%) returned completed questionnaires. To enhance the validity further, a comparison of the symptom frequencies found in this study was performed with symptom frequencies reported in other research papers (4,8,10,14). This comparison did not indicate that our results were biased toward an over-reporting of subjective symptoms. The sample consisted of 111 women and 38 men. The mean age was 40,9 years, with a standard deviation of 9,7 years. All subjects were experienced VDUusers and had VDU-~ork, mostly routine and non-routine data dialogue, as a major part of their Working day. No particular change regarding the type of VDU-equipment was done as a part of the study. The equipment most subject used on a daily basis was a 14-inch low radiation colour monitor, 70,5% used a TDV 5330 (Tandberg Data A/S, Oslo, Norway), a monitor with a refresh rate of 72 Hz, non-interlaced. The symptom questionnaire is developed by the authors and precursors of this questionnaire have been used in other studies (20,21,22). Each symptom category in the questionnaire was recorded through several items, in order to endeavour different facets related to the complaint. This diversity in items made it necessary to reduce and standardize items applied in the present analysis. 17 key items, one from each symptom category, were chosen a priori and factor analysed using principal components analysis with Varimax rotation. A list of the symptom categories included in the questionnaire is given in table 1. A 4 factor solution explaining 52,6% of the variance was selected on the basis of its ease of interpretation and relatively high communalities. The 4 factor solution was also in accordance with factor solutions found in other studies (23,24). Items were summed to form the following 4 symptom variables: 1. Visual symptoms 2. Musuloskeletal symptoms, 3. Vegetative symptoms and 4. Specific symptoms (Skin and forearm]hand symptoms). A hierarchical cluster analysis was used to produce empirical clusters of subjects classified on the basis of their scores on the four symptom variables. Different cluster solutions were then compared and validated according to the criteria of replication and interpretation. Replication is in particular regarded as one of the better ways to validate cluster solutions (25). 3. R e s u l t s .
The analysis was performed by use of the hierarchical agglomerative cluster procedure in SPSS, with Within-groups average linkage as method and Pearson correlation as similarity measure. The dendrogram was inspected and solutions of 2,3,4,5,6,7 and 8 clusters selected to undergo further examination.
578 A split half procedure was used to examine replicability of different cluster solutions. The sample was randomly split into two subsamples and cluster solutions from the subsamples compared. The analysis show t h a t the 4 and 5, 6 and 7 cluster solutions were quite similar in the two subsamples. The criteria of interpretation show t h a t the 4, 5 and 6 cluster solutions were easily interpreted. The 7 and 8 cluster solutions gave some clusters with very few subjects. Based on the information about the validity criteria, the 5-cluster solution was chosen. It was stable, easily interpreted, parsimonious and represented the clustering p a t t e r n of the sample in the most appropriate manner. The five clusters may be described as follows. Cluster 1 is characterised by a relatively high frequency of musculoskeletal symptoms and very few other symptoms. Cluster 2 is characterised by a relatively high frequency of visual symptoms and very few other symptoms. Cluster 3 is characterised by a high frequency of specific symptoms and some musculoskeletal symptoms. Cluster 4 is characterised by a high frequency of vegetative symptoms and very few other symptoms. Cluster 5 is characterised by relatively high frequencies of visual and musculoskeletal symptoms, some vegetative symptoms and very few specific symptoms. To enhance the understanding of the 5 cluster solution, each symptom category was recoded into a binomial variable. The percentage of subjects, in each cluster group, experiencing a symptom was then calculated. The results are shown in table 1. Table 1. Symptoms in the questionnaire and yes answer distribution in the clusters.
Symptoms in the questionnaire 1. Focusing problems 2. Headache 3. Pain/tension- neck/shoulder 4. Pain/tension in the back 5. Pain/tension- forearm/hand 6. Pain/tension- leg/foot 7. Skin problems 8. Dizziness 9. Nausea 10.Concentration problems 11. A general feeling of tiredness 12. Pain in the eyes 13. Tired eyes 14. Problems with line tracking 15. "Foggy" letters or words. 16. "Doubling" of letters or words 17. "Shivering" text
Clust. 1 Clust. 2
Clust. 3
Clust. 4
Clust. 5
% Yes 16 89 95 84 11 05 16 16 05 37 100 37 53 00 21 00 00
% Yes 19 52 93 62 55 38 33 07 00 26 71 45 76 14 33 17 17
% Yes 47 73 80 53 20 13 40 33 60 60 100 60 47 13 27 13 07
% Yes 48 80 80 40 16 20 00 68 08 64 00 72 88 44 44 28 28
% Yes 30 13 33 13 13 10 33 07 03 30 57 63 63 40 50 40 30
579 Except for the main tendency found in the cluster analysis, the results revealed some interesting symptom distributions among the groups. It is particularly noteworthy that some symptom categories (Nos. 2, 3, 11, 12 and 13) seem to be quite common in most groups. 4. D i s c u s s i o n
The results from the cluster analysis imply the existence of differences in symptomatology between subgroups of VDU-workers. The existence of such subgroups agrees with the notion of different risk factors associated with VDUwork. Subgroups experiencing mostly visual symptoms, musculoskeletal symptoms and their combination are, in this perspective, accounted for; visual symptoms because of sustained near work and probably minor oculomotor and refractive errors; musculoskeletal symptoms because of the increased in sedentary work, and the combination of symptoms because of both risk factors or, as recent research proposes, through the influence of oculomotor factors in the aetiology of occupational cervicobrachial diseases (12,26). The subgroup experiencing mainly vegetative symptoms is also accounted for, since several studies have demonstrated a strong relationship between vegetative symptoms and psychosocial factors (12,24,27). The aetiology involved in the subgroup associated with skin and forearm/hand symptoms are less obvious. The forearm/hand symptoms could be associated with the so called "mouse syndrome", while a possible causal link between skin symptoms and VDU-work are still under discussion (8,12,14). Perhaps these symptoms are not functionally inter related at all. Alternatively, the existence of symptom clusters may be explained on the basis of psychosomatic theory. Already in 1949 Malmo and Shagass described the principle of "symptom specificity" (28). This principle is derived from evidence showing a specific association between symptoms and physiological mechanisms being susceptible to activation by stressful experiences. The observed clusters could, therefore, be explained solely by symptom specific reactions to workplace stress. However, the symptom clusters are not that clear-cut. A more subtle picture emerges if one considers the frequency distribution in table 1. This table shows that most symptoms are represented among most VDU-users, even though there is an overrepresentation of certain symptom categories in the different clusters. This fact presupposes that more general factors have to be taken into account when discussing the causal relationships of symptom clusters. The baseline health measurements of the present study are to be replicated during the autumn of 1995. The repeated measurements design, which makes it possible to investigate changes in magnitude of symptoms, also opens the possibility to investigate the stability of those clusters reported in this paper.
580 References
1. C. Behrens, and S.B. Sells, Arch. Ophth., 31, (1944) 148. 2. K. Brozek, E. Simons and A. Keys, Am. J. Psych., 63, (1950) 51. 3. W.J. Smith, In, Proceedings of the Human Factors Society, 23rd Annual Meeting. (1979) 362. 4. M.J. Dainoff, and A. Happ, Hum. Fact., 23(4), (1981) 421. 5. R.A. Tyrrell and H.W. Leibowitz, Hum. Fact., 32(3), (1991) 341. 6. E. Grandjean, Ergonomics and Health in modern Offices, Taylor & Francis: London, UK, 1984. 7. M. Hagberg, and G. Sundelin, Ergonomics, 29, (1986) 1637. 8. J. Evans, Work and Stress, 1(3), (1987) 271. 9. A. Smith, D. Peck, and T. Clatworthy, Work and Stress, 3(2), (1989) 195. 10. P.T. Yeow, and S.P. Taylor, Appl. Ergonomics., 21(4), (1990) 285. 11. A.C. Mandal, In, D.J. Oborne (eds), Contemporary Ergonomics 1985 Taylor & Francis: London, UK, 1985. 12. U. Bergquist, Scand. J. Work Environ. Health, 10, suppl. 2, (1984). 13. A. Aar~s, Scand. J. Rehab.-Med., Supplement No 18. (1987) 14. U. Bergqvist, B. Knave, M. Voss, and R. Wibom, Int. J. Hum. Com. Int., 4(2), (1992) 197. 15. M.J. Smith, P. Carayon, K.J. Sanders, S.Y. Lim, and D. Le Grande, Appl. Ergonomics., 23(1), (1992) 17. 16. K. M. Zyla, Appl. Ergonomics., 24(6), (1993) 432. 17. H. Kahn, and C.L. Cooper, Curr. Psych. Res. & Rev., Summer, (1986) 148. 18. S.Y. Lim, and P. Carayon, In, A. Grieco, G. Molteni, E. Occhipinti and B. Piccoli, (eds.), Book of Short Papers, Work With Display Units '94, Vol 1, C9 University of Milan: Milan, Italy, 1994. 19. T. Paulsen, Proof-Reading from VDU: The Effect of Vertical Gaze Direction on Speed, Acuity, Subjective Discomfort and Preference. Institute of Psychology. University of Oslo: OSLO, Norway, 1990. (In Nor.) 20. I. Lie and R. Watten, Ergonomics, 37(8), (1994) 1419. 21. I. Lie and K.I. Fostervold, In, A. Grieco, G. Molteni, E. Occhipinti and B. Piccoli, (eds.), Work With Display Units '94 Elsevier Science: Amsterdam. 1995. (In press) 22. R. Watten, I. Lie and S. Magnussen, Behav. & Inf. Techn., 11(5) (1992) 262. 23. R. Watten and I. Lie, Nord. Ergonomi, 5 (1987) 8. (In Nor. Eng. summary) 24. H. Ursin, I.M. Endresen and G. Ursin, Eur. J. App. Physol., 57, (1988) 282. 25. P. Carayon, Ergonomics, 37(2) (1994) 311 26. I. Lie and R. Watten, Eur. J. App. Physol., 56, (1987) 151. 27. O. Vassend, R. Watten, T. Myhrer and J.L. Syvertsen, Soc. Sci. Med., 39(4) ( 1994 ) 583. 28. R.B. Malmo, C. Shagass and F.H. Davis, Psychosom. Med., 12, (1950) 362.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
581
C o n s t r u c t validity of c o m p u t e r anxiety as m e a s u r e d by the C o m p u t e r Attitudes Scale* Deane, F. p.1, Henderson, R. D. 2'3, Barrelle, K. 2, Saliba, m. 2'4, & Mahar, D. 2 Department of Psychology, Massey University, New Zealand. Trust Project, Computer Sciences Lab., RSISE, ANU, ACT 0200, Australia. Now at Department of Applied Psychology, UC, P.O.Box 1 Belconnen, ACT 2616, Australia. Now at DSTO Aeronautical and Maritime Research Laboratory, Air Operations Division, Human Factors, PO Box 4331, Melbourne, Victoria 3001, Australia. 1.
Introduction It is becoming increasingly difficult for individuals to avoid contact with computer technology. Along with these technological advances come a number of associated stressors. There may also be significant job task changes as a function of computerised technology and the need to master a variety of computer software packages. Individuals vary considerably in their response to these challenges. A significant proportion will experience negative reactions. One reaction which has generated much interest and research is computer anxiety. This is attested to by the fact that at least 12 questionnaires have been developed to measure computer anxiety [ 1]. Most people initially experience hesitation when confronted with new computer technology, or have heard anecdotal reports of such experiences [2]. Anecdotal reports are typically restricted to examples of people in the work environment avoiding wordprocessers and handwriting reports [3]. Estimates of the prevalence of computer anxiety range from 5% [4] to 30% [5] with specific sub-groups having higher prevalence rates [6]. Despite the high prevalence rates and apparent face validity, the construct validity of computer anxiety requires considerable clarification. Inconsistent use of terminology has resulted in some conceptual uncertainty regarding the construct of computer anxiety. There have been several terms which all relate to the construct of computer anxiety :- computerphobia, computer fear and computer anxiety [7,6,5]. Different criteria for each definition has substantial implications for both estimates of prevalence and subsequent corrective interventions [5]. The construct of anxiety, per se, has historically been steeped in definitional ambiguities. Resolution of the definitional and measurement problems associated with anxiety has been facilitated by the state-trait anxiety distinction. State anxiety is considered a relatively unstable, transitory, situational anxiety response [8,9]. A person experiencing state anxiety is typically experiencing an unpleasant emotional state characterised by apprehension and worry. Trait anxiety, on the other hand, is a more stable and enduring characteristic, more consistent with a personality trait, but does not imply permanent anxiety. Trait anxiety refers to "relatively stable individual differences in anxiety-proneness." [9, p. 1]. Theoretically, people with high levels of trait anxiety are considered more likely to respond to stress with elevations in state anxiety. Whether people high in trait anxiety react to stressors with high levels of state anxiety depends on the extent to which they perceive the specific situation as threatening, and this is influenced by each individual's past experience [9]. The state-trait distinction has rarely been made with regard to the construct of computer anxiety. In their efforts to clarify the definition of computer anxiety Cambre and Cook [ 10] cited two unpublished studies which explored the relationship between computer anxiety and state-trait anxiety. Both studies concluded that computer anxiety comprised state anxiety resulting form exposure to computer use. Cambre and Cook [ 10] concluded that "the relationships between computer anxiety and
* This reseach has been carried out on behalf of the Harry Triguboff AM Research Syndicate.
582 state and trait anxiety need to be explored further and classified if computer anxiety is to be defined accurately", (p.53). Several studies have done this, but the results have offered little theoretical clarification of the relationships and produced contradictory results [ 11,12,7]. In assessing cognitive factors in computer anxiety Glass and Knight [ 13] found that high computer-anxious subjects scored significantly higher on the STAI trait scale than low computeranxious subjects. They found highly computer-anxious subjects experienced significantly higher levels of anxiety than low anxious individuals immediately after beginning the task and after being informed of an error in their performance. This implies that computer anxiety is related to both trait and state anxiety. The authors conceptualised computer anxiety as a "situation-specific anxiety", but stated that unlike other situation-specific anxieties, computer anxiety research has not been based on clear theories or models. This criticism was also indirectly highlighted by Harrington et al. [5] who found that computer anxiety has less effect under certain conditions, "suggesting that computer anxiety may be a temporary, normal form of state anxiety", (p.354). This conclusion was viewed as highly speculative because state anxiety had not been tested or controlled for. The present study focuses on the use of the Computer Attitudes Scale (CAS) [ 14], in particular the Computer Anxiety subscale (CAS-Anxiety) as the measure of computer anxiety. The CAS has been used extensively in studies of computer anxiety and is fairly typical of the range of anxiety scales [ 1]. The CAS-Anxiety appears to be consistent with other forms of situational anxiety where anxiety occurs in a specific class of situations [ 15]. Respondents are asked to respond to 10 items describing reactions when working with computers. While the CAS-Anxiety appears to be an example of "trait" situational anxiety, there is some ambiguity in that respondents are not asked to respond as to how they "generally" feel. The present study aimed to clarify these constructs by looking at the theoretical relationships between state anxiety, state anxiety when imagining recent computer interactions, trait anxiety and computer anxiety as measured by the CAS-Anxiety. Theory and supporting empirical evidence [ 12] suggest that those individuals with high scores in trait anxiety should react to relevant stressors with higher scores in state anxiety than those low in trait anxiety. Given, that the stressor must be perceived as threatening, we predicted that those high on a trait oriented measure of computer anxiety (CAS-anxiety) would respond with higher levels of state anxiety in situations involving computer use, than those low in trait oriented computer anxiety. Construct clarification is also attempted by examining the theoretical relationship between computer anxiety and avoidance. An anxiety-response pattern involves "an array of avoidant-defensive behaviours", [ 16](p.183). Despite a number of theories predicting avoidance of anxiety arousing situations, empirical findings related to computer anxiety have produced theoretically incongruent results. There have been few studies in which a relationship between computer anxiety and avoidance has been established [17,7,11]. The implications of the relationship between computer anxiety and computer avoidance are far reaching. There are serious implications for productivity if up to 30% of the workforce are computer anxious and avoiding computer use. Preliminary studies suggest that information system failure (especially non-use) may be associated with a number of psychological variables [ 18, 19, 20]. One such variable may be computer related anxiety and associated avoidance. Given the theoretical causal chain leading to avoidance involving computer anxiety, we also aimed to clarify those predictors of computer anxiety. A partial replication of a prior study [21] looking at key psychological predictors of computer anxiety in health care workers was attempted. In that study it was found that self-efficacy expectations was the best single predictor, accounting for 72% of the variance [21 ]. 2.
Method Subjects and Procedure. 197 Undergraduate university psychology students agreed to complete the questionnaire. 58% were in their first year of university study. Age ranged from 17 to 50 years, (M=21.0, SD=5.2), with a mode of 18 years. Sixty percent were female. Measures. The Computer Attitudes Scale (CAS) [ 14] is made up of three subscales which measure computer related anxiety (CAS-Anxiety), computer attitudes (CAS-Liking) and computer
583 self-efficacy (CAS-Confidence). Reliability coefficients and factor-analytic studies suggest each of the three subscales are sufficiently discrete to be used separately [22, 14]. All subscales have high internal reliability with coefficient alphas ranging from .86 to .91 [ 14]. In the present study similar levels of internal reliability were found with Cronbach alphas of .89 for CAS-Anxiety and .88 for both CAS-Liking and Confidence subscales. The CAS-Liking was used to assess computer attitudes and has been found to correlate with a variety of other computer attitude measures in the .80 to .89 range [23]. The CAS- Confidence subscale was used to assess self-efficacy. This was considered appropriate given the match between item content and the definition of the self-efficacy coostruct by Bandura [24]. The State-Trait Anxiety Inventory (STAI), [9] is one of the most extensively used and researched self-report measures of anxiety. The STAI-Trait has high internal reliability. A coefficient alpha of .91 in the present study is commensurate with Spielberger's [9] normative sample. In the present study a shortened 6-item version of the state anxiety scale [25] was used given the need to collect repeated measures of state anxiety. Spielberger [9] indicated that the state anxiety scale of the STAI may "be used to evaluate how they felt at a particular time in the recent past...in a specific situation...or in a variety of hypothetical situations" (p.2). Consequently, this measure was completed twice. The first version with the standard instructions to respond to the items as they feel "RIGHT NOW, that is AT THIS MOMENT", was considered the normal level of state anxiety and will be referred to as "STAI-norm". The second version instructed respondents to answer as they felt the "LAST TIME YOU USED A COMPUTER", and will be referred to as "STAI-compute". Since measures of computer avoidance have been limited [26] this construct was measured using a custom developed scale. This scale comprised seven items rated on a 4-point Likert-type scale. Subjects were instructed to rate the extent to which each item described their" behaviour during the present academic year". No prior psychometric evaluation on the scale had been conducted, but the Cronbach alpha coefficient of .85 in the present study suggested good internal reliability. Similarly, theoretically consistent correlations with other measures suggested some construct validity (see Table 1). The items were as follows: 1. I avoided taking courses that required using computers; 2. I avoided using computers in my daily activities; 3. I avoided learning about computers; 4. I put off working with computers; 5. I completed some things by hand rather than use a computer; 6. I avoided talking about computers and; 7. I avoided using computers to help with my course work. Subjects were asked two questions to determine computer experience. The first involved an estimation of the length of time they had been using computers as part of their daily activities, and the second an estimation of the number of hours of computer use in the previous. An index of computer experience comprised the product of the total number of months of computer use and the average number of hours of computer use. This represented the total average hours per month for each subject. 3.
Results
In order to determine the differential changes in state anxiety from the "normal" state to the computer situation, the difference between STAI-compute and STAI-norm was calculated. This variable is referred to as STAI-change and reflects changes in state anxiety from a "normal" resting state to the state when last using a computer. Subjects were divided into high and low trait anxious groups using a median split on the STAI-trait. Given that theory suggests that people who differ in trait anxiety only show corresponding differences in state anxiety in situations they perceive as threatening, high and low trait anxiety groups were also determined using a median split on CASanxiety. A between groups t-test was calculated between mean STAI-change scores for high and low trait anxious groups. No significant difference in STAI-change was found for high (M=-.15, SD=4.82) and low (M=.98, SD=3.98) trait anxious groups using the STAI-trait. However, when high and low situation specific trait anxiety (CAS-anxiety) was used, those high on the CAS-anxiety (M=-.85, SD=3.95) experienced significantly higher (t(196)=-4.2, I2_EF) EEi[:=--:E: i!_: (L- -:~_~-~-~-~:: $-: E:_E
- 1
0
I
(sec.)
a c c e p t a n c e of contact closure F i g u r e 2. The 1 0 - k e y p a d and the accelerometer.
F i g u r e 3. A n e x a m p l e of t h e a c c e l e r a t i o n a n d EMG w a v e f o r m s . (upper : a c c e l e r a t i o n , lower : EMG. The w a v e f o r m s b e t w e e n d a s h e d l i n e s are r e c o r d e d . )
799
output of a b i o e l e c t r i c amplifier was connected to the AiD c o n v e r t e r , a n d its i n p u t w a s c o n n c e t e d to t h e E M G e l e c t r o d e s . A piezoelectric accelerometer was attached to t h e 1 0 - k e y p a d as s h o w n in F i g u r e 2, a n d its o u t p u t w a s a l s o c o n n e n t e d to t h e A / D converter via a charge amplifier. When the subject depressed a k e y on t h e k e y p a d w i t h t h e r i g h t r i n g f i n g e r , the EMG signal from the right forearm and the signal from the sensor on the keypad were simultaneously r e c o r d e d b y a s o f t w a r e r u n n i n g on t h e PC. Both the EMG and acceleration signals began registering 300 milliseconds before key-contact closure was registered by the s o f t w a r e , a n d w e r e r e c o r d e d for 600 m i l l i s e c o n d s (approximately 300 m i l l i s e c o n d s after key-contact release). See f i g u r e 3. T h e IEMG for each keystroke was calculated as an average of the absolutized EMG over that 600 milliseconds. The maximum acceleration swing for each keystroke was calculated as the difference between positive and negative peak values for t h e s a m e 600 m i l l i s e c o n d interval.
4.
EXPERIMENT
For t h e study, four s u b j e c t s d e p r e s s e d keys on the 10-key p a d u s i n g t h e r i n g f i n g e r on t h e r i g h t hand. I n s t r u c t i o n s were p r e s e n t e d o n a C R T d i s p l a y . T h e y i n d i c a t e d , in r a n d o m o r d e r , t h e k e y to b e d e p r e s s e d ( n u m b e r s 0 t h r o u g h 9, c a r r i a g e r e t u r n , b a c k space, p e r i o d , a n d . , + , - , / , a n d =) , s u b j e c t i v e intensity (hard, medium, or soft), and subjective touch ( c r i s p or s t i c k y ) . The subject depressed e a c h k e y f i v e t i m e s for a l l c o m b i n a t i o n s of intensity and touch.
Scattergram Split By: INTEN 20 18 16 O
14 ..-..
~
. Borrowing concepts from physics let us say a system will perform "work" on some object in its environment if it can be considered to create an impetus (Force) which when applied to the object moves it the through some "distance" in the space considered (sensory or symbolic with sufficient structure). Such constructs fit the usual definitions of work in applications if we connect the "goal attainment" in a particular metric space, sensory or symbolic, with displacements in the goal space. This is nothing new to behavioral sciences for such approaches date back to the group dynamics school pioneered by K. Lewin. We can associate measures of motion (momenta), the derivatives (force and thrust), and the integral (Energy) with the motion in sensory or symbolic space in the usual sense as the terms are used in Mechanics. In addition, with all the measures of motion we can associate and examine measures of order, or better-- disorder, measured using the notions of information and entropy coming from the more recent understanding in thermodynamics and information theory. Applying these concepts, we get "e-motion" as entropized and possibly erratic motion. [1] Definition: Emotional Behavior is energetically perturbated motion. 2.2. Characteristic Work Patterns If we follow the concepts found in mechanical and electrical engineering, "load" on a system can be defined in terms of force. Furthermore a system can be assessed as having capacity measures to "bear a load" and not deviate in performance (the creation of a specific effect). Put in other terms, "load" is a measure of the capability of a system to produce (effect a change in its environment by the expenditure of energy, modification of entropy, or regulation of its exchanges (economics)) specific behavior trajectory [effect/expenditure] edges which it can exhibit with sufficient regularity given a set of contingencies. A characteristic work pattern or envelope for the system is defined as the associated capacity to effect, change, modify objects and move them through a distance in a space and thereby to perform work on them.
This work characteristic interacts with the load characteristic patterns so that the interaction (linkage with load) may interfere with the operational characteristic of the systems. The property we can associate with the envelope or limit within which these systems operate is also studied by mechanical engineers. It is the concept of elastic limit in which a
873 force applied to the system causes a restorative effect to occur (as is done with a spring) up to the elastic limit even under a load with the effects of decrements in velocity, acceleration, displacement. The system can also perform work of a temporary nature (a system transient). If the load characteristics go beyond this system elastic limit, there will be an "overload" condition and permanent deformation of material as is seen in plasticity for engineers, growth for biological systems, and learning in behavior systems. If the load goes beyond plasticity conditions there is loss of integrity separation and possible explosion. We can talk about system strain in this light. In the other direction, where we have compression, we can get overload and loss of elasticity by stress (compression), reduction of time and space available is a case in point. For example, engineers use the notion of elasticity by comparing stress to strain loading use an index called Young's Modulus of Elasticity as measure of cycle properties, especially where there are dynamic oscillations (strain-strain pressure waves). These concepts of are general, a la virtual work, and can thus be utilized to consider both sensory and symbolic aspects of a task (an imperative or purposeful act to seek a goal and have it interfered with by overload, underload, or oscillation between both. Hence stresses and strains caused by explosive and implosive forces which result in overloads to physical (sensory-motor) as well as mental (perceptual, cognitive) aspects results in interactions (interfaces) which produce perturbations in the motions (e-motion effects) which coincide with the way emotional effects arise in behavioral systems. Thus overloads in emotion coincide with e-motions overload which have as their roots overload in physical (sensory-motor) and mental perceptual-cognitive (symbolic) disturbances.
2.2 Emotional WorkLoad From biophysical behavior science and technology we can coin the term "emotional workload" of a system with given work capability under conditions of capacity in communication with a constraint combination (loading). "Overloading" then is to exceed capacity and to generate e-motions which are realized as entropized experiences (feelings) and entropized expression (emotions) which are some times erratic and sometime creative and lead to growth. Transient decrements in performance (work performed) followed by longtime improvement or vice versa. But how do we know and predict, the conditions and ultimately manage them to produce work improvement? This requires we formalize these intuitive notions. Definition: The forms of communication of emotional behavior are experiences and expressions. The experiences of emotional behavior in human beings will be called feelings. The expressions of emotional behavior will be referred to as emotions. The emotional communication interface contains both feelings and emotions. 2.3 Affect Linkages The Theory of Affect Linkages is a model to describe some the structures and processes of how an organism or organizations orients to its environment to satisfy its needs and facilitate its goals In its initial form it defines the attitudinal construct, affect link, in terms of transformation of the marginal degree of goal facilitation attributed as being provided by the object of orientation to the assessor which is computed by comparing what is perceived as
874 being actually provided with what was the expectation learned from past experience. The result is then updated as the event in the environment unfolds. In addition to the value aspect of the construct there is a belief or cognitive aspect (which associates certainty the attribution). Hence the concept of orientation = (value, belief) is a fuzzy concept which is modelled mathematically via complex functions. The construct is operationalized by expressing it in normalized form, which allows it to process measurements of monitored physical and symbolic events. The structure of affect linkages are modelled by a scaling geometry and the dynamics determined in both deterministic and stochastic form. The array approach developed is extended to consider individual, group, and organizational processes. A simple mathematical expression developed to describe stable affect states is as follows using a fixed reinforcement schedule is &j(t) = Log[1-ka(t)/1-ke(t)] ,
(1)
where Aij(t) = affect link from party i to party j; ka(t)j i = actual degree of goal facilitation from j to i; and k~(t)ji= expected degree of goal facilitation. Bij(t)= Belief in the affect attribution Aij(t), and
(2)
Oij(t) : orientation of i to j : (Aij(t), Bij(t)).
(3)
These can be generalized to allow the consideration of both spatial and temporal effects. The affect link can be made to directly correspond to e-motion and thus be a valid measure of entropy-energy interactions and emotions which is an operational construct. How does affect link or other e-motional measures relate to actual emotional behavior? It is conjectured that the emotional behavior is engaged in as exchange which is continually taking place in which emotion is composed or synthesized from the interaction of its entropic (thoughts) and energetic (actions) antecedents. This is the process of emotional experience (perceived as feelings (mood)). Simultaneously there is a decomposition going on in which emotions decompose into entropy (dissipation as noise + signal (neg-entropy=information)) and energy (action) which together are manifested as emotional expression (emotions). These states coexist and compete for dominance in an introversion-extroversion process with stabilization being personality orientation and the transients being "letting off steam" (energy) and/or noise (entropy-information). 3. QUESTIONS AND ANSWERS ABOUT OF EMOTIONAL W O R K L O A D Q. Is there a functional significance to emotional behavior ? A. If emotional behavior is shown to be chaotic behavior it serves as way to allow rapid adaptation to chaotic environmental conditions. Q. Are there limits on the adaptive capability of emotional behavior. A. Yes. If the acceleration of change is too rapid there is an overload and apparent fixation and disorientation of the system. This suggests an upper limit on adaptive capability of emotional behavior. Similarly the studies of conflict, curiosity and arousal show that there is also a lower
875 limit in the form of boredom due to sensory underload. Note in cases of either boredom or overload there are many studies on mental and physical overload to indicate that emotions may function as a safety valve to prevent system destruction from inner commotion which if unchecked would lead to system implosion (excess stress) or explosion (excess strain). Hence emotional behavior may be considered to have functional significance as an operational safety valve. Physiological study with endorphins (associated with crying, etc), adrenlin secretion (with fear,) and nor-adrenalin (with fight) adds further strength to these contentions. Q. How could these perturbations in emotional behavior be analyzed in our framework? A. Emotional loading can be decomposed in energetic and entropic (or emblematic) loads (over, critical, under) and their interfaces Note that according to this qualitative analysis there are four 1st order cases of emotionally imbalanced loadings which could be analyzed by defining dynamics.
Loading Energy Entropy Emotion
Under-Loaded Depressed Information Overload Depressed and Overwhelmed
Critically-Loaded
Over-Loaded
Normal Activitation
(Manic) Hyperactive
Normal Attention
Information Underload
Emotionally Balanced
Depressed and Bored
Table 1 First Order Component Loading: Comparative Capacity vs. Type Q. How can the dynamics of emotional imbalances be computed? If in each category the behavior exhibited in under-critical-overdamping convolutional equations can be written to simulate the dynamics changes in time ,where emotional behavior can be considered as the convolution of energetic and entropic components. 4. CONCLUSION In this paper, it is suggested that the concept of emotional overload be modelled by the use of physical concepts incorporating energy-entropy interactions. These concepts can then be formalized using the author's affect linkage theory and applied to the design of a more advanced "friendlier, satisfying and tunable" Human-Computer Interface which can be utilized to developing large distributed and differentiated friendly HCI which can facilitate the performance of remote teams as found in remote telemedical applications.
REFERENCES [ 1] Marin, I.C. and J. Sappington "A Machine with "Feelings": Emotional Behavior, Proc. World Neural Nets Conference (1994)
System Models of
This Page Intentionally Left Blank
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
877
The psychological impact of computerised production feedback systems" A comparative study of the U.K. subsidaries of U.S. and Japanese multinational companies C. Oswick and D. Grant The Management Centre, King's College, University of London, Kensington Campus, Campden Hill Road, London W8 7AH, United Kingdom
1. SYNOPSIS This study examines the extent to which the use of computerised systems of feedback in manufacturing environments affects employee attitudes and behaviour. Using samples taken from four diverse organisations, data were gathered via; documentary sources, interviews, observational techniques, and group sessions. The most significant finding of the study is that the immediacy and comprehensiveness of employee feedback provided by sophisticated computer-based systems was found to have a detrimental impact upon attitudes and work performance. In contrast, feedback systems using less sophisticated technology and placing a greater emphasis on interpersonal communication had a more positive psychological impact.
2. INTRODUCTION The primary objective of this paper is to compare mad contrast differing approaches to the computerised monitoring and feedback of production performance data. In particular, the interface between the teclmology (computerised feedback system) and the end user (the production employee) is explored in terms of the attitudinal and behavioural implications. It is posited that independant feedback variables (comprehensiveness, immediacy, frequency, unit of measurement, and media) have a direct, and significant, impact upon the employee's psychological orientation towards work (i.e., job satisfaction,
878
motivation, and commitment). Much of the mainstream management literature extols the virtues of comprehensive feedback systems [1]. Indeed, Hackman and Oldham [2] suggest that feedback is one of the most important variables in determining what they refer to as the motivating potential score (MPS) of a job. Although several authors have highlighted the need for feedback to be meaningful and immediate [3,4], the contingent nature of the complex relationship between feedback and work perfonnance is still not fully tmderstood. This study seeks to establish the extent to which a firm linkage exists between feedback mechanisms and employee attitudes and behaviour. It is hoped that identifying the nature of the interplay between these factors will aid the design of optimal feedback tectmology which maximises the potential for socio-technical congruence, and therefore improved productivity.
3. METHOD 3.1. The Sample The research is based upon data gathered from the U.K. subsidaries of four large multinational companies. All of the companies are drawn from the manufacturing sector. Two of the corporations represented are U.S. owned (company A and company B), while the other two are Japanese owned (company C and company D). Company A is a large-scale producer of male grooming products. Company B manufactures photographic goods and materials. Company C produces a particular electrical component essential to the manufacture of most consumer electrical goods. Company D manufactures consumer electrical products. A brief outline of the feedback system employed in each firm is provided in Table 1. 3.2. Data Collection Data on the nature and operation of the computerised feedback systems were gathered using several methods, namely: secondary data sources; non-participant observation, and; informal interviews with operators, managers and system experts. Insights into the psychological impact of the systems were provided via; documentary sources, group feedback sessions, and participant observation. An extensive progranune of data collection was undertaken, i.e.; 202 employees completed attitudinal questionnaires, 22 employees drawn from various organisational strata were interviewed, 18 days were spent observing and shadowing operations, and a series of 47 group feedback sessions (consisting of
879
373 production employees) were held. The use of documentary sources involved the analysis of a variety of records (including; production figures, down-time, labour turnover, and absenteeism). Direct exposure to the effects of the feedback systems were provided during brief periods of participant observation where production tasks were undertaken and operatives were shadowed. The group feedback sessions contained between 5-15 employees and offered a forum for teams, and work units, to engage in frank and open discussion of their feedback system in a candid and non-threatening enviromnent.
Table 1 - A Summary of the Characteristics of the Feedback Systems Employed in Companies A, B, C, & D. SOPHISTICATION UNITOF FREQUENCY OF TECHNOLOGY MEASUREMENT & IMMEDIACY OF FEEDBACK
METHOD OF FEEDBACK
COMPANY A
High
Group, S h i f t , Department
Intermittent and Self Determined
Remote VDU
COMPANY B
Very High
Group, S h i f t , Department
Continuousand Instantaneous
Shopfloor Light Screen Display
COMPANY C
Low
Individual and Group
Daily (individual) Weekly (group)
Daily = Verbally (via supervisor) Weekly = Display Charts (public)
COMPANY D
Moderate
Group, Shift, Department
Daily, Weekly and Delayed
Display Charts (via computer printout)
4. RESULTS
The nature of the computer system used to monitor and feedback production performance was found to be the major determinmat of other facets of the feedback process. For example, the sophisticated hardware used in Company B enabled ongoing immediate feedback to be provided directly to workers via terminals located on the shopfloor. By contrast, the 'low-tech' system operated in Company C resulted in a delay in production feedback. A strong negative
880 correlation was identified between sophisticated feedback technology and high levels of the employee satisfaction and motivation. Indirect corroboration for this finding was provided by the secondary data gathered - higher productivity, lower absenteeism, less down-time, fewer breakdowns and more favourable labour ttLmover rates, were identified for the two companies (C & D) with less sophisticated feedback mechanisms, than their 'hi-tech' counterparts (A & B). Direct shopfloor observation and the group feedback sessions provided insights into several tmderlying reasons for the aforementioned finding. In particular, the immediate access to performance provided by hi-tech feedback systems caused several problems. Some workers admitted that having concurrent, rather retrospective feedback, enabled them to regulate their performance to ensure that they met, but did not exceed, the minimum production targets. Paradoxically, other workers indicated that continuous feedback spurred them on to work harder - this lead to intra-team problems due to an inevitable conflict between those wanting to maximise production and those seeking to minimise it. In addition, high levels of inter-team competition were reported at companies A and B. Unfortunately, the form of competition which resulted tended to be dysfunctional. Almost two thirds of the production teams at these companies admitted attempting to outperform neighbouring teams by unsanctioned means. This involved manipulating perfonnance data by nmning unloaded machines at excessive speeds for short periods which increased the efficiency rating displayed on the feedback system, but did not enhance production and in the longer term also damaged the machinery. In more extreme forms inter-team competition involved restricting the performance of adjoining teams by; witholding infonnation, hoarding spares, and in some instances even sabotage. The intra and inter-team problems associated with concurrent feedback were a source of demotivation, dissatisfaction and disharmony. The unhealthy forms of competition outlined for Companies A and B were not reported by respondents in Companies C and D. Teams at these companies focused on self improvement in order to out perform the others - what management at company C tenned "healthy competition". This could be attributed to the retrospective nature of the feedback as a consequence of using unsophisticated teclmology. These factors reduced the emotional impact of feedback on employees, and limited the scope for dysfunctional action because it related to past, rather than present, perfonnance. The method of feedback used in the two companies with sophisticated computer systems also caused motivational problems. The use of hi-tech media for presenting feedback was perceived by employees as 'cold' and impersonal.
881
However, at the companies with less sophisticated systems the daily feedback provided by supervisors and/or team leaders was described as being more 'human' and interactive. From a managerial viewpoint the use of direct interpersonal feedback also allowed greater control to be exercised because of the inherent flexibility of the system, i.e. factors such as; the timing of feedback, the opportunity to interpret rather than merely report results, and the scope for positive or negative delivery.
5. DISCUSSION & CONCLUSIONS The results presented above indicate that the two Japanese companies (C and D) have the more effective feedback systems. Nevertheless, it is not possible to draw any generalisable inferences about U.S. and Japanese sudsidaries per se. Instead, the emphasis in the following discussion of the findings is centred upon teclmological differences between the companies, rather than the international dimension. It is possible to think of computerised feedback systems in terms of the form of control they offer. Donnelly et al [5] have identified three forms of production control, namely; preliminary, concurrent and feedback. Preliminary control takes place prior to operations and includes activities such as the inspection of raw materials and the training of operators; concurrent control occurs during the production process, mid; feedback control is concerned with outcomes (i.e. overall results and deviations from set standards). The sophisticated computer systems used in colnpanies A and B are designed to assist management in all three of these areas of control- whereas the systems in operation in companies C and D place emphasis on feedback control. The immediacy and complexity of the concurrent feedback provided by the more advanced systems offers valuable data which aids the design, plamling and monitoring of production operations. However, as was demonstrated earlier, it is the immediate access to this information which allows production workers to manipulate it in ways that are detrimental to output. The nature of the interface between employees and the computer systems within companies A and B also appears to be problematical. The feedback is direct from the computer to the worker, rather thma via a supervisor or manager. Consequently, raw, rather than screened or interpreted, data is fed back and this limits the scope for interpersonal interaction with superiors. As a result, psychologically reassuring aspects of feedback such as praise and recognition are circumvented while management also lose an important opportunity to assert
882
control over operatives who are performing poorly. The primary conclusion of this study is not that companies A and B should opt for less sophisticated systems similar to those in use in companies C and D. Instead, the way in which the technology is utilised needs to be reconsidered. The difficulties experienced in companies A and B appear to stem from their original design and implementation. In terms of human-computer interaction (HCI) it would seem that interface with managers, as systems users, has been fully considered at the designa stage of introducing the new technology. However, the psychological impact upon workers, who are effectively the end-users of the system, has been somewhat neglected. This lack of foresight is not a new phenomenon, there is a considerable volume of HCI-related research which identifies the general problems which arise due to a tendancy to concentrate upon the teclmical requirements of system and failing to consider the psychological aspects of systems design [6-8]. ~
REFERENCES
1. F. Luthans, Organizational Behaviour, 5th Edition, McGraw-Hill, New York, 1988. 2. J.R. Hacklnan and G.R. Oldham, Work Redesign, Addison-Wesley, California, 1980. 3. D.M. Prue and J.A. Fairbank, "Performance Feedback in Organizational Behaviour Management: A Review", J. Org. Behaviour Mgt., Spring (1981). 4. D.A. Nadler, "The Effects of Feedback on Task Group Behaviour: A Review of the Experimental Research", Org. Behaviour and Human Perf., June (1979). 5. J. Dolmelly, J. Gibson and J. Ivancevich, Fundamentals of Management, Business Publications, Texas, 1981. 6. J.A. Hughes, I. Somerville, R. Bentley and D. Randall, "Designing with Etlmography" Making Work Visible", Interacting With Computers, 5 (1993). 7. K. Davids and R. Martin, "Shopfloor Attitudes towards Advanced Manufacturing Teclmology" The Changing Focus of Industrial Conflict?", Interacting With Computers, 4 (1992). 8. F.D. Davis, "User Acceptance of Information Teclmology: System Characteristics, User Perceptions and Behavioral Impact", Int. J. of ManMachine Studies, 38 (1993).
IV.11 HCI Standard
This Page Intentionally Left Blank
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) 1995 Elsevier Science B.V.
Human-Computer Interaction Standards Nigel Bevan NPL Usability Services, National Physical Laboratory, Teddington, Middx, TWl 1 0LW, UK
[email protected] 1. DIFFERENT APPROACHES TO STANDARDS FOR HCI
It is often assumed that a standard means a precise specification. Such standards have brought benefits in many fields, eg: bolts which screw into nuts, ATMs which can read credit cards, and compilers which can read programming languages. Some HCI standards are also of this type: many design guides provide a detailed specification of the nature of the user interface. Although standard user interfaces provide the benefit of consistency, they become out of date as technology changes, and are usually only appropriate for limited types of users and tasks (Bevan and Holdaway, 1993). Thus most work on international standards for HCI has not been about precise specification, but instead has concentrated on the principles which need to be applied in order to produce an interface which meets user and task needs. These standards broadly fall into two categories. One is a "top-down" approach which is concerned with usability as a broad quality objective: the ability to use a product for its intended purpose. The other is a product-oriented "bottom-up" view which is concerned with aspects of the interface which make a system easier to use. The broad quality view originates from human factors, and standards of this type are applicable in the broad context of design and quality objectives. The product-oriented view concentrates on the design of specific attributes, and relates more closely to the needs of the interface designer and the role of usability in software engineering (see Bevan, 1995). Section 4 explains how standards can be used to provide a means of meeting the requirements for the operator-computer interface in the European Directive on Display Screen Equipment. 1.1 Usability as a quality objective These standards relate to usability as a high level quality objective, and usability is defined in this way in ISO 9241-11" Usability: the extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of Use.
Standards of this type can be used to support the following activities: • specification of overall quality and usability requirements and evaluation against these requirements (ISO 9241-11 and ISO/IEC 14598-1) • incorporation of usability into a quality system (ISO 9241-11) • incorporation of usability into the design process (ISO/IEC 13407) Section 2 describes these standards. 1.2 Product-oriented standards In the product-oriented view, usability is seen as one relatively independent contribution to software quality, and is defined in this way in ISO/IEC 9126: Usability: a set of attributes of software which bear on the effort needed for use and on the individual assessment of such use by a stated or implied set of users. Section 3 describes standards which deal with usability in terms of attributes which must be designed into a software product to make it easy to use:
885
886 ISO 9241: Ergonomics requirements for office work with visual display terminals: Part 10, 12-17: dialogue design •ISO/IEC, 10741.t!~Dialogue interaction - Cursor control for text editing ISO/IEC 11581 Icon symbols and functions ISO/IEC 9126 Software product evaluation - Quality characteristics and guidelines for their use These standards can be used in the following ways: • To specify details of the appearance and behaviour of the user interface • To provide detailed guidance on the design of user interfaces • To provide criteria for the evaluation of user interfaces However the attributes which a product requires for usability depend on the nature of the user, task and environment. A product has no intrinsic usability, only a capability to be used in a particular context. ISO 9241-11 can be used to help understand the context in which particular attributes may be required. 2. USABILITY AS A HIGH LEVEL QUALITY OBJECTIVE 2.1 ISO 9241-11 Guidance on Usability
The objective of designing and evaluating for usability is to enable users to achieve goals and meet needs in a particular context of use. ISO 9241-11 explains how usability can be specified and evaluated in terms of user performance and satisfaction. User performance is measured by the extent to which the intended goals of use are achieved (effectiveness) and the resources such as time, money or mental effort that have to be expended to achieve the intended goals (efficiency). Satisfaction is measured by the extent to which the user finds the use of the product acceptable. ISO 9241-11 also emphasises that usability is dependent on the context of use and that the level of usability achieved will depend on the specific circumstances in which a product is used. The context of use consists of the users, tasks, equipment (hardware, software and materials), and the physical and organisational environments which may all influence the usability of a product (see Figure 1).
intended objectives
....
,,oa,s ) & I
usability: extent to which goals are achieved with effectiveness, efficiency and satisfaction I
Y
equipment. ) ,
,,,
C env'onment) Context of use
•
~ : . : ~ : ~ . ~ : ~ ~ ' ~ .
C e,,ect,veness)
,,,,
outcome of interaction
C
efficiency ')
C satis'action) Usabilitymeasures Figure 1 Usability framework
887 ISO 9241-11 was developed in close conjunction with the MUSIC project. The user-based MUSIC methods and tools provide a practical implementation of the principles of the standard. The Usability Context Analysis Guide (Macleod 1994) provides a procedure for documenting the context of use and context of evaluation. The Performance Measurement Method (Bevan and Macleod, 1994) provides a reliable and repeatable method for measuring effectiveness and efficiency and diagnosing usability problems. SUMI (Kirakowski, 1995) enables different aspects of user-perceived usability to be measured and areas of difficulty to be pin-pointed. Cognitive workload can be measured (Wiethoff et al 1993) as a means of predicting over- or under-loading of the user. 2.2 Quality Systems and ISO 9001
Dealing with usability as part of a quality system for design and development of products, as specified in ISO 9001, involves the systematic identification of requirements for usability, including usability measures and verifiable descriptions of the context of use. These provide design targets which can be the basis for verification of the resulting design. ISO 9001 specifies what is required for a quality system. A quality system is a documented set of procedures intended to ensure that a product will meet initially stated requirements. A quality system is a desirable (though not sufficient) condition for achieving quality of the end product. ISO 9241-11 describes how the usability of a product can be defined, documented and verified as part of a quality system which conforms to ISO 9001 (Figure 2). The overall context of use should be identified, usability requirements should be specified, usability issues should be monitored during development, and the usability achieved should be evaluated. Activities
I
Documents/Outputs
identify context of use
.-I " I,
specification of context of use
select usability measures criteria and context
..1 "l
usability specification
evaluate usability redesign product
.=1 statement of compliance with "! criteria .J
improved product
"1 Figure 2 Quality Plan Overall context of use: Information about the characteristics of users, their goals and tasks and the environments in which the tasks are carried out provides important information for use in the specification of overall product requirements, prior to development of specific usability requirements. Usability requirements: Prior to development of a custom system, the purchasing organisation should specify the usability requirements which the system must meet and against which acceptance testing may be carded out. Specific contexts in which usability is to be measured should be identified, measures of effectiveness, efficiency and satisfaction selected, and acceptance criteria based on these measures established.
888
Monitor usability: At various stages during the development process the developer should measure the usability achieved against these targets. This information enables objective decisions to be taken about the need for design changes to enhance usability, and about tradeoffs which may be appropriate between usability and other requirements. Usability evaluation: The characteristics of the context in which a product is likely to be used need to be identified. To ensure the validity of test resuks the users, tasks and environments used for the evaluation should match the real context of use as closely as possible. 2.3 Quality of use
ISO 9241-11 introduces the concept of a work system, consisting of users, equipment, tasks and a physical and social environment, for the purpose of achieving particular goals. Measures of user performance and satisfaction assess the quality of the work system in use, and, when a product is the focus of concern, these measures provide information about the usability of that product in the particular context of use provided by the rest of the work system. ISO 9241-11 defines the quality of a work system in use as: Quality of a work system in use: the extent to which specified goals can be achieved with effectiveness, efficiency and satisfaction in a specified work system. The difference between usability and the quality of a work system in use is a matter of focus. When usability is evaluated, the focus is on improving a product while the other components of the work system (user, task, equipment, and environment) are treated as given. If the aim is to improve the quality of the overall work system in use, any part of the work system may be the subject of design or evaluation. For example it may be appropriate to consider the amount of user training to be provided, changes in lighting, or re-organisation of the task. In this case the element which is the object of design or evaluation is considered to be subject to potential variation, while the other elements of the work system are treated as fixed. 2.4 Software quality evaluation
ISO 8402 (Quality Vocabulary) defines quality as: Quality: the totality of characteristics of an entity that bear on its ability to satisfy stated and implied needs. This defines quality in terms of the characteristics of a product. To the extent that user needs are well-def'lned and common to the intended users it implies that quality is an inherent attribute of the product. However, if different groups of users have different needs, then they may require different characteristics for a product to have quality. ISO/IEC 14598-1 (Information Technology- Evaluation of Software Products- General guide) distinguishes between the concept of quality as an inherent characteristic of the product, and quality of use: Quality of use: the extent to which an entity satisfies stated and implied needs when used under stated conditions The ultimate objective of software quality evaluation is to ensure that the product provides quality of use - that it meets the needs of the users. (Users may include operators, recipients of the results of the software, or maintainers of software.) This definition of quality of use is very similar to the definitions of usability and the quality of a work system in use in ISO 9241-11. The only difference is that ISO 9241-11 specifically defines the needs in terms of user performance and satisfaction, and the stated conditions in terms of users, goals and environments. Internal software quality attributes (such as the functionality, usability and efficiency attributes defined in ISO/IEC 9126) can be used as indicators to estimate f'mal software quality. The specific internal attributes which are relevant to final quality of use will depend on the intended conditions of use - for an interactive product this will depend on the needs of the eventual end users and the tasks.
889 2.5 Human-centred design To achieve the overall objective of usability and quality of use requires a human-centred approach to design. This is the subject of a standard under development: ISO 13407 (Humancentred design process for interactive systems). This standard is expected to cover topics including: planning the usability process, incorporating human-centred design activities in interactive system development processes, and assessing the benefits of human-centred design. 3. DESIGNING FOR USABILITY: PRODUCT-ORIENTED STANDARDS
Usable products may be designed by incorporating product features and attributes known to benefit users in particular contexts of use. ISO 9241 provides requirements and recommendations relating to the attributes of the hardware, software and environment which contribute to usability, and the ergonomic principles underlying them. The following parts of ISO 9241 and other standards deal with attributes of the software: • ISO 9241-10: Dialogue principles. This part of ISO 9241 deals with general ergonomic principles which apply to the design of dialogues between humans and information systems: suitability for the task, suitability for learning, suitability for individualisation, conformity with user expectations, self descriptiveness, controllability, and error tolerance • ISO 9241-12: Presentation of information. This part of ISO 9241 contains specific recommendations for presenting and representing information on visual displays. It includes guidance on ways of representing complex information using alphanumeric and graphical/symbolic codes, screen layout, and design as well as the use of windows. • ISO 9241-13: User guidance: This part provides recommendations for the design and evaluation of user guidance attributes of software user interfaces including Prompts, Feedback, Status, On-line Help and Error Management. • ISO 9241-14: Menu dialogues. This part provides recommendations for the ergonomic design of menus used in user-computer dialogues. The recommendations cover menu structure, navigation, option selection and execution, and menu presentation (by various techniques including windowing, panels, buttons, fields, etc.). Part 14 is intended to be used by both designers and evaluators of menus (however, its focus is primarily towards the designer). • ISO 9241-15: Command language dialogue. This part provides recommendations for the ergonomic design of command languages used in user-computer dialogues. The recommendations cover command language structure and syntax, command representations, input and output considerations, and feedback and help. Part 15 is intended to be used by both designers and evaluators of command dialogues, but the focus is primarily towards the designer. • ISO 9241-16: Direct manipulation dialogues. This part provides recommendations for the ergonomic design of direct manipulation dialogues, and includes the manipulation of objects, and the design of metaphors, objects and attributes. It covers those aspects of "Graphical User Interfaces" which are directly manipulated, and not covered by other parts of ISO 9241. Part 16 is intended to be used by both designers and evaluators of command dialogues, but the focus is primarily towards the designer. • ISO 9241-17: Form-filling dialogues. This part provides recommendations for the ergonomic design of form filling dialogues. The recommendations cover form structure and output considerations, input considerations, and form navigation. Part 17 is intended to be used by both designers and evaluators of command dialogues, but the focus is primarily towards the designer. • ISO/IEC 10741-1 Dialogue interaction - Cursor control f o r text editing. The standard specifies how the cursor should move on the screen in response to the use of cursor control keys. • ISO/IEC 11581 Icon symbols and functions - Part 1: General. This part contains a framework for the development and design of icons, including general requirements and recommendations applicable to all icons.
890 • ISO/IEC 11581 Icon symbols and functions - Part 2: Object icons. This part contains requirements and recommendations for icons that represent functions by association with an object, and that can be moved and opened. It also contains specifications for the function and appearance of 20 icons. Before designing appropriate usability attributes into the software following the guidance and requirements of the standards listed above, a software designer needs to identify the anticipated users, tasks and environments using ISO 9241-11. However, using attributes which conform to these standards cannot guarantee that a product reaches a required level of usability, as these standards do not provide an exhaustive specification of how to apply the general principles that make a product usable.
4. EUROPEAN DISPLAY SCREENS DIRECTIVE The European Directive on Display Screen Equipment (EEC 1990, Bevan 1991) is primarily concerned with the physical working environment and working conditions, but also includes requirements that: • Software must be suitable for the task. • Software must be easy to use and where appropriate adaptable to the user's level of knowledge or experience. • Systems must display information in a format and at a pace which are adapted to users • The principles of software ergonomics must be applied. This applies to software used as part of new workstations immediately, and all workstations from 1997. Conformance with usability standards provides one means to ensure compliance with the Directive. The minimum requirements of the Directive are similar, but not identical to, the requirements of the relevant parts of ISO 9241 which are in much greater detail. In particular, ISO 9241-10 contains the main principles of software ergonomics. In general, the standards contain broader requirements than the Directive, as the Directive is concerned only with health and safety, while the standards are also concerned with the effectiveness and efficiency of users. It would have been simpler if the Directive had made direct reference to standards rather than containing its own requirements. However, not all the standards are complete, and the contents of standards are agreed by experts in national standards bodies, while the contents of the Directive are approved at a political level in the European Commission. REFERENCES
Bevan N (1991) Standards relevant to European Directives for display terminals. In: Bullinger HJ (ed): Proceedings of the 4th International Conference on Human Computer Interaction, Stuttgart, September 1991. Elsevier. Bevan N (1995) Usability is quality of use. In: Anzai & Ogawa (eds) Proceedings of the 6th International Conference on Human Computer Interaction, Yokohama, July 1995. Elsevier. Bevan N and Holdaway K (1993) User needs for user system interaction standards. In User needs for information technology standards, Evans, Meek and Walker (eds), Butterworth Heinemann. Bevan N and Macleod M (1994) Usability measurement in context. Behaviour and Information Technology, 13, 132-145. CEC (1990) Minimum safety and health requirements for work with display screen equipment Directive (90/270/EEC) Official Journal of the European Communities No L 156, 21/6/90. Houwing E.M., Wiethoff M., and Arnold A.G. (1994). Introduction to cognitive workload measurement. Delft University of Technology (WIT Lab). Kirakowski J (1995) The software usability measurement inventory: background and usage. In: P Jordan et al, Usability Evaluation in Industry. Taylor & Frances, UK (in press). Macleod, M (1994) Usability in Context: Improving Quality of Use. In: G Bradley et al (eds.) Human Factors in Organizational Design and Management - IV. Elsevier/North Holland.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) 1995 Elsevier Science B.V.
891
The applicability o f the ISO user interface standards. Frederik Dehlholm Datacentralen Oldenburg Alle 1 DK 2630 Taastrup Denmark E-mail:
[email protected]. dk
The Parts 10 to 17 of ISO 9241 are user interface standards. Part 10 was agreed upon in 1994 and will be issued in the first half of 1995. Three of the other parts are out for final voting in the beginning of 1995, and the last four parts will follow in 1996 and 1997. Part 10 contains seven general dialogue principles and examples of their application. The Parts 12 to 17 contain specific rules for the dialogue and for the design and layout of the screen. As a member of the ISO group that makes these standards, Datacentralen has been already been able to incorporate these standards in our usability work. (Datacentralen is a Danish large software house that develops business information systems for many different platforms).
The Seven dialogue principles. In our company some discussions have taken place about the necessity of incorporating Part 10 into our usability work. Some regarded the seven dialogue principles as a good introduction and as good general guidelines for our usability work. On the other hand, the quality assurance team found the principles too general. They thought it would be impossible to check whether a specific user interface complied with these principles. The conclusion of the discussion was that the seven dialogue principles were incorporated in the company styleguide. In the company styleguide the seven dialogue principles serve as an introduction, but are at the same time mentioned as principles that ought to be followed. The seven principles are quoted with some guidelines on how to apply the principles. These guidelines are the guidelines of the ISO standard relevant to the kind of administrative applications made by Datacentralen.
The company styleguide. The purpose of the company styleguide is to help the systems developers make usable and beautiful systems and at the same time to make sure that the systems have the corporate look and feel and follow the de facto standard for the platforms, that the systems are made for. The general contents in the seven dialogue principles have not been disputed, but one of the guidelines for the third principle, that the user should be in control, created a lot of debate. The guideline says that the waste-paper basket should not be emptied until the user wants it.
892
The arguments against this guideline were: • it is too difficult to implement • it is a part of the case tools that we use • it is a feature in either Windows or OS/2 the waste-paper basket will become too full because the user forgets to empty it. The conclusion of the discussion was been to include the guideline not only as a specification of the general principle, but also as a part of the corporate look of the user interface.
Propaganda for the ISO 9241 standard
We have made a lot of propaganda for ISO 9241: • We have had several articles on ISO 9241 in the Danish ComputerWorld • We have made a short introduction to the standard in the company styleguide • We have presented the standard at some GUI-seminars for all the systems developers in the company • We have used one hour in the company GUI training course to go through the Parts 10 to 17.
Incorporation of the Parts 11 to 17.
The drafts of the Parts 11 to 17 contain a lot of specific rules for the user interface. We did not want to include all these rules in the company styleguide because that would make the company styleguide document much too comprehensive. We wanted to make the company styleguide as small as possible. So we went through all the rules in the drafts of the Parts 11 to 17 and included only those that: • are relevant for the kind of Administrative Systems that are made at Datacentralen and at the same time • are not included in CUA or Windows
893 • are not included in CUA or Windows and at the same time • are not self-evident for the systems developers at Datacentralen, considering that almost all the system developers have been through a three-days course in usability
The rules of the Parts 11 to 17 have not caused much debate. The issues in the company styleguide that have given cause to most debate have been: which elements to use in designing the corporate image in the user interface. Many parts of the company participated in the development the company styleguide. This was crucial for both the quality and the acceptance of the standard. The standard was also reviewed by all the existing GUI projects in the company, but it turned out not to be enough. The persons writing the company GUI guides for the different GUI tools used in our company should also have participated in the group, developing the company GUI guide. To be a reviewer of the guide was not enough to make them feel committed to the guide. This means that there are a few arias in which the GUI guides for the GUI tools deviated from the company GUI guide. This is now being corrected.
This Page Intentionally Left Blank
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) 1995 Elsevier Science B.V.
895
Application of e r g o n o m i c standards to the EC Directive on requirements for display screen equipment. Henrik H o p f f Tele D a n m a r k Ltd. E r g o n o m i c Office 21 N o e r r e g a d e DK-0900 Copenhagen C Abstract The European Union is eager to promote the use of standards in the legal requirements for work with display screen equipment. This paper describes the difficulties in applying standardisation to the law-making process. It is shown to what extent standards are applicable and in which cases other methods are necessary. The paper demonstrates a case story of how a large Danish company has managed to apply successfully the parts 3 and 10 of the ISO 9241 standard in their implementation of the requirements of the EC Directive 90/270/EEC.
1. INTRODUCTION In the EC, the efforts towards obtaining a single market is regulated by using directives. The directives act as a way of harmonising the legislation of the individual member countries. One such directive is the minimum safety and health requirements for work with display screen equipment (90/270/EEC). The Directive was published in 1990. Apart from other directives, the directives concerning worker's safety and health are so-called minimum directives. This implies that the individual member countries can increase the requirements during the national implementation of the Directive. The Directive aims at the employers who are responsible for the worker's safety and health. In the Directive on the minimum safety and health requirements for work with display screen equipment (90/270/EEC) no references to international standards are found. 1.1. CEN Standards However in 1988, SOGITS (Senior Officials Group Information Technologies Standardisation) which is formed by the European Commission, moved towards making use of the what was to be the ISO 9241 standard. SOGITS mandated CEN, the European Commitee for Standardisation to convert the first four parts of the ISO 9241 standard into CEN standards. In this way, the European Commission wanted to provide a link between the Directive and the ongoing standardisation work. In future a revised directive might have a reference to the 9241 standards.
896 2. DIFFICULTIES IN USING ERGONOMIC STANDARDS Some European member countries inclusive of Denmark, recognize a difficulty in regulating protection of worker's safety and health through applying standards to the legislation. In the Danish community, the legislation on worker's safety and health is provided by the authorities in a close cooperation with employers and workers. The use of standards will affect the influence of the employers and workers. With the standards the participants will have less possibility to influence the national legislation.
3. PRACTICAL USE OF THE ISO 9241 STANDARD The Danish legislation on worker's safety and health demands that all major companies form an in-house security organization. This organization consists of representatives of the company management and workers. The purpose of the organisation is to control that equipment and work procedures meet the legal requirements. Tele Denmark Ltd. is the national telephone operator with at present three million subscribers. The company has participated in the national and international preparation of the ISO 9241 standard. Due to this work, it has been possible to perform an early implementation of the standard in the company. Three different approaches have been taken to make use of the standard.
3.1. Checklists In order to provide the security organisation with checklists, three different checklists have been developed for hardware, software and office equipment. The checklists contain the recommendations of the parts concerning display unit and the general dialogue principles. In addition the checklists incluse the recommendations of the Swedish MPR 1990 and TCO '92 ergonomic requirements. All suppliers of office equipment are asked to fill in the relevant checklists and send them to the members of the security organisation. The company can then use the checklists to evaluate whether the products meet the requirements of that particular part of the company.
3.2. Project Management Handbook In order to make sure that new software products conform to usability principles, the Part 10 Dialogue principles are built into the handbook. The handbook follows the traditional model of development. At the beginning of a project, the prospective userorganisation is notified in order to let the security organisation provide the ergonomic requirements. In the design of the user interface, the ergonomic recommendations are applied. The quality control must go through the initial design and make sure that all deviations from the standard are reported to the security organisation. In the near future, usability testing will be provided in order to make sure that those usability aspects not supported by the recommendations in the standard will be investigated and documented.
3.3. Company Styleguide The requirements in Part 10 Dialogue Principles are used to design the layout of user
897 interfaces and dialogue principles, as well as the principles of the IBM SAA/CUA styleguide. The Company Styleguide provides the developers with a tool to make consistent interfaces building on ergonomic principles.
4. E X P E R I E N C E GAINED F R O M T H E I M P L E M E N T A T I O N The main difficulty in implementing the standard is the fact that the use of standards may obscure the tradition of heuristic testing of user interfaces. The ergonomic recommendations sometimes override individual user requirements derived from either usability testing or from the initial user requirements. Many users and developers are reluctant to rely on ergonomic standards and prefer to rely on results from usability testing and experience gained from similar user interfaces. In this case the standard is used as a design guide where applicable. The recommendations are followed only when they fit into the developer's mental model of the user dialogue. I see this conflict as a natural consequence of the role of ergonomic standards in the Danish community. Protection of worker's safety and health relies on individual inspection and action and not on the use of standards.
This Page Intentionally Left Blank
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) 1995 Elsevier Science B.V.
899
S t r u c t u r e d H u m a n Interface V a l i d a t i o n T e c h n i q u e - S H I V A Jtirgen Ziegler, Fraunhofer Institute IAO Michael Burmester, Fraunhofer Institute IAO tel: +49 711 970 2334 fax: +49 711 970 2300
[email protected] [email protected] The paper describes a new evaluation approach based on a walkthrough method combined with the task and navigation model of the user interface software to be evaluated. The user interface is evaluated according to the requirements of ISO 9241.
1. N E E D F O R U S A B I L I T Y E V A L U A T I O N More and more workplaces are equipped with visual display terminals. It is estimated that by the year 2000 90% of all employees in industrialised countries will use VDTs in order to perform their tasks (F~ihnrich, 1987). This leads to the situation that on one hand health and safety and on the other hand effectiveness and efficiency is required for VDT workplaces. Since January 1, 1993 the Council of the European Union issued the Directive 90/270/EEC "on the minimum safety and health requirements for work with display screen equipment". The governments of the member States of the European Union must converse this directive into national law. With this the European directive is valid and must be applied to the workplaces with display screen equipment. The EU-Directive demands very high safety and health standards for work being carried out on computer systems. Workplaces must be analysed and it must be ensured that the VDT workplaces exclude heavy physical and mental strain. Furthermore, it is required to involve end users in the design and development phases of workplaces employing computer systems. In five short paragraphs the EU-directive lays down special requirements concerning manmachine interfaces: The software must be better adapted to improve the execution of jobs. It must also be easy-to-use, flexible, user-oriented and able to present information about currently running processes in the system. Besides this, the most basic ergonomic standards must be employed. The most agreed approach is to interpret these requirements on the basis of the ISO 9241. Part 10 "Dialogue Principles" of ISO 9241 (1994) is already agreed as an international standard concerning the software ergonomic quality of user interfaces. After that, software must fulfil the following dialogue principles: suitability for the task, self-descriptiveness, controllability, conformity with user expectations, error tolerance, suitability for individualisation, and suitability for learning. Taking these facts into account, it is obvious for software developing organisations that efficient and effective means for user interface design and evaluation are required. 2.
EVALUATION
METHODS
A lot of user interface evaluation methods are available. Two main classes can be differentiated. One group of the evaluation methods are focused on users of a particular system.
900 The other category of the evaluation methods are designed to support ergonomic experts in evaluating user interfaces. Heuristic evaluation and cognitive walkthrough are well known expert evaluation approaches.
2.1.
Expert Evaluation
Heuristic evaluation is based on a systematically and theory driven assessment of user interfaces carried out by software ergonomic experts (Nielsen, 1992). A large variety of procedures for expert evaluation can be found. Some expert methods are highly structured providing a lot of instructions for the evaluator. An example for a highly structured and voluminous expert evaluation method is EVADIS II (Reiterer & Oppermann, 1993). Other expert evaluation procedures provide only some guidelines. For example, heuristic evaluation can be carried out by assessing each screen according to a four level user interface model (VDI, 1990; GtJrner, 1994). Following this approach, the user interface is analysed on the levels: •
Task level (What are the tasks? Which goals do users have?)
•
Functional level (Does the software provide all functions needed to perform the tasks?)
•
Syntactic level (Is the meaning of labels and icons clear? Does the system provide clear feedback?)
•
In- and Output level (Are the dialogue elements correctly chosen? Are colours, grouping of elements etc. according to human factors requirements?)
Cognitive walkthrough is another approach of expert evaluation. It is based on the human action theory (Norman, 1986). The user interface evaluation is structured by the tasks of the users. A task is defined by a goal which has to be achieved and actions to be performed in order to reach the goal. The expert evaluates the user interface according to the human action cycle (Polson, Lewis, Rieman & Wharton, 1992): •
Goals
•
Action plan
•
Execution of actions
•
Evaluation of feedback
•
Revision of the goals
•
Continuation of the cycle
2.2.
User Testing
User testing or empirical evaluation is based on the analysis of user behaviour during the use of the software to be evaluated. A pool of several data collection and data analysis methods is available to run user trials: logfile analysis, video protocolling, verbal protocolling, task completion time, number of errors, calculation of unproductive time etc.
2.3.
Comparison of Expert Evaluation and User Testing
The following table shows a comparison of expert evaluation and user testing methods according to the criteria effectiveness and expenditure of the evaluation method. For software development especially in small software companies the factor expenditure is a central cost factor.
901
Table 1: Comparison of Expert Evaluation and User Testing
User Testing
Expert Evaluation
Effectiveness
70% of errors in user interface design can be detected by running user trials with 5 users (Nielsen & Landauer, 1993)
1 Expert is able to detect between 30% and 50% of user interface problems, 5 experts are able to detect up to 95% of design errors (Molich & Nielsen, 1990)
Expenditure
very high: : Acquisition of users, planning of trials, data collection methods, data analysis, interpretation and generation of design alternatives;
low: Little preparation for heuristic evaluation is needed Cognitive walktrough requires the definition of the user tasks
At least two evaluators are needed to perform user trials The effectiveness of both approaches is nearly similar, but a big difference can be found in the costs of the two approaches. Karat, Campbell & Fliegel (1992) calculated 160 hours total time for user testing and 94 hours total for walkthrough performed by one expert. This is a significant difference. The proportion of significant problem areas detected in the software per hour evaluation work is better for empirical evaluation (4.0 hours to detect a significant problem) than for individual walkthrough (5.2 hours to detect a significant problem). In contradiction to this, Nielsen & Landauer (1993) showed that expert evaluation is more effective than empirical evaluation, especially if the experts are experienced with the tasks environment of the system to be evaluated. The reason for the different results is due to the different level of expertise between the two experiments. Nielsen & Landauer (1993) run the expert evaluation with human factors experts and Karat et al. (1992) running their trials with experienced computer users as experts. Taking into account the good effectiveness and cost-efficiency of expert evaluation, a well structured evaluation approach which is closely related to ISO 9241 seems to be promissing. 3.
TASK
AND
OBJECT
ORIENTED
DEVELOPMENT
METHOD
In order to be efficient and effective in user interface evaluation it is necessary to construct evaluation methods according to user interface development methods and the development process. In the following, a methodology for development and design of object oriented and graphical user interfaces (GUIs) is described (Beck, Janssen, Weisbecker and Ziegler, 1994). The method follows an iterative development approach which includes object oriented task analysis and modelling, as well as navigation and dialogue modelling combined with prototyping and modelling techniques from software engineering (Ziegler & Janssen, 1995). The user interface development method is strictly user oriented. The iterative development process consists of four phases: •
Analysis of user tasks
•
Definition of the essential task model
•
Conceptual user interface model
•
Design of the user interface
During the analysis of user tasks, the main focus is on getting information concerning the user behaviour, the time structure of task sequences and objects needed to perform the tasks. Data collection can be done by using interview techniques, user observation (e.g. event sampling) or scenario technique or any other suitable technique (Kirwan & Ainsworth, 1992).
902 From these data an object model and a task model will be generated. The object model consists of the objects needed and their attributes. The task model covers the dynamic relations between the objects. The combination of the object and task aspects is called essential task model. The structure of the task model is represented by data flow diagrams, linked to the object model analogous to Entity-Relationship diagrams in Structured Analysis (Yourdon, 1989). The conceptual user interface model is constructed on the basis of the object and task model. For the representation of objects user views are defined. A view is a part of the overall object model and has two aspects. The logical aspect defines the elements of the object model belonging to the view. The physical aspect defines the screen appearance, the association of GUI interaction objects, and the layout. The distinction of conceptual objects and views is similar to the MVC model (Krasner & Pope, 1989). On the navigation and dialogue control side, two layers are distinguished. The coarse grain dialogue defines the sequencing of views and the call of object methods in response to user input. The fine grain dialogue defines the dynamics of view attributes, e.g. the change of the sensitivity of menu choices according to the state of the dialogue. After having defined the view and navigation model, the design of the user interface can be started. During this phase the dialogue elements will be selected. The selection can be supported by rapid prototyping tools providing the dialogue elements. Normally, these dialogue elements are designed according to an industrial standard (e.g. MS Windows, 1991 or CUA, 1991). Furthermore, the visual information presentation will be performed in this phase. With the first prototype the evaluation of the user interface can be started. After evaluating the user interface the refinement of the user interface will start with the next iteration.
Task Analysis (e.g. UserObservation,ScenarioTechnique) y
User particiga tion Object Model
Task Model
I I Iterative Development Approach I I Prototyping
View Definition
Navigation Structure
Selection of Dialogue Objects
Visual Information Design
I I Evaluation
903 Figure 1. Methodology for task and object oriented development of graphical user interfaces 4.
NAVIGATION
MODEL
AS BASIS
FOR
EVALUATION
The Structured Human Interface Validation Technique SHIVA uses a model of the navigational structure of the system for systematically assessing the usability of the system. As described in the software development method, the navigation model describes how the user can move between the different parts of the user interface. The elements of the navigation model are so-called views which provide a specific perspective on the data object to be manipulated. Views typically correspond to collections of interaction and display objects shown in a single window or dialogue box or represented by iconic objects on a desktop. In order to perform a certain task, the user has to navigate to a particular view which supports the task at hand, or has to go through a sequence of views to accomplish the different subtasks associated with the main goal. The navigation model is thus an important abstraction of the actual interaction with the system which describes the conceptual architecture of the user interface. It can be represented in a suitable fashion in a graphical form either e. g. as specific Petri Nets (Janssen 1993) or as a network of predefined view types which are depicted as different pictograms (Ziegler 1995). An example of the latter modelling approach as part of a larger system is shown in Fig. 2.
leon
Icon Suppliers
On:leB
I
I I I I I I
L
Search Supplier
Products
Search Oxler
I I I i
I
kiste
kiste
Suppliers
Orders
l
Suppler Data
I..,
1~' I"
, I Order Data
Figure 2. Example of a navigational structure showing a part of a larger system for a purchasing department. The view supplier data can be reached by starting from a supplier icon, leading either over a search dialogue box if there are many suppliers or a supplier list. The supplier data view can also be accessed over the orders which have been issued to a particular supplier. The navigation model abstracts from the specific techniques and conditions for making a transition from one view to another as well as from the specific contents of the different views. These aspects will be evaluated later on the basis of the actual system, a prototype or additional specifications. 5.
TASK
SCENARIOS
Independently from elaborating the navigation model, a set of realistic task scenarios is developed, preferably by or in close co-operation with the prospective users of the system. These task scenarios can either be provided in a free textual form or as a more formalised description using a suitable task modelling technique. The scenarios should cover the relevant
904 and typical tasks of the user and should address all major functionalities of the system to be evaluated. Scenarios represent composite tasks consisting of a set of subtasks. Scenarios can be organised around a single object to be manipulated (,,change customer address data") or around semantically related objects (,,check the status of a certain customer a see whether there are any orders which have not yet been delivered"). The object relations are typically documented in an object or entity-relation model. It is not necessary that the scenarios cover all possible dialogue paths through the system as a check concerning changing task goals will be performed during the evaluation walkthrough. 6.
PERFORMING
THE
EVALUATION
The evaluation of the system is done on the basis of a walkthrough mechanism which is applied in two cycles. In the first cycle, the evaluator walks through all the different views of the system, asking a set of standard questions which are derived from overall usability principles such as ISO 9241-10. A first set of questions has been developed and is currently being further analysed. The questions relate either to the specific view under scrutiny or to the navigational transitions which are possible from this particular view. Questions concerning the contents of the view currently analysed are (related ISO criteria are shown in brackets): •
Which task/tasks can be performed within this view (task adequacy)
•
Are the interaction objects selected suitable for the user's task (task adequacy)?
•
Is the information presented understandable to the user (self-descriptiveness, leamability)?
•
Is help and support information available and adequate (self-descriptiveness)?
•
Is the layout and the display of information clear and easy to perceive (ISO 9241-12, Information Presentation)?
In addition, the reviewer can note all possible other usability problems related to this view (cf. heuristic usability evaluation, see Nielsen & Mack, 1994). The second set of questions relates to the transitions leading from the current view: •
Are the transitions to other views of the same object completely supported (task adequacy, suitability for individualisation)?
•
Are the transitions to semantically related objects completely supported (task adequacy, controllability)?
•
Is it transparent to the user which other views of the same or related objects can be reached from the current one (controllability, learnability)?
•
Are the interaction techniques for triggering the transitions suitable (conformance with user expectations, learnability)?
In the second cycle, the evaluation focuses on the tasks scenarios provided. The reviewer walks through the scenarios, analysing the sequence of views needed for performing the particular scenarios. This yields results concerning the number of different views needed for performing the single tasks. In each step, the reviewer should ask a set of what-if questions concerning possibly changing task goals. These questions address issues like: •
What if the user wants to perform a different operation on the same object?
•
What if the user wants to perform the operation on a set of objects instead of a single object?
•
What if the user wants to see or manipulate a related object?
The assumption underlying these questions is that those tasks the user wants to perform cannot be completely enumerated in an efficient manner for information systems of realistic size.
905 It is therefore necessary to explore the task space through what-if questions while going through the different steps of a defined scenario.
7. I N T E G R A T I O N OF THE W A L K T H R O U G H R E S U L T S The two walkthrough cycles need not necessarily be performed in the sequence described above. However, the results must be integrated in order to get a coherent picture of the system's usability. The first walkthrough cycle yields information related to each single view in a local perspective, whereas the second cycle allows to check the appropriateness of the system for defined task or task sequence in a global perspective. The transitions evaluated in the first cycle may be used for answering the what-if questions addressed in the second cycle. Cycle 1 helps to determine which tasks the system supports at all, while cycle 2 checks whether this support is appropriate for a given set of scenarios. Currently, various forms of integrating, analysing and documenting the results of the walktroughs are being analysed such as tables showing which views are needed for which tasks. While such specific techniques are still under development, the method has already been used in a couple of test assessments. There is evidence that it can provide valuable support for an expert evaluator of a system by structuring the evaluation process. Until now, it is an open issue, however, whether such techniques might also be used by non-experts or end users.
8. R E F E R E N C E S 1. Beck, A., Janssen, C., Weisbecker, A. & Ziegler, J., Integrating Object-Oriented and Graphical User Interface Design. Proceeding of the SE/HCI Workshop, Sorrento, Italy, May 16-17, 1994 2. F~ihnrich, K.-P. (ed.). Software-Ergonomie. Mtinchen: OldenbourgVerlag, 1987 3. G6mer, C., Vorgehenssystematik zum Prototyping graphisch-interaktiver Audio/VideoSchnittstellen, Berlin: Springer, 1994. 4. IBM:SAA/CUA (System Application Architecture-Common User Access) (1991). Guide to User Interface Design (IBM-Nr. SC34-4289-0). 5. ISO 9241, Ergonomic requirements for office work with visual display terminals (VDTs) Part 10: Dialogue principles, 1994. 6. ISO 9241, Ergonomic requirements for office work with visual display terminals (VDTs) Part 12: Presentation of Information, 1994. 7. Janssen, C., Dialognetze zur Beschriebung von Dialogabl~iufen in graphisch-interaktiven Systemen. In: K.-H. R6diger (ed.), Software-Ergonomie ,93 Von der Benutzeroberfl~iche zur Arbeitsgestaltung. Stuttgart: Teubner, 1993. 8. Karat, C.-M., Campbell, R. & Fliegel, T. (1992). Comparision of empirical testing and walkthrough methods in user interface evaluation. Proceedings of CHI '92, 397-404,1992. 9. Kirwan, B. & Ainsworth, L.K. (eds.), A guide to task analysis.London:.Taylor & Francis, 1992 10. Krasner, G.E. & Pope, S.T., A Cookbook, for Using the Model-View-Controller User Interface Paradigm in Smalltalk-80. Journal of Object-Oriented Programming, 1 (3), 26-49, 1989. 11. Microsoft Corporation (1991). The Windows Interface; An Application Design Guide. Redmond, Washington: Microsoft Press. 12. Molich, R. & Nielsen, J., Improving a Human-Computer Dialogue. Communications of the ACM, Vol. 33, 3, 1990. 13. Nielsen, J., Finding usabilty problems through heuristic evaluation. Proceedings of CHI '92, 373-380, 1992. 14. Nielsen, J & Landauer, T.K., A mathematical model of the finding of usability problems. Proceedings of INTERCHI '93, 206-213, 1993. 15. Nielsen, J. & Mack, R. (Eds., 1994): Usability Inspection Methods. New York: John Wiley
906 16. Normen, D.A., Cognitive Engineering. In: D.A. Normen & D.A. Draper (eds.), UserCentered System Design: New Perspective on Human-Computer-Interaction. Lawrence Erlbaum: Hilldale, 1986. 17. Poison, P.G., Lewis, C., Rieman, J. & Wharton, C., Cognitive walkthrough: a method for theory-based evaluation of user interfaces. International Journal of Man-Machine Studies, 36, 741-773, 1992. 18. Reiterer, H. & Oppermann, R., Evaluation of user interfaces: EVADIS II -- a comprehnsive evaluation approach. Behaviour and Information Technology, 12 (3), 137-148, 1993. 19. Software-Ergonomie in der Btirokommunikation (Software Ergonomics in Office Communication). VDI-Richtlinie 5005. Berlin Beuth, 1990 20. Yourdon, E. Modern Structured Analysis. Engewood Cliffs: Prentice Hall, 1989. 21. Ziegler, J & Janssen, C., Aufgabenbezogene Dialogstrukturen ftir Informationssysteme. In H.-D. B6cker (ed.), Software-Ergonomie ,95 Mensch - Computer - Interaktion, Anwendungsgebiete lemen voneinander, Stuttgart: Teubner, 1995 22. Ziegler, J., Objektorientierter Entwurf graphisch-interaktiver Informationssysteme. Tech. Rep. Fraunhofer-Institut IAO, Stuttgart (in German), 1995
V. Interface for Physically Challenged
This Page Intentionally Left Blank
V.1 Interface for Physically Challenged
This Page Intentionally Left Blank
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
911
Composition of M e s s a g e s on Winking by AI.S Patients Naoyuki Kanou ", Michio Inoue b and Yasuhiro Kobayashi b Electronic Control Engineering, Yon~agoNational College of Technology Hikona, Yonago, 683 JAPAN b Department of Information and Knowledge Engineering Faculty of Engineering, Tottori University Koyama, Tottori, 680 JAPAN This paper is concerned with a development of a Communication Aid (CA) for Amyotrophic Lateral Sclerosis (ALS) patients who lose the physical ability of communication. A CA is just the equipment that provides the facilities for conversation with them. In our CA, inquiries are arranged in a matrix form, and thrown to patients by means of crosswise scanning. If the patients find out the desired one, among the exhibited ones, they are requested to send a sign YES, by winking before a CCD camera. It is because winking is a most simple motion which can be done by a little physical power. As a matter of fact, wink happens also due to physiological demands. Hereupon it becomes necessary to discriminate the intentional wink from the unintentional wink clearly. This paper proposes a method of remote sensing of intentional winking as the sign of affirmation by a Time-Delayed Neural Network(TDNN), and describes how to compose their messages on their winking. 1. I N T R O D U C T I O N Patients with ALS are damaged in their physical capacity seriously so much as they cannot speak even a word, though retaining their mental capacity. Communication Aid (CA) is just the equipment that provide facilities for conversation with them. In detail speaking, the CA is requested to furnish the functions for interpreting their ideas and those for its literal expression. Their ideas can be barely informed to us by a simple motion on their meager physical power [1], [7]. As their possible motions, ALS patients can move their eyelids, or can turn their eyes intentionally. Let us consider that the ALS patient, a partner through our conversation, is requested to send a sign, if he finds out an affaumble point among our questions to him. On the other hand, we shall throw our questions and watch his affirmative response. If we can perceive his weak response from him, we can converse with him, though merger in the tempo on conversation [3]. To detect his physical motions, a strain-gage sensor and an eye-camera are usable.
912 In practice, the former is attached on his eyelids, detecting his wink. The later is set up at his spectacle frame, tracing his visual focus. Unfortunately, these sensors are not welcomed in spite of their excellent performance. Frankly speaking, any sensors of the attached-type are obstructive. It is because attachment of materials and their wiring before eyes at a close quarter give a feeling of psychological depression. Moreover, performance test is needed whenever these sensors were reset. In this paper, remote sensing technology using a TV camera and a time delayed neural computation are applied to detect the patient' winks and to discriminate their motivations which is due to consciousness or not. Its detail description is given as follows. 2. PERCEPTION OF HIS SIGN Conversation with the ALS patient is carded out in a form of Answers To Questions. His answer (affirmation only) is dispatched by shutting his eyelids intentionally. If missing his signs, we cannot understand his ideas In the present issue of our CA, a TV camera is focused to the ALS patient's face, taking his pictures at a rate of 60 frames per second. Each frame consists of 256H × 256W pixels with each 64 gradations of luminous intensity. These graphic informations are interpreted as 64H × 64W pixels with each 256 luminous gradations. To emphasize the edge line of the picture, its vertical differences are calculated. It is because pixels with each steepest descents of their luminance satisfy the edge lines of the monochromatic graph. For speeding up the data processing without deteriorating its accuracy, we can localize luminous informations enclosing the patient eye. Further, we can thin out the frames. At last, luminous informations are send to a Time-Delayed Neural Network (TDNN) [5], at a rate of 15 frames, each of which consists of 24W x 16H pixels. 3. TIME-DELAYED N E U R A L NETWORK TDNN consists of 3 layers. The first layer has 24 x 16 neurons receiving the luminous informations of the frame, consisting 24W x 16H pixels, and delivering them to the second layer (network). The second layer comprises 16 neurons, recognizing the either state of eyelids. Luminous informations as the input variables of the 1st layer are refreshed at every frame and the outputs of the 2nd layer are held until a sequence of processing for 16 frames are finished. The third layer, consists of a single neuron, recognizing time-dependency of the two states of eyelids, and discriminating the intentional wink from the unintentionals in a real-time operation by processing continuous sixteen frames without employing high -performance hardware. Fig. 1 shows the configuration of TDNN. 4. DISCRIMINATION OF T H E INTENTIONAL WINK The ALS patient, a partner of our conversation is requested to shut his eyelids, as his affinuative response to our inquiry. Sometimes, however, he shuts his eyelids unintentionally as an instinctive motion due to his biological demands. Such an unintentional motion makes us misunderstand his ideas. Motivations of his physical motion must be investigated.
913 1st layer
2nd layer
16th ! . , . ~ . . ~ ~ frame
foo
ith
;~"~-'---"~,O
frame [..'~'...
"-
/layer
2. = =
2nd . . . . . . . . frame ._, ~ : __ -- -1st t : z = ~ --- --. frame :~=====&._ ~ ---Fig. 1. Time- Delayed Neural Network.
ON
~ I ~ I ~ I I~'~'~'~1~1~1~1~1~1~1~1~1~1~1~1~1
OFF
I~I~I~I~I~I~I~I~I~I~I~I~I~I~I~I~I I~I~I~I~I~'~'~'~ ~,~,~,~I~I~IGI~IGI~I~I~I
OFFI~i~1~1~1~1~1~1~1~1~1~1~!~1~1~1~1 ON ~,~,~I~I~I~I~I~I~IGIGI~I~IGIGI~I [~1~i~ ~1~1~ O F F ~,~,~1~,~,~1~1~1~1~1~1~1~1~1~1~1 OFFI~1~1~1~1~!~1~1~1~1~1~1~1~1~1~1~1 ~ ~,~,~,~,~,~,~1~1 OFFI~1~1~1~~,~,~,~1~1~,~,~,~,~,~,~ Fig. 2. The series of frames for training.
Our careful observation drag out the fact that different motivations allow different duration of winking. In detail speaking, it takes approximately less than 0. 2 second in an unintentional wink. Contrary to this, an intentional wink requires much more time than the unintentionals. In this CA system, TDNN is taught to assume that if the scene of closed eyes is observed through continuous 10 frames, then the motion is performed intentionally. This criterion is determined by considering the frequency of our inquiry and the promptitude of the patient's response. Thus, the standard luminous patterns of the open eyes and the closed eyes are provided as the teaching materials. These are shown in Fig. 2. Classification of the received luminous informations is implemented by means of the neural computation. The weight for the neurons of the second layer are determined as the deference in luminous informations between two patterns, opened eye and closed eye. The weights for the neurons of the third layer are soon converged to certain values through training. These weights are commonly used for processing luminous informations of the following frames. The name of TDNN results from the architecture that input informations are refreshed at every frame, though maintaining the common weights [6]. 5. APPLICATION TO CA For talking to patients with serious ALS, we can not expect so much. A possible gesticulation is to use wink as a sign of their affnTmtive response to our inquiry. As mentioned above, intentional winks have been discriminated from the unintentionals. This provides a facility for conversation with them. In this CA, our inquiry is thrown to him about their compliances or their requests by imagining some scenes in hospital. These examples are tabulated in Table 1. In practice, these are exhibited in a form of telops on the CRT display unit. The patients have only to wink when they agree with the meanings of the extffoited telops. This CA can assist literal expression of their ideas. In this application, The patient specifies desired characters one by one among the syllabary shown in Fig. 3 and spell them along the grammar. Our inquiry is thrown by means of crosswise scanning with a constant speed. As for the composition, this CA permits a short approach where some idioms conducted by the first two characters
914
that the patient specifies. If the patient finds out a desired idiom, he can economize on his physical efforts to specify necessary characters. It is shown in Fig. 4. [2], [4]. An example of the AI~ patient's letters in Japanese are shown in Fig. 5. He spends about 40 minutes on this composition.
~. -t ~: ~ 9 ~
g ~.gv~ry
•
zaei}
z
(a) Column F¢'X/ is first selected by the column scanning.
~
b 4
, ?
~
6 t11~
05
~--T-
07 ~
~v,m
Fig. 4. Candidate words conducted by two characters.
Table I. An example of questions. .
.
.
.
.
.
.
.
.
.
(b) Character F(S._J is then selected by the row scanning. Fig. 3. A crosswise scanning for s e lecting the character F(S.J .
~~
?
~
?
~:~
?
¢~v, ~
? ?
~, ~)p ~
? ? ?
(SOMETHING TO EAT ?) SOMETHING TO DRINK ?) FEVERRISH 7) CHILLY ?) HARD BREATH ?) NAUSEOUS ?) ITCHY ?) SWEATY HOT WATER 7) COLD WATER ?) TEA ?) TOWEL ?) VENTILATION ?)
I
Translation: I have despaired of my future and feel deeply vexed in my life, cursing my domn every day. Thank you very much for your kindness of providing this facilities. This can dispel my gloom before a half year. My irritation ~ quieL I can talk to my family and friends, and moreover, I can write to them. They are deeply moved with lit~al correspondence. Wonderful !
Fig. 5. An example of the ALS patient's letters.
915
6. DISCUSSIONS 6. 1. Feature of our Time-Delayed Neural Computation The TDNN has an excellent attribute, such that it is able to classify the eye states by tracing their time-dependent variations on a common hardware. In practice, the neurons of the third layer pushes out results of training, that is the time dependency in the outputs of the second layer. The neurons of the second layer output the value of nearly One or Zero corresponding to the state of eyelid (Opened or Closed). The neuron of the third layer is trained to remember the series of One or Zero, these are transition of the state of the eyelid. These are shown in Fig. 6. Fig. 7 shows the weights of the 3rd layer. By setting the weight of the second layer as the difference between the opened eye and the closed eye, the neurons of the second layer can discriminate the states of the eyelid.
1
~
1
The neuronsof the 2nd layer
g 0' .~_~ [-.
0
(a) 2nd layer (b) 3rd layer Fig. 6. The independence of the 3rd layer from 2nd layer.
Fig. 7. The weights of the 3rd layer.
Table 2. Discrimination in composition. (each testee)
Table 3. Discrimination composition. (each distance)
distance testees
15cm A
B
C
testees
D
For successful 205 205 207 205 inten- discrimination tional failured 3 3 14 winking discrimination 5 For misuninten 2 4 7 3 -tional discrimination motion The focal distance" 4.7mm
distance
13
15cm 25cm 35cm
For successful inten- discrimination 207 tional failured 10 winking discrimination For uninten mis2 -tional discrimination motion The focal distance" 4.7mm
201 203 0
0
8
0
916 6. 2. Experiments by composition The authors test our CA through the experiments by composition. First, Five testees composed a short messages. Table 2 shows the result. The intentional wink could be recognized with high probability. There were two wrong cases in recognition. The one was caused by missing an intentional wink. The other was due to misjudgment as a intentional wink in spite of no sign. Secondary, a testee composed the messages at several distance form the TV camera. Table 3 shows this results. This was almost in agreement with the result of the fast experiment. Thereby, it is not necessary to set the position of the TV camera exactly. 7. CONCLUSION Intentional winks have been discriminated from the unintentionals. Its mathematical basis is on the difference in the duration of eye's motion. In a practical view, this achievement results from a remote-sensing technology in combination with a time-delayed neural computation. Upon this, we can understand the ALS patient's ideas by perceiving his wink as an affirmative response to our inquiry. This type of the communication aids can assist literal expression of his ideas. REFERENCES [1] Perry A. R., Gawel M. and Rose F. C., "Communication aids in patients with motor neurone disease", British Medical Journal, vol. 282, pp. 1690-1692, 1981. [2] Heckathome C. W. and Childress D. S., "Applying Anticipatory Text Selection in a Writing Aid for People with Severe Motor Impairment", IEEE MICRO, vol. 3 No. 3, pp. 17- 23, 1983. [3] T. Tokunaga, M. Inoue, Y. Kobayashi, N. Kanou, K. Inoue "Design of a Commu -nication Aid for a patient with Amyotrophic Lateral Sclerosis", IEICE report CAS 87-2 6, pp. 1- 8, 1987(inJapanese). [4] M. Inoue, Y. Kobayashi, N. Kanou, K. Inoue, "A Method of Word Processing for a Patient with Amyotrophic Lateral Sclerosis", Trans. of IPSJ, Vol. 33, No. 5, pp. 6 45- 651, 1992(inJapanese). [5] Y. Miyata, 'Neural Networks for Temporal Processing", Tranr~ of IEEJ, vol. 113, No. 6, pp. 372-377, 1993(inJapanese). [6] N. Kanou, M. Inoue, Y. Kobayashi, S. Inoue, K. Inoue "Detection of Winking for Communication Aids", IPSJ SIG Notes, Vol. 94, No. 74 pp. 9 - 14, 1994(inJapanese). [7] Wolfgang L. Zagler, Geoffrey Busby, Roland R. Wagner, "Computers for Handicapped Persons", Springer- Verlag.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
917
Development of Language Training System for Developmentally Handicapped Children K. Itohaand K. litaka b aDept, of Computer Science, Tokyo Polytechnic College 2-32 Ogawanishi-machi, Kodaira-shi, Tokyo 187, Japan bResearch Institute for the Education of Exceptional Children Tokyo Gakugei University 4-101 Nukiikita-machi, Koganei-shi, Tokyi 187, Japan A language training system with a personal computer is developed for developmentally handicapped children. They see action images presented on the display of the personal computer and learn action words with this system. They have a difficult problem to solve when they learn words. For consideration of the problem, the system has an improved method of presenting actions as follows. For each of the action words, the system presents action images from an actor's viewpoint and an observer's viewpoint in the form of a line-drawn animation and a movie by using multiple examples. The new language training system is applied to developmentally handicapped children. The effectiveness of the system in helping them to learn the action words is evaluated. The results obtained are that with each of the presentation methods, the correct answer rate in the post-learning test is higher than in the pre-learning test and that there are significant differences between the pre-learning and post-learning test results. 1. Introduction This study is concerned with the development of a language training system for developmentally handicapped children 1.2. The system presents to developmentally handicapped children a movie and an animation depicting an action word on the display of a personal computer and helps them to learn the action word. When a previous language training system was experimentally used on developmentally handicapped children, they were seen enjoying their learning experience.
918
Here is reported a new language training system developed with an improved method of presenting actions to developmentally handicapped children. They find it difficult to generalize action words when they learn them. Each action word is presented in the form of a movie and a line-drawn animation from two viewpoints, or actor's viewpoint and an observer's viewpoint, by using multiple examples. The new language training system is applied to developmentally handicapped children, and their test performance is compared before and after learning. evaluated.
The effectiveness of the system in helping them to learn action words is The action words "Cut" and "Tear" were selected from among those
action words that are considered difficult for developmentally handicapped children to distinguish (Morioka et al., 19903). 2. Language training system The language training system is divided into learning and testing phases. The examples presented to subjects are given in Table 1. In the learning phase, three examples are presented for each of the action words "Cut" and "Tear". In the testing phase, an additional example is also presented. Since the system presents images from the actor's viewpoint and the observer's viewpoint in a movie and an animation, four problems can be presented in different ways for one example. In other words, 24 (2 x 3x 4) problems are presented in the learning phase, and 32 (2 x 4 x 4 I problems are presented in the testing phase. Table 1. Examples presented on display Learning software Testing software .3ut
Cut a carrot with a kitchen knife. Cut fingernails with a pair of clippers. Cut a wooden plank with a handsaw.
Tear (Tear a bag.) Tear paper. Tear packaging paper. Tear a sheet of the calender.
Cut a carrot with a kitchen knife. Cut fingernails with a pair of clippers. Cut a wooden plank with a handsaw. Cut a cake with a knife. (Tear a bag.) Tear packaging paper. Tear a sheet of the calender. Tear an envelope.
2.1. Hardware configuration
A Macintosh Ilfx personal computer is used in the language training system. The monitor is equipped with a touch panel so that developmentally handicapped children can learn the presented action words by touching the screen.
919 Sound and images are introduced through a digitizer and used to create software on HyperCard.
A movie is a digital movie entered from a videotape. An
animation is a kind of "flipping" cartons created from three line-drawn pictures. 2.2. Software configuration The configuration of the learning software is shown in Figure 1.
Since a
subject learns within 20 minutes at a time, 24 problems are divided into two groups, and 12 problems are given in one experiment.
When the subject
touches the problem display screen, a movie or an animation is started.
After the
movie or animation is finished, a question like "What is he doing?" is asked of the subject.
If the subject gives a correct answer, a picture and sound are presented
to encourage him or her to learn more.
If the subject does not give a correct
answer, the spelling and the pronunciation of the correct action word are presented.
The problem is given repeatedly until the subject can give the correct
answer. The configuration of the test software is shown in Figure 2. are given in one experiment.
Sixteen problems
The reaction time and answer type (correct
answer, wrong answer1, wrong answer2, no-reaction) are noted.
Wrong Voice ans. "~ I Start ! ~_~Letters LJAmovleor I I Pr°bleml1 ~ A ananimationl
picture,Ill
[Problem2 ! IProblem3 1 ! Pr°blem4
I Correct .j.]_. ans v [ " Sound i
I
1I I Pr°blem11 I I [Problem12 !
- -Reaction • timer Start ! -.L{ Correct ans. J--Problem1 -~ Wrong ans.1F i [Problem2 I 1 -~ No reaction F JProblem3,.J IProb,e'm4! ~-{ Wr°ng ans-2 i"I 1!
l ro 'e S ! I
Fig 1. The configuration of
Fig 2. The configuration of
the learning software
the test
software
3.Results and discussion Using the newly developed language training system, pre-learning tests were conducted on 12 developmentally handicapped children.
The subjects range
from 7 to 12 in chronological age and from 5 to 9 in vocabulary age. Subjects N and Y who had to learn action words were selected. the presented action words and were tested after learning.
They learned
The chronological
920
age is 7 for subject N and 9 for subject Y. The vocabulary age is 6 for subject N and 5 for subject Y. The way the subjects participated in the experiments was videotaped.
3.1. Pre-learning test The pre-learning test results arranged as classified for the action words "Cut" and
1oo, 80 "Tear" are shown in Figure 3. Wrong 6c answer1 in Figure 4 means that the subjects 4( mistook "Cut" for "Tear" or "Tear" for "Cut." 2( Wrong answer2 in Figure 4 means that they C had any other answer. The rate of correct
answer is nearly 70% for "Cut" but less than 40 % for Tear." There are significant differences between the results for "Cut" and "Tear." The rate of wrong answer1 for
Fear i ~11k3 I ~ a ~ . , L l ~ . , l l
I
Wrong2 Fig 3. The pre-learning test results arranged as classified for "Cut" and "Tear".
"Tear" is higher than for "Cut." The subjects mistook "Tear" for "Cut" more often than "Cut" for "Tear."
The subjects exhibited
results similar to those of the preliminary study where still pictures were presente~. The videotape shows, however, that they uttered much more words than in the preliminary study. In the preliminary study, most of the subjects gave only answers, in this study, when motion pictures were presented, the subjects uttered cries and talked about the motion pictures by touching them on the display screen. When sound was presented, the subjects said "Big sound!" as well as "Gohgoh" or "Bribri" to imitate the sound. The pre-learning test results arranged as classified for the examples are shown in Figure 4.
The rate of correct answer for the example "Cut a carrot with a
kitchen knife" is more than for the other examples "Cut .... " The example is often presented to the subjects in a picture-word matching vocabulary test. more familiar to them.
It may be
The rate of wrong answer2 for the example "Tear
packaging bag" is higher than for the other examples "Tear .... ". This is because many subjects answered "Open packaging bag."
There are significant
differences between the results for the example and for the other examples "Tear .... " In the language training system, the subjects were made to decide action words only when action images were presented. The subjects found it difficult to decide the action word for the example without knowing the situation where the action was actually performed. The results for the example are thus excluded from the pre-learning and post-learning test results.
921
/ ~ \
(%)
(%/
1 o(
1 oo
8 6 4 r,
"-r-A
_
_
80 6(: 4( 2t
4
8
Co No reaction No-reaction Wrong2 Wrong2 Fig 4. The pre-learning test results arranged as classified for the examples. 1. Cut a cake with a knife.
5. Tear a bag.
2. Cut a carrot with a kitchen knife.
6. Tear an envelop.
3. Cut fingernails with a pair of clippers. 7. Tear packaging paper. 4. Cut a wooden plank with a handsaw. 8. Tear a sheet of the calender. 3.2. Learning When the two subjects did not give correct answers, they were not only presented with action images, but also made to imitate the actions to learn the action words. The videotape shows that subject N was pleased with the pictures and sound presented to encourage him to learn and wanted to have them. 3.3. Post-learning test The results of the pre-learning and (%) post-learning tests of the two subjects loo are shown in Figure 5. In the 8o 6o pre-learning test, the rate of wrong answer2 is 70%, and the subjects
In the
4c 2( (
post-learning test, the rate of correct
Cc
made many wrong answers.
answer is about 90%. There are
significant differences between the pre-learning and post-learning test
~ost-learning -learning
......... "Wrong2 Fig 5. The pre-learning and post-learning
results, suggesting the learning effect.
test results for two subjects
Some examples were presented in the learning phase, and some were not (Table 1).
The test results were arranged as separated for the former and the
latter. The rate of correct answer for each of the two groups in the post-learning test is higher than in the pre-learning test.
There are significant differences
between the pre-learning and post-learning test results for the two groups.
The
922
increased percentage of correct answers for the examples not presented in the learning phase indicates the possibility of generalization occurring. The language training system presents examples by four different methods. The different presentation methods produced no significant differences between the pre-learning and post-learning test results, but one subject produced different test results depending on the presentation methods employed. In a future study, more subjects will be employed to study individual differences between them. 4. Summary A language training system with a personal computer is developed for developmentally handicapped children. The system is improved in the method of presenting actions to the developmentally handicapped children who find it difficult to generalize action words when they learn them. The new language training system to learn the action words "Cut" and "Tear" is built and applied to two subjects. Their test performance is compared before and after learning. The correct answer rate in the post-learning test is much higher than in the pre-learning test. The learning effect is thus suggested. The correct answer rate for the examples not presented in the learning phase is also increased. The possibility of generalization occurring is indicated as a result. Individual differences between the subjects will be studied in the future. Acknowledgements We would like to express gratitude to the pupils and teachers in a special class for developmentally handicapped children at the K elementary school. References 1. K. Itoh, K. litaka, The 8th Congress of the Japan Educational Technology Society, (1992), 262-263. 2. K. Itoh, K. litaka, The 9 th Rehabilitation Engineering Conference, (1994), 341-344. 3. N. Morioka, K. litaka, Master"s thesis submitted to Tokyo Gakugei University in fiscal 1990.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
INTERACT: disabilities
An
interface
builder
923
facilitating
access
to
users
with
C. Stephanidis and Y. Mitsopoulos Institute of Computer Science, Foundation for Research and Technology - Hellas, Science and Technology Park of Crete, P.O. Box 1385, Heraklion GR-71110, Crete, Greece
This paper describes INTERACT 1, a tool for the specification of interaction dialogues and the construction of user interfaces appropriate for various categories of users, including people with disabilities. INTERACT builds on the notion of separating an application in two functional components, namely the application functional core and the user interface component, thus allowing the provision of multiple user interfaces and supporting the "highlevel" design of the interaction dialogue, i.e. independently from the presentation details and operational constraints of a particular technological platform.
1. INTRODUCTION Recent efforts addressing the problems of accessibility to graphics based applications by people with disabilities have mainly addressed and explored two alternatives. The first, involves the enrichment of interaction objects of existing toolkits. The second concentrates on the development and embodiment of new toolkits into new User Interface architectures and tools [3]. Following the first approach, INTERACT is a user interface construction tool that aims towards the enrichment of interaction objects of existing User Interface toolkits. While INTERACT exhibits the majority of the characteristics of other state-of-the-art User Interface Builders [2],[4],[5], it also facilitates the development of graphics based applications for disabled users through the provision of enhanced user interface customisation possibilities. More specifically, INTERACT supports different interaction styles through the utilisation of alternative human and computer interface channels and media; the audio, visual and haptic modalities can be selected by the user interface designer, taking into consideration the characteristics of the target user grpup and the scope of the particular application.
1Part of this R&D work has been carried out in the context of the IPSNI-II (R2009) project, which is partially funded by the RACE II Programme of the Commission of the European Union (DG XIII). The partners of this consortium are: Consiglio Nazionale delle Ricerche (IROE-CNR), Italy; Centro Studi e Laboratori Telecommunicazioni (CSELT), Italy; Institute for Rehabilitation Research (IRV), The Netherlands; Dundee University Microcomputer Centre, UK; Katholieke Universiteit Leuven, Belgium; Institute of Computer Science - FORTH, Greece; Technical Research Centre of Finland (VTI'), Finland.
924 2. DESIGN AND DEVELOPMENT WITH INTERACT The underlying concept of INTERACT is that the enhancement of existing toolkits with additional "look and feel" styles can facilitate the development of graphics based applications accessible by a wide user population, including people with disabilities. In addition to the standard I/O communication channels supported by existing toolkits for developing graphics based applications, INTERACT introduces additional features regarding: the media and modalities used during interaction, - the employed interaction techniques, the utilised I/O devices, navigation in the user interface, - feedback provided to the user, etc. -
-
-
-
~-to-Servlce
~
action Dialogue
~
i zeauZta
in...
Abstract User-to-Service Interaction Dialogue /
is. " t |
Presentation Independent Interaction
Obkct
~o'? ~
.... ~-I
:!
'.... '
User Interface
I
.=1
' !l
is..
Technology Platform Independent
Interaction Techniques and I/O Devioes
=,: •: •. :! :: ::
Presentation Dependent Interaction Object
* .... VO Devices Spe#~
Independent ::
1st S t a g e 2nd Stage ,,,.........o..........,,,.,..,.=.=.=..=........==....=.. ==.. .... ...................... : ................=.,...................................o..=..,.=.,,,--...=..,,,
Figure 1.
The stages in the user interface construction process using INTERACT.
For instance, in order to support the development of graphics-based applications accessible by blind users, INTERACT enhances the graphical objects with additional attributes (e.g. presentation in auditory or tactile form) and provides facilities for switching between the applications, as well as facilities for the exploration of the graphical objects of the various applications. User interface construction with INTERACT takes place in a two stage process (see Figure 1). During the first stage, the designer interactively constructs the dialogue by means of a hierarchical (tree) structure of abstract interaction objects. The second stage involves the "binding °' of the abstract interaction dialogue to produce a specific User Interface. Both the abstract interaction dialogue and the target user interface have interaction objects as building blocks. The abstract dialogue consists of interaction objects that possess only semantic / functional properties (i.e. abstract interaction objects). The end user interface consists of physical instantiations of the abstract interaction objects which inherit the semantics of their abstract correspondents while conveying the presentation, manipulation, control and feedback characteristics of the target technology platform (e.g. OSF Motif). For example, at the first stage of the design process (i.e. abstract level), an interaction object could be a Menu (i.e.
925
10I~ sI I de'.15.1 ms~aGe.17o1
/bleed= ~) Buttons
I button_24.4
button_ 18_3 button_9_2 button_B_1
(ffecter:. ~) Shells "
-i
lshell_4_3 shell_3_2 shell_2_1
Speech Options (~) SpeechOff
Interface Type (~ Cruphlcsl When: [~
Left mouse button pressed
Action: (~ HI~
@,r~J:
Technology Platform (~ MOTIF
View Connections: (~) Buttons
I button.~4_4 (left mouse button pressed/-> shell_4_3 (Show) button_18_3 (left mouse but'ton pressed)-> shell_q_3 (Hide)
CCreateOraw ,~rea) Ccreate o~..... ~)
Figure 2.
A snapshot of the working environment in INTERACT.
abstract selector), which can be mapped, at the physical level, to an Option List, Pop Up Menu, etc. The design of the user interface, both at the abstract and physical levels, is performed by "easy-to-use" interactive facilities; thus, non-computer experts, human factors specialists, rehabilitation experts, etc, could easily and efficiently design graphics-based applications accessible by users with disabilities. At present, INTERACT addresses the needs and requirements of blind, low vision and motor-impaired users through the provision of appropriate interaction techniques and the utilisation of suitable input and output devices for each user category. In this context, the following interface styles are supported: - tactile based user interfaces, - speech based user interfaces, - combined speech and tactile based user interfaces, - large size widgets, - large size widgets with auditory cues, - large size widgets and speech.
926 3. SYSTEM ARCHITECTURE OF INTERACT The main parts of the system (shown in Figure 3) are the following: a) The user interface component INTERACT provides "easy-to-use" interactive facilities for the design of graphical user interfaces. Currently, the implementation is based on the XView toolkit. One of the key subcomponents is the graphical browser which provides the dialogue designer with "standard" browsing facilities as well as management facilities for the dialogue trees (e.g. hierarchical structure of interaction objects). Type checking facilities are also supported (e.g. a menu interaction object should not contain valuator or command interaction objects as options). The depicted dialogue tree is "translated" into an internal representation and "saved" in the internal dialogue representation module. b) The internal dialogue representation module (IDRM) This module is responsible for the mapping of a particular dialogue tree to a corresponding high level dialogue structure with embedded dialogue sequencing and control schemes; it "transforms" the dialogue designer's conceptual model, which is represented graphically in the dialogue tree, into the corresponding abstract dialogue specification which is still presentation independent. The internal dialogue representation module cooperates with the code generation and attribute modification components that are responsible for matching and translating the already designed interaction dialogue to a specific technology platform with a specific I/O configuration and a specific set of interaction techniques that are suitable for the target user group. c) The attribute modification component (AMC) The attribute modification component provides online (i.e. interactive) restructuring of the interaction dialogue by altering its internal representation, as well as facilities for the modification of the attributes of the particular interaction objects. The attribute modification component communicates with the library modules either for the modification of presentational attributes of the particular interaction objects or for the modification of manipulation/control and feedback attributes. d) The code generation component (CGC) Taking into account the target technological platform that the dialogue designer wishes to use for the realisation of the end user interface (e.g. OSF Motif), the code generation component "maps" the internal dialogue representation to a target software module, using relevant I/O device related functions and interaction techniques.
e) The library module The library module consists of three "sublibraries", namely the technology platforms library, the I/0 device related functions library and the interaction techniques library. More specifically: - The technology platforms library includes the required software modules for the attribute modification/code generation components. Archetypes of the various technology platform specific interaction objects, such as OSF MOTIF, MS Windows [ 1] (i.e. widgets, controls, etc) are included in this module to be used during the realisation of the abstract dialogue for a specific technological platform and the generation of the target software module.
927
INTERACTUser InterfaceComponent Graphical Browser and Management
Tree
m
n
"% .... , r
Attribute Modification Component (AMC)
INTERNAL , DIALOGUE REPRESANTATION ODULE I (MDRM)
~ ~' •
Code Generation Component
'
(CGC)
---!i
i Library Technology Platform Related Libraries
Figure 3.
Modules I/O D e v i c e s Related Functions
Target User Group Interaction Techniques
Runtime architecture of INTERACT.
- The 1/(9 device related functions library provides the "interfacing" of the particular interface code with the utilised I/O devices and facilitates handling of I/O events at run time. - The interaction techniques library mainly includes the supported interaction techniques for the particular target user groups; it is used both during the realisation / online attribute modification and code generation processes. The library modules have been implemented in such a way so that new functions can be incorporated; no global recompilation of already existing modules is required.
4. DISCUSSION AND FUTURE WORK
Preliminary tests with different user groups have confirmed the practical value of INTERACT. More systematic user trials are foreseen. Work currently under way seeks to augment the supported user interaction styles through the further exploitation of the haptic and audio channels. In addition, design assistance is foreseen by means of a module that provides the user of INTERACT with suggestions for lexical aspects of the User Interface such as the presentation of interface objects with respect to the user group under consideration. Future work envisages the automated production of "log files" supplementary to the
928 generated software modules. These "log files" can be used by the designer in the early design phases in order to improve the usability of the interfaces under development.
REFERENCES
1. R. Chimera, Evaluation of Platform Independent User Interface Builders, Proceedings of the 10th Annual Symposium and Open House of the Human-Computer Interaction Laboratory, Center for Automation Research, University of Maryland, June 1993. 2. B. Myers and D. Olsen Jr, User Interface Tools, Tutorial 36 Notes, CHI'94 Conference on Human Factors in Computing Systems, 1994. 3. C. Stephanidis, A. Savidis and D. Akoumianakis, Tools for User Interfaces for all. Paper to appear in the Proceedings of the 2nd TIDE Congress, La Villette, Paris, April 26-28, 1995. 4. U. Thakkar, G. Perlman and D. Miller, Evaluation of the NeXT Interface Builder for Prototyping a Smart Telephone, SIGCHI Bulletin, January 1990. 5. OpenWindows Developer's Guide 3.0.1, User's Guide, Sun Microsystems, 1993.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
929
Supporting Blind and Sighted User Collaboration through Dual User Interfaces using the HOMER System A. Savidis and C. Stephanidis Institute of Computer Science, Foundation for Research and Technology - Hellas, Science and Technology Park of Crete, P.O. Box 1385, GR-71110 Heraklion, Crete, Greece The emergence of Graphical User Interfaces has introduced additional problems regarding the accessibility of computer systems by blind people. The implications include restricted opportunities for computer-based collaboration between blind and sighted users in a working environment. Currently, accessibility to graphical User Interfaces by blind users is enabled through systems which reproduce the lexical structure of User Interfaces (i.e. interaction objects and their relationships) in a non-visual form; such systems introduce visually oriented concepts in the context of non-visual interaction. The concept of Dual User Interfaces has been defined as a more efficient and effective approach to address the accessibility and collaboration problems. A User Interface Management System, called HOMER, has been developed for the construction of Dual User Interfaces 1. 1. INTRODUCTION The emergence of Graphical User Interfaces has initially excluded accessibility to computer systems by blind users. However, special purpose software has been developed for reproducing the lexical level of interaction (i.e. presentation and dialogue aspects of interaction objects, hierarchical structure, etc) in a non-visual form. Such methods are characterized as adaptation oriented methods and are usually tightly coupled to the target graphical environment (e.g. MS-WINDOWS TM, Macintosh TM, PRESENTATION MANAGER TM, X WINDOWING SYSTEM). It is argued that, even though these methods have offered partial solutions to the problem, there are a number of outstanding issues to be resolved: (i) All adaptation oriented approaches achieve dialogue reproduction based on lexical level information. No knowledge concerning the application semantics and the particular application domain can be extracted and, consequently, the semantics of the visual application are not taken into consideration during non-visual reproduction. (ii) Reproduction of the dialogue in a non-visual form is based on the visual dialogue structure. However, such a dialogue structure is constructed in accordance to the specific needs of sighted users. Moreover, with adaptation oriented approaches fixed dialogue decisions are implicitly taken
1 This work has been partially funded by the TIDE Programme of the Commission of European Union (DG XIII), under the project GUIB-II (TP 215). The partners of the GUIB consortiumare: IROE-CNR, Italy; Institute of Computer Science-FORTH, Greece; Vrije Universiteit Brussel, Belgium; Department of Computer Science-FUB, Germany;Institute of Telecommunications-TUB,Germany;IFI, Universityof Stuttgart, Germany; V'Iq', Finland; RNIB, England; F. H. Papenmeier GmbH&Co, KG, Germany.
930 for all non-visual User Interfaces. Consequently, there is no support for non-visual interface design; this contrasts one of the main principles of good User Interface design. (iii) Considering that the trend of User Interface software technology is towards interaction methods aiming to enable maximum exploitation of the human visual information processing capability, such as virtual reality and 3D representations (i.e. visual reality), it is expected that the employment of adaptation oriented techniques will become unrealistic or meaningless in the future. (iv) Finally, there are no methods or tools available at present which enable the development of non-visual User Interfaces. If the implementation of a dedicated non-visual application is required, interface developers are faced with the problem of lack of tools. In order to effectively support the socio-economic integration of blind users and to prevent possible future segregation in their working environment, it is argued that there is a need for more powerful User Interfaces than the adapted versions of existing visual User Interfaces. In this context, the concept of Dual User Interfaces has been defined and is characterized by the following properties: (i) it is concurrently accessible by blind and sighted users in order to enable collaboration; (ii) the (possibly different) visual and non-visual metaphors of interaction meet the specific (different) needs of sighted and blind users respectively; (iii) the (possibly different) visual and non-visual syntactic and lexical structures meet the specific (different) needs of sighted and blind users respectively, thus, the non-visual dialogue design may be different from the visual interface design; (iv) at any point in time, the same internal (semantic) functionality should be made accessible to both user groups through the visual and non-visual "faces" of the Dual User Interface (i.e. the What You Can Do Is What I Can Do principle); (v) at any point in time, the same semantic information should be made accessible through the visual and non-visual "faces" of the Dual User Interface (i.e. a type of What You "See" Is What I "See" for semantic information). An interface development system has been designed and implemented, called HOMER [3], which falls in the domain of User Interface Management Systems, and supports the construction of: (a) dedicated visual User Interfaces, (b) dedicated non-visual User Interfaces and (c) Dual User Interfaces. The HOMER system provides a high-level language for Dual User Interface specification, the HOMER language, and translates such a specification into a C++ implementation. Also, the HOMER system provides facilities for integrating different visual and non-visual lexical technologies (i.e. interface toolkits). 2. THE PRINCIPLE OF DUALITY IN INTERACTION One of the basic requirements of Dual User Interfaces is to facilitate concurrent access by blind and sighted users (i.e. collaboration). There are a number of significant dimensions which concern concurrency of interaction with respect to the visual and non-visual faces of a Dual User Interface, as is shown in Table 1. The A dimension is related to concepts employed for communication between the two user groups (i.e. discussion during collaboration) and can be either lexical (e.g. "this menu, this button"), syntactic (e.g. "I am doing editing") and semantic (e.g. " the recording volume has changed"). The B dimension concerns the metaphors of interaction for the visual and non-visual environments which can be either identical (e.g. Desk-top and Desk-top) or different (e.g. Desk-top for visual and Rooms [4] for non-visual). The C dimension characterizes the freedom that blind and sighted users have on performing actions independently of each other. More specifically, the dialogue control can be either synchronous (i.e. the visual and the non-visual dialogues always pass from the same "states" by progressing in strict parallelism), semisynchronous (i.e. users have
931 freedom to focus on different tasks; however, at certain points synchronicity is imposed) and asynchronous (i.e. users may focus on different tasks, interacting on different interaction objects and performing different actions). The D dimension characterizes the flexibility on the physical structure of the visual and non-visual faces of a Dual User Interface. It is possible to have identical structure (e.g. same object classes and instances, same layout, same object hierarchies), similar structure (e.g. some of the interface components and object classes are different, however, the overall physical similarity is evident) and totally different structure (e.g. the hierarchical organization of objects is completely different involving different classes of objects - no physical similarity can be identified). The E dimension concerns the flexibility of applying a different dialogue design for each environment. The dialogue structure can be either identical (e.g. same decomposition of user tasks) or different (e.g. different organization of tasks for blind and sighted users). Finally, the F dimension concerns the type of collaboration which can be either local (i.e. users work on the same machine and are physically close to each other) or remote (i.e. users work on different machines - distant collaboration).
Concepts for communication between users {Lexical, Syntactic, Semantic } B~
Metaphor relationship between visual and non-visual faces {Identical, Different }
C -_
Dialogue control between visual and non-visual faces {Synchronous, Semisynchronous, Asynchronous } Lexical structure between visual and non-visual faces {Identical, Similar, Different}
E~
Dialogue structure between visual and non-visual faces {Identical, Different }
F~
Collaboration type {Local, Remote }
I
Table 1: Dimensions of concurrent visual and non-visual interaction. The properties related to the concurrency of interaction for Dual User Interfaces can be defined by any 6-tuple from the set A x B x C x D x E x F. In comparison, the concurrency supported by adaptation oriented approaches has properties which belong to the following set: A x {Identical} x {Synchronous} x I1dentical, Similar} x {Identical} x {Local}. It is evident that apart from the theoretical and practical drawbacks from which adaptation oriented approaches suffer (as previously mentioned), the type of collaboration supported is considerably restricted. In Figure 1, the two modes of collaboration between blind and sighted users are illustrated for Dual User Interfaces generated by the HOMER system. The nonvisual technology that has been integrated in the HOMER system is the COMONKIT toolkit [4]. This toolkit complies with the purpose-designed Rooms metaphor [4] and supports non-
932 visual interaction based on speech, braille and non-speech audio output, while keyboard is used for input. The visual lexical technology that has been integrated is the Athena widget set of the X WINDOWING SYSTEM. * Local c o l l a b o r a t i o n m o d e
* Remote collaboration mode
! i i
m£ghted umI~
network
Braille
~
keyboard , audio' "]i a'='ll~mm=n=~
mouse
•/
/
blind u80r
keyboard
Figure 1: Collaboration modes for Dual User Interfaces developed using the HOMER system.
3. AN OVERVIEW OF THE HOMER SYSTEM The HOMER system enables the development of Dual User Interfaces by reducing implementation to specification. It provides a powerful specification language [3] which encompasses many innovative features with respect to existing UIMS languages. It supports metaphor independent interaction objects (called virtual interaction objects in HOMER terminology) which stand one level of abstraction above platform independent interaction objects. The HOMER language by itself provides no built-in interaction objects, but supplies mechanisms for introducing objects both at the physical (i.e. metaphor-specific or platformspecific level) and at the virtual level (i.e. pure abstract level). The dialogue control model, which is based on agents and has some relationships with the PAC model [2], is reflected in the language by explicitly introducing the notion of agent constructs (called dialogue agents in HOMER). Event handlers are supported and can be attached to interaction objects. The technology integration interface model, which is based on interaction objects and Input/ Output events, constitutes an important extension of existing technology integration methods [1]. The HOMER language enables distinction between visual and non-visual physical constructs through the qualifiers visual and nonvisual respectively. The lexical technologies are integrated in the HOMER system by means of implementing technology servers. It should be noted that, at run-time, the visual and the non-visual technology servers can be both handled in parallel. In Figure 2, the utilisation of the HOMER system is outlined. A proposed design methodology is discussed later on. The HOMER system will transform a Dual interface specification to a C++ implementation. The HOMER language provides powerful notational methods for the specification of the application interface by combining transactions on a
933 shared space with message channels. The run-time architecture of Dual User Interfaces developed with the HOMER system is based on the Dual run-time model [3] which constitutes an appropriate enhancement and extension to the Arch model [5] in order to fulfil the requirements of Dual User Interfaces.
O tilt| ues=g-"-Rl--'ner
sighted user needs % '
~
.
blinduser~ / ~ 0
t11t
needs ~
~
"
~
designer
d
g
sighted user
Dual User
Interface
5
~nput
d HOMER L ~ ~ r .
blind user
Figure 2: Using the HOMER system to develop Dual User Interfaces. 4. DUAL USER INTERFACE DESIGN M E T H O D O L O G Y The goal of the Dual User Interface design process is to preserve two fundamental principles regarding the relationship of the visual and non-visual interfaces: (i) independent but integrated, where there should be no interference or conflict of user requirements between the two environments, and (ii) similar but different, which signifies the similarity of the two faces at an abstract level. It is this similarity on which the design process is based. A decomposition of the design process in a number of distinct phases has been defined. Firstly, identification of the semantic services which are provided by the functional core is carried out. Since in many situations it is primarily more important to focus on the type of data "exported" by the functional core and secondly to identify the operations which can be applied on them, it has been decided that in the first phase both the identification of the functional services (i.e. supported semantic functions) and of the data / information structures will be performed in parallel and at the same level of priority. Formally, this is called the semantic services and semantic data
identification phase. The purpose of the User Interface is to realize methods of accessing internal functionality and applying it interactively on the internal information structures. Hence, the second task of the design process is to identify such "manipulation methods" for the internal structures. Such methods are to be made interactively available to blind and sighted users, hence, in one sense
934 they formulate a high-level description of what the user has to accomplish (i.e. user tasks). Thus, this part of the design process has been named abstract task identification phase. Next, for each abstract task it is necessary to identify the way in which each user will carry out specific actions. The key issue during this phase, where the transition from abstract to concrete is realized, is to ensure that the semantic information which is made accessible to one user group is also made accessible to the other in the same qualitative manner (i.e. the What You Can Do Is What I Can Do principle). Such rules must be clearly stated in this design phase so that it is always well defined when consistency of representation should be applied and on which information structures. The above phase concerns the description of user specific actions (user task identification phase). Finally, the design of the physical aspects of the visual and non-visual faces is to be carried out by addressing issues related to the physical appearance. This is called the physical design phase. 5. DISCUSSION AND CONCLUSIONS Existing approaches addressing the problem of accessibility of Graphical User Interfaces by blind people are based on adaptations of visual interactive applications at the lexical level. However, these methods are associated with significant theoretical and practical problems. To avoid segregation of blind users in their working environment, it is critical to efficiently and effectively facilitate computer-based collaboration with sighted users. To address these problems, the concept of Dual User Interfaces has been defined and a User Interface Management System for developing Dual User Interfaces, called HOMER, has been constructed. Using the HOMER system, a number of experimental interactive applications with Dual User Interfaces have been built like a payroll management system, an application providing description of graphical pictures annotated with text and a personal organizer, while preliminary trials have demonstrated the practical value of the adopted approach. The Dual User Interfaces generated by the HOMER system may run either in local collaboration mode or in remote collaboration mode. The HOMER system realizes an efficient approach for eliminating accessibility and collaboration problems from the development phase (i.e. proactive approach) in contrast to the more restrictive adaptation oriented approaches which necessarily attempt to "react" to problems arising from new technological developments (i.e. a reactive approach). REFERENCES 1. CMU/SEI-91-UG-8, Guide to Adding Toolkits, Serpent User's Guide, 1991. 2. J. Coutaz, Architecture Models for Interactive Software: failures and trends, in G. Cockton (ed.), Engineering for Human-Computer Interaction, North-Holland, 137-151, 1990. 3. A. Savidis and C. Stephanidis, Developing Dual User Interfaces for Integrating Blind and Sighted Users: the HOMER UIMS, to appear in the Proceedings of the CHI '95 Conference on Human Factors in Computing Systems, Denver, Colorado, May 7-11, 1995. 4. A. Savidis and C. Stephanidis, Building non-visual interaction through the development of the Rooms metaphor, to appear in the Conference Companion of the RCHI '95 conference on Human Factors in Computing Systems, Denver, Colorado, May 7-11, 1995. 5. The UIMS tool developers workshop, A Metamodel for the run-time architecture of interactive systems, SIGCHI Bull 24(1), 32-37, 1992.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
935
D e v e l o p m e n t of Human-oriented Information systems - Learning with mentally handicapped people Yasuko Kaminuma Department of Information Systems, Faculty of Informatics, Teikyo University of Technology, 2289, Uruido, Ichihara, Chiba, 290-01, Japan
ABSTRACT Recently, it became increasingly necessary to obtain information systems professionals who can take into consideration the harmony between technologies and human beings/ societies. How can these information systems professionals be obtained? Can they be educated in universities? To study these problems, we tried to utilize action research. In this paper, we present information systems education report in rehabilitative training schools for mentally handicapped people as one example of practices in university education.
1. INTRODUCTION In recent years, the environment of the facilities or resources relating to information systems has been changed. Accompanied with this, variation appears also in human activities in relation to computers. System engineers should be aware how they should do requirement analysis of the information systems reflecting situation changes, and should comprehend correctly what the optimum requirement specifications are. On the other hand, information system handlers also should acquire the ability to allow themselves to analyze information and actually utilize the result of the information analysis. For that purpose, university undergraduates who are to be engaged in the work of information systems are required to cultivate the power of concept for expressing themselves. In this paper, a report is made with the examples of the information system development exercises in rehabilitative training schools for mentally-handicapped people as practices in university education.
2. DEVELOPMENT OF HUMAN-ORIENTED INFORMATION SYSTEM
Downsizing and networking are now in progress, and development environment of information systems has been changed. As a result of this, dispersion of information resources is also in progress. Confronted with such a situation, improvement of user
936 interfaces is in requirement and betterment of the development technology of the information systems is furthermore required. The theme of the development project has been expanded from settled business to un-settled one, and moreover attention is paid also to the aid of a variety of social problems. On the other hand, autonomous dispersion of the systems has compelled their end users to seek enhancement of knowledge and technology. Furthermore the operability of the information systems has become an objective of evaluation, and importance has been attached to the object orientation or multi-media technology. In this paper, mechanism of information systems composed of human and mechanical systems concerning collection, processing, transfer, and utilization required for human activities is dealt with as an objective. An information systems is a system which assembles, stores, processes and delivers information relevant to an organization ( o r to society ), in such a way that the information is accessible and useful to those who wish to use it. An information systems is human activities system which may or may not involve computer systems. It is important for system developers to explain a problem concerning resources such as budget, manpower, terms, etc., a problematic point regarding planning or administration of developing projects, etc. However concurrently with this, both developers and users should agree with themselves on the matter what the contents of the information system to be developed are. This is the most difficult problem in the system development. To comprehend a problem on the matter of development, it is one of the most important assignments to investigate what the education that universities can perform is. We have made investigation with respect to the existing techniques or methodologies that can be applied effectively under new environment. This has induced us to do action researches[I-2] as a method to do requirement analysis of the existing information system, to build up a prototype, and to estimate whether the requirement analysis has been made correctly or not. In accordance with this method, problematic points or required matters have been made clarified under interference by entering the users' problem region. Accordingly the method in question directly drives analyzers to be involved into action processes, and it might be permissible to state that the method is quite an adequate one for dissolving such a kind of problems where importance is attached to human activities. That is to say, it becomes easy to add new settlement of problem affairs, to re-construct a theory, or to combine plural situations by the circumstance that analyzers are involved to a situation where theoretically correct tests can be done.
3. FUNDAMENTAL EDUCATION OF INFORMATION SYSTEM D E V E L O P M E N T PROSECUTABLE IN UNIVERSITIES In future for the students who are to be engaged in development of information systems, problem-finding ability, problem-analyzing capability, and problem-dissolving expertise together with performance to execute decision making are believed to be required.
937 A method is available in analysis work of an information system for educating a process for formation of agreement where human and human correspond to each other by means of computer simulation. The method is however almost devoid of effect in a beginners' course, because correspondence of human vs. machine and correspondence of human vs. human have so many portions intrinsically different from each other. In this stage, an action research has been intended to be planned in non-profitmaking system in order to acquire the ability for problem finding and requirement analyzing in an information system. The plan aims at allowing students specializing information system technology to cultivate the fundamentals for becoming system engineers. The purpose of this education resides with the fact that students are required to obtain knowledge necessary for planning/developing of information systems by comprehending the information systems to be developed and the roles and assignments of the people who are engaged in the systems. Students are guided in a direction of explaining problematic points of the information systems they are going to deal with, concurrently by learning required knowledge in the individual processes of the system development. The efforts for the explanation of the problems are made through a series of the processes to allow a prototype to be built up by securing targets of discovery of the problems existing in the real world and by doing requirement analysis. As the problem region to be experienced, the environment where practical utilization of information systems is behind schedule will probably be a desirous choice. This is because the students are obliged to undergo a training to dissolve the problems they themselves have discovered without just mimicking other people's actions. In this exercise, processes of the students' actual experiments are videoed as far as possible and are practically utilized as the materials to investigate the students' thinking processes. With this, it becomes possible to check to see what the students accepted are, whether judgment is correctly made, whether there is any ambiguousness incomprehension, whether other interpretation can be made, etc.
4. PRACTICAL LEARNING OF ANALYSIS/DESIGN IN INFORMATION SYSTEMS Here is the description concerning the exercises for problem finding and requirements analyzing conducted by entering the field of self-establishment education for heavydegree mentally-handicapped people. The learning environment referred to above is explained in Figure 1. Field A is a facility for the mentally-handicapped people. In the field the mentally-handicapped people are provided with instruction from instructors for daily-life activities, who are experts of the welfare performances. Field B is a university. Students collect a variety of knowledge concerning information technology through activities including learning in classes. With interference by securing themselves in Field A, the students observe mentally-handicapped people's conducts/actions. The students and the teachers possess the view for the world as analyzers, whereas the mentally-handicapped people and the instructors share the view for the world as users.
938 C is an interface between the users and the analyzers, i.e. human vs. human. With the cooperation among the people possessing the individual types of view for the world, analysis of information systems is made. The students intend to improve the level of the analysis by repeating the analysis of the systems, the acquisition of the knowledge, and the collection of the information as is shown with the arrow (a). With the repetition of this operation, the world view of the individual persons are gradually changed. The systems are directed towards procedure D, when the agreement between a user and an analyzer is obtained. The students execute design attaching importance to a human-computer interface. At that time, the products obtained from prototyping is exposed to users' evaluation. To the extent that the users' satisfaction is attained, the analysis and design are repeated as shown in the arrow mark (b).
iew
ANALYSIS C human-human / ~ interface analyzers' "~ view
action ~ ~ (/ ~dents, .
/
teach~
DESIGN human-computer interface D
(b)
(a) Figure 1. The Learning Environment
The practice referred to above was made in accordance with the action research. To the collection and analysis of the information a grounded theory[3] is applied, whereas SSADM[4] is applied to the analysis/design of the system. The cooperators to these practices are comprised of about 60 mentally-handicapped people in their 20s or 30s, almost all of whom make their living together in a housing complex established subordinately to the facility. We, the authors of this paper, have made some preparations in obtaining such cooperation. The matter to which the most earnest attention should be paid among them was never to give stress to the mentallyhandicapped people. Especially with the influence produced by introducing a computer to the facility, investigation was deliberately conducted for about one year. For example, problems including such a matter whether the mentally-handicapped people can be familiar with equipment and apparatus are examined, by inviting to laboratories
939 in the university those handicapped who shows interest in machines or by allowing them to touch the equipment and apparatus. On the other hand, the students themselves are willing enough to visit from time to time the facilities to deepen their comprehension for the mentally-handicapped people. For a social aid system, importance was placed upon the concept in purport: (1) The system should not be the one for coercion. (2) The system should be the one acceptable for the people to receive the aid without any resistive feeling. Also to avoid the coercion caused by the developers' erroneous understanding, actions were started with observation of the students who are for a settled duration of a period with the mentally-handicapped people and their tutors to experience the daily life of such people in special training and watch actual state of their life. It remains to be seen whether there exists a system useful as a measure for the aid of life. If there exists such a one, what is it like? Supposing that the system is developed, how is the order of its priority? With regard to such doubtfulness, the students conceive plural system plans with several problem conceptions kept in mind in parallel with observation and detection. An idea selected among the plans mentioned above enjoys a considerable degree of evaluation by the mentally-handicapped people by performing analysis/design and prototyping. The evaluation is done straightforwardly and quantitatively in an easily understandable manner. For example, checking can be made with the items shown below. Are the mentally-handicapped people interested in the matter? Are the mentally-handicapped willing to touch it? Are they desirous to have a try? Are they apt to be fired in the course? Are they utterly reluctant to see it? Are they eager to use it repeatedly many times? As a result of acquisition of the evaluation, the students improve the matter that have found to be problematic points and repeat the operation until they become good enough to be accepted by the mentally-handicapped. The compuer-aided systems developed by the students until today through this practice are as shown below. (1) Computer-aided system for purchase training This is the system to allow the mentally-handicapped to train purchase in a manner of especially discerning types of money, calculation of prices, selecting changes, etc. The most difficult term for the mentally-handicapped is such comprehension concerning money. (2) Computer-Aided system for homecoming training Those mentally-handicapped who live in the residences can avail themselves of the system of returning to their native places several times every year. The system is for the learning aid enabling the disabled to return home by themselves individually. (3) Computer-Aided system for promoting mutual understanding This is the aid system to enable those people who suffer from such double handicap as of the trouble stemming from deaf ears in addition to the intellectual disability to carry on conversations using a computer.
940 (4) Computer-Aided system for fabric making Fabric making is the work to fabricate picture patterns onto mats and the other materials similar to them by referring to the fundamental designs. Mentally-handicapped people always have hard time in recognizing something objective and then getting to the next conduct. Accordingly trainings to enable the mentally-handecapped to do fundamental actions become necessary. The system in question is for the purpose of helping the learning.
5. CONCLUSIONS The systems were made in the laboratories and the prototypes completed concurrently with securing the systems were brought into the facility to be subjected to evaluation. The mentaly-handicapped people were looking forward to undergoing system tests. This encourage the students to willingly get to their work, and as a result they are directed to solution of the problems. Autonomous dispersion environment progresses, and reducing the information system developed under complicated facility/resource environment into a specific type becomes furthermore difficult. The users participating in project conferences are diversified, and level difference among knowledge and technology are more and more expanded. Although agreement formation of users and developers is very difficult, it is effective to visualize indistinct user requirement by using end-user thought or multi-media information. We, the authors of this paper, have been successful in heightening work efficiency of visualization by re-utilizing the constructed prototype.
REFERENCES
1. G.Mansell, Action Research in information systems development, Jurnal of Information Systems, No.1 (1991) 29. 2. Y.Kaminuma, Training for the development of piblic information systems, User Oriented Information Systems Simposium, IPSJ (1993) 101. 3. W.C.Chenitz and J.M.Swanson,From Practice to Grounded Theory,Addison-Wesley,1986. 4. G.Cutts, Structured Systems Analysis and Design Methodology, BlackweU Scientific Publication, 1991
VI. Social Aspects, Management and Work
This Page Intentionally Left Blank
VI.1 InformationTechnology
This Page Intentionally Left Blank
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
945
Personal Information Appliances Peter J. Thomas, John F. Meech and Robert D. Macredie Centre for Personal Information Management, Faculty of Computer Studies and Mathematics, University of the West of England, Bristol, Coldharbour Lane, Bristol, BS16 1QY, UK 1. INTRODUCTION The range of information management activities which personal computing devices are intended to support include (a) Storage of local information created or manipulated by users (b) Retrieval of local and non-local information (c) Integration of multiple sources of local and non-local information (d) Decision-making by integrating stored, retrieved and integrated information (e) Communication with other users and devices The activities which comprise what may be termed 'personal information management' [1, 2, 3, 4] require not only the use of various technologies, media and modalities but their integration. In this paper we look at the characteristics of personal information appliances, the range of activities which they support, and review families of appliances and their application domains with examples from a current project. 2. PERSONAL INFORMATION APPLIANCES Before looking at specific research issues and topics, we will describe the motivation for research and development in personal information appliances by looking at three areas: key development issues, user experience, and enabling
technology developments.
2.1 Key Development Issues The development of personal information appliances is centred on the following key issues: (a) Convergence. Technologies such as multimedia, broadband networks, personal computing and telephony are converging into a situation where networks provide more than transport of information, and applications and data are embedded in high-speed information networks.
946 (b) Integration. These technolo~es require integration in way which creates a seamless 'information space' of families of information appliances. This suggests t h a t radically new technologies may not be required to support personal information management. (c) Management. Traditionally, the concept of 'information management' has been emphasised in corporate settings. 'Information Management' can also be extended to all users' activities, and suggests a focus on issues such as information properties (depth and breadth, for example), and the possibilities for users to sort, categorise, personalise and share information across activities and tasks.
2.2 User Experience In terms of users' experience of integrated personal information appliances the following characteristics are central: (a) Transparency. Access to information is transparent, and appliances are interchangeable. (b) Appropriate information provision. The focus is not only on the bandwidth of information or simply multimedia delivery, but on the provision of information which is appropriate to the task at hand. (c) Tailorable enduser interfaces. Interfaces to personal information appliances are personalisable and can be driven by the contextual features of use and the content of the information itself. (d) Multimodal interfaces. User interfaces use several forms of presentation not simply multimedia in the commonly-accepted sense.
2.3 Enabling Technology Developments Finally we can specify enabling technology developments which support the development of personal information appliances: (a) Communications and Network Technologies. Networks, infrastructures and equipment; the integration of networks and services; mobile broadband services; personal communications networks. (b) H u m a n Interfaces. The design of information displays; multimodal interfaces; the use of 'digital documents'; video/voice integration; small and innovative display devices. (c) Information Engineering. Support for mobile professionals and knowledge/information workers; integrated office information systems; and techniques for managing 'interpersonal information management' [6]. 3. A P P L I C A T I O N DOMAINS AND APPLIANCE FAMILIES We suggest t h a t for personal information appliances there is a string relationship not between application domain and particular appliance but
947 between domains and families of information appliances. Families of integrated personal information appliances include familiar technologies and appliances but also radically new ones. Figure (1) provides a mapping of the space of families of personal information appliances into application domains, and the overlapping relationships between application domains and families of appliances. HQ_m___e___S_yst¢_ms
...........................................
Site~building Systems
[ .........
Office Systems
Mobile Systems
Figure 1: Personal Information Appliances (adapted from [4])
We are currently using this mapping to drive the design and development of families of appliances and specific members of those families. The following sections look at application domain (site/building systems), appliance family (telephony appliances) and user interface technology (agent-based appliances).
3.1 Application Domain: Site And Building Systems The development of integrated personal information appliances suggests an integration of 'bits and buildings' and the integration of information appliances into sites and buildings themselves. Here information management facilities can be embedded into buildings via distributed appliances which interface with other appliances.
3.2 Appliance Family: Telephony Appliances Telephony systems are still relatively simple and reactive information appliances. The use of voicemai! - ubiquitous within many organisations - call management or fax-switching, are familiar services which can be enhanced by a perspective which emphasises integrated personal information appliances. The requirement for transparency suggests that telephone handsets can be access points to information management functions.
948
3.3 Interface Technology: Agent-Based Interfaces The concept of agency in terms of managing information may best be thought of as an enabling functionality. In these terms an agent may be thought of providing a special facility such as scheduling meetings to a particular user's preferences, or performing other functions in much the same way as a h u m a n might. These agent roles may be viewed as functioning at the information level: they manage information within the context of h u m a n understanding. Such agents are controllable, minimal, 'information management support agents'. 4. P E R S O N A L O F F I C E S U P P O R T SYSTEMS To illustrate work in personal information appliances, we describe an ongoing project on the development of Personal Office Support (POS) Systems which unite the observations we have made so far with the complexities of supporting information management in a work setting.
4.1 S u p p o r t i n g Office-Based Collaboration Communication, the management of information and the scheduling of time are central to the working relationship between individuals. In particular, the relationship between office-based collaborating professionals [6], such as managers and their personal assistants, is based on the smooth management of communication, information exchange and time-management. Although there are numerous examples of 'productivity' software such systems are essentially ad hoc, based on little detailed investigation of the complexity of the tasks to be supported. We suggest t h a t a great deal of leverage to be obtained by understanding thoroughly the nature of the relationships between office-based collaborating professionals, the nature of the information which is necessary to those relationships, the key information management tasks to be supported and the ways in which those relationships and tasks are changed by the nature of different user interface modes. In these working relationships information can come from diverse sources: face-to-face interaction with co-workers, from interactive devices such as telephones or from non-interactive media such as fax, email, letter and memo. POS systems [7, 8] recognise that adequate support for the work of collaborating office professionals requires an understanding of both the 'personal' aspects of information management and the 'collaborative' aspects. The key feature of this approach, in line with our approach to the development of personal information appliances, is the integration of the diverse media necessary to support an individual's work. Our approach to providing support for these personal information management activities is to develop information appliances which provide (a) high levels of connectivity (b) a narrow range of core information management functions (c) support for specific users through information content and provision without personal ownership of the appliance and which (d) serve as ways of locating individuals' activities (as opposed to individuals) within the larger social and organisational settings in which they work.
4.2 R e s e a r c h Studies We have used various approaches to investigate the complexities of the relationship between office-based collaborating professionals such as managers
949 and personal assistants, and in the of design prototype POS systems. Initially small-scale multi-modal studies of working offices were used to arrive at taxonomies of salient events using analysis of video-recorded materials. The studies included videotaping of the interactions between office-based collaborating professionals, the use of technologies (both paper-based and electronic), and the ways in which information was managed in the office setting. Results from initial observational studies of office-based professionals suggested that the salient issues which personal office support systems would need to address were those of (a) communication of short, content-intensive messages (b) support for shared understanding of a medium of information exchange such as a shared view of a diary and (c) support for storage and retrieval of longer, less content-intensive messages in the form of electronic documents. These issues taken up to design a series of multimodal prototypes which incorporated penbased interaction, voice messaging, computer-telephony integration and shared visual workspaces. Further studies of these prototypes suggested that the complexities of the work of office-based collaborative professionals such as managers and assistants could enhanced by the provision of low-functionality robust communications technologies designed around a central 'working medium' such as a shared diary representation. 4.3 P r o t o t y p e
Development
These studies were carried out in parallel with the incremental prototyping of a POS system using paper and software prototypes and a set of scenarios which were intended to reflect accurately possible use of the prototype. Finally, a commercially-available collaborative work system [9] was re-developed to provide a prototype POS system. This system was evaluated using both traditional usability studies and observational and scenario-based studies. The POS system consists of a small gesture, voice and pen-enabled LCD colour display and associated technology which manages telephone, fax, and email and allows a user to indicate their availability status in a time-management system. Pairs of appliances allow users to indicate their availability, interruptability and unavailability to interruption and therefore represent an enhancement of traditional network-based diaries or schedulers. The results of this project suggest that the problems of designing POS systems are more complex from those of simple 'personal systems' but share many of the same concerns.. In particular, collaborating office professionals see such systems as 'highconsequence'- crucial for the support of working relationships where technology is a mediating factor. The further value of the project, which is being taken up in current projects, is to provide a base to explore a number of research themes such as (a) the human interface design issues for visual displays of time and the use of colour, resolution, icons and symbols (b) multimodal interface issues - integration of pen, voice, speech and gesture (c) integration of information appliances - the ways in which networks of ubiquitous computing devices can be managed (d) technology issues in developing a software architecture which can integrate and manage telephony, fax, email and voicemail, for example, and (e) the social and organisational issues in supporting office work.
950 REFERENCES
1, Thomas, P. J. (1995) (ed.). Mobile Communication and Collaborative Technology. (London: Alfred Waller/Unicorn Seminars series). 2. Thomas, P. J. and Meech J. F. (1994). Personal Information Management: Applying HCI Techniques to Develop Usable Personal Technology. Proceedings of HCI'94, Glasgow, August 1994. 3. Thomas, P. J. and Meech, J. F. (1994). Personal Information Management: developing usable personal systems. In the Proceedings of OZCHI'94, Australian annual conference on Computer-Human Interaction, Melbourne, November 1995. 4. Thomas, P. J., Meech, J. F. and Macredie, R. D. (1995). Personal Information management using integrated information appliances. To appear in Vince, J., and Jones, H. (eds.) Digital Media and Electronic publishing, book edition of BCS Graphics and Displays Group Conference (to appear, 1995). 5. Thomas, P. J., Meech, J. F. and Macredie, R. D. (1995). Integrated Information Appliances and Added-Value Information Services. In the proceedings of HFT'95 15th International Symposium on Human Factors in Telecommunications, Melbourne Australia, March 6-10, 1995. 6. Frohlich, D. (1995). Interpersonal information management. In Thomas, P. J. (ed.). Mobile Communication and Collaborative Technology. (London: Alfred Waller/Unicorn Seminars). 7. Fleuriot, C., Lees, D. Y. Macredie, R. D., Thomas, P. J. and Meech, J. F. (1995) Interface Engineering in an Office Information Appliance. To appear in the Proceedings of CHI'95 Human Factors in Computing Conference, Denver, May 1995. 8. Thomas, P. J., Meech, J. F. and Macredie, R. D. (1995). Managing OfficeBased Services using an Integrated Information Appliance. In the proceedings of H F T ' 9 5 15th I n t e r n a t i o n a l S y m p o s i u m on H u m a n Factors in Telecommunications, Melbourne Australia, March 6-10, 1995.* 9. O'Conaill, B., Geelhoed, E. and Toft, P. (1994) Deskslate: a shared workspace for telephone partners. Hewlett-Packard Technical Document, HP Laboratories, Bristol, UK.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
951
Efficient Development of Organisations and Information Technology- A Design Approach Jan Gulliksen a'b (
[email protected]), Mats Lind a (
[email protected]), Magnus Lif a'b (
[email protected]) and Bengt Sandblad a'b (
[email protected]) a Uppsala University, Center for Human-Computer Studies, L~igerhyddv. 18, S-752 37 Uppsala, Sweden. b Uppsala University, Dept. of Technology, Systems and Control Group, POBox 27, S-75103 Uppsala, Sweden.
In this paper a framework for the entire process of organisation and information system development is discussed, focusing especially on design issues. Our definition of the process of design in human-computer interaction is the process of creating a formal description by appearance and functionality of an information system. This design is based on both formal and informal descriptions of interaction requirements as a result of a work analysis process. The analysis phase is separate from the design phase. According to the definition of design, as a specification into a formal language, it can never completely describe all requirements. We distinguish four, different, consecutive gaps of communication in the design process. In each of these gaps some information about the actual work situation can be lost. These gaps constitute severe obstacles in the process of developing efficient and usable information systems for specific work situations. Development models covering the entire process of design might bridge, or at least narrow, these gaps. Some main characteristics of such models are presented. 1. B A C K G R O U N D Developing information systems for complex tasks in working life is not trivial. Skills in software development and cognitive psychology in combination with an insight into the enduser's work situation are required. Most methods for development of information technology support are a mixed process of organisational development, task analysis, data modelling, application development and interface design. They emphasise certain aspects of the development process but lack other. Disadvantages with such methods: • • • • •
Difficult to plan and control development - increased development times and costs Unfriendly, fragmentary, inconsistent, unchangeable user interfaces are often the result Inferior user participation Problems with personal competence and knowledge, due to complexity Increased work load and personal conflicts due to unclear roles of responsibility and insufficient competence
Several competencies are involved in the organisational and information technology development process. Human-factors experts analyse the user and his task, software engineers perform system development etc. An analysis of expectations identify goals for the work situation of tomorrow, organisational developers analyse implications of general future work methodological changes, trying to predict the effects of modem computer support, reengineering measures and knowledge development. Task analysts establish what information is used and derive together with designers and programmers a datamodel. The amount of extra work involved can be big due to unclear roles and competencies. A methodology for user interface design and development could significantly reduce the amount of work required. A designer is often introduced as a communication link between work analysts and software developers. He is often a software engineer with special interest in formatting and layout but often without formal training in graphics design. This designer often performs design as an
952 artistic leap, sometimes referred to as innovative design. Assembled information and knowledge of the work, result in a prototype with more or less functionality. Problems in the design process can refer to problems in communicating and translating one semiformal notation (e. g. a report from the task analysis) into another formal notation (e.g. a MacroMind prototype, or a mock up design on paper). The prototype can seldom be re-used in software development and these prototypes seldom illuminate all relevant aspects encountered in the task analysis, since prototyping tools are not developed to communicate design decisions. 2. ORGANISATIONAL ASPECTS AND MODELS OF WORK There is a need for a new paradigm of system development, focusing on development of organisations and the work activities as such. This means that methods for analysis, requirement specifications, system design and construction, must include all relevant aspects simultaneously, and not focus only on technical support systems. It is obvious that couplings between e.g. work organisation and information handling in organisations are very strong. A certain work organisation defines certain information support requirements. An information system is normally developed with a specific work organisation in mind, reflected in e.g. datamodel and functional specifications. Changing the work organisation without changing the information system will often create severe informational problems for workers in the new organisation. In administrative and service work situations, the main focus of our studies, it is important to base information technology development on models of work, considering all aspects; competence for performing a specific work task, organisational impacts due to automatisation of tasks, information needs and possibilities of sharing large information sets, work flow and supervision, evaluation and efficiency. We are currently developing work models for the case handling domain [ 1]. Development based on such models can contribute to the improvement of work efficiency, user satisfaction, competence, work environment etc. When designing computer support it is necessary to question the present work organisation, to fully utilise the potential of technological support. We need to develop information systems for tomorrow's organisations, not for today's. Often, in a development project, a task analysis is performed, focusing only on information aspects of the work. Information related problems are identified and a suitable computer application solution is specified, resulting in false conclusions and inappropriate requirement specifications. Information problems, identified by the task analysis and experienced by professionals in a work setting, can be symptoms of more fundamental work environment problems. E. g. organisational problems, lack of personnel or other resources, lacking competence or inefficient support systems, will result in information handling problems. Developing computer systems can eliminate symptoms but will not solve, rather conserve, the original problems.
Figure 1. The OTM-model describes the process of simultaneously considering organisational aspects (O), human, social and comPetence aspects (M) and technological aspects (T). A new approach to task analysis, considering all aspects of a work situation, will result in more correct requirement specifications. Close relations between work organisation, information handling and competence must be considered simultaneously [c.f. fig. 1]. An important part of task analysis is the expectation analysis, where different staff categories can specify expectations
953 regarding work organisation, work procedures and support systems. Such expectations are an alternate, and more efficient, way of specifying goals. In a 'learning organisation', organisational development is a continuous process integrated in the work. Goals are formulated, compared to the actual work situation, development plans specified, effectuated and evaluated, in a continuous feedback process [c.f. fig 2]. This 'learning organisation' can support development of organisations, information technology and competencies. Goals and
[ ,
-
Generalgoals e.g.: efficiency
I "usability ,
~
.
-
! i
,
good work en~ronment n
0 J" activities
actual state t~i1e
'now'
Figure 2. Continuous development of organisation, competence and information systems in a 'learning' organisation. The development activities can be seen in more detail in figure 3. Describing and analysing work requires detailed domain knowledge, which is why an efficient user participation is essential. Organisation, work contents, activities, communication and information aspects must be described. Problems and expectations of different staff categories are identified and possible solutions analysed. Requirement specifications for development should be formulated in activity terms, covering information and organisation aspects simultaneously. Requirement specifications in this way describe how potential end-users want to organise and perform their work activities and environment. Expectation analysis can reveal conflicting goals, which can be handled only if explicitly documented. An experimental model for system development assumes professionals participating actively. A preliminary requirement specification in activity terms is interpreted by computer experts into a requirement specification in 'computer' terms. A rapid prototype is developed and tested. Again computer experts interpret and evaluate data etc. and the prototype gradually transforms into the final system. Parallel requirement specifications allow users to participate and even obtain a leading role in development. Prerequisites for this development model to be feasible are e.g.: an organisational decision to work according to an experimental model and assign reasonable resources, competence for this kind of work both with domain and computer experts and finally, efficient rapid prototyping tools. 3. O R G A N I S A T I O N AND I N F O R M A T I O N T E C H N O L O G Y D E V E L O P M E N T Above mentioned disadvantages might be overcome by defining a structure for the entire process of organisational and information technology development, consisting of separate but sequentially related parts [c. f. fig. 3]. Organisational d e v e l o p m e n t includes e.g. specification of goals, relations, roles, competencies and work processes. The organisational model is the description of this, the basis for specification of user requirements, both concerning functionality and human-computer interaction. Information analysis establishes, using this organisational model, what data, are needed, by whom, when and where, to perform the work. The resulting application model is both a datamodel and a collection of rules or methods, depending on the nature of the database. Analysis of information utilisation focuses upon certain aspects of how information entities are used in specific work situations, and, especially, factors affecting cognitive load [2]. It assumes existing application and organisational models and the result is
954 called a work model. User interface d e s i g n is the interpretation of the work model into an interface model. Construction is the production of applications (database and presentations user interface and functionality) based on the application and interface models. A ( ~ ~ . , Applicati°n3 development
I
4
~rgani-f sa~lonm ~ s a t i o n a l ~ develop-[ w-]-model I v
mation l---~catio n ] analysis I J m o d e ~ / ,..~l i
q
:...................
utilisation analysis
, struction
['-I Activities- work processes (analysis, design, construction) /7
Descriptions- resulting formalised models
Figure 3. Iterative experimental system development; including consecutive activities and the descriptions they produce. Feedback and possibilities for participatory design are also illustrated. 4. M E T H O D S
FOR DESIGN - BRIDGING THE DESIGN GAPS
We need to enhance the communicating and interpreting aids for transfer of knowledge between the different areas which have been subject to analysis. This communication is to some extent a question of design. The concept of design in human computer interaction can be defined as the creation of a formal description (e. g. program code, formal language) of appearance and functionality based on partly informal (to a large extent) results of an analysis. Four different types of design, corresponding to the formalisation of the description models are defined. The design of the organisation model is a question of being innovative and imaginative in predicting future changes of work organisation and impacts from information technology and relating this to the overall goals of the organisation. The design of the application model is a characterisation of the objects and methods of work. The design of the work model is a process of formalising the outcome of the analysis of information utilisation. And finally the design of the interface model is the formalisation of results of the decisions made during the user interface design process, as e. g. a prototype. As information about the actual work is lost when designing formalised descriptions, communication gaps in design occur. Today's methods are 'leaps' over these gaps, referred to as 'innovative design'. Methodologies for design can bridge or narrow these gaps, improve knowledge communication, make development more efficient and economic, prevent unnecessary work and produce usable interfaces. • Gap 1 - refers to problems when formalising the design of the user interface model, the appearance, functionality and behaviour of the intended human computer interface, in an interpretable way for software development. Prototyping with existing design tools never illuminate all relevant aspects of work, it shows a minimum of functionality and dynamics and it seldom produces formal specifications that are reusable in construction. Our group is currently developing an interface emulating tool that can interpret a formalised design description file, together with an existing database, into a functioning application. The description file describes important aspects of layout and relations between interface elements at different levels. Prototyping is performed in real terminology with a dedicated tool. • Gap 2 - refers to problems when formalising results from the analysis of information utilisation. In the final specification of formalised, viable methods for analysis of information utilisation, a language for the specification of the work model is defined. Through the domain specific design methodology [3] the interpretation of the work model is enhanced. It is
955 recommended that the same person performs the analysis of information utilisation and the user interface design due to the loss of information in every transfer of complex knowledge. • Gap 3 - refers to the problems when formalising the result of the information analysis, the application model, as a formal design of the datamodel (objects and methods) along with behaviour and functionality. Object oriented methodologies for application modelling are preferred, where relations to interface elements and concepts of end-user's work situations are straight forward [4]. The domain specific design methodology emphasise the importance of having datamodels that mirror the actual work situation, which also improves possibilities for efficient user participation in the iterative development process. • Gap 4 - refers to the problems of specifying the organisational model, through a formal description that can be adapted, understood and efficiently introduced in the entire application development process. The concept of 'learning organisations' is central. 5. ANALYSIS OF I N F O R M A T I O N U T I L I S A T I O N Analysis of information utilisation focus upon how information entities encountered in the information analysis are being used in a specific work situation. An existing application model (an implemented, functioning, object-oriented datamodel containing data and methods to specify the behaviour) and an organisational model, can be assumed. The datamodel puts limits on the design space. If the analysis of information utilisation demands changes, the design of the user interface can be enhanced, if the datamodel is specified in a modifiable way. We are working on methods for analysis of information utilisation and a taxonomy for specification of the work model, the outcome of the analysis of information utilisation. Central to the creation of efficient user-interfaces is to minimise cognitive load, imposed upon the user when performing work related tasks. To do this, we have argued elsewhere [2], the user's tasks must be analysed in terms of which decisions they make. Examples of the kinds of decisions we refer to are: rejecting or accepting an application, sending or not sending a form to someone else for consultation etc. From when the task is started and until a decision is reached, the cognitive resources of the user are heavily strained and the manipulation of the interface should be kept at a minimum. In order to make such a design, the work model must include: (a) a list of tasks performed, as defined by decisions that are made, for each type of worker, (b) a list of data, in terms of the datamodel, that could possibly be used in performing each task, (c) a list of actions needed to manifest each decision, (d) a list of naturally occurring "work-situations", defining sets of tasks which usually, or possibly, are performed concurrently. This work model can then be used by the interface designer in the creation of suitable interface elements e. g. "screen documents" or forms, or larger entities such as "workspaces" or "rooms"[5] and the dialogue needed to use these to perform the tasks. 6. D E S I G N DECISIONS AND P R O T O T Y P I N G Documentation of design decisions, and the reasons behind them, are seldom performed, but when redesigning the interface, during experimental development, the reasons for design decisions must be known, clearly specified and easily understood if they are to be evaluated. The Design Rationale approach [6, 7] is a framework for documenting design decisions, a semi-formal notation of different design options and explicit representations or reasons for choosing among those options. The main concepts are QOCs, Questions - highlight key issues in the design, Options - possible answers to the questions and Criteria-reasons for or against possible options. QOCs are mainly identified by recording design sessions. Inventors of the design rationale approach argue that the methodology provides a theoretical framework for design. Our opinion is that design rationales solve some problems on documentation of design decisions, describing decisions that have been made, but give no guidance for which decision to make in a certain context. The rationale is therefore not a methodology for interface design.
956 The usability engineering approach [8, 9] to interface design focus on the evaluation of an interface. The design must fulfil utility and usability criteria. The focus is on the definition of these criteria, and on methods for evaluation and testing. Methods of this nature are also important in a more complete design methodology, but lack support for the design decision process. Basing design work on a domain specific style guide [3] can minimise the distance from the style guide to interface design for a specific application. Normally, a style guide is of a general nature with very limited design support for applications in a specific work domain. A style guide on a higher level, including domain knowledge, is much more detailed and can efficiently support the design process. Important parts are composite interface elements corresponding to more complex information structures of the domain. However, a more structural approach to the design process is needed, even here. We are currently formulating a methodology, in which the representation of design decisions is immediate and with a minimal loss of information. 7. D I S C U S S I O N A framework for organisational and information technology development is stressed in this paper by defining design in HCI as the creation of a formal description of appearance and functionality, based on partly informal results of analyses. Due to the design gaps occurring from the communication of design decisions, information about the actual work situations is lost. These gaps can be bridged or at least narrowed through the definition of methodologies for design. Extensive research work has been performed on definitions of methods for analysis of information utilisation, domain specific design and interface modelling tools. Future research aims at increasing the granularity of methods for the entire process of information system development. By a further analysis of the design process, the possibilities for including knowledge on users in general, on work, design and software development, are increased. Domain specific design increases the possibilities for efficient user participation. Well defined roles and channels for easy cooperation eliminates unnecessary work and facilitates the preserving of knowledge gained from different analyses. Decreased development times and lower costs as well as improved human-computer interaction is the result. REFERENCES
1.
Gulliksen, J. & Sandbiad, B. (in press) Domain specific design of user interfaces - Case handling and data entry problems. In David Benyon & Phillipe Palanque (eds.) Critical
issues in User Interface Systems Engineering, Springer Verlag London Limited. 2.
Nygren, E., Johnson, M., Lind, M. and Sandblad, B. (1992) The art of the obvious.
Proceedings of CHI'92, Monterey, California, May 1992. 3.
Gulliksen, J. and Sandblad, B. (in press) Domain specific design of user interfaces. Int.
Journal of Human-Computer Interaction, Ablex Publ. Corp., Norwood, New Jersey. 4.
Rumbaugh, J., Blaha, M., Premerlani, W., Eddy, F. & Lorensen, W. oriented modeling and design. Englewood Cliffs: Prentice-HalL
(1991) Object-
5.
Card, S. K. & Henderson, A. (1987). A multiple Virtual-Workspace Interface to Support User Task Switching. Proceedings of CHI+G11987, Toronto, Canada.
6.
MacLean, A., Young, R.M., Bellotti, V.M.E. and Moran T.P. (1991) Questions, options and criteria: Elements of design space analysis. Human-Computer Interaction, 6, 201-250.
7.
McKerlie, D. and MacLean, A. (1993) QOC in Action: Using Design Rationale to Support Design. INTERCHI'93 video program, Amsterdam: A CM.
8.
Nielsen, J. (1993). Usability Engineering. Academic Press, Inc. San Diego.
9.
Nielsen, J. & Mack, R. L. (1994). Usability Inspection Methods. John Wiley & Sons Inc.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawaand H. Mori (Editors) © 1995Elsevier ScienceB.V. All rights reserved.
957
I n t e g r a t i o n of People, T e c h n o l o g y a n d O r g a n i z a t i o n : The European Approach Christina Kirsch, Peter Troxler, Eberhard Ulich Work & Organizational Psychology Unit (IfAP) Swiss Federal Institute of Technology ETH, Ztirich, Switzerland
This paper presents the general outline of new method, HITOP-D, considering the integration and joint optimization of people, technology, and organization. This method is based on the existing american methods HITOP and ACTION. It takes in account the specific european industrial context. In an iterating process the preliminary design of a project is assessed according a list of criteria of the four aspects people, technology, organization, and task design. Incongruencies are solved through a fit analysis and redesigning the original project. The performance of HITOP-D will be empirically evaluated.
1. I n t e g r a t i o n of People, Technology, and Organization Designers of processes an structures in industrial contexts are confronted with the growing need of adressing multiple aspects of a design project as there are task design, organizational design, people-related questions, and technical design. The work psychological point of view is that only the joint optimization of technology, qualification and organizational structures can lead to successful implementation of processes and structures (Ulich, 1994). In the government funded CIM Action Program in Switzerland the holistic conception of computer integrated manufacturing, encompassing the three aspects - people, technology, organization- is referred to as the MTO-Model ("Man - Technology- Organization". Ulich, 1993). The focal point of the MTO approach is the functional division between man and machine. Work design ie. the task allocation in man-machine and man-man systems is the basic question in analysis and design of production systems. One instrument that furthers the proposed joint optimization of industrial projects is HITOP (Majchrzak et al., 1991). It primarily addresses the design of sociotechnical systems implementing new technologies. The use of the HITOP method for the analysis and design process reduces the probability that important aspects are neglected. The fit or congruency of the three aspects people, technology and organization is evaluated according to various sociotechnical criteria. Misfit or incongruency is resolved in mutual discussion by the HITOP Team.
2. F r o m HITOpTM-91 to HITOP-D In the original HITOP manual (Majchrzak et al. 1991), which has been developped in the late 1980s, the implementation of advanced manufacturing technology was the predetermined starting-point for performing the analysis.
958 Many of the problems implementing new technologies emerge indeed considering organizational or human resources aspects (Strohm et al., 1993). F u r t h e r e d by the recession of the early 1990s productivity enhancement programs in organizations started to change from technology ,ie., automation oriented strategies with high investment risks to strategies dealing with organization or personnel features, eg., implementing team work in production islands, or establishing a Total Quality Management (TQM) system. Especially small and m e d i u m sized e n t e r p r i s e s are moreoften confronted w i t h organizational changes or changes in human ressources. These are the reasons why in European industry there is a strong need to go beyond the limitations of the originial HITOP manual. The revised HITOP-D version consequently treats each aspect with the same emphasis. HITOP-D can be used to realize the joint optimization of personnel, organizational or technology projects. With this approach, HITOP-D is more open than ACTION which is an analysis program based on HITOP that was developped by Ann Majrchzak (Majchrzak & Finley 1994). I m p l e m e n t i n g new technology like modern computer systems, artificial intelligence, and multimedia systems are highly sensitive to organizational and personnel issues. The application of HITOP-D in these cases is still a key success factor t h a t g u a r a n t e e s positive effects considering integration of people, technology, and organization. According to the basic concept of the socio-technical system approach the central focus is neither on personnel, on technology nor on organization, but on the fit or congruency of the various factors that constitute the production system. HITOP-D offers the option to start at any chosen entry point - personnel, technology or organization.
2.1. P r e l i m i n a r y Steps for a HITOP-D Analysis The first steps of a HITOP-D analysis consist defining the goals and restrictions of the innovation project analyzes and describes the actual situation before changes are implemented. This analysis is based on the MTO-Analysis developped in the project GRIPS (Ulich & Strohm, 1995). As in HITOP-91 it is proposed to work with HITOP-D in a team. To provide the necessary integration of the different views on an innovation project the future users, the various departments, and the different interest groups whithin the organization should be represented in the HITOP team. The team is lead by one or two HITOP facilitators. 2.2. E n t e r i n g the HITOP-D Analysis HITOP-D offers for each aspect to be considered - personnel, technology, organization, task design - a set of criteria to describe the relevant characteristics of the project. Each criterion is presented with a brief definition and guidelines for assessment. The assessment expresses how problematic each criterion has to be considered rating it from 1 (higly problematic) to 5 (no problems expected). A set of questions allows the users to collect the relevant information. Definition and assessment guidelines are clarified by additional practical examples. Additionally the HITOP users have to determine for each criterion whether it describes an unchangeable or a changeable characteristic of the existing situation in order to develop the conclusions for the design of the sociotechnical system. The next step of the HITOP-D analysis is to determine the desired entry point and reference point of the respective project and to describe and assess the
959 Critical Features (CF) of the project in terms of sociotechnical criteria for each aspect, i.e., task, personnel, organisation, and technology. This description uses the same criteria mentioned before.
Peopleo 1 Technologyo Organizationo "~" A1 Task Designo
People Technology Organization Task Design
People1 Technology 1 Organization1 Task Design l
~
(i)
These first steps acomplished the HITOP analysis has come to a preliminary description of the innovation project as expressed in (1) where the first vector indexed 0 represents the existing situation, the last vector index 1 - the intended future situation, and the delta vector the planned change to the existing situation. 2.3. A s s e s s i n g the Interference B e t w e e n Different Aspects The goal of the HITOP analysis is to achieve the fit of the four aspects - people, technology, organization, and task design- for the future situation. For this purpose, the interference between the four aspects of the suggested preliminary future situation - the fit- has to be analyzed.
///4
people
S
technology
~,
v
organization
Figure 1: Interrelationship matrices between the four aspects of the HITOP-D analysis - people, technology, organization and task design The fit is analyzed evaluating the impact every aspect has on each of the remaining other three aspects. HITOP uses six impact matrices shown in figure 1 to operationalize this step. For each matrix HITOP provides guidelines-i.e. questionnaires, examples-which mutual impacts have to be expected comparing two aspects. The impact analysis ends up identifying critical aspect criteria for the suggested future solution. f fi
Peoplel 1 Technologyl Organizationl ~ 1 Task Design1
(2)
960 Usually the preliminary future design will not satisfy the proposed fit criteria, cf. (2). The next steps of the HITOP analysis will have to deal with the redesign of the intended future situation in order to achieve the fit between people, technology, organization, and task design. 3. R e d e f i n i n g t h e F u t u r e S i t u a t i o n I t e r a t i n g t h e H I T O P - D P r o c e s s
The discussions during the preliminary description of the future situation and especially the first fit analysis usually produce plenty of ideas and questions how the project could be changed to get better results according to the criteria applied. In the following iterative steps these ideas are systematically integrated redefining the intended future situation and reassessing the fit. People0 Technology0 Organization0 Task Design0
+An
People Technology Organization Task Design
f Technologyn Peoplen 1 fi Organizationn = 1 Task Designn
____)
Peoplen 1 Technologyn Organizationn Task Designn
(3)
(4)
After several iterations there will be found a future situation with a high degree of fit or congruency of the four aspects people, technology, organization, and task design. The new delta vector contains the necessary changes to the existing situation. From this basis an action program can be derived. The whole HITOP process will take - depending on the scope of the project three to five days including three or four iterations. 4. E v a l u a t i n g t h e H I T O P - D M e t h o d
The hypothesis is, that the use of the HITOP method in design projects assures that the relevant aspects are taken into account, whereas projects not using HITOP-D tend to neglect relevant PTO (people technology organization) aspects. Numerous instruments, methodologies, and tools are available, claiming to improve performance and efficiency of production design and implementation of new technologies. Most of these instruments just claim to have an effect without giving any empirical evidence for objectively attained improvements. Thatfore an empirical evaluation of the pilot studies is conducted to determine: (a) The impact of HITOP-D on the design process: process evaluation (b) The impact of HITOP-D on the outcomes of the design process: cost-benefit analysis, product evaluation and goal attainment scaling. 4.1. H I T O P - D O u t c o m e s E v a l u a t i o n
With a first i n s t r u m e n t - the HITOP-D pre-evaluation survey- the project structure, input (costs, duration) and goals of the project are analyzed at the beginning of the project. At the end of the project goal attainment, problems,
961 positive effects and general satisfaction with the HITOP-D method are analyzed with a similar instrument - the HITOP-D post-evaluation survey. Comparing the results from the two datasets to datasets of other projects not supported with HITOP-D it can be determined how powerful an instrument HITOP-D is and clear evidence can be given what performance improvements are possible with the use of HITOP-D. 4.2. H I T O P - D P r o c e s s E v a l u a t i o n
Additionally during the design process a diary survey is conducted as sort of a brief longitudinal study. After each project-meeting the participants complete a questionnaire concerning the current task of the project meeting, method, problems and satisfaction within the project. We investigate on how the HITOPD projects proceede, what kind of turning points and problems occur. From the results of these pilot studies we will be able to develop project management and facilitator guidelines making HITOP-D even more user friendly and easy to use. References
Majchrzak, Ann et al. (1991). Reference Manual for Performing the HITOP TM Analysis. Ann Arbor: Industrial Technology Institute. Majchrzak, Ann, Finley, L. (1994). Extending the Concept of Fit to a Theory of Sociotechnical Tradeoffs. Paper presented at the Fourth International Conference on Management of Technology, Miami, FL, March 1994. Strohm et al. (1993). Integrierte Produktion: Arbeitspsychologische Konzepte und e m p r i r i s c h e Befunde. In: G. C y r a n e k & E. Ulich (Eds.). C I M H e r a u s f o r d e r u n g an Mensch, Technik, Organisation. (pp. 192-140). Schriftenreihe Mensch-Technik-Organisation (Ed. E. Ulich), Band 1. Zfirich: Verlag der Fachvereine; Stuttgart: Teubner. Ulich, E. (1993). C I M - eine integrative Gestaltungsaufgabe im Spannungsfeld von Mensch, Technik und Organisation. In: G. Cyranek & E. Ulich (Eds.). CIM - H e r a u s f o r d e r u n g an Mensch, Technik, Organisation. pp. 29-43. Schriftenreihe Mensch-Technik-Organisation (Ed. E. Ulich), Band 1. Zfirich: Verlag der Fachvereine; Stuttgart: Teubner. Ulich, E. (1994). Arbeitspsychologie, 3rd edition. Zfirich: Verlag der Fachvereine; Stuttgart: Sch~iffer-Poeschel. Strohm, O. & Ulich, E. (Eds.) (1995). Ganzheitliche Betriebsanalyse unter Berficksichtigung von Mensch, Technik und Organisation. Vorgehen und Methoden einer Mehr-Ebenen-Analyse. Schriftenreihe Mensch-TechnikOrganisation (Ed. E. Ulich), Band 10. Zfirich: Verlag der Fachvereine; Stuttgart: Teubner.
This Page Intentionally Left Blank
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
963
Dynamic changes of human systems under a simple task of HCI Mariko Fujikake F u n a d a , , Satoshi SUztlki b, Takao Tanaka b, Yusuke Yazu b, Kyoko Idogawa c, Chieko Hukuda,, Satoki P. Ninomija b
a Department of Management, Hakuoh Univ., 1117, Daigyoji, Oyamashi, Tochigi, 326, Japan bFaculty of Science and Engineering, Aoyama Gakuin Univ., 6-16-1, Chitosedai, Setagayaku, Tokyo, 157, Japan ~Seitoku Univ., Matudo, Chiba, Japan
1. INTRODUCTION Since HCI (Human-Computer Interaction) is a field which treats a kind of communication methods between computer systems and human systems controlled by brains, true characteristics of human systems should be made clear in order to devise a suitable HCI method. Three approaches are considered to get near the truth. The first one is to study about an active brain under HCI task, because a brain is a CPU of human system. The second approach is to study about relations between states of human systems and input-output (I/O) responses of the systems. In this case, states of human systems are activity of brain, cardiac rhythm, etc., inputs mean a indication displayed on a CRT or message of computer voice, etc., and outputs are behavior of human beings to the influence of computer systems. The third approach is to study about only I/O responses of human systems treating a human system as a black box. The order of these approaches are reasonable, because a brain controls and determined all behavior of a human system, and total states of human systems determine I/O responses of the systems. W? have done about analysis of brain under a HCI task and report the results by another representation of this conference(l). In this paper, our situation is at the second approach., and our purpose is to analyze both of states of human systems brain and I/O responses of human systems under a simple HCI task. We selected grouped a waves of EEGs, cardiac rhythms, and numbers of winking to keep watch states of human systems, because grouped a waves appear when human beings fall down or falling down to low awake conditionsa) and cardiac rhythm and numbers of winking are reported that they are deeply concerned with levels of mental stress or awakeness of human beings. And we made a simple HCI task which is to input the same character as that displayed on a CRT display and defined several kinds of variables to show efficiency of I/O
964 responses. From the results of analysis about these variables, we take out the characteristics of human systems about the relations between states of the systems and I/O responses, and then consider the reasonable nature to have suitable future HCI methods.
2.METHODS 2.1.Object data Object data are measured from five normal male students under the following task; (1) task: Push the same key as a numeric character displayed on a CRT display as fast as possible. Displayed numeric character is one of 1"-~9, and the displayed sequence is at random. The displayed time of next character is 0.5 msec after a correct answer is inputted. This task is continued for one hour. We defined the following variables to analyze the relation between brain activity and I/O responses of human system; (2) Defined variables: N: numbers of displayed numeric characters for one mi,utes E: a ratio of numbers of error input to total input numbers for one minute. CR: mean of periods from the time when a character is displayed to the time when a correct answer is inputted during one minutes. R: mean of response time from the time when a character is displayed to the time when the first character is inputted during one minute. Az time length when grouped a waves appear for I minutes. C: cardiac rhythm for one minute. W: numbers of winking for one minute. LW: numbers of winking whose continuous time is more than or equal to 0.5 seconds.
2.2.Analytical methods We analyze the variables defined above by the following methods; (1) representation of variables as time series data. (2) calculations of mutually correlation among variables. (3) calculation of difference of variables and repetition of the same calculation ten times. (4) smoothing by moving average method. (5) drawing graphs in a three dimensional space constructed by the three kinds of variables.
3.RESULTS Fig. 1 is an example of time series representation of measured variables. The horizontal axis is time and vertical line is values of defined variables. The changes of variables include short and
965 0
10
I
20
I
30
I
40
I
50
I
I
min !
0
10
20
:
:
I
I
!
t
30
40
I
I
I
I
50 I
min t
N
E CR
A
A
H
H
L
I
0
10
I
I
I
20
30
40
I
50
4
min
Fig.1 An example oforiginal data. N~LW: variables defined in the section 2.1.
I
I
0 10 20 30 40 50 min Fig.2 An exampleoforiginal data. N ~ L W : variables defined in the section 2.1. 0
10
0
10
I
I
20 I
30 I
40 I
50 I
min 4
,.IT,Tll, TIrTI N
E
CR
R
Tiiltll!
A
H
W
LW
Fig.3 An example of mutually correlation. N~LAW: variables defined in the section2.1. The data are the same as ones used in Fig. 1.
I
I
I
I
I
20
30
40
50
I
min
Fig.4 An example of difference calculated by 10 times repetition. The data are the same as ones used in Fig. 1. comparatively large frequencies. Fig.2 is another example of time series of measured variables. An amount of a waves in Fig2 is fewer than that included in Fig.1. Amounts of grouped a waves are different among individuals. Then, all our data are divided into these two types; one includes a
966
0
10
I
20
I
30
I
40
I
50
I
I
I
I
E
E
CR
CR
20
10
0
mi n
I
40
30
I
I
I
50 I
m
in I
/ fl
t.W
.._~. I
-',_AI~"
J
~
I 10
0
~
I
I
20
I
30
I
50
40
I
I m in
I
0
I
10
I
20
I
30
40
I
50
I
mi n
Fig.5 The smoothed data of Fig.1 with 11 data. Fig.6 The smoothed data of Fig.2 with 11 data.
i. ...... •
!.!
l i,
,I,ID,.,,,,,..,,,"
•
•
!
I
,~
6
?
,
........
"
'[f,IUi
: 'I" ....................... '.... .... i'"
8
I
I
I
r 6566.
9
1~
11 seo '64 ~ 0.0
Fig.7 Changes of grouped a waves"A", cardiac rhythm "H", and response time "R". The height of line represents the length of response time. The data in Fig.5 are used.
'........if:; ........ ,'
0 ""~
" 0.5
I
H
I
:/ ~C
Fig.8 Changes of grouped ~ waves "A", cardiac rhythm "H", and response time "R'. The height of line is the length of response time. The data in Fig.6 are used.
large amount of a waves and the other is a few. Fig.3 is an example of mutually correlation among variables about the data of Fig.1. Both of horizontal and vertical axes are corresponding to the defined variables. The variables "N~, "CR", and "C" have large correlation among them. Fig.4 is an example of difference when a calculation is repeated ten times. The horizontal axis is time, and the vertical axis is values of difference of respective variables. The used data are the
967
33' 30
25
20
5
Fig.9
6
7
8
9
10
11 sec
Changes of grouped a waves "A", winking "W", and response time "R". The height of line is proportional to value of response time "R".
same as ones of Fig. 1. Resemble patters appear in "It", "A', and '~r' variables. The relations among "It", "A', and "W~ are deeper than the other relations among variables. Fig.5 and Fig.6 are examples of smoothing the data of Fig.1 and Fig.2. The tendencies included in these data become clearer. Fig.7 and Fig.8 are graphs in a space constructed variables "A", "IF', and "It". The horizontal axis is variable "A", the depth of the figure corresponds to "IF', and vertical axis is "It". The dotted line is a locus of variables "A" and "H" following time. The loci are not the same and the tendencies are also different. But the changes of loci are not at random, and they seem that there are some rules. Fig.9 is a graph in ("A', "'~r', "R"")-space. 4.CONSIDERATIONS Our purpose of this paper is to make clear the relation between states of human systems and I/O responses of the systems. The defined variables are divided into two kinds of classes. One is to indicate the characteristics of I/O responses, and the another is to show the states of human system. The variables "N', "E", "CR', and "It" are the former ones, and "A', ""H", "WTM, and '%W~ are the latter ones. Since the variables '%T', "CR" and "R" have great mutual correlation, we select "It" to indicate the character of I/O responses and consider the relations among "It" and other variables of states of human systems. One of major characteristics of human system are
that the states are unstable and are
dynamically changing. And another character is large difference among individuals. But the
968 changes do not occur at random, and there are some tendencies or rules observed in Fig.7 ~ Fig.9. Since the variable "It" has greater mutually correlation to those variables ;".4,", "I-F, and "W~ in smoothed data like Fig.5 and Fig.6, we select "A', "I-I', and 'T( ~ to represent states of human systems. Though selected variables are moving in the space defined by those variables, the movements are not at random and have some tendencies. The tendencies are concerned with the variable "It". If we cut the space defined by "A" and "W~ or "IF variables like Fig. 7~Fig.9 following the values of "It", we make a function from (""A", "H""), or (""A", "W") space to "R" space. The function has a faculty combining states of human systems and I/O responses of the system and represent the relations between states of human systems and I/O responses. The changes of states of human systems and existence of such functions indicate that there are two possible approaches to design a method suitable to HCI; one is to keep the state in the faster response area, and the other is that response of computer corresponds to fluctuating state of human system. The former HCI method is suitable to important and significant HCI work, and the latter is to realize a comfortable HCI. In each case, a flexible or fluctuating mech~niam, which is following or checking the change of states of human systems, is required to future HCI. In order to realize
the flexible and fluctuating HCI,
studies about characters of human systems are
important.
5.CONCLUSIONS From the analyses and considerations, the conclusions of this paper are followings; (1) We measured I/O responses of human system, states of human systems like amounts of grouped a waves, cardiac rhythms and winking under a simple HCI task. (2) The all measured data are changing dynamically. (3) It is possible to represent the response of human systems as a function of factors indicating states of human systems. (4) Amounts of grouped a waves,
cardiac rhythms, and numbers of winking are one of the good
factors to determine the states of human systems. (5) A results of (4) show the importance of studies about human states to design a suitable method to future HCI. (6) A flexibility or fluctuation is one of important characteristics of future HCI.
REFERENCES (1)Mariko F. Funada, Satoki P. Ninomija, et.sl.:Analysis of brain activity for HCI, HCI in Yokohama, 1995 in printing. (2)Chiek Hukuda, et. al : A study about shifting time to low awakening condition on Monotonous VDT works, HCI in Yokohama, 1995 in printing.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawaand H. Mori (Editors) © 1995Elsevier Science B.V. All rights reserved.
969
Temporal organisation of h u m a n centred systems V.A. Chernomoretsa and S. V. Kirpichb aInstitute of Cybernetics, Ukrainian Academy of Sciences 252028 Kiev, UKRAINE bInstitute of Engineering Cybernetics, Academy of Sciences of Belarus 220012 Minsk, B E A U S * Abstract: In solving problems on analysis and synthesis of complex systems, in particular decision making in these systems, it becomes often necessary to consider the integrity (systematization) criterion, thereby yielding the emergent effect. Here the integrity criterion is a defined ratio of components or processes in a system, the relationships between which are characterised as organisation, harmonium, subordination to some proportion [1-5]. For example, it is known that at a visual perception of an object the perception strategies are oriented to the vision psycho physiology regularities that allow for the Fibonacci numbers. The present paper considers the strategy of the integral behaviour of man making discrete activity in time. Optimising the behaviour consists in subdividing a time interval of a discrete action (i.e., quantum) into sub intervals (operating period and activity slack). In this case, subdividing the time interval objectively obeys the Fibonacci series. Such a conclusion, grounded theoretically and proved experimentally, gives the possibility to model and optimise strategies of the man's activity in accordance with his psycho physiological characteristics by using the integrity criterion.
1. INTRODUCTION Decision making under conditions of uncertainly is a rather common kind of the information management man's activity in complex systems, viz., in man-machine systems, human-centred systems, etc. The present work is devoted to investigation of such kind of activity concerned with processing of discrete signals by a man-operator and analysis of * Mailing address: Nikiforov Str. 7-160, 220141 Minsk, BELARUS Fax: 7 (0172) 34 15 54; E-mail:
[email protected] or kirs%avtlab.itmo.minsk.b~sr.eu.net
970 the management activity quality in the course of control and maintenance of a complex system. The input data for a man-operator under such conditions of the information management activity contain different uncertainties p e r t a i n i n g , for instance, to problems, methods or data. Such lack of information is either objective or due to economic restrictions. On making the decisions under the conditions of information uncertainty (or lack) it is, as a rule, presumed that a man fills a gap in the data with his personal experience, qualification, current situation, etc. in mind and with use of the integrity principle. The strategies of decision making by a man and temporal parameters of his activity are the goal of the present research. 2. S T R A T E G I E S OF TEMPORAL ORGANISATION OF SYSTEMS We will consider the strategies of man-operator's activity with regard to (a) external limitations on the time of signal processing; (b) duration of signal processing; (c) individual psycho physiological data and motivations of a man, i.e. the signal repetition period T, the processing time x of such signals. Now we pass to the dependence x - f(T). A decision is made by a man-operator within the limits train < T < tmax, where tmin(tmax) is the minimum (maximum) time interval for the man to perform any kind of activity. The time required for a man to take a decision during processing of the i-th discrete signal is Ti = ' r i + R i , where Ti is the i-th signal repetition period; x is the time of the i-th signal processing; Ri is the activity slack after termination of processing of the i-th signal before starting to process the next (i + 1)-th signal. Analysis of the relations ai - Ti/Ri and bi - Ri/~i reveals that a - f(T, R) is a monotones-increasing function while b - f(R, x) is a monotones-decreasing $ one. Then for an a r b i t r a r y fixed Ti there exists the unique value of xi at which ai - bi. In this case the following equality Ti
Ri
(1)
$
is valid, whence it follows that at xi - xi the activity slack R~ is Ri =
1+~2
, , xi = ( 1 . 6 1 8 . . . ) ~ = a xi
971 In view of the fact t h a t for numbers of the type (x the relation +affia 2
1
is fulfilled, t h e n ¢i = a-2T at a21;min _ decisions which are based on our u n d e r s t a n d i n g of the world as given in knowledge and historical information which lead to the formation of plans and designs; 2> decisions which are based on our understanding of the world as given by our on-hand experience and encounter with the actual flow of the world which leads to adjustments and adaptations in practice (this is a form of Greek "phronesis"). The former can be called "rationalanalytical decisions", the latter "intuitive-creative decisions", decisions which are directed to the open world or actual world and must take account of emergent factors. What needs to be stressed is that even in the "rational-analytical decisions" one needs to have ontogenetic considerations in order to arrive at a relevant and effective conceptual scheme as representation of the world as already given with a historical background. Thomas L. Saaty's "Analytic Hierarchy Process (AHP)" (in his book Multicriteria Decision Making: The Analytic Hierarchy Process, Pittsburgh, 1988) represents an expansion of rational consciousness basically using logical and scientific principles as guiding principles for categorization and subcategorization of the world. It does not take account of the ontogenetic process and structure of the I Ching as its model and foundation as it should. For otherwise we do not have any reason to hold to this process as a decision-making process or to even come to an
1031 u n d e r s t a n d i n g of the process. This process must be ontogenetically understood so that we can see w h y it naturally and necessarily comes to be so. Besides, h o w this process can function as a decision-making process requires the understanding of the axiogenetic process as a counter-image of the ontogenetic process. Finally, we need to understand the nature of decisions and decision-making as a process which has an objective reference in the world. It is not a relativistic nor a subjectivistic projection of whimsical desires and perceptions. Hence we need go to the foundational or transcendental problem of categorization and subcategorization not based on a prior rationality, but based on a comprehensive observation of the world across epochs of time. The I Ching ontogenesis of yin-yang differentiation and integration provides such a foundation and justification rooted on comprehensive observation (guan) of the world across epochs of time. The weighing of values also requires the axiogenetic model for making sense of evaluation and for justification. Finally, one must not forget that decisions involve applying and satisfying multiple criteria. H o w multiple criteria are to be identified, formulated and then integrated or ordered to achieve the optimum completion of m a x i m u m value is the essence of decision-making. The basis for such multi-criterial fulfillment must be found in an ontogenetic theory of integration of differences and that of integration of reasons for such integration. It is in light of the I Ching ontogenesis we can see the beginning of an answer to this question. One m a y think that one can identify anything's position or place in the scientific scheme or map of the world. But a scientific scheme or map of the world is given only in science. Our c o m m o n s e n s e w o r l d v i e w on the other h a n d is a manifest image which presents the world as we sense it but which nevertheless also allows scientific elaboration and transformation, such as the p h e n o m e n a of sunrise and sunset w o u l d allow us to u n d e r s t a n d that "The earth rotates from West to East", This means that our common-sense w o r l d v i e w (or for that matter our commonsense language or language games) is a metaworldview or metalanguage from which science derives its meaning under pre-designated scientific conditions. In this sense science needs not contradict our commonsense experience but on the contrary m u s t have to thrive on the openness of the c o m m o n s e n s e world and commonsense language. It must also need to depend on the specificity of purposes and conditions under which science would be adequate to serve and observe. N o w just as we may conceive science as an elaboration of commonsense one may also conceive commonsense as an elaboration of our perception of a world of change to be spelt out in the language of Yin and Yang. Why the language of yin and yang? It is because the commonsense worldview and language must be based on the ontogenetic process and structure as described above. Once decisions are properly understood in the context of the ontogenetic and axiogenetic processes and structures, we can see how negotiations as decisions based
1032 on two or more persons' efforts to make decisions are based on or capable of being based on (or referred to) the I Ching processes and structures of ontogenesis and axiogenesis. What is available and useful as resources and insights in the I Ching ontocosmology of ontogenesis and axiogenesis can be applied in negotiations. By the same token, conflict resolutions can be regarded as decisions to be made for and on behalf of two or more disagreeing and conflicting parties, the process and structure of which can again be represented in the ontogenetic and axiogenetic processes and structures like individual decision making, or personal or group negotiations. Both negotiation and conflict resolution can be thus seen as decision-making processes which involve ontogenetic or axiogenetic resettings or reinitializations of the world, which are ontogenetically or axiogenetically parts of the world as the taiji and the dao.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawaand H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
1033
P h i l o s o p h y of U n i t y in D i v e r s i t y -The D a n c e of Q u a n t u m a n d the I - C h i n g ' s S y m b o l Thomas In-sing Leung C.R.R.S. 200-8140 Leslie Road Richmond, B.C., Canada V6X 3W1
In 1952, David Bohm published a paper in Physical Review, 1 in which he suggests the use of the concept of hidden variable for interpreting the puzzles of Quantum Mechanics. He maintains that the particle and its movements are basically produced by an underlying wave motion. The universe is somewhat like a body of water, the wave packets became the phenomena of particles. They constantly unfold out of the body of wave and enfold in to it again. He introduces the term quantum potential to explain the essential features of quantum mechanics through a subquantummechanical level. There are enfolding and unfolding processes through which information comes in and out of the particles. This is the early causal interpretation of David Bohm. He then develops new ideas later. In his 1975 paper 2 and 1980 book Wholeness and the Implicate Order3 David Bohm contends that an implicate order lies behind all phenomena and forms the wholeness of the universe. He says "that in separable quantum interconnectedness of the whole universe is the fundamental reality and that relatively independently behaving parts are merely particular and contingent forms within this whole. ''4 He also introduces the term "implicate order" to explain the enfolding and unfolding processes of the quantum potential. He summarizes his thought in his 1985 paper Hidden Variables and the Implicate Order that "the causal interpretation of quantum mechanics and the implicate order are two aspects of one comprehensive notion. This can be described as an overall implicate order, which may extend to an infinite number of levels, and which objectively and self-actively differentiates and organizes itself into independent subwholes, while determining how these are interrelated to make up the whole. ''5 In 1993, he proposes an ontological interpretation of quantum mechanics in his book The Undivided Universe, 6 in which an ontology of quantum mechanics is developed. It becomes a philosophy that can compare to other philosophical systems.
1034 Some basic concepts in the ontology of David Bohm are quite close to the philosophy of I Chuan, which is the appendix that provides the orthodox interpretation of I Ching. The I Chuan proposes a process ontology with the concepts of I and Taichi. I in Chinese originally means change. The universe is perceived as always changing and transforming. The changing process is determined by a fundamental creativity (sheng). It says, "The greatest virtue of heaven and earth is creativity", "creative creativity" (sheng-sheng) is called I. Then the concept Tai-chi (the Great Ultimate) is introduced. "In the I, there is the Tai-chi. It creatively manifests the two modes, the two modes creatively manifest the form hsiangs (symbols or images), and the four hsiangs manifest the eight trigrams." The Tai-chi is the unique principle of I, it is the ultimate creativity that makes the manifestation of different symbols - images and trigrams, that is, the forms and beings of the world. From the I Chuan's ontology, the world is an inseparable whole. The many are manifested from this holistic Great Ultimate. On the other hand, the Great Ultimate also manifests the symbols. They are the trigrams and hexagrams in I Ching that form the way for understanding the many in the one. The symbols are the information that are manifested from the one and provide a ground for understanding. I Chuan also introduces the term enfolding and unfolding and uses these terms to describe the category change. It says, "The alternation between enfolding and unfolding is called change, the going forward and backward without ceasing is called penetration." All phenomena are manifested from the Great Ultimate and through the process of change and penetration, that is enfolding / unfolding and going forward / backward. One can understand these processes through the symbols, that is, the information which comes in and out of the Great Ultimate. The I Chuan ontology is a philosophy of wholeness. The ontology of David Bohm and I Chuan is so similar that it can open dialogue between contemporary scientific philosophy and Asian ancient philosophy of wholeness. This paper is a new attempt to bridge the old Chinese philosophy and new interpretation of scientific phenomena. REFERENCES David Bohm, "A Suggested Interpretation of the Quantum Theory in Terms of Hidden Variables", Physical Review, pp. 85, 166, 190 (1952).
1035 David Bohm and B.J. Hiley, "On the Intuitive Understanding of Nonlocality as Implied by Quantum Theory", Foundation of Physics, Vol. 5, No. 1, 1975. °
,
David Bohm, Wholeness and the Implicate Order, (London: Routledge & Kegan Paul, 1980). Same as note (2), p.102. This paper is first published in Zygon, Vol 20, 111, (1985). Later published in Quantum Implication, ed. by B.J. Hiley and D. Peat, (London: Routledge, 1987), p.44. David Bohm and B.J. Hiley, The Undivided Universe, (London: Routledge, 1933).
This Page Intentionally Left Blank
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
The I Ching and Non-Linear M a p p i n g : A M e t a - B i n a r y A p p r o a c h to Reflective Choice, DecisionM a k i n g , and Hierarchical I n f o r m a t i o n S y s t e m s
M. Secter Department of Communications, Simon Fraser University, Burnaby, British Columbia, Canada V5A 1S6 Institute of Asian Research, Center for Chinese Research, University of British Columbia, Vancouver, B.C. Canada V6T 1Z2
A Special Session" The I Ching (Eki Kyo ~ ~;~ v~L ) and Modern Science
1. 1.1
DISCLOSING A UNIFIED BINARY COMMUNICATIONS SYSTEM Introduction
This paper proposes a meta-binary, non-linear architecture for mapping and managing multi-logical systems and interactive virtual learning environments. It is an original redaction of a binary generated, six-dimension hypercube, inspired by two identical but apparently unrelated models: one from modern biochemistry and the other from ancient Chinese philosophy. The first is the DNA whose sixty-four amino acids embody the binary coding for life itself. The second is the I Ching, a 3000 year old Chinese text comprised of sixtyfour gua or hexagrams, aphorisms accompanied by binary symbols which objectify the coding for the dynamics of change and decision theory. Both DNA and I Ching have sixty-four identical sets, each set identified by a binary codon whose numerical value ranges from 0 to 63. And both models are apparently lacking the context of a formal coherent structure to legitimize or establish the binary numbers as sets in an authentic system. This hypercube construction not only provides a matrix for multi-dimensional operations but serves as a device for quantifying qualitative information and conditions. As such it supports cognitive ergonomics and human-computer interaction.
1037
1038
1.2
A Radical Alternative for Rethinking Information Systems Such a meta-binary machine as this proposed six-dimension hypercube
matrix can function as a server for- multi-logical information processing and object-oriented software architecture 1; interactive learning environments and computerized landscape metaphors 2; and multi-criteria, ontogeneticaxiogenetic, analytic decision hierarchies 3, in human-computer interaction. Reexamining the similarity between DNA and the I Chin.q, my self-appointed task was 'to identify the underlying system that would represent 64 binary sets in a cohesive model while complying with or satisfying the conditions and constraints inherent and explicit in the traditional I Ching literature.' The establishment of such a structure suggested a method for the rational, non-linear integration of binary numbers. This in turn implied a new approach to thinking about communication theory, information systems, and amino acids formulation. It also suggested a meta-binary information processing schema; that is, a system that informs rules which allow binary data to proceed simultaneously along more than one path direction from a particular binary position, set, mode, or field. The inherent implication is" a meta-binary machine based on conventional binary numbers, rules, and logic that transcends the constraints and limitations established by decision trees and conventional hierarchy models. In the same way that four nucleotides (00, 01, 10, 11) recombine to make up sixty-our (64) amino acids from which all life is formed, eight subsets or
gua (000, 001, 010, 011, 100, 101, 110, 111 ) recombine to make up the sixty-four six-digit binary sets (Gua) from which all else can be theoretically derived. In both the I Ching and DNA the sixty-four sets are considered to be the primary states, sets or fields. Higher levels of definition, refinement, or 'possible state' options are achievable by means of the hierarchical nesting of
additional systems within both the initial and the ending primary states.
1039
1.3
The Basic Issues or questions to be Addressed a)
how to establish the rules for constructing a system.
b)
how to establish the rules for applying or utilizing the system.
The first part of the paper outlines the development of the mathematical model. Specifically the algorithms that disclose it are described. The model emerges as one in which binary numbers become established as three-digit subsets which recombine to stabilize as six-digit sets. The sixty-four sets cohere into six subsystems that consolidate into a single system. This could be characterized as a neural network that embodies certain properties that might be expected of a unified theory for a six-dimension system or universe. The neural network configuration raises a further implication. Employing a relatively low-level or simple non-linear hierarchical matrix it is possible to generate multi-dimensional, non-parallel, linear interconnections. With a relatively small number of total sets, a vast number of decision paths or information processes is possible. From 64 amino acids all life emerges. The second part of the paper applies the methodology to both six-dimension information architecture, and the establishment of qualitative contextualizing for problems-solving and decision-making. The latter consists primarily of reframing operations in which cognitive processes and qualitative conditions are converted into subsets.. Because each subset is accorded an array of qualitative characteristics it becomes feasible to define conditions accordingly. Subsets are recombined (into pairs) and converted into six-digit binary values. These become the 64 categorized sets in the system. With the help of object-oriented menus the process of converting qualitative conditions into t w o primary subsets becomes relatively manageable. This indirect approach establishes a more precise correspondence between a condition and one of the sixty-four sets. At the same time the process reduces partiality, expectation, bias, and misinterpretation.
1040
1.4
An Original Six-Dimension Hypercube Information Matrix
The system is based on the rational integration of sixty-four interconnected binary sets into a non-linear matrix that supports simultaneous information processing at all six binary positions or levels. Each level denotes one metadimension. In addition there are three default transformation states each specifying a class of uncertainty" progressive inversion; revision; and binary progression. What emerges is a binary continuum between the system, user, and the interface. It more closely approximates real thought processing. This paper does not suggest that the original authors of the I Chin_q created, understood, or were even aware of the theory or system that I am proposing. However it is possible that its inherent cohesiveness, its internal character was intuitively grasped or acquired after years of study and working with it. In the same way that the gua sets or hexagrams suggest or imply an underlying system, so do the sixty-four sets of amino acids (defined as binary codons) suggest or imply an underlying system to which they belong, and whose rules inform and govern their existence and interrelationships. It appears evident that the sixty-four sets of the hypercube binary system establishes itself into six subsystems or energy continuums. It may be worth asking if: a) all six-dimension systems such as DNA share this subsystem organizational feature; and b) if we are perhaps living in a six-dimension universe, in which event there might be a correspondence between the subsystem structures and the six primary energy forms in physics. At a minimum, this model demonstrates that communication theory and genetics share identical binary systems. The importance of this cannot be underscored. We believe this model will have applications in interactive architecture and object-oriented, interactive software. Greg Tropea (HCI95 this session) David Smith (HCI95 this session) Chung-ying Cheng (HCI95 this session)
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
1041
E x p l o r i n g S e l f - D e v e l o p i n g M o d e l s In C o m p u t e r i z e d , I n t e r a c t i v e L e a r n i n g Environments D. A. Smith Hebrew University, 28 Bezalel Street. Jerusalem, 94501 Israel
1. THE STATE OF VIRTUAL REALITY SOFTWARE AS IT APPLIES TO LEARNING VR technology permits users to move objects and simulate physical actions in a cyberspace and promises radical changes in applications relying on object visualization and manipulation-for example: medical technology, Computer-AidedDesign (CAD), real-time simulation, shoot 'em up arcade games, and so on. Admittedly, this kind of enhanced interaction may exercise previously unfamiliar "cognitive-motor" capabilities and associations, but what can guarantee that users are going to function at higher levels of motivation and proficiency primarily because they happens to find themselves in cartoon-like virtual environments? Once the novelty wears off, why should being in a simulated world motivate anyone more than being in the real one? Ultimately, users are still left to their own habitual devices when it comes to refining behavioral patterns of creative learning, planning, organizing and elaboration in conventional software environments, let alone virtual ones. This situation has not evolved in a vacuum, but comes about, in part, because the trend in contemporary HCI design continues to be a mechanistic one in which the task is defined according to the perception of the tool, instead of the other way around. Accordingly, user participation is task-dedicated. Resulting software operations are, for the most part, devoted to forming or changing inanimate, or symbolic subject matter. Effects upon the user, while interesting, are usually relegated to chance and limited to a small set of mechanical skills formatting operations. 2. THE USER AS QUALITATIVE "PRODUCT" IN THE TASK ENVIRONMENT In contrast, we consider the user-and not necessarily the product (or, in this case, document)--to be the unique entity in the task environment. As suck a user's interactions might have direct, qualitative significance if only he or she were invited and guided to integrate them into responsive structures for awareness and learning. Efficacy of production is not threatened where heightened motivation and efficiency contribute to the accomplishment of the job at hand. The interface suggested here, then, is user-dedicated; one that treats the person as the qualitative, strategic variable in the performance of any task. The model acknowledges and
1042 quantifies individual differences in motivational and intellectual predilection, and offers tools for providing feedback. In taking the initial empirical steps to further evolve this paradigm for collaborative "being" and "learning" in a virtual reality (VR) environment, we propose to create and test a minimum sized "proof-of-concept" model--including its graphical user interface (GUI). We elaborate upon a form of virtual reality investigated in "An Interactive Media Environment for Creative Self-education-A Theoretical Definition," (Smith, 1988, 93). Named EDUGATES, this work defined a spatial landscape metaphor for graphical visualization of data. It discussed an expanded, alternate role for hypermedia and applied metaphor in learning and understanding. Theoretical guidelines and psychosocial methodologies are adapted from M.D. Caspi's '"Transformational Approaches to Creative Self Remaking," a comprehensive theory of "Self-Remaking (Caspi, 1985, 92, 94). The work offers an innovative theoretical approach and implementation that "unifies" neural network stimulus patterns responsible for: a) the semantic and spatial location of properties within a "fractal" database (the simulated spatial environment); together with, b) the criteria responsible for administering user and system generated events within that domain. Intersecting vectors describe a geomorphic lattice of "hard" data and "soft" user/system events that roughly describe 1:1 correlations on the surface of a hypersphere. These correspondences are mapped and continuously reinvested into the spatial and event horizons of the virtual environment. This mapping strategy essentially means that any given location in the virtual environment will contain arguments that at once promulgate and constrain the events that may occur therein. Visualization cues manifest relationships between the significance of data and the "shape" it assumes in a database. Data is a step closer to looking like the information it represents as the user invokes a virtual, micro-environment and dwells at the juncture -- or "place of moment"-- located between the real and virtual extents of the manifest "interface continuum."
3. CONCLUSION The aim is to provide the scaffolding for dialectic processes by which "learners" transform themselves into "doers" and "knowers" as they create the boundaries and rules that govern the changes in their virtual environment. The resulting virtual representation thus approaches a unique synthesis of significance, syntax and intention. In contradistinction to prevalent conventions that treat information as finite, discrete "products" for consumption, the way information is presented in this model encourages users to discover and benefit from how they, themselves, are changing in light of their interactions.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
1043
Business Rules, Revolutionary Discourse, and Multilogical I n f o r m a t i o n Systems G. Tropea Dept. of Philosophy, California State University, Chico, California, USA 95929-0730
1. THE HUMAN REALITY OF INFORMATION SYSTEMS The principles of systems theory apply both to the devices we use to process information and to the contexts in which they are used. Part One of this paper begins with an application of these principles to the task of understanding the human reality of information systems. It then builds on these ideas to present the basic theory of a new concept, the multilogical information system, including an account of its current context. To achieve an adequate account of the information system in context, the paper challenges the view of the information system as a set of devices external to human beings. In place of the traditional notion, the information system is sketched as a composite entity whose operation spans the division between the natural and the artificial. The goal of the opening pages of the paper is to radicalize the sense of object in the information system to include human beings. The conceptualization of human beings as system objects is not pursued as a stereotypical technological reduction, but rather as a way of developing a consistent object orientation at the theoretical level. It also shows how thorough the integration of information systems has actually become.
1.1 MODES OF DISCOURSE Part One continues with an analysis of modes of discourse. The theses of this discussion belong not only to the field of information science, but to philosophy and cultural anthropology as well. The main purpose of this part of the exposition is to establish a working sense of the concept of the multilogical information system as a kind of discourse that has hitherto been effectively prohibited by the rules of civilization, but now is required. Basically, the shift is from a preference for linear, sequential conversation, which we know as the dialog, to a new form of conversation in which multiple relevant signals are exchanged simultaneously, which we call a multilog. It is both culturally and technologically significant that this complex type of communication has become both possible and necessary at this point in history. The fact of this new requirement becomes the point of departure for Part Two of the discussion, which takes a more practical turn.
1044 2. ORDERLY PROCESSING IN A DISTRIBUTED COMPUTING ENVIRONMENT Part Two seeks to develop a basic architecture of a software solution to the perpetual information system problems of orderly processing (namely, integrity and contention) in a distributed computing environment. The proposed solution is grounded in the I Ching in two ways. First, it takes a cue from the metaphysics of the text in that it assumes change, not rest, as the default state of the system. Second, at a more practical level, the paper proposes inclusion of a run-time repository of information about objects and processes whose organization is based on the hexagram. This design specification is not an attempt to show that the originators of the I Ching were computer engineers or even anticipated the modern computer, but rather that their attention to the dynamics of change in a complex world may be of use to us, who must attend to the dynamics of change in complex information systems. Each part of the paper does its own work, but implications of the theoretical section become clearer in the application section, and motivation for specific features of the application will be found in the theory. Information system design for the foreseeable future will be driven by the fundamental ideal of modern quality assurance: everyone must know everything that is relevant all at once. Business rules guide the encoding of knowledge in the information system, but do not directly determine data structures or internal logic of a system. While there is an unhappy history of naive attempts to construct systems whose internals precisely match knowers' intuitions, the attempt to stay close to uncritical intuitions is not itself naive if independent evidence exists to motivate the design. Part Two's specification of the run-time repository as rooted in basic categories of perception and the presumption of dynamic equilibrium in the information system suggest that the foundational points explained in Part One can indeed be usefully mirrored in the system's internal logic.
2.1 PRIMARY DIMENSIONS AND METADIMENSIONS The main work of Part Two is specification and sequencing of the key elements of program logic in a multilogical information system. These elements are patterned closely on the essential characteristics of I Ching hexagrams. Some correspondences are more obvious than others, but the full proposal does not come into view until one grasps that a multilogical information system operates in ten dimensions, including four primary dimensions and six metadimensions. The importance of the six metadimensions is that they are available for constant reference from any point and provide a way for one object to connect abstractly with any other object. In other words, through the metadimensional repository, objects monitor other objects themselves and do not necessarily interact with the primary information or the internal control information of those objects. In Leibnizian terms, objects in the system function as monads, but not windowless monads. Each
1045 metadimension may have one of four values indicating something about the relationship of the object to other objects in the system. The four primary dimensions in an information system refer to the three space dimensions and one time dimension of ordinary experience. The six metadimensions correspond to psychological realities of past, X, present, Y, future, and Z dimensions. Each of these is assigned specific meaning in the context of the information system. After defining its terms, the paper goes on to show how this addition to the computing environment would work to maintain system integrity and help resolve contentions. Since there is an introduction of additional theoretical and procedural complexity to the system, even if implementation of this design may reduce actual processing, there is a legitimate concern that the cure might be worse than the disease. To address this issue, the paper includes a discussion of how its proposed additions can be maintained under program control.
This Page Intentionally Left Blank
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
1047
T h e I Ching as a P a r a d i g m for U n d e r s t a n d i n g C o r r e s p o n d i n g States in Fundamentally Different Systems J. W. Walls Simon Fraser University at Harbour Centre, 515 West Hastings, Vancouver, Canada V6B 5K3
1. Y I N - Y A N G A S B I N A R Y S C H E M A The word for 'science' in Chinese (~_~t~ kexue), Japanese (kagaku) and Korean (kwahak) languages means literally 'the study of classification.' Our awareness of the world around us begins with the most primitive form of classification: This, and That, perhaps first in the form of "I" and "Not-I". Early on, based on this primal distinction, we become aware of the difference between "awake" and "asleep," which soon become associated with "activity" and "rest" and "day" and "night". In ancient China such polarities came to be schematized as yang (11~) and yin (~), and the movement back and forth between any two such polarities came to be symbolized in the famous taijitu (~*~1~!), in which activity, heat, sun and daytime, for instance, are represented by the lighter element, while rest, cold, moon and nighttime are symbolized by the darker, inverted element. The endless cycle of movement back and forth between such polarities may be seen as symbolizing the essence of a system dynamic (1). Systems -- such as mechanical, biological, meteorological, astrophysical, economic and social systems -- may be observed and described as moving back and forth between cyclical periods of activity and rest, expansion and contraction, growth and decay, etc. These polarized descriptors may all be symbolized by yang and yin, and their dynamic interaction by the taijitu, which is the circular chart in the center of Figure 1 below. The yang principle may be represented by a single, unbroken line; the yin principle by a line with an empty space in the middle. Incidentally, yin is seen as representing the essence of zero and all even numbers, while yang represents one and all odd numbers -- a relationship that is essentially binary and digital (2):
active growing positiw hot
YANG
bright
p~sive declining neg~ive cold d~rk
sun
ll'lOOh m
m
YIN Figure 1. Yin-Yang and Taiji Chart But the yin-yang taijitu schematic system is not simply digital, as it clearly emphasizes "degrees of yang-ness" and "degrees of yin-ness", a mode of awareness that is clearly analogical.
1048
2. T A I J I T U
AS D I G I T A L - A N A L O G
SYSTEM
SCHEMA
Gradually, as people mature, they become aware that reality may be perceived not only as a multiplicity of "either-or's" (yin-yang's), but also as sequences of "more-or-less's". For example: morning and noon are sequences of magnitude, culminating at the meridian and followed by afternoon and night, which would be represented in the taijitu as the narrow-but-growing tail of the lighter element (morning), the head of the lighter dement (noon), the narrow-but-growing tail of the darker element (afternoon), and the head of the darker element (night). Through the process of analogy, we can easily see the same graph as illustrating the four seasons of the year: spring is the morning of the new year; summer is noon; autumn is the afternoon; and winter is the night. From here, it is quite obvious that the morning-noon-afternoon-night and spring-summer-autumn-winter sequence are analogous to the childhood-adulthood-middle age-old age stages of a person's life. On the level of the social system analysis, the evolution-devolution sequence is a perfect representation of what Chinese historians have called "the dynastic cycle": following the dissolution and overthow of a corrupt regime, a new dynastic order is seen as emerging in its energetic, idealistic youth -- its "young yang "stage, during which time the wrongs of the preceding dynasty are righted. The next stage is that of dynastic adulthood, the "flourishing" period when civilization reaches a high-performance peak. This is followed eventually by a transition period that enjoys some of the momentum inherited from the flourishing stage but also starts showing symptoms of decay. The final period is characterized by imperial indulgence, unreasonable taxation of the populace, corruption among local officials -- even natural disasters, usually regarded as signs of Heaven's displeasure with an immoral regime, seen in China as the father's displeasure with the son, since the Emperor was regarded as the "Son of Heaven" (3). S old
yang ~lulthood
young yang
young yln
childhood
middle age
W
old age
old yJn N
Figure 2. Young and Old Yang and Yin In the northern hemisphere, the four quarters of the compass, too, seem almost naturally analogous to the four seasons and four stages of life: The east and its rising sun are analogous to youth and springtime" the south with its scorching heat is analogous to summer, mid-day and
1049 adulthood; the west and its setting sun are analogous to evening and middle age; and the cold north is analogous to winter and old age. The quaternary cycle is symbolized by four pairs of lines (digrams) in permutations of "young yang" (solid line below broken line), "old yang" (two solid lines), "young yin" (broken line below solid line) and "old yin" (two broken lines). These four stages in the life cycle of systems are also the core symbols on the national flag of the Republic of Korea, where we find the I-Ching trigrams forqian (~, representing the yang principle of activity) and kun ( ~ , representing the yin principle of rest), positioned on opposite sides of the taegukki (taiji flag), with qian at the pinnacle of yang development, and kun at the pinnacle of yin development. The trigram for fire (~)is positioned halfway along the route from kun to qian, and water (~.~)is located halfway down the path from qian to kun. If we begin with a state of rest, then fire may be seen as leading to the pinnacle of yang activity, followed by water which douses the fire, leading back to a yin state of rest.
///' Figure 3.
Taegukki Korean Flag The yang element in the Korean flag logo is coloured red, which represents fire and heat, associated with spring and summer; the yin element is blue, representing the cool of autumn and winter. Even though the taijitu in the Korean flag has been "tilted", we still find the kun trigram placed by the pinnacle of yin, and the qian trigram by the pinnacle of yang.
3.
THE 'EIGHT TRIGRAMS' ELABORATION
The next larger cycle is symbolized by eight trigrams, the so-called "Eight gua" (3~,~1'). The eight trigrams are often placed in a sequence that seems to be an elaboration of the quaternary cycle: movement, brightness, pleasure and fulfillment on the yang side; penetration, danger, standstill and emptiness on the yin side. "Movement" (~-), also equated with "thunder", is the early harbinger of activity and developments to come. "Brightness" ( ~ ) , also equated with "fire" and "clinging", is typical of the undisciplined vigour found in young systems. "Pleasure" (.~), also associated with large expanses of still waters, is the stage of life wherein one can enjoy the benefits of youth without worrying about decline, because the pinnacle has not yet been reached. "Fulfillment" ( ~ ) , also seen as representing "Heaven", is the peak of development, the stage of maturity, after which systems tend to begin to devolve.
1050
fulfillment pleasure i
qian
i
penetration ii i
i
sun
dui brightness
danger i
i
i
i i i
i
li
kon
movement i
I
i
i
standstill empti ness
zhen
i
i
i
i
i
i
i
i
I
i
gen
kun Figure 3. The Eight Trigrams The first trigram on the yin side of the taijitu is "Penetration" (~.), associated with the wind and the trees, another harbinger of the fall that is coming. "Danger" (~.~) is equated with flowing water that sweeps away the fallen leaves in autumn, when cyclic decline is no longer deniable. "Standstill" (R:) is symbolized by the mountain, where sages meditate on the ultimate futility of all organized effort that is not in tune with the natural flow of Nature. "Emptiness" ( ~ ) is associated with earth, the ground to which all spent systems return after their fall and before their reincarnation for another cyclic tour of "rise-and-fall". This is the way most "non-linear systems" have been seen by Asian analysts to evolve and devolve over time. The schema is also seen as a paradigm for understanding the history of institutions, a traditional metaphor for which is ",~,~" ("rise-fall'). The gua sequencing in the I Ching is usually seen as an elaboration of this basic cycle.
4.
THE TWELVE HEXAGRAMS
The next degree of elaboration is called "The Twelve Inspiration-Expiration Hexagrams", with its implication that "Inspiration" embraces the yang evolutionary phases, while "Expiration" stands for the yin devolutionary phases. These twelve gua represent a linear progression of changes from yin lines to yang lines, but the linear progression describes a circle/cycle that is parallel and analogous to the yin-yang evolution-devolution cycle illustrated by the taijitu. It is interesting to note that the twelve "Inspiration-Expiration" hexagrams, when seen in the full Fu Xi circular arrangement of all 64 hexagrams, occupy positions removed from qian and kun by the numbers 1, 3, 7, 15, and 30 -- each double the distance of its predecessor. Space limitations do not allow for detailed discussion here, but the evolution-devolution syndrome is immediately apparent upon contemplation of the "Inspiration-Expiration" chart.
1051
breakthrough I
strength i i
fulfillment m
I
encounter
| i
qian
quai
i
i
gou
retreat |
i I
ii
da zhuang
dun
retrogression
progress I I I
•
I I I
i i n
n i .
pi
toi advancino I I I I
contemplation
I I I I
/
i I
lin
returni no
~_ ~_~ fu
collapse
em.....ptines.._.s E
E
I
I
E
guan
E bo
kun Figure 4. The Twelve Inspiration-Expiration Hexagrams
5.
F U R T H E R E L A B O R A T I O N S OF THE 'RISE-FALL' S C H E M A
Another level of elaboration of the same allegorical life cycle is is that of the "Twenty-four Seasonal Subdivisions", which demarcate the year's rise-and-fall with changes that occur each two or three weeks in a typical year. Notice how the seasonal sequence fits the growth phases of yang and yin when placed so that they circumambulate the taijitu, indicating analogical relationship with general system cycles. In the chart that follows, the duodecimal cycle of the zodiacal animals is placed between the taijitu and the 24 seasonal subdivisions, to remind us that the sequence of twelve 2-hour watch periods that mark the progress of each 24-hour day are also analogous to the seasonal subdivisions: Starting with "-~" (the Rat) at the "Winter Solstice", the first watch is 1 l:00pm to l:00am, the midnight watch; the second watch (the Boar) is l:00am to 3:00am, and so on through the morning, midday, afternoon and nighttime, in parallel sequence with the progress of the seasons through the year. Thus the yin-yang binary cycle and the four digrams may be seen as a rough, "macro-perspective" on systemic evolution; the eight hexagrams, 12 zodiacal figures and 24 seasonal subdivisions are a "meso-perspective" offering a mid-level of detail in isolating the significant stages of systemic evolution; and the 64 hexagrams of the full-blown I Ching offer more of a micro-perspective aid to understanding the many possible configurations of any complex system. Perhaps an inordinate amount of attention has been paid, over the years, to the "divination manual" aspect of the I Ching, and not enough to its "mnemonic" and "check-list" functions. It is a mnemonic device for remembering the characteristic elements, relationships and evolutiondevolution patterns of any system, from the simplest to the most elaborate. It is also a check-list for itemizing and analyzing the system states at various points in a life cycle.
summer Ilndn s o Is t ic e
in e~'
budsg~r' ~
'
slight
/
he~
summer ~ ~
t
begins
/ e /
gt'~Jh
ndn
tie.bright~ ~ "
vernal equinox
,
J~,
o~
/
~ ~,u autumn
/
begins
--J-'. - ~
,~
finish
E~
~
white autumn equinox
~
insects __--.---"-"
"---------__ cold
,.O'°'n° /// begins
"'°"r... begins
gr~=
cold slight
cold
winter solstice
he~w
light snow
she,
Figure 5. The 24 Seasonal Subdivisions The I Ching, therefore, may perhaps best be seen, understood, and used as a schematic paradigm, an elaboration of the yin-yang taijitu abstraction of basic systems dynamics, which requires the observer to take into account all the other typical stages of a system when considering any given stage in a non-linear cycle of "evolution-devolution". This requirement to consider any gwen state together with a variety of other potential states is, in fact, consistent with the spirit of contemporary "context theory". The 64 hexagrams of the I Ching are a goldmine of opportunity to hypertextualize a system of interconnected symbols, linking the related states of each hexagram in such a way that complexity may be recognized, its elements systematized, and connected into a network of relationships that deal with dynamic complexity in a "user-friendly" fashion.
REFERENCES 1. Cheng Yi, I Ching: The Tao of Organization (Thomas Cleary, trans.), Kuala Lumpur, Malaysia, 1991. 2. K. Walter, Tao of Chaos: Merging East and West, Austin, 1994, 115. 3. D. Bodde, Essays on Chinese Civilization (C. Le Blanc and D. Borei, eds.), Princeton, New Jersey, 1981,246.
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
1053
Nonlinear Computation in the I Ching K. Walter Kairos Center, 4608 Finley Drive, Austin, Texas, USA, 78731 1.
INTRODUCTION AND HYPOTHESIS
This work presents an innovative model for computation. Current computers are mostly binary and do not take nonlinear aspects into account. They certainly cannot do both binary and analog computation in the same process. But such things are possible. Our model will be modern DNA and the ancient / Ching. Each system combines within itself both /inear binary processing and ana/og ratio processing into a peculiar hybrid mathematics that has made DNA and the / Ching into the sturdy system-survival packages that they are. Each model manifests the principles of the new science of patterned chaos. The core of chaos theory is found in the Period 3 window of the periodic tree of bifurcating data. Yorke and Li showed mathematically that when the Period 3 window appears, abruptly you have not random chaos but patterned chaos, i.e., orderly but nonlinear structure, able to sustain and replicate itself with variation. This hypothesis says that mathematically the mRNA codon and the I Ching trigram each presents a Period 3 window of chaos patterning. More important, each system is nonlinear, combining analog and linear functions to synthesize a transcendent third operation whereby the system escapes to a higher order of organization. 2. DNA SWATCH, I CHING HEXAGRAM, AND ENTRAINMENT
A DNA swatch or an I Ching hexagram can be demonstrated to be two Period 3 windows counterposed against each other to form a fail-safe package of complementary chaos, or co-chaos, for short. In each system, its pairs of triplets may be arranged into 64 possible configurations...which naturally give us the 64 codons of the genetic code or the 64 hexagrams of the I Ching. Probably DNA took the form it did along the double helix because this paradigm of counterposing two Period 3 windows against each other across the double spiral gives the sturdiest possible mathematical structure. This counterbalance gives co-chaos, whereby one chaos system balances itself against another in a fail-safe supersystem that provides both stability and also the possibility for evolutionary change within that stability. Most contemporary gene sequencing programs are set up to handle only binary aspects, avoiding nonlinear aspects altogether. But to quote "Hacking the Genome," an article in the April 1992 Scientific American:" The clarity of the answers will depend on asking the right questions." it may be that asking the right questions for computation now is to look at DNA and the I Ching and ask how and why is each essentially a hybrid analinear system that utilizes the principles of co-chaos? Probing for these answers may reveal much about the basic patterns in life's physical and mental systems. It may offer a new way to build computers so that they can imitate the basic number framework hidden in
1054
life itself, and perhaps even the universal coding that no doubt forms the root of nature. Number itself forms the root. To find this deeply-embedded root, we do not discard traditional scientific linearity, but instead we add something new--analogs. Analog plus linear gives analinear number. It is not just linear. It combines the chunky lumps of binary sums with the flowing proportions of analog ratios to birth a transcendent third form. Some might call it nonlinear, but Stanislaw Ulam said this is a rather silly term, since most of life's problems are nonlinear. He said saying "non-linear" is akin to calling most of the animals in the zoo "non-elephants." Therefore, I prefer to use the term analinear, showing it combines both modes to create a synergistic third state. Binary number seeks a goal, a solution, the answer to a problem as a quantity of units. It is a discrete, end-stopped sum giving the goalmthe quantity. But analog number does not emphasize a solution, a goal, a final lump sum. Instead, it discusses the quafity of relationships along the way. This brings up all kinds of resonant associations that open the doors to ongoing process rather than closing them down into the sum of a final answer. That's the trouble with analogs, from a traditional computational point of view. Analogs are networking rather than end-stopped. They engender resonances that linearity doesn't want to encourage because it prefers to stay tidy and neat and hurry to a quick solution--not trigger a network of related resonances. Analog numbers resonate in networks that reinforce entrainment. Entrainment is the main signature of analog number. It does not care about the summary quantity but, rather, about the relative qualities along the way. Its comparisons shift in changing ratios, not striving for an end but rather for the consummate trip, so that finally it never gets there, because there becomes irrelevant. The end becomes no goal at all as instead, it just keeps on traveling. When analog and linear are combined into analinear number, it can do both m find straight-line solutions and keep traveling in cycles. The result is the spiral of change. 3.
I CHING, BINARY SEQUENCING AND ANALOG FLOW
The ancient Chinese I Ching provides an astoundingly complete computer model using binary sequencing plus analog flow. Its structure first of all shows binary number, a fact long evident to the West since the days of the German scientist Gottfried Wilhelm Leibniz. At the turn of the 1700s, Leibniz saw that the I Ching hexagrams may be read as binary number, counting from 0 through 63. Other scholars have since concurred in this observation. More recently, though, Western scientists have begun to recognize that the / Chin~s yang and yin can even be cross-coded in a binary way with the genetic code. Gunther Stent discusses this procedure in The Coming of the Golden Age, published in 1969, and Martin Schoenberger in The I Ching and the Genetic Code in 1973. Scientific American's January 1974 article by Martin Gardner explores the binary math of the I Ching. Then came Eleanor B. Morris's Functions and Models of Modem Biochemistry in the I Ching in 1978. In 1991 came Johnson Yan's book called DNA and the I Ching. Each author considers various binary aspects of this genetic code//Ching interface.
10:55
We may ask ourselves: how could the ancient East and the modern West, so far apart in space and time, come upon the same mathematical model, with the East seeing the I Ching as an oracle that codes for the flow of psyche, while the West sees the same structure as DNA that codes for the building of flesh? Obviously, its underlying root is where mind and body come together. To explain this, let us consider the I Chin~s fractal aspects. Tao of Chaos by Katya Walter presents an innovative view of the I Ching as a computer combining fractal analog and linear binary functions. This scientific explanation is based on the fractals of modern chaos theory, which can predict a trend without specifying its exact details. Chaos patterning is determined because it can predict an overall pattern, but it is also chaotic because it cannot specify any exact point of its next manifestation. The mathematician can determine its general form but not its exact contents. This same dynamic can also be seen in the ancient I Ching hexagrams, which describe 64 basic nonlinear patternings. Patterned chaos has its own special signature: • • • •
Order in the midst of apparent disorder. Cycling that repeats with continual slight variation. Scaling that fits one level into another like nesting boxes. Universal applicability.
Chaos theory has enabled us to find pattern within apparently random events. With it, we rise to a new level of vision and discover that there is simplicity within complex flux. Long ago in China, it was called the Tao. This strange nonlinear realm first began to be explored mathematically in the West during the 1960s, often on makeshift analog computers that charted a peculiar cyclic patterning. Its odd vocabulary of fractals, Julia and Mandelbrot sets, butterfly effects and strange attractors suddenly opened up a new nonlinear reality. This transcendent use of number is seen in the I Ching, developed perhaps 5,000 years ago. It is also seen in DNA, whose structure was discovered in the 1950s. Briefly, here is a synopsis of the parallel structures in the genetic code and I Ching: 0
Binary Tree 0
1 is~. Analog Tree
0 1 A A 0 10 1
0
1
A A 0 10
0
0~
1
1
m m mu mm
m
m
m
"--"
m
m
m
m
V m m
m
V ~/
m m
Figure 1. Model of Binary Tree and Analog Tree
m m
m '
V mmmm
,.1
binary
m m
analog
-
"
-
v 'x/
m m
-
1056
Each of the eight I Ching trigrams can be seen as a Period 3 window. This window can be read horizontally across the branches of a bifurcation tree in the typical linear way, or the branching can be read vertically in a fractal analog way. In other words, the system can read by both methods simultaneously, giving an analinear reading that balances one function against the other. It is quite remarkable! Furthermore, one trigram is then balanced against another in the I Ching to create the 64 hexagrams that may nowadays be seen as 64 sets of counterposed Period 3 windows utilizing both binary and fractal components. Each hexagram describes a unique dynamic process. Likewise, DNA may be seen in this same way. Its pyrimidines and purines use the same organization plan as the I Chin~s bigrams. Within a hexagram, these bigrams may be read across its two trigrams--as a cross-bigram--to encode the message of an amino acid, providing in all, the 64 codons of RNA. Furthermore, it can be shown that the I Ching and the genetic code not only use the same analinear mathematical structure, but they also cross-correlate into the same dynamic meaning for each of the 64 units--with the result that, for example, the Opal codon of the gene's full-stop signal actually equates to Hexagram 12 of Standstill. I Code Tree I Ching Tree .
.
.
.
... ..'."
.
.
-.
...
.-"'--'"
"-O"
Genetic Code Tree T/U
........
C
G
.......-
Pyrimidines
'O"
A
Purines
-_"_ T/U "..
..
-" C .."
"" G "..
Pydm"~dines
A
.~ .."
Purines
" O"
Figure 2. hing Tree, Genetic Code Tree, and "1 Code Tree" Thus, it can be seen that each system--genetic code or I Ching--gives a microcosmic rendition of the larger principle of chaos theory using analinear number. Fortunately, these models, ancient and modern, provide us a means to apprehend the two modes of number working together in a mathematical paradigm that is perhaps inherent in the fabric of the cosmos itself. Numbers hook up to create the patterns of the universe. Analogs form the networks of qualitative resonance in the timing and spacing of matter and energy, while linears develop discrete sums that quantify the units of whatever is being spaced or timed. T o g e t h e r - - a s analinear number-- they give flowing, connective quality to the universe's discrete quantities. To merge the analog with the linear offers a way into a truly universal computation method. When we apply chaos theory, we see that each hexagram and each DNA swatch becomes a nonlinear equation. Chaos theory provides the pivotal explanation for how East and West could find the same mathematical structure from such very different paths. Each version is rooted in chaos theory--more particularly, in analinear number. This number paradigm builds our bodies and our thoughts. It is bone-deep in the species, archetypally deep in the mind. But notice--concentrating only on the binary/digital aspects of number in these two systems would deprive us of the major key--those two counterposed Period 3 windows of complementary chaos that create 64 different dynamic patterns. Going only binary, we would miss out on the nonlinear equations that
1057
form the hexagrams. Since binary merely indicates the 0-1 shunt of a discrete chain of logic, it discounts the integrating fractal properties that are inherent in analog number, and thereby it misses the complex sophistication of cycling proportions in ratio. If we do not see this, we completely overlook this amazing combination of binary structure plus analog relationship which reveals the master code. The i Ching and the genetic code offer microcosmic renditions of this mesh of analog and linear number. To balance and harmonize the analog and linear is the special province of analinear computation. It is seen in the ancient I Ching and the modern discovery of DNA. By combining binary counting with fractal proportion, this paradigm creates analinear equations that may one day provide the means for a new kind of computation.
This Page Intentionally Left Blank
Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
B i o m a t h e m a t i c s D e r i v e d f r o m the I C h i n g J. F. Yan Yan Research, P.O. Box 4115, Federal Way, WA 98063, U.S.A. The I Ching has inspired many scientists and philosophers in this century. Perhaps the most spectacular is its association with molecular biology and the genetic code. The two systems share a basic principle of using a quaternary system of numbers: four bases in DNA and RNA, and four digrams of the I Ching. Both systems have a total of 64 combinations: triple-base genetic codons and hexagrams for the I Ching. The mathematical association is not limited to these two "coincidental" number sets. The working of the I Ching divination process involves repeat applications of the Chinese Remainder Theorem, well known in modern number theory in its practice. These applications lead to a set of nucleotide numbers (quaternary units 0, 1, 2, 3) and a set of amino acid numbers (0 for "stop" codons, 1 for Trp, 2 for Ile, 3 for Met, and mostly prime numbers for other amino acids.) What is so marvelous is the I Ching can be applied not only to the broadest meaning of universal laws, but also to the finest details of life science: In the broad sense, hydrophobic protein segments are "yang", the hydrophilic ones are "yin"; both are supplemented and complemented with the time or sequential order of "old" and "young". In details, on using this unique set of amino acid numbers, protein sequences are a computable language from which sequence patterns and collective properties can be inferred. The gammatical rules of this language are expressed in the form of two number theorems. The language equivalents range from simple repeating phrases (like in baby talks) to elegant and sophisticated seven-word poems (like those of Li Bai). Sequence examples can be found in frequently occurring repeats in silk fibroin, collagen repeats and mutational hotspots, ancient ferredoxin, coiled coil and leucine zipper patterns, runs of amino acids in protein "huntingtin", implicit heptad repeats in amyloid and prion proteins, etc. Sequence patterns can be displayed with computer graphics, which can then be correlated with mutational data and compared results from molecular geometry. The I Ching is "the book of change", and chemistry is the science of change. In this sense, the I Ching may well be translated as "The Book of Biomolecular Chemistry."
1059
This Page Intentionally Left Blank
1061
Author Index Aar'~, A. Abed, M. Akamatsu, S. Allender, L. Alty, J. Angu6, J. Ankrum, D.R. Anola, K. Anzai, Y. Aoki, M. Arai, K. Asahi, N. Asahi, T. Asakura, T. Asano, Y. Ashibe, K. Athousaki, M. Averbukh, E.A. Bagolini, B. Balbo, S. Barrelle, K. Bastide, R. Bastien, J.M.C. Bevan, N. Black, A. Bodart, F. Brezillon, P. Bugajska, A. Burmester, M. Butera, F. Buur, J. Cai, Y. Candy, L. Carayon, P. Cesta, A. Chatty, S. Chen, Q. Chen, Q. Cheng, C. Chemomorets, V.A. Chignell, M.H. Chislov, V.V. Cohen, B.G.F. Coyle, G. Co~ff6, C. Dainoff, M.J. Danderkar, K. Danuta, R. Daonoff, M.H. Deane, F.P. Dehlholm, F.
575, 745 267 559 393 5 267 611 681 237, 243 725 541 33 381 687,699 423 535 459, 463 21,399 607 337 581 489 343 349, 885 187 367 5 745 899 1003 187 173 103 693 225 27 553 471 1029 969 323 459 745 529 91 745 109 745 745 581 891
Dennison, P. Downing, S. Drury, C.G. D'Aloisi, D. Ebina, T. Edmonds, E. Edmondson, W. Endoh, A. Endres, L.S. Esteban, O. Farenc, C. Felger, W. Fostervold, K.I. Fujigaki, Y. Fukuda, C. Fukuda, N. Fukumoto, T. Fukuzumi, S. Funada, M.F. Giannini, V. Goff, J.M.L. Gogoulou, R. Gohda, Y. Goonetilleke, R.S. Gosain, A. Granda, R.E. Grant, D. Greeson Jr., J. Grote, G. Gulliksen, J. Guo, Q. Haito, S. Hamada, H. Han, J. Hansen, E.E. Hara, F. Haratani, T. Haratani, T. Hashimoto, M. Hayashi, Y. Heggie, S. Heisig, P. Helander, M.G. Henderson, D.R. Higuchi, T. Hiji, M. Hiramatsu, K. Hisamori, Y. Hix, D. Hoang, K. Hollnagel, E.
587 1011 311 225 65 103 5 411 477 27 329 593 575 675,699 963,983 155 553 833 839,963,983 225 291 39 667 311 109 655 877 655 989 951 129 617 423 217 611 565 705 699 803 777,785 103 1021 375 581 713 521 237 405 109,355 629 5
1062 Hopff, H. Horgen, G. Horgen, G. Horie, Y. Hosogi, S. Huuntanen, P. Huuskonen, P. Huuskonen, P. Ichikawa, H. Ide, M. Idogawa, K. Iga, S. Iitaka, K. Ikeda, K. Ikeda, M. Imamiya, A. Inagaki, Y. Inoue, M. Inui, N. Iseki, O. Iskanderova, Z.I. Ito, A. Ito, A. Itoh, K. Itoh, K. Iwai, Y. Jo, K.-H. Johansson, A. Kaarela, K. Kahler, H. Kaminuma, Y. Kandolin, I. Kanenishi, K. Kano, H. Kanou, N. Kaplan, R.M. Karagiannidis, C. Kasamatsu, K. Kataoka, R. Kato, N. Kato, T. Katoh, M. Katz, I.R. Kawabata, T. Kawakami, N. Kawakami, T. Kawarada, H. Kayis, B. Kelley, A.J. Kelley, T. King, R.J. Kirpich, S.V Kirsch, C. Kishi, N.
895 745 575 817 135 681 291 417 509 737 963,983 231 917 45 725 731 803 845, 911 285 381 443 65 635 803 917 809 193 749 417 995 935 681 97 135 911 83 497 983 661 155 559 777 53,83 503 705 845 173 629 713 393 211 969 957 161
Kishino, S. Kita, K. Kitajima, M. Kiv, A.E. Kobayashi, H. Kobayashi, M. Kobayashi, Y. Kogi, K. Koike, T. Komatsubara, A. Konarska, M. Kotani, Y. Koumpis, A. Kubota, S. Kumamoto, T. Kuno, Y. Kurosu, M. Larsen, S. Lee, E.,S. Leino, T. Lepore, D. Leung, C.K.H. Leung, T.I. Leung, Y.K. Lie, I Lif, M. Lin, R. Lin, T. Lind, M. Littlefair, P.J. Macredie, R.D. Maeda, Y. Maenaka, A. Mahar, D. Mahfoudhi, A. Maloryan, V.L. Mariani, M Marin, I. Maruyama, M. Matsubayashi, K. Matsuda, R. Matsumoto, T. Matsuoka, S. McHale, S.T. McNeive, A. Meech, J.F. Meech, J.F. Melo, V. Meri, M. Mertins, K. Miller, L.A. Mimura, I. Minato, K. Minoh, M.
777 535 515 431,443 565 299 845,911 635 547 299 745 285 497 643 65 193 167 575 33,59 681 607 859 1033 211 575 951 199 217 951 623 945 135 33 581 267 443,459 1011 871 141 71 381 147 827 431 1011 291 5,945 323 291 1021 123 167 411 45
1063 Misue, K. Mitsuopoulos, Y. Miyao, M. Miyazaki, M. Mogaji, A.A. Mollaghasemi, M. Molle, F. Molyako, V.A. Morris, A. Motoyama, T. Moyoshi, M. Mukahi, T. Murasugi, K. Murata, A. Murray, B. Mutoh, K. Nagai, Y. Nakagawa, M. Nakagawa, S. Nakaishi, H. Nakashima, K. Nakatani, T. Nemeth, K.J. Nimomija, S.P. Ninomija, P. Nishida, S. Nishimura, T. Nisimura, H. Noma, T. Norcio, A.F. Ntuen, C.A. Nunokawa, H. Oda, M. Odgaard, I. Ogata, H. Ogata, M. Ogawa, K. Ohkubo, T. Ohshima, J. Ohsuga, M. Okada, H. Okada, K. Okada, N. Okuno, H.G. Oppermann, R. Orishchenko, V.G. Oswick, C. Palanque, P.A. Park, K.S. Parlangeli, O. Perry, M.J. Polozovskaya, I.A. Poison, P.G. Rankin, J.R.
135 923 599 521 975 387 607 431,443 587 323 765 273 273 719 103 317 853 155 547 599 845 503 611 839,963,983 983 77,141 45 285 205 471 455 521 559 865,1017 535 823 5,423 725 737 279,771 381 33 205 503 361 431 877 27,329,489 725 1011 623 431,443,459 515 179
Rauterberg, M. Reiterer, H. Rekimoto, J. Roast, C.R. Roberts, C.R. Rousseau, N. Ryan, G. Sagayama, S. Saito, S. Sakai, K. Sakamoto, T. Salcudean, S.E. Saliba, A. Sandblad, B. Sato, M. Sato, S. Savidis, A. Sawa, M. Scapin, D.L. Schallock, B. Scullica, L. Secter, M. Seki, Y. Senges, V. Sepp~il~i, P. Shafrir, U. Shahnavaz, H. Sharit, J. Shih, H.M. Shimoda, H. Shimojo, M. Shimono, F. Shirai, Y. Shiratori, N. Shtakser, G.V. Siddiqi, J.I. Sidhu, C.K. Siio, I. Smith, D.A. Sotoyama, M. Stanney, K.M. Stary, C. Stephanidis, C. Stephanidis, C. Sugamura, N. Sugiyama, K. Suthers, D. Suzuki, K. Suzuki, S. Sweitzer, G. Takahashi, A. Takahashi, M. Takata, K. Takayama, K.
449 361 255 483 705 103 1011 541 617 635 731 713 581 951 173 791 929 725 343 1021 607 1037 791 489 759 437 749 311 859 853 791 279,771 193 59 459 483 529 261 1041 617 123,387 115 39,497,923 929 541 135 5 785,797 963,983 649 791 635 765 135
1064 Takeda, M. Tamada, T. Tamura, H. Tamura, T. Tanaka, T. Tanimura, T. Taptagapom, S. Tawamura, I. Teguchi, K. Templeman, J.N. Teraoka, T. Terashita, H. Thomas, P.J. Thoresen, M. Toda, M. Tokuda, Y. Tokura, N. Tropea, G. Troxler, P. Tseng, M.M. Tsujino, Y. Turbati, M. Ueda, Y. Ui, T. Ulich, E. Umemura, M. Uyemov. A.I. Vanderdonckt, J. Vora, P.R. V/igland, A. Wagner, E. Wakamori, O. Walls, J.W. Walter, K. Wang, S.J. Watanabe, Y. Weik, S. Widerszal-Bazyl, M. Wolska, A. Wong, R.K. Wu, H. Wu, J. Wfifler, T. Yachida, M. Yagi, A. Yagi, Y. Yagyu, T. Yamadera, H. Yamaguchi, M. Yamaguchi, M.K. Yamamoto, M. Yamamoto, S. Yamamoto, T. Yamaoka, T.
785 141 305 667 963, 983 205 617 765 509 109 141 279, 771 945 745 279, 771 59 71 1043 957 859 71 607 45 273 957 509 459 329, 367 375 575 291 853 1047 1053 173 765 989 745 745 249 553 305 989 405, 553, 809 823 405, 809 405 167 285 559 547 827 737 77
Yamasaki, N. Yamashina, T. Yan, J.F. Yano, S. Yano, Y. Yasumura, M. Yazu, Y. Yokoyama, K. Yonemura, S. Yonezawa, Y. Yoshikawa, H. Yoshioka, O. Yoshioka, T. Yoshitake, R. Yoshizawa, Y. Zakharchenko, I.G. Ziegler, J. ZSlch, M.
243 737 1059 827 97,535 231 963,983 765 423 803 853 541 765 661 317 459 899 989
1065
Keyword Index 3D 33 3D graphics 141 3D object 173 active camera 553 active interface 225, 243 address input 541 air quality 635 allocation 989 ALS patients 911 analysis 273, 291, 431 analysis and design 267 analysis support 317 anthropomorphic media approach 565 anxiety 581 assesment 83 attitudes 581 auditory scene analysis 503 automated 989 biomathematics 1059 blind 929 brain activity 839 brain potential 823, 845 calligraphy 129 cardiovascular 771 case-based method 77 CFF 785 cocktail-party effect 503 collaboration 929 color 405 command 45 communication aids 845 computational model 521 context 5 control system 291 creative thinking 431,443, 459 criteria and standards 343 cryptography 147 cube-based 173 curve design 179 data 115 data structure 141 decision making 449, 1037 decisions 1029 desktop interface 123, 261 diagnosis 423 diagram 217 dialog system 547 dialogue 65, 71 dial devices 731 disabilities 923 discount analysis 323
display 623, 895 draw 155 dynamic changes 963 ecological interface design 423 EEG 827, 833 emotional workload 871 empowerment 1021 error 45 erromeous performance 437 evaluation 83, 285, 299, 329 expert system 599 face 553, 559 face robot 565, 877 font 135 front end 103 fuzzy reasoning 199 fuzzy set theory 279 graphical constraint 205 graphics 39, 161 GUI 187 hand-grasp 791 handicapped 917, 935 handwriting 803 head-coupled display system 593 heart rate 777 hospital 411 human communicatio 521 human identification 147 human operatior 455 human operator model 463 human-centered 193, 969 human-machine systems 21 human-robot interface 237 I Ching 1029, 1033, 1037, 1041 1043, 1047, 1053, 1059 icon 199 impression 559 information technology 951 input methods 725 intensity 797 intention recognition 77 interaction 5, 27, 65, 109, 255, 923 interaction cost 285 interface design 59, 91, 187, 387, 411 interface improvement 459 interface specification 483 ISO 661, 891 job 687 job content 681 job satisfaction 975 keystrokes 797
1066 language training learning life lighting linguistics management manipulation market means-ends media quality mental task mental variation mental workload menu metaphor model-based analysis modification motion motor-disabled person mouse multinationa musculoskeletal neural net non-visual object interface object-oriented object-oriented GUI ocular surface area office operators optometric intervention organization personal personality personality engineering petri net plant design pointing device problem solving process approarch prototyping quality control quality of use quantification theory quantitative evaluation R-R intervals real object real world reflective thinking rehearsal rendering requirements robot scenario
917 935,1041 1011 623,643 367 859,1003,1017 211 21 417 305 455 827 765,771,853 547 497 515 205 809 737 713,719 877 745,759 471 39 231 267 249 617 635,687,693,759 607 745 865,951,957,969 945 477 477 449 417 261,737 97 393 53 399 885 273 305 765 231 255 437 509 129 1017 249 497
seated posture 629 shifting time 983 SHIVA 899 sick building syndrome 587 skill aquisition 161 skin temperatures 817 slips 515 software 27,225,337, 355 software-ergonomics 361 space-vehicle control 311 specification 71 speech interface 529 speech recognition 535, 541 standard 361 standards 655, 885, 891,895 stationary metaphor 155 stress 675,681,705, 833 stressors 693, 699 subjective measurement 279 supporting system 405 symptom 575 system development 103 tailorability 995 task model-system model 489 taylorism 995 template model 483 TFF/LCD 661,667 theorem proving 217 think-aloud 375 tool 381 tool-based interface design 463 training 311 transformation 865 trouble 317 typeface 135 understanding 97 uric 785 usability 299, 323, 337, 343, 349, 355, 399 usability measure 387 usability testing 375 usability testing 381 usability testing 393 user 115, 123 user identification 237 user modeling 471 validation 899 VDT 575, 599, 983 VDT workplaces 749 vertical horopter 611 video conference 243 viewing distance 611 virtual auditory screen 803 vision-based 193 visual communication 167
1067 visual memory wavelet winking work load workflow workspace
509 777 911 817,823 859 33
This Page Intentionally Left Blank