Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen
Editorial Board David Hutchison Lancaster University, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Alfred Kobsa University of California, Irvine, CA, USA Friedemann Mattern ETH Zurich, Switzerland John C. Mitchell Stanford University, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel Oscar Nierstrasz University of Bern, Switzerland C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen TU Dortmund University, Germany Madhu Sudan Microsoft Research, Cambridge, MA, USA Demetri Terzopoulos University of California, Los Angeles, CA, USA Doug Tygar University of California, Berkeley, CA, USA Gerhard Weikum Max Planck Institute for Informatics, Saarbruecken, Germany
6765
Constantine Stephanidis (Ed.)
Universal Access in Human-Computer Interaction Design for All and eInclusion 6th International Conference, UAHCI 2011 Held as Part of HCI International 2011 Orlando, FL, USA, July 9-14, 2011 Proceedings, Part I
13
Volume Editor Constantine Stephanidis Foundation for Research and Technology - Hellas (FORTH) Institute of Computer Science N. Plastira 100, Vassilika Vouton, 70013, Heraklion, Crete, Greece and University of Crete Department of Computer Science Crete, Greece E-mail:
[email protected] ISSN 0302-9743 e-ISSN 1611-3349 ISBN 978-3-642-21671-8 e-ISBN 978-3-642-21672-5 DOI 10.1007/978-3-642-21672-5 Springer Heidelberg Dordrecht London New York Library of Congress Control Number: 2011928824 CR Subject Classification (1998): H.5, K.6, H.3-4, C.2, D.2, J.1, J.3 LNCS Sublibrary: SL 3 – Information Systems and Application, incl. Internet/Web and HCI
© Springer-Verlag Berlin Heidelberg 2011 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)
Foreword
The 14th International Conference on Human–Computer Interaction, HCI International 2011, was held in Orlando, Florida, USA, July 9–14, 2011, jointly with the Symposium on Human Interface (Japan) 2011, the 9th International Conference on Engineering Psychology and Cognitive Ergonomics, the 6th International Conference on Universal Access in Human–Computer Interaction, the 4th International Conference on Virtual and Mixed Reality, the 4th International Conference on Internationalization, Design and Global Development, the 4th International Conference on Online Communities and Social Computing, the 6th International Conference on Augmented Cognition, the Third International Conference on Digital Human Modeling, the Second International Conference on Human-Centered Design, and the First International Conference on Design, User Experience, and Usability. A total of 4,039 individuals from academia, research institutes, industry and governmental agencies from 67 countries submitted contributions, and 1,318 papers that were judged to be of high scientific quality were included in the program. These papers address the latest research and development efforts and highlight the human aspects of design and use of computing systems. The papers accepted for presentation thoroughly cover the entire field of human–computer interaction, addressing major advances in knowledge and effective use of computers in a variety of application areas. This volume, edited by Constantine Stephanidis, contains papers in the thematic area of universal access in human-computer interaction (UAHCI), addressing the following major topics: • • • •
Design for all methods and tools Web accessibility: approaches, methods and tools Multimodality, adaptation and personalisation eInclusion policy, good practice, legislation and security issues
The remaining volumes of the HCI International 2011 Proceedings are: • Volume 1, LNCS 6761, Human–Computer Interaction—Design and Development Approaches (Part I), edited by Julie A. Jacko • Volume 2, LNCS 6762, Human–Computer Interaction—Interaction Techniques and Environments (Part II), edited by Julie A. Jacko • Volume 3, LNCS 6763, Human–Computer Interaction—Towards Mobile and Intelligent Interaction Environments (Part III), edited by Julie A. Jacko • Volume 4, LNCS 6764, Human–Computer Interaction—Users and Applications (Part IV), edited by Julie A. Jacko • Volume 6, LNCS 6766, Universal Access in Human–Computer Interaction— Users Diversity (Part II), edited by Constantine Stephanidis • Volume 7, LNCS 6767, Universal Access in Human–Computer Interaction— Context Diversity (Part III), edited by Constantine Stephanidis
VI
Foreword
• Volume 8, LNCS 6768, Universal Access in Human–Computer Interaction— Applications and Services (Part IV), edited by Constantine Stephanidis • Volume 9, LNCS 6769, Design, User Experience, and Usability—Theory, Methods, Tools and Practice (Part I), edited by Aaron Marcus • Volume 10, LNCS 6770, Design, User Experience, and Usability— Understanding the User Experience (Part II), edited by Aaron Marcus • Volume 11, LNCS 6771, Human Interface and the Management of Information—Design and Interaction (Part I), edited by Michael J. Smith and Gavriel Salvendy • Volume 12, LNCS 6772, Human Interface and the Management of Information—Interacting with Information (Part II), edited by Gavriel Salvendy and Michael J. Smith • Volume 13, LNCS 6773, Virtual and Mixed Reality—New Trends (Part I), edited by Randall Shumaker • Volume 14, LNCS 6774, Virtual and Mixed Reality—Systems and Applications (Part II), edited by Randall Shumaker • Volume 15, LNCS 6775, Internationalization, Design and Global Development, edited by P.L. Patrick Rau • Volume 16, LNCS 6776, Human-Centered Design, edited by Masaaki Kurosu • Volume 17, LNCS 6777, Digital Human Modeling, edited by Vincent G. Duffy • Volume 18, LNCS 6778, Online Communities and Social Computing, edited by A. Ant Ozok and Panayiotis Zaphiris • Volume 19, LNCS 6779, Ergonomics and Health Aspects of Work with Computers, edited by Michelle M. Robertson • Volume 20, LNAI 6780, Foundations of Augmented Cognition: Directing the Future of Adaptive Systems, edited by Dylan D. Schmorrow and Cali M. Fidopiastis • Volume 21, LNAI 6781, Engineering Psychology and Cognitive Ergonomics, edited by Don Harris • Volume 22, CCIS 173, HCI International 2011 Posters Proceedings (Part I), edited by Constantine Stephanidis • Volume 23, CCIS 174, HCI International 2011 Posters Proceedings (Part II), edited by Constantine Stephanidis I would like to thank the Program Chairs and the members of the Program Boards of all Thematic Areas, listed herein, for their contribution to the highest scientific quality and the overall success of the HCI International 2011 Conference. In addition to the members of the Program Boards, I also wish to thank the following volunteer external reviewers: Roman Vilimek from Germany, Ramalingam Ponnusamy from India, Si Jung “Jun” Kim from the USA, and Ilia Adami, Iosif Klironomos, Vassilis Kouroumalis, George Margetis, and Stavroula Ntoa from Greece.
Foreword
VII
This conference would not have been possible without the continuous support and advice of the Conference Scientific Advisor, Gavriel Salvendy, as well as the dedicated work and outstanding efforts of the Communications and Exhibition Chair and Editor of HCI International News, Abbas Moallem. I would also like to thank for their contribution toward the organization of the HCI International 2011 Conference the members of the Human–Computer Interaction Laboratory of ICS-FORTH, and in particular Margherita Antona, George Paparoulis, Maria Pitsoulaki, Stavroula Ntoa, Maria Bouhli and George Kapnas. July 2011
Constantine Stephanidis
Organization
Ergonomics and Health Aspects of Work with Computers Program Chair: Michelle M. Robertson Arne Aar˚ as, Norway Pascale Carayon, USA Jason Devereux, UK Wolfgang Friesdorf, Germany Martin Helander, Singapore Ed Israelski, USA Ben-Tzion Karsh, USA Waldemar Karwowski, USA Peter Kern, Germany Danuta Koradecka, Poland Nancy Larson, USA Kari Lindstr¨om, Finland
Brenda Lobb, New Zealand Holger Luczak, Germany William S. Marras, USA Aura C. Matias, Philippines Matthias R¨ otting, Germany Michelle L. Rogers, USA Dominique L. Scapin, France Lawrence M. Schleifer, USA Michael J. Smith, USA Naomi Swanson, USA Peter Vink, The Netherlands John Wilson, UK
Human Interface and the Management of Information Program Chair: Michael J. Smith Hans-J¨ org Bullinger, Germany Alan Chan, Hong Kong Shin’ichi Fukuzumi, Japan Jon R. Gunderson, USA Michitaka Hirose, Japan Jhilmil Jain, USA Yasufumi Kume, Japan Mark Lehto, USA Hirohiko Mori, Japan Fiona Fui-Hoon Nah, USA Shogo Nishida, Japan Robert Proctor, USA
Youngho Rhee, Korea Anxo Cereijo Roib´ as, UK Katsunori Shimohara, Japan Dieter Spath, Germany Tsutomu Tabe, Japan Alvaro D. Taveira, USA Kim-Phuong L. Vu, USA Tomio Watanabe, Japan Sakae Yamamoto, Japan Hidekazu Yoshikawa, Japan Li Zheng, P. R. China
X
Organization
Human–Computer Interaction Program Chair: Julie A. Jacko Sebastiano Bagnara, Italy Sherry Y. Chen, UK Marvin J. Dainoff, USA Jianming Dong, USA John Eklund, Australia Xiaowen Fang, USA Ayse Gurses, USA Vicki L. Hanson, UK Sheue-Ling Hwang, Taiwan Wonil Hwang, Korea Yong Gu Ji, Korea Steven A. Landry, USA
Gitte Lindgaard, Canada Chen Ling, USA Yan Liu, USA Chang S. Nam, USA Celestine A. Ntuen, USA Philippe Palanque, France P.L. Patrick Rau, P.R. China Ling Rothrock, USA Guangfeng Song, USA Steffen Staab, Germany Wan Chul Yoon, Korea Wenli Zhu, P.R. China
Engineering Psychology and Cognitive Ergonomics Program Chair: Don Harris Guy A. Boy, USA Pietro Carlo Cacciabue, Italy John Huddlestone, UK Kenji Itoh, Japan Hung-Sying Jing, Taiwan Wen-Chin Li, Taiwan James T. Luxhøj, USA Nicolas Marmaras, Greece Sundaram Narayanan, USA Mark A. Neerincx, The Netherlands
Jan M. Noyes, UK Kjell Ohlsson, Sweden Axel Schulte, Germany Sarah C. Sharples, UK Neville A. Stanton, UK Xianghong Sun, P.R. China Andrew Thatcher, South Africa Matthew J.W. Thomas, Australia Mark Young, UK Rolf Zon, The Netherlands
Universal Access in Human–Computer Interaction Program Chair: Constantine Stephanidis Julio Abascal, Spain Ray Adams, UK Elisabeth Andr´e, Germany Margherita Antona, Greece Chieko Asakawa, Japan Christian B¨ uhler, Germany Jerzy Charytonowicz, Poland Pier Luigi Emiliani, Italy
Michael Fairhurst, UK Dimitris Grammenos, Greece Andreas Holzinger, Austria Simeon Keates, Denmark Georgios Kouroupetroglou, Greece Sri Kurniawan, USA Patrick M. Langdon, UK Seongil Lee, Korea
Organization
Zhengjie Liu, P.R. China Klaus Miesenberger, Austria Helen Petrie, UK Michael Pieper, Germany Anthony Savidis, Greece Andrew Sears, USA Christian Stary, Austria
Hirotada Ueda, Japan Jean Vanderdonckt, Belgium Gregg C. Vanderheiden, USA Gerhard Weber, Germany Harald Weber, Germany Panayiotis Zaphiris, Cyprus
Virtual and Mixed Reality Program Chair: Randall Shumaker Pat Banerjee, USA Mark Billinghurst, New Zealand Charles E. Hughes, USA Simon Julier, UK David Kaber, USA Hirokazu Kato, Japan Robert S. Kennedy, USA Young J. Kim, Korea Ben Lawson, USA Gordon McK Mair, UK
David Pratt, UK Albert “Skip” Rizzo, USA Lawrence Rosenblum, USA Jose San Martin, Spain Dieter Schmalstieg, Austria Dylan Schmorrow, USA Kay Stanney, USA Janet Weisenford, USA Mark Wiederhold, USA
Internationalization, Design and Global Development Program Chair: P.L. Patrick Rau Michael L. Best, USA Alan Chan, Hong Kong Lin-Lin Chen, Taiwan Andy M. Dearden, UK Susan M. Dray, USA Henry Been-Lirn Duh, Singapore Vanessa Evers, The Netherlands Paul Fu, USA Emilie Gould, USA Sung H. Han, Korea Veikko Ikonen, Finland Toshikazu Kato, Japan Esin Kiris, USA Apala Lahiri Chavan, India
James R. Lewis, USA James J.W. Lin, USA Rungtai Lin, Taiwan Zhengjie Liu, P.R. China Aaron Marcus, USA Allen E. Milewski, USA Katsuhiko Ogawa, Japan Oguzhan Ozcan, Turkey Girish Prabhu, India Kerstin R¨ ose, Germany Supriya Singh, Australia Alvin W. Yeo, Malaysia Hsiu-Ping Yueh, Taiwan
XI
XII
Organization
Online Communities and Social Computing Program Chairs: A. Ant Ozok, Panayiotis Zaphiris Chadia N. Abras, USA Chee Siang Ang, UK Peter Day, UK Fiorella De Cindio, Italy Heidi Feng, USA Anita Komlodi, USA Piet A.M. Kommers, The Netherlands Andrew Laghos, Cyprus Stefanie Lindstaedt, Austria Gabriele Meiselwitz, USA Hideyuki Nakanishi, Japan
Anthony F. Norcio, USA Ulrike Pfeil, UK Elaine M. Raybourn, USA Douglas Schuler, USA Gilson Schwartz, Brazil Laura Slaughter, Norway Sergei Stafeev, Russia Asimina Vasalou, UK June Wei, USA Haibin Zhu, Canada
Augmented Cognition Program Chairs: Dylan D. Schmorrow, Cali M. Fidopiastis Monique Beaudoin, USA Chris Berka, USA Joseph Cohn, USA Martha E. Crosby, USA Julie Drexler, USA Ivy Estabrooke, USA Chris Forsythe, USA Wai Tat Fu, USA Marc Grootjen, The Netherlands Jefferson Grubb, USA Santosh Mathan, USA
Rob Matthews, Australia Dennis McBride, USA Eric Muth, USA Mark A. Neerincx, The Netherlands Denise Nicholson, USA Banu Onaral, USA Kay Stanney, USA Roy Stripling, USA Rob Taylor, UK Karl van Orden, USA
Digital Human Modeling Program Chair: Vincent G. Duffy Karim Abdel-Malek, USA Giuseppe Andreoni, Italy Thomas J. Armstrong, USA Norman I. Badler, USA Fethi Calisir, Turkey Daniel Carruth, USA Keith Case, UK Julie Charland, Canada
Yaobin Chen, USA Kathryn Cormican, Ireland Daniel A. DeLaurentis, USA Yingzi Du, USA Okan Ersoy, USA Enda Fallon, Ireland Yan Fu, P.R. China Afzal Godil, USA
Organization
Ravindra Goonetilleke, Hong Kong Anand Gramopadhye, USA Lars Hanson, Sweden Pheng Ann Heng, Hong Kong Bo Hoege, Germany Hongwei Hsiao, USA Tianzi Jiang, P.R. China Nan Kong, USA Steven A. Landry, USA Kang Li, USA Zhizhong Li, P.R. China Tim Marler, USA
XIII
Ahmet F. Ozok, Turkey Srinivas Peeta, USA Sudhakar Rajulu, USA Matthias R¨ otting, Germany Matthew Reed, USA Johan Stahre, Sweden Mao-Jiun Wang, Taiwan Xuguang Wang, France Jingzhou (James) Yang, USA Gulcin Yucel, Turkey Tingshao Zhu, P.R. China
Human-Centered Design Program Chair: Masaaki Kurosu Julio Abascal, Spain Simone Barbosa, Brazil Tomas Berns, Sweden Nigel Bevan, UK Torkil Clemmensen, Denmark Susan M. Dray, USA Vanessa Evers, The Netherlands Xiaolan Fu, P.R. China Yasuhiro Horibe, Japan Jason Huang, P.R. China Minna Isomursu, Finland Timo Jokela, Finland Mitsuhiko Karashima, Japan Tadashi Kobayashi, Japan Seongil Lee, Korea Kee Yong Lim, Singapore
Zhengjie Liu, P.R. China Lo¨ıc Mart´ınez-Normand, Spain Monique Noirhomme-Fraiture, Belgium Philippe Palanque, France Annelise Mark Pejtersen, Denmark Kerstin R¨ ose, Germany Dominique L. Scapin, France Haruhiko Urokohara, Japan Gerrit C. van der Veer, The Netherlands Janet Wesson, South Africa Toshiki Yamaoka, Japan Kazuhiko Yamazaki, Japan Silvia Zimmermann, Switzerland
Design, User Experience, and Usability Program Chair: Aaron Marcus Ronald Baecker, Canada Barbara Ballard, USA Konrad Baumann, Austria Arne Berger, Germany Randolph Bias, USA Jamie Blustein, Canada
Ana Boa-Ventura, USA Lorenzo Cantoni, Switzerland Sameer Chavan, Korea Wei Ding, USA Maximilian Eibl, Germany Zelda Harrison, USA
XIV
Organization
R¨ udiger Heimg¨artner, Germany Brigitte Herrmann, Germany Sabine Kabel-Eckes, USA Kaleem Khan, Canada Jonathan Kies, USA Jon Kolko, USA Helga Letowt-Vorbek, South Africa James Lin, USA Frazer McKimm, Ireland Michael Renner, Switzerland
Christine Ronnewinkel, Germany Elizabeth Rosenzweig, USA Paul Sherman, USA Ben Shneiderman, USA Christian Sturm, Germany Brian Sullivan, USA Jaakko Villa, Finland Michele Visciola, Italy Susan Weinschenk, USA
HCI International 2013
The 15th International Conference on Human–Computer Interaction, HCI International 2013, will be held jointly with the affiliated conferences in the summer of 2013. It will cover a broad spectrum of themes related to human–computer interaction (HCI), including theoretical issues, methods, tools, processes and case studies in HCI design, as well as novel interaction techniques, interfaces and applications. The proceedings will be published by Springer. More information about the topics, as well as the venue and dates of the conference, will be announced through the HCI International Conference series website: http://www.hci-international.org/ General Chair Professor Constantine Stephanidis University of Crete and ICS-FORTH Heraklion, Crete, Greece Email:
[email protected] Table of Contents – Part I
Part I: Design for All Methods and Tools Visual Mediation Mechanisms for Collaborative Design and Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Carmelo Ardito, Barbara Rita Barricelli, Paolo Buono, Maria Francesca Costabile, Antonio Piccinno, Stefano Valtolina, and Li Zhu Design for the Information Society . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Agata Bonenberg Classifying Interaction Methods to Support Intuitive Interaction Devices for Creating User-Centered-Systems . . . . . . . . . . . . . . . . . . . . . . . . . Dirk Burkhardt, Matthias Breyer, Christian Glaser, Kawa Nazemi, and Arjan Kuijper
3
12
20
Evaluation of Video Game Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Joyram Chakraborty and Phillip L. Bligh
30
Emergent Design: Bringing the Learner Close to the Experience . . . . . . . Joseph Defazio and Kevin Rand
36
Eliciting Interaction Requirements for Adaptive Multimodal TV Based Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Carlos Duarte, Jos´e Coelho, Pedro Feiteira, David Costa, and Daniel Costa
42
Making Task Modeling Suitable for Stakeholder-Driven Workflow Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Peter Forbrig, Anke Dittmar, Jens Br¨ uning, and Maik Wurdel
51
A Method to Solve the Communication Gap between Designers and Users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jeichen Hsieh, Chia-Ching Lin, and Pao-Tai Hsieh
61
Teaching the Next Generation of Universal Access Designers: A Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simeon Keates
70
Use-State Analysis to Find Domains to Be Re-designed . . . . . . . . . . . . . . . Masami Maekawa and Toshiki Yamaoka
80
An Approach Towards Considering Users’ Understanding in Product Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Anna Mieczakowski, Patrick Langdon, and P. John Clarkson
90
XVIII
Table of Contents – Part I
Evaluation of Expert Systems: The Application of a Reference Model to the Usability Parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paula Miranda, Pedro Isaias, and Manuel Cris´ ostomo
100
Investigating the Relationships between User Capabilities and Product Demands for Older and Disabled Users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Umesh Persad, Patrick Langdon, and P. John Clarkson
110
Practical Aspects of Running Experiments with Human Participants . . . Frank E. Ritter, Jong W. Kim, Jonathan H. Morgan, and Richard A. Carlson
119
A Genesis of Thinking in the Evolution of Ancient Philosophy and Modern Software Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Stephan H. Sneed
129
Understanding the Role of Communication and Hands-On Experience in Work Process Design for All . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Christian Stary
139
Extending Predictive Models of Exploratory Behavior to Broader Populations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shari Trewin, John Richards, Rachel Bellamy, Bonnie E. John, Cal Swart, and David Sloan
149
Digitizing Interaction: The Application of Parameter-Oriented Design Methodology to the Teaching/ Learning of Interaction Design . . . . . . . . . Shu-Wen Tzeng
159
A Study on an Usability Measurement Based on the Mental Model . . . . . Yuki Yamada, Keisuke Ishihara, and Toshiki Yamaoka
168
Part II: Web Accessibility: Approaches, Methods and Tools Enabling Accessibility Characteristics in the Web Services Domain . . . . . Dimitris Giakoumis, Dimitrios Tzovaras, and George Hassapis
177
Results from Multi-dimensional Accessibility Assessment . . . . . . . . . . . . . . Rog´erio Bandeira, Rui Lopes, and Lu´ıs Carri¸co
187
A Harmonised Methodology for the Components of Software Applications Accessibility and Its Evaluation . . . . . . . . . . . . . . . . . . . . . . . . Eleni Chalkia and Evangelos Bekiaris
197
An Architecture for Multiple Web Accessibility Evaluation Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N´ adia Fernandes, Rui Lopes, and Lu´ıs Carri¸co
206
Table of Contents – Part I
XIX
Overview of 1st AEGIS Pilot Phase Evaluation Results . . . . . . . . . . . . . . . Maria Gkemou and Evangelos Bekiaris
215
An End-User Evaluation Point of View towards OSS Assistive Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Maria Gkemou, Evangelos Bekiaris, and Karel Van Isacker
225
A Method to Automate the Ranking of Web Pages According to User Defined Accessibility Ratings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alice Good
235
Issues in Web Presentation for Cognitive Accessibility . . . . . . . . . . . . . . . . Clayton Lewis
244
A Study of Accessibility Requirements for Media Players on the Web . . . Lourdes Moreno, Mar´ıa Gonzalez, Paloma Mart´ınez, and Ana Iglesias
249
An Accessibility Assessment Framework for Improving Designers Experience in Web Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theofanis Oikonomou, Nikolaos Kaklanis, Konstantinos Votis, and Dimitrios Tzovaras A Unified Environment for Accessing a Suite of Accessibility Evaluation Facilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nikolaos Partarakis, Constantina Doulgeraki, Margherita Antona, Theofanis Oikonomou, Nikolaos Kaklanis, Konstantinos Votis, Grammati-Eirini Kastori, and Dimitrios Tzovaras Introducing TactoWeb: A Tool to Spatially Explore Web Pages for Users with Visual Impairment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gr´egory Petit, Aude Dufresne, and Jean-Marc Robert Remote Evaluation of WCAG 2.0 Techniques by Web Users with Visual Disabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Christopher Power, Helen Petrie, Andr´e P. Freire, and David Swallow Embedded Cultural Features in the Design of an Accessibility Agent for the Web . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ingrid Teixeira Monteiro and Clarisse Sieckenius de Souza
258
267
276
285
295
Part III: Multimodality, Adaptation and Personalisation Some Issues Regarding the Design of Adaptive Interface Generation Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Julio Abascal, Amaia Aizpurua, Idoia Cearreta, Borja Gamecho, Nestor Garay, and Ra´ ul Mi˜ n´ on
307
XX
Table of Contents – Part I
Search Intention Analysis for User-Centered Adaptive Visualizations . . . Dirk Burkhardt, Matthias Breyer, Kawa Nazemi, and Arjan Kuijper
317
Adaptations Based on Ontology Evolution as a Mean to Exploit Collective Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Laura Burzagli, Francesco Gabbanini, and Pier Luigi Emiliani
327
The Contribution of Multimodal Adaptation Techniques to the GUIDE Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jos´e Coelho and Carlos Duarte
337
Adapting Multimodal Fission to User’s Abilities . . . . . . . . . . . . . . . . . . . . . David Costa and Carlos Duarte
347
Self-adapting TV Based Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Daniel Costa and Carlos Duarte
357
A Survey on Guiding Logic for Automatic User Interface Generation . . . Gaurav Dubey
365
Adaptive Multimodal Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pedro Feiteira and Carlos Duarte
373
Intelligent Working Environments for the Ambient Classroom . . . . . . . . . Maria Korozi, Stavroula Ntoa, Margherita Antona, and Constantine Stephanidis
381
Adaptive Interfaces: A Little Learning Is a Dangerous Thing. . . . . . . . . . . Kyle Montague, Vicki L. Hanson, and Andy Cobley
391
A Novel Design Approach for Multi-device Adaptable User Interfaces: Concepts, Methods and Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alexandros Mourouzis, Asterios Leonidis, Michalis Foukarakis, Margherita Antona, and Nicos Maglaveras Cultural Difference in Nonverbal Behaviors in Negotiation Conversations: Towards a Model for Culture-Adapted Conversational Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fumie Nori, Afia Akhter Lipi, and Yukiko Nakano An Investigation into a Personalised and Web2.0-Based Search Engine Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tsai-Hsuan Tsai, Hsien-Tsung Chang, and Shih-Ting Huang
400
410
420
Part IV: eInclusion Policy, Good Practice, Legislation and Security Issues Cyber Risks to Secure and Private Universal Access . . . . . . . . . . . . . . . . . . G. Susanne Bahr, Liam M. Mayron, and Hannah J. Gacey
433
Table of Contents – Part I
XXI
Towards Information Technology Security for Universal Access . . . . . . . . . Shiran Cohen, Noam Ben-Asher, and Joachim Meyer
443
The Barriers to and Benefits of Use of ICT for People with Visual Impairment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kristin Skeide Fuglerud
452
Supporting a Shared Dialog on Healthcare Policy between Researchers, Practitioners, and the Lay Public: Creating the SpeakHealth Online Community . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . David Gurzick, Lee Boot, Stacy Arnold, and Martha Chandler Gurzick
463
Social Inclusion through Digital Engagement . . . . . . . . . . . . . . . . . . . . . . . . Vicki L. Hanson
473
Legal and Policy Implications of Cloud Computing . . . . . . . . . . . . . . . . . . . Eve Hill
478
Technology Support via Telephone – to Elderly and People with Disability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lars Johansson and Claes Tj¨ ader
484
Virtual Structured Dialogic Design as Tool for Analysis of Threats Before Implementing European Dialogues Aiming to Identify R&D Gaps in Assistive ICT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yiannis Laouris, Georgina Siita, Patrick Roe, Pier-Luigi Emiliani, and Aleco Christakis
492
Investigation of Best Practices for Maintaining Section 508 Compliance in U.S. Federal Web Sites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jonathan Lazar and Abiodun Olalere
498
eInclusion: Policies and Concepts Regarding Persons with Disabilities – Considerations about Brazil and Portugal . . . . . . . . . . . . . . . . . . . . . . . . . . . Ana Isabel B.B. Paraguay
507
Creating a Global Public Inclusive Infrastructure . . . . . . . . . . . . . . . . . . . . Gregg Vanderheiden and Jutta Treviranus
517
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
527
Table of Contents – Part II
Part I: User Models, Personas and Virtual Humans Standardizing User Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pradipta Biswas and Patrick Langdon Integral Model of the Area of Reaches and Forces of a Disabled Person with Dysfunction of Lower Limbs as a Tool in Virtual Assessment of Manipulation Possibilities in Selected Work Environments . . . . . . . . . . . . . Bogdan Branowski, Piotr Pohl, Michal Rychlik, and Marek Zablocki Modeling the Role of Empathic Design Engaged Personas: An Emotional Design Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Robert C.C. Chen, Wen Cing-Yan Nivala, and Chien-Bang Chen Accessible UI Design and Multimodal Interaction through Hybrid TV Platforms: Towards a Virtual-User Centered Design Framework . . . . . . . . Pascal Hamisu, Gregor Heinrich, Christoph Jung, Volker Hahn, Carlos Duarte, Pat Langdon, and Pradipta Biswas
3
12
22
32
Modelling Cognitive Impairment to Improve Universal Access . . . . . . . . . Elina Jokisuu, Patrick Langdon, and P. John Clarkson
42
Integrating Human Modeling and Simulation with the Persona Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Taro Kanno, Tomohiko Ooyabu, and Kazuo Furuta
51
User Modeling through Unconscious Interaction with Smart Shop . . . . . . Toshikazu Kato
61
Supporting Inclusive Design of User Interfaces with a Virtual User Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pierre T. Kirisci, Patrick Klein, Markus Modzelewski, Michael Lawo, Yehya Mohamad, Thomas Fiddian, Chris Bowden, Antoinette Fennell, and Joshue O Connor Virtual User Concept for Inclusive Design of Consumer Products and User Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yehya Mohamad, Carlos A. Velasco, Jaroslav Pullmann, Michael Lawo, and Pierre Kirisci Modeling Users for Adaptive Semantics Visualizations . . . . . . . . . . . . . . . . Kawa Nazemi, Dirk Burkhardt, Matthias Breyer, and Arjan Kuijper
69
79
88
XXIV
Table of Contents – Part II
An Investigation of a Personas-Based Model Assessment for Experiencing User-Centred Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wen Cing-Yan Nivala, De-Lai Men, Tin-Kai Chen, and Robert C.C. Chen Numerical Analysis of Geometrical Features of 3D Biological Objects, for Three-Dimensional Biometric and Anthropometric Database . . . . . . . Michal Rychlik, Witold Stankiewicz, and Marek Morzynski
98
108
Part II: Older People in the Information Society Designing Interactive Pill Reminders for Older Adults: A Formative Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sepideh Ansari
121
Older User Errors in Handheld Touchscreen Devices: To What Extent Is Prediction Possible? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Michael Bradley, Patrick Langdon, and P. John Clarkson
131
Affective Technology for Older Adults: Does Fun Technology Affect Older Adults and Change Their Lives? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ryoko Fukuda
140
Muntermacher – “Think and Move” Interface and Interaction Design of a Motion-Based Serious Game for the Generation Plus . . . . . . . . . . . . . Holger Graf, Christian Tamanini, and Lukas Geissler
149
Preliminary Framework for Studying Self-reported Data in Electronic Medical Records within a Continuing Care Retirement Community . . . . . Kelley Gurley and Anthony F. Norcio
159
Using Motion-Sensing Remote Controls with Older Adults . . . . . . . . . . . . Thomas von Bruhn Hinn´e and Simeon Keates
166
Design Lessons for Older Adult Personal Health Records Software from Older Adults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Juan Pablo Hourcade, Elizabeth A. Chrischilles, Brian M. Gryzlak, Blake M. Hanson, Donald E. Dunbar, David A. Eichmann, and Ryan R. Lorentzen Design and Development a Social Networks Platform for Older People . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chien-Lung Hsu, Kevin C. Tseng, Chin-Lung Tseng, and Boo-Chen Liu In Search of Information on Websites: A Question of Age? . . . . . . . . . . . . Eug`ene Loos
176
186
196
Table of Contents – Part II
Preliminary Findings of an Ethnographical Research on Designing Accessible Geolocated Services with Older People . . . . . . . . . . . . . . . . . . . . Valeria Righi, Guiller Mal´ on, Susan Ferreira, Sergio Sayago, and Josep Blat An Experiment for Motivating Elderly People with Robot Guided Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ryohei Sasama, Tomoharu Yamaguchi, and Keiji Yamada Connecting Communities: Designing a Social Media Platform for Older Adults Living in a Senior Village . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tsai-Hsuan Tsai, Hsien-Tsung Chang, Alice May-Kuen Wong, and Tsung-Fu Wu A Telehealthcare System to Care for Older People Suffering from Metabolic Syndrome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kevin C. Tseng, Chien-Lung Hsu, and Yu-Hao Chuang Narrating Past to Present: Conveying the Needs and Values of Older People to Young Digital Technology Designers . . . . . . . . . . . . . . . . . . . . . . . Elizabeth Valentine, Ania Bobrowicz, Graeme Coleman, Lorna Gibson, Vicki L. Hanson, Saikat Kundu, Alison McKay, and Raymond Holt Evaluating the Design, Use and Learnability of Household Products for Older Individuals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Christopher Wilkinson, Patrick Langdon, and P. John Clarkson
XXV
205
214
224
234
243
250
Part III: Designing for Users Diversity Disable Workstation Development: A Multicompetence Approach to Human Behaviour Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Giuseppe Andreoni, Fiammetta Costa, Carlo Frigo, Sabrina Muschiato, Esteban Pavan, Laura Scapini, and Maximiliano Romero Making Visual Maps Accessible to the Blind . . . . . . . . . . . . . . . . . . . . . . . . . Maria Claudia Buzzi, Marina Buzzi, Barbara Leporini, and Loredana Martusciello Untapped Markets in Cloud Computing: Perspectives and Profiles of Individuals with Intellectual and Developmental Disabilities and Their Families . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ann Cameron Caldwell Patient-Centered Design: Interface Personalization for Individuals with Brain Injury . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Elliot Cole
263
271
281
291
XXVI
Table of Contents – Part II
An Information Theoretic Mouse Trajectory Measure . . . . . . . . . . . . . . . . . Samuel Epstein, Eric S. Missimer, and Margrit Betke Comparative Study between AZERTY-Type and K-Hermes Virtual Keyboards Dedicated to Users with Cerebral Palsy . . . . . . . . . . . . . . . . . . . Yohan Guerrier, Maxime Baas, Christophe Kolski, and Franck Poirier
301
310
New Trends in Non-visual Interaction - Sonification of Maps . . . . . . . . . . . Vidas Lauruska
320
Opportunities in Cloud Computing for People with Cognitive Disabilities: Designer and User Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . Clayton Lewis and Nancy Ward
326
Adaptive Mouse-Replacement Interface Control Functions for Users with Disabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . John J. Magee, Samuel Epstein, Eric S. Missimer, Christopher Kwan, and Margrit Betke A-Cross: An Accessible Crossword Puzzle for Visually Impaired Users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Stavroula Ntoa, Ilia Adami, Giannis Prokopiou, Margherita Antona, and Constantine Stephanidis
332
342
Access-a-WoW: Building an Enhanced World of WarcraftTM UI for Persons with Low Visual Acuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G. Michael Poor, Thomas J. Donahue, Martez E. Mott, Guy W. Zimmerman, and Laura Marie Leventhal
352
Audiopolis, Navigation through a Virtual City Using Audio and Haptic Interfaces for People Who Are Blind . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jaime S´ anchez and Javiera Mascar´ o
362
Implications of Cloud Computing for People with Cognitive Disabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . James Sullivan, Clayton Lewis, and Jeffery Hoehl
372
Website Design and Usability Assessment Implications from a Usability Study with Visually Impaired Users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sarah J. Swierenga, Jieun Sung, Graham L. Pierce, and Dennis B. Propst Disabled Youth in Sport Rivalry: What Are the Trends – Virtual or Real Competition? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Katarzyna Ujma-Wasowicz Advances in Game Accessibility from 2005 to 2010 . . . . . . . . . . . . . . . . . . . Thomas Westin, Kevin Bierre, Dimitris Gramenos, and Michelle Hinn
382
390 400
Table of Contents – Part II
Tactics Choice Behaviors Represented in a Programming Language in the Map Tracing Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nobuhito Yamamoto, Syoko Shiroma, and Tomoyuki Nishioka
XXVII
410
Part IV: Cultural and Emotional Aspects Age-Related Accessibility Biases in Pass-Face Recognition . . . . . . . . . . . . . Ray Adams, Gisela Susanne Bahr, and Ejder Sevgen Raif
423
Affective Climate of Workplace and Its Contributing Factors . . . . . . . . . . Waratta Authayarat and Hiroyuki Umemuro
432
Learning Culture-Specific Dialogue Models from Non Culture-Specific Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kallirroi Georgila and David Traum
440
Dialog Behaviors across Culture and Group Size . . . . . . . . . . . . . . . . . . . . . David Herrera, David Novick, Dusan Jan, and David Traum
450
Life in Affective Reality: Identification and Classification of Smiling in Early Childhood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fumito Kawakami and Akifumi Tokosumi
460
Investigation of Users’ Reactions toward Various Kinds of Artificial Agents: Comparison of an Robotic Agent with an On-screen Agent . . . . . Takanori Komatsu, Yuuki Seki, Ryohei Sasama, Tomoharu Yamaguchi, and Keiji Yamada Sense of Presence in a Robotic Telepresence Domain . . . . . . . . . . . . . . . . . Annica Kristoffersson, Silvia Coradeschi, Kerstin Severinson Eklundh, and Amy Loutfi Exploration of the Cultural Image of Chinese Form Using Culture Identity Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ying-Jye Lee and Cheih-Ying Chen Museum Exhibit Content Recommendation and Guidance System Focusing on Experience Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ding-Bang Luh, Chih-Lin Chiang, Ssu-Ling Huang, and Tsai-Lin Yang
470
479
488
498
General Factors That Elicit Human Affect across Countries . . . . . . . . . . . Qin Tang and Hiroyuki Umemuro
508
Affective Technology through Affective Management . . . . . . . . . . . . . . . . . . Hiroyuki Umemuro
513
Do Hedonic and Eudaimonic Well-Being of Online Shopping Come from Daily Life Experience? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jia Zhang and Hiroyuki Umemuro
519
XXVIII
Table of Contents – Part II
Part V: Eye Tracking, Gestures and Brain Interfaces Eye Tracking and Universal Access: Three Applications and Practical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Michael Bartels and Sandra P. Marshall Interpreting 3D Faces for Augmented Human-Computer Interaction . . . . Marinella Cadoni, Enrico Grosso, Andrea Lagorio, and Massimo Tistarelli
525 535
Social Environments, Mixed Communication and Goal-Oriented Control Application Using a Brain-Computer Interface . . . . . . . . . . . . . . . G¨ unter Edlinger and Christoph Guger
545
Tactile Hand Gesture Recognition through Haptic Feedback for Affective Online Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hae Youn Joung and Ellen Yi-Luen Do
555
Gesture-Based User Interfaces for Public Spaces . . . . . . . . . . . . . . . . . . . . . Andreas Kratky
564
Towards Standardized User and Application Interfaces for the Brain Computer Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paul McCullagh, Melanie Ware, Alex McRoberts, Gaye Lightbody, Maurice Mulvenna, Gerry McAllister, Jos´e Luis Gonz´ alez, and Vicente Cruz Medina
573
Head Movements, Facial Expressions and Feedback in Danish First Encounters Interactions: A Culture-specific Analysis . . . . . . . . . . . . . . . . . . Patrizia Paggio and Costanza Navarretta
583
EEG-Based Personalized Digital Experience . . . . . . . . . . . . . . . . . . . . . . . . . Olga Sourina, Yisi Liu, Qiang Wang, and Minh Khoa Nguyen
591
Perspectives on User Experience Evaluation of Brain-Computer Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bram van de Laar, Hayrettin G¨ urk¨ ok, Danny Plass-Oude Bos, Femke Nijboer, and Anton Nijholt BCIs in Multimodal Interaction and Multitask Environments: Theoretical Issues and Initial Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jan B.F. van Erp, Marieke E. Thurlings, Anne-Marie Brouwer, and Peter J. Werkhoven Fitts’ Law in Bivariate Pointing on Large Touch Screens: Age-Differentiated Analysis of Motion Angle Effects on Movement Times and Error Rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sebastian Vetter, Jennifer B¨ utzler, Nicole Jochems, and Christopher M. Schlick
600
610
620
Table of Contents – Part II
Adaboost with SVM-Based Classifier for the Classification of Brain Motor Imagery Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jue Wang, Lin Gao, Haoshi Zhang, and Jin Xu AVIN (Assisted Visual Interactive Notepad): A Novel Interface Design to Expedite the Eye Writing Experience . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xianjun Sam Zheng, Stuart Goose, Joeri Kiekebosch, and James Jeng-Weei Lin Online BCI Implementation of High-Frequency Phase Modulated Visual Stimuli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Danhua Zhu, Gary Garcia-Molina, Vojkan Mihajlovi´c, and Ronald M. Aarts Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
XXIX
629
635
645
655
Table of Contents – Part III
Part I: Universal Access in the Mobile Context Results of the Technical Validation of an Accessible Contact Manager for Mobile Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jon Azpiroz, Juan Bautista Montalv´ a Colomer, Mar´ıa Fernanda Cabrera-Umpi´errez, Mar´ıa Teresa Arredondo, and Julio Guti´errez Developing Accessible Mobile Phone Applications: The Case of a Contact Manager and Real Time Text Applications . . . . . . . . . . . . . . . . . . Mar´ıa Fernanda Cabrera-Umpi´errez, Adri´ an Rodr´ıguez Castro, Jon Azpiroz, Juan Bautista Montalv´ a Colomer, Mar´ıa Teresa Arredondo, and Javier Cano-Moreno
3
12
BrailleTouch: Mobile Texting for the Visually Impaired . . . . . . . . . . . . . . . Brian Frey, Caleb Southern, and Mario Romero
19
Supporting Universal Usability of Mobile Software: Touchscreen Usability Meta-Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vlado Glavinic, Sandi Ljubic, and Mihael Kukec
26
Mobile Technologies for Promoting Health and Wellness among African American Youth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Donovan Hill, Jasmine Blunt, Terrence Pugh, Monika Monk, Ji-Sun Kim, Woodrow W. Winchester III, D. Scott McCrickard, Paul Estabrooks, and Felicia Doswell
36
Privacy, Security and Interoperability of Mobile Health Applications . . . . Josette F. Jones, Sara A. Hook, Seong C. Park, and LaSha M. Scott
46
GeoDrinking: How to Extract Value from an Extended Social Wine Drinking Experience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alessandro Marcengo and Amon Rapp
56
Enhancing Mobile Interaction using WLAN Proximity . . . . . . . . . . . . . . . . W. Narzt and H. Schmitzberger
66
Tracking Observations of Everyday Living with Smart Phones . . . . . . . . . Michelle Rogers
76
Effect of Protective Coating on the Performance of Wearable Antennas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Minyoung Suh, Kate Carroll, and William Oxenham
84
XXXII
Table of Contents – Part III
The Effects of Font Size and Page Presentation Method of E-Book Reading on Small Screens for Older Adults . . . . . . . . . . . . . . . . . . . . . . . . . . Wang-Chin Tsai, Yi-Lin Ro, Ya-Tzu Chang, and Chang-Franw Lee Product Form Feature Selection for Mobile Phone Design Using LS-SVR and ARD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chih-Chieh Yang, Meng-Dar Shieh, Kuang-Hsiung Chen, and Pei-Ju Lin Mobile Wikipedia: A Case Study of Information Service Design for Chinese Teenagers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jia Zhou, P.L. Patrick Rau, Christoph Rohmer, Jie Zhou, Christophe Ghalayini, and Felix Roerig
94
102
112
Part II: Ambient Assisted Living and Smart Environments A Method with Triaxial Acceleration Sensor for Fall Detection of the Elderly in Daily Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nan Bao, Cong Feng, Yan Kang, Lisheng Xu, Yuhang Du, Lei Zhang, Feifei Yang, and Qingchao Li The REMOTE AAL Project: Remote Health and Social Care for Independent Living of Isolated Elderly with Chronic Conditions . . . . . . . Angelos Bekiaris, Alexandros Mourouzis, and Nicos Maglaveras Observe the User Interactive Behavior with a Large Multi-touch Display in Public Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chien-Hsu Chen, Hsiao-Mei Hung, I.-Jui Lee, Yu-Wen Chen, and Fong-Gong Wu
121
131
141
Detection of Wheelchair User Activities Using Wearable Sensors . . . . . . . Dan Ding, Shivayogi Hiremath, Younghyun Chung, and Rory Cooper
145
Universal Access in Ambient Intelligent Environments: A Research Agenda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pier Luigi Emiliani, Laura Burzagli, and Francesco Gabbanini
153
Mobile Interfaces for Better Living: Supporting Awareness in a Smart Home Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Denis Graˇcanin, D. Scott McCrickard, Arthur Billingsley, Roosevelt Cooper, Tavon Gatling, Erik J. Irvin-Williams, Felicia Osborne, and Felicia Doswell
163
Table of Contents – Part III
XXXIII
Design and Development of Four Prototype Interactive Edutainment Exhibits for Museums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dimitris Grammenos, Xenophon Zabulis, Damien Michel, Thomas Sarmis, Giannis Georgalis, Konstantinos Tzevanidis, Antonis Argyros, and Constantine Stephanidis Informatics as Semiotics Engineering: Lessons learned from Design, Development and Evaluation of Ambient Assisted Living Applications for Elderly People . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Andreas Holzinger, Gig Searle, Andreas Auinger, and Martina Ziefle
173
183
iAWN: Designing Smart Artifacts for Sustainable Awareness . . . . . . . . . . Taysheng Jeng, Yu-Pin Ma, and Yang-Ting Shen
193
A System for Enhanced Situation Awareness with Outdoor Augmented Reality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jan A. Neuh¨ ofer and Thomas Alexander
203
Implementation of the ISO/IEC 24756 for the Interaction Modeling of an AAL Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pilar Sala, Carlos Fernandez, Juan Bautista Mochol´ı, Pablo Presencia, and Juan Carlos Naranjo Virtual Reality for AAL Services Interaction Design and Evaluation . . . . Pilar Sala, Felix Kamieth, Juan Bautista Mochol´ı, and Juan Carlos Naranjo Young by Design: Supporting Older Adults’ Mobility and Home Technology Use through Universal Design and Instruction . . . . . . . . . . . . . Michael Sengpiel Towards an Evidence-Based and Context-Aware Elderly Caring System Using Persuasive Engagement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yu Chun Yen, Ching Hu Lu, Yi Chung Cheng, Jing Siang Chen, and Li Chen Fu
210
220
230
240
Part III: Driving and Interaction Towards an Integrated Adaptive Automotive HMI for the Future . . . . . . Angelos Amditis, Katia Pagle, Gustav Markkula, and Luisa Andreone
253
Lessons Learned Regarding Simulator Sickness in Older Adult Drivers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nicholas D. Cassavaugh, Joshua E. Domeyer, and Richard W. Backs
263
Design of Human Computer Interfaces for Highly Automated Vehicles in the EU-Project HAVEit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Frank Flemisch, Anna Schieben, Nadja Schoemig, Matthias Strauss, Stefan Lueke, and Anna Heyden
270
XXXIV
Table of Contents – Part III
Towards User-Centred Development of Integrated Information, Warning, and Intervention Strategies for Multiple ADAS in the EU Project interactIVe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tobias Hesse, Johan Engstr¨ om, Emma Johansson, Giuseppe Varalda, Martin Brockmann, Amon Rambaldini, Nicola Fricke, Frank Flemisch, Frank K¨ oster, and Lena Kanstrup
280
The Comparison of Different Sensory Outputs on the Driving Overtake Alarm System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yu-Chun Huang, Chia-Jung Tsai, Jo-Yu Kuo, and Fong-Gong Wu
290
I Can’t Hear You? Drivers Interacting with Male or Female Voices in Native or Non-native Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ing-Marie Jonsson and Nils Dahlb¨ ack
298
Monitoring User Distraction in a Car by Segmentation of Experimental Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tom´ aˇs Macek, Martin Labsk´ y, Jan Kleindienst, and Hana Truskov´ a
306
On-Road Pilot Study on the Need for Integrated Interfaces of In-Vehicle Driver Support Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Evangelia Portouli, Vassilis Papakostopoulos, and Nicolas Marmaras
316
Part IV: Interactive Technologies in the Physical and Built Environment New Design – Integration of Art and Technology . . . . . . . . . . . . . . . . . . . . . Wojciech Bonenberg Chetoe.com: An Integrated Web 2.0 Service for Automatically Travel Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hsien-Tsung Chang, Zi-Ning Liu, Yi-Ting Wang, Che-Wen Chang, and Chia-An Hung
329
338
Evolution of Domestic Kitchen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jerzy Charytonowicz and Dzoana Latala
348
Perception and Illusion in Interior Design . . . . . . . . . . . . . . . . . . . . . . . . . . . Anna Jaglarz
358
Spaces of Mutable Shape and the Human Ability to Adapt . . . . . . . . . . . . Katarzyna Kubsik
365
Using a Visual Assistant to Travel Alone within the City . . . . . . . . . . . . . . Yves Lachapelle, Dany Lussier-Desrochers, Martin Caouette, and Martin Therrien-B´elec
372
Table of Contents – Part III
XXXV
The Computer – A Tool That Makes Human Environment: Technosphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Beata Majerska-Palubicka
378
Machinery Design for Construction Safety in Practice . . . . . . . . . . . . . . . . . Beata Mrugalska and Aleksandra Kawecka-Endler
388
The Design and Manufacture of Functional Micro-stationary PCR Chip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jinquan Nie, Yulong Zhao, Yimin Liu, Keyin Liu, and Niancai Peng
398
Thermal Preparation of Food and Its Influence on Shaping the Old and Modern Kitchen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Przemyslaw Nowakowski
405
mGuides, Design and Usability of a Mobile System to Assist Learning in Critical Situations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jaime S´ anchez and Mat´ıas Espinoza
415
Smart Cities, Ambient Intelligence and Universal Access . . . . . . . . . . . . . . Norbert A. Streitz
425
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
433
Table of Contents – Part IV
Part I: Speech, Communication and Dialogue Greek Verbs and User Friendliness in the Speech Recognition and the Speech Production Module of Dialog Systems for the Broad Public . . . . . Christina Alexandris and Ioanna Malagardi
3
Intercultural Dynamics of Fist Acquaintance: Comparative Study of Swedish, Chinese and Swedish-Chinese First Time Encounters . . . . . . . . . Jens Allwood, Nataliya Berbyuk Lindstr¨ om, and Jia Lu
12
Greek WordNet and Its Extension with Terms of the Computer Science Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Panagiotis Blitsas and Maria Grigoriadou
22
An Experimental Study of the Use of Multiple Humanoid Robots as a Social Communication Medium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kotaro Hayashi, Takayuki Kanda, Hiroshi Ishiguro, Tsukasa Ogasawara, and Norihiro Hagita
32
A Multitasking Approach to Adaptive Spoken Dialogue Management . . . Tobias Heinroth, Dan Denich, and Wolfgang Minker
42
From Clouds to Rain: Consolidating and Simplifying Online Communication Services with Easy One Communicator . . . . . . . . . . . . . . . Jeffery Hoehl and Gregg Vanderheiden
52
Use of Speech Technology in Real Life Environment . . . . . . . . . . . . . . . . . . Ruimin Hu, Shaojian Zhu, Jinjuan Feng, and Andrew Sears
62
Metagnostic Deductive Question Answering with Explanation from Texts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . John Kontos, Joseph Armaos, and Ioanna Malagardi
72
Collecting an American Sign Language Corpus through the Participation of Native Signers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pengfei Lu and Matt Huenerfauth
81
Fuzzy-Logic Controller for Speaker-Independent Speech Recognition System in Computer Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Moyen Mohammad Mustaquim
91
Building of Turn-Taking Avatars That Express Utterance Attitudes: A Social Scientific Approach to Behavioral Design of Conversational Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Masahide Yuasa and Naoki Mukawa
101
XXXVIII
Table of Contents – Part IV
Part II: Interacting with Documents and Images Visual Design Elements of Photos on Taiwanese Female Clothing On-line Sales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ro-Han Chang, Tsai-Rong Yu, and Yao-Cheng Tsai
111
Web Application for Analysis, Manipulation and Generation of Accessible PDF Documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alireza Darvishy, Hans-Peter Hutter, and Oliver Mannhart
121
Survey: Improving Document Accessibility from the Blind and Visually Impaired User’s Point of View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Martin Dorigo, Bettina Harriehausen-M¨ uhlbauer, Ingo Stengel, and Paul S. Dowland
129
Readable Image for the Visually Impaired . . . . . . . . . . . . . . . . . . . . . . . . . . . Sunil Kumar Kopparapu
136
Camera Canvas: Image Editing Software for People with Disabilities . . . . Christopher Kwan and Margrit Betke
146
Exact Compensation of Color-Weakness with Discrimination Threshold Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rika Mochizuki, Satoshi Oshima, Reiner Lenz, and Jinhui Chao
155
Collaborative Editing for All: The Google Docs Example . . . . . . . . . . . . . . Giulio Mori, Maria Claudia Buzzi, Marina Buzzi, Barbara Leporini, and Victor M.R. Penichet
165
Acoustic Modeling of Dialogue Elements for Document Accessibility . . . . Pepi Stavropoulou, Dimitris Spiliotopoulos, and Georgios Kouroupetroglou
175
Part III: Universal Access in Complex Working Environments Seeing the Wood for the Trees Again! SMART - A Holistic Way of Corporate Governance Offering a Solution Ready to Use . . . . . . . . . . . . . . Fritz Bastarz and Patrick Halek
187
Using Human Service Center Interfaces and Their Information to Foster Innovation Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Klaus-Peter Faehnrich, Kyrill Meyer, and Benjamin Udo Strehl
195
Key Features of Subject-Oriented Modeling and Organizational Deployment Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Albert Fleischmann and Christian Stary
205
Table of Contents – Part IV
XXXIX
The Effect of a GPS on Learning with Regards to Performance and Communication in Municipal Crisis Response . . . . . . . . . . . . . . . . . . . . . . . . Helena Granlund, Rego Granlund, and Nils Dahlb¨ ack
215
Crisis Management Training: Techniques for Eliciting and Describing Requirements and Early Designs across Different Incident Types . . . . . . . Ebba Thora Hvannberg and Jan Rudinsky
225
Development of Mobile Evacuation Guides for Travellers and Rescue Personnel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Viveca Jimenez-Mixco, Jan Paul Leuteritz, Eugenio Gaeta, Maria Fernanda Cabrera-Umpierrez, Mar´ıa Teresa Arredondo, Harald Widlroither, and Mary Panou Stakeholder-Driven Business Process Management: “An Evaluation of the Suitability, Adequacy and Effectiveness of Quality and Process Management” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Michael Jungen
235
244
Evaluation of a Mobile AR Tele-Maintenance System . . . . . . . . . . . . . . . . . Michael Kleiber and Thomas Alexander
253
Knowledge in Digital Decision Support System . . . . . . . . . . . . . . . . . . . . . . Erika Matsak and Peeter Lorents
263
Examining the Current State of Group Support Accessibility: A Focus Group Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . John G. Schoeberlein and Yuanqiong (Kathy) Wang
272
BioSEME: A Semantically-Integrated Content Management System for Cancer Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ionut Mihai Stancu
282
Age Dependent Differences in the Usage of a Desktop VR System for Air Force Mission Planning and Preparation . . . . . . . . . . . . . . . . . . . . . . . . . Carsten Winkelholz and Michael Kleiber
292
A Concept for User-Centered Development of Accessible User Interfaces for Industrial Automation Systems and Web Applications . . . . . . . . . . . . . Farzan Yazdi, Helmut Vieritz, Nasser Jazdi, Daniel Schilberg, Peter G¨ ohner, and Sabina Jeschke
301
Part IV: Well-Being, Health and Rehabilitation Applications Forms of Interaction in Virtual Space: Applications to Psychotherapy and Counselling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shagun Chawla and Nigel Foreman
313
XL
Table of Contents – Part IV
Control of Powered Prosthetic Hand Using Multidimensional Ultrasound Signals: A Pilot Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xin Chen, Siping Chen, and Guo Dan
322
A Top-k Analysis Using Multi-level Association Rule Mining for Autism Treatments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kelley M. Engle and Roy Rada
328
Pregnancy Test for the Vision-Impaired Users . . . . . . . . . . . . . . . . . . . . . . . Tereza Hykov´ a, Adam J. Sporka, Jan Vystrˇcil, Martin Kl´ıma, and Pavel Slav´ık Effect of Spinal Cord Injury on Nonlinear Complexity of Skin Blood Flow Oscillations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yih-Kuen Jan, Fuyuan Liao, and Stephanie Burns Developing Prot´eg´e to Structure Medical Report . . . . . . . . . . . . . . . . . . . . . Josette Jones, Kanitha Phalakornkule, Tia Fitzpatrick, Sudha Iyer, and C. Zorina Ombac Increasing Physical Activity by Implementing a Behavioral Change Intervention Using Pervasive Personal Health Record System: An Exploratory Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hadi Kharrazi and Lynn Vincz
335
345 356
366
Exploring Health Website Users by Web Mining . . . . . . . . . . . . . . . . . . . . . Wei Kong and Josette Jones
376
Double Visual Feedback in the Rehabilitation of Upper LIMB . . . . . . . . . Liu Enchen, Sui Jianfeng, and Ji Linhong
384
Can User Tagging Help Health Information Seekers? . . . . . . . . . . . . . . . . . . Malika Mahoui, Josette Jones, Andrew Meyerhoff, and Syed Ahmed Toufeeq
389
Interactive Medical Volume Visualizations for Surgical Online Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Konrad M¨ uhler, Mathias Neugebauer, and Bernhard Preim Age-Adapted Psychoacoustics: Target Group Oriented Sound Schemes for the Interaction with Telemedical Systems . . . . . . . . . . . . . . . . . . . . . . . . Alexander Mertens, Philipp Przybysz, Alexander Groß, David Koch-Koerfges, Claudia Nick, Martin Kaethner, and Christopher M. Schlick Bringing the Home into the Hospital: Assisting the Pre-Discharge Home Visit Process Using 3D Home Visualization Software . . . . . . . . . . . . Arthur G. Money, Anne McIntyre, Anita Atwal, Georgia Spiliotopoulou, Tony Elliman, and Tim French
398
406
416
Table of Contents – Part IV
Design of a Paired Patient-Caregiver Support Application for People Coping with Cancer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Christine M. Newlon, Robert Skipworth Comer, Kim Wagler-Ziner, and Anna M. McDaniel
XLI
427
Importance of Physical Interaction between Human and Robot for Therapy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Takanori Shibata
437
The Relationships between Morphology and Work for the Nursing Performance of Hand Controls in Emergency Surgery . . . . . . . . . . . . . . . . . Chao-Yuan Tseng and Fong-Gong Wu
448
Upper Limb Contralateral Physiological Characteristic Evaluation for Robot-Assisted Post Stroke Hemiplegic Rehabilitation . . . . . . . . . . . . . . . . Lap-Nam Wong, Qun Xie, and Linhong Ji
458
A Case Study of the Design and Evaluation of a Persuasive Healthy Lifestyle Assistance Technology: Challenges and Design Guidelines . . . . . Jie Xu, Ping-yu Chen, Scott Uglow, Alison Scott, and Enid Montague
464
Novel Human-Centered Rehabilitation Robot with Biofeedback for Training and Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Runze Yang, Linhong Ji, and Hongwei Chen
472
Intensity Analysis of Surface Myoelectric Signals from Lower Limbs during Key Gait Phases by Wavelets in Time-Frequency . . . . . . . . . . . . . . Jiangang Yang, Xuan Gao, Baikun Wan, Dong Ming, Xiaoman Cheng, Hongzhi Qi, Xingwei An, Long Chen, Shuang Qiu, and Weijie Wang Handle Reaction Vector Analysis with Fuzzy Clustering and Support Vector Machine during FES-Assisted Walking Rehabilitation . . . . . . . . . . Weixi Zhu, Dong Ming, Baikun Wan, Xiaoman Cheng, Hongzhi Qi, Yuanyuan Chen, Rui Xu, and Weijie Wang
479
489
Part V: Universal Access to Education and Learning From “Reading” Math to “Doing” Math: A New Direction in Non-visual Math Accessibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nancy Alajarmeh, Enrico Pontelli, and Tran Son Accessible Education for Autistic Children: ABA-Based Didactic Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Silvia Artoni, Maria Claudia Buzzi, Marina Buzzi, Claudia Fenili, and Simona Mencarini
501
511
XLII
Table of Contents – Part IV
Educational Impact of Structured Podcasts on Blind Users . . . . . . . . . . . . Maria Claudia Buzzi, Marina Buzzi, Barbara Leporini, and Giulio Mori A New Structure of Online Learning Environment to Support the Professional Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wenzhi Chen and Yu-How Lin Behaviour Computer Animation, Communicability and Education for All . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Francisco V. Cipolla Ficarra, Jacqueline Alma, and Miguel Cipolla-Ficarra An Intelligent Task Assignment and Personalization System for Students’ Online Collaboration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Asterios Leonidis, George Margetis, Margherita Antona, and Constantine Stephanidis Using Interface Design with Low-Cost Interactive Whiteboard Technology to Enhance Learning for Children . . . . . . . . . . . . . . . . . . . . . . . Chien-Yu Lin, Fong-Gong Wu, Te-Hsiung Chen, Yan-Jin Wu, Kenendy Huang, Chia-Pei Liu, and Shu-Ying Chou Integration of a Spanish-to-LSE Machine Translation System into an e-Learning Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fernando L´ opez-Colino, Javier Tejedor, Jordi Porta, and Jos´e Col´ as
521
530
538
548
558
567
Towards Ambient Intelligence in the Classroom . . . . . . . . . . . . . . . . . . . . . . George Margetis, Asterios Leonidis, Margherita Antona, and Constantine Stephanidis
577
Learning Styles and Navigation Patterns in Web-Based Education . . . . . . Jelena Naki´c, Nikola Maranguni´c, and Andrina Grani´c
587
Phynocation: A Prototyping of a Teaching Assistant Robot for C Language Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Akihiro Ogino, Haruaki Tamada, and Hirotada Ueda
597
A Generic OSGi-Based Model Framework for Delivery Context Properties and Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jaroslav Pullmann, Yehya Mohamad, Carlos A. Velasco, and Stefan P. Carmien Inclusive Scenarios to Evaluate an Open and Standards-Based Framework That Supports Accessibility and Personalisation at Higher Education . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alejandro Rodriguez-Ascaso, Jesus G. Boticario, Cecile Finat, Elena del Campo, Mar Saneiro, Eva Alcocer, Emmanuelle Guti´errez y Restrepo, and Emanuela Mazzone
605
612
Table of Contents – Part IV
XLIII
Relationship between BPM Education and Business Process Solutions: Results of a Student Contest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Werner Schmidt
622
Design of a Multi-interface Creativity Support Tool for the Enhancement of the Creativity Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . George A. Sielis, George A. Papadopoulos, and Andrina Grani´c
632
Shadow Expert Technique (SET) for Interaction Analysis in Educational Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Christian Stickel, Martin Ebner, and Andreas Holzinger
642
On Interactive Interfaces for Learning Chinese Character Writing Online . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Manlai You, Yu-Jie Xu, and Cherng Chin
652
Personalization in Mobile Learning for People with Special Needs . . . . . . Saeed Zare
662
Intelligent Provision of Semantic Web Services in Educational Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Marija Zeli´c, Marko Rosi´c, and Vlado Glavini´c
670
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
681
Visual Mediation Mechanisms for Collaborative Design and Development Carmelo Ardito1, Barbara Rita Barricelli2, Paolo Buono1, Maria Francesca Costabile1, Antonio Piccinno1, Stefano Valtolina2, and Li Zhu2 1
Dipartimento di Informatica, Università degli Studi di Bari, Via Orabona 4, 70125 Bari, Italy {ardito,buono,costabile,piccinno}@di.uniba.it 2 Dipartimento di Informatica e Comunicazione, Università degli Studi di Milano, Via Comelico 39/41, 20135 Milano, Italy {barricelli,valtolin,zhu}@dico.unimi.it
Abstract. Collaborative design involving end users has emerged as a response to the needs felt by various organizations of adapting software to specific environments and users. During time, users and environments evolve; this is another reason why software has to be modified. Different stakeholders, including consultants, designers internal to the organization and, recently, end users, have to collaborate among themselves, and possibly with the software providers, to shape software. Such stakeholders face fundamental challenges in learning how to communicate and in building a shared understanding. Researchers are now addressing such challenges. This paper contributes to this innovative research by formally defining visual mediation mechanisms for collaborative design. A case study illustrating their application is discussed. Keywords: collaborative design, mediation mechanisms, end-user development, meta-design, communities of practice.
1 Introduction The evolution of interactive software has determined the need of collaborative design involving stakeholders having different backgrounds and expertise. Software products must be modified in order to be adapted to specific environments and users. The software should also be modified at use time to comply with users’ and environments’ changes. Product developers already cooperate with end users and implementation consultants when developing or adapting software. But we are now assisting to a paradigm change in the way software is designed, developed, used and evolved. The report from a study commissioned by the US Department of Defense to define a research agenda on ‘Ultra-Large-Scale Systems’ (ULSS report) confirms this change and states that software will evolve during usage at all levels: system software, middleware, and the applications themselves [1]. Development activities will be distributed over the whole lifecycle and initiated by various stakeholders, including end users. The report also uses the term socio-technical ecosystem, which indicates the complex mutual dependencies among different parts of the software development C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 3–11, 2011. © Springer-Verlag Berlin Heidelberg 2011
4
C. Ardito et al.
infrastructure. Researchers are now exploring different ways for effectively supporting the collaborative activities of the diverse stakeholders in such ecosystems. Particular attention is devoted to the development of software environments and tools, which enable end users to effectively participate in this collaboration and in adapting software at use time. Recent developments based on the Web 2.0 and semantic Web, like weblogs, podcasts, RSS feeds, social software as wikis and social networking, are already examples of collaborative design environments that permit user-generated contents. Fischer refers to a new world based on cultures of participation, in which end users evolve from being passive software consumers to active producers [2], and are involved in various activities of end-user development [3]. In several development practices, communication and collaboration with end users take place through channels that are separated from the actual software, e.g., phone, e-mail, thus limiting end users’ participation in collaborative design. Another main problem is that stakeholders are very diverse, characterized by different cultures and skills, they use different languages and notations, adopt different documentation styles, i.e. they belong to different Communities of Practice (CoPs). According to Wenger, we refer to a CoP as a group of people who share a common practice and address a common set of problems [4]. CoPs develop their own languages and notations to express and communicate their knowledge, problems and solutions. Examples of CoPs are software developers, software consultants, end users. CoPs involved in design, development and evolution of a certain software system represent a Community of Interest (CoI), defined in [5] as a community of communities brought together to solve a problem of common concern. CoIs stress the importance of combining voices from different CoPs; however, they face fundamental challenges in communicating among the CoPs and in building a shared understanding, which is the basis for their collaboration. Members of the CoI keep collaborating at use time, whenever there is the need to modify or evolve the software; this forces the development of technical means to relate and integrate users’ and developers’ views in order to provide a seamless way of moving between use and design of software, facilitating its adaptation to users’ needs and environments. Such technical means include new modeling languages, architectures that support multilevel design and development, but also mediation mechanisms, which permit the communication between professional developers’ environments and end users’ environments across the ecosystem. The novel contribution of this paper is the formal definition of visual mediation mechanisms supporting collaborative design. Their application to a case study referring to a Web portal, which advertises products of shops of various types, is also discussed. The paper is organized as follows. Section 2 discusses the mediation process. Section 3 presents the formal specification of visual mediation mechanisms, and Section 4 illustrates how they are applied to a case study. Finally, Section 5 reports conclusions.
2 Mediation Process A mediation process allows two human actors (shortly actors) to reach a common understanding, related to a specific domain, by the support of an agent, the mediator [6].
Visual Mediation Mechanisms for Collaborative Design and Development
5
Wiederhold defined a mediator as “a software module that exploits encoded knowledge about certain sets or subsets of data to create information for a higher layer of applications” [7]. The concept of mediator has been used in the field of Web services to manage the interoperability among either software agents and Web services or Web services themselves [8], [9]. In the collaborative design context analyzed in this paper, the mediation process consists of exchanging messages between two actors playing a certain role in the collaboration [10]. These actors are generally members of different CoPs; they use dedicated interactive environments to reason on and perform their activities. Similarly to what is described in [11], such environments exploit an interaction language, which comply with the CoP notations, culture and role in the collaboration, in order to be usable for the CoP members. Each environment is equipped with an engine that acts as mediator by translating the incoming messages into the CoP’s interaction language. The first two steps of a generic mediation process are illustrated in Fig. 1. The human actor H1 sends a message (Message1) to another human actor H2. Before reaching H2, Message1 is captured and managed by “Mediator2” (the engine of H2’s interactive environment that acts as mediator for it) that, by exploiting the knowledge Base (KB) for the current domain, translates it into the interaction language used by H2, so that H2 can understand it. The translated message, Message1’, is then delivered to H2. In the second step represented in Fig. 1 H2 replies to H1’s message by sending a new message (Message2) to H1. In analogy with what happened in the first step, the message is captured and managed by “Mediator 1” (the mediator for H1’s interactive environment) that, by exploiting the knowledge base, translates it into the interaction language used by H1. The translated message (Message2’) is then delivered to H1. The CoPs’ environments support the actors involved in the communication process by allowing them to create and exchange boundary objects and their annotations on them. Boundary objects are artifacts of the interactive system and are part of the message sent by a human actor (a member of a CoP), but received and interpreted differently by another human actor according to the background and expertise of the CoP s/he belongs to [12]. Boundary objects are used all the time to sustain communication in face-to-face collaboration; for example, blueprints, sketches and drawings are used during discussion in design engineering; similarly, digital images
Fig. 1. Mediation process between two human actors (H1 and H2)
6
C. Ardito et al.
are used during discussions in medicine and natural sciences. Discussants may express their annotations on a boundary object by using sticky notes. Going to interactive software systems, boundary objects are used to support computer-mediated communication among members of CoIs, who are geographically distributed and work asynchronously on the same project. The diversity of the members of the CoI is addressed by making available boundary objects that are adaptable to the various cultures and contexts of use. Members of the CoIs which constitute the design team interact and negotiate a concept by exchanging messages based on boundary objects as a concrete representation of what they mean. Boundary objects serve as externalizations that capture distinct domains of human knowledge and hold the potential to lead to an increase in socially shared cognition and practice [13], [14]. They carry information and context and can be used to translate, transfer and transform knowledge between CoIs members [12]. These objects are dynamic; they can be changed and manipulated to carry more information. In a collaborative design context, the effectiveness of a boundary object is directly related to how it is translated from tacit knowledge to explicit knowledge and translated back from explicit knowledge to tacit knowledge between different actors [15]. Since the information carried by boundary objects can be implicit, the annotations allow each actor to explicitly explain the modification s/he introduces in the boundary objects.
3 Mediation Mechanisms The elements involved in a mediation process constitute a Mediation Mechanism (MM), defined as: MM = (Mediator, KB, MVL) where: - Mediator is the agent that supports the two human actors in reaching an agreement through a mediation process; - KB is the knowledge base accumulated in a specific domain in which the actors collaborate; - MVL (Mediation Visual Language) is the visual language constituted by the set of messages exchanged by the two human actors by means of the Mediator. MVL is defined as follows: MVL = {MVLmsg1, …, MVLmsgn} A MVLmsg is a message defined as MVLmsg = Data describe the object of the mediation; metadata specify some characteristics of the sending and receiving actors and of the digital platform used in the communication. When an actor sends a message, a mediation process starts. Such message is constituted as follows.
Visual Mediation Mechanisms for Collaborative Design and Development
7
• Data: ─ EP, an executable program, that is the software artifact that the two actors are collaboratively designing or part of it; ─ A, the annotation that the sender attached to the EP in order to communicate with the receiver; • Metadata: ─ S, the profile of the human actor that acts as a sender; ─ R, the profile of the human actor that acts as a receiver; ─ Pl, the specification of the hw/sw platform being used to access the program All messages following the first one convey the contributions of the involved actors. The metadata related to the profiles of the communicants are sent only in the first MVL message and not repeated in the next ones. In the messages, boundary objects are augmented with multimedia annotations to support negotiation among the different actors. Boundary objects and annotations are explicitly represented by MVL. A mediation mechanism enables actors of different CoPs, working with different software environments, to properly communicate and collaborate within the ecosystem, since it makes concepts expressed in the language of a CoP understandable by the members of another CoP. In [11], examples of message exchanges between actors of different CoPs, each working with the software environment specific for the CoP the actor belongs to, are provided. That paper also describes the Software Shaping Workshop (SSW) methodology, whose main idea is that all stakeholders, including end users, are “owners” of a part of the problem (e.g. software engineers are experts of technology, end users know the application domain, human-computer interaction experts deal with human factors). They all bring their own expertise into the collaboration and exchange ideas in order to converge toward a common design. Different software environments (called Software Shaping Workshops), are provided to each community of stakeholders (CoPs) in order to allow them to actively participate in system design, development and evolution. The case study illustrated in the next section has been designed according to the SSW methodology.
4 A Case Study: The Virtual Showroom The case study described in this paper refers to a Web portal of a company that provides virtual windows for shops of various natures. Through its portal, the company sells advertisement spaces (the virtual shop windows) to shop owners, in order to allow them to advertise their goods. One of the novelties of the system is that such owners are allowed to create and manage the content of the virtual windows, so that they can update them as they like at any time. The virtual windows may be created using different templates, which are made available by the company and sold at different prices: the more the complexity (in terms of combined multimedia elements) grows, the higher the virtual window price becomes. Fig. 2 depicts a virtual window for a women’s accessories shop, whose template is composed of a left side with textual description and a right side with a photo gallery showing four pictures.
8
C. Ardito et al.
Some stakeholders in this case study are grouped into four distinct CoPs: ─ Customers, i.e. Web surfers, who interact with and/or browse the virtual shop windows; ─ Content providers, i.e. shop owners, who provide contents to be shown in their virtual shop window(s); ─ Editorial staff, who create virtual windows’ templates to be used by the content providers; ─ Administrators, who shape the software environment in which the editorial staff designs the virtual windows’ templates. Administrators, editorial staff, and content providers collaborate in designing the portal, by interacting with software environments specifically developed for them. Editorial staff and content providers are not required to be computer experts. On the other hand, administrators have to be familiar with the application domain and also should have some experience in Web development. The software environment used by the administrators is designed by professional software developers, but this is out of the scope of this paper so that this CoP is not considered here. The reader interested in the whole meta-design process can refer to [11]. In order to provide an example of a mediation process in this case study, the situation in which the content provider wants a feature modification in her/his virtual shop window is considered. As illustrated in Fig. 3, s/he uses the annotation tool available in the system to annotate that specific feature explaining the changes s/he wants. The feature is the photo gallery that is highlighted with a dashed border. Through this annotation, which in Fig. 3 overlaps the main picture, the content provider requests to the editorial staff a modification in the virtual shop window template in order to be able to show a higher number of pictures.
Fig. 2. An example of virtual window of a women’s accessories shop. The template is composed of a textual description on the left and a photo gallery on the right.
Visual Mediation Mechanisms for Collaborative Design and Development
9
Fig. 3. An example of annotation containing a request of the content provider. The photo gallery is highlighted by surrounding it with a dashed border and the annotation explaining the requested modification is overlapped on the main picture.
A mediation process is thus activated. The mediation mechanism involved in this process consists of the three components defined in Section 3; specifically, Mediator is an engine that is part of the environment used by the content provider to make her/his requests of changing the template of the virtual shop window; KB contains all the information necessary to translate the content of the message (i.e. EP and A) in a specific request for the editorial staff; MVL is the visual language composed by the messages exchanged during the whole mediation process between content provider and editorial staff. The first message is constituted as follows. • Data: ─ EP, the virtual shop window environment that the content provider is asking to modify; ─ A, the annotation that the content provider attached to the EP in order to communicate her/his request to the editorial staff; • Metadata: ─ S, the profile of the content provider (the sender in this mediation process); ─ R, the profile of the editorial staff (the receiver in this mediation process); ─ Pl, the specification of the hw/sw platform used to access the Web portal.
10
C. Ardito et al.
The receiver, a member of the editorial staff, gets the message translated according to the language of her/his software environment, which is customized to the needs of that CoP. As shown in Fig. 4, the content of the message is rendered in a table at the bottom of the screen, which communicates the same meaning intended by the content provider who sent the message, but using a different visual representation, in which some codes are used, that the editorial staff well understands. If the editorial staff can directly manage the request, s/he performs the necessary software modification and communicates it to the content provider; otherwise, s/he activates a second mediation process with a member of the administrator CoP, to whom s/he reports the request for modifications. In the first mediation process, the reply message to the content provider consists of the virtual shop window application EP, modified according to the content provider request, and the annotation A created by the editorial staff to explain the performed modifications. If the content provider is satisfied by the solution s/he gets, the mediation process is concluded; otherwise it keeps going on iteratively until the content provider and the editorial staff reaches an agreement.
Fig. 4. Editorial staff environment: the request sent by the content provider is shown
5 Conclusions This paper has discussed and provided a formal definition of visual mediation mechanisms for collaborative design, development and evolution of software. Mediation mechanisms provide a means to improve communication and cooperation among all stakeholders involved in the design of software artifacts, including end users. This communication is fundamental in order to cooperatively create software adapted to user needs and context of work. A case study referring to a Web portal that provides advertisement for shop of various natures has been presented; it provides an example of how visual mediation mechanisms are used to permit the collaboration among the different stakeholders. Acknowledgments. This work is supported by Italian MIUR and by grant PS_092 DIPIS. Li Zhu acknowledges the support of the Initial Training Network “Marie Curie
Visual Mediation Mechanisms for Collaborative Design and Development
11
Actions”, funded by the FP 7 - People Programme with reference PITN-GA-2008215446 “DESIRE: Creative Design for Innovation in Science and Technology”. The authors thank Nicola C. Cellamare for his collaboration in the case study.
References 1. Software Engineering Institute: Ultra-Large-Scale Systems: The Software Challenge of the Future, http://www.sei.cmu.edu (last access on February 22, 2010) 2. Fischer, G.: Cultures of Participation and Social Computing: Rethinking and Reinventing Learning and Education. In: Ninth IEEE International Conference on Advanced Learning Technologies (ICALT), pp. 1–5. IEEE Computer Society, Riga (2009) 3. Lieberman, H., Paternò, F., Wulf, V. (eds.): End User Development, vol. 9. Springer, Dordrecht (2006) 4. Wenger, E.: Communities of Practice: Learning, Meaning, and Identity. Cambridge University Press, London (1998) 5. Fischer, G.: Extending Boundaries with Meta-Design and Cultures of Participation. In: 6th Nordic Conference on Human-Computer Interaction: Extending Boundaries (NordiCHI 2010), pp. 168–177. ACM, Reykjavik (2010) 6. Boulle, L.: Mediation: Principles, Process, Practice. LexisNexis Butterworths, Chatswood (2005) 7. Wiederhold, G.: Mediators in the Architecture of Future Information Systems. Computer 25(3), 38–49 (1992) 8. Burstein, M.H., McDermott, D.V.: Ontology Translation for Interoperability among Semantic Web Services. AI Magazine - Special issue on Semantic Integration 26(1), 71– 82 (2005) 9. Vaculin, R., Neruda, R., Sycara, K.: The Process Mediation Framework for Semantic Web Services. International Journal of Agent-Oriented Software Engineering 3(1), 27–58 (2009) 10. Zhu, L., Mussio, P., Barricelli, B.R.: Hive-Mind Space Model for Creative, Collaborative Design. In: 1st DESIRE Network Conference on Creativity and Innovation in Design, Desire Network, Aarhus, Denmark, pp. 121–130 (2010) 11. Costabile, M.F., Fogli, D., Mussio, P., Piccinno, A.: Visual Interactive Systems for EndUser Development: A Model-Based Design Methodology. IEEE T. Syst. Man Cy. A 37(6), 1029–1046 (2007) 12. Carlile, P.R.: A Pragmatic View of Knowledge and Boundaries: Boundary Objects in New Product Development. Organization Science 13(4), 442–455 (2002) 13. Jennings, P.: Tangible Social Interfaces: Critical Theory, Boundary Objects and Interdisciplinary Design Methods. In: 5th Conference on Creativity & Cognition (C&C), pp. 176–186. ACM, London (2005) 14. Resnick, L., Levine, J., Teasley, S.: Perspectives on Socially Shared Cognition. American Psychological Association, APA (1991) 15. Fong, A., Valerdi, R., Srinivasan, J.: Boundary Objects as a Framework to Understand the Role of Systems Integrators. Systems Research Forum 2(1), 11–18 (2007)
Design for the Information Society Agata Bonenberg Faculty of Architecture, Poznan University of Technology, Nieszawska 13C, 61-021 Poznan, Poland
[email protected] Abstract. The aim of this paper is to discuss the acomplishments of contemporary design, focusing on its flexibilible and adaptative features – meeting demands of migrating, mobile societies. With the expansion and popularization of information and communication technologies in the last decades, traditional space-use patterns evolved. Divisions and borders between work and leisure, public and private slowly loose their importance. Users often seek “multi-use”, “multi-task” and “open space” solutions. Research is based on projects developed at the Faculty of Architecture, Poznan University of Technology, under supervision of the Author. Keywords: Evolution-based adaptation, adaptive change in modern design, modification, adjustment, reactivity, devices, utilitarian/common use/ everyday objects.
1 Introduction The theses discussed in „The Rise of the network Society” by a Spanish sociologist Manuel Castells suggest gradual increase of the dynamics of changes within the social and economic spheres following formation of information societies [1]. The shape of the space - modern architecture and design– is adjusted to the requirements resulting from these circumstances. The first decades have passed, during which one observes functioning of architecture and every day use objects in situation of constant changes of the way of space use – which happened in very short span of time. The priority for the designer becomes an implementation of flexible and changeable solutions. [2] The modern discourse in the theory of design concentrates on the relation between usefulness and aesthetics. Methodological approach to a wide range of conditions affecting these relations makes us aware of the actual complexity of the designing process which is, first of all, influenced by the external conditions and requirements, and one of its most important factors - ergonomics. Common use objects and devices, as a smaller, cheaper, closer to a man than architectonic forms, are subject to more frequent criticism and changes. The life of a building lasts several dozen, several hundred years whereas the life of utilitarian objects, especially the ones related to information and communications technologies, may last just a few years. Subsequent generations of equipment form groups and types depending on the features which make them useful, necessary, or an object of particular desire. At the same time the forms and functions are subject to evolution C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 12–19, 2011. © Springer-Verlag Berlin Heidelberg 2011
Design for the Information Society
13
which requires more technological and stylistic innovation than in case of architecture. It is related to the fact that removal of an unwanted architectonic structure is a difficult undertaking. In order to secure a specified group of clients for a product, or a family of products, the designer must take into consideration the state of continuous change of their needs. This evolution is focused particularly on the strategies of bonding users to a product - or a family of products. Analogies of adjusting common use objects to the external requirements may be found in similar processes existing in nature. The following paper includes the terms related to biological sciences: „evolution-based adaptation”, „adaptive changes”, which illustrate well the rules governing the modern requirements towards products. As living creatures have their specific adaptation strategies, designers create products applying utility criteria - particularly important for the product. The aim is to improve subsequent generations of objects, with regard to ergonomics, and also to meet the functional and aesthetic needs of the users in the constantly changing world of new technologies.
2 Evolution-Based Adaptation of Design Objects During the historical development, objects evolved under influence of technical possibilities, and development of the science, of ergonomics, fashion and various aesthetical trends. [3] It is particularly noticeable when we analyze them from certain time perspective. The need to adapt was usually triggered by market conditions and was the driving force of evolution of every day use objects. Some of the products and equipment existing for years have changed completely, some to some extent only; man has vanished. Example: evolution-based adaptation of a tram:
1A
1B
1C
Fig. 1. Evolution-based adaptation illustrated with an example of a tram with an electric drive: (from left): 1A -A tram in Warsaw (1920)- part of a postcard from Warsaw 1B - Contemporary tram used in Poznan (2010) fot. Autor 1C - Following the evolutionary path – image of a „future tram” Project developed at the Faculty of Architecture, Poznan University of Technology, by Wacław Kłos, under supervision of A. Boneberg
14
A. Bonenberg
There can be distinguished some basic factors which modify the forms of everyday objects: − Technology: new materials, engineering solutions − Culture: fashion, painting, trends in aesthetics − Economy – manufacturing, production and use cost
3 Adaptive Change in Modern Design Designers take into account the necessity to make adaptive changes under the influence of the environment and external stimuli. This, however, must be preceded by thorough recognition of a designing problem: defining the task, collection of information on the present state of the available solutions. Many of once acquired features and proprieties of products are "inherited" - the same features and proprieties exist in the subsequent editions of the products. Important task is to find the general (unchanging) features of a given product and the features subject to adaptive changes. As examples we can name here equipment which is operated under extreme conditions (in hot or cold environment, or at height) using different type of power supply and strict requirements for operation safety. All stimuli being the reason for adaptation changes may be classified as development factors, determinants or stimulators. There are several types of adaptive changes: modification, adjustment and reactivity. 3.1 Modification Modification can be defined as a type of recomposition, performed to meet requirements of the specific groups of consumers. In which case, the designers have to work with the object features and the environment taken under consideration, and establish: • Practical and communication functions • Aesthetical and semantic functions • Function of indication and symbolic meaning1
Fig. 2. Contemporary variants of a design of a coffee machine: product intended for the same use, applying similar technology, adapted to the needs of different target groups. Each of the examples has different aesthetical expression, different semantic function and different symbolism. 1
After: Steffen D. “Design als Produktsprache” der “Offenbacher Ansatz” in Theorie und Praxis Verlag form GmbH, Frankfurt / Main 2000.
Design for the Information Society
15
Fig. 3. System of light-weight division walls which divide living or working space in few minutes, designed at the Faculty of Architecture and the Poznań University of Technology. Object allows enjoying open spaces keeping basic functional divisions of the interior. (Author: M. Marchewicz, under supervision of A. Bonenberg).
Fig. 4. An example illustrates adjustment of the wall shelves system, which may be personalized to a large extent. Such solution assumes different load of the system in course of exploitation. It ensures easy access to articles of different shapes and sizes. The design has been developed under supervision of A. Bonenberg by G. D’Angelo, during the course of Design, at the Faculty of Architecture of the Poznan University of Technology.
Fig. 5. A modular bus stop construction - for changeable length, depending on the traffic intensity on a given communication hub. Each bus stop is equipped with at least one module with a digital timetable and information on the time of arrival of the next vehicle. The design has been developed by E. Erdmann under supervision of A. Bonenberg, Faculty of Architecture, Poznan University of Technology.
16
A. Bonenberg
While describing the phenomena related to adaptation, one should not forget about behavioral adaptation, i.e. modification or development of new products due to the phenomenon of change in space-use patterns. Interior design of living and working areas seems to encourage greater flexibility and multi-purpose use. This has happened in response to the life-style changes. 3.2 Adjustment Adjustment is a relatively short-lasting recomposition, which indicates reversible changes in an object. The aim is to meet dynamically changing needs of its user.
Fig. 6. Folding desk intended for multi-functional living spaces. The desk is equipped with integrated lighting system, drawers and two additional spaces for storing books. The design has been developed by A. Szymczak, under supervision of A. Bonenberg, Faculty of Architecture, Poznan University of Technology.
3.3 Reactivity Reactivity can be defined as being sensitive to environmental stimuli as well as functional and structural adaptation possibilities of a given device. In modern designing, reactivity is usually characteristic for objects, which use information and communication technologies. Those devices initiate respective actions depending on the phenomena in the environment.
4 Creative Process: Selection the Strategy for Adaptive Changes Adaptive changes specified above and the strategies of their implementation constitute the creative process. Stages of the designing process of utilitarian objects in the traditional understanding are illustrated in Table 1. At each stage it is possible to apply the elements of adaptation, adjustment or reactivity. The sooner the concepts are taken into account in the design, the greater impact they have on the final result of the designing process. Design is most often perceived as introduction of adaptive changes related to human factors (aesthetics, changeable understanding of beauty [5] related to cultural phenomena) with omission of the economic, ecological and technological factors with regard to which decisions are made solely by the manufacturer. But these areas have special potential for introduction of new, innovative solutions. “Design is not just
Design for the Information Society
17
Fig. 7. An example of reactive design – a conceptual project of a telephone with the function of detecting people within a predefined radius. The telephone „senses” similar devices in the proximity, making available users’ profiles visible. The idea of this invention is to facilitate face-to-face social contacts based on the technologies created for internet community networks.
what it looks like and feels like. Design is how it works“[6], a quote from one of the most revolutionary entrepreneurs, Steve Jobs, seems to confirm this truth. The interest and role of a designer should go far beyond the question of the form. Influence of the actions schedule on the implementation of adaptive changes is illustrated in Table 2. However, regardless of the revolutionary nature of the applied concepts, utilitarian objects must always comply with the basic requirements of: ergonomics, safety of use, transportation possibilities, storage, packaging and waste management. These
18
A. Bonenberg
Table 1. A scheme of designing actions: from a concept through readiness for production. [4] Source: teaching materials, Hans–Georg Piorek, Dipl.-Designer.
IDEA Phase 1 SEARCHING ANALYSIS Aim Recognition of the problem
Defining the task Gathering information Analysis of the present state Definition of the target group Briefing preparation
Phase 2 CONCEPT DEVELOPMENT Aim Solution variants
Division of functions Looking for solutions Creation of concept variants Assessment of the variants Establishing grounds for realization
Phase 3 DESIGNING Aim Problem solving
Ergonomics check Computer development of models Development of prototypes Assessment of the design Design approval
Phase 4 IMPROVEMENT ELABORATION Aim Problem solving
Development of details Improvement of the general form Voting for the realization Expenditures check Approval for realization
issues are connected with the product-environment relation and adjustment of the living space to the abilities of human body and mind. Special role is played by ergonomics as it changes the scope of meanings and extends the issues semantically it evolves in the direction indicated by technology and civilization development making it more sensitive to the needs and abilities of individuals.
5 Summary There can be three pillars of changeability of utilitarian forms distinguished: 1. Evolution-based on adaptation of utilitarian objects 2. Adaptation changes in the modern designing of utilitarian objects 3. Selection of adaptation changes strategy
Design for the Information Society
19
Table 2. Designing action - and adaptive changes Designing actions Analysis
Adaptation changes made by the designer Analytical part of the designing process is the basis for selection of the direction of adaptive changes. Gathering information and analysis of the current state of the market is helpful in regard to selection of the direction of changes. The analytical part concludes with a list of designing requirements.
Elaboration of the concept
At the concept stage it is possible to observe different solutions. While creating the variants of a concept, assessing the variants, it is worth extending the spectrum of possibilities and offers
Designing
At the stage of computer development of models and prototypes changes may be introduced, provided that they do not disturb coherence of the concept. At this stage the design and adaptation, adjustment or reactive changes applied to it are subject to assessment.
Improvement of the elaboration
At the stage of improvement and elaboration, change is only possible within the range of the corrected elements: work-out details, improved general form. At this stage possibilities of adaptive changes are limited.
These represent the rules crucial for the contemporary designing processes. Strategies of adaptation to the market conditions and improvement of subsequent generations of utilitarian objects decides of the success in the difficult, constantly changing world of consumer demands. Evolution and diversity of the design object, cannot be understood without social, cultural and economical context – this is dominated in present times by the mobility of the information society.
References 1. Castells, M.: The Rise of the network Society. PWN, Warsaw (2007) 2. Bonenberg, A.: Ergonomic aspect of Urban and Social Dynamics. In: Vink, P., Kantola, J. (eds.) Advances in Occupational, Social, and Organizational Ergonomics, pp. 837–846. Taylor & Francis Group, Boca Raton (2010) 3. Fiell, C.P.: Design of the 20th Century. Taschen, Köln (2005) 4. Steffen, D.: Design als Produktsprache, der Offenbacher Ansatz in Theorie und Praxis. Verlag vom GmbH, Frankfurt / Main (2000) 5. Bonenberg, W.: Beauty and Ergonomics of living environment. In: Vink, P., Kantola, J. (eds.) Advances in Occupational, Social, and Organizational Ergonomics, pp. 575–581. Taylor & Francis Group, Boca Raton (2010) 6. Jobbs, S.: The Guts of a New Machine, The New York Times, (November 30, 2003), http://www.nytimes.com/2003/11/30/magazine/30IPOD.html
Classifying Interaction Methods to Support Intuitive Interaction Devices for Creating User-Centered-Systems Dirk Burkhardt, Matthias Breyer, Christian Glaser, Kawa Nazemi, and Arjan Kuijper Fraunhofer Institute for Computer Graphics Research, Fraunhoferstraße 5, 64283 Darmstadt, Germany {dirk.burkhardt,matthias.breyer,christian.glaser, kawa.nazemi,arjan.kuijper}@igd.fraunhofer.de
Abstract. Nowadays a wide range of input devices are available to users of technical systems. Especially modern alternative interaction devices, which are known from game consoles etc., provide a more natural way of interaction. But the support in computer programs is currently a big challenge, because a high effort is to invest for developing an application that supports such alternative input devices. For this fact we made a concept for an interaction system, which supports the use of alternative interaction devices. The interaction-system consists as central element a server, which provides a simple access interface for application to support such devices. It is also possible to address an abstract device by its properties and the interaction-system overtakes the converting from a concrete device. For realizing this idea, we also defined a taxonomy for classifying interaction devices by its interaction method and in dependence to the required interaction results, like recognized gestures. Later, by using this system, it is generally possible to develop a user-centered system by integrating this interaction-system, because an adequate integration of alternative interaction devices provides a more natural and easy to understand form of interaction. Keywords: Multimodal Interaction, Human-Centered Interfaces, HumanComputer-Interfaces, Gesture-based Interaction.
1 Introduction Offering the use of modern interaction devices on technical systems is an actual trend for regarding customers’ needs to provide an easy and intuitive interaction. Especially games are predominantly designed for playing them with other controllers than traditional once, like joystick, gamepad or the combination of mouse and keyboard. In 2006 the Nintendo Wii was published and had an amazing success because of its intuitively useable control paradigm by performing natural gestures. Going one step further Microsoft’s Kinect introduce a full body gesture interaction by using only the own body as controller device. But not only systems for playing games try to utilize natural interactions, also systems like mobile phones using multi touch are successful like Apple’s iPhone or other modern smartphones. All of these systems have one feature in common: they support a natural interaction by supporting gestures. C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 20–29, 2011. © Springer-Verlag Berlin Heidelberg 2011
Classifying Interaction Methods to Support Intuitive Interaction Devices
21
On the computer nowadays a gesture-based interaction is even not so successful. On these systems the traditional interaction devices mouse and keyboard are the most often used devices for the control of applications. Only multi-touch monitors are in some usage scenarios used for an easier interaction e.g. in public domains. Mostly the reason is the missing support in programs and applications. But also if alternative interaction devices are supported, their usage may not be adequate in all use case scenarios of a program. In different use cases, different interaction metaphors are needed to provide a useful interaction. For example by presenting a picture, abstract gestures are useful to instruct the viewer-program to zoom or rotate the picture. But if the user navigates through the menu of such a program only simple gestures are appropriate like pointing an entry or panning the display in a direction. Furthermore interaction devices often provide additional technical features e.g. the Nintendo WiiMote controller contains accelerometers, which are useful for supporting a gesture-based interaction. But the WiiMote also utilizes an infrared camera, which allows it to be used as pointing device. For an adequate support of modern interaction devices all the possible interaction methods and paradigms should be supported, but it is a challenge to address different kinds of interaction methods, if multiple modern interaction devices are used. In this paper we introduce a taxonomy of possible interaction methods for existing interaction devices. In this taxonomy currently applied interaction methods are presented, based on nowadays available interaction systems and devices. In this paper we also describe a generic interaction analysis system we had developed, which handles different interaction devices and its supported forms of gestural interactions, basing on the defined taxonomy for interaction methods. Hence the system handles the technical aspects like detecting gestures, or adopting e.g. the pointing mechanism of the WiiMote to a physical display. By using this interaction analysis system, a developer does not need to spend effort in developing a gesture recognition system any more. Via an API the developer can also declare, which interaction method should be supported or be disallowed in the programs use case scenarios. The interaction analysis system automatically determines the supported devices and its technical aspects to use the relevant sensors. Of course the support of multiple interaction methods is provided, so different kinds of interaction methods are possible in a use case scenario.
2 Related Works For classifying interaction devices to specify existing input and output devices, different approaches are existing. Especially for input devices classifications exist, to provide the possibility to group devices in dependence of similar properties. In the following section we give a small overview of existing classifications for input devices. One of our goals is to use the classification for conceptualizing and developing an interaction-system, for this fact we also introduce in some existing interaction systems, which provide the feature for an intuitive usage.
22
D. Burkhardt et al.
2.1 Classifications for Interaction Devices In the past some effort was done in defining classifications of input devices, mainly with the goal of better understanding of the devices, so that optimal devices could be found for specific tasks or similarities in devices to replace a device by another adequate one [1]. One of the first classifications was a taxonomy, which was defined by Foley et. al. [2]. The classification was focused on graphical user interface applications and its typical tasks that can be performed. Tasks were for example selecting a graphical object, orienting the mouse etc. Foleys taxonomy differentiates between devices that can perform these tasks and in what way they can do it, especially if its active principle is directly or indirectly. Buxton et. al. [3] made the observation that there is a major difference in the way devices can produce the same output. Buxton called these differences pragmatics and divided the devices further into the way they produce the output (position, motion, pressure, etc.) and ended up with a taxonomy, which could represent simple devices. To finally be able to classify all input devices, even virtual ones, Mackinglay [4] took Buxton’s taxonomy and build a formal approach, which can be used to classify the most input devices. Mackinglay describes an input device with a 6-tuple:
T = M , In, S , R, Out , W M… is an operation manipulator that describes, which value is changed by the device e.g. Rz would mean rotation around the z-axis In… is the input range in which the device can be manipulated e.g. a touchscreen would have a input range of [0, touchscreensize] S… is the actual state of the device R… defines a function that maps between In and Out Out… is the output range which results from In and function R W… describes the inner function of the device
(a) Merged operations
(b) combined devices in Buxton’s taxonomy
Fig. 1. Mackinglay’s Taxonomy of input devices [4]
Classifying Interaction Methods to Support Intuitive Interaction Devices
23
These tuples can be combined by 3 different merge operations to form more complex input devices: ─ Connect: connects one device from Out with a fitting input device from In ─ Layout composition: indicates the position for two simple devices at a superordinate device ─ Merged composition: similar to layout composition, but the merged values defining a new complex data type It is hard to insert devices for video or audio-based recognition in these taxonomies. In an alternative approach Krauss [5] divides devices on the super level in coordinatebased and non-coordinate-based interaction devices. So its classification (see Fig. 2) is driven by the output values of the devices.
Fig. 2. Classification of input devices by Krauss [5]
2.2 Gesture-Based Interaction Systems Some effort was invested to support a gesture-based interaction to normal computers and into application for allowing an easier control. In particular, the use of interaction devices from game consoles becoming also interesting for the use on normal computers. The reason is that they are well known and next to its easy usage, they provide different methods for interacting with a system. Early interaction devices like old data gloves were designed for only one interaction scenario. But in difference, a modern controller like the Wiimote, which is an input device developed by Nintendo, can recognize acceleration in 3 directions and it therefore can be used for gesturerecognition, next to use the infrared camera for the use as pointing device or use the buttons directly to control an application. To use only the WiiMote on a computer, different implementations are available. One of the most flexible programs for using the WiiMote as gesture-based device is WiiGee1. The recognition is implemented by a statistical comparison of the acceleration inputs and comparing them with previously trained acceleration data. WiiGee can classify the observed sequence as gestures [6][7]. 1
http://www.wiigee.org
24
D. Burkhardt et al.
There are also video-based applications that recognize faces [8] and some early approaches to recognize human gestures in video-based systems [9]. A more advances approach is Microsoft's Kinect2 which uses stereo images to recognize gestures. Speech recognition is a field, which gets more and more involved and works well for limited vocabulary. Many systems like cellphones, car navigation systems, operation systems and computer games support speech recognition.
3 Classification for Interaction Methods In a first idea for creating a gesture-interaction system (see Fig. 3) that supports different kinds of alternative interaction devices, we focused on supporting different kinds of devices [10]. During this work we recognized that many devices can be used for different interaction methods, for instance the WiiMote is an accelerometer-based device, which can be used for gestures. Next to this it utilizes an infrared camera, which allows the use as pointing-device and it is possible to use it as key-based controller, because of the also provided buttons on the top of the controller. Furthermore for some interaction methods various other devices do exists, which allows the same form of interaction. Representatives for 3-dimensional accelerometerbased devices are the WiiMote, Playstation Move Controller and most of the currently existing smartphones on the market.
Fig. 3. Architecture of our Gesture-Interaction-System [10]
In default, for every supported controller another implementation is necessary, also for modules, which are able to e.g. recognize gestures. This is not an adequate procedure, so that we created another approach, which we are going to describe in the following sections. For creating an effective tool to support also new upcoming alternative interaction devices, we need a concept for organizing them. Also to allow the use of modules for multiple times, like for gesture recognition module. In fact of this, we designed a multi-dimensional classification of interaction devices, grouped by the interaction method and the interaction result. 3.1 Classification of Interaction Methods In the past times interaction devices are designed for a single use scenario, e.g. a 3DMouse was designed for using it to interact directly within a virtual 3-dimensional 2
http://www.xbox.com/de-de/kinect
Classifying Interaction Methods to Support Intuitive Interaction Devices
25
Fig. 4. Classification taxonomy for the possible interaction methods
world. In difference to these devices, the modern interaction devices for modern game consoles etc. utilize multiple technologies. The Playstation Move controller integrates so accelerometers, a gyrocopter, a compass and buttons, which in sum allows a manifold interaction style. For grouping interaction devices, we conceptualized a taxonomy to group the controller by its basic underlying technology (see Fig. 4). Of course, every interaction device can be ordered in multiple groups – depending to the required technical feature. This classification is necessary to provide the possibility that a developer etc. can address which kind of interaction method he wants to support or what kinds of interaction devices he wants to allow interacting in his application. The important difference between this characterization and the existing approaches for classifying interaction device is that we separate the devices by its technical use method and what kind of sensors etc. are responsible for this method. 3.2 Classification of the Interaction Results In a practical use a user or furthermore a developer has a specific requirement on the results by the usage of an interaction device. So if an interaction in a graphical userinterface is planned, the result from the used controllers must be coordinates etc. This can only be ensured, if the taxonomy provides the feature of filtering the devices by its generated result data. By interacting through a graphical interface the user can use different kinds of interaction devices, in dependence to the needed results e.g. coordinates for a 3dimensional environment. Today only 1, 2 or 3-dimensional environments are known. But in fact of further research approaches or the use of e.g. multi-dimensional
Fig. 5. Classification taxonomy for addressing the interaction result by previously defined interaction method
26
D. Burkhardt et al.
combinations of multiple 3D environments, we regard also n-dimensional interaction results, but of course today we found no real n-dimensional (n>3) interaction device or an environment, which provides the need of such an interaction device. 3.2 Usage of Interaction Devices by Classifying Interaction Devices Our concept of the taxonomy for classifying interaction methods to support intuitive interaction devices is a combination of the two described sub-classifications. The result of coupling both classifications is an array in which every interaction device (restricted to input devices) can be classified (see array in Fig. 5). With this array it is possible to determine interaction device with similar features. Next to the feature of addressing interaction devices that supports a required interaction method and generates a preferred interaction results, like complete coordinates within a graphical user-interface.
Fig. 6. Entire Classification Taxonomy for classifying interaction methods to support intuitive interaction devices, in which every input device can be arranged
4 Concept for an Interaction-System To create an extensible interaction system, we conceptualized a system, which regards the presented taxonomy. One of the most important features of the conceptualized system will be the possibility to address devices with a specific interaction metaphor and with a special return value. This enables the chance that later further applications can be developed, which will use the interaction-system and over this they can select the kind of the preferred way to interact within the application. On this way the developer of an application can for instance define, that every kind of accelerometerbased devices will be supported for a gesture-based interaction. The consumer has now the choice to use such a device, but it does not matter, if it will be a WiiMote, a Playstation Move controller or a Smartphone with accelerometers. To achieve such a system and also to support the aspect of recognizing gestures with a device, every field of the array equates to an abstract device. This will be
Classifying Interaction Methods to Support Intuitive Interaction Devices
27
achieved in modeling every device as a virtual device. The treatment of the data e.g. the final coordinates or the recognition of a performed gesture will fully be overtaken by the interaction-system. Over an API the application is able to commit information to the server like the enabled kind of interaction devices or further supported gestures. The application gets as return value the coordinates or the final performed gestures. So the interaction-system is saving a lot of effort for developers of applications, who wants to enable a gesture-based interaction within his application or program. Fig. 7 shows the overall concept of our system. Another application can communicate with our system by sending xml messages over the network layer. Over the connection the system sends an xml message that lists all available devices, and expecting after this an xml message, which describes the devices that should be operated and how they should be connected. Every device is implemented as a plugin, which the server can automatically load. This makes our system generically to support further and also new upcoming devices like Microsoft Kinect.
(a) Architecture illustration for the interaction-system
(b) mapping of the device data to the screen
Fig. 7. Architecture and data mapping of interaction devices by the Gesture Interaction System
The system consists in general of 3 components. The central and most important component is the server of the interaction-system, which organizes all available interaction devices and also the modules for the supported interaction methods. Next to this central system an API library is available, so that a developer of a program is able to use the interaction-system. The API is planned for C# (.NET) and for Adobe Flex, which allows also web-application to support modern devices. The third component will be a system for providing the possibility to configure new gestures. So this Learning-Tool is only designed for modes where a gesture-interaction should be used. So developers etc. can train new gestures for controlling their application. The Learning-Tool generates an XML-configuration file, which can later be committed by the API to the server of the interaction-system. On this way the server will know the new gestures and is able to recognize them.
28
D. Burkhardt et al.
5 Discussion This interaction-system is an additional system to support alternative interaction devices. Our main research scope lay in an adaptive visualization framework for visualizing semantics data. To enable an intelligent user-interface we have to support intuitive information visualization on the hand, and on the other hand we have to provide an easy form for interacting through visualizations from the input side. For this circumstance we need an interaction-system to delimit a specific method of interaction (and a specific type of result value, like a performed gesture or coordinates) and the used interaction method e.g. by accelerometers. This is especially necessary, if another kind of navigation should be supported then only a coordinate-based once. For example, how is it possible to provide a gesture-based interaction through graphs? The challenge is that the navigation to its direction is harmful, because if e.g. 5 nodes are positioned on top right, it is hard to determine which node has to be selected, if a gesture to the top, right direction is performed. So an alternative approach is required for providing an adequate gesture-based interaction through graphs. Similar to this kind of visualization, it is also hard to find a gesture-based approach for timeline and geographical visualizations. Another point for discussions is the completeness of the taxonomy. We currently applied it on actual existing input devices especially from game consoles. We also try to regard most kinds of smartphones. From all these devices we determined its utilized features for differentiate its interaction methods and form. Then we tried to determine the possibly results, which are required in external applications. In the presented form, we cannot ensure that all kinds of interaction are regarded. We supported all common kinds of interaction, but it is possible that also new approaches have to be regarded and so an extension of the taxonomy has to be taken.
6 Conclusion In this paper we introduced into our defined taxonomy for classifying interaction devices in dependence of its interaction metaphor and its interaction results. In difference to other existing classifications, our approach is primary driven by the usage in adaptive semantic visualizations and therefor to select specific kinds of interaction devices or devices with specific forms of interaction. Next to this, the taxonomy groups devices with similar forms of interaction, which helps to abstract these devices and its significant properties and provide so the possibility to develop e.g. gesture-recognition modules for such an abstract device and finally also all related devices which are grouped under this abstract device. This idea for classifying interaction devices is used for our interaction system, which allows the use of alternative interaction devices like WiiMote or the Playstation Move controller. With this system every currently available input device can be classified by its utilized interaction method and in dependence to its interaction result e.g. coordinates or an identified gesture. The interaction-system consists of two parts, the central server, which organizes the devices and provides an interface for external applications. And also the API for the use within application that communicates with the server and gets the results of interactions from the connected devices.
Classifying Interaction Methods to Support Intuitive Interaction Devices
29
The conceptualized system can be extended by further and also new upcoming input devices and also for the determined input results further modules can be developed to support e.g. different kinds of gesture-recognition. This can be useful if an accurate recognition is required for normal interactions in application as well as a less accurate real-time recognition like it is needed in games. But in general it is possible to develop a user-centered system by integrating this interaction-system, because an adequate integration of alternative interaction devices provides a more natural and easy to understand form of interaction. Acknowledgements. This work was supported in part by the German Federal Ministry of Economics and Technology as part of the THESEUS Research Program. For more information please see http://www.semavis.com.
References 1. Jacob, R.J.K., Sibert, L.E., McFarlane, D.C., Mullen Jr., M.P.: Integrality and separability of input devices. ACM, New York (1994) 2. Foley, J.D., Wallace, V.L., Chan, P.: The human factors of computer graphics interaction techniques. Prentice Hall Press, Englewood Cliffs (1990) 3. Buxton, W.: Lexical and pragmatic considerations of input structures. ACM, New York (1983) 4. Mackinlay, J., Card, S.K., Robertson, G.G.: A semantic analysis of the design space of input devices. L. Erlbaum Associates Inc., Mahwah (1990) 5. Krau, L.: Entwicklung und Evaluation einer Methodik zur Untersuchung von Interaktionsgerten für Maschinen- und Prozessbediensysteme mit graphischen Benutzungsoberflächen. Universität Kaiserslautern (2003) 6. Poppinga, B.: Beschleunigungsbasierte 3D-Gestenerkennungmit dem Wii-Controller. University of Oldenburg (2007) 7. Thomas Schlomer, N.H.S.B., Poppinga, B.: Gesture Recognition with a Wii Controller. In: 2nd International Conference on Tangible and Embedded Interaction (2008) 8. Zhao, W., Chellappa, R., Phillips, P.J., Rosenfeld, A.: Face recognition: A literature survey. ACM Comput. Surv. 35, 399–458 (2003) 9. Mitra, S., Acharya, T.: Gesture Recognition: A Survey. IEEE Systems Man and Cybernetics Society (2007) 10. Burkhardt, D., Nazemi, K., Bhatti, N., Hornung, C.: Technology Support for Analyzing User Interactions to Create User-Centered Interactions. In: Stephanidis, C. (ed.) UAHCI 2009. LNCS, vol. 5614, pp. 3–12. Springer, Heidelberg (2009) 11. Nazemi, K., Burkhardt, D., Breyer, M., Stab, C., Fellner, D.W.: Semantic Visualization Cockpit: Adaptable Composition of Semantics-Visualization Techniques for KnowledgeExploration. In: International Association of Online Engineering (IAOE): International Conference Interactive Computer Aided Learning 2010 (ICL 2010), pp. 163–173. University Press, Kassel (2010)
Evaluation of Video Game Interfaces Joyram Chakraborty and Phillip L. Bligh Department of Computer Science The State University of New York, The College at Brockport, Brockport, NY 14420 USA
[email protected],
[email protected] Abstract. The interface is an essential part of every video game. However, research in the understanding of the modern game player’s preferences is lacking. This paper reports the preliminary findings from the evaluation of a computer game user interfaces that determines specific user preferences. Keywords: User Interface Evaluation, Video Game Design, End User Preferences.
1 Introduction The game interface is an essential part of every video game. Regardless of how artistic, useable or functional the interface is, it remains the primary conduit of interaction for the game player. It is essential for game developers to understand the common problems associated with game interfaces as well as the analytical techniques used to solve them. However, little research has been carried out to understand the end user preferences. Game developers use personal preferences and creative programming techniques and tools to develop games with the hopes of successful market penetration. It is the purpose of this study to evaluate the interface of a video game to gain an understanding of the end user.
2 Background Research has indicated that the following four methods of evaluation techniques are most commonly applied [1, 9, 10]. 1.
2.
Cognitive Walk-through technique is a process that measures the usability of an interface by using a Cognitive Learning Model to evaluate the ease at which an interface can be learned. Heuristic Evaluation technique involves employing use-ability experts to inspect interfaces. They use predefined criteria to evaluate the problems with the interface. This was found to be the most effective evaluation technique but it relies on the availability of an expert.
C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 30–35, 2011. © Springer-Verlag Berlin Heidelberg 2011
Evaluation of Video Game Interfaces
3.
4.
31
Pluralistic Evaluation technique is a process by which developers, users and experts do a walkthrough of the interface together. The advantage is that in this process is the diverse perspective involved in evaluation. Formal Usability Inspections are a process where Human Factors Experts use a Cognitive Model of Task to evaluate interfaces. The advantage of this process is that the experts can do a walk-through more efficiently.
The literature has indicated that heuristic evaluation method are the most effective alternative to empirical testing. This technique is better at finding a larger percentage of design flaws in an interface, although its effectiveness relied heavily on the quality of the experts available [1, 7, 8, 9, 10]. The advantage of heuristics and reliance on experts was also found to be true in a study that compared cognitive walk-through and heuristic evaluation techniques using system designers and experts [7]. When comparing these two methods performed by experts, heuristics had a clear advantage, however, when using only designers to perform the evaluations both methods preformed equally. Over the last two decades, other researchers have recognized the need to expand and modify heuristic evaluation approaches for video game interfaces. Modifying and creating new heuristic approaches had already been done in other cases. For instance, in one non game related study, the author examines enhanced heuristic evaluation techniques [10]. Heuristics criteria were combined and then analyzed for effectiveness. After comparing the results the most effective heuristics were noted. In another study, researchers [6] introduced Heuristic Evaluation for Playability (HEP), a heuristic evaluation created specifically for the evaluation of game-play. According to the findings, HEP was “reviewed by several playability experts and game designers.” A comparative study of the HEP method and end user observation revealed specific problems that could only be found through observation. In a study about customizing evaluations for video games, Pinelle, et. al. divided video games into six different genres: Role-playing, Sports, Shooter, Puzzle, Strategy, Adventure [4]. Then they took twelve common use-ability problems found in 108 different reviews (eighteen for each type of video game genre) and mapped their occurrences to each genre. The common occurrences of usability problems for each genre were shown in radial charts. After finding problems common to specific genre, they discuss the implications that those problems could have on evaluation techniques. In another study, researchers developed ten criteria for evaluating video game interface problems [5]. Initially five video game players with experience analyzing interfaces were recruited to evaluate a specific game using Heuristic evaluation criteria developed from their previous research. After evaluating the game by playing and using the given criteria the problems evaluators reported were recorded. The evaluators used a severity scale to classify the problems found. There was a significant overlap in reported problems but some evaluators found unique problems. This study did not take in account the engagement or fun factor of the game. Research on video game interface evaluation suggests that a different type of evaluating criteria must be developed for evaluating interfaces in games. Our study
32
J. Chakraborty and P.L. Bligh
attempts to prove that a classification of game players based on their interface preferences is possible. This new classification of users could be used as criteria for evaluating interfaces.
3 Methodology A windows-based gaming application called "Mayor City" was developed using both JAVA and JOGL under the Eclipse Platform specifically. In "Mayor City" the goal of a player is to build a road from one side of a three dimensional map to the other. The player starts with a small amount of seed money (used to build roads and buildings) and a zero population. Each "game" day the player receives income based and a percentage of the total income of all the money making buildings they have built plus a small base income. In order to build money making buildings the player must first build population buildings in order to meet the population requirements of the specific building.
Fig. 1. Mayor City
The following elements were added to the game to measure the player’s preferences: 1.
The game input consists of a mouse menu interaction and corresponding keyboard controls. Any action in the game that can be done by mouse can also be done by the keyboard.
Evaluation of Video Game Interfaces
2.
3. 4.
33
There are two parts of the interface that swap location during play. The first part is the side menu, which swaps from left side to right side at random. This menu allows the user to select a picture of the building that they are planning to build. The second part is the game text. The text on the top bar displays both the population and the funds and the bottom bar displays the game timer. These elements swap location at random. The entire interface changes color between a cyan background with black text and blue background with white text. In "Mayor City" the sky darkens and brightens according to the time of day. The color of the sunsets and sunrises vary randomly between red and green.
A demographic and end user preference survey was developed. This survey instrument was pilot tested. A Microsoft Access database was developed to implement this survey.
4 Procedure Permission was sought from the Institution Review Board at the SUNY Brockport prior to the start of this study. Subjects were recruited from undergraduate programs at SUNY Brockport. Subjects were informed of their rights and their ability to cease participation at any time. Once they accepted that agreement, they were presented with the Ishihara color blind test to test for normal color vision. If they passed the test, they would be presented with the images that would comprise the study itself. Subjects were then allowed to play the game for as long as they wanted. Each player was provided with a basic game play instruction document as well as verbal instructions and help as needed. After the conclusion of the game, the subjects were asked to fill out our demographic and game response survey in the Access Database application. (For the complete set of questions, please email the author.)
5 Results Data was collected from 24 undergraduate subjects of from SUNY Brockport. There were 11 females and 13 male students of various disciplines who spoke English as their primary language. On average, these participants indicated that they had been browsing the Internet for approximately 5 years. They further indicated that they spent an average of 5 hours a day working on their computers of which an average of 2 hours was spent on the Internet. The participants were asked to complete a binary scaled questionnaire to gauge their interface preferences. For example, participants were asked to respond to the question “Which contrast scheme did you prefer” using binary scale. The table below shows the responses from the 24 participants. The results indicated that there were no clear preferences for specific user interface features with the exception of Mouse versus Keyboard preferences. Surprisingly, the majority of the users preferred the use of the Keyboard which was not in tune with the literature findings. The results did not indicate any obvious gender preferences as both male and female participants showed almost similar interface preferences features.
34
J. Chakraborty and P.L. Bligh
18 to 20 8
Age Ranges 21 to 24 25 to 30 12
31 to 35
36+
Male
1
2
13
1
Gender Femal e 11
Preferred Text Contrast Dark text on bright Bright text on dark background background 13 11
Preferred Control Interface Mouse Keyboard 1
13
6
Preferred Sunset Sky Color Green Red 12
18
Preferred Sunrise Sky Color Green Red
12
11
Preferred Side Menu Location Right-side Left-side 17
Preferred Menu Background Color Cyan Blue
13
Preferred Date Text Location at Bottom Yes No
7
17
7
Preferred Population and Funds Text Location at the Top Yes No 10
14
6 Conclusions This is a preliminary investigation of evaluation techniques of video game interfaces. The findings of this study have indicated that there is clearly a need for further research of evaluation techniques. The end users in this study showed no obvious preferences with the exception of the Keyboard and Mouse. The results indicated that changing interface features, such as color and menu positioning did not affect the end users.
Evaluation of Video Game Interfaces
35
The results from the study are not surprising given the generally high level of familiarity with computer based gaming that the end users indicated. The enormous diversity of computer games available to the end user has ensured that their gaming experience levels are high. The literature confirms these findings.
7 Future Work This study is far from complete. This report is the first of a series of on-going studies that are examining evaluation techniques in the development of gaming user interfaces. The next step would be to reevaluate the questionnaires for to correct any deficiencies. It would be interesting to observe whether the results can be replicated on a larger sample size. Mayor City game could be further enhanced to add more tasks and more interface options. In addition, this study should be replicated a different demographic such as children or older adults to gauge their interface preference. The literature for best evaluation practices is inconclusive. An interesting methodology would be the use of eye tracking to help researchers analyze end user cognition patterns. This pattern recognition could be extended to other demographics, such as children and older adults. The study could be replicated using users of other cultures. These findings would be of great interest to game developers as they seek to further the reach of video games by understanding the preferences of end users.
References 1. Hollingsed, T., Novick, D.G.: Usability inspection methods after 15 years of research and practice. In: Proc. The 25th Annual ACM international Conference on Design of Communication, pp. 249–255. ACM Press, El Paso (2007) 2. Juul, J., Norton, M.: Easy to use and incredibly difficult: on the mythical border between interface and gameplay. In: Proc. 4th International Conference on Foundations of Digital Games (2009) 3. Barr, P., Noble, J., Biddle, R.: Video game values: Human-computer interaction and games. Journal of Interacting with Computers 19(2), 180–195 (2007) 4. Pinelle, D., Wong, N., Stach, T.: Using genres to customize usability evaluations of video games. In: Proc. 2008 Conference on Future Play: Research, Play, Toronto, Ontario, Canada, November 3-5, pp. 129–136 (2008) 5. Pinelle, D., Wong, N., Stach, T.: Heuristic evaluation for games: usability principles for video game design. In: Proc. CHI 2008, pp. 1453–1462 (2008) 6. Desurvire, H., Capalan, M., Toth, J.A.: Using heuristics to evaluate the playability of games. In: Extended Abstracts of the 2004 Conference on Human Factors in Computing Systems, pp. 1509–1512. ACM Press, NewYork (2004) 7. Desurvire, H.W., Kondziela, J.M., Atwood, M.E.: What is gained and lost when using evaluation methods other than empirical testing. In: Proc. Conference on People and Computers VII, York, United Kingdom, pp. 89–102 (January 1992) 8. Jeffries, R., Miller, J.R., Wharton, C., Uyeda, K.M.: User Interface Evaluation in the Real World: A Comparison of Four Techniques. In: Proceedings of CHI 1991, ACM, New Orleans (1991) 9. Nielson, J.: Usability inspection methods. In: Proc. ACM CHI 1995 Conf. Usability Engineering, May 7- 11, pp. 377–378. Denver, CO (1995) 10. Nielsen, J.: Enhancing the explanatory power of usability heuristics. In: Proc. of the SIGCHI Conference on Human factors in Computing Systems: Celebrating Interdependence, Boston, Massachusetts, United States, April 24-28, pp. 152–158 (1994)
Emergent Design: Bringing the Learner Close to the Experience Joseph Defazio and Kevin Rand School of Informatics and Department of Psychology IUPUI – Indiana University Purdue University at Indianapolis 535 W. Michigan St., Indianapolis, IN 46202 {jdefazio,klrand}@iupui.edu
Abstract. The creative process of design is at the foundation of serious game and simulation development. Using a systematic approach, the designer of serious simulations or games analyzes the best approach that would deliver an interactive learning experience; one that will harnesses growing forms of behavior, requiring both the learner and technology to engage in an open-ended cycle of productive feedback and exchange. According to Collins [1], “Beyond simply providing an on/off switch or a menu of options leading to ‘canned’ content, users should be able to interact intuitively with a system in ways that produce new information. Interacting with a system that produces emergent phenomena is what I am calling interactive emergence” (4th Annual Digital Arts Symposium: Neural Net{work}). Keywords: creative process, emergent design, serious game design, health education simulation.
1 Introduction The healthcare profession is heavily task-and performance-based where clinical reasoning is paramount with integrity, empathy and compassion. Many of these attributes are difficult to teach and assess in a traditional classroom. Patient safety has to be the ultimate outcome of any healthcare curriculum. As Khan, Pattison, and Sherwood state, “Recently, a range of simulation techniques have surfaced that allow the learner to ‘enhance learning of healthcare professionals in safe environments, without compromising the patient safety, while maintaining a high degree of realism.’” [2]. Serious simulation design and development is just one example of this technique. Serious games and simulations have started to permeate the healthcare arena. Zyda provides a definition of serious games as: “a mental contest, played with a computer in accordance with specific rules, that use entertainment to further government or corporate training, education, health, public policy, and strategic communication objectives.” [3]. This mental contest is an interactive dialog that allows the learner (gamer) to navigate through a serious of vignettes (digital stories) and challenges in order to attain achieve learning outcomes designed within the simulation experience. C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 36–41, 2011. © Springer-Verlag Berlin Heidelberg 2011
Emergent Design: Bringing the Learner Close to the Experience
37
The Suicide: Prevention, Intervention simulation is a work-in-progress that uses emergent design; the intuitive interactive experience coupled with a focus on usability to build this simulation. The authors’ intention is to provide learners with an educational experience while engaging in interactive dialogue with virtual characters in situations that increase potential suicidal behavioral awareness, promote and encourage effective communication with members of specific populations at risk, and enhance participation and activation skills in working with the members of a population thereby increasing positive mental health outcomes.
2 The Creative Process of Emergent Design The creative process of emergent design is essentially a problem-solving process. In the creative process the designer is actively engaged in systematic problem analysis much as an architectural designer would design a blueprint for a building or a computer scientist would develop programming code structures for an application. Emergent design is an evolutionary process. The simulation begins to reveal its structure and performance attributes as the design evolves. According to EmergentDesign.org, “Emergent design holds both emergence—the bubbling up of new phenomena from the random interaction of individuals—and design—the planned progression of events towards a goal—in a dynamic embrace that maintains a higher level view of the process, and thus transcends the apparent duality by facilitating the interaction between the two approaches. It seeks to reduce the friction points that inhibit the free flow of information and to allow for experimentation towards a goal, so that product development is more like the leaves of trees reaching for the light than central planning.” [4]. In emergent design, this serious health education simulation serves as a way of testing specific hypotheses within the Interpersonal-Psychological Theory of Suicide, which posits that the motivation for suicide is a function of two psychological factors: belongingness and burdensomeness. For example, the interactive simulation could be tweaked to only intervene on belongingness or only on burdensomeness and then these versions of the game could be compared in terms of their impact in reducing suicide risk (e.g., two versions of the simulation could be compared at two different college campuses). This would answer empirical questions about which psychological need has a greater impact on suicidality. Moreover, hypotheses could be tested as to whether there is an additive benefit to intervening on both burdensomeness and belongingness, or if intervening on just one is sufficient to reduce risk for suicide. The goals for the Suicide: Intervention, Prevention simulation are presented in Table 1. Obviously, studying suicide is a difficult endeavor given the ethical and moral issues surrounding it. Hence, it is difficult to conduct much research that involves experimental manipulation in order to determine causality. With this interactive serious health education simulation however, researchers could experimentally manipulate important variables while still behaving in an ethical manner. In summary, this simulation would provide a modifiable tool that could be used to answer specific research questions related to suicide and expand our understanding of the mechanisms involved in suicidality.
38
J. Defazio and K. Rand Table 1. Goals for the Suicide: Intervention, Prevention Simulation Goal Raise awareness of risk factors Model Appropriate intervention & helpseeking behaviors Inform of resources of help-seeking Model appropriate follow-up behaviors
Example Marked change in behavior or appearance Caring, empathy, compassion
Present sources for help and guidance Follow-up, provide continual support
Cavallo claims that emergent design manages the overall process [5]. Emergent design uses the visual messages from artifacts in the process of systematic analysis. Kitamura, Kakuda, & Tamki also provide supporting claim that emergent design uses “The concepts of evolution, adaptation, learning, and coordination” [6] which are addressed iteratively.
3 Using Artifacts to Inform the Design In the design and development of simulations in health education the authors use available turnkey systems for animators, content creators, visualization and special effects. Applications can be developed easily which allow 3D motion data, volumetric 3D mesh, and video surface textures for game design team to reduce production time. For game developers, the entire production pipeline is shortened, production-ready animation data is available instantly, and animators can now implement character, graphics, and animated sequences without extensive design and development
Fig. 1. 3D Modeled Character
Fig. 2. Street Scene
Emergent Design: Bringing the Learner Close to the Experience
39
production. For game designers, instant 3D characters are created in seconds. Game and simulation designers and developers have access to faster, cheaper, and better motion capture for 3D applications. The process of design incorporates both representation of the artifact being designed and the process or development by which the design is completed [7]. Design can be defined as the intention of making original plans, patterns, and/or schemes towards an aim or goal. For example, Figure 1 shows a 3D character designed using Mixamo. Figure 2 shows a screen capture from Sims 3 and Figure 3 shows the character embedded into the street scene [14]. The character developed in Mixamo is an animated character in a walk-cycle. This animated character can be tweened to move according to the sidewalk path depicted in the street scene.
Fig. 3. Embedded Character/Scene
Although Seels and Richey define design in the context of instructional design; “The planning phase in which specifications are constructed” [8], the same premise can be used for serious game and simulation design. Development can be defined as a stage or phase in the gradual growth, evolution, or advancement towards an aim or goal. Development is “The process of translating the design specifications into physical form” [8]. Design and development are related processes and both have goals and outcomes for the product to be developed. One of the biggest time-constraints in game and simulation design and development is the design and animation of a 3D character. “Animating an articulated 3D character currently requires manual rigging to specify its internal skeletal structure and to define how the input motion deforms its surface” [9]. Within the emergent design process, the authors engaged in rapid game development as the main context for design and development. The authors explored several game engines and software applications that would allow quick prototyping and development. Mixamo [10] was used for character development and animation. The Sims3 simulation game was used for graphics, backgrounds, and assets (e.g., furniture, home content, etc.)
40
J. Defazio and K. Rand
The development of serious educational games and simulations has proven to be complex, time-consuming and costly [11], [12]. Serious game and simulation designers and developers are continually facing time constraints and demands on delivery of the end product. One of the issues at the forefront is how to reduce development and production time. According to Pagulayan et.al. “There is a great deal of pressure on designers to utilize new technologies which may break old interaction models” [13]. The authors recognize these limitations and have adopted emergent design as the driving factor in the design of the educational simulation titled, Suicide: Prevention, Intervention.
4 Conclusions The goals for the Suicide: Prevention, Intervention simulation are to provide an engaging interactive experience that will educate the learner about suicide prevention. The objectives include: 1) raise awareness of risk factors (e.g., marked change in behavior or appearance, agitated or depressed mood, drug or alcohol abuse, etc.); 2) model appropriate intervention & help-seeking behaviors; 3) inform of resources for help-seeking; and 4) model appropriate follow-up behaviors. The authors of this article are actively designing and developing a serious simulation titled, Suicide: Intervention, Prevention. This work-in-progress is being developed as an educational simulation (serious game) that includes skills, knowledge, and values that allow the learner to interact with virtual characters who demonstrate mental health issues (e.g., suicide or harmful behavior). Learners are able to think, interact, and solve behavioral problems in social situations that demand effective communication, analysis, and response in various health education scenarios. Using four programmed scenarios, the learner will engage in distinct issues regarding suicidal thoughts as portrayed by characters in the simulation. This article focuses on the creative process of emergent design and usability issues affecting the overall production flow of this serious simulation.
References [1] Collins, D.: Breeding the Evolutionary: Interactive Emergence in Art and Education. In: Paper given at the 4th Annual Digital Arts Symposium: Neural Network, April 11 -12. University of Arizona, Tucson (2002) [2] Kahn, K., Pattison, T., Sherwood, M.: Simulation in Medical Education. Medical Teacher 33(1), 1–3 (2011) [3] Zyda, D.: From Visual Simulations to Virtual Reality to Games, vol. 38(9), pp. 25–32. IEEE Computer Society, Los Alamitos (2005) [4] Emergent Design.: What is Emergent Design (2009), http://emergentdesign.org/chrysalis/topics/emergence/ [5] Cavallo, D.: Emergent Design and Learning Environments: Building on Indigenous Knowledge. IBM Systems Journal 39(3), 768–781 (2000) [6] Ktamura, S., Kakuda, Y., Tamaki, H.: An approach to the Emergent Design Theory and Applications. Artificial Life Robotics 3, 86–89 (1998)
Emergent Design: Bringing the Learner Close to the Experience
41
[7] Dym, C.L.: Engineering Design: A Synthesis of Views. Cambridge University Press, MA (1994) [8] Seels, B.B., Richey, R.C.: Instructional Technology: The Definition and Domains of the Field. Association for Educational Communications Technology, Washington DC (1994) [9] Baran, I., Popović, J.: Automatic Rigging and Animation of 3D Characters. ACM Transactions on Graphics (TOG) 26(3), 72–78 (2007) [10] Mixamo.: Mixamo: Animation in Seconds (2010), http://www.mixamo.com [11] Nadolski, R., Hummel, H., Brink, H., van de Hoefakker, R., Slootmaker, A., Kurvers, H., Storm, J.: Emergo: Methodology and Toolkit for Efficient Development of Serious Games in Higher Education. Electronic Games, Simulations and Personalized eLearning (2007) [12] Westera, W., Hommes, M.A., Houtmans, M., Kurvers, H.J.: Computer-Supported Training of Psychodiagnostic Skills. Interactive Learning Environments 11(3), 215–231 (2003) [13] Pagulayan, R., Keeker, K., Fuller, T., Wixon, D., Romero, R.: User-centered Design in Games (revision). In: Jacko, J., Sears, A. (eds.) Handbook for Human-Computer Interaction in Interactive Systems. Lawrence Erlbaum Associates, Inc., Mahwah (2006) (in press) [14] Defazio, J., Rand, K., William, A.: Serious game design: Reducing production time constraints. In: 1st Annual Midwest Health Games Conference, Indianapolis, USA (2010) [15] Kharrazi, H., Faiola, A., Defazio, J.: Healthcare Game Design: Behavioral Modeling of Serious Gaming Design for Children with Chronic Diseases. In: Salvendy, G., Jacko, J. (eds.) Proceedings of the 13th International Conference on Human-Computer Interaction, San Diego, CA, Lawrence Erlbaum Publishing, Mahwah (2009)
Eliciting Interaction Requirements for Adaptive Multimodal TV Based Applications Carlos Duarte, José Coelho, Pedro Feiteira, David Costa, and Daniel Costa LASIGE, University of Lisbon, Edifício C6, Campo Grande 1749-016 Lisboa, Portugal {cad,pfeiteira}@di.fc.ul.pt, {jcoelho,dcosta,thewisher}@lasige.di.fc.ul.pt
Abstract. The design of multimodal adaptive applications should be strongly supported by a user centred methodology. This paper presents an analysis of the results of user trials conducted with a prototype of a multimodal system in order to elicit requirements for multimodal interaction and adaptation mechanisms that are being developed in order to design a framework to support the development of accessible ICT applications. Factors related to visual and audio perception, and motor skills are considered, as well as multimodal integration patterns. Keywords: Multimodal interaction, Adaptation, User trials.
1 Introduction TV is one the most ubiquitous devices in our homes. As a result, it has been explored over the years as a means to convey all types of contents, ranging from information to entertainment, to viewers in their houses. Viewing experience has been, as the name implies, passive, with interaction mostly limited to choosing what channel to watch. This trend is now changing, with offers, like Google TV, promising a future where the viewer takes on an active role, and becomes not only a consumer, but also an active content producer. Interaction possibilities will encompass browsing the web, contributing with blogs, publication of different media, like pictures and videos, exploring different applications, including social networks’ access, and, keep the more traditional use cases of watching live TV, and recording and watching recorded TV shows. All this new interaction possibilities will still happen in the same environment where we are used to watch TV. This is mainly the living room, but also includes every other location where people watch TV, like kitchens or their bedrooms. Apart from the location, the interaction means used in these settings will have to move forward from the remote controls that are currently used to interact with a TV and the devices connected to it. Although well suited to switch channels and control the audio volume, they will be inadequate for the majority of the interaction tasks that will become available. While using a remote to select a canned response might still be a possibility, using it for entering a tweet, for instance, will surely discourage users C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 42–50, 2011. © Springer-Verlag Berlin Heidelberg 2011
Eliciting Interaction Requirements for Adaptive Multimodal TV Based Applications
43
from doing it. This means, that a whole new interaction techniques will have to make their way into the living room. Gesture recognition will become part of the interaction experience, which is already happening, first with the Nintendo Wii, and more recently with Microsoft Kinect. But we can expect voice recognition to also become part of this new interactive TV scenario, as well as alternative “remote controls”, like smartphones or tablet devices. Besides this environmental and interaction devices variability, the target audience makes this a truly inclusive problem, since everyone can be a user of such platforms, regardless of their age, knowledge and physical, sensorial or cognitive abilities. Moreover, each user will have his or her preferences. They will be more comfortable with one interaction device than others, but even that may vary depending on the task they are accomplishing. One approach to tackle to problems raised by all these variables, environment, devices and users, is to employ adaptation techniques. This will allow an intrinsically multimodal system, to explore to the maximum the advantages of multimodal interaction, most notably the possibility of interacting naturally, with the added benefits in terms of learnability and ease of use, and of letting users chose the modality that is the most adequate, based on their appraisal of the current situation [1]. Adaptation will allow a system to exploit the knowledge it might have about its users, in order to provide the most appropriate means of interaction for a given task in a given environment, being aware to the context defining parameters [2]. To support an efficient and effective adaptation in a multimodal setting, it is of the utmost importance to correctly identify the adaptation variables and the interaction patterns users reveal. This paper reports on the efforts made to tackle the aforementioned issues, in the context of the European Union funded GUIDE1 project. By employing a user centred approach, interaction requirements are being elicited to understand how user’s abilities impact their perception, and also their use of different skills. This allowed for the identification of several interaction requirements, initially targeted at specific modalities, like visual presentation issues. Simultaneously, we have been observing how users integrate multiple modes of interaction when offered the possibility to explore them in a combined or independent fashion.
2 Context The studies reported in this paper were made in the context of the European Union funded project GUIDE (Gentle User Interfaces for Elderly Citizens). GUIDE aims to provide a framework and toolbox for adaptive, multimodal user interfaces that target the accessibility requirements of elderly users in their home environments, making use of TV set-top boxes as a processing and connectivity platform. GUIDE envisions bringing to users the benefits of multimodal interaction, empowering interaction through natural and seamless interaction modes, that do not require learning, and that will be able to convince users to adopt them [3]. This includes the modalities that are used in natural human-human communication, as speech and pointing. Additionally, and being based on TV as the central processing 1
www.guide-project.eu
44
C. Duarte et al.
hub, GUIDE also encompasses remote controls as another interaction modality. Moreover, in an attempt to explore novel interaction scenarios, and even promote mobility in the explored interaction settings, GUIDE includes tablets as an additional device, that can be employed both for input and output modalities. The majority of the system’s output will be transmitted, as expected, through the TV set. Complementing it, besides the aforementioned tablet, is the possible of haptic feedback through the remote control. As can be easily perceived from the above description, GUIDE aims to implement a fully multimodal system, offering its users a range of modalities that they can explore in order to address, both their impairments, as well as the context variations that might come into being. In order to avoid an excessive configuration complexity that such richness might impart, GUIDE includes a set of adaptation capabilities to harness the complexity of managing its full array of modalities. As such, adaptation will impact both multimodal fusion, by taking advantage of known multimodal interaction patters and by adapting individual recognizers in order to increase their efficiency, and multimodal fission, by generating output presentation customized to the system’s current user, considering a diversity of factors, with the more important ones being the user’s perceptual, motor and cognitive skills and abilities. Adaptation will also impact the dialog management component, thus addressing the interaction flow with applications based on user’s abilities. In order to support such system characteristics it is fundamental to employ a usercentered design methodology. The user is the pivotal subject for the system’s adaptation mechanisms. User information drives the adaptation, and knowledge about him is fundamental for the fusion and fission processes and the dialogue management. As a result, a profound characterization of its target users group is one of the project’s milestones, and several iterations of user studies are planned in the scope of this project. The next section details the first of these iterations.
3 User Studies As aforementioned, in order to collect the information required for the characterization of the target population several user studies have been planned. These will proceed in iterative fashion, intermingled with developments, allowing for an updated feedback on evolutions of the envisioned framework and its deployment. In this section we describe the set-up of the first user study conducted in the scope of the project. Given its nature of being the first study, the possibility of using low fidelity prototypes was considered. This however, was readily discarded, given that we wished to acquire reactions to what technology could offer its users as close to reality as possible. Having already available the majority of the individual technologies envisioned for the project, we opted to build a multimodal system to explore individual reactions to interaction means made available, and to study patters of usage [4]. The recognizers already supported by the system include pointing (either freehand through a motion sensing system, or with a remote control – in our case, we used the Nintendo’s Wii remote control). Speech recognition was not included at this time, but nevertheless tasks accepting speech input were included in the studies, in order to assess how representatives of the target population employ it. Output was
Eliciting Interaction Requirements for Adaptive Multimodal TV Based Applications
45
achieved through a TV, including text, audio and video. An avatar was also used. With it we attempted to understand if it could contribute the user’s empathy towards the system. The script included tasks to exercise perceptual, motor and cognitive skills. With them we aimed at exploring how users perceive and prefer visual elements (font size, font and background colors, object placement …) and audio stimulus (volume). We included also a set of tasks to exercise the user’s cognitive abilities. Furthermore, and event tough in some tasks users were free to interact in any way they wished, including combining modalities, we included tasks for users to explicitly use more than one modality, in an attempt to observe what integration patters were used. The trials were conducted over a period of one week. Nine elderly people participated in these trials: five men and four women. Their average age was 74.6 years old, with the younger being 66 years old and the older being 86 years old. Sessions lasted about 40 minutes, in which participants familiarized themselves with the available means of interaction and executed the required tasks. In the next section we will discuss our findings based mostly on qualitative findings from the trials’ observation and the participants’ remarks made during and after the trials. It should be stressed that these were the first trials, and thus have had such a small number of participants. Currently, a new of set of trials is underway, involving several scores of participants, and where findings from the first trials, here reported, have already been used to improve the system’s performance.
4 Discussion As mentioned previously the trial’s scrip took in consideration visual, audio and motor requirements. Moreover, in some tasks participants were free to elect how they wished to interact with the trial’s application while in other tasks they were specifically requested to interact with a specific modality or combination of modalities. In most of the script’s questions participants had to perform the same task with different modalities or with a different rendering parameter and after performing the task they had to report their favorite configuration for that task. In some other questions participants were just asked to select between one of differently rendered presentations without having to explicitly perform any task. The trials begun with a simple explanation task, where participants were asked to select one of four options presented visually on screen as four buttons. They had to do it through finger pointing, using the Wii remote to point, speech and with a combination of speech and one of the pointing options. Afterwards, different presentations were explored: target placement (near the center of near the edges of the screen), target size, target spacing, target color, text size and text color. Only for the color presentations participants simply had to select their preferred option. In all other, they had to perform a task, be it selecting one of the targets or reading aloud the text. The trial then proceeded with an audio rendering task where participants were asked to repeat text that was rendered to them through a TTS system with different volume levels. These were followed by motor tests, where users had to perform gestures (not simply pointing) both without surface support or using a tablet emulating surface.
46
C. Duarte et al.
Afterwards, users had to perform a selection task in all combinations of input (options presented visually, aurally or both combined) and output (selection made through pointing, speech or both combined) modalities. Finally, they had to assess what would be their preferred modality to be alerted to an event when watching TV and when browsing photos on the TV. Options included on screen text, TTS, the avatar or combinations of the three. The test ended with a comparison between the avatar and a video of a person presenting the same content, to assess the avatar’s ability to generate empathy with future system users. In the following paragraphs we report our findings, based on the participants expressed opinions but also on our observations, in situ and of the trials’ recordings. 4.1 Visual Perception Regarding visual presentation there were two main focus studied: target (e.g. buttons) and text. Targets were analyzed according to their placement, size, separation and content. In what concerns target placement, most participants (6 participants) preferred targets near the center of the screen instead of near the edges. Most participants (7 participants) preferred the larger targets. The majority of participants (6 participants) preferred the version with greater separation than the version with targets closer. Reasons given for this preference include being easier to see (and understand) the targets, additionally to movement related issues (being easier to select). Regarding the content of targets, participants showed a clear preference for solutions that promote a great visual contrast. The more popular choices were white text on black background or blue background, and black text on white background, with a single participant electing blue text on yellow background. There seemed to be some sort of consensus among participants that strong colors are tiring. In what concerts text presentation, size and color were both evaluated. Six text sizes were presented to participants. The larger was an 100 pixel font (this meant that approximately five words would fill half the TV screen), and the smaller a 12 pixel font. Intermediate sizes were 80, 64, 40 and 24 pixels. Only one participant preferred the larger font. Five participants preferred the second largest and three participants the third largest size. No participant preferred fonts with any of the three smallest sizes. Table 1. Participants‘ preferences regarding text color (in rows) for different background colors (columns)
White Black Blue Red Green Orange Yellow Gray
White 7
Black 7 -
1
1
Blue 8 -
Green 2 5 2 -
1
1
1
Eliciting Interaction Requirements for Adaptive Multimodal TV Based Applications
47
Text color was evaluated against different color backgrounds. Text colors considered were white, black, blue, red, green, orange, yellow and gray. Background colors used were white, black, blue and green. Participants mostly opted for high contrast combinations. Table 1 shows the participants’ expressed preferences, with text color in the rows and background color in the columns. Values in the table’s cells represent the number of participants that selected that particular combination of text and background color. 4.2 Audio Perception The audio tasks evaluated participants’ ability to perceive messages in different volumes. Five volumes were employed. The test started on the loudest setting, and then would decrease each time by half the previous volume and after reaching the lowest volume the procedure was repeated but increasing the volume by complementary amounts. Three participants preferred the loudest volume, with all but one participants preferring one of three largest volumes. However, some participants noted that the highest volume was too high in their opinion. One interesting finding, reported by some participants, was that their comfortable audio level was different when the volume was decreasing or increasing. For instance, one participant reported she could understand the spoken message only in the first two volumes when the volume was decreasing, but she could understand the loudest three volumes when the volume was increasing. Other examples of such behavior were observed and reported. In addition to reporting their preferred volume setting, several participants also reported being comfortable with the three loudest volumes. 4.3 Motor Skills Participants found both free-hand pointing and pointing with the Wii to be natural interaction modalities, meeting our goal of providing interaction modalities that are natural and do not require learning to be used. When comparing both free-hand pointing and Wii pointing the majority (6 participants) preferred free-hand pointing. Nevertheless, there were some interesting remarks. One participant stated she preferred the Wii remote because it reminded her of her remote control that she is used to handling when watching TV in her house. Other participants changed their preferences during the trial, after becoming more accustomed with both options. These participants typically moved from an initial preference for the remote to a preference for free-hand pointing. For the design of pointing interaction devices it was possible to gather some relevant indications. For instance, one participant pointed almost exclusively with her finger, barely moving her arm and hand. This was especially challenging for the motion tracking system. Other participants change the pointing hand depending where the target is on screen, using the left hand to point at targets on the left side of the screen, and the right hand to point at targets on the right side of the screen. This knowledge can be used, for instance, to adapt presentation based on the ability of the user’s arms. Participants were also asked to perform a set of representative gestures (e.g. circling, swiping) both in free air and on a surface representative of a tablet. Eight of
48
C. Duarte et al.
the participants preferred performing the gestures in free air. Some participants justified this preference because they felt that in that manner they could be more expressive. Others reported that doing gestures in free air was similar to do gestures when engaged in conversation with another person, thus feeling more natural. Another raised the issue that when performing gestures in the tablet she wouldn’t be able to see any content that might be displayed in it. Participants were also asked if they preferred to do those gestures with just one or with the two hands. Four participants expressed no preferences in this regard, with only two preferring two handed gestures and 3 preferring one handed gestures. Regarding their preferences and abilities when asked to perform pointing tasks with targets in the four corners of the TV screen it was not possible to identify a clear tendency in the collected results. One participant found it easier to point at the top right, two preferred the bottom right, one preferred the top edge, one preferred the bottom edge and four did not express any preference. 4.4 Specific Modality and Multimodal Patterns One important topic to address when using both gestures and speech is how users combine them, specifically if and what use they make of deictic references. One initial observation was that the purpose of combining modalities was not clear to all participants. This could however be attributed to their being required to combine modalities to do the same task they had just made with a single modality. Most participants employed multiple modalities in a redundant fashion, speaking the option’s text and pointing to it. From the interaction analysis it was possible to identify some integration patterns. Four participants pointed before speaking, while other four participants spoke before pointing. For the other participant no clear tendency was found. A few participants combined the pointing with spoken deictic expressions. In one of the trials the participant used the deictic reference while pointing, and then followed it speaking the option’s text, introducing a different interaction pattern. Multimodal interaction was explicitly exercised in the context of a selection task done with visual, audio and combined presentations and with selection being possible through pointing, speech and combined usage of these two modalities. Regarding the presentation, participants unanimously expressed a preference for the system to employ redundant visual and audio presentation (all 9 participants selected this option). Observations of some participants’ behavior showed that when the system presented options both visually and aurally, they did not wait for all options to be presented, answering as soon as they perceived the right answer. Regarding the input, seven participants are satisfied with having simply speech recognition, commenting that it is much easier to perform selection in this fashion. Two participants expressed their preference for a system where they can combine pointing and speaking. Different combinations of modalities to alert the user were also considered in the context of two different scenarios: watching TV and browsing pictures on the TV screen. For the TV watching scenario preferences were variable. Two participants preferred the alert to be rendered only using speech synthesis. Other two preferred to be alerted by avatar and text message, four preferred text and audio and one the avatar
Eliciting Interaction Requirements for Adaptive Multimodal TV Based Applications
49
only. In the photo browsing scenario a similar variability was found. Two participants preferred alerts through speech synthesis only, two preferred the avatar and text, one preferred text and audio and two preferred the avatar only. Other interesting observations regarding the use of multiple modalities were made. Some participants, even in tasks where they were required to select only by pointing ended up also speaking the option’s text, even without noticing it. In one case, even after being asked to point the correct option, a participant just spoke the option’s text. In what concerns the use of speech alone for selection operations it is important to understand if users simply read aloud one of the choices presented to them or if they use alternative speech. Although most participants simply read aloud one of the presented options, we also witnessed some natural utterances, like saying “the fourth” instead of “option 4” which was the text presented on screen. 4.5 Other Topics Although not available to be used during the trial, participants were shown a keyboard and a mouse and we asked if they would like to have them available to interact with GUIDE. The majority (6 participants) did not wish to have it available. Some even expressed they could not really understand why should the system employ those devices if users could interact through the natural means they had already tried (speech and gestures), while others simply stated that those interaction means are harder to use than speech or pointing. The use of an avatar in the system was also assessed. The participants’ reaction to the avatar was not as positive as expected. There are some justifications for this, given the avatar employed had a too small size and was not properly configured in what concerns emotion expression. However, it was possible to gain some knowledge about the use of an avatar in the context of the proposed system. Some participants expressed a request for the avatar to look better, less cartoonish, in order to make them feel better about it. An important observation was that four out of the nine participants responded to the avatar’s greeting message as if they had been greeted by a person. This is an indicative sign that avatar’s can be used to promote a bonding between the users and the system.
5 Conclusions Due to their intrinsic nature, the design of multimodal interaction systems should be strongly founded on user centred design techniques. That is exactly what is being done in the context of the GUIDE projected, and some of those initial findings have been reported in this paper. User trials with a multimodal system are being conducted in order to characterize user abilities, and also to assist in the future evolutions of the system. These results are expected to provide impact both on the individual modality level, but also through the multimodal integration patterns found. This paper presented a summary of the findings, and the impact that user’s visual perception, audio perception and motor skills can have on the interaction design of
50
C. Duarte et al.
such systems. Additionally, some multimodal interaction patterns were observed and reported. What is clearly supported by the data acquired so far, is the need for adaptation mechanisms in order to provide adequate interaction mechanisms to a user population with such diversity of abilities. One example of how adaptation could be explored became evident as a result of the observations conducted so far: participants selected their pointing hand based on where the target is on screen. This knowledge can be explored to decide on presentation details. For instance, if we know the user has impairments affecting his left hand, we can present the selectable targets on the right side of the screen, offering him what should be a more comfortable interaction experience. These user trials are ongoing, so further observations and insights are expected in the near future and will be reported in due time.
References 1. Oviatt, S., Darrell, T., Flickner, M.: Special issue: Multimodal interfaces that flex, adapt, and persist. Commun. ACM 47(1), 1 (2004) 2. Duarte, C., Carriço, L.: A conceptual framework for developing adaptive multimodal applications. In: Proceedings of the 11th International Conference on Intelligent user Interfaces, pp. 132–139. ACM Press, New York (2006) 3. Dumas, B., Lalanne, D., Oviatt, S.: Multimodal interfaces: A survey of principles, models and frameworks. In: Lalanne, D., Kohlas, J. (eds.) Human Machine Interaction. LNCS, vol. 5440, pp. 3–26. Springer, Heidelberg (2009) 4. Duarte, C., Feiteira, P., Costa, D., Costa, D.: Support for inferring user abilities for multimodal applications. In: Proceedings of the 4th Conferência Nacional em Interacção Pessoa-Máquina, Aveiro, Portugal (2010)
Making Task Modeling Suitable for Stakeholder-Driven Workflow Specifications Peter Forbrig, Anke Dittmar, Jens Brüning, and Maik Wurdel University of Rostock, Department of Computer Science, Albert Einstein Str. 21, 18055 Rostock, Germany {peter.forbrig,anke.dittmar,jens.bruening, maik.wurdel}@uni-rostock.de
Abstract. This paper discusses approaches for specifying workflows based on task models. These task models represent activities of stakeholders in different ways. It is shown how the development process of workflow specifications can be supported to get hierarchical, structured and sound specifications. Further on, a language CTML is introduced that was developed to specify activities in smart environment. The language has the potential to be used to specify general workflow specifications as well. It is demonstrated how cooperative work can be specified using this language. Keywords: Stakeholder-driven Specifications, Business process modeling, Workflow specifications, Task Modeling.
1 Introduction Cooperative work of different stakeholders is often specified as workflows that are modeled in a flat way like in EPCs, UML activity diagrams or BPMN. Such specifications often do not reflect the perspective of different stakeholders. (see Fleischmann et al. [8]). The authors state: “… participation- and communicationoriented modeling is still an open issue.” (p. 81). Although sub-processes can be defined in workflow specifications, the hierarchical modeling is not an integral part in these languages. In contrast the hierarchy is an integral part in specifications that are used in software engineering [1] to capture functional requirements and in the ARIS method [21] for business process modeling. In this domain also hierarchical supply and value chains, e.g. in the SCOR model [20] are used. Task trees [7] are used for capturing the hierarchical character of tasks and activities in an integrated way as well. Although these last three modeling languages have a similar hierarchical character of modeling they are used for different purposes. Task models are used to specify activities in user-centered way. Further on, in task trees the control flow is specified within the tree. Several languages have been developed for task models in the past. They were mainly used in connection with user interface development. One of the first approaches was HTA (Hierarchical Task Analysis) [1]. In the following TKS [10], GOMS [5] and CTT [14] have been C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 51–60, 2011. © Springer-Verlag Berlin Heidelberg 2011
52
P. Forbrig et al.
developed. Besides the fact that a lot of other languages were developed currently the notation of CTT is most often used one in publications. Additional to the application during requirements engineering task models can also be used for modeling business processes or workflows. This idea is not new. Already Traetteberg mentioned in [22] “Workflow concepts are compatible with task modeling concepts, although they have different foci and strengths”. Furthermore, task trees generally are structured workflow models [11] that provide further benefits. In [4] the correlation between CTT and UML activity diagrams is shown and a structured editor for activity diagrams is presented. Structured workflow models are less error prone, better readable and understandable [14]. Unreadable and unstructured workflow models similar to “Spaghetti code” models [9] are avoided. Task tree models forbid insensible models like the vicious cycle and are dead- and lifelock free and sound by design. Although expressive power is lost compared to the flow chart oriented languages like BPMN, there are many benefits to use task trees as workflow models. Consequently it seems promising to use them as workflow models. The paper is structured as follows. In section 2 the foundation for modeling business processes with task trees is given. Section 3 provides a modeling language CTML originally developed for smart environments. Finally there will be a discussion and a summary.
2 Modeling Cooperative Activities with Task Trees In this section the modeling of workflows with task trees is discussed in more detail. The hierarchy is integral part of the workflow specifications and binary temporal relations are used to specify the control flow. Temporal operators are related to the operators of the process algebra LOTOS [3] . 2.1 Control Flow Specification Using Temporal Operators CTT is currently the most referred language for task models. Tasks are arranged hierarchically, with more complex tasks decomposed into simpler sub-tasks. CTT distinguishes between several task types, which are represented by the icon representing the task node. There are abstract tasks, which are further decomposable into combinations of the other task types including interaction, application and user tasks (see Fig. 1 for an overview of the available task types). The task type denotes the responsibility of execution (human, machine, interaction, cooperation with human).
Fig. 1. Task types for task models
Making Task Modeling Suitable for Stakeholder-Driven Workflow Specifications
53
They are connected pair wise by binary temporal operators. Additional to the binary temporal operators there are unary ones that are only related to one task. Some CTT operators are listed in Table 1. A complete listing and explanation of the CTT operators can be found in [18]. Operators have priority orders. These orders are important for interpreting different operators in the same level. The priority of the interleaving ( |=| ) operator is higher than the enabling operator ( >> ). The only unary operator listed in Table 1 is the iteration. Fig. 2 gives an Impression of the specification of a task model. Table 1. Selected temporal operators of CTT Operator name choice orderindependence interleaving disabling enabling iteration
CTT Notation T1 [] T2 T1 |=| T2 T1 ||| T2 T1 [> T2 T1 >> T2 T1*
An example of a CTT model is given in Fig. 2 which shows how a presenter may give a talk. The abstract root task “Give Presentation” is decomposed into four children tasks. The tasks on the second level of abstraction are connected with the enabling operator (≫) in order to specify that one task has to be performed before the other can start (e.g., “Present” can only be performed after having executed “Configure Equipment”). An exception to this is “Leave Room” as it can be performed at any time due to the deactivation operator ([>) resulting in a prematurely abortion of the currently running task. “Configure Equipment” is furthermore consisting of the tasks “Start Projector”, “Start Laptop” and “Connect Laptop & Projector”. Those basic tasks are connected with the orderindependence (|=|) and enabling operator. The orderindependence operator defines the sequential execution of the tasks in arbitrary order meaning that once one of the tasks is started the other has to
Fig. 2. Task model for giving a presentation
54
P. Forbrig et al.
wait for the first one to terminate. Tasks which are not further decomposed are actions and considered as atomic. They represent the smallest entity of execution (e.g., Start Projector). 2.2 Specification of Cooperative Work In order to support the specification of collaborative (multi-user) interactive systems, CTT has been extended to CCTT (Cooperative ConcurTaskTrees) [16]. A CCTT specification consists of multiple task trees. One task tree for each involved user role and one task tree that acts as a “coordinator” and specifies the collaboration and global interaction between involved user roles. An example for the formalism is given in Fig. 3. Task models for the roles presenter and listener are given on top and on the lower right hand side of the figure respectively. The model specifying the coordination of the individual tasks is depicted on the lower left hand side. For each action in the coordinator task model a counterpart in the role specific task model has to be defined which is denoted by the dotted lines in the figure. In essence, the coordinator task specification adds additional execution constraints to the individual task models. In the given example it is specified that “Wait for Questions” of the role “Presenter” needs to be performed before the “Listener” is allowed to perform “Ask Question”. After that “Answer Question” of the role “Presenter” can eventually be executed.
Fig. 3. Cooperative Concurrent Task Tree Model for Presentation
The main shortcoming of CCTT is that the language does not provide means to model several actors simultaneously fulfilling the same role as well as that an actor is assumed to fulfill only one role within a CCTT specification (strict one to one mapping of actors and roles).
Making Task Modeling Suitable for Stakeholder-Driven Workflow Specifications
55
3 A Collaborative Task Modeling Language The collaborative task modeling language (CTML) was developed in conjunction with modeling efforts in smart environments. It supports the idea of stakeholderdriven process management and has the potential to be used outside the context of smart environments. We will shortly discuss the fundamentals and main features of the language. The design of CTML is based on four fundamental assumptions: 1. Role-based Modeling. In limited and well-defined domains the behavior of an actor can be approximated through her role. 2. Hierarchal Decomposition and Temporal Ordering. The behavior of each role can be adequately expressed by an associated collaborative task expression. 3. Causal Modeling. The execution of tasks may depend on the current state of the environment (defined as the accumulation of the states of all available objects) and in turn may lead to a state modification. 4. Individual and Team Modeling. The execution of task of individual users may contribute to a higher level team task. Based on these assumptions a collaborative task model is specified in a two-folded manner: 1. Cooperation Model Specifies the structural and behavior properties of the model. 2. Configuration(s) Holds runtime information (like initial state, assignment) and simulation/ animation configurations. For each Cooperation Model several Configurations may exist in order to describe different situations in which the model is used. The following figure Fig. 4 shows a schematic sketch of a cooperation model. Elements in the inner circle show modeling entities of the cooperation model (post fixed with “-1”) whereas diagrams outside of the inner circle show specifications realizing the corresponding entities (post fixedwidth “-2”). On a higher level of abstraction the cooperation model specifies the entities relevant to task modeling. Therefore roles (e.g., A-1), devices (e.g., B-1), a location model (C-1), a domain model (D-1) and a team model (E-1) can be specified. The potential actions a user is able to perform are determined by his role(s). More precisely a role is associated with a collaborative task model (A-2 in Fig.4), which is visually represented by a task tree in a CTT-like notation [18]. Tasks are arranged hierarchically defining a tree structure. Atomic tasks, non refined tasks, are referred as actions. In addition, tasks on the same level of abstraction can be connected via temporal operators defining the temporal order of task execution. Roles categorize users of the same kind in terms of capability, responsibility, experience and limitations according to the domain. Thus roles are abstractions of actors sharing the same characteristics. Role modeling is a common concept in
56
P. Forbrig et al.
Fig. 4. Schematic Cooperation Model for Meeting Scenario
software engineering ([6; 10]) to reduce complexity and build systems for diverse users. What constitutes to a certain role and distinguishes it from another one relates to the system and development approach. In [10] it is stated that a user is not limited to one role at a time and role switching is often taking place. In CTML the role concept is employed to define the pool of actions of a user by means of task expressions. In task analysis and modeling this approach is quite common but is usually restricted to a one-to-many relation of role and user [15;16]. However this is a rather rigorous constraint. In the domain of smart environments it is frequently the case that an actor changes his role at runtime and that one role is being performed by several actors simultaneously. This might be the case in our modern business world as well. The role concept implemented in CTML incorporates this case. In the example of Figure 4 the roles are Presenter, Listener and Chairman. They represent the different types of stereotypical behavior in the meeting scenario. Besides the cooperation model a CTML specification also contains one or more configurations providing essential runtime information for the cooperation model. A configuration represents necessary information for a concrete situation. This allows for testing different settings for the same cooperation model without much effort by defining different configurations. As the cooperation model relies on a role-based specification actors operating in the environment need to be defined in accordance with a corresponding actor-role mapping. More precisely an actor may fulfill more than one role concurrently and a role may be assigned to different actors simultaneously. Moreover, not only
Making Task Modeling Suitable for Stakeholder-Driven Workflow Specifications
57
concurrent role fulfilling is allowed but also all other temporal operators defined in CTML are possible. None of currently existing task modeling supports this assumption even though this is a common case in working cooperatively. Taking the example of the “Conference Session” one can imagine the case of an actor presenting a paper in front of the audience but also listening to other presentations afterward. Therefore, the simultaneous (or more precisely ordered) performance of more than one role is an important feature of the language as it also allows separating roles from another since they are assembled at runtime. Thus modularization and separation of concerns are achieved. Additionally some properties for actors are defined (e.g., initial position in the environment). On the left hand side of Fig. 5 an example Configuration for the schematic Cooperation Model in Fig. 4 is depicted. Not all before mentioned information have visual counterparts but the actor-role mapping is represented by arrows. More precisely it is specified that Leonard only acts as Presenter whereas Penny fulfills the role Presenter and Listener simultaneously. Sheldon acts as Chairman. The precise assignment of temporal operators for an actor fulfilling more than one role is performed in a dialog which is shown on the right hand side. Currently it is specified that Penny first acts as Presenter and afterward as Listener.
Fig. 5. Configuration “Scenario 1” for Cooperation Model "Conference Session" Table 2. Examples of preconditions and effects Role Presenter Presenter Chairman Role Presenter Presenter Chairman
Task Start presentation Respond to question Announce open discussion Task End presentation Leave room Announce open discussion
Precondition Chairman.oneInstance.AnnounceTalk Chairman.oneInstance.OpenDiscussion Presenter.allInstances.EndPresentation Effect self.presented = true self.isin(Outside) Notebook.allInstances.switchOff
A configuration can be considered as a scenario under which the cooperation model is tested or used. However sometimes one might test only certain features of the model. In the context of a smart environment the scenario is concrete. It could be considered as abstract specification for a workflow specifying the general behavior of all possible participants.
58
P. Forbrig et al.
Additional to temporal relations between tasks CTML allows more specific relations in a similar way like OCL. These constrains can be specified between tasks but also between other model elements. CTML allows in this way the specification of preconditions and effects of tasks. With respect to the task expressions of the role chairman and the role presenter the preconditions shown in Table 2 can be defined. The first precondition defines that the presenter is only allowed to start his presentation if the talk had been announced by a Chairman. The second preconditions states that responding to questions can only be performed if the Chairman has opened the discussion. The precondition of the chairman states that an open discussion can only be announced if all presenters have finished their presentation. Preconditions defined on this level of abstraction integrate well with the CTML approach of role based descriptions. Quantifiers are able to specify how many actors fulfilling the role are addressed (one or all). The task “End presentation” results in setting the attribute presented of the presenter true. If a presenter leaves the room he is outside. The opening of the general discussion by the chairman has the effect that all notebooks are switched off.
4 Discussion In this paper, an approach to model workflows in a hierarchical, structured way by using task trees is discussed. Task trees are frequently used in the HCI community to specify tasks and user actions in a goal or rather problem oriented way in a consequently user centered approach. The models are structured and for that reason readable and understandable [14]. Originally, CTML was developed to specify activities in a smart environment but it can be used to model general workflows as well. Activities of stakeholders are specified by specific models. Constrains between tasks, roles, devices and location (general context) can be specified by constrains as preconditions and effects. Additionally, an activity model specifying the cooperation can be used. Additionally, temporal relations can be specified that describe the dependence of different roles a user acts. The idea of specifying for every role (stakeholder) a special task model could be called subject-oriented because the main idea is very similar to that of [8]. The term stakeholder-driven would also fit. Stakeholders that play the role of bystanders can also be model. In this case the task model might be reduced to one task only. However, certain restrictions can be specified regarding certain stats of other elements of the model. This can e.g. mean that it is not allowed that in the neighborhood of a bystander exists a device that is louder than a certain level. During software development models are adapted and incremental refined. In order to define an appropriate notion of refinement different refinement relations introduced for CTML [27]. This information specifies on which degree a certain task specification can be adapted. This could be very helpful for workflow management systems as well.
Making Task Modeling Suitable for Stakeholder-Driven Workflow Specifications
59
5 Summary and Outlook Task modeling is compared to business process and workflow modeling in this paper. In this context it is stated that task modeling normally goes beyond business process modeling to a fine granular user action level. Nevertheless, notations of task models can be used for workflow specifications. It might be useful to separate models with different level of detail. In this way workflow management systems can be combined with user interface generation systems presenting the user workflow data in a usable way. Detailed Task models have been proven to be useful for interactive systems in this way.
References 1. Annett, J., Duncan, K.D.: Task analysis and training design. Occupational Psychology 41, 211–221 (1967) 2. Balzert, H.: Lehrbuch der Software-Technik: Basiskonzepte und Requirements Engineering. Spektrum, Heidelberg (2009) 3. Bolognesi, T., Brinksma, E.: Introduction to the ISO Specification Language LOTOS. Computer Network ISDN Systems 14(1), 25–59 (1987) 4. Brüning, J., Dittmar, A., Forbrig, P., Reichart, D.: Getting SW Engineers on Board: Task Modelling with Activity Diagrams. In: Gulliksen, J., Harning, M.B., van der Veer, G.C., Wesson, J. (eds.) EIS 2007. LNCS, vol. 4940, Springer, Heidelberg (2008) 5. Card, S., Moran, T.P., Newell, A.: The Psychology of Human Computer Interaction. Erlbaum, Hillsdale (1983) 6. Constantine, L.L., Lockwood, L.A.D.: Software for Use: A Practical Guide to the Models and Methods of Usage-Centered Design. Addison-Wesley, Boston (1999) 7. Diaper, D., Stanton, N.: The Handbook of Task Analysis for Human-Computer Interaction. Lawrence Erlbaum Assoc Inc., Mahwah (2003) 8. Fleischmann, A., Lippe, S., Meyer, N., Stary, C.: Coherent Task Modeling and Execution Based on Subject-Oriented Representations. In: Proc. TAMODIA, pp. 78–79 (2009) 9. Gabriel, M., Ferreira, V., Ferreira, D.R.: Understanding Spaghetti Models with Sequence Clustering for ProM. In: Business Process Management Workshops (BPM 2009 International Workshops), Ulm. LNBIP, vol. 43 (2009) 10. Johnson, H., Johnson, P.: Task Knowledge Structures: Psychological Basis and Integration into System Design. Acta Psychologica 78, 3–26 (1991) 11. Kiepuszewski, B., ter Hofstede, A.H.M., Bussler, C.J.: On structured workflow modelling. In: Wangler, B., Bergman, L.D. (eds.) CAiSE 2000. LNCS, vol. 1789, pp. 431–445. Springer, Heidelberg (2000) 12. Kristiansen, R., Trætteberg, H.: Model-based user interface design in the context of workflow models. In: Winckler, M., Johnson, H. (eds.) TAMODIA 2007. LNCS, vol. 4849, pp. 227–239. Springer, Heidelberg (2007) 13. Larman, C.: Applying UML and Patterns: An Introduction to Object-Oriented Analysis and Design and Iterative Development, 3rd edn. Prentice Hall, Englewood Cliffs (2004) 14. Laue, R., Mendling, L.: The Impact of Structuredness on Error Probability of Process Models. In: Information Systems and e-Business Technologies 2nd International United Information Systems Conference UNISCON 2008 Klagenfurt. LNBIP, vol. 5 (2008) 15. Molina, A.I., Redondo, M.A., Ortega, M., Hoppe, U.: CIAM: A Methodology for the Development of Groupware User Interfaces. Journal of Universal Computer Science 14, 1435–1446 (2008)
60
P. Forbrig et al.
16. Mori, G., Paternò, F., Santoro, C.: CTTE: Support for Developing and Analyzing Task Models for Interactive System Design. IEEE Trans. Software Eng. 28(8), 797–813 (2002) 17. OCL: http://www.omg.org/technology/documents/modeling_spec_catalo g.htm#OCL 18. Paterno, F.: Model-Based Design and Evaluation of Interactive Applications. Springer, Heidelberg (2000) 19. Penichet, V.M.R., Lozano, M.D., Gallud, J.A., Tesoriero, R.: User interface analysis for groupware applications in the TOUCHE process model. Adv. Eng. Softw. 40(12), 1212– 1222 (2009) 20. Poluha, R.G.: Application of the SCOR Model in Supply Chain Management. Youngstown, New York (2007) 21. Scheer, A.-W.: ARIS: Business Process Modeling. Springer, Heidelberg (1999) 22. Traetteberg, H.: Modelling Work: Workflow and Task Modelling, Computer-Aided Design of User Interfaces II. In: Louvain-la-Neuve, Belgium, Vanderdonckt, J., Puerta, A.R. (eds.) Proceedings of the Third International Conference of Computer-Aided Design of User Interfaces, October 21-23, pp. 275–280. Kluwer, Dordrecht (1999) 23. van der Aalst, W., Desel, J., Kindler, E.: On the semantics of EPCs: A vicious cycle. In: Geschäftsprozessmanagement mit Ereignisgesteuerten Prozessketten (EPK 2002), Trier (2002), http://www.wiso.uni-hamburg.de/fileadmin/wiso_fs_wi/EPKCommunity/epk2002-proceedings.pdf (accessed at November 30, 2010) 24. van der Aalst, W., ter Hofstede, A.: YAWL – Yet Another Workflow Language (Revised version). QUT Technical Report, FIT-TR-2003-04, Queensland University of Technology, Brisbane (2003) 25. Vanhatalo, J., Völzer, H., Koehler, J.: The refined process structure tree. In: Dumas, M., Reichert, M., Shan, M.-C. (eds.) BPM 2008. LNCS, vol. 5240, pp. 100–115. Springer, Heidelberg (2008) 26. Weske, M.: Workflow Management Systems: Formal Foundation, Conceptual Design, Implementation Aspects. Habilitation Thesis, University of Münster (2000) 27. Wurdel, M.: Making Task Modeling Suitable for Stakeholder-Driven Workflow Specification, University of Rostock, PhD Thesis submitted (December 2010)
A Method to Solve the Communication Gap between Designers and Users Jeichen Hsieh1, Chia-Ching Lin1, and Pao-Tai Hsieh2 1 Department of Industrial Design, Tunghai University, Taiwan, R.O.C. Tai-Chung Institute of Technology, Department of Department of Commercial Design National, Taiwan, R.O.C.
[email protected],
[email protected],
[email protected] 2
Abstract. There are always discrepancies when abstract design concepts are transferred to solid products. How to make sure that design concepts are conveyed exactly via products? To develop the early stage prototypes for tests and surveys is one of the solutions. The research applies POE (Post-Occupancy Evaluation) on prototypes of students’ design cases repeatedly. The result revealed that product prototype POE can anticipate the performances of products in final evaluation as an evaluation can predict post-production consumer reception. It suggests that performances of product prototype by POE would be clarified if extraneous variables are under strict control in advance. Two cases show chaos phenomenon, to probe into the field of students’ design activities with grounded theory might help to unearth some discovery. Keywords: Post-Occupancy Evaluation, Prototype, Cognitive Differences.
1 Introduction When the abstract design concepts are transferred into concrete products, there will always be some degree of gap. Crilly et al [1]describes product design as a communication process, the designer message encoded in the product and the customer, it is then decoded and explained by the product; However some cognitive problems will always exist, such as Norman [2] stated: designers and users do not directly communicate through their products and end-user communication only. Therefore, if a designer wants to confirm whether he sent a proper message to the user, one way is through the development of early prototype testing review. Prototyping and testing, product development into practice in several functions, can be used as the company's communication media. It can help designers to develop ideas and also be used to test the product to verify if the design is appropriate. In the pre-evaluating new product ideas, the designer wishes to know whether the product concepts is a good idea in the eyes of the others, whether the design principles to achieve this goal simply tell them what the product can do and how these different products where the usibility of sketches, precision drawings or mock model can achieve the purpose. If the product is manufactured as a prototype for testing, the reaction may be related to the design and details [3]. In short, in the development of a C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 61–69, 2011. © Springer-Verlag Berlin Heidelberg 2011
62
J. Hsieh, C.-C. Lin, and P.-T. Hsieh
concrete answer after the design program, you need to examine whether it meets the original objectives. Coughlan and Mashman [4] note that when more than one prototype must be selected or, usually ask the customer to choose the most attractive proposal for further development or improvement of the missing to do, in this single-phase of the assessment the customer does not make the right judgments, easy to provide false information, causing errors on the design decisions, leading to the development of a design or development should not veto the design of risk should not be rejected. Also, ask the customer to select their most attractive proposal is consistent with the original target market or to meet the original design proposal? It needs to verify. By the use of the construction sector assessment of POE (Post-Occupancy Evaluation) method, a designer could understand the intent and the user intent and thereby shorten the gap among multi-application architecture and urban planning and other fields. Follow-up with researchers to apply it to industrial design to view products in the design and planning on the information to know whether the user can really feel the product design and planning on the intention and the assessment can be used in nextgeneration knowledge acquisition products improvement; Since the ex post review of POE can be used as the product improvement policy, whether that is in the prototype stage of development to evaluate and improve the implementation of POE, whether the effects are significant? That is the research focuses. In education, there are often many steps between the first ideas to the completion of modeling, with teacher’s guidance for improvement. This is similar to the work industry, when the design goes through internal assessment decisions by the superiors. However, in the interview with the former Toyota Motor Minister of Design, it is stated that the early decisions made by a superior often result in errors, with no further assessment due to a systematic problem. We can see that the practice has to be effective in practice. POE prototype response to the research, but the development of the field of industrial design prototype POE method, import the repeated amendments to the process of evaluation and assessment of students design prototype, the design concept and the actual performance gap and give recommendations to encourage students to better approximate the design, prototype and then try to understand if the implementation of POE cases can result in significant impact on students.
2 Literature Review 2.1 POE Method POE contains research and review the previous assessment of the research prototype, is to grasp the principles and characteristics of POE, to explore its content and evolution, and by what is the prototype, the prototype is to investigate the scope of the establishment. POE method views the environment, the effectiveness of architectural design and planning, accumulation and sorting through the data as a basis for future improvements. In contrast, POE used in industrial design, should be able to view the product information on the design and planning, and to understand whether the user can really feel the product design and planning with the intention; Liguan Yao in the
A Method to Solve the Communication Gap between Designers and Users
63
POE-in home appliances refrigerators Case studies, for the first time to understand the use of POE in the product design, users and designers the difference between ideas, products in use after those changes, the user may misuse act and new patterns of behavior. Architecture and industrial design are compared in the different POE, and the establishment of suitable operating model for industrial products and related POE. The nature and functions of POE have somewhat different views. Sommer defines the POE is the "built environment for the use of a systematic examination of ways to obtain useful information"; Rabinowity says the POE is the "built environment for understanding and improving the effectiveness of a method of using"; domestic scholars have proposed to use assessment as a "specifically refers to the use of the building or building environment research". Meanwhile Shyh-Meng Huang think "POE's point of view the first emphasis on space and activity (or use) whether there is correspondence between the occurrence of conflict issues, the fact that these contradictions to resolve these contradictions." In addition, according to Zimring and Rirenstein, POE is "to access, observation, questionnaires and other methods to understand user views on all aspects of building work", which are more operational definition, the object between people and buildings define. And Zimring's another definition of POE is "diagnostic work on the environment, and this environment is by design for human use after". All in all, the POE can be generalized and treat the efforts of the past as a reflection and decision-making, from seeking to increase understanding and further improvement of the road. More narrow view of architectural design from the perspective of users, such as exploring the response of the built environment to improve the design of future cases similar to the decisionmaking; or to assess the functional performance of buildings, and use and planning objectives and content of the original comparison; or to act with the architects on the use of assumptions compared to understand the differences and to explore the cause during different reasons. POE and rationalization of a systematic process, emphasizing the position and method of objective evaluation and assessment criteria such as characteristics of importance is the result of its and the "Architectural Review" (Critique) different places. In other words, POE is the use of social science research methods of the built environment and people (especially users) to do the relationship between integrity and depth. The Presier based on the length of the POE application functionality into three stages: 1. (1) (2) (3) 2. (1)
In the short-range effect on: determine the success or failure of the building. make recommendations on the need to fix the problem. in the development project budget to provide some information. medium-range effect on: to help determine the suitability of the building to continue to use, modify or re-build and other measures. (2) resolve the supply dimension of existing buildings. 3. the effect on the remote: (1) POE can be based on the findings to the design of future buildings.
64
J. Hsieh, C.-C. Lin, and P.-T. Hsieh
(2) basic information designed to enhance and improve the formulation of assessment criteria and guidelines. (3) to improve the qualitative aspects of the past performance of the building methods of measurement. And, Preiser [5] presented in the writings of three levels of POE studies can be used in any form or size of buildings or facilities: 1. Indicative POE (Indicative POE): The main problem is to find that the advantages and disadvantages, success and failure, for short-term assessment. General to file the collection, performance evaluation, field reconnaissance and interviews of four methods required time from 2 to 3 hours to a day or two days. 2. Investigative POE (Investigative POE): Index of POE is that the results of important issues, and investigative POE need to get to the bottom of the topic for a more detailed investigation. The main basis for the formulation of assessment criteria related to literature, and reference to recent cases of similar assessment and therefore the higher its credibility, more objective; need for about 160 to 240 hours. 3. Diagnostic POE (Diagnostic POE): Diagnostic POE is a comprehensive and in-depth assessment, and involves many variables, the use of questionnaires, surveys, observation, the actual measurement and scientific method and other data collection methods, aims to understand the actual environment and facilities, the relationship between behavior and the variables and the relationships between variables; assessment results can be compared and scientific standards, it is quite accurate, and can enhance the knowledge of the building, design guidelines, and even make amendments to laws and regulations; time from months or even a year or more. 2.2 Industrial Design Prototypes Through Science Direct and UMI ProQuest Digital Dissertations search, keyword Evaluation, Product and Industrial Design Prototype door in search results, similar to the original planning POE design information for the control of the ex-post evaluation methods used in the prototype or product literature little. If Section I, POE than used in the construction field, Liguan Yao used the first time in the refrigerator POE product evaluation, and Cai Shijie dehumidifier product innovation and POE. The context of the POE is no more in-depth understanding of, for example: There is no understanding of POE to assess which building projects? How these projects generate? In addition, POE is not described how the three levels used in the field of industrial design. Coughlan and Mashman [4] create the prototype of a single-stage (single-session) of the assessment, the customer does not make the right judgments, the easy to provide false information, causing errors on the design decisions, it advocates with more than one assessment, which is to allow customers to look at a few prototypes, to reduce "side of the edge," Evaluation of the potential to make the crisis, and concluded that: the assessment of pre-production market should
A Method to Solve the Communication Gap between Designers and Users
65
be able to predict customer acceptance of the situation after ; cultural context is often in the aesthetic aspects of the product play an important role and the success of aftermarket, and the manufacturer's image and reputation, people use their past experience in related products and functions, etc., will affect the judgments of the product, so In considering a potential product, the other for the assessment can not be ignored. Further, Crilly et al [1] summed products are aesthetic, symbolic and practical functions such as 3 large, and suggested that further research is to develop appropriate ways to examine all aspects of product performance, and even to predict whether the product can stimulate the customers as expected reaction; Heufler [6] also indicated that the works of aesthetic products, symbolic and practical basic functions such as 3 large, we can see three basic functions of large argument so far is more consistent. Based on the above, Coughlan and Mashman said the current product design, prototype for the evaluation of other still room for development, and evaluation of pre-production and after-market customer response related to sex. Liguan Yao pointed out that the POE applied to assess the feasibility of industrial products, and products, POE survey that can be assessed and revised from generation to generation products; If POE used in the prototype evaluation, and POE function of feedback and feed forward echoes; use of the prototype POE assessment feedback and make improvements, but the effect is the role of feed forward in the product (eg, Coughlan and Mashman said above the predictive effect), it may form a better assessment of the prototype approach. Baxter [3] that the design process and models used in the prototype can be divided into a structural characterization of the model with functional characterization of the prototype and the characterization of both structure and function of the prototype. 2.3 Research Questions POE in the construction sector or field of industrial design applied to the POE, and irrespective of whether the POE in the field of industrial design, whether the spirit of the original POE, POE common research approach is used, the original planning and user cognitive differences between it reported; product assessment is an established fact, the conclusion to provide improved directions, did not address the design intent and user POE reduced cognitive performance. The product design process, designers are thinking how to be sure users understand which way is through the development review of the early prototype testing to verify design concepts to convey the relevance of the conversion, in addition to the appropriate research approach to solve, but also To know how it works. The focus of research is to expose "POE users to shorten the design intent and cognitive differences," the performance of the POE method is applied to the first prototype; use of prototyping and testing, repeated evaluation and correction (Assess & Refine) process to implement the product prototype POE, to answer whether the use of prototype POE, the subjects in the final assessment of whether a significant difference? And whether subjects were able to predict the final response? Such as preproduction evaluation, it can predict after-market customer acceptance of the situation.
66
J. Hsieh, C.-C. Lin, and P.-T. Hsieh
3 Methodology Assessment of the prototype is done by the students of the Department of Industrial Design, Tunghai University. Including: (1) chair prototype POE, sophomore design; (2) acrylic and re-design after POE, junior design; (3) cross step onto the track of the next semester, the experimental group and control group were two senior project students. Figure 3.1 shows the order of the processes. First, to understand the feasibility of POE applied to prototype, then the chair directly to the sophomore design prototypes for evaluation, and then go back and search for literature, will be integrated and field data to construct the theory is not only the documentation from the field come, and studied literature in the field of temporary and verified, so the POE literature review in the chair after the end of the prototype, and has practical experience with the designer to understand how the industry's implementation of the prototype assessments. After the prototype chair determine the feasibility of POE, but also learn from the literature review, should be followed in the construction sector POE way to the development of industrial design product prototypes POE approaches, then submitted documentation inductive method POE model prototype construction; time study gradually reduce the focus on design intent of POE and the user perceived performance, the use of the student work on is an attempt to POE will review the information back to students, encourage students to better approximate the design. POE prototype is used to understanding other sides of the implementation of multiproduct (POE prototype chair facing a single product do not), so in the third year of its implementation of the prototype acrylic and submit works of students attended a common problem, according to procedures to conduct the first POE (cancellation), and re-design of the second post-POE. Also at the same time senior graduation project, the first experimental design in order to grasp the upper and lower across the time course of the semester, the whole experiment is divided into five senior mileage, and established to operate the variables and experimental and control group arrangements, senior experimental design. After the experiment the first two milestones, the last semester of study will come to an end, then has to do the assessment work done to review and feedback to the prototype POE approaches, after continue next semester to complete three assessments.
4 Results Implementation of POE from the understanding of the feasibility of the prototype began with repeated cases of students to design prototype implementation of POE, there are findings as following: POE prototype will be used in cases of students work, repeated evaluation and improvement of its prototype implementation of the recommendations in the final evaluation of subjects. From the Scheffe method of post hoc comparison, the experimental group and control group work did not reach 0.05 level of significance. Hence there are differences between the performance but not significantly; and then ranked the performance seems to work, the first born of the experimental group B
A Method to Solve the Communication Gap between Designers and Users
67
single rinsing mug, the second born of the control group D with a group of cold tea, which means that the work done using POE Performance is not necessarily better, and that work without POE is not relatively to poor performance. There is no significant effect in the POE performance of this four students. Works in the experimental group and the final assessment four miles, respectively, by the independent sample t test and Scheffe method of post hoc comparison, did not reach 0.05 level of significance, the results indicated that the two assessments showed no significant difference. In other words, four miles from the school final results of the evaluation will be able to get a glimpse of a new generation of development works in the school's performance, it can be expected to judge cases of POE prototype high school students work, the performance of the final results of the assessment; such as the assessment of pre-production, to predict product Listed after the customer accepts the situation. POE on the prototype works in the of secondary school students showed no significant effect, said the POE literature review the research work carried out under natural conditions, rather than in a controlled experimental conditions, but the process of gradually found it difficult to external variables do better isolated defects, such as designers to reverse the verdict, and the design capabilities of different items, etc.; meaning POE used in the prototype will need to be in a better controlled experimental conditions, if asked to set up product items and designers can not overturn and other control measures, should also help clarify the effectiveness of POE prototype. And when items are not at the same time, it is more difficult to clarify the diagnosis POE prototype design and prototype feedback on the effectiveness of what happened, as the case of the pot with the robot lamp, table lamp itself, the robot may be difficult to obtain subjects who recognition, but if the two table lamps designed by key design robot, the more consistent base of comparison. Senior test mileage II, Kathy Chang, single rinsing mug after assessment found that there are functional, and that cup diameters which are too small will cause the brush bristles into the cup on the face and prevent lack of water movement, the feedback survey after the diagnosis of POE Mileage three functional assessment found that cup to drink water on the bristles have not obstructing movement. As Norman [2] described, recursive testing and re-design, the level of behavior (functional) design has played a good role, but not necessarily suitable for instinct (aesthetic features) or reflection level (symbolic function) design. Some recommendations are as following: From the works of experimental and control groups, there is no significant difference in the performance of views. The case did not present a significant benefit by students, but the diagnostic feedback POE prototype is more significant for a senior designer to reflect on their instinct. Prototype POE recommends using only the investigation of the two indicator levels, as indicators for short-term assessment of the POE. The main problem is found that the advantages and disadvantages, and success and failure, the time required from 2 to 3 hours to a day or two days, the study's assessment of the mileage are all investigative POE found in the indicator problem, its more detailed investigation are applied to the assessment of all POE takes about two weeks. The POE is the diagnostic variables and variables related to the relationship between the time required from months to a year or more [5], the study did not use the highest level which is necessary to implement POE. It seems too difficult to match the time used in
68
J. Hsieh, C.-C. Lin, and P.-T. Hsieh
the prototype, because the results are too late to feed back to designers in product development, unless a considerable length of time. 1. In this will complete prototype development of skills needed for POE (POE after 30 years of architectural development), the main results of the current prototype in the proposed POE approaches, but more specific items such as: functional aspects of environmental protection , a symbol of the functional aspects such as cultural and social context detection method, and using better ways to detect, follow-up study to be itemized and specializes in the integration of knowledge into prototypes and then POE research approach. 2. Research in the "sub-blocks method of investigation" did not get the better of, because you can more easily partition the building blocks, but the product of a partition block would probably lose its significance (based on gestalt psychology) and would be difficult to define the scope. For example: a robotic desk lamp where block A is defined as the palm of your hand to the shoulder, but why not from the palm to the elbow? There seems to be no sufficient reason to support such a regional division, the impact, for example: only know that the poor performance of the prototype and beautiful, but not that "the beauty of which part of the poor"; raises a question: how to detect the aesthetic (or other) part of a negative evaluation of the exact place? Can only be subjects for the prototype general awareness of the negative evaluation of the qualitative data records, if the designer knows exactly which part or range of subjects is not understood and difficult to clarify. Suggest that future developments need to improve that part of the prototype approach. 3. The industry does not face the fear of using more innovative design of the phenomenon in Taiwan. By POE and Decision Sciences, it might help the evaluation of making prototypes and let the designers be more convincing. 4. From the chair prototype POE we found that the general assessment of the personnel evaluation of subjects and the scores for teachers were higher than the assessment of staff; a glimpse into different plays (Occupy) and occupies the length of time seems to influence the form of assessment, like business involvement in the field of the impact of different decisions on the purchase, so the prototype POE Marketing and Business Studies (Marketing) areas should be able to do more closely with the possession of such different forms, such as: a packed, unpacked and made available any consumer touch, but packaged and exposed part of the product design without instructions, etc., the impact on purchasing decisions. 5. In the assessment of association via a qualitative record of items to find work similar appearance can lead to more in-depth articles of association or emotional detachment, such as: single rinsing mug reminiscent of a partner's time. Since the emotional detachment can sometimes change or advertised products through methods such as enhanced, thereby allowing the consumers feel are "their" products [6]. To use POE prototype to explore the use of items and do not use items in advertising effectiveness or the impact on consumer purchasing decisions could be considered. 6. The current assessment is more artificial data interpretation, diagnosis and recommendations vary susceptible to assess product prototypes more cumulative. It is expected to organize the analysis and diagnosis of the rules, such as: with correlation analysis, can determine why the performance of the project or sort out the rules for construction of expert system.
A Method to Solve the Communication Gap between Designers and Users
69
7. POE survey is to assess and generation of products and buildings, but no studies have product or building to be followed and diagnosis. It does not show the generation better or not. In terms of design activities for students, graduates of the next track are to follow on the topic if the term by the POE diagnosis, whether the performance of a new generation of work can be better will be explored. 8. Cases, a glimpse into the Chaos of the phenomenon, are beyond the scope of the study. Scholars in different fields have been found to chaotic phenomena, the proposed follow-up study in this direction may be able to open another field of industrial design a "nonlinear" perspective. For example: Grounded Theory approach taken directly to the students in the fields of design activities and to observe the chaotic phenomena and create some new concept and theory.
References 1. Crilly, N., Moultrie, J., Clarkson, P.J.: Seeing Things: Consumer Response to the Visual Domain in Product Design. Design Studies 25(6), 547–577 (2004) 2. Norman, D.A.: Emotional Design (Emotional Design: Why We Love (or Hate) Everyday Things) (Que Lan Weng. Garden City Culture Limited, New York (2005) 3. Baxter, M.: Product Design and Development (Product Design: A Practical Guide to Systematic Methods of New Product Development). Kuni Press, New York (1998) 4. Coughlan, P., Mashman, R.: Once is Not Enough: Repeated Exposure to and Aesthetic Evaluation of an Automobile Design Prototype. Design Studies 20(6), 553–563 (1999) 5. Preiser, W.F.E., Rabinowitz, H.Z., White, E.T.: Post-Occupancy Evaluation. Van Nostrand Reinhold Company, New York (1988) 6. Heufler, G.: Design Principles: From Concept to Product Formation (Design Basics: From Ideas to Products). Longxi Books, New York (2005)
Teaching the Next Generation of Universal Access Designers: A Case Study Simeon Keates IT University of Copenhagen, Rued Langgaards Vej 7, 2300 Copenhagen, Denmark
[email protected] Abstract. This paper describes the development of the “Usability and Accessibility” course for M.Sc. students at the IT University of Copenhagen. The aim is to examine whether this course provides an effective and useful method for raising the issues around Universal Access with the designers of the future. This paper examines the results and conclusions from the students over 5 semesters of this course and provides an overview of the success of the different design and evaluation methods. The paper concludes with a discussion of the effectiveness of each of the specific methods, techniques and tools used in the course, both from design and education perspectives. Keywords: usability, accessibility, universal access, education.
1 Introduction It is widely accepted that there is a need to adopt user-centred [1] or user-sensitive [2] design processes when designing user interfaces. It is also widely accepted that there is a need to design for the widest possible range of users [3]. Design approaches such as Universal Design [4], Inclusive Design [5], and Countering Design Exclusion [6] have been developed as means of ensuring that user interfaces support the concept of Universal Access [7]. However, it is unusual to find any of these concepts taught explicitly within university Computer Science degree programs. Often they are taught within subjects such as Interaction Design, if at all. This paper describes a combined Usability and Accessibility course for graduate students. It will explain how students with little to no background in either topic can gain pragmatic skills and experience in a comparatively short space of time.
2 The “Usability and Accessibility” Course The purpose of the “Usability and Accessibility” course is to empower the students to be able to identify and design for the needs of the widest possible range of users once they enter the workplace. This is the future of Universal Access by doing, not solely by researching. Originally the course was called “Usability with Project,” but was renamed to reflect the actual course content more accurately in Spring 2010. The course is taught in the third semester of the 2-year M.Sc. degree within the Design and Digital Communication (DDK) line at the IT University of Copenhagen. C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 70–79, 2011. © Springer-Verlag Berlin Heidelberg 2011
Teaching the Next Generation of Universal Access Designers: A Case Study
71
Students on the DDK line come from a wide variety of backgrounds. Approximately half of the students attending the course have received a traditional computer science education. The other students have had a more humanities-based education. The students are all typically mature and are either returning to education after a few years of work experience or are completing the degree as part of their onthe-job training. Almost all of the students describe their interest and motivation for taking the course as being to learn how to make websites more usable, even though websites are not explicitly mentioned in the course description. The DDK line consists of a mix of mandatory courses and voluntary ones. The full course structure is shown in Table 1. Table 1. The DDK study line for the M.Sc. degree at the IT University of Copenhagen
Semester
Courses
1st Semester
“Interaction design” (15 ECTS) “Media and communication” (7.5 ECTS) “Web design and web communication” (7.5 ECTS)
2nd Semester
“Innovation and concept development” (7.5 ECTS) “Introduction to coding, databases and system architecture” (7.5. ECTS) Elective 1 (7.5 ECTS) Elective 2 (7.5 ECTS)
3rd Semester
“Digital rhetoric” (7.5 ECTS) Specialism (15 ECTS) Elective 3 (7.5 ECTS)
4th Semester
Masters dissertation (30 ECTS)
The “Usability and Accessibility” course is a specialism option in the 3rd semester. Other choices include: • • • •
“Digital culture and community” “Globalisation, organisation and communication” “Digital aesthetics: theory and practice” “Mobile communication: design-related, business-related and social context”
These are all 15 ECTS courses, constituting one-eighth of the 120 ECTS M.Sc. course and run twice every year in the Spring and Autumn (Fall) semesters.
3 Course Structure The “Usability and Accessibility” course is 15 weeks long and is structured around the development of a web shop. In the first teaching session, the students are asked to interview each other and to complete a skills and interests questionnaire. Students are then placed into groups of 4 or 5 students with at least 2 experienced coders in each
72
S. Keates
group, although all students will have taken the mandatory course on Databases, which teaches the basics of PHP programming, and the course on Web Design, which teaches HTML, XML and the basics of Javascript. Students are tasked with building a simple web-shop from scratch in the first 2 weeks of the semester. The tight deadline is specifically to emulate the time pressure in most commercial environments. No explicit usability or accessibility goals are presented. This version of the web-shop is then frozen and the students are not allowed to modify it. A copy of the web-shop is made and over the next 10 weeks of the course, usability and accessibility theory are introduced. The students develop another version of the web-shop, with explicit consideration of the usability and accessibility requirements. Students are introduced to usability and accessibility theories in an order to support the continuing development and refinement of their web-shops. The course is expected to take 20 hours per week of student time, with 2-hour lectures and 2-hour exercise sessions twice a week (Wednesday and Friday), giving 8 hours of direct tuition per week and the remainder of the time being self-guided tuition by the students, typically work on their projects. Usually, the first morning of lectures introduces new theory. The first exercise session is focused on applying that theory in an exercise that is unrelated to the project. The second morning of lectures then examines the application of the theory and introduces further theory. The second afternoon of exercises is then focused on applying the theory to the web-shop project. At the end of the semester, the students are asked to prepare a 10-page project report in the ACM CHI publication format [8] along with a 5 page supplementary report, which can be formatted to their own choice. They are examined on a combination of the 10-page report, a 20 minutes group presentation and 20 minutes individual oral examinations. The students were told to focus on being able to justify quantitatively whether their revised sites were more usable and accessible than their original (frozen) sites. 3.1 Course Participants In total 116 students have enrolled in the course over the 5 semesters discussed in this paper (Autumn 2008 to Autumn 2010). Between them, they have developed 48 different web-shops – 24 original (frozen) versions and 24 revised versions. 3.2 The Design Brief Once students have been placed in their groups of 4, they are given a design brief, which states that: • The students have been hired by a fictional company to produce a web-shop within 2 weeks that offers a list of specified British products to their employees as a reward for a record-breaking year of sales. • The web-shop is to consist of a welcome/splash page explaining the offer, a product selection page, a delivery page and an order confirmation page. • Each employee is to either choose a single product (Autumn 2008 and Spring 2009) or is to receive between 5 and 10 stars to spend (all other semesters). All stars must be “spent” to reduce delivery costs before the order can be completed.
Teaching the Next Generation of Universal Access Designers: A Case Study
73
The students are then given a list of between 60 and 75 British products to offer on their web-shop. A number of those products are deliberately chosen to be unfamiliar to non-British people, such as mince pies and Christmas crackers. The aim is to encourage the students to learn to research products for themselves and also to ensure that their web-shops communicate the nature of the products effectively, rather than simply relying on brand and product name familiarity to the users. Between 30% and 50% of the products on the list were changed each semester both to reflect the change between Christmas and Summer holiday rewards and also to minimize the effects of designs being passed down from semester to semester. The change from selecting a single product to spending 10 stars was made because although the newer project is more complex to code, it offers a richer interaction and thus more data to analyse in the final reports. Having developed a working web-shop, the students then have to improve the design through the application of usability and accessibility methods. 3.3 Usability Methods The students are introduced to usability methods in both increasing complexity, but also in an order that makes sense for the re-design of their web-shop. Card sorting. Card sorting was used by the students to decide on the best potential clusters for their products (e.g. Sweets, Healthcare products) and also to ensure that the products were in the correct cluster Personas. Personas are usually developed from a known user group and are typically used to describe particular sectors of the target users that are of specific interest to the designers. In this case, though, since the target user group was fictional, the personas were developed to represent broad user types and were used to prompt the students to consider different user patterns of behaviour in conjunction with heuristic evaluation. Heuristic evaluation. The students developed specific use cases based on the personas that they had developed and then performed a heuristic evaluation to identify potential usability issues with their “frozen” sites. User trials. At the end of the semester the students performed user trial evaluations of their original (frozen) and revised sites. They had to recruit a minimum of 4 (later 6) users and a user who was blind. No assistance was given in finding the blind user to encourage the students to learn where to find such users. Before conducting the final set of user trials, they also had to perform a pilot study with at least one user. The students typically used screen-recording software, such as Silverback1 and Camtasia2, to record the trials. They were encouraged to collect as much quantitative data as possible. 3.4 Accessibility Methods The stipulation that at least one of the users in the final user trials had to be blind meant that each group had to explicitly consider the accessibility of their web-shop. To this end, the students were introduced to common accessibility evaluation tools. 1 2
http://silverbackapp.com/ http://www.techsmith.com/camtasia/
74
S. Keates
Cynthia Says3. The students were first asked to use HiSoftware’s Cynthia Says Portal to identify how many Web Content Authoring Guidelines (WCAG) Priority 1, 2 and 3 errors [9] their sites had. Although WCAG is commonly accepted as the default standard for web accessibility in the Universal Access community, this was the first time almost all of the students had encountered it. Wave4. As many students found the Cynthia Says Portal output to be very difficult to visualise, they were asked to repeat the WCAG evaluation using WebAIM’s Wave Web Accessibility Evaluation Tool, which produces a marked up version of the web page being analysed, with red, yellow and green markers indicating the location of potential problems (the yellow and red markers) or successes (the green markers). Vishceck5. About 8% of the male population is colour blind, so to check whether this presents a problem to users of their sites, the students are instructed to evaluate their sites using Vischeck. The aim is to establish whether users with Deuteranopia (red/green colour deficit), Protanopia (red/green colour deficit) or Tritanopia (blue/yellow colour deficit) would experience difficulties using their sites. Screen reader. While the WCAG compliance tools such as Cynthia Says and Wave are useful in identifying basic coding issues, simply conforming to those standards does not guarantee an accessible or usable website. To check this, the students are asked to use a screen reader such as WebAnywhere6 or JAWS to browse their webshops aurally. Exclusion calculator. To evaluate the potential number of users that may be excluded from using their sites, the students are asked to perform a comparative exclusion analysis using either of the exclusion calculators from the Engineering Department at the University of Cambridge7. The calculators require the students to estimate the level of functional capability required to use a product and then report the total number of people within the British population who do not possess those levels of functional capability. The aim of introducing the exclusion calculators is to indicate prevalence of impairment in the general population.
4 Review of the Usability and Accessibility Methods As discussed above, the “Usability and Accessibility” course introduced the students to a number of common design and evaluation methods and tools. 4.1 Consideration of Usability and Accessibility None of the groups considered accessibility in their initial designs. Where the first versions of their web-shops were accessible, this was solely due to using valid HTML 3
http://www.cynthiasays.com/ http://wave.webaim.org/ 5 http://www.vischeck.com/ 6 http://webanywhere.cs.washington.edu/ 7 http://www.eng.cam.ac.uk/inclusivedesign/index.php?section=data&page=exclusion_calc or http://www.inclusivedesigntoolkit.com/betterdesign/downloads/exclusioncalc.html 4
Teaching the Next Generation of Universal Access Designers: A Case Study
75
coding. This is both good in that it demonstrates that accessibility can be achieved by following standards, but is also concerning that no students considered accessibility until formally instructed to do so. Comparatively few groups considered explicit usability goals either. When they were considered, the goals were vaguely formulated often making reference to “user experience,” but with no set targets or objectives. By the end of the course, all groups had clearly defined usability and accessibility objectives. By far the most common usability definition adopted was that from ISO 9241:11 specifically the “extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use.” [10] The most commonly adopted definition of accessibility was broadly that the site must be “usable and accessible to a blind user using a screen reader.” While this definition does not meet the usual definitions for Universal Access – to be as usable and accessible by as many people as possible in as many contexts of use as possible – it is a major step in the right direction. 4.2 Usability Methods and Tools The card sorting exercise is useful in helping the students to consider their product groups and especially to identify products that were in the wrong product group. The personas are not very useful in their traditional role in a user-centred design process. However, this is not surprising as they are entirely fictional and not developed in the correct way. They are useful, though, as a design tool in reminding the students that the users may exhibit a variety of browsing patterns and IT skills. The most successful strategy observed to date is a trio of personas that exhibited the following browsing patterns: • The quick user – someone who wants to complete the process as quickly as possible • The careful user – someone who wants to consider all of the possibilities to get the best possible value • The uncertain user – someone who changes their mind frequently, might possibly visit the site multiple times before deciding and also possibly chooses products based on someone else’s recommendations The heuristic evaluation often proves very useful in identifying many usability issues with the original (frozen) versions of the web-shops. However, this technique has also proven to be the most problematic for the students to use. In the final user trial evaluations of the original and revised web-shops, the times when the users expressed a preference for the original site can in almost all circumstances be traced back to the heuristic evaluation stage. Heuristic evaluation is known to identify many potential usability issues on a website. However, the method provides comparatively little information about the priority of each issue. Consequently, the students often assign each issue the same priority and attempt to fix them all. In doing so, they sometimes end up with a revised site that is visually more complex than the original site through the addition of FAQs, contact addresses, more robust error-checking, etc. While the users often respond positively to the new additions to the site in terms of trustworthiness, for example, they also sometimes feel
76
S. Keates
that the flow of the interaction has become more cumbersome and less streamlined. Many of the students walk the fine line between providing a richer and more secure user experience without compromising the effectiveness of the site. Some groups, however, make their sites so complex that the user satisfaction is adversely affected. Finally, the user trials at the end of the semester are generally regarded by the students as the most useful usability evaluation method and the user trials with the blind users are often the most interesting and personally rewarding. However, it is also accepted that user trials take much longer to perform and are more resource intensive. 4.3 Accessibility Methods and Tools The students typically find the visual presentation of WCAG violations from Wave to be extremely useful in identifying where the accessibility problems were on each page. However, the detailed analytical feedback from Cynthia Says is typically more useful in identifying where the problems lie in the coding. All groups to date have used a combination of both applications in developing the revised versions of their web-shops. Vischeck is often harder for the students to interpret. A number of students have tried to adjust the colour schemes of their sites to still look visually appealing to themselves, while not appreciating that their colour preferences (with unimpaired colour vision) may not be the same for some with a colour vision impairment. Most students, though, use Vischeck to look for insufficient colour contrast for each of the three colour vision impairment types, which is usually more successful. The exclusion calculators usually do not offer enough data resolution to be able to respond to the changes made between the original and revised versions of each site, with often only minor differences in exclusion reported between the two versions. This is because the limiting factors in the ability to use the web-shops are imposed by the hardware used in the interaction (the keyboard, mouse and screen) rather than the design of the web-shops themselves. The accessibility tools that are most universally praised and used by the students are the screen readers. Trying to complete purchases using only the screen readers quickly makes it clear why the pages have to be well-structured and also why options such as “Skip to Content” are so important.
5 Review of the Course The “Usability and Accessibility” course is under constant review to keep it fresh and relevant to the students. The review process involves student feedback as well as setting new pedagogical and learning goals. 5.1 Student Response to the Course Midway through each semester, the students on the course are invited to provide anonymised feedback through an online questionnaire. Responses are rated on a Likert scale of 1 (I completely disagree) to 6 (I completely agree).
Teaching the Next Generation of Universal Access Designers: A Case Study
77
The students are very satisfied overall with the course (response mean = 4.8 out of 6.0 to the statement “Overall conclusion: I am happy about this course”). They also feel that the quantity of work on the course is about right (mean response = 3.4 out of 6.0 to the statement “My time consumption for this course is above the norm of about 20 hours/week”) and that the course is highly relevant to their future employment (response mean = 5.1 out of 6.0 to the statement “I think this course is highly relevant for my future job profile”). These results indicate that the students respond positively to this course. Qualitative feedback demonstrates very clearly that the students respond most positively to the very practical and applied nature of the course, with the focus on learning pragmatic skills rather than simply classroom theory. 5.2 Course Name and Student Enrolment In Spring 2010, the name of the course was changed from “Usability with Project” to “Usability and Accessibility”. Figure 1 shows the student enrolment on the course before and after the name change. It can clearly be seen that the student enrolment decreased from a mean of 24.7 students to 14 students per semester with the change in name of the course. This shows that the concept of “accessibility” is still problematic in persuading students that this is a topic worthy of their attention. It is worth noting, though, that Denmark does not have a formal anti-discrimination law along the lines of the 1990 Americans with Disability Act [11] or the 1995 UK Disability Discrimination Act [12]. Thus it is not clear whether the student response to the course name change would be the same in countries where there is a clear legal imperative to consider accessibility in the design of websites.
Fig. 1. The number of students enrolled for each semester of the Usability with Project and the Usability and Accessibility courses
It is worth noting, though, that in their final exams the students are usually the most excited by the work that they have done with the blind users. So, it is not that they are prejudiced about the work, simply that they do not see the relevance of “accessibility” to their own careers and shows that there is still work to be done in making accessibility and Universal Access more mainstream concepts.
78
S. Keates
5.3 Experimental Design and Analysis Pre-course It became clear during the first semester of the “Usability with Project” course that the students from a non-scientific background were struggling with the quantitative analysis elements of the project. It was clear, for example, that many of them had never been introduced to fundamental concepts such as probabilities and could not make the connection between a probability of 0.5 being the same as a 50% chance. As such, a new pre-course was introduced – “Experimental Design and Analysis.” This course runs in the second semester of the degree programme and teaches basic statistics assuming no pre-knowledge. It covers basic probabilities all the way up to multivariate analysis of variance. Since the introduction of this course, the overall quality of reports submitted for the “Usability with Project” and “Usability and Accessibility” courses improved substantially, with the students able to better understand the role of the statistical tests and spontaneously performing Kolmogorov-Smirnov or Q-Q plot analyses to ensure that the data is normally-distributed before applying a paired Student t-test or ANOVA. If the data is not normally-distributed, they usually perform a Wilcoxon signed-rank test. This is remarkable in students that have often had no statistical training prior to the “Experimental Design and Analysis” course. Since the introduction of that course, no students have lost marks because of incorrect statistical analyses in their projects.
6 Conclusions Overall, the “Usability and Accessibility” course provides a model for teaching both usability and accessibility theory and practice within the later stages of a Bachelor programme or in a Masters programme. It shows that students from a wide variety of backgrounds can respond positively to the challenges presented by this course. The student response is overwhelmingly positive to the course. However, there is still room for concern over the tailing off of the number of students enrolling in the course since the name change and this suggests that “accessibility,” and thus Universal Access, is still widely perceived as a niche interest rather than a mainstream activity within the student community. Since the students on the DDK course are mature students and have experience in industry before enrolling in the programme, this suggest that this attitude is also widespread in Danish industry. This is clearly a challenge that needs to be met.
References 1. Vredenburg, K., Mao, J.-Y., Smith, P.W., Carey, T.: A Survey of User-Centered Design Practice. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI 2002). ACM, New York (2002) 2. Newell, A.F., Gregor, P.: User Sensitive Inclusive Design- in Search of a New Paradigm. In: Proceedings of the 2000 Conference on Universal Usability (CUU 2000), pp. 39–44. ACM, New York (2000)
Teaching the Next Generation of Universal Access Designers: A Case Study
79
3. Marcus, A.: Universal, Ubiquitous, User-Interface Design for the Disabled and Elderly. Interactions 10(2), 23–27 (2003) 4. Choi, Y.S., Yi, J.S., Law, C.M., Jacko, J.A.: Are Universal Design Resources Designed for Designers? In: Proceedings of the 8th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS 2006), pp. 87–94. ACM, New York (2006) 5. Clarkson, P.J., Coleman, R., Lebbon, C., Keates, S.: Inclusive Design. Springer, Heidelberg (2003) 6. Keates, S., Clarkson, P.J.: Countering Design Exlcusion. Springer, Heidelberg (2003) 7. Stephanidis, C.: The Universal Access Handbook, 1st edn. CRC Press, Inc., Boca Raton (2009) 8. ACM Special Interest Group on Computer Human Interaction: SIGCHI Conference Publications Form , http://www.sigchi.org/chipubform (accessed February 3, 2011) 9. Thatcher, J., Burks, M.R., Heilman, C., Henry, S.L., Kirkpatrick, A., Lauke, P.H., Lawson, B., Regan, B., Rutter, R., Urban, M., Waddell, C.D.: Web Accessibility: Web Standards and Regulatory Compliance. Springer, New York (2006) 10. ISO 9241-11: Ergonomic requirements for office work with visual display terminals (VDTs) – Part 11: Guidance on usability (1998) 11. US Department of Justice: Americans with Disabilities Act of (1990), http://www.ada.gov/pubs/ada.htm (accessed February 3, 2011) 12. Her Majesty’s Stationery Office (HMSO): UK Disability Discrimination Act (1995), http://www.legislation.gov.uk/ukpga/1995/50/contents (accessed February 3, 2011)
Use-State Analysis to Find Domains to Be Re-designed Masami Maekawa1 and Toshiki Yamaoka2 1
Sosa Design inc., 12-12 Higashiyama, Keihokuhosono-cho, Ukyo-ku, Kyoto-shi, Kyoto 601-0262, Japan
[email protected] 2 Wakayama University, 930 sakaedani, Wakayama-shi,Wakayama 640-8510, Japan
Abstract. Even if the problems concerning HCI are individually solved, it is not easy to relate to a big value solution. It tends to be some small improvements of narrow scope. Therefore, this proposal pays attention to the situations in which the problem occurred. It was attempted to clarify the domains to be re-designed by uing mathematical analysis methods with use-state keywords as data extracted from the descriptions of situations. Consequentially, it is understood that there is a possibility of this method through a trial experiment. And, some limits of this method were found. Additionally, the difference with the result of a common classification method was confirmed by a comparative experiment. Keywords: Use-state Analysis, Design, Human Computer Interaction, Context.
1 Introduction Task analysis, observation, interview, protocol analysis, and inspection method etc. are problem finding techniques on HCI field [1]. However, a big effect cannot be hoped even if the problems obtained here are improved as problems of local design. The found problems are not limited to the HCI field. It has the possibility of leading to the more valuable creation of the product [2]. It is important to search out the matter that should not be individual, and solve essential problems extracted from the relation of those problems. Problems don't happen generally, but happen under some certain specific situations [3]. For instance, a senior person cannot often use that young person can use without trouble. Moreover, making mistakes is often at the time them hurry up though it isn’t when they act slowly. Thus, there is a deep relation up to the problem and the situation. Then, this study pays attention to the relativity of the problems and the situation of the problem found by above mentioned methods on HCI field of products. And it proposes to find domains to be re-designed by analysis. First of all, some fields concerning problem findings are described. And, it proposes the technique to find domain where is not the piling of individual problem solving but a drastic solving. Finally, the result of the trial experiment of this technique is considered. C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 80–89, 2011. © Springer-Verlag Berlin Heidelberg 2011
Use-State Analysis to Find Domains to Be Re-designed
81
2 Literature Review In the field of software engineering, there is requirement engineering as a technique to clarify which to be designed and be made. Demand extraction type, goal aim type, and domain model type are basic ideas that acquire the requirement and analyze it [3]. This chapter shows demand extraction type, scenario method used with demand extraction type, and goal aim type. Demand extraction type is a method of extracting demands by basically using interview. This technique stands in assumption that stake-holders including users have demands latently. However, the quality of the interview depends on the interviewer’s skill. Moreover, there is a difficulty on understanding the globular conformation of demands. These characteristics tend to make individual solutions of each extracted individual demand. The lack of coherence often appears in demand extraction type. Therefore, the contradiction of solving plans and/or confrontation is caused as shown in Fig.1. In scenario method, states such as user's behavior and reactions of system after the system constructed are described as a scenario. User’s needs are extracted through taking communications with the user by using scenario. There is an advantage that it is easy to extract demands with high effectiveness from the user [4]. Using scenario method, the developer or the designer should describe a scenario showing the state after
Fig. 1. Idea of solution with individual investigation
Fig. 2. Idea of solution with use-state analysis
82
M. Maekawa and T. Yamaoka
the system constructed beforehand. However, the contents of the discussion are depended on the range of the contents included in the scenario. It is clear that the result changes. For instance, when only a part of narrow range of procedure is imaged, only the demand within the narrow range tends to be found and discussed. Therefore, it is still the theme what range of subjects is set in the scenario by the developer or the designer. Goal aim type is an idea that demands for the system become clear by clarifying stake-holder's goal. Next, the goal is resolved to some sub-goals to be achieved, repeatedly. The function and the operation are allocated to the sub-goals of the subordinate position [3]. Therefore, this is a technique for not the solution of each individual problem but the searching activity for the primary cause by causal relation of problems, and deriving the goal there. However, the problems concerning HCI have complex and unclear causal relation [5]. Therefore, the analysis method used by goal aim type is understood that it is not easy to use for the problems concerning HCI. There is contextual design technique as a method of analysis as the user survey through the investigation and the modeling to understand the context and the situation of product use inclusively. It is interested in this technique through the question to the user first of all by non-structured question method. Next, the use context is modeled from five viewpoints of “Flow”, “Sequence”, “Artificial object”, “Culture”, and “Physics”, and the solution is found from these inductively. There is a feature in the point that the entire use state can be described [6]. A method to invite the idea that finds the fundamental problem with grasping the relations of found small problems as shown in Fig.2 is hoped for now. The purpose that this study proposes looks like goal aim type and contextual design technique, because they don’t intend to the improvement of an individual problem but inclusive re-design. In this proposed technique, the problem itself is little considered, because it pays attention not to the causal relation but the situation of problem occurrence. In contextual design technique, the situation and state is caught as an element of inclusive current understanding, and it is not done as the object of the mathematical analysis. This proposed technique recognizes the situation and state as an information group including background, reasons, and remote cause etc. that relate to the found problem. Use-state keywords which describe the context of situation problem occur are recognized as data to be analyzed mathematically. And, the domains to be re-designed are worked out.
3 Process 3.1 Problem Finding Task analysis, observation, and protocol analysis etc. are used as techniques for finding problems. There is a feature that these techniques are enforceable without preparing testee without protocol analysis. Therefore, for the product development, it is often used. When executing it, some scenes that the product will be used are specified. It is thought that it is easy to understand situations problem occur by recording them with these situation items listed below.
Use-State Analysis to Find Domains to Be Re-designed
1. • • • • 2. • • • • 3. •
83
Situation items concerning user Attribute or Sense of value Role Physical posture Psychological condition Situation items concerning background Stage of work or Time zone Environment or Place Limiting condition Things used alongside Situation item concerning interface Thing concerned
Fig. 3. Structure of situation items of use
Fig. 3 shows the structure of use aspects to be observed or supposed [7]. Each element can be shown in the situation items. The arrows that connect between elements show the relation. And, it is also thought that they include limiting conditions between elements. These limiting conditions are that may become design requirements. These situation items are extracted by the following methods. The description and the item which define the situation are included in user’s role, persona, and scenario etc. used to assume the situation that the product will be used. Therefore, the situation item concerning user can be chiefly drawn out from the element of user’s roles and the persona. In addition, other situation items can be drawn out from the example of scenario. It shows specifically as follows. The set of “Needs”, “Concern”, “Expectation”, “Behavior”, and “Responsibility” included in the user’s role described by Constantine and Lockwood advocated [8] was enumerated as “Roles”, “Attributes or Sense of values”. The characteristic of the persona that Cooper advocated [9] was enumerated as “Attributes or sense of values”. Moreover, Carroll describes about scenario as follows. The reference and the assumption of the situation setting are included in the scenario, and the status becomes clear. It is thought that the scenario includes situation items to describe the status correctly. Some contents of the scenario
84
M. Maekawa and T. Yamaoka
described in two or more documents thought to be good examples to be investigated. The following elements have been extracted from the result of the survey. “Thing concerned” is as an item concerning interface. “Stage of work or time zone”, “Place”, “Limiting condition”, “Things used alongside” are as items concerning background. “Attribute or Sense of value”, “Role”, “Posture”, “Psychological condition” are as items concerning user. What assumed to be a situation concerning the background includes limiting conditions such as technical, social, physical, economical which thought to be remote causes. “Things used alongside” is a thing that is necessary for the user to accomplish the work. It includes supplement elements of the product to be used for achieving the work for the user. “Things concerned” is the objects for interaction like buttons and displays etc. The fault when the designers do task analysis as an impression method, the assumed situations are limited within the range that they recognize beforehand that problems may happen. It is impossible to overturn this fault, even though it is expected to become easy to avoid losing the situation that should be assumed by putting the above mentioned situation items on the mind. However, if the assumed situations of the found problem are recorded, it must be clarified because the situations not assumed are not recorded in it. Therefore, when an insufficient assumption in the problem finding step is recognized, it becomes possible to add the problems under an insufficient situation. Thus, traceability of the assumed situations in the problem finding step is secured, and the recovery action is expected to be easy and swift. 3.2 Description of Situation and Extraction of Use-State Keyword Problems are found by using task analysis, the observation, and protocol analysis etc. They are recorded and the descriptions of the situation in which the problems occur are recorded in rows divided by the situation items. Keywords that frankly show the situation are extracted from each situation description. Two or more keywords may be extracted from one situation description. 3.3 Analysis When the use-state keyword is extracted, it is possible to pay attention to a state keyword which is common between two or more problems, and to extract the problem of coming under a specific keyword. However, it is not realistic to repeat the list of the problem coming under it in every case about all keywords so that associating of each problem and each use-state keyword may have diversity. Moreover, there is a problem of not understanding the range of the problem easily because it becomes a view that pays attention in each of the use-state keyword. Therefore, the pattern is classified by analyzing it with Hayashi's quantification method 3. The categorical data table is made first. The state keywords are arranged in the axis of abscissas and the problems are arranged to the spindle. The cells of the problem and the use-state keyword come under are put “1”. Other cells are put “0”. Next, cluster analysis that uses the category score of each keyword in some extracted dimensions is done. It can be expected to become easy to be able to understand the relations between use-state keywords from the dendrogram obtained as result. The problem descriptions
Use-State Analysis to Find Domains to Be Re-designed
85
are not analyzed, but it is used as reference information when the contents of clusters are interpreted. Moreover, it is possible to exclude needless and an inappropriate use-state keywords from the data if the product concept is prepared upon the analysis beforehand. It can be expected that it becomes easy to find the problem under an appropriate situations that agree with the product concept. 3.4 Interpretation There are various kinds of problems by various situations in one product. And naturally, various kinds of domains to be re-designed are supposed to exist. The use-state keywords are classified into some clusters according to the shape of the dendrogram. Domains are specified by interpreting the contents of each cluster. It may be easy to image the domain if there is a description of specific problem. In that case, it can be used as reference information. The domain could be derived by assuming use-state keywords extracted from the situation description of the problem to be data, and by using cluster analysis. 3.5 Using Interpreted Contents This proposal doesn't pay attention to the problems actualized only. It will adjust to the idea that intends the development of not problem solving type but creation type, because the contents don’t show the problems but the domains and some subjects. Therefore, an idea which is not stored in shape or UI of physical product may be conceived. In this case, it is guessed that not the product but service or function will be consequentially designed.
4 Trial Experiment To confirm the effect of proposal technique, a trial experiment was done. The theme is books search pages on WEB of three libraries. 4.1 Methodology The experiment collaborators were students learning design, or designers. They were 6 people, and all members were experienced task analysis before. All members were daily using internet. Each collaborator was marked letters A-F to be distinguished. All problems were brought together in one sheet, and the situation of the problem occurrence was described according to the situation item. Next, use-state keywords were extracted from the contents of the situation description. In this experiment, to arrange the expression and the extraction method of keywords, one experimenter extracted all the use-state keywords from all the situation descriptions. It was judged that the difference of the assumption situation of “Situation concerning user”, and “Place” were minor. So, these are excluded from the data. The situation item of “Stages of work and time zone”, “Limiting condition”, “Things used alongside”, and “Things concerned” were used and analyzed. The use-state keywords were analyzes by Hayashi’s Quantification Method 3 to examine the pattern of them. Next, the cluster
86
M. Maekawa and T. Yamaoka
analysis on the category score of each dimension of each keyword was done. The dendrogram shows the relations between each use-state keyword like a tree form. Therefore, it drives easiness to image the situation and understand the relativity of them. And it derives discussion about the situation assumed to be re-designed. After this, a comparative experiment was done with the proposal technique. The method was an idea to classify the problem by discussion like KJ method. This is a common practice often used on the site of design. The experiment collaborators were 4 designers besides the trial experiment. Data was 190 problems previously extracted by the trial experiment. Neither the situation descriptions nor the use-state keywords were used at all. After having confirmed the outline of books search WEB system, the experiment collaborator began classifying of the problem description. 4.2 Result 190 problems have been found including repetition. 68 kinds of use-state keywords have been extracted. The keywords were analyzed as category data by Hayashi’s quantification method 3. The keywords and the obtained sample scores of 5 axes were analyzed by cluster analysis. Therefore, a dendrogram shown in Fig.4 was derived. The number of cluster was assumed to be 6 from the shape of the dendrogram. And the domains which were candidates to be re-designed were summarized in Fig.5 with some use-state keywords included in each cluster.
Fig. 4. Dendrogram of state keywords
Use-State Analysis to Find Domains to Be Re-designed
87
Fig. 5. Clustered use-state keywords and domains extracted
Next, the comparative experiment is described. The problems were consequentially classified into the following seven kinds. “Retrieval”, “Visual representation and display”, “Semantic content transferability of mark”, “Layout”, “Procedure design”, “Map”, and “Help”. Because each problem has many aspects to be considered, many kinds of classification should be done by each aspect. However, it is guessed that plural aspects actually mix during classifying, and it was classified as mentioned above. It seems that collaborator's experience, easiness to classify, and plainly of the classification influenced. The contents that paid attention to a specific button or the display were often seen. And ideas adding function were seen also. The idea that aimed reviewing the existence of a function was not seen, and the idea to make the function easy to use was shown. Overall, the derived ideas were narrowly classified from the problems included in each classification. Moreover, not the problem theme but specific plans to solve individual problems were most in the ideas.
5 Consideration 5.1 Comparison with a Common Technique The contents that belong to two or more groups classified by the comparative experiment exist together in almost each clusters derived by this technique. For instance, use-state keywords grouped by common technique concerning “Search”, “Procedure design”, and “Help” are included in one cluster, and keywords belong to “Search”, “Visual representation and display”, and “Layout” are included in each three clusters. It is thought that the effect of inviting combined domain to be re- designed is arrived by using this proposal technique.
88
M. Maekawa and T. Yamaoka
5.2 Problem Finding and Situation Description By using the situation items as a template that describes the situation in which problems occur, it was suggested that the detailed grasp of the problem including the situation is made. In the experiment, the situation item was squeezed to 4 kinds as mentioned above. Naturally, only the keywords which are appeared in the dendorogram are the data to show the situation. However it is not enough. It is necessary to add the situation and the state excluded from the analysis as a common state to perceive the situation correctly. In an actual design process, it is general that the status is limited according to the product concept. The possibility that the proposal method can be used after some situation items are selected in conformity with the product concept is suggested. However, it is thought that there is little effect to use this technique when the situation items to be analyzed are extremely few. It is a point to require the further work. 5.3 Use-State Keyword The result of each collaborator and each product is considered about the amount of use-state keywords. The number of extracted keywords becomes few in one product. To extract various keywords, it was able to be confirmed that you should extract the problem of plural products. The following respect is thought as this reason. Because an individual product is designed respectively in an assumed situation and restriction, various kind of them are extracted as problems by investigating plural products. A significant difference was not confirmed on the amount of keywords between the case that each 3 persons found problems of another product and the case that only 1 person found problems of every products. However, it is understood that the number of keyword obtained by one person is relatively small even if plural products are investigated. By bringing together the result of each person’s investigation to each another product, various keywords are included within the data. And, it leads to the achievement of the use-state keyword extraction with a little omission. 5.4 Analysis and Interpretation By interpreting the use-state keyword group referring to the depth of the relations between each keyword that can be read according to the dendrogram form, it is thought that the problem situation can be almost imaged. This proposal method has aimed to relate to the development of a more creative, more attractive product. The advantage point is that the latent domains to be re-designed can be derived from recognized problems. There is possibility to be able to use the result of analysis as objective information. In the upper process of the product development, the proposal method is expected to function. When the use-state keywords classified into a cluster were considered, it has been understood that two or more keywords which belong to “stage of work or Time zone” are not included in a same cluster easily. This is a limit that depends on the sorting algorithm of the analysis method. Therefore, the other way is necessary to clarify a wider range of domains including plural stages of work or time zones.
Use-State Analysis to Find Domains to Be Re-designed
89
6 Conclusion and Future Plan The present study aimed at the proposal of the technique for deriving domains to be re-designed to develop attractive products. Consequently, the following points were able to be confirmed. The domains that should be re-designed are possible to be found by mathematical analyzing the use-state keywords which are extracted from the description of the problem situation. Up to now, the domains to be re-designed are often subjectively decided by the experience and intuition of designers. So, it is expected that it may become an effective technique that can be used in the upstream stage of the product development. However, there is a limiting fact of this method that the other way is necessary to find wider range domains including plural steps. So, it is necessary to do the research on this theme. And, it is recognized that an additional research for clarifying applicable condition and restriction is necessary in the future. For instance, it is perhaps difficult to be used if situation items few, or the case which serious problems occur only in unusual situations. The research on such respect should be advance further.
References 1. Preece, J., et al.: Human-Computer Interaction. Addison Wesley, UK (1994) 2. Kurosu, M., Ito, M., Tokitsu, T.: Guide book of User Engineering, p. 124. Kyoritsu shuppan, Tokyo (1999) (in Japanese) 3. Ohnishi, J., Tsumaki, T., Shirogane, J.: Introduction to Requirements Engineering, Kindaikagakusha, Tokyo (2009) (in Japanese) 4. Carroll, J.M.: Making Use: scenario-based design of Human-computer Interactions. The MIT Press, Boston (2000) 5. Tahira, H.: Usability Problem Solving with Graph Theory. In: Conf. of Human Interface Symposium, Sapporo (2002) (in Japanese) 6. Beyer, H., Holtzblatt, K.: Contextual Design. Morgan Kaufmann Publishers Inc., San Francisco (1998) 7. Maekawa, M., Yamaoka, T.: One Proposal Concerning Framework to Clarify Design Requirement. In : Proceedings of the 57th Annual Conference of JSSD, Ueda (2010) (in Japanese) 8. Constantine, L., Lockwood, L.: Software for Use. ACM Press, Boston (1999) 9. Cooper, A.: About Face 3. Wiley Publishing Inc., Indianapolis (2007)
An Approach towards Considering Users' Understanding in Product Design Anna Mieczakowski, Patrick Langdon, and P. John Clarkson Engineering Design Centre, Department of Engineering, University of Cambridge, Trumpington Street, Cambridge, CB2 1PZ, United Kingdom {akm51,pml24,pjc10}@eng.cam.ac.uk
Abstract. Although different techniques for supporting the process of designing exist, there is, at present, no easy-to-use and pragmatic way of helping designers to infer and analyse product representations that users form in their heads and to compare them with designers’ own understanding of products. This paper is part of ongoing research that attempts to develop an approach for supporting designers in identifying, during the early stages of the design process, whether specific product features evoke similar understanding and responses among the users as among the designers of those features. Keywords: Inclusive Design, Product-User Interaction, Mental Models, Cognitive Representations, Prior Experience.
1 Introduction Features of many modern products are largely unusable for the majority of users as they frequently embody the intentions of designers and unverified assumptions about users’ needs and wants. To design more inclusively, designers require guidance and tools that would help them to better understand users’ understanding of product features, the goals they want to achieve in relation to products and the actions they exert on product features to achieve those goals. This paper describes the goal and structure of a new easy-to-use and pragmatic technique for designers, called the GoalAction-Belief-Object (GABO) approach, developed to help designers model users’ understanding and use of products and compare it with their own conceptual models.
2 Users’ Internalised Models of Products When users interact with a given product, they need knowledge of the functionality of that product, the relation of it to their task and the dialogue between their actions and product responses [15]. Intrinsically, human understanding is often perceived as a model and the theoretical construct of a mental model has, for over sixty years, been studied at length by researchers from different disciplines including cognitive psychology, artificial intelligence and human-computer interaction (HCI), etc. In 1943 Kenneth Craik [7] laid the foundation for the concept of a mental model by C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 90–99, 2011. © Springer-Verlag Berlin Heidelberg 2011
An Approach towards Considering Users' Understanding in Product Design
91
postulating that it is a small-scale model of external reality that people carry in their heads and use it to take various actions, conclude which is best and utilise this knowledge for future problem-solving. Nearly forty years later, Johnson-Laird [8], suggested that individuals construct internal models of the external world that enable them to make inferences and predictions, understand and explain phenomena, decide what actions to perform and control their execution. Norman [10] argues that the system image observable by the user should be consistent with the designer’s conceptual model, and the mental model that a user brings to bear on a given task should be consistent with both these models. Payne [15] claims that users’ models of products are fragmentary and as a result people have difficulties interacting correctly with products. While, diSessa [12] notes that mental models are heavily influenced by knowledge and experience with previously encountered products.
3 Prior Experience Experience is a critical factor of how easy a product is to learn and use [1]. Rasmussen [3] claims that people process information on three levels: (1) skill-based level, which is automatic and non-conscious; (2) rule-based level, which is guided by the ‘if (precondition) then (action)’ rule; and (3) knowledge-based level, which accounts for unfamiliar situations for which no rules from previous experience are available. Reason [11] suggests that rule-based attempts in tasks are always tried first as people in general are “furious pattern matchers”. If the problem can be pattern matched and only minor corrective rules need to be applied, then the processing will take place at the skill-based level. If, however, a ‘pre-packaged solution’ cannot be found at the rule-based level, then the information processing is carried out at the slower and more laborious knowledge-based level.
4 Modelling Design Activity It has been suggested that extracting, measuring and comparing mental models can be best achieved through modelling [18]. Payne [14] believes that models of user interaction with products should be developed with a view of furthering psychological theory and providing conceptual and practical models in HCI. However, a plethora of tools for designers have been developed way before the theoretical foundations of the concept of mental models have been developed sufficiently and as a direct consequence those models are “very confusing and lack the predictive power to be of any practical or explanatory value” [16]. Yet a further consequence is that product designers are provided with very little guidance in adequately representing and comparing functional, conceptual and user information during product design. Following a number of years spent researching the design activity at a large UK engineering company, Aurisicchio and Bracewell [17] propose the use of diagrams for documenting the structure of design information. They argue that diagrams can be supported by a wide range of computer-based diagramming tools (i.e. Visio, SmartDraw, Mindjet, Compendium, Cambridge Advanced Modeller, etc.), and since they are more visual than linear text documents, they are better for spotting patterns and
92
A. Mieczakowski, P. Langdon, and P.J. Clarkson
gaining insights. Also, using diagrams is beneficial as they group together information that needs to be used together and minimally label that information, place similar information at adjacent locations, minimise shifts of attention and automatically support a number of perceptual inferences.
5 Requirements for Easy-to-Use Modelling Approach Previously, modelling techniques such as ACT-R [5], GOMS [6], SOAR [2] and TAFEI [4] had been developed to help designers to focus on users’ goals and actions. Most of them incorporated task analysis of what a person is required to do to achieve their goal and what operational difficulties they faced [9]. However, for different reasons, including the complexity of architecture and specific skills necessary to use these techniques, so far none of them has effectively been transferred to product design in order to benefit the end users. Consequently, in order to offset the lack of an easy-to-use and pragmatic technique for modelling users’ understanding and use of products and comparing it with designer conceptual models, we developed the GoalAction-Belief-Object (GABO) modelling approach for designers. The GABO approach was developed in conjunction with information elicited from interviews with twenty product designers and observations with fifty users of everyday products. Our goal for development of the GABO approach was that it should be visual, easy and quick to understand, implement and use, it should lead toward improvement in design practice to increase the chances of producing an accessible and usable product, and designers should find significant productivity and differentiation gains in using it.
6 Rules of the GABO Modelling Approach The GABO approach works on the premise that the contents of people’s minds (i.e. their knowledge, theories and beliefs) should be explored by designers in order to better understand users’ behaviour in relation to products [15]. Consequently, it is aimed at representing minimum information about users’ preferences to identify areas of design where people’s mental models are compatible with the designer model and where they differ. Though, it needs noting that the GABO approach does not focus on the structure of the human mind and is not in any way aimed at representing information about the actively changing parts of mental models. Instead, it intends to capture and represent the current and static contents of users’ mental models during interaction with a given product. 6.1 Structure of the GABO Modelling Approach The GABO approach is aimed at encouraging designers to focus on users’ understanding and use of products through representing the following elements: 1. goals that users want to achieve during interaction with products, 2. correct and incorrect actions that they exert on product interfaces 3. beliefs about actions that they bring from previous interactions with other products to interactions with new products, and 4. understanding of the impact of their actions on functional objects.
An Approach towards Considering Users' Understanding in Product Design
93
When using the GABO approach, designers are required to create three types of models: 1. engineering model 2. designer model, and 3. a number of individual user models. The engineering model is essentially a functional analysis diagram encompassing the objects of a given product, the engineer’s beliefs about the function of each object, external interface-based and internal object-based actions that need to be taken to change objects’ states and transform available inputs into the desired outputs, and goals that pertain to actions driven by human input and applied on the objects. The structure of the engineering model is different to the structure of the designer model and the individual user models as its purpose is to act as a reference for designers when they construct the designer model. The designer model includes the overall goal(s) that the designer thinks that the potential users should set for their product usage, the actions that need to be taken to get the users to achieving their goals with that product, the beliefs that correspond to users’ actions and contain the designers’ description of the appearance, functionality and behaviour of that product, and the objects that are envisaged by designers to sit on the top of the interface of the product and be exploited by users. The structure of an individual user model is very similar to the designer model as it captures the overall goal(s) that a given user wants to accomplish with a particular product, the actions that a user wants and thinks they need to perform to achieve their goal(s) with that product, the beliefs that correspond to the user’s actions and provide the user’s internal understanding of the appearance, functionality and behaviour of that product, and the objects that the user is using to carry out their actions. The DRed platform [17] has been used during development of the GABO approach as a testing ground for representation and comparison purposes. However, it needs noting that the GABO approach is aimed to be platform independent. During the use of the DRed software, the GABO approach’s goal, action, belief and object elements were assigned their corresponding DRed elements. Accordingly, the DRed’s Task (Pending) element was used as a goal’s symbol, the Answer (Open) element was chosen to represent an action, the Issue (Open) element was given a belief’s symbol and the Block (Internal) element was selected as a counterpart for an object. The GABO approach’s corresponding DRed elements are shown in Table 1. Table 1. Four elements of the GABO approach and their corresponding DRed symbols
GABO Approach’s Element Types Goal Action Belief Object
Corresponding DRed Element Types
94
A. Mieczakowski, P. Langdon, and P.J. Clarkson
The modelling of products using the GABO approach involves four stages. In the first stage, designers need to refer to a drawing of the actual engineering model of a given product to better understand what product elements and operations are functionally and structurally possible before they design interface features that will sit on the top of the engineering parts. Figure 1 shows an example of an engineering model of a complex-to-use toaster represented using the GABO approach in DRed software.
Fig. 1. Engineering model of a complex-to-use toaster drawn using the GABO approach
In this example, the engineering model has been represented using the object elements; the belief elements to explain the objective of each object; and the main relationships between different objects. Also, the action element was used to connote human input and DRed’s External Block element to signify external entities such as human input, bread, mains power and environmental air. In the second stage of using the GABO approach, designers are required to draw one collective product model using goal, action, belief and object elements and accompany each element with simple descriptive text based on a rigorous semantic coding language of their own choice for the purpose of pattern matching elements in designer and user models during model comparison. For instance, the semantic coding language that has been used to draw the designer and user models in this paper uses different verb and noun combinations depending on which type of element—goal (verb + noun), action (verb+ing + noun), belief (noun + to+verb), object (noun)—it is describing. An example of a designer model of a complex-to-use toaster represented using the GABO approach in DRed in shown in Figure 2. Due to the sheer size of this designer model, only a part of it is included in this paper.
An Approach towards Considering Users' Understanding in Product Design
95
Fig. 2. Part of designer model of a complex-to-use toaster drawn using the GABO approach
In this example, the designer model of a complex-to-use toaster has been represented using the goal element to signify the master user goal pertaining to a toaster’s usage; the action elements to indicate the sequence of actions that designers think that users have to perform to accomplish their goal with a toaster interface; the belief elements to provide designer description of the appearance, functionality and behaviour of toaster features which users are required to exert their actions on and to elucidate the position of different features and how user actions need to be taken to correctly operate them; and object elements to specify the functional features that are envisaged by designers to sit on the top of the toaster interface and be exploited by users.
Fig. 3. User model of a complex-to-use toaster drawn using the GABO approach
96
A. Mieczakowski, P. Langdon, and P.J. Clarkson
In the third stage of using the GABO approach, designers need to run observations with users of products using a verbal protocol (think-aloud) and keep a log of users’ goals, actions, beliefs and objects. Subsequently, designers should use the information collected from study participants to draw several individual user models using goal, action, belief and object elements and, as in the design model, accompany each element with simple descriptive text based on a rigorous semantic coding language of their own choice. Similarly to the elements used in the designer model example, the elements in individual user models have been annotated with four different verb and noun combinations. An example of an individual user model of a complex-to-use toaster represented using the GABO approach in DRed in shown in Figure 3. In this example, the individual user model of a complex-to-use toaster has been represented using the goal element to signify the master user goal pertaining to a toaster’s usage; the action elements to indicate the sequence of actions that a user wants and thinks they need to perform to accomplish their goal with a toaster interface; the belief elements to provide a user’s internal understanding of the appearance, functionality and behaviour of toaster features which they are required to exert their actions on and to convey the position of different features and how a user thinks their actions need to be taken to correctly operate these features; and lastly object elements to stipulate the functional features that a user is familiar with, accustomed to operate and immediately associates with a toaster form. It needs noting that the block element has a blue background only in the designer model and the background colour of all the transcluded block elements in user models is white. In the fourth stage of using the GABO approach, designers need to compare similarities and differences between each user model and the designer model using an appropriate algorithm (either manually or computationally), check the degree of compatibility between the designer model and the individual user models and make appropriate design decisions relating to the inclusivity of future product features. 6.2 The GABO Comparison Procedure The GABO approach stipulates that any two models (a designer model and an individual user model) are compared based on: (1) presence of the same vertices in the two models and (2) connectivity between two given vertices in the two models. The comparison procedure is carried out (either manually or computationally) using a function from set theory that measures similarity between graphs with common vertex and edge sets [13]. This function is used for measuring both the presence of vertices and the connectivity between vertices in the designer model and the individual user models, with the designer model acting as the main model against which each user model is checked for compatibility. The assumption is that by using the GABO approach’s function, designers will be able to make close estimates of the degree of compatibility of their intended goals, actions, beliefs and objects regarding product usage with the goals, actions, beliefs and objects of heterogeneous users. The function for checking presence of vertices in designer model and user models is as follows:
P( D,U ) =| VD ∩ VU | / | VD | ,
(1)
An Approach towards Considering Users' Understanding in Product Design
97
where •
P = value between 0 and 1, where o 0 means that an individual user model is not compatible with the designer model, and o 1 means that an individual user model is 100% compatible with the designer model o Any value between 0 and 1 indicates the degree of compatibility of a user model
with the designer model (e.g. if P ( D,U ) = 28/88, then the compatibility level equals 0.31, which means 31% compatible) • D = designer model • U = user model
VD = the set of vertices in designer model • VU = the set of vertices in user model • V D ∩ VU = the set of all vertices that are members of both V D and VU .
•
This function assumes that two vertices are the same (one from the designer model and one from the user model) if they belong to the same element type, for instance the belief element type, and contain the same semantic grammar. The function for checking connectivity of vertices in designer model and each user model is as follows:
C ( D,U ) =| E D ∩ EU | / | E D | ,
(2)
where • C = value between 0 and 1, where o 0 means that an individual user model is not compatible (dissimilar) with the designer model, and o 1 means that an individual user model is 100% compatible (identical)with the designer model o Any value between 0 and 1 indicates the degree of compatibility of a user model with the designer model (e.g. if C ( D,U ) = 30/114, then the compatibility level equals 0.26, which means 26% compatible) • D = designer model • U = user model
E D = the set of edges in designer model • EU = the set of edges in user model • E D ∩ EU = the set of all edges that are members of both E D and EU .
•
The connectivity function assumes that two edges are equal if they join two vertices that belong to the same element type and have identical semantics.
98
A. Mieczakowski, P. Langdon, and P.J. Clarkson
7 Evaluation of the GABO Modelling Approach The usefulness and effectiveness of the GABO approach was evaluated with eight designers from a range of small and large organisations based in the UK during two five hour workshop sessions. Designers, aged between 29 and 52, were asked to work on two redesign tasks, one of which required them to redesign an interface of a household product with a complex-to-use interface (either a toaster or a coffee maker) using a method of choice and the other task required them to redesign an interface of one of the aforementioned two products using the GABO approach. When the tasks were completed, each designer was asked to individually fill out an evaluation questionnaire composed of a number of quantitative and qualitative questions. Overall, the designers marked, on average, point 5 on a 7-point scale indicating how useful the GABO approach was in identifying and capturing users’ understanding and the problems users encounter during product use. This procedure was mirrored when investigating designers’ opinion regarding designers’ understanding of product functionality with the understanding of users, the result being point 5.5 on the scale. Likewise, indicating ease-of-use, the designers on average gave the GABO approach a score of 4.3. In addition, five designers believed that the GABO approach helped them to produce a better design than an alternative approach, while three designers said that they would need more time to use it to determine as to whether it was better or worse than the alternative method.
8 Discussion and Conclusion This paper discussed the role of mental models and prior experience in product-user interaction and existing modelling techniques for representing users’ goals and actions. Since there is, at present, no easy-to-use and pragmatic technique for representing and comparing designers and users’ understanding and usage of everyday products, this paper proposes the GABO approach for designers which bridges that gap. The GABO approach consists of four stages in which designers need to: (1) refer to the engineering model of a product to better understand how different product parts interact with one another; (2) create a designer model of that product using appropriately annotated goal, action, belief and object elements and compare it with the engineering model to see what features should be mounted on the top of the underlying functional parts; (3) investigate how different users understand and use product features, create several individual user models using goal, action, belief and object elements annotated in the same semantic style as their counterpart elements in the designer model; and (4) compare the designer model with individual user models using a function from set theory (either manually or computationally), check the degree of compatibility between the designer model and the user models and make appropriate design decisions relating to the inclusivity of future product features. Results from the evaluation study with eight designers indicate that designers find the GABO approach fairly useful and effective in identifying key similarities and differences in designers and users’ understanding and usage of products.
An Approach towards Considering Users' Understanding in Product Design
99
References 1. Langdon, P.M., Lewis, T., Clarkson, P.J.: Prior experience in the use of domestic product interfaces. Universal Access in the Information Society 9, 209–225 (2009) 2. Laird, J.E., Newell, A., Rosenbloom, P.S.: SOAR: An architecture for general intelligence. Artificial Intelligence 33, 1–64 (1987) 3. Rasmussen, J.: Skills, rules, and knowledge: Signals, signs, and symbols, and other distinctions in human performance models. IEEE Transactions on Systems, Man, and Cybernetics 13, 257–266 (1983) 4. Stanton, N.A., Baber, C.: Validating task analysis for error identification: Reliability and validity of a human error prediction technique. Ergonomics 48, 1097–1113 (2005) 5. Anderson, J.R.: Rules of the mind. Lawrence Erlbaum Associates, Hillsdale (1993) 6. Card, S., Moran, T.P., Newell, A.: The Psychology of human-computer interaction. Lawrence Erlbaum Associates, Hillsdale (1983) 7. Craik, K.J.W.: The nature of explanation. Cambridge University Press, Cambridge (1943) 8. Johnson-Laird, P.N.: Mental models. Harvard University Press, Cambridge (1983) 9. Kirwan, B., Ainsworth, L.K.: A Guide to Task Analysis: The Task Analysis Working Group. Taylor and Francis, London (1992) 10. Norman, D.A.: The design of everyday things. Basic Books, London (2002) 11. Reason, J.T.: Human error. Cambridge University Press, Cambridge (1990) 12. di Sessa, A.: Models of computation. In: Norman, D.A., Draper, S.W. (eds.) User-centered system design: New perspectives in human-computer interaction, pp. 201–218. Lawrence Erlbaum Associates, Hillsdale (1986) 13. Goldsmith, T.E., Davenport, D.M.: Assessing structural similarity in graphs. In: Schvaneveldt, R.W. (ed.) Pathfinder associative networks, pp. 75–87. Ablex Publishing Corporation, Norwood (1990) 14. Payne, S.J.: On mental models and cognitive artefacts. In: Rogers, Y., Rutherford, A., Bibby, P.A. (eds.) Models in the mind: Theory, perspective and application, pp. 103–118. Academic Press, London (1992) 15. Payne, S.J.: Mental models in human-computer interaction. In: Jacko, J.A., Sears, A. (eds.) Mental models: The human-computer interaction handbook: Fundamentals, evolving technologies and emerging applications, pp. 63–76. Taylor & Francis, New York (2008) 16. Rogers, Y.: Mental models and complex tasks. In: Rogers, Y., Rutherford, A., Bibby, P.A. (eds.) Models in the mind: Theory, perspective and application, pp. 145–149. Academic Press, London (1992) 17. Aurisicchio, M., Bracewell, R.H.: Engineering design by integrated diagrams. In: Proceedings of the 17th International Conference on Engineering Design (ICED 2009), Stanford California, USA, pp. 301–312 (2009) 18. Van Engers, T.: Knowledge management: The role of mental models in business system design. PhD Thesis. Departmennt of Computer Science, Vrije Universiteit (2001)
Evaluation of Expert Systems: The Application of a Reference Model to the Usability Parameter Paula Miranda1, Pedro Isaias2, and Manuel Crisóstomo3 1
Escola Superior de Tecnologia de Setúbal, IPS Campus do IPS, Estefanilha 2910-761 Setúbal – Portugal
[email protected] 2 Universidade Aberta, Rua Fernão Lopes, 9, 1º Esq. 100-132 Lisboa – Portugal
[email protected] 3 Universidade de Coimbra,
[email protected] Abstract. This study aims to present an expert systems’ performance evaluation model, which will then be used to evaluate an existing expert system as a way of testing its applicability. The proposed model’s evaluation criteria are: usability, utility, quality, interface, structure, productivity and return. Information systems, especially expert systems, are today a real necessity for any organisation intending to be competitive. Given this scenario, organisations investing in these systems, aim to, progressively, ensure that the investment they’ve made is contributing to the organisation’s success. Hence, it is fundamental to evaluate the expert system performance. The evaluation assesses an expert system’s adaptability to its original requisites and objectives and determines if its performance satisfies its users and meets the organisation’s strategic goals. Keywords: Expert systems, performance evaluation, evaluation model, usability.
1 Introduction The assessment of a system’s performance is a management technique, which aims to improve the performance of the system and justify its existence. Performance evaluation is a tool that allows the assessment and measurement of how a system is accomplishing its mission inside a company and if it is meeting the expectations. Also, throughout the evaluation it is possible to measure and differentiate systems’ performance; to inform the companies’ management of the systems’ performance; to help in the decision making in terms of either to continue or to cease certain investments and to identify the deficiencies in the systems’ performance in order to correct them. C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 100–109, 2011. © Springer-Verlag Berlin Heidelberg 2011
Evaluation of Expert Systems
101
The literature presents several methods and models for the evaluation of information systems in general and knowledge based systems in particular. Nonetheless, they’re not very systematic, they’re difficult to apply and they’re based in informal concepts, with modest foundations, and little practical experience. According to [2] the evaluation of a knowledge based system is a multifaceted problem, with numerous approaches and techniques. The results the system produces should be evaluated along with its characteristics: usability, easiness to improve and the impact on the users for not using the system. A review of the literature demonstrates that the authors have numerous viewpoints in terms of what should be considered when evaluating an expert system. [3] believe expert systems can be evaluated and analysed through the most important performance measures: efficiency and effectiveness. Efficiency is the level of accomplishment in terms of objectives and it is related to the outputs of the system and the effectiveness correlates the inputs (resources) with the outputs. Efficiency implies the use of the least amount of resources. [4] have a different view and they advocate that the best way to evaluate and test the efficiency of an expert system is to ensure it satisfies the users and it responds to their feedback. User satisfaction is also presented by [1] as a success measure, in terms of an expert system’s use and effectiveness. If the users are pleased with the system, they will feel motivated to use it and that will improve their individual impact. [5], on the other hand, believe that the evaluation is the process that ensures the usability, quality and utility of an expert system.
2 Evaluation Model This study aims to propose an evaluation model of expert systems performance, based on the review of various studies conducted on this subject. 2.1 Initial Considerations When evaluating an expert system, one has to take into consideration its evaluation as a global objective. An expert system cannot be seen as an isolated object, but as an integrated entity, used by people in a specific context to achieve certain objectives [7]. The review of the literature identified several parameters for expert systems evaluation. These parameters demonstrate that expert systems performance should be evaluated from the perspective of the existing influences between user, system and organisation. According to [8], in this type of system, users and technologies are, often, responsible for the accomplishment of tasks. Hence, both the technical and the organisational evaluation are important. To conduct an accurate evaluation of this type of systems it is essential to take into consideration the interactions involved in the process where the expert system is implemented: user, system and organization. [9] underlines, in his study, the fact that the interdependency between these entities is determinant for the accomplishment of a task. Expert systems bring many benefits to users. Expert systems are an excellent support for users, in the sense they can reduce the time users spend performing a task and also decreases the users’ workload. It can make users more successful in the
102
P. Miranda, P. Isaias, and M. Crisóstomo
accomplishment of their tasks and make decisions with a higher precision. A user with good working tools is, undoubtedly, a satisfied user since the workload and the occurrence of mistakes are reduced and the user can make better decisions. Nonetheless, the misuse of the system may lead to failed expectations. Expert systems also bring benefits to organisations. An expert system can lead to a more precise and homogeneous decision making and represent a reduction in task conclusion time. Due to its fundamental characteristic of disseminating knowledge, it gives, specialists, the possibility of being freed from tasks that were previously their responsibility so they can have time to perform other tasks. Nevertheless, if the organisation doesn’t create the conditions for the expert system to operate at its best, for example, by subscribing a maintenance and update program, the system won’t have the expected performance. The user, by operating the system correctly is unquestionably contributing for the global performance of the organisation. 2.2 Model As it was mentioned by [10], the majority of the models described in the literature are not clear and organised in the identification of the evaluation parameters. In the attempt to design a theoretical model it was possible to identify parameters defined by the majority of the authors. The parameters that can be used in expert systems performance evaluation, identified in the variety of models proposed in the literature, were grouped with the intention of contributing to their identification and systemisation. They were organised according to the role of the entities involved in the evaluation process: the user, the system and the organisation. Table 1. Parameters and respective expert systems evaluation criteria, according to the involved entities Entities
Parameters Usability
User
Utility Quality
Interface System Structure
Productivity Organisation Return
Criteria Learning Effectiveness Error tolerance Satisfaction Efficiency Scope Suitability Reliability Decision quality Consistency Update time Coherence Response time Ease of navigation Suggestibility Screen design Search easiness Input and output quality System stability Error rate Transaction processing Historic Transaction time Motivation Task optimisation Benefit Reports Efficiency Cost Effectiveness Competitiveness Return over investment Cost reduction
Evaluation of Expert Systems
103
The table above categorises the entities involved in the process, the evaluation parameters and the respective criteria that compile the model this research aims to propose (table 1). 2.2.1 User Perspective From the user perspective, it was possible to identify three parameters for the evaluation of expert systems’ performance: usability, utility and quality. Usability Parameter The usability parameter was identified, with more or less emphasis, by several authors [10], [2] and [5] and it can be defined by the easiness with which the system is used. Table 2 lists the evaluation criteria proposed for the usability parameter: learning, error tolerance, efficiency, effectiveness and satisfaction. Table 2. Usability parameter – evaluation criteria Criteria Learning Error tolerance Efficiency Effectiveness Satisfaction
Definition Ease with which the users perform the tasks. Identification of the errors made by the user. The required effort to attain a certain objective. Quantity of successfully accomplished tasks. User’s subjective satisfaction.
Utility Parameter Another considered parameter in the perspective of the user for performance evaluation is utility [11] and [1]. Utility refers to how much benefit the user perceives to have. According to [2], the performance of the system should also be evaluated by its utility. Thus, it is important to assess if the expert system is useful in the resolution of the problems it was designed to solve, to determine if the expert system is executing what it is expected to. The Technology Acceptance Model (TAM), developed by [12] to explain the behaviour of informatics system users, has demonstrated that if a system is really useful, the user tends to use it, despite the existing difficulties to do so [13]. Table 3. Utility parameter – evaluation criteria Criteria Scope Suitability
Definition Level of correspondence between the system and the reality. Relation between the decisions made by the system and the needs of users.
Quality Parameter The quality parameter is referred by [14], [15], [10], [16] and [5]. The technical definition established by the International Organisation for Standardisation (ISO) states that quality is fitness for use. It is the conformity to the demands.
104
P. Miranda, P. Isaias, and M. Crisóstomo
Table 4 shows the criteria associated with the quality parameter: reliability, consistency, coherence, decision quality, update time and response time. Table 4. Quality parameter – evaluation criteria Criteria Reliability Consistency Coherence Decision quality Update time
Response time
Definition Level of trust the users have in the systems decisions. Integrity relation between different information in the system. System’s capacity to reflect reality. Level of understanding of the decisions and its suitability to the needs of the user. Capacity the system has to make decisions in the shortest period of time, while keeping the system available for updates, so that no conflicts can occur in the decision-making process Time the user has to wait since the request until the attainment of a decision
2.2.2 System’s Perspective In the system’s perspective two parameters were identified for the evaluation of expert systems performance: interface and structure. The interface and structure parameters allow the identification of criteria in a technological perspective, i.e., according to the software and hardware aspects. Interface Parameter The interface, cited by [14] and [11] relates to the design of the system and the access to the information and the structure concerns the technological base of the system. The interface in an expert system is the set of characteristics the users employ to interact with the system. Thus, the interface is everything the user has available to control the system and what the system has to produce the effects (responses) of the user’s actions. For the interface parameter five criteria were identified: ease of navigation, screen design, input and output quality, suggestibility and search easiness. Structure Parameter The next parameter, related to the system, is structure. The evaluation criteria of this parameter are: system stability, transaction processing, transaction time, error rate and historic. The criteria concerning the structure were referenced by [14]. 2.2.3 Organization’s Perspective From the organisation’s perspective two parameters were identified for the evaluation of the expert systems performance: productivity and return. The productivity and return parameters enable the measurement of the benefits that the organisation gains from the acquisition of the expert system. Productivity Parameter The productivity parameter criteria are: motivation, benefit, efficiency, effectiveness, task optimisation, reports and cost. They intend to evaluate the aspects related with the financial impact and the production. These criteria were identified by diverse authors namely [15], [10] and [3].
Evaluation of Expert Systems
105
Return Parameter The return parameter intends to underline the relation between the expectations of the investment and the perceived return. Its criteria are competitiveness, cost reduction and return over investment.
3 Case Study The case study conducted in this study aims to demonstrate the applicability of the previously proposed expert system evaluation model, by using the Exsys Corvid’s "Cessna Citation X Diagnosis System". 3.1 Describing the Application of the Exsys Corvid’s "Cessna Citation X Diagnosis System" Exsys Corvid is a world leader, user-friendly tool used in the development of expert systems. Its systems report an impressive investment return and a unique competitive advantage (Exsys 2008). This study chose the "Cessna Citation X Diagnosis System", an expert system developed by the Cessna airline for its Cessna Citation X planes, using the Exsys Corvid software. According to Cessna, the use of this expert system has significantly reduced the downtime of their Cessna Citation X aircrafts, allowing for a substantial reduction of costs and maximizing the safety and satisfaction of its users. Using this case study, a scenario that reports a fault on the starting of one of the engines of the Cessna Citation X plane was proposed. More specifically, one of the engines fails when the starting process is initiated. When the fault occurs, a set of lights in the cabin are switched on and depending on which lights are switched on, recommendations are progressively given and then a final recommendation on how to solve the problem is provided. 3.2 Methods A questionnaire was designed (see appendix)to demonstrate the feasibility of the adoption of the proposed model and to provide a method to assess the level of satisfaction that the users of this type of systems experience. It uses a Likert scale that indicates the level of the user’s agreement or disagreement in face of the given statements. The users assessed the statements according to levels that vary from 0 – N/A to 5-Totally agree. The questionnaire embodies some of the factors responsible for the satisfaction or dissatisfaction of expert systems’ users. The questionnaire was given to a total of 37 students from two classes of the “Management Science” Master degree from the Higher Institute of Economics and Management from Lisbon Technical University – Portugal. One of the classes concerned students from the “Decision Systems” module and the other from “e-Business Models and Technologies”. The users started by understanding the test scenario in order for them to get familiarised with the application of the test surroundings, in this case the Exsys Corvid (http://www.exsys.com) and the scenario “Expert System - Cessna Citation X”. Only then they proceeded to the questionnaire itself.
106
P. Miranda, P. Isaias, and M. Crisóstomo
4 Results This section will display the result obtained with the evaluation of the leraning criteria of the usability parameter. It is the product of a statistical treatment of the questionnaires’ data. The first part characterises the user and the second intends to evaluate the learning criteria. 4.1 User’s Profile The respondents’ age was an average of 24.5 years (n=37; SD=2.96). Respondents have been asked whether it was the first time they used expert systems. 14 answered ‘No’ (37.8%) and 23 answered ‘Yes’ (62.2%) (N=37). Respondents that answered ‘No’ to the previous question were asked what the frequency os expert systems usage was. 10 answered ‘Rarely’ (71.4%) whilst only 4 (28.6%) answered ‘at least once a week or more’ (N=14). To those that used expert systems. a question on usage frequency (‘how much time per week they used expert systems?”). The majority (13) corresponding to 92.9% of the respondents only used “less than an hour” (N=14). This shows very little experience with expert systems. 4.2 Usability Evaluation The results presented in this section concern the learning criteria for usability evaluation. 4.2.1 Learning Criteria The questions used to evaluate this criteria are: A.1 It is easy to learn how to use the system. A.2 It took too much time to learn how to use the software. A.3 It is easy to remember how to use the system after a period of usage interruption. A.4 The information the system supply is easy to apprehend. A.5 The help/explanation messages are clear. A.6 The language the system uses is adequate. A.7 The commands’ names are suggestive and easy to use. A.8 To learn how to use the system it is necessary to have extensive experience in the use of expert systems. A.9 Before using the system is it necessary to read much documentation. A.10 It is easy to successfully complete a task for the first time in the system . A.11 It was possible to use the software from beginning to end without resorting to external help. A.12 The system encourages the exploration of new features. The learning criteria registered, in general, a high median. Table 5 presents more detailed answers’ results. Table 5 indicates satisfaction at a level 4 – Agree for questions A1, A3, A4, A5, A6, A7, A10, A11 and A12 (with the highest percentage of respondents indicating this choice). This means that with regards to these questions users, in general, agreed
Evaluation of Expert Systems
107
A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12
Total
5 - Totally Agree
4 - Agree
3 - Neither Agree/ Nor Disagree
2 - Disagree
1 - Totally Disagree
0 - NA
Classification
Questions
Table 5. Frequencies and percentages regarding the learning criterion
F
%
F
%
F
%
F
%
F
%
F
%
F
%
1 1 1 1 1 1
2.7 2.7 2.7 2.7 2.7 2.7
11 1 1 16 19 2 -
29.7 2.7 2.7 43.2 51.4 5.4 -
1 17 3 1 2 5 19 12 1 9 1
2.7 45.9 8.1 2.7 5.4 13.5 51.4 32.4 2.7 24.3 2.7
1 7 5 8 12 7 7 1 4 5 6 10
2.7 18.9 13.5 21.6 32.4 18.9 18.9 2.7 10.8 13.5 16.2 27.0
21 2 17 21 17 19 20 1 2 27 13 19
56.8 5.4 45.9 56.8 45.9 51.4 54.1 2.7 5.4 73.0 35.1 51.4
14 9 6 6 9 4 3 7 6
37.8 24.3 16.2 16.2 24.3 10.8 8.1 18.9 16.2
37 37 36 37 37 37 37 37 37 37 37 37
100 100 97.3 100 100 100 100 100 100 100 100 100
that nhe system is easy to understand and remember who to use, the information given by the system is effortlessly assimilated, the messages the system supplies, the language it employs and the name of the given commands are accessible, the accomplishment of a task for the first time can be done without resorting to external help and the system encourages the use of new features. With regards to questions A1, A3, A6 and A11, the second highest score relates to people who chose 5-Totally agree, indicating that these questions, globally, represented a higher users’ satisfaction among all the questions with a higher median. Nonetheless, questions A4, A5, A7, A10 and A12 besides having a 4-Agree score have a significant number of individuals choosing 3-Neither Agree nor Disagree, reflecting a certain indifference in responding to these questions. It is necessary to underline that answers A2 and A8 present a majority score of 2-Disagree, followed by a significant second choice of 1Totally Disagree. This means that users, in general, disagreed that the process of learning how to use the software is time-consuming and requires extensive experience. Finally, answer A9 had a 1 – Totally Disagree score followed by a considerable 2-Disagree score, meaning that the majority of users disagreed that it is necessary to read much documentation before using the system, therefore, it becomes possible to conclude that it is easy to learn how to use the system. 4.2.2 Others The others criteria: error tolerance, efficiency, efficacy and satisfaction were also object of the same kind of evaluation.
108
P. Miranda, P. Isaias, and M. Crisóstomo
4.3 Discussion The analysis of the results from table 5 allowed an expert system usability evaluation for learning criteria. The generality of the criteria evaluated in this study resulted in a high usability. Nonetheless, the efficiency, efficacy and satisfaction criteria obtained more convergent indices in terms of the users’ opinions and, consequently, a higher trust in the results that presented a high usability. Satisfaction was undoubtedly the criteria with best performance. It is possible to perceive that in certain criteria the number of respondents responsible to reduce the evaluation of the usability for that criteria, or even of provoking a large dispersion in the answers was very small, thus enhancing the amplitude of the final result. From the presented results we can verify the applicability of the proposed model, to the usability parameter, in special for the learning criteria.
5 Conclusions Expert Systems can be used as decision auxiliaries and are part of a set of tools used by organizational strategy, working as differentiation and competition devices. It’s unquestionable the importance to evaluate Expert Systems. The impact of Information Systems in general and Expert Systems in particular in organizations is of increasing importance, interconnecting the most varied activities. Investments in these systems are high and the benefits difficult to be measured due to the high level of subjectivity of the variables under evaluation. Nevertheless, this is a mandatory task. There are several barriers to the mentioned evaluation such as systems’ complexity, the evolutive nature of Information Technologies, the non existence of methodological evaluation patterns, demands in terms of Human Resources, Financial demands, and time demands, lack of immediate perception of the evaluation benefits and the perception of limitations or failure aspects in the organizations. The study presented had the goal, as described in section 1, to propose a model to evaluate the performance of Expert Systems. The proposal has been object of evaluation through a questionnaire that can be applied to an organization’s Expert System. The implementation shows the applicability of the proposed model and its results.
References [1] Cascante, P., Plaisent, M., Maguiraga, L., Bernard, P.: The Impact of Expert Decisions Support Systems on the Performance of New Employees. Information Resources Management Journal 15(4), 64–78 (2002) [2] Barr, V.: Applications of Rule-base Coverage Measures to Expert System Evaluation. Knowledge-Based Systems 12, 27–35 (1999) [3] Turban, E., Aronson, J.: Decision Support Systems and Intelligent Systems, 6th edn. Prentice-Hall, New Jersey (2000) [4] Anumba, C., Scott, D.: Performance Evaluation of Knowledge-Based System for Subsidence Management. Structural Survey Journal 19(5), 222–232 (2001)
Evaluation of Expert Systems
109
[5] Cojocariu, A., Munteanu, A., Sofran, O.: Verification, Validation and Evaluation of Expert Systems in Order to Develop a Safe Support in the Process of Decision Making. Computational Economics, 0510002 (2005) EconWPA [6] Yang, C., Kose, K., Phan, S., Kuo, P.: A Simulation-Based Procedure for Expert System Evaluation. In: Proceedings of the IEA/AIE 13th International Conference, New Orleans, June 19-22 (2000) [7] Miranda, P., Isaias, P., Crisostomo, M.: Expert systems evaluation proposal. In: Smith, M.J., Salvendy, G. (eds.) HCII 2007. LNCS, vol. 4557, pp. 98–106. Springer, Heidelberg (2007) [8] Grabowski, M., Sanborn, S.: Evaluation of Embedded Intelligent Real-time Systems. Decision Sciences Journal 32(1), 95–123 (2001) [9] Mauldin, E.: An Experimental Examination of Information Technology and Compensation Structure Complementarities in an Expert System Context. Journal of Information Systems 1, 19–41 (2003) [10] Guida, G., Mauri, G.: Evaluating Performance and Quality of Knowledge-Based Systems: Foundation and Methodology. IEEE Transactions on Knowledge and Data Engineering 5(2), 204–224 (1993) [11] Waterman, D.: A Guide to Expert System. Addison-Wesley, Reading (1986) [12] Davis, F.: Perceived Usefulness, Perceived Ease of Use, and User Acceptance of Information Technology. MIS Quarterly 13(3), 319–340 (1989) [13] Burgarelle, R., Carvalho, R.: Avaliação do Uso de Sistemas de Informação Académica por Alunos de Graduação em Ciência e a Informação. In: Actas do VIII ENANCIB – Encontro Nacional de Pesquisa em Ciência da Informação, Salvador, Bahia, Brasil (October 2007) [14] Lawrence, L., Miller, W., Okogbaa, G.: Evaluation of Manufacturing Expert Systems: Framework and Model. The Engineering Economist 37, 293–314 (1992) [15] Kirani, S., Zualkernan, I., Tsai, W.: Comparative Evaluation of Expert System Testing Methods. In: Proceedings of the 1992 IEEE International Conference on Tools with AI, Arlington, VA (November 1992) [16] Lynn, M., Murray, M.: Expert System Domain Identification, Evaluation and Project Management a TQM Approach. International Journal of Quality and Reliability Management 13(3), 73–83 (1996)
Investigating the Relationships between User Capabilities and Product Demands for Older and Disabled Users Umesh Persad1, Patrick Langdon2, and P. John Clarkson2 1 Product Design and Interaction Lab, Centre for Production Systems, The University of Trinidad and Tobago, Trinidad and Tobago, West Indies 2 Cambridge Engineering Design Centre, Department of Engineering, University of Cambridge, Cambridge, United Kingdom
[email protected], {pml24,pjc10}@eng.cam.ac.uk
Abstract. This paper presents the results of a study that specifically looks at the relationships between measured user capabilities and product demands in a sample of older and disabled users. An empirical study was conducted with 19 users performing tasks with four consumer products (a clock-radio, a mobile phone, a blender and a vacuum cleaner). The sensory, cognitive and motor capabilities of each user were measured using objective capability tests. The study yielded a rich dataset comprising capability measures, product demands, outcome measures (task times and errors), and subjective ratings of difficulty. Scatter plots were produced showing quantified product demands on user capabilities, together with subjective ratings of difficulty. The results are analysed in terms of the strength of correlations observed taking into account the limitations of the study sample. Directions for future research are also outlined. Keywords: Inclusive Design, Product Evaluation, User Capability Data, Disability.
1 Introduction Inclusive design aims to consider the needs and capabilities of older and disabled users in the design process. This ensures that products and services are accessible and usable by the maximum number of users without being stigmatising or resorting to special aids and adaptations [1]. In previous work [2], a framework for analytical evaluation was presented, taking into account the principles of interaction design, human factors/ergonomics and psychology. This framework proposed the use of a database of capability measures to support analytical evaluation by matching the multivariate capability profiles of users with the demands on those capabilities by the designed product. Thus, an assessment of compatibility between user and product could be determined for representative groups and sub-groups in the population. Research has shown that predictions of real-world interaction problems in disabled populations produce variable results [3,4]. For example, in studying control operation by physically disabled users, Kanis found that he was able to accurately predict users' C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 110–118, 2011. © Springer-Verlag Berlin Heidelberg 2011
Investigating the Relationships between User Capabilities and Product Demands
111
difficulty in operating controls for a little more than 50% of the cases (after measuring maximum force exertions and the force required by the control) [3]. Steenbeekkers et al. also concluded that laboratory measures have limited predictive value for difficulties experienced in daily life, and it is not clear how individual measures of capability combine enabling task performance [4]. Most human factors data tends to be for relatively homogenous populations, and using capability data to make real world predictions of difficulty and exclusion for disabled people is not well understood [4,5,6]. As a precursor to further developing analytical evaluation methods and collecting human capability data to support these methods, this fundamental problem needs to be addressed. The study presented in this paper aims to shed some light on the predictive ability of user capability measures in the context of a capability-demand product interaction framework. It is a necessary first step to determine the relevant capability measures and their relationships with task outcome measures before the development of valid and robust analytical evaluation methods for inclusive design.
2 Study Design An empirical study was conducted entailing participants using four consumer products chosen to represent activities of daily living: (1) a clock-radio, (2) a mobile phone, (3) a food blender and (4) a vacuum cleaner (Fig. 1). The study was conducted at the usability lab of the Cambridge Computer Laboratory. Prior to commencing the study, the four products were chosen and various characteristics were measured including sizes and colours of text, sizes and colours of interface features (chassis, buttons, handles etc.) and push/pull/rotational forces required for activation. After ethical approval was obtained from the Cambridge Psychology ethics committee, older and disabled users were recruited from organisations in and around Cambridge such as the University of the Third Age (U3A), CAMSIGHT and the Hester Adrian Centre (Papworth Trust). This resulted in the recruitment of 19 participants in total who took part in the study. Participants first signed a consent form and were given a 10 GBP voucher for participating in the study. They were then asked questions to gather demographic, medical and product experience information. Participants were also asked to rate their experience with four consumer products and to describe how they would go about using these products to perform tasks (one task per product). Data was recorded on a questionnaire sheet and via an audio recorder. Secondly, a series of capability tests were administered using a range of measurement devices. These tests included sensory tests of visual acuity, contrast sensitivity, hearing level; cognitive tests of short term working memory, visuo-spatial working memory, long term memory and speed of processing (reaction time); and motor tests such as push/pull forces exerted by each hand in different positions, walking speed and balance time. Participants had a short break after the sensory and cognitive capability assessment was performed. Some of the participants chose to take breaks during the capability testing session when they became tired. All capability testing data was recorded on a pre-designed testing sheet or a computer database for the computer based cognitive testing (CANTABeclipse from Cambridge Cognition).
112
U. Persad, P. Langdon, and P.J. Clarkson
Fig. 1. Products selected for the study (from the left): Matsui Clock Radio, Siemens Mobile Phone, Breville Blender, and Panasonic Vacuum Cleaner
Thirdly, participants performed one task with each of the products while being videotaped, with tasks randomly assigned to avoid order effects. The tasks performed were: (a) Clock radio - setting the time to 4.30 PM, (b) Mobile phone - taking the ringer off via the menu, (c) Blender - blend a banana and water as if making a smoothie/drink, and (d) Vacuum cleaner- vacuum a piece of carpet till clean. These tasks were analysed for essential constituent actions before the commencement of the study. Therefore, on completion of a task, subjective difficulty and frustration ratings were collected from each participant for these selected actions using a visual analogue scale ranging from 0 to 100. After completing the four tasks, participants were debriefed and thanked for participating in the study.
3 Results and Analysis The collected data was entered into SPSS for statistical analysis and the video data was analysed in Adobe Premiere Pro. The videos of each participant were analysed and task time and errors were extracted. The SPSS data consisted of 19 participants (Mean Age=62.68, SD=9.20) with demographic data, capability data, and task outcome measures (times, errors, difficulty and frustration ratings) for each product task. The data was analysed by generating scatterplots of rated difficulty versus measured capability for constituent actions, and analyzing the strength of linear correlations. Fig. 2 shows the proportion of participants who attempted, succeeded or failed the task with each of the consumer products. Of the 16 participants who performed the clock radio task, 56% successfully completed the task, and of the 16 participants who performed the mobile phone task, 19% completed it successfully. Of the 19 participants who performed the blender task, 100% successfully completed the task, and of the 18 participants who performed the vacuum cleaner task, 100% completed it successfully. Thus the mobile phone task and the clock radio task had the highest and second highest failure rate respectively.
Investigating the Relationships between User Capabilities and Product Demands
113
Fig. 2. Graph of proportion of participants who attempted, failed and succeeded each of the four tasks
Mean difficulty and frustration ratings were plotted for each product and compared. The mobile phone had the highest mean ratings across all the products for difficulty in starting the task (M=78.24, SD=30.00), difficulty in working out subsequent actions (M=84.59, SD=28.96), and overall mental demand (M=76.47, SD=30.25). The mobile phone also had the highest mean rating for frustration experienced during the task (M=48.89, SD=41.82). In terms of visual demands, the small text on the clock radio display was rated the most difficult to see (M=52.37, SD=34.78), followed by seeing the numbers on buttons (M=46.05, SD=36.84) and seeing the actual buttons (M=39.47, SD=41.16) on the mobile phone. The physical actions of opening (M=47.37, SD=30.25) and closing (M=38.68, SD=26.03) the blender cover and pushing the vacuum cleaner forward (M=28.06, SD=30.69) were also rated as being the most difficult actions on average. In terms of overall mental demands, the mobile phone ranked the highest (M=76.47, SD=30.25), followed by the clock radio (M=42.94, SD=37.54), the blender (M=28.11, SD=32.58) and the vacuum cleaner (M=27.94, SD=27.60). For mean frustration ratings, the mobile phone once again ranked the highest (M=48.89, SD=41.82), followed by the vacuum cleaner (M=28.33, SD=36.22), the clock radio (M=26.39, SD=39.91) and the blender (M=22.11, SD=36.03). In the following sections, the relationships between measured user capabilities and their difficulty ratings would be examined further. Due to space limitations, an overview of the results will be given with illustrative examples. 3.1 Sensory and Motor Capabilities Scatter plots between visual capabilities and rated difficulty in visual actions were generated as shown in Fig. 3. Similar graphs were also used for motor actions. The plots show an increasing user capability measure on the horizontal axis, while the vertical axis shows the rated difficulty score ranging from 0 to 100. A red vertical dashed demand line is plotted on the graph to indicate the specific demand of the
114
U. Persad, P. Langdon, and P.J. Clarkson
Fig. 3. Graphs of visual capability and demand (top row) and motor capability and demand (bottom row)
product feature being considered. Weak correlations (r values) were considered to be 0.4 and below, moderate between 0.4 and 0.7, and strong above 0.7 for both positive and negative correlations. The top row of Fig. 3 shows a fairly strong negative relationship between reading numbers on the digital display of the clock radio and the contrast sensitivity measured for each participant: r(16)=-0.782, p< 0.01. It also shows a strong correlation for actions involving seeing product features, for example seeing the cord retractor button on the vacuum cleaner: r(14)=-0.771, p< 0.01. Five of eight actions involving seeing textual features produced strong correlations, while the remaining three produced weak to moderate correlations. Three of six actions involving seeing product features produced strong correlations, while the remaining three produced very weak correlations. The bottom row of Fig. 3 shows example correlations for motor actions. In considering motor actions and manipulations (pushing buttons, sliding and twisting, pushing and pulling etc.), only weak to moderate linear relationships were found. For example, the left graph in the bottom row of Fig. 3 shows a weak linear relationship between finger push force and difficulty in pushing the clock radio buttons: r(17)=0.105. Actions such as lifting the blender and moving the vacuum cleaner around showed better correlations. For example, in pushing the vacuum cleaner forward (right graph in the bottom row of Fig. 3), there was a moderate linear relationship with rated difficulty: r(13)=-0.564, p< 0.05.
Investigating the Relationships between User Capabilities and Product Demands
115
3.2 Cognitive Capabilities In order to investigate the relationships between measured cognitive capabilities and task outcome measures, graphs were plotted of task time, errors, difficulty starting task, difficulty in selecting subsequent actions and overall mental demand against four main cognitive variables: (1) short term working memory (digit span), (2) visuospatial working memory (span length), (3) speed of processing (reaction time) and (4) long term memory (GNTpercentcorrect). In the case of errors, short term working memory and visuo-spatial working memory were found to correlate moderately with errors for the blender r(17)=-0.493, p< 0.05 and vacuum cleaner r(16)=-0.516, p< 0.05. Visuo-spatial working memory showed some fairly strong correlations with blender errors r(15)=-0.819, p< 0.01 and vacuum errors r(14)=-0.700, p< 0.01. Long term memory showed a significant relationship with blender errors: r(15)=-0.638, p< 0.01, vacuum errors: r(15)=-0.763, p< 0.01 and clock radio errors: r(14)=-0.502, p< 0.05. However, the mobile phone errors had a week linear correlation with long term memory: r(14)=-0.059. Some of these relationships are shown graphically in Fig 4.
Fig. 4. Example graphs of relationships between errors and short term working memory, visuospatial working memory and long term memory
116
U. Persad, P. Langdon, and P.J. Clarkson
Fig. 5. Example graphs of relationships between errors and a Euclidean model of cognitive capability
Fig. 6. Graph of relationships between overall mental demand ratings and self rated experience
In order to further investigate the relationship between outcome measures and cognitive capabilities, the four cognitive variables were used to calculate a derived variable (scaled to an interval between 0 and 1) based on one of the following models: MAX, MIN, CITY-BLOCK and EUCLIDEAN. MAX and MIN models used the maximum and minimum value of the four cognitive variables respectively. The
Investigating the Relationships between User Capabilities and Product Demands
117
CITY-BLOCK metric used the sum of the four cognitive variables, in this case being equivalent to the arithmetic mean of the cognitive variables. The EUCLIDEAN metric was also used which is the square root of the sum of the squares of the four cognitive variables. Correlations between these cognitive models and task time, errors, difficulty starting task, difficulty with next action and overall mental demand were investigated. In general, all cognitive models produced very weak linear relationships with the outcome measures, except for the EUCLIDEAN cognitive capability model in relation to the blender errors and the vacuum cleaner errors shown in Fig. 5 (blender errors r(15)=-0.711, p< 0.01 and vacuum cleaner errors r(14)=-0.804, p< 0.01). Fig. 6 shows the relationships between overall mental demand and self rated experience. The clock radio r(14)=-0.631, p< 0.01 and the vacuum cleaner r(15) =-0.622, p< 0.01 correlations were moderate in strength. The mobile phone showed participants with relatively high ratings of mental demand even with moderate to high experience ratings, while the blender showed participants with low ratings of mental demand even though they had relatively low experience ratings.
4 Discussion The overview of results presented in the previous sections suggest that, given the limitations of the current study, measures of low-level visual, cognitive, and motor capabilities in general correlate weakly to moderately with outcome measures such as time, errors and rated difficulty. In the case of vision, correlations were moderate to strong indicating that the essential low-level capabilities utilized in real world task performance were being captured and adequately modelled with a linear relationship. The blender and vacuum cleaner showed stronger correlations with cognitive variables such as visuo-spatial memory and long term memory. This difference between the type of product used is significant in that it indicates different resources may be drawn upon depending on the complexity of the task at hand. Further, it also indicates the possibility that multiple models may be needed to capture the workings of the human cognitive system in task performance. The EUCLIDEAN model used to combine the four cognitive variables correlated strongly with errors for the blender and vacuum cleaner, but not for the clock radio and mobile phone. Therefore, another type of model may account for such cognitively demanding products. The results also indicate that there is scope for exploring alternatives to a linear reductionist model for describing human capability in disabled populations. It is possible that disabled people utilise multiple low-level capabilities in a non-linear way, relying on a system of accommodation and coping strategies that would be difficult to accurately model with simple linear models. The graphs also show that ratings of difficulty could have a large spread at a given level of capability. It is possible that multidimensional judgements are being made in rating difficulty with single actions, taking into account the level of difficulty with other actions in the task and the cognitive state of the user. The sample size (19 users) in the study was relatively small, and the results presented are indicative. However, the methodology for investigating capabilitydemand relationships with a large sample should be similar, as it simultaneously
118
U. Persad, P. Langdon, and P.J. Clarkson
captures all the essential user, product and task parameters for analysis of their interrelationships. The trade-off would be the increased cost and resources required for conducting a larger scale investigation.
5 Conclusions and Further Work The data collected is being analysed further to look at models consisting of multiple user capabilities and how they relate to errors, times and rated difficulty. This would involve the use of both linear and non-linear models to determine whether task outcome measures could be predicted using lower level sensory, cognitive and motor capability measures. Further studies are planned using a similar methodology and a larger sample size to investigate capability-demand interaction in older and disabled populations.
References 1. Keates, S., Clarkson, J.: Countering design exclusion - An introduction to inclusive Design. Springer, Heidelberg (2003) 2. Persad, U., Langdon, P., Clarkson, J.: Characterising user capabilities to support inclusive design evaluation. Universal Access in the Information Society 6(2), 119–135 (2007) 3. Kanis, H.: Operation of Controls on Consumer Products by Physically Impaired Users. Human Factors 35(2), 325–328 (1993) 4. Steenbekkers, L.P.A., VanBeijsterveldt, C.E.M., Dirken, J.M., Houtkamp, J.J., Molenbroek, J.F.M., Voorbij, A.I.M.: Design-relevant ergonomic data on Dutch elderly. International Journal for Consumer & Product Safety 6(3), 99–115 (1999) 5. Kondraske, G.V.: Measurement tools and processes in rehabilitation engineering. In: Bronzino, J.D. (ed.) The Biomedical Engineering Handbook, vol. 2, pp. 145-141 – 145-116. CRC Press, Boca Raton (2000) 6. Kondraske, G.V.: A working model for human system-task interfaces. In: Bronzino, J.D. (ed.) The Biomedical Engineering Handbook, vol. 2, pp. 147-141 – 147-118. CRC Press, Boca Raton (2000)
Practical Aspects of Running Experiments with Human Participants Frank E. Ritter1, Jong W. Kim2, Jonathan H. Morgan1, and Richard A. Carlson3 1
College of Information Sciences and Technology, The Pennsylvania State University 2 Department of Psychology, University of Central Florida 3 Department of Psychology, The Pennsylvania State University {frank.ritter,jhm5001,racarlson}@psu.edu,
[email protected] Abstract. There can often be a gap between theory and its implications for practice in human-behavioral studies. This gap can be particularly significant outside of psychology departments. Most students at the undergraduate or early graduate levels are taught how to design experiments and analyze data in courses related to statistics. Unfortunately, there is a dearth of materials providing practical guidance for running experiments. In this paper, we provide a summary of a practical guide for running experiments involving human participants. The full report should improve practical methodology to run a study with diverse topics in the thematic area of universal access in humancomputer interaction. Keywords: Experiments, Human Participants, Universal Access.
1 Introduction Joining the lab as a new research assistant to help run studies with human participants, you have come to help out and to learn. What do you do? Where do you start? How can you avoid common and easily fixable problems that even your colleagues and lab director might be unaware because you are outside of a psychology department? All of these questions are related to practical guidelines to run an experiment. However, there are few practical guides available on the practical aspects of how to prepare and run experiments with human participants. In our experience, we have found that research assistants (RAs) and principal investigators (PIs) are taught how to design experiments and how to analyze data in courses such as Design of Experiments and Statistics. On the other hand, the lack of materials on running experiments can lead to a gap between theory and practice in this area, which is particularly acute outside of psychology departments. Consequently, labs frequently must not only impart these practical skills to students informally but must also address misunderstandings arising from this divorce of theory and practice in their formal education. Researchers in psychology often end up C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 119–128, 2011. © Springer-Verlag Berlin Heidelberg 2011
120
F.E. Ritter et al.
appalled by the lack of this common but undocumented sense when it is reported by researchers applying psychology methods outside of psychology. 1.1 Why Do We Need a Practical Guide? In general, scientific inquiries in the areas of human-computer interaction (HCI), human factors, cognitive psychology, and cognitive science involve human participants. One distinguishing factor of these disciplines, and thus experiments in these areas, has been the centrality of the human participant. Consequently, working in these areas requires not only understanding the theoretical and ethical issues incumbent to running human participants but also the practical aspects of the process itself. To frame this discussion, we are working to provide an overview of this process and related issues. 1.2 Purpose of This Paper In this paper, we will present a summary of a practical guide (Ritter, Kim, & Morgan, 2009) that can help RAs to run experiments effectively and more comfortably. Our purpose is to provide hands-on knowledge and actual experimental procedures. We are generally speaking here from a background rooted in cognitive psychology, cognitive ergonomics, and HCI studies. Because it is practical advice, we do not cover experimental design or data analyses and it may be less applicable in more distant areas. 1.3 Who Is This Report Useful For? We believe that this synopsis of our report is and the longer summary are useful to anyone who is starting to run research studies, training people to run studies, or studying the experimental process. Particularly, it is useful for students, teachers, lab managers, and researchers in industry. It is useful in particular to computer scientists and other technologists who might run an empirical user study to test new ways to support universal access.
2 Contents We focus on topics that are important for running HCI related user studies concerning diverse populations and universal interactions to them. Also, we discuss the importance of repeatable and valid experiments, and the ethical issues associated with studies involving human participants. 2.1 Overview of the Components Table 1 outlines several of the major components of the larger report. Here, we examine these components with respect to studies examining universal access for diverse populations.
Practical Aspects of Running Experiments with Human Participants
121
Table 1. Important components for working with diverse populations Components Scripting Missing subjects Decorum Recruiting Literature Debriefing Payments Piloting Simulator studies Chances for insights
Explanation Writing a script to ensure standard procedures are observed. How do you deal with participants who do not show up? How do you dress and how do you address the participants? How do you recruit a diverse yet representative set of participants without unwanted bias? What literature should you read to prepare for running a study? How to debrief after a study session. How to arrange payment for the participants, and the importance of performing this correctly. The need to run pilot subjects to practice the method, and also to find where the method (e.g., the script) needs to be modified. The role for simulated studies and how to treat model results as data. Reiterating the importance of being observant, and thus ready for further insights while running studies.
2.2 Repeatability and Validity When running an experiment, ensuring its repeatability and validity are of greatest importance, assuming the experiment is conducted ethically. Running an experiment in exactly the same way for each participant is essential. In addition, reducing unwanted variance in the participants’ behavior is important as well. Ensuring this repeatability is partly the job of the RAs, who often are not informed about these concepts and their practical application. Thus, senior colleagues should strive to make these concepts clear to RAs, while RAs should strive to provide each participant a consistent and comfortable but neutral testing experience. Understanding how participants will complete the task, and working towards uniformity across all iterations of the procedure for each subject are important. The repeatability of the experiment is a necessary condition for scientific validity. There are, however, several well-known effects that can affect the experimental process. Chief among these is the experimenter’s effect, or the influence of the experimenter’s presence on the participants. It is important to be not only aware of this effect but also how it can vary across experimenters. Depending upon the experimental context, the experimenter effect can lead to either increased or decreased performance. The magnitude and type of effect that can be attributed to this effect generally depends upon the type and extent of personal interaction between the participant and experimenter. Thus, you should strive to provide each participant a comfortable but neutral testing experience. Besides the experimenter’s effect, there are other risks to the experimental process. We highlight some here and illustrate how to avoid them, either directly or through proper randomization. Randomization is particularly important because you will most likely be responsible for implementing treatments, while understanding the other risks will help you take steps to minimize them. Finally, there are other experimental effects that are outside of your control—we do not cover these here. Even though you
122
F.E. Ritter et al.
cannot eliminate all contingent events, you can note idiosyncrasies and with help from the principle investigator either correct them or report them as a potential problem. Another common source of variation across trials is the effect of the experimental equipment. For instance, if you are having participants interact with a computer or other fixed display, you should take modest steps to make sure that the participant’s distance to the display is the same for each subject—this does not mean, necessarily, putting up a tape measure, but in some cases, it does. It is necessary to be aware that the viewing distance can influence performance and in extreme cases can lead to blurred vision, irritated eyes, headache, and movement of torso and head (e.g., Rempel, Willms, Anshel, Jaschinski, & Sheedy, 2007). These factors, in that they represent deviations in the testing protocol, can be risks to validity. Furthermore, if subjects are picking up blocks or cards or other objects, the objects should either always be in the same positions, or they should be always randomly placed because some layouts of puzzles can make the puzzles much easier to solve. The experimental set up should be consistent across all trials. There will be other instances where variations in the apparatus can lead to unintended differences, and you should take advice locally to learn how to reduce these risks. 2.3 Ethics There are several topics that you need to keep in mind when working with participants. Chief among these are the ethics pertaining to experimental procedures, and the gathering and reporting of data including published and unpublished documents. If you have any questions, you should contact the lead researcher (or principal investigator), or other resources at your university. Because we would like to generalize the results to a wide population, indeed the whole population if possible, it is critical to recruit a representative sample of the population in question. It has been noted by some observers that experimenters do not always recruit from the whole population. In some studies, there are good reasons for recruiting more heavily from one sub-group, either because of the increase in risk to vulnerable groups (e.g., non-caffeine users in a caffeine study), or because of limited access (e.g., treatment programs for drug addicts). In these cases, there are principled procedures for ensuring reliability. Where possible, however, experimenters should recruit a representative population. This may mean putting up posters outside your department; and it may entail paying attention to the proportion of participants of any one sex or age among the recruits. Where the proportions are detrimental, correcting these imbalances will most likely require more recruiting. As a research assistant, you can be the first to notice this, and to help address these issues by bringing it to the attention of the investigator. Coercion is a violation of the human rights of participants. It is necessary to avoid any procedures that restrict participants’ freedom of consent regarding their participation in a study. Some participants, including minors, patients, prisoners, and individuals who are cognitively impaired are more vulnerable to coercion. For example, enticed by the possibility of payments, minors might ask to participate in a study. If, however, they do so without parental consent, this is unethical because they are not old enough to give their consent—agreements by a minor are not legally binding.
Practical Aspects of Running Experiments with Human Participants
123
Students are also vulnerable to exploitation. The grade economy presents difficulties, particularly for courses where a lab component is integrated into the curriculum. In these cases, professors must not only offer an experiment relevant to the students’ coursework but also offer alternatives to participating in the experiment. To address these problems, it is necessary to identify potential conditions that would compromise the participants’ freedom of choice. For instance, in the second example, recall that it was necessary for the professor to provide an alternative way to obtain credit. In addition, this means ensuring that no other form of social coercion has influenced the participants’ choice to engage in the study. Teasing, taunts, jokes, inappropriate comments, or implicit quid pro quo arrangements are all inappropriate. These interactions can lead to hard feelings (that’s why they are ethical problems!), and the loss of good will towards you, your lab, and potentially science in general. When preparing to run a study, you must have procedures for handling sensitive data. There are at least two categories of which to be aware: (1) sensitive data that you have indentified and prepared for before the experiment; and (2) unexpected sensitive data that arises during the course of the experiment. Personal data, or data that is intrinsically sensitive, should be handled carefully. Information on an individual pertaining to his or her race, creed, gender, gender preference, religion, friendships, etc., must be protected. This data should not be lost or mislaid. It should not be shared with people not working on the project, either formally if you have an IRB that requires notice, or informally, if the IRB does not have this provision (formal tracking of who has access to experimental data occurs more rigorously in the US than in some other countries). You should seek advice from your colleagues about what practices are appropriate in your specific context. In some situations, you are not allowed to take data from the building, and in most cases, you are encouraged to back it up and keep the backed-up copy in another safe and secure location. In nearly all cases, anonymous data, that is, removing names and other ways data can be associated with a particular individual, removes most or all of the potential problems. Data that arises during the experiment (e.g., the subject’s responses) can have implications beyond the scope of the study. This can include subjects implicating themselves in illegal activity, or unintentionally disclosing an otherwise hidden medical condition. For example, if you are administering caffeine, and you ask the subject what drugs they take to avoid known caffeine agonists or antagonists, you may find information about illegal drug use. If you take the participant’s heart rate or blood pressure, you may discover symptoms of an underlying disease. It is important to prepare experimenters for these situations. Generally, preparation for a study should involve discussions about how to handle sensitive data, and whether there is a chance that the study may reveal sensitive data about the participants. You should fully understand how your institution’s policies regarding sensitive data, and how to work with the participants when sensitive information becomes an issue. If you have questions, you should ask the principle investigator.
3 Major Aspects of Working with Diverse Populations Nearly every aspect of preparing and running a study is influenced by the precautions necessary for working with studies of universal access and diverse populations.
124
F.E. Ritter et al.
We examine just a few, noting that some researchers may find other issues more important for their work. 3.1 Recruiting Recruiting participants for your experiment can be a time consuming and potentially difficult task, but recruiting is important for producing meaningful data. An experimenter, thus, should carefully plan out with the lead researcher (or the principal investigator) to conduct successful participant recruitment for the research study. Ask yourself, “What are the important characteristics that my participants need to have?” Your choices will be under scrutiny, so having a coherent reason for which participants are allowed or disallowed into your study is important. First, it is necessary to choose a population of interest from which you will recruit participants. For example, if an experimenter wants to measure the learning effect of foreign language vocabulary, it is necessary to exclude participants who have prior knowledge of that language. On the other hand, if you are studying bilingualism you will need to recruit people who speak two languages. In addition, it may be necessary to consider age, educational background, gender, etc., to correctly choose the target population. Second, it is necessary to decide how many participants you will recruit. The number of participants can affect your final results. The more participants you can recruit, the more reliable your results will be. However, limited resources (e.g., time, money, etc.) often force an experimenter to find the minimum number of participants. You may need to refer to previous studies to get some ideas of the number of participants, or may need to calculate the power of the sample size for the research study, if possible (most modern statistical books have a discussion on this, and teach you how to do this, e.g., Howell, 2007). Finally, you may have to consider whether the sample is, in fact, too large (e.g., online surveys), in that the sample’s size is either wasteful of resources, or because it is likely to exaggerate any correlations found in the data. With large sample sizes, trivial or meaningless effects can be found to be statically significant (reliable). This is not a normal problem, but if you arrange to test a large class or use online data you may encounter this problem. There are several ways that participants can be recruited. The simplest way is to rely on the experimenters to find participants. In simple vision studies, this is often done because the performance differences between people are negligible in these types of tasks is negligible, and knowledge of the hypothesis being tested does not influence performance. Thus, the results remain generalizable even with a small number of participants. An alternative to personal recruiting specifically for the task is to use a sample of convenience. Samples of convenience consist of people who are accessible to the researcher (e.g., such as classroom based research). Many studies use this approach, so much so that this is not often mentioned. Generally, for these studies, only the sampling size and some salient characteristics that might possibly influence the participants’ performance are noted. These factors might include age, major, sex, education level, and factors related to the study, such as nicotine use in a smoking study, or number of math courses in a tutoring study. There are often restrictions on how to recruit appropriately, so stay in touch with your advisor and/or IRB.
Practical Aspects of Running Experiments with Human Participants
125
In studies using samples of convenience, try distributing an invitation e-mail to a group mailing list (e.g., students in the psychology department or an engineering department) done with the approval of the list manager and your advisor. Also, you can post recruitment flyers on a student board, or an advertisement in a student newspaper. Use efficiently all the resources and channels available to you. There are disadvantages to using a sample of convenience. Perhaps the greatest is that the resulting sample is less likely to lead to generalizable results. The subjects you recruit are less likely to represent a sample from a larger population. Students who are participants often differ from other students. For instance, selection bias is a potential issue because they are very likely already interested in experimental methods and the hypotheses behind them, making them more vulnerable to experimenter effects. Furthermore, the sample itself may have hidden variability in it. The subjects you recruit from one method (e.g., e-mailing them) or from another method (using a poster) may be different. We also know that they differ over time— those that come early to fulfill course requirements are generally more conscientious than those who come later. So, ensure that these participant types are randomly assigned to the various conditions of your study. The largest and most carefully organized sampling group is a random sample. In this case, researchers randomly sample a given population by carefully applying sampling methodologies meant to ensure statistical validity by making participant selection equally likely across the given population. Asking students questions at a football game as they go in does not constitute a random sample—some students do not go (selection bias). Other methods such as selecting every 10th student based on a telephone number or ID introduce their own biases. For example, some students do not have a publicly available phone number, and some subpopulations register early to get their ID numbers. Truly choosing a random sample is difficult, and you should discuss how best to do this with your lead researcher. One approach for recruiting participants is a subject pool. Subject pools are generally groups of undergraduates who are interested in learning about psychology through participation. Most Psychology departments organize and sponsor subject pools. Subject pools offer a potential source of participants. You should discuss this as an option with your lead researcher, and where appropriate, learn how to fill out the requisite forms. If the students in the study are participating for credit, you need to be particularly careful with recording who participated because the students’ participation and the proof of that participation represent part of their grade. A whole book could be written about subject pools. Subject pools are arrangements that psychology or other departments provide to assist researchers and students. The department sets up a way for experimenters to recruit subjects for studies. Students taking particular classes are either provided credit towards the class requirement or extra credit. When students do not wish to participate in a study, alternative approaches for obtaining course credit are provided. The theory is that participating in a study provides additional knowledge about how studies are run, and provides the participant with additional knowledge about a particular study. Researchers, in turn, receive access to a pool of potential subjects.
126
F.E. Ritter et al.
3.2 Experimental Methods: Background Readings for your Study Many introductory courses in statistics focus primarily on introducing the basics of analysis of variance (ANOVA) and regression. These tools are unsuitable for many studies analyzing human subject data where the data is qualitative or sequential. Care, therefore, must be taken to design an experiment that collects the proper kinds of data. If ANOVA and regression are the only tools at your disposal, we recommend that you find a course focusing on the design of experiments featuring human participants, as well as the analysis of human data. We also recommend that you gather data that can be used in a regression because it can be used to make stronger predictions, not just that a factor influences a measure, but in what direction (!) and by how much. Returning to the topic of readings, it is generally useful to have read in the area in which you are running experiments. This reading will provide you further context for your work, including discussions about methods, types of subjects, and pitfalls you may encounter. For example, the authors of one of our favorite studies, an analysis of animal movements, notes that data collection had to be suspended after the researchers were chased by elephants! If there are elephants in your domain, it is useful to know about them. There are, of course, less dramatic problems such as common mistakes subjects make, correlations in stimuli, self-selection biases in a subject population, power outages, printing problems, or fewer participants than expected. While there are reasons to be blind to the hypothesis being tested by the experiment (that is, you do not know what treatment or group the subject is in that you are interacting with, so that you do not implicitly or inadvertently coach the subjects to perform in the expected way), if there are elephants, good experimenters know about them, and prepared research assistants particularly want to know about them! As a result, the reading list for any particular experiment is both important and varies. You should talk to other experimenters, as well as the lead researcher about what you should read as preparation for running or helping to run a study. 3.3 Piloting Conducting a pilot study based on the script developed for the research study is important. Piloting can help you determine whether your experimental design will successfully produce answers to your inquiries. If any revision to the study is necessary, it is far better to find it and correct it before running multiple subjects, particularly when access to subjects is limited. It is, therefore, helpful to think of designing experiments as an iterative process characterized by a cycle of design, testing, and redesign. In addition, you are likely to find that this process works in parallel with other experiments, and may be informed by them (e.g., lessons learned from ongoing related lab work). Thus, we highly recommend that you use pilot studies to test your written protocols (e.g., instructions for experimenters). The pilot phase provides experimenters the opportunity to test the written protocols with practice participants, and is important for ironing out misunderstandings, discovering problematic features of the testing equipment, and identifying other conditions that might influence the participants. Revisions are a normal part of the process; please do not hesitate to revise your
Practical Aspects of Running Experiments with Human Participants
127
protocols. This will save time later. There is also an art to knowing when not to change the protocol. Your principal investigator can help judge this! It is also useful at this stage to write the method section of your paper. Not only is your memory much fresher but also you can show other researchers your method section and receive suggestions from them before you run the study—definitely a good time to get suggestions. These suggestions can save you a lot of time, in that these reviews essentially constitute another way of piloting the study. 3.4 Chances for Insights Gathering data directly can be tedious, but it can also be very useful and inspiring. Gathering data gives you a chance to obtain insights about aspects of behavior that are not usually recorded, such as the user’s questions, their posture, and their emotional responses to the task. Obtaining these kinds of insights and the intuition that follows from these experiences is important aspect of the experimental process, but gathering data is particularly important for young scientists. It gives them a chance to see how previous data has been collected and how studies work. Reading will not provide you this background or the insights associated with it, rather this knowledge only comes from observing the similarities and differences that arise across multiple participants in an experiment. So, be engaged as you run your study and then perform the analysis. These experiences can be a source for later ideas, even if you are doing what appears to be a mundane task. In addition, being vigilant can reduce the number and severity of problems that you and the lead investigator will encounter. Often, these problems may be due to changes in the instrument, or changes due to external events. For example, current events may change word frequencies for a study on reading. Currently, words such as bank, stocks, and mortgagees are very common, whereas these words were less prevalent three or four years ago.
4 Conclusions Once a science is mature, practitioners know the methods; while, however, a science is growing, methods and procedures have to be more explicitly taught. Furthermore, when methods move between areas (e.g., techniques associated with behavioral studies moving from psychology to computer science and engineering), there must be an effort to not only document and disseminate these methods but also to formally transfer them, with greater attention given by senior investigators. In our presentation we provide practical advice regarding conducting experiments with human participants. We are working on extending and polishing a written guide that will be useful to anyone who is starting to run research studies, training people to run studies, or studying the experimental process. We expect this guide will be particularly helpful to students who are not in large departments, or who are running participants in departments that do not have a long history conducting human-based research.
128
F.E. Ritter et al.
Currently, the report is in use at five universities in the US, Canada, and England for graduate and advanced undergraduate courses in Cognitive Science, Human Factors engineering, and in Human-Computer Interaction courses. As a colleague noted, this contains just common sense. In this case, we have found that the common sense is not so common, and that new researchers, both students and those taking up a new methodology, need a good dose of common sense.
Acknowledgements. This work was sponsored by ONR (W911QY-07-01-0004 and #N00014-10-1-0401).
References 1. Howell, D.C.: Statistical methods for psychology, 6th edn. Thomson, Belmont (2007) 2. Rempel, D., Willms, K., Anshel, J., Jaschinski, W., Sheedy, J.: The effects of visual display distance on eye accommodation, head posture, and vision and neck symptoms. Human Factors 49(5), 830–838 (2007) 3. Ritter, F.E., Kim, J.W., Morgan, J.H.: Running behavioral experiments with human participants: A practical guide (Tech. Report No. 2009-1): Applied Cognitive Science Lab, College of Information Sciences and Technology, The Pennsylvania State University (2009)
A Genesis of Thinking in the Evolution of Ancient Philosophy and Modern Software Development Stephan H. Sneed jCOM1, Münchner Straße 29 - Hettenshausen 85276 Pfaffenhofen, Germany
[email protected] Abstract. This paper is a brief discussion on the issue of modeling evolution. The question posed is where does modeling theory come from and where is it going. A parallel is drawn between the evolution of modeling theory and ancient philosophy. In the end both must come to the conclusion that a theory must be applicable to a human problem and must lead to a solution of that problem. Otherwise, it is useless. It is pointed out, that some methodological approaches are even detrimental to reaching a good solution. They only absorb costs and effort and lead to nowhere. Just as Aristotle rounded out the ancient philosophical discussion, the S-BPM method appears as the next logical step in the evolution of modeling methods. In the end, some research issues are discussed, for the S-BPM method as well as for modeling comparison in general. Keywords: SA, OOA, OOP, S-BPM, paradigm change, modeling methods, programming languages, software development, evolution.
1 Introduction Before introducing a next step in this evolution, one needs to answer some questions: How do we even determine the value of a modeling or programming method? Is there a significant difference between programming and modeling languages? If modeling languages are only meant to describe a system, they can differ from programming languages, but if they are intended to be executed they take on all the characteristics of a programming language. In the end both are formal, artificial languages. If so, then all the experience made with programming languages must apply as well to the evolution of modeling languages. By looking at the evolution of human thinking in ancient philosophy from Parmenides to Plato via Socrates and on to Aristotle, we will see that the evolution of human thought has much in common with the evolution of software development methodologies. We will also see that Aristotle is even one step ahead compared to the IT with Object technology as the last major evolution step. Philosophy in general is about constructing descriptive models of the real world that are easier to comprehend than the real world itself. This is what philosophy has in common with design theories aimed at achieving higher level abstractions of machine code in a graphical notation. They are easier to comprehend than the machine code which is finally executed, which was the main motivation to develop them. C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 129–138, 2011. © Springer-Verlag Berlin Heidelberg 2011
130
S.H. Sneed
2 Formal Theories in IT Let us define executable machine instructions as the base of any abstraction with index 0 and therefore as atomic elements of higher level abstractions. Any higher level abstraction of machine code is a formal theory above this base with an index n. Any model consisting of more than one atomic algorithm of such a formal theory is called a system. The problem with these definitions is that they are recursive and not bounded. Two instructions can already form a system. A function is already a system as well as anything that is not atomic like a single instruction or a single data item. Hence, referring to something as a system consisting of functions that consist of instructions is absolute arbitrary. The fact is that we have systems encapsulated by other systems and so on. There can be an infinite nesting of systems. So each programming language is an abstraction with an index n of the atomic instructions at 0 level, and hence a formal IT theory. Each executable program written in this language is a model of this theory and can be interpreted by the computer. In case of modeling methods, one has to distinguish between the ones that are executable and ones that are not. Some modeling languages like SADT were never intended to be executable. Originally UML was not intended to be executed. Now it is. Anyway, executable modeling languages (or their executable subsets) are formal IT theories about the code they govern. A model of such a theory can be executed, hence is an abstraction of machine code and hence an IT system. This brings us to an interesting point: What is actually the difference between a programing language and an executable modeling method? They have the following aspects in common: • Programing languages and modeling methods are both level n abstractions of machine code • Models of programing languages and models of modeling methods are called IT systems. The aspects where programming languages and modeling methods differ are the following: • The algorithms in programming languages have a textual representation whereas the representation of algorithms in an executable modeling method is graphical. Actually most modeling languages reach a point where graphics no longer suffice. Then they revert to text in tags like OCL as an extension of UML. • Usually the modeling method is an abstraction of a certain programming language and has the abstraction index n+1 when the programming language is of abstraction index n. It becomes clear that the type of algorithm representation, either in text or in graphical representation, is not crucial for the theory itself. We could well think of a graphical Java Editor where an “if statement” could be placed in a diamond shaped form as with structured analysis. Thus, the representation of algorithms is actually not relevant since it can be replaced by other representations. The other aspect is the degree of the abstraction level. A modeling method will usually have a higher
A Genesis of Thinking in the Evolution of Ancient Philosophy
131
abstraction index than a programming language but this aspect does not have a casual relationship. We could also think of a modeling method that is directly based on a particular machine code as well. In the end, there is no real difference between programming languages and executable modeling methods. They are both nothing but abstractions of a certain level n of machine code. Both are being developed in order to reduce complexity and to make the systems described with them easier to comprehend. Thus both of them have to be treated equally when talking about their value to the user.
3 Formal IT Theories and their Value Value can be expressed in terms of cost effectiveness, reduced time and enhanced comprehension. What does cost effectiveness mean in the sense of formal IT theories? In the first place, any business IT system has the purpose to of enhancing user revenues. So to be of value, a language or model must be cost effective, i.e. inexpensive to use. A language which requires a longer time to formulate is less valuable than one which fits better to the problem and can be expressed more quickly. General purpose languages are generally not suitable to particular problems and require more time to formulate. This is the advantage of domain specific languages. The third potential value is more problematic. What does it mean to be comprehensible? Are there some certain properties that make a formal language easier to comprehend than others? Are they objective? Can they even be? Nietzsche says that every product of the human mind is only relevant for humans. So, according to him, there are no general criteria for good and bad languages, and he shows good reasons for his claim [Nietzsche1999]. Languages can only be judged in terms of their value to the user. If so, the language would be preferable which requires the least mappings between the human user and the IT system.
Fig. 1. The figure shows a simple and a complex mapping. The simple mapping occurs where real world (or natural language) and modeling world entities are the same, while the complex mapping occurs, where they differ.
132
S.H. Sneed
4 A Genesis of Thinking It is, of course not possible to discuss the complete evolution of Greek philosophy and IT programming and modeling methods within this short paper. But there is also no need for this. It suffices to concentrate on those key aspects that determined a new evolutionary step. The evolution of model construction seems to be common for both ancient philosophy and modern system design. The requirements for proper models, the arguments for and against certain approaches, as well as the criticism and condemnation are following exactly the same track. This evolution, “based on the problem and dictum of Parmenides describes [in both disciplines] a curios curve of thinking” [Buchheim2003] that will be described in this chapter. Parmenides and Bottom-up System Design. The beginning of software design methodology coincides with the point in time in which the ancient Greeks started to reflect on the nature of the world and the reason for their being. Heraclitus, a contemporary of Parmenides, pointed out the difficulties one has when discussing an ever changing and evolving world, while Parmenides reflected on the construction of models as well: “The one most famous claim made by Parmenides that can serve for constructing any ontology” [Buchheim2003] is the following: “What is thought must correspond to what exists” [Parmenides, Fragments, B3]. This claim of Parmenides presupposes a tight relationship between the intelligible and the existing entities. The fulfillment of this axiom is what we will be pursuing when comparing the evolution of ancient philosophy with that of system modeling. Regarding ancient philosophy we have to rely on the conclusions reached at that time, but as for system design, i.e. business engineering, we can determine whether a model fulfills the claim of Parmenides or not: The claim of Parmenides is fulfilled by a modeling approach in IT when the respective model can be automatically transferred into coding instructions performing specific functions with interfaces to the outer world, that is, to an executable system. This is true only if the model is an abstraction of the underlying machine code. In software design, the first approaches to dealing with program complexity were the hierarchical functional structured approach of IBM in the USA (HIPO) and the hierarchical data structured approaches of Jean Dominique Warnier in France (LCP) [Warnier1974] and Michael Jackson in England (JSP) [Jackson1983]. The situation with software development is according to Warnier something he calls the space and time problem where one images how a program is being executed in time and writes down the instructions to be executed in space [Warnier1974, p. 15]. He claimed that “this approach is not satisfactory, for it leads to us to attach equal importance both to details and to essential points; it must be replaced by a hierarchical approach which allows us to go from abstractions to details, i.e. from the general to the particular” [Warnier1974, p. 15]. This lead to a structured approach, in which data is grouped according to some criteria in a hierarchy of data types where each node of the hierarchy can be a set of data or an elementary data attribute. This data hierarchy was then mapped into a corresponding hierarchy of functions so that for every node of the data tree there was a corresponding node in the function tree for processing it. There are in the end as many functional hierarchies as there are input/output data structures.
A Genesis of Thinking in the Evolution of Ancient Philosophy
133
Each functional hierarchy corresponds to a subprogram. A program consists of all subprograms derived from data structures used by that program. This hierarchical data-oriented approach introduced by Warnier was a means of modeling IT-Systems from the bottom up based on the data they use. He also proposed a set of diagrams for depicting the data structures and their corresponding functional structures. The major reason why his method was not widely accepted was the difficulty involved in drawing those diagrams. Michael Jackson in England came up with a similar approach, but his tree diagrams were much more intuitive and easier to draw. Therefore the Jackson method of modeling computer programs became much more widespread. Furthermore, the Jackson structured diagrams could be mapped 1:1 into a program design language from which COBOL programs were generated. Thus, there was a direct uninterrupted path from the model to an executable system. The prerequisite was of course that the model must be at the same semantic level as the program itself, so that in the end the model is the program. Every single data item and elementary operation is contained within the model, thus fulfilling the premise of Parmenides. However, what was the advantage of this method other than changing the textual representation into the graphic one? Socrates and Plato, Structured Analysis and Object Orientation. With the increasing power of the computer, the size of the IT-Systems increased as well. By the end of the 70’s the average commercial program size had surpassed 2000 statements rendering it impossible to apply a bottom up approach with the then existing tools. This led to the so called software crisis of the 70ies, a crisis which continues even after 30 years. “The major cause of this crisis is that the machines have become several orders of magnitude more powerful! To put it quite bluntly: as long as there were no machines, programming was no problem at all; when we had a few weak computers, programming became a mild problem, and now we have gigantic computers, programming has become an equally gigantic problem” [Dijkstra1972]. Thus the situation for any IT-Manager could be well described by a human that gains for the wisdom of god when trying to understand all the algorithms (either graphical or textual) under his responsibility. But Socrates claimed: “Human wisdom is worth little or nothing at all.” [Socrates, Apology 23a] So, according to Socrates, it is not possible at all for the human to gain the complete and definite truth. The only thing that can be done is trying to get close to it by human assumptions and the testing of hypotheses, not knowing where this may lead to in the end. This is exactly the thesis put forward by the advocates of agile development. They claim it is impossible to model and to plan a complex IT-System in advance. The only way to construct one is through trial and error. “While the PreSocratic philosophy had been about achieving the truth by reducing concepts to existing reality, Socrates was looking for the most accurate alphabetization of the claimable, which was directed towards searching for the truth, but without ever really obtaining it” [Buchheim2003, p. 208]. This is equally true for agile development using Object-oriented Design. The analyst starts to study a given problem (“Analysis is the study of a problem…” [DeMarco1978, p. 4]) whose solution may, after many months of discussion with the user and the development team (“The analyst is the middleman between the user, who decides what has to be done, and the development team, which does it.” [DeMarco1978, p. 7]), turn out to be completely different in the end than it might have appeared at the beginning. Socrates knew quite well that
134
S.H. Sneed
human beings cannot grasp complex reality, as depicted in the quote: “That, what we believe to understand clearly, we should look upon skeptically, for it is surely in error, even though we might all agree that we understand it” [Socrates, Sophistes 242a]. This was the main motivation for the structured and later the object-oriented design methods. “The principal goal of Structured Analysis is to minimize the probability of critical analysis phase errors” [DeMarco1978, p. 9]. If errors are to be avoided within the analysis phase, they are avoided by constructing the proper models, gained by proposal and discussion. To quote the literature, “OO analysis and design […] are powerful ways to think about complex systems” [Martin1992, p. xi] Hence all analysis and its models, whether being structured or object-oriented, serve as a base of discussion to avoid errors within the analysis phase, thus corresponding to the philosophical approach of Socrates: “The highest good for man is to produce [trains of thoughts] each day that you hear me talking about in discussions and thereby put myself and others in question.” [Buchheim2003, p. 210] It was pointed out before that Socrates had his doubts about the Parmenides approach to depicting the reality in accordance with a single detailed view of the world. He was quoted as saying “Human wisdom is insignificant compared to the true wisdom of god that is not at human disposal” [Buchheim2003, p. 206]. The wisdom of god in terms of information technology would mean nothing less than knowing the content of each and every single instruction in a system, or its representation within a model. Creating models by abstracting elementary statements in the way that Warnier and Jackson did might work for small programs. However with the increasing power of the computer, the size of the IT-Systems increased as well. But the entities of DeMarcos modeling approach are not physical entities like modules, interfaces and code blocks. They are abstract features defined in a specification made independent of the machine code. This meant it was never intended to derive executable programs from the model. DeMarco proposed to stop modeling this top down approach in a downward direction when one believes that the insights of his “lowest-level bubbles can be completely described in a mini-spec of about one page” [DeMarco1992, p. 84]. Object-oriented Analysis and Design is more of a bottom-up approach in the sense of Warnier and Jackson. It starts with the data objects of a system and determines what happens with them. Once the building blocks are determined, the designer then decides what to do with them. The object model is really derived from the data model as depicted by Shlaer and Mellor in their book on “Modeling the World with Data” [ShlaerMellor1988]. Unlike structured design, which was never intended to be used as a basis for code generation, object-oriented design can correspond to the physical system, i.e. to the code, provided it is broken down far enough. The UML-2 language makes this possible as claimed by Ivar Jacobson in his keynote speech at the International Conference on Software Maintenance in Florence in 2001 with the title “UML – all the way down” [Jacobson2001]. By that he meant that with the new UML-2 it would be possible to go all the way from the requirements to the code. There would only be one language required and that is UML. The problem with this one language approach is that the one language inevitably becomes just another programming language. As Harry M. Sneed noted in his article “The Myth of Top-Down Software Development” by quoting DeMillo, Perles and
A Genesis of Thinking in the Evolution of Ancient Philosophy
135
Lipton, “’the program itself is the only complete and accurate description of what the program will do.’ The only way to make the specification a complete and accurate description of the program is to reduce the specification to the semantic level of the program. However, in doing this, the specification becomes semantically equivalent to the program” [Sneed1989, p. 26]. So, the OOA and OOD methods are nothing more than graphical representations of OO programming languages. The relevant principles of OO programming languages are inheritance and encapsulation (information hiding), locality of reference and polymorphism [Rumbaugh1991]. Each and every class extends the all integrating super class “Object”. This corresponds exactly to Plato’s teaching of the ideas. Individual ideas are all derived from a single overriding idea, the idea of the good, and each lower idea holds its part in the higher one. With his notion of what is good, Plato introduced something that he assumed to be “the organizing principle of the world” [Buchheim2003, p. 211]. The benefit of this concept is that information can be outsourced to the super classes (in a complex inheritance structure) in order to minimize the size of the lower level classes. But, of course, this requires a very intelligent design of the super classes, a degree of intelligence most software developers don’t possess. “Our constitution will be well governed if the enforcement agencies are always aware of and adhere to the higher order governing principles.” [Plato, Politeia 506b]. Probably the perfect inheritance structure is just as much a utopia as was Plato’s perfect state he describes in the Politeia. If we define a super class for creatures that walk on two legs and call them humans as well as one class for animals that fly through the sky which is called birds, what do we do with the ostrich? Often things have less in common that we think they have. The Next Step of the Evolution: Aristotle and S-BPM. The advantages of object oriented design and object oriented programming are clear: The top down approach of structured analysis and UML1 makes things simple and quick to discuss them, where the most important parts need the deepest discussions as Socrates proposed. Object-oriented design languages, e.g. UML, are attempting to be both a descriptive language and an implementation language in the sense of Plato’s universal, allencompassing model. Aristotle’s criticism of his predecessors in ancient philosophy applies equally well to the early design methods. SD and UML1 as well as non formal BPM methods (like ARIS) were never intended to be an abstraction of the physical solution, i.e. the machine code. They exist only on paper or as a graphical model within some drawing tool. In fact, they do not have any guaranteed reference to the executable machine code. They may be describing something totally different than that what has been implemented. And, since the code changes, but the design documentation remains static, the model soon loses its value, if it ever had one. In our fast changing world they represent a lost investment. The least change to the code renders the model obsolete. The two descriptions no longer match. The problem is that model and code are depictions of separate worlds. This is the same as with Plato’s teaching of the ideals. They form another reality with a complete different ontological state. How can we use an explanation of World A to explain something that lies in world B when the two worlds differ so much? This was the reason for Aristotle to claim the following:
136
S.H. Sneed
“Are these now separated from each other, there would be no science of the ones while the others wouldn’t be existing.” [Aristotle, Metaphysics 1031b]. Object-oriented design methods and their graphical representation UML2 have reunited the model and reality, but are lacking in another respect. The formal OO theory defines objects, methods and properties as their main entities. But in the real world, objects are passive. An invoice in the real world does not have any functionality like “calculate sum”. In the real world, this functionality belongs to the actor responsible for preparing the invoices. He calculates the sum and adds it to the invoice. If this actor is automated, it still is an actor and hence a subject, not an object. An object which controls the behavior of other objects, such as the control object proposed by Jacobson is an artificial entity that should exist, neither in the real world nor in natural language. The use case and the control object are no more than patches that were added to make the original method usable. So, instead of defining new artificial entities like Plato’s ideas or control objects with functionality, Aristotle demands that one should define subjects: “Definitions exist either only or at least preferable from substances (subjects).” [Aristotle, Metaphysics 1031a]. The only objects that should be allowed are data beans. Everything with more functionality than get and set methods is no longer an object but a subject. With its introduction of the subject, the SPBM method is in tune with this major criticism of Aristotle. The S-BPM method takes this claim seriously in introducing the notion of subjects denoting any kind of actor within a business process: “The acting elements in a process are called SUBJECTS. SUBJECTS are abstract resources which represent acting parties in a process. SUBJECTS are process specific roles. They have a name, which gives an indication of their role in the process. SUBJECTS send or receive messages to or from other process participants (SUBJECTS) and execute actions on business objects” [Fleischmann2010, p. 12]. SUBJECTS are distinct from each other by their behavior, which consist of chains of elementary operations such as SEND, RECEIVE and ACTION. Thus the SUBJECTS of an S-BPM model are already implemented sub systems in form of prefabricated building blocks. Their specific behavior can be modeled by drag and drop in order to define the unique behavior of their relationships. They might either represent an existing or planned system or a human actor or an organizational unit. In case of a system, they need to be linked to the system, but the systems functionality can be simulated as long is the connection is not yet completed. In case of a human actor, the human interface is already provided by the SUBJECT. The relations of the SUBJECTS are defined within the S-BPM communication view, where “it is defined which SUBJECTS participate in a process and which messages they exchange” [Fleischmann2010, p. 12]. Altogether they form an S-BPM business process. It is important to note that a modeling method defines only the syntax of a language. The semantics come together with modeling and are provided by the entities and relationships the model artifacts are representing in the real world. Since existing legacy systems or actors have to interact with the new actors, they should be modeled first. For existing actors and constraints, the model is made according to the real world. For actors that are not yet existing, they are modeled according to the role
A Genesis of Thinking in the Evolution of Ancient Philosophy
137
they should play. Otherwise the process might be executable according to a weaker definition, but not to the strong definition required for process operation. It seems that the bottom-up approach has partly to be also applied to the semantics (not discussed within this paper) in order to get a fully operational process.
5 Conclusion The issues to be considered in developing a “better” implementation language as a successor of the OO paradigm according to Demelt and Maier are the following [DemeltMaier2008]: Better ability of structuring, higher language abstraction, development of formal methods and tool support. A more valuable model in terms of cost effectiveness, ease of use and comprehensibility is achieved by the S-BPM method, by taking the subject out of the objects and putting it there where it belongs, as in the real world where subjects are independent of the objects they deal with. Not only that, but S-BPM is a pure abstraction of the underlying Java code and remains in tune with it throughout the system evolution. The S-BPM Method is based on formal theory and is supported by the Metasonic Tool Suite, which ensures that model and implementation are always synchronized. Just as Aristotle adapted from Plato the principle of a complete and all encompassing description of reality, a description which combines generality with specifics, so has the S-BPM method taken over the principles of abstraction from object-oriented design. However, instead of having a single, central hierarchy as with Plato and OO, S-BPM follows the Aristotelian philosophy by having several parallel hierarchies. It distributes the complexity by subject. In that way, the problems associated with deep inheritance trees are avoided. Also the objects are separated from the subjects. This corresponds to the principle of Aristotle, in distinguishing between the category and the members of categories, which can be either subjects or objects. The S-BPM Method is devoted to being a mirror image of the real world, reflecting the real life subjects, objects and actions. The model can be mapped 1:1 not only to the code but also to the business process the code is embedded in. In so doing, it becomes cost effective, easy to use and easy to understand, thus achieving a maximum value to the user. Finally S-BPM fulfills all the requirements of Demelt und Maier in regard to creating a new paradigm for software development.
References 1. Buchheim, T.: Ist wirklich dasselbe, was zu denken ist und was existiert? – Klassische griechische Philosophie. In: Fischer, E., Vossenkuhl, W. (eds.) Die Fragen der Philosophie, C.H. Beck, Munich (2003) 2. DeMarco, T.: Structured Analysis and System Specification. Yourdon, New York (1978) 3. Demelt, A., Maier, M.: Paradigmenwechsel: Was kommt nach der Objektorientierung? In: Objektspektrum 2008 / 6, SIGS DATACOM, Troisdorf (2008) 4. Djikstra, E.: The Humble Programmer (EWD340). Communications of the ACM (1972)
138
S.H. Sneed
5. Fleischmann, A.: What is S-BPM? In: Buchwald, H., Fleischmann, A., Seese, S., Stary, C. (eds.) S-BPM, CICS Band, vol. 85. Springer, Heidelberg (2010) 6. Jackson, M.: System Development. Prentice/Hall International, Englewood Cliffs (1983) 7. Jacobson, I.: Four Macro Trends in Software Development“. In: Proceedings Keynote of: Conference on Software Maintenance, Florence. IEEE Computer Society Press, Washington (2001) 8. Martin, J., Odell, J.: Object-oriented Analysis & Design. Prentice/Hall International, Englewood Cliffs (1992) 9. Nietzsche, F.: Über Lüge und Wahrheit im außermoralischen Sinne, Kritische Studienausgabe Band 1, DTV de Gruyter, München (1999) 10. Rumbaugh, J., Blaha, M., PRemerlani, W., Eddy, F., Lorensen, W.: Object-Oriented Modeling and Design. Prentice-Hall, Englewood Cliffs (1991) 11. Shlaer, S., Mellor, S.: Object Oriented Systems Analysis: Modeling the World in Data. Prentice Hall, New Jersey (1988) 12. Sneed, H.: The Myth of ‘Top-Down’ Software Development and its Consequences for Software Maintenance. In: Proceedings of Conference on Software Maintenance, Miami 1989. IEEE Computer Society Press, Washington (1989) 13. Sneed, H.: Software Entwicklungsmethodik. Verlag Rudolf Müller, Köln (1996) 14. Warnier, J.D.: Logical Construction of Programs. Van Nostrand Reinhold Company, Paris (1974)
Understanding the Role of Communication and Hands-On Experience in Work Process Design for All Christian Stary Department of Business Information Systems - Communications Engineering, University of Linz, Austria,
[email protected] Abstract. The paper motivates the explicit recognition of communication and hands-on experience when stakeholders design work processes, both, on the individual and on the organization level. As a straightforward implementation Subject-oriented Business Process Management is reviewed. Its constructs for modelling and resulting capabilities for seamless execution when using a corresponding suite are discussed. In particular, it is shown how stakeholders can articulate their way of task accomplishment in terms of communication relationships while producing an executable model. As the behaviour of all participating stakeholders in a specific business process can be expressed in terms of communication acts, adjusting individual and task-relevant flows of communication leads to a complete picture of an organization in operation. Moreover, subject-oriented representations allow executing the resulting workflow without further transformations. They enable interactive experience of business processes which in turn facilitates (collective) reflection and redesign. In this way, stakeholders can trigger and control seamless round-trips in organizational development. It minimizes development costs and social risks, since alternative ways of task accomplishment can be negotiated before becoming operational in daily business. Keywords: Work process modeling, Subject-oriented Business Process Management, Participatory Design, seamless roundtrip engineering, articulation and negotiation.
1 Introduction Today’s business success mainly depends on the ability to innovate and change. Organizations have to rearrange their structure, driven by means of information and communication technology (cf. Levine et al., 2000). Increasingly, the driving forces behind the ‘right’ set of connections are common values of persons, regardless of their functional roles (cf. Tsai et al., 1998). This shift emphasizes the perspective on organizations as social systems. According to Luhmann’s understanding such systems are driven by communication (of decisions) (cf. Luhmann, 2006). Persons form the relevant environment of organizations as significant actors, rather than constituting an organization. According to Luhmann an organization is composed of communication C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 139–148, 2011. © Springer-Verlag Berlin Heidelberg 2011
140
C. Stary
acts. However, the activities in operation and development of an organization decide upon its success and failure. The latter, functional perspective on organizations is addressed by and handled through traditional techniques in Business Process Management (BPM). They structure organizational behavior and arrange the organization of work according to sequences of functions for task accomplishment (cf. Laudon et al., 2005). In the course of modeling a chain of activities (functions) is defined according to temporal and/or causal relationships when handling work tasks. Communication can be represented in this kind of models by overlaying communication relationships, e.g., defining communication flows (cf. Scheer, 2001). Following this approach, organizations are primarily described through functional decomposition rather than adjustment of communication acts. As social systems organizations are living systems, they behave ‘non-trivial’ - their behavior cannot be (pre-) determined externally, and described by specifying causal relationships (Varela et al., 1974, von Foerster, 2003): A certain input is processed according to the inner state and the activity patterns of an organization. Situational context influences individual behavior and trigger work activities. In accordance with that business process models should reflect stakeholder-specific situational context of functions focusing on communication acts (cf. Wiesenfeld et al., 1998). Such a shift requires rethinking management in general: Scientific management techniques as proposed by Taylor still dominate in leading organizations, and center on control and efficiency (Hamel, 2007). They consider organizations as ‘trivial’ machines, and strive for deterministic behavior patterns. In a world where adaptability and creativity drive business success (cf. Hamel, 2007) they need to be adapted to if not replaced by mechanisms of co-ordination and guidance (cf. Böhle et al., 2004). Such a shift allows sharing not only ideas and expertise, but also values, interests and objectives (cf. Back et al., 2004, www.vernaallee.com), coming back to the initially mentioned observation of values as drivers of change. Today’s change management and organizational development processes rely on the use of information and communication technologies. They link communities by providing the technical infrastructure for collaboration and knowledge representation, processing, and sharing (cf. Van den Hooff et al., 2004). Organizations having implemented environments recognizing the nature of social systems report significant benefits in terms of knowledge transfer efficiency, response time and innovation (Deloitte, 2002). In this paper we discuss the baseline of socio-technical work (re-)design, namely subject-oriented business-process management of organizations. As it actively supports describing work processes from a stakeholder perspective and their immediate execution, scientific management of development processes can be replaced by selfcontrol when handling change requests and ideas (cf. Bieber et al., 2002; Hill et al., 2009). It allows breaking through the (vicious) cycle of re-inventing organizations by identical means, which might lead to self-referential results and cause more economic damage than benefits, as Kotter argues metaphorically: ‘The (penguin) colony ensured that changes would not be overcome by stubborn, hard-to-die traditions’. (Kotter, 2008).
Understanding the Role of Communication and Hands
141
2 Organizational Key Asset Communication In this section the essential role of communication is shown, as S-BPM strives for complete and coherent descriptions of business processes. The perspective of individuals is tackled in section 2.1, enabling the flow of communication between stakeholders elaborated in section 2.2. 2.1 Individual Work and Process Completion Subject-oriented Business Process Modeling (S-BPM) is based on the conceptual understanding of processes as functionally interacting subjects, i.e. actors, agents or services, such as accountants, sales persons, information systems, or knowledge management systems (see also Fleischmann et al., 2011). Subjects cannot only be persons or software applications. They can also be a combination of both, meaning that data is being entered to a software application. A process is considered as the structured interaction of subjects involved in a business or work transaction. Subjects transfer information and coordinate their work
Fig. 1. An employee’s communication perspective when specifying a vacation request process
142
C. Stary
by exchanging messages. Messages can be exchanged synchronously, asynchronously, or in a combined form. The synchronization type can be specified depending on the message type and the sending subject. Each subject has an input pool as a mail box for incoming messages. The synchronization type is defined using attributes of the input pool. In figure 1 the behavior of the subject Employee in terms of exchanging messages along a simple vacation application process is shown, starting with filling in a vacation request form to be sent to the subject Manager. In the same, however complementary way the interaction between Manager and Employee can be specified (see figure 3). In general, each subject involved in a business process, sends and receives messages, and accomplishes some tasks without interaction. The definition and the behavior of a subject depend on the order of sent and received messages, the tasks being accomplished, and the way it influences the behavior. If a subject sends a message the information transferred with that message is derived from user inputs or computed by some applications. These send functions are executed before a message is sent. Vice versa, if a subject accepts a message a corresponding function is executed. The information received through the message is used as an input to that function. This type of receive and send functions represent socalled refinements of a subject. They constitute the interface of a subject to the applications used by the subject. Once all participating subjects in a specific process can be identified, each of the subjects can be specified in terms of its communication behavior in the above explained way (cf. Fleischmann et al., 2011). To complete a description of a process in terms of communication lines the interfaces have to be adjusted, as exemplified for the vacation application in figure 2 and 3.
Fig. 2. Organizational perspective as set of communication interactions
2.2 Subject as Drivers of Processes The flow of communication in a networked subject-driven process environment can be best illustrated proceeding with the vacation process example. The behavior of the manager is complementary to the Employee’s. The messages sent by Employee are received by the subject Manager and vice versa. Figure 3 shows the behavior of the subject Manager. The Manager is on hold for the holiday application of Employee. Upon receipt the holiday application is checked (state). This check can either result in an approval or a rejection, leading to either state. The subject
Understanding the Role of Communication and Hands
143
Fig. 3 Adjusted behavior of subject Manager
Employee receives the result (i.e. decision). In case the holiday application is approved, the subject Human Resource Department is informed about the successful application. In terms of S-BPM the subject HR receives the approved holiday application, and puts it to Employee’s days-off record, without further activities (process completion). The description of a subject defines the sequence of sending and receiving messages, or the processing of internal functions, respectively. In this way, a subject specification contains the pushing sequence of functions, so-called services (as an abstraction from implementation). These services can be the standard ones for communication like send, or predicates dealing with specific objects, such as required when an employee files a holiday application form (vacation request in figure 1). Consequently, each node (state) and transition has to be assigned to an operation. The implementation of that operation does not matter at that design stage, since it can be handled by (business) object specifications. A service is assigned to an internal functional node. If this state is reached, the assigned service is triggered and processed. The end conditions correspond to links leaving the internal functional node. Each result link of a sending node (state) is assigned to a named service. Before sending this service is triggered to identify the content or parameter of a message. The service determines the values of the message parameters transferred by the message. Analogously, each output link of a receiving node (state) is also assigned to a named service. When accepting a message in this state that service is triggered to identify the parameters of the received message. The service determines the values of the parameters transferred by the message and provides them for further processing. These services are used to assign a certain meaning to each step in a subject behavior. Services allow defining the functions used in a subject. All of those are triggered in a
144
C. Stary
synchronous way, i.e. a subject only reaches its subsequent state once all triggered services have been completed. The functions of a subject are defined by means of objects. In this way, a process specification can be completed for automated execution.
3 Development Key Asset Self-directed Interactive Experience After completing subject behavior specifications and their mutual adjustments stakeholders can execute syntactically valid models. In section 3.1 the procedure is exemplified. Such interactive experiences can be shared by reflecting the model, as shown in section 3.2 using a semantic content management approach. 3.1 Seamless Interactive Workflow Execution In S-BPM, in order to enable direct process experience, stakeholders are empowered to execute what they have modeled without further transformation (seamless processing). For instance, an employee entitled to apply for vacations is able to create a new instance of a process specification (see also www.metasonic.de). After creating the process instance the stakeholder is guided through the process. He/she is asked by the BPM suite which transition he/she wants to follow. For instance, once the stakeholder knows that he/she has to fill in the business message form with the corresponding data, and that form has to be sent to the manager, he/she follows the transition “send”. In the state “Prepare Message and select Receiver” following the transition “send” he/she fills in the business object with the data required for an application for vacation. In the following figure elements of the user interface created by the S-BPM suite is shown. 1. refers to the name of the current state: “Prepare Message and select Receiver” 2. gives the title of that process instance: “Request for vacation” 3. shows the creation date of that process instance 4. is the form for filling in the business object data
1 2 3
4
Fig. 4. User interface of the execution engine (workflow system) in state “prepare message and select the person(s) to be addressed”
Understanding the Role of Communication and Hands
145
The stakeholder (in this case, subject 1 ‘Employee’) can add all the required data for a vacation request to the business object and send it to his/her manager who is the owner of another subject (subject 2 ‘Manager’). Since S-BPM focuses on individual work perspective, stakeholders need only to know communication interfaces when participating in organizational development: The behavior description of the subject Employee allows sending the vacation request to other subjects, such as the Manager or the Human Resource Department. S-BPM utilizes the metaphor of exchanging emails, however, focused on certain task accomplishments involving certain business objects (i.e. the content of the mail). The workflow execution allowing interactive process experience follows a simple protocol. A stakeholder (subject 1, e.g., Employee) starts with the select activity and selects the send transition. After that the action “prepare message and select address” is executed and in another state the message is sent to another stakeholder (subject 2, e.g., Manager). Now subject 1 reaches again the state “select”. In state Start subject 2 receives the message. In the following state “follow up action” the content of the received message is read and the corresponding action is executed by a certain person (or system) who is the owner of subject 2. In the case of the vacation application this follow up action is the manager´s decision whether the vacation application is accepted or denied. This decision must be sent to subject 1 (Employee). In the state select subject 2 (Manager) decides to follow the send transition, prepares the message with the result of the decision and sends it to subject 1 (Employee). In general, when a subject sends a message the sending state is connected with the corresponding receive state in the receiving subject. Subject 1 sends a message to subject 2 in state 2. Subject 2 receives that message in state “start”. Following this line of interaction, a complete business process can be executed in an interactive way. 3.2 Self-directed Roundtrip Engineering Creating new knowledge in the course of organizational development requires members of a knowledge building network collaboratively posing questions and commenting, and intentionally seeking for alternative solutions in order to expand the social system’s capabilities (Hakkarainen et al., 2004). Supporting such mutual learning scenarios, a subject-oriented workflow system is capable to provide operational evidence for each stakeholder involved in a certain process. A behavior proposal of one of the involved stakeholders in a process can be studied also by others immediately after completing and validating a subject-oriented model. However, posing questions and commenting requires more than execution. It requires social media, such as chat, forum, blogs, or process portfolios, and a proper content management to preserve contextual findings. Otherwise, the process of organizational development cannot be traced. Figure 5 shows such an approach: The subject-oriented behavior specification of the subject Manager when handling vacation requests is embedded into a semantic content management system. From the content perspective, each model can be enriched with meta data, such as documents assigned to a process, and role bindings, such as the behavior of the subject Manager. They also allow for navigation (left side in figure 5 with main categories ‚behavior‘ and ‚documents‘). Business process specifications handled as
146
C. Stary
Fig. 5. Knowledge sharing support – content perspective
content elements cannot only be tagged with meta data, they also can be annotated (i.e. enriched with link, comments, text, videos etc.), and become part of specific views that can be exchanged among members of an organization in the course of reflection and sharing. Any meta data, such as ‘behavior’ for diagrammatic content, lays ground for focused interactions sharing experience. Coupling meta data and annotations with topic-specific forum entries, role-specific blogs, or chats allows coand re-constructing information spaces for organizational development and mutual understanding. In the course of generating process models other domain-specific content might either be created from scratch or added to existing content. Embedding links to forum entries, blogs, or chats into annotations, stakeholder-specific perspectives can be created and kept in user views. They structure the space for sharing information and interacting, as social interaction is based on exchanging (stakeholder-specific) views. A view is generated like an empty overhead slide, and put on top of content elements (see right side of the screen a view termed ‘My lecture notes’ for annotating a diagram to indicate idealized subject behavior). The selection of content is supported by structured navigation (as shown on left side of the screen in figure 5) and filters directing the search for specific categories of information, such as behavior diagrams.
Understanding the Role of Communication and Hands
147
In a semantic content management system all annotations of a development process are stored, equal to original content elements, in user-specific views, including the links to communication entries. Users can manage their views, including deletion and transfer to other members of the organization. The transfer of views is essential, as collaboration in virtual communities can be enabled through sharing views. Views having set public by users can be taken by other users. They might import them to their list of individual views, and study them on top of the concerned content elements. Stakeholders taking existing views might continue working this way, i.e. setting views public after supplementing existing annotations. Such backand-forth transfers lead to cascaded views and in this way, to traceable processes in organizational development, both, on the content, and interaction level (cf. Stary, 2011).
4 Conclusive Summary For the first time since the dawning of the industrial age, the only way to build a company that’s fit for the future is to build one that’s fit for human beings as well.‘ (Hamel, 2007). Following this finding indicating the need for new ways in management we need to revisit organizations as social systems. From such a perspective stakeholders are driven by communication acts when accomplishing their tasks and when moving forward to novel structures of work. Traditional techniques of Business Process Management arrange process-specific information around functional activities that are linked by causal and/or temporal relationships. They do not support modeling organizations from a communication perspective in the first run. The discussed Subject-oriented Business Process Management technique supports stakeholders articulating their way of task accomplishment in terms of communication relationships while producing an executable model. The latter allows for interactive experience of specific business processes, according to individual and task-relevant flows of communication. The immediate execution of subject-oriented models facilitates (collective) reflection and re-design of individual proposals. In combination with semantic content management stakeholders can discuss alternative ways of task accomplishment along seamless roundtrips before incorporating them into daily business.
References 1. Bieber, M., Goldman-Segall, R., Hiltz, S.R., Im, I., Paul, R., Preece, J., et al.: Towards knowledge-sharing and learning in virtual professional communities. In: 35th Annual Hawaii International Conference on Systems Sciences (HICSS-35 2002). IEEE Computer Society Press, Los Alamitos (2002) 2. Back, A., von Krogh, J.: Knowledge Networks. Springer, Berlin (2004) 3. Böhle, F., Pfeiffer, S., Sevsay-Tegethoff, N.: Die Bewältigung des Unplanbaren. VS Verlag für Sozialwissenschaften, Wiesbaden (2004) 4. Deloitte: Collaborative Knowledge Networks. Deloitte Consulting and Deloitte & Touche, New York (2002) 5. Fleischmann, A., Stary, C.: Whom to Talk to? A Stakeholder Perspective on Business Process Development. Universal Access in the Information Society (2011) (in press)
148
C. Stary
6. Hakkarainen, K., Palonen, T., Paavola, S., Lehtinen, E.: Communities of Networked Expertise, Professional and Educational Perspectives. Advances in Learning and Instruction Series. Earli & Elsevier, London (2004) 7. Hamel, G.: The Future of Management. Harvard Business School Press, Boston (2007) 8. Hill, J.B., Cantara, M., Kerremans, M., Plummer, D.C.: Magic Quadrant for Business Process Management Suites, Gartner Research, 18, G00164485 (February 18, 2009) 9. Laudon, K.-C., Laudon, J.P.: Essentials of Management Information Systems: Managing the Digital Firm, 6th edn. Pearson, Upper Saddle River (2005) 10. Levine, R., Locke, C., Searls, D., Weinberger, D.: The Cluetrain Manifesto, The End of Business as Usual, Perseus, Cambridge, MA (2000) 11. Kotter, J.: Our Iceberg is Melting, Management of Change, Ashley Kreuer, NY (2008) 12. Luhmann, N.: Soziale Systeme. Gundriß einer allgemeinen Theorie, Suhrkamp (2006) 13. Stary, C.: Perspective Giving - Perspective Taking: Evidence-based Learning in Organizations. International Journal of Information and Knowledge Management 10 (2011) 14. Scheer, A.-W.: ARIS - Modellierungsmethoden, Metamodelle, Anwendungen, 4th edn. Springer, Berlin (2001) 15. Tsai, W., Ghosal, S.: Social Capital and Value Creation: The Role of Intrafirm Networks. The Academy of Management Journal 41(4), 464–476 (1998) 16. Van den Hooff, R., De Ridder, J.A.: Knowledge Sharing in Context: The Influence of Organizational Commitment, Communication Climate and CMC Use on Knowledge Sharing. Journal of Knowledge Management, 8(6), 117–130 (2004) 17. Varela, F.J., Maturana, H.R., Uribe, R.: Autopoiesis: The Organization of Living Systems, Its Characterization and a Model. Bio Systems 5, 187–196 (1974) 18. Von Foerster, H.: Wahrheit ist die Erfindung eines Lügners. Carl Auer, Heidelberg (2003) 19. Wiesenfeld, B.M., Raghuram, S., Garud, R.: Communication Patterns as Determinants of Organizational Identification in a Virtual Organization. Journal of Computer-mediated Communication 3(4) (1998), http://jcmc.huji.ac.il/vol3/issue4/wiesenfeld.html
Extending Predictive Models of Exploratory Behavior to Broader Populations Shari Trewin1, John Richards1,2, Rachel Bellamy1, Bonnie E. John1, Cal Swart1, and David Sloan2 1 IBM T. J. Watson Research Center, 19 Skyline Drive, Hawthorne, NY, 10532, USA {trewin,ajtr,rachel,bejohn,cals}@us.ibm.com 2 School of Computing, University of Dundee, Dundee, DD1 4HN, Scotland {johnrichards,dsloan}@computing.dundee.ac.uk
Abstract. We describe the motivation for research aimed at extending predictive cognitive modeling of non-expert users to a broader population. Existing computational cognitive models have successfully predicted the navigation behavior of users exploring unfamiliar interfaces in pursuit of a goal. This paper explores factors that might lead to significant between-group differences in the exploratory behavior of users, with a focus on the roles of working memory, prior knowledge, and information-seeking strategies. Validated models capable of predicting novice goal-directed exploration of computer interfaces can be a valuable design tool. By using data from younger and older user groups to inform the development of such models, we aim to expand their coverage to a broader range of users. Keywords: Cognitive modeling, information foraging, usability testing, accessibility, interface design, older users.
1 Introduction Cognitive modeling, most notably keystroke-level modeling (KLM) as originally formulated by Card, Moran, and Newell [1], has proven to be highly predictive of the times taken for skilled performance of a variety of tasks. KLM, and its more complex relative, GOMS, has been extensively validated and widely applied. Gray et al [2], Callander and Zorman [3], John and Kieras [4], Luo and John [5], and Knight, et al [6], illustrate the variety of successful uses of this modeling approach. More recently, several models of novice goal-directed information search on the Web have been developed and shown to predict human behavior. Our work extends existing research by focusing on comparisons of younger and older users. In this paper, we discuss cognitive factors that might impact performance on such tasks and lead to differences between older and younger groups. We seek to understand whether current theories of goal-directed exploratory behavior, as implemented in existing C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 149–158, 2011. © Springer-Verlag Berlin Heidelberg 2011
150
S. Trewin et al.
models, apply broadly, and what (if any) enhancements to current models will better predict the behavior of users of different ages.
2 Modeling Skilled Behavior HCI professionals are generally capable of constructing keystroke-level models by hand but this can become tedious if the design is complex. The correct placement of mental operators is also error prone undercutting model accuracy. To increase modeling efficiency and decrease modeling errors, John et al. created CogTool [7] to support the storyboarding of designs, the demonstration of tasks against those designs, and the resultant automatic generation of accurate models. Models of expert performance have begun to develop a more nuanced representation of users that incorporates differences between groups. Daily et al [8] applied personalized models to explain the performance of individual subjects across two different tasks. Further work by Rehling et al. [9] applied individual models based on performance in one task to predict performance in a second task. Jastrzembski and Charness [10] developed 'younger' and 'older' GOMS models, based on cognitive parameters suggested by prior literature. Two mobile phone tasks were modeled: dialing a number and sending a text message. The models predicted performance differences observed in an older and younger group of participants performing these tasks. These parameters were later mapped to equivalent ACT-R parameters, and again the age-specific parameters produced a better fit to the data than the default ACT-R parameters [11]. In further work, John and Jastrzembski [12] explored a series of models built using CogTool. The most successful model for the older adult group revisited the written instructions in the dialing task more frequently to extract the next set of digits to dial. The best models were able to detect statistically significant differences between the age groups, though the match to observed times (within 10-20% of observed times) was less good than that of the original hand-built GOMS models (within 5% of observed times). Awareness of the role of working memory in task performance proved instrumental in the production of high quality models of skilled performance.
3 Modeling Exploratory Behavior Navigating an unfamiliar interface can be described using the information foraging framework initially developed to describe the behavior of people navigating the Web [13], [14], [15], and subsequently used to model the behavior of users in other information-rich environments such as debugging unfamiliar code in an integrated development environment [16], [17]. The basic idea underlying information foraging is that a user evaluates which item (e.g., a link) to select next by assessing the semantic relationship between the goal (expressed as a series of linguistic tokens) and the item (similarly expressed as a series of tokens). This semantic relationship is characterized by its “information scent” with higher scent items being more likely to be selected than lower scent items. Various strategies for deciding when an item has a high enough scent to follow it, what to do when no links on the current page have a
Extending Predictive Models of Exploratory Behavior to Broader Populations
151
high enough scent, how rapidly the memory of having selected an item decays, and so on, can all be modeled within a general purpose cognitive architecture such as ACT-R [18]. The computation of information scent is based on an underlying text corpus and is roughly a measure of the degree to which the goal and the item are “near” each other in the corpus. Several computational cognitive models of information seeking on the Web, incorporating information foraging theory, have been shown to make good predictions of users’ interaction with Web sites [19], [20]. Our approach to modeling exploratory behavior is to build on CogTool-Explorer [21], a version of CogTool that implements novice goal-directed exploration through information foraging. CogTool-Explorer uses the ACT-R cognitive model [22], and incorporates SNIFACT 2.0, a computational cognitive model of how people use information scent cues to make navigation decisions on the Web [23]. In SNIF-ACT, items on a Web page are evaluated one by one, and a decision is made either to “satisfice” (choose the best seen so far), look further, or go back. In CogTool-Explorer, headings, and then items within headings, are evaluated in the order dictated by Halverson and Hornoff’s Minimal Model of Visual Search [24]. 3.1 Calculating Information Scent Successful modelling of novice behaviour depends crucially on the ability to make human-like judgments of the similarity between a goal and an item in a user interface. A number of algorithms have been explored, including latent semantic analysis (LSA), pointwise mutual information, generalized LSA, and Vectorspace approaches. All of these algorithms depend on the presence of a large corpus of text that represents the user’s knowledge, and their performance improves as the corpus size increases. By default, CogTool-Explorer uses LSA calculations based on the TASA corpus. TASA is a corpus often used for LSA calculations, and incorporates texts that a first year college student would be expected to have read. Blackmon’s AutoCWW also used this approach. Other research has explored corpora based on material found on the Web, on Wikipedia, or New York Times articles. Pointwise mutual information is a simpler measure that captures how likely it is to find word A in a text given that it contains word B, adjusting for the frequency of A and B in the corpus. It has the advantage that the corpus can be more easily extended and scaled. For large corpora, several studies have found it to perform better than LSA on similarity judgements [25]. Stone et al [26] report that the Vectorspace model performed better than LSA in predicting eye movements in a web search task. More sophisticated scent calculations have also been developed. The CoLiDes model of web navigation [27] integrates five factors to assess information scent: semantic similarity based on LSA, elaboration of terms to include similar terms, word frequency, previous experience, and literal matching between the goal and the user interface object being assessed. 3.2 The Role of Between-Group Differences If models of information seeking are to be useful to designers (who must design for a range of user abilities), the underlying models must be broadly applicable. However,
152
S. Trewin et al.
little is known about between-group differences in information foraging behavior. Blackmon et al’s experiments [19], upon which both the Auto-CWW and CogToolExplorer models are based, were performed with college students. The evaluations of the SNIF-ACT theory were based on data gathered from researchers and academics, and from individuals recruited on the web about whom little is reported, except that they were able to install specialized software in order to participate in the experiment. It is plausible that these models represent strategies and approaches primarily used by relatively young people with high levels of cognitive function. Age and cognitive skills are particularly interesting dimensions to study. Older adults are a significant, and growing, group of technology users that are often underrepresented in both research and design. They also tend to be a more heterogeneous group, in part due to the variable effects of age-related changes in cognitive, physical and sensory abilities. An important research question, therefore, is whether these models are applicable to older adults.
4 Factors Influencing Goal-Directed Exploration This section discusses three factors that may lead to significant differences not captured in current models. 4.1 Working Memory Working memory is the mental process by which information is temporarily stored and manipulated in the performance of complex cognitive tasks such as language understanding, learning and reasoning. Many different types of information can be held in working memory, including visual, verbal, and emotional information [28]. Measures of working memory correlate well with performance on many tasks, especially those that involve both memory and processing. Working memory capacity shows a decline with age, and is considered to be a key determinant of age-related differences in performance on cognitive tasks [29], [30], [31]. Age differences in performance on memory tasks are greater for tasks involving more executive processing [32], [33]. Reuter-Lorenz and Sylvester [34] suggest that this may be due to the use of executive processing resources to compensate for declines in memory ability. It is well known that recall is a more difficult memory task than recognition. Craik and McDowd [35] showed that increased age amplifies the cost of memory recall. Older adults may also be less able to suppress interfering memories [36], [37] during encoding and retrieval of information. Baddeley and Hitch's influential model of working memory [38] posits three primary subcomponents of working memory: an 'attentional-controlling system' (or 'central executive'), a 'phonological loop' that stores and rehearses verbal and other auditory information, and a 'visuo-spatial scratch pad' that manipulates visual images. More recently, an 'episodic buffer' component has been proposed, serving the purpose of integrating working memory information with that in long-term memory [39]. Working memory has a role in performance that goes beyond age-related differences. Impairments in aspects of working memory have been found with a number of disabilities, including schizophrenia [40], ADHD [41], and learning
Extending Predictive Models of Exploratory Behavior to Broader Populations
153
disabilities [42]. This impacts performance in many tasks. Naumann et al [43] provided strategy training to individuals in how to learn from hypertext, and found that the effectiveness of this training depended on the working memory resources of the individuals. For learners with poorer working memory, the strategy training actually hindered their performance. Furthermore, modern computing contexts often tax working memory with frequent interruptions and multi-tasking demands. 4.2 Prior Knowledge Prior knowledge has been found to influence goal-directed exploration. Older adults, having greater life experience on which to draw, may outperform younger adults in ill-defined Web-based tasks that rely on background knowledge of the topic [44]. The effect of knowledge on exploration is core to information foraging. Scent depends on the perceived relationship between what is on the screen and the users goal. To date, information foraging models have treated scent as a lexical relationship. It is calculated relative to lexical relationships in a pre-existing corpus. One can think of scent calculations as using a corpus of documents to ‘stand-in’ for the users knowledge. There are likely to be generational differences in background knowledge. For example, different age groups will have learned different topics at school, different age groups will tend to have read different books and be exposed to different media. The effects of background knowledge are readily apparent in different English speaking cultures. For example, when asked to find ‘vacuum cleaners’, people who have grown-up in the UK will likely follow a link that says ‘Hoovers’. Use of the term ‘Hoover’ to refer to a vacuum cleaner is common in the UK, so will be background knowledge. Among English language speakers who have grown-up in the US, however, younger adults are very unlikely to click such a link, as it would not appear to be related to their goal. In information foraging terms, for someone who has grown-up in the US, ‘Hoover’ would not have high scent when the search goal was ‘vacuum cleaner’. These differences in background knowledge may be sufficient to change users’ exploration behavior, and background knowledge may be similar enough within a generation to see generational differences in exploration behavior. 4.3 Information-Seeking Strategies In an information foraging task, participants must maintain many items in memory: the current goal, the paths already explored, and other promising unexplored paths, in addition to making judgments of the information scent of each option. As a result, performance in such tasks may be quite sensitive to working memory capacity. It may even be the case that people with lower working memory capacity employ very different strategies, in order to compensate. In a loosely constrained information-seeking task, Fairweather [45] observed that while there were no differences in task success between an older and younger group, there were significant differences in how the task was approached, and strong agerelated tendencies to follow particular paths and visit particular zones. Older adults made more use of guided small steps, performed more fruitless searches, and left the original site more often. Fairweather concluded that the observed performance
154
S. Trewin et al.
differences could not be attributed to taking the same steps in a less efficient way – fundamentally different approaches were being used. In a study of adaptive foraging behavior in a virtual landscape, both younger and older adults adjusted the parameters of their foraging strategy in response to task characteristics, but the performance of the older group was poorer, even when an optimal strategy was explicitly taught [46]. Hanson also observed strategy differences between older and younger individuals performing Web search tasks, but no significant differences in task success [47]. Paxton et al [48] report MRI-based results that also suggest a strategy change in older adults in a task that stressed maintenance of goal-relevant information. They hypothesize that problems with goal maintenance lead to this shift. Cognitive models of foraging strategies, when compared with human data, can be a useful way to explore possible alternative strategies in different user groups.
5 Research Directions In light of the discussion above, our research is exploring the following questions: How do the strategies of younger and older adults compare on information foraging tasks? What cognitive skills, including working memory, correlate with task performance and strategy use? How well can information foraging models predict human data? What is the influence of information scent algorithms and corpora? 5.1 Collecting Human Data from a Broader Population Our approach is to gather very detailed data from older and younger adults performing a goal-directed search task on the Web, to augment existing knowledge of human information foraging. Our data includes keyboard, mouse, and eye movement information. Our initial experiments are utilizing an information space derived from a popular online auction site in the UK. Items in this space are organized in a three-level category hierarchy with the top level including 27 categories such as “Antiques”, “Art”, “Baby”, “Jewellery & Watches”, “Musical Instruments”, and “Photography”. Research participants are being asked to find the proper third-level category for a named item such as “Watch batteries”. In this case the correct path would be “Jewellery & Watches”, “Watches”, “Watch Batteries”. Each of the second-level and third-level screens has a “Go Back” button making it easy to navigate back up the hierarchy. In addition, each screen has a reminder string of the form “Looking for x” to remind participants of their goal. By examining the eye-movement data we will be able to see what is considered, how long each item is considered, and whether the reminder string is utilized. 5.2 Modeling Between-Group Differences in Information Foraging ACT-R models offer predictions of how the motor actions, eye movements, and cognition are interweaved in the performance of a task. We plan to develop models that represent theories of how users are performing the task. Different corpora to represent different knowledge bases are another possible avenue of exploration.
Extending Predictive Models of Exploratory Behavior to Broader Populations
155
Comparing different models with human data will test these theories, leading to a deeper understanding of age-related differences in information foraging. The cognitive model currently underlying CogTool-Explorer has been successful in predicting user behavior in Blackmon’s dictionary search tasks [19]. Although it is an ACT-R model, the goal, and the best link seen so far, are stored in ACT-R’s goal buffer, where they are not subject to decay. Differences in working memory abilities cannot be represented in this model. If issues of forgetting the goal term, or the best link seen so far arise in the data, more sophisticated models will be required to capture these effects. ACT-R’s working memory mechanism will provide a basis for models that account for forgetting. The ACT-R cognitive architecture implements a model of working memory, including a visual buffer and an auditory buffer. The attentional-controlling system is implemented as a fixed amount of memory activation that is spread across memory items. Each memory item has a level of activation based on the time since it was created or accessed, and connections to other active elements. Those items with an activation level above a threshold are active 'working memory' chunks. Anderson et al [49] describe experiments supporting this 'source activation' model, and ACT-R models can account for many effects observed in psychological experiments [50]. Huss and Byrne [51] further proposed an ACT-R based model of an articulatory loop, in which an individual repeatedly sub vocally verbalizes an item in order to rehearse it and maintain its activation in working memory. Their model was able to reproduce the effects of list length and stimulus length on a list recall task. Sub vocal (or even vocal) articulation is a commonly used memory strategy, and may be important for accurate modeling of tasks involving memory. The ultimate goal of our research work is to derive and validate models capable of being embedded in CogTool-Explorer and used by designers to predict the behavior of a broad range of users exploring unfamiliar Web sites or user interfaces. Using data from older and younger user groups, the applicability of current models can be examined, and new models developed. We have discussed working memory, prior knowledge, and information-seeking strategies. These three factors, when reflected in models, may help to account for between-group differences in exploratory behavior. Acknowledgments. This research was supported by an Open Collaborative Research grant from the IBM Research Division, and by RCUK EP/G066019/1 “RCUK Hub: Inclusion in the Digital Economy”.
References 1. Card, S.K., Moran, T.P., Newell, A.: The Psychology of Human-Computer Interaction. Lawrence Erlbaum Associates, Hillsdale (1983) 2. Gray, W.D., John, B.E., Atwood, M.E.: Project Ernestine: Validating a GOMS Analysis for Predicting and Explaining Real-World Task Performance. Human-Computer Interaction 8(3), 237–309 (1993) 3. Callander, M., Zorman, L.: Usability on Patrol. In: CHI 2007 Extended Abstracts on Human Factors in Computing Systems, San Jose, CA, USA, April 28 - May 03, pp. 1709–1714. ACM, New York (2007)
156
S. Trewin et al.
4. John, B.E., Kieras, D.E.: Using GOMS for User Interface Design and Evaluation: Which Technique? ACM Transactions on Computer-Human Interaction 3(4), 287–319 (1996) 5. Luo, L., John, B.E.: Predicting Task Execution Time on Handheld Devices Using the Keystroke-Level Model. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI 2005), Portland, Oregon, April 2-7. ACM, New York (2005) 6. Knight, A., Pyrzak, G., Green, C.: When Two Methods Are Better Than One: Combining User Study with Cognitive Modeling. In: CHI 2007 Extended Abstracts on Human Factors in Computing Systems, San Jose, CA, USA, April 28 - May 03, pp. 1783–1788. ACM, New York (2007) 7. John, B.E., Prevas, K., Salvucci, D.D., Koedinger, K.: Predictive Human Performance Modeling Made Easy. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI 2004), pp. 455–462. ACM, New York (2004) 8. Daily, L., Lovett, M., Reder, L.: Modeling Individual Differences in Working Memory Performance: A Source Activation Account. Cognitive Science 25, 315–353 (2001) 9. Rehling, J., Lovett, M., Lebiere, C., Reder, L., Demiral, B.: Modeling Complex Tasks: An Individual Difference Approach. In: Proceedings of the 26th Annual Conference of the Cognitive Science Society, Chicago, IL, August 4-7, pp. 1137–1142 (2004) 10. Jastrzembski, T.S., Charness, N.: The Model Human Processor and the Older Adult: Validation in a Mobile Phone Task. Journal of Experimental Psychology: Applied 13, 224–248 (2007) 11. Jastrzembski, T.S., Myers, C., Charness, N.: A Principled Account of the Older Adult in ACT-R: Age Specific Model Human Processor Extensions in a Mobile Phone Task. In: Proceedings of the Human Factors and Ergonomics Society 54th Annual Meeting, San Francisco, CA, September 27-October 1 (2010) 12. John, B.E., Jastrzembski, T.S.: Exploration of Costs and Benefits of Predictive Human Performance Modeling for Design. In: Salvucci, D.D., Gunzelmann, G. (eds.) Proceedings of the 10th International Conference on Cognitive Modeling, Philadelphia, PA, pp. 115– 120 (2010) 13. Pirolli, P., Card, S.: Information Foraging in Information Access Environments. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI 1995), pp. 51–58. ACM Press, New York (1995) 14. Pirolli, P.: Computational Models of Information Scent-Following in a Very Large Browsable Text Collection. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI 1997), pp. 3–10. ACM Press, New York (1997) 15. Pirolli, P.: Information Foraging Theory: Adaptive Interaction with Information. Oxford University Press, New York (2007) 16. Lawrance, J., Bellamy, R., Burnett, M., Rector, K.: Using Information Scent to Model the Dynamic Foraging Behavior of Programmers in Maintenance Tasks. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI 2008), pp. 1323– 1332. ACM Press, New York (2008) 17. Lawrance, J., Burnett, M., Bellamy, R., Bogart, C., Swart, C.: Reactive Information Foraging for Evolving Goals. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI 2010), pp. 25–34. ACM, New York (2010) 18. Anderson, J.R.: The Adaptive Character Of Thought. Erlbaum, Hillsdale (1990) 19. Blackmon, M., Kitajama, M., Polson, P.: Tool for Accurately Predicting Website Navigation Problems, Non-Problems, Problem Severity, and Effectiveness of Repairs. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI 2005), pp. 31–40. ACM Press, New York (2005)
Extending Predictive Models of Exploratory Behavior to Broader Populations
157
20. Chi, E., Rosien, A., Supattanasiri, G., Williams, A., Royer, C., Chow, C., Robles, E., Dalal, B., Chen, J., Cousins, S.: The Bloodhound Project: Automating Discovery of Web Usability Issues Using the InfoScentTM Simulator. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI 2003), pp. 505–512. ACM Press, New York (2003) 21. Teo, L., John, B.E.: Towards a Tool for Predicting Goal-Directed Exploratory Behavior. In: Proceedings of the Human Factors and Ergonomics Society 52nd Annual Meeting, pp. 950–954 (2008) 22. Anderson, J., Lebiere, C.: The Atomic Components of Thought. Erlbaum, USA (1998) 23. Fu, W., Pirolli, P.: SNIF-ACT: A Cognitive Model of User Navigation on the World Wide Web. Human-Computer Interaction 22(4), A355–A412 (2007) 24. Halverson, T., Hornoff, A.: A Minimal Model for Predicting Visual Search in Human Computer Interaction. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI 2007), pp. 431–434. ACM Press, New York (2007) 25. Turney, P.D.: Mining the Web for Synonyms: PMI-IR Versus LSA on TOEFL. In: Proceedings of the Twelfth European Conference on Machine Learning, Freiburg, Germany, pp. 491–502 (2001) 26. Stone, B., Dennis, S., Kwantes, P.J.: A Systematic Comparison of Semantic Models on Human Similarity Rating Data: The Effectiveness of Subspacing. In: The Proceedings of the Thirtieth Conference of the Cognitive Science Society (2008) 27. Kitajima, M., Blackmon, M.H., Polson, P.G.: Cognitive Architecture for Website Design and Usability Evaluation: Comprehension and Information Scent in Performing by Exploration. In: Proceedings of the HCI International Conference (2005) 28. Mikels, J., Larkin, G., Reuter-Lorenz, P.: Divergent Trajectories in the Aging Mind: Changes in Working Memory for Affective Versus Visual Information with Age. Psychology and Aging, APA 20(4), 542–553 (2005) 29. Salthouse, T.A.: Differential Age-Related Influences on Memory for Verbal-Symbolic Information and Visual-Spatial Information. Journal of Gerontology 50B, 193–201 (1995) 30. Park, D.C., Lautenschlager, G., Hedden, T., Davidson, N.S., Smith, A.D., Smith, P.K.: Models of Visuospatial and Verbal Memory Across the Adult Life Span. Psychology and Aging 17, 299–320 (2002) 31. Brown, S.C., Park, D.C.: Theoretical Models of Cognitive Aging and Implications for Translational Research in Medicine. The Gerontologist 43(suppl. 1), 57–67 (2003) 32. Dobbs, A.R., Rule, B.G.: Adult Age Differences in Working Memory. Psychology and Aging 4, 500–503 (1989) 33. Salthouse, T.A., Babcock, R.L.: Decomposing Adult Age Differences in Working Memory. Developmental Psychology 27, 763–776 (1991) 34. Reuter-Lorenz, P., Sylvester, C.: The Cognitive Neuroscience of Working Memory and Aging. In: Cabeza, R., Nyberg, L., Park, D. (eds.) Cognitive Neuroscience of Aging: Linking Cognitive and Cerebral Aging, pp. 186–217. Oxford University Press, Oxford (2005) 35. Craik, F., McDowd, J.: Age Differences in Recall and Recognition. Journal of Experimental Psychology: Learning, Memory and Cognition 13(3), 57–67 (1987) 36. May, C.P., Hasher, L., Kane, M.J.: The Role of Interference in Memory Span. Memory and Cognition 27, 759–767 (1999) 37. Lustig, C., May, C.P., Hasher, L.: Working Memory Span and the Role of Proactive Interference. Journal of Experimental Psychology 130, 19–207 (2001) 38. Baddeley, A.D., Hitch, G.J.: Working memory. In: Bower, G.A. (ed.) Recent Advances in Learning and Motivation, pp. 47–89. Academic Press, London (1974)
158
S. Trewin et al.
39. Baddeley, A.: The Psychology of Memory. In: Baddeley, A., Kopelman, M., Wilson, B. (eds.) The Essential Handbook of Memory Disorders for Clinicians, ch.1. John Wiley & Sons, Chichester (2004) 40. Silver, H., Feldman, P., Bilker, W., Gur, R.C.: Working Memory Deficit as a Core Neuropsychological Dysfunction in Schizophrenia. American Journal of Psychiatry 160, 1809–1816 (2003) 41. Marusiak, C., Janzen, H.: Assessing the Working Memory Abilities of ADHD Children Using the Stanford-Binet Intelligence Scales. Canadian Journal of School Psychology 20(1-2), 84–97 (2005) 42. Swanson, H.: Individual Differences in Working Memory: A Model Testing and Subgroup Analysis of Learning-Disabled and Skilled Readers. Intelligence 17(3), 285–332 (1993) 43. Naumann, J., Richter, T., Christmann, U., Groeben, N.: Working Memory Capacity and Reading Skill Moderate the Effectiveness of Strategy Training in Learning from Hypertext. Learning and Individual Differences 18(2), 197–213 (2008) 44. Chin, J., Fu, W., Kannampallil, T.: Adaptive Information Search: Age-Dependent Interactions Between Cognitive Profiles and Strategies. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI 2009), Boston, MA, USA, April 04–09, pp. 1683–1692. ACM, New York (2009) 45. Fairweather, P.: How Younger and Older Adults Differ in Their Approach to Problem Solving on a Complex Website. In: Proceedings of 10th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS 2008). ACM Press, New York (2008) 46. Mata, R., Wilke, A., Czienskowski, U.: Cognitive Aging and Foraging Behavior. J. Gerontol. B Psychol. Sci. Soc. Sci. 64B(4), 474–481 (2009) 47. Hanson, V.: Influencing Technology Adoption by Older Adults. Interacting with Computers 22, 502–509 (2010) 48. Paxton, J., Barch, D., Racine, C., Braver, T.: Cognitive Control, Goal Maintenance, and Prefrontal Function in Healthy Aging. Cerebral Cortex 18(5), 1010–1028 (2008) 49. Anderson, J., Reder, L., Lebiere, C.: Working Memory: Activation Limitations on Retrieval. Cognitive Psychology 30, 221–256 (1996) 50. Anderson, J., Bothell, D., Lebiere, C., Matessa, M.: An Integrated Theory of List Memory. Journal of Memory and Language 38, 341–380 (1998) 51. Huss, D., Byrne, M.: An ACT-R/PM Model of the Articulatory Loop. In: Detje, F., Doerner, D., Schaub, H. (eds.) Proceedings of the Fifth International Conference on Cognitive Modeling, pp. 135–140. Universitats-Verlag Bamberg, Germany (2003)
Digitizing Interaction: The Application of Parameter-Oriented Design Methodology to the Teaching/ Learning of Interaction Design Shu-Wen Tzeng Department of Industrial and Graphic Design, Auburn University, Auburn, Alabama 36849, USA
[email protected] Abstract. The development of digital technology has changed the way users interact with products and forced industrial design educators to rethink the role of design education with respect to both the integrity and suitability of current design curriculum. This study is sought to figure out a better way for teaching/ learning Interaction Design in the discipline of Industrial Design with considerations to the nature of interaction design and students’ learning mode. A newly created interaction design methodology will be introduced in this paper, and the case study on the application of this approach to a graduate school level interaction design course should explain how this methodology can be manipulated in the development of an interaction design, making teaching/ learning Interaction Design more effective and enjoyable. Keywords: Interaction Design, Design Methodology, Design Education.
1 Introduction In his article “Tomorrow’s Design Landscape: Anywhere Interactions”, Bill Mak, an innovation interaction-design entrepreneur and the chair of IDSA’s Interaction Design Professional Interest Section, stated that “ Three possibilities for innovation- mobile high-speed Internet, contextual awareness and anywhere physical interactions- will change the world and professional design practice.” He also foretold, “ The third possibility suggests that digital convergence will not just take interaction design to a dramatically broader industrial ecosystem of electronics, machines, objects and devices but also promote the digital evolution of many traditionally non-digital product areas.” [1]. To date, the development of digital technology has opened many opportunities to integrate diverse technologies and new interaction paradigms into many categories of products and may promote innovations in many non-digital products. Hence, more and more industrial designers of digital technology products no longer regard their jobs as designing a physical object but as designing the interactions between the user and the product. In other words, the boundary between hardware and software has become blurred and the demand for a good user- product interaction has never been so acute. C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 159–167, 2011. © Springer-Verlag Berlin Heidelberg 2011
160
S.-W. Tzeng
In school, the changes in technology are also forcing design educators to rethink the role of design education with respect to both the integrity and suitability of current design curriculum. It is apparent that we are now at the leading edge of a wave of technological change that will affect all aspects of everyday life in a profound way. The next generation of designers will need new skills and knowledge to negotiate this new terrain. As the focus of product design has shifted from the physical functionality and aesthetics to user interfaces, and finally to the ubiquitous interactive paradigms, industrial design students must have a solid appreciation of Interaction Design topics to succeed. Understanding Interaction Design will allow designers to create products that are usable by everyone, extending the impact of user-product interaction to a diverse set of users within many domains. Yet industrial design educators are now struggling with teaching Interaction Design due to the lack of methods, which adapt to students’ learning mode (in this case, the Net Generation’s learning mode) and are easy-to learn and quick to examine. Because the design of interaction interface often depends on both the author’s thinking and the presumed audience’s experience, educators must endow their students with abilities to solve problems by using procedures and analytic methods. The solution to this problem relies on educators’ extensive study on Interaction Design and continuous attempts at creating effective teaching strategies. Therefore, this study is sought to figure out a better way for teaching/ learning Interaction Design in the discipline of Industrial Design. The goal is to equip the next generation of industrial designers with the abilities to tackle human’s substantial needs of interaction.
2 Interaction Design in the Field of Industrial Design 2.1 The Nature of Interaction Design According to Bill Moggridge, who coined the term “interaction design” together with Bill Verplank in the late 1980s, Interaction Design is the study of devices with which a user can interact [2]. In other words, Interaction Design defines the behavior (the "interaction") of an artifact or system in response to its users. Because the behavior of an artifact is often embodied in both the abstract content and the concrete form of an interface design, it makes sense to assume that any interaction design is composed of user, content and form of design, as shown in Figure 1.
Fig. 1. The three main components in an interaction design
Digitizing Interaction: The Application of Parameter-Oriented Design Methodology
161
These three components- the user, the content, and the form of design- are both independent from and correlated with each other- the content and the form of design can influence the users’ experience and productivity while have no power to change the users’ profiles or intentions. The user, with no doubt, is the center of an interaction design since the ultimate goal of Interaction Design is to create a great user experience. Note that the user consideration here is more focused on users’ intentions rather than users’ needs -- the way a user chooses to interact with an interface is often determined by his/ her purpose or intentions in accessing the content. The second consideration, the content, can be referred to the services provided by an artifact. The content is collected and delivered through different ways based on the pre-determined functions. Because Interaction Design is often looking at the user’s productivity rather than that of the product, it is more interested in the delivery strategy of the content instead of the content itself. In their design research, Blair-Early and Zender find that the type of delivery of content is one of the most important factors contributing to the quality of an interaction design and has more power to influence end-user experience comparing to the content itself [3]. The form of design, which is often referred to an interface, is the means by which users interact with content for a purpose. The form can be presented through different media- from a physical object to a gesture-oriented interaction, and exists in various kinds of products in our everyday life. As the number of interfaces and the diversity of users glow, the need for effective interaction design increases. Therefore, the form of design has gradually become the main focus in an interaction design. As interaction design becomes a new terrain for industrial designers, the educators are under obligation to teach interaction design to their students in an effective way. 2.2 Teaching Interaction Design to Industrial Design Students Similar to other interaction design related disciplines, the interaction design curriculum in industrial design begins with a design research, which investigates user, process, cultural context, form, and content, and follows with an iterative design process including idea exploration, design execution and design validation. Technically, this is a process from collecting abstract information (often involving the understanding of user behaviors and goals), to defining content functions (within a conceptual framework and with an emphasis on the values of the content), and finally to constructing the concrete structure of a user interface design. This process can be presented in the structure as shown in Figure 2. Different from other disciplines, industrial design has more focus on the communications between the user and the artifact through the form of design- often in the fashion of visual display of quantitative information and visual explanations. It has been believed that the “design principles” can help designers achieve an effective interface [3], [4], [5], [6], [7], [8]. Hence, tons of interaction design professionals have attempted to improve interface design primarily by exploring and analyzing existing form (patterns) of interface design, or by providing design principles to guide the design of interaction. Two of the most famous examples are the 1992 Version of “Macintosh Human Interface Guidelines” published by Apple Computer, which suggests thirty-eight index entries for icons in an interface design [4], and the Nielsen
162
S.-W. Tzeng
Fig. 2. The general process and factors in an interaction design
Norman Group’s 106-page report, “Site Map Usability”, which delivers twenty-eight guidelines “ to improve an interaction design” [9]. These design principles have two problems: they are too vague to be useful— especially for design students or junior designers; and, as final experiences, they provide no indication of how they may be achieved. Therefore, there is a need for the design educators to figure out a better way to teach interaction design, especially on how design principles can be applied to an interaction design. What follows is a description of the proposed design methodology that has been used for teaching/ learning interaction design since 2008. This design method is created with the considerations of students’ learning mode and ease of examination for future design validation, making learning more efficient and effective.
3 The Idea of Parameter-Oriented Design Methodology The Parameter-Oriented Interaction Design Methodology is partly inspired by the parameter concept proposed by Blair-Early and Zender [3], and partly inspired by the digital technology which is extensively utilized in the everyday life of current college students. Many current university students belong to the ‘Net Generation’- a label used to describe today’s young adults. This group of individuals, born between 1980 and 1994 [10], has been characterized by their familiarity with and reliance on digital technologies. A number of social psychologists have argued that the digital culture in which the Net Generation have grown up has influenced their preferences and skills in a number of key areas related to education. For example, the Net Generation are said to prefer receiving information quickly; expect immediate answers; prefer multi-tasking and non-linear access to information; rely heavily on communications technologies to access information and to carry out social and professional interactions [11], [12]. Compare to analog signal, digital signal is easier to recognize, define, and manipulate, meeting the needs of Net Generation in terms of simplicity and speed [13]. The digital signal takes the form of parameter, which is a computation from recorded data, to
Digitizing Interaction: The Application of Parameter-Oriented Design Methodology
163
present its relative value in a system. Therefore, the Parameter-Oriented Design Methodology should be an ideal way for teaching/ learning interaction design. In their research on identifying user interface design principles for interaction design, Blair-Early and Zender propose a set of “parameters” for achieving an effective interface. They believe that principles in isolation do not provide sufficient guidance to inform design decision, only when parameters and principles working together can drive innovation and empower designers. Based on the three components model for an interaction design, Blair-Early and Zender define the “parameters” that govern an effective interface as: User Intention: Hunter-Browser. Content Delivery: Reference- Educational- Inspiration- Entertainment. Interface Type: Linear-Hierarchy-Matrix-Web. Note that the Interface Type here can only be referred to the structure of an interface design rather than the form of an interaction design; hence, only the parameters defined for user and content are considered in this design methodology. The descriptions of these parameters are shown as below: User Intention. The Hunter: the hunter is focused, precise and often destinationdriven. The hunter values the speed and efficiency of an interface, and rarely deviates from its initial content direction to discover a new path. A hunter may scan large quantities of information quickly in order to find a predetermined target information or content. The Browser: the browser is intent on the journey and, in many cases, may not have a final content destination in mind. The browser is less focused and driven in the search for content, and more likely to be open to new experiences. Content Delivery Strategy. Reference: a content delivery strategy designed to serve discrete bits of information to users. The reference source is driven to provide as much information as possible in as few steps as possible. Educational: a content strategy designed to instruct, often in a step-by-step fashion. Educational content is driven to educate its audience. Inspiration: a content strategy designed to motivate or inspire. Often, the inspirational source has a more personal connection to the audience through calls to action and directives. The inspirational source derives its trust through emotional response and personal connection rather than through factual data. Entertainment: an entertainment delivery strategy is designed to amuse and geared to draw a browser audience. It establishes a more direct connection with the audience and requires direct participation from the user. Entertainment sources are the most open to interpretation, and may even require audience participation in establishing the content. It is important for designers to define these parameters before working on design exploration. Because these parameters are derived from research results, identifying design parameters can help designers clarify some essential questions. Once these parameters are identified, designers can define the level of interaction based on their knowledge of user intentions and content delivery strategies by applying appropriate design principles to their design, making interaction design more effective and relevant. Note that the level of interaction/ the desired experience is the key in this design methodology; it should be considered within each single design principle. Therefore, the structure of the Parameter-Oriented Design Methodology can be presented in the following figure (Figure 3):
164
S.-W. Tzeng
Fig. 3. The structure of the Parameter-Oriented Design Methodology
At the very center of this model is the user intention, which can be classified into two parameters- the hunter and the browser, and follows the consideration of content strategy, which is divided into four main types: the reference, the educational, the inspiration, and the entertainment. All of the design principles are used to guide design decisions by defining the appropriate level of interaction in the fashion of digital parameters. Take “the principle of feature exposure” and “the principle of user autonomy” for examples; to define the form of an interaction design for a browsing user experiencing entertainment content relies on the understanding of both user intention and content strategy. Based on the gathered information, it is obvious that intensive feature exposure is important and more authorship is required to achieve an effective interaction design within this kind of scenario. Therefore, if these design elements can be presented in a scale composed of nine numbers - from 1 to 9, the design decisions can be illustrated on the following chart:
Fig. 4. The design parameters identified by interaction designers based on the considerations of user intention, content strategy, and level of interaction
Digitizing Interaction: The Application of Parameter-Oriented Design Methodology
165
Note that the decisions related to “Feature Exposure” and “User Autonomy” align roughly with the interaction parameters above them, causing each design decision to be taken in reference to the specific considerations being addressed. This process will be followed with each design principle in an iterative fashion. In practice, consideration of just a few principles generally leads to a design theme or system that encompasses the other principles. With the great flexibility in accounting for all the relevant factors in an interaction design, the Parameter-Oriented Design Methodology is both easy to practice and inviting invention and innovations. Recipes for making an interaction design are replaced by guidance in establishing landmarks and contexts for a particular user intention and content strategy- making the teaching/ learning of interaction design more effective and enjoyable.
4 The Application of Parameter-Oriented Design Methodology to the Teaching/ Learning of Interaction Design 4.1 The Execution of Parameter-Oriented Design Methodology in Design Projects This paragraph describes how the Parameter-Oriented Design Methodology was applied to a social networking device design conducted by two industrial design graduate students. The Friend Finder was designed for a hunter persona within a reference interface concept. Based on students’ understanding of user intention and content strategy, the result of defining design parameters can be graphically illustrated below:
Fig. 5. Student example, Brian Williams and Lauren Weigel. Design parameters were placed within a matrix based on what user intention and content strategy are and how closely they related to the search query.
166
S.-W. Tzeng
The goal of this project was to develop features and systems within a person’s mobile ecosystem that better enables them to manage a connection to information and others in order to create more meaningful destinations. Because the hunter requires immediate information with the least amount of input and the reference device is aimed to give the user a way of navigating and connecting within their social network, the design parameters consist of relatively low user autonomy, tons of clear landmarks with consistent design language, and conventional visual metaphors. Some interface designs of this project are presented in Figure 6.
Fig. 6. Student example, Brian Williams and Lauren Weigel. Friend Finder interaction design and process of use.
4.2 Educational Result and Student Benefits of the Application of PODM to the Design of Interaction Because the PODM application project was following a case study project, students were provided with enough background information to understand the dilemma in an interface development process. While the guidance was clearly prescribed and was based on the parameters of an interaction interface, it became much easier for design students to create their first interaction design works. The produced prototypes were published in exe format and sent out to potential users for design evaluation. Although the effectiveness of design was hard to verify due to the small size of sampling, design students got a lot of feedback and insights for future design improvement. When asked to respond on a five-point Likert scale to broader questions about this design methodology in an end-semester learning evaluation, most students either strongly agreed or agreed that the PODM is great for training decision-making process, useful for guiding project actions and enhancing their understanding of Interaction Design. Although a relevant and effective interaction design may take more effort and time for design testing and refinement, the design students in this class have built strong confidences in design exploration and decision-making, preparing them for the future careers.
Digitizing Interaction: The Application of Parameter-Oriented Design Methodology
167
5 Summary As mentioned before, without more detailed knowledge of the effects of the execution of interaction design, design principles can only be applied intuitively, which is difficult for design students. As a result of applying Parameter-Oriented Design Methodology to the projects conducted by design students, it is clear that more precisely defined parameters for the form of interaction design are needed in order to create distinct possibilities aimed toward a target experience. Besides, the integrated approach-combining user, content, and form in the PODM should be comprehensive enough to guide design students or practical designers to make relevant design decisions more effectively. Even so, more work needs to be done— the continuous researches on how to convert intuition into significant knowledge to make interaction design education more effective are of enormous need in this digital era. Design educators are now at the leading edge of a wave of not only technological but also educational change that will affect all aspects of how a product can be designed in a profound way.
References 1. Mak, B.: Tomorrow’s Design Landscape: Anywhere Interactions. Innovation Journal, IDSA, Dulles, 43–49 (2009) 2. Moggridge, B.: Designing Interactions. The MIT Press, Massachusetts (2007) 3. Blair-Early, A., Zender, M.: User Interface Design Principles for Interaction Design, Design Issues, vol. 24(1), pp. 85–107. The MIT Press, Massachusetts (2008) 4. Apple Computer Inc.: Macintosh Human Interface Guidelines. Addison-Wesley, Boston (1992) 5. Tufte, E.: Visual Display of Quantitative Information and Visual Explanations. Graphic Press, CT (1997) 6. Cooper, A., Reimann, R.: About Face 2. Wiley, Boston (2003) 7. van Duyne, D.K., Landay, J.A., Hong, J.I.: The Design of Sites. Addidon-Wesley, Boston (2003) 8. Tognazzini, B.: Maximizing Human Performance, Nielsen Norman Group, http://www.asktog.com/basics/03Performance.html 9. Nielson Norman Group: Site Map Usability, Fremont, California (1998) 10. McCrindle, M.: New Generations at Work: Attracting, Recruiting, Retraining& Training Generation Y, p.7. McCrindle Research, Sydney (2006) 11. Frand, J.L.: The Information-Age Mindset: Changes in Students and Implications for Higher Education. EDUCAUSE Review 35, 15–24 (2000) 12. Prensky, M.: Digital Natives, Digital Immigrants, pp. 34, On the Horizon (2001) 13. Befooty, C.: What are the Advantages and Disadvantages of Analog vs. Digital Communication?, http://www.answerbag.com/q_view/31169#ixzz12pAElagr
A Study on an Usability Measurement Based on the Mental Model Yuki Yamada, Keisuke Ishihara, and Toshiki Yamaoka 930, Sakaedani, Wakayama City, Wakayama, 640-8510, Japan
[email protected] Abstract. If you find distance between user’s model and designer’s model, the system is not good. We made a scale constructed with six viewpoints of similarity and practiced it. In result, we found possibility to measure a satisfaction at usability, roughly. If we complete this study, this method is useful for all interface designer and usability engineer. Keywords: Mental Model, Similarity, Usability, Interface.
1 Introduction When a man/woman find cognitive distance using a system between his/her own image (User’s Model) and system (Designer’s Model), he/she feel the system bad (Fig.1). The own image is called The Mental Model [1]. The purpose of this study is to measure the cognitive distance. The Mental Model is incomplete, unstable, unscientific, and ambiguous [2]. Because of those features it is difficult to found what The Mental Model is. However, Users must have viewpoints to find the cognitive distance. We focused on the similarity among Models.
Fig. 1. Cognitive Distance
2 Investigation 2.1 Purpose The purpose of this first investigation is to find some viewpoints of the cognitive distances, and to make itemized list for products evaluation. C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 168–173, 2011. © Springer-Verlag Berlin Heidelberg 2011
A Study on an Usability Measurement Based on the Mental Model
169
2.2 Materials and Methods Participants were six students (included author) who study cognitive ergonomics and interface design. We prepared six different digital cameras as representative example of user interface. First, they operated cameras and build The Mental Models to understand each system. Second, they wrote out similar points among each camera (fig.2). Third, they showed their own viewpoints and grouped each opinion (fig.3). Last, those points of similarity were integrated into six.
Fig. 2. Writing out viewpoints
Fig. 3. Grouped viewpoints
2.3 Result We made the following (provisional) scale for measuring similarity.
disagree 1
agree 2
3
4
○
2. Structure of indication is good.
○
3. Ordering of indication is good. 4. Shapes of operating parts is good. 5. Layouts of operating parts is good. 6. Steps of operation is good.
5
○
1. Appearance of indication matched is good.
○ ○ ○
3 Experiments 3.1 Purpose The purpose of experiments is to use the scale actually, and check its function. And we looked into the importance of each items in the scale.
170
Y. Yamada, K. Ishihara, and T. Yamaoka
3.2 Materials and Methods Participants are 23 students (Average: 22.7, SD:1.2, male:15, female:8). We prepared 3 types of digital cameras, 3 types of cell phones and 3 types of electric dictionaries. After hearing explanation, the participants took the following processes repeatedly. Last, we heard experiences regarding products use. 1. The participants operate 3 tasks with the product (fig.4). We asked them to talk their idea about the products due to observe their cognitive process (Protocol analysis). 2. The participants filled in two evaluation forms (fig.5). One is similarity form. The other is SUS (System Usability Scale) form [3].
Fig. 4. Operation
Fig. 5. Evaluation
3.3 Result 1. Observation Many kinds of cognitive idea were observed. We will not take up individual examples in detail. But small problems occur early. For example, some users took no clear judgment. Some users took suspension of operation. Last stage of the problem, they lost where they are (Situation Awareness [4]), and operated the system randomly. Also, some unconsciousness error was observed. 3.4 Result 2. Comparing the Similarity Measurement and SUS The relation between total similarity score (average of six similarity score) and SUS score was investigated by Pearson’s correlation coefficient test. The result revealed the strong correlation between the total similarity score and SUS score (r=0.8023, p55 years old group), previous experience in using mobile (60%), previous experience in using computers (70%), previous experience in using internet (40%), previous experience with AT (50%), participation in similar activities (30%). Finally, it should be mentioned that, from the early beginning of ÆGIS, it has been decided to engage at 50% the same users across all evaluation rounds and studies.
5 Conclusions The current paper presents the evaluation framework established in ÆGIS FP7 project, in the context of its iterative UCD plan, aiming to support its testing activities that have and will be performed in three evaluation rounds and one final demonstration phase. Focus has been mainly the first evaluation round of the project, in the context of which, 10 OSS prototypes/solutions have been tested. The experimental plans, the testing and participants recruitment and ethics processes followed have been explained in short. The first evaluation round has been completed and the results consolidation, both from the technical validation and human factors assessment testing with users, is available. The 10 prototypes have been evaluated in total with 185 users with impairments and 61 other users, representing various types of experts (e.g. tutors, accessibility evaluators, consultants, etc.) and developers. Tests have been conducted across 6 test sites (and 4 countries), namely in Belgium by EPR and KUL, in Spain by FONCE, in Sweden by SU-DART and in the UK by ACE and RNIB. It should be noted, that, although a very specific experimental plan and measuring tools have been developed for each of the solutions tested, the overall approach followed has been common across all prototypes and across all test sites, in order to allow valuable comparisons in the later analysis of the results. Contextual inquiry, naturalistic observation techniques, performance testing and focus groups were the evaluation techniques deployed for the first evaluation round. The evaluation framework presented in this paper will be most probably revisited, according to the feedback derived during the first tests. The same is valid for the supporting measuring tools. As such, the tests of the first round (and of each subsequent
234
M. Gkemou, E. Bekiaris, and, K. Van Isacker
round) will not only serve for the optimisation of the accessible solutions tested, but also for the optimisation of the evaluation framework and plan itself. To conclude, it should be noted that the usefulness of the evaluation framework established for the needs of the ÆGIS project is not limited to the relatively narrow context of the project; on the contrary, it should be seen as a useful guide for testing in the whole eInclusion area. Acknowledgments. This work was partially funded by the EC FP7 project AEGISOpen Accessibility Everywhere: Groundwork, Infrastructure, Standards, Grant Agreement No. 224348.
References 1. Annex I-“Description of Work”, ÆGIS project, CN. 224348, 7th Framework Programme, ICT & Ageing, September 8 (2008), http://www.aegis-project.eu/ 2. Bekiaris, E., G(k)emou, M., et al.: ÆGIS Ethics Manual, Deliverable 5.6.1 (revision for M18), ÆGIS project, CN. 224348, 7th Framework Programme, ICT & Ageing (May 2010) 3. G(k)emou, M., Bekiaris, E., et al.: Novel framework and supporting material for the inclusive evaluation of ICT, Deliverable 1.5.1, ÆGIS project, CN. 224348, 7th Framework Programme, ICT & Ageing (May 2010) 4. Pajo, S., et al.: User Groups’ and stakeholders’ definition and UCD Implementation Plan, Deliverable 1.1.1, ÆGIS project, CN. 224348, 7th Framework Programme, ICT & Ageing (February 2010) 5. Sulmon, N., et al.: Analysis of the context of ICT use, Deliverable 1.1.2, ÆGIS project, CN. 224348, 7th Framework Programme, ICT & Ageing (February 2010)
A Method to Automate the Ranking of Web Pages According to User Defined Accessibility Ratings Alice Good School of Computing University of Portsmouth, PO1 3AE, UK
[email protected] Abstract. The premise of this research is to present the final results of an investigation, which looked at a means to automate the rating of web pages according to their accessibility to specific user groups. These groups include visual impairments, mobility restricted and dyslexia. This research identifies three integrated, user-centred studies that assisted in the development of this work. The research conducted for this project has collected data that will help to develop a better method for disabled users to search for and easily locate accessible web pages. It has investigated how web pages can be rated for accessibility using specific algorithms that have designed according to user defined ratings of accessibility. The results presented in this paper demonstrate that re-ordering search results, by ranking web pages according to user defined ratings, could provide a better user experience for people with disabilities. Keywords: Accessibility, Algorithms; Disabilities, HCI.
1 Introduction Where assistive and adaptive software have certainly improved access to web-based information for users with disabilities, in terms of equality of access, there is no comparison to non-disabled users. The problem itself is widely recognised with numerous interest groups researching new techniques and introducing legislation that will provide better access to future web-based information. However, despite the availability of standards and guidelines there is still a significant shortfall of compliance. It is this shortfall that prevents people with disabilities equal access of information. Where non-compliance to design standards prevails, there will always be users who are faced with barriers. Learning potential, inclusion and empowerment are all issues that are affected by inaccessible web pages. Selecting web pages according to accessibility could help to reduce the amount of inaccessible information presented to the user and thus increase the value of the World Wide Web. Access to information is notoriously inadequate for disabled people; both current and earlier research confirms this [1,2,3,4]. In recent years, the medium has expanded to provide new services, such as the social networking sites including Facebook, Bebo and MySpace. However, according to AbilityNet, disabled users still find themselves excluded due to design problems [5]. While there are a number of contributing factors, the fundamental cause can be traced back to poor web design. Certainly, there is a marked failure to consider the needs of all users and incorporate the necessary C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 235–243, 2011. © Springer-Verlag Berlin Heidelberg 2011
236
A. Good
adjustment into the design process. Consequently, poor design leads to poor accessibility [6,7]. With age comes increased disability: in the elderly, vision and mobility deteriorate with time With an aging population on the increase, the severity of this problem continues to grow. Population aging is on the increase and as such, there will be a greater percentage of people alive at this time who are likely to be experiencing age-related impairments than in the past. There is a definite need to ‘bridge the gap’ between elderly users and the Internet and with almost a fifth of the UK population aged over 65, it is a significant market to exclude [8]. Many of these people make use of the Internet in controlling their finances, keeping up to date with current affairs and staying in contact with family and friends. The Internet can then be said to be a valuable tool in reducing the risk of social exclusion, a concern that many of the elderly face. In considering that the elderly are very likely to develop age related impairments such as visual and/or mobility related disabilities, web accessibility then becomes an important consideration. The initiatives may be in place to encourage the elderly to get online but poor web accessibility is likely to affect not only the user’s ability to access information but also create a less than favourable web experience. Better solutions are needed to improve accessibility for special user groups Much research has been dedicated to devising ways of modifying content according to users’ needs. The focus has then stringently been placed upon the alteration of content, a process that retains a certain amount of infallibility due to its dependence upon correctly coded pages i.e. web pages that have been written with the relevant html, for example ‘Alt-tags’, so that screen readers are able to inform the user. However, Correctly coded accessible web pages can be difficult to find. There needs to be a practical and effective way to guide users to accessible content. One possible solution could be adapting the order of search results according to user needs. This would then enable a means of retrieving accessible information, which is best suited to the users’ needs and disabilities. Furthermore, such a method would contribute towards a better user experience too. 1.1 Inter-related Studies This research has featured a series of inter-related studies, which are outlined below: Study One. In the initial requirements phase, user defined ratings of web page elements that were known to affect accessibility for specific user groups were obtained [9]. The study included participants with a wide range of sensory, physical and cognitive impairments. Participants were asked to specify and rate, with a scoring of one to three, the elements they felt affected accessibility of web pages. Many of the elements that were rated were ones that had been previously specified by the WAI. This is particularly true for all visually impaired participants. Study Two. These ratings formed the basis of a set of algorithms that were designed according to each user group, using the user defined elements of web pages that reduced accessibility outlined in study one (9). The objective of the algorithms was to analyse the accessibility of a web page, according to a specific user group. The algorithm would then assign a rating based upon the page's accessibility. Using these algorithms, users can enter a search query and Web results are then re-ordered according to how accessible they are to a particular user group.
A Method to Automate the Ranking of Web Pages
237
Study Three. The focus of this paper is to present the final stage of this exploratory study, namely study three. The aim of this study was to evaluate the effectiveness of the algorithms in re-ordering search results according to user perceived accessibility in comparison with Google search engine. An empirical evaluation was conducted using the same participants from study one. The purpose was fundamentally to assess whether participants had a better Web experience using the Algorithm Based Search. 1.2 Algorithm Based Search The proposed system, known as Algorithm Based Search, will incorporate a user model specific to the disability group as well as a web model. The information in the user model is used to determine which algorithm is used to rate the accessibility of a web page. The type of algorithm applied will depend upon whether the user is: blind; short-sighted; motor restricted or dyslexic. It is intended that the user supplies this information by specifying which user group best fits their needs: : blind; shortsighted; motor restricted or dyslexic. A database acts as the web model, which represents the accessibility of each page and includes attributes such as the URL and accessibility rating of web pages. The purpose of the web model is to store the URL and the rating of the web page. Each web page is then analysed according to the constraints incorporated within the algorithm. When a user submits a query, the system refers to the user model to establish which algorithm is assigned. The algorithm then individually analyses the first 20 web pages that the search engine returns, and inputs the URL and the rating of accessibility it has given the page into the database. Following this, the system then ranks the web pages according to the ratings applied and then presents the 20 pages as a list of URLs to the user, according to the ratings that have been assigned. Pages with high ratings are considered more accessible than those with low ratings.
2 Method A simulator was built using the algorithms created from user-defined ratings of accessibility that were obtained from the first study of this research [9]. Each disability user group had its own algorithm: blind; short sighted; motor restricted and dyslexic. The simulator was built in html. The simulator enabled a user to specify which user group they were in and the content of each of the two search queries (Scottish castles and Health Information). Twenty web pages were previously analysed according to the rating system incorporated within the algorithm. For example, for a short-sighted user, the user-defined rated elements from the first study that were found to affect accessibility were fixed font, low contrast between background and font colour. The simulator allowed the user to switch between the ordered set of results according to the applied algorithm and the set of results provided by Google. Each user set consisted of 10 search results for each query. There were two search queries and so each user set had two sets of results, one for ‘Scottish castles’ and one for ‘Preventing asthma’. There was also the default set of search results provided by Google for each search query.
238
A. Good
For this study, an empirical investigation was used to measure the effectiveness of the system to direct users to accessible pages. This system has been designed to improve the accessibility of search results for disabled users when compared to a traditional search engine. In this study, users were asked to rate the first ten pages of the search results presented only, as this amount was deemed a suitable sample. Google and most other on-line search engines limit the amount of results to 10 per page. 2.1 Participants The second study included 31 participants. It is possible to obtain excellent results using small sample groups, as research undertaken by Nielsen and Landauer (1993) reveals. Sample groups consisting of no more than five people can yield meaningful data Furthermore, frequent tests with smaller groups can produce better results than large-scale studies [10]. Participants ranged in age from 16-65; both males and females were included. Computer skills ranged from novice to intermediate. All users reported that they experienced impairments that affect the ease with which they were able to access web-based information. 2.2 Procedure The participants were asked to perform two searches for specific information, using the Internet. They were asked to enter the search query regarding health related issue, firstly using the algorithm and then using Google search. Then users were asked to rate the pages, using a Likert-Scale, in terms of ease of access and suitability of content.
3 Results Respondents were asked to rate web pages in terms of ease of access and precision of content using both the Algorithm Based Search system and Google. Results indicated statistically significant differences between respondents’ perception of the ease of access and content of the webpage between Google search and Algorithm Based Search. 3.1 Comparison of Responses The Likert-Scale responses were converted onto a numerical scale with ‘strongly disagree’ responses assigned the minimum weight of 1and ‘strongly agree’ response assigned the highest weight of 5. The differences in responses were calculated between the algorithm based search and Google, for each respondent and each web page. Differences in responses are recorded for each web page to analyse how the behaviour of the respondents change as they record their response for page 1- 10. A flat line above the zero mark will be the most ideal outcome as this would suggest that respondents consistently rated algorithm based search results better than Google results. Mean, Median and Standard deviation were calculated for difference in
A Method to Automate the Ranking of Web Pages
239
responses for each web page. Fig. 1and 2 show the Mean, Median and Standard deviation of difference in responses for each web page. Health Information Search. As fig. 1 shows, Mean, median and Standard Deviation were generally positive with page 8 and page 6 being the exception, where respondents felt that Google search results were better than code based search. Also, difference in responses was not consistent with 4 crests and 4 troughs. Respondents particularly reported negative responses for web page 6 while responses for web page 8 were marginally negative. However, the graph suggests that overall the respondents positively felt that the Algorithm Based Search results were better than Google search results in terms of ease of access and suitability of content. Scottish Castle Search. Fig. 2 suggests that respondents felt positive about algorithm based results but their responses turned negative as they moved down the search results. This may be because the Scottish Castle related web pages were very graphical with many images. Obviously a high ratio of graphics to text would create accessibility issues for blind users. Because the algorithm checks for fonts and contrasts, the two attributes which do not affect images very much, the algorithm results down the order may have been less appealing to the respondents than Google results, however the respondents did report a significant and positive improvement in results with the algorithm. Stdev Mean Median
Page1
Page2
Page3
Page4
Page5
Page6
Page7
Page8
Fig. 1. Health Information Search Statistics
Page9
Page10
240
A. Good
Stdev Mean Median
Page1
Page2
Page3
Page4
Page5
Page6
Page7
Page8
Page9
Page10
Fig. 2. Scottish Castle Search Statistics
3.2 Statistical Testing of Results The Wilcoxon Signed Ranks Two-Tailed Test was also used to determine differences between users’ responses on which search method was more convenient to them in terms of accessibility and in terms of finding the information that they were looking for. This non-parametric test for significance between items was considered most appropriate because the measure uses an ordinal scale. The Wilcoxon Signed Ranks Test approximates a z test. To further determine the magnitude of differences, all analyses were conducted at the p < .01 significance level. We test the hypothesis: H0: The median response is 3 i.e. neither agree nor disagree H1: The median response is >3 i.e. the respondents felt a noticeable enhancement in results when searched using the code as compared to when searched using Google search engine. Median
10
Mean
9.612903
Standard deviation
3.116381
Median
14
Mean
16.22581
Standard deviation
8.452651
Wilcoxon signed rank test was then performed using SPSS version 11.5.
A Method to Automate the Ranking of Web Pages
241
Health Information Search. The results obtained for health related query were: W- = 72.50, W+ = 227.50, N = 24, p <area ... alt="Plan of the Upper circle" /> <area ... alt="Plan of the Royal Circle" /> <area shape="rect" coords="4,192,397,368" href="#" alt="Plan of the stalls" /> Technique G53: Identifying the purpose of a link using link text combined with the text of the enclosing sentence The link implemented in this technique, “click here”, requires the preceding text in the same sentence to identify that it points to a page about becoming a member of the North Yorkshire Palladium.
To become a member of the North Yorkshire Palladium, click here.
Technique H33: Supplementing link text with the title attribute This Technique was implemented with a link text saying “It’s behind you”, but its purpose is complemented by the title attribute “Find out more about this year’s pantomime”.
It's behind you!
Technique C7: Using CSS to hide a portion of the link text A link was implemented with a long text. Part of this text was hidden with CSS scripting. The link only showed visually “Lennie & George: Steinsbeck’s Greatest
290
C. Power et al.
Creations”, while the text “A news story on the upcoming production of Mice & Men” is hidden by CSS. HTML: <span >A news story on the upcoming production of Mice & Men Lennie & George: Steinbeck's Greatest Creations CSS: #wcag_links a span { height: 1px; width: 1px; position: absolute; overflow: hidden; top: -10px; } Technique H77: Identifying the purpose of a link using link text combined with its enclosing list item The link text “More information” has its purpose described by the combination of its text and the content in the list item in which it is enclosed. This link brings more information about the production “Peter Pan”.
- The Crucible - More information
- Peter Pan - More information
- A Midsummer Night's Dream - More information
Technique H78: Identifying the purpose of a link using link text combined with its enclosing paragraph This Technique was implemented by means of a link as a separate statement at the end of the paragraph, with the context of the link being provided in the enclosing paragraph.
The North Yorkshire Palladium has recently been nominated for two prestigious Luvvie Awards. ...The awards are voted on by the general public. Find out more information as well how to vote.
Technique H79: Identifying the purpose of a link using link text combined with its enclosing table cell and associated table headings The link with text “£60” needs to be associated with a vertical header describing the categories of membership (Silver) and the place in which the seat at this price are located (Royal Circle). Failing to identify either of these would cause a misunderstanding about the discounts or where the seat is located.
Remote Evaluation of WCAG 2.0 Techniques by Web Users
291
| Bronze | Silver | Gold |
Upper Circle | £10 | £40 | £70 |
Royal Circle | £30 | £60 | £90 |
Stalls | £20 | £50 | £80 |
H80: Identifying the purpose of a link using link text combined with the preceding heading element This technique was implemented with a link in a list of “benefits of becoming a member”. The link text has to be read considering the preceding heading, in order to convey the message that “10% off food and drink” was only given to members. Benefits of becoming a member
- ...
- ...
- 10% off food and drink at the Bar whenever you visit.
- ...
- ...
Technique H81: Identifying the purpose of a link in a nested list using link text combined with the parent list item under which the list is nested The link used for this technique pointed to a page describing information about the production “To Kill a Mockingbird”. It was nested within a list of current productions, and within an inner list with specific links for “To Kill a Mockingbird” with text “About this production”.
292
C. Power et al.
To Kill A Mockingbird (11th February – 26th February) - About this production
- Meet the cast
- Book tickets
4 Results and Discussion The results from the remote evaluations are presented in Table 1. This table presents the techniques in order of percentage of participants who succeeded at the task. The participants were asked to record their confidence in their answer regarding the link destination and the difficulty they had in determining that destination. The average mean confidence rating, with the value of 1 indicating high confidence and 5 indicating very low confidence are presented in column 3. Similarly, the mean difficulty of the task, where 1 indicates that the task was very easy and 5 indicates very difficult, is presented in column 4. There was a highly significant difference between the percentage of participants who correctly identified the link target destination for the different techniques (chisquare = 76.01, df = 10, p < 0.01). Assuming that the target is to have 100% of users be able to determine the link target destination through the conformance to SC 2.4.4, from this table we see that only one technique comes close to meeting that target. Only the general technique, G91, where the link is implemented with plain text in an anchor element succeeds for close to all participants. If we relax the criteria for deciding if the technique has been a success in meeting the SC to be a success rate greater than 75%, then the first 7 techniques presented in the table are successful. Table 1. Results from the evaluation of Success Criteria 2.4.4 techniques
Technique
G91 G53 H30 H77 H81 C7 H78 H24 H79 H80 H33
% of all participants giving appropriate information (Rank) 96.0 (1.0) 92.0 (2.0) 88.0 (3.5) 88.0 (3.5) 84.0 (5.0) 80.0 (6.0) 76.0 (7.0) 64.0 (8.0) 44.0 (9.5) 44.0 (9.5) 20.0 (11.0)
Mean confidence rating (Rank) 1.48 1.55 2.18 1.91 1.52 1.50 1.94 1.94 3.09 2.18 2.60
(1.0) (4.0) (8.5) (5.0) (3.0) (2.0) (6.5) (8.5) (11.0) (8.5) (10.0)
Mean difficulty rating (Rank) 1.25 1.71 1.85 1.91 1.86 1.55 1.78 1.94 2.64 1.91 2.60
(1.0) (3.0) (5.0) (7.5) (6.0) (2.0) (4.0) (9.0) (10.0) (7.5) (11.0)
Remote Evaluation of WCAG 2.0 Techniques by Web Users
293
There are several very interesting features in these results. First, it is notable that a technique that relies on the inclusion of a standard HTML attribute, that of the title attribute in a link, performed the worst out of all the techniques, even those that are much more complicated conceptually. This perhaps indicates that users do not use the title attribute in that way, or have the reading of the title attribute turned off in their assistive technology configurations. The poor showing of this technique, with only 20% of users understanding the target of the link, and with the average difficulty for that technique being the highest of all techniques, indicates that this is a very poor technique to be recommending to web developers to use. In contrast, one of the key techniques widely reported to be very problematic does not confuse users as often as expected. The target destination of the “click here” link in technique G53 was correctly understood by users 92% of the time from using the surrounding context. While falling short of the 100% goal, this result indicates that users have adapted to this commonly used technique, developing strategies to address the apparent shortcomings of the link itself, and in this environment, was successful. There is an interesting situation that presents itself when looking at technique H30, where the link destination is the alternative text of the image encapsulate within a link anchor element. The participants were successful 88% of the time, lower than one would expect for such a simple technique. Further, the users exhibited extremely low confidence regarding their answers, and had moderately high difficulty in obtaining the correct answer. It is hypothesized that this is due to the users having a specific mental model regarding how alternative text should be used, specifically that it is used to describe an image. It is possible that the users, having encountered a destination for the link instead of a description, were unsure as to what the text referred. This hypothesis warrants further investigation, possibly through the use of retrospective protocols to understand the mental models the users bring to surfing the web, and specifically the use of links on the web.
5 Conclusions and Future Work This paper has presented a methodology for evaluating the success of web design and implementation techniques relating to WCAG 2.0 Success Criteria. This methodology is intended to be a lightweight, flexible means of obtaining data through remote evaluation to ensure that techniques are contributing to the overall goal of making the web accessible to all people with disabilities. This methodology was applied to the techniques associated with the Success Criterion 2.4.4 relating to users identifying link destinations. The results indicate that only one technique, that of having plain text contained within an anchor element, meets the criterion of creating an accessible link almost 100% of the time for visually disabled users. Other techniques met with varied levels of success. However, a number of simple to implement techniques resulted in either inaccessible links or links about which the users had low confidence regarding their destination For the future, it is important that the web accessibility community begin to take an empirical, evidence-based approach to WCAG 2.0 Techniques, and work to identify which techniques work best for users.
294
C. Power et al.
References 1. Howe, J.: The rise of crowdsourcing. Wired, Issue 14.06 (2006), http://www.wired.com/wired/archive/14.06/crowds.html 2. U.S. Department of Health and Human Sciences. (n.d.). Research-based web design and usability guidelines. Washington, D.C.: U.S. Government Printing Office, http://www.usability.gov/guidelines/index.html 3. W3C. Web Content Accessibility Guidelines 1.0 (1999), http://www.w3.org/TR/WCAG10/ (last accessed on February 22, 2011) 4. W3C. Web Content Accessibility Guidelines 2.0 (2008), http://www.w3.org/TR/WCAG20 (last accessed on February 22, 2011) 5. W3C. Understanding Web Content Accessibility Guidelines 2.0 (2008a), http://www.w3.org/TR/UNDERSTANDING-WCAG20 (last accessed on February 22, 2011) 6. W3C. How to Meet the Web Content Accessibility Guidelines 2.0 (2008b), http://www.w3.org/WAI/WCAG20/quickref/ (last accessed on February 22, 2011)
Embedded Cultural Features in the Design of an Accessibility Agent for the Web Ingrid Teixeira Monteiro and Clarisse Sieckenius de Souza SERG – Semiotic Engineering Research Group Departamento de Informática, PUC-Rio Rua Marquês de São Vicente 225 – 22453-900 Rio de Janeiro, RJ - Brazil {imonteiro,clarisse}@inf.puc-rio.br
Abstract. This paper presents the Web Navigation Helper (WNH), an interface agent for users with special needs originally developed for Brazilian users. WNH mediates scripted interaction with web sites, by providing alternative dialogs with appropriate style, structure, etc. The paper reports the results of qualitative empirical studies done at the early design stages. In particular, it shows how our design vision changed when findings from initial studies revealed that the technology we were about to develop was implicitly guided by a sociability model that was not prevalent in the Brazilian culture. The main contributions of the paper are to expose the process by which we became aware of cultural factors affecting the design of accessibility agents, and to propose a kind of technology that may be adopted in cultures whose sociability models are based on personal relations with friends and family members. Keywords: Web Accessibility; User Agents; Computer Mediated Communication; Cultural Issues in HCI Design; Sociability.
1 Introduction In almost every social setting, when individuals have trouble performing a task that they need to accomplish, there are people available to assist them. As so many activities migrate to computer environments, we need to think of assistants. Help for online activities may be available as information (e. g. web sites and public repositories) and as communication – in asynchronous (e. g. online discussion lists and blogs) and synchronous (e. g. chats) form. It may also come from user agents, an alternative that takes much more computational effort to design and develop. This paper discusses WNH, the Web Navigation Helper [7, 8], which as its name suggests is a user agent for helping users with special needs to navigate the Web. The metaphor underlying WNH is that of an interpreter – a mediator that is able to understand different languages and cultures, and can therefore translate material from one into the other in different kinds of situations. Specifically, WNH can translate web site interfaces into specialized interaction dialogs (e. g. dialogs appropriate for the elderly, the deaf, the blind, users with literacy deficits, etc.). The translation is not automatic; it is the result of manual interface adaptations by well-informed Web C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 295–304, 2011. © Springer-Verlag Berlin Heidelberg 2011
296
I.T. Monteiro and C.S. de Souza
users, who can use the WNH dialog editor to specify how a given web site's interface can be translated into an appropriate interface for the elderly, for example. This interface, which works truly as a Web Navigation Helper, is then made available for elderly users that need help to do tasks on the Web. The size and diversity of the user population in need of help are huge, and accessibility solutions are practical only if they can scale up to benefit the largest share of the targeted population. Hence, manually produced help usually fails to make a difference in the accessibility scenario. However, there is a limit to automation when it comes to translating web site’s interfaces into more accessible dialogs. If even human translators are known for making regretful context interpretation mistakes, what can we say of automated software and its ability to produce contextually adequate translations? A key element in this context is that the targeted beneficiaries of the translation are individuals with special (i. e. non-standard) needs. One way to circumvent the limitations of automated solution, especially with the advent of Web 2.0, is crowd sourcing. Now that a large number of individuals can be mobilized, there have already been attempts to use the power of social participation to promote more accessibility on the Web [11]. Thus, WNH was initially conceived as an accessibility tool to be developed counting on large-scale social volunteering. However, as we found out later, the Brazilian culture, for which WNH is primarily targeted, values certain practices and structures that go in a different direction. Therefore, we report empirical research that led us to this realization and discuss what this may mean for promoting accessibility in Brazil and similar family-oriented cultures. In Section 2, we present a brief overview of WNH. In section 3, we describe three empirical studies carried out to inform the design and evolution of WNH. Then in Section 4, we present our current conclusions and outline the future research work.
2 A Brief Overview of WNH WNH is a Firefox plug-in and currently works in conjunction with CoScripter, a macro recorder for the Web developed by IBM Research Almaden [9]. CoScripter macros automate tasks enabled by a particular Web application whose interaction with users WNH can mediate. Macros must be created by scripters, typically nonprofessional Web users with high levels of digital literacy, who will then develop mediating dialogs for each macro step that requires user input. When a macro is created, all interaction with the browser is captured and recorded. Thus, input values that go into textboxes, options that are selected in list boxes and checkboxes, links and buttons that are pressed, all of these are captured for further use in the mediating dialogs. For example, a simple movie search macro recorded on Google Movie Showtimes for Rio de Janeiro might look like this in CoScripter: 1. * go to "http://www.google.com/movies" 2. * enter "Toy Story 3" into the first textbox 3. * click the first "Search Movies" button The instructions above correspond to recorded input events and values. Note that values and events recorded in lines 1 and 3 can be reused if users want information
Embedded Cultural Features in the Design
297
about movies in Rio de Janeiro. Likewise, we do not know how the actual interface looks like; all we know is that there is a textbox where users can fill in movie titles and a button that launches search. The scripters’ job is to adjust the macro and create mediating dialogs that may look like in Fig. 1 (in English only to facilitate reading).
Fig. 1. How WNH mediates Web navigation
In order to allow each user to search for a different movie, scripters use an existing CoScripter mechanism to create variables whose values can be filled in during the execution of a CoScript. In the next step, scripters use the WNH specialized editor to design small screens like the one shown in Fig. 1 and then to associate them with instructions where user input is necessary (see instruction 2 in the example above). In Fig. 2 we show WNH editing screen shots that briefly sketch the flow that culminates in the creation of a mediating dialog, for a registration page. First (1), the scripter writes the screen title and the dialog text. Second (2), he sees which instructions will have their input captured in this particular screen. Third (3), the scripter chooses which labels and other interface elements will appear on the dialog screen (regardless of the ones used in the original web page). Fourth (4), he indicates the data types for input. Finally (5), the scripter can preview the dialog screen.
5 1
2 Fig. 2. Creation of a dialog in WNH editor
3
4
298
I.T. Monteiro and C.S. de Souza
Scripters can add help information to dialog screens. A special button labeled “Dúvidas” (Doubts), as shown in Fig. 1, appears on each screen. When the targeted user presses that button during interaction with WNH, she can resolve doubts about what she must do, the meaning of specific terms, the effects of the task, the overall goal of the web site, and so on. Notice that these doubts must be anticipated by scripters, or else the end user will have no help. With WNH installed in her Firefox, she can select one from a set of available mediated activities. Once she does it, she begins to interact with WNH. The user’s interaction with WNH guides background navigation controlled by the underlying macro. In this way, the user completes the desired task using an alternative style of interaction. For further details on how WNH works, see [7].
3 Empirical Studies Informing the Design of WNH Our first approach to designing WNH was inspired by crowd sourcing. We hoped to count on volunteers with high levels of digital literacy to create mediating dialogs for users with special needs. They might even do this anonymously, if they wished. We planned to give them dialog templates, designed by HCI specialists, to guide their interface mediation. But in order to know what other requirements we would have to meet, we ran two exploratory experiments with participants that viewed themselves as potential WNH volunteers. We used a qualitative methodology, as is often the case with exploratory studies. In the first experiment, we worked with seven IT professionals and in the second with eight experienced Internet users. The groups were intentionally selected with complementary skills. Whereas IT professionals had no particular training in written communication, selected users in the second experiment were well-trained Language students, taking a course in digital narratives and related topics (HCI and Accessibility excluded for both groups). In the first experiment, participants were quickly debriefed about WNH and its purpose. Then they were asked to go through the experiment scenario. The main character in it (whose role they should play) wants to develop mediating dialog screens for an elderly lady who wants to visit a Bus Travel Company web site and look for travel options (date, time, seat, etc.). The CoScript for the task was previously prepared by our team, so all the participants only had to design the dialogs, and make the appropriate associations between them and the CoScript instructions. Because the WNH editor was not fully implemented at that time, we used printed form sheets, which participants filled out with their solutions. At the end of the experiment, each participant had a short interview with us, about the experiment, the technology, and the participant’s perceptions and suggestions regarding WNH. The results of this experiment gave us much useful information. This group had no difficulty to understand how CoScripter combined with WNH. As a rule, they could indicate, easily and correctly, when their dialogs should appear. They could also easily name and use interaction elements in their dialogs. Interestingly, one participant used a different interface element in his dialogs, clearly rephrasing what the original interface was saying. Another participant, at a certain point along the mediated dialog, directed the user to the company’s web site itself in order to select a particular bus service code. This achieved a kind of transition from mediated to nonmediated interaction, which was nice to see.
Embedded Cultural Features in the Design
299
Only three participants completed all of their dialogs within the time limit for the task, and the others fell short of completing them all by only a few screens. All participants thought of questions and answers (Q&A’s) to go in the “Doubts” section. The number of Q&A’s elaborated by participants ranged from 3 to 7. Five participants created Q&A’s for all dialogs where user input was required. The style of communication in all dialogs included various instances of technical jargon like “procedure”, “this script is optimized”, “valid date on dd/mm/yyyy format”, and the like. In some cases, the participants even spoke for the bus travel company, saying things like: “Unfortunately, it’s possible that your city isn’t served by our company.”; “We apologize, but apparently this city isn’t served by Viação Cometa”; “Contact Customer Service at…”; “Bear in mind that we cover the largest number of routes…”. The most relevant results of this experiment with IT experts were that the styles of mediating dialogs varied considerably from one participant to the other, even though their professional profile was homogeneous. Participants expressed their difficulty to communicate appropriately with the targeted user. Clearly, participants were unsure about how much explanation was necessary. None of them really knew how to deal with the stereotyped user depicted in the test scenario. So, in spite of having initially said they would be glad to act as volunteers to develop WNH dialogs, at the end of the experiment they were faced with unanticipated difficulties and were no longer sure that they had the necessary knowledge to achieve the task. Three participants explicitly suggested that companies and commercial businesses operating in the Web should be interested in creating the dialogs themselves. One of them explained that companies might use this to publicize their concern with users with special needs. Before we ran the second experiment, we interviewed the instructor of an Internet course for the elderly offered by the Rio de Janeiro State administration, and attended some of the classes to observe how they went, especially as the elderly learners practiced their lessons in the computer. Our goal was to put together some guidelines that the next participants would be able to use while developing their mediating dialogs for a similar test scenario. The second experiment was run for the same purpose as the first: to investigate what kind of support WNH volunteers needed in order to design good mediating dialogs. Participants were non-experts in IT, but had advanced skills in written communication. Although the experiment scenario was similar to the one used with the first group (this time they should help an elderly lady create a web mail account), the experiment procedure was different. Instead of running individual sessions with each participant, we used a class situation, with the whole group. They all saw the same introductory presentation, telling them about WNH, CoScripter, and our goal with the technology. We emphasized the case of elderly users. Then each participant received sheets of paper with their task description, the paper forms for the dialogs, and a summary of guidelines and facts about computer interfaces for elderly users. This step was done individually, by each participant. At the end of the experiment, we brought all participants together for a focus group session to discuss the experiment. The group as a whole needed more time than the first group to achieve the task. Five of the eight participants understood how CoScripter and WNH worked together. The other three were confused when associating dialogs with CoScript instructions. Only two of the eight participants had no problems with naming interactive elements. The other six experienced a variety of breakdowns. For example, they talked about
300
I.T. Monteiro and C.S. de Souza
‘icons’ when none were involved in the interaction, or referred to radio buttons as circles. Three participants gave us evidence that they did not really understand how CoScripter automated script execution. For example, they created a dialog for commands where no user data was necessary. Unlike the first group, this group generated very few Q&A’s for the “Doubts” section. To our surprise, most participants did not consult the list of guidelines and information about elderly users’ needs. Regarding the content and style of dialogs, as a rule, this group’s dialogs were very informative. All the participants embedded useful explanations in a number of dialogs, providing detailed explanations about how to use interface controls, for example (which explains the scarcity of Q&A’s). The focus group session raised a number of relevant points. One participant said he believed that the less technical the information, the closer the elderly would feel to the technology, and the more comfortable with it. Some of the participants manifested having difficulty to explain the basics of interaction (like what ‘to click’ means, what a ‘cursor’ is, etc.). One participant said that he was worried that the language and style he used would not sound childish to the elderly. One participant said that as she listened to the oral presentation he thought everything was easy and obvious, but when it came to actually writing down the mediating dialogs, it became difficult (which echoes the results of the first experiment). Interestingly, one participant said that he was rather pessimistic about Brazilians engaging in this sort of social volunteering online. He did not see “people helping other people at large”, he said. This statement started a lively discussion about when and if Brazilians were likely to help others, rather than ‘be selfish’ (this expression was explicitly used in the discussion). There was a consensus in the group that everybody in this culture would be happy to help their grandmother and grandfather, or some other elderly person in their family. These were people they knew, and not – as one participant put it – “the whole wide world”. It thus became clear that the point of the discussion was the reach of one’s initiative online, which extended far beyond the limited context that necessarily informed what one was doing when trying to help someone with a special need. In other words, helping their grandmother was as far as they were willing (or able) to go, because they felt they needed to know the person they were trying to help. Helping unknown people, which was the context of WNH we presented to them, was something about which a number of participants said they were rather pessimistic. However, all agreed that once they developed a script for somebody they knew well, there would probably be no problem whatsoever with sharing it with other people that they knew. Notice that this keeps the decision about sharing or not sharing scripts would be then made on a case-by-case basis, not escaping human control (as would probably happen in a crowd-sourcing context). The two experiments gave us much valuable input for the design of WNH. However, one finding had much deeper consequences than all others did. The focus group discussion in the second experiment taught us that we had tacitly accepted the fact that crowd sourcing was the alternative for having contextually well-adapted dialogs for users with special needs. The pressure for scalable accessibility solutions led us to aim at users at large. The reaction of one of the participants in the second experiment showed us that we were unconsciously expecting WNH scripters to work for stereotyped beneficiaries, running the risk of missing precisely those special contextual elements that distinguish this user population from other users that can deal with standard interfaces.
Embedded Cultural Features in the Design
301
We then reframed WNH and began to think of it more clearly as an end-user development tool to create software agents that can achieve what Clayton Lewis refers to as “human companionship and support” [10]. This new design conception opened a new way of computer-mediated social participation, although it did not impose to scripters (and WNH beneficiaries) a networked sociability model [3] as our previous design did [7, 8]. We ran a third experiment to explore the new design concept. It was structured in three subsequent phases, which shared the same scenario setting and involved two different participant roles: WNH scripters and WNH end users. We recruited four participants to play the scripter role (two HCI students doing research with elderly users, and two teachers working with digital inclusion programs for the elderly in the State of Rio de Janeiro). For the end user role, we recruited six users (aged 55-82) that had just started learning how to use computers and the Internet. In the first phase, scripters were asked to produce verbal descriptions of interaction involved in the experiment scenario to two different listeners. In one case, they would be talking over the phone to a colleague (with the same level of expertise as the scripter). In the second, they would be talking face-to-face to an elderly family member or friend (both listener roles were played by a research team member). The participant’s job was to explain orally to the other person how to use an online mortgage calculator made available by a Brazilian public bank. In the second phase, scripters listened to a brief description of our project and were introduced to the WNH Editor. Then they were asked to use the editor to create mediating dialogs for the web site they had just described and explained. Their targeted user should be the elderly person they were addressing themselves to in the first phase. In the last phase, participants matching most of the ‘digitally illiterate elderly person’ profile were asked to use the mediating dialogs created by scripters. The group was divided in two: three participants used dialogs created by one of the HCI researchers and the other three used those created by one of the teachers. At the end of experiment sessions, we had a short interview with the participants. This experiment was used for triangulation with previous experiments in the sense that the tasks and activities that scripters had to do were very similar to the ones we had before. The design vision and context this time was, however, substantially different: we moved from online volunteering to family help, and from paper forms to the real WNH editor. Moreover, we did an internal triangulation in the third experiment, by having end users actually using the dialogs produced by the scripters. We collected considerable evidence that the new design perspective we adopted is more naturally understood by participants, who can easily engage in the proposed scenarios. During the interviews, all scripter participants, without exception, told some story about relatives (parents, grandparents, elderly in-laws) that had problems with computers and asked them to help (which they all did). All agreed that WNH could be effectively used to create mediating dialogs, especially to facilitate frequent tasks done by relatives with lower levels of digital literacy. They also said that they would like to start using WNH as soon as it became publicly available. Some participants explicitly said that WNH gave them an opportunity to help their friends and family in a much more practical way. The teachers, in particular, said that WNH would be useful at home, with their relatives, and with the elderly students in the Digital Inclusion program. The more experienced of the two teachers even said
302
I.T. Monteiro and C.S. de Souza
that if he had more time, he could do “air-tight dialogs”, with which it would be almost impossible for the end user to make mistakes during the navigation. When asked who would benefit from WNH, most participants answered that all lay users and the elderly, in particular, would. Some of them said they also thought that blind and deaf users, as well as users with literacy deficits, could also be helped with WNH. The end user group of participants (most of them over 60 years of age) showed real interest in WNH. All of them said that they would like have WNH installed in their personal computers to help them do most frequent tasks. We observed that successful navigation with WNH had mostly to do with breaking interaction into a series of small dialogs (a stepwise interaction style). This, in itself, circumvents a number of problems having to do with ‘what to do next’ confusion. We also observed that once users knew how to use an interface control element (e. g. a textbox), they did not care to read short instructions and tips on how to use them that scripters had included in their dialogs. They jumped right into interaction, which was a small but noticeable gain in efficiency. In some cases, when skipping the reading led to mistakes, the end users were warned by WNH that something had gone wrong. This situation gave us the opportunity to see the importance of anticipating errors and doubts when creating WNH dialogs. In all cases where these were available, end users eventually corrected the problem and went on to next steps. We asked to the end user group if they thought that, after using WNH dialogs repeatedly to do the same task, they would stop using mediation and go directly to the web site original interface. All of them agreed that the dialogs actually help them learn to navigate the web site directly. However, half of them said that they would prefer the easier way, which would probably be to continue using WNH. The other half said that, after some time practicing with WNH, they would surely want to migrate to the web site original interface. We welcomed this result since it tells us that WNH is perceived by some as a scaffold in the path of digital literacy acquisition.
4 Conclusions and Future Work Designing tools for accessibility is more than a technological matter. The research described in this paper illustrates some of the benefits of qualitative methods in research, since they allow us to go deep into explicit and implicit meanings manifested by participants in the course of empirical studies. Likewise, we, as designers, in the process, are naturally led to question our own explicit and implicit meanings regarding the (kind of) technology we are proposing to users. Results in this kind of research setting can be very surprising and entail radical changes. Our study gave us a window onto Brazilian sociability, which is considerably influenced by a different sociability model than the one embedded in accessibility technologies proposed elsewhere (e. g. [1, 11]). In sensitive issues like dealing with users with special needs, Brazilian sociability is closer to a community model [5] than a networked model [3]. In communities, individuals experience strong personal relations and intimacy. An individual’s identity is built and sustained by such relations and collective values. “Others” are not unknown, faceless individuals; they can always be chartered in a space of inter-personal relations, because of their ties with people one personally knows. The third experiment showed us how deeply true
Embedded Cultural Features in the Design
303
this is, since all participants (scripters and end users) told us stories of elderly users that were (nearly) “digitally illiterate” and how family members or friends were there to help them in various kinds of situations. We believe that the crowd-sourcing model may not work for technologies like WNH in the Brazil (and probably neither in similar cultures). Accessibility issues are sometimes related to a “universe of one” [2]. The needs are so special, so specific, that it is difficult solve problems by automating generic solutions. Having someone who is not personally involved with the user generating solutions may also lead to missing many relevant aspects of the user’s special context. The new design vision for WNH brings together both universe of one and community-oriented sociability perspectives. We believe that this new vision somehow introduces a new type of technology in our country. We are talking about family-oriented IT, as an additional alternative to assistive technologies. This technology is strongly related to end user development alternatives, as is the case with technologies produced in the context of the CLever project, at Colorado University [4]. WNH is thus an alternative technology. It can promote and reap the benefits of inter-personal relationships in computer-mediated environments and activities. Besides, end users know whom they must talk to if they have further needs or if (as is often the case) scripts and dialogs stop running because of a change in the website that they refer to. This kind of maintenance task can be costly for larger social volunteering initiatives. Another lesson we learned was that the mediated interaction supported by WNH could be used for other purposes. For example, it can be useful for teaching and learning activities, or for expressing one’s version of the interactive potential supported by web sites and services (that can be shared with others for various reasons), or even for exploring design or redesign activities (as different mediating dialogs can be created by designers themselves and then be exposed to user evaluation). We still have a long way to go. Our next steps involve reviewing and fixing some usability problems we discovered with the experiments. We would also like to do more experiments to evaluate the further uses of WNH. On a more theoretical note, and in tune with previous research of ours in Semiotic Engineering [6], we would like to explore how WNH can be used to study different metacommunication strategies and styles, towards a rhetoric of HCI. In particular, when running new experiments to observe the use of WNH in family contexts, we might have a glimpse of affective dimensions of metacommunication that we have never explored before. Paying closer attention to cultural issues silently lying beneath technologies opens different avenues for research. Not because this will necessarily lead to inventing new technologies, but perhaps (and more importantly) because it can show us how and why to combine existing technologies and then deploy them in different completely cultural settings. Acknowledgments. We thank the volunteers that helped us in this research and Chantal Intrator for technical support. This work is partially supported by Programa INC&T (CNPq 557.128/2009-9 and FAPERJ E-26/170028/2008). Clarisse de Souza receives individual research grants from CNPq (308964/2006-3) and FAPERJ (E26/102.400/2009). Ingrid Monteiro has been supported by Informática/PUC-Rio.
304
I.T. Monteiro and C.S. de Souza
References 1. Bigham, J.P., Lau, T., Nichols, J.: Trailblazer: enabling blind users to blaze trails through the web. In: Proceedings of the 13th International Conference on Intelligent user interfaces (IUI 2009), pp. 177–186. ACM, New York (2009) 2. Carmien, S.P., Fischer, G.: Design, adoption, and assessment of a socio-technical environment supporting independence for persons with cognitive disabilities. In: Proceeding of the Twenty-Sixth Annual SIGCHI Conference on Human Factors in Computing Systems (CHI 2008), pp. 597–606. ACM, New York (2008) 3. Castells, M.: The Internet galaxy: reflections on the Internet, business, and society. Oxford University Press, Oxford (2001) 4. CLever Project – Center for Lifelong Learning and Design, University of Colorado – Boulder (January 2011), http://l3d.cs.colorado.edu/clever/index.html 5. da Matta, R.: O que faz o Brasil, Brasil? Rio de Janeiro, Rocco (1986) 6. de Souza, C.S.: The Semiotic Engineering of Human-computer Interaction. The MIT Press, Cambridge (2005) 7. Intrator, C.: Using Scripts to Improve Web Accessibility. Dissertação de Mestrado. Departamento de Informática, PUC-Rio. 105 pages (2009) 8. Intrator, C., de Souza, C.S.: Using web scripts to improve accessibility. In: Proceedings of the VIII Brazilian Symposium on Human Factors in Computing Systems, Porto Alegre, October 21- 24. ACM International Conference Proceeding Series, vol. 378, pp. 292–295 (2008) 9. Leshed, G., Haber, E.M., Matthews, T., Lau, T.: CoScripter: automating & sharing how-to knowledge in the enterprise. In: Proceeding of the 26th Annual SIGCHI Conference on Human Factors in Computing Systems. CHI 2008, pp. 1719–1728. ACM, New York (2008) 10. Lewis, C.: HCI for people with cognitive disabilities. In: SIGACCESS Accessibility and Computing, vol. 83, pp. 12–17 (September 2005) 11. Takagi, H., Kawanaka, S., Kobayashi, M., Itoh, T., Asakawa, C.: Social accessibility: achieving accessibility through collaborative metadata authoring. In: Proceedings of the 10th International ACM SIGACCESS Conference on Computers and Accessibility, pp. 193–200. ACM, New York (2008)
Some Issues Regarding the Design of Adaptive Interface Generation Systems Julio Abascal, Amaia Aizpurua, Idoia Cearreta, Borja Gamecho, Nestor Garay, and Raúl Miñón Informatika Fakultatea. University of the Basque Country/Euskal Herriko Unibertsitatea Manuel Lardizabal 1, 20018 Donostia, Spain {julio.abascal,amaia.aizpurua,idoia.cearreta,borja.gamecho, nestor.garay,raul.minon}@ehu.es Abstract. This paper describes the main issues related to the design of user adaptive interaction systems, in order to discuss their applicability to a specific domain: the automatic generation of accessible user interfaces for people with disabilities who make use of ubiquitous services. Advances in the domain of the accessible Web are especially interesting for this purpose. Nevertheless, even if several procedures are similar, there are specific features that require new approaches, such as the formal specification of the functionality of the interface that will be generated. Keywords: Adaptive Systems, Special Needs, Adaptive User Interfaces, Ubiquitous Computing, Knowledge Modelling.
1 Introduction It is well known that users with sensory, physical or cognitive disabilities have enormous difficulties using devices and applications, and obtaining services through them, because several of them are not accessible. Most frequently the main accessibility barriers lie in the user interfaces, which are habitually designed without taking human diversity into consideration. The most natural way to cope with diversity appears to be adaptation. In the last decades, people working in diverse areas of the Artificial Intelligence field have been working on adaptive systems, hence creating valuable knowledge that can be applied to the design of adaptive user interfaces for people with disabilities. Our laboratory is involved in a research effort to design accessible interfaces for people with disabilities using intelligent machines, such as ATMs, vending machines, information kiosks, home intelligent appliances, etc., in a ubiquitous context. In order to establish the foundations for our system, we reviewed and comment the state of the art in modelling for user adapted interaction in diverse domains and its applicability to the provision of ubiquitous services for people with disabilities. This paper summarises a number of results from our discussions. After this brief introduction, section 2 introduces the principal techniques for knowledge modelling. In section 3 the main adaptation methods are summarized. In the following section reasoning approaches are discussed. Section 5 provides examples of adaptive systems for personalizing the user interface, emphasizing the ubiquitous computing area. The paper ends with some conclusions. C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 307–316, 2011. © Springer-Verlag Berlin Heidelberg 2011
308
J. Abascal et al.
2 Knowledge Modelling User modelling has been addressed by an increasing number of research projects. The reason is that knowledge modelling is essential to creating adaptive systems that generate personalized user interfaces. Thus, thanks to the information gathered in user models, it is possible to adapt the content or the presentation to users’ needs, greatly enhancing the quality of user interfaces. To this end, three main phases can be distinguished in order to create a model: (1) the design of the model aims to describe the selected concepts and their relationships; (2) the representation of the model tries to specify the model using a particular modelling technology; and (3) the application of the model, which includes data acquisition in order to fill in the model with real user information. Although this section describes those three phases through examples, it is mainly focused on the first one, since it forms the basis of knowledge modelling. According to the above-mentioned phases, the first task in designing a model is to analyze exhaustively the relevant user characteristics. In addition to user characteristics, context-related concept modelling is also necessary (for instance, information regarding the mobile device, location, etc.) in some domains, such as mobile environments. The selection of context-related concepts generally depends on the application domain of the system. As stated before, some authors only consider users’ features in order to create user profiles. For instance, Jrad et al. [1] focus their work on how features such as user preferences and interests can be modelled, in order to develop tourism-related personalization systems. Another example is the work of Casas et al. [2], who base their model on characteristics related to users’ capabilities and behavior. They argue that this procedure contributes positively to the development of intelligent systems that are able to help elderly users with impairments to overcome their communication barriers. In certain cases, users’ information is gathered and classified into stereotypes [3]. In this way, user interfaces are adapted according to similar characteristics that a group of users has in common. The proposed adaptation is mainly aimed at people with motor impairments. As a global review of models considering only user features, Gauch et al. [4] analyze the most popular techniques for collecting users’ characteristics in order to represent and build user profiles. Other authors compose the user context from more concepts, apart from own user characteristics. For instance, Brusilovsky and Millán [5] provide a wide review of issues related to user models and user modelling approaches applied to the adaptive Web. In addition to features such as the user’s knowledge, interests, goals, background or individual traits, they also discuss modelling the user’s work context including concepts such as the type of platform, user location or user’s affective state. Similar to this last research line, Cearreta and Garay [6] propose the modelling of affective states, including other aspects of the subject, such as physical or cognitive states. Cearreta and Garay refer to the term “subject context” instead of personal context, in order to model aspects regarding both the user and the system. In this way, they model affective interactions between users and systems, taking their context into account. Within this context, apart from the subject’s features, they also include environmental, task, socio-cultural and spatio-temporal context aspects.
Some Issues Regarding the Design of Adaptive Interface Generation Systems
309
Other authors model similar concepts of the context, using a different classification or terminology. For instance, Göker and Myrhaug [7] structure the context with the following parts: personal, environmental, task, social and spatio-temporal. Another example is the work of Krüger et al. [8], which models the context in the following way: physical, spatial, temporal, activity and situation. Schmidt et al. [9] propose a hierarchical model in which two different contexts are distinguished on the top level: human factors (user, social environment and task) and physical environment (conditions, infrastructure and location). It is clear that finding a consensus for context modelling is not easy. Although many authors use similar concepts for modelling, each of them focuses on the needs of his system to develop and the end users. With regard to the model representation phase, several technologies have been proposed. Some of the most popular technologies to represent the user model or create user profiles are: network-based [10]; term- or keyword-based [11]; mark-up language-based [12]; hierarchies of concepts, based on a taxonomy or thesaurus [13]; more completed concept-based hierarchies using existing ontologies [14]; or ontologies created from scratch [15]. Nowadays, ontology-based techniques are being widely used for representing knowledge. They are usually based on different technologies, such as XML [16], RDF [17] and OWL [18]. Their main objective is to represent concepts of the real world, making it possible to share knowledge among different applications. Moreover, ontologies provide several mechanisms to reuse the domain knowledge or even infer additional information, using reasoning techniques. The use of ontologies can be found in several references related to adaptive systems. For instance, Hérvas and Bravo [19] developed an infrastructure to support information adaptability for users, using ontologies for the management of contextual information. Once the user model is designed and represented, it is ready for the data acquisition phase. In this regard, methods for collecting user information can be divided into two main categories: explicit and implicit [4]. Explicit user information collection methods require user intervention in order to complete the information of the model. For instance, sometimes users are explicitly asked to provide information by filling in web forms. The main downside of this type of method is that it requires users’ willingness and can be disturbing for them. In order to avoid this, implicit methods acquire user data in a transparent way, and thus they do not place a burden on the user. These information-capturing techniques are based on monitoring the user interaction behavior, frequently using data-mining techniques. Some systems have tried hybrid approaches to collect different types of information at diverse stages.
3 Adaptation Methods Apart from building the user model, it is necessary to identify the adaptations that the system will perform to provide users with personalized or adaptive interfaces. There is a wide variety of adaptive methods and techniques that can be classified into three main groups, depending on the adaptation purpose: content, presentation and navigation adaptation [20]. Nonetheless, some methods may fall simultaneously into two categories, since they adapt either content and presentation or navigation and presentation.
310
J. Abascal et al.
Content adaptation techniques are frequently oriented to providing users with personalized content. There are several approaches, some of which add, delete or modify the content, while others can emphasize or deemphasize it, suggesting the most relevant content according to the user’s interests or needs. According to Knutov et al., adaptive navigation methods are divided into two main categories: “suggesting” and “enforcing” techniques. “Suggesting” techniques color, emphasize or sort the links of a site, keeping all of them. In contrast, “enforcing” techniques delete non-recommended links or add new links. This means that some of the original links might not be present in the adapted user interface. Presentation methods include modification of the layout considering the user’s preferences. The previously mentioned content adaptation techniques, which emphasize and deemphasize parts of the web page, can also be considered as presentation techniques. Moreover, navigation techniques, such as coloring links, can be included in presentation techniques too, if the adaptation affects styles. Figure 1, based on [20], shows a classification of adaptation methods. Not all the adaptive systems require the use of all three main types of adaptive methods mentioned previously. Depending on the purpose and application domain of the adaptive system, the application of some types of techniques would not be required. For instance, adaptive systems corresponding to e-learning domains may focus on adaptive content, rather than applying presentation-related techniques.
4 Reasoning Approaches for Relating Modelling to Adaptation As explained in previous sections, modelling refers to the description of the knowledge regarding the user and his/her interaction context; whereas adaptation techniques refer to the implementation of methods in order to perform adaptations on the user interface. Both information modelling and adaptation techniques are essential components of an adaptive system. In this section, different reasoning mechanisms for relating knowledge models to adaptations are described. This relation aims to provide the system with the knowledge required in order to select the most suitable adaptations for a particular user. There are two main approaches to organizing modelling and adaptation components: User Modelling Shell Systems and User Modelling Servers [21]. In the former case, all user modelling (including content representation and reasoning mechanisms) is performed by the adaptive system. On the contrary, in the latter case the modelling is not part of the adaptive system itself, since the servers are independent from the application. Although both approaches try to separate the user modelling functionality from the adaptive application, they follow different architectures. The server approach has some advantages compared with the shell systems. For example, the servers can easily be integrated into existing systems or environments. In addition, they can serve more than one application at a time. In this way user information can be available to several applications simultaneously. Nonetheless, user modelling servers have some drawbacks, such as the management of distributed user information and support for privacy issues.
Some Issues Regarding the Design of Adaptive Interface Generation Systems
311
Fig. 1. Classification of the adaptation methods based on Knutov, E., De Bra, P, Pechenizkiy, M. (2008) [20]
With the advent of Ubiquitous Computing, other trends are becoming popular, for instance agent-based user modelling for mobile environments. Regardless of the architecture approach adopted, the information on the model has to be closely related to the adaptive application. Although the relation between the model and the adaptations might sound obvious, it is not trivial and it really plays a key role in every adaptive system. Once the model is created, the complexity of the matching depends on the technology used to represent the model. If the user’s information is represented through a pair of keywords (concept–value), the relation is quite straightforward. For instance, a keyword pair such as (contrast, high) implies that a particular user needs a high contrast on his user interface. Nevertheless, when the model is more complex, reasoning mechanisms are required in order to derive assumptions about the user. These reasoning mechanisms can be classified into different types: deductive reasoning, inductive reasoning and analogical reasoning [22]. Deductive reasoning is applied to infer from the more general to the more specific knowledge. It is used in logic-based user model representation, such as concept formalism. For instance, in adaptive systems that use ontologies to represent the model as a concept hierarchy, the relation can be specified through rules, and later reasoning mechanisms can be used to infer knowledge regarding the user. According to Kobsa et al. [22], one of the main drawbacks of logic-based approaches is the limited support for uncertainty.
312
J. Abascal et al.
In order to deal with this, Bayesian networks and fuzzy-logic-based approaches make use of probabilities in the user model. In contrast to the deductive approach, inductive reasoning infers assumptions from the specific cases to the general one. It entails keeping track of users’ interaction behavior and deriving a general conclusion. For this purpose, machine learning techniques, such as Nearest-Neighbor or Naive Bayes classification algorithms, can be used. This type of reasoning is often used to infer knowledge about users, such as interests, by analyzing the items rated by users according to their affinity. In regard to analogical reasoning, this approach is based on identifying similarities among users. In this case, rather than analyzing the features of the items that each user is interested in, the system tries to find other users with similar interaction behavior. Based on clustering algorithms similar user profiles can be found and grouped. The Amazon [23] website is an example of an adaptive system that applies some of the above-mentioned reasoning techniques. It recommends products that users may want to buy in a tailored way by considering product ratings. Moreover, it also suggests products by analyzing users’ previous purchases, browsing history and changes in interest of similar user profiles. In this way, suggestions based on the user profile are adapted for each user. In this section we have described several mechanisms to relate modeled information to adaptations, to be performed by the adaptive system. From our experience in developing prototypes for the INREDIS [24] project, rule-based mechanisms were used in the adaptive system to relate the user model to the different adaptation techniques. The user model and the adaptations were represented through an ontology. This allowed us to use predefined rules and a reasoner, in order to identify the most suitable adaptations for a particular user.
5 Adaptive Systems for Personalizing User Interfaces The previous sections describe the main issues regarding the design of an adaptive system. Diverse adaptive systems that apply specific approaches to these issues can be found in the literature. These systems belong to different application domains, such as e-learning, e-commerce or domotics. Moreover, some of them adapt web-based user interfaces, whereas others are standalone approaches. This section includes a number of examples of adaptive systems corresponding to both approaches. One of the first adaptive web system approaches was the AVANTI project [3]. The main objective was to provide hypermedia information to people with special needs by adapting the content, presentation and navigation of web pages to individual users. As mentioned in section 2, Fink et al. modeled users according to different stereotypes. Depending on the stereotype, the system presented optional content and chose the most appropriate information among alternatives. The mPersona architecture [25] is a flexible content personalization system for users with mobile devices. This architecture adapts web-based interfaces by selecting the most suitable content for users, based on their interests and preferences. This adaptive approach is based on mobile agents. The Supple system [26] automatically generates standalone interfaces adapted to a person’s device, tasks, preferences and abilities. Interface generation is presented as a
Some Issues Regarding the Design of Adaptive Interface Generation Systems
313
discrete optimization problem and it is solved with a branch-and-bound algorithm using constraint propagation. The adaptations are expressed in a cost function that can easily be modified to build new adaptations. Content, navigation and presentation adaptations are considered in the system to generate the interfaces. In the area of the adaptive Web, the common approach is to adapt a previously designed user interface, in order to personalize user interfaces tailored to users’ capacities, skills, access device, task, etc. However, when user interfaces have to be adapted in other scenarios, such as ubiquitous environments, the adaptation process may differ, even though web technologies might be used. In the next subsection the adaptation process in ubiquitous environments will be described. 5.1 User Interface Adaptation in Ubiquitous Environments As mentioned before, adapting user interfaces in ubiquitous environments requires a different approach. In the so-called Ambient Intelligence when a mobile user enters a place where ubiquitous services are provided, his/her mobile device and the local service communicate in a way that is transparent to the user. This is frequently achieved through a wireless network controlled by the appropriate middleware. In this way, it is possible to establish a direct user–service communication. In order to provide services through this type of architecture, each service must be described in an abstract way. When a user accepts or requests a service, a specific instance of the service’s user interface is downloaded into his/her mobile device. This interface should be adapted to the user’s characteristics and his/her context. It should be noted that the final user interface is not predesigned; in contrast, the interface is dynamically adapted from an abstract description of the interaction elements. Therefore, a machine-readable formal specification of its functionality is necessary in order to guide the process of creating the final user interface. For this purpose, User Interface Definition Languages (UIDLs) are widely used [27]. Although the adaptation process of ubiquitous environments is different from that used for web-based adaptive systems, they can use similar adaptation techniques. A number of examples of adaptive systems in ubiquitous environments are presented below. Ubiquitous Interactor [28] addresses the problems of accessing ubiquitous services from distinct devices with different features. The interaction between services and devices is modeled, and customization forms have been implemented in order to adapt the presentation. Another instance is provided by the previously mentioned INREDIS project [29]. Its main goal was to create a framework to provide people with disabilities with access to ubiquitous services. To accomplish this goal, the system transforms an abstract user interface into a final user interface considering user stereotypes that guide the resource selection and page rendering. Finally, one of latest approaches is ViMos [19]. This is an information visualization service that applies context awareness to provide users with adapted information through embedded devices in the environment. In order to present diverse content, pieces of information are represented as user interface widgets. These widgets have several associated techniques of scalability to adapt themselves according to the content to display and the available area in the user interface. ViMos includes a library of widgets in order to display multiple kinds of data (e.g., plain text,
314
J. Abascal et al.
images, multimedia and formatted documents) by using different visualization techniques and providing adaptive techniques to adjust the visual layout (e.g., zoom, pagination and scrolling). From our point of view after the INREDIS project, the selection of adequate resources and web elements is a key task. People with disabilities must be provided with accessible interfaces, and their ability to interact with them might be compromised if the resources are not appropriate. For example, a hearing-impaired user should be provided with audio transcriptions if the service offers audio elements. Therefore, to build an adapted user interface with adequate resources, the service provider must offer different types of resources. For instance, if the user is illiterate, the system should provide an interface based on images. To this end, the provider of the service should offer these images, in order to make this personalization possible. Additionally, knowledge regarding adaptive techniques should be exploited. It is necessary to provide users with navigation mechanisms adapted to their needs, accessible content and a suitable layout presentation.
6 Conclusion It is generally accepted that some application fields require intelligent adaptive interfaces in order to grant accessibility to applications, devices and services. In this paper we have summarized the most popular techniques used to design user adaptive applications, with special emphasis on the most recent techniques applied in web adaptation. From this exercise we extracted the main guidelines applicable to a different domain: the dynamic generation of adapted user interfaces for a Ubiquitous Computing environment. These ideas have been applied in the design of the first prototype of the Interface Generator of the INREDIS project. From the experience obtained in that project we are currently working on a more advanced adaptive system called EGOKI. Acknowledgments. This work was started within the INREDIS project, founded by the Spanish Industry Ministry and it has been partially supported by the Department of Education, Universities and Research of the Basque Government. In addition, A. Aizpurua, B. Gamecho and R. Miñón enjoy PhD scholarships from the the Research Staff Training Program of the same Department.
References 1. Jrad, Z., Aufaure, M.-A., Hadjouni, M.: A contextual user model for web personalization. In: Weske, M., Hacid, M.-S., Godart, C. (eds.) WISE Workshops 2007. LNCS, vol. 4832, pp. 350–361. Springer, Heidelberg (2007) 2. Casas, R., Blasco, R., Robinet, A., Roy-Delgado, A., Roy-Yarza, A., McGinn, J., Picking, R., Grout, V.: User modelling in ambient intelligence for elderly and disabled people. In: Miesenberger, K., Klaus, J., Zagler, W.L., Karshmer, A.I. (eds.) ICCHP 2008. LNCS, vol. 5105, pp. 114–122. Springer, Heidelberg (2008) 3. Fink, J., Kobsa, A., Nill, A.: Adaptable and Adaptive Information Provision for All Users, Including Disabled and Elderly People. New Review of Hipermedia and Multimedia 4, 163–188 (1998)
Some Issues Regarding the Design of Adaptive Interface Generation Systems
315
4. Gauch, S., Speretta, M., Chandramouli, A., Micarelli, A.: User models for adaptive hypermedia and adaptive educational systems. In: Brusilovsky, P., Kobsa, A., Nejdl, W. (eds.) Adaptive Web 2007. LNCS, vol. 4321, pp. 3–53. Springer, Heidelberg (2007) 5. Brusilovsky, P., Millán, E.: User models for adaptive hypermedia and adaptive educational systems. In: Brusilovsky, P., Kobsa, A., Nejdl, W. (eds.) Adaptive Web 2007. LNCS, vol. 4321, pp. 3–53. Springer, Heidelberg (2007) 6. Cearreta, I., Garay-Vitoria, N.: Ontology for adapting interactions between humans and systems able to manage affective states. In: 18th UMAP - Workshop on User Models for Motivational Systems, Hawaii, USA, pp. 1–12 (2010) 7. Göker, A., Myrhaug, H.I.: User context and personalisation. In: ECCBR Workshop on Case Based Reasoning and Personalisation, Aberdeen, UK, pp. 1–7 (2002) 8. Krüger, A., Baus, J., Heckmann, D., Kruppa, M., Wasinger, R.: Adaptive mobile guides. In: Brusilovsky, P., Kobsa, A., Nejdl, W. (eds.) Adaptive Web 2007. LNCS, vol. 4321, pp. 521–549. Springer, Heidelberg (2007) 9. Schmidt, A., Beigl, M., Gellersen, H.W.: There is more to context than location. Computers and Graphics 23(6), 893–901 (1999) 10. Magnini, B., Strapparava, C.: User Modelling for News Web Sites with Word Sense Based Techniques. In: User Modelling and User Adapted Interaction, vol. 14 (2004) 11. Barla, M.: Towards Social-based. User Modelling and Personalization. Dissertation Thesis. Slovak University of Technology in Bratislava (2010) 12. Heckmann, D., Krueger, A.: A User Modelling Markup Language (UserML) for Ubiquitous Computing. In: Brusilovsky, P., Corbett, A.T., de Rosis, F. (eds.) UM 2003. LNCS, vol. 2702, pp. 393–397. Springer, Heidelberg (2003) 13. Miller, G.A.: WordNet: A Lexical Database for English. Communications of the ACM 38(11), 39–41 (1995) 14. Chaffe, J., Gauch, S.: Personal ontologies for web navigation. In: Procs. of the 9th Int. Conf. on Information and Knowledge Management CIKM 2000, McLean, VA, pp. 227– 234 (2000) 15. Heckmann, D., Schwartz, T., Brandherm, B., Schmit, M., Wilamowitz-Moellendorff, M.: GUMO - the General User Model Ontology. In: Ardissono, L., Brna, P., Mitrović, A. (eds.) UM 2005. LNAI (LNCS), vol. 3538, pp. 428–432. Springer, Heidelberg (2005) 16. Extensible Markup Language (XML), http://www.w3.org/XML 17. Resource Description Framework (RDF), http://www.w3.org/RDF/ 18. Web Ontology Language (OWL), http://www.w3.org/2004/OWL/ 19. Hervás, R., Bravo, J.: Towards the ubiquitous visualization: Adaptive user-interfaces based on the Semantic Web. Interacting with Computers 23(1), 40–56 (2011) 20. Knutov, E., De Bra, P., Pechenizkiy, M.: AH 12 years later: a comprehensive survey of adaptive hypermedia methods and techniques. The New Review of Hypermedia and Multimedia 15(1), 5–38 (2008) 21. Kobsa, A.: Generic user modeling systems. In: Brusilovsky, P., Kobsa, A., Nejdl, W. (eds.) Adaptive Web 2007. LNCS, vol. 4321, pp. 136–154. Springer, Heidelberg (2007) 22. Kobsa, A., Koenemann, J., Pohl, W.: Personalised hypermedia presentation techniques for improving online customer relationships. The Knowledge Engineering Review 16(2), 111–155 (2001) 23. Panayiotou, C., Samaras, G.: mPERSONA: personalized portals for the wireless user: An agent approach. Mobile Networks and Applications 96, 663–677 (2004) 24. Amazon.com, http://www.amazon.com/ 25. INREDIS, INterfaces for RElations between Environment and people with DISabilities, http://www.inredis.es/Default.aspx
316
J. Abascal et al.
26. Gajos, K.Z., Wobbrock, J.O., Weld, D.S.: Automatically generating personalized user interfaces with Supple. Artificial Intelligence 174, 910–950 (2010) 27. Guerrero, J., González, J.M., Vanderdonckt, J., Muñoz, J.: A Theoretical Survey of User Interface Description Languages: Preliminary Results. In: LA-WEB 2009, Latin American Web Congress, pp. 36–43. IEEE Computer Society, Los Alamitos (2009) 28. Nylander, S.: The Ubiquitous Interactor - Mobile Services with Many Different User Interfaces. Licentiate Thesis, Dept. of Information Technology, Uppsala University (2003) 29. Miñón, R., Aizpurua, A., Cearreta, I., Garay, N., Abascal, J.: Ontology-Driven Adaptive Accessible Interfaces in the INREDIS project. In: Procs. of the Int. Workshop on Architectures and Building Blocks of Web-Based User-Adaptive Systems, Haway, pp. 37–39 (2010)
Search Intention Analysis for User-Centered Adaptive Visualizations Dirk Burkhardt, Matthias Breyer, Kawa Nazemi, and Arjan Kuijper Fraunhofer Institute for Computer Graphics Research, Fraunhoferstr. 5 64283 Darmstadt, Germany {dirk.burkhardt,matthias.breyer,kawa.nazemi, arjan.kuijper}@igd.fraunhofer.de
Abstract. Searching information on web turned to a matter of course in the last years. The visualization and filtering of the results of such search queries plays a key-role in different disciplines and is still today under research. In this paper a new approach for classifying the search intention of users’ is presented. The approach uses existing and easy parameters for a differentiation between explorative and targeted search. The results of the classification are used for a differentiated presentation based on graphical visualization techniques. Keywords: User Intention Analysis, Search result visualization, Semantic Visualization, User-Centered Interaction, Adaptive Visualization, Semantic Web.
1 Introduction Acquiring information and knowledge from web-based knowledge repositories is already part of our daily life. In our daily routine we are confronted with many expressions that we either do not know or have superficial knowledge about. These situations often force us to use web-repositories for gathering mostly rudimentary knowledge about a given topic. This behavior makes the most of us knowing about many issues very less and leads to elementary changes in our society. For offering an adequate level of information quantity, granularity and depth, it is more and more necessary to possess knowledge about the users’ pre-knowledge and intentions. Today many systems collect information about the users and model them in different ways for offering or recommending different information artifacts and different types of information, whereas the way of their presentation is not investigated in depth. Beside the changes in the information acquisition, the already existing trend of alternative interaction paradigms with computer systems leads to a need for investigation and research alternative and optimized ways of presenting information and search results. Thus natural gesture-based interaction ways are a promising alternative to the outdated mouse and keyboard interaction. Especially graphical user interfaces will play a key-role in future information and search systems. For the usual user a natural interaction with graphical objects is easier to proceed than with text-based systems. C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 317–326, 2011. © Springer-Verlag Berlin Heidelberg 2011
318
D. Burkhardt et al.
Graphical objects are more similar to our perception of and interaction with the real world and the objects within. Although the usage of graphical information representation is obvious, today’s search results are almost entirely presented as text-based lists. A widespread deployment of information visualizations has not taken place yet. The usage of information visualization systems are principally reserved for scientific and economic areas. Users with no or average skills in information visualization perceive the interaction with and the information acquisition from these system not as intuitive and are overstrained. Other users with a higher graphical comprehension experience the graphical representation as infantile and not gainful. To bridge the gap between the existing heterogeneous users, it is necessary to develop adaptive systems, which are able to adapt automatically to the requirements, pre-knowledge and intentions of users. The challenges in this area are the identification of relevant information about the user, the implicit extraction of this information and the problems of “new user”, “new content” and “new context”, based on the adaptation method. To face these challenges an adaptive visualization for semantically annotated data will be presented in this paper. The adaptation will especially face the new user and new content problem by analyzing the first user interaction in a searching process: the entered search query terms. Therefore a new method for analyzing the user’s intention based on the entered search query input is presented. This method classifies the search intentions of users as “explorative” and “targeted”, thus these are the two most common and most clear searching behaviors. For differentiating between the searchclasses, word frequency algorithms are used. Each of the search terms are analyzed according their Frequency List class, in which the words of a language are classified, based on their occurrence. The targeted search behavior on the one hand is a specific and very clear search and requires pre-knowledge in the searched topic. A targeted search consists always of multiple search terms and indicates precise results. An example for this class of search could be the input of the terms “Workflow WFMC XPDL”. A targeted search is not only defined by the number of the search terms, but also by the frequency of each term compared to the occurrence of the given language. An explorative search on the contrary consists of one or more words which are high frequently used. These words indicate that the user wants general information about a topic. In most cases the searched topic is unknown or the user’s pre-knowledge is rudimentary. Examples for this class of search could be the input of the search terms “Workflow” or “what is a workflow”. The explorative search contains only one term with a high Frequency Class and hence with a low occurrence compared to the given language. In this paper we first present related works to behavior-driven adaptive visualizations. We will show that the number of these systems is very limited, whereas the added value is higher than expected. After that the analysis method will be describe, with a differentiation between targeted and explorative search behavior. In the following section the visualization of the search results based on the search behavior analysis will be described. Here the aspect of the adaption to the users’ requirements is the focus. We will conclude this paper with an evaluation of the system, which will argue the deployment of our method in future applications.
Search Intention Analysis for User-Centered Adaptive Visualizations
319
2 Related Works The search process is in focus since larger databases can be created and conducted with technical systems, but they becoming a central information source since a large number of users are able to use or enrich such data sources in the internet. For an economical success of a search engine the quality of a search result is very important, because an information database can only be useful contact point, if the results are relevant to the searched query. Thus the search engine plays an important role with its algorithm generating a list of results, basing on a issued query. Many algorithms were developed in the past and some of them are also published in journals and on conferences. But often these basic algorithms like the Keyword search over relational tables and streams [3] or by using a partially observable markov model [4] are only a few of different used algorithms that often will be combined to determine well matches. Often these combinations of different algorithms are issued confidential by the companies which provide a search engine. But this circumstance makes it hard to compare the different existing search engines on the market. The evaluation of existing search engines and also their usability is in focus not only by the providers themselves. Also for partners or researchers it is important to know, which search engine is adequate for testing a new approach or algorithm. Often the evaluation primary concerns to the quality of the generated result list. The challenge for this goal is to find an automatic method for determining the quality, for instance “how can the quality of a generated result set be measured?” Two examples for realizing published by Hochstotter et. al. [1] and Zaragoza et. al. [2], but of course there exist more strategies. With the up growing of the Semantic Web also further search strategies were possible, by regarding its semantic structured background [5][6]. In the past, if a user was searching for e.g. the term ‘Golf’, the search engines were not able to identify if the automotive or the sport was meant by the user. By searching through a structured e.g. semantics database it is possible to let the user choose the relevant concept and provide him so a better result set with more helpful information. Another advantage is that further kinds of interacting through the result set can be provided, like an interactive and dynamic search request, if the user is clicking on a concept or instance, the search request can be extended to improve the quality of the result set. By supporting this interactivity, the user is not forced to improve the results only by adding further key words, he also can improve it by clicking on a concept e.g. to filter the result set. Increasing the quality of search results can also be achieved by adapting the search strategy of the search engine in dependence to the user, so the algorithm adapts the results to the behavior of the user. The adaption of the search results will be made by generating a user model, which regards the goals, experiences etc. of a user to optimize the result set. This generation of a user model can be achieved with different approaches. A common approach is classifying into a limited number of groups. An example for this procedure is presented in Teevan et. al. [7]. A group is characterized by different user attributes like the age, gender or location. So with these characters a user can be allocated in a group, which goes along with an influence and an optimization of the search results. Similar to this approach, the search results can be optimized by the social context of a user [8]. The idea based on the existing
320
D. Burkhardt et al.
bookmarks of the users on which a profile will be generated and in fact of this profile, it can be matched and compared with done searches by users with a similar profile. It is also possible to regard the contents of social networks. Another aspect is the optimization basing on the interest, like it is described in Ma et. al. [9]. During the entire search process of the user, the search request and results will be classified to an interest field and stored in a database. By every other following search request, these interests will be used, to optimize the search results. This can be done for a single user and also for all or also a group of users. The chapter described approaches basing on the idea of generating adequate results to a user query. But not only the generated results are important for a user, the type how information will be presented for the user is a main topic too. So in difference to result generation, aspects for usability and user experiences are main factors that have to be investigated. For providing an adequate data visualization an adaptive visualization system can be used to adopt the results for the presentation on the users’ behavior [10]. But also non-automatic adaption systems can support the user by the information visualization, like the possibility to create a so called “Knowledge Cockpit”, where a user can couple preferred visualization to work and interact with the results of a search request [11].
3 Search Intention Analysis 3.1 Users’ Searching Behavior Information and Communication Technologies (ICT) do not only provide new opportunities for searching and gathering information, they have already changed the social behavior of human in different aspects of life. An example for the “new behavior” could be the research homework of student in school or university: A student gets a topic for research as homework and has to present it. Just few years ago the student had to go to libraries and search in books for the given topic. The information acquisition process was longer and not that easy that easy to handle, but the gathered knowledge was deeper. The students were confronted to process the whole information acquisition process manually, finding the right books and further literature, searching in the books and notice the relevant points for using them in their presentation. In contrast to the past, the information acquisition process has changed totally. Today’s students are searching primary in the web, using search machines or knowledge data bases. The knowledge and information acquisition process is faster, but not deep. The acquired information are mostly rudimentary and thus the process itself s fast, the general knowledge of users are more but not deep. With the changed information search behavior, different questions overcame in the last years, which need to be investigated. One main question is: what is the search intention of the user searching for a term and how the information can be adequately presented to the users. Silverstein et al. presented in [12] statistical values of search behavior of 285 million search sessions and about one billion search queries. Our concept for analyzing search intentions considers two main search parameter, on the one hand the
Search Intention Analysis for User-Centered Adaptive Visualizations
321
number of terms used in a search query and on the other hand the Frequency List Class of each term. For demonstrating the distribution of the term quantity the work of Silverstein [12] is essential and gives very good overview. The following table shows the statistics concerning the number of terms per query [12]: Table 1. Statistics concerning the number of terms per query [12]
The distribution of the term quantity shows that most valid search queries (not included are search queries with zero terms) contain one term or two terms, with each about 26 percent. The further distribution shows that three terms in query and more than three terms are rough equal with 15 and about 13 percent. 3.2 Targeted and Explorative Search A very simple classification of search behavior is the differentiation between targeted and explorative search. A targeted search can be defined as an explicit search for a fact or explicit information. This kind of search anticipates Declarative Knowledge or information as result. The explorative search does not require explicit knowledge about the domain by the user. The user may browse in the domain to gain an overview or to get a picture about something he is not familiar with. If the user has some preknowledge he may want to perceive some additional facts or he may want to reflect the knowledge by reading, thinking and thus cognitive structuring the facts again. Adaptive visualizations facilitate users in perceiving the information in search results. These visualization techniques preprocess the data using additional information about the search domain or user specific data to present the information adequately and user adapted. Thereby an important criteria is the differentiation of the users’ search intention, thus the differentiation in targeted and explorative search. The classification of search queries to the class of targeted search intention can be performed by the measurement of two criteria: a) quantity and b) frequency of search terms in the query. These two criteria quantity and frequency will be discussed in detail in the subsequent sections. 3.3 Term Quantity in Search Queries A very important parameter for identifying the targeted search is the quantity of the search term. The quantity of terms in search queries is an indication for existing preknowledge by the user. A targeted search must contain more than one term. Only if a search query is really specified, it can be categorized as targeted.
322
D. Burkhardt et al.
An example of a search with high quantity of search terms can be the typing of the question following question as search terms: “When was president Obama born?”. The anticipated information is very specific and estimates a date. The same targeted search can be formulated as “President Obama birthday” or “day of birth Barack Obama”. If we take the above example and the user search would be formulated just by one term, e.g. “Obama”, it is not really clear what is really searched by the user or which information is expected. Maybe the user searches for the general information about president Obama or someone else with the same name; further the user may wants to know “what Obama is” and he is searching for example for the Japanese city “Obama”. It can be specified really clear that a targeted search contains more than one search term. But how is it vice versa? Are all search queries which contain more than one term, targeted searches? This question is already answered in the section above: If a user does not know anything about a term and just wants to have general information or browse through the knowledge related to the term, the queries may be formulated by more than one term, e.g. “What is Obama?”. So, it is more than just the quantity of the terms is needed to separate between the targeted and explorative search and provide an adequate presentation based on the users’ intentions. This second criterion and measurement can be solved with a really easy and existing algorithm, the Frequency Lists. In the following section the classification using the Frequency Lists is described. 3.4 Term Frequency in Search Queries The term frequency describes the number of occurrences of a term in a specific language. The more a term is used in a language the more common it is. But the more a term is used in a language the more general and unspecific it is. Therefore the term frequency has to be interpreted inverse. The more specific a term is the more preknowledge has to be given by the user and the more targeted is the search query. A well known inverse interpretation of term frequency is the TF-IDF algorithm, which calculates relevancies of terms for specific documents by examining the term occurrence in the specific document compared to the term occurrence in the whole document space. For the analysis of the relevancies of terms in search queries the likewise examination of occurrences of terms in the query space does not yield reliable results because these queries contain too little information. Nevertheless TFIDF values can be used to identify the relevance of search query terms in the context of the search space respectively the document space. But this requires an expensive preprocessing of the search space. Even worse if the search space documents change, like it is given in the internet, the preprocessing has to be repeated time after time yield reliable term relevancies. To overcome this problem a new approach for search term relevance measurement is described in this paper which does depend on the search space but only on the language specific term occurrences. The search term relevance can be measured using language specific frequency lists. These frequency lists count the number of occurrences of a term in the natural
Search Intention Analysis for User-Centered Adaptive Visualizations
323
language and arranges them according the occurrences. Afterwards the terms are chunked and assigned to frequency classes which are valued. The higher the value of a frequency class the more specific is the term. Even more the values are logarithmic, so the assignment of a term to the next higher frequency class indicates that this term is used half as often than terms of the lower frequency class and thus twice as specific. For example the term ‘the’ is used most often in the English language. Thus it is in the first frequency class values ‘0’. Thus a term assigned to the frequency class valued ‘1’, let’s say the term ‘that’, is used half as often as the term ‘the’. It follows that the term ‘that’ is twice as important or specific for the search query in comparison to ‘the’. The main advantage of this approach is that no entire preprocessing of the whole search space is required to gain reliable term relevance values. Adding new documents to the frequency classes only results in increasing numbers of occurrences of used terms. Afterwards the chunks are recalculated which determine the frequency class values. Thus it is not necessary to reprocess the whole search space. The frequency classes can be used to measure the specifically of a term in the search query. The higher the frequency class or classes are the more a targeted search is given. This holds because the more specific a term is the more accurate is the users’ intention about the search domain. 3.5 Targeted search intention deduced by Quantity Measurement and Frequency Lists A targeted search is given when the user states a search query using multiple terms (quantity) and specific terms (frequency). This can algorithmically be determined by summarizing the frequency class value of the search terms. If the calculated value exceeds a threshold value a targeted search is given. The quantity of terms in search queries is an indication for existing preknowledge by the user. The more terms are stated in a query a user send the more specific his intended question formulated. This assumption holds if at least one of the two following assumptions are fulfilled: Either the terms localize the result space in a semantically way or the terms localize the result space by their frequency class, thus their specificity. Due to the fact algorithms for semantic interpretation are expensive and not reliable enough by now the presented approach of using frequency classes to determine the assignment of search queries to the classes targeted and explorative search yields to the most reliable results while only a few preprocessing is required. The frequency class of a term in search queries is an indication for the accuracy with which the search domain is localized. The more the search domain can be enclosed by specific query terms the more awareness about the targeted search results is given for the user. The combination of these two criteria quantity and frequency determines if a search query represents a targeted search. Furthermore this approach is easily applicable and yields to reliable results. The examined class of targeted of explorative search can be used to adapt the search result visualization automatically to the users search intent. If the user stated a targeted search visualization techniques should be presented which let the user
324
D. Burkhardt et al.
perceive the targeted information. In comparison when an explorative search intention is given other visualization techniques should be used to allow an explorative search through the search result space. Even more this explorative search result should provide the user the opportunity to explore through the document space, not limited to the search result space, because it can be assumed the user had no specific intention about what he is looking for. In the next chapter a case study is given for the automatically adaption of search result visualization for targeted and explorative search intentions.
4 Visualization of Search Results: A Case Study In the last chapter an approach had been presented to identify if a search query represents a targeted or an explorative search intention. In this chapter an adaptive search result visualization for targeted and explorative search intention is presented. If the user stated an unspecific search query like ‘Obama’ the examination of quantity and frequency values yields to the assumption this query represents an explorative search. Therefore visualization techniques are presented which allow an explorative navigation through the search result space and even more through the document space, like presented in Fig. 1.
Fig. 1. Search result visualization for an explorative search intention
In contrast, if the user stated a query with a high quantity of terms and these terms are valued high by their frequency classes using frequency lists, a targeted search intention is given. To adapt the search result visualization to the targeted search result the user wants to perceive specific information about his targeted information object. Thus visualization techniques are applied which allow the user to read detailed information about the search result space, like presented in Fig. 2.
Search Intention Analysis for User-Centered Adaptive Visualizations
325
Fig. 2. Search result visualization for an targeted search intention
5 Conclusion In this paper we presented a new approach for classifying users’ search behavior for visualizing the results based on the classification. The approach classifies user search behavior in two categories, targeted and an explorative search. For the classification two different parameters of users’ query are used, the quantity of the search terms and the Frequency List class of each searched term. The paper concluded with a differentiated visualization of the different search classifications. Therefore a targeted search is visualized with a single content. An explorative search presents the information in acockpit of visualizations and enables an exploration of the information. Acknowledgments. The here described work was developed as a part of the CoreTechnology-Cluster for Innovative User Interfaces and Visualizations of the THESEUS Program, a 60-month program partially funded by the German Federal Ministry of Economics and Technology. For more information visit semavis.de or theseu-programm.de.
References 1. Hochstotter, N., Koch, M.: Standard parameters for searching behavior in search engines and their empirical evaluation. Journal of Information Science 35(1), 45–65 (2009) 2. Zaragoza, H., Cambazoglu, B.B., Baeza-Yates, R.: Web search solved? All result rankings the same? In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management (CIKM 2010), pp. 529–538. ACM, New York (2010) 3. Markowetz, A., Yang, Y., Papadias, D.: Keyword search over relational tables and streams. ACM Trans. Database Syst. 34(3), Article 17, 51 pages (2009)
326
D. Burkhardt et al.
4. Wang, K., Gloy, N., Li, X.: Inferring search behaviors using partially observable Markov (POM) model. In: Proceedings of the third ACM International Conference on Web Search and Data Mining (WSDM 2010), pp. 211–220. ACM, New York (2010) 5. Chen, Y., Wang, W., Liu, Z., Lin, X.: Keyword search on structured and semi-structured data. In: Binnig, C., Dageville, B. (eds.) Proceedings of the 35th SIGMOD International Conference on Management of Data (SIGMOD 2009), pp. 1005–1010. ACM, New York (2009) 6. Demartini, G.: Leveraging semantic technologies for enterprise search. In: Proceedings of the ACM first Ph.D. Workshop in CIKM (PIKM 2007), pp. 25–32. ACM, New York (2007) 7. Teevan, J., Morris, M.R., Bush, S.: Discovering and using groups to improve personalized search. In: Baeza-Yates, R., Boldi, P., Ribeiro-Neto, B., Barla Cambazoglu, B. (eds.) Proceedings of the Second ACM International Conference on Web Search and Data Mining (WSDM 2009), pp. 15–24. ACM, New York (2009) 8. Carmel, D., Zwerdling, N., Guy, I., Ofek-Koifman, S., Har’el, N., Ronen, I., Uziel, E., Yogev, S., Chernov, S.: Personalized social search based on the user’s social network. In: Proceeding of the 18th ACM Conference on Information and Knowledge Management (CIKM 2009), pp. 1227–1236. ACM, New York (2009) 9. Ma, Z., Pant, G., Sheng, O.R.L.: Interest-based personalized search. ACM Trans. Inf. Syst. 25(1), Article 5 (2007) 10. Nazemi, K., et al.: A conceptual model for adaptive visualization. In: Proc. of the HCI International 2011. LNCS, Springer, Heidelberg (2011) 11. Nazemi, K., Burkhardt, D., Breyer, M., Stab, C., Fellner, D.W.: Semantic Visualization Cockpit: Adaptable Composition of Semantics-Visualization Techniques for KnowledgeExploration. In: International Association of Online Engineering (IAOE): ICL on Interactive Computer Aided Learning (ICL 2010), pp. 163–173. University Press, Kassel (2010) 12. Silverstein, C., Marais, H., Henzinger, M., Moricz, M.: Analysis of a very large web search engine query log. SIGIR Forum 33(1), 6–12 (1999), http://doi.acm.org/10.1145/331403.331405, doi:10.1145/331403.331405
Adaptations Based on Ontology Evolution as a Mean to Exploit Collective Intelligence Laura Burzagli, Francesco Gabbanini, and Pier Luigi Emiliani Institute for Applied Physics, National Research Council of Italy Via Madonna del Piano 10, 50019, Sesto Fiorentino, Firenze, Italy {L.Burzagli,F.Gabbanini,P.L.Emiliani}@ifac.cnr.it
Abstract. The paper discusses new possible facets regarding the concept of adaptation, focused on allowing users to intervene in the process by having the possibility both to insert new data and to alter the structure of ontological knowledge bases which contain adaptation determinants. The approach is based on the exploitation of the synergy between Web 2.0 and Semantic Web and may give the opportunity to obtain more personalized, comprehensive and tailored service to all users, by allowing user communities to model a domain of interest and to develop it according to their interests. The paper discusses background ideas and gives technical details of an example implementation of a software component designed to perform ontology evolution based on the analysis of contributions expressed by users of a service in natural language. Keywords: Design for All, Semantic Web, Web 2.0, Adaptations.
1 Introduction The approach of Design for All (envisioning services, from the design phase on, that are suitable for the broadest possible section of the population, to the greatest extent possible, without the need for specialized design) is proving to remain valid as one of more effective methodologies of design in ICT, even as technological advances are going on. One of the main technical implementation of Design for All is based on automatic adaptations [1, 2, 3, 4], considered in the form of adaptivity (dynamically enhancing the interaction by reconfiguring a service while a system is in use) and adaptability (adapt a system by initializing it according to a number of parameters). Technological and social developments that have happened since the introduction of the concept of adaptation suggest that the latter could be complemented to take into account novel aspects that have emerged in the world of ICT and in the World Wide Web in particular. From a technological point of view, the Semantic Web is now mature enough to offer a flexible way to organize information in such a way to allow for forms of intelligence to be introduced in information processing systems. From a social point of view a relevant phenomenon is represented by Web 2.0, which has given users the possibility to participate in the production of contents, making them pro-sumers of information. C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 327–336, 2011. © Springer-Verlag Berlin Heidelberg 2011
328
L. Burzagli, F. Gabbanini, and P.L. Emiliani
Semantic Web and Web 2.0 could give the opportunity to define new facets of adaptation, allowing users to take an active role in the process, by having access to ontological knowledge bases which contain adaptation determinants (i.e., the elements that form the basis for deciding upon adaptations), both to insert new data and to alter their structure. In this sense, we envision the fact that information, which models users and domains of interest, could also be modeled by users. This can give rise to new information systems which better respond to users’ requests and better reflect users’ preferences and needs. In order to implement this approach, in this paper we describe a strategy that adopts semantic-enhanced techniques to model knowledge bases, which offers more expressivity than relational models and, being based on formal semantics, is interpreted unambiguously by different agents. In this perspective, adaptations are based on ontologies which are augmented thanks to user collaboration and content contribution. The paper discusses background ideas and gives technical details of an example implementation of one of the components of a wider system that is designed to perform ontology evolution based on the analysis of contributions expressed by users of a social network in natural language.
2 A Possible Role of Users and Communities in the Adaptation Process Since its introduction, the concept of adaptation is referred to content, structure and presentation of information. Adaptability and adaptivity are based on processing information which models a domain of interest and a user profile which contain data that may eventually be continuously modified by the interaction with the system. Many examples can be given where adaptability and adaptivity (of contents and UI) give good results, being based on adaptation determinants that are stored in an information base having a static structure and not directly provided by users, if not for information regarding the user profile. For example products and services implemented within the ACTS AVANTI AC042 project [1, 4] and the IST-200120656 PALIO project [5] can be cited. It appears from a recent survey reported in [6]) that known adaptation/personalization systems still follow a similar approach in that, although some are based on ontologies, their structure seems to be fixed by designers and not allowed to change. Since the introduction of the idea of adaptive systems, one of the most significant changes on the web has been the important role assumed by users in the process of generating new information, due to the advent of the so called Web 2.0 [7]. The activity which we are carrying on in our laboratories investigates whether the phenomenon of user participation can have an impact on adaptation-capable systems. The idea is that a first contribution may come from the fact that users are enabled to contribute to the enrichment of “raw” data contents related to the domain of interest, which implies that information is more updated and complete, so that adaptation engines can potentially offer a more personalized, comprehensive and tailored service to all users. However, another and even more innovative aspect may emerge from these new scenarios: besides contributing “raw” contents, users of a service, structured in communities, could also have a role in the enrichment of information
Adaptations Based on Ontology Evolution as a Mean to Exploit Collective Intelligence
329
structure, which is usually (aprioristically) given by a team of designers. This can be done either on a direct basis (by asking them to follow explicit procedures to extend and edit types, add new ones, and define new properties that link existing data, following the model of the Freebase1 application, see [8]) or by setting up automatic procedures that achieve the same results analyzing users’ discussions using Natural Language Processing techniques. As a result of such a process, which is open to reflect users’ attitudes, interests, opinions and preferences, information structure and contents (modeled through ontologies) may offer a better fit for a given domain of interest. In this way, a service (as a whole, thus including user interface, contents and information structure) can be reconfigured through adaptation processes while it is in use, based on information which is not necessarily fixed and does not necessarily have a fixed structure, but is allowed to change, reflecting not only the contributions of a single user, but the wider contribution of a community of users having similar interests.
3 A Web 2.0-Semantic Web Synergy Achieving the objectives that were introduced above requires combining relative strengths of two of the aspects in which intelligence manifests itself on the web: Web 2.0 (providing the information, through user generated contributions) and Semantic Web (providing suitable techniques to organize data in a flexible and efficient way and methods to reason over the data). In our laboratories we are studying a new family of systems capable to exploit synergies between Web 2.0 the Semantic Web: an example of such systems is described in [9], which identifies and describes the main building blocks of a so called Collective Knowledge Management System. The system foresees the use of a knowledge base structured as an ontology which can evolve thanks to users’ contribution: this opens up a new perspective from which to look at the concept of adaptivity, that directly regards contents and how they are structured. In the system, a fundamental block is designed to process information that comes from users in an unstructured form, being provided by the use of Web 2.0 tools, often in the form of natural language. This requires that an elaboration layer is present in the knowledge management system in order to extract from text meaningful information that is then structured in the ontology (eventually altering the structure of the ontology itself). This layer is called Ontology Evolution Manager. 3.1 The Ontology Evolution Manager The Ontology Evolution Manager (sketched in Figure 1) is designed to take corpora of textual documents as input, produce a series of RDF statements and use them to enrich ontologies. It was designed to model knowledge contributed by web based social networks in a flexible structure, and, therefore, it is specifically suitable for unstable and evolutionary information spaces, as is the case, for example in the tourism or mobility domain. Free text annotations posted by users progressively give rise to a description 1
See http://www.freebase.com, last visited on January 28, 2011.
330
L. Burzagli, F. Gabbanini, and P.L. Emiliani
Corpus of text documents
Ontology Evolution Manager Annotation Engine
Feed Adapter
RDF Statements
Fig. 1. Sketch of the main functionalities of the Ontology Evolution Manager
of the domain that reflects its several facets: moreover, this description is user contributed, thus showing what the users of a community think it is important in that given domain. In order to achieve this kind of flexibility, our framework uses ontologies instead of traditional relational databases to implement the data layer: whereas the database schema is fixed at design time and changes imply huge modifications at software and at systems level, ontology based knowledge bases are much more flexible in that regular statements can be used to define the structure and schema of the knowledgebase as well as individuals or instances, and it is reasonably easy to extend them by the addition of entities and relationships that alter the way information is structured. Another important point to consider is that information in ontologies can be effectively searched for through the use of the SPARQL language, that allows performing very expressive queries that can take into account a variety of relationships between entities (while in relational databases we have only relationships modeled by the use of foreign keys, see also [10]). 3.2 A Sample Scenario As an example to illustrate concepts discussed so far, think of an ontology describing aspects related to the tourism domain, which acts as an information base for an eTourism system which allows searching for accommodations and posting comments on them. The ontology could be designed to describe how an accommodation is structured (e.g., a hotel is a building that is made up with some floor levels, each one having several rooms). Moreover the system may include a navigator able to help hotel customers reach the hotel by following a suitable path. Suppose that a user on a wheelchair wants to reach Hotel X from the railway station and that the system suggests her/him a path, which happens to require going up a stairway at the end of Road A. That user may alert the system by writing a post reporting the presence of a stairway at the end of Road A: the post is interpreted and relevant information is added to the knowledge base, thus allowing users having similar characteristics (e.g., persons with heavy baggage or a baby stroller) to obtain a more suitable result when they will perform a similar query to the system.
Adaptations Based on Ontology Evolution as a Mean to Exploit Collective Intelligence
331
Also, suppose that the ontology includes a room entity and that a user reviewing one of the hotels s/he has stayed in writes something like this: “Hotel X has a beautiful sauna”. Then the system is able to a) insert sauna as a sub-entity of room; b) insert a “Hotel X sauna” as an individual, making part of “Hotel X”. This enriches the ontology both regarding entities and individuals. Subsequent users would be able to take profit from the fact that Hotel X has a sauna, which is information that was not foreseen in the original information structure and that would have been discarded if the data layer was structured in a fixed way. This in turn reveals the capability for the system to continuously adapt as the users use it. Moreover, the domain model is left to the community that has a form of control over it and is able to develop it in a direction which is currently of most interest to the community itself, keeping it relevant over time. The following section describes the Ontology Evolution Manager, which constitutes the core of the evolution process.
4 Implementation Aspects The main issues that have to be faced to implement the Ontology Evolution Manager consist in: 1. extracting machine readable knowledge from text which makes up users’ posts in the social network; 2. transforming extracted information in a form that is suitable for insertion into an ontology. It is to be noted that knowledge to be extracted may in principle regard both assertions about individuals (as in “hotel X has a good cuisine”, where “hotel X” represents an individual of the “hotels” entity) and assertions about entities (as in “hotels have bedrooms”). In the first case the problem under study is named “ontology population”, while in the second it is named “ontology learning”, being the two terms synthesized by the more general “ontology evolution”. A Java based implementation of software components used for information extraction from texts and ontology evolution (both in the sense of “ontology population” and “ontology learning”, see the Annotation Engine block in Figure 1) was designed in our laboratories. The focus in this section is mainly on the description of the Annotation Engine. As for the Feed Adapter, for the purpose of this paper, it is sufficient to say that it was designed as a middleware, acting as an adapter between annotations and ontologies. For the extraction of knowledge from texts, Natural Language Processing (NLP) techniques were employed [11] . Generally, these consist in a series of steps in which text is analyzed, and include annotating text with a variety of information. These steps are typically represented by: splitting sentences; splitting text into words through tokenization; part-of-speech tagging (POS), which consists in a form of grammatical tagging, marking up the words in a text as corresponding to a particular part of speech, based on both its definition, as well as its relationship with adjacent and related words in a phrase, sentence, or paragraph; term indexing using gazetteers (particular sort of dictionaries); user-defined transduction processes to further analyze texts using finite state transducers that operate over annotations based on regular
332
L. Burzagli, F. Gabbanini, and P.L. Emiliani
Annot ation Syst em A nnotat onEngineConfig config
> GateM anager
>
Annotati onEngine
+registerPlugin(String) +registerResource(ResourceVisitor) +parseAnnotation(AnnotationParserVisitor) > > Sim pleR esourceR egister
+doAnnotation()
Annotati onDelegateR egister
Default Gazet teerR egi st er
R egi sterVisi tor
OpenCalai sR egister
+visit(GateManager)
Default TokenizerR egister
Annot ationParserVisitor +visit(GateManager)
OW LI M TransducerR egist er SentenceSpl itterR egister
DefaultGazett eerParser OpenCalaisParser
Sesam e Adapter
Sesam eGazett eerParser
>
Sesam eOpenCalaisParser
>
> SesameM odelH andler
OW L-API Adapter
Ow lA piGazet teerParser
Ow lApi OpenCalaisParser
>
>
> Ow l ApiM odelH andler
Fig. 2. UML class diagram of the Annotation System, part of the Ontology Evolution Manager
expressions, for pattern-matching, semantic extraction, and other operations over syntactic trees produced by the previous steps. The Annotation Engine block is meant to implement NLP techniques so as to process and annotate textual contents, in order to provide coherent and structured inputs to the Feed Adapter block, which in turn uses them to enrich ontologies with new concepts and assertions. Although several algorithm implementations exist that perform some of the processing steps described in the previous section (e.g., LExO2, Ontomat3, OpenNLP4, Text2Onto5), the Annotation Engine block is based on the General Architecture for Text Engineering6 (GATE [12, 13]).
2
LExO, available at http://code.google.com/p/lexo/, last visited on April 28, 2011. Ontomat, available at http://annotation.semanticweb.org/ontomat/index.html, last visited on April 28, 2011. 4 OpenNLP, available at http://opennlp.sourceforge.net/, last visited on April 28, 2011. 5 Text2Onto, available at http://sourceforge.net/projects/texttoonto/, last visited on April 28, 2011. 6 GATE, available at http://gate.ac.uk/, last visited on April 28, 2011. 3
Adaptations Based on Ontology Evolution as a Mean to Exploit Collective Intelligence
333
GATE provides a modular object-oriented framework implemented in Java to embed language processing functionality in diverse applications. It can be extended and customized for different tasks by loading plugins, which can in turn contain a number of resources able to hold linguistic data and to process data. GATE is distributed with an Information Extraction (IE) system called “A Nearly-New IE System” (ANNIE), which relies on finite state algorithms and the Java Annotation Patterns Engine (JAPE) to process text corpora and performs operations such as sentence detection, tokenization, POS-tagging, chunking and parsing, named-entity detection, and pronominal co-reference. Functionalities offered by the GATE APIs were reorganized in order for them to be available and easily usable within the more general framework of the Collective Knowledge Management System. For this purpose, the GateManager Java class, as sketched in the UML diagram of Figure 2, was designed to act as a façade to access various functionalities made available by the GATE API, such as initializing the GATE system, registering GATE plugins and resources, managing text corpora, parsing text corpora. The GateManager class is responsible of managing the annotation process, which is based on a pipeline approach. A text document enters the pipeline and gets processed by the various registered plug-in resources, which, in turn, enrich it with a set of annotations. Each plug-in may contain a set of text annotation resources which can take advantage of annotations taken by means of resources that precede it in the pipeline. At the end of the text processing pipeline an annotated text document is obtained and convenient Java classes were implemented that allow going through the annotations. With reference to the UML diagram in Figure 2, showing the underlying architecture of the natural language processing infrastructure, the GateManager class is responsible of managing the annotation engine: for this purpose it needs plug-ins and
Text processing pipeline Document Clean up
Text document
Sentence Splitter
Sentence Tokenizer
POS Tagger
...
JAPE Transducer(s)
Annotated document
Fig. 3. An example text processing pipeline
334
L. Burzagli, F. Gabbanini, and P.L. Emiliani
resources to be registered for pipeline annotation, each resource implementing an annotation step (see blocks within the text processing pipeline in Figure 3). This process is implemented using a visitor pattern [14]. For more technical details please refer to [15]. 4.1 A Sample Test Case In order to evaluate the correctness and validity (at least, from a technical point of view) of the approach, a sample application was setup in which an example ontology is to be enriched by inspection of a corpus of text documents consisting in 88 user generated reviews of accommodation structures, taken from the website http://www.accessatlast.com. Each review is expressed as free text and reflects the opinion of a user regarding an accommodation that s/he has stayed in. A relation extractor (implemented by the TagCloudRelationExtractor class) was implemented that uses the Annotation Engine to parse the reviews and combines the results with WordNet (used as a source of background knowledge), thus obtaining a set of novel entities and relationships, as depicted in Table 1. More details can be found in [16]. Table 1. Relations identified by the TagCloudRelationExtractor after parsing a corpus of 88 reviews from http://www.accessatlast.com. Bold italic characters represent entities that were not present in the original ontology. Term Barn Cottage Castle Chalet Chair BookCase Sofa Dresser Bed Suite Restaurant Hospital Stairs
Relation subClassOf subClassOf subClassOf subClassOf subClassOf subClassOf subClassOf subClassOf subClassOf subClassOf subClassOf subClassOf subClassOf
Term Building Building Building Building Furniture Furniture Furniture Furniture Furniture Apartment Building Building Stairway
Term Wheelchair Studio Wall Wall Doorway Wall Wall Carport Villa Tub Wall Wall Garage
Relation subClassOf subClassOf partOf partOf partOf partOf partOf subClassOf subClassOf partOf partOf partOf subClassOf
Term Chair Apartment Building Room Wall Hallway Hall Building Building Bathroom Room Hallway Building
Different text processing pipelines can give rise to extractors, having the capability of providing different types of enrichment to the ontology (e.g., in terms of new instances of certain entities), thus keeping the knowledge base in line with the evolution of the interests of the community that is interested in the service, and ultimately allowing the service to better respond to users’ requests and better reflect users’ preferences and needs.
5 Conclusions The paper discusses how novel aspects that have emerged in the world of ICT and in the World Wide Web in particular may contribute to define new facets of the concept
Adaptations Based on Ontology Evolution as a Mean to Exploit Collective Intelligence
335
of adaptation. According to the approach, user and user communities may intervene to change both the informative contents and the structure of the ontological knowledge base on which a service builds, thus allowing reconfiguration of the service itself through adaptation processes (as a whole, thus meaning user interface, contents and information structure), based on information which is not necessarily fixed and does not necessarily have a fixed structure. This user driven evolution allows reflecting what the users of a community think it is important in a given domain. A description is presented to sketch some details about the Ontology Evolution Manager, which constitutes the core of the evolution process. An example of an ontology learning process, based on user generated comments about tourist accommodations, is discussed. Future work aimed at improving the system will regard the evaluation of more refined NLP techniques. On the theoretical side, efforts will focus on how to cope with inconsistent assertions that the system may attempt to insert into the ontology. These are inevitably generated during ontology enrichment processes based on background knowledge coming from the web, and the topic is receiving attention by the scientific community [17] (Zablith et al., 2010). As for the evaluation of the actual added value given by the Web 2.0 to ontology evolution processes, it is being investigated in the context of the so called “Task-based Approaches” for the evaluation of ontology learning [18].
References 1. Stephanidis, C., Paramythis, A., Sfyrakis, M., Stergiou, A., Maou, N., Leventis, A., Paparoulis, G., Karagiannidis, C.: Adaptable and adaptive user interfaces for disabled users in the AVANTI project. In: Campolargo, M., Mullery, A. (eds.) IS&N 1998. LNCS, vol. 1430, pp. 153–166. Springer, Heidelberg (1998) 2. Stephanidis, C., Savidis, A.: Universal Access in the Information Society: Methods, Tools, and Interaction Technologies. Universal Access in the Information Society 1(1), 40–55 (2001) 3. Brusilovsky, P.: Adaptive hypermedia. User Modeling and User-Adapted Interaction 11(12), 87–110 (2001) 4. Kobsa, A.: Generic user modeling systems. User Modeling and User-Adapted Interaction 11(1-2), 49–63 (2001) 5. ICS-FORTH, IROE-CNR, University of Kent, Design guidelines for the types of adaptation to be supported by the PALIO system core, Deliverable 9 of the EU Project IST-20656 PALIO (2002) 6. Torre, I.: Adaptive systems in the era of the semantic and social web, a survey. User Modeling and User-Adapted Interaction 19(5), 433–486 (2009) 7. O’Reilly, T.: What is Web 2.0 (2005), http://oreilly.com/web2/archive/what-is-web-20.html 8. Razmerita, L., Jusevičius, M., Firantas, R.: New Generation of Social Networks Based on Semantic Web Technologies: the Importance of Social Data Portability. In: Dattolo, A., Tasso, C., Farzan, R., Kleanthous, S., Vallejo, D.B., Vassileva, J. (eds.) Proceedings of the International Workshop on Adaptation and Personalization for Web 2.0., pp. 79–87 (2009)
336
L. Burzagli, F. Gabbanini, and P.L. Emiliani
9. Burzagli, L., Como, A., Gabbanini, F.: Towards the convergence of web 2.0 and semantic web for E-inclusion. In: Miesenberger, K., Klaus, J., Zagler, W., Karshmer, A. (eds.) ICCHP 2010. LNCS, vol. 6179, pp. 343–350. Springer, Heidelberg (2010) 10. Hebeler, J., Fisher, M., Blace, R., Perez-Lopez, A.: Semantic Web Programming. Wiley Publishing, Inc., Indianapolis (2009) 11. Buitelaar, P., Cimiano, P. (eds.): Ontology Learning and Population: Bridging the Gap between Text and Knowledge. Frontiers in Artificial Intelligence and Applications, vol. 167. IOS Press, Amsterdam (2008) 12. Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.: GATE: A framework and graphical development environment for robust NLP tools and applications. In: Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics (2002) 13. Maynard, D., Li, Y., Peters, W.: NLP techniques for term extraction and ontology population. In: Proceedings of the 2008 Conference on Ontology Learning and Population: Bridging the Gap between Text and Knowledge, pp. 107–127. IOS Press, Amsterdam (2008) 14. Gamma, E., Helm, R., Johnson, R., Vlissides, J.: Design patterns: elements of reusable object-oriented software. Addison-Wesley Longman Publishing Co., Boston (1995) 15. Gabbanini, F.: On a Java based implementation of ontology evolution processes based on Natural Language Processing. Tech. Rep. 65-8, Institute for Applied Physics, National Research Council of Italy (2010) 16. Gabbanini, F.: Exploiting web resources for the identification of relations between concepts: a Java based implementation and case study. Tech. Rep. 65-9, Institute for Applied Physics, National Research Council of Italy (2010) 17. Zablith, F., d’Aquin, M., Sabou, M., Motta, E.: Using ontological contexts to assess the relevance of statements in ontology evolution. In: Cimiano, P., Pinto, H.S. (eds.) EKAW 2010. LNCS, vol. 6317, pp. 226–240. Springer, Heidelberg (2010) 18. Dellschaft, K., Staab, S.: Strategies for the evaluation of ontology learning. In: Buitelaar, P., Cimiano, P. (eds.) Ontology Learning and Population: Bridging the Gap between Text and Knowledge. IOS Press, Amsterdam (2008)
The Contribution of Multimodal Adaptation Techniques to the GUIDE Interface José Coelho and Carlos Duarte LaSIGE, Faculty of Sciences, University of Lisbon, Lisbon, Portugal,
[email protected],
[email protected] Abstract. This paper describes the European Union funded project GUIDE, focusing on the issues of multimodal and context adaptation techniques, as well as in the importance of having a multimodal system architecture based in user models and integrated with fusion and fission mechanisms in order to give elderly and impaired users several input and output modalities in their interaction with TV and set-top box based technology. The possible future role of GUIDE in the development of accessible applications is also focused. Keywords: Multimodal adaptation, context adaptation, user modeling, GUIDE.
1 Introduction The old days are gone, and TV interaction is now more complicated with complex TV remote controllers and too many buttons or complex interaction just to perform regular actions, like changing channels, record a TV show, or simply turn off the TV. More, different users have different ways to interact, as each one has a set of unique capabilities and preferences. However, disabled and elderly users have a much reduced set of capacities when compared with typical users, being limited to the ones they understand or can perform. Motor, physical or mental impairments make simple interactions more complicated to these users, so the platform they use to interact with a TV must be prepared to understand their unique range of interaction modalities, and potentiate their preferences in order to give them the most comfortable and effective interaction experience. This paper presents the GUIDE[1] project, a new TV and set-top box based system which employs adaptation techniques, in order to provide accessibility and well-suited interaction to users with different impairments and skills, as well as being capable to reflect their preferences and the different contexts of use.
2 The GUIDE Project GUIDE (Gentle User Interfaces for Elderly People), is an EU funded project focused on accessibility, that offers to elderly and impaired users a TV and set-top box platform, giving them a more intuitive way to interact with this type of technology, making use of gestures, voice commands, or just pointing to the screen, in order to C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 337–346, 2011. © Springer-Verlag Berlin Heidelberg 2011
338
J. Coelho and C. Duarte
perform various kinds of actions without ever having to touch or understand the complexity of the traditional controllers. With GUIDE, interactions can be seen as a set of intuitive combination of actions, mixing all types of input modalities a user can and likes to perform. GUIDE interaction is based on simple and intuitive actions that require wispy learning from the users (it’s the only way it will be adopted by them), and a development process based in an extensive user requirement phase, focused on fully understanding the different impairments and what is needed for each type of disabled user to easily interact with the system. To accomplish his objectives, any user interacting with GUIDE can use a range of devices: TV, remote control, speech synthesis and recognition, Microsoft Kinnect, Nintendo Wii remote and Tablet PC. In addition to this technology, and to raise user’s empathy, Avatar personas (face, body and voice) are also used to impersonate the system. With all of this input and output possibilities, GUIDE interfaces can communicate with the user’s trough visual, audio and haptic feedback. In what is related to scenarios of use, originally GUIDE can be used to perform regular TV operations, as well as performing video-conference, home automation, accessing media and social interaction and performing telelearning. As it was already referred, GUIDE also has adaptation mechanisms capable of “translating” an interface into several types of presentation, each one of them focusing the different types of users the system aims (for example, for a visual impaired user a interface with a menu would have its content presented using only speech and sound, or having bigger buttons and larger fonts, or a combination of both). Additionally, thanks to these mechanisms, and following other of its objectives, GUIDE will offer to developers, a toolbox for accessible development of applications which performs a kind of automatic “translation” of “generic” interfaces to interfaces tailored for specifically impaired users.
3 GUIDE User Model: First Step to Adaptation In general, humans differ in background, sex, education, personality, cognitive skills, preferences, motivation, goals and mood [2]. In addition, we know from literature [3,4] that elderly and handicapped users differ from the so called “regular” users, in what concerns aims, interests, experience and abilities (for example: elderly users prefer historic and cultural information and content; For vision-impaired users the screen display should be enlarged or modality presented must be changed to tactile or audio output; etc). Considering this many differences between users and the necessity of having their many representations in the system so GUIDE can adapt its interfaces to the complexity of each user, the first step of adaptation is having a User Model (“a representation of the knowledge and preferences of users that the system believes the user posses” [6]) capable of clustering groups of users with different characteristics. For making this possible, GUIDE User Model [8] is based on the Model Human Processor [5] model for reasons of simplicity, and it is characterized by having 1) a user profile section which stores application-relevant characteristics about the user:
The Contribution of Multimodal Adaptation Techniques to the GUIDE Interface
• • • •
339
Physical and Cognitive Abilities Domain knowledge Environmental data (software, hardware, user location, etc) Competence in handling with GUIDE system
2) A section responsible for increasing overall simulation accuracy by incorporating the following detailed action models: • Perception Model: Simulates the phenomenon of visual perception as is capable of simulating the effects of different visual impairments. • Cognitive Model: Simulates performance from novice to expert. • Motor Behavior Model: Developed from cursor analysis of cursor trace from motor impaired users and can be used to predict paths and pointing time. 3) An inference machine section called simulator [7], which uses user information, takes a description of an interaction with the interface (Interface Model) as well as a task definition (Application Model) and predicts possible interaction and task completion times for each type of user in that given interface. So, GUIDE User model is based on the empirical data described above, and collected from a wide range of users with different abilities (and trough a process of questionnaires, focus groups and user trials with multimodal prototypes). The User model will cluster these users to different user profiles, and the simulator will predict interaction patterns for each of those profiles, making static and dynamic adaptation possible (described in the next section of this paper). Additionally, since this User Model does not store data about user preferences, GUIDE also has a complementary Preference Model which stores this type of information and makes personalization on interfaces also possible within every cluster of users.
4 GUIDE Architecture: Adaptive/Adaptable System Behavior 4.1 GUIDE Architecture The User model is a crucial component of any adaptive multimodal system, but it is only a small part of it (by its own, it can’t do anything). GUIDE, as one of this type of systems, has a complex architecture based on recognizers and sensors to perceive the user and the environment, as well as an integration module which is capable of receiving all the information from the input modalities, make sense of it and select an adapted response making use of the output modalities and always considering the user characteristics and preferences as well as the different interaction contexts: Figure 1 shows how input modalities are first perceived though recognizers, outputting their results to the fusion engine [19], which is in charge of making a common interpretation of the inputs and communicating it to the dialog manager [20]. Here, the dialog state is identified, as well as the transition to perform, the action to
340
J. Coelho and C. Duarte
Fig. 1. The architecture of an adaptive multimodal system [18]
communicate to a given application, and the message to return to the user through the fission component. Then, depending on the user profile and context of use, the fission engine [21] is responsible for sending back a message to the user trough the most adequate modality or combination of output modalities. Finally, the context manager communicates any environment, content or user profile changes to fission, fusion and dialog mechanisms, so they can adapt their interpretations and perform context adaptation. 4.2 Machine-User Adaptation Benyon said “Humans are adaptive systems. Computer systems can be made adaptive” [10]. When people have characteristics which they cannot easily change, or would prefer not to change, GUIDE automatically changes the way it appears in order to better suit those user characteristics. It adapts the user interface (font-size, background colors, number and size of buttons, etc) as well as the content shown (language, news, etc) to every single user interacting, whatever his context of interaction. So, we can say that GUIDE is a self-modifying adaptive system, because it is capable of automatically changing its components to reflect the ad-hoc needs of every user [2, 22]. Adaptation in GUIDE is a process, based on User models (profiles) and sets of rules, that needs data for producing output that makes sense. This data can be obtained initially by an initialization application (Initial Adaptation) and after that, inferred from the extended user interaction with the system (Run-time Adaptation). Initial Adaptation. GUIDE adaptation begins trough an “Initialization application” that allows for the acquisition of primary assumptions about the user and is therefore
The Contribution of Multimodal Adaptation Techniques to the GUIDE Interface
341
a valuable source for initially assigning the user to a certain user model (or user profile). Only by doing this, is possible to offer each user the most appropriate interface and to pre-define a (multimodal) way of interaction. Knowing that each user model contains assumptions about interesting characteristics of user subgroups, after “going trough” the initialization application a user is assigned to a user model as certain preconditions are met (adaptability [23]). From that moment on, every time the user interacts with a GUIDE application, the system is “initially” adapted to him. It’s relevant to say that GUIDE initialization application is presented to the user as a simple step-by-step configuration of a “general” interface. In each step, different types of contents and different contexts of interaction are presented, so the user can test different components and parameters, and the system learns the user characteristics, from his impairments to his preferences. Run-time Adaptation. After the initial adaptation, any user can already start using the system as it is already calibrated to his capabilities. However, each interface instance continues to be enhanced at run-time (adaptivity [23]), in order to provide a continuous and high-quality adaptation as the user evolves and gets more experienced with the system. So, at run-time the mechanisms of adaptivity need to infer likely events, based on their representations of users, contexts and applications, and on the experience of previous interactions. With the ability to alter the interface, functions are programmed into the system in a rule-based manner, and are selected according to specific well-defined criteria, refining the interaction with the system [2, 3]. For better understanding, we give some examples of GUIDE adaptation rules: IF user_orientation = Low AND input_mod = Speech THEN volume = High … IF input_mod = Pointing AND menu_opt_number = 20 AND selection_time > 500 THEN menu_opt_number = 10 AND menu_opt_size = 20 … IF user_experience = Veteran THEN application_help_explanation = Off In the first example, if the user is interacting with speech and GUIDE perceives he is disoriented, the volume is raised; In the second example, if the user is taking too much time to do a selection the system reduces the quantity and duplicates the size of options shown in a menu; In the third rule, if the user is experienced with the system, the help explanation is turned off (because he already knows how to interact). User-Machine Adaptation. GUIDE is not only an adaptable system because it automatically adapts interactions and interfaces to the user, but also because it allows manual configuration of the system aspects by the user [22]. This means, the user also adapts to the system and not only the opposite, particularly by adjusting interface or interaction components to reflect his preferences at any time in any interaction. So, for example, if the user is reading information on the screen and he decides that he wants a bigger font size he just speaks to the system saying “Bigger text” and the interface automatically updates to fit that preference. In this way, the system is always “listening” to the user, who by using any type of input modality has the capacity of
342
J. Coelho and C. Duarte
changing any part of the system (this also ensures that the user is fact in control of the system and not the opposite). All this preferences are saved in a Preference Model that adds more efficiency and personalization to the user model to which the user was allocated when he begun using the system.
5 Interaction Context Adaptation Context-awareness allows user interfaces to be adapted to new situations into an environment and interaction that evolves continuously. Considering applications for disabled or impaired users, and the necessity of adapting the user graphical interface to almost every single user, the coupling of information coming from context awareness and multimodality is indispensable [10, 16]. In GUIDE interaction, adaptation to several contexts is also important, because many users require applications that can be run in different environments, or with different devices. In context adaptation every situation is defined by a set of entities defining the system (applications, tasks and computational objects), the environment (place, physical objects and atmosphere conditions) and the users; by the relations these entities have with each other (ex: every task “has a” set computational objects capable of describing it, and every place “has a” certain environment and a set of physical objects); and a set of roles assigned to the entities, which typically describes every scenario of use of the application. A small change in the number of entities, or in the assignment of a role or even in a type of relation between two entities, forces a new situation and a new context, asking for context adaptation so the system can continue to perform well [9, 10, 11]. This means that for context-awareness and adaptation to be possible in GUIDE, we have to consider there are physical and social contexts related to the environment where every action is performed, meaning that is important to know how the user understands the physical place he is in, and also how he relates to all the social aspects related with the activity he is involved in. Also, there are several aspects concerning the context of the application that is being used, related to how every user should perform each action to achieve certain goals given a set of technical characteristics. Of course all of this only makes sense if we also take into account the most important contexts in the use of GUIDE, the ones related with physical, sensorial and cognitive capacities that each user has, defining which modalities one can use for interacting with the system. Next, we present a list of context dimensions and variables that can interfere during or between interactions in GUIDE as in any typical adaptive system [14, 15, and 17]: • Environment: ─ Physical: heating, lighting, noise level, acceleration and pressure (atmospheric conditions), user location, user position, object in the way (placed on the user or placed between the user and a sensor) ─ Social: background culture, user culture, people other than the user, time of day
The Contribution of Multimodal Adaptation Techniques to the GUIDE Interface
343
• Application: ─ Software: application, task, modality of interaction, provision of help and training, time for interaction, communication ─ Technical: device of interaction, bandwidth (latency), device malfunction, resources available • User: ─ Physical: motor impairments, tremors, reflexes ─ Sensorial: blindness, vision impairments, deafness, hearing impairments ─ Cognitive: reflexes, goals, memory, interest/boredom levels, affective state, focus of attention, interests After defining all these dimensions, we present (Table 1) some use cases of context awareness and adaptation in GUIDE, showing in what way the system can adapt to every change in order to provide elderly and disabled users an intuitive interaction at all times. Adaptation to contexts in GUIDE is also based in a set of rules. The activation of this rules, is caused by a specific change in environment, application or user contexts, which results in system (interface) adaptations as a mean to provide the user adequate interaction. So, for every single adaptation described in the table, will exist in the system an adaptation rule (or a set of them) responsible for the changes in GUIDE interface and interaction
6 The Role in Development of Accessible Applications The development of accessible applications is one of the most important issues in the world of today’s technology, as it is currently one of the main focuses considered by developers when designing new applications and interfaces. Despite this, if we ask any developer if he would be able to make different interfaces for each application he designs, so that every type of user could use it, his response would be a resounding “no”. Considering the adaptation mechanisms present in GUIDE, and all the efforts done to develop a system focused on accessibility, and capable of addressing all the differences between several types of users in one single platform, makes all sense that we extend its use to external developers. GUIDE can, in this way, be presented as a toolbox of accessibility and adaptation, a platform capable of almost automatically “translate” standard applications into several others suitable for other clusters of users. However, for this to be possible, developers would be “obligated” to use a specific programming language like HTML (with JavaScript and CSS) to make a “generic” interface of the application, and the GUIDE platform, making use of an impairment simulator (to infer about which content could be used by which cluster of users, and which could be not) and a set of translation rules, would automatically generate a set of different versions of the original interface. This way, every time a different user interacts with the application a “tuned” and “adapted” interface would be presented to him, making the application accessible to everyone.
344
J. Coelho and C. Duarte Table 1. Use cases of context awareness and adaptation in GUIDE
Context of interaction GUIDE solution User takes off glasses during Using the image from the motion sensors, GUIDE recognizes interaction. this situation and increases the size of content rendered on the screen (bigger buttons, font, etc.) During user interaction, Using motion sensors, GUIDE captures a sudden change in the dog, cat or another person image captured, stops interaction (locking the screen), and gets in the way, between the sends a confirmation message to the user (using the output TV and the couch. modality the user prefers or that is more appropriate). When the user is ready, confirms to GUIDE (by selecting a button, or issuing a voice command) and the system continues the interaction. During user interaction, Using motion sensors, GUIDE captures an additional and dog, cat or another person sudden presence, but ignores it focusing still on the first are captured by the motion presence. The system sends a notification message to the user sensor, side by side with the asking if there is a change of user and if nobody responds in a user, but closer to the configurable number of seconds it ignores the second presence. screen. During user interaction, dog Using voice recognition the system… barks or person, other than A) Can’t identify the dog barks as GUIDE commands and the user, talks loud. ignores them. B) Recognize the other person tone of voice as not being from the original user, and ignores all the commands issued. Change of lighting GUIDE interface adapts to the lighting conditions by changing conditions caused by time the interface background, text and button color. For weak of the day, or by turning lighting, background will be darker, font and buttons will be on/off a light. lighter, and vice-versa. Noise level increases in the System perceives the noise conditions making use of living room or in the room microphones, and… where the user is interacting A) suggests the user changes from speech modality to another with the system. modality if the noise is too loud (typically to the second preferred modality of interaction) B) Asks the user for confirmation after each speech command. Or suggests a change between speech interaction and the use of at least two modalities (example: speech and pointing) for redundancy in each selection. User is choosing a channel A notification is sent to the user asking him if he wants to to watch on the TV and he receive the call. If user responds affirmatively GUIDE changes receives a call from his son. to the video-conference application and to the respective preferred modality of interaction. User change positions while Checks if there is still an image of a person captured by the interacting with GUIDE. motion sensor. If it can see the user it continues the same interaction automatically calibrating the motion sensor mechanism. If the user “disappeared”, it sends a notification to User change locations while the user and changes to the next modality of interaction interacting with GUIDE preferred. using pointing. The device in use runs out Sends a notification to user and changes to next modality of of battery. interaction preferred.
The Contribution of Multimodal Adaptation Techniques to the GUIDE Interface
345
7 Conclusion and Future Work Multimodality is the central question in all GUIDE interaction. In this paper we prove that, multimodal adaptation techniques can be applied to the GUIDE interfaces to bring more consistency to all types of actions performed by any end user: First, the system can automatically adapt to the user by learning in each case of interaction, which modality he feels more comfortable with and with which he can deliver more actions with less effort. Second, the system can adapt to the user by inferring his intentions from the extended use of various modalities and the way he interacts, as well as from the interpretation of his User model and Preference model. Third, the system can recognize different users every time they make use of the TV. Forth, at any point in time, in any interaction, the user can modify every aspect of the interface to suit his preferences or needs, retaining the control of the system. Fifth, the system automatically adapts its interfaces to the context of every interaction, so that every time a change occurs the user can continue using the system in an effective manner. In the future, to prove the statements, assumptions and concepts described above, we will continue implementing GUIDE platform and related applications, with special focus on easing accessible application development trough the GUIDE toolbox, and testing the system with users of several European countries.
References 1. European Project GUIDE – Gentle User Interfaces for Elderly People, http://www.guide-project.eu/ 2. Benyon, D.: Accommodating Individual Differences trough and Adaptive User Interface.: Results and Prospects. Elsevier, North-Holland (1993) 3. Fink, J., Kobsa, A., Nill, A.: Adaptable and Adaptive Information Access for All Users, Including the Disabled and the Elderly. In: Proc. HYPERTEXT 1998, Pittsburgh (1998) 4. Kennel, A., Perrochon, L., Darvishi, A.: WAB: World-Wide Web access for blind and visually impaired computer users. New technologies in the education of the visually handicapped. ACM SIGCAPH Bulletin 55, 10–15 (1996) 5. Card, S., Moran, T., Newell, A.: The Psychology of Human-Computer Interaction. Lawrence Erlbaum Associates, Hilldale (1983) 6. Benyon, D., Murray, D.: Applying User Modeling to Human Computer Interaction Design. Artificial Intelligence Review 7(3), 199–225 (1993) 7. Biswas, P., Robinson, P.: Evaluating the design of inclusive interfaces by simulation. In: Proc. IUI 2010, pp. 277–280. ACM Press, New York (2010) 8. Biswas, P., Langdon, P.: Towards an inclusive world – a simulation tool to design interactive electronic systems for elderly and disabled users. In: Proc. SRII (2011) 9. Shackel, B.: Human Factors and Usability. In: Preece, J., Keller, L. (eds.) HumanComputer Interaction. Prentice Hall, Hemel Hempstead (1990) 10. Benyon, D.: Adaptive Systems: A Solution to Usability Problems. User modeling and User-adapted Interaction 3(1), 65–87 (1993) 11. Brusilovsky: Methods and techniques of adaptive hypermedia. User Modeling and UserAdapted Interaction 6(2-3), 87–129 (1996)
346
J. Coelho and C. Duarte
12. Anisetti, M., Bellandi, V., Ceravolo, P., Damiani, E.: Multimodality in Pervasive Environment. In: Semantic Changes in Sensor Networks. In: Dagstuhl Seminar Proceedings (2010) 13. Coutaz, J., Crowley, J.L., Dobson, S., Garlan, D.: Context is Key. Communications of the ACM 48(3) (March 2005) 14. Feki, M.A., Renouard, S., Abdulrazak, B., Chollet, G., Mokhtari, M.: Coupling context awareness and multimodality in smart homes concept. In: Miesenberger, K., Klaus, J., Zagler, W.L., Burger, D. (eds.) ICCHP 2004. LNCS, vol. 3118, pp. 906–913. Springer, Heidelberg (2004) 15. Schmidt, A., Beigl, M., Gellersen, H.W.: There is more to Context than Location. In: Proc. IMC 1998, Rostock, Germany (November 1998) 16. Dumas, B., Lalanne, D., Oviatt, S.: Multimodal interfaces: A survey of principles, models and frameworks. In: Lalanne, D., Kohlas, J. (eds.) Human Machine Interaction. LNCS, vol. 5440, pp. 3–26. Springer, Heidelberg (2009) 17. Costa, D., Duarte, C.: Adapting Multimodal Fission to User’s Abilities. In: Stephanidis, C. (ed.) 14th International Conference on Human-Computer Interaction (HCII). LNCS, vol. 6765, pp. 357–366. Springer, Heidelberg (2011) 18. Costa, D., Duarte, C.: Self-adapting TV based applications. In: Stephanidis, C. (ed.) 14th International Conference on Human-Computer Interaction (HCII). LNCS, vol. 6765, pp. 367–374. Springer, Heidelberg (2011) 19. Feiteira, P., Duarte, C.: Adaptive Multimodal Fusion. In: Stephanidis, C. (ed.) 14th International Conference on Human-Computer Interaction (HCII). LNCS, vol. 6765, pp. 384–391. Springer, Heidelberg (2011) 20. Tran, T., Cimiano, P., Ankolekar, A.: A Rule-based Adaption Model for Ontology-based Personalization. In: Advances in Semantic Media Adaptatation and Personalization. SCI, vol. 93 (2008) 21. Stephanidis, C., Paramythis, A., Sfyrakis, M., Stergiou, A., Maou, N., Leventis, A., Paparoulis, G., Karagiannidis, C.: Adaptable and adaptive user interfaces for disabled users in the AVANTI project. In: Campolargo, M., Mullery, A. (eds.) IS&N 1998. LNCS, vol. 1430, pp. 153–166. Springer, Heidelberg (1998)
Adapting Multimodal Fission to User’s Abilities David Costa and Carlos Duarte LaSIGE and Informatics Department, Faculty of Sciences of the University of Lisbon, Campo Grande, 1749-016 Lisboa, Portugal
[email protected],
[email protected] Abstract. New ways of communication are now possible thanks to adaptive multimodal systems, enabling the improvement in accessibility of ICT applications to all users. We are developing a project which combines TV with a multimodal system in order to overcome accessibility and usability problems by impaired users. This paper is focused on the fission of outputs, and how the presentations of applications running on GUIDE’s environment are adapted to the user’s capabilities.
1 Introduction In our everyday lives, we use multiple ways to communicate with each other using speech, gestures, expressions and vision. The modalities we use to have a natural conversation are not the same as the ones we use during human-computer interaction and the main reason is because application developers force users to adapt to the computer’s form of functioning and not the other way. When interacting with a computer we normally use a keyboard for typing and a pointer device for pointing or clicking therefore the interfaces (GUI) are not focused in the use of multimodalities. Over the last two decades a new form of interfaces arisen, called “multimodal user interfaces”, prepared to recognize human language and behaviors. These interfaces integrate recognition technologies such as speech and gestures, relegating the keyboard and mouse to a second plan of input interaction. Of course this will bring some issues as we no longer have the simplicity of graphical user interfaces (GUI) that expect an input as an atomic form and an unequivocal order of events [1]. With multimodal interfaces, users are offered a set of more natural interaction options because it provides alternative modes of input / output other than the usual in human–computer interaction. Resorting to modalities like gesture and/or voice, it is possible to have an interaction closer to what users are used to do in a human to human normal interaction. Even though this sounds interesting for any age group, it turns out to be more compelling to focus on elderly people due to their disabilities or limitations inherent to advanced aging and their lack of experience with graphical user interfaces. The adaptability provided by this type of systems needs indeed some operations of configuration and selection that by any means are not made by the users due to their lack of technical knowledge or familiarity with the system. Adaptation can be very difficult and tedious for the user unless resorting to an adaptive interface solution. C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 347–356, 2011. © Springer-Verlag Berlin Heidelberg 2011
348
D. Costa and C. Duarte
Our work is focused on finding that mechanism of adaptation that is able to improve and refine the performance of multimodal fission of different outputs. This mechanism is responsible for the decision making of the best strategy, firstly to bring out the content using the best available modalities suitable to the user’s profile and the content features, secondly to distribute that content through the selected modalities (using strategies of redundancy and/or complementarity), finally it is necessary to adjust that content for each modality chosen [2]. Multimodality tries to resolve social and e-exclusion as it offers the possibility of presenting the same information in different ways (sound, visual, haptic), compensating some sensorial impairments. Presenting information using different modalities isn’t new but they are used in most cases to distribute different content in different modes. [3] 1.1 GUIDE “Gentle User Interface for Elderly People” This work is being developed in the scope of the European project GUIDE (“Gentle user interfaces for elderly people”) which has the goal of developing a framework for developers to efficiently integrate accessibility features into their applications. GUIDE puts a dedicated focus on the emerging Hybrid TV platforms and services (connected TVs, Set-Top Boxes, etc.), including application platforms as HBBTV as well as proprietary middleware solutions of TV manufacturers. These platforms have the potential to become the main media terminals in the users’ homes, due to their convenience and wide acceptance. Especially for users of the elderly society applications such as home automation, audio-visual communication or continuing education can help to simplify their daily life, stay connected in their social network and enhance their understanding of the world. Ageing and accessibility are two subjects that are highly correlated in several contexts, as for interacting with electronic devices like computers, nomadic devices or set-top boxes. Approximately 50% of the elderly suffers of some kind of (typically mild) disability such as visual, auditory or cognitive impairments, which poses several problems and challenges to social interaction. For such end-users, accessible ICT can make much more of a difference in living quality than for other citizens: It enables or simplifies participation and inclusion in their surrounding private and professional communities. When adapted in the right way, recent advances in human-computer interfaces such as visual gestures, multi-touch as well as speech, or haptics could help to let disabled or elderly users interact with ICT applications in a more intuitive and supportive manner. Despite these positive trends, implementation of accessible interfaces is still expensive and risky for developers of ICT applications. Among others, they have to cope with user-specific needs and limitations (including lack of ICT proficiency) as well as with technological challenges of innovative UI approaches, which require special experience and effort. Therefore, today many ICT application implementations simply neglect special needs and lock out a large portion of their potential users. [4]
Adapting Multimodal Fission to User’s Abilities
349
2 Multimodal Systems Multimodal systems are defined in [2] as “computer systems endowed with multimodal capabilities for human-computer interaction and able to interpret information from various sensory and communication channels.” These systems offer users a set of modalities to allow them to interact with machines and “are expected to be easier to learn and use, and are preferred by users for many applications” [5]. As opposed to unimodal systems where the interaction had to be adapted to a given application, the Multimodal User Interfaces (MUI) are flexible and offer users the capability to change between different modes for expressing different types of information. The advantages are obvious, users with different skills, age, native languages and physical or cognitive impairments are able to interact more effectively with computer systems that are able to adapt to different situation and to a context in constant evolution [1]. Systems that combine outputs evolved since the early nineties where text and graphics were combined (e.g. COMET [6]). More recent systems combine speech, haptic, graphics, text, 2D/3D animations or avatars (e.g. SmartKon [7]). Although most applications use few output modalities and consequently straightforward fission techniques, when dealing with above-mentioned combination of outputs it can turn the presentations more complex, difficult to coordinate and make them always coherent. Oviatt and Cohen [8] describes that the combination of multiple modalities on the input and output side of a multimodal system makes it more robust, reducing errors in the communication. There are not only advantages when using multimodal interfaces, because adding several modes and mixing them increases the complexity of the application as each modality has its own interface and distinct human behavior. To know how to make them work together, their properties and the amount of information that is transmitted in each modality must be learnt. As [2] describes, the generic components for handling multimodal integration (integration committee) are a fusion engine (combination of modalities), fission module (divide information through active outputs), a dialogue manager and a context manager.
Fig. 1. The architecture of a multimodal system and its generic components [2]
350
D. Costa and C. Duarte
Figure 1 illustrates the components and the processing flow between these components. The various input modalities are all first perceived by their respective recognizers which each of them sending to the fusion module their results of processing. This module is responsible for interpreting the information given by the recognizers and making its decision through fusion mechanisms. With the fusion result computed it is now time to communicate to the Dialogue management that identifies the dialogue state and the action to perform and communicate to an application and/or the fission module. This module returns the information to the user using the most adequate and available modality or combination of modalities according to the user profile and context. These are responsibility of the Context manager, which must be aware of the user profile and environmental context changes. 2.1 Adaptive Multimodal Fission A multimodal system should be able to flexibly generate various presentations for the same information content in order to meet the individual user’s requirements, environmental context, type of task and hardware limitations. Adapting the system to combine all this time changing elements is a delicate task (e.g. SmartKom[7]). The fission module and fusion engine are crucial to making possible the usage of multimodal applications for all users, as it takes advantage of multimodalities to overcome sensory impairments that users may have. When considering this in the scope of GUIDE’s vision and elderly people, its target user group, that is the issue number one to be treated. In other words this module is responsible for choosing the output to be presented to the user and how that output is channeled and coordinated throughout the different available output channels (based on the user’s perceptual abilities and preferences). To do this according to the context and user profiles, the fission engine follows these three tasks that will be described further: Message construction; modality selection; output coordination. Based on the What-Which-How-Then (WWHT) conceptual model of Cyril Rousseau et al. [9], who created this model to offer the capability to make adaptive and context aware presentations, we will describe the fission module along the following three sections. The WWHT model authors define three main components as being the means of communication (physical and logical) between human and machine. Those components are Mode, Modality and Medium. It is also important to refer that there are primary and secondary relations between components. Primary relations are for example in haptic systems the “vibration” created by the system and the user’s tactile mode, but as a side effect you can hear the vibration, making a secondary relation between the audio mode and vibration modality. The WWHT model is based on four basic concepts and they will be described in detail in sections 2.1.1, 2.1.2 and 2.1.3: What information to present, Which modality(ies) to choose to present that information, How to present that information using that modality(ies), Then – How to make the presentation change. Message Construction. The presentation content to be included must be selected and structured, i.e., it is necessary to decompose the semantic information issued from the dialogue manager into elementary data to be presented to the user.
Adapting Multimodal Fission to User’s Abilities
351
As it’s shown on figure 2 the information is divided into n basic elements. This phase is called by the authors as “Semantic Fission” (What). There are two main approaches for content selection and structuring that can be employed - schema-based or plan-based [10]. However, in some systems, selecting and structuring the content is done before the fission module process begins. An example of that is MAGPIE [11].
Fig. 2. WWHT conceptual model for context aware and adaptive interaction [9]
In [10] the author describes the Schema-based approach “encodes a standard pattern of discourse by means of rhetorical predicates that reflect the function each utterance plays in text. By associating each rhetorical predicate with an access function for an underlying knowledge base, these schemas can be used to guide both the selection of content and its organization into a coherent text to achieve a given communicative goal”. Schema-based systems like COMET[6] determine from a set of existing schemas the best suitable for the user’s intentions assigning a weight for each intention using heuristics. From that step results a list of schemas ordered by efficiency. Next stage is the generation of elementary information, if a listed schema fails to generate the data then the next best efficient schema is tested. This is where it fails against plan-based approaches, because it is impossible to extend or modify a part of the schema. Plan-based approach uses a goal-driven, top-down hierarchy planning mechanism which receives communicative goals and a set of generation parameters, such as user’s profile, presentation objective, resource limitations and so on. Systems that use this approach then select parts of a knowledge base and transform them into a presentation structure. The structure’s root node is the communicative goal, i.e., a somewhat complex presentation goal (e.g. describing a process like the TV programming) and its’ leafs are elementary information to present (e.g. text, graphics, animation, etc.) [12]. WIP[13]is a presentation generation system that follows this approach receiving as input a presentation goal and then it tries to find a
352
D. Costa and C. Duarte
presentation strategy which matches the given goal. A refinement-style plan is then generated in the form of a directed acyclic graph (DAG) where its leafs are specifications for elementary acts of presentation which are then sent to the respective module that is capable of handling them. The advantage of this approach over schema-based approach is that it is able to represent the effects of each section of the presentation. Also, mode information can be easily incorporated and propagated during the content selection process. Plan-based approaches facilitate the coordination between mode information and content selection process as mode selection can run simultaneously with content selection and not only after. Modality Selection. After the message construction, the presentation must be allocated, i.e., each elementary data is allocated to a multimodal presentation adapted to the interaction context as presented on figure 2 (“Election phase” or Which). This selection process follows a behavioral model that specifies the components (modes, modalities and medium) to be used. The available modalities should be structured according to the type of information they can handle or the perceptual task they permit, the characteristics of the information to present, the user’s profile (abilities, skills, impairments and so on) and the resource limitations. Taking this into consideration is necessary for optimal modality selection. For the compliance of this goal there are some approaches: • Composition - The system tries to combine selected primitives or operators, using predefined composition operators. The first criterion of modality selection is how efficiently and accurately each modality is likely to be perceived by the user. Depending on the modality(ies) chosen it will decide how the presentation is presented. The second step is to choose and combine a complete set of modalities that follows the specified message structure. [14, 15] • Rules - A set of election rules allocate the components of the presentation among the modalities using simple instructions ( if …Then …). The premise of a contextual rule describes a state of context interaction environment (e.g. Noise level superior to 100 dB) and the contextual rule’s conclusion is based on the weight of each premise. Other type of rules are criterion-referenced ones, these rules allow a selection of rules based on global criterion (language, age, abilities, etc.) [6, 9, 16]. Coutaz et al [17] define some rules which allow the allocation of presentations with multiple pairs of modality-medium under redundancy and/or complementarity criterions. These rules follow four properties: Equivalence, Assignment, Redundancy and Complementarity. Equivalence expresses the availability of choice between two or more modalities but does not impose any of temporal constraint on them (e.g. To show the same information message we can use text or speech synthesizer). Assignment expresses the absence of choice, which means there’s no other modality choice or it is defined to use one and only modality for that specific case. Redundancy and Complementarity consider the use of combined multiple modalities under temporal constraints. Redundant express the equivalence between two or more modalities (same expressive power) and are used within the same temporal window (repetitive behavior) without increasing its expressive power. Redundancy includes sequential and parallel temporal relations. Parallelism puts restrictions on the types of
Adapting Multimodal Fission to User’s Abilities
353
modalities that can be used simultaneously as a human mode cannot be activated in parallel. Complementarity is used when one modality isn´t enough to reach the goal of the presentation and therefore more modalities are combined to reach the intended goal. Examples of systems that use CARE properties are MATIS[17] or Rousseau’s platform[18]. Agents - Competitive and cooperative agents plan the presentations. MAGPIE[11]system implements a set of agents that communicate with each other in order to reach a presentation goal, this system enables the dynamic creation of modality-specific agents needed to select and integrate basic components of the data presentation. Output Coordination. Once the presentation is allocated, it is now instantiated, which consists in getting the lexical-syntactic content and the attributes of the modalities (How). First a concrete content of the presentation is chosen and then the attributes are fixed such as modality attributes, spatial and temporal parameters, etc. For a coherent and synchronized result of the presentation, all used output channels should be coordinated with each other’s. The consistency of the presentation must be verified as structural incoherencies (some modalities are indeed chosen to express multiple basic elements in one single presentation but that isn´t always possible) and instantiation incoherencies (problems in the defined modality attributes) may occur. Output coordination abides by the following aspects: • Physical layout – When using more than one visually-presented modality, the individual components of the presentation must be defined. [6, 11] • Temporal Coordination – When using dynamic modalities like voice synthesizers, videos or sounds, these components must be coordinated in order to achieve the presentation’s goal. Because the order and duration of actions are different, the dynamic modalities used need to be synchronized and coherent. • Referring expressions – Some systems will produce multimodal and cross-modal (interaction between two or more sensory modalities) referring expressions, that means making references using multimodalities or referring to another part of the presentation which need some coordination work.[6, 19, 20] Coordination and consistency are also necessary through the presentations as user’s and environmental context may change along the evolution of the presentations output. This is important to not get outdated content, and if so invalidating it and get an updated version of the presentation (Then).
3 GUIDE’s Output Fission Module Due to the already mentioned GUIDE’s context environment and objectives these are the following output components that are expected to be used in order to satisfy the project requirements: Video rendering equipment (e.g. TV); Audio rendering equipment (e.g. Speakers); Tablet supporting a subset of video and audio rendering; Remote control supporting a subset of audio rendering, vibration feedback (e.g. Wii remote).
354
D. Costa and C. Duarte
Video rendering. The main medium used for video rendering is obviously the TV. Here is where visual presentations will occur, be them the channels themselves, the adaptive user interface or video streams. A tablet may also be used to clone the TV screen or complement information displayed on the TV screen (e.g. context menus) but essentially is used as a secondary display. The main user interface should be able to generate various configurable visual elements such as text (e.g. subtitles, information data, etc.), buttons for navigation purpose, images/photos, video (e.g. video conference or media content) and an avatar. In order for the UI to be adapted to the user’s needs these elements are necessarily highly configurable and scalable (vector-based). Size, font, location, and color are some attributes needed to maintain adaptability. These graphical elements enable the system communication with the users by illustrating, answering, suggesting, advising, helping or supporting through their navigation. The 3D avatar plays a major role for elderly acceptance and adoption of GUIDE system. An avatar able to do non-verbal expressions like facial expressions and gestures gives the system a more human like communication. Although a human realistic avatar would be preferable, due to hardware limitations (set-top box) a cartoon-like representation was chosen. Audio rendering. Audio feedback will be available from TV, tablet or remote control through audio speakers. Audio outputs can be from “simple” non-speech sounds, i.e., rhythmic sequences that are combined with different timber, intensity, pitch and rhythm parameters to speech synthesizers that produce artificial human speech. Besides the obvious audio-visual output from TV channels or other media (video), GUIDE UI will provide non-speech audio feedback for alarm, warning or status messages or input selection/navigation feedback. These audio signals can act as redundant information to the visual feedback in order to strengthen their semantics. Synchronized and desynchronized audio-visual presentation will be provided by text-to-speech interfaces. The avatar uses lip-synchronized text-to-speech in order to communicate with the user but in case of hardware limitations or high processor workload some predefined recorded sound files will be playing instead (TTS can be hardware demanding). TV audio replicated by the tablet can act as an enhancer for users with hearing impairments because the mobility of the device makes it possible to be closer to the user ears or use headphones. Haptic feedback. Haptic output feedback is done using vibration features present on remote control and/or tablet devices. This modality perceived by user’s tactile sensory is used to add new or redundant information in complementation of other modalities for example with visual and audio when an alert or warning message is triggered or used as an input feedback. This mode is a parallel sensory channel to visual and audio sensors which means using them together will not increase the cognitive load. Fission Architecture. There isn’t much research done on fission of output modalities because most application use few different output modalities therefore simple and direct output mechanism are often used. Nevertheless based on 2.1 and on the expected GUIDE applications we will discuss the best implementation of the fission module to be integrated on GUIDE framework. As mentioned above on section 2.1 this module usually follows three main steps, the message construction, modality selection and output coordination. This module communicates directly with the dialogue manager and not with the application itself.
Adapting Multimodal Fission to User’s Abilities
355
Taking the example of GUIDE video conferencing, when the application initiates and it is logged with the GUIDE system the application sends to the dialogue manager the information needed to communicate (we assume it is a set of possible states and commands). The Dialogue manager is expected to send information about the content to be displayed (abstract information) and also user information in order to adapt that content to the user’s capacities. Although the message construction could be made in the dialogue module as it already has the needed information, decomposition of the semantic information can also be done on the fission module. Between the two possible presentations structures we would choose the plan-based one due to its already explained advantages over schema-based. This structure represents an abstract user interface presentation and it is modality independent. The communication language used between dialogue and fission module should be XML related, however a specific language is yet to be decided (EMMA, SMUIML, UsiXML and TERESA XML are going to be considered). The selection of a modality to present some specific data is elected using different techniques. Due to the complexity and the hardware limitation imposed by the use of a set-top box we will leave the agent based technique out for now, and we’ll test the efficiency of the composition and rules approaches, resulting in a modality dependent structure. If the user has middle hearing, visual and cognitive impairments (common in elderly people), the selection phase will follow the right procedures to adapt the presentation to the user. In this case the sounds (e.g. Text-to-speech of available options description) would be avoided at least if used alone and not as redundant information (e.g. TTS of selected button description). The visual presentation for instance such as text or buttons should be altered thus settings like font size, width and height, back and front colors, contrast and even the location are all susceptible to change. Cognitive impairments are more likely to be a complex matter and more research is indeed needed to know how to treat this problem. Common sense would say to create a simplistic interface with the minimal options and important information required to the user should be always available. After the selection is done, data need to be sent to their respective renders modules and treated. This communication is also XML based and must coordinate all the modalities in order to maintain the presentation coherency. SMIL, HTIMEL or HTML+TIME are some of the languages to be considered. It is important to refer each state the application goes through, the different presentation screens follows all the fission phases to ensure adaption of environmental changes (user left the front screen, another user appears, loud ambient, etc.)
4 Conclusion We showed the importance of adaptive output fission to enhance the interaction between humans and computers and how it can overcome human impairments in order to give all users the possibility to interact with any application. We researched some multimodal systems and their techniques for output presentations and described the advantages and disadvantages. We also presented our project named GUIDE and the goals we pretend to achieve, and more specifically the fission module in development. We are working to choose or create the best techniques for the best possible performance due the limited GUIDE hardware restrictions.
356
D. Costa and C. Duarte
References 1. Faraz T.: Multimodal Interfaces, http://www.cs.utoronto.ca/~faraz/projects/6326/paper.pdf 2. Dumas, B., Lalanne, D., Oviatt, S.: Multimodal interfaces: A survey of principles, models and frameworks. In: Lalanne, D., Kohlas, J. (eds.) Human Machine Interaction. LNCS, vol. 5440, pp. 3–26. Springer, Heidelberg (2009) 3. Human Factors (HF): Multimodal interaction, communication and navigation guidelines. In: ETSI 2008 Guides and recommendations to promote e-accessibility, http://www.etsi.org/WebSite/Technologies/HumanFactors.aspx 4. Gentle user interfaces for elderly people, http://www.guide-project.eu/ 5. Oviatt, S.: Multimodal interfaces. In: Jacko, J., Sears, A. (eds.) The Human-Computer Interaction Handbook: Fundamentals, Evolving Technologies and Emerging Applications (2003) 6. Feiner, S.K., McKeown, K.R.: Automating the Generation of Coordinated Multimedia Explanations (1991) 7. Reithinger, N., et al.: 5th International Conference on Multimodal Interfaces (2003) 8. Oviatt, S.L., Cohen, P.R.: Multimodal Interfaces That Process What Comes Naturally. Communications of the ACM (2000) 9. Rousseau, C., Bellik, Y., Vernier, F.: WWHT: Un modèle conceptuel pour la presentation multimodale d’information. In: IHM 2005 Proceedings of the 17th International Conference on Francophone sur l’Interaction Homme-Machine (2005) 10. Duarte, C.: Design and Evaluation of Adaptive Multimodal Interfaces, PhD Thesis, Faculty of Sciences, University of Lisbon (2007) 11. Han, Y., Zukerman, I.: A mechanism for multimodal presentation planning based on agent cooperation and negotiation. In: Human-Computer Interaction (1997) 12. Herzog, G., André, E., Baldes, S., Rist, T.: Combining alternatives in the multimedia presentation of decision support information for real-time control. In: Proceedings of the IFIP Working Group 13.2 Conference (1998) 13. Wahlster, W., André, E., Finkler, W., Profitlich, H.-J., Rist, T.: Plan-Based Integration of Natural Language and Graphics Generation. Artificial Intelligence Special Volume on Natural Language Processing (1993) 14. Casner, S.M.: Task-analytic approach to the automated design of graphic presentations. ACM Trans. Graph. (1991) 15. Fasciano, M., Lapalme, G.: Intentions in the coordinated generation of graphics and text from tabular data. Knowledge Information Systems (2000) 16. Bateman, J., Kleinz, J., Kamps, T., Reichenberger, K.: Towards constructive text, diagram, and layout generation for information presentation. Computational Linguistics (2001) 17. Coutaz, J., Nigay, L., Salber, D., Blandford, A., May, J., Young, R.M.: Four Easy Pieces for Assessing the Usability of Multimodal Interaction: the CARE Properties. In: INTERACT 1995 (1995) 18. Rousseau, C., Bellik, Y., Vernier, F.: Multimodal output specification / simulation platform. In: ICMI 2005 (2005) 19. André, E., Rist, T., Muller, J.: Employing AI methods to control the behavior of animated interface agents. Applied Artificial Intelligence (1998) 20. Johnson, W.L., Rickel, J.W.: Animated pedagogical agents: Face -to-face interaction in interactive learning environments. International Journal of Artificial Intelligence in Education (1998)
Self-adapting TV Based Applications Daniel Costa and Carlos Duarte LaSIGE and Informatics Department Faculty of Sciences of the University of Lisbon Campo Grande, 1749-016 Lisboa, Portugal
[email protected],
[email protected] Abstract. The lack of accessibility of ICT applications affects mainly the disabled and/or elderly people, who have a high risk of social exclusion. Using the advantages of adaptive multimodal systems and a well-accepted device like the TV, those difficulties can be surmounted and social exclusion can be stopped. GUIDE intends to simplify the interaction and at the same time not limit the users able to use it, providing multimodal interaction. Also, applications running on GUIDE environment will be able to adapt, on an automated way, to the user’s needs and characteristics. Keywords: Accessibility, Adaptive Multimodal Systems, Dialogue Manager.
1 Introduction Multimodal interaction has shown a great development in the past years concerning the multiple ways researchers explored to enhance human-computer communication. Being a more “human” way of interacting with computers, using speech, gestures and other “natural” modalities, multimodal interaction is preferred over unimodal interfaces by users [1]. Two main objectives can be set regarding this type of interaction. By one hand, support and accommodate users’ perceptual and communicative capabilities; on the other hand integrate computational intelligence in the real world, by offering more natural ways of interaction to humans [2].With this type of interaction, the choices offered to users increase and the complexity of the applications grows. Also, with the increase of computer users we have to deal with different levels of expertise and characteristics [3]. Disabled and/or elderly people are a group at high risk of social exclusion due to the physical, legal, financial, and attitudinal barriers from society that they face in their everyday life. The access to the opportunities offered by society is obviously limited if these cannot be reached by persons with impairments or restricted mobility. A more subtle way of exclusion results from the sensory modalities in which they are presented. Therefore, if the presentation of information has only one modality it will exclude people with impairments in that particular sensory modality. Not being able to use a device or service because its input and output channels support only one modality is a serious restriction of one’s everyday life [4]. Adaptive multimodal interfaces, giving the user an optional representation of the same information in more C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 357–364, 2011. © Springer-Verlag Berlin Heidelberg 2011
358
D. Costa and C. Duarte
than one sensory mode, and besides adapting automatically to the user’s requirements, can compensate to a certain degree for cognitive and sensory impairments. Besides alternative outputs, we have to think about alternative interaction inputs so that no impairment can actually prevent users from using the system. The European project GUIDE [5] ("Gentle user interfaces for elderly people") is developing a software framework which allows developers to efficiently integrate accessibility features into their applications. GUIDE puts a dedicated focus on the emerging Hybrid TV platforms. These platforms have the potential to become the main media terminals in the users’ homes, due to their convenience and wide acceptance. Especially for users of the elderly society, applications such as home automation, audio-visual communication or continuing education can help to simplify their daily life, stay connected in their social network and enhance their understanding of the world. When adapted in the right way, recent advances in human-computer interfaces such as visual gestures, multi-touch as well as speech, or haptics could help to let disabled or elderly users interact with ICT applications in a more intuitive and supportive manner. Nowadays many ICT application implementations simply neglect special needs and lock out a large portion of their potential users. Therefore, this work will help this kind of users to be able to interact with ICT applications in spite of the associated disabilities as it will adapt to the user’s needs automatically.
2
Adaptive Multimodal Interfaces
2.1 Adaptation Adaptation is a “method to increase the usability of an application system in terms of effectivity, efficiency, and ease of use” [6]. We can find many possible cases where interface adaptation would benefit the user interaction: For instance, when the requirements of the user change over time. A user’s knowledge and experience evolves over time, from a novice status towards an expert. However the most important reason for having an adaptive system is if the system is used by a heterogeneous user population having different requirements. Users have different needs, preferences, characteristics, knowledge, motivation, goals, etc. There are many applications that include customization components which allow the user to change the preferences or use templates. Although it is the easiest way of providing adaptation mechanisms from the designer’s point of view, it is not so easy for the users. This type of customization done by the user can be very difficult and tedious, and sometimes it isn’t even done at all, as pointed out by Berry [7]: “Leaving the customization process to the user, may end up reflecting his beliefs, but not his actual practices, since even domain experts have difficulty describing their true knowledge and biases about a domain”. GUIDE has the goal to use self-adaptation, making adaptation the system’s responsibility as the users don’t have enough knowledge to perform it themselves. However, the user will be capable of adjusting at any time some settings for a better personalization of the system. 2.2 Adaptive Multimodal Systems The advantages of multimodal interfaces can be better explored by introducing adaptive capabilities. By monitoring the user’s interaction and the context of use, the
Self-adapting TV Based Applications
359
system can adapt automatically, improving its ability to interact with the user, building a user model based on partial experience with that user. These adaptive capabilities are important when dealing with users with different physical and cognitive characteristics, preferences and knowledge. These systems that use multimodal interactions and have the ability to adapt are called Adaptive Multimodal Systems. GUIDE will be used mainly by elderly people, therefore putting together the advantages of multimodality and adaptation to overcome these users’ requirements and characteristics is an added value. Architecture. In general the architecture of an adaptive multimodal interface is composed by the recognizers for the input modalities, the synthesizers for output modalities and between them there is the “Integration committee” [2]. As shown in figure 1, the components for handling the multimodal integration are: a fusion engine (for inputs), a fission module (for outputs), a dialog manager and a context manager.
Fig. 1. The architecture of an adaptive multimodal system by Dumas el al [2]
The input modalities, such as speech or gestures, are received by the various recognizers. Then the fusion engine is in charge of giving a common interpretation of those inputs. The interpretation given by the fusion engine is passed to the dialog manager, responsible for identifying the dialog state, the transition to perform, the action to communicate to a given application, and/or the message to return through the fission component. Then, depending on the user profile and context of use, the fission engine returns the message to the user through the most adequate modality or combination of modalities. Finally, the context manager is responsible for communicating any changes on the environment, context and user profile to the other three components, so they can adapt their interpretations. Context Models. Adaptive systems rely on information stored in models as basis for adaptation. There are different models concerning different types of information to be stored: user model, task model, domain model, dialog model, environment model, etc. The most important model in the adaptive system field is the user model, as the user is
360
D. Costa and C. Duarte
the main driver of adaptation in most systems [3]. Being an integral part of any adaptive system, the user model is a representation of knowledge and preferences which the system “believes” that a user (or group of users) possesses [8]. The user model holds the user’s characteristics, behavioral patterns, preferences, etc. All this information is important for the adaptation process and for improving the interaction. Therefore, the user model must have continuous maintenance and be always updated. Knowledge Acquisition Techniques. One of the most relevant aspects of the selfadaptation process is the way the knowledge concerning the users is acquired. According to [3] the process of acquisition of the information needed to build and maintain the user model can be done in an implicit or explicit way. By observing the user’s behavior, the interface can obtain information in an implicit way, and use it to enhance the experience and capabilities of the user. This acquisition can be done by observation of direct interaction with the interface, or by the analysis of the information needed by the user. There are two types of information that can be gathered with implicit acquisition: short-term information or long-term information. The short-term information is formed by the current dialog state (e.g., last sentence typed). This information can be used to give immediate assistance in case of a query to the system. The long-term information consists on the interaction history. The explicit acquisition of information for the user model can be done in two ways. The application can ask questions directly to the user regarding the information that is lacking, or the user might be allowed to inspect and possibly make changes to the user model. By asking for information rather than deriving it from other sources, this knowledge acquisition technique can be more precise and reach higher levels of accuracy of the information kept in the user model. However, this involves interruptions on the user task and therefore a more intrusive way of getting the information [3]. An example of the application of these techniques can be seen in [9], where it is described an approach for Personalized Electronic Program Guides to face the information overload and facilitate the selection of the most interesting programs to watch. The system runs on the set-top box and downloads information about the available TV programs from the satellite stream. In order to obtain precise estimates of the individual TV viewer’s preferences, their system relies on the management of a hybrid user model that integrates three sources of information: user’s explicit preferences; information about viewing preferences of stereotypical TV viewer classes; and user’s viewing behavior. Dialogue Manager Approaches. Bui [10] describes a Dialog Manager (DM) as “the program which coordinates the activity of several subcomponents in a dialogue system and its main goal is to maintain a representation of the current state of the ongoing dialogue”. This section will present several approaches to implement a DM. Finite state models are the simplest models used to develop a dialogue management system. The dialogue structure is represented in the form of state transition networks in which the nodes represent the system’s utterances and the transitions between the nodes determine all the possible paths through the network. The Dialogue control is system-driven and all the system’s utterances are predetermined. In this approach, both task model and dialogue model are implicit and they are encoded by a dialogue designer. The major advantage of this approach is the
Self-adapting TV Based Applications
361
simplicity. It is suitable for simple dialogue systems with well-structured task. However, the approach lacks of flexibility, naturalness, and applicability to other domains. An extension of finite stated-based models, frame-based model is developed to overcome the lack of flexibility of the finite state models. The frame-based approach, rather than building a dialogue according to a predetermined sequence of system’s utterances, takes the analogy of a form-filling (or slot-filling) task in which a predetermined set of information is to be gathered. This approach allows some degree of mixed-initiative and multiple slot fillings. The task model is represented explicitly and the dialogue model is (implicitly) encoded by a dialogue designer. The frame based approaches have several advantages over the finite state-based approaches: greater flexibility and the dialogue flow is more efficient and natural. However, the system context that contributes to the determination of the system’s next action is fairly limited, and more complex transactions cannot be modeled using these approaches. In [10], Bui refers that Information state approach and its extensions are an effort to overcome limitations in finite-state based and frame-based approaches. An information state-based theory of dialogue consists of five main components [11]: A description of informational components (e.g. participants, common ground, linguistic and intentional structure, obligations and commitments, beliefs, intentions, user models, etc.); Formal representations of the above components (e.g., as lists, sets, typed feature structures, records, Discourse Representation Structures, propositions or modal operators within a logic, etc.); A set of dialogue moves that will trigger the update of the information state; A set of update rules, that govern the updating of the information state. And finally, an update strategy for deciding which rule(s) to apply at a given point from a set of applicable ones. Another extension to the information state approaches is to use probabilistic techniques such as (fully observable) Markov Decision Process (MDP) or a Partially Observable Markov Decision Process (POMDP). The idea is to dynamically allow changing of the dialogue strategy and the actions of a dialogue system based on optimizing some kinds of rewards or costs given the current state. This is done by modeling the dialogue as a MDP or as a POMDP. Reinforcement learning is then used to learn an optimal strategy. The actions are the system’s responses and questions and the rewards are either defined by the designer (high reward for task completion, low punishment for confirmation and questions and so on) or they are provided by the user who is asked to rate the system at the end of each dialogue. Plan-based Approaches support a greater complexity to dialogue modeling than the approaches presented in previous sections. Bui [10] explains that these approaches are based on the view that humans communicate to achieve goals, including changes to the mental state of the listener. The dialogue’s input is not only considered as a sequence of words but as performing speech acts [12] and it is used to achieve these goals. Usually, the task of the listener is to discover and appropriately respond to the speaker’s underlying plan. The plan-based approaches are based on the plan-based theories of communicative action and dialogue which claim that the speaker’s speech act is part of a plan and that it is the listener’s job to identify and respond appropriately to this plan.
362
D. Costa and C. Duarte
Collaborative Approaches (also known as agent-based dialogue management) are based on viewing dialogues as collaborative process between intelligent agents. Both agents work together to achieve a mutual understanding of the dialogue. Bui [10] points out that unlike the dialogue grammars and plan-based approaches which concentrate on the structure of the task, the collaborative approaches try to capture the motivations behind a dialogue and the mechanisms of dialogue itself. Machine Learning. Machine learning techniques play an important role in adaptive multimodal interfaces. Many parts of multimodal systems are likely supported by machine learning. For example, the speech, face and gesture recognizers are domains of interest on machine learning. On the self-adaptive systems field, the machine learning can be useful on the user, task and context modeling [2]. Machine learning is the use of algorithms that allow computer programs to automatically improve through experience. Because adaptive user interfaces must learn from observing their user’s behavior, another distinguishing characteristic is their need for rapid learning. Dialogue Manager (DM) exists to oversee and control the entire conversational interaction and execute any number of functions. Ultimately, the DM prescribes the next action on each turn of an interaction. Because actions taken by the system directly impact users, the DM is largely responsible for how well the system performs and is perceived to perform by users (i.e. the user experience) [13]. This recommends the use of learning methods that achieve high accuracy from small training sets over those with higher asymptotic accuracy but slower learning rates. On the other hand, the advisory nature of these systems makes this desirable but not essential; an interface that learns slowly will be no less useful than one that does not adapt to the user at all. Still, adaptive interfaces that learn rapidly will be more competitive, in the user’s eyes, than ones that learn slowly [14]. An example of a learning method is the reinforcement learning, one the most active research area in artificial intelligence. Reinforcement learning is a computational approach to learning whereby an agent tries to maximize the total amount of reward it receives when interacting with a complex, uncertain environment.
3 GUIDE TV Applications 3.1 Initialization and User Profiling When the user has contact with GUIDE for the first time it will be presented with an application for initialization and user profiling. The system will ask questions and present tasks so the user preferences and limitations can be perceived and stored and the user model can be built. This is done through a dialogue between the system and the user, which is already predetermined by the application but that should be able to adapt itself based on the knowledge it acquires. These tasks take in account all the elderly related problems such as vision, hearing, motor or cognitive impairments. The application also gathers the user preferences regarding the preferred modalities to interact. All this data is stored in the user model which will be used by the modules that compose the GUIDE adaptation core to understand the user’s characteristics so it can apply its techniques for user adaptation.
Self-adapting TV Based Applications
363
3.2 Video Conference, Tele-learning, Media Access, Home Automation GUIDE will provide some applications already installed and prepared to run in its environment. These applications will all run in the GUIDE framework and therefore will be capable of self-adaptation and offer interaction with different modalities. This adaptation is based on the user’s characteristics and interaction patterns. For instance, if a person has earing related impairments the system will automatically adapt, presenting the information in other modalities like images and text rather than sound. If a person has cognitive impairments and has difficulties remembering the last screen displayed, then the system will find a way to display the last and current screen at the same time or reduce the steps and screens needed to accomplish a task. The GUIDE video conferencing application aims at providing inclusive audiovisual communication services for elderly people with ageing-related impairments. Using video conferencing, elderly people shall be enabled to communicate with their families and friends, sharing multimedia content and experiences, being provided with multimodal user interfaces. The goal of the Tele-Learning (TL) Application is to provide to users a means of continuing education by enabling accessible interaction with recorded lecture material. Such scenarios are common in Third-Age Universities, a relatively recent development across Europe that allows older citizens to pursue training in a subject of their interest. Typically, today tele-learning is done on a PC, but in order to make the application accessible to people not necessary experienced with its intricacies, adopting it on a TV environment is considered a major step in making available such a service to a greater audience. The GUIDE Media access application aims at giving easy access to digital TV for elderly and impaired people on a multimodality basis. This application will provide several basic TV services like navigation through other channels while watching a given channel, access to information related to programs on real time, activate subtitles, etc. The home automation application aims at enabling a user to have easy control of his physical home environment through a transparent and distributed user interface that hides the technology in the background and provides the user with a multimodal, intuitive and natural interface to a “reactive home environment”. The reactive home environment is equipped with embedded technologies such as sensors, device actuators, input / output channels and software applications that are seamlessly connected in a networked home environment via a user interface framework.
4 Conclusion In this paper we presented the problems raised by the lack of accessibility on ICT applications and the difficulties that elderly people have to surmount when interacting with such applications. Then GUIDE was described has a system with the goal to overcome those problems, using the TV as the main platform for an adaptive multimodal system. However, to build an adaptive multimodal system that covers all GUIDE’s requirements is not an easy task, starting at the architecture and all its components and ending at the adaptation strategies. All of these aspects are complex and have many approaches, therefore which paths to take must be discussed and evaluated. For example, which approach to choose to implement the Dialogue
364
D. Costa and C. Duarte
Manager must take in account the complexity of the applications to be developed or the hardware limitations in terms of processing capabilities. Also protocols must be decided related to the communication between the components and mostly between the adaptation module and the applications.
References 1. Oviatt, S.L.: Advances in Robust Multimodal Interface Design. IEEE Computer Graphics and Applications 23 (September 2003) 2. Dumas, B., Lalanne, D., Oviatt, S.: Multimodal interfaces: A survey of principles, models and frameworks. In: Lalanne, D., Kohlas, J. (eds.) Human Machine Interaction. LNCS, vol. 5440, pp. 3–26. Springer, Heidelberg (2009) 3. Duarte, C.: Design and Evaluation of Adaptive Multimodal Interfaces, PhD Thesis, Faculty of Sciences, University of Lisbon (2007) 4. ETSI. Human Factors (HF): Multimodal interaction, communication and navigation guidelines. In: ETSI Guides and recommendations to promote e-accessibility (2008), http://www.etsi.org/WebSite/Technologies/HumanFactors.aspx 5. Gentle User Interfaces for Elderly People (GUIDE), http://www.guideproject.eu 6. Schneider-Hufschmidt, M., Kuhme, T., Malinowski, U.: Adaptive User Interfaces: Principles and Practice. North-Holland, Amsterdam (1993) 7. Berry, D.C.: The problem of implicit knowledge. Expert Systems 4, 144–151 (1987) 8. Benyon, D., Murray, D.: Applying user modeling to human-computer interaction design. Artificial Intelligence Review 7(3-4), 199–225 (1993) 9. Ardissono, L., Gena, C., Torasso, P., Bellifemine, F., Difino, A., Negro, B.: User Modeling and Recommendation Techniques for Personalized Electronic Program Guides. In: Ardissono, L., Kobsa, A., Maybury, M.T. (eds.) Personalized Information Retrieval (2004) 10. Bui, T.H.: Multimodal Dialogue Management - State of the Art. CTIT Technical Report series No. 06-01, University of Twente (UT), Enschede, The Netherlands (2006) 11. Traum, D., Larsson, S.: The Information State Approach to Dialogue Management (2003) 12. Searle, J.R.: Speech acts: An Essay in the Philosophy of Language. Cambridge University Press, Cambridge (1969) 13. Paek, T., Pieraccini, R.: Automating spoken dialogue management design using machine learning: An industry perspective. Speech Communication 50(8-9), 716–729 (2008) 14. Langley, P.: Machine Learning for Adaptive User Interfaces. KI Kunstliche Intelligenz, 53–62 (1997)
A Survey on Guiding Logic for Automatic User Interface Generation Gaurav Dubey Penn State University, University Park PA 16802, USA
[email protected] Abstract. Mobile devices and other computationally-capable artifacts are presenting new ways in which to interact with our applications and data. However, to do so effectively requires user interfaces that are optimized for the target platform. Designing such user interfaces manually for every platform and application would be an impractical way to approach this problem; this is one area where automatic user interface generation could prove useful. This process, however, requires guiding logic, formalized from usability knowledge and represented in a form that an automatic UI generation system can use. This paper discusses some of the issues related to this task, presenting examples from the relevant literature. The paper then introduces the concept of genericity, which could help manage the problem space encompassed by automatic UI generation. Keywords: automatic user interface generation, adaptation.
1 Introduction In recent years, computers have pervaded our environment. Our appliances, telephones and automobiles have all been computerized to a certain extent; furthermore, it is becoming easier to interact with this computational presence. Cellular phones and tablet computers are giving us new ways to access our data and applications, and innovative new interaction mechanisms are emerging. However, as we start using computational devices with new form factors and interaction modes, we need to create interfaces for our software that are optimized for both the restrictions enforced and the capabilities granted by these devices. Doing this manually for every application and platform would be a messy, tedious approach, causing consistency and maintenance issues. On the other hand, automation would make it easier to create these interfaces in an efficient, usability-conscious manner. A system that created (or assisted in the creation of) such UIs would need guiding logic: a formalization of the usability knowledge relevant to the target platform(s). The goal of this paper is to examine several issues that emerge while designing this logic, and to discuss how these issues are treated in the literature. The corpus of papers examined in this review includes a selection of relevant papers from a larger, unpublished literature review conducted in order to identify viable research areas [4]. The papers for that review were selected using a combination of keyword searches and citation filtering. C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 365–372, 2011. © Springer-Verlag Berlin Heidelberg 2011
366
G. Dubey
2 Emergent Issues This section discusses several issues that emerge while examining the guiding logic used by automatic user interface generation systems. 2.1 Generation vs. Adaptation When generating a user interface for a new platform, one question that arises is: should the interface be generated from scratch, or should it be an adaptation of an existing interface? The papers under review include both approaches. For instance, Plomp and Mayora-Ibarra [13] claim that graphical UIs don’t have enough information to be directly adapted to the speech modality, and vice versa. The paper favors an abstract model that includes enough UI generation data for both modalities. On the other hand, a number of other papers support adaptation: these include Banavar et al, Bandelloni et al and Florins et al [2, 3, 5] Note, however, that none of these attempt to adapt existing UIs without the involvement of higher abstraction levels. One could argue that adaptation would be a better approach than pure generation not involving an existing UI, because theoretically, transformations could be designed that preserved the usability and layout aesthetics of an existing interface while adapting it to a new platform. This way, if an appealing, usable interface were used as input during the adaptation process, those properties would be transferred to the new UI. However, none of the papers under review discussed this argument in depth; furthermore, it is probably very difficult to create transformations that ‘preserve usability’ without platform-specific issues creeping in and causing problems. 2.2 Consistency One way to improve the quality of automatically generated interfaces is to ensure that they are familiar to the user. This could be done by maintaining consistency between interfaces: whenever a new interface is encountered, it is rendered in such a way as to match previously encountered interfaces. Out of the papers covered by this review, only two discussed this topic in depth: Nichols, and Gajos et al [11, 6]. The PUC incorporates consistency as a major factor in its approach: a good portion of the logic described in [11] is devoted to this. The paper specifies seven requirements for the PUC’s interfaces to be consistent: − − − − −
“Interfaces should manipulate similar functions in the same way” “Interfaces should locate similar functions in the same place” “Interfaces should use familiar labels for similar functions” “Interfaces should maintain a similar visual appearance” “Usability of unique functions is more important than consistency for similar functions” − “Interface generators must provide a method to find similar functions across multiple appliances” − “Users must be able to choose to which appliance consistency is ensured”
A Survey on Guiding Logic for Automatic User Interface Generation
367
One of the major concerns stated by the paper when applying consistency is the balance between consistency with previous UIs and the usability of unique functions (functions that haven’t been encountered in otherwise similar UIs). The paper provides a set of heuristics for positioning unique functions in the most usable way. SUPPLE also incorporates a dissimilarity measure into its cost function [6]. The paper makes use of the following features while evaluating the similarity of individual widgets: − Language: “the primary method(s) a widget uses to convey its value” − Domain visibility: how much of a widget’s domain of values is visible − Continuous/discrete: whether or not values can be adjusted along a continuous range − Orientation of data presentation: whether domain values are presented vertically, horizontally, or circularly − Widget geometry: this concerns “the general appearance of the widget” − Primary manipulation method: “the primary way of interacting with the widget” The paper also uses two features of container widgets: − Layout geometry: this could be horizontal, vertical, or grid-based − Impact on visibility: this concerns certain container types that might impact widget visibility (eg., tab panes, pop-up windows). 2.3 Layout Method The actual method for translating from an abstract UI to a concrete one could take the form of a set of rules applied as the adaptation system traverses an abstract model; it could also take the form of a constraint-satisfaction problem, where the interactors and layout are chosen in such a way as to optimize a numerical function. Both approaches are used in the papers under review – in some instances, they are used by the same paper. (For instance, see Bandelloni et al, Florins et al, and Kavaldjian et al for the rule-based approach [3, 5, 8], and Bandelloni et al, Song and Lee, and Gajos et al for the constraint-satisfaction approach [3, 14, 6].) Each approach has difficulties associated with it. For instance, as expressed by Adam et al [1], formalizing usability knowledge is a very difficult task, par- ticularly because of the number of inconsistencies that start to arise, requiring special cases. Similarly, with the constraint-satisfaction approach, the function to be optimized can be highly complex, and formulating it is not a straight- forward process. (Also, with approaches such as Song and Lee’s [14], complex matrices must be maintained and updated as new input controls and data types are considered.) Another issue involves whether to use pre-existing tools or custom solutions. For instance, Sottet et al [15] advocate the use of existing model-based engineering technology for automatic UI generation (while acknowledging the possibility that initial attempts might not meet all the needs of the application). On the other hand, Adam et al [1] had issues with existing tools, because they weren’t optimized for runtime UIs, and therefore didn’t provide suitable performance. Similarly, Kavaldjian et al [8] developed their own transformation engine in order to account for deficiencies in solutions like ATL and MOLA.
368
G. Dubey
2.4 Element Mapping In the set of papers under review, the issue of mapping interaction element data types to widgets was a common one. If a certain input or output element in an abstract UI model is to be translated into a concrete UI widget, what should the widget type be? A few of the papers under examination presented such mappings. For instance, Hodes and Katz [7] specified mappings for ints, reals, booleans, enums, and strings. Also, Plomp and Mayora-Ibarra [13] presented a generic vocabulary for creating UIs that are inherently adaptable to multiple modes of operation, along with sample mappings for some of the elements in this vocabulary. These mappings need not be restricted to primitive data types. If an application needs to receive or display data of a complex type, mappings could also be developed for these. For instance, an input element with the ‘RGB color value’ data type could be mapped to a color picker widget. Higher-level mappings of this sort would be reminiscent of the ‘Smart Templates’ system presented by Nichols in [11]. However, Smart Templates define complex scenarios that are left to UI generation system builders to implement, as opposed to being mappings to existing widgets. That idea would be more in line with choosing target platforms with complex widgets, as suggested by Adam et al in [1]. 2.5 Layout Directives A UI at a higher level of abstraction could have annotations and directives attached to it that guide the UI generation process. What sorts of information should these annotations contain? Among the papers examined for this review, the grouping directive was the most common type encountered. This directive indicates that a certain set of elements is logically related: based on this, an automatic UI generator might avoid placing these elements on separate pages. Papers that used this directive include Plomp and Mayora-Ibarra, Banavar et al, Bandelloni et al, Khushraj and Lassila, and Florins et al [13, 2, 3, 9, 5]. Other group-related directives indicate more specific relationships: for instance, Banavar et al [2] have single-column and same width-height directives. Bandelloni et al [3] have relation, ordering and hierarchy operators: respectively, these indicate a one-to-many element relationship, a specific element order, and relative element importance. Other directives apply to individual elements. For instance, both Nichols as well as Khushraj and Lassila allow elements to be annotated with display labels – in particular, Nichols defines a dictionary type to allow for different platforms requiring different labels [9, 11]. Florins [5] allows individual elements to be marked as pageable: if an element is thus marked, it can be replicated if its container is split into multiple containers during the generation process. It is highly likely that there are other useful directives that could be used to guide the UI generation process. Too many directives would be difficult to use, however: Banavar et al limit the number of available directives for this reason [2]. It might be interesting to look for a small set of directives that provides the most control. A symbolic representation allowing complex instructions to be expressed in terms of simpler directives might also be useful.
A Survey on Guiding Logic for Automatic User Interface Generation
369
2.6 Error Correction Only one of the papers had an in-depth discussion of a problem fixing system designed to operate on the completed UI after generation: Nichols [11]. The PocketPC interface generator built for the PUC has a set of nine rules for dealing with problems of both insufficient height and insufficient width. The PUC’s system or the Pocket PC works by applying the rules in order to each problem encountered until one of the rules fixes the problem. After this, the interface is searched again to fix any new problems caused by the original fix. The procedure ends when there are no problems left, or when the remaining problems cannot be fixed using the rules. The system presented by Nichols is designed to handle only two cases on a specific platform; however, an extra pass for fixing UI problems might be a good generalpurpose way to improve automatically generated UIs. The main issue would be assembling a ruleset comprehensive enough to fix most problems, but concise enough to be manageable and efficient. Different rulesets would need to be assembled for the various platforms; further, formalizing UI knowledge in such a way that it could be applied to locate and fix UI problems would be a difficult task (as described in Adam et al [1]). The repository approach might be one way to approach this problem: maintain a repository of problem cases encountered during UI generation, along with potential solutions. 2.7 Human Guidance An important distinction while generating user interfaces is whether the gen- eration process is human-assisted, or whether it is fully automatic. There are benefits to both approaches: the human-assisted approach enables the designer (or possibly the user) to tweak the interface in order to improve the usability and the aesthetics in ways that an automated system cannot. On the other hand, the fully automatic approach enables dynamic, real-time interface generation (although this does not rule out the possibility of adjusting the completed interface after generation). Among the papers under review, Banavar et al and Sottet et al [2, 16] place emphasis on semi-automatic systems that keep designers in the loop. This helps alleviate the problem of unappealing automatically-generated interfaces [16]. However, there are also problems: Banavar et al discuss the round-trip problem, where evolutionary updates to the underlying application model cause conflicts to arise when attempting to apply previously-defined tweaks. The authors introduce a distinction between local and global tweaks in order to mitigate this. Bandelloni et al, Khushraj and Lassila, and Luyten et al [3, 9, 10] use a fully automatic approach. Khushraj and Lassila present a semantic caching system used to manage the large amounts of context data used as input [9]. Luyten et al present a system designed to be as inclusive as possible in terms of runtime environments: any version of the .NET or Java runtimes is suitable for running applications delivered through their system [8].
370
G. Dubey
3 Genericity This section will introduce the concept of genericity. This refers to the degree to which a problem space is divided, and each sub-space solved using a separate approach. For instance, if a monolithic tool was designed with the intention of adapting UIs to every possible context of use using the same principles and methods, this would be a high-genericity approach (since it would be generic, and applicable to every context). On the other hand, if separate principles and methods were used for different sections of the problem space, this would be a reducedgenericity approach. (For instance: developing separate tools for adapting UIs to mobile devices and large-screen displays.) In practice, the reduced-genericity approach makes the most sense, since trying to create a monolothic solution that accounted for all possible contexts of use would be cumbersome. However, another question arises: how should the problem space be divided? For instance, independent sets of principles and methods could be formulated and used for different form factors (eg. desktops, mobile devices, large public screens), different application types (eg. text editing, media playback, instant messaging), or different tasks (eg. printing a document, saving a file, entering a date). What is the best way to divide the problem space? The papers under review take a number of different approaches. For instance, Nichols has separate rendering solutions for different platforms [11]; Luyten et al use separate rendering back-ends for separate UIML vocabularies (which presumably vary according to target platform) [10]. Kavaldjian et al divide the problem space by device properties: the authors use different transformation rule sets depending whether the device uses fine or coarse pointing granularity [8]. Gajos et al present separate optimization functions based on the desired properties of the UI (conformance with user preferences vs speed of use) [6]. Another way to divide the problem space is to attack specific problems in specific contexts of use: for instance, Paelke et al discuss the idea of a framework designed to organize, publish, refine and evaluate visualization methods (with a focus on the restrictions introduced by mobile devices) [12]. Nichols sug- gests a system of Smart Templates, where a repository of templates for specific UI features is maintained, such that UI generators can include custom implementations for these features (if necessary) [11]. Sottet et al discuss storing UI model transformations in libraries: by representing them as models, they can be transformed themselves, and easily compared and evaluated [16]. It isn’t clear what the best way to divide the problem space (thereby reducing the genericity of a UI generation system) is; however, finding it could make the work done to build generation systems more applicable and extensible.
4 Conclusion Mobile and other computationally-capable devices are changing the way we interact with our applications and data. However, in order to fully realize the potential these devices represent, software needs to be optimized to run as effectively as possible on
A Survey on Guiding Logic for Automatic User Interface Generation
371
the target platform. This includes accomodating constraints (eg. limits on screen resolution, keyboard size, or processing power), but it also involves making full use of the extra capabilities these devices possess. For in- stance, a cellphone can be tilted, rotated, moved forwards and backwards and side to side – these are just some of the ways in which a handheld device could exceed the capabilities of its desk-bound equivalent. Using these capabilities in a natural, logical fashion could improve the usability of application interfaces on non-conventional platforms; however, formalizing the usability knowledge required to automatically generate effective UIs is a daunting task, encompassing a large number of platforms, form factors and application types. Further examination of the issues identified and discussed in this paper could help developers address this challenge, leading to the creation of usable, attractive interfaces for whatever platform a user might choose.
References 1. Adam, S., Breiner, K., Mukasa, K., Trapp, M.: Challenges to the model-driven generation of user interfaces at runtime for ambient intelligent systems. Constructing Ambient Intelligence, 147–155 (2008) 2. Banavar, G., Bergman, L., Gaeremynck, Y., Soroker, D., Sussman, J.: Tooling and system support for authoring multi-device applications. Journal of Systems and Software 69(3), 227–242 (2004) 3. Bandelloni, R., Mori, G., Paterno, F.: Dynamic generation of web migratory interfaces. In: Proceedings of the 7th International Conference on Human Computer Interaction with Mobile Devices & Services, pp. 83–90 (2005) 4. Dubey, G.: Rendering dynamically generated device aware user interfaces (2010) (unpublished literature review) 5. Florins, M., Simarro, F., Vanderdonckt, J., Michotte, B.: Splitting rules for graceful degradation of user interfaces. In: Proceedings of the Working Conference on Advanced Visual Interfaces, pp. 59–66 (2006) 6. Gajos, K., Weld, D., Wobbrock, J.: Automatically generating personalized user interfaces with Supple. Artificial Intelligence 174(12-13), 910–950 (2010) 7. Hodes, T., Katz, R.: “smart spaces”: entity description and user interface generation for a heterogeneous component-based distributed system. In: Proceedings of the 1998 DARPA/NIST Smart Spaces Workshop, pp. 7-44–7-51. (1998) 8. Kavaldjian, S., Raneburger, D., Falb, J., Kaindl, H., Ertl, D.: Semi-automatic user interface generation considering pointing granularity. In: Proceedings of the 2009 International Conference on Systems, Man and Cybernetics, pp. 2052–2058 (2009) 9. Khushraj, D., Lassila, O.: Ontological approach to generating personalized user interfaces for web services. In: Proceedings of the 4th International Semantic Web Conference, pp. 916–927 (2005) 10. Luyten, K., Thys, K., Vermeulen, J., Coninx, K.: A generic approach for multi- device user interface rendering with UIML. In: Proceedings of the 6th International Conference on Computer-Aided Design of User Interfaces, pp. 175–182 (2008) 11. Nichols, J.: Automatically generating high-quality user interfaces for appliances. Ph.D. thesis, Carnegie Mellon University (2006) 12. Paelke, V., Reimann, C., Rosenbach, W.: A visualization design repository for mobile devices. In: Proceedings of the 2nd International Conference on Computer Graphics, Virtual Reality, Visualisation and Interaction in Africa, pp. 57–62 (2003)
372
G. Dubey
13. Plomp, C., Mayora-Ibarra, O.: A generic widget vocabulary for the generation of graphical and speech-driven user interfaces. International Journal of Speech Technology 5(1), 39–47 (2002) 14. Song, K., Lee, K.: Generating multimodal user interfaces for Web services. Interacting with Computers 20(4-5), 480–490 (2008) 15. Sottet, J.-S., Calvary, G., Favré, J.-M., Demeure, A., Balme, L.: Towards Model Driven Engineering of Plastic User Interfaces. In: Bruel, J.-M. (ed.) MoDELS 2005. LNCS, vol. 3844, pp. 191–200. Springer, Heidelberg (2006) 16. Sottet, J., Ganneau, V., Calvary, G., Coutaz, J., Demeure, A., Favre, J.: Model-driven adaptation for plastic user interfaces. In: Proceedings of the 11th International Conference on Human-computer Interaction, pp. 397–410 (2007)
Adaptive Multimodal Fusion Pedro Feiteira and Carlos Duarte LaSIGE and Informatics Department, Faculty of Sciences of the University of Lisbon, Campo Grande, 1749-016 Lisboa, Portugal {pfeiteira,cad}@di.fc.ul.pt
Abstract. Multimodal interfaces offer its users the possibility of interacting with computers, in a transparent, natural way, by means of various modalities. Fusion engines are key components in multimodal systems, responsible for combining information from different sources and extract a semantic meaning from them. This fusion process allows many modalities to be effectively used at once and therefore allowing a natural communication between user and machine. Elderly users, whom can possess several accessibility issues, can benefit greatly from this kind of interaction. By developing fusion engines that are capable of adapting, taking into account the characteristics of these users, it is possible to make multimodal systems cope with the needs of impaired users. Keywords: Multimodal Interfaces, Adaptive multimodal fusion, Fusion engines, Evaluation.
1 Introduction Interactive multimodal systems are starting to become widespread, covering many application domains and trying to support a variety of users in the completion of theirs tasks or needs. In the past, multimodal interfaces were centered on modalities such as speech or deictic gestures, but now they are striving to include more and more input options, like full-body gestures, eye gaze, touch interaction with tablets and many others. By allowing multiple modalities (input and output) to be used, multimodal systems try to replicate a kind of “human-human” interaction with their users, establishing a more natural way of communication with them. This “flexibility” on modality choice, is beneficial for enhancing accessibility, especially for users with physical or cognitive impairments (e.g. visual and motor impairments). A critical step to be taken in the functioning of a multimodal interactive system is multimodal fusion, which is a process responsible for scanning all the input sources, merging all the information together and making an interpretation out of that. Fusion engines, key components in multimodal systems, executing fusion algorithms and techniques, can also be distinguished by their adaptability capability. When dealing with end-users such as elderly citizens, with so many particularities in their limitations and capabilities, it is crucial to have adaptable user interfaces meeting user requirements. C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 373–380, 2011. © Springer-Verlag Berlin Heidelberg 2011
374
P. Feiteira and C. Duarte
Many criteria can be considered when trying to implement an adaptive fusion engine (e.g. noise/error ratio, quality of the recognizers). However, in the scope of project GUIDE, the principle that defines how the fusion mechanism should act, are users and their characteristics.
2 Context The European project GUIDE (Gentle User Interfaces for Elderly Citizens) intends to deliver a developer oriented toolbox of adaptive, multimodal user interfaces that target the accessibility requirements of elderly users in their home environment, making use of TV set-top boxes as a processing and connectivity platform. The target users of GUIDE project are elderly people that possess accessibility issues, namely mild to moderate sensory, cognitive and physical impairments (loss or abnormality of psychological, physiological or anatomical structure or function) resulting from ageing or disability (restriction or lack of ability to perform an activity in the manner or within the range considered normal for a human being). Different modalities for input and output will be available in GUIDE, such as speech, gestures, audio and animated avatars. This diversity will allow elderly people to have several options at their disposal to interact with applications. When dealing with this type of users, allowing a “natural” way of communication with the system proves useful, not only because they may be not used to technology, but also because they usually possess certain physical or cognitive impairments that prevents some paths of interaction. Through user models, which contain information about user’s impairments or disabilities, the framework is able to adapt to each user and adjust the system behavior in an optimal way to provide the best interaction experience possible.
3 Multimodal Systems In our everyday lives we are constantly communicating with each other, by means of modalities like vision or gestures, making our life truly multimodal. This kind of dialog is also desirable in HCI, because it would make the interaction with computer feel much more smooth and natural. This desire gave birth to what is called, multimodal systems, which differ considerably from traditional GUI interfaces. Table 1shows the main differences between these two types of interfaces [1]. Table 1. Main differences between GUIs and MUIs according to Oviatt et al [1] GUI
MUI
Single input stream
Multiple input streams
Atomic, deterministic
Continuous, probabilistic
Sequential processing
Parallel processing
Centralized architecture
Distributed & time-sensitive architecture
Adaptive Multimodal Fusion
375
One of the reasons why people use multimodal systems is because they like it. Devices like the mouse, joystick or keyboard limit the ease with which a user can interact in today’s computing environments, including, for example, immersive virtual environments. Providing interaction alternatives not only boosts user satisfaction but it also makes systems more robust, due to information combination which makes the weaknesses of a modality (in a certain physical or user related context) be complemented with the strengths of another. Multimodal interfaces have also shown to be more reliable and therefore reduce user’s errors by 36% [2]. 3.1 Architecture In Fig. 1 we can observe the general architecture of a multimodal system, along with its key components. This figure demonstrates on a software vision, the message flow in multimodal systems, from the user to system, even including the developer applications. As we can see, input modalities are first perceived through various recognizers, which output their results to the fusion engine, in charge of forming a common interpretation of the inputs. When the fusion engine comes to an interpretation, it communicates it to the dialog manager, in charge of identifying the dialog state, the transition to perform, the action to communicate to a given
Fig. 1. General architecture of multimodal systems according to Dumas et al [1]
376
P. Feiteira and C. Duarte
application, and/or the message to return through the fission component. Finally, the fission engine is responsible for returning a message to the user through the most adequate modality or combination of modalities, depending on the user profile and context of use. For this reason, the context manager, in charge of tracking the location, context and user profile, closely communicates any changes in the environment to the three other components, so that they can adapt their interpretations.
4 Multimodal Fusion In multimodal interactive systems, multimodal fusion is a crucial step in combining and interpreting the various input modalities, and it’s one of the distinguishing features that separate multimodal interfaces from unimodal interfaces [1]. The aim of sensor fusion is to analyze many measurements simultaneous, and try to construct semantic meaning from them, which would be harder if only individual measurements were taken into account. Table 2 shows how modalities can be used to interact with multimodal interfaces. The “Use of modalities” columns, expresses the temporal availability of modalities, while the lines represent the fact that information obtained from several modalities can be either combined or treated in an independent fashion. While sequential use of modalities forces the user to use them one at a time, the support for “parallel” use of modalities, allows the user to employ multiple modalities at once, increasing the rate of information transmission between user and system. If this information is further combined, it becomes a synergistic form of interaction. Table 2. Ways to interact with multimodal interfaces
FUSION
Combined Independent
USE OF MODALITIES Sequential Parallel ALTERNATE SYNERGISTIC EXCLUSIVE CONCURRENT
Based on the type of information available, requirements, number and type of modalities, different levels of fusion may be considered for a multimodal system. Some of those options will be further discussed in section 4.1. 4.1 Levels of Fusion Recent literatures consider three levels of fusion: sensor-data, feature and decisionlevel fusion [1], while others make a distinction between decision-level and opinionlevel fusion [3]. According to Sanderson & Paliwal [3], in the former the output of each classifier forms a hard decision, where in the latter a classifier is viewed as an expert which provides an opinion on each possible decision. Information can either be combined before the use of classifiers or experts (pre-mapping) or after a translation from data/feature space to decision/opinion space has taken place (postmapping).
Adaptive Multimodal Fusion
377
In the next subsections several types of fusion are considered and explained, such as sensor-data fusion and feature level fusion (pre-mapping). Special focus is given to decision and opinion-level fusion (post-mapping) because they are the most common approaches and most adequate for a system which uses loosely-coupled modalities [1] such as GUIDE. Sensor-data Level. Data-level fusion, also called sensor-level fusion, deals with raw data coming from recognizers, representing the richest form of information possible (quantitatively speaking). Because the signal is directly processed, no information loss occurs. It is normally used when dealing with multiple signals of the same type, involving a single modality. Feature Level. Feature-level fusion is oriented for closely-coupled or time synchronized modalities, such as, for example, speech and lips movement. In this type of fusion, features are extracted from data collected by several sensors and are combined later. Unlike data-level fusion, it can suffer from data loss, but manages noise interference better. Decision Level. One of the most common and widely accepted forms of fusion is . decision-level fusion, and that is because it allows multimodal systems to make effective use of loosely- coupled modalities, like speech and pen interaction. Because the information received by the fusion engines has already been processed, noise and failure are no longer issues to deal with. This means, that fusion has to rely on preprocessed information in order to construct semantic meaning from combining partial semantic information coming from each input mode. That preprocessed information constitutes a hard decision that was produced by one or more classifiers. Those decisions can then be combined using different techniques (e.g. majority voting, ranked lists, AND fusion, OR fusion) [3]. Opinion Level. Opinion-level fusion (also called score-level fusion) is very similar to decision-level fusion because both of them operate after the mapping of data/featurelevel space into an opinion/decision space. However, in the case of opinion-level fusion, a group of classifiers, viewed as experts, provides opinions instead of hard decisions, and for that reason Sanderson and Paliwal [3] found more adequate to make a distinction between the two types. Opinions combination can be achieved, for example, through weighted summation or weighted product approaches [3], before using a classification criterion (e.g. MAX operator) in order to reach a final decision. The main advantage of this approach over decision-level fusion is that opinions from each expert can be weighted, which allows to imprint adaptive features into a system, by setting the reliability and discrimination of experts through time according to the state of the environment/signal quality, users or application logic. 4.2 Adaptive Fusion Fusion classifiers can be distinguished not only by the type of fusion or architecture they possess, but also by whether they are adaptive or non-adaptive [4]. The basic
378
P. Feiteira and C. Duarte
concept around adaptive fusion (also called quality fusion) is to assign different weight values associated with a modality. As stated in section 4.1.4 Sanderson and Paliwal [3] pointed out two examples of how such weighting can be used in performing an adaptive opinion fusion; weighted summation fusion and weighted product fusion. Poh et al [4] state that Adaptivity work as a function of the signal quality measured on one modality. The idea is, the higher quality a signal has, more weight will be set for it. One use of this kind of adaptation is for instance, a person’s recognition in a biometric system. Because the light conditions can change and influence the system input (in this case, the face recognition), this visual modality may get a lower weight value whilst speech input would get a higher value, and thus considered more trustworthy in the recognition process. According to Poh & Kittler [4], signal quality can be measured through quality measures. These measures are a set of criteria used to assess the incoming signal quality of a modality. Such measures could be for example, lighting or reflections in face detection and SNR (speech noise ratio) for sound. An ideal quality measure should correlate, to some extent, with the performance of the classifier processing the modality [4]. This means that some characteristics of a classifier prone to affect its performance should make ideal quality measures (e.g. if head pose in face recognition affects the recognition process then head pose would serve as an optimal quality measure). 4.3 Benchmark and Evaluation Evaluation of multimodal systems has mainly focused so far on user interaction and user experience evaluation [5]. Although this may give valuable insight about multimodal systems, it is only one possible source of errors. Two other important sources that should be considered and studied are modalities recognizers and fusion engines. When a query to the application doesn’t produce the expected results any of these three sources is a possible culprit: the user didn’t make the proper action to his intent; a recognizer issue came up or delays in system communication made the fusion process fail. Several frameworks are available today to quickly develop multimodal interfaces with easy plug-in of fusion engines, such as OpenInterface [6] or HephaisTK [7]. Dumas et al. [7] proposed a “divide-and-conquer” approach to evaluate multimodal interfaces, in which each component of a system (e.g. fusion engine, fission engine, recognizers output) is to first be tested on its own, and only later a test of the whole system occurs, with real user-data. The testbed developed by Dumas et al. [7] to use in HephaisTK, enables to focus on fusion algorithms and rules, since the output generated by modalities recognizers are simulated in order to discard all the error they may create and to test specific scenarios. Knowing exactly what information is passed to the fusion engine, it’s possible to establish a “ground-truth” set of expected results, which will be compared with the interpretation given by the fusion process. In this manner performance of fusion engines can be evaluated. Some of the metrics defined to measure the quality of engines based on their performance were response time (time between the instant the fusion engine receive its input and the instant it returns
Adaptive Multimodal Fusion
379
an interpretation), efficiency (comparison of ground-truth data and produced output of the engine), adaptability (to context and user) and extensibility (the capability of supporting new or different input sources).
5 Progress in GUIDE As stated before, because GUIDE intends to put many loosely-coupled modalities at user’s disposal, the choice for a fusion approach becomes narrower, being the possible candidates, decision or opinion-level fusion. Even though these two are very similar, opinion-level fusion by setting weights for different modalities at specific points in time, allows the system to cope with the very particular needs of each user and adapt to it. User profiles are stored and constantly updated by the dialog manager of the system, which allows the fusion engine to access that data and adjust modalities weights accordingly, based on user characteristics and context. All of the input data provided by users shall also have the purpose of analyzing interaction patterns so that profiles will be updated if needed. Since adaptation based on user disabilities and limitations is not yet a much explored area of research, some strategies have to be developed in order to correctly map that data from user profiles into proper modalities weights.
6 Conclusion In this paper we presented some of the state-of-the-art aspects of multimodal systems and their respective fusion engines. Different approaches to performing fusion are now available, as well as frameworks to quickly develop and integrate them into multimodal interfaces. Many of the ideas and concepts presented will have a significant impact on the design of the GUIDE project multimodal adaptive framework and its respective fusion engine. In this project special focus is given to elderly users whom can possess very particular needs and limitations. Taking that into account adaptability is obviously a critical concern in a system such as GUIDE. The future work will go through studying some of those frameworks available in order to implement and evaluate our initial adaptive fusion algorithms and techniques, which will be greatly influenced by user models.
References 1. Dumas, B., Lalanne, D., Oviatt, S.: Multimodal interfaces: A survey of principles, models and frameworks. Human Machine Interaction 5440(2), 3–26 (2009) 2. Oviatt, S.L.: Multimodal interactive maps: Designing for human performance. HumanComputer Interaction 12(1), 93–129 (1997) 3. Sanderson, C., Paliwal, K.K.: Information fusion and person verification using speech & face information. Research Paper IDIAP-RR 02-33 1(33) (2002)
380
P. Feiteira and C. Duarte
4. Poh, N., Bourlai, T., Kittler, J.: Multimodal information fusion. In: Multimodal Signal Processing Theory and Applications for Human Computer Interaction, p. 153. Academic Press, London (2010) 5. Lalanne, D., Nigay, L., Palanque, P., Robinson, P., Vanderdonckt, J.: Fusion engines for multimodal input: a survey. In: Proceedings of the 2009 International Conference on Multimodal Interfaces, pp. 153–160. ACM, New York (2009) 6. Lawson, L., Al-Akkad, A.-A., Vanderdonckt, J., Macq, B.: An open source workbench for prototyping multimodal interactions based on off-the-shelf hetero-geneous components. In: Proceedings of the 1st ACM SIGCHI Symposium on Engineering Interactive Computing Systems EICS 2009, p. 245. ACM Press, New York (2009) 7. Dumas, B., Ingold, R., Lalanne, D.: Benchmarking fusion engines of multimodal interactive systems. In: Proceedings of the 2009 International Conference on Multi-modal Interfaces ICMIMLMI 2009, pp. 169–176. ACM Press, New York (2009)
Intelligent Working Environments for the Ambient Classroom Maria Korozi1, Stavroula Ntoa1, Margherita Antona1, and Constantine Stephanidis1,2 1
Foundation for Research and Technology – Hellas (FORTH), Institute of Computer Science, GR-70013 Heraklion, Crete, Greece
[email protected] 2 University of Crete, Department of Computer Science
Abstract. This paper introduces a suite of Window Managers purposed for the technologically enhanced classroom. The overall objective is to instantiate a common look and feel across various classroom artifacts, thus providing a unified working environment for the students and teachers. To achieve optimal interaction and application display, the workspaces for each artifact are designed keeping in mind both the platform’s characteristics and the user’s requirements. The usability evaluation of the developed system is reported. Keywords: window managers, education, smart classroom, pervasive user interfaces.
1 Introduction The benefits of using Information and Communication Technologies (ICTs) in education have been in the foreground for several years, as they are acknowledged to be potentially powerful tools for educational change and reform. The students benefit particularly from the use of ICTs, since the access to educational information is unlimited, the learning environment is enriched, active learning and collaboration is promoted, and motivation to learn is enhanced. In the recent past, learning with the use of ICT was strongly related to concepts such as distance learning, educational games, intelligent tutoring systems and e-learning applications. As a result, the notion of smart classrooms became prevalent in the past decade [9] and has drawn the attention of many researchers. Many have envisioned a smart classroom that consists of technologically enhanced artifacts purposed to serve teachers’ and students’ needs. Such artifacts could replace the traditional student and teacher desks as well as the classroom boards, in order to support the processes of disseminating and receiving knowledge. The diversity of characteristics among the potential classroom artifacts outlines the need for a unified work environment for both students and teachers, and points out the importance of optimal interaction with the available applications that may travel among the various artifacts (i.e., desk, board). To address the aforementioned issues, this paper introduces the classroom Window Managers (cWMs) that host educational applications and facilitate learning activities. C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 381–390, 2011. © Springer-Verlag Berlin Heidelberg 2011
382
M. Korozi et al.
A common look and feel is instantiated across the various classroom artifacts, thus transforming the classroom into a unified environment rather than a group of isolated units. In cooperation with the PUPIL system [4], which is a framework that includes among others a library of adaptive educational-oriented components, the cWMs instruct the applications to adapt accordingly to the needs of the targeted classroom artifact. Therefore, the developers are disengaged from having to build different versions of each application and the use of cWMs ensures their portability. The next sections present some related work in this domain, outline characteristics of the proposed Window Managers, describe the heuristic evaluation performed and finally summarize with conclusions and future work.
2 Related Work During the past decade, research focused on augmenting the classroom environment by assisting the teaching process, recording lecture activity, enabling collaboration among students, supporting remote student telepresence, and encouraging active student’s participation and contribution in the learning process. The system reported in [3] aims to improve the teaching experience by offering automated device control and creating a suitable environment for each detected situation. For example, when an instructor logs on to the classroom computer, the system infers that a computer-based lecture will be given, automatically turns off the lights, lowers the screen, turns on the projector, and switches the projector to computer input. Similarly, Y. Shi et al. in [8] try to facilitate the lecturers by providing alternatives for command execution that do not require being in direct contact with the computer. In this approach, the teacher can use a traditional laser pen to indicate and circle a point of interest, while a camera detects the user’s intention and sends a mouse click message to the computer. In order to endorse students’ collaboration and active participation in the learning process H. Breuer et al. [2] developed a platform to support the programming of distributed applications running on different platforms. In their approach, applications running on different platforms and sharing an object will have automatically the status of the object synchronized. An interactive whiteboard application is offered, named “DeepBoard”, that implements a gesture-based interaction paradigm and a hierarchical semantic to store and retrieve data, created on the fly, by free-hand writing. Furthermore, collaboration among students is also supported in the Smart Classroom proposed by S. Yau et al. [10], in which each student owns a situationaware PDA. Students' PDAs dynamically form mobile ad hoc networks for group meetings. Each PDA monitors its situation (locations of PDAs, noise, light, and mobility) and uses the related information to trigger communication activities among the students and the instructor for group discussion and automatic distribution of presentation materials. The abovementioned approaches envision the “smart” classroom as an environment that assists the instructor and enables students’ remote access, collaboration and active participation during lectures. However, cWMs in collaboration with the PUPIL system approach this concept from the students’ perspective. The student is provided with a
Intelligent Working Environments for the Ambient Classroom
383
collection of workspaces tailored to the needs of the available artifacts (i.e., desk, board) that facilitate studying and transform boring school “chores” into pleasant activities.
3 Classroom Window Managers A window manager is the system software that controls the placement and appearance of windows within a windowing system in a graphical user interface, and most of them are designed to provide a desktop environment. The cWMs are engaged with an additional task as well, namely to ensure the optimal display of the launched applications according to the characteristics of the targeted classroom artifact. The window managers introduced here are employed in the technologicallyenhanced ambient classroom, where every artifact incorporates situation-aware functionality offered by the ClassMATE platform [5]. The cWMs were developed following a user centered design process (UCD) and focus on the students’ needs gathered through a requirements elicitation process. Furthermore, each cWM has unique characteristics, as it is purposed for a specific artifact (i.e., student’s desk, classroom board) trying to create an ideal workspace environment. Table 1 lists the available artifacts. Table 1. The classroom artifacts supported by the classroom Window Managers Artifact AmIDesk SmartDesk AmIBoard SmartBoard
Characteristics Vision-based touch-enabled device Touch screen device Vision-based touch-enabled device Touch sensitive interactive whiteboard
Resolution 1600x600 1440x900 1920x1200 1024x768
Use desk desk board board
3.1 Core Modules The cWMs incorporate mechanisms that transform the educational applications for optimal display in the various artifacts. To this purpose, the applications should be defined following a particular structure that is automatically translated by the cWMs, composing the appropriate application version. A generic application layout contains: (i) the content, (ii) a primary navigation area, visualized as tab items, for navigating through the different content categories and (iii) a secondary navigation area to facilitate navigation among the various subcategories of the selected content category. In order to conform to the aforementioned specification, every cWM-enabled application defines its skeleton using the following XAML elements: (i) (ii)
App: denotes the root of the Application Tree hierarchy, which subsequently defines a single application (e.g., the Multimedia application). Applet: denotes a distinct content category (e.g. the Image Displayer or the Video Displayer of the Multimedia application). At runtime every Applet is translated into a Tab item and populates the primary navigation area.
384
M. Korozi et al.
(iii) AppletComponeent: designates a sub-division of an Applet and at runtiime it is translated into a menu item of the secondary navigation area. An ple is the Image Displayer of the Multimedia applicatiion, illustrative examp which can be diivided into the “Book Images” and “Wikipedia Imagges” subcategories. A designer uses the aforem mentioned elements to build the appropriate hierarchy and create the Application Treee. Subsequently, each AppletComponent can be populaated using either native WPF orr PUPIL widgets, thus building the UI of the applicatiion. The following example pressents the Application Tree of a Multimedia application tthat can on the one hand host im mages from different sources and on the other hand dispplay videos relevant to these imaages.
Fig. 1. Thee Application Tree of a Multimedia application
Figure 1 is a graphical rep presentation of the application layout composition accordding to the Application Tree. Thee composition process consists of the following steps: • • • •
The Applet elements generate the application’s tab items and menus – one menu for each tab item The AppletCompo onent elements generate menu items that populate the respective menus The actions of the generated tab and menu items are defined so that each tab item activates a meenu and each menu item displays the appropriate contennt The children of Ap ppletComponent elements are appropriately adapted to the artifact needs to reesult in optimal content presentation.
The cWMs support applicaation layouts that can be customized at design time throuugh a number of available optiions of the Application Tree (e.g., menu orientation, tab orientation and style, etcc.); however, device characteristics may lead to thheir modification at run time. Fo or instance, consider a designer who creates an applicattion having in mind the AmIDeesk artifact. The designer selects the horizontal versionn of the menu and places it above the main content. The same application, if launchedd on the SmartBoard, will have a different layout where the menu will maintain its horizontal orientation but itt will be placed below the main content to be reachablee by younger students. The varieety of placement alternatives for every main UI componnent (menu, tabs, main content and a application previews) results in more than one layoouts for each artifact (AmIDesk,, Netbook, SmartBoard, etc.).
Intellig gent Working Environments for the Ambient Classroom
385
As far as the content UI is concerned, the WPF WrapPanel container was seleccted as the default container to host h all the included widgets in order to take advantagee of all available space that an application a might exploit. In more details, the WrapPaneel is used to position child elem ments in sequential position from left to right, breakking content to the next line wheen it reaches the edge of its parent container. Accordingg to this behavior, two widgetss displayed next to each other in a big screen wouldd be rearranged in a smaller screeen so that the second widget goes to the next line. Nevertheless, a system that t only supports UIs where all the contained UI elemeents wrap in the available spacce does not provide an optimal solution. The designners should be given more flex xibility when building the UIs, without compromising the application’s usability and appealing a appearance when traveling among artifacts. To address these issuees, the group and breakLine designer elements w were introduced. The group elem ment enables the application designer to denote that all the contained widgets must allways be placed on the same line. However, forcing the display of a number of widgets in a single line entails the risk of hiding some of thhem due to space limitations. In n such case, since the designer’s directive was single lline occupancy, the strategy is to scale down the contained widgets to fit the designaated area without hiding any off them. On the contrary, the use of the breakLine elem ment forces the following widgett to be displayed on the next line. 3.2 AmI Desk and Smartt Desk Window Managers The AmIDesk [1] and Sm martDesk Window Managers have similar characteristtics, since they are both purposeed for the student’s classroom desk. Their key feature is to display two applications sim multaneously to enable student’s parallel work. In additiion, they support working area organization by offering application sliding from lefft to right and vice versa. Sincee both the AmIDesk and SmartDesk artifacts (see Figuree 2) offer widescreen resolutio ons, any application that needs to display additioonal information, such as history y or help, can take advantage of the extra horizontal sppace and launch a new applicatio on next to it with the relevant content.
a.
b.
c.
Fig. 2. The window manager layouts for the SmartDesk (a) and the AmIDesk (b). Screenshoot of the AmIDesk window manageer with two launched applications (c).
Toolbar. The Toolbar arrea is a specific region offered by the AmIDesk and SmartDesk Window Manaagers where useful shortcuts are displayed. The Persoonal Area shortcut is always visible, v allowing the student to easily browse throuugh educational/ personal materrial. The lower area of the Toolbar is populated with couurse related applications, when a shortcut is selected the respective application mustt be launched.
386
M. Korozi et al.
Clipboard. The Clipboard temporarily holds instances of applications - e.g., one school hour- and disposes its contents before the next class (the student will be asked whether to save the pinned application instances in the Personal Area). As soon as an application is pinned to the Clipboard, its state is stored and the student can restore it by selecting the respective application thumbnail. Subsequently, a student can pin an application instance in order to access it later. The Clipboard permits browsing the pinned thumbnails through scrolling. When a thumbnail is pressed, it disappears from the Clipboard and the relevant application is either launched if suspended, or brought to front displaying the data recalled through the pinned instance. Finally, the pinned items can be removed by striking through them (right to left) and for mass removal an “Unpin All” button is provided. PieMenu. The Pie Menu (Figure 3a) is a round menu that appears when the student touches the screen at the same point continuously for more than 400 milliseconds, displaying a number of options. The Pie Menu allows reorganizing the launched applications. The menu includes on the one hand commands such as sliding applications from left to right and vice-versa, and on the other hand commands such as sending application instances to the Clipboard, the class board and the teacher. When the Pie Menu is activated the student can either execute a command by sliding a finger over the desired option, or deactivate the popup menu by lifting the finger from the screen. When the menu is activated, only one sliding option is displayed according to its position, so if the menu is launched on the left part of the screen the “slide right” is visible. These two options appear at the left and right of the Pie Menu respectively, fulfilling the natural mapping design principle. Accordingly, the “send to class board” option was placed at the top center of the round menu to take advantage of the physical board’s placement in a classroom. When an application instance is sent to the class board, the same application is launched at the board artifact, displaying the same data for all the students. Finally, the “send to teacher” option is visible only when the selected application is an exercise that needs to be submitted to the teacher. Application Switcher. The Application Switcher (Figure 3b) is a utility activated when the student touches a specific area on the screen (down right corner for the AmIDesk and up right corner for the SmartDesk). It displays thumbnails previewing the launched applications and the student can select one or two applications to view. When an application thumbnail is touched, the Application Switcher collapses and the respective application(s) come to front. When active, the Application Switcher can be closed by touching anywhere else on the screen. Placement Decision Making Component. The fact that the AmIDesk and SmartDesk Window Managers support the presentation of two applications simultaneously raised the need for a Placement Decision Making Component. This component decides the placement (left or right) of each newly launched application, ensuring that an application and its additional information will be launched next to each other. Furthermore, the algorithm prioritizes the most recently used application by not placing the new one on top of it, preserving the student’s working area on focus.
Intellig gent Working Environments for the Ambient Classroom
a.
387
b.
Fig. 3. The Pie Menu (a) and the Application Switcher (b)
Glass Input Handler. Th he Glass Input Handler is a transparent mechanism tthat supports the Pie Menu commands of the AmIDesk and SmartDesk Winddow Managers. The concept of this t mechanism was based on Java’s Glass Pane [7], whhich is like a sheet of glass oveer all the UI elements intercepting input events. Similaarly, the Glass Input Handler is handling the touch events transparently and eitther propagates them to the resspective applications or launches the Pie Menu. Wheen a “touch down” event (the user’s u finger contacted the touch-sensitive screen) evennt is raised, it must be checked whether w the student was aiming an application or intendding to activate the Pie Menu. Siince a single “touch down” event indicates that the studdent may intend to activate the Pie Menu, the Glass Input Handler is responsiblee to monitor the sequence of thee upcoming events and decide whether to simply propaggate them or launch the Pie Men nu. Application Layout. The AmIDesk and SmartDesk artifacts support various layyout Tree alternatives that can be specified at design time during the Application T construction or at runtime according a artifact characteristics (see Figure 4).
Fig. 4. Layo out alternatives for the desk window managers
3.3 Smart Board and Am mI Board Window Manager The SmartBoard and AmIB Board Window Managers are intended for a board artiffact. They both provide the Too olbar and Sidebar features, while the AmIBoard Winddow Manager additionally suppo orts a layout that integrates application-related previews to utilize the extra space offereed by its higher resolution (see Figure 5). Application Launcher. As A a board artifact is aware of the current course, it w was necessary to determine an n appropriate placeholder for displaying course-relaated application shortcuts. The concept c of placing these shortcuts on the Toolbar area w was discarded, as it was quite likely that some of them might become unreachable for younger learners. For thatt reason, a new module was introduced, the Applicattion Launcher, which displays a grid of application shortcuts following a bottom m-up placement algorithm.
388
M. Korozi et al.
a.
b.
c.
Fig. 5. The window manager layouts l for the SmartBoard (a) and the AmIBoard (b). Screenshot of the AmIBoard window man nager displaying the Multimedia application (c).
Toolbar. The Toolbar item ms of a board artifact are the Application Switcher shorttcut, the Application Launcher shortcut, s and the application close control. The Boardd, as opposed to the Netbook artifact, does not display information concerning an he Personal Area shortcut was redundant. Taking iinto individual student, thus th consideration the Toolbar’’s vertical placement, its contained items anchor at the bottom to be reachable by all a students. Previews. Previews take advantage of the AmIBoard extra space. They displayy, in parallel to the currently seelected content, snapshots of other information availaable through the same applicatio on. The designer can easily employ this feature by selectting the appropriate preview-en nhanced layout that gives users a glimpse through the application’s categories/su ubcategories. Figure 5c presents such a layout for an enhanced Dictionary Appliication. While the students observe images related to the word “remarkable”, the preeview placeholders may display the definition of the w word and/or some relevant video os and audio. When the student or teacher points to onee of the preview areas the respecctive category/subcategory is displayed. Application Layout. Thee application layouts supported by the board Winddow Managers are depicted in Figure 6. Their key feature is that the Tab and Menu t screen, thus making them easily accessible to younnger placeholders are lower in the students.
Fig. 6. Layo out alternatives for the board window managers
4 Heuristic Evaluatio on As the development of the Window W Managers had advanced, a number of educatioonal applications were build (e.g g., Book Application, Multimedia Application) in ordeer to evaluate the cWMs. A heu uristic evaluation [6] of the SmartDesk Window Manaager was conducted. martDesk was performed by five evaluators. The usabiility The evaluation of the Sm issues discovered where noted n down by an observer, who was monitoring the evaluation progress. The mo ost important discovered issues are:
Intellig gent Working Environments for the Ambient Classroom
389
I1. I2. I3. I4. I5.
The Pie Menu should not n be activated when no applications are launched Application Switcher should not be activated when no applications are launcheed hould be performed by direct gestures The application slide sh Applications should alllow maximization When an application is i restored from the Clipboard it is unpinned. Howevver, applications should reemain pinned to the clipboard until the user deciides otherwise. w not intuitive I6. Some Pie Menu Icons were The chart presented in Fig gure 7 illustrates the heuristic evaluation results, ordeered according to their severity. The most severe usability issues that were ranked higgher blems (>3) are marked in red color. These findings are the than “major” usability prob most imperative to fix. Th he issues ranked higher than “minor” (>2) and which are represented in the figure by y the orange-colored bars are next in turn to be fixed. Isssues marked in blue color rang ge from “just aesthetic” problems (severity rating 1)) to “minor” problems (severity y rating 2) and will be fixed before the next system verssion becomes available.
Fig. 7. Heuristic Evaluation results
6 Conclusions and Future F Work The main objective of this work w was to lay the foundation towards the realization oof a unified working environmeent inside the ambient classroom, where the students are constantly supported durin ng their everyday educational activities. Each classrooom artifact is equipped with an a appropriate workspace (Window Manager) keepingg in mind its special characteriistics and the user’s goals of use. Since the cWMs w were developed following a userr centered design process, the user needs and goals becaame clearer. On the other hand, the feedback received through the heuristic evaluation led to the discovery of significaant usability issues. Additional steps should be taken to fully support the initial concept. The first sstep is to conduct a full-scale user-based evaluation of the currently developed cWMss, in ment order to acquire additional feedback from the end-users. The evaluation experim hat engage all the supported artifacts, so as to record and should include scenarios th analyze their acceptance by b users in real-life situations. Furthermore, since m most students nowadays possess numerous mobile devices (i.e., laptop, PDA), such deviices should be incorporated in the t enhanced classroom. Finally, teacher’s support shoould
390
M. Korozi et al.
be included to assist the “maestro” of the class. A completely different requirements’ elicitation process will be followed, as the teacher’s goals differ from the student’s goals, while the component-based architecture of the existing system will facilitate the integration of additional software (e.g., teacher-oriented applications) with minimal cost.
References 1. Antona, M., Margetis, G., Ntoa, S., Leonidis, A., Korozi, M., Paparoulis, G., Stephanidis, C.: Ambient Intelligence in the classroom: an augmented school desk. In: Karwowski, W., Salvendy, G. (eds.) Proceedings of the 2010 AHFE International Conference (3rd International Conference on Applied Human Factors and Ergonomics), Miami, Florida, USA, July 17-20 (2010) 2. Breuer, H., Baloian, N., Sousa, C., Matsumoto, M.: Interaction design patterns for classroom environments. In: Jacko, J.A. (ed.) HCI 2007. LNCS, vol. 4553, pp. 163–172. Springer, Heidelberg (2007) 3. Cooperstock, J.: Classroom of the Future: Enhancing Education through Augmented reallity. In: HCI International, pp. 688–692 (2001) 4. Korozi, M.: PUPIL: pervasive UI development for the ambient classroom (Master’s thesis), (2010), e-Locus, http://elocus.lib.uoc.gr/dlib/a/e/2/metadata-dlib81a07682706c2163d8f582245fd9edfd_1288689489.tkl 5. Leonidis A.: ClassMATE: Classroom Multiplatform Augmented Technology Environment (Master’s thesis), (2010), Available from e-Locus, http://elocus.lib.uoc.gr/dlib/d/3/6/metadata-dlibd19ded9b6c0938f4723672b56b78ebe2_1288598057.tkl 6. Nielsen, J.: http://www.useit.com/papers/heuristic/ 7. Oracle, The Java Tutorials - How to Use Root Panes, http://download.oracle.com/javase/tutorial/uiswing/component s/rootpane.html 8. Shi, Y., Xie, W., Xu, G., Shi, R., Chen, E., Mao, Y., Liu, F.: The smart classroom: Merging technologies for seamless tele-education. IEEE Pervasive Computing, 47–55 (April-June 2003) 9. Xu, P., Han, G., Li, W., Wu, Z., Zhou, M.: Towards intelligent interaction in classroom. In: Stephanidis, C. (ed.) UAHCI 2009. LNCS, vol. 5616, pp. 150–156. Springer, Heidelberg (2009) 10. Yau, S.S., Gupta, S.K.S., Karim, F., Ahamed, S.I., Wang, Y., Wang, B.: Smart Classroom: Enhancing Collaborative Learning Using Pervasive Computing Technology. Engineering, 1–9 (2003)
Adaptive Interfaces: A Little Learning is a Dangerous Thing… Kyle Montague, Vicki L. Hanson, and Andy Cobley School of Computing University of Dundee Dundee DD1 4HN UK {kylemontague,vlh,acobley}@computing.dundee.ac.uk
Abstract. In this paper we present a possible approach to improve accessibility and usability of software applications through shared user models. Previous work in adaptive user interfaces has relied on local and domain specific user models, which lack in scope and detail. Shared user models can increase the accuracy and depth of data used to adapt the interfaces and user interactions. This research focuses on the accessibility of touch screen mobile devices for users with low vision and mobility impairments. Keywords: adaptive interfaces, mobile computing, low vision, physical disability.
1 Introduction Personalised applications and websites are becoming increasingly popular, and many computer users interact with some form of user-adaptive system on a daily basis. Contemporary adaptive educational systems (AES) tend to be driven by two factorsthe user’s domain knowledge and the user’s interests of the session [1]. Currently domain knowledge is more dominant within the user model. Whilst non-educational systems apply a more equal weight to the user’s interests, both the user’s domain knowledge and interests rely on local user models. In this case, a little knowledge may be a dangerous thing [2]. These adaptive systems store a local user model of data collected within the scope of the application. The current research investigates shared user modelling as a possible technique to improve the accuracy and detail of user data to create more effective mobile device interactions, concentrating on improving the accessibility of the technology for users with low vision and mobility needs. It is vital that computer technologies and software are usable by people with very diverse skills and needs. Guidelines exist to give a generalised view of accessibility issues and techniques to reduce these issues, promoting developers to design accessible, usable, flexible, and intuitive solutions. Whilst this has encouraged the designers to take a more user centred approach, the final products may not be fully accessible. Technology users are unique, with individual needs and requirements of software tools, and may not necessarily fit designers’ generalised abstract group C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 391–399, 2011. © Springer-Verlag Berlin Heidelberg 2011
392
K. Montague, V.L. Hanson, and A. Cobley
representations. Adaptive Systems [3] attempt to bridge the gaps between design and user needs by providing personalised designs. They do this by personalising content based on domain specific user models, which provides a valuable structure for educational systems intended to develop and progress users’ domain knowledge. The adaptive systems often use a multi-layer interface design [4], with a spectrum of diverse complexity and functionality layers for users of different skill set. Adaptive systems allow content to be presented and tailored to meet the needs and interests of individual users. Therefore, the more realistic the user model the higher accuracy of the personalisation and accessibility of applications. This research will be focusing on the use of shared user models with touch capable mobile devices.
2 User Models User models are structured data sources for representing user characteristics such as interests, preferences and domain knowledge. All the data is recorded in context of the system it was stored for, and self-contained within the application. Shared user modelling expands the range of variables being recorded, and allows multiple applications to make use of the same data. Such models already exists in technology, with the Facebook Open Source Graph protocol (OSG)[5] being one such example. Shared user models need to accommodate a wide and diverse range of systems, requiring them to contain additional data to allow the abstract shared data to be put into context for individual applications or uses. We, therefore, break the model into the following architecture: User, Application, Device and Environment. Within the User section we look at interests and user preferences or attributes. Application records data on the users’ domain knowledge and goals within the current application. Any data about the hardware preferences and capabilities is recorded in the Device section, and attributes of the users surroundings during system use are logged in the Environment section. 2.1 Web Services With shared models, applications must all obtain and update information stored within the user model. This is why the User Model Web Service (UMoWS) [3] structure was selected. The user models are contained within the database accessed only by the UMoWS. Any application wishing to make use of the shared user models must communicate via the web service methods of the UMoWS using unique authentication credentials. Applications can request and update information on a selected user as long as they have permissions to do so. Connected to the UMoWS is a User Management control panel, allowing users to view and edit their model information and access permissions for this content. The diagram in Figure 1 shows the overall structure and how applications can make use of the shared user models. As can be seen from Figure 1, the only way to access the user models is via the UMoWS. To ensure the security of users information all access to the web services requires authentication, this means applications will be required to register for an API key to gain access. Before that application can retrieve or store any user data the user must first grant access to do so. Access permissions can only be granted by the user,
Adaptive Interfaces: A Little Learning is a Dangerous Thing
393
using their login credentials via the User Management control panel. The control panel will allow the user to modify and remove any granted permissions or information being stored. Once an application has the appropriate permissions it can then use the permitted web services to access the shared user model. Web services included will be the standard operations for retrieving a users model with specific fields, adding new information and adding new fields to be recorded (useful for domain specific data). Simple Object Access Protocol (SOAP) will ensure that any application wishing to access the web services can do so using a standard format.
Fig. 1. UMoWS structure showing how applications access the shared user model data via SOAP requests
2.2 Data Acquisition Applications will have their own techniques for collecting data on the user’s domain knowledge, however it is important that the shared user model also contains information about the application itself. This allows domain specific data to be put in context and helps build relationships between the concepts within the user model. All applications will be registered with details outlining the scope of the system and the goals of the application, allowing comparisons and connections to be identified between different applications and their domain activity. The device and environment information should also be logged in the user model with each session the user has. This includes details on the device capabilities both hardware and software, showing possible interactions that can be carried out on the device, and environmental details on noise, light levels, mobile or stationary to help identify relevant interactions for the current session.
3 Adaptive Interfaces The goal of adaptive interfaces is to improve the interactions a user has with the software based on knowledge of the user and their experience within the application
394
K. Montague, V.L. Hanson, and A. Cobley
[6]. Adaptive interfaces tend to retain the same overall layout and interaction style for each user, whilst personalising the content being presented. Multi-Layer interface design works differently [4]. With these, users are presented with an interface layer that matches their application skill level. As users progress and become more skilled new features are opened up and added to the interface. Multi-Layer Adaptive Interfaces combine the user modelling and domain knowledge monitoring techniques of Adaptive Systems, with custom interface layers for different characteristics or situations. The diagram in Figure 2 shows how a single application created with a multi-layer interface design could provide individual users with tailored interfaces and interaction methods. Adaptive interfaces allow applications to cater for a wider range of users with distinct characteristics and needs, rather than the popular one size fits all solutions we see all to often in software today.
Fig. 2. Structure showing multi-layer interfaces being used to personalise an application for individual users
4 Touch Screen Mobile Devices Although touch screen capabilities are becoming more common in desktop computers and laptops, there has been a shift in the balance of technologies used to access news, information and the Internet in general. In recent years we have seen an increase in the uptake in both new types of devices and styles of interaction. In the UK, for example, the overall up-take of communication devices and services have all increased in the last 3 years [5,6]. With technologies such as digital television and personal video recorders (PVR) there comes a higher level of functionality and menu complexity than their predecessors. This requires a greater understanding and knowledge from the user. Whilst the new input devices have a similar form to the older devices, the methods of
Adaptive Interfaces: A Little Learning is a Dangerous Thing
395
use and cognitive load have increased to incorporate the new functionality of the systems. Reports [5, 6] also show that the number of people with access to broadband has increased, suggesting also an increased use of web services and exposure to web browser applications. Looking at the top 10 smartphones in 2009, two of which were touch screen devices; by 2010, 8 of the top ten were touch screen devices [7]. Seven of these use multi-touch displays. Touch screen interfaces allow for very flexible interfaces and interaction methods. However most lack the tactile feedback that a physical button provides, creating new complications when informing the user of a successful button press. Some manufacturers have elected to substitute the feedback created by a physical button with device vibrations and or notification sounds. However no standard technique has been formed, causing inconsistencies between devices, and even between applications within the same mobile phone. Touch screen devices also allow for greater flexibility when creating user interfaces, there a fewer restrictions on the type of layouts and controls possible compared with non-touch input devices. Currently work is being carried out investigating techniques to make touch screen devices more accessible for users with physical impairment that make touch screens difficult. That research has focussed on the target sizes, positioning of buttons and the use of gestures [8].
5 Mobile Indoor Navigation Tool Our initial prototype for the shared user modelling is a tool for users with low vision and mobility needs to assist with indoor navigation [11]. By loading in the shared user model, the tool is able to customise the route taken from a start point to destination, based on data that might affect a user’s way-finding capabilities and interaction needs. The interface is individually tailored to meet the needs of the users, providing the instructions in the format that best suites their characteristics and preferences, as shown in Figure 3. A pilot study was run with a small number of older adults (4 in total) aged between 60 and 86. None of the participants had any prior experience with touch screen mobile smartphones. Most, however, picked up the interaction method of pressing the buttons they could see on screen relatively quick. One participant did encounter a number of issues when using the navigation tool. Originally the researcher believed this to be a fault with the application communicating with the web services retrieving the navigation instructions. The tool was reset and the route restarted. This time the researcher observed the way-finding task and identified that the user was touching the onscreen button multiple times within a short space of time due to a slight hand tremor. The false positives from this participant’s touches caused this technology to be very challenging. The researcher was able to explain the problem to the participant before the final task was performed. Shown in Figure 4 are the logs from both the route prior and post explanation of the issue. The logs from the first route show that the participant was skipping ahead instructions then having to select the “previous”
396
K. Montague, V.L. Hanson, and A. Cobley
Fig. 3. Screen shot of the navigation tool instruction interface (left) the corresponding user profile showing interaction preferences (right)
Fig. 4. Graphs of touch interaction logs from indoor navigation user study
Adaptive Interfaces: A Little Learning is a Dangerous Thing
397
button to get the correct one. By default the touch screen Smartphone does not threshold the duration between touches. The graph shows all the points where the duration between touches was less than 1 second. The 2nd graph shows the logs after the situation with the application was explained to the user. The 2nd log is more like the result we would hope to see from the application, with just one error where the “next” button was double tapped resulting in the user skipping from step 8 to 10 within the way-finding sequence. The prototype was later modified to include catches for this type of interaction problem. Further work is being carried out to personalise the threshold durations for what is an intended touch or not.
6 Mobile TV Guide To evaluate the effectiveness of shared user modelling at a method of improving the accessibility of the touch screen devices, there must be at least two applications. Therefore, a second prototype mobile app is being developed to assist users with browsing electronic programming guides (EPG) for digital television (DTV). Ofcom reports [5, 6] define accessibility of content targets for DTV public sector channels. For subtitling, they state minimum percentage as 90%; for audio descriptions10%; for
Fig. 5. Screen shot of TV guide prototype showing shortlisted content based on shared user model (left), low vision optimised version of the interface (right)
398
K. Montague, V.L. Hanson, and A. Cobley
signing 5%. With such a low percentage of programmes available in AD and digital signing format, users could face great difficulty searching for a programme they would like to watch in the format they need. With regards to the additional (non public sector) channels they may never find a programme they can access fully. The mobile TV guide app combines the rich user profile data from the shared user model with the EPG information to assist the user in finding shows both in a format that’s accessible and of interest to them. The prototype will shortlist content based on the individual’s access needs and present the information back to the user in a format that best suits their characteristics. Screen shots of the app and an adapted version for a low vision user are shown in Figure 5. Interface customisations will include scaling and hue shifting of all text, images and buttons. The Mobile TV guide and Indoor Navigation app are two very distinct systems, however they share a great deal of overlap in terms of interaction with the mobile device. The 2nd prototype will allow this overlap and the effectiveness of shared user modeling to be properly evaluated as a potential technique for improving the accessibility of touch screen mobile devices for users with low vision and mixed levels of mobility.
7 Conclusion Touch devices afford the flexibility to create interesting methods of user interaction. Due to the lack of consistency between interfaces there are a number of usability and accessibility issues created. Designers cannot rely on the one-size-fits-all approach to touch screen interface design and existing adaptive systems are unable to always make the correct adaptions due to lack of accuracy and scope in the local user model. Detailed shared user models would allow developers to build applications that could be customised at runtime to suit individuals needs, helping to ensure that applications are accessible and usable by all, regardless of individual user differences, devices and environments. Full studies to the indoor navigation tool and mobile TV guide with visually and physically impaired participants are currently being conducted. The prototype has been extended to include more detailed interaction data, capturing not just the order and timestamps of button presses, but also the duration of a single touch (press & release) and touch position within target element. This new data will allow further refinement to the threshold filtering of intentional interactions and optimisation of target sizes of elements on an individual basis. These studies should allow for a detailed comparison between the one-size-fits-all, local model and shared user model adaptive systems. Acknowledgments. This research was supported by the RCUK grant number EP/G066019/1 under their Digital Economy programme.
References 1. Brusilovsky, P., Sosnovsky, S., Shcherbinina, O.: User modeling in a distributed Elearning architecture. In: Ardissono, L., Brna, P., Mitrović, A. (eds.) UM 2005. LNCS (LNAI), vol. 3538, pp. 387–391. Springer, Heidelberg (2005) 2. Pope, A.: An Essay on Criticism, Correspondence (1706)
Adaptive Interfaces: A Little Learning is a Dangerous Thing
399
3. Brusilovsky, P., Millán, E.: User models for adaptive hypermedia and adaptive educational systems. In: Brusilovsky, P., Kobsa, A., Nejdl, W. (eds.) Adaptive Web 2007. LNCS, vol. 4321, pp. 3–53. Springer, Heidelberg (2007) 4. Shneiderman, B.: Promoting universal usability with multi-layer interface design. In: Proceedings of the 2003 Conference on Universal Usability, pp. 1–8. ACM, Vancouver (2003) 5. Ofcom, Communications Market Report (2010), http://stakeholders.ofcom.org.uk/binaries/research/cmr/75356 7/CMR_2010_FINAL.pdf (retrived 12/01/2011) 6. Ofcom, Communications Market Report (2009), http://stakeholders.ofcom.org.uk/binaries/research/cmr/cmr09 .pdf (retrived 12/01/2011) 7. Facebook. Open Graph protocol, http://developers.facebook.com/docs/opengraph (retrived 12/01/2011) (cited 11/06/2010) 8. Langley, P.: User modeling in adaptive interfaces. In: Proceedings Um 1999: User Modeling, vol. 393, pp. 357–370 (1999) 9. AdMob, AdMob Mobile Metrics, in Metrics Highlights (2010) 10. Guerreiro, T., Nicolau, H., Jorge, J., Gonçalves, D.: Towards accessible touch interfaces. In: Proceedings of the 12th International ACM SIGACCESS Conference on Computers and accessibility (ASSETS 2010), pp. 19–26. ACM, New York (2010) 11. Montague, K.: Accessible indoor navigation. In: Proceedings of the 12th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS 2010), pp. 305–306. ACM, New York (2010)
A Novel Design Approach for Multi-device Adaptable User Interfaces: Concepts, Methods and Examples Alexandros Mourouzis1, Asterios Leonidis2, Michalis Foukarakis2, Margherita Antona2 and Nicos Maglaveras1,3 1
Centre for Research and Technology Hellas (CERTH), Institute of Biomedical and Biomolecular Research, GR-57001 Thessaloniki, Greece {mourouzis,nicmag}@certh.gr 2 Foundation for Research and Technology – Hellas (FORTH), Institute of Computer Science, GR-70013 Heraklion, Greece {leonidis,foukas,antona}@ics.forth.gr 3 Aristotle University (AUTH), The Medical School, Lab of Medical Informatics, GR-54124 Thessaloniki, Greece
[email protected] Abstract. User interface designers and engineers are faced today with unprecedented challenges. Applications are progressively required to run on multiple computing platforms and accommodate the capabilities of different devices. Users and context are significantly characterized by diversity, leading to a very broad range of accessibility and interaction requirements and preferences that need to be adequately addressed. The current approach of designing separate user interfaces, one for each target use, is time consuming, error prone, and does not adequately addresses the challenges of cross-platform consistency, universal access and integration. To this end, this paper proposes a new integrative approach to multi-device user interface development for achieving deviceindependence by-design and further pursuing improved levels user experience for all through adaptive presentational models for various devices and contexts of use. Explanatory examples that were implemented in the context of the REMOTE R&D project are also presented. Keywords: User interface design, Multi-device, Device-independent, Adaptable user interfaces, Migratory interfaces, Accessibility, Design for All.
1 Introduction Today, cloud computing and web-based systems allow users to interact with serverside services and information through a wide variety of interactive devices, including PCs, PDAs, and mobile phones. As modern applications are progressively required to C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 400–409, 2011. © Springer-Verlag Berlin Heidelberg 2011
A Novel Design Approach for Multi-device Adaptable User Interfaces
401
run on multiple, different computing platforms and accommodate the capabilities of heterogeneous devices and contexts of use, new challenges are brought forward for user interface designers and engineers [1]. Further challenges are also imposed because of the diversity among users, which generates a very broad range of accessibility and interaction requirements and preferences that need to be appropriately addressed [2]. The current mainstream approach consists of designing a single, independent user interface for each one of the target computing platforms. Such approach is time consuming, error prone, and does not adequately addresses the challenges of crossplatform consistency, universal access and integration. Thus, there is an increasing need for new solutions and methods towards modelling, designing and implementing multi-device, universally accessible user interfaces for the emerging generation of interactive systems. This paper introduces a new approach to multi-device user interface design, in which device-independence is intrinsic, i.e., is achieved “by-design”. This approach allows the design of user interfaces targeting multiple platforms (multi-device), potentially in a single design iteration. Then, the paper also discusses how to apply adaptive presentational models for various devices and contexts. To this end, an integrative framework is employed which supports building accessible and selfadapted user interfaces in terms of context of use and user requirements, while hiding any adaptation-related complexity. The main benefits of such adaptation framework are manifold. From a design point of view, it supports the possibility of embedding accessibility and diversity-related design knowledge into libraries of widgets, thus relieving designers from the need of acquiring in-depth related expertise and to design their own adaptations. From an implementation point of view, the proposed approach makes adaptation completely transparent, eliminating the need of implementing adaptation rules. On the other hand, it also offers extensibility, as both the presentation styles and the adaptation logic are stored in external applicationindependent files. Finally, both global and local adaptations can be applied to any user interace (UI) element. To facilitate the comprehension of the proposed concepts and methods, specific examples are also presented here as achieved in the context of the REMOTE R&D project. Finally, some important research opportunities and issues, emerging from this work and relevant to multi-device user interface development, are highlighted.
2 The REMOTE Project The concepts and methods proposed in this paper have been put in practice and prototyped for the first time in the context of the REMOTE R&D project [3]. Thus, several examples that are used throughout this paper stemmed from this work. REMOTE is an AAL Project (see “Acknowledgments”) with the main objective of advancing the state-of-the-art in fields of tele-healthcare and ambient intelligence and enhancing older people’s personal environment with audio-visual, sensor / motoric monitoring, and automation abilities for tracing vital signs, activity, behaviour and health condition, and detecting risks and critical situations as well as providing, proactively and reactively, effective and efficient support at home. To this end, the
402
A. Mourouzis et al.
project focuses on defining and establishing a multidisciplinary and integrated approach to R&D of ICT for addressing, in real life contexts, identified needs of frail elderly, especially of citizens at risk due to geographic and social isolation in combination with chronic conditions, such as hypertension, arthritis, asthma, stroke, Alzheimer’s disease, and Parkinson’s disease, and the coexistence of various disabilities and lifestyle risk factors. All the application and services are supported by an open architecture that facilitates the development and integration of new services (e.g., for additional types of chronic conditions and different disability situations) and new delivery platforms. By targeting older individuals with chronic conditions around Europe, REMOTE targets a very wide range of user profiles. In this context, the aim is to provide high-quality, ambient user interfaces, by effectively addressing diversity in the following dimensions: target user population and changing abilities due to aging; categories of delivered services and applications; and deployment computingplatforms (e.g., PDAs, smartphones, desktops, laptops) and input devices. To this end, REMOTE has developed an approach towards delivering device- and modalityindependent interfaces and dialogues, enhanced by advanced device-, user- and context- adaptation facilities for increased usability and subjective satisfaction.
3 Related Work 3.1 Multi-device User Interface Design Today, with the emerging pervasiveness of wireless technologies, people use an increasing variety of computing devices, such as PCs, PDAs, iTV and mobile phones, often to access the same information and utilities. The possibility of freely changing among devices without having to restart the session from scratch is starting to emerge as an important aspect of user experience and mobility. HCI researchers have emphasized the importance of understanding how the design space for user interfaces is influenced by characteristics of input and output devices [4]. Currently, designers of such multi-device user interfaces aiming to support migration, distribution and adaptivity to the platform, follow three main approaches. The most common approach is to make available a priori separate designs per target device and which can then be pre-installed or downloaded at runtime at the device at hand. Clearly, this is time consuming and key factor for errors and inconsistencies. For instance, even the smallest changes need to be carefully implemented and cross-checked for each of the individual interfaces. Alternatively, various model-based approaches can be followed that provide logical descriptions which can be used as a starting point for generating interfaces that adapt to the various devices at hand. In recent years, such interest has been accompanied by the use of XML-based languages to represent such logical descriptions [5] [6]. Finally, designers may employ (at runtime or a priori) software tools that automatically generate interfaces. For instance, different techniques for redesigning Web applications for small devices have been proposed (e.g., [7]). These approaches are mainly based on a layout analysis and the small screen pages accessed through the
A Novel Design Approach for Multi-device Adaptable User Interfaces
403
thumbnails are built by extraction and summarization of parts of the original interfaces. This often results in interfaces that are awkward. Overall, current efforts in these three approaches are characterized by a number of limitations. For instance, these approaches usually focus on providing support, not in the runtime phase, but only in the design and authoring phase in order to help designers to efficiently obtain different versions that adapt to the various interaction features. Most works focus on adaptations induced by constrains related to the display (output) capabilities of each device and pay little attention to characteristics induced by the devices’ input modalities. Even for devices that are functionally similar, vastly different interfaces are usually produced, which employ totally different conceptual models. Thus, the user has to spend significant time learning each new interface. In most cases, both usability and accessibility of the produced user interfaces, especially for people with disability, are highly questionable if not totally absent. This leads to a contradiction: a modern user interface needs to be generalised on the one hand because it has to allow deployment on multiple devices, and specialised towards different user profiles on the other hand to make it more usable. 3.2 User Interface Adaptation User Interface Adaptation has been proposed in recent years as a comprehensive and systematic approach to achieve personalized accessibility and usability of user interfaces in the context of Universal Access and Design for All [8]. Related methods and tools foster a proactive strategy, postulating that accessibility and quality of interaction need to be embedded into a product at design time. Proactive strategies entail a purposeful effort to build access features into a product, as early as possible (e.g., from its conception, to design and release). Producing and enumerating distinct interface designs through the conduct of multiple design processes would be an impractical solution, since the overall cost for managing in parallel such a large number of independent design processes, and for separately implementing each interface version, would be unacceptable. Therefore, the Unified User Interfaces methodology has been proposed and applied [9] as a vehicle to efficiently and effectively ensure, through an adaptation-based approach, the accessibility and usability of UIs to users with diverse characteristics, supporting also technological platform independence, metaphor independence and user-profile independence. In such a context, automatic UI adaptation seeks to minimize the need for a posteriori adaptations and deliver products that can be adapted for use by the widest possible end user population (adaptable user interfaces). This implies the provision of alternative interface instances depending on the abilities, requirements and preferences of the target user groups, as well as the characteristics of the context of use (e.g., technological platform, physical environment). This approach is proposed as an alternative to the traditional design and development for the ‘average’ user and aims at ensuring accessibility and usability through user interface adaptation to diverse user characteristics, including disability, expertise, preferences, etc.
404
A. Mourouzis et al.
4 A New Design Concept for Multi-device Adaptable User Interfaces This paper proposes a new approach to better support the design of accessible and multi-device user interfaces, so that eventually users will be offered successfully the possibility of changing interaction devices and still continue their tasks through interfaces smoothly adapted to the new platform. 4.1 Design for 3x3 Multi-tier Menus The design concept presented here has mainly originated from previous experiences with text entry methods and scanning techniques for users with limited motor functionality of upper limbs. Early evaluations of a research prototype of an accessible Web browser which included a QWERTY-like soft keyboard operated via three special switches, indicated frustrating rates of text entry (words per minute). To overcome this, the alphanumeric layouts of early mobile phones were considered as a means to implement a 12-key soft keyboard as a potential means to achieve movement minimisation and high learnability thanks to familiarity. Then, as on a 12key keypad the letters A to Z are typically encoded on eight keys only, a novel 8direction method was explored for text entry with joysticks. This led us to the keyboard concept introduced in [9]. That is, a 3x3 multi-tier onscreen keyboard, consisting of a primary menu, where eight cells are used for entering letters and one cell is reserved for entering into additional, again 3x3, menus that accommodate extra characters and options. It supports various entry modes, including 1 to 10-key and joystick modes, allowing text entry with a remarkable range of devices. This work has suggested several alternative entry modes, and several layouts for novice, moderate and expert users, as a text entry solution both for disabled and able-bodied users. A general conclusion of this work was that the novel concept of 3x3 multi-tier menus originally applied for implementing a multi-device and thus accessible soft keyboard could be further employed as a basis for designing multi device user interfaces in general. Here it is argued that as a starting point an application’s user interface, which can actually be considered as a limited number of user navigation paths (input-output sequences), shall be represented as a hierarchical system of user options organised in multi-tier 3x3 menus, where 8 cells are used for direct commands (including for entering into submenus, again 3x3 menus that accommodate extra options), and one cell is reserved for returning to upper level of the menu hierarchy. Figure 1, presents an example of such an arrangement based on the REMOTE project services. In this example, a number of user options are organised in 3 levels using 3x3 menus. For instance, the 3x3 main menu denoted with “1” offers access to several applications for elder users (Activity advisor, Guarding Angel, etc.). In this example, the selection of the option “Smart home” through the main menu will lead to a new 3x3 menu (“2”) allowing the user to select a specific room in the context of interacting with the control interface of a smart home application, which in turn leads to another submenu (“3”) offering several options related to the specific home room in question. Assuming that, at a given point, the user has navigated to the Smart home menu (i.e., “2”), then this menu is considered as the current menu, while the “main
A Novel Design Approach for Multi-device Adaptable User Interfaces
405
menu” (“1”) is considered as the parent menu and the Kitchen menu (“3”) as one of the child menus. In these terms, each menu may have only one parent menu, to which the user may return at any time through one of the nine given options, and multiple child menus, occasionally grouped in a “more options...” command (see “4”). Finally, each menu may be associated with one or more display areas (see “5”).
Fig. 1. An example of an application designed using a 3x3 multi-tier menu system In this way, multi-device input (mouse, keypads, special switches, remote controls, and many others, including a joystick) is guaranteed “by design”. In other words, the device-independence characteristic is an intrinsic aspect of this approach, which in turn allows for deploying on multiple interaction platforms (PCs, cell phones, etc.), subject to proper presentational adaptations.
Fig. 2. Example of PDA-oriented adaptation
4.2 Adaptations for Different Delivery Environments The way a set of user options is organised, e.g., semantically, is often significantly different from the way these options are presented to a user or among different users. A key aspect of the proposed approach is that, although designs always maintain an
406
A. Mourouzis et al.
underlying hierarchy of 3x3 menus (which guaranties at any time multi-device input, including in cases of new input devices plugged-in at runtime or user interface migration), an unlimited number ways may be generated to present this to users. Related adaptations may take into account (a) the given platforms display size and other characteristics, (b) the user characteristics and skills, and (c) other contextual parameters, such as lighting, noise, etc. The following figures illustrate an example of how an interface based on the multi-tier 3x3 hierarchy can be adapted in order to be presented on a PDA (Fig. 2) and on a PC (Fig. 3).
Fig. 3. Example of PC-oriented adaptation
Despite some limitations implied when endorsing an underlying 3x3 multi-tier structure for organising navigation paths, a windows-like environment can also be fully supported (see Figure 3). A key advantage of this approach is that the navigation model (menu hierarchies and navigation paths) remains the same at all time (including, potentially, down to text entry level, given that the AUK concept is fully endorsed for text input- see “5” in Fig. 3), thus minimising the risk of confusing the user when shifting from one platform to another (at runtime or in the long term). 4.3 Further Multi-purpose Adaptations Further user interface adaptations may be employed in order to better optimize the produced layouts to specific levels of user expertise, disability, preferences, or interaction devices. For instance, adaptations may support automatic or user-triggered
A Novel Design Approach for Multi-device Adaptable User Interfaces
407
reconfiguration of display parameters, such as colors, font sizes, help in context, etc. As an example, the following set of possible combination of optimisations can be considered: Fast Internet connection; Use through Mouse and keyboard; Use without assistive technology; With layout for typical screen sizes; Using Medium Fonts and Typical Font Family; Employing the Colored version; For users with medium computer literacy; and With no accessibility optimizations. This is one of the many possible combinations, which can then be altered by the user or automatically at runtime. Among these various adaptations, the most relevant here are those related to the user input mode, which may include optimizations for various input settings such as mouse and keyboard; keyboard only; touch screen / mouse only / stylus; joystick; numeric keypad; 4 arrows and “ENTER”; special switches: UP, RIGHT, ENTER, etc. Clearly, each of these optimisations entails individual or sets of predefined adaptation commands. For instance, in the case of “touch screen / mouse only / stylus”, the following adaptations may be applied to the UI: increase clickable areas (margin/padding); turn into bright background (as it will hide fingerprints more than a black background would); highlight interactive items; use dithered or other patterned backgrounds that help the eye focus on the screen image instead of reflections, even in areas where there are no icons or menu choices; turn the cursor off, etc. 4.4 User Interface Adaptation in REMOTE In REMOTE, an adaptation platform is employed [3] that is based on the following process: (i) identification of the UI aspects that could be conditionally adjusted and the discrete dimensions that are correlated with the adaptation decisions (user- and context- related parameters); and (ii) the definition of each UI component’s alternatives according to the requirements posed by each adaptation dimension. These alternatives need to be further encoded into a rule set, loaded by a rule inference engine, evaluated and finally propagated from the concept layer to the actual presentation layer. These mechanisms are incorporated into a complete framework that inherently supports adaptation. The Decision Making Specification Language (DMSL) engine and run-time environment [10] offer a powerful rule definition mechanism and promote scalability by utilizing external rule files while relieving the actual UI implementation code of any adaptation-related conditionality. The Adaptive Widget Library encapsulates all the necessary complexity for supporting adaptation (from evaluation request to decision application) and could be integrated into common IDEs to facilitate developers while building self-adaptive interfaces [11]. In brief, the adaptation platform infrastructure consists of the following components: the DMSL Server and the Adaptive Widget Library. The DMSL server is divided into the DMSL Engine Core and the DMSL Proxy. The Core is responsible for loading and evaluating the rules, while the Proxy acts as a mediator between the Core and external “clients”, by monitoring incoming connections, processing the requests and invoking the appropriate core methods. The Adaptive Widget Library is a set of UI components, primitives and complex, that utilizes the DMSL Server facility to support adaptation. The library’s ease-of-use is ensured by relieving developers of the responsibility to manual adapt any widget attributes by offering a common “adapt” method. Each widget encloses a list of its adaptive attributes and
408
A. Mourouzis et al.
when instructed to adapt itself, evaluates each attribute and applies the corresponding decision. Considering that the DMSL Server is a remote component, network connectivity is an essential precondition for the overall process thus its lack should be handled beforehand. Therefore, a fail-safe mechanism is used to minimize the side effects of potential connectivity loss, where the “last” known configuration is stored and maintained locally to facilitate “static” user interface generation without supporting on-the-fly adaptation.
5 Conclusions and Future Work In the years to come, as the Future Internet starts to shape, applications will gradually stop from being hooked to a single platform (e.g., being installed on a mobile device, a PC, etc.) and will rather start, through cloud computing, “moving about”, shifting from one platform to another, along with the user. In other words, ambient intelligence environments will require migratory user interfaces that will allow users to move about freely while carrying on their tasks through a variety of interaction platforms and devices. Solutions for multi-device environments development will be pursued. User dialogue with computers will then undergo radical and unprecedented changes, raising new challenges and opportunities in terms of human-computer interaction. Above all, people with disability, older users and other groups of users will be put at risk of exclusion due the upcoming shift from current desktop and mobile practices to cloud-based solutions and multi-device environments. The work presented in this paper constitutes a significant step toward enabling designers to create accessible and personalised multi-device user interfaces. It may apply to already available mainstream ICT and is potentially beneficial for everybody. It may also support the rapid and wide endorsement and uptake of online / cloudbased platforms and at the same time may guarantee universal access and high quality user experience for all, regardless of the delivery platform and the characteristics or changing of the user or context. Towards the realization of this vision, further significant work is required and will be pursued in order to: assess and validate the perceived user experience of the proposed solutions, with real users, in various and changing contexts; further elaborate on this novel design concept that ensures device- and modality- independent user input (i.e., accessible user input, regardless of the input device used), including text entry; deliver an integrative framework for designing, implementing and evaluating such device-independent user interfaces; produce adequate methodologies, guidelines, and tools to design applications that comply with the aforementioned concepts; define intelligent and seamless adaptivity and adaptability features (computer-aided customisation and personalisation of the interface) with focus on optimisations for various user devices, disabilities and preferences; provide smart mechanisms that generate (and shift between) equivalent alternatives of system output for various modalities and displays (auditory, visual and tactile, including 3D); define appropriate mechanisms for context-sensitive user support (help), especially for guarantying smooth transitions form one platform to another; and demonstrate exemplary interfaces that are able at any point to migrate seamlessly and / or
A Novel Design Approach for Multi-device Adaptable User Interfaces
409
transform gracefully according to the changing of the context of use without raising any accessibility and usability barriers. Acknowledgments. Part of this work has been carried out in the framework of the European Commission co-funded AAL1 project REMOTE2 (“Remote health and social care for independent living of isolated elderly with chronic conditions”, Grant Agreement no. AAL-2008-1-147).
References 1. Seffah, A., Forbrig, P., Javahery, H.: Multi-devices Multiple user interfaces: development models and research opportunities. Journal of Systems and Software - Special issue: Applications of statistics in software engineering archive 73(2) (2004) 2. Doulgeraki, C., Partarakis, N., Mourouzis, A., Stephanidis, C.: Adaptable Web-based user interfaces: methodology and practice, vol. I(5) (March 2009) ISSN: 1697-9613 (print) 1887-3022 (online) 3. Leonidis, A., Foukarakis, M., Antona, M., Mourouzis, A.: User interface self-adaptation infrastructure. REMOTE Project Deliverable D2.1 (2010) 4. Card, S., Mackinlay, J., Robertson, G.: A morphological analysis of the design space of input devices. ACM Transactions on Information Systems 9, 99–122 (1991) 5. Puerta, A., Eisenstein, J.: XIML: A Common Representation for Interaction Data. In: Proc. of IUI 2002: Sixth International Conference on Intelligent User Interfaces. ACM Press, New York (2002) 6. Abrams, M., Phanouriou, C., Batongbacal, A., Williams, S., Shuster, J.: UIML: An Appliance-Independent XML User Interface Language, In: Proc. of the 8th WWW Conference (1994) 7. MacKay, B., Watters, C.R., Duffy, J.: Web Page Transformation When Switching Devices. In: Brewster, S., Dunlop, M.D. (eds.) Mobile HCI 2004. LNCS, vol. 3160, pp. 228–239. Springer, Heidelberg (2004) 8. Savidis, A., Stephanidis, C.: Unified User Interface Design: Designing Universally Accessible Interactions. International Journal of Interacting with Computers 16(2), 243–270 (2004) 9. Mourouzis, A., Boutsakis, E., Ntoa, S., Antona, M., Stephanidis, C.: An accessible and usable soft keyboard. In: Stephanidis, C. (ed.) UAHCI 2007 (Part II). LNCS, vol. 4555, pp. 961–970. Springer, Heidelberg (2007) 10. Savidis, A., Antona, M., Stephanidis, C.: A Decision-Making Specification Language for Verifiable User-Interface Adaptation Logic. International Journal of Software Engineering and Knowledge Engineering 15(6), 1063–1094 (2005) 11. Leonidis, A., Antona, M., Stephanidis, C.: Rapid Prototyping of Adaptable User Interfaces. International Journal of Human-Computer Interaction (2011) (to appear)
1 2
Website of the Ambient Assisted living (AAL) Joint Programme: http://www.aal-europe.eu/ Project website: http://www.remote-project.eu/
Cultural Difference in Nonverbal Behaviors in Negotiation Conversations: Towards a Model for Culture-Adapted Conversational Agents Fumie Nori1, Afia Akhter Lipi2, and Yukiko Nakano2 1
Graduate School of Science and Technology, Seikei University Musashino-shi, Tokyo 180-8633, Japan 2 Dept. of Computer and Information Science, Seikei University Musashino-shi, Tokyo 180-8633, Japan {dm106219,y.nakano}@cc.seikei.ac.jp,
[email protected] Abstract. As the basis of generating nonverbal expressions in animated agents, this paper proposes factors that account for cultural difference in nonverbal behaviors in negotiation interaction. First, we introduce theories of negotiation and cultural characteristics. Then, our analysis of human interaction in negotiation conversations in CUBE-G corpus is described. Finally, by integrating cultural and negotiation parameters with empirical data obtained in the corpus analysis, we design a parameterized network model that generates culture specific nonverbal expressions in negotiation conversations. Keywords: Negotiation, Nonverbal behaviors, Comparative corpus analysis.
1 Introduction Culture influences how people think, communicate and behave. It strongly affects the way they negotiate or make decisions. Many previous literatures [1, 2, 7, 9, 12] mentioned that culture affects negotiation process. Some cultures emphasize individual’s decision making in negotiation while others stick in decision as a group. In some cultures, a negotiator with a formal style insists on addressing his/her conversation partner by her/his full name or titles, but that with an informal style tries to start conversation on a first name basis. Thus, there is no question about the existence of cultural influence on negotiation styles and nonverbal behaviors occurred there. However, in our knowledge, little has been studied about the models for generating culture-adapted negotiation communications, specifically in human-computer interaction (HCI). Moreover, quantitative data analysis is necessary as the basis of designing HCIs that support negotiation communication because there are many theoretical studies, but not empirical research on negotiation strategies [1]. Based on the motivations above, this paper aims at connecting theories of culture and negotiation interaction with empirical analysis. As a theory of culture, we use a well known model introduced by Hofstede [6]. As the factors of characterizing culture, he proposed five dimensions: hierarchy, gender, identity, uncertainty, and orientation. As for the factors of characterizing negotiation C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 410–419, 2011. © Springer-Verlag Berlin Heidelberg 2011
Cultural Difference in Nonverbal Behaviors in Negotiation Conversations
411
interaction, we studied previous literatures in social psychology [1, 2, 3, 9], we proposed four parameters that categorize negotiation: negotiation style, risk taking, degree of rapport, and negotiation goal. In our empirical study, we use negotiation conversations in CUBE-G corpus [5] which is a standardized multimodal corpus of two cultures: German and Japanese. In this study, we focus on smile and head nod in our analysis. Finally, by integrating theories of culture and negotiation with empirical data, we will propose a computational model which generates culture specific nonverbal expressions in animated characters. This paper is organized in the following way. In Section 2, previous studies will be reviewed, and in Section 3 theoretical background drawn from a literature study is briefly outlined. Section 4 will describe the empirical analysis of two cultures: German and Japanese in a negotiation interaction. Section 5 will present a computational model by integrating empirical data and cultural and negotiation characteristics. Section 6 will present concluding remarks and future directions of this research.
2 Related Work A number of studies have identified the effect of culture on negotiation process. Adair et al. [12] claimed that, negotiators from different cultures enact different behavioral sequences at the bargaining table, leading to difficulty in synchronization and inefficient deals. So, they proposed a four stage model of negotiation that captures the progression of competitive and cooperative elements over time in mixed-motive negotiation. They also claimed that, particular behaviors that negotiators enact in each of the stages were shaped, in part, by negotiators’ cultural backgrounds. Finally, they predicted the particular stages and behavioral sequences in order to generate efficient deals. Comparing how negotiation styles differ depending on the culture, Jeswald [2] found that Japanese people always go for win-win process as negotiation outcome where as only thirty three percent of Spanish executives take this view. Moreover, [1] found that during negotiation interaction, the Japanese were using a relatively large number of proposals, compared to the US negotiators, and the US negotiators were using a whole array of direct communications relatively more frequently than the Japanese. Brett [1] developed a model of how culture affects negotiation process. This study has identified three key factors leading to successful integrative and distributive agreements that are affected by culture. The first is a value for information sharing, the second is a means of searching for information, and the third is the motivation to search for information. Therefore, negotiators who are motivated to search for information and are flexible about how that search is carried out, can reach high-quality negotiated outcomes and culture at first place affects the norm and behavior of the negotiators. Moreover, Teng et al.[9] found that the amount of time to make a decision depends on the cultural background. As western cultures are on the individualistic and masculine sides in Hofstede’s cultural dimension, they are more expected to take an aggressive approach to reach a solution. This claim is supported by the fact that,
412
F. Nori, A.A. Lipi, and Y. Nakano
western cultures are short term cultures, and this suggests that they tend to solve problems quickly and in a sequential manner. Whereas eastern cultures that belong to a more holistic, long-term and collectivistic group are expected to solve problems much slower and more exhaustive. Facial expression is one of the most important channels in conversational management. Mark et al [8] found that smiles are positive reinforcers that can change behavior of the conversational partner. Receiving a smile from someone can make the receiver more helpful toward the partner which in turn can initiate a better deal in conflict situations. Although all these literatures provide a great deal of information as to how communicative behavior, style, and attitudes during negotiation are influenced by culture, yet precise quantitative data is virtually absent from such sources, and the data should be used in a technical way to model the culture computationally.
3 Literature on Culture and Negotiation We employ Hofstede theory [6] to describe cultural characteristics, and exploit theories in social psychology and its related research [1, 2, 3, 9] to explain negotiation characteristics. In this section, we describe these factors in detail. 3.1 Cultural Characteristic Hofstede theory [6] consists of the following five dimensions, which are based on a broad empirical survey. The dimensions are: a. Hierarchy (Small/Large): This dimension describes the extent to which different distribution of power is accepted by less powerful members. b. Identity (Individualism/Collectivism): This dimension indicates the degree to which individuals are integrated into a group. On the individualist side ties between individuals are loose, and everybody is expected to take care of himself. On the collectivist side, people are integrated into strong and cohesive groups. c. Gender (Masculinity/Femininity): The gender dimension describes the distribution of roles between the genders. In feminine culture the roles differ less than in masculine cultures, where competition is rather accepted and status symbols are of importance. d. Uncertainty (Weak/Strong): The tolerance for uncertainty and ambiguity is defined in this dimension. It indicates to what extent the members of a culture feel either uncomfortable or comfortable in unstructured situations which are novel, unknown, surprising, or different from usual. e. Orientation (Short/Long): This dimension distinguishes long and short term orientation. Values associated with long term orientation are thrift and perseverance whereas values associated with short term orientation are respect for tradition, fulfilling social obligations, and saving one’s face. Table 1 gives the Hofstede’s ratings for three countries. For example, in Identity dimension, Germany (67) is more individual culture than Japan (46), and US (91) is the most individual culture among three.
Cultural Difference in Nonverbal Behaviors in Negotiation Conversations
413
Table 1. Hofstede ratings for three countries
Hierarchy
Identity
Gender
Uncertainity
Orientation
Germany
35
67
66
65
31
Japan
54
46
95
92
80
US
40
91
62
46
29
3.2 Negotiation Characteristics There have been many studies on how to manage inefficient deal during negotiation and how to anticipate misunderstanding during cross-cultural negotiation. They proposed factors of negotiation to identify cultural differences. We reviewed many socio-psychological studies [1, 2, 3, 9] and finally came up with the following four elements to characterize negotiations. a. Negotiation style (Formal/Informal): Negotiation style concerns how politely way that a negotiator interacts with the other person. Culture strongly influences the style of negotiation. For example, American people more likely to use informal negotiation style, and they call someone by her/his first name to display the friendship. On the contrary, Japanese people with formal negotiation style use first name at a first time meeting to display disrespect [2]. b. Risk Taking (High/Low): This parameter indicates how willingly/easily the negotiator take a risk in negotiation interaction. For example, previous literature [2] claimed that the American are risk-takers whereas the Japanese try to avoid risk in a deal making. c. Degree of Rapport (High/Low): Rapport is a social mechanism that serves as the function of enabling coordinate action in dyads and groups [3]. The degree of rapport rises through the coordination of expressive behaviors, such smile. d. Negotiation goal (Long/Short): This is the purpose of a negotiation deal. Negotiators from different cultures view the goal of the negotiation differently. For some cultures, the goal of negotiation is a signed contract so they are fast in decision making. However, for some cultures, the goal is not only a signed contract but also the creation of a long-term relationship between two sides. They invest time getting to know each other which as a result slows down the process of decision making [2, 9].
4 Corpus Analysis This section reports the results of a comparative corpus analysis using CUBE-G corpus [5]. We analyzed 10 videos of negotiation conversations in each culture: German and Japanese. The average duration of the analyzed conversations was 12 minutes. The negotiation scenario was a variant of the standard lost at sea scenario [5]. Subjects have to assume that they are shipwrecked in the Pacific Ocean. They have
414
F. Nori, A.A. Lipi, and Y. Nakano
only time to take three items with them that could help them in surviving. On the boat, there are 15 items and thus they have to choose three items among those 15. Every subject has to choose his / her own top three items from the item list, and then they have to negotiate to choose three items ranked in order of importance for surviving. The subjects were also instructed that their choice would be compared with the “official” list by the U.S. Coast Guard, and their monetary award depends on how close their choices are to the official list. Therefore, this scenario has the advantage of forcing the subjects to achieve the consensus in item selection, and the subjects had an intrinsic motivation to argue for their choices. Then, using a video annotation tool ANVIL [11], we annotated verbal and nonverbal behavior patterns found in the two cultures in negotiation conversations. In this study, we annotated smile and head nods. Fig. 1 is an example of annotation of our corpus. Our coding scheme is shown below.
Smile᧶The corners of lips are extended and drawn up. The cheeks are raised and the mouth may or may not be parted with teeth exposed or not. Start : Moment when the corner of mouth goes up End : Moment when the corner of mouth goes down Head nod : Vertical movement of head Start : Moment when the chin goes up or goes down End : Moment when movement stopped We set the following three types of head nod. Single : The person nods only once Double : The person nods twice Repeated : The person nods three or more times
Fig.1. An example of annotation using ANVIL
Cultural Difference in Nonverbal Behaviors in Negotiation Conversations
415
Table 2. Analysis of smile
Culture Japanese German
Total amount of time[s] 212.85 142.77
Ratio[%] 35.98 22.28
Frequency 36.50 33.70
Duration[s] 7.28 3.98
Table 3. Analysis of head nod
Culture Japanese German
Total amount of time[s] 92.15 57.68
Frequency 123.30 50.44
4.1 Differences in Smile Table 2 shows the average frequency and duration of smile in both cultures. This shows that in Japanese data, the average amount of time for smiling per conversation was 212.9 sec. On the other hand, in German data, the amount of time for smiling per conversation was 142.8 sec. This means that the Japanese smiled much more than the German. Moreover, smiling behaviors were slightly more frequent in Japanese data (36.5 times) than the German data (33.7 times). 4.2 Differences in Head Nod Table 3 shows the average amount of time for head nodding per conversation and the average frequency. As shown in the table, the frequency of head nod in Japanese data (123.3) is twice as much as German data (50.4), and the amount of time is also larger in Japanese data than in German data. Next, we analyzed the types of head nods: single, double, and repeated. The ratios for each type of head nod are shown in Fig 2. An interesting finding is that the proportion of single head nod is higher in Japanese data than German data whereas the proportion of repeated head nod is higher in German data. There is no difference between these countries in double head nod. Note that in Geraman data the
Fig. 2. The rate of head nod Style
416
F. Nori, A.A. Lipi, and Y. Nakano
Fig. 3. Head nod during own utterance.
Fig. 4. Head nod during the partner’s utterance.
proporations for both single and repeated head nods are 41% where as the frequency of single head nod is more than twice as much as that of repeated head nods in Japanese data. This indicates that single head nod chracterizes Japaneese negotiation convereation. To investigate the usage of each type of head nod more precisely, we analyzed head nods with respect to turn. The results are shown in Fig 3 and 4. Fig 3 shows the proportion of each type of head nod that occured during own utterance. The proporations for all types of head nod are not very different between two cultures. Notable difference was found in Fig 4 showing the proportin of head nods that occured during the partner’s trun. While the partner was speaking, Japanese people frequently displayed single head nod. On the other hand, German people used repeated head nod which was less frequent than Japanese single head nod. This result suggests that such frequent single head nod as the feedback to the conversation partner contributes to relaxing the tension between the converastional participants and reducing the risk in the negotiation conversations.
5 Formulate a Cmputational <del This section describes how empirical results from the comparative corpus analysis are connected with cultural and negotiation factors to design a network model that predicts nonverbal behaviors. We employ a Bayesian network for this purpose. 5.1 Designing of the Model In order to create a Bayesian network capable of predicting nonverbal behavior, the GeNie [4] modeling system was used. Fig 5 illustrates the Bayesian network created by using this tool. Based on the empirical analysis in section 4 and previous studies, we determined which factor or node should be linked to which node in the next layer. A summary of the network design is described below. The Parent Node. The parent node of the Bayesian network consists of two nodes Culture and Negotiation. Culture node is connected with Hofstede’s dimensions and
Cultural Difference in Nonverbal Behaviors in Negotiation Conversations
417
Negotiation node is connected with negotiation characteristics. In the current study, the Culture node has only two values: German and Japanese and the Negotiation node also has two values: high divergence and low divergence. Currently in this study, the high divergence value was assigned to our negotiation scenario and respected data were used in employing the model. Cultural Parameters. As the cultural factors layer, we used Hofstede’s five dimensions, which were described in Section 3.1, integrating all five dimensions: hierarchy, identity, gender, uncertainty avoidance, and orientation. The probabilities of each node are assigned based on the Hofstede’s ratings for each culture given in Table 1. Parent node
Cultural parameters
Negotiation parameters
Non-verbal parameters
Fig. 5. A Network model connecting cultural and negotiation parameters with nonverbal
Negotiation Parameters. As the negotiation characteristics, we employed factors proposed in [1, 2, 3, 9], which were explained in Section 3.2. The linkage between the cultural parameters and the negotiation parameters were designed based on some previous studies [1,2,3,9], which provide enough evidences that culture influences how people think, communicate, and behave and it also strongly affects the negotiation process. The links for each node are described as follows. a) Negotiation style: We linked Hofstede’s dimension of “Hierarchy” with “Negotiation style”. It is said that eastern cultures are high hierarchical societies where the attitudes in communication are formal, and they adopt formal procedures of addressing to counterpart [2].
418
F. Nori, A.A. Lipi, and Y. Nakano
b) Degree of Rapport: In face-to face interaction, rapport creates positive impression between the dyads, and affects the perception of another person [3]. Thus, we linked “Degree of Rapport” with “Identity”. Since Japanese culture is collectivistic, it is more harmonious and conflict is less common compared to individualistic cultures [7]. c) Negotiation goal: Japan is a long term oriented society where their negotiation goal is not only to sign a contract, but to create a long term relationship [2, 9]. We therefore created a link between “Negotiation goal” and “Orientation”. d) Risk Taking: Since Japan is a high uncertainty avoidance society, they try to avert any risk in decision making [2]. So, we linked “Uncertainty avoidance” with “Risk Taking”. On the contrary, cultures with weaker uncertainty-avoidance, risks are more easily accepted. Nonverbal Parameters. The lowest layer consists of a number of different behavioral parameters that depend on cultural characteristics profiled by Hofstede’s dimensions and the negotiation parameters. We draw some links between the negotiation parameters and the nonverbal behaviors. To assign the probability of each node, we employed EM (Estimation-Maximization) algorithm. The results of analyzing nonverbal behaviors (head nods and smile) were used as the input to the algorithm. a) Frequency of smile and Frequency of head nod: We linked frequency of smile and frequency of head nods to “Risk management” and “Negotiation style”. It can be assumed that since the Japanese are risk averse, they frequently use smile and head nod during negotiation to relax the tension with the conversational partner. b) Duration of smile: We linked duration of smile with “Degree of rapport” since previous study [8] mentioned that facial expression influences the perception of trustworthiness. c) Type of head nod: We linked the “Negotiation goal” with “Type of head nod”. From our empirical data explained in section 4, it is found Japanese people do more single head nods than German which means the Japanese keep proving feedback to the conversation partner which enhances the relationship with the partner and a long-term relationship is ensured. In addition, from our analysis, we also found that German people do more repeated head nod than Japanese people. When German are sure about decision making, they confirm what they agreed on by nodding head repeatedly. Output Generated. The model predicts non-verbal parameters when culture and negotiation nodes are specified. As shown in Fig 5, when Japanese is chosen as an evidence for culture during a negotiation scenario, the results for frequency of smile is high (68%), duration of smile is long (50%), frequency of head nod is high (68%), and type of head nod is single (80%). Then by selecting German culture, the result for frequency of smile is medium (100%), duration of smile short (98%), frequency of head nod is medium (50%), and type of head nod is repeated (66%).
6 Conclusion This study attempted to provide theoretical background to characterize culture and negotiation interaction. We also analyzed comparative corpus for German and
Cultural Difference in Nonverbal Behaviors in Negotiation Conversations
419
Japanese, and obtained statistical values of nonverbal behaviors with regard to negotiation scenario of dyadic conversations. We integrated the statistical information into a Bayesian network as well as the cultural and negotiation factors. As a future study, we will implement this model in Embodied Conversation Agents that can provide a means of identifying the nonverbal attributes for any culture in negotiation scenarios so that negotiators can cope with cross-culture negotiation deal and know other party’s cultural background and behavior pattern. Such understanding of culture may help to adjust his/her negotiation strategies, and resolve conflict and uncertainties. Acknowledgment. This work is funded by the German Research Foundation (DFG) under research grant RE 2619/2-1 (CUBE-G) and the Japan Society for the Promotion of Science (JSPS) under a Grant-in-Aid for Scientific Research (C) (19500104).
References 1. Brett, J.M.: Culture and Negotiation. International Journal of Psychology 35(2), 97–104 (2000) 2. Salacuse, J.W.: Negotiating; The top ten ways that culture can affect your negotiation. Ivey Business Journal Online (September 2004) 3. Drolet, A., Morris, M.: Rapport in Conflict Resolution: Accounting for How Face-to-Face Contact Fosters Mutual Cooperation in Mixed-Motive Conflicts. Journal of Experimental Social Psychology 36, 26–50 (2000) 4. http://genie.sis.pitt.edu/ 5. Rehm, M., Nakano, Y., André, E., Nishida, T., Bee, N., Endrass, B., Wissner, M., Lipi, A.A., Huang, H.-H.: From observation to simulation: generating culture-specific behavior for interactive systems. AI & Society, Special Issue on Enculturating Human Computer Interaction 24(3) (2009) 6. Hofstede, G., Bond, M.H.: The Confucius connection: From cultural roots to economic growth. Organisational Dynamics 16(4), 4–21 (1988) 7. Ting-Toomey, S.: Communication across Culture. The Guildford Press, New York (1999) 8. Knapp, M.L., Hall, J.A.: Nonverbal Communication in Human Interaction, ISE International Edition, 7th edn. 9. Teng, J.T.C., Calhoun, K.J., Cheon, M.J., Raeburn, S., Wong, W.: Is the east really different from the west: a cross-cultural study on information technology and decision making. In: Proceedings of the 20th International Conference on Information Systems, pp. 40–46 (1999) 10. Traum, D.R., Swartout, W.R., Marsella, S.C., Gratch, J.: Fight, flight, or negotiate: Believable strategies for conversing under crisis. In: Panayiotopoulos, T., Gratch, J., Aylett, R.S., Ballin, D., Olivier, P., Rist, T. (eds.) IVA 2005. LNCS (LNAI), vol. 3661, pp. 52–64. Springer, Heidelberg (2005) 11. Kipp, M.: Anvil - A Generic Annotation Tool for Multimodal Dialogue. In: Proceedings of the 7th European Conference on Speech Communication and Technology, pp. 1367-1370 (2001) 12. Adair, et al.: The Negotiation Dance: Time, Culture, and Behavioral Sequences in negotiation. Organization Science 16(1), 33–51 (2005) 13. Quaddus, M.A., Tung, L.L.: Explaining cultural differences in decision conferencing. Communications of the ACM 45, 93–98 (2002)
An Investigation into a Personalised and Web2.0-Based Search Engine Interface Tsai-Hsuan Tsai1, Hsien-Tsung Chang2, and Shih-Ting Huang1 2
1 Department of Industrial Design, Chang Gung University, Taoyuan, Taiwan Department of Computer Science and Information Engineering, Chang Gung University, Taoyuan, Taiwan,
[email protected] Abstract. This study aims to investigate user behaviours and preferences for the usage of search engine interfaces and to provide a user-centred search engine interface with various functions and services that cater to user needs and personalisation along the Web2.0 trend. A survey was conducted to examine the relationship between user behaviours and interface needs for Web2.0 search engines. Some influential factors in association with user issues and Web2.0 concepts were proposed as design principles for a personalised search engine interface design. Then, a conceptual search engine interface, which matched the theory-driven principles and practical experiences, was developed. It matches the concept of user-orientated services and provides every user with personalised search engine interface that offers mash-up search results and multiple functions. Users are allowed to share search results while using a personalised interface. Through the new concept of search engine interface, the convenience of the search process will include more convenient search procedures that meet personal needs. Keywords: search engine interface, user behaviours, Web2.0.
1 Introduction Search engines have become one of the most important and most frequently used services, heavily influencing the way users perceive the Internet [1-3]. Google [4], the most widely used Web-based search engine in the world, began in 1998. Google has stood atop the usage rate of search engine since 2001. In 2011, Google boosted over 90% of its worldwide usage rate and stood number one among search engines [5]. Google’s user interface, presenting search results in a listing manner, has thus been leading display styles of most search engines and how users have used search engines. That is, users are very accustomed to Google’s result display. Correspondingly, some familiar search engines, such as Yahoo [6] and Bing [7], display ”Google-like” content search results. However, many search engines offer users a myriad of small tools and features in their interfaces with an emphasis on enriched user interactions. In response to Web2.0 related functions and technical development, search engines have been upgraded to C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 420–429, 2011. © Springer-Verlag Berlin Heidelberg 2011
An Investigation into a Personalised and Web2.0-Based Search Engine Interface
421
include various features. For instance, some search engines use the modular functionality of Web 2.0; for example, mashup and tagging, such as FundooWeb [8]; KartOO [9] offers rich internet application search interfaces; Omgili [10] connects with social aspects—user contribution, recommendation, and social networks. It characterises Websites of Web2.0 where users are allowed to not only participate and share but also create, revise, and post Website content [11-13]. With Web2.0 related techniques and core concepts, search engines offer users more accuracy and diversity. They also constantly develop individual features and different user interfaces. Hence, in spite of Google’s dominance on the current search engine interface display, with the trend of Web2.0, users are offered interfaces with various features and integrated applications that come from different sources [14]. Users are no longer play a passive role in receiving messages; they are an active part of the creation and contribution to content through interactive sharing [15-16]. However, these search engines just reached a 1% worldwide usage rate among search engines [5]. Boulos et al. [17] point out that search engines play a significant role in Web2.0 applications. The composition of user interfaces is thus the major part of search engines for users [18]. User interfaces are the key for users to operate search engines. Moreover, user experience and preferences in search engine interfaces deserve attention and consideration to keep up with current user needs. In the 11the IEEE International Conferences on Computer-Aided Design and Computer Graphics [19], we proposed the following views about search engine interfaces: (1) search engine as an intermediary between for the user; (2) output results according to different queries; (3) implementing Web2.0 concepts; (4) integrated interface and personalised methods of operation. User input is important in obtaining a good understanding of user behaviours for exploring search engine interface needs. These behaviours include habitual practice and preferences when interacting with the Web and search engines. A user-centred approach has become the core of current user interface designs. Despite some Web design guidelines and some search engines that combine Web2.0 applications cater to the current Web trends. There is a lack consideration of user habits and preferences in the usage of search engine interfaces today. How user habits and user acceptance correspond to each other is also not taken into consideration. Search engine interfaces should not only focus on the development of individual functions but also on the usability of interfaces and user needs and experiences in current Web trends. User behaviour patterns should be introduced and integrated into related services of today’s Web2.0 to develop search engine interfaces that will greatly serve users. In order to investigate user needs and preferences for the usage of search engine interfaces in the current trends of Web2.0 and to propose a user-centred user interface for search engines, the study conducts a survey on user needs and preferences for the usage of search engine interfaces in the current trends of Web2.0.
2 User Habitual Practice and Preference under the Prevalence of Web2.0 This study aims to obtain a better understanding of user behaviours when interacting with search engines and their needs in search engine interfaces. Through identifying
422
T.-H. Tsai, H.-T. Chang, and S.-T. Huang
the relationship between user practice, preferences, and the features of Web2.0, we will provide design factors for search engine interface designs. The study practically indentifies the frequency of use, type, user preferences, and habitual practices for search engine usage through questionnaires and analyses to ascertain user needs for functions, use experiences, and personal preferences of current search engine interfaces and to understand the impact of Web2.0 trends on users, which will lead to the identification of design factors for search engine interfaces. 2.1 Design and Procedure A questionnaire was employed in the survey. 415 participants were randomly distributed from colleges and universities in Taiwan. Survey participants target general user groups of search engines in that user habitual practices and needs are more likely found in such groups. Based on Pew’s Internet report [20], young people aged 18-29 present with the highest usage rate of search engines usage. Therefore, it has become one of the criteria of participant selection. There are 415 participants aged 18 to 29, 247 males and 168 females; the average age is 22, and 331 persons are undergraduate students. Moreover, 100 are postgraduate students, 3 are in high school, and 1 is a research student; nearly 100% of the participants are students, and two-thirds have more than 8-year experience using Internet. This equates to a search engine usage rate of nearly 100%. The questionnaire mainly examines user needs for search engine usage and user experiences on Internet; the former includes user habitual practices, query input functions, and search results displays of search engines, and the latter includes user preferences and specific Internet needs. To ensure questionnaire reliability, there are two pre-tests prior to the formal distribution, and each time, we use about five questionnaires to test unclear terms and time spent. 2.2 Survey Results and Analysis y Search engine usage among young generations in Taiwan Of the 415 users of search engines in Taiwan, Yahoo! and Google score the highest, with a usage rate of over 60% (Table 1). “Everyday” accounts for the highest usage frequency for about two-thirds of search engine users. The most frequently searched content of the first three are report information, everyday knowledge, and academic research. Searches for information have become the most frequently searched content for users (Table 2). Data quantity is the most cited reason for search engine selection and user habit is the second (Table 3). y Query Assistance Used by Users In terms of query input, “spelling suggestion” (suggesting possible words or terms before the user’s entry is complete) is regarded as a helpful function for 348 (84%), “search history” (previous searches displayed below the search box) for 276 (67%), “relevant keywords” (the search engine interrogates its database of possible keyword matches related to the keyword query) for 355 (86%), “spelling correction” (the search selecting the best choice among all possible corrections for a misspelled term when users misspell query) for 349 (84%), and “top searches” for 302 (73%). The
An Investigation into a Personalised and Web2.0-Based Search Engine Interface
423
needs for query functions are relevant keywords, spelling correction, suggested spelling, top searches, and search history. Each function is needed for over 87% of users (Table 4). To enter the query, typing description words into the search box is the most commonly used search method. Users value efficiency and accuracy while searching. Accordingly, user interfaces in search engines should focus on speed and accuracy. Table 1. Search Engine Usage Search Engine YAHOO! Google YAHOO! ANSWERS
Participants
Search Engine 373 Wikiseek 356 Baido 246 Live Search (MSN Search)
Participants
Search Engine 143 AltaVista 51 Snap 33 Cuil
Participants
Search Engine 3 Lycos 2 HotBot 2 others
Participants 0 0 7
Table 2. Frequently Search Content by Users Search Contents Participants
report information 350
everyday knowledge 303
academic research 232
commercial News Pictures Map Blog others products 215 196 195 137 126 10
Table 3. Reasons of Use Reasons Participants
data user habit quantity 288 250
search accuracy 246
search speed 169
source of function layout others information 123 75 67 2
Table 4. Query Input Helps Used by Users Query Helps Yes Ordinary No
relevant keywords spelling correction spelling suggestion top searches 355 349 348 44 41 50 16 25 17
search history 302 94 19
276 86 53
Table 5. Search Result Options Adopted by Users Search Options Yes Ordinary No
automatically data listing 348 58 9
category selection 330 60 25
instant previews 294 92 30
automatically filter results by type of content 282 105 28
double window 272 90 53
various interface 219 135 61
y Users’ Perspectives of Search Engine Result Pages In terms of search result interface, 330 respondents (80%) regard “category selection” (providing classified search results based on entered search query) as a useful function, 348 (84%) considered “automatic data listing” (search results automatically listed in different types) useful, 219 (53%) identified “various interface” (different queries automatically outputting diverse result pages), 294 (71%), identified “instant previews” (click once on an icon in the search results to instantly see a preview of the Web page), 272 (66%) identified “double window”
424
T.-H. Tsai, H.-T. Chang, and S.-T. Huang
(show search results provided by different search engines on the same page), and 282 (68%) identified “automatically filter results by type of content” (search results appear with user habits or queries in accordance to user preferences) as useful. The sequential needs for output functions in search engines are automatic data listing, category selection, instant previews, automatically filter results by type of content, double window, and layout settings. Each function was cited as needed for over 85% of users (Table 5). Of Web search engine usage and statistics among the 415 participants, the preference ranking for search engine entrances is Yahoo!, Google, and MSN Search, in that order. User habits mostly account for preference factors up to 272 responses (over 66%). Of the current search engine displays, 322 respondents (78%) do not consider the current displays the best, while 93 regard them as the optimal; over 90% (409 users) think that search engine interfaces could be more various and changed. Accordingly, the most common listing style for search result displays does not fully meet user needs. y Use Satisfaction of Current Search Interfaces For 415 search engine users, the first three representative Web2.0 Websites are YouTube, Wikipedia, and Facebook (Table 6). WRETCH (www.wretch.com) scores the highest usage rate (45%) for users who once set up blogs on Websites, and, yet, every user uses blogs (Table 7). For blog related functions, 346 respondents (83%) regard “personal management” as a helpful function, 375 (90%) identified “interface layout settings,” 388 (93%) identified “edit/share/delete/revise,” 382 (92%) identified “reply/leave a message,” and for 297 (72%) cited “tool options.” Blog-related function needs are edit/share/delete/revise, reply/leave a message, interface layout settings, personal management, and tool options, in that order. Each function is identified as needed for over 90% of users (Table 8). Of the needs for Web-related functions for 415 users, Instant Messenger (e.g. MSN, Skype, Google Talk, Yahoo Messenger) functions that provide users with instant share and responses are regarded as useful for 407 respondents (98%), while 8 do not consider them so; 380 (91%) respondents regard Wikipedia functions that allow every user to edit and revise Website information as helpful for the accuracy of information and knowledge while 35 (9%) do not. Based on the survey statistics, most search engine users use various and characteristic Websites such as YouTube and Wikipedia. These Websites feature interactive sharing and score in the top two for use experience; moreover, the usage rate of blogs as a Web2.0 feature counts for up to 92%. The user need for personalised management functions is over 90%. y Specific needs of Internet users For specific needs of Internet users among 415 respondents, 310 (75%) regard “delete/revise search results” as a useful function, 273 (66%) identified “result display by hits ranking,” 348 (84%) cited “personalised search results,” and 353 (85%) users reported “share/integrate search results” are useful. Sequentially, the specific needs of Internet users are share/integrate search results, personalised search results, delete/revise search results, and result display by hits ranking (Table 9). Each function is needed for over 90% of users. Moreover, over 90% (384 respondents) think that it is helpful for users to be able to simultaneously select different search engines.
An Investigation into a Personalised and Web2.0-Based Search Engine Interface
425
Table 6. Web2.0 Websites of Use Web2.0 Websites YouTube Wikipedia Facebook
Participants
Search Engine 391 Flickr 373 Amazon 154 Plurk
Participants
Search Engine 115 funP 94 Twitter 72 MyShare
Participants
Search Engine 29 Digg 24 HEMiDEMi 23 Del.icio.us
Participants 12 10 8
Table 7. Blogs Usage Blogs WRETCH MSN Yahoo!Blog
Participants 318 68 64
Blogs blogger Xuite YAM
Participants
Blogs 56 PChome 49 Sina 48
Participants
Blogs 21 others 7 No
Participants 18 59
Table 8. SE Users’ Favourite Blog Features Query Helps Yes Ordinary No
edit/share/delete/ revise 388 24 3
reply/leave a message 382 31 2
interface layout settings 375 35 5
personal management 346 63 6
tool options 297 90 28
Table 9. Specific needs of Internet users Specific Needs Yes Ordinary No
share/integrate search results 353 51 11
personalised search results 348 58 9
delete/revise search results 310 89 16
result display by hits ranking 273 99 43
3 A Personalised and Web2.0-Based Search Engine Interface Through an examination of current search engine interfaces, including query input, search box, search results pages, and search engine options and features, a variety of small tools and features, according to habitual practices, preferences adoptions, user habits, and acceptance in the Web2.0 trends, have an impact on search engine interface needs with increasing expectations and preferences. A variety of Web2.0 related applications and services that serve diverse use experiences also affect use habits and preferences for current users who expect more personalised services. Because of keyword usage, query input has been the most deep-rooted concept and thus a habit for users. Therefore, providing appropriate query assistance, such as relevant keywords, spelling correction, spelling suggestion, top searches, and search history, is also demonstrated in the study. Based on an investigation of user needs and preferences for the usage of search engine interfaces, this study proposes several suggestions below with which search engines should be equipped: y Mashup and Tagging Data displays, which over 90% of the respondents consider helpful, should be able to simultaneously select different search engines; moreover, in terms of query input and search result output, search engine functions highly appeal to user requirement
426
T.-H. Tsai, H.-T. Chang, and S.-T. Huang
and acceptance, and users expect more personalisation- and interaction-related functions of search engine interfaces in the current Web2.0 trends. With respect to the current developing trends of search engines, there are hundreds of assorted search engines on the Internet, but they are independently developed and scattered. Despite their various functions and features, these search engines have a hard time against Yahoo! and Google among users. Accordingly, these currently scattered search engines should be integrated so that search engine interfaces will better meet user needs and provide more suitable and personalised user interfaces. For example, shown as Fig. 1, a personalised and Web2.0-based search engine interface will take account of user needs and preferences and present itself in an integrated way with the community concept to involve users. For the query input part, search results output displays, personal settings, and search results, search engine interfaces offer users personalised search methods and results that will improve more convenient search procedures to meet personal needs. y Creating Users’ Personalised Search Engine Interface Personalised services of a user-centred approach and the integration of interaction and sharing have become current trends. Thus, search engine interfaces that impact user selection should integrate a variety of Web2.0 features to cater to the behaviour patterns of current users. User habits have become an important factor for search engine users. In short, search engine outputs should be generated based on the personal needs of each individual. Search engines should automatically record users’ habitual practices and behaviour patterns. There should also be various result displays based on different questions so that interfaces are more diverse to meet user needs. For example, as shown as Fig. 1, there are four types of search results on the original display interface of search engines: general search engine, information sharing, video, and image. Each type offers several logos of different search engines in the upper right corner of the window. These search engines are of the highest usage rate or are currently the most representative. Hence, while users search the keyword “Clownfish” in the search engine interface, the search results of different search engines will simultaneously be presented, including various search results such as text, as with Google, knowledge, as with Wikipedia, video, as with YouTube, and image, as with Flickr. Users may feel free to click on the small icons of other search engines in the upper right corner to review search results provided by different search engines or in the drop-down arrow to select more search engines. Moreover, search engines record preferences and habits of every user automatically in addition to the types of the most-used search engines. Furthermore, users may re-arrange the windows in the search results display at will and may adjust the size, select zoom-in or zoom-out, and close or add a new window by clicking on the upper right icon according to personal preferences. Each user may acquire a personalised search engine interface. Thus, search engines should offer users personalised interfaces with layout settings (e.g., colour, position) and tool options and offer a variety of personalised management options in accordance with behaviour patterns of Website usage (e.g., blog) in the current Web2.0 trends to increase the efficiency and selectivity while users use search engine interfaces.
An Investigation into a Personalised and Web2.0-Based Search Engine Interface
427
y Social Aspects: User Contribution, Recommendation and Search results sharing with others Search engines should allow edition, revision, and deletion of result rankings and reply to result content. Search engines should also display results by hit ranking so that the suitability and accuracy of information rankings are filtered by users collectively, and integrated search results are accordingly presented to increase the preciseness of information and knowledge to best meet search results. By the integration and sharing of personalised search results of search engine users from different fields, perhaps with online instant sharing, feedback, etc., the community power contribution of users in these Web2.0 trends will shorten the time spent on data clicking and help users acquire optimal search results. For example, as shown as Fig. 1, after users search the keyword “Clownfish,” they may click on the right icons of each search result to edit images, make a deletion, or post a comment. Users may re-arrange the order of search results as they desire and read other users’ comments on the search result. Accordingly, users may acquire personalised search results regarding the term “Clownfish.” Users may click on the left icons of search results to show approval. Moreover, every user may share edited search results with other users as a reference. Users of different fields thus share information and acquire more complete, accurate, and diverse search results. Users are considered integral in interface designs. In order to develop a new usercentred interface design for search engines, this study identifies issues of search engine interface designs through a literature review, and a questionnaire survey
Fig. 1. A Personalised Search Engine Interface: A New Search Engine Paradigm
428
T.-H. Tsai, H.-T. Chang, and S.-T. Huang
examines user habits and needs for search engine interfaces and user views and preference for search engine interface functions in the current Web2.0 trends to generalise design-related references for a new search engine interface. This study proposes a personalised and Web2.0-based search engine interface to meet every user’s needs when accessing a search engine. The new search engine interface provides general functions like Google and other unique functions and interface designs such as various input/output interfaces, personalised layout settings, search result sharing, knowledge sharing, integrated type of functions, and interfaces. Every user may edit and arrange result data to cater more to general needs. The interface offers assistance and options on keyword input and integrates the search results of every search result so that users may simultaneously acquire diverse search results and select commonly used search engine based on personal habits to meets user needs in accordance with Web2.0 trends. Every user has a personalised search engine interface and may set layout, language, and function icons based on habitual practices and preferences that allow more convenience while using the search engine interface. Overall, the new search engine interface provides users with an integrated interface and related services and aims to improve the convenience and efficiency of procedures as users use search engines in the current Web2.0 trends. Acknowledgements. This research was funded by the Taiwan National Science Council (NSC 99-2410-H-182-032-).
References 1. Wiza, W., Walczak, K., Cellary, W.: Periscope: a system for adaptive 3D visualization of search results. In: Proceedings of the Nineth International Conference on 3D Web Technology. ACM, Monterey (2004) 2. Jansen, B.J., Spink, A.: How are we searching the World Wide Web? A comparison of nine search engine transaction logs. Information Processing & Management 42(1), 248– 263 (2006) 3. Yang, H.-C., Tzeng, M.-C., Yang, C.-Z.: A Web Interface for Visualizing Web Search Engine Results (2002) 4. Google. Google (1998), http://www.google.com.tw/ (cited January 24, 2011) 5. StatCounter, StatCounter Global Stats Top 5 Search Engine, StatCounter (2010) 6. YAHOO! YAHOO! (1994), http://search.yahoo.com/ (cited January 24, 2011) 7. Bing. Bing (2009), http://www.bing.com/ (cited January 24, 2011) 8. FundooWeb. FundooWeb (2006), http://fundooweb.com/ (cited January 24, 2011) 9. KartOO. KartOO (2001), http://www.kartoo.com/ (cited January 24, 2011) 10. Omgili. Omgili (2006), http://omgili.com/ (cited January 24, 2011) 11. O’Reilly, T.: Web 2.0 Compact Definition: Trying Again. O’Reilly Radar, Sebastopol (2006) 12. Deshpande, A., Jadad, A.R.: Web 2.0: Could it help move the health system into the 21st century? The Journal of Men’s Health & Gender 3(4), 332–336 (2006) 13. Needleman, M.: Web 2.0/Lib 2.0–What Is It? (If It’s Anything at All). Serials Review 33(3), 202–203 (2007) 14. Mason, R., Rennie, F.: Using Web 2.0 for learning in the community. The Internet and Higher Education 10(3), 196–203 (2007)
An Investigation into a Personalised and Web2.0-Based Search Engine Interface
429
15. Stephens, M., Collins, M.: Web 2.0, Library 2.0, and the Hyperlinked Library. Serials Review, 33(4), 253–256 (2007) 16. Hendler, J., Golbeck, J.: Metcalfe’s law, Web 2.0, and the Semantic Web. Web Semantics: Science, Services and Agents on the World Wide Web 6(1), 14–20 (2008) 17. Boulos, M.N.K., Wheeler, S.: The emerging Web2.0 social software: an enabling suite of sociable technologies in health and health care education. Health Information & Libraries Journal 24(1), 2–23 (2007) 18. Chau, M., Wong, C.H.: Designing the user interface and functions of a search engine development tool. Decision Support Systems 48(2), 369–382 (2010) 19. Huang, S.-T., Tsai, T.-H., Chang, H.-T.: The UI issues for the search engine. In: Proceedings of 11the IEEE International Conferences on Computer-Aided Design and Computer Graphics, Yellow Mountain City, China (2009) 20. PEW/INTERNET, Search Engine Use (2008)
Cyber Risks to Secure and Private Universal Access G. Susanne Bahr1, Liam M. Mayron2, and Hannah J. Gacey1 1
Psychology Florida Institute of Technology Melbourne, Florida, USA 2 Government Communications Systems Division Harris Corp. Melbourne, Florida, USA
[email protected],
[email protected],
[email protected] Abstract. The overarching goal of UA is affording all individuals, regardless of disabilities, geographic status, infrastructure, age or training background, the use and benefit of information technology. The goal of this paper is to raise awareness of the security and privacy cyber risks in the everyday use of UA technology. The challenge of UA technology transfer from the laboratory to real world setting leads us to identify vulnerabilities of UA users and to present examples of cyber security strategies to safeguard data. The principles of confidentiality, integrity and availability guide our non-exhaustive review of concrete UA approaches and their security and privacy implications for everyday use. We examine the cyber risks to privacy and security of brain computer interfaces and UA home networking and conclude with call to interdisciplinary collaboration between the security and UA expert communities to ensure the transitioning of safe and secure UA technologies to the end-users. Keywords: Universal Access, Privacy, Security, Technology Transfer, Brain Computer Interfaces, Networks, Hacking, Interdisciplinary Research.
1 Introduction The overarching goal of universal access (UA) in human computer interaction (HCI) is affording all individuals, regardless of disabilities, geographic status, infrastructure, age or training background, the use and benefits of information technology. The UA research agenda has prolifically generated new ideas, methodologies and solutions for universal access to technology [1], [2], [3], [4]. At the same time, there remain facets of the UA challenge that are conspicuous by their minor treatment in the UA literature: privacy and security. For instance, in the 2009 Handbook of Universal Access, only one of sixty-one chapters deals explicitly with security [5]. In fact, with the exception of Maybury’s concise chapter on international security policies the topic is mentioned implicitly or treated superficially, if at all. It is difficult to deny that the privacy and security concerns of users and their families will likely emerge from the UA axioms of inclusiveness and diversity. Clearly, universal access is not intended as a bidirectional flow of information C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 433–442, 2011. © Springer-Verlag Berlin Heidelberg 2011
434
G.S. Bahr, L.M. Mayron, and H.J. Gacey
resulting in being universally accessible. Potential issues may be less evident in laboratory settings because researchers are accountable to institutional review boards which are governed by federal and international laws. Instead, privacy and security issues become tangible and consequential as UA solutions migrate from the laboratory to the real world, i.e., commercially available technology and real networks. This challenge is at the core of this paper: What are the security and privacy cyber risks for the everyday use of UA technology? For example, should advertisers have access to interaction data that allows them to determine which users have sensory impairments or where they are located? How do we protect health related data collected in non-traditional interaction paradigms such as brain computer interfaces from unwanted disclosure or corruption? These questions not only introduce privacy concerns, but also raise security matters. For example the site “pleaserobme.com” educates its users how to easily track the current location as well as location trails of a mobile device, i.e. its user, using Google maps [6]. The site suggests that lack of location privacy makes users’ homes vulnerable to burglars. Such information, in combination with health information, may be exploited for other dark purposes left to the reader’s imagination. Another example deals with health-related data collection and storage: the US Department of Health has recorded 216 violations of health Information privacy since 2009 for databases that have been hacked or left unprotected. The range of records, i.e., confidential data of real people, reaches above one million for a case of hard drive theft from BlueCrossBlueShield, Tennessee. Other types of breaches include loss, improper disposal, hacking, and unauthorized disclosure [7]. The remainder of this paper presents UA vulnerabilities and reviews cyber security strategies and objectives, which lead to a brief, non-exhaustive review of UA approaches and their security and privacy implications for everyday use. Some of our ideas may seem grotesque, overstated, or paranoid. We do acknowledge that media confessions of hackers and anecdotal evidence suggest that hacking into confidential and secured spaces is perceived as a challenge without intend to harm or the harboring of sinister intentions. For transcripts of interviews see [8]. However, such observations trivialize the problem and remain insufficient evidence to dismiss the need to protect the unsuspecting UA user from being unethically universally accessed. Hence, to protect privacy and security we have to enter the dark side of universal access technology in the real world network of networks.
2 Universal Access Targets and Vulnerabilities Universal access includes those individuals who are otherwise unable to gain access to information technology. They are offline and perhaps even unlisted or “ungooglable”, as epitomized in The New Yorker cartoon by Roz Chast [9]. What makes these disconnected individuals unique or excludes them from access? Primarily, they do not represent the average user of standard interfaces for high speed networks. Apart from making a deliberate decision against technology access, nonaccessing individuals may be socioeconomically or geographically without access, or are virtually disconnected due to low speed or low bandwidth connections. It appears
Cyber Risks to Secure and Private Universal Access
435
reasonable to argue that the UA users who are the most vulnerable and require the most protection from exploitation are individuals with motor or sensory disabilities or cognitive impairments and medical issues. For example, the elderly have been invisible disconnected due to unfamiliarity with technology, training issues, and lack of age-appropriate interfaces. Consequently, a paramount UA approach to empower “all” to avail themselves of information technology is by overcoming barriers through user models and interface design. Such efforts are not without introducing risks that require identification and management. Concerns dealing with consumer privacy, health data privacy, demographic data privacy and location privacy are not new to the modern networks of networks but may escalate as UA succeeds. In this context, one might think of “cyber-patients” and their physical or psychological safety when depending on the web-based services. While Internet practices and safeguards are ongoing discussion topics, reliance on unverified assumption about user knowledge, skills, abilities, security awareness, or compliance are unlikely to control risk. Instead, research must become sensitive to security issues and invoke processes for designing with security in mind, thereby making networks, including the internet, relatively safe places for interactions and transactions.
Fig. 1. The Ungooglable Man by Roz Chast (reprinted with permission of The Cartoon Bank, a New Yorker Company/Condé Nast Publications Inc.)
436
G.S. Bahr, L.M. Mayron, and H.J. Gacey
3 Cyber Security Today Security is a fundamental consideration in contemporary computation. For example, malware is a major concern for systems because, as the name suggest, it is software designed to do harm. Harming the system may involve harvesting private or sensitive data, altering or corrupting of data, compromising the processing of data, or denying access. Malicious code may traverse networks such as the Internet to seek a host. Indeed, being offline seems like a safe place for users and their computers. However, as malware such as Stuxnet has shown, even systems that are isolated from an external network (called “air gapped”) are vulnerable [10]. For background, Stuxnet is a worm that self-replicates and appears to attack control systems. Its discovery in 2010 was high-profile news because it infested the internal networks of nuclear power plants. Isolation from the Internet may seem a safe strategy to protect data and software from misuse and corruption, however the Stuxnet example illustrates even air-gapped system may be vulnerable. The question arises, how do we protect privacy and data? The states of science and practice have identified many ways to secure a system [11]. The following is an overview of the “how to” secure systems health and data assuredness [12]. They include at least seven strategies: network intrusion detection, perimeter defense, encryption, malware detection, patch management, backups, and file reputation systems. This overview starts at the highest (network) level and ends with the assurance of individual files. • Network intrusion detection: Here an intrusion detection system (IDS) analyzes network traffic (information flowing into and out of the network). The purpose of this analysis is to identify suspicious patterns of behavior that may be an indication of an attack or probe of system vulnerabilities. A sophisticated IDS may include a honeypot system that simulates vulnerabilities in order to attract and identify hackers. • Perimeter network defense: This type of defense includes systems such as firewalls that reside between the internal network and the internet. A firewall may be installed on a computer or around an internal network of a group of computers. Authorized traffic may pass through a firewall, whereas unauthorized traffic will be denied [13]. Firewalls can be implemented in hardware, software, or both. The rules for categorizing traffic vary and can be set to different levels of permissiveness and scrutiny. • Encryption: Encryption is an approach for securing data by transforming an intelligible pattern into an apparently random one [14]. The goal is to create and use secrecy to protect information that could is vulnerable to being tapped or stolen. In general, cyphers (algorithms used to transform plain text into something unreadable) have a long history of being invented and broken [Simon Singh,2000, the code book]. An everyday example of the use of encryption is the Hypertext Transfer Protocol Secure (HTTPS). HTTPS is used as the scheme for secure transactions such as online banking, e.g., https://www.bankofamerica.com [15]. • Host-based malware detection: Host-based (residing on one’s computer) malware detection, includes antivirus software. It typically consists of actively-running software processes that detect and, if possible, contain hostile infections as they
Cyber Risks to Secure and Private Universal Access
437
occur. Antivirus software often includes defenses against Trojans, Worms, and other malicious viruses program and requires regular updates to be prepared to deal with new developments, which leads directly to the next item. • Patch management: To remain secure, a system must have a way to update itself in order to correct vulnerabilities that have been identified and exploited since the system was originally deployed. Generally, the majority of resources in the lifetime of a software system are expended after the initial development [16]. • File reputation systems: Systems, particularly those encountering large amounts of data of unknown provenance, must be able to judge, with reasonable certainty, the risk of opening a file, executing a process, or communicating with a remote host. File reputation systems can reference remote risk databases in order to augment results. • Backups: In the event a system is compromised beyond repair, a security strategy must include backups and the ability to restore a system to a known working state. The backups must be updated and checked regularly and systematically and hence require maintenance to be of use. The objective of using and integrating these strategies is to secure data and system health against four types of cyber-attacks: Disclosure, theft, intentional damage and denial of service [17]. In order for a system to be resistant to such attack threats it must satisfy three universal security and privacy requirements [18]: 1. Confidentiality: Certain data must not be made public or should be restricted to only certain users or groups of users. Hackers, malware (malicious programs), or unauthorized insider activity are several of the threats to confidentiality. 2. Integrity: Data must remain impervious to unauthorized alteration or damage. Users must have the confidence that the data they are interacting with has not been tampered with in an improper way. 3. Availability: A system must be able to provide access to data when it is needed. System downtime, either intentional or otherwise, interrupts availability. Although a power outage or natural disaster may be natural causes for an interruption in system availability, it is possible to intentionally disrupt a system, such as through a distributed denial-of-service attack. As the UA user group grows and its inherent diversity multiplies, the principles of confidentiality, integrity and availability are waiting to be addressed and may become more challenging as seen in the next section.
4 Universal Access Users under Attack The purpose of this section is to visit UA concepts and research to consider emerging privacy and security concerns. We return to the question that is at the core of this chapter: What are the security and privacy risks for the everyday use of UA technology? The examples presented here examine security and privacy implications of brain computer interfaces and home networking for UA. 4.1 Brain Computer Interfaces The concept of the Brain Computer Interface (BCI) is based on the measurement and interpretation cortical activity of a human user in real time so that this human may
438
G.S. Bahr, L.M. Mayron, and H.J. Gacey
interface with a system using his or her brain activity. Thus, interfacing changes from motor activities, such as typing and clicking, to thinking about typing or clicking. It has been noted that BCIs are of particular benefit to individuals with disabilities or in extreme circumstances that involve severe communication and motor control problems. Likewise, as a non-traditional human-to-system communication tool, BCIs become an additional input/output alternatives for traditional computer users. For an overview of BCIs see [19]. At the core of this chapter is the question which cyber threats and safe guards may affect UA technology transfer or UA solution that are commercially deployed. In the case of BCIs all three requirements (confidentiality, integrity and availability) seem relevant. For instance, a BCI system whose confidentiality has been compromised may provide physiological and interaction data of the user to outside parties. Such a breach is similar to accessing medical records without obtaining consent or HIPAAlike (Health Insurance Portability and Accountability Act) agreements. This offense is not trivial and in the USA, violators of patients’ privacy rights can be held accountable with civil and criminal penalties [20]. Encryption of the nervous system may be a defense but it has been shown that even encrypted communications are susceptible to the man-in-the-middle attack [21]. In order to execute a man-in-themiddle attack, the perpetrator intercepts communications between two parties (or in the case of BCIs, the user and the system) and relays them as intended or altered. Eavesdropping alone without alterations, even in closed-loop interfaces, may not be detectable. The questions for technology transfer are at a minimum whether BCI users are aware of the risks and how to implement safeguards to protect these data. Next, BCIs whose integrity has been corrupted may not disclose private information but act and function is an undesirable or harmful manner. BCIs require calibration and user profiles to match brain activity to the desired interaction. A compromised system may actuate undesirable inputs in response to brain activity. An example of an unwanted, perhaps harmless command would be the sending of an email draft instead of the final and intended communication. Other scenarios of mismatched intended and actuated input due to system compromise may include deleting instead of saving or typing erroneous characters in the case of BCI speller [22]. In addition to BCI use at home, BCI have been explored for driving and aviation tasks. We leave the consequences of compromised system integrity of a BCI to operate a motor vehicle to the reader’s imagination [23]. Not only integrity issues but also availability limitations may have considerable influence not only over user frustration levels but impact safety. For a BCI to transcend novelty as an alternative and perhaps chic input device, it has to be reliably available. Users who depend on BCIs to connect to services, to work, and with friends and family, cannot be subjected to noticeable maintenance and downtimes, i.e., intermittent availability. In summary, when considering the transfer of BCI technology from the laboratory to real world setting, we meet challenges regarding confidentiality, integrity and availability of the UA systems. Without resolving the privacy and security concern outlined, the deployment of BCIs may enhance the quality of life of potential users but also introduce unmitigated risks with uncertain consequences to the user. Next, we explore UA vulnerabilities in the context of home networking.
Cyber Risks to Secure and Private Universal Access
439
4.2 Universal Access and Home Networking Technology and innovation are rapidly presenting consumers with new products for interfacing with the Internet, with many marketed for their ease of use. One such product is low-level home monitoring, which lets consumers invite information gathering artifacts into their personal space (their home) for small scale telemonitoring and the ability to “watch” their spaces [24]. Monitoring may also be accomplished with mobile devices and to the benefit of off-site caregivers who wish to keep a virtual eye on the activity of, e.g., their elderly or disabled family member [25]. In addition to surveillance, such systems may be smart, i.e., equipped with (a) sensors to detect user state and (b) actuators to make adjustments to the environment or artifacts in line with user needs [26]. Data gathered can be made intentionally available to a range of parties from emergency services to the individual consumer using or example a web browser enabled mobile phone. In this context, Baillie, Pucher and Képesi (2004) point out that, “the user should feel a sense of privacy and control over any artifacts proposed for their home space.” [24, pg. 377]. Unfortunately, this reasonable expectation is not a trivial security problem. The ease of penetrating a home that is connected to the Internet has been demonstrated recently in a high profile court case in Germany. The incident involved approximately 150 underage women whose privacy was invaded by taking external control of in-home webcams. The individual charged had approached his teenage targets using social networking sites [27]. The case demonstrated that granting control to one’s personal webcam is as easy as clicking on an attached photo. This click not only opens the desired picture but also (without knowledge of the user) installs a virus that enables control of one’s webcam by a remote user. This remote user then has the choice to turn the camera on and off, and if so inclined to record one’s personal space and post the video, for instance, on www.youtube.com [28]. Unprotected home networks that reveal user activity put confidentially at stake. Moreover, integrity must be questioned due to unintentionally authorized access to network activity. It is perhaps counterintuitive, but in this context the requirement of system availability can create a threat to confidentiality. For example, a system that is reliably available but compromised may not only provide access to the user as intended, but also expose user actions and data to outside parties. Similarly, being visible during UA requires user knowledge of the data that are being revealed and collected. Profile Extractor (PE) is an example of user profiling based on internet behaviors. Here information learned from users is collected and analyzed to attempt the construction of accurate and comprehensive profiles and usage patterns. Such profiles can be leveraged in the interest of the user to capture user preferences but may also be exploited to expose private patterns of behavior and decision making. Particularly for UA, the use of profiles in E-commerce may include individuals whose executive functioning is impaired, which may involve suboptimal decision making and lacking control of impulsive behaviors. Such profile data can be used to safeguard individuals, e.g., protect against out of control shopping sprees or irresponsible transactions in general. Likewise, the same information may also facilitate target identification for phishing, cyber stalking or other forms of exploitation. In summary, UA with home networking creates another layer of complexity to the challenge of safeguarding confidentiality, integrity and reliability. Not all aspects of
440
G.S. Bahr, L.M. Mayron, and H.J. Gacey
home networking that are critical to UA were covered here. For instance, network speed and bandwidth raise concerns for UA that will be addressed in future work.
5 Conclusion As the UA field continues to grow, solutions are more likely to transfer from the laboratories to the commercial markets. In these every day, non-laboratory environments, it is plausible that increasing amount sensitive personal information are generated and possibly exposed. Preventing misuse is imperative to an overall system security strategy. Intentionally or not, end-users may execute or disclose information that is damaging to the network, system or, more importantly, to themselves. Attacks can involve social components and create even greater weaknesses in any system and hence potentially exploiting vulnerabilities of users. In the absence of UA systems that are capable of truly ensuring confidentially, integrity, and availability, the protection and security against “universally accessing” the user cannot be guaranteed. The purpose of this paper is the exploration and identification of privacy and security concerns that are likely to influence the success of transitioning UA research to real world users living with real world networks. We did not focus on the security adage that users must be instructed in proper protocols for remaining secure. One reason is that although user education is important, it is unlikely the optimal and only strategy: It requires development, knowledge of the trainees’ cognitive and motor abilities, unimpaired user agency for decision making and last but not least, that trainees have access to training. Our main reason for this paper topic is that UA scientists and security experts have not yet developed a rich dialogue or interdisciplinary projects. As a result, the awareness, identification, elimination and management of cyber risks to privacy and security have not been addressed for UA. It appears that interdisciplinary collaboration between the security and the UA expert communities is the necessary next step to controlling cyber risks, thereby enabling the successful, safe and secure transitioning of UA technology to the end-users.
References 1.
2.
3. 4. 5. 6.
Universal access: theoretical perspectives, practice, and experience. In: Carbonell, N., Stephanidis, C. (eds.) 7th ERCIM International Workshop on User Interfaces for All. Springer, Berlin (2003) User-centered interaction paradigms for universal access in the information society. In: Stary, C., Stephanidis, C. (eds.) 8th ERCIM Workshop on User Interfaces for All. Springer, Berlin (2004) Stephanidis, C. (ed.): Universal access in HCI: Towards an information society for all, vol. 3. Lawrence Erlbaum Associates, Mahwah (2001) Stephanidis, C. (ed.): User interfaces for all: concepts, methods, and tools. Lawrence Erlbaum Associates, Mahwah (2001) Maybury, M.T.: Security and Privacy for Universal Access. In: Stephanidis, C. (ed.) The Universal Access Handbook, pp. 1–11. CRC Press, Boca Raton (2009) Please Rob Me: Raising Awareness about Over-sharing, http://pleaserobme.com/ (retrieved January 7, 2011)
Cyber Risks to Secure and Private Universal Access 7.
8. 9. 10. 11. 12.
13. 14. 15. 16. 17. 18. 19.
20. 21.
22.
23. 24.
441
U.S. Department of Health & Human Services. Breaches Affecting 500 or More Individuals, http://www.hhs.gov/ocr/privacy/hipaa/administrative/breachn otificationrule/postedbreaches.html (retrieved February 6, 2011) Frontline by PBS, http://www.pbs.org/wgbh/pages/frontline/shows/hackers/inter views/ (retrieved February 13, 2011) Chas, R.: The Ungooglable Man. Reprinted with permission from The Cartoon Bank/Conde Nast, The New Yorker (March 22, 2010) Falliere, N., Murchu, L.O., Chien, E.: W32.Stuxnet Dossier. Symantec Security Response, 1–69 (2011) Pfleeger, C.P., Pfleeger, S.L.: Security in computing, 4th edn. Prentice Hall, Upper Saddle River (2006) Mayron, L.M., Bahr, G.S., Balaban, C., Bell, M., Ford, R., Fox, K.L., Henning, R.R., Smith, W.B.: A hybrid cognitive-neurophysiological approach to resilient cyber security. In: The 2010 Military Communications Conference – Unclassified Program – Cyber Security and Network Management (MILCOM 2010-CSNM), San Jose, California (October 2010) Cheswick, W., Bellovin, S., Rubin, A.: Firewalls and internet security: Repelling the wily hacker, 2nd edn. Addison-Wesley Longman Publishing Co., Inc., Boston (2003) Mayron, L.M.: Secure Multimedia Communications. IEEE Security & Privacy 8(6), 76–79 (2010) Bank of America, https://www.bankofamerica.com/ (retrieved January 7, 2011) Schach, S.: Object-Oriented and Classical Software Engineering. McGraw-Hill, New York (2010) Amoroso, E.: The Effects of Cyber Attacks. In: AT&T, Cyber Security, ch. 3. Silicon Press, Summit (2007) Greene, S.S.: Security Policies and Procedures: Principles and Practices. Prentice Hall, Englewood Cliffs (2006) Adams, R., Bahr, G.S., Moreno, B.: Brain Computer Interfaces: Psychology and Pragmatic Perspectives for the Future. In: Proceedings of the Artificial Intelligence and Simulation of Behaviour (AISB) 2008 Symposium on Brain Computer Interfaces and Human Computer Interaction: A Convergence of Ideas, vol. 5(1-6). The Society for the Study of Artificial Intelligence and Simulation of Behaviour, Aberdeen (2008) U.S. Department of Health & Human Services. What does the HIPPA Privacy Rule do?, http://www.hhs.gov/ocr/privacy/hipaa/faq/privacy_rule_gener al_topics/187.html (retrieved February 6, 2011) Asokan, N., Niemi, V., Nyberg, K.: Man-in-the-middle in tunnelled authentication protocols. In: Christianson, B., Crispo, B., Malcolm, J.A., Roe, M. (eds.) Security Protocols 2003. LNCS, vol. 3364, pp. 28–41. Springer, Heidelberg (2005) Thulasidas, M., Guan, C., Wu, J.: Neural Signal Process. Lab., & Inst. For Infocomm Res., Singapore. IEEE Transactions on Neural Systems and Rehabilitation Engineering on Robust classification for EEG signal for brain-computer interface 14(1), 24–29 (2006) Augmented Cognition International Society, http://www.augmentedcognition.org/ (retrieved February 12, 2011) Baillie, L., Pucher, M., Képesi, M.: A supportive multimodal mobile robot for the home. In: Stary, C., Stephanidis, C. (eds.) User-Centered Interaction Paradigms for Universal Access in the Information Society: 8th ERCIM Workshop on User Interfaces for All, pp. 375–383. Springer, Berlin (2004)
442 25.
26.
27.
28.
G.S. Bahr, L.M. Mayron, and H.J. Gacey Becker, S.A., Webbe, F.M.: Designing for Older Adult Users of Handheld Technology. In: Proceedings of the 28th IEEE Engineering in Medicine and Biology Society Annual International Conference, pp. 3297–3300 (2006) Cook, D., Das, S.: Smart Environments: Technology, Protocols and Applications in the Wiley Series on Parallel and Distributed Computing. Wiley-Interscience, Hoboken (2005) Hacker spannt junge Mädchen per Webcam, http://www.stern.de/digital/online/150-faelle-hackerspannt-junge-maedchen-per-webcam-1584091.html (retrieved January 6, 2011) YouTube, http://www.youtube.com/ (retrieved January 6, 2011)
Towards Information Technology Security for Universal Access Shiran Cohen, Noam Ben-Asher, and Joachim Meyer Department of Industrial Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel Deutsche Telekom Laboratories at BGU, Beer-Sheva, Israel {shirang,noambena,joachim}@bgu.ac.il
Abstract. One way to secure Information Technology (IT) systems is with authentication mechanisms that distinguish between users. Users who differ in their cognitive and motor abilities, cultural background and personal characteristics should be able to operate the IT system including its security features. If system design fails to consider user diversity, users might bypass or disable the security feature, reducing system security. Providing universal accessesibility and acceptability is generally a challenge, especially when dealing with IT security. We present a conceptual model that explores and establishes guidelines for the inclusion of biometric authentication in systems which serve a wide range of users. Aspects of this model were examined in laboratory settings using a task which simulates mobile access to an eBanking system with biometric authentication. Younger and older participants used the authentication mechanism. The age groups clearly differed in their interaction with the IT and the security system. Designing security system for universal access remains a major challenge. Keywords: Universal access, usability, security, authentication, information technology.
1 Introduction There are undisputable benefits from the increased functionality, connectivity and ability to collaborate provided by Information Technology (IT). However numerous privacy and security issues also arise and pose threats to IT users. One of the common ways to secure an IT system is by integrating authentication and access control mechanisms into it. These mechanisms aim to distinguish between users and protect the registered user from unauthorized access by other people. Knowledge-based authentication uses a word or phrase shared between the user and the IT security system (i.e. - password), token-based authentication uses a physical token, and biometric authentication relies on the uniqueness of details in a person’s anatomy or behavior [4]. As biometric authentication methods mostly rely on the users' physical or behavioral features and these features are always available, they can be considered as highly usable compared to token-based methods in which users need to protect the token from being lost or stolen. Moreover, the user is not required to memorize or C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2011, LNCS 6765, pp. 443–451, 2011. © Springer-Verlag Berlin Heidelberg 2011
444
S. Cohen, N. Ben-Asher, and J. Meyer
acquire new information, and thus memory load and cognitive efforts are low. Cognitive effort refers to the amount of resources the user has to invest when learning and executing a procedure. People try to avoid cognitive effort, even if avoidance might lead to adverse consequences [5]. However, while biometric methods have several advantages, they also have some disadvantages. For instance, an individual’s age and occupation (e.g. construction work) may make it difficult to capture a complete and accurate fingerprint image. Some of the issues could relate to the cost of the capturing equipment, and vulnerability and privacy issues. Furthermore, biometric systems might solve problems associated with traditional security methods, but if a hacker manages to compromise a biometric system, the privacy of the individual is permanently compromised, because biometrics do not change over time. This property of biometrics constitutes a serious drawback [14]. Users with greatly different cognitive and motor abilities, cultural backgrounds and personal characteristics operate IT security systems on a daily basis. The usage, context of operation and surroundings differ between systems, users and scenarios. This diversity is challenging when designing an IT security system. Various studies have shown that ethnicity, age and gender may limit the use of biometrics [1]. If the design of a security system does not consider these matters, psychological acceptance of the system will be low and users might try to bypass security features and thereby reduce and compromise their own security. Moreover, users might stop using a system if they fail to access it. This study is a preliminary step in evaluating how different aspects of usability affect younger and older users' acceptance and use of biometric authentication in a simulated IT system. We first describe some of the challenges in providing a universally accessible experimental security system, and then present and discuss our findings.
2 Universal Access and IT Security One of the main requirements that a modern IT security system must meet is universality. This requirement suggests that almost anybody should be able to use the system [12;2]. However, having the physical capabilities required for using the system does not necessarily mean that people will want to use it (or feel they can use it). Additional limitations can prevent people from using the system; these limitations include the whole society (cultural limitations), a specific group (e.g., older users) or the individual itself (technophobia). If these matters are not considered, universality can be unintentionally violated. It is possible that with aging, motor and cognitive skills decrease and biometric features gradually change [13]. The physical changes include the loss of the lipid (fat) layer in the skin, making fingerprints worn and difficult to image. Similarly, with age voice changes or turns unstable, making vocal signature identification more difficult. Diminishing cognitive abilities (and in particular memory) can cause difficulties when interacting with security systems, especially when the older users need to memorize passwords or use their biometric features. Other common authentication-related failures include non-readable fingerprints, blind users who cannot register iris images,
Towards Information Technology Security for Universal Access
445
or cuts and burns on fingers which can cause a temporary inability to use a fingerprint reader [11]. However, providing universal access might not be enough; users should also universally accept security systems. This concept relates to the extent to which people are willing to accept the security measures in their daily life [12]. Cultural incompatibilities can pose constraints on the acceptability of authentication mechanisms. For example, facial recognition is not suitable in cultures where women hide their faces [11] and biometric authentication does not support family members performing tasks in place of the individual who is registered, a customary behavior in some cultures [9]. Hofstede's [6] model defined five cultural dimensions that quantify differences between national cultures: (i) power distance, (ii) individualism, (iii) masculinity, (iv) uncertainty avoidance, and (v) long-term orientation. Power distance is the extent to which a society as a whole accepts an unequal power distribution among its members. Individualism refers to the relative importance of individuals in a society. Masculinity is the difference between male and female gender roles. Uncertainty avoidance defines the society’s tolerance of uncertainty and the unknown and long-term orientation is the time focus of a culture. A study [9] that examined cross-cultural differences in attitudes towards biometric technology in India, the United Kingdom and South Africa found clear differences in the acceptance of different methods. However, the cultural dimensions proposed by Hofstede [6] were not sufficient to explain the differences. The alternative explanation related to the effect of the crime rate in each country. Despite the presented conclusions, Hofstede's cultural dimensions can indeed provide a general framework to investigate a society and her motivation in adopting new technology. However, observing acceptance from the society point of view is not always enough, and there is a need to examine what motivates individuals to adopt new technologies. The "Diffusion of innovation" tool for assessing technology adaptation was developed by Rogers [10]. It defines five adapter categories that exist in the population: (i) Innovators, (ii) Early adopters, (iii) Early majority, (iv) Late majority and (v) Laggards. One can predict that "innovators'" and "early adopters'" acceptance and thereby interaction and use of IT security can be very different from both "late majority" and "laggards". Moreover, it is likely that "innovators" and "early adopters" are using more advanced services, which create different security needs and include advanced authentication. Additionally, there might be an age difference between the groups - innovators tend to be younger and more technology oriented than the laggards. This assumption is supported by Kowalski and Goldstein [8], who found that the "early majority" are a younger population which responds better to biometric authentication methods, compared with the "late majority" and "laggards". It is crucial to remember when relating to these models, that they are used to explain adoption of a new technology and do not relate specifically to security systems. Security is a supporting task, as it is not considered as a critical part of the means to achieve a desired goal [11]. Meaning that, it is more sensitive to performance reduction in terms of efficiency, even more than for the production task, on which users are focused. If a supporting task conflicts with a production task, users will attempt to bypass it, maintaining performance in the production task. If a supporting task requires significant extra effort or interferes with the production tasks,
446
S. Cohen, N. Ben-Asher, and J. Meyer
users motivation depend on understanding the reason for this, and have to be incite by a motivator to comply with it. Failure to provide users with the necessary understanding, training and motivation can result in security-related human error. The requirements for a high level of security while maintaining adequate usability and user acceptance are frequently in conflict with each other, and a suitable balance has to be found [3]. Two principles of signal detection theory can be used to assess security and usability of an authentication system, False Match (FM) and False NonMatch (FNM) rates [7]. FM refers to granting access to an unauthorized user due to error in the matching process, FNM refers to incorrectly dismissing an authorized user. Generally, a low FM rate is desired and it indicates a secure system. However, it might lead to higher FNM rate that points to low usability and accessibility that may cause users to avoid the system. The goal of this study was to identify differences between the young and elderly users when accessing an IT security system. In addition, for each group we observed how the need to authenticate and the authentication process influence the subjective evaluations of the system performance.
3 Method As previously mentioned, an authentication system has to satisfy a set of usability concerns in order to be adopted and continuously used by users. We developed an experimental environment that includes an eBanking- like investment game combined with fingerprint authentication for this study. 3.1 Participants The first group of participants included 32 (13 females and 19 males) undergraduate students Industrial Engineering and Management students from Ben-Gurion University of the Negev, with ages ranging from 23 to 28 (mean=25). They received an extra credit point in a course for participation and also had the opportunity of winning a price of 40 NIS based on performance in the task. The second group of participants included 7 subjects (5 females and 2 males) with ages ranging from 5562. These participants were recruited from the non-academic staff of the university. For participating in the experiment, participants in this group received 50 NIS and the opportunity of winning an additional price of 40 NIS based on performance in the task. 3.2 Apparatus and Procedure The experiment took place in computer laboratory settings. The experimental system included a personal computer connected to a 17" touch screen and a fingerprint reader (SecuGen Hamster IV). The experimental platform overided the manufacturer properties of the fingerprint reader and controlled the FNM rate of the fingerprint reader and the duration a fingerprint reading was processed before providing a response. After arriving at the laboratory, each participant signed a consent form and received instructions on the course of the experiment. The investment game had a
Towards Information Technology Security for Universal Access
447
touch-screen interface, and an external fingerprint reader was used for the authentication process. During 48 steps of the game, participants could build an investment portfolio by allocating money to different investment channels, differing in their expected profit and the maximal amount of money the user could invest (see Fig. 1.). However, allocating money required prior authentication, after which the participant could transfer money between the investment channels. Alternatively, participants could select to continue with their current portfolio. The incentive to change the portfolio was the opportunity to increase the amount of money invested in channels with a large expected profit and a small variance in the expected profit. The experimental design was within-subjects with two independent variables and two levels for each variable (four experimental conditions, i.e. blocks, 12 steps each). Participants were randomly assigned to one of eight groups that differed in the block's order. After each experimental block, participants answered questions assessing their subjective experience with the system using a seven-point Likert scales ranging from 'Agree' to 'Disagree'.
Fig. 1. Screen capture of the investment game
3.3 Independent Variables The study focused on two independent variables - probability of successful authentication and authentication processing time. The probability of successful authentication was manipulated by changing the number of attempts required to authenticate before the authentication process succeeds. There were two levels of probabilities of success, low and high. When the defined probability for successful authentication is low, participants needed to perform an average of 2.95 authentication trials. For the high probability, the average number of trials until success is 1.5. The authentication processing time is the time delay from the moment the fingerprint is captured until the system provides the participant with an answer (match or no match). The two levels of time delay differed in their duration, short and intermediate. In the short time delay, after the finger has been scanned and processed
448
S. Cohen, N. Ben-Asher, and J. Meyer
by the system, there is an additional delay of 1 second, and only afterwards a message stating the success or failure of authentication will appear. In the intermediate time delay, the additional delay is 4 seconds. Fig. 2. illustrates the verification process from the moment the user started an authentication attempt until feedback on authentication success or failure was given. Following each block subjective evaluation of the authentication process was evaluated on three scales. Participants reported their assessment of the reliability of the process, their satisfaction with the interaction with the authentication mechanism and the perceived delay in the processing time.
Fig. 2. Authentication process, user and system views
4 Results The data collected from each group of participants was analyses separately. The effects of the independent variables on the tendency to authenticate was assessed using General Linear Mixed Models, logistic regression analyses and the subjective evaluations of the authentication system were examined using General Linear Models. 4.1 Young Users The interaction between the probability of a successful authentication and the processing delay had a significant effect on the tendency to authenticate [z= 3.396, p